Improvement of Mask-RCNN Object Segmentation Algorithm

Wu, Xin; Wen, Shiguang; Xie, Yuan-ai

doi:10.1007/978-3-030-27526-6_51

Xin Wu¹⁴,
Shiguang Wen¹⁴ &
Yuan-ai Xie¹⁵

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11740))

Included in the following conference series:

International Conference on Intelligent Robotics and Applications

3803 Accesses
7 Citations

Abstract

Semantic maps play a key role in tasks such as navigation of mobile robots. However, the visual SLAM algorithm based on multi-objective geometry does not make full use of the rich semantic information in space. The map point information retained in the map is just a spatial geometric point without semantics. Since the algorithm based on convolutional neural network has achieved breakthroughs in the field of target detection, the target segmentation algorithm MASK-RCNN is combined with the SLAM algorithm to construct the semantic map. However, the MASK-RCNN algorithm easily treats part of the background in the image as foreground, which results in inaccuracy of target segmentation. Moreover, Grubcut segmentation algorithm is time-consuming, but it’s easy to take foreground as background, which leads to the excessive edge segmentation. Based on these, our paper proposes a novel algorithm which combines MASK-RCNN and Grubcut segmentation. By comparing the experimental results of MASK-Rcnn, Grubcut and the improved algorithm on the data set, it is obvious that the improved algorithm has the best segmentation effect and the accuracy of image target segmentation is significantly improved. These phenomenons demonstrate the effectiveness our proposed algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Tang, P., Wang, C., Wang, X., Liu, W., Zeng, W., Wang, J.: Object detection in videos by high quality object linking. arXiv preprint arXiv:1801.09823 (2018)
Neumann, L., Zisserman, A., Vedaldi, A.: Relaxed softmax: efficient confidence auto-calibration for safe pedestrian detection (2018)
Google Scholar
Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Trans. Pattern Anal. Mach. Intell. 40(4), 834–848 (2018)
Article Google Scholar
Fathi, A., et al.: Semantic instance segmentation via deep metric learning. arXiv preprint arXiv:1703.10277 (2017)
Arnab, A., Torr, P.H.: Pixelwise instance segmentation with a dynamically instantiated network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 441–450 (2017)
Google Scholar
Bell, S., Lawrence Zitnick, C., Bala, K., Girshick, R.: Inside-outside net: Detecting objects in context with skip pooling and recurrent neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2874–2883 (2016)
Google Scholar
Cao, Z., Simon, T., Wei, S.E., Sheikh, Y.: Realtime multi-person 2D pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017)
Google Scholar
Girshick, R., Iandola, F., Darrell, T., Malik, J.: Deformable part models are convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 437–446 (2015)
Google Scholar
Hayder, Z., He, X., Salzmann, M.: Shape-aware instance segmentation (2016)
Google Scholar
Kirillov, A., Levinkov, E., Andres, B., Savchynskyy, B., Rother, C.: InstanceCut: from edges to instances with MultiCut. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp, 5008–5017 (2017)
Google Scholar
Li, Y., Qi, H., Dai, J., Ji, X., Wei, Y.: Fully convolutional instance-aware semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2359–2367 (2017)
Google Scholar
Lin, T.Y., Dollár, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2117–2125 (2017)
Google Scholar
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440 (2015)
Google Scholar
Sun, C., Shrivastava, A., Singh, S., Gupta, A.: Revisiting unreasonable effectiveness of data in deep learning era. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 843–852 (2017)
Google Scholar
Andriluka, M., Pishchulin, L., Gehler, P., Schiele, B.: 2D human pose estimation: new benchmark and state of the art analysis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3686–3693 (2014)
Google Scholar
Arbeláez, P., Pont-Tuset, J., Barron, J.T., Marques, F., Malik, J.: Multiscale combinatorial grouping. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 328–335 (2014)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. 37(9), 1904–1916 (2015)
Article Google Scholar

Download references

Acknowledgment

This work was supported by National Key R&D Program of China Number 2017YFB1301103, and the Fundamental Research Fund for the Central Universities of China N172604003, N172603001, and supported by Doctoral Foundation of Liaoning Science and Technology Department Number 20170520244, and the National Natural Science Foundation of China under Grant nos. 61701101, U1713216, 61803077, 61603080.

Author information

Authors and Affiliations

Northeastern University, Shenyang, 110004, China
Xin Wu & Shiguang Wen
Yanshan University, Qinhuangdao, 066004, China
Yuan-ai Xie

Authors

Xin Wu
View author publications
You can also search for this author in PubMed Google Scholar
Shiguang Wen
View author publications
You can also search for this author in PubMed Google Scholar
Yuan-ai Xie
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shiguang Wen .

Editor information

Editors and Affiliations

Shenyang Institute of Automation, Shenyang, China
Haibin Yu
Shenyang Institute of Automation, Shenyang, China
Jinguo Liu
Shenyang Institute of Automation, Shenyang, China
Lianqing Liu
University of Portsmouth, Portsmouth, UK
Zhaojie Ju
Shenyang Institute of Automation, Shenyang, China
Yuwang Liu
University of Portsmouth, Portsmouth, UK
Dalin Zhou

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wu, X., Wen, S., Xie, Ya. (2019). Improvement of Mask-RCNN Object Segmentation Algorithm. In: Yu, H., Liu, J., Liu, L., Ju, Z., Liu, Y., Zhou, D. (eds) Intelligent Robotics and Applications. ICIRA 2019. Lecture Notes in Computer Science(), vol 11740. Springer, Cham. https://doi.org/10.1007/978-3-030-27526-6_51

Download citation

DOI: https://doi.org/10.1007/978-3-030-27526-6_51
Published: 02 August 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-27525-9
Online ISBN: 978-3-030-27526-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics