Abstract
Preparation of high-quality datasets for the urban scene understanding is a labor-intensive task, especially, for datasets designed for the autonomous driving applications. The application of the coarse ground truth (GT) annotations of these datasets without detriment to the accuracy of semantic image segmentation (by the mean intersection over union—mIoU) could simplify and speedup the dataset preparation and model fine tuning before its practical application. Here the results of the comparative analysis for semantic segmentation accuracy obtained by PSPNet deep learning architecture are presented for fine and coarse annotated images from Cityscapes dataset. Two scenarios were investigated: scenario 1—the fine GT images for training and prediction, and scenario 2—the fine GT images for training and the coarse GT images for prediction. The obtained results demonstrated that for the most important classes the mean accuracy values of semantic image segmentation for coarse GT annotations are higher than for the fine GT ones, and the standard deviation values are vice versa. It means that for some applications some unimportant classes can be excluded and the model can be tuned further for some classes and specific regions on the coarse GT dataset without loss of the accuracy even. Moreover, this opens the perspectives to use deep neural networks for the preparation of such coarse GT datasets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Chouhan, S.S., Kaul, A., Singh, U.P.: Image segmentation using computational intelligence techniques: review. In: Archives of Computational Methods in Engineering, pp. 1–64 (2018)
Yu, H., Yang, Z., Tan, L., Wang, Y., Sun, W., Sun, M., Tang, Y.: Methods and datasets on semantic segmentation: a review. Neurocomputing 304, 82–103 (2018)
Garciagarcia, A., Ortsescolano, S., Oprea, S., Villenamartinez, V., Martinezgonzalez, P., Garciarodriguez, J.: A survey on deep learning techniques for image and video semantic segmentation. Appl. Soft Comput. 70, 41–65 (2018)
Franke, U., Pfeiffer, D., Rabe, C., Knöppel, C., Enzweiler, M., Stein, F., Herrtwich, R.G.: Making bertha see. In: ICCV Workshops, pp. 214–221 (2013)
Autopilot: Full Self-Driving Hardware on All Cars. Tesla Motors. https://www.tesla.com/autopilot. Accessed 26 Nov 2018
Cordts, M., Omran, M., Ramos, S., Rehfeld, T., Enzweiler, M., Benenson, R., Franke, U., Roth, S., Schiele, B.: The cityscapes dataset for semantic urban scene understanding. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3213–3223 (2016)
Geiger, A., Lenz, P., Stiller, C., Urtasun, R.: Vision meets robotics: the KITTI dataset. Int. J. Robot. Res. 32(11), 1231–1237 (2013)
Brostow, G.J., Fauqueur, J., Cipolla, R.: Semantic object classes in video: a high-definition ground truth database. Pattern Recognit. Lett. 30(2), 88–97 (2009)
Brostow, G.J., Shotton, J., Fauqueur, J., Cipolla, R.: Segmentation and recognition using structure from motion point clouds. In: European Conference on Computer Vision, pp. 44–57 (2008)
Scharwächter, T., Enzweiler, M., Franke, U., Roth, S.: Efficient multi-cue scene segmentation. In: German Conference on Pattern Recognition, pp. 435–445 (2013)
Zhao, H., Shi, J., Qi, X., Wang, X., Jia, J.: Pyramid scene parsing network. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2881–2890 (2017)
Zhao, H., Qi, X., Shen, X., Shi, J., Jia, J.: ICNet for real-time semantic segmentation on high-resolution images. arXiv preprint arXiv:1704.08545 (2017)
Chen, L.C., Papandreou, G., Schroff, F., Adam, H.: Rethinking atrous convolution for semantic image segmentation. arXiv preprint arXiv:1706.05587 (2017)
Taran, V., Gordienko, N., Kochura, Y., Gordienko, Y., Rokovyi, A., Alienin, O., Stirenko, S.: Performance evaluation of deep learning networks for semantic segmentation of traffic stereo-pair images. In: Proceedings of the 19th International Conference on Computer Systems and Technologies, pp. 73–80. ACM, September 2018
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: Computer Vision and Pattern Recognition 2009, pp. 248–255. IEEE (2009)
Everingham, M., Van Gool, L., Williams, C.K., Winn, J., Zisserman, A.: The pascal visual object classes (voc) challenge. Int. J. Comput. Vis. 88(2), 303–338 (2010)
Cordts, M.: Understanding cityscapes: efficient urban semantic scene understanding. Doctoral dissertation, TechnischeUniversität (2017)
Taran, V., Gordienko, N., Kochura, Y., Gordienko, Y., Rokovyi, A., Alienin, O., Stirenko, S.: Optimization of ground truth annotation quality to simplify semantic image segmentation of traffic conditions (2018, submitted)
Abadi, M., et al.: Tensorflow: a system for large-scale machine learning. Oper. Syst. Des. Implement. 16, 265–283 (2016)
Chae, H., Kang, C. M., Kim, B., Kim, J., Chung, C. C., and Choi, J. W.: Autonomous braking system via deep reinforcement learning. arXiv preprint arXiv:1702.02302, (2017)
Flores, C., Merdrignac, P., de Charette, R., Navas, F., Milanés, V., Nashashibi, F.: A cooperative car-following/emergency braking system with prediction-based pedestrian avoidance capabilities. IEEE Trans. Intell. Transp. Syst. 99, 1–10 (2018)
Müller, M., Botsch, M., Böhmländer, D., Utschick, W.: Machine learning based prediction of crash severity distributions for mitigation strategies. J. Adv. Inf. Technol. 9(1), 15–24 (2018)
Chaudhary, A.S., Chaturvedi, D.K.: Analyzing defects of solar panels under natural atmospheric conditions with thermal image processing. Int. J. Image, Graph. Signal Process. (IJIGSP) 10(6), 10–21 (2018). https://doi.org/10.5815/ijigsp.2018.06.02
Bouzid-Daho, A., Boughazi, M.: Segmentation of abnormal blood cells for biomedical diagnostic aid. Int. J. Image, Graph. Signal Process. (IJIGSP) 10(1), 30–35 (2018). https://doi.org/10.5815/ijigsp.2018.01.04
Memon, S., Bhatti, S., Thebo, L.A., Talpur, M.M.B., Memon, M.A.: A video based vehicle detection, counting and classification system. Int. J. Image, Graph. Signal Process. (IJIGSP) 10(9), 34–41 (2018). https://doi.org/10.5815/ijigsp.2018.09.05
Sahoo, R.K., Panda, R., Barik, R.C., Panda, S.N.: Automatic Dead zone detection in 2-D leaf image using clustering and segmentation technique. Int. J. Image, Graph. Signal Process. (IJIGSP) 10(10), 11–30 (2018). https://doi.org/10.5815/ijigsp.2018.10.02
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Taran, V., Gordienko, Y., Rokovyi, A., Alienin, O., Stirenko, S. (2020). Impact of Ground Truth Annotation Quality on Performance of Semantic Image Segmentation of Traffic Conditions. In: Hu, Z., Petoukhov, S., Dychka, I., He, M. (eds) Advances in Computer Science for Engineering and Education II. ICCSEEA 2019. Advances in Intelligent Systems and Computing, vol 938. Springer, Cham. https://doi.org/10.1007/978-3-030-16621-2_17
Download citation
DOI: https://doi.org/10.1007/978-3-030-16621-2_17
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-16620-5
Online ISBN: 978-3-030-16621-2
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)