Realtime Human Segmentation in Video

Zhang, Tairan; Lang, Congyan; Xing, Junliang

doi:10.1007/978-3-030-05716-9_17

Tairan Zhang¹⁹,
Congyan Lang¹⁹ &
Junliang Xing²⁰

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11296))

Included in the following conference series:

International Conference on Multimedia Modeling

2280 Accesses
3 Citations

Abstract

Human segmentation from a single image using deep learning models has obtained significant performance improvements. However, when directly adopting a deep human segmentation model on video human segmentation, the performance is unsatisfactory due to some issues, e.g., the segmentation results of video frames are discontinuous, and the speed of segmentation process is slow. To address these issues, we propose a new real-time video-based human segmentation framework which is designed for the single person from videos to produces smoothing, accurate and fast human segmentation results. The proposed framework for video human segmentation consists of a fully convolutional network and a tracking module based on a level set algorithm, where the fully convolutional network segments the human part in the first frame of the video sequence, and the tracking module obtains the segmentation results of other frames using the segmentation result of the last frame as the initial segmentation. The fully convolutional network is trained using human images datasets. To evaluate the proposed framework for video human segmentation, we have created and annotated a new single person video dataset. The experimental results demonstrate very accurate and smoothing human segmentation with very higher speed only using a deep human segmentation model.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Bi, S., Liang, D.: Human segmentation in a complex situation based on properties of the human visual system. In: 2006 6th World Congress on Intelligent Control and Automation, vol. 2, pp. 9587–9590 (2006)
Google Scholar
Chopp, D.L.: Computing minimal surfaces via level set curvature flow. J. Comput. Phys. 106, 77–91 (1993)
Article MathSciNet Google Scholar
Dai, J., He, K., Sun, J.: Boxsup: Exploiting bounding boxes to supervise convolutional networks for semantic segmentation. In: 2015 IEEE International Conference on Computer Vision (ICCV). pp. 1635–1643, December 2015
Google Scholar
Gu, D., Zhao, Y., Yuan, Y., Hu, G.: Human segmentation based on disparity map and grabcut. In: 2012 International Conference on Computer Vision in Remote Sensing, pp. 67–71, December 2012
Google Scholar
Heo, S., Koo, H.I., Kim, H.I., Cho, N.I.: Human segmentation algorithm for real-time video-call applications. In: 2013 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, pp. 1–4, October 2013
Google Scholar
Hernandez-Vela, A., et al.: Graph cuts optimization for multi-limb human segmentation in depth maps. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp. 726–732, June 2012
Google Scholar
Jia, Y., et al.: Caffe: Convolutional architecture for fast feature embedding. In: Proceedings of the 22nd ACM International Conference on Multimedia. MM 2014, pp. 675–678. ACM, New York (2014)
Google Scholar
Junior, J.C.S.J., Jung, C.R., Musse, S.R.: Skeleton-based human segmentation in still images. In: 2012 19th IEEE International Conference on Image Processing, pp. 141–144, September 2012
Google Scholar
Kazemi, V., Sullivan, J.: One millisecond face alignment with an ensemble of regression trees. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1867–1874, June 2014
Google Scholar
Kim, K., Oh, C., Sohn, K.: Non-parametric human segmentation using support vector machine. In: 2016 IEEE International Conference on Consumer Electronics (ICCE), pp. 131–132, January 2016
Google Scholar
Kim, Y.S., Yoon, J.C., Lee, I.K.: Real-time human segmentation from RGB-d video sequence based on adaptive geodesic distance computation. In: Multimedia Tools and Applications, November 2017
Google Scholar
King, D.E.: Dlib-ml: a machine learning toolkit. J. Mach. Learn. Res. 10, 1755–1758 (2009)
Google Scholar
Kohli, P., Rihan, J., Bray, M., Torr, P.H.: Simultaneous segmentation and pose estimation of humans using dynamic graph cuts. Int. J. Comput. Vision 79(3), 285–298 (2008)
Article Google Scholar
Kumar, R., Kumar, R., Gopalakrishnan, V., Iyer, K.N.: Fast human segmentation using color and depth. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1922–1926, March 2017
Google Scholar
Lee, Y.T., Su, T.F., Su, H.R., Lai, S.H., Lee, T.C., Shih, M.Y.: Human segmentation from video by combining random walks with human shape prior adaption. In: 2013 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, pp. 1–4, October 2013
Google Scholar
Li, C., Xu, C., Gui, C., Fox, M.D.: Level set evolution without re-initialization: a new variational formulation. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2005) (CVPR), vol. 01, pp. 430–436, June 2005
Google Scholar
Li, J., et al.: Multiple-Human Parsing in the Wild. ArXiv e-prints, May 2017
Google Scholar
Liang, X., et al.: Human parsing with contextualized convolutional neural network. In: 2015 IEEE International Conference on Computer Vision (ICCV), pp. 1386–1394, December 2015
Google Scholar
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3431–3440, June 2015
Google Scholar
Mostajabi, M., Yadollahpour, P., Shakhnarovich, G.: Feedforward semantic segmentation with zoom-out features. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), vol. 00, pp. 3376–3385, June 2015
Google Scholar
Noh, H., Hong, S., Han, B.: Learning deconvolution network for semantic segmentation. In: 2015 IEEE International Conference on Computer Vision (ICCV), vol. 00, pp. 1520–1528, December 2015
Google Scholar
Osher, S., Sethian, J.A.: Fronts propagating with curvature-dependent speed: algorithms based on hamilton-jacobi formulations. J. Comput. Phys. 79, 12–49 (1988)
Article MathSciNet Google Scholar
Park, S., Yoo, J.H.: Human segmentation based on grabcut in real-time video sequences. In: 2014 IEEE International Conference on Consumer Electronics (ICCE), pp. 111–112, January 2014
Google Scholar
Ramadan, H., Tairi, H.: Automatic human segmentation in video using convex active contours. In: 2016 13th International Conference on Computer Graphics, Imaging and Visualization (CGiV), pp. 184–189, March 2016
Google Scholar
Shen, X., et al.: Automatic portrait segmentation for image stylization. In: Proceedings of the 37th Annual Conference of the European Association for Computer Graphics (2016)
Google Scholar
Shi, Y., Karl, W.C.: Real-time tracking using level sets. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2005), vol. 2, pp. 34–41, June 2005
Google Scholar
Song, C., Huang, Y., Wang, Z., Wang, L.: 1000fps human segmentation with deep convolutional neural networks. In: Pattern Recognition, pp. 474–478 (2016)
Google Scholar
Tan, Y., Guo, Y., Gao, C.: Background subtraction based level sets for human segmentation in thermal infrared surveillance systems. Infrared Phys. Technol. 61(5), 230–240 (2013)
Article Google Scholar
Wu, X., Du, M., Chen, W., Li, Z.: Exploiting deep convolutional network and patch-level CRFs for indoor semantic segmentation. In: 2016 IEEE 11th Conference on Industrial Electronics and Applications (ICIEA), pp. 150–155, June 2016
Google Scholar
Wu, Z., Huang, Y., Yu, Y., Wang, L., Tan, T.: Early Hierarchical Contexts Learned by Convolutional Networks for Image Segmentation. In: Proceedings of the 22nd International Conference on Pattern Recognition, pp. 1538–1543. IEEE (2014)
Google Scholar
Zhao, T., Nevatia, R.: Stochastic human segmentation from a static camera. In: Proceedings of the Workshop on Motion and Video Computing, pp. 9–14, December 2002
Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer Science and Technology, Beijing Jiaotong University, Beijing, 100044, People’s Republic of China
Tairan Zhang & Congyan Lang
National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing, 100190, People’s Republic of China
Junliang Xing

Authors

Tairan Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Congyan Lang
View author publications
You can also search for this author in PubMed Google Scholar
Junliang Xing
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tairan Zhang .

Editor information

Editors and Affiliations

Information Technologies Institute, Centre for Research and Technology Hellas, Thessaloniki, Greece
Ioannis Kompatsiaris
EURECOM, Sophia Antipolis, France
Benoit Huet
Information Technologies Institute, Centre for Research and Technology Hellas, Thessaloniki, Greece
Vasileios Mezaris
Dublin City University, Dublin, Ireland
Cathal Gurrin
National Chiao Tung University, Hsinchu, Taiwan
Wen-Huang Cheng
Information Technologies Institute, Centre for Research and Technology Hellas, Thessaloniki, Greece
Stefanos Vrochidis

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhang, T., Lang, C., Xing, J. (2019). Realtime Human Segmentation in Video. In: Kompatsiaris, I., Huet, B., Mezaris, V., Gurrin, C., Cheng, WH., Vrochidis, S. (eds) MultiMedia Modeling. MMM 2019. Lecture Notes in Computer Science(), vol 11296. Springer, Cham. https://doi.org/10.1007/978-3-030-05716-9_17

Download citation

DOI: https://doi.org/10.1007/978-3-030-05716-9_17
Published: 11 December 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-05715-2
Online ISBN: 978-3-030-05716-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics