Enabling More Accurate Bounding Boxes for Deep Learning-Based Real-Time Human Detection

Jeong, Hyunsu; Gwak, Jeonghwan; Park, Cheolbin; Khare, Manish; Prakash, Om; Song, Jong-In

doi:10.1007/978-981-13-2685-1_33

Hyunsu Jeong³⁸,
Jeonghwan Gwak^39,40,
Cheolbin Park³⁸,
Manish Khare⁴¹,
Om Prakash⁴² &
…
Jong-In Song³⁸

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 524))

1001 Accesses

Abstract

While human detection has been significantly recognized and widely used in many areas, the importance of human detection for behavioral analysis in medical research has been rarely reported. Recently, however, efforts have been actively made to recognize behavior diseases by measuring gait variability using pattern analysis of human detection results from videos taken by cameras. For this purpose, it is very crucial to establish robust human detection algorithms. In this work, we modified deep learning models by changing multi-detection into human detection. Also, we improved the localization of human detection by adjusting the input image according to the ratio of objects in an image and improving the results of several bounding boxes by interpolation. Experimental results demonstrated that by adopting the proposals, the accuracy of human detection could be increased significantly.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Nguyen, D., & Li, W. (2016). Human detection from images and videos: a survey. Pattern Recognition, 51, 148–175. https://doi.org/10.1016/j.patcog.2015.08.027.
Article Google Scholar
Felzenszwalb, P. (2008). A discriminatively trained, multiscale, deformable part model. In 10th IEEE International Symposium on High Performance Distributed Computing (pp. 1–8). New York: IEEE Press. https://doi.org/10.1109/cvpr.2008.4587597.
Krizhevsky, A., Sutskever, I., & Hinton, G. (2012). ImageNet classification with deep convolutional neural networks. Advances in Neural Information Processing Systems 25 (pp. 1097–1105). New York: Curran Associates, Inc. https://doi.org/10.1145/3065386.
Article Google Scholar
Girshick, R., Donahue, J., Darrell, T., & Malik, J. (2014). Rich feature hierarchies for accurate object detection and semantic segmentation. In 2014 IEEE Conference on Computer Vision and Pattern Recognition (pp. 580–587). New York: IEEE Press. https://doi.org/10.1109/cvpr.2014.81.
Girshick, R. (2015). Fast R-CNN. In IEEE International Conference on Computer Vision (pp. 1440–1448). New York: IEEE press. https://doi.org/10.1109/iccv.2015.169.
Ren, S., He, K., & Girshick, R. (2016). Faster R-CNN: Towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39, 1137–1149. https://doi.org/10.1109/TPAMI.2016.2577031.
Article Google Scholar
Redmon, J., Divvala, S., Girshick, R., & Farhadi, A. (2016). You only look once: Unified, real-time object detection. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (pp. 779–788). New York: IEEE Press. https://doi.org/10.1109/cvpr.2016.91.
Gwak, J., Park, G., & Jeon, M. (2017). Viewpoint invariant person re-identification for global multi-object tracking with non-overlapping cameras. KSII Transactions on Internet and Information Systems, 11, 2075–2092.
Google Scholar
Gwak, J. (2017). Multi-object tracking through learning relational appearance features and motion patterns. Computer Vision and Image Understanding, 162, 103–115.
Article Google Scholar
Yang, E., Gwak, J., & Jeon, M. (2017). CRF-boosting: Constructing a robust online hybrid boosting multiple object trackers facilitated by CRF learning. Sensors, 17, 617:1–617:18.
Article Google Scholar
Yang, E., Gwak, J., & Jeon, M. (2017). Multi-human tracking using part-based appearance modelling and grouping-based tracklet association for visual surveillance applications. Multimedia Tools and Applications, 76, 6731–6754.
Article Google Scholar
Prakash, O., Gwak, J., Khare, M., Khare, A., & Jeon, M. (2018). Human detection in complex real scenes based on combination of biorthogonal wavelet transform and Zernike moments. Optik—International Journal for Light and Electron Optics, 157, 1267–1281.
Article Google Scholar
Yu, H., Riskowski, J., & Brower, R. (2009). Gait variability while walking with three different speeds. In: 2009 IEEE International Conference on Rehabilitation Robotics (pp. 823–827). New York: IEEE Press. https://doi.org/10.1109/icorr.20095209486.
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C., et al. (2016). SSD: Single shot multibox detector. In European Conference on Computer Vision (pp. 21–37). Cham: Springer. https://doi.org/10.1007/978-3-319-46448-0_2.
Chapter Google Scholar
Lam, K. Y., Tsang, N. W. H., & Han, S. (2017). Activity tracking and monitoring of patients with alzheimer’s disease. Multimedia Tools and Applications, 76, 489–521. https://doi.org/10.1007/s11042-015-3047-x.
Article Google Scholar
Mega, S., & Gornbein, F. (1996). The spectrum of behavioral changes in Alzheimer’s disease. Neurology, 46, 130–135. https://doi.org/10.1212/WNL.46.1.130.
Article Google Scholar
Redmon, J., Divvala, S., Girshick, R., & Farhadi, A. (2017). YOLO9000: Better, faster, stronger. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (pp. 6517–6525). New York: IEEE Press. https://doi.org/10.1109/cvpr.2017.690.
Huang, J., Rathod, V., Sun, C., & Zhu, M. (2017). Speed/accuracy trade-offs for modern convolutional object detectors. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (pp. 3296–3297). New York: IEEE Press. https://doi.org/10.1109/cvpr.2017.351.

Download references

Acknowledgements

This work was supported by the Brain Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Science, ICT & Future Planning (NRF-2016M3C7A1905477, NRF-2014M3C7A1046050) and the Basic Science Research Program through the NRF funded by the Ministry of Education (NRF-2017R1D1A1B03036423). This study was approved by the Institutional Review Board of Gwangju Institute of Science and Technology (IRB no. 20180629-HR-36-07-04). All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki Declaration and its later amendments or comparable ethical standards. This article does not contain any studies with animals performed by any of the authors.

Author information

Authors and Affiliations

Department of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology, 123, Cheomdangwagi-ro, Buk-gu, 61005, Gwangju, Korea
Hyunsu Jeong, Cheolbin Park & Jong-In Song
Biomedical Research Institute, Seoul National University Hospital (SNUH), Seoul, 03080, Korea
Jeonghwan Gwak
Department of Radiology, Seoul National University Hospital (SNUH), Seoul, 03080, Korea
Jeonghwan Gwak
Dhirubhai Ambani Institute of Information and Communication Technology, Gandhinagar, Gujarat, India
Manish Khare
Centre of Computer Education, Institute of Professional Studies, University of Allahabad, Allahabad, India
Om Prakash

Authors

Hyunsu Jeong
View author publications
You can also search for this author in PubMed Google Scholar
Jeonghwan Gwak
View author publications
You can also search for this author in PubMed Google Scholar
Cheolbin Park
View author publications
You can also search for this author in PubMed Google Scholar
Manish Khare
View author publications
You can also search for this author in PubMed Google Scholar
Om Prakash
View author publications
You can also search for this author in PubMed Google Scholar
Jong-In Song
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Jeonghwan Gwak or Jong-In Song .

Editor information

Editors and Affiliations

University of Allahabad, Allahabad, India
Ashish Khare
Indian Institute of Information Technology, Allahabad, Uttar Pradesh, India
Uma Shankar Tiwary
Oakland University, Rochester, MI, USA
Ishwar K. Sethi
University of Allahabad, Allahabad, Uttar Pradesh, India
Nar Singh

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Jeong, H., Gwak, J., Park, C., Khare, M., Prakash, O., Song, JI. (2019). Enabling More Accurate Bounding Boxes for Deep Learning-Based Real-Time Human Detection. In: Khare, A., Tiwary, U., Sethi, I., Singh, N. (eds) Recent Trends in Communication, Computing, and Electronics. Lecture Notes in Electrical Engineering, vol 524. Springer, Singapore. https://doi.org/10.1007/978-981-13-2685-1_33

Download citation

DOI: https://doi.org/10.1007/978-981-13-2685-1_33
Published: 07 December 2018
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-2684-4
Online ISBN: 978-981-13-2685-1
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics