Robust Hand Pose Regression Using Convolutional Neural Networks

Gomez-Donoso, Francisco; Orts-Escolano, Sergio; Cazorla, Miguel

doi:10.1007/978-3-319-70833-1_48

Francisco Gomez-Donoso¹⁹,
Sergio Orts-Escolano¹⁹ &
Miguel Cazorla¹⁹

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 693))

Included in the following conference series:

Iberian Robotics conference

2430 Accesses
2 Citations

Abstract

Hand pose estimation is useful for several human-computer interaction applications, like sign language recognition, the identification of more complex behaviors such as hand gestures and interaction in virtual reality applications. In this work, we propose a system which is able to predict the 2D hand joints using a monocular color camera. To do that, we propose to use a 3D hand tracking sensor for collecting ground truth information that is projected to the camera image plane. We present a novel pipeline that leverages deep learning techniques for hand pose estimation. The proposed Convolutional Neural Networks (CNN) is able to infer the joints of the hand from an image without the need of any additional sensor.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Athitsos, V., Sclaroff, S.: Estimating 3D hand pose from a cluttered image. In: 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2003), 16–22 June 2003, Madison, WI, USA, pp. 432–442 (2003)
Google Scholar
Bader, T., Räpple, R., Beyerer, J.: Fast invariant contour-based classification of hand symbols for HCI. In: Jiang, X., Petkov, N. (eds.) Computer Analysis of Images and Patterns, pp. 689–696. Springer, Heidelberg (2009)
Chapter Google Scholar
Chua, C.S., Guan, H., Ho, Y.K.: Model-based 3D hand posture estimation from a single 2D image. Image Vis. Comput. 20(3), 191–202 (2002)
Article Google Scholar
Erol, A., Bebis, G., Nicolescu, M., Boyle, R.D., Twombly, X.: Vision-based hand pose estimation: a review. Comput. Vis. Image Underst. 108(1–2), 52–73 (2007)
Article Google Scholar
Grzejszczak, T., Kawulok, M., Galuszka, A.: Hand landmarks detection and localization in color images. Multimed. Tools Appl. 75(23), 16363–16387 (2016)
Article Google Scholar
Guan, H., Chang, J.S., Chen, L., Feris, R.S., Turk, M.: Multi-view appearance-based 3D hand pose estimation. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR Workshops 2006, New York, NY, USA, 17–22 June 2006, p. 154 (2006)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2016
Google Scholar
Kuznetsova, A., Leal-Taixé, L., Rosenhahn, B.: Real-time sign language recognition using a consumer depth camera. In: 2013 IEEE International Conference on Computer Vision Workshops, pp. 83–90, December 2013
Google Scholar
Melax, S., Keselman, L., Orsten, S.: Dynamics based 3D skeletal hand tracking. In: Proceedings of Graphics Interface 2013, GI 2013, pp. 63–70. Canadian Information Processing Society, Toronto (2013)
Google Scholar
Mittal, A., Zisserman, A., Torr, P.H.S.: Hand detection using multiple proposals. In: British Machine Vision Conference (2011)
Google Scholar
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Cortes, C., Lawrence, N.D., Lee, D.D., Sugiyama, M., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 28, pp. 91–99. Curran Associates, Inc., New York (2015)
Google Scholar
Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., Berg, A.C., Fei-Fei, L.: ImageNet large scale visual recognition challenge. Int. J. Comput. Vis. (IJCV) 115(3), 211–252 (2015)
Article MathSciNet Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. CoRR abs/1409.1556 (2014)
Google Scholar
Tang, D., Yu, T.H., Kim, T.K.: Real-time articulated hand pose estimation using semi-supervised transductive regression forests. In: The IEEE International Conference on Computer Vision (ICCV), December 2013
Google Scholar
Tompson, J., Stein, M., Lecun, Y., Perlin, K.: Real-time continuous pose recovery of human hands using convolutional networks. ACM Trans. Graph. 33(5), 169:1–169:10 (2014)
Article Google Scholar
Wei, S.E., Ramakrishna, V., Kanade, T., Sheikh, Y.: Convolutional pose machines. CoRR abs/1602.00134 (2016)
Google Scholar
Yeo, H.S., Lee, B.G., Lim, H.: Hand tracking and gesture recognition system for human-computer interaction using low-cost hardware. Multimed. Tools Appl. 74(8), 2687–2715 (2015)
Article Google Scholar
Zhang, Z.: A flexible new technique for camera calibration. IEEE Trans. Pattern Anal. Mach. Intell. 22(11), 1330–1334 (2000)
Article Google Scholar

Download references

Acknowledgments

This work has been supported by the Spanish Government TIN2016-76515-R Grant, supported with Feder funds.

Author information

Authors and Affiliations

Computer Science Research Institute, University of Alicante, P.O. Box 99, 03080, Alicante, Spain
Francisco Gomez-Donoso, Sergio Orts-Escolano & Miguel Cazorla

Authors

Francisco Gomez-Donoso
View author publications
You can also search for this author in PubMed Google Scholar
Sergio Orts-Escolano
View author publications
You can also search for this author in PubMed Google Scholar
Miguel Cazorla
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Francisco Gomez-Donoso .

Editor information

Editors and Affiliations

Escuela Técnica Superior de Ingeniería, Universidad de Sevilla, Sevilla, Spain
Anibal Ollero
Institut de Robòtica I Informàtica Industrial (CSIC-UPC), Universitat Politècnica de Catalunya, Barcelona, Spain
Alberto Sanfeliu
Departamento de Informática e Ingeniería de Sistemas, Escuela de Ingeniería y Arquitectura, Instituto de Investigación en Ingeniería de Aragón, Zaragoza, Spain
Luis Montano
Institute of Electronics and Telematics Engineering of Aveiro (IEETA), Universidade de Aveiro, Aveiro, Portugal
Nuno Lau
IDMEC, Instituto Superior Técnico de Lisboa, Universidade de Lisboa, Lisbon, Portugal
Carlos Cardeira

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Gomez-Donoso, F., Orts-Escolano, S., Cazorla, M. (2018). Robust Hand Pose Regression Using Convolutional Neural Networks. In: Ollero, A., Sanfeliu, A., Montano, L., Lau, N., Cardeira, C. (eds) ROBOT 2017: Third Iberian Robotics Conference. ROBOT 2017. Advances in Intelligent Systems and Computing, vol 693. Springer, Cham. https://doi.org/10.1007/978-3-319-70833-1_48

Download citation

DOI: https://doi.org/10.1007/978-3-319-70833-1_48
Published: 12 November 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-70832-4
Online ISBN: 978-3-319-70833-1
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics