Abstract
State-of-the-art deep learning-based models used to address hand challenges, e.g. 3D hand joint estimation, need a vast amount of annotated data to achieve a good performance. The lack of data is a problem of paramount importance. Consequently, the use of synthetic datasets for training deep learning models is a trend and represents a promising avenue to improve existing approaches. Nevertheless, currently existing synthetic datasets lack of accurate and complete annotations, realism, and also rich hand-object interactions. For this purpose, in our work we present a synthetic dataset featuring rich hand-object interactions in photorealistic scenarios. The applications of our dataset for hand-related challenges are unlimited. To validate our data, we propose an initial approach to 3D hand joint estimation using a graph convolutional network feeded with point cloud data. Another point in favour of our dataset is that interactions are performed using realistic objects extracted from the YCB dataset. This could allow to test trained systems with our synthetic dataset using images/videos manipulating the same objects in real life.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Asadi-Aghbolaghi, M., Clapes, A., Bellantonio, M., Escalante, H.J., Ponce-López, V., Baró, X., Guyon, I., Kasaei, S., Escalera, S.: A survey on deep learning based approaches for action and gesture recognition in image sequences. In: 2017 12th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2017), pp. 476–483. IEEE (2017)
Barattini, P., Morand, C., Robertson, N.M.: A proposed gesture set for the control of industrial collaborative robots. In: 2012 IEEE RO-MAN, pp. 132–137. IEEE (2012)
Calli, B., Singh, A., Walsman, A., Srinivasa, S., Abbeel, P., Dollar, A.M.: The YCB object and model set: towards common benchmarks for manipulation research. In: 2015 International Conference on Advanced Robotics (ICAR), pp. 510–517. IEEE (2015)
de Carvalho Correia, A.C., de Miranda, L.C., Hornung, H.: Gesture-based interaction in domotic environments: state of the art and HCI framework inspired by the diversity. In: IFIP Conference on Human-Computer Interaction, pp. 300–317. Springer, Heidelberg (2013)
Castro-Vargas, J., Zapata-Impata, B., Gil, P., Garcia-Rodriguez, J., Torres, F.: 3DCNN performance in hand gesture recognition applied to robot arm interaction. In: Proceedings of the 8th International Conference on Pattern Recognition Applications and Methods: ICPRAM, vol. 1, pp. 802–806. INSTICC, SciTePress (2019)
Chih, C.Y., Wan, Y.C., Hsu, Y.C., Chen, L.G.: Interactive sticker system with intel realsense. In: 2017 IEEE International Conference on Consumer Electronics (ICCE), pp. 174–175. IEEE (2017)
Congdon, E.L., Novack, M.A., Goldin-Meadow, S.: Gesture in experimental studies: how videotape technology can advance psychological theory. Organ. Res. Meth. 21(2), 489–499 (2018)
Dong, C., Leu, M.C., Yin, Z.: American sign language alphabet recognition using microsoft kinect. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 44–52 (2015)
Garcia-Garcia, A., Martinez-Gonzalez, P., Oprea, S., Castro-Vargas, J.A., Orts-Escolano, S., Garcia-Rodriguez, J., Jover-Alvarez, A.: The RobotriX: an eXtremely photorealistic and very-large-scale indoor dataset of sequences with robot trajectories and interactions. In: 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 6790–6797. IEEE (2018)
Ge, L., Ren, Z., Li, Y., Xue, Z., Wang, Y., Cai, J., Yuan, J.: 3D hand shape and pose estimation from a single RGB image. arXiv preprint arXiv:1903.00812 (2019)
Gomez-Donoso, F., Orts-Escolano, S., Cazorla, M.: Large-scale multiview 3D hand pose dataset. Image Vis. Comput. 81, 25–33 (2019)
Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., Bengio, Y.: Generative adversarial nets. In: Advances in Neural Information Processing Systems, pp. 2672–2680 (2014)
Kim, H., Lee, S., Kim, Y., Lee, S., Lee, D., Ju, J., Myung, H.: Weighted joint-based human behavior recognition algorithm using only depth information for low-cost intelligent video-surveillance system. Exp. Syst. Appl. 45, 131–141 (2016)
Kipf, T.N., Welling, M.: Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907 (2016)
Luo, R.C., Wu, Y.C.: Hand gesture recognition for human-robot interaction for service robot. In: 2012 IEEE Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI), pp. 318–323. IEEE (2012)
Martinez-Gonzalez, P., Oprea, S., Garcia-Garcia, A., Jover-Alvarez, A., Orts-Escolano, S., Rodríguez, J.G.: UnrealROX: an eXtremely photorealistic virtual reality environment for robotics simulations and synthetic data generation. CoRR abs/1810.06936 (2018). http://arxiv.org/abs/1810.06936
Melax, S., Keselman, L., Orsten, S.: Dynamics based 3D skeletal hand tracking. In: Proceedings of Graphics Interface 2013, pp. 63–70. Canadian Information Processing Society (2013)
Miwa, H., Itoh, K., Matsumoto, M., Zecca, M., Takanobu, H., Rocella, S., Carrozza, M.C., Dario, P., Takanishi, A.: Effective emotional expressions with expression humanoid robot WE-4RII: integration of humanoid robot hand RCH-1. In: 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2004 (IROS 2004). Proceedings, vol. 3, pp. 2203–2208. IEEE (2004)
Mueller, F., Bernard, F., Sotnychenko, O., Mehta, D., Sridhar, S., Casas, D., Theobalt, C.: GANerated hands for real-time 3D hand tracking from monocular RGB. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 49–59 (2018)
Mueller, F., Mehta, D., Sotnychenko, O., Sridhar, S., Casas, D., Theobalt, C.: Real-time hand tracking under occlusion from an egocentric RGB-D sensor. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1284–1293 (2017)
Mueller, F., Mehta, D., Sotnychenko, O., Sridhar, S., Casas, D., Theobalt, C.: Real-time hand tracking under occlusion from an egocentric RGB-D sensor. In: Proceedings of International Conference on Computer Vision (ICCV) (2017). http://handtracker.mpi-inf.mpg.de/projects/OccludedHands/
Oikonomidis, I., Kyriazis, N., Argyros, A.A.: Efficient model-based 3D tracking of hand articulations using kinect. In: BmVC, vol. 1, p. 3 (2011)
Oprea, S., Martinez-Gonzalez, P., Garcia-Garcia, A., Castro-Vargas, J.A., Orts-Escolano, S., Garcia-Rodriguez, J.: A visually plausible grasping system for object manipulation and interaction in virtual reality environments. arXiv preprint arXiv:1903.05238 (2019)
Panteleris, P., Oikonomidis, I., Argyros, A.: Using a single RGB frame for real time 3D hand pose estimation in the wild. In: 2018 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 436–445. IEEE (2018)
Pławiak, P., Sośnicki, T., Niedźwiecki, M., Tabor, Z., Rzecki, K.: Hand body language gesture recognition based on signals from specialized glove and machine learning algorithms. IEEE Trans. Ind. Inf. 12(3), 1104–1113 (2016)
Qian, C., Sun, X., Wei, Y., Tang, X., Sun, J.: Realtime and robust hand tracking from depth. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1106–1113 (2014)
Rogez, G., Khademi, M., Supančič III, J., Montiel, J.M.M., Ramanan, D.: 3D hand pose detection in egocentric RGB-D images. In: Workshop at the European Conference on Computer Vision, pp. 356–371. Springer, Heidelberg (2014)
Sharp, T., Keskin, C., Robertson, D., Taylor, J., Shotton, J., Kim, D., Rhemann, C., Leichter, I., Vinnikov, A., Wei, Y., et al.: Accurate, robust, and flexible real-time hand tracking. In: Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems, pp. 3633–3642. ACM (2015)
Singh, S., Arora, C., Jawahar, C.: First person action recognition using deep learned descriptors. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2620–2628 (2016)
Singha, J., Roy, A., Laskar, R.H.: Dynamic hand gesture recognition using vision-based approach for human-computer interaction. Neural Comput. Appl. 29(4), 1129–1141 (2018)
Sridhar, S., Mueller, F., Zollhoefer, M., Casas, D., Oulasvirta, A., Theobalt, C.: Real-time joint tracking of a hand manipulating an object from RGB-D input. In: Proceedings of European Conference on Computer Vision (ECCV) (2016). http://handtracker.mpi-inf.mpg.de/projects/RealtimeHO/
Sridhar, S., Oulasvirta, A., Theobalt, C.: Interactive markerless articulated hand motion tracking using RGB and depth data. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), December 2013. http://handtracker.mpi-inf.mpg.de/projects/handtracker_iccv2013/
Sun, X., Wei, Y., Liang, S., Tang, X., Sun, J.: Cascaded hand pose regression. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 824–832 (2015)
Tang, D., Jin Chang, H., Tejani, A., Kim, T.K.: Latent regression forest: structured estimation of 3D articulated hand posture. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3786–3793 (2014)
Tompson, J., Stein, M., Lecun, Y., Perlin, K.: Real-time continuous pose recovery of human hands using convolutional networks. ACM Trans. Graph. (ToG) 33(5), 169 (2014)
Wetzler, A., Slossberg, R., Kimmel, R.: Rule of thumb: Deep derotation for improved fingertip detection. In: Xianghua Xie, M.W.J., Tam, G.K.L. (eds.) Proceedings of the British Machine Vision Conference (BMVC), pp. 33.1–33.12. BMVA Press, Durham, September 2015
Xu, C., Nanjappa, A., Zhang, X., Cheng, L.: Estimate hand poses efficiently from single depth images. Int. J. Comput. Vis. 116(1), 21–45 (2016)
Yuan, S., Ye, Q., Stenger, B., Jain, S., Kim, T.K.: BiGHand2.2M benchmark: hand pose dataset and state of the art analysis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4866–4874 (2017)
Zaman, M., Rahman, S., Rafique, T., Ali, F., Akram, M.U.: Hand gesture recognition using color markers. In: International Conference on Hybrid Intelligent Systems, pp. 1–10. Springer, Heidelberg (2016)
Zimmermann, C., Brox, T.: Learning to estimate 3D hand pose from single RGB images. In: 2017 IEEE International Conference on Computer Vision (ICCV), pp. 4913–4921. IEEE (2017)
Acknowledgements
This work has been funded by the Spanish Government grant TIN2016-76515-R for the COMBAHO project, supported with Feder funds. This work has also been supported by three Spanish national grants for PhD studies (FPU15/04516, FPU17/00166, and ACIF/2018/197), by the University of Alicante project GRE16-19, and by the Valencian Government project GV/2018/022. Experiments were made possible by a generous hardware donation from NVIDIA.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Castro-Vargas, JA., Garcia-Garcia, A., Oprea, S., Martinez-Gonzalez, P., Garcia-Rodriguez, J. (2020). 3D Hand Joints Position Estimation with Graph Convolutional Networks: A GraphHands Baseline. In: Silva, M., Luís Lima, J., Reis, L., Sanfeliu, A., Tardioli, D. (eds) Robot 2019: Fourth Iberian Robotics Conference. ROBOT 2019. Advances in Intelligent Systems and Computing, vol 1093. Springer, Cham. https://doi.org/10.1007/978-3-030-36150-1_45
Download citation
DOI: https://doi.org/10.1007/978-3-030-36150-1_45
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-36149-5
Online ISBN: 978-3-030-36150-1
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)