Robotic grasping recognition using multi-modal deep extreme learning machine

Wei, Jie; Liu, Huaping; Yan, Gaowei; Sun, Fuchun

doi:10.1007/s11045-016-0389-0

Robotic grasping recognition using multi-modal deep extreme learning machine

Published: 03 March 2016

Volume 28, pages 817–833, (2017)
Cite this article

Multidimensional Systems and Signal Processing Aims and scope Submit manuscript

Jie Wei^1,2,
Huaping Liu¹,
Gaowei Yan² &
…
Fuchun Sun¹

1397 Accesses
27 Citations
Explore all metrics

Abstract

Recognizing which part of an object is graspable or not is important for intelligent robot to perform some complicated tasks. In order to obtain good grasping performance, learning rich representations efficiently from multi-modal RGB-D images is crucial. To address this problem, in this paper, we propose an effective multi-modal deep extreme learning machine structure. In this structure, unsupervised hierarchical extreme learning machine (ELM) is conducted for feature extraction for RGB and depth modalities separately. Then, the shared layer is developed by combining both RGB and depth features. Finally, the ELM is used as supervised feature classifier for final decision. Experimental validation on Cornell grasping dataset illustrates that the proposed multiple modality fusion method achieves better grasp recognition performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A review of convolutional neural networks in computer vision

Article Open access 23 March 2024

Transfer learning for image classification using VGG19: Caltech-101 image data set

Article 17 September 2021

Vision-based robotic grasping from object localization, object pose estimation to grasp estimation for parallel grippers: a review

Article 17 August 2020

References

Akusok, A., Miche, Y., Karhunen, J., Bjork, K. M., Nian, R., & Lendasse, A. (2015). Arbitrary category classification of websites based on image content. IEEE Computational Intelligence Magazine, 10(2), 30–41.
Article Google Scholar
Bai, J., & Wu, Y. (2014). SAE-RNN deep learning for RGB-D based object recognition. In Intelligent computing theory. Lecture notes in computer science, Vol. 8588, pp. 235–240.
Beksi, W. J., & Papanikolopoulos, N. (2015). Object classification using dictionary learning and RGB-D covariance descriptors. In International conference on robotics and automation (ICRA) (pp. 1–6).
Bicchi, A., & Kumar, V. (2000). Robotic grasping and contact: A review. In International conference on robotics and automation (ICRA) (pp. 348–353).
Bohg, J., Morales, A., Asfour, T., & Kragic, D. (2014). Data-driven grasp synthesis—A survey. IEEE Transactions on Robotics, 30(2), 289–309.
Article Google Scholar
Cambria, E., & Huang, G. (2013). Extreme learning machines-representational learning with ELMs for big data. IEEE Intelligent Systems, 28(6), 30–59.
Article Google Scholar
Cao, J. W., Chen, T., & Fan, J. Y. (2015). Landmark recognition with compact BoW histogram and ensemble ELM. Multimedia Tools and Applications. doi:10.1007/s11042-014-2424-1.
Cao, J., & Lin, Z. (2015). Extreme learning machine on high dimensional and large data applications: A survey. Mathematical Problems in Engineering. doi:10.1155/2015/103796.
Cao, J., Lin, Z., Huang, G.-B., & Liu, N. (2012). Voting based extreme learning machine. Information Sciences, 185(1), 66–77.
Article MathSciNet Google Scholar
Chen, Y., Yao, E., & Basu, A. (2015). A 128 channel extreme learning machine based neural decoder for brain machine interfaces. IEEE Transactions on Biomedical Circuits and Systems (in press).
Ding, S., Zhang, N., Xu, X., Guo, L., & Zhang, J. (2015). Deep extreme learning machine and its application in EEG classification. Mathematical Problems in Engineering. doi:10.1155/2015/129021.
Feng, G., Huang, G., Lin, Q., & Gay, R. (2009). Error minimized extreme learning machine with growth of hidden nodes and incremental learning. IEEE Transactions on Neural Networks, 20(8), 1352–1357.
Article Google Scholar
Huang, G., Zhu, Q., & Siew, C. (2004). Extreme learning machine: A new learning scheme of feedforward neural networks. In Proceedings of international joint conference on neural network (IJCNN) (Vol. 2, pp. 985–990).
Huang, G. B. (2014). An insight into extreme learning machines: Random neurons, random features and kernels. Cognitive Computation, 61(1), 376–390.
Article Google Scholar
Huang, G., Zhou, H., Ding, X., & Zhang, R. (2012). Extreme learning machine for regression and multiclass classification. IEEE Transaction on Systems, Man, and Cybernetics, Part B: Cybernetics, 42(2), 513–529.
Article Google Scholar
Huang, G., Zhu, Q., & Siew, C. (2006). Extreme learning machine: Theory and applications. Neurocomputing, 70, 489–501.
Article Google Scholar
Hu, X., Zhang, X., Liu, M., Chen, Y., Li, P., Liu, J., et al. (2016). High precision intelligent flexible grasping front-end with CMOS interface for robots application. Science China Information Sciences, 59, 032203(11).
Google Scholar
Jhuo, I. H., Gao, S., Zhuang, L., & Lee, D. T. (2015). Unsupervised feature learning for RGB-D image classification. In Asian conference on computer vision (ACCV) (pp. 276–289).
Jiang, C. F., Chang, C. C., & Huang, S. H. (2012). Regions of interest extraction from SPECT images for neural degeneration assessment using multimodality image fusion. Multidimensional Systems and Signal Processing, 23(4), 437–449.
Article MathSciNet MATH Google Scholar
Lai, K., Bo, L., Ren, X., & Fox, D. (2011). A large-scale hierarchical multi-view RGB-D object dataset. In International conference on robotics and automation (ICRA) (pp. 1817–1824).
Lenz, I., Lee, H., & Saxena, A. (2015). Deep learning for detecting robotic grasps. The International Journal of Robotics Research, 34(4–5), 705–724.
Article Google Scholar
Ouyang, W., Chu, X., & Wang, X. (2014). Multi-source deep learning for human pose estimation. In Computer vision and pattern recognition (CVPR) (pp. 2337–2344).
Porter, W. A., & Liu, W. (1994). Object recognition by a massively parallel 2-D neural architecture. Multidimensional Systems and Signal Processing, 5(2), 179–201.
Article MATH Google Scholar
Sahbani, A., El-Khoury, S., & Bidaud, P. (2012). An overview of 3D object grasp synthesis algorithms. Robotics and Autonomous Systems, 60, 326–336.
Article Google Scholar
Saxena, A., Driemeyer, J., & Ng, A. Y. (2008). Robotic grasping of novel objects using vision. The International Journal of Robotics Research, 27(2), 157–173.
Article Google Scholar
Srivastava, N., & Salakhutdinov, R. (2012). Learning representations for multi-modal data with deep belief nets. In International conference on machine learning workshop (pp. 1–8).
Tang, J., Deng, C., & Huang, G. (2015). Extreme learning machine for multilayer perceptron. IEEE Transactions on Neural Networks and Learning Systems. doi:10.1109/TNNLS.2015.2424995.
Uzair, M., Shafait, F., Ghanem, B., & Mian, A. (2015). Representation learning with deep extreme learning machines for efficient image set classification. arXiv preprint arXiv:1503.02445, pp. 1–10.
Wang, A., Lu, J., Wang, G., Cai, J., & Cham, T. J. (2014). Multimodal unsupervised feature learning for RGB-D scene labeling. In European conference on computer vision (ECCV) (pp. 453–467).
Wang, W., Ooi, B. C., Yang, X., Zhang, D., & Zhuang, Y. (2014). Effective multi-modal retrieval based on stacked auto-encoders. Proceedings of the VLDB Endowment, 7(8), 649–660.
Article Google Scholar
Wang, J., Su, G., Xiong, Y., Chen, J., Shang, Y., Liu, J., et al. (2013). Sparse representation for face recognition based on constraint sampling and face alignment. Tsinghua Science and Technology, 1, 62–67.
Article MATH Google Scholar
Yuan, Y., & Sun, F. (2015). Data fusion-based resilient control system under DoS attacks: A game theoretic approach. International Journal of Control Automation and Systems, 13(3), 513–520.
Article Google Scholar
Yu, W., Zhuang, F., He, Q., & Shi, Z. (2015). Learning deep representations via extreme learning machines. Neurocomputing, 149, 308–315.
Article Google Scholar
Zaki, M., Ghalwash, A., & Elkouny, A. A. (1996). CNN: A speaker recognition system using a cascaded neural network. Multidimensional Systems and Signal Processing, 7(1), 87–99.
Article MATH Google Scholar
Zhu, W., Miao, J., Qing, L., & Huang, G. (in press). Hierarchical extreme learning machine for unsupervised representation learning. Neurocomputing.

Download references

Acknowledgments

This work was supported in part by the National Key Project for Basic Research of China under Grant 2013CB329403; in part by National High-tech Research and Development Plan under Grant 2015AA042306; in part by the National Natural Science Foundation of China under Grants 61210013 and 61450011; and in part by the Tsinghua University Initiative Scientific Research Program under Grant 20131089295.

Author information

Authors and Affiliations

Department of Computer Science and Technology, Tsinghua University, State Key Laboratory of Intelligent Technology and Systems, TNLIST, Beijing, People’s Republic of China
Jie Wei, Huaping Liu & Fuchun Sun
College of Information Engineering, Taiyuan University of Technology, Taiyuan, Shanxi, People’s Republic of China
Jie Wei & Gaowei Yan

Authors

Jie Wei
View author publications
You can also search for this author in PubMed Google Scholar
Huaping Liu
View author publications
You can also search for this author in PubMed Google Scholar
Gaowei Yan
View author publications
You can also search for this author in PubMed Google Scholar
Fuchun Sun
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Huaping Liu or Gaowei Yan.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wei, J., Liu, H., Yan, G. et al. Robotic grasping recognition using multi-modal deep extreme learning machine. Multidim Syst Sign Process 28, 817–833 (2017). https://doi.org/10.1007/s11045-016-0389-0

Download citation

Received: 27 September 2015
Revised: 22 January 2016
Accepted: 19 February 2016
Published: 03 March 2016
Issue Date: July 2017
DOI: https://doi.org/10.1007/s11045-016-0389-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Robotic grasping recognition using multi-modal deep extreme learning machine

Abstract

Access this article

Similar content being viewed by others

A review of convolutional neural networks in computer vision

Transfer learning for image classification using VGG19: Caltech-101 image data set

Vision-based robotic grasping from object localization, object pose estimation to grasp estimation for parallel grippers: a review

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding authors

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Robotic grasping recognition using multi-modal deep extreme learning machine

Abstract

Access this article

Similar content being viewed by others

A review of convolutional neural networks in computer vision

Transfer learning for image classification using VGG19: Caltech-101 image data set

Vision-based robotic grasping from object localization, object pose estimation to grasp estimation for parallel grippers: a review

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding authors

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation