An Implementation and Improvement of Convolutional Neural Networks on HSA Platform

Bao, Zhenshan; Luo, Qi; Zhang, Wenbo

doi:10.1007/978-981-10-6385-5_50

Zhenshan Bao¹⁵,
Qi Luo¹⁵ &
Wenbo Zhang¹⁵

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 727))

Included in the following conference series:

International Conference of Pioneering Computer Scientists, Engineers and Educators

2507 Accesses

Abstract

Nowadays, the most heterogeneous architectures were made up by the various IP modules of different hardware vendors, but this model is less efficiently. In order to solve this problem, AMD joint other hardware vendors proposed heterogeneous system architecture (HSA) specification. On the one hand, the HSA could help developers to accelerate the design process and programming. On the other hand, it improved the system performance and reduced the power. In this paper we presented the implementation of a framework for accelerating training and classification of arbitrary Convolutional Neural Networks (CNNs) on the HSA, on the basis of implementation, we presented tow accelerated methods that are Online update weights and letting CPU to participate in calculation. Experimental results showed that the implementation of CNNs on HSA 4 to 10 times faster than on the CPU.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Yang, C., Wu, L.: GPU-based volume rendering for 3D electromagnetic environment on virtual globe. Int. J. Image Graph. Sig. Process. 2(1), 53 (2010)
Article Google Scholar
Harris, C.: GPU accelerated radio astronomy signal convolution. Exp. Astron. 22(1), 129–141 (2008)
Article Google Scholar
Michel, P., Chestnutt, J., Kagami, S., Nishiwaki, K.: GPU-accelerated real-time 3D tracking for humanoid locomotion and stair climbing. IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 463–469. IEEE (2007)
Google Scholar
Owens, J.D., Houston, M., Luebke, D., Green, S., Stone, J.E., Phillips, J.C.: GPU computing. Proc. IEEE 96(5), 879–899 (2008)
Article Google Scholar
Du, P., Weber, R., Luszczek, P., Tomov, S., Peterson, G., Dongarra, J.: From CUDA to OpenCL: towards a performance-portable solution for multi-platform GPU programming ☆, ☆☆. Parallel Comput. 38(8), 391–407 (2012)
Article Google Scholar
Blinzer, P.: The heterogeneous system architecture: it’s beyond the GPU. In: International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation, p. iii. IEEE (2014). (Maxwell, J.C.: A Treatise on Electricity and Magnetism, 3rd edn., vol. 2, pp. 68–73. Clarendon, Oxford (1892))
Google Scholar
Dan, C.C., Meier, U., Gambardella, L.M., Schmidhuber, J.: Convolutional neural network committees for handwritten character classification. In: International Conference on Document Analysis and Recognition, pp. 1135–1139. IEEE Computer Society (2011)
Google Scholar
Lecun, B.Y., et al.: Gradient-based learning applied to document recognition. In: Proceedings of the IEEE (2010)
Google Scholar
Ubal, R., Jang, B., Mistry, P., Schaa, D., Kaeli, D.: Multi2Sim: a simulation framework for CPU-GPU computing. In: International Conference on Parallel Architectures and Compilation Techniques, pp. 335–344. ACM (2012)
Google Scholar
Ma, K., Li, X., Chen, W., Zhang, C.: GreenGPU: a holistic approach to energy efficiency in GPU-CPU heterogeneous architectures. In: International Conference on Parallel Processing, pp. 48–57. IEEE (2012)
Google Scholar
Ding, J.H., Hsu, W.C., Jeng, B.C., Hung, S.H., Chung, Y.C.: HSAemu: a full system emulator for HSA platforms. In: International Conference on Hardware/Software Codesign and System Synthesis, p. 26. ACM (2014)
Google Scholar
Luo, Z., Liu, H., Wu, X.: Artificial neural network computation on graphic process unit. In: IEEE International Joint Conference on Neural Networks, vol. 1, pp. 622–626. IEEE (2005)
Google Scholar
Strigl, D., Kofler, K., Podlipnig, S.: Performance and scalability of GPU-based convolutional neural networks. In: Euromicro International Conference on Parallel, Distributed and Network-Based Processing, pp. 317–324. IEEE (2010)
Google Scholar
Gadea, R., Cerdá, J., Ballester, F., Mocholí, A.: Artificial neural network implementation on a single FPGA of a pipelined on-line backpropagation, pp. 225–230 (2000)
Google Scholar
Himavathi, S., Anitha, D., Muthuramalingam, A.: Feedforward neural network implementation in FPGA using layer multiplexing for effective resource utilization. IEEE Trans. Neural Netw. 18(3), 880 (2007)
Article Google Scholar

Download references

Acknowledgement

This research is supported by the Natural Science Foundation of BJUT, the National Natural Science Foundation of China (Grants No. 91546111, 91646201), the Key Project of Beijing Municipal Education Commission (Grants No. KZ201610005009).

Author information

Authors and Affiliations

Faculty of Information Technology, Beijing University of Technology, Beijing, 100124, China
Zhenshan Bao, Qi Luo & Wenbo Zhang

Authors

Zhenshan Bao
View author publications
You can also search for this author in PubMed Google Scholar
Qi Luo
View author publications
You can also search for this author in PubMed Google Scholar
Wenbo Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhenshan Bao .

Editor information

Editors and Affiliations

Central South University, Changsha, China
Beiji Zou
Central South University, Changsha, China
Min Li
Harbin Institute of Technology, Harbin, China
Hongzhi Wang
Harbin University of Science and Technology, Harbin, China
Xianhua Song
Harbin University of Science and Technology, Harbin, China
Wei Xie
Harbin Sea of Clouds and Computer Technology, Harbin, China
Zeguang Lu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Bao, Z., Luo, Q., Zhang, W. (2017). An Implementation and Improvement of Convolutional Neural Networks on HSA Platform. In: Zou, B., Li, M., Wang, H., Song, X., Xie, W., Lu, Z. (eds) Data Science. ICPCSEE 2017. Communications in Computer and Information Science, vol 727. Springer, Singapore. https://doi.org/10.1007/978-981-10-6385-5_50

Download citation

DOI: https://doi.org/10.1007/978-981-10-6385-5_50
Published: 16 September 2017
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-6384-8
Online ISBN: 978-981-10-6385-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics