Skip to main content

On Construction of a Caffe Deep Learning Framework based on Intel Xeon Phi

  • Conference paper
  • First Online:
Advances on P2P, Parallel, Grid, Cloud and Internet Computing (3PGCIC 2018)

Abstract

With the increase of processor computing power, also a substantial rise in the development of many scientific applications, such as weather forecast, financial market analysis, medical technology and so on. The need for more intelligent data increases significantly. Deep Learning as a framework that able to understand the abstract information such as images, text, and sound has a challenging area in recent research works. This phenomenon makes the accuracy and speed are essential for implementing a large neural network. Therefore in this paper, we intend to implement Caffe deep learning framework on Intel Xeon Phi and measure the performance of this environment. In this case, we conduct three experiments. First, we evaluated the accuracy of Caffe deep learning framework in several numbers of iterations on Intel Xeon Phi. For the speed evaluation, in the second experiment we compared the training time before and after optimization on Intel Xeon E5-2650 and Intel Xeon Phi 7210 . In this case, we use vectorization, OpenMP parallel processing, message transfer Interface (MPI) for optimization. In the third experiment, we compared multinode execution results on two nodes of Intel Xeon E5-2650 and two nodes of Intel Xeon Phi 7210.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Heinecke, A.: Accelerators in scientific computing is it worth the effort? In: 2013 International Conference on High Performance Computing Simulation (HPCS), pp. 504–504, July 2013

    Google Scholar 

  2. Dagum, Leonardo, Menon, Ramesh: Openmp: an industry standard api for shared-memory programming. IEEE Comput. Sci. Eng. 5(1), 46–55 (1998)

    Article  Google Scholar 

  3. Chapman, B., Jost, G., Van Der Pas, R.: Using OpenMP: portable shared memory parallel programming, vol. 10. MIT press, Cambridge (2008)

    Google Scholar 

  4. Chandra, R.: Parallel programming in OpenMP. Morgan kaufmann (2001)

    Google Scholar 

  5. Openmp (2017). https://www.openmp.org

  6. Zhang, C., Fang, Z., Zhou, P., Pan, P., Cong, J.: Caffeine: towards uniformed representation and acceleration for deep convolutional neural networks. In: IEEE/ACM International Conference on Computer-Aided Design, Digest of Technical Papers, ICCAD, 07–10 Nov 2016

    Google Scholar 

  7. Hegde, G., Siddhartha, Ramasamy, N., Kapre, N.: Caffepresso: an optimized library for deep learning on embedded accelerator-based platforms. In: Proceedings of the International Conference on Compilers, Architectures and Synthesis for Embedded Systems, CASES 2016 (2016)

    Google Scholar 

  8. Jia, Y., et al.: Caffe: convolutional architecture for fast feature embedding. In: MM 2014 - Proceedings of the 2014 ACM Conference on Multimedia, pp. 675–678 (2014)

    Google Scholar 

  9. Krizhevsky, A., Hinton, G.: Convolutional deep belief networks on cifar-10. Unpublished manuscript 40 (2010)

    Google Scholar 

  10. Coates, A., Ng, A., Lee, H.: An analysis of single-layer networks in unsupervised feature learning. In: Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, pp. 215–223 (2011)

    Google Scholar 

  11. Cifar10 (2017). https://www.cs.toronto.edu/~kriz/cifar.html

  12. Kim, Y.: Convolutional neural networks for sentence classification. arXiv preprintarXiv:1408.5882 (2014)

  13. Roska, T., et al.: The use of cnn models in the subcortical visual pathway. IEEE Trans. Circuits Syst. I Fundam. Theory Appl. 40(3), 182–195 (1993)

    Article  MathSciNet  Google Scholar 

  14. Zarándy, Ákos, Orzó, László, Grawes, Edward, Werblin, Frank: CNN-based models for color vision and visual illusions. IEEE Trans. Circuits Syst. I Fundam. Theory Appl. 46(2), 229–238 (1999)

    Article  Google Scholar 

  15. Bottou, L., et al.: Comparison of classifier methods: a case study in handwritten digit recognition. In: Proceedings of the 12th IAPR International. Conference on Pattern Recognition, 1994. Vol. 2-Conference B: Computer Vision & Image Processing, vol. 2, pp. 77–82. IEEE (1994)

    Google Scholar 

  16. Han, S., Pool, J., Tran, J., Dally, W.: Learning both weights and connections for efficient neural network. In: Advances in Neural Information Processing Systems, pp. 1135–1143 (2015)

    Google Scholar 

  17. Susan Blackford, L., Petitet, Antoine, Pozo, Roldan, Karin Remington, R., Whaley, Clint, Demmel, James, Dongarra, Jack, Duff, Iain, Hammarling, Sven, Henry, Greg: An updated set of basic linear algebra subprograms (blas). ACM Trans. Math. Softw. 28(2), 135–151 (2002)

    Article  MathSciNet  Google Scholar 

  18. Nath, R., Tomov, S., Dongarra, J.: Accelerating GPU kernels for dense linear algebra. In: VECPAR, pp. 83–92. Springer, Berlin (2010)

    Google Scholar 

  19. Xbyak (2017). https://github.com/herumi/xbyak

  20. Nair, V., Hinton, G.E.: Rectified linear units improve restricted Boltzmann machines. In: Proceedings of the 27th International Conference on Machine Learning (ICML-10), pp. 807–814 (2010)

    Google Scholar 

  21. Dahl, G.E., Sainath, T.N., Hinton, G.E.: Improving deep neural networks for LVCSR using rectified linear units and dropout. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 8609–8613. IEEE (2013)

    Google Scholar 

Download references

Acknowledgements

This work was supported in part by the Ministry of Science and Technology, Taiwan, under Grant MOST 107-2221-E-029-008 and MOST 106-3114-E-029-003.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chao-Tung Yang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Yang, CT., Liu, JC., Chan, YW., Kristiani, E., Kuo, CF. (2019). On Construction of a Caffe Deep Learning Framework based on Intel Xeon Phi. In: Xhafa, F., Leu, FY., Ficco, M., Yang, CT. (eds) Advances on P2P, Parallel, Grid, Cloud and Internet Computing. 3PGCIC 2018. Lecture Notes on Data Engineering and Communications Technologies, vol 24. Springer, Cham. https://doi.org/10.1007/978-3-030-02607-3_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-02607-3_9

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-02606-6

  • Online ISBN: 978-3-030-02607-3

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics