Skip to main content

Performance Optimization Strategies of High Performance Computing on GPU

  • Conference paper
Advanced Parallel Processing Technologies (APPT 2009)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 5737))

Included in the following conference series:

Abstract

Recently GPU is widely utilized in scientific computing and engineering applications, owing primarily to the evolution of GPU architecture. Firstly, we analyze some key performance characters of GPU in detail, and the relationships among GPU architecture, programming model and memory hierarchy. Secondly, we present three performance optimization strategies: Prefetching, Streamlizing, and Task Division. Adequate experiments have been done to abstract the relationships among different factors and efficiency. Finally, we map the HPL benchmark to testify our strategies and achieve certain speedup.

This work is supported by National High Technology Development 863 Program of China under Grant No.2009AA01Z102 and National Natural Science Foundation of China under Grant No.60873016.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Ghuloum, A., Sprangle, E., Fang, J., Wu, G., Zhou, X.: Ct: A Flexible Parallel Programming Model for Tera-scale Architectures. Technical report, Intel Research (2007)

    Google Scholar 

  2. Gutowitz. H.: A tutorial introduction to Swarm. Technical report, The Santa Fe Institute (1993)

    Google Scholar 

  3. Monteyne, M.: RapidMind: Multi-Core Develpment Platform, RapidMind Official Page (2007), http://www.rapidmind.net/

  4. Dongarra, J.J., Luszczek, P., Petitet, A.: The LINPACK Benchmark: Past, Present, and Future. Concurrency and Computation: Practice and Experience 15, 803–820 (2003)

    Article  Google Scholar 

  5. http://www.netlib.org/benchmark/hpl/index.html

  6. Halfhill, T.R.: Parallel Processing With CUDA. Microprocessor Report (January 2008)

    Google Scholar 

  7. Stone, J.: Accelerating Computational Biology by 100x with CUDA. In: NVISION (2008) (presentation)

    Google Scholar 

  8. Hartley, T.D.R., Catalyurek, U., Ruiz, A., Igual, F., Mayo, R., Ujaldon, M.: Biomedical image analysis on a cooperative cluster of gpus and multicores. In: ICS 2008: Proceedings of the 22nd annual international conference on Supercomputing, pp. 15–25. ACM, New York (2008)

    Google Scholar 

  9. Bond, A.: Havok FX: GPU-accelerated physics for PC games. In: Proceedings of Game Developers Conference 2006 (2006)

    Google Scholar 

  10. Hagen, T.R., Lle, K.-A., Natvig, J.R.: Solving the Euler equations on graphics processing units. In: Alexandrov, V.N., van Albada, G.D., Sloot, P.M.A., Dongarra, J. (eds.) ICCS 2006. LNCS, vol. 3994, pp. 220–227. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  11. Zeller, C.: Cloth simulation on the GPU. In: ACM SIGGRAPH 2005 Conference Abstracts and Applications (2005)

    Google Scholar 

  12. Elsen, E., Houston, M., Vishal, V., Darve, E., Hanrahan, P., Pande, V.S.: N-Body simulation on GPUs. In: Proc. 2006 ACM/IEEE Conf. on Supercomputing, p. 188 (2006)

    Google Scholar 

  13. Phillips, J.C., Braun, R., Wang, W., Gumbart, J., Tajkhorshid, E., Villa, E., Chipot, C., Skeel, R.D., Kale, L., Schulten, K.: Scalable molecular dynamics with NAMD. J. Comp. Chem. 26, 1781–1802 (2005)

    Article  Google Scholar 

  14. Stone, J.E., Phillips, J.C., Freddolino, P.L., Hardy, D.J., Trabuco, L.G., Schulten, K.: Accelerating molecular modeling applications with graphics processors. J. Comp. Chem. 28, 2618–2640 (2007)

    Article  Google Scholar 

  15. Stone, S.S., Haldar, J.P., Tsao, S.C., Hwu, W.W., Liang, Z., Sutton, B.P.: Accelerating advanced MRI reconstructions on GPUs. In: ACM Computing Frontier Conference (2008)

    Google Scholar 

  16. openVIDIA, http://openvidia.sourceforge.net/

  17. Volkov, V., Demmel, J.W.: Benchmarking GPUs to tune dense linear algebra. In: SC 2008: Proceedings of the 2008 ACM/IEEE conference on Super-computing, pp. 1–11. IEEE Press, Los Alamitos (2008)

    Google Scholar 

  18. Fatica, M.: Accelerating Linpack with CUDA on heterogenous clusters. In: GPGPU 2009. ACM, New york (2009)

    Google Scholar 

  19. Castillo, M., Chan, E., Igual, F.D., Mayo, R., Quintanaorti, E.S., Quintana-orti, G., Van De Geijn, R., Van Zee, F.G.: Making Programming Synonymous with Programming for Linear Algebra Libraries, FLAME Working Note #31. The University of Texas at Austin, Department of Computer Sciences. Technical Report TR-08-20 (April 17, 2008)

    Google Scholar 

  20. Quintana-Orti, G., Igual, F.D., Quintana-Orti, E.S., van de Geijn, R.: Solving Dense Linear Systems on Platforms with Multiple Hardware Accelerators. In: PPoPP, pp. 121–129 (2009)

    Google Scholar 

  21. decuda, http://www.cs.rug.nl/~wladimir/decuda/

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Ma, A., Cai, J., Cheng, Y., Ni, X., Tang, Y., Xing, Z. (2009). Performance Optimization Strategies of High Performance Computing on GPU. In: Dou, Y., Gruber, R., Joller, J.M. (eds) Advanced Parallel Processing Technologies. APPT 2009. Lecture Notes in Computer Science, vol 5737. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-03644-6_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-03644-6_12

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-03643-9

  • Online ISBN: 978-3-642-03644-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics