Skip to main content

Heterogeneous Programming and Optimization of Gyrokinetic Toroidal Code Using Directives

  • Conference paper
  • First Online:
Accelerator Programming Using Directives (WACCPD 2018)

Abstract

The latest production version of the fusion particle simulation code, Gyrokinetic Toroidal Code (GTC), has been ported to and optimized for the next generation exascale GPU supercomputing platform. Heterogeneous programming using directives has been utilized to balance the continuously implemented physical capabilities and rapidly evolving software/hardware systems. The original code has been refactored to a set of unified functions/calls to enable the acceleration for all the species of particles. Extensive GPU optimization has been performed on GTC to boost the performance of the particle push and shift operations. In order to identify the hotspots, the code was the first benchmarked on up to 8000 nodes of the Titan supercomputer, which shows about 2–3 times overall speedup comparing NVidia M2050 GPUs to Intel Xeon X5670 CPUs. This Phase I optimization was followed by further optimizations in Phase II, where single-node tests show an overall speedup of about 34 times on SummitDev and 7.9 times on Titan. The real physics tests on Summit machine showed impressive scaling properties that reaches roughly 50% efficiency on 928 nodes of Summit. The GPU + CPU speed up from purely CPU is over 20 times, leading to an unprecedented speed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    The timings for the TITAN CPU w/PETSc case in Table 1 assume an ideal scaling in OMP threads from 8 threads to 16. i.e. the times presented in Table 1 for this case are those of the 8 OMP threads case, but they are divided by 2. The motivation for this is to set a lower bound in the possible GPU speedup attainable in TITAN.

References

  1. Lee, W.W.: Phys. Fluids 26, 556 (1983)

    Article  Google Scholar 

  2. Lee, W.: J. Comput. Phys. 72, 243 (1987). ISSN 0021-9991

    Article  Google Scholar 

  3. Littlejohn, R.G.: J. Plasma Phys. 29, 111 (1983)

    Article  Google Scholar 

  4. Brizard, A., Hahm, T.: Rev. Mod. Phys. 79, 421 (2007)

    Article  Google Scholar 

  5. Hahm, T.: Phys. Fluids (1958–1988) 31, 2670 (1988)

    Article  Google Scholar 

  6. Frieman, E., Chen, L.: Phys. Fluids (1958–1988) 25, 502 (1982)

    Article  Google Scholar 

  7. Rogister, A., Li, D.: Phys. Fluids B: Plasma Phys. (1989–1993) 4, 804 (1992)

    Article  Google Scholar 

  8. Lin, Z., Chen, L.: Phys. Plasmas (1994-present) 8, 1447 (2001)

    Article  Google Scholar 

  9. Lin, Y., Wang, X., Lin, Z., Chen, L.: Plasma Phys. Controlled Fusion 47, 657 (2005)

    Article  Google Scholar 

  10. Holod, I., Zhang, W.L., Xiao, Y., Lin, Z.: Phys. Plasmas 16, 122307 (2009)

    Article  Google Scholar 

  11. Liu, P., Zhang, W., Dong, C., Lin, J., Lin, Z., Cao, J.: Nucl. Fusion 57, 126011 (2017)

    Article  Google Scholar 

  12. Lin, Z., Hahm, T.S., Lee, W.W., Tang, W.M., White, R.B.: Turbulent transport reduction by zonal flows: massively parallel simulations. Science 281, 1835 (1998)

    Article  Google Scholar 

  13. http://phoenix.ps.uci.edu/GTC

  14. http://www.iter.org

  15. Lin, Z., Holod, I., Chen, L., Diamond, P.H., Hahm, T.S., Ethier, S.: Phys. Rev. Lett. 99, 265003 (2007)

    Article  Google Scholar 

  16. Xiao, Y., Lin, Z.: Turbulent transport of trapped electron modes in collisionless plasmas. Phys. Rev. Lett. 103, 085004 (2009)

    Article  Google Scholar 

  17. Zhang, W., Lin, Z., Chen, L.: Phys. Rev. Lett. 101, 095001 (2008)

    Article  Google Scholar 

  18. Zhang, W., Decyk, V., Holod, I., Xiao, Y., Lin, Z., Chen, L.: Phys. Plasmas 17, 055902 (2010)

    Article  Google Scholar 

  19. Zhang, W., Holod, I., Lin, Z., Xiao, Y.: Phys. Plasmas 19, 022507 (2012)

    Article  Google Scholar 

  20. Zhang, C., Zhang, W., Lin, Z., Li, D.: Phys. Plasmas 20, 052501 (2013)

    Article  Google Scholar 

  21. Wang, Z., et al.: Radial localization of toroidicity-induced alfven eigenmodes. Phys. Rev. Lett. 111, 145003 (2013)

    Article  Google Scholar 

  22. Cheng, J., et al.: Phys. Plasmas 23, 052504 (2016)

    Article  Google Scholar 

  23. Kuley, A., et al.: Phys. Plasmas 22, 102515 (2015)

    Article  Google Scholar 

  24. Peng, J., Zhihong, L., Holod, I., Chijie, X.: Plasma Sci. Technol 18, 126 (2016)

    Article  Google Scholar 

  25. McClenaghan, J., Lin, Z., Holod, I., Deng, W., Wang, Z.: Phys. Plasmas 21, 122519 (2014)

    Article  Google Scholar 

  26. Liu, D., Zhang, W., McClenaghan, J., Wang, J., Lin, Z.: Phys. Plasmas 21, 122520 (2014)

    Article  Google Scholar 

  27. Lin, Z., Hahm, T.S., Ethier, S., Tang, W.M.: Size scaling of turbulent transport in magnetically confined plasmas. Phys. Rev. Lett. 88, 195004 (2002)

    Article  Google Scholar 

  28. Meng, X., et al.: Heterogeneous programming and optimization of gyrokinetic toroidal code and large-scale performance test on TH-1A. In: Kunkel, J.M., Ludwig, T., Meuer, H.W. (eds.) ISC 2013. LNCS, vol. 7905, pp. 81–96. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-38750-0_7

    Chapter  Google Scholar 

  29. Wang, E., et al.: The gyrokinetic particle simulation of fusion plasmas on Tianhe-2 supercomputer. In: Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems (ScalA) 2016, International Conference for High Performance Computing, Networking, Storage and Analysis (SC2016), Salt Lake City, USA (2016)

    Google Scholar 

  30. Madduri, K., et al.: Gyrokinetic toroidal simulations on leading multi- and manycore HPC systems. In: Proceedings of International Conference on High Performance Computing, Networking, Storage and Analysis, SC 2011 (2011)

    Google Scholar 

  31. Madduri, K., Im, E.J., Ibrahim, K.Z., Williams, S., Ethier, S., Oliker, L.: Gyrokinetic particle-in-cell optimization on emerging multi- and manycore platforms. Parallel Comput. 37(9), 501–520 (2011)

    MathSciNet  Google Scholar 

  32. Wang, B., et al.: Kinetic turbulence simulations at extreme scale on leadership-class systems. In: Proceedings of International Conference on High Performance Computing, Networking, Storage and Analysis, SC 2013, no. 82 (2013)

    Google Scholar 

  33. Ethier, S., Adams, M., Carter, J., Oliker, L.: Petascale parallelization of the gyrokinetic toroidal Code. LBNL Paper LBNL-4698 (2012)

    Google Scholar 

  34. Tang, W., Wang, B., Ethier, S.: Scientific discovery in fusion plasma turbulence simulations at extreme scale. Comput. Sci. Eng. 16, 44 (2014)

    Article  Google Scholar 

  35. Dawson, J.M.: Rev. Mod. Phys. 55, 403 (1983)

    Article  Google Scholar 

  36. Birdsall, C.K., Langdon, A.B.: Plasma Physics via Computer Simulation. CRC Press, Boca Raton (2004)

    Book  Google Scholar 

  37. Xiao, Y., Holod, I., Wang, Z., Lin, Z., Zhang, T.: Phys. Plasmas 22, 022516 (2015)

    Article  Google Scholar 

  38. Feng, H., et al.: Development of finite element field solver in gyrokinetic toroidal code. Commun. Comput. Phys. 24, 655 (2018)

    Google Scholar 

  39. Ethier, S., Lin, Z.: Porting the 3D gyrokinetic particle-in-cell code GTC to the NEC SX-6 vector architecture: perspectives and challenges. Comput. Phys. Commun. 164, 456–458 (2004)

    Article  Google Scholar 

  40. White, R.B., Chance, M.S.: Phys. Fluids 27, 2455 (1984)

    Article  Google Scholar 

  41. Joubert, W., et al.: Accelerated application development: the ORNL Titan experience. Comput. Electr. Eng. 46, 123–138 (2015)

    Article  Google Scholar 

  42. Vergara Larrea, V.G., et al.: Experiences evaluating functionality and performance of IBM POWER8+ systems. In: Kunkel, J.M., Yokota, R., Taufer, M., Shalf, J. (eds.) ISC High Performance 2017. LNCS, vol. 10524, pp. 254–274. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-67630-2_20

    Chapter  Google Scholar 

Download references

Acknowledgments

The authors would like to thank Eduardo D’Azevedo for his many useful suggestions in the optimizations. This work was supported by the US Department of Energy (DOE) CAAR project, DOE SciDAC ISEP center, and National MCF Energy R&D Program under Grant Nos. 2018YFE0304100 and 2017YFE0301300, the National Natural Science Foundation of China under Grant Nos. 11675257, and the External Cooperation Program of Chinese Academy of Sciences under Grant No. 112111KYSB20160039. This research used resources of the Oak Ridge Leadership Computing Facility (OLCF) at the Oak Ridge National Laboratory, which is supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC05-00OR22725.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhihong Lin .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zhang, W. et al. (2019). Heterogeneous Programming and Optimization of Gyrokinetic Toroidal Code Using Directives. In: Chandrasekaran, S., Juckeland, G., Wienke, S. (eds) Accelerator Programming Using Directives. WACCPD 2018. Lecture Notes in Computer Science(), vol 11381. Springer, Cham. https://doi.org/10.1007/978-3-030-12274-4_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-12274-4_1

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-12273-7

  • Online ISBN: 978-3-030-12274-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics