Skip to main content

Energy-Efficiency Tuning of a Lattice Boltzmann Simulation Using MERIC

  • Conference paper
  • First Online:
Parallel Processing and Applied Mathematics (PPAM 2019)

Abstract

Energy-efficiency is already of paramount importance for High Performance Computing (HPC) systems operation, and tools to monitor power usage and tune relevant hardware parameters are already available and in use at major supercomputing centres. On the other hand, HPC application developers and users still usually focus just on performance, even if they will probably be soon required to look also at the energy-efficiency of their jobs. Only few software tools allow to energy-profile a generic application, and even less are able to tune energy-related hardware parameters from the application itself. In this work we use the MERIC library and the RADAR analyzer, developed within the EU READEX project, to profile and tune for efficiency the execution parameters of a real-life Lattice Boltzmann code. Profiling methodology and details are described, and results are presented and compared with the ones measured in a previous work using different methodologies and tools.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://github.com/LLNL/libmsr.

  2. 2.

    https://github.com/LLNL/msr-safe.

References

  1. Ahmad, W.A., et al.: Design of an energy aware petaflops class high performance cluster based on power architecture. In: 2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 964–973 (2017). https://doi.org/10.1109/IPDPSW.2017.22

  2. Alessi, F., Thoman, P., Georgakoudis, G., Fahringer, T., Nikolopoulos, D.S.: Application-level energy awareness for OpenMP. In: Terboven, C., de Supinski, B.R., Reble, P., Chapman, B.M., Müller, M.S. (eds.) IWOMP 2015. LNCS, vol. 9342, pp. 219–232. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24595-9_16

    Chapter  Google Scholar 

  3. Beneventi, F., Bartolini, A., Cavazzoni, C., Benini, L.: Continuous learning of HPC infrastructure models using big data analytics and in-memory processing tools. In: Proceedings of the Conference on Design, Automation & Test in Europe. DATE 2017, pp. 1038–1043 (2017)

    Google Scholar 

  4. Biferale, L., Mantovani, F., Sbragaglia, M., Scagliarini, A., Toschi, F., Tripiccione, R.: Reactive Rayleigh-Taylor systems: front propagation and non-stationarity. EPL 94(5), 54004 (2011). https://doi.org/10.1209/0295-5075/94/54004

    Article  MATH  Google Scholar 

  5. Biferale, L., Mantovani, F., Sbragaglia, M., Scagliarini, A., Toschi, F., Tripiccione, R.: Second-order closure in stratified turbulence: simulations and modeling of bulk and entrainment regions. Phys. Rev. E 84(1), 016305 (2011). https://doi.org/10.1103/PhysRevE.84.016305

    Article  MATH  Google Scholar 

  6. Calore, E.: https://baltig.infn.it/COKA/PAPI-power-reader

  7. Calore, E., Gabbana, A., Kraus, J., Pellegrini, E., Schifano, S.F., Tripiccione, R.: Massively parallel lattice-Boltzmann codes on large GPU clusters. Parallel Comput. 58, 1–24 (2016). https://doi.org/10.1016/j.parco.2016.08.005

    Article  MathSciNet  Google Scholar 

  8. Calore, E., Gabbana, A., Kraus, J., Schifano, S.F., Tripiccione, R.: Performance and portability of accelerated lattice Boltzmann applications with OpenACC. Concurr. Computat.: Pract. Exp. 28(12), 3485–3502 (2016). https://doi.org/10.1002/cpe.3862

    Article  Google Scholar 

  9. Calore, E., Gabbana, A., Schifano, S.F., Tripiccione, R.: Evaluation of DVFS techniques on modern HPC processors and accelerators for energy-aware applications. Concurr. Comput.: Pract. Exp. 29(12), 1–19 (2017). https://doi.org/10.1002/cpe.4143

    Article  Google Scholar 

  10. Calore, E., Mantovani, F., Ruiz, D.: Advanced performance analysis of HPC workloads on Cavium ThunderX. In: 2018 International Conference on High Performance Computing Simulation (HPCS), pp. 375–382 (2018). https://doi.org/10.1109/HPCS.2018.00068

  11. Calore, E., Schifano, S.F., Tripiccione, R.: Energy-performance tradeoffs for HPC applications on low power processors. In: Hunold, S., et al. (eds.) Euro-Par 2015. LNCS, vol. 9523, pp. 737–748. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-27308-2_59

    Chapter  Google Scholar 

  12. Cesarini, D., Bartolini, A., Bonfà, P., Cavazzoni, C., Benini, L.: COUNTDOWN: a run-time library for application-agnostic energy saving in MPI communication primitives. In: Proceedings of the 2nd Workshop on AutotuniNg and aDaptivity AppRoaches for Energy-efficient HPC Systems. ANDARE 2018, pp. 2:1–2:6 (2018). https://doi.org/10.1145/3295816.3295818

  13. Dick, B., Vogel, A., Khabi, D., Rupp, M., Küster, U., Wittum, G.: Utilization of empirically determined energy-optimal CPU-frequencies in a numerical simulation code. Comput. Vis. Sci. 17(2), 89–97 (2015). https://doi.org/10.1007/s00791-015-0251-1

    Article  MathSciNet  Google Scholar 

  14. Dongarra, J., London, K., Moore, S., Mucci, P., Terpstra, D.: Using PAPI for hardware performance monitoring on Linux systems. In: Conference on Linux Clusters: The HPC Revolution, vol. 5. Linux Clusters Institute (2001)

    Google Scholar 

  15. Etinski, M., Corbalán, J., Labarta, J., Valero, M.: Understanding the future of energy-performance trade-off via DVFS in HPC environments. J. Parallel Distrib. Comput. 72(4), 579–590 (2012). https://doi.org/10.1016/j.jpdc.2012.01.006

    Article  Google Scholar 

  16. Hackenberg, D., Schone, R., Ilsche, T., Molka, D., Schuchart, J., Geyer, R.: An energy efficiency feature survey of the Intel Haswell processor. In: 2015 IEEE International Parallel and Distributed Processing Symposium Workshop (IPDPSW), pp. 896–904 (2015). https://doi.org/10.1109/IPDPSW.2015.70

  17. Kjeldsberg, P.G., et al.: Run-time exploitation of application dynamism for energy-efficient exascale computing. System-Scenario-Based Design Principles and Applications, pp. 113–126. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-20343-6_6

    Chapter  Google Scholar 

  18. Mantovani, F., Calore, E.: Performance and power analysis of HPC workloads on heterogeneous multi-node clusters. J. Low Power Electron. Appl. 8(2) (2018). https://doi.org/10.3390/jlpea8020013

  19. Mantovani, F., Pivanti, M., Schifano, S.F., Tripiccione, R.: Performance issues on many-core processors: a D2Q37 lattice Boltzmann scheme as a test-case. Comput. Fluids 88, 743–752 (2013). https://doi.org/10.1016/j.compfluid.2013.05.014

    Article  MathSciNet  MATH  Google Scholar 

  20. McCalpin, J.D.: Memory bandwidth and machine balance in current high performance computers. IEEE Technical Committee on Computer Architecture (TCCA) Newsletter (1995)

    Google Scholar 

  21. Oleynik, Y., Gerndt, M., Schuchart, J., Kjeldsberg, P.G., Nagel, W.E.: Run-time exploitation of application dynamism for energy-efficient exascale computing (READEX). In: 2015 IEEE 18th International Conference on Computational Science and Engineering, pp. 347–350 (2015). https://doi.org/10.1109/CSE.2015.55

  22. Sbragaglia, M., Benzi, R., Biferale, L., Chen, H., Shan, X., Succi, S.: Lattice Boltzmann method with self-consistent thermo-hydrodynamic equilibria. J. Fluid Mech. 628, 299–309 (2009). https://doi.org/10.1017/S002211200900665X

    Article  MathSciNet  MATH  Google Scholar 

  23. Scagliarini, A., Biferale, L., Sbragaglia, M., Sugiyama, K., Toschi, F.: Lattice Boltzmann methods for thermal flows: continuum limit and applications to compressible Rayleigh-Taylor systems. Phys. Fluids (1994-present) 22(5), 055101 (2010). https://doi.org/10.1063/1.3392774

    Article  MATH  Google Scholar 

  24. Schuchart, J., et al.: The readex formalism for automatic tuning for energy efficiency. Computing 99(8), 727–745 (2017). https://doi.org/10.1007/s00607-016-0532-7

    Article  MathSciNet  Google Scholar 

  25. Sensi, D.D., Matteis, T.D., Danelutto, M.: Simplifying self-adaptive and power-aware computing with Nornir. Future Gener. Comput. Syst. 87, 136–151 (2018). https://doi.org/10.1016/j.future.2018.05.012

    Article  Google Scholar 

  26. Shafik, R.A., Das, A., Yang, S., Merrett, G., Al-Hashimi, B.M.: Adaptive energy minimization of OpenMP parallel applications on many-core systems. In: Proceedings of the 6th Workshop on Parallel Programming and Run-Time Management Techniques for Many-core Architectures. PARMA-DITAM 2015, pp. 19–24. ACM (2015). https://doi.org/10.1145/2701310.2701311

  27. Succi, S.: The Lattice-Boltzmann Equation. Oxford University Press, Oxford (2001)

    MATH  Google Scholar 

  28. Vysocky, O., Beseda, M., Říha, L., Zapletal, J., Lysaght, M., Kannan, V.: MERIC and RADAR generator: tools for energy evaluation and runtime tuning of HPC applications. In: Kozubek, T., et al. (eds.) HPCSE 2017. LNCS, vol. 11087, pp. 144–159. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-97136-0_11

    Chapter  Google Scholar 

  29. Wu, Q., et al.: A dynamic compilation framework for controlling microprocessor energy and performance. In: Proceedings of the 38th Annual IEEE/ACM International Symposium on Microarchitecture, pp. 271–282. IEEE Computer Society (2005)

    Google Scholar 

Download references

Acknowledgements

This work was done in the framework of the COKA, and COSA projects of INFN. We thank Università degli Studi di Ferrara for access to their HPC systems. Enrico Calore was partially founded by “Contributo 5 per mille assegnato all’Università degli Studi di Ferrara - dichiarazione dei redditi dell’anno 2014”.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Enrico Calore .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Calore, E., Gabbana, A., Schifano, S.F., Tripiccione, R. (2020). Energy-Efficiency Tuning of a Lattice Boltzmann Simulation Using MERIC. In: Wyrzykowski, R., Deelman, E., Dongarra, J., Karczewski, K. (eds) Parallel Processing and Applied Mathematics. PPAM 2019. Lecture Notes in Computer Science(), vol 12044. Springer, Cham. https://doi.org/10.1007/978-3-030-43222-5_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-43222-5_15

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-43221-8

  • Online ISBN: 978-3-030-43222-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics