Unifying the Analysis of Performance Event Streams at the Consumer Interface Level

Besnard, Jean-Baptiste; Malony, Allen D.; Shende, Sameer; Pérache, Marc; Carribault, Patrick; Jaeger, Julien

doi:10.1007/978-3-030-11987-4_4

Jean-Baptiste Besnard⁶,
Allen D. Malony⁷,
Sameer Shende⁷,
Marc Pérache⁸,
Patrick Carribault⁸ &
…
Julien Jaeger⁸

Included in the following conference series:

International Workshop on Parallel Tools for High Performance Computing

316 Accesses

Abstract

Several instrumentation interfaces have been developed for parallel programs to make observable actions that take place during execution and to make accessible information about the program’s behavior and performance. Following in the footsteps of the successful profiling interface for MPI (PMPI), new rich interfaces to expose internal operation of MPI (MPI-T) and OpenMP (OMPT) runtimes are now in the standards. Taking advantage of these interfaces requires tools to selectively collect events from multiples interfaces by various techniques: function interposition (PMPI), value read (MPI-T), and callbacks (OMPT). In this paper, we present the unified instrumentation pipeline proposed by the MALP infrastructure that can be used to forward a variety of fine-grained events from multiple interfaces online to multi-threaded analysis processes implemented orthogonally with plugins. In essence, our contribution complements “front-end” instrumentation mechanisms by a generic “back-end” event consumption interface that allows “consumer” callbacks to generate performance measurements in various formats for analysis and transport. With such support, online and post-mortem cases become similar from an analysis point of view, making it possible to build more unified and consistent analysis frameworks. The paper describes the approach and demonstrates its benefits with several use cases.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Adhianto, L., Banerjee, S., Fagan, M., Krentel, M., Marin, G., Mellor-Crummey, J., Tallent, N.R.: Hpctoolkit: tools for performance analysis of optimized parallel programs. Concurr. Comput. Pract. Exp. 22(6), 685–701 (2010). https://doi.org/10.1002/cpe.1553
Ajima, Y., Inoue, T., Hiramoto, S., Uno, S., Sumimoto, S., Miura, K., Shida, N., Kawashima, T., Okamoto, T., Moriyama, O., Ikeda, Y., Tabata, T., Yoshikawa, T., Seki, K., Shimizu, T.: Tofu Interconnect 2: System-on-Chip Integration of High-Performance Interconnect, pp. 498–507. Springer International Publishing, Cham (2014). https://doi.org/10.1007/978-3-319-07518-1_35
Google Scholar
Benedict, S., Petkov, V., Gerndt, M.: PERISCOPE: An Online-Based Distributed Performance Analysis Tool, pp. 1–16. Springer, Berlin Heidelberg (2010). https://doi.org/10.1007/978-3-642-11261-4_1
Google Scholar
Besnard, J.B., Malony, A., Shende, S., Pérache, M., Carribault, P., Jaeger, J.: An mpi halo-cell implementation for zero-copy abstraction. In: Proceedings of the 22Nd European MPI Users’ Group Meeting, EuroMPI 2015, pp. 3:1–3:9. ACM, New York, NY, USA (2015). https://doi.org/10.1145/2802658.2802669
Besnard, J.B., Pérache, M., Jalby, W.: Event streaming for online performance measurements reduction. In: 2013 42nd International Conference on Parallel Processing, pp. 985–994 (2013). https://doi.org/10.1109/ICPP.2013.117
Böhme, D., Gamblin, T., Beckingsale, D., Bremer, P., Giménez, A., LeGendre, M.P., Pearce, O., Schulz, M.: Caliper: performance introspection for HPC software stacks. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2016, Salt Lake City, UT, USA, November 13–18, 2016, pp. 550–560 (2016). https://doi.org/10.1109/SC.2016.46
Derradji, S., Palfer-Sollier, T., Panziera, J.P., Poudes, A., Atos, F.W.: The bxi interconnect architecture. In: 2015 IEEE 23rd Annual Symposium on High-Performance Interconnects, pp. 18–25 (2015). https://doi.org/10.1109/HOTI.2015.15
Eichenberger, A.E., Mellor-Crummey, J., Schulz, M., Wong, M., Copty, N., Dietrich, R., Liu, X., Loh, E., Lorenz, D.: OMPT: An OpenMP Tools Application Programming Interface for Performance Analysis, pp. 171–185. Springer, Berlin, Heidelberg (2013). https://doi.org/10.1007/978-3-642-40698-0_13
Chapter Google Scholar
Eschweiler, D., Wagner, M., Geimer, M., Knüpfer, A., Nagel, W.E., Wolf, F.: Open trace format 2: the next generation of scalable trace formats and support libraries. PARCO 22, 481–490 (2011)
Google Scholar
Fu, H., Liao, J., Yang, J., Wang, L., Song, Z., Huang, X., Yang, C., Xue, W., Liu, F., Qiao, F., Zhao, W., Yin, X., Hou, C., Zhang, C., Ge, W., Zhang, J., Wang, Y., Zhou, C., Yang, G.: The sunway taihulight supercomputer: system and applications. Sci. China Inf. Sci. 59(7), 072,001 (2016). https://doi.org/10.1007/s11432-016-5588-7
Geimer, M., Kuhlmann, B., Pulatova, F., Wolf, F., Wylie, B.J.N.: Scalable collation and presentation of call-path profile data with cube. In: Parallel Computing: Architectures, Algorithms and Applications: Proceedings Parallel Computing (ParCo07, Jlich/Aachen, pp. 645–652. IOS Press
Google Scholar
Geimer, M., Wolf, F., Wylie, B.J.N., Ábrahám, E., Becker, D., Mohr, B.: The scalasca performance toolset architecture. Concurr. Comput. Pract. Exp. 22(6), 702–719 (2010). https://doi.org/10.1002/cpe.1556
Giménez, A., Gamblin, T., Bhatele, A., Wood, C., Shoga, K., Marathe, A., Bremer, P.T., Hamann, B., Schulz, M.: Scrubjay: deriving knowledge from the disarray of hpc performance data. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2017, pp. 35:1–35:12. ACM, New York, NY, USA (2017). https://doi.org/10.1145/3126908.3126935
Hilbrich, T., Müller, M.S., de Supinski, B.R., Schulz, M., Nagel, W.E.: Gti: a generic tools infrastructure for event-based tools in parallel systems. In: 2012 IEEE 26th International Parallel and Distributed Processing Symposium, pp. 1364–1375 (2012). https://doi.org/10.1109/IPDPS.2012.123
Hilbrich, T., Schulz, M., Brunst, H., Protze, J., de Supinski, B.R., Müller, M.S.: Event-Action Mappings for Parallel Tools Infrastructures, pp. 43–54. Springer, Berlin, Heidelberg (2015). https://doi.org/10.1007/978-3-662-48096-0_4
Google Scholar
Islam, T., Mohror, K., Schulz, M.: Exploring the capabilities of the new MPI\_T interface. In: Proceedings of the 21st European MPI Users’ Group Meeting, EuroMPI/ASIA 2014, pp. 91:91–91:96. ACM, New York, NY, USA (2014). https://doi.org/10.1145/2642769.2642781
de Kergommeaux, J.C., de Oliveira Stein, B.: Pajé: An Extensible Environment for Visualizing Multi-threaded Programs Executions, pp. 133–140. Springer, Berlin, Heidelberg (2000). https://doi.org/10.1007/3-540-44520-X_17
Chapter Google Scholar
Knüpfer, A., Rössel, C., Mey, D.a., Biersdorff, S., Diethelm, K., Eschweiler, D., Geimer, M., Gerndt, M., Lorenz, D., Malony, A., Nagel, W.E., Oleynik, Y., Philippen, P., Saviankou, P., Schmidl, D., Shende, S., Tschüter, R., Wagner, M., Wesarg, B., Wolf, F.: Score-P: A Joint Performance Measurement Run-Time Infrastructure for Periscope,Scalasca, TAU, and Vampir, pp. 79–91. Springer, Berlin Heidelberg (2012). https://doi.org/10.1007/978-3-642-31476-6_7
Chapter Google Scholar
Malony, A.D., Biersdorff, S., Shende, S., Jagode, H., Tomov, S., Juckeland, G., Dietrich, R., Poole, D., Lamb, C.: Parallel performance measurement of heterogeneous parallel systems with gpus. In: 2011 International Conference on Parallel Processing, pp. 176–185 (2011). https://doi.org/10.1109/ICPP.2011.71
Mohr, B., Malony, A.D., Shende, S., Wolf, F., et al.: Towards a performance tool interface for openmp: an approach based on directive rewriting. In: Proceedings of the Third Workshop on OpenMP (EWOMP01) (2001)
Google Scholar
Pillet, V., Pillet, V., Labarta, J., Cortes, T., Cortes, T., Girona, S., Girona, S., Computadors, D.D.D.: Paraver: a tool to visualize and analyze parallel code. Technical report, In WoTUG-18 (1995)
Google Scholar
Schulz, M., de Supinski, B.R.: PNMPI tools: A whole lot greater than the sum of their parts. In: Proceedings of the 2007 ACM/IEEE Conference on Supercomputing, SC 2007, pp. 30:1–30:10. ACM, New York, NY, USA (2007). https://doi.org/10.1145/1362622.1362663
Shende, S.S., Malony, A.D.: The tau parallel performance system. Int. J. High Perform. Comput. Appl. 20(2), 287–311 (2006). https://doi.org/10.1177/1094342006064482
Article Google Scholar
Wagner, M., Hilbrich, T., Brunst, H.: Online performance analysis: an event-based workflow design towards exascale. In: 2014 IEEE International Conference on High Performance Computing and Communications, 2014 IEEE 6th International Symposium on Cyberspace Safety and Security, 2014 IEEE 11th International Conference on Embedded Software and System (HPCC,CSS,ICESS), pp. 839–846 (2014). https://doi.org/10.1109/HPCC.2014.145
Wolf, F., Mohr, B.: EARL—A Programmable and Extensible Toolkit for Analyzing Event Traces of Message Passing Programs, pp. 503–512. Springer, Berlin, Heidelberg (1999). https://doi.org/10.1007/BFb0100611
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

ParaTools SAS, Arpajon, France
Jean-Baptiste Besnard
ParaTools Inc., Eugene, USA
Allen D. Malony & Sameer Shende
CEA, Arpajon, France
Marc Pérache, Patrick Carribault & Julien Jaeger

Authors

Jean-Baptiste Besnard
View author publications
You can also search for this author in PubMed Google Scholar
Allen D. Malony
View author publications
You can also search for this author in PubMed Google Scholar
Sameer Shende
View author publications
You can also search for this author in PubMed Google Scholar
Marc Pérache
View author publications
You can also search for this author in PubMed Google Scholar
Patrick Carribault
View author publications
You can also search for this author in PubMed Google Scholar
Julien Jaeger
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jean-Baptiste Besnard .

Editor information

Editors and Affiliations

Höchstleistungsrechenzentrum Stuttgart (HLRS), Universität Stuttgart, Stuttgart, Germany
Christoph Niethammer
Höchstleistungsrechenzentrum Stuttgart (HLRS), Universität Stuttgart, Stuttgart, Germany
Michael M. Resch
Zentrum für Informationsdienste und Hochleistungsrechnen (ZIH), Technische Universität Dresden, Dresden, Germany
Wolfgang E. Nagel
Zentrum für Informationsdienste und Hochleistungsrechnen (ZIH), Technische Universität Dresden, Dresden, Germany
Holger Brunst
Zentrum für Informationsdienste und Hochleistungsrechnen (ZIH), Technische Universität Dresden, Dresden, Germany
Hartmut Mix

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Besnard, JB., Malony, A.D., Shende, S., Pérache, M., Carribault, P., Jaeger, J. (2019). Unifying the Analysis of Performance Event Streams at the Consumer Interface Level. In: Niethammer, C., Resch, M., Nagel, W., Brunst, H., Mix, H. (eds) Tools for High Performance Computing 2017. PTHPC 2017. Springer, Cham. https://doi.org/10.1007/978-3-030-11987-4_4

Download citation

DOI: https://doi.org/10.1007/978-3-030-11987-4_4
Published: 15 February 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-11986-7
Online ISBN: 978-3-030-11987-4
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics