Construction and Evaluation of Coordinated Performance Skeletons

Xu, Qiang; Subhlok, Jaspal

doi:10.1007/978-3-540-89894-8_10

Qiang Xu⁵ &
Jaspal Subhlok⁵

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 5374))

Included in the following conference series:

International Conference on High-Performance Computing

704 Accesses
2 Citations

Abstract

Performance prediction is particularly challenging for dynamic environments that cannot be modeled well due to reasons such as resource sharing and foreign system components. The approach to performance prediction taken in this work is based on the concept of a performance skeleton which is a short running program whose execution time in any scenario reflects the estimated execution time of the application it represents. The fundamental technical challenge addressed in this paper is the automatic construction of performance skeletons for parallel MPI programs. The steps in the skeleton construction procedure are 1) generation of process execution traces and conversion to a single coordinated logical program trace, 2) compression of the logical program trace, and 3) conversion to an executable parallel skeleton program. Results are presented to validate the construction methodology and prediction power of performance skeletons. The execution scenarios analyzed involve network sharing, different architectures and different MPI libraries. The emphasis is on identifying the strength and limitations of this approach to performance prediction.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Toomula, A., Subhlok, J.: Replicating memory behavior for performance prediction. In: Proceedings of LCR 2004: The 7th Workshop on Languages, Compilers, and Run-time Support for Scalable Systems, Houston, TX, October 2004. The ACM Digital Library (2004)
Google Scholar
Sodhi, S., Subhlok, J.: Automatic construction and evaluation of performance skeletons. In: Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS 2005), Denver, CO (April 2005)
Google Scholar
Sodhi, S., Xu, Q., Subhlok, J.: Performance prediction with skeletons. Cluster Computing: The Journal of Networks, Software Tools and Applications 11(2) (June 2008)
Google Scholar
Casanova, H., Obertelli, G., Berman, F., Wolski, R.: The AppLeS Parameter Sweep Template: User-level middleware for the grid. In: Supercomputing 2000, pp. 75–76 (2000)
Google Scholar
Raman, R., Livny, M., Solomon, M.: Matchmaking: Distributed resource management for high throughput computing. In: 7th IEEE International Symposium on High Performance Distributed Computing (July 1998)
Google Scholar
Snavely, A., Carrington, L., Wolter, N.: A framework for performance modeling and prediction. In: Proceedings of Supercomputing 2002 (2002)
Google Scholar
Huband, S., McDonald, C.: A preliminary topological debugger for MPI programs. In: 1st International Symposium on Cluster Computing and the Grid (CCGRID 2001) (2001)
Google Scholar
Badia, R., Labarta, J., Gimenez, J., Escale, F.: DIMEMAS: Predicting MPI applications behavior in Grid environments. In: Workshop on Grid Applications and Programming Tools (GGF8) (2003)
Google Scholar
Noeth, M., Mueller, F., Schulz, M., de Supinskii, B.: Scalable compression and replay of communication traces in massively parallel environments. In: 21th IEEE International Parallel and Distributed Processing Symposium (IPDPS 2007), Long Beach, CA (April 2007)
Google Scholar
Ratn, P., Mueller, F., de Supinski, B., Schulz, M.: Preserving time in large-scale communication traces. In: 22nd ACM International Conference on Supercomputing, June 2008, pp. 46–55 (2008)
Google Scholar
Kerbyson, D., Barker, K.: Automatic identification of application communication patterns via templates. In: 18th International Conference on Parallel and Distributed Computing Systems, Las Vegas, NV (September 2005)
Google Scholar
Tabe, T., Stout, Q.: The use of the MPI communication library in the NAS Parallel Benchmark. Technical Report CSE-TR-386-99, Department of Computer Science, University of Michigan (November 1999)
Google Scholar
Cordella, L.P., Foggia, P., Sansone, C., Vento, M.: Performance evaluation of the VF graph matching algorithm. In: Proc. of the 10th ICIAP, vol. 2, pp. 1038–1041. IEEE Computer Society Press, Los Alamitos (1999)
Google Scholar
Xu, Q., Prithivathi, R., Subhlok, J., Zheng, R.: Logicalization of MPI communication traces. Technical Report UH-CS-08-07, University of Houston (May 2008)
Google Scholar
Ziv, J., Lempel, A.: A universal algorithm for sequential data compression. IEEE Transactions on Information Theory 23(3), 337–343 (1977)
Article MathSciNet MATH Google Scholar
Nevill-Manning, C., Witten, I., Maulsby, D.: Compression by induction of hierarchical grammars. In: Data Compression Conference, Snowbird, UT, pp. 244–253 (1994)
Google Scholar
Nevill-Manning, C.G., Witten, I.H.: Sequitur, http://SEQUITUR.info
Crochemore, M.: An optimal algorithm for computing the repetitions in a word. Inf. Process. Lett. 12(5), 244–250 (1981)
Article MathSciNet MATH Google Scholar
Xu, Q., Subhlok, J.: Efficient discovery of loop nests in communication traces of parallel programs. Technical Report UH-CS-08-08, University of Houston (May 2008)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of Houston, Houston, TX 77204, USA
Qiang Xu & Jaspal Subhlok

Authors

Qiang Xu
View author publications
You can also search for this author in PubMed Google Scholar
Jaspal Subhlok
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science and Engineering, The Ohio State University, 2015 Neil Avenue, OH 43210, Columbus, USA
Ponnuswamy Sadayappan
Department of Electrical and Computer Engineering, Rutgers, the State University of New Jersey, 94 Brett Road, NJ 08854, Piscataway, USA
Manish Parashar
Hewlett-Packard ISO,, Sy 192, Whitefield Road, Mahadevapura Post, 560048, Bangalore, India
Ramamurthy Badrinath
Department of Electrical Engineering, University of Southern California, CA 90089-2562, Los Angeles, USA
Viktor K. Prasanna

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Xu, Q., Subhlok, J. (2008). Construction and Evaluation of Coordinated Performance Skeletons. In: Sadayappan, P., Parashar, M., Badrinath, R., Prasanna, V.K. (eds) High Performance Computing - HiPC 2008. HiPC 2008. Lecture Notes in Computer Science, vol 5374. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-89894-8_10

Download citation

DOI: https://doi.org/10.1007/978-3-540-89894-8_10
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-89893-1
Online ISBN: 978-3-540-89894-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics