Network-Aware Optimization of MPDATA on Homogeneous Multi-core Clusters with Heterogeneous Network

Malik, Tania; Szustak, Lukasz; Wyrzykowski, Roman; Lastovetsky, Alexey

doi:10.1007/978-3-319-49956-7_3

Tania Malik³⁰,
Lukasz Szustak³¹,
Roman Wyrzykowski³¹ &
…
Alexey Lastovetsky³⁰

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10049))

Included in the following conference series:

International Conference on Algorithms and Architectures for Parallel Processing

892 Accesses
1 Citations

Abstract

The communication layer of modern HPC platforms is getting increasingly heterogeneous and hierarchical. As a result, even on platforms with homogeneous processors, the communication cost of many parallel applications will significantly vary depending on the mapping of their processes to the processors of the platform. The optimal mapping, minimizing the communication cost of the application, will strongly depend on the network structure and performance as well as the logical communication flow of the application. In our previous work, we proposed a general approach and two approximate heuristic algorithms aimed at minimization of the communication cost of data parallel applications which have two-dimensional symmetric communication pattern on heterogeneous hierarchical networks, and tested these algorithms in the context of the parallel matrix multiplication application. In this paper, we develop a new algorithm that is built on top of one of these heuristic approaches in the context of a real-life application, MPDATA, which is one of the major parts of the EULAG geophysical model. We carefully study the communication flow of MPDATA and discover that even under the assumption of a perfectly homogeneous communication network, the logical communication links of this application will have different bandwidths, which makes the optimization of its communication cost particularly challenging. We propose a new algorithm that is based on cost functions of one of our general heuristic algorithms and apply it to optimization of the communication cost of MPDATA, which has asymmetric heterogeneous communication pattern. We also present experimental results demonstrating performance gains due to this optimization.

A. Lastovetsky—This publication has emanated from research conducted with the financial support of Science Foundation Ireland (SFI) under Grant Number 14/IA/2474. This research was conducted with the financial support of NCN under grants no. UMO-2015/17/D/ST6/04059. This work is partially supported by EU under the COST Program Action IC1305: Network for Sustainable Ultrascale Computing (NESUS). Experiments were carried out on Grid’5000 developed under the INRIA ALADDIN development action with support from CNRS, RENATER and several Universities as well as other funding bodies (see https://www.grid5000.fr).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Malik, T., Rychkov, V., Lastovetsky, A.: Network-aware optimization of communications for parallel matrix multiplication on hierarchical hpc platforms. Concurrency Comput. Pract. Experience 28, 02–821 (2016). cpe.3609
Google Scholar
Wyrzykowski, R., Szustak, L., Rojek, K.: Parallelization of 2D MPDATA EULAG algorithm on hybrid architectures with GPU accelerators. parallel Comput. 40, 425–447 (2014)
Article MathSciNet Google Scholar
Wyrzykowski, R., Szustak, L., Rojek, K., Tomas, A.: Towards efficient decomposition and parallelization of MPDATA on hybrid CPU-GPU cluster. In: Lirkov, I., Margenov, S., Waśniewski, J. (eds.) LSSC 2013. LNCS, vol. 8353, pp. 457–464. Springer, Heidelberg (2014). doi:10.1007/978-3-662-43880-0_52
Google Scholar
Szustak, L., Rojek, K., Wyrzykowski, R., Gepner, P.: Toward efficient distribution of mpdata stencil computation on intel mic architecture. In: Proceedings of the 1st International Workshop on High-Performance Stencil Computations, pp. 51–56 (2014)
Google Scholar
Beaumont, O., Boudet, V., Legrand, A., Rastello, F., Robert, Y.: Heterogeneous matrix-matrix multiplication or partitioning a square into rectangles: Np-completeness and approximation algorithms. In: Proceedings of the Ninth Euromicro Workshop on Parallel and Distributed Processing, pp. 298–305 (2001)
Google Scholar
Lastovetsky, A., Dongarra, J.: High Performance Heterogeneous Computing. Wiley (2009)
Google Scholar
Smolarkiewicz, P.: Multidimensional positive definite advection transport algorithm: an overview. Int. J. Numer. Meth. Fluids 50, 1123–1144 (2006)
Article MathSciNet MATH Google Scholar
Piotrowski, Z., Wyszogrodzki, A., Smolarkiewicz, P.: Towards petascale simulation of atmospheric circulations with soundproof equations. Acta Geophys. 59, 1294–1311 (2011)
Article Google Scholar
Dichev, K., Lastovetsky, A.: Optimization of collective communication for heterogeneous hpc platforms. Wiley-Interscience (2013)
Google Scholar
Agarwal, T., Sharma, A., Laxmikant, A., Kale, L.: Topology-aware task mapping for reducing communication contention on large parallel machines. In: IPDPS 2006, p. 10 (2006)
Google Scholar
Solomonik, E., Bhatele, A., Demmel, J.: Improving communication performance in dense linear algebra via topology aware collectives. In: SC 2011, pp. 77: 1–77: 11. ACM, New York (2011)
Google Scholar
Kielmann, T., Hofman, R.F., Bal, H.E., Plaat, A., Bhoedjang, R.A.: MagPIe: MPI’s collective communication operations for clustered wide area systems. In: ACM Sigplan Notices, vol. 34, pp. 131–140. ACM (1999)
Google Scholar
Karonis, N., De Supinski, B., Foster, I., Gropp, W., Lusk, E., Bresnahan, J.: Exploiting hierarchy in parallel computer networks to optimize collective operation performance. IPDPS 2000, 377–384 (2000)
Google Scholar
Ma, T., Bosilca, G., Bouteiller, A., Dongarra, J.: HierKNEM: an adaptive framework for kernel-assisted and topology-aware collective communications on many-core clusters. In: IPDPS 2012, pp. 970–982 (2012)
Google Scholar
Kandalla, K., Subramoni, H., Vishnu, A., Panda, D.K.: Designing topology-aware collective communication algorithms for large scale infiniband clusters: case studies with scatter and gather. In: 2010 IEEE International Symposium on Parallel Distributed Processing, Workshops and Phd Forum (IPDPSW), pp. 1–8(2010)
Google Scholar
Coti, C., Herault, T., Cappello, F.: MPI applications on grids: a topology aware approach. In: Sips, H., Epema, D., Lin, H.-X. (eds.) Euro-Par 2009. LNCS, vol. 5704, pp. 466–477. Springer, Heidelberg (2009). doi:10.1007/978-3-642-03869-3_45
Chapter Google Scholar
Traff, J.: Implementing the MPI process topology mechanism. In: Supercomputing 2002, pp. 1–23 (2002)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer Science, University College Dublin, Belfield, Dublin 4, Ireland
Tania Malik & Alexey Lastovetsky
Czestochowa University of Technology, Dabrowskiego 69, 42-201, Czestochowa, Poland
Lukasz Szustak & Roman Wyrzykowski

Authors

Tania Malik
View author publications
You can also search for this author in PubMed Google Scholar
Lukasz Szustak
View author publications
You can also search for this author in PubMed Google Scholar
Roman Wyrzykowski
View author publications
You can also search for this author in PubMed Google Scholar
Alexey Lastovetsky
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Tania Malik or Lukasz Szustak .

Editor information

Editors and Affiliations

Carlos III University of Madrid, Getafe, Spain
Jesus Carretero
Carlos III University of Madrid, Getafe, Spain
Javier Garcia-Blas
Mathematical Support for Computers, N. I. Lobachevsky State University of Nizhny Novgorod, Nizhniy Novgorod, Russia
Victor Gergel
Research Computing Center (RCC), Moscow State University, Moscow, Russia
Vladimir Voevodin
Research Computing Center (RCC), Moscow State University, Moscow, Russia
Iosif Meyerov
E.U. Politécnica, Universidad de Extremaddura, Cáceres, Spain
Juan A. Rico-Gallego
Ingenieria de Sistemas Informáticos, Universidad de Extremaddura, Cáceres, Spain
Juan C. Díaz-Martín
Universitat Politécnica de València, Valencia, Spain
Pedro Alonso
Distributed and Parallel Systems Group, Institute for Computer Science, Innsbruck, Austria
Juan Durillo
Carlos III University of Madrid, Getafe, Spain
José Daniel Garcia Sánchez
UCD School of Computer Science, University College Dublin, Dublin, Ireland
Alexey L. Lastovetsky
University of Calabria, Rende (CS), Italy
Fabrizio Marozzo
Information Science and Engineering, Central South University, Changsha, Hunan, China
Qin Liu
Information Science and Engineering, Central South University, Changsha, Hunan, China
Zakirul Alam Bhuiyan
Ludwig Maximilian University of Munich, Munich, Germany
Karl Fürlinger
Informatik 10 - Rechnertechnik, Technische Universität München, Munich, Germany
Josef Weidendorfer
High Performance Computing Center (HLRS), Stuttgart, Germany
José Gracia

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Malik, T., Szustak, L., Wyrzykowski, R., Lastovetsky, A. (2016). Network-Aware Optimization of MPDATA on Homogeneous Multi-core Clusters with Heterogeneous Network. In: Carretero, J., et al. Algorithms and Architectures for Parallel Processing. ICA3PP 2016. Lecture Notes in Computer Science(), vol 10049. Springer, Cham. https://doi.org/10.1007/978-3-319-49956-7_3

Download citation

DOI: https://doi.org/10.1007/978-3-319-49956-7_3
Published: 19 November 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-49955-0
Online ISBN: 978-3-319-49956-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics