A Tunable Implementation of Quality-of-Service Classes for HPC Networks

Brown, Kevin A.; McGlohon, Neil; Chunduri, Sudheer; Borch, Eric; Ross, Robert B.; Carothers, Christopher D.; Harms, Kevin

doi:10.1007/978-3-030-78713-4_8

Kevin A. Brown¹²,
Neil McGlohon¹³,
Sudheer Chunduri¹²,
Eric Borch¹⁴,
Robert B. Ross¹²,
Christopher D. Carothers¹³ &
…
Kevin Harms¹²

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12728))

Included in the following conference series:

International Conference on High Performance Computing

2414 Accesses
3 Citations

Abstract

High-performance computer (HPC) networks are often shared by communication traffic from multiple applications with varying communication characteristics and resource requirements. These applications contend for shared network buffers and channels, potentially resulting in significant performance variations and slowdown of critical communication operations such as low-latency MPI collectives. In order to ensure predictable communication performance, network resources must be allocated relative to the communication requirements of applications.

Quality of Service (QoS) solutions can regulate the allocation of resources by defining traffic classes with specified resource allocations and assigning applications to these classes, thus improving application performance predictability. However, it is difficult to accomplish facility-level goals of ensuring efficient application communication when constrained to a limited number of classes.

We propose a practical QoS implementation for large-scale, low-diameter networks, such as the dragonfly topology, using flexible bandwidth shaping along with traffic prioritization to reduce the impact of interference on communication performance. Our design gives facilities more control over tuning QoS class to meet application- and site-specific performance guarantees. The results show that our solution effectively eliminates the slowdown of high-priority traffic due to interference with lower-priority traffic, significantly reducing run-to-run variability. We also demonstrate how port counters can be used to detect when a job-to-class assignment is inappropriate for a given system and when a workload is exceeding the bandwidth limits of its class.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Switch hardware can only support a limited number of actives classes due to resource limitations.
2.
The number of traffic classes that can be configured on a given switch will be limited by how many class buffers and rate limiting counters are supported by that switch hardware.

References

Brown, K.A., Jain, N., Matsuoka, S., Schulz, M., Bhatele, A.: Interference between I/O and MPI traffic on fat-tree networks. In: Proceedings of the 47th International Conference on Parallel Processing, ICPP 2018, pp. 1–10. Association for Computing Machinery, New York, August 2018
Google Scholar
Carothers, C.D., Bauer, D., Pearce, S.: ROSS: a high-performance, low memory, modular time warp system. In: Proceedings Fourteenth Workshop on Parallel and Distributed Simulation, pp. 53–60 (2000)
Google Scholar
Chunduri, S., et al.: GPCNeT: designing a benchmark suite for inducing and measuring contention in HPC networks. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. SC 2019. Association for Computing Machinery, New York (2019)
Google Scholar
Chunduri, S., Parker, S., Balaji, P., Harms, K., Kumaran, K.: Characterization of MPI usage on a production supercomputer. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis. SC 2018. IEEE Press (2018)
Google Scholar
Cope, J., Liu, N., Lang, S., Carns, P., Carothers, C., Ross, R.: CODES: enabling co-design of multilayer exascale storage architectures (2011)
Google Scholar
Dordal, P.L.: An Introduction to Computer Networks, August 2020
Google Scholar
Grant, R.E., Pedretti, K.T., Gentile, A.: Overtime: a tool for analyzing performance variation due to network interference. In: Proceedings of the 3rd Workshop on Exascale MPI, ExaMPI 2015, pp. 1–10. Association for Computing Machinery, New York, November 2015
Google Scholar
Groves, T., Gu, Y., Wright, N.J.: Understanding performance variability on the aries dragonfly network. In: 2017 IEEE International Conference on Cluster Computing (CLUSTER), pp. 809–813, September 2017. iSSN 2168-9253
Google Scholar
Hewlett Packard Enterprise: Shasta Software Workshop (2019). https://cug.org/proceedings/cug2019_proceedings/includes/files/inv113s1-file1.pdf. Accessed 19 Oct 2020
Hewlett Packard Enterprise: Measuring Network Performance to Better Manage IT. Technical White Paper a50002193ENW, August 2020
Google Scholar
Jha, S., Brandt, J., Gentile, A., Kalbarczyk, Z., Iyer, R.: Characterizing supercomputer traffic networks through link-level analysis. In: 2018 IEEE International Conference on Cluster Computing (CLUSTER), pp. 562–570, September 2018. https://doi.org/10.1109/CLUSTER.2018.00072, iSSN: 2168-9253
John Thompson: Scalable Workload Models for System Simulations (2014). https://hpc.pnl.gov//modsim/2014/Presentations/Thompson.pdf. Accessed 19 Oct 2020
Jokanovic, A., Sancho, J.C., Labarta, J., Rodriguez, G., Minkenberg, C.: Effective quality-of-service policy for capacity high-performance computing systems. In: 2012 IEEE 14th International Conference on High Performance Computing and Communication 2012 IEEE 9th International Conference on Embedded Software and Systems, pp. 598–607, June 2012. https://doi.org/10.1109/HPCC.2012.86
Kim, J., Dally, W.J., Scott, S., Abts, D.: Technology-driven, highly-scalable dragonfly topology. In: Proceedings - International Symposium on Computer Architecture, pp. 77–88 (2008)
Google Scholar
Li, F., Niaki, A.A., Choffnes, D., Gill, P., Mislove, A.: A large-scale analysis of deployed traffic differentiation practices. In: Proceedings of the ACM Special Interest Group on Data Communication, Beijing China, pp. 130–144. ACM, August 2019
Google Scholar
Mubarak, M., et al.: Evaluating quality of service traffic classes on the Megafly network. In: Weiland, M., Juckeland, G., Trinitis, C., Sadayappan, P. (eds.) ISC High Performance 2019. LNCS, vol. 11501, pp. 3–20. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-20656-7_1
Chapter Google Scholar
OFI Working Group: Libfabric Programmer’s manual (2020). https://ofiwg.github.io/libfabric/master/man/fi_endpoint.3.html. Accessed 19 Oct 2020
Savoie, L., Lowenthal, D.K., de Supinski, B.R., Mohror, K., Jain, N.: Mitigating inter-job interference via process-level quality-of-service. In: 2019 IEEE International Conference on Cluster Computing (CLUSTER), pp. 1–5 (2019)
Google Scholar
Sensi, D.D., Girolamo, S.D., McMahon, K.H., Roweth, D., Hoefler, T.: An in-depth analysis of the slingshot interconnect. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC20), November 2020
Google Scholar
Smith, S.A., et al.: Mitigating inter-job interference using adaptive flow-aware routing. In: SC18: International Conference for High Performance Computing, Networking, Storage and Analysis, pp. 346–360, November 2018
Google Scholar
Society, T.I.: A Two Rate Three Color Marker (1999). https://tools.ietf.org/html/rfc2698. Accessed 01 June 2020
Wilke, J., Kenny, J.: Opportunities and limitations of quality-of-service in message passing applications on adaptively routed dragonfly and fat tree networks. In: 2020 IEEE International Conference on Cluster Computing (CLUSTER) (2020)
Google Scholar
Zhang, Y., Tuncer, O., Kaplan, F., Olcoz, K., Leung, V.J., Coskun, A.K.: Level-spread: a new job allocation policy for dragonfly networks. In: 2018 IEEE International Parallel and Distributed Processing Symposium (IPDPS), pp. 1123–1132 (2018)
Google Scholar

Download references

Acknowledgement

This work was supported by the Argonne Leadership Computing Facility, which is a DOE Office of Science User Facility supported under Contract DE-AC02-06CH11357, and by the Exascale Computing Project – learn more at https://www.exascaleproject.org/. We also gratefully acknowledge the computing resources provided and operated by the Joint Laboratory for System Evaluation (JLSE) at Argonne National Laboratory.

Author information

Authors and Affiliations

Argonne National Laboratory, Lemont, USA
Kevin A. Brown, Sudheer Chunduri, Robert B. Ross & Kevin Harms
Rensselaer Polytechnic Institute, Troy, USA
Neil McGlohon & Christopher D. Carothers
Hewlett Packard Enterprise, Houston, USA
Eric Borch

Authors

Kevin A. Brown
View author publications
You can also search for this author in PubMed Google Scholar
Neil McGlohon
View author publications
You can also search for this author in PubMed Google Scholar
Sudheer Chunduri
View author publications
You can also search for this author in PubMed Google Scholar
Eric Borch
View author publications
You can also search for this author in PubMed Google Scholar
Robert B. Ross
View author publications
You can also search for this author in PubMed Google Scholar
Christopher D. Carothers
View author publications
You can also search for this author in PubMed Google Scholar
Kevin Harms
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kevin A. Brown .

Editor information

Editors and Affiliations

Hewlett Packard Enterprise, Seattle, WA, USA
Bradford L. Chamberlain
University of Amsterdam, Amsterdam, The Netherlands
Ana-Lucia Varbanescu
Extreme Computing Research Center, Thuwal Jeddah, Saudi Arabia
Hatem Ltaief
The University of Tennessee, Knoxville, Knoxville, TN, USA
Piotr Luszczek

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Brown, K.A. et al. (2021). A Tunable Implementation of Quality-of-Service Classes for HPC Networks. In: Chamberlain, B.L., Varbanescu, AL., Ltaief, H., Luszczek, P. (eds) High Performance Computing. ISC High Performance 2021. Lecture Notes in Computer Science(), vol 12728. Springer, Cham. https://doi.org/10.1007/978-3-030-78713-4_8

Download citation

DOI: https://doi.org/10.1007/978-3-030-78713-4_8
Published: 17 June 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-78712-7
Online ISBN: 978-3-030-78713-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics