Skip to main content

Collective Communication Patterns on the Quadrics Network

  • Chapter
Performance Analysis and Grid Computing

Abstract

The efficient implementation of collective communication is a key factor to provide good performance and scalability of communication patterns that involve global data movement and global control. Moreover, this is essential to enhance the fault-tolerance of a parallel computer. For instance, to check the status of the nodes, perform some distributed algorithm to balance the load, synchronize the local clocks, or do performance monitoring. Therefore, the support for multicast communications can improve the performance and resource utilization of a parallel computer. The Quadrics interconnect (QsNET), which is being used in some of the largest machines in the world, provides hardware support for multicast. The basic mechanism consists of the capability for a message to be sent to any set of contiguous nodes in the same time it takes to send a unicast message. The two main collective communication primitives provided by the network software are the barrier synchronization and the broadcast, which are both implemented in two different ways, either using the hardware support, when nodes are contiguous, or a balanced tree and unicast messaging, otherwise. In this paper some performance results are given for the above collective communication services, that show, on the one hand, the outstanding performance of the hardware-based primitives even in the presence of a high network background traffic; and, on the other hand, the limited performance achieved with the software-based implementation.

The work was supported by the Spanish CICYT through contract TIC2000–1151–C07–05

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Andrea Carol Arpaci-Dusseau. Implicit coscheduling: coordinated scheduling with implicit information in distributed systems. ACM Transactions on Computer Systems, 19(3):283–331, 2001.

    Article  Google Scholar 

  2. G. Bell. Ultracomputer: a Teraflop before its time. Communications of the ACM, 35(8):27–47, 1992.

    Article  Google Scholar 

  3. Nanette J. Boden, Danny Cohen, Robert E. Felderman, Alan E. Kulawick, Charles L. Seitz, Jakov N. Seizovic, Wen-King Su. Myrinet: A Gigabit-per-Second Local Area Network. IEEE Micro, 15(l):29–36, January 1995.

    Article  Google Scholar 

  4. Darius Buntinas, Dhabaieswar Panda, P. Sadayappan. Performance Benefits of NIC-Based Barrier on Myrinet/GM. In Workshop on Communication Architecture for Clusters (CAC’01), San Francisco, CA, April 2001.

    Google Scholar 

  5. José Duato, Sudhakar Yalamanchili, Lionel Ni. Interconnection Networks: an Engineering Approach. IEEE Computer Society Press, 1997.

    Google Scholar 

  6. Fabrizio Petrini, Wu-chun Feng. Buffered Coscheduling: A New Methodology for Multitasking Parallel Jobs on Distributed Systems. In Proceedings of the International Parallel and Distributed Processing Symposium 2000, IPDPS2000, Cancun, MX, May 2000.

    Google Scholar 

  7. Dror G. Feitelson, Morris A. Jette. Improved Utilization and Responsiveness with Gang Scheduling. In Dror G. Feitelson and Larry Rudolph (Eds.), Job Scheduling Strategies for Parallel Processing, volume 1291 of Lecture Notes in Computer Science. Springer-Verlag, 1997.

    Google Scholar 

  8. Eitan Frachtenberg, Fabrizio Petrini, Juan Fernandez, Scott Pakin, Salvador Coll. Storm: Lightning-fast resource management. In IEEE/ACM SC2001, Baltimore, MD, November 2002.

    Google Scholar 

  9. Barton P. Miller, Mark D. Callaghan, Jonathan M. Cargille, Jeffrey K. Hollingsworth, Karen L. Karavanic R. Bruce Irvin, Krishna Kunchithapadam, Tia Newhall. The Paradyn Parallel Performance Measurement Tool. IEEE Computer, 28(11):37–46, November 1995.

    Article  Google Scholar 

  10. Fabrizio Petrini. Scaling to Thousands of Processors with Buffered Coscheduling. In Scaling to New Heights Workshop, Pittsburgh, PA, May 2002.

    Google Scholar 

  11. Fabrizio Petrini, Wu chun Feng, Adolfy Hoisie, Salvador Coll, Eitan Frachtenberg. The Quadrics Network: High Performance Clustering Technology. IEEE Micro, 22(l):46–57, January-February 2002.

    Article  Google Scholar 

  12. Fabrizio Petrini, Salvador Coll, Eitan Frachtenberg, Adolfy Hoisie. Hardware- and Software-Based Collective Communication on the Quadrics Network. In IEEE International Symposium on Network Computing and Applications 2001 (NCA 2001), Boston, MA, October 2001.

    Google Scholar 

  13. Randy L. Ribler, Jeffrey S. Vetter, Huseyin Simitci, Daniel A. Reed. Autopilot: Adaptive Control of Distributed Applications. In 7th IEEE Symposium on High-Performance Distributed Computing, Chicago, IL, July 1998.

    Google Scholar 

  14. Rajeev Sivaram, Dhabaieswar Panda, Craig Stunkel. Efficient Broadcast and Multicast on Multistage Interconnection Networks using Multiport Encoding. In Proceedings of the 8th IEEE Symposium on Parallel and Distributed Processing, New Orleans, LA, October 1996.

    Google Scholar 

  15. Rajeev Sivaram, Dhabaieswar Panda, Craig Stunkel. Multicasting in Irregular Networks with Cut-Through Switches using Tree-Based Multidestination Worms. In Parallel Computing, Routing, and Communication Workshop, PCRCW’97, Atlanta, GA, June 1997.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer Science+Business Media New York

About this chapter

Cite this chapter

Coll, S., Duato, J., Mora, F.J., Petrini, F., Hoisie, A. (2004). Collective Communication Patterns on the Quadrics Network. In: Getov, V., Gerndt, M., Hoisie, A., Malony, A., Miller, B. (eds) Performance Analysis and Grid Computing. Springer, Boston, MA. https://doi.org/10.1007/978-1-4615-0361-3_6

Download citation

  • DOI: https://doi.org/10.1007/978-1-4615-0361-3_6

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-1-4613-5038-5

  • Online ISBN: 978-1-4615-0361-3

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics