Skip to main content

Energy-Aware On-Chip Networks

  • Chapter
Energy-Aware System Design
  • 685 Accesses

Abstract

As technology continues to evolve, communication is becoming the bottleneck of future systems as it significantly impacts overall performance and cost. The energy consumed in the communication of future many-core processors will be critical in achieving a scalable many-core system. In this chapter, we present energy-aware on-chip network architectures that attempt to achieve ideal on-chip network behavior by approaching the latency and energy consumed in the wires in transmitting data from source to destination. We present different techniques, including topology, flow control, and router microarchitecture, that attempt to achieve this ideal on-chip network. These approaches minimize the energy and latency overhead of intermediate routers as packets traverse the network. In addition to these approaches, we describe an alternative approach, i.e., bufferless on-chip networks, which minimize the amount of network buffers to reduce energy consumption.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    The latency calculations were based on Intel TeraFlop [18] parameters (t r=1.25 ns, b=16 GB/s, L=320 bits) and an estimated value of wire delay for 65 nm (t w=250 ps/mm).

  2. 2.

    Some amount of storage is still needed; for example, pipeline registers are still needed and, if the router microarchitecture requires multiple cycles, additional internal storage would be required.

References

  1. Intel: From a few cores to many: a tera-scale computing research overview (2006)

    Google Scholar 

  2. Kongetira, P., Aingaran, K., Olukotun, K.: Niagara: a 32-way multithreaded Sparc processor. IEEE MICRO 25, 21–29 (2005). 10.1109/MM.2005.35. http://portal.acm.org/citation.cfm?id=1069597.1069758

    Article  Google Scholar 

  3. Nickolls, J., Dally, W.J.: The GPU computing era. IEEE MICRO 30, 56–69 (2010). http://doi.ieeecomputersociety.org/10.1109/MM.2010.41

    Article  Google Scholar 

  4. Owens, J.D., Dally, W.J., Ho, R., Jayasimha, D.N., Keckler, S.W., Peh, L.S.: Research challenges for on-chip interconnection networks. IEEE MICRO 96–108 (2007)

    Google Scholar 

  5. Dally, W.J., Towles, B.: Principles and Practices of Interconnection Networks. Morgan Kaufmann, San Francisco (2004)

    Google Scholar 

  6. Kumar, A., Peh, L.S., Kundu, P., Jhay, N.K.: Express virtual channels: towards the ideal interconnection fabric. In: Proc. of the International Symposium on Computer Architecture, ISCA, San Diego, CA (2007)

    Google Scholar 

  7. Dally, W.J., Towles, B.: Route packets, not wires: on-chip interconnection networks. In: Proc. of the 38th Conference on Design Automation, DAC, pp. 684–689 (2001)

    Chapter  Google Scholar 

  8. Kumar, A., Kundu, P., Singh, A., Peh, L.S., Jha, N.: A 4.6 Tbits/s 3.6 GHz single-cycle NOC router with a novel switch allocator in 65 nm CMOS. In: International Conference on Computer Design, ICCD (2007). http://www.gigascale.org/pubs/1218.html

    Google Scholar 

  9. Nicopoulos, C.A., Park, D., Kim, J., Vijaykrishnan, N., Yousif, M.S., Das, C.R.: Vichar: a dynamic virtual channel regulator for network-on-chip routers. In: Annual IEEE/ACM International Symposium on Microarchitecture, MICRO, Orlando, FL (2006)

    Google Scholar 

  10. Hoskote, Y., Vangal, S., Singh, A., Borkar, N., Borkar, S.: A 5-GHz mesh interconnect for a teraflops processor. IEEE MICRO 27(5), 51–61 (2007)

    Article  Google Scholar 

  11. Kodi, A.K., Sarathy, A., Louri, A.: iDEAL: inter-router dual-function energy and area-efficient links for network-on-chip. In: Proc. of the International Symposium on Computer Architecture, ISCA, Beijing, China (2008)

    Google Scholar 

  12. Kim, J., Balfour, J., Dally, W.J.: Flattened butterfly for on-chip networks. In: Proc. of the 40th Annual IEEE International Symposium on Microarchitecture, MICRO, Chicago, IL, Dec. 2007, pp. 172–182 (2007)

    Google Scholar 

  13. Kim, J.: Low-cost router microarchitecture for on-chip networks. In: Proc. of the 42nd Annual IEEE International Symposium on Microarchitecture, MICRO 42, New York, NY, Dec. 2009, pp. 255–266 (2009)

    Chapter  Google Scholar 

  14. Hayenga, M., Jerger, N.E., Lipasti, M.: Scarab: a single cycle adaptive routing and bufferless network. In: Proc. of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 42, pp. 244–254 (2009)

    Chapter  Google Scholar 

  15. Moscibroda, T., Mutlu, O.: A case for bufferless routing in on-chip networks. In: Proc. of the 36th Annual International Symposium on Computer Architecture, pp. 196–207 (2009)

    Google Scholar 

  16. Kim, J., Dally, W.J., Towles, B., Gupta, A.K.: Microarchitecture of a high-radix router. In: Proc. of the 32nd IEEE International Symposium on Computer Architecture, ISCA, Madison, WI, pp. 420–431 (2005)

    Google Scholar 

  17. Wang, H., Peh, L.S., Malik, S.: Power-driven design of router microarchitectures in on-chip networks. In: Proc. of the 36th Annual IEEE/ACM International Symposium on Microarchitecture, pp. 105–116 (2003)

    Google Scholar 

  18. Vangal, S., Howard, J., Ruhl, G., Dighe, S., Wilson, H., Tschanz, J., Finan, D., Singh, A., Jacob, T., Jain, S., Erraguntla, V., Roberts, C., Hoskote, Y., Borkar, N., Borkar, S.: An 80-tile sub-100-W TeraFLOPS processor in 65-nm CMOS. IEEE J. Solid-State Circuits 43(1), 29–41 (2008)

    Article  Google Scholar 

  19. Balfour, J., Dally, W.J.: Design tradeoffs for tiled CMP on-chip networks. In: ICS’06: Proc. of the 20th Annual International Conference on Supercomputing, pp. 187–198 (2006)

    Chapter  Google Scholar 

  20. Das, R., Eachempati, S., Mishra, A.K., Vijaykrishnan, N., Das, C.R.: Design and evaluation of a hierarchical on-chip interconnect for next-generation CMPs. In: International Symposium on High-Performance Computer Architecture, HPCA, Raleigh, NC, pp. 175–186 (2009)

    Google Scholar 

  21. Grot, B., Hestness, J., Keckler, S.W., Mutlu, O.: Express cube topologies for on-chip interconnects. In: International Symposium on High-Performance Computer Architecture, HPCA, Raleigh, NC, pp. 163–174 (2009)

    Google Scholar 

  22. Bhuyan, L.N., Agrawal, D.P.: Generalized hypercube and hyperbus structures for a computer network. IEEE Trans. Comput. 33(4), 323–333 (1984)

    Article  MATH  Google Scholar 

  23. Dally, W.J.: Express cubes: improving the performance of k-ary n-cube interconnection networks. IEEE Trans. Comput. 40, 1016–1023 (1991)

    Article  Google Scholar 

  24. Kumar, P., Pan, Y., Kim, J., Memik, G., Choudhary, A.N.: Exploring concentration and channel slicing in on-chip network router. In: International Symposium on Networks-on-Chips, NOCs, La Jolla, CA, pp. 276–285 (2009)

    Google Scholar 

  25. Cho, S., Jin, L.: Managing distributed, shared l2 caches through OS-level page allocation. In: Annual IEEE/ACM International Symposium on Microarchitecture, MICRO, Orlando, FL, pp. 455–468 (2006)

    Google Scholar 

  26. Kim, J., Dally, W.J., Abts, D.: Flattened butterfly: a cost-efficient topology for high-radix networks. In: Proc. of the International Symposium on Computer Architecture, ISCA, San Diego, CA (2007)

    Google Scholar 

  27. Dally, W.J.: Virtual-channel flow control. IEEE Trans. Parallel Distrib. Syst. 3(2), 194–205 (1992)

    Article  Google Scholar 

  28. Seo, D., Ali, A., Lim, W.T., Rafique, N., Thottethodi, M.: Near-optimal worst-case throughput routing for two-dimensional mesh networks. In: Proc. of the International Symposium on Computer Architecture, ISCA, pp. 432–443 (2005)

    Google Scholar 

  29. Valiant, L.G.: A scheme for fast parallel communication. SIAM J. Comput. 11(2), 350–361 (1982)

    Article  MathSciNet  MATH  Google Scholar 

  30. Singh, A.: Load-balanced routing in interconnection networks. Ph.D. thesis, Stanford University, Palo Alto, CA (2005)

    Google Scholar 

  31. Chen, C.H.O., Agarwal, N., Krishna, T., Koo, K.H., Peh, L.S., Saraswat, K.C.: Physical vs. virtual express topologies with low-swing links for future many-core NoCs. In: Proc. of the 2010 4th ACM/IEEE International Symposium on Networks-on-Chip, NOCS’10, pp. 173–180. IEEE Comput. Soc., Washington (2010)

    Chapter  Google Scholar 

  32. Kessler, R., Schwarzmeier, J.: Cray T3D: a new dimension for Cray research. Compcon Spring’93, Digest of Papers, pp. 176–182 (22–26 Feb. 1993)

    Google Scholar 

  33. Kumar, A., Peh, L.S., Jha, N.K.: Token flow control. In: Annual IEEE/ACM International Symposium on Microarchitecture, MICRO, Lake Como, Italy, pp. 342–353 (2008)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to John Kim .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer Science+Business Media B.V.

About this chapter

Cite this chapter

Kim, J. (2011). Energy-Aware On-Chip Networks. In: Kyung, CM., Yoo, S. (eds) Energy-Aware System Design. Springer, Dordrecht. https://doi.org/10.1007/978-94-007-1679-7_5

Download citation

Publish with us

Policies and ethics