Architectural Considerations for Exascale Supercomputing

Ishii, Yasuo

doi:10.1007/978-3-642-32454-3_2

Yasuo Ishii⁶

602 Accesses

Abstract

Towards exascale supercomputing, both academia and industry have started to investigate the future HPC technologies. One of the most difficult challenges is the enhancement of energy efficiency of the computer system. We discuss the energy efficiency of existing architecture in this paper. With the analysis of the performance of the dense matrix–matrix multiplication (DGEMM), we propose the DGEMM-specialized Vector-SIMD architecture that only requires the small number of processor cores and low memory bandwidth. The DGEMM-specialized Vector-SIMD architecture can outperform existing architectures with respect to several metrics, as far as it is dedicated to limited usages, such as the DGEMM calculation. We conclude that this type of discussion will be essential in designing the future computer architecture.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Each fused multiply and add operation requires two memory accesses (16B).

References

Gotoblas library. http://www.tacc.utexas.edu/tacc-projects/gotoblas2/.
Top500 supercomputer sites. http://www.top500.org/.
Kazushige Goto and Robert A. van de Geijn. Anatomy of high-performance matrix multiplication. ACM Trans. Math. Softw., 34(3):12:1–12:25, May 2008.
Google Scholar
P. Kogge, K. Bergman, S. Borkar, D. Campbell, W. Carson, W. Dally, M. Denneau, P. Franzon, W. Harrod, K. Hill, and Others. Exascale computing study: Technology challenges in achieving exascale systems. Technical report, University of Notre Dame, CSE Dept., 2008.
Google Scholar
Peter M. Kogge and Timothy J. Dysart. Using the top500 to trace and project technology and architecture trends. In Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis, SC ’11, pages 28:1–28:11, New York, NY, USA, 2011. ACM.
Google Scholar
Kuniaki Koike, Ken Fujino, Toshiyuki Fukushige, Hiroshi Daisaka, Yutaka Sugawara, Mary Inaba, Kei Hiraki, and Junnichiro Makino. Gravitational n-body simulation and lu decomposition with the multi purpose accelerator grape-dr (in japanese). Technical Report, 2009(26):1–11, 2009-07-28.
Google Scholar
Yunsup Lee, Rimas Avizienis, Alex Bishara, Richard Xia, Derek Lockhart, Christopher Batten, and Krste Asanović. Exploring the tradeoffs between programmability and efficiency in data-parallel accelerators. In Proceedings of the 38th annual international symposium on Computer architecture, ISCA ’11, pages 129–140, New York, NY, USA, 2011. ACM.
Google Scholar
Naoya Maruyama, Masaaki Kondo, Yasuo Ishii, Akihiro Nomura, Hiroyuki Takizawa, Takahiro Katagiri, Reiji Suda, and Yutaka Ishikawa. Technical Roadmap for Exascale Supercomputing. Technical report, SDHPC, 2008.
Google Scholar
Takashi Soga, Akihiro Musa, Youichi Shimomura, Ryusuke Egawa, Ken’ichi Itakura, Hiroyuki Takizawa, Koki Okabe, and Hiroaki Kobayashi. Performance evaluation of nec sx-9 using real science and engineering applications. In Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis, SC ’09, pages 28:1–28:12, New York, NY, USA, 2009. ACM.
Google Scholar
Guangming Tan, Linchuan Li, Sean Triechle, Everett Phillips, Yungang Bao, and Ninghui Sun. Fast implementation of dgemm on fermi gpu. In Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis, SC ’11, pages 35:1–35:11, New York, NY, USA, 2011. ACM.
Google Scholar
David W. Wall. Limits of instruction-level parallelism. In Proceedings of the fourth international conference on Architectural support for programming languages and operating systems, ASPLOS-IV, pages 176–188, New York, NY, USA, 1991. ACM.
Google Scholar
Wm. A. Wulf and Sally A. McKee. Hitting the memory wall: implications of the obvious. SIGARCH Comput. Archit. News, 23(1):20–24, March 1995.
Google Scholar

Download references

Author information

Authors and Affiliations

HPC Division, NEC Corporation, 1–10, Nisshin-cho, Fuchu, Tokyo, Japan, 183-8501
Yasuo Ishii

Authors

Yasuo Ishii
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yasuo Ishii .

Editor information

Editors and Affiliations

Stuttgart (HLRS), University of Stuttgart, High Performance Computing Center, Nobelstrasse 19, Stuttgart, 70569, Germany
Michael M. Resch
Center Stuttgart (HLRS), University of Stuttgart, High Performance Computing, Nobelstraße 19, Stuttgart, 70569, Germany
Xin Wang
Europe GmbH, NEC High Performance Computing, Prinzenallee 11, Düsseldorf, 40459, Germany
Wolfgang Bez
Europe GmbH, NEC High Performance Computing, Hessbrühlstr. 21b, Stuttgart, 70565, Germany
Erich Focht
Cyberscience Center, Tohoku University, Aramaki-Aza-Aoba 4F, Sendai, 980-8578, Japan
Hiroaki Kobayashi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ishii, Y. (2013). Architectural Considerations for Exascale Supercomputing. In: Resch, M., Wang, X., Bez, W., Focht, E., Kobayashi, H. (eds) Sustained Simulation Performance 2012. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-32454-3_2

Download citation

DOI: https://doi.org/10.1007/978-3-642-32454-3_2
Published: 27 August 2012
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-32453-6
Online ISBN: 978-3-642-32454-3
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics