Skip to main content

Part of the book series: Lecture Notes in Computer Science ((THIPEAC,volume 6760))

Abstract

Conventional stream architectures focus on exploiting ILP and DLP in the applications, although stream model also exposes abundant TLP at kernel granularity. On the other side, with the development of model VLSI technology, increasing application demands and scalability challenges conventional stream architectures. In this paper, we present a novel Tiled Multi-Core Stream Architecture called TiSA. TiSA introduces the tile that consists of multiple stream cores as a new category of architectural resources, and designed an on-chip network to support stream transfer among tiles. In TiSA, multiple levels parallelisms are exploited on different granularity of processing elements. Besides hardware modules, this paper also discusses some other key issues of TiSA architecture, including programming model, various execution patterns and resource allocations. We then evaluate the hardware scalability of TiSA by scaling to 10s~1000s ALUs and estimating its area and delay cost. We also evaluate the software scalability of TiSA by simulating 6 stream applications and comparing sustained performance with other stream processors and general purpose processors, and different configuration of TiSA. A 256-ALU TiSA with 4 tile and 4 stream cores per tile is shown to be feasible with 45 nanometer technology, sustaining 100~350 GFLOP/s on most stream benchmarks and providing ~10x of speedup over a 16-ALU TiSA with a 5% degradation in area per ALU. The result shows that TiSA is a VLSI- and performance-efficient architecture for the billions-transistors era.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Khailany, B.: The VLSI Implementation and Evaluation of Area-and Energy-Efficient Streaming Media Processors. PhD thesis, Stanford,University (2003)

    Google Scholar 

  2. Rixner, S.: Stream Processor Architecture. Kluwer Academic Publishers, Boston (2001)

    MATH  Google Scholar 

  3. Bond, R.: High Performance DoD DSP Applications. In: 2003 Workshop on Streaming Systems (2003), http://catfish.csail.mit.edu/wss03/

  4. Wen, M., Wu, N., Li, H., Zhang, C.: Multiple-Morphs Adaptive Stream Architecture. Journal of Computer Science and Technology 20(5) (September 2005)

    Google Scholar 

  5. Kozyrakis, C.E., et al.: Scalable Processors in the Billion-Transistors Era: IRAM. IEEE Computer 30(9) (September 1997)

    Google Scholar 

  6. Khailany, B., Dally, W.J., Kapasi, U.J., Mattson, P., et al.: Imagine: media processing with streams. IEEE Micro (March/April 2001)

    Google Scholar 

  7. Taylor, M.B., et al.: Evaluation of the Raw Microprocessor: An Exposed-Wire-Delay Architecture for ILP and Streams. In: ISCA 2004 (2004)

    Google Scholar 

  8. Sankaralingam, K., et al.: Exploiting ILP, TLP, and DLP with the Polymorphous TRIPS architecture. In: 30th Annual International Symposium on Computer Architecture (May 2003)

    Google Scholar 

  9. Hofstee, H.P.: Power Efficient Processor Architecture and the Cell Processor. In: Proc. of the 11th International Symposium on High Performance Computer Architecture (February 2005)

    Google Scholar 

  10. Fang, J.: Challenges and Opportunities on Multi-core Microprocessor. In: Srikanthan, T., Xue, J., Chang, C.-H. (eds.) ACSAC 2005. LNCS, vol. 3740, pp. 389–390. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  11. Mattson, P.R.: A Programming System for the Imagine Media Processor. PhD thesis, Stanford University (2002)

    Google Scholar 

  12. Dally, W.J., et al.: Merrimac: Supercomputing with Streams. In: Proc. of Supercomputing 2003 (November 2003)

    Google Scholar 

  13. Wen, M., Wu, N., Zhang, C., Wu, W., Yang, Q., Xun, C.: FT64: Scientific Computing with Stream. In: Aluru, S., Parashar, M., Badrinath, R., Prasanna, V.K. (eds.) HiPC 2007. LNCS, vol. 4873, pp. 209–220. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  14. Larus, J.: SPIM: A MIPS Simulator, http://pages.cs.wisc.edu/~larus/spim.html

  15. Das, A., Mattson, P., Kapasi, U., Owens, J., Rixner, S., Jayasena, N.: Imagine Programming System User’s Guide 2.0 (June 2004), http://cva.stanford.edu/Imagine/project/

  16. Mattson, P.: Communication scheduling. In: Proceedings of the Ninth International Conference on Architectural Support for Programming Languages and Operating Systems, Cambridge, MA (November 2000)

    Google Scholar 

  17. Nuwan, S.: Jayasena, Memory Hierarchy Design for Stream Computing. Stanford Ph.D. Thesis (2005)

    Google Scholar 

  18. Khailany, B., Dally, W.J., Rixner, S., Kapasi, U.J., Owens, J.D., Towles, B.: Exploring the VLSI Scalability of Stream Processors. In: Proceedings of the 9th Symposium on High Performance Computer Architecture, Anaheim, California (February 2003)

    Google Scholar 

  19. Erez, M.: Merrimac - High-Performance and High-Efficient Scientific Computing with Streams. PhD thesis, Stanford University (2006)

    Google Scholar 

  20. Das, A., Mattson, P., Kapasi, U., Owens, J., Rixner, S., Jayasena, N.: Imagine Programming System Developer’s Guide (2002), http://cva.stanford.edu/Imagine/project/

  21. Mai, K., Paaske, T., Jayasena, N., Ho, R., Dally, W.J., Horowitz, M.: Smart memories: A modular reconfigurable architecture. In: International Symposium on Computer Architecture (June 2000)

    Google Scholar 

  22. Kongetira, P., Aingaran, K., Olukotun, K.: Niagara: A 32-way multithreaded Sparc processor. IEEE Micro, 25(2) (March/April 2005)

    Google Scholar 

  23. (2007), http://www.tilera.com/products/processors.php

  24. Zhirnov, V., Cavin, R.: Greg Leeming, Kosmas Galatsis, An Assessment of Integrated Digital Cellular Automata Architectures. IEEE Computer (January 2008)

    Google Scholar 

  25. Wu, W., Wen, M., Wu, N., He, Y., et al.: Research and Evaluating of a Multiple-dimension Scalable Stream Architecture. Acta Electronic Sinica (May 2008)

    Google Scholar 

  26. Ahn, J.H.: Memory and Control Organizaions of Stream Processors, Ph.D. Thesis, Stanford University (2007)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Wu, N. et al. (2011). Tiled Multi-Core Stream Architecture. In: Stenström, P. (eds) Transactions on High-Performance Embedded Architectures and Compilers IV. Lecture Notes in Computer Science, vol 6760. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-24568-8_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-24568-8_14

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-24567-1

  • Online ISBN: 978-3-642-24568-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics