Skip to main content

Abstract

The architecture of future high performance computer systems will respond to the possibilities offered by technology and to the increasing demand for attention to issues of programmability. Multithreaded processing element architectures are a promising alternative to RISC architecture and its multiple-instruction-issue extensions such as VLIW, superscalar, and superpipelined architectures.

This paper presents an overview of multithreaded computer architectures and the technical issues affecting their prospective evolution. We introduce the basic concepts of multithreaded computer architecture and describe several architectures representative of the design space for multithreaded, parallel computers. We review design issues for multithreaded processing elements intended for use as the node processor of parallel computers for scientific computing. These include the question of choosing an appropriate program execution model, the organization of the processing element to achieve good utilization of major resources, support for fine-grain interprocessor communication and global memory access, compiling machine code for multithreaded processors, and the challenge of implementing virtual memory in large-scale multiprocessor systems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Anant Agarwal, John Kubiatowicz, David Kranz, Beng-Hong Lim, Donald Yeung, Godfrey D’Souza, and Mike Parkin, “Sparcle: An evolutionary processor design for multiprocessors,” IEEE Micro, 13(3):48–61, June 1993.

    Article  Google Scholar 

  2. Anant Agarwal, Beng-Hong Lim, David Kranz, and John Kubiatowicz, “APRIL: A processor architecture for multiprocessing,” in Proceedings of the 17th Annual International Symposium on Computer Architecture, Seattle, Washington, pp. 104–114, May 1990.

    Google Scholar 

  3. Anant Agarwal, Richard Simoni, John Hennessy, and Mark Horowitz, “An evaluation of directory schemes for cache coherence,” in Proceedings of the 15th Annual International Symposium on Computer Architecture,Honolulu, Hawaii, pp. 280–289, May–June 1988.

    Google Scholar 

  4. Eugene Albert, Kathleen Knobe, Joan D. Lukas, and Guy L. Steele, Jr., “Compiling Fortran 8x array features for the Connection Machine computer system,” in Proceedings of the ACM/SIGPLAN PPEALS 1988 — Parallel Programming: Experience with Applications, Languages and Systems, New Haven, Connecticut, pp. 42–56, July 1988. SIGPLAN Notices, 23(9), September 1988.

    Google Scholar 

  5. Stephen J. Allan and R. R. Oldehoeft, “HEP SISAL: Parallel functional programming,” inParallel MIMD Computation: The HEP Supercomputer and its Applications (Janusz S. Kowalik, ed.), pp. 123–150, MIT Press, 1985.

    Google Scholar 

  6. Robert Alverson, David Callahan, Daniel Cummings, Brian Koblenz, Allan Porterfield, and Burton Smith, “The Tera computer system,” in Conference Proceedings, 1990 International Conference on Supercomputing, Amsterdam, The Netherlands, pp. 1–6, June 1990.

    Google Scholar 

  7. Marco Annaratone, Emmanuel Arnould, Thomas Gross, H. T. Kung, Monica Lam, Onat Menzilicioglu, and Jon A. Webb, “The Warp computer: Architecture, implementation and performance,” IEEE Transactions on Computers, 36(12):1523–1538, December 1987.

    Article  Google Scholar 

  8. James Archibald and Jean-Loup Baer, “Cache coherence protocols: Evaluation using a multiprocessor simulation model,” ACM Transactions on Computer Systems, 4(4):273–298, November 1986.

    Article  Google Scholar 

  9. Arvind, Kim P. Gostelow, and Wil Plouffe, “An asynchronous programming language and computing machine,” Technical Report 114a, Department of Information and Computer Science, University of California at Irvine, December 1978.

    Google Scholar 

  10. Arvind and Robert A. Iannucci, “Two fundamental issues in multiprocessing,” in Parallel Computing in Science and Engineering, no. 295 in Lecture Notes in Computer Science, pp. 61–88, Springer-Verlag, 1987. Proceedings of the 4th International DFVLR Seminar on Foundations of Engineering Sciences, Bonn, West Germany, June 1987.

    Google Scholar 

  11. Arvind, Vinod Kathail, and Keshav Pingali, “A dataflow architecture with tagged tokens,” Technical Memo MIT/LCS/TM-174, MIT Laboratory for Computer Science, September 1980.

    Google Scholar 

  12. Arvind and Rishiyur S. Nikhil, “Executing a program on the MIT tagged-token dataflow architecture,” IEEE Transactions on Computers, 39(3):300–318, March 1990.

    Article  Google Scholar 

  13. Arvind, Rishiyur S. Nikhil, and Keshav K. Pingali, “I-structures: Data structures for parallel computing,” ACM Transactions on Programming Languages and Systems, 11(4):598–632, October 1989.

    Article  Google Scholar 

  14. Arvind and Robert E. Thomas, “I-structures: An efficient data type for functional languages,” Technical Memo MIT/LCS/TM-178, MIT Laboratory for Computer Science, September 1980.

    Google Scholar 

  15. H. B. Bakoglu and T. Whitside, “RISC System/6000 hardware overview,” in IBM RISC System/6000 Technology (Mamata Misra, ed.), pp. 8–15, International Business Machines Corporation, 1990. Order No. SA232619.

    Google Scholar 

  16. BBN Advanced Computers Inc., “The Butterfly GP1000 parallel processor,” in Butterfly Parallel Processing — General Information,Tests, and Modifications: Articles on the Butterfly Computer, pp. 3–22, Cambridge, Massachusetts: Bolt Beranek and Newman, 1986.

    Google Scholar 

  17. Micah Beck, Keshav K. Pingali, and Alex Nicolau, “Static scheduling for dynamic dataflow machines,” Technical Report TR 90–1076, Department of Computer Science, Cornell University, Ithaca, New York, January 1990.

    Google Scholar 

  18. Michael J. Beckerle, “Overview of the START(*T) multithreaded computer,” in Digest of Papers, 38th IEEE Computer Society International Conference, COMP CON Spring ‘83, San Francisco, California, pp. 148–156, February 1993.

    Google Scholar 

  19. David Bernstein, “PREFACE-2: Supporting nested parallelism in FORTRAN,” Technical Report RC14160, IBM Research, 1988.

    Google Scholar 

  20. David Bernstein and Izidor Gertner, “Scheduling expressions on a pipe-lined processor with a maximal delay of one cycle,” ACM Transactions on Programming Languages and Systems, 11(1):57–66, January 1989.

    MathSciNet  MATH  Google Scholar 

  21. Daniel M. Berry, “Introduction to Oregano,” in Proceedings of a Symposium on Data Structures in Programming Languages, pp. 171–190, ACM, 1971. SIGPLAN Notices, 6(2), February 1971.

    Google Scholar 

  22. Lubomir Bic, “A process-oriented model for efficient execution of dataflow programs,” in Proceedings of the 7th International Conference on Distributed Computing Systems, Berlin, West Germany, pp. 332–336, IEEE Computer Society, September 1987.

    Google Scholar 

  23. Philip Bitar, “The weakest memory-access order,” Journal of Parallel and Distributed Computing, 15(4):305–331, August 1992.

    Article  MATH  Google Scholar 

  24. David L. Black, Anoop Gupta, and Wolf-Dietrich Weber, “Competitive management of distributed shared memory,” in Digest of Papers, 34th IEEE Computer Society International Conference, COMPCON Spring ‘89, San Francisco, California, pp. 184–190, February—March 1989.

    Google Scholar 

  25. D. Allan Bromley, ed., Grand Challenges: High Performance Computing and Communications. The FY 1992 U.S. Research and Development Program. Committee on Physical, Mathematical, and Engineering Sciences; Federal Coordinating Council for Science, Engineering, and Technology; Office of Science and Technology Policy, 1992.

    Google Scholar 

  26. E. Brooks, “The attack of the killer micros.” Presentation in the Teraflop Computing Panel Discussion at Supercomputing ‘89, Reno, Nevada, November 1989.

    Google Scholar 

  27. Richard Buerher and Kattamuri Ekanadham, “Incorporating data flow ideas into von Neumann processors for parallel execution,” IEEE Transactions on Computers, 36(12):1515–1522, December 1987.

    Article  Google Scholar 

  28. Lucien M. Censier and Paul Feautrier, “A new solution to coherence problems in multicache systems,” IEEE Transactions on Computers, 27(12):1112–1118, December 1978.

    MATH  Google Scholar 

  29. David Chaiken, John Kubiatowicz, and Anant Agarwal, “LimitLESS directories: A scalable cache coherence scheme,” in Proceedings of the Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, Santa Clara, California, pp. 224–234, April 1991.

    Book  Google Scholar 

  30. Merrill Cornish, “The TI dataflow architecture: The power of concurrency for avionics,” in Proceedings of the 3rd Digital Avionics Systems Conference, pp. 19–25, IEEE and AIAA, November 1979.

    Google Scholar 

  31. David E. Culler, Anurag Sah, Klaus Eric Schauser, Thorsten von Eicken, and John Wawrzynek, “Fine-grain parallelism with minimal hardware support: A compiler-controlled threaded abstract machine,” in Proceedings of the Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, Santa Clara, California, pp. 164–175, April 1991.

    Book  Google Scholar 

  32. Robert C. Daley and Jack B. Dennis, “Virtual memory, processes and sharing in MULTICS,” Communications of the ACM, 11(5):306–312,May 1968.

    Article  MATH  Google Scholar 

  33. William J. Dally, “Wire efficient VLSI multiprocessor communication networks,” in Proceedings of the Stanford Conference on Advanced Research in VLSI, pp. 391–415, March 1987.

    Google Scholar 

  34. William J. Dally, “Fine-grain message-passing concurrent computers,” in Proceedings of the Third Conference on Hypercube Concurrent Computers and Applications, vol. I, Pasadena, California, pp. 2–12, ACM, January 1988.

    Google Scholar 

  35. William J. Daily, Linda Chao, Andrew Chien, Soha Hassoun, Waldemar Horwat, Jon Kaplan, Paul Song, Brian Totty, and Scott Wills, “Architecture of a message-driven processor,” in Proceedings of the 4th Annual International Symposium on Computer Architecture, Pittsburgh, Pennsylvania, pp. 189–196, June 1987.

    Google Scholar 

  36. William J. Daily and Charles L. Seitz, “Deadlock-free message routing in multiprocessor interconnection networks,” IEEE Transactions on Computers, 36(5):547–553, May 1987.

    Article  Google Scholar 

  37. William J. Daily and D. Scott Wills“Universal mechanisms for concurrency,” in Proceedings of PARLE ‘89 — Parallel Architectures and Languages Europe, vol. I, Veldhoven, The Netherlands, pp. 19–33, June 1989.

    Book  Google Scholar 

  38. F. Darema, D. A. George, V. A. Norton, and G. F. Pfister, “A singleprogram-multiple-data computational model for EPEX/FORTRAN,” Parallel Computing, 7:11–24, April 1988.

    Article  MATH  Google Scholar 

  39. Alan L. Davis and Robert M. Keller, “Data flow program graphs,” Computer,15(2):26–41, February 1982.

    Article  Google Scholar 

  40. Denelcor, Inc., “HEP principles of operation,” 1979.

    Google Scholar 

  41. Jack B. Dennis, “First version of a data-flow procedure language,” in Proceedings of the Colloque sur la Programmation, no. 19 in Lecture Notes in Computer Science, Paris, France, pp. 362–376, Springer-Verlag, April 1974.

    Google Scholar 

  42. Jack B. Dennis, “Data flow supercomputers,” Computer, 13(11) 48–56, November 1980.

    Article  Google Scholar 

  43. Jack B. Dennis, “The Paradigm Compiler: Mapping a functional language for the Connection Machine,” in Scientific Applications of the Connection Machine (Horst Simon, ed.), pp. 301–315, Singapore: World Scientific, 1989.

    Google Scholar 

  44. Jack B. Dennis, “The evolution of ”static“ data-flow architecture,” in Advanced Topics in Data-Flow Computing (Jean-Luc Gaudiot and Lubomir Bic, eds.), ch. 2, Prentice-Hall, 1991.

    Google Scholar 

  45. Jack B. Dennis and Guang R. Gao, “An efficient pipelined dataflow processor architecture,” in Proceedings of Supercomputing ‘88, Orlando, Florida, pp. 368–373, November 1988.

    Google Scholar 

  46. Jack B. Dennis and Guang Rong Gao, “Maximum pipelining of array operations on static data flow machine,” in Proceedings of the 1983 International Conference on Parallel Processing, Bellaire, Michigan, pp. 331–334, August 1983.

    Google Scholar 

  47. Jack B. Dennis, Guang-Rong Gao, and Kenneth W. Todd, “Modeling the weather with a data flow supercomputer,” IEEE Transactions on Computers, 33(7):592–603, July 1984.

    Article  Google Scholar 

  48. Jack B. Dennis and David P. Misunas, “A preliminary architecture for a basic data-flow processor,” in Proceedings of the 2nd Annual Symposium on Computer Architecture, Houston, Texas, pp. 126–132, January 1975.

    Google Scholar 

  49. Keith Diefendorff and Michael Allen, “Organization of the Motorola 88110 superscalar RISC microprocessor,” IEEE Micro, 12(2):40–63, April 1992.

    Article  Google Scholar 

  50. Digital Equipment Corporation, Alpha Architecture Manual. Burlington, Vermont, 1992.

    Google Scholar 

  51. E. W. Dijkstra, “Co-operating sequential processes,” in Programming Languages (F. Genuys, ed.), pp. 43–112, New York: Academic Press, 1968.

    Google Scholar 

  52. John R. Ellis, Bulldog: A Compiler for VLIW Architectures. PhD thesis, Yale University, May 1984. ACM Doctoral Dissertation Award, 1985; published in 1986.

    Google Scholar 

  53. Joseph A. Fisher, “Very long instruction word architectures and the ELI-512,” in Proceedings of the 10th Annual International Symposium on Computer Architecture, Stockholm, Sweden, pp. 140–150, June 1983.

    Google Scholar 

  54. Joseph A. Fisher, John R. Ellis, John C. Ruttenberg, and Alexandru Nicolau, “Parallel processing: A smart compiler and a dumb machine,” in Proceedings of the SIGPLAN ‘84 Symposium on Compiler Construction, Montréal, Québec, pp. 37–47, June 1984.

    Google Scholar 

  55. G. Gao and Q. Ning, “Loop storage optimization for dataflow machines,” in Proceedings of the Fourth International Workshop on Languages and Compilers for Parallel Computing, no. 589 in Lecture Notes in Computer Science, Santa Clara, California, pp. 359–373, Springer-Verlag, August 1991. Published in 1992.

    Google Scholar 

  56. Guang R. Gao, Herbert H. J. Hum, and Yue-Bong Wong, “Limited balancing—an efficient method for dataflow software pipelining,” in Proceedings of the ISMM International Conference on Parallel and Distributed Computing and Systems,New York, New York, International Society for Mini and Microcomputers, October 1990.

    Google Scholar 

  57. Guang R. Gao, Herbert H. J. Hum, and Yue-Bong Wong, “Parallel function invocation in a dynamic argument-fetching dataflow architecture,” in Proceedings of PARBASE ‘80-International Conference on Databases, Parallel Architectures, and Their Applications, Miami Beach, Florida, pp. 112–116, IEEE Computer Society, March 1990.

    Book  Google Scholar 

  58. Guang R. Gao, Yue-Bong Wong, and Qi Ning, “A timed Petri-Net model for fine-grain loop scheduling,” in Proceedings of the SIGPLAN ‘81 Conference on Programming Language Design and Implementation, Toronto, Ontario, pp. 204–218, June 1991.

    Google Scholar 

  59. James R. Goodman, “Using cache memory to reduce processor-memory traffic,” in Proceedings of the 10th Annual International Symposium on Computer Architecture, Stockholm, Sweden, pp. 124–131, June 1983.

    Google Scholar 

  60. V. G. Grafe, G. S. Davidson, J. E. Hoch, and V. P. Holmes, “The Epsilon dataflow processor,” in Proceedings of the 16th Annual International Symposium on Computer Architecture, Jerusalem, Israel, pp. 36–45, May—June 1989.

    Book  Google Scholar 

  61. V. G. Grafe and J. E. Hoch, “The Epsilon-2 hybrid dataflow architecture,” in Digest of Papers, 35th IEEE Computer Society International Conference, COMP CON Spring ‘80, San Francisco, California, pp. 88–93, February—March 1990.

    Google Scholar 

  62. Anoop Gupta, John Hennessy, Kourosh Gharachorloo, Todd Mowry, and Wolf-Dietrich Weber, “Comparative evaluation of latency reducing and tolerating techniques,” in Proceedings of the 18th Annual International Symposium on Computer Architecture, Toronto, Ontario, pp. 254–263, May 1991.

    Google Scholar 

  63. Robert H. Halstead, Jr., “Multilisp: A language for concurrent symbolic computation,” ACM Transactions on Programming Languages and Systems, 7(4):501–538, October 1985.

    Google Scholar 

  64. Robert H. Halstead, Jr. and Tetsuya Fujita, “MASA: A multithreaded processor architecture for parallel symbolic computing,” in Proceedings of the 15th Annual International Symposium on Computer Architecture, Honolulu, Hawaii, pp. 443–451, May—June 1988.

    Google Scholar 

  65. John L. Hennessy and Norman P. Jouppi, “Computer technology and architecture: An evolving interaction,” Computer, 24(9):18–29, September 1991.

    Article  Google Scholar 

  66. John L. Hennessy and David A. Patterson, Computer Architecture: A Quantitative Approach. Morgan Kaufmann Publishers, Inc., 1990.

    Google Scholar 

  67. James Hicks, Derek Chiou, Boon Seong Ang, and Arvind, “Performance studies of Id on the Monsoon dataflow system,” Journal of Parallel and Distributed Computing, 18(3):273–300, July 1993.

    Article  Google Scholar 

  68. High Performance Fortran Forum, “High-performance fortran language specification,” technical report, Rice University, May 1993.

    Google Scholar 

  69. Mark D. Hill, “A case for direct-mapped caches,” Computer, 21(12):2540, December 1988.

    Google Scholar 

  70. Kei Hiraki, Satoshi Sekiguchi, and Toshio Shimada, “Status report of SIGMA-1: A data-flow supercomputer,” in Advanced Topics in Data-Flow Computing (Jean-Luc Gaudiot and Lubomir Bic, eds.), ch. 7, Prentice-Hall, 1991.

    Google Scholar 

  71. Seema Hiranandani, Ken Kennedy, and Chau-Wen Tseng, “Compiler optimizations for Fortran D on MIMD distributed-memory machines,” in Proceedings of Supercomputing ‘81, Albuquerque, New Mexico, pp. 86–100, November 1991.

    Google Scholar 

  72. Herbert H. J. Hum and Guang R. Gao“Efficient support of concurrent threads in a hybrid dataflow/von Neumann architecture,” in Proceedings of the Third IEEE Symposium on Parallel and Distributed Processing, Dallas, Texas, pp. 190–193, ACM SIGARCH and IEEE Computer Society, December 1991.

    Google Scholar 

  73. Herbert H. J. Hum and Guang R. Gao, “A novel high-speed memory organization for fine-grain multi-thread computing,” in Proceedings of PARLE ‘81 - Parallel Architectures and Languages Europe, vol. I, Eindhoven, The Netherlands, pp. 34–51, June 1991.

    Google Scholar 

  74. Robert A. Iannucci, A Dataflow/von Neumann Hybrid Architecture. PhD thesis, Massachusetts Institute of Technology, July 1988. Also published as Technical Report MIT/LCS/TR-418, MIT Laboratory for Computer Science.

    Google Scholar 

  75. Robert A. Iannucci, “Toward a dataflow/von Neumann hybrid architecture,” in Proceedings of the 15th Annual International Symposium on Computer Architecture, Honolulu, Hawaii, pp. 131–140, May-June 1988.

    Google Scholar 

  76. Mike Johnson, Superscalar Microprocessor Design. Prentice Hall, Englewood Cliffs, New Jersey 07632, 1991.

    Google Scholar 

  77. John B. Johnston, “The contour model of block structured processes,” in Proceedings of a Symposium on Data Structures in Programming Languages, pp. 55–82, ACM, 1971. SIGPLAN Notices, 6(2), February 1971.

    Article  Google Scholar 

  78. Norman P. Jouppi and David W. Wall, “Available instruction-level parallelism for superscalar and superpipelined machines,” in Proceedings of the Third International Conference on Architectural Support for Programming Languages and Operating Systems, Boston, Massachusetts, pp. 272–282, April 1989.

    Book  Google Scholar 

  79. R. H. Katz, S. J. Eggers, D. A. Wood, C. L. Perkins, and R. G Sheldon, “Implementing a cache consistency protocol,” in Proceedings of the 12th Annual International Symposium on Computer Architecture, Boston, Massachusetts, pp. 276–283, June 1985.

    Google Scholar 

  80. Robert M. Keller, “Look-ahead processors,” Computing Surveys, 7(4):177–195, December 1975.

    Article  MATH  Google Scholar 

  81. Parviz Kermani and Leonard Kleinrock, “Virtual cut-through: A new computer communication switching technique,” Computer Networks, 3(4):267–286, September 1979.

    MathSciNet  MATH  Google Scholar 

  82. Kathleen Knobe, Joan D. Lukas, and Guy L. Steele, Jr., “Data optimization: Allocation of arrays to reduce communication on SIMD machines,” Journal of Parallel and Distributed Computing, 8(2):102–118, February 1990.

    Article  Google Scholar 

  83. Yuetsu Kodama, Yasuhito Koumura, Mitsuhisa Sato, Hirohumi Sakane, Shuichi Sakai, and Yoshinori Yamaguchi, “EMC-Y: Parallel processing element optimizing communication and computation,” in Conference Proceedings, 1993 International Conference on Supercomputing, Tokyo, Japan, pp. 167–174, July 1993.

    Google Scholar 

  84. Yuetsu Kodama, Shuichi Sakai, and Yoshinori Yamaguchi, “A prototype of a highly parallel dataflow machine EM-4 and its preliminary evaluation,” in Proceedings of InfoJapan 90, pp. 291–298, October 1990.

    Google Scholar 

  85. Charles Koelbel, “Compile-time generation of communications for scientific programs,” Technical Report CRPC-TR91089, Center for Resarch on Parallel Computation, Rice University, January 1991.

    Google Scholar 

  86. Peter M. Kogge, The Architecture of Pipelined Computers. New York: McGraw-Hill Book Company, 1981.

    Google Scholar 

  87. Les Kohn and Neal Margulis, “Introducing the Intel i860 64-bit microprocessor,” IEEE Micro, 13(4):15–30, August 1989.

    Article  Google Scholar 

  88. Janusz S. Kowalik, ed., Parallel MIMD Computation: The HEP Supercomputer and its Applications. MIT Press, 1985.

    MATH  Google Scholar 

  89. James T. Kuehn and Burton J. Smith, “The Horizon supercomputing system: Architecture and software,” in Proceedings of Supercomputing ‘88, Orlando, Florida, pp. 28–34, November 1988.

    Google Scholar 

  90. Daniel Lenoski, James Laudon, Truman Joe, David Nakahira, Luis Stevens, Anoop Gupta, and John Hennessy, “The DASH prototype: Implementation and performance,” in Proceedings of the 19th Annual International Symposium on Computer Architecture, Gold Coast, Australia, pp. 92–103, May 1992.

    Google Scholar 

  91. Kai Li and Paul Hudak, “Memory coherence in shared virtual memory systems,” in Proceedings of the Fifth Annual ACM Symposium on Principles of Distributed Computing, Calgary, Alberta, pp. 229–239, August 1986.

    Book  Google Scholar 

  92. J. R. McGraw et al., “SISAL: Streams and iteration in a single assignment language—language reference manual version 1.2,” Technical Report M146, Lawrence Livermore National Laboratory, 1985.

    Google Scholar 

  93. Charles Melear, “The design of the 88000 RISC family,” IEEE Micro, 13(2):26–38, April 1989.

    Article  Google Scholar 

  94. Edward F. Miller, Jr., “A multiple-stream registerless shared-resource processor,” IEEE Transactions on Computers, 23(3):277–285, March 1974.

    MATH  Google Scholar 

  95. Mamata Misra, ed., IBM RISC System/6000 Technology. International Business Machines Corporation, 1990. Order No. SA23–2619.

    Google Scholar 

  96. Rishiyur S. Nikhil, “The parallel programming language Id and its compilation for parallel machines,” CSG Memo 313, Computation Structures Group, MIT Laboratory for Computer Science, July 1990.

    Google Scholar 

  97. Rishiyur S. Nikhil and Arvind, “Can dataflow subsume von Neumann computing?” in Proceedings of the 16th Annual International Symposium on Computer Architecture,Jerusalem, Israel, pp. 262–272, May-June 1989.

    Book  Google Scholar 

  98. Rishiyur S. Nikhil and Arvind, “Id: a language with implicit parallelism,” CSG Memo 305, Computation Structures Group, MIT Laboratory for Computer Science, 1990.

    Google Scholar 

  99. Michael D. Noakes, Deborah A. Wallah, and William J. Dally, “The J-Machine multicomputer: An architectural evaluation,” in Proceedings of the 20th Annual International Symposium on Computer Architecture,San Diego, California, pp. 224–235, May 1993.

    Google Scholar 

  100. Amos R. Omondi, “Design of a high performance instruction pipeline,” Computer Systems Science and Engineering, 6(1):13–29, January 1991.

    Google Scholar 

  101. Gregory M. Papadopoulos, Implementation of a General Purpose Dataflow Multiprocessor. MIT Press, 1991. Revised version of the author’s Ph.D. dissertation, MIT, 1988.

    Google Scholar 

  102. Gregory M. Papadopoulos and David E. Culler, “Monsoon: an explicit token-store architecture,” in Proceedings of the 17th Annual International Symposium on Computer Architecture, Seattle, Washington, pp. 82–91, May 1990.

    Google Scholar 

  103. Gregory M. Papadopoulos and Kenneth R. Traub, “Multithreading: A revisionist view of dataflow architectures,” in Proceedings of the 18th Annual International Symposium on Computer Architecture, Toronto, Ontario, pp. 342–351, May 1991.

    Google Scholar 

  104. David A. Patterson and Carlo H. Sequin, “RISC I: A reduced instruction set VLSI computer,” in Proceedings of the 8th Annual Symposium on Computer Architecture, Minneapolis, Minnesota, pp. 443–450, May 1981.

    Google Scholar 

  105. A. R. Pleszkun, J. R. Goodman, W.-C. Hsu, R. T. Joersz, G. Bier, P. Woest, and P. Schechter, “WISQ: A restartable architecture using queues,” in Proceedings of the 14th Annual International Symposium on Computer Architecture, Pittsburgh, Pennsylvania, June 1987.

    Google Scholar 

  106. Steven A. Przybylski, Cache and Memory Hierarchy Design: A Performance-Directed Approach. Morgan Kaufmann, 1990.

    Google Scholar 

  107. George Radin, “The 801 minicomputer,” in Proceedings of the Symposium on Architectural Support for Programming Languages and Operating Systems, Palo Alto, California, pp. 39–47, March 1982.

    Google Scholar 

  108. B. Ramakrishna Rau, David W. L. Yen, Wei Yen, and Ross A. Towle, “The Cydra 5 departmental supercomputer,” Computer, 22(1):12–35, January 1989.

    Article  Google Scholar 

  109. Gary Sabot, “A compiler for a massively parallel distributed memory MIMD computer,” in Proceedings of Frontiers ‘82: The Fourth Symposium on the Frontiers of Massively Parallel Computation,McLean, Virginia, pp. 12–20, October 1992.

    Google Scholar 

  110. Shuichi Sakai, Yoshinori Yamaguchi, Kei Hiraki, Yuetsu Kodama, and Toshitsugu Yuba, “An architecture of a dataflow single chip processor,” in Proceedings of the 16th Annual International Symposium on Computer Architecture, Jerusalem, Israel, pp. 46–53, May—June 1989.

    Book  Google Scholar 

  111. Charles L. Seitz, “The Cosmic Cube,” Communications of the ACM, 28(1):22–33, January 1985.

    Article  MathSciNet  Google Scholar 

  112. Jong-Jiann Shieh and Christos A. Papachristou, “On reordering instruction streams for pipelined computers,” in Proceedings of the 22th Annual International Workshop on Microprogramming and Microarchitecture, Dublin, Ireland, pp. 199–206, August 1989. SIGMICRO Newsletter, 20(3), September 1989.

    Google Scholar 

  113. Alan Jay Smith, “Cache memories,” Computing Surveys, 14(3):473–530, September 1982.

    Article  Google Scholar 

  114. Burton Smith, “The architecture of HEP,” in Parallel MIMD Computation: The HEP Supercomputer and its Applications (Janusz S. Kowalik, ed.), pp. 41–55, MIT Press, 1985.

    Google Scholar 

  115. Burton J. Smith, “Architecture and applications of the HEP multiprocessor computer system,” in Proceedings of SPIE — Real-Time Signal Processing IV, vol. 298, San Diego, California, pp. 241–248, August 1981.

    Google Scholar 

  116. Harold S. Stone and John Cocke, “Computer architecture in the 1990s,” Computer, 24(9):30–38, September 1991.

    Article  Google Scholar 

  117. C. K Tang, “Cache system design in the tightly coupled multiprocessor system,” in AFIPS Conference Proceedings,1976 National Computer Conference, New York City, New York, pp. 749–753, June 1976.

    Google Scholar 

  118. Charles P. Thacker and Lawrence C. Stewart, “Firefly: a multiprocessor workstation,” in Proceedings of the Second International Conference on Architectural Support for Programming Languages and Operating Systems, Palo Alto, California, pp. 164–172, October 1987.

    Google Scholar 

  119. Kevin B. Theobald, “Panel sessions of the 1991 workshop on multithreaded computers,” Computer Architecture News, 22(1):2–33, March 1994.

    Google Scholar 

  120. Kevin B. Theobald, Guang R. Gao, and Laurie J. Hendren, “On the limits of program parallelism and its smoothability,” in Proceedings of the 25th Annual International Symposium on Microarchitecture, Portland, Oregon, pp. 10–19, December 1992.

    Google Scholar 

  121. R. H. Thomas, R. Gurwitz, J. Goodhue, and D. Allen, “Butterfly parallel processor overview,” technical report, Bolt Beranek and Newman, Cambridge, Massachusetts, December 1985.

    Google Scholar 

  122. J. E. Thornton, Design of a Computer: The Control Data 6600. Scott, Foresman, and Co., 1970.

    Google Scholar 

  123. Kenneth R. Traub, “Compilation as partitioning: A new approach to compiling non-strict functional languages,” CSG Memo 291, Computation Structures Group, MIT Laboratory for Computer Science, 1988.

    Google Scholar 

  124. Kenneth R. Traub, Sequential Implementation of Lenient Programming Languages. PhD thesis, Massachusetts Institute of Technology, September 1988. Also published as Technical Report MIT/LCS/TR-417, MIT Laboratory for Computer Science.

    Google Scholar 

  125. D. A. Turner, “The semantic elegance of applicative languages,” in Proceedings of the 1981 Conference on Functional Programming Languages and Computer Architecture, Portsmouth, New Hampshire, pp. 85–97, October 1981.

    Book  Google Scholar 

  126. David W. Wall, “Limits of instruction-level parallelism,” in Proceedings of the Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, Santa Clara, California, pp. 176–188, April 1991.

    Book  Google Scholar 

  127. Ian Watson and John Gurd, “A practical data flow computer,” Computer, 15(2):51–57, February 1982.

    Article  Google Scholar 

  128. Wolf-Dietrich Weber and Anoop Gupta, “Exploring the benefits of multiple hardware contexts in a multiprocessor architecture: Preliminary results,” in Proceedings of the.16th Annual International Symposium on Computer Architecture, Jerusalem, Israel, pp. 273–280, May—June 1989.

    Book  Google Scholar 

  129. X3J3, The FORTRAN Technical Committee of ANSI, “FORTRAN 90, draft of the international standard,” June 1990.

    Google Scholar 

  130. Toshitsugu Yuba, Toshio Shimada, Kei Hiraki, and Hiroshi Kashiwagi, “SIGMA-1: a dataflow computer for scientific computations,” Computer Physics Communications, 37:141–148, July 1985.

    Article  Google Scholar 

  131. Hans P. Zima, Heinz-J. Bast, and Michael Gerndt, “SUPERB: a tool for semi-automatic MIMD/SIMD parallelization,” Parallel Computing, 6:118, January 1988.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1994 Springer Science+Business Media New York

About this chapter

Cite this chapter

Dennis, J.B., Gao, G.R. (1994). Multithreaded Architectures: Principles, Projects, and Issues. In: Iannucci, R.A., Gao, G.R., Halstead, R.H., Smith, B. (eds) Multithreaded Computer Architecture: A Summary of the State of the ART. The Springer International Series in Engineering and Computer Science, vol 281. Springer, Boston, MA. https://doi.org/10.1007/978-1-4615-2698-8_1

Download citation

  • DOI: https://doi.org/10.1007/978-1-4615-2698-8_1

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-1-4613-6161-9

  • Online ISBN: 978-1-4615-2698-8

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics