Multithreaded Architectures: Principles, Projects, and Issues

Dennis, Jack B.; Gao, Guang R.

doi:10.1007/978-1-4615-2698-8_1

Jack B. Dennis⁵ &
Guang R. Gao⁶

Part of the book series: The Springer International Series in Engineering and Computer Science ((SECS,volume 281))

149 Accesses
14 Citations

Abstract

The architecture of future high performance computer systems will respond to the possibilities offered by technology and to the increasing demand for attention to issues of programmability. Multithreaded processing element architectures are a promising alternative to RISC architecture and its multiple-instruction-issue extensions such as VLIW, superscalar, and superpipelined architectures.

This paper presents an overview of multithreaded computer architectures and the technical issues affecting their prospective evolution. We introduce the basic concepts of multithreaded computer architecture and describe several architectures representative of the design space for multithreaded, parallel computers. We review design issues for multithreaded processing elements intended for use as the node processor of parallel computers for scientific computing. These include the question of choosing an appropriate program execution model, the organization of the processing element to achieve good utilization of major resources, support for fine-grain interprocessor communication and global memory access, compiling machine code for multithreaded processors, and the challenge of implementing virtual memory in large-scale multiprocessor systems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Anant Agarwal, John Kubiatowicz, David Kranz, Beng-Hong Lim, Donald Yeung, Godfrey D’Souza, and Mike Parkin, “Sparcle: An evolutionary processor design for multiprocessors,” IEEE Micro, 13(3):48–61, June 1993.
Article Google Scholar
Anant Agarwal, Beng-Hong Lim, David Kranz, and John Kubiatowicz, “APRIL: A processor architecture for multiprocessing,” in Proceedings of the 17th Annual International Symposium on Computer Architecture, Seattle, Washington, pp. 104–114, May 1990.
Google Scholar
Anant Agarwal, Richard Simoni, John Hennessy, and Mark Horowitz, “An evaluation of directory schemes for cache coherence,” in Proceedings of the 15th Annual International Symposium on Computer Architecture,Honolulu, Hawaii, pp. 280–289, May–June 1988.
Google Scholar
Eugene Albert, Kathleen Knobe, Joan D. Lukas, and Guy L. Steele, Jr., “Compiling Fortran 8x array features for the Connection Machine computer system,” in Proceedings of the ACM/SIGPLAN PPEALS 1988 — Parallel Programming: Experience with Applications, Languages and Systems, New Haven, Connecticut, pp. 42–56, July 1988. SIGPLAN Notices, 23(9), September 1988.
Google Scholar
Stephen J. Allan and R. R. Oldehoeft, “HEP SISAL: Parallel functional programming,” inParallel MIMD Computation: The HEP Supercomputer and its Applications (Janusz S. Kowalik, ed.), pp. 123–150, MIT Press, 1985.
Google Scholar
Robert Alverson, David Callahan, Daniel Cummings, Brian Koblenz, Allan Porterfield, and Burton Smith, “The Tera computer system,” in Conference Proceedings, 1990 International Conference on Supercomputing, Amsterdam, The Netherlands, pp. 1–6, June 1990.
Google Scholar
Marco Annaratone, Emmanuel Arnould, Thomas Gross, H. T. Kung, Monica Lam, Onat Menzilicioglu, and Jon A. Webb, “The Warp computer: Architecture, implementation and performance,” IEEE Transactions on Computers, 36(12):1523–1538, December 1987.
Article Google Scholar
James Archibald and Jean-Loup Baer, “Cache coherence protocols: Evaluation using a multiprocessor simulation model,” ACM Transactions on Computer Systems, 4(4):273–298, November 1986.
Article Google Scholar
Arvind, Kim P. Gostelow, and Wil Plouffe, “An asynchronous programming language and computing machine,” Technical Report 114a, Department of Information and Computer Science, University of California at Irvine, December 1978.
Google Scholar
Arvind and Robert A. Iannucci, “Two fundamental issues in multiprocessing,” in Parallel Computing in Science and Engineering, no. 295 in Lecture Notes in Computer Science, pp. 61–88, Springer-Verlag, 1987. Proceedings of the 4th International DFVLR Seminar on Foundations of Engineering Sciences, Bonn, West Germany, June 1987.
Google Scholar
Arvind, Vinod Kathail, and Keshav Pingali, “A dataflow architecture with tagged tokens,” Technical Memo MIT/LCS/TM-174, MIT Laboratory for Computer Science, September 1980.
Google Scholar
Arvind and Rishiyur S. Nikhil, “Executing a program on the MIT tagged-token dataflow architecture,” IEEE Transactions on Computers, 39(3):300–318, March 1990.
Article Google Scholar
Arvind, Rishiyur S. Nikhil, and Keshav K. Pingali, “I-structures: Data structures for parallel computing,” ACM Transactions on Programming Languages and Systems, 11(4):598–632, October 1989.
Article Google Scholar
Arvind and Robert E. Thomas, “I-structures: An efficient data type for functional languages,” Technical Memo MIT/LCS/TM-178, MIT Laboratory for Computer Science, September 1980.
Google Scholar
H. B. Bakoglu and T. Whitside, “RISC System/6000 hardware overview,” in IBM RISC System/6000 Technology (Mamata Misra, ed.), pp. 8–15, International Business Machines Corporation, 1990. Order No. SA232619.
Google Scholar
BBN Advanced Computers Inc., “The Butterfly GP1000 parallel processor,” in Butterfly Parallel Processing — General Information,Tests, and Modifications: Articles on the Butterfly Computer, pp. 3–22, Cambridge, Massachusetts: Bolt Beranek and Newman, 1986.
Google Scholar
Micah Beck, Keshav K. Pingali, and Alex Nicolau, “Static scheduling for dynamic dataflow machines,” Technical Report TR 90–1076, Department of Computer Science, Cornell University, Ithaca, New York, January 1990.
Google Scholar
Michael J. Beckerle, “Overview of the START(*T) multithreaded computer,” in Digest of Papers, 38th IEEE Computer Society International Conference, COMP CON Spring ‘83, San Francisco, California, pp. 148–156, February 1993.
Google Scholar
David Bernstein, “PREFACE-2: Supporting nested parallelism in FORTRAN,” Technical Report RC14160, IBM Research, 1988.
Google Scholar
David Bernstein and Izidor Gertner, “Scheduling expressions on a pipe-lined processor with a maximal delay of one cycle,” ACM Transactions on Programming Languages and Systems, 11(1):57–66, January 1989.
MathSciNet MATH Google Scholar
Daniel M. Berry, “Introduction to Oregano,” in Proceedings of a Symposium on Data Structures in Programming Languages, pp. 171–190, ACM, 1971. SIGPLAN Notices, 6(2), February 1971.
Google Scholar
Lubomir Bic, “A process-oriented model for efficient execution of dataflow programs,” in Proceedings of the 7th International Conference on Distributed Computing Systems, Berlin, West Germany, pp. 332–336, IEEE Computer Society, September 1987.
Google Scholar
Philip Bitar, “The weakest memory-access order,” Journal of Parallel and Distributed Computing, 15(4):305–331, August 1992.
Article MATH Google Scholar
David L. Black, Anoop Gupta, and Wolf-Dietrich Weber, “Competitive management of distributed shared memory,” in Digest of Papers, 34th IEEE Computer Society International Conference, COMPCON Spring ‘89, San Francisco, California, pp. 184–190, February—March 1989.
Google Scholar
D. Allan Bromley, ed., Grand Challenges: High Performance Computing and Communications. The FY 1992 U.S. Research and Development Program. Committee on Physical, Mathematical, and Engineering Sciences; Federal Coordinating Council for Science, Engineering, and Technology; Office of Science and Technology Policy, 1992.
Google Scholar
E. Brooks, “The attack of the killer micros.” Presentation in the Teraflop Computing Panel Discussion at Supercomputing ‘89, Reno, Nevada, November 1989.
Google Scholar
Richard Buerher and Kattamuri Ekanadham, “Incorporating data flow ideas into von Neumann processors for parallel execution,” IEEE Transactions on Computers, 36(12):1515–1522, December 1987.
Article Google Scholar
Lucien M. Censier and Paul Feautrier, “A new solution to coherence problems in multicache systems,” IEEE Transactions on Computers, 27(12):1112–1118, December 1978.
MATH Google Scholar
David Chaiken, John Kubiatowicz, and Anant Agarwal, “LimitLESS directories: A scalable cache coherence scheme,” in Proceedings of the Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, Santa Clara, California, pp. 224–234, April 1991.
Book Google Scholar
Merrill Cornish, “The TI dataflow architecture: The power of concurrency for avionics,” in Proceedings of the 3rd Digital Avionics Systems Conference, pp. 19–25, IEEE and AIAA, November 1979.
Google Scholar
David E. Culler, Anurag Sah, Klaus Eric Schauser, Thorsten von Eicken, and John Wawrzynek, “Fine-grain parallelism with minimal hardware support: A compiler-controlled threaded abstract machine,” in Proceedings of the Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, Santa Clara, California, pp. 164–175, April 1991.
Book Google Scholar
Robert C. Daley and Jack B. Dennis, “Virtual memory, processes and sharing in MULTICS,” Communications of the ACM, 11(5):306–312,May 1968.
Article MATH Google Scholar
William J. Dally, “Wire efficient VLSI multiprocessor communication networks,” in Proceedings of the Stanford Conference on Advanced Research in VLSI, pp. 391–415, March 1987.
Google Scholar
William J. Dally, “Fine-grain message-passing concurrent computers,” in Proceedings of the Third Conference on Hypercube Concurrent Computers and Applications, vol. I, Pasadena, California, pp. 2–12, ACM, January 1988.
Google Scholar
William J. Daily, Linda Chao, Andrew Chien, Soha Hassoun, Waldemar Horwat, Jon Kaplan, Paul Song, Brian Totty, and Scott Wills, “Architecture of a message-driven processor,” in Proceedings of the 4th Annual International Symposium on Computer Architecture, Pittsburgh, Pennsylvania, pp. 189–196, June 1987.
Google Scholar
William J. Daily and Charles L. Seitz, “Deadlock-free message routing in multiprocessor interconnection networks,” IEEE Transactions on Computers, 36(5):547–553, May 1987.
Article Google Scholar
William J. Daily and D. Scott Wills“Universal mechanisms for concurrency,” in Proceedings of PARLE ‘89 — Parallel Architectures and Languages Europe, vol. I, Veldhoven, The Netherlands, pp. 19–33, June 1989.
Book Google Scholar
F. Darema, D. A. George, V. A. Norton, and G. F. Pfister, “A singleprogram-multiple-data computational model for EPEX/FORTRAN,” Parallel Computing, 7:11–24, April 1988.
Article MATH Google Scholar
Alan L. Davis and Robert M. Keller, “Data flow program graphs,” Computer,15(2):26–41, February 1982.
Article Google Scholar
Denelcor, Inc., “HEP principles of operation,” 1979.
Google Scholar
Jack B. Dennis, “First version of a data-flow procedure language,” in Proceedings of the Colloque sur la Programmation, no. 19 in Lecture Notes in Computer Science, Paris, France, pp. 362–376, Springer-Verlag, April 1974.
Google Scholar
Jack B. Dennis, “Data flow supercomputers,” Computer, 13(11) 48–56, November 1980.
Article Google Scholar
Jack B. Dennis, “The Paradigm Compiler: Mapping a functional language for the Connection Machine,” in Scientific Applications of the Connection Machine (Horst Simon, ed.), pp. 301–315, Singapore: World Scientific, 1989.
Google Scholar
Jack B. Dennis, “The evolution of ”static“ data-flow architecture,” in Advanced Topics in Data-Flow Computing (Jean-Luc Gaudiot and Lubomir Bic, eds.), ch. 2, Prentice-Hall, 1991.
Google Scholar
Jack B. Dennis and Guang R. Gao, “An efficient pipelined dataflow processor architecture,” in Proceedings of Supercomputing ‘88, Orlando, Florida, pp. 368–373, November 1988.
Google Scholar
Jack B. Dennis and Guang Rong Gao, “Maximum pipelining of array operations on static data flow machine,” in Proceedings of the 1983 International Conference on Parallel Processing, Bellaire, Michigan, pp. 331–334, August 1983.
Google Scholar
Jack B. Dennis, Guang-Rong Gao, and Kenneth W. Todd, “Modeling the weather with a data flow supercomputer,” IEEE Transactions on Computers, 33(7):592–603, July 1984.
Article Google Scholar
Jack B. Dennis and David P. Misunas, “A preliminary architecture for a basic data-flow processor,” in Proceedings of the 2nd Annual Symposium on Computer Architecture, Houston, Texas, pp. 126–132, January 1975.
Google Scholar
Keith Diefendorff and Michael Allen, “Organization of the Motorola 88110 superscalar RISC microprocessor,” IEEE Micro, 12(2):40–63, April 1992.
Article Google Scholar
Digital Equipment Corporation, Alpha Architecture Manual. Burlington, Vermont, 1992.
Google Scholar
E. W. Dijkstra, “Co-operating sequential processes,” in Programming Languages (F. Genuys, ed.), pp. 43–112, New York: Academic Press, 1968.
Google Scholar
John R. Ellis, Bulldog: A Compiler for VLIW Architectures. PhD thesis, Yale University, May 1984. ACM Doctoral Dissertation Award, 1985; published in 1986.
Google Scholar
Joseph A. Fisher, “Very long instruction word architectures and the ELI-512,” in Proceedings of the 10th Annual International Symposium on Computer Architecture, Stockholm, Sweden, pp. 140–150, June 1983.
Google Scholar
Joseph A. Fisher, John R. Ellis, John C. Ruttenberg, and Alexandru Nicolau, “Parallel processing: A smart compiler and a dumb machine,” in Proceedings of the SIGPLAN ‘84 Symposium on Compiler Construction, Montréal, Québec, pp. 37–47, June 1984.
Google Scholar
G. Gao and Q. Ning, “Loop storage optimization for dataflow machines,” in Proceedings of the Fourth International Workshop on Languages and Compilers for Parallel Computing, no. 589 in Lecture Notes in Computer Science, Santa Clara, California, pp. 359–373, Springer-Verlag, August 1991. Published in 1992.
Google Scholar
Guang R. Gao, Herbert H. J. Hum, and Yue-Bong Wong, “Limited balancing—an efficient method for dataflow software pipelining,” in Proceedings of the ISMM International Conference on Parallel and Distributed Computing and Systems,New York, New York, International Society for Mini and Microcomputers, October 1990.
Google Scholar
Guang R. Gao, Herbert H. J. Hum, and Yue-Bong Wong, “Parallel function invocation in a dynamic argument-fetching dataflow architecture,” in Proceedings of PARBASE ‘80-International Conference on Databases, Parallel Architectures, and Their Applications, Miami Beach, Florida, pp. 112–116, IEEE Computer Society, March 1990.
Book Google Scholar
Guang R. Gao, Yue-Bong Wong, and Qi Ning, “A timed Petri-Net model for fine-grain loop scheduling,” in Proceedings of the SIGPLAN ‘81 Conference on Programming Language Design and Implementation, Toronto, Ontario, pp. 204–218, June 1991.
Google Scholar
James R. Goodman, “Using cache memory to reduce processor-memory traffic,” in Proceedings of the 10th Annual International Symposium on Computer Architecture, Stockholm, Sweden, pp. 124–131, June 1983.
Google Scholar
V. G. Grafe, G. S. Davidson, J. E. Hoch, and V. P. Holmes, “The Epsilon dataflow processor,” in Proceedings of the 16th Annual International Symposium on Computer Architecture, Jerusalem, Israel, pp. 36–45, May—June 1989.
Book Google Scholar
V. G. Grafe and J. E. Hoch, “The Epsilon-2 hybrid dataflow architecture,” in Digest of Papers, 35th IEEE Computer Society International Conference, COMP CON Spring ‘80, San Francisco, California, pp. 88–93, February—March 1990.
Google Scholar
Anoop Gupta, John Hennessy, Kourosh Gharachorloo, Todd Mowry, and Wolf-Dietrich Weber, “Comparative evaluation of latency reducing and tolerating techniques,” in Proceedings of the 18th Annual International Symposium on Computer Architecture, Toronto, Ontario, pp. 254–263, May 1991.
Google Scholar
Robert H. Halstead, Jr., “Multilisp: A language for concurrent symbolic computation,” ACM Transactions on Programming Languages and Systems, 7(4):501–538, October 1985.
Google Scholar
Robert H. Halstead, Jr. and Tetsuya Fujita, “MASA: A multithreaded processor architecture for parallel symbolic computing,” in Proceedings of the 15th Annual International Symposium on Computer Architecture, Honolulu, Hawaii, pp. 443–451, May—June 1988.
Google Scholar
John L. Hennessy and Norman P. Jouppi, “Computer technology and architecture: An evolving interaction,” Computer, 24(9):18–29, September 1991.
Article Google Scholar
John L. Hennessy and David A. Patterson, Computer Architecture: A Quantitative Approach. Morgan Kaufmann Publishers, Inc., 1990.
Google Scholar
James Hicks, Derek Chiou, Boon Seong Ang, and Arvind, “Performance studies of Id on the Monsoon dataflow system,” Journal of Parallel and Distributed Computing, 18(3):273–300, July 1993.
Article Google Scholar
High Performance Fortran Forum, “High-performance fortran language specification,” technical report, Rice University, May 1993.
Google Scholar
Mark D. Hill, “A case for direct-mapped caches,” Computer, 21(12):2540, December 1988.
Google Scholar
Kei Hiraki, Satoshi Sekiguchi, and Toshio Shimada, “Status report of SIGMA-1: A data-flow supercomputer,” in Advanced Topics in Data-Flow Computing (Jean-Luc Gaudiot and Lubomir Bic, eds.), ch. 7, Prentice-Hall, 1991.
Google Scholar
Seema Hiranandani, Ken Kennedy, and Chau-Wen Tseng, “Compiler optimizations for Fortran D on MIMD distributed-memory machines,” in Proceedings of Supercomputing ‘81, Albuquerque, New Mexico, pp. 86–100, November 1991.
Google Scholar
Herbert H. J. Hum and Guang R. Gao“Efficient support of concurrent threads in a hybrid dataflow/von Neumann architecture,” in Proceedings of the Third IEEE Symposium on Parallel and Distributed Processing, Dallas, Texas, pp. 190–193, ACM SIGARCH and IEEE Computer Society, December 1991.
Google Scholar
Herbert H. J. Hum and Guang R. Gao, “A novel high-speed memory organization for fine-grain multi-thread computing,” in Proceedings of PARLE ‘81 - Parallel Architectures and Languages Europe, vol. I, Eindhoven, The Netherlands, pp. 34–51, June 1991.
Google Scholar
Robert A. Iannucci, A Dataflow/von Neumann Hybrid Architecture. PhD thesis, Massachusetts Institute of Technology, July 1988. Also published as Technical Report MIT/LCS/TR-418, MIT Laboratory for Computer Science.
Google Scholar
Robert A. Iannucci, “Toward a dataflow/von Neumann hybrid architecture,” in Proceedings of the 15th Annual International Symposium on Computer Architecture, Honolulu, Hawaii, pp. 131–140, May-June 1988.
Google Scholar
Mike Johnson, Superscalar Microprocessor Design. Prentice Hall, Englewood Cliffs, New Jersey 07632, 1991.
Google Scholar
John B. Johnston, “The contour model of block structured processes,” in Proceedings of a Symposium on Data Structures in Programming Languages, pp. 55–82, ACM, 1971. SIGPLAN Notices, 6(2), February 1971.
Article Google Scholar
Norman P. Jouppi and David W. Wall, “Available instruction-level parallelism for superscalar and superpipelined machines,” in Proceedings of the Third International Conference on Architectural Support for Programming Languages and Operating Systems, Boston, Massachusetts, pp. 272–282, April 1989.
Book Google Scholar
R. H. Katz, S. J. Eggers, D. A. Wood, C. L. Perkins, and R. G Sheldon, “Implementing a cache consistency protocol,” in Proceedings of the 12th Annual International Symposium on Computer Architecture, Boston, Massachusetts, pp. 276–283, June 1985.
Google Scholar
Robert M. Keller, “Look-ahead processors,” Computing Surveys, 7(4):177–195, December 1975.
Article MATH Google Scholar
Parviz Kermani and Leonard Kleinrock, “Virtual cut-through: A new computer communication switching technique,” Computer Networks, 3(4):267–286, September 1979.
MathSciNet MATH Google Scholar
Kathleen Knobe, Joan D. Lukas, and Guy L. Steele, Jr., “Data optimization: Allocation of arrays to reduce communication on SIMD machines,” Journal of Parallel and Distributed Computing, 8(2):102–118, February 1990.
Article Google Scholar
Yuetsu Kodama, Yasuhito Koumura, Mitsuhisa Sato, Hirohumi Sakane, Shuichi Sakai, and Yoshinori Yamaguchi, “EMC-Y: Parallel processing element optimizing communication and computation,” in Conference Proceedings, 1993 International Conference on Supercomputing, Tokyo, Japan, pp. 167–174, July 1993.
Google Scholar
Yuetsu Kodama, Shuichi Sakai, and Yoshinori Yamaguchi, “A prototype of a highly parallel dataflow machine EM-4 and its preliminary evaluation,” in Proceedings of InfoJapan 90, pp. 291–298, October 1990.
Google Scholar
Charles Koelbel, “Compile-time generation of communications for scientific programs,” Technical Report CRPC-TR91089, Center for Resarch on Parallel Computation, Rice University, January 1991.
Google Scholar
Peter M. Kogge, The Architecture of Pipelined Computers. New York: McGraw-Hill Book Company, 1981.
Google Scholar
Les Kohn and Neal Margulis, “Introducing the Intel i860 64-bit microprocessor,” IEEE Micro, 13(4):15–30, August 1989.
Article Google Scholar
Janusz S. Kowalik, ed., Parallel MIMD Computation: The HEP Supercomputer and its Applications. MIT Press, 1985.
MATH Google Scholar
James T. Kuehn and Burton J. Smith, “The Horizon supercomputing system: Architecture and software,” in Proceedings of Supercomputing ‘88, Orlando, Florida, pp. 28–34, November 1988.
Google Scholar
Daniel Lenoski, James Laudon, Truman Joe, David Nakahira, Luis Stevens, Anoop Gupta, and John Hennessy, “The DASH prototype: Implementation and performance,” in Proceedings of the 19th Annual International Symposium on Computer Architecture, Gold Coast, Australia, pp. 92–103, May 1992.
Google Scholar
Kai Li and Paul Hudak, “Memory coherence in shared virtual memory systems,” in Proceedings of the Fifth Annual ACM Symposium on Principles of Distributed Computing, Calgary, Alberta, pp. 229–239, August 1986.
Book Google Scholar
J. R. McGraw et al., “SISAL: Streams and iteration in a single assignment language—language reference manual version 1.2,” Technical Report M146, Lawrence Livermore National Laboratory, 1985.
Google Scholar
Charles Melear, “The design of the 88000 RISC family,” IEEE Micro, 13(2):26–38, April 1989.
Article Google Scholar
Edward F. Miller, Jr., “A multiple-stream registerless shared-resource processor,” IEEE Transactions on Computers, 23(3):277–285, March 1974.
MATH Google Scholar
Mamata Misra, ed., IBM RISC System/6000 Technology. International Business Machines Corporation, 1990. Order No. SA23–2619.
Google Scholar
Rishiyur S. Nikhil, “The parallel programming language Id and its compilation for parallel machines,” CSG Memo 313, Computation Structures Group, MIT Laboratory for Computer Science, July 1990.
Google Scholar
Rishiyur S. Nikhil and Arvind, “Can dataflow subsume von Neumann computing?” in Proceedings of the 16th Annual International Symposium on Computer Architecture,Jerusalem, Israel, pp. 262–272, May-June 1989.
Book Google Scholar
Rishiyur S. Nikhil and Arvind, “Id: a language with implicit parallelism,” CSG Memo 305, Computation Structures Group, MIT Laboratory for Computer Science, 1990.
Google Scholar
Michael D. Noakes, Deborah A. Wallah, and William J. Dally, “The J-Machine multicomputer: An architectural evaluation,” in Proceedings of the 20th Annual International Symposium on Computer Architecture,San Diego, California, pp. 224–235, May 1993.
Google Scholar
Amos R. Omondi, “Design of a high performance instruction pipeline,” Computer Systems Science and Engineering, 6(1):13–29, January 1991.
Google Scholar
Gregory M. Papadopoulos, Implementation of a General Purpose Dataflow Multiprocessor. MIT Press, 1991. Revised version of the author’s Ph.D. dissertation, MIT, 1988.
Google Scholar
Gregory M. Papadopoulos and David E. Culler, “Monsoon: an explicit token-store architecture,” in Proceedings of the 17th Annual International Symposium on Computer Architecture, Seattle, Washington, pp. 82–91, May 1990.
Google Scholar
Gregory M. Papadopoulos and Kenneth R. Traub, “Multithreading: A revisionist view of dataflow architectures,” in Proceedings of the 18th Annual International Symposium on Computer Architecture, Toronto, Ontario, pp. 342–351, May 1991.
Google Scholar
David A. Patterson and Carlo H. Sequin, “RISC I: A reduced instruction set VLSI computer,” in Proceedings of the 8th Annual Symposium on Computer Architecture, Minneapolis, Minnesota, pp. 443–450, May 1981.
Google Scholar
A. R. Pleszkun, J. R. Goodman, W.-C. Hsu, R. T. Joersz, G. Bier, P. Woest, and P. Schechter, “WISQ: A restartable architecture using queues,” in Proceedings of the 14th Annual International Symposium on Computer Architecture, Pittsburgh, Pennsylvania, June 1987.
Google Scholar
Steven A. Przybylski, Cache and Memory Hierarchy Design: A Performance-Directed Approach. Morgan Kaufmann, 1990.
Google Scholar
George Radin, “The 801 minicomputer,” in Proceedings of the Symposium on Architectural Support for Programming Languages and Operating Systems, Palo Alto, California, pp. 39–47, March 1982.
Google Scholar
B. Ramakrishna Rau, David W. L. Yen, Wei Yen, and Ross A. Towle, “The Cydra 5 departmental supercomputer,” Computer, 22(1):12–35, January 1989.
Article Google Scholar
Gary Sabot, “A compiler for a massively parallel distributed memory MIMD computer,” in Proceedings of Frontiers ‘82: The Fourth Symposium on the Frontiers of Massively Parallel Computation,McLean, Virginia, pp. 12–20, October 1992.
Google Scholar
Shuichi Sakai, Yoshinori Yamaguchi, Kei Hiraki, Yuetsu Kodama, and Toshitsugu Yuba, “An architecture of a dataflow single chip processor,” in Proceedings of the 16th Annual International Symposium on Computer Architecture, Jerusalem, Israel, pp. 46–53, May—June 1989.
Book Google Scholar
Charles L. Seitz, “The Cosmic Cube,” Communications of the ACM, 28(1):22–33, January 1985.
Article MathSciNet Google Scholar
Jong-Jiann Shieh and Christos A. Papachristou, “On reordering instruction streams for pipelined computers,” in Proceedings of the 22th Annual International Workshop on Microprogramming and Microarchitecture, Dublin, Ireland, pp. 199–206, August 1989. SIGMICRO Newsletter, 20(3), September 1989.
Google Scholar
Alan Jay Smith, “Cache memories,” Computing Surveys, 14(3):473–530, September 1982.
Article Google Scholar
Burton Smith, “The architecture of HEP,” in Parallel MIMD Computation: The HEP Supercomputer and its Applications (Janusz S. Kowalik, ed.), pp. 41–55, MIT Press, 1985.
Google Scholar
Burton J. Smith, “Architecture and applications of the HEP multiprocessor computer system,” in Proceedings of SPIE — Real-Time Signal Processing IV, vol. 298, San Diego, California, pp. 241–248, August 1981.
Google Scholar
Harold S. Stone and John Cocke, “Computer architecture in the 1990s,” Computer, 24(9):30–38, September 1991.
Article Google Scholar
C. K Tang, “Cache system design in the tightly coupled multiprocessor system,” in AFIPS Conference Proceedings,1976 National Computer Conference, New York City, New York, pp. 749–753, June 1976.
Google Scholar
Charles P. Thacker and Lawrence C. Stewart, “Firefly: a multiprocessor workstation,” in Proceedings of the Second International Conference on Architectural Support for Programming Languages and Operating Systems, Palo Alto, California, pp. 164–172, October 1987.
Google Scholar
Kevin B. Theobald, “Panel sessions of the 1991 workshop on multithreaded computers,” Computer Architecture News, 22(1):2–33, March 1994.
Google Scholar
Kevin B. Theobald, Guang R. Gao, and Laurie J. Hendren, “On the limits of program parallelism and its smoothability,” in Proceedings of the 25th Annual International Symposium on Microarchitecture, Portland, Oregon, pp. 10–19, December 1992.
Google Scholar
R. H. Thomas, R. Gurwitz, J. Goodhue, and D. Allen, “Butterfly parallel processor overview,” technical report, Bolt Beranek and Newman, Cambridge, Massachusetts, December 1985.
Google Scholar
J. E. Thornton, Design of a Computer: The Control Data 6600. Scott, Foresman, and Co., 1970.
Google Scholar
Kenneth R. Traub, “Compilation as partitioning: A new approach to compiling non-strict functional languages,” CSG Memo 291, Computation Structures Group, MIT Laboratory for Computer Science, 1988.
Google Scholar
Kenneth R. Traub, Sequential Implementation of Lenient Programming Languages. PhD thesis, Massachusetts Institute of Technology, September 1988. Also published as Technical Report MIT/LCS/TR-417, MIT Laboratory for Computer Science.
Google Scholar
D. A. Turner, “The semantic elegance of applicative languages,” in Proceedings of the 1981 Conference on Functional Programming Languages and Computer Architecture, Portsmouth, New Hampshire, pp. 85–97, October 1981.
Book Google Scholar
David W. Wall, “Limits of instruction-level parallelism,” in Proceedings of the Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, Santa Clara, California, pp. 176–188, April 1991.
Book Google Scholar
Ian Watson and John Gurd, “A practical data flow computer,” Computer, 15(2):51–57, February 1982.
Article Google Scholar
Wolf-Dietrich Weber and Anoop Gupta, “Exploring the benefits of multiple hardware contexts in a multiprocessor architecture: Preliminary results,” in Proceedings of the.16th Annual International Symposium on Computer Architecture, Jerusalem, Israel, pp. 273–280, May—June 1989.
Book Google Scholar
X3J3, The FORTRAN Technical Committee of ANSI, “FORTRAN 90, draft of the international standard,” June 1990.
Google Scholar
Toshitsugu Yuba, Toshio Shimada, Kei Hiraki, and Hiroshi Kashiwagi, “SIGMA-1: a dataflow computer for scientific computations,” Computer Physics Communications, 37:141–148, July 1985.
Article Google Scholar
Hans P. Zima, Heinz-J. Bast, and Michael Gerndt, “SUPERB: a tool for semi-automatic MIMD/SIMD parallelization,” Parallel Computing, 6:118, January 1988.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Laboratory For Computer Science, MIT, USA
Jack B. Dennis
School of Computer Science, McGill University, Canada
Guang R. Gao

Authors

Jack B. Dennis
View author publications
You can also search for this author in PubMed Google Scholar
Guang R. Gao
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Exa Corporation, Cambridge, Massachusetts, USA
Robert A. Iannucci
McGill University, Montreal, Quebec, Canada
Guang R. Gao
Digital Equipment Corporation, Cambridge, Massachusetts, USA
Robert H. Halstead Jr.
Tera Computer Company, Seattle, Washington, USA
Burton Smith

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Dennis, J.B., Gao, G.R. (1994). Multithreaded Architectures: Principles, Projects, and Issues. In: Iannucci, R.A., Gao, G.R., Halstead, R.H., Smith, B. (eds) Multithreaded Computer Architecture: A Summary of the State of the ART. The Springer International Series in Engineering and Computer Science, vol 281. Springer, Boston, MA. https://doi.org/10.1007/978-1-4615-2698-8_1

Download citation

DOI: https://doi.org/10.1007/978-1-4615-2698-8_1
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4613-6161-9
Online ISBN: 978-1-4615-2698-8
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics