Skip to main content

Advertisement

Log in

A Prototype Processing-In-Memory (PIM) Chip for the Data-Intensive Architecture (DIVA) System

  • Published:
Journal of VLSI signal processing systems for signal, image and video technology Aims and scope Submit manuscript

Abstract

The Data-Intensive Architecture (DIVA) system employs Processing-In-Memory (PIM) chips as smart-memory coprocessors. This architecture exploits inherent memory bandwidth both on chip and across the system to target several classes of bandwidth-limited applications, including multimedia applications and pointer-based and sparse-matrix computations. The DIVA project has built a prototype development system using PIM chips in place of standard DRAMs to demonstrate these concepts. We have recently ported several demonstration kernels to this platform and have exhibited a speedup of 35X on a matrix transpose operation.

This paper focuses on the 32-bit scalar and 256-bit WideWord integer processing components of the first DIVA prototype PIM chip, which was fabricated in TSMC 0.18 μm technology. In conjunction with other publications, this paper demonstrates that impressive gains can be achieved with very little “smart” logic added to memory devices. A second PIM prototype that includes WideWord floating-point capability is scheduled to tape out in August 2003.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. D. Burger, J. Goodman, and A. Kagi, “Memory Bandwidth Limitations of Future Microprocessors,” in Proceedings of the 23rd International Symposium on Computer Architecture, May 1996.

  2. S. Iyer and H. Kalter, “Embedded DRAM Technology,” IEEE Spectrum, April 1999, pp. 56–64.

  3. P. Ranganathan, S. Adve, and N. Jouppi, “Performance of Image and Video Processing with General-Purpose Processors and Media ISA Extensions,” in Proceedings of the International Symposium on Computer Architecture, May 1999.

  4. J.B. Carter et al., “Impulse: Building a Smarter Memory Controller,” in Proceedings of the Fifth International Symposium on High Performance Computer Architecture, Jan. 1999, pp. 70–79.

  5. T. von Eicken, D. Culler, S.C. Goldstein, and K. Schauser, “Active Messages: A Mechanism for Integrated Communication and Computation,” Proceedings of the 19th International Symposium on Computer Architecture, May 1992.

  6. C. Kang and J. Draper, “A Fast, Simple Router for the Data-Intensive Architecture (DIVA) System,” in Proceedings of the IEEE Midwest Symposium on Circuits and Systems, Aug. 2000.

  7. J. Draper et al., “The Architecture of the DIVA Processing-in-Memory Chip,” in Proceedings of the International Conference on Supercomputing, June 2002, pp. 14–25.

  8. Mary Hall et al., “Mapping Irregular Application to DIVA, a PIM-based Data-Intensive Architecture,” Supercomputing, Nov. 1999.

  9. M. Hall and C. Steele, “Memory Management in a PIM-Based Architecture,” in Proceedings of the Workshop on Intelligent Memory Systems, Oct. 2000.

  10. JEDEC, “JEDEC,” http://www.jedec.org

  11. J. Hennessy and D. Patterson, Computer Architecture: A Quantitative Approach, 2nd edn., Morgan Kaufman, 1996.

  12. R. Lee et al., “Subword Parallelism with MAX-2,” IEEE Micro, Aug. 1996, pp. 51–59.

  13. AltiVec, “AltiVec Technology,” http://www.altivec.org

  14. J. Draper et al., “Implementation of a 32-bit RISC Processor for the Data-Intensive Architecture Processing-In-Memory Chip,” in Proceedings of the IEEE International Conference on Application-Specific Systems, Architectures, and Processors, July 2002, pp. 163–172.

  15. J. Draper, J. Sondeen, and C. Kang, “Implementation of a 256-bit WideWord Processor for the Data-Intensive Architecture (DIVA) Processing-In-Memory (PIM) Chip,” in Proceedings of the 28th European Solid-State Circuit Conference, Sept. 2002, pp. 77–80.

  16. IBM, “Blue Gene,” http://researchweb.watson.ibm.com/bluegene/.

  17. P. Kogge, “The EXECUBE Approach to Massively Parallel Processing,” in Proceedings of the International Conference on Parallel Processing, Aug. 1994.

  18. A. Saulsbury, F. Pong, and A. Nowatzyk, “Missing the Memory Wall: The Case for Processor/Memory Integration,” in Proceedings of the International Symposium on Computer Architecture, May 1996.

  19. T. Sterling, “An Introduction to the Gilgamesh PIM Architecture,” Euro-Par, 2001, pp. 16–32.

  20. D. Elliott et al., “Computational RAM: Implementing Processors in Memory,” IEEE Design and Test of Computers, January–March 1999, pp. 32–41.

  21. M. Gokhale, B. Holmes, and K. Iobst, “Processing In Memory: the Terasys Massively Parallel PIM Array,” IEEE Computer, April 1995, pp. 23–31.

  22. C. Kozyrakis et al., “Hardware/Compiler Co-development for an Embedded Media Processor,” in Proceedings of the IEEE, Nov. 2001, pp. 1694–1709.

  23. J. Babb et al., “Parallelizing Applications into Silicon,” in Proceedings of the IEEE Workshop on FPGAs for Custom Computing Machines, April 1999.

  24. Y. Kang et al., “FlexRAM: Toward an Advanced Intelligent Memory System,” in Proceedings of the IEEE International Conference on Computer Design, Oct. 1999.

  25. M. Oskin, F.T. Chong, and T. Sherwood, “Active Pages: A Model of Computation for Intelligent Memory,” in Proceedings of the International Symposium on Computer Architecture, June 1998.

  26. J. Brockman et al., “Microservers: A New Memory Semantics for Massively Parallel Computing,” in Proceedings of the International Conference on Supercomputing, June 1999, pp. 454–463.

  27. S. Larsen and S. Amarasinghe, “Exploiting Superword-Level Parallelism with Multimedia Instruction Sets,” in Proceedings of the ACM Conference on Programming Languages Design and Implementation, 2000.

  28. J. Chame, M. Hall, and J. Shin, “Code Transformations for Exploiting Bandwidth in PIM-Based Systems,” in Proceedings of the ISCA Workshop on Solving the Memory Wall Problem, June 2000.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jaffrey Draper.

Additional information

Jeffrey Draper is a Research Assistant Professor in the Department of Electrical Engineering at the University of Southern California. He holds this appointment in conjunction with a Project Leader position at the Information Sciences Institute of the University of Southern California. Dr. Draper’s research group has participated in many DARPA-sponsored large-scale VLSI development efforts. He is a member of the IEEE Computer Society and has conducted research in the areas of processing-in-memory architectures, thermal management, VLSI, interconnection networks, and modeling/performance evaluation. Dr. Draper received a BSEE from Texas A&M University and an MS and PhD from the University of Texas at Austin.

J. Tim Barrett is a Senior Electrical Engineer at the Information Sciences Institute of the University of Southern California. Mr. Barrett has managed, designed and implemented the hardware, low-level software and integration of many computer systems. Applications of these systems include scalable supercomputers at USC Information Sciences Institute, the long distance telephone switch at AT&T Bell Labs, building energy management at Barber-Colman Company, and laser entertainment performance instruments at Aura Technologies and Laser Images Inc. He is a member of IEEE Solid State Circuits Society and received his MSCS from the University of Illinois Chicago and BSEE from the University of Iowa.

Jeff Sondeen is a Research Associate at the Information Sciences Institute of the University of Southern California, where he supports and maintains CAD technology files, libraries, and tools for implementing VLSI designs. Previously he has worked at Silicon Compilers and Hewlett-Packard in CAD tool and test chip development. He received an MSEE from the University of Michigan.

Sumit Mediratta is currently pursuing a PhD in Electrical Engineering at the University of Southern California. He received a Bachelor of Engineering degree in Electronics and Telecommunication from the Shri Govind Ram Sekseria Institute of Technology and Science, India. His research interests include interconnection networks, VLSI, processing-in-memory architectures, high-speed data communication and synchronization techniques and network interfaces for high-performance architectures.

Chang Woo Kang received a BS in electrical engineering from Chung-ang University, Seoul, South Korea, in 1997 and an MS in electrical engineering from the University of Southern California, Los Angeles, in 1999. He is currently pursuing a PhD in electrical engineering at the University of Southern California. His research includes VLSI system design and algorithms for low-power logic synthesis and physical design.

Ihn Kim is a PhD student in the Department of Electrical Engineering at the University of Southern California. He is also a Staff Engineer at QLogic. His research interests include user-level network interface, network processor architectures, and modeling/performance evaluation of system area networks. He is a member of the IEEE Computer Society. He received an MS at KAIST (Korea Advanced Institute of Science and Technology).

Gokhan Daglikoca is an Application Engineer at Cadence Design Systems, Inc, where he specializes in High-Performance ASIC and Microprocessor Design Methodologies. He is a member of IEEE. Gokhan Daglikoca received a BS from Istanbul Technical University and an MS from the University of Southern California.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Draper, J., Barrett, J.T., Sondeen, J. et al. A Prototype Processing-In-Memory (PIM) Chip for the Data-Intensive Architecture (DIVA) System. J VLSI Sign Process Syst Sign Image Video Technol 40, 73–84 (2005). https://doi.org/10.1007/s11265-005-4939-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11265-005-4939-1

Keywords

Navigation