Skip to main content

LLM: Realizing Low-Latency Memory by Exploiting Embedded Silicon Photonics for Irregular Workloads

  • Conference paper
  • First Online:
High Performance Computing (ISC High Performance 2022)

Abstract

As emerging workloads exhibit irregular memory access patterns with poor data reuse and locality, they would benefit from a DRAM that achieves low latency without sacrificing bandwidth and energy efficiency. We propose LLM (Low Latency Memory), a codesign of the DRAM microarchitecture, the memory controller and the LLC/DRAM interconnect by leveraging embedded silicon photonics in 2.5D/3D integrated system on chip. LLM relies on Wavelength Division Multiplexing (WDM)-based photonic interconnects to reduce the contention throughout the memory subsystem. LLM also increases the bank-level parallelism, eliminates bus conflicts by using dedicated optical data paths, and reduces the access energy per bit with shorter global bitlines and smaller row buffers. We evaluate the design space of LLM for a variety of synthetic benchmarks and representative graph workloads on a full-system simulator (gem5). LLM exhibits low memory access latency for traffics with both regular and irregular access patterns. For irregular traffic, LLM achieves high bandwidth utilization (over 80% peak throughput compared to 20% of HBM2.0). For real workloads, LLM achieves 3\(\times \) and 1.8\(\times \) lower execution time compared to HBM2.0 and a state-of-the-art memory system with high memory level parallelism, respectively. This study also demonstrates that by reducing queuing on the data path, LLM can achieve on average 3.4\(\times \) lower memory latency variation compared to HBM2.0.

This work was supported in part by ARO award W911NF1910470.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 69.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 89.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Ayar Labs Realizes Co-Packaged Silicon Photonics - WikiChip Fuse. https://fuse.wikichip.org/news/3233/ayar-labs-realizes-co-packaged-silicon-photonics/

  2. JEDEC. https://www.jedec.org/sites/default/files/docs/JESD212.pdf

  3. Thermistor Specification Fiber Specification an exemplary Eye Diagram of one F-P mode Externally modulated at 2.5 GHz filtered-out single channel. www.innolume.com

  4. Zen - Microarchitectures - AMD - WikiChip. https://en.wikichip.org/wiki/amd/microarchitectures/zen

  5. Batten, C., et al.: Building many-core processor-to-dram networks with monolithic CMOS silicon photonics. In: International Symposium on Microarchitecture (MICRO), pp. 8–21 (2009)

    Google Scholar 

  6. Beamer, S., et al.: Re-architecting dram memory systems with monolithically integrated silicon photonics. In: Proceedings International Symposium on Computer Architecture (ISCA), pp. 129–140. IEEE (2010)

    Google Scholar 

  7. Beamer, S., et al.: The gap benchmark suite. arXiv preprint arXiv:1508.03619 (2015)

  8. Carter, J., et al.: Impulse: building a smarter memory controller. In: Proceedings Fifth International Symposium on High-Performance Computer Architecture, pp. 70–79. IEEE (1999)

    Google Scholar 

  9. Chatterjee, N., et al.: Managing dram latency divergence in irregular GPGPU applications. In: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis (SC), pp. 128–139 (2014)

    Google Scholar 

  10. Chatterjee, N., et al.: Architecting an energy-efficient dram system for GPUS. In: IEEE International Symposium on High Performance Computer Architecture (HPCA), pp. 73–84. IEEE (2017)

    Google Scholar 

  11. Cheung, S., et al.: Ultra-compact silicon photonic 512\(\times \) 512 25 GHZ arrayed waveguide grating router. IEEE J. Selected Top. Quant. Electron. 20, 310–316 (2013)

    Google Scholar 

  12. Cianchetti, M.J., et al.: Phastlane: a rapid transit optical routing network. In: Proceedings of the International Symposium on Computer Architecture (ISCA), pp. 441–450 (2009)

    Google Scholar 

  13. Cooper-Balis, E., et al.: Fine-grained activation for power reduction in dram. In: International Symposium on Microarchitecture (MICRO), pp. 34–47 (2010)

    Google Scholar 

  14. Eklov, D., et al.: Bandwidth bandit: quantitative characterization of memory contention. In: Proceedings of the 2013 IEEE/ACM CGO, pp. 1–10 (2013)

    Google Scholar 

  15. Fotouhi, P., et al.: Enabling scalable chiplet-based uniform memory architectures with silicon photonics. In: Proceedings of the International Symposium on Memory Systems (MEMSYS), pp. 222–334 (2019)

    Google Scholar 

  16. Grani, P., et al.: Design and evaluation of AWGR-based photonic NOC architectures for 2.5 d integrated high performance computing systems. In: IEEE International Symposium on High Performance Computer Architecture (HPCA), pp. 289–300. IEEE (2017)

    Google Scholar 

  17. Gupta, U., et al.: The architectural implications of facebook’s DNN-based personalized recommendation. In: IEEE International Symposium on High Performance Computer Architecture (HPCA), pp. 488–501. IEEE (2020)

    Google Scholar 

  18. Ha, H., et al.: Improving energy efficiency of dram by exploiting half page row access. In: International Symposium on Microarchitecture (MICRO), pp. 1–12. IEEE (2016)

    Google Scholar 

  19. Hassan, H., et al.: Chargecache: reducing dram latency by exploiting row access locality. In: IEEE International Symposium on High Performance Computer Architecture (HPCA). IEEE (2016)

    Google Scholar 

  20. JESD235A, J.: High Bandwidth Memory (HBM) Dram. JEDEC Solid State Technology Association (2015)

    Google Scholar 

  21. Kaseridis, D., et al.: Minimalist open-page: a dram page-mode scheduling policy for the many-core era. In: International Symposium on Microarchitecture (MICRO), pp. 24–35. IEEE (2011)

    Google Scholar 

  22. Kim, Y., et al.: A case for exploiting subarray-level parallelism (SALP) in dram. In: Proceedings of the International Symposium on Computer Architecture (ISCA), pp. 368–379. IEEE (2012)

    Google Scholar 

  23. Kirman, N., et al.: Leveraging optical technology in future bus-based chip multiprocessors. In: International Symposium on Microarchitecture (MICRO), pp. 492–503. IEEE (2006)

    Google Scholar 

  24. Li, H., et al.: A 25 Gb/s, 4.4 v-swing, ac-coupled ring modulator-based WDM transmitter with wavelength stabilization in 65 nm CMOS. IEEE J. Solid-State Circuits 50, 3145–3159 (2015)

    Google Scholar 

  25. Li, L., et al.: 3d sip with organic interposer for ASIC and memory integration. In: IEEE 66th Electronic Components and Technology Conference (ECTC), pp. 1445–1450. IEEE (2016)

    Google Scholar 

  26. Lowe-Power, et al.: The gem5 simulator: Version 20.0+. arXiv preprint arXiv:2007.03152 (2020)

  27. Luszczek, P.R., et al.: The HPC challenge (HPCC) benchmark suite. In: Proceedings of the 2006 ACM/IEEE Conference on Supercomputing, p. 213-es (2006)

    Google Scholar 

  28. Matsuo, S.A.O.: Microring-resonator-based widely tunable lasers. IEEE J. Select. Top. Quant. Electron. 15, 545–554 (2009)

    Google Scholar 

  29. Nitta, C.J., et al.: On-chip photonic interconnects: a computer architect’s perspective. Synthesis Lectures on Computer Architecture, pp. 1–111 (2013)

    Google Scholar 

  30. O’Connor, M., et al.: Fine-grained dram: energy-efficient dram for extreme bandwidth systems. In: International Symposium on Microarchitecture (MICRO), pp. 41–54. IEEE (2017)

    Google Scholar 

  31. Papistas, I., et al.: Bandwidth-to-area comparison of through silicon VIAS and inductive links for 3-d ICS. In: European Conference on Circuit Theory and Design (ECCTD), pp. 1–4. IEEE (2015)

    Google Scholar 

  32. Parekh, M.S., et al.: Electrical, optical and fluidic through-silicon VIAS for silicon interposer applications. In: IEEE Electronic Components and Technology Conference (ECTC), pp. 1992–1998. IEEE (2011)

    Google Scholar 

  33. Proietti, R., et al.: Experimental demonstration of a 64-port wavelength routing thin-clos system for data center switching architectures. J. Opt. Commun. Network. 10, 49–B57 (2018)

    Google Scholar 

  34. Rumley, S., et al.: Silicon photonics for exascale systems. J. Lightwave Technol. 33, 547–562 (2015)

    Google Scholar 

  35. Shacham, A., et al.: Photonic networks-on-chip for future generations of chip multiprocessors. IEEE Trans. Comput. 57, 1246–1260 (2008)

    Google Scholar 

  36. Shang, K., et al.: Low-loss compact silicon nitride arrayed waveguide gratings for photonic integrated circuits. IEEE Photon. J. 9, 1–5 (2017)

    Google Scholar 

  37. Shen, Y., et al.: Silicon photonics for extreme scale systems. J. Lightwave Technol. 37, 245–259 (2019)

    Google Scholar 

  38. Takada, K., et al.: Low-crosstalk 10-GHZ-spaced 512-channel arrayed-waveguide grating multi/demultiplexer fabricated on a 4-in wafer. IEEE Photon. Technol. Lett. 13, 1182–1184 (2001)

    Google Scholar 

  39. Udipi, A.N., et al.: Rethinking dram design and organization for energy-constrained multi-cores. In: Proceedings of the International Symposium on Computer Architecture (ISCA), pp. 175–186 (2010)

    Google Scholar 

  40. de Valicourt, et al.: Dual hybrid silicon-photonic laser with fast wavelength tuning. In: Optical Fiber Communications Conference and Exhibition (OFC), pp. 1–3 (2016)

    Google Scholar 

  41. Wade, M., et al.: Teraphy: a chiplet technology for low-power, high-bandwidth in-package optical I/O. In: International Symposium on Microarchitecture (MICRO), pp. 63–71 (2020)

    Google Scholar 

  42. Wang, Y., et al.: Figaro: Improving system performance via fine-grained in-dram data relocation and caching. In: International Symposium on Microarchitecture (MICRO), pp. 313–328. IEEE (2020)

    Google Scholar 

  43. Werner, S., et al.: Amon: an advanced mesh-like optical NOC. In: IEEE 23rd Annual Symposium on High-Performance Interconnects, pp. 52–59 (2015)

    Google Scholar 

  44. Werner, S., et al.: AWGR-based optical processor-to-memory communication for low-latency, low-energy vault accesses. In: Proceedings of the International Symposium on Memory Systems (MEMSYS), pp. 269–278 (2018)

    Google Scholar 

  45. Werner, S., et al.: 3d photonics as enabling technology for deep 3d dram stacking. In: Proceedings of the International Symposium on Memory Systems (MEMSYS), pp. 206–221 (2019)

    Google Scholar 

  46. Yu, K., et al.: A 25 Gb/s hybrid-integrated silicon photonic source-synchronous receiver with microring wavelength stabilization. IEEE J. Solid-State Circuits 51, 2129–2141 (2016)

    Google Scholar 

  47. Zhang, T., et al.: Half-dram: a high-bandwidth and low-power dram architecture from the rethinking of fine-grained activation. In: Proceedings of the International Symposium on Computer Architecture (ISCA), pp. 349–360. IEEE (2014)

    Google Scholar 

  48. Zhang, Y., et al.: High-density wafer-scale 3-D silicon-photonic integrated circuits. IEEE J. Select. Top. Quant. Electron. 24, 1–10 (2018)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Marjan Fariborz .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Fariborz, M. et al. (2022). LLM: Realizing Low-Latency Memory by Exploiting Embedded Silicon Photonics for Irregular Workloads. In: Varbanescu, AL., Bhatele, A., Luszczek, P., Marc, B. (eds) High Performance Computing. ISC High Performance 2022. Lecture Notes in Computer Science, vol 13289. Springer, Cham. https://doi.org/10.1007/978-3-031-07312-0_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-07312-0_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-07311-3

  • Online ISBN: 978-3-031-07312-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics