Skip to main content

Memory Bandwidth Conservation for SpMV Kernels Through Adaptive Lossy Data Compression

  • Conference paper
  • First Online:
Parallel and Distributed Computing, Applications and Technologies (PDCAT 2022)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13798))

  • 485 Accesses

Abstract

SpMV is a very common algorithm in linear algebra, which is widely adopted by machine learning applications nowadays. Especially, fully-connected MLP layers dominate many SpMV tasks that play a critical role in diverse services, and therefore a large fraction of data center cycles are spent. Despite exploiting sparse matrix storage techniques such as CSR/CSC, SpMV still suffers from limited memory bandwidth during data transferring because of the architecture of modern computing systems. However, we find that both integer type and floating-point type data used in matrix-vector multiplications are handled plainly without any necessary pre-processing. We added compression and decompression pre-processing between the main memory and Last Level Cache (LLC) which may dramatically reduce the memory bandwidth consumption. Furthermore, we also observed that convergence speed in some typical scientific computation benchmarks will not be degraded when adopting compressed floating-point data instead of the original double type. Based on these discoveries, in this paper, we propose a simple yet effective compression approach that can be implemented in general computing architectures and HPC systems preferably. When adopting this technique, a performance improvement of 1.92x is made in the best case.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 69.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 89.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. HPCG Ranking (2021). https://www.top500.org/lists/hpcg/2021/06/

  2. Kourtis, K., Karakasis, V., Goumas, G., Koziris, N.: CSX: an extended compression format for SpMV on shared memory systems. SIGPLAN Not. 46, 8 (2011)

    Article  Google Scholar 

  3. Ahmad, K., Sundar, H., Hall, M.: Data-driven mixed precision sparse matrix vector multiplication for GPUs. ACM Trans. TACO 16(4), 1–24 (2019)

    Google Scholar 

  4. Sakamoto, R., Kondo, M., Fujita, K., Ichimura, T., Nakajima, K.: The effectiveness of low-precision floating arithmetic on numerical codes: a case study on power consumption. In: Proceedings HPCAsia2020, pp. 199–206 (2020)

    Google Scholar 

  5. FUJITSU Processor A64FX Datasheet. https://www.fujitsu.com/downloads/SUPER/a64fx/a64fx_datasheet_en.pdf

  6. Vazquez, F., Ortega, G., Fernandez, J.J., Garzon, E.M.: Improving the performance of the sparse matrix vector product with GPUs. In: Proceedings of the 10th IEEE ICCIT, ser. CIT, pp. 1146–1151 (2010)

    Google Scholar 

  7. Tang, W.T., et al.: Accelerating sparse matrix-vector multiplication on GPUs using bit-representation optimized schemes. In: Proceedings of the ICHPC (2013)

    Google Scholar 

  8. Yang, W., Li, K., Mo, Z., Li, K.: Performance optimization using partitioned SpMV on GPUs and multicore CPUs. IEEE Trans. Comput. 64(9), 2623–2636 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  9. Ashari, A., Sedaghati, N., Eisenlohr, J., Sadayappan, P.: An efficient two-dimensional blocking strategy for sparse matrix-vector multiplication on GPUs. In: Proceedings of the ICS 2014, pp. 273–282 (2014)

    Google Scholar 

  10. Grigoras, P., Burovskiy, P., Hung, E., Luk, W.: Accelerating SpMV on FPGAs by compressing nonzero values. In: 2015 IEEE 23rd Annual International Symposium on Field-Programmable Custom Computing Machines, pp. 64–67 (2015)

    Google Scholar 

  11. Liu, W., Vinter, B.: CSR5: an efficient storage format for cross-platform sparse matrix-vector multiplication. In: Proceedings of the ICS 2015, pp. 339–350 (2015)

    Google Scholar 

  12. Bian, B., Huang, J., Dong, R., Liu, L., Wang, X.: CSR2: a new format for SIMD-accelerated SpMV. In: CCGRID, pp. 350–359 (2020)

    Google Scholar 

  13. Dongarra, J., Heroux, M.A., Luszczek, P.: HPCG Benchmark: a new metric for ranking high performance computing systems. Knoxville, Tennessee (2015)

    Google Scholar 

  14. Davis, T.A., Hu, Y.: The University of Florida sparse matrix collection. ACM Trans. Math. Softw. 38(1), Article no. 1 (2011)

    Google Scholar 

Download references

Acknowledgment

First and foremost, we would like to sincerely thank the anonymous reviewers for their valuable comments. This work was supported, in part, by JST CREST Grant Number JPMJCR18K1, Japan.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Siyi Hu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Hu, S., Ito, M., Yoshikawa, T., He, Y., Kondo, M. (2023). Memory Bandwidth Conservation for SpMV Kernels Through Adaptive Lossy Data Compression. In: Takizawa, H., Shen, H., Hanawa, T., Hyuk Park, J., Tian, H., Egawa, R. (eds) Parallel and Distributed Computing, Applications and Technologies. PDCAT 2022. Lecture Notes in Computer Science, vol 13798. Springer, Cham. https://doi.org/10.1007/978-3-031-29927-8_36

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-29927-8_36

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-29926-1

  • Online ISBN: 978-3-031-29927-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics