Skip to main content

The I/O Complexity of Strassen’s Matrix Multiplication with Recomputation

  • Conference paper
  • First Online:
Algorithms and Data Structures (WADS 2017)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10389))

Included in the following conference series:

Abstract

A tight \(\varOmega ((n/\sqrt{M})^{\log _2 7}M)\) lower bound is derived on the I/O complexity of Strassen’s algorithm to multiply two \(n \times n\) matrices, in a two-level storage hierarchy with M words of fast memory. A proof technique is introduced, which exploits the Grigoriev’s flow of the matrix multiplication function as well as some combinatorial properties of the Strassen computational directed acyclic graph (CDAG). Applications to parallel computation are also developed. The result generalizes a similar bound previously obtained under the constraint of no-recomputation, that is, that intermediate results cannot be computed more than once.

This work was supported, in part, by MIUR of Italy under project AMANDA 2012C4E3KT 004 and by the University of Padova under projects CPDA121378/12, and CPDA152255/15.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Patterson, C.A., Snir, M., Graham, S.L.: Getting Up to Speed: The Future of Supercomputing. National Academies Press (2005)

    Google Scholar 

  2. Bilardi, G., Preparata, F.P.: Horizons of parallel computation. Journal of Parallel and Distributed Computing 27(2), 172–182 (1995)

    Article  MATH  Google Scholar 

  3. Strassen, V.: Gaussian elimination is not optimal. Numerische Mathematik 13(4), 354–356 (1969)

    Article  MathSciNet  MATH  Google Scholar 

  4. Le Gall, F.: Powers of tensors and fast matrix multiplication. In: Proc. ACM ISSAC, pp. 296–303. ACM (2014)

    Google Scholar 

  5. Hong, J., Kung, H.: I/o complexity: the red-blue pebble game. In: Proc. ACM STOC, pp. 326–333. ACM (1981)

    Google Scholar 

  6. Cannon, L.E.: A cellular computer to implement the Kalman filter algorithm. Technical report, DTIC Document (1969)

    Google Scholar 

  7. Ballard, G., Demmel, J., Holtz, O., Lipshitz, B., Schwartz, O.: Brief announcement: strong scaling of matrix multiplication algorithms and memory-independent communication lower bounds. In: Proc. ACM SPAA, pp. 77–79. ACM (2012)

    Google Scholar 

  8. Irony, D., Toledo, S., Tiskin, A.: Communication lower bounds for distributed-memory matrix multiplication. Journal of Parallel and Distributed Computing 64(9), 1017–1026 (2004)

    Article  MATH  Google Scholar 

  9. Scquizzato, M., Silvestri, F.: Communication lower bounds for distributed-memory computations. arXiv preprint arXiv:1307.1805 (2013)

  10. Pagh, R., Stöckel, M.: The input/output complexity of sparse matrix multiplication. In: Schulz, A.S., Wagner, D. (eds.) ESA 2014. LNCS, vol. 8737, pp. 750–761. Springer, Heidelberg (2014). doi:10.1007/978-3-662-44777-2_62

    Google Scholar 

  11. Ballard, G., Demmel, J., Holtz, O., Schwartz, O.: Minimizing communication in numerical linear algebra. SIAM Journal on Matrix Analysis and Applications 32(3), 866–901 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  12. Ballard, G., Demmel, J., Holtz, O., Schwartz, O.: Communication-optimal parallel and sequential Cholesky decomposition. SIAM Journal on Scientific Computing 32(6), 3495–3523 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  13. Loomis, L.H., Whitney, H.: An inequality related to the isoperimetric inequality. Bull. Amer. Math. Soc. 55(10), 961–962 (1949)

    Article  MathSciNet  MATH  Google Scholar 

  14. Zalgaller, V.A., Sossinsky, A.B., Burago, Y.D.: The American Mathematical Monthly 96(6), 544–546 (1989)

    Article  Google Scholar 

  15. Ballard, G., Demmel, J., Holtz, O., Schwartz, O.: Graph expansion and communication costs of fast matrix multiplication. JACM 59(6), 32 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  16. Ballard, G., Demmel, J., Holtz, O., Lipshitz, B., Schwartz, O.: Graph expansion analysis for communication costs of fast rectangular matrix multiplication. In: Even, G., Rawitz, D. (eds.) MedAlg 2012. LNCS, vol. 7659, pp. 13–36. Springer, Heidelberg (2012). doi:10.1007/978-3-642-34862-4_2

    Chapter  Google Scholar 

  17. Scott, J., Holtz, O., Schwartz, O.: Matrix multiplication I/O complexity by path routing. In: Proc. ACM SPAA, pp. 35–45 (2015)

    Google Scholar 

  18. De Stefani, L.: On space constrained computations. PhD thesis, University of Padova (2016)

    Google Scholar 

  19. Bilardi, G., Preparata, F.: Processor-time trade offs under bounded speed message propagation. Lower Bounds. Theory of Computing Systems 32(5), 531–559 (1999)

    Article  MATH  Google Scholar 

  20. Ballard, G., Demmel, J., Holtz, O., Lipshitz, B., Schwartz, O.: Communication-optimal parallel algorithm for Strassen’s matrix multiplication. In: Proc. ACM SPAA, pp. 193–204 (2012)

    Google Scholar 

  21. Jacob, R., Stöckel, M.: Fast output-sensitive matrix multiplication. In: Bansal, N., Finocchi, I. (eds.) ESA 2015. LNCS, vol. 9294, pp. 766–778. Springer, Heidelberg (2015). doi:10.1007/978-3-662-48350-3_64

    Chapter  Google Scholar 

  22. Savage, J.E.: Extending the Hong-Kung model to memory hierarchies. In: Du, D.-Z., Li, M. (eds.) COCOON 1995. LNCS, vol. 959, pp. 270–281. Springer, Heidelberg (1995). doi:10.1007/BFb0030842

    Chapter  Google Scholar 

  23. Bilardi, G., Peserico, E.: A characterization of temporal locality and its portability across memory hierarchies. In: Orejas, F., Spirakis, P.G., Leeuwen, J. (eds.) ICALP 2001. LNCS, vol. 2076, pp. 128–139. Springer, Heidelberg (2001). doi:10.1007/3-540-48224-5_11

    Chapter  Google Scholar 

  24. Koch, R.R., Leighton, F.T., Maggs, B.M., Rao, S.B., Rosenberg, A.L., Schwabe, E.J.: Work-preserving emulations of fixed-connection networks. JACM 44(1), 104–147 (1997)

    Article  MathSciNet  MATH  Google Scholar 

  25. Bhatt, S.N., Bilardi, G., Pucci, G.: Area-time tradeoffs for universal VLSI circuits. Theoret. Comput. Sci. 408(2–3), 143–150 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  26. Bilardi, G., Pietracaprina, A., D’Alberto, P.: On the space and access complexity of computation DAGs. In: Brandes, U., Wagner, D. (eds.) WG 2000. LNCS, vol. 1928, pp. 47–58. Springer, Heidelberg (2000). doi:10.1007/3-540-40064-8_6

    Chapter  Google Scholar 

  27. Grigor’ev, D.Y.: Application of separability and independence notions for proving lower bounds of circuit complexity. Zapiski Nauchnykh Seminarov POMI 60, 38–48 (1976)

    MATH  Google Scholar 

  28. Savage, J.E.: Models of Computation: Exploring the Power of Computing, 1st edn. Addison-Wesley Longman Publishing Co., Inc., Boston (1997)

    Google Scholar 

  29. Bilardi, G., Stefani, L.D.: The i/o complexity of strassen’s matrix multiplication with recomputation. arXiv preprint arXiv:1605.02224 (2016)

  30. Ranjan, D., Savage, J.E., Zubair, M.: Upper and lower I/O bounds for pebbling r-pyramids. Journal of Discrete Algorithms 14, 2–12 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  31. Aggarwal, A., Vitter, J.S.: The input/output complexity of sorting and related problems. Commun. ACM 31(9), 1116–1127 (1988)

    Article  MathSciNet  Google Scholar 

  32. Le Gall, F.: Faster algorithms for rectangular matrix multiplication. In: Proc. IEEE FOCS, pp. 514–523. IEEE (2012)

    Google Scholar 

  33. Thompson, C.: Area-time complexity for VLSI. In: Proc. ACM STOC, pp. 81–88. ACM (1979)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lorenzo De Stefani .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Bilardi, G., De Stefani, L. (2017). The I/O Complexity of Strassen’s Matrix Multiplication with Recomputation. In: Ellen, F., Kolokolova, A., Sack, JR. (eds) Algorithms and Data Structures. WADS 2017. Lecture Notes in Computer Science(), vol 10389. Springer, Cham. https://doi.org/10.1007/978-3-319-62127-2_16

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-62127-2_16

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-62126-5

  • Online ISBN: 978-3-319-62127-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics