Abstract
Recent progress in using recursion as a general technique for producing dense linear algebra library software for today’s memory tiered computer systems is presented. To allow for efficient utilization of a memory hierarchy, our approach is to apply the technique of hierarchical blocking. The success of our approach includes novel recursive blocked algorithms, hybrid data formats and superscalar kernels.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Andersen, B., Gustavson, F., Waśniewski, J.: A recursive formulation of Cholesky factorization of a matrix in packed storage. ACM Trans. Math. Software 27, 214–244 (2001)
Anderson, E., Bai, Z., Bischof, C., Blackford, S., Demmel, J., Dongarra, J., Du Croz, J., Greenbaum, A., Hammarling, S., McKenney, A., Sorensen, D.: LAPACK Users’ Guide, 3rd edn. SIAM, Philadelphia (1999)
Elmroth, E., Gustavson, F.G.: Applying recursion to serial and parallel QR factorization leads to better performance. IBM J. Res. Develop. 44, 605–624 (2000)
Elmroth, E., Gustavson, F.G.: A faster and simpler recursive algorithm for the LAPACK routine DGELS. BIT 41, 936–949 (2001)
Elmroth, E., Gustavson, F.G.: High-performance library software for QR factorization. In: Sørevik, T., Manne, F., Moe, R., Gebremedhin, A.H. (eds.) PARA 2000. LNCS, vol. 1947, pp. 53–63. Springer, Heidelberg (2001)
Elmroth, E., Gustavson, F., Jonsson, I., Kågström, B.: Recursive Blocked Algorithms and Hybrid Data Structures for Dense Matrix Library Software. SIAM Review 46(1), 3–45 (2004)
Granat, R., Jonsson, I., Kågström, B.: Combining Explicit and Recursive Blocking for Solving Triangular Sylvester-Type Matrix Equations on Distributed Memory Platforms. In: Danelutto, M., Vanneschi, M., Laforenza, D. (eds.) Euro-Par 2004. LNCS, vol. 3149, pp. 742–750. Springer, Heidelberg (2004)
Gustavson, F.G.: Recursion leads to automatic variable blocking for dense linear-algebra algorithms. IBM J. Res. Develop. 41, 737–755 (1997)
Gustavson, F.G., Henriksson, A., Jonsson, I., Kågström, B., Ling, P.: Recursive blocked data formats and BLAS’s for dense linear algebra algorithms. In: Kågström, B., et al. (eds.) PARA 1998. LNCS, vol. 1541, pp. 195–206. Springer, Heidelberg (1998)
Gustavson, F.G., Jonsson, I.: Minimal-storage high-performance Cholesky factorization via blocking and recursion. IBM J. Res. Develop. 44, 823–849 (2000)
IBM, Engineering and Scientific Subroutine Library, Guide and Reference, Ver. 3, Rel. 3 (2001)
Jonsson, I.: Analysis of Processor and Memory Utilization of Recursive Algorithms for Sylvester-Type Matrix Equations Using Performance Monitoring, Report UMINF-03.16, Dept. of Computing Science, Umeå University, Sweden (2003)
Jonsson, I., Kågström, B.: Recursive blocked algorithms for solving triangular systems— Part I: One-sided and coupled Sylvester-type matrix equations. ACM Trans. Math. Software 28, 392–415 (2002)
Jonsson, I., Kågström, B.: Recursive blocked algorithms for solving triangular systems— Part II: Two-sided and generalized Sylvester and Lyapunov equations. ACM Trans. Math. Software 28, 416–435 (2002)
Jonsson, I., Kågström, B.: RECSY—AHigh PerformanceLibrary for Sylvester-TypeMatrix Equations (2003), http://www.cs.umu.se/research/parallel/recsy
Kågström, B., Ling, P., Van Loan, C.: GEMM-based level 3 BLAS: High-performance model implementations and performance evaluation benchmark. ACM Trans. Math. Software 24(3), 268–302 (1998)
Kågström, B., Ling, P., Van Loan, C.: Algorithm 784: GEMM-based level 3 BLAS: Portability and optimization issues. ACM Trans. Math. Software 24, 303–316 (1998)
SLICOT, The SLICOT Library and the Numerics in Control Network (NICONET) website, http://www.win.tue.nl/niconet/
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kågström, B. (2006). Management of Deep Memory Hierarchies – Recursive Blocked Algorithms and Hybrid Data Structures for Dense Matrix Computations. In: Dongarra, J., Madsen, K., Waśniewski, J. (eds) Applied Parallel Computing. State of the Art in Scientific Computing. PARA 2004. Lecture Notes in Computer Science, vol 3732. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11558958_3
Download citation
DOI: https://doi.org/10.1007/11558958_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-29067-4
Online ISBN: 978-3-540-33498-9
eBook Packages: Computer ScienceComputer Science (R0)