Skip to main content

A Normalization Scheme for the Non-symmetric s-Step Lanczos Algorithm

  • Conference paper
Algorithms and Architectures for Parallel Processing (ICA3PP 2013)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 8286))

Abstract

The Lanczos algorithm is among the most frequently used techniques for computing a few dominant eigenvalues of a large sparse non-symmetric matrix. When variants of this algorithm are implemented on distributed-memory computers, the synchronization time spent in computing dot products is increasingly limiting the parallel scalability. The goal of s-step algorithms is to reduce the harmful influence of dot products on the parallel performance by grouping several of these operations for joint execution; thus, plummeting synchronization time when using a large number of processes. This paper extends the non-symmetric s-step Lanczos method introduced by Kim and Chronopoulos (J. Comput. Appl. Math., 42(3), 357–374, 1992) by a novel normalization scheme. Compared to the unnormalized algorithm, the normalized variant improves numerical stability and reduces the possibility of breakdowns.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Lanczos, C.: An iteration method for the solution of the eigenvalue problem of linear differential and integral operators. J. Res. Nat. Bur. Stand. 45(4), 255–282 (1950)

    Article  MathSciNet  Google Scholar 

  2. Ghysels, P., Ashby, T.J., Meerbergen, K., Vanroose, W.: Hiding global communication latency in the GMRES algorithm on massively parallel machines. SIAM J. Sci. Comput. 35(1), C48–C71 (2013)

    Google Scholar 

  3. Ghysels, P., Vanroose, W.: Hiding global synchronization latency in the preconditioned Conjugate Gradient algorithm. In: Parallel Computing (in press, 2013)

    Google Scholar 

  4. Mohiyuddin, M., Hoemmen, M., Demmel, J., Yelick, K.: Minimizing communication in sparse matrix solvers. In: Proc. Conf. High Perf. Comput. Networking, Storage and Analysi, SC 2009, pp. 36:1–36:12. ACM, New York (2009)

    Google Scholar 

  5. Carson, E., Knight, N., Demmel, J.: Avoiding communication in two-sided Krylov subspace methods. SIAM J. Sci. Comput. 35(5), S42–S61 (2013)

    Google Scholar 

  6. Fischer, B., Freund, R.: An inner product-free conjugate gradient-like algorithm for Hermitian positive definite systems. In: Brown, J., et al. (eds.) Proc. Cornelius Lanczos Intern. Centenary Conf., pp. 288–290. SIAM (1994)

    Google Scholar 

  7. Meurant, G.: The conjugate gradient method on supercomputers. Supercomputer 13, 9–17 (1986)

    Google Scholar 

  8. Van Rosendale, J.: Minimizing inner product data dependencies in conjugate gradient iteration. NASA Contractor Report NASA–CR–172178, NASA Langley Research Center, Hampton, VA (1983)

    Google Scholar 

  9. Bücker, H.M., Sauren, M.: A Variant of the Biconjugate Gradient Method Suitable for Massively Parallel Computing. In: Bilardi, G., Ferreira, A., Lüling, R., Rolim, J. (eds.) IRREGULAR 1997. LNCS, vol. 1253, pp. 72–79. Springer, Heidelberg (1997)

    Chapter  Google Scholar 

  10. Bücker, H.M., Sauren, M.: A Parallel Version of the Quasi-Minimal Residual Method Based on Coupled Two-Term Recurrences. In: Waśniewski, J., Dongarra, J., Madsen, K., Olesen, D. (eds.) PARA 1996. LNCS, vol. 1184, pp. 157–165. Springer, Heidelberg (1996)

    Chapter  Google Scholar 

  11. Bücker, H.M., Sauren, M.: Reducing global synchronization in the biconjugate gradient method. In: Yang, T. (ed.) Parallel Numerical Computations with Applications, pp. 63–76. Kluwer Academic Publishers, Norwell (1999)

    Chapter  Google Scholar 

  12. Chronopoulos, A.T.: A Class of Parallel Iterative Methods Implemented on Multiprocessors. Technical report UIUCDCS–R–86–1267, Department of Computer Science, University of Illinois, Urbana, Illinois (1986)

    Google Scholar 

  13. Chronopoulos, A.T., Gear, C.W.: s-step iterative methods for symmetric linear systems. J. Comput. Appl. Math. 25(2), 153–168 (1989)

    Article  MathSciNet  MATH  Google Scholar 

  14. Chronopoulos, A.T., Swanson, C.D.: Parallel iterative s-step methods for unsymmetric linear systems. Parallel Computing 22(5), 623–641 (1996)

    Article  MathSciNet  MATH  Google Scholar 

  15. Kim, S.K., Chronopoulos, A.T.: A class of Lanczos-like algorithms implemented on parallel computers. Parallel Computing 17(6-7), 763–778 (1991)

    Article  MathSciNet  MATH  Google Scholar 

  16. Kim, S.K., Chronopoulos, A.T.: An efficient nonsymmetric Lanczos method on parallel vector computers. J. Comput. Appl. Math. 42(3), 357–374 (1992)

    Article  MathSciNet  MATH  Google Scholar 

  17. Feuerriegel, S.: Lanczos-based Algorithms for the Parallel Solution of Large Sparse Linear Systems. Master’s thesis, RWTH Aachen University, Aachen (2011)

    Google Scholar 

  18. Kim, S.K.: Efficient biorthogonal Lanczos algorithm on message passing parallel computer. In: Hsu, C.-H., Malyshkin, V. (eds.) MTPP 2010. LNCS, vol. 6083, pp. 293–299. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  19. Carson, E., Demmel, J.: A residual replacement strategy for improving the maximum attainable accuracy of s-step Krylov subspace methods. Technical Report UCB/EECS–2012–197, University of California, Berkeley (2012)

    Google Scholar 

  20. Gustafsson, M., Kormann, K., Holmgren, S.: Communication-efficient algorithms for numerical quantum dynamics. In: Jónasson, K. (ed.) PARA 2010, Part II. LNCS, vol. 7134, pp. 368–378. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  21. Kim, S.K., Kim, T.H.: A study on the efficient parallel block Lanczos method. In: Zhang, J., He, J.-H., Fu, Y. (eds.) CIS 2004. LNCS, vol. 3314, pp. 231–237. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  22. Balay, S., Gropp, W.D., McInnes, L.C., Smith, B.F.: Efficient management of parallelism in object oriented numerical software libraries. In: Arge, E., et al. (eds.) Modern Software Tools in Scientific Computing, pp. 163–202. Birkhäuser Press (1997)

    Google Scholar 

  23. Hernandez, V., Roman, J.E., Vidal, V.: SLEPc: A scalable and flexible toolkit for the solution of eigenvalue problems. ACM Trans. Math. Softw. 31(3), 351–362 (2005)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer International Publishing Switzerland

About this paper

Cite this paper

Feuerriegel, S., Bücker, H.M. (2013). A Normalization Scheme for the Non-symmetric s-Step Lanczos Algorithm. In: Aversa, R., Kołodziej, J., Zhang, J., Amato, F., Fortino, G. (eds) Algorithms and Architectures for Parallel Processing. ICA3PP 2013. Lecture Notes in Computer Science, vol 8286. Springer, Cham. https://doi.org/10.1007/978-3-319-03889-6_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-03889-6_4

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-03888-9

  • Online ISBN: 978-3-319-03889-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics