A Cache-Optimal Alternative to the Unidirectional Hierarchization Algorithm

Hupp, Philipp; Jacob, Riko

doi:10.1007/978-3-319-28262-6_5

Philipp Hupp⁹ &
Riko Jacob¹⁰

Part of the book series: Lecture Notes in Computational Science and Engineering ((LNCSE,volume 109))

931 Accesses
1 Citations

Abstract

The sparse grid combination technique provides a framework to solve high-dimensional numerical problems with standard solvers by assembling a sparse grid from many coarse and anisotropic full grids called component grids. Hierarchization is one of the most fundamental tasks for sparse grids. It describes the transformation from the nodal basis to the hierarchical basis. In settings where the component grids have to be frequently combined and distributed in a massively parallel compute environment, hierarchization on component grids is relevant to minimize communication overhead.

We present a cache-oblivious hierarchization algorithm for component grids of the combination technique. It causes \(\left \vert \mathbf{G}_{\boldsymbol{\ell}}\right \vert \cdot \left ( \frac{1} {B} + \mathcal{O}\left ( \frac{1} {\root{d}\of{M}}\right )\right )\) cache misses under the tall cache assumption \(M =\omega \left (B^{d}\right )\). Here, \(\mathbf{G}_{\boldsymbol{\ell}}\) denotes the component grid, d the dimension, M the size of the cache and B the cache line size. This algorithm decreases the leading term of the cache misses by a factor of d compared to the unidirectional algorithm which is the common standard up to now. The new algorithm is also optimal in the sense that the leading term of the cache misses is reduced to scanning complexity, i.e., every degree of freedom has to be touched once. We also present a variant of the algorithm that causes \(\left \vert \mathbf{G}_{\boldsymbol{\ell}}\right \vert \cdot \left ( \frac{2} {B} + \mathcal{O}\left ( \frac{1} {\root{d-1}\of{M\cdot B^{d-2}}} \right )\right )\) cache misses under the assumption \(M =\omega \left (B\right )\). The new algorithms have been implemented and outperform previously existing software. In several cases the measured performance is close to the best possible.

The dimension d is assumed to be constant in the \(\mathcal{O}\)-notation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
We use the terms internal memory and cache as well as cache line size and block size synonymously.

References

A. Aggarwal, J.S. Vitter, The input/output complexity of sorting and related problems. Commun. ACM 31(9), 1116–1127 (1988)
Article MathSciNet Google Scholar
G. Ballard, J. Demmel, O. Holtz, O. Schwartz, Minimizing communication in numerical linear algebra. SIAM J. Matrix Anal. Appl. 32(3), 866–901 (2011)
Article MathSciNet MATH Google Scholar
H.-J. Bungartz, M. Griebel, Sparse grids. Acta Numer. 13, 147–269 (2004)
Article MathSciNet MATH Google Scholar
H.-J. Bungartz, A. Heinecke, D. Pflüger, S. Schraufstetter, Option pricing with a direct adaptive sparse grid approach. J. Comput. Appl. Math. 236(15), 3741–3750 (2011). Online Okt. 2011
Google Scholar
H.-J. Bungartz, D. Pflüger, S. Zimmer, Adaptive sparse grid techniques for data mining, in Modelling, Simulation and Optimization of Complex Processes 2006, Proceedings of the International Conference on HPSC, Hanoi, ed. by H. Bock, E. Kostina, X. Hoang, R. Rannacher (Springer, 2008), pp. 121–130
Google Scholar
G. Buse, R. Jacob, D. Pflüger, A. Murarasu, A non-static data layout enhancing parallelism and vectorization in sparse grid algorithms, in Proceedings of the 11th International Symposium on Parallel and Distributed Computing (ISPDC), Munich, 25–29 June 2012 (IEEE, 2012), pp. 195–202
Google Scholar
D. Butnaru, D. Pflüger, H.-J. Bungartz, Towards high-dimensional computational steering of precomputed simulation data using sparse grids, in Proceedings of the International Conference on Computational Science (ICCS), Tsukaba. Volume 4 of Procedia CS (Springer, 2011), pp. 56–65
Google Scholar
P. Butz, Effiziente verteilte Hierarchisierung und Dehierarchisierung auf vollen Gittern, Bachelor’s thesis, University of Stuttgart, 2014, http://d-nb.info/1063333806
Google Scholar
C. Feuersänger, Sparse grid methods for higher dimensional approximation, PhD thesis, Universität Bonn, 2010
Google Scholar
M. Frigo, C. E. Leiserson, H. Prokop, S. Ramachandran, Cache-oblivious algorithms, in Proceedings of the 40th Annual Symposium on Foundations of Computer Science (FOCS’99), New York (IEEE Computer Society Press, 1999), pp. 285–297
Google Scholar
J. Garcke, Maschinelles Lernen durch Funktionsrekonstruktion mit verallgemeinerten dünnen Gittern, PhD thesis, Universität Bonn, 2004
Google Scholar
J. Garcke, M. Griebel, On the parallelization of the sparse grid approach for data mining, in Large-Scale Scientific Computing, ed. by S. Margenov, J. Waśniewski, P. Yalamov. Volume 2179 of Lecture Notes in Computer Science (Springer, Berlin/Heidelberg, 2001), pp. 22–32
Google Scholar
E. Georganas, J. González-Domínguez, E. Solomonik, Y. Zheng, J. Touriño, K. Yelick, Communication avoiding and overlapping for numerical linear algebra, in Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis (SC’12), Salt Lake City (IEEE Computer Society Press, Los Alamitos, 2012), pp. 100:1–100:11
Google Scholar
M. Griebel, The combination technique for the sparse grid solution of PDE’s on multiprocessor machines. Parallel Process. Lett. 2, 61–70 (1992)
Article Google Scholar
M. Griebel, H. Harbrecht, On the convergence of the combination technique, in Sparse Grids and Applications. Volume 97 of Lecture Notes in Computational Science and Engineering (Springer, Cham/New York, 2014), pp. 55–74
Google Scholar
M. Griebel, W. Huber, Turbulence simulation on sparse grids using the combination method, in ed. by N. Satofuka, J. Periaux, A. Ecer, Proceedings Parallel Computational Fluid Dynamics, New Algorithms and Applications (CFD’94), Kyoto, Wiesbaden Braunschweig (Vieweg, 1995), pp. 75–84
Google Scholar
M. Griebel, W. Huber, C. Zenger, Numerical turbulence simulation on a parallel computer using the combination method, in Flow Simulation on High Performance Computers II, Notes on Numerical Fluid Mechanics 52, pp. 34–47 (Vieweg, Wiesbaden 1996) DOI:10.1007/978-3-322-89849-4_4
Google Scholar
M. Griebel, M. Schneider, C. Zenger, A combination technique for the solution of sparse grid problems, in Iterative Methods in Linear Algebra (IMACS/Elsevier, Amsterdam 1992), pp. 263–281
MATH Google Scholar
M. Griebel, V. Thurner, The efficient solution of fluid dynamics problems by the combination technique. Int. J. Numer. Methods Heat Fluid Flow 5, 51–69 (1995)
Article MathSciNet MATH Google Scholar
B. Harding, M. Hegland, A robust combination technique, in CTAC-2012. Volume 54 of ANZIAM Journal, 2013, pp. C394–C411
Google Scholar
M. Holtz, Sparse Grid Quadrature in High Dimensions with Applications in Finance and Insurance. Volume 77 of Lecture Notes in Computational Science and Engineering (Springer, Heidelberg, 2011)
Google Scholar
J.-W. Hong, H.-T. Kung, I/O complexity: The red-blue pebble game, in Proceedings of STOC’81, New York (ACM, 1981), pp. 326–333
Google Scholar
P. Hupp, Communication efficient algorithms for numerical problems on full and sparse grids, PhD thesis, ETH Zurich, 2014
Google Scholar
P. Hupp, Performance of unidirectional hierarchization for component grids virtually maximized, in International Conference on Computational Science. Volume 29 of Procedia Computer Science (Elsevier, Amsterdam 2014), pp. 2272–2283
Google Scholar
P. Hupp, M. Heene, R. Jacob, D. Pflüger, Global communication schemes for the numerical solution of high-dimensional PDEs. Parallel Comput. (2016). DOI:10.1016/j.parco.2015.12.006
Google Scholar
P. Hupp, R. Jacob, M. Heene, D. Pflüger, M. Hegland, Global communication schemes for the sparse grid combination technique. in Parallel Computing – Accelerating Computational Science and Engineering (CSE). Volume 25 of Advances in Parallel Computing (IOS Press, 2014), pp. 564–573
Google Scholar
D. Irony, S. Toledo, A. Tiskin, Communication lower bounds for distributed-memory matrix multiplication. J. Parallel Distrib. Comput. 64(9), 1017–1026 (2004)
Article MATH Google Scholar
R. Jacob, Efficient regular sparse grid hierarchization by a dynamic memory layout, in Sparse Grids and Applications 2012, Munich, ed. by J. Garcke, D. Pflüger. Volume 97 of Lecture Notes in Computational Science and Engineering (Springer, Cham/New York, 2014)pp. 195–219
Google Scholar
C. Kowitz, M. Hegland, The sparse grid combination technique for computing eigenvalues in linear gyrokinetics. Procedia Comput. Sci. 18, 449–458 (2013). International Conference on Computational Science.
Google Scholar
M.D. Lam, E.E. Rothberg, M.E. Wolf, The cache performance and optimizations of blocked algorithms. SIGPLAN Not. 26(4), 63–74 (1991)
Article Google Scholar
A. Maheshwari, N. Zeh, A survey of techniques for designing I/O-efficient algorithms, in Algorithms for Memory Hierarchies. ed. by U. Meyer, P. Sanders, J. Sibeyn. Volume 2625 of Lecture Notes in Computer Science, pp. 36–61 (Springer, Berlin/Heidelberg, 2003)
Google Scholar
A. Murarasu, J. Weidendorfer, G. Buse, D. Butnaru, D. Pflüger, Compact data structure and scalable algorithms for the sparse grid technique, in Proceedings of the 16th ACM Symposium on Principles and Practice of Parallel Programming (PPoPP), San Antonio (ACM, 2011), pp. 25–34
Google Scholar
A. F. Murarasu, G. Buse, D. Pflüger, J. Weidendorfer, A. Bode, fastsg: A fast routines library for sparse grids. Procedia CS 9, 354–363 (2012)
Google Scholar
C. Pflaum, Convergence of the combination technique for second-order elliptic differential equations. SIAM J. Numer. Anal. 34(6), 2431–2455 (1997)
Article MathSciNet MATH Google Scholar
C. Pflaum, A. Zhou, Error analysis of the combination technique. Numer. Math. 84(2), 327–350 (1999)
Article MathSciNet MATH Google Scholar
D. Pflüger, Spatially adaptive sparse grids for high-dimensional problems, PhD thesis, Institut für Informatik, Technische Universität München, 2010
Google Scholar
D. Pflüger, H.-J. Bungartz, M. Griebel, F. Jenko, T. Dannert, M. Heene, A. Parra Hinojosa, C. Kowitz, and P. Zaspel, Exahd: An exa-scalable two-level sparse grid approach for higher-dimensional problems in plasma physics and beyond, in Euro-Par 2014: Parallel Processing Workshops. Volume 8806 of Lecture Notes in Computer Science (Springer, Cham 2014), pp. 565–576
Google Scholar
H. Prokop, Cache-oblivious algorithms, Master’s thesis, Massachusetts Institute of Technology, 1999
MATH Google Scholar
C. Reisinger, Analysis of linear difference schemes in the sparse grid combination technique. IMA J. Numer. Anal. 33(2), 544–581 (2013)
Article MathSciNet MATH Google Scholar
S. Smolyak, Quadrature and interpolation formulas for tensor products of certain classes of functions. Sov. Math. Dokl. 4, 240–243 (1963)
MATH Google Scholar
C. Zenger, Sparse grids, in Parallel Algorithms for Partial Differential Equations. Volume 31 of Notes on Numerical Fluid Mechanics (Vieweg, Wiesbaden 1991), pp. 241–251
Google Scholar

Download references

Acknowledgements

We would like to thank Dirk Pflüger and Mario Heene for support and discussions, in particular for enabling the experiments on Hornet. We also thank two anonymous referees for detailed feedback on an earlier draft.

Author information

Authors and Affiliations

ETH Zürich, Zürich, Switzerland
Philipp Hupp
IT University of Copenhagen, Rued Langgaards Vej 7, DK-2300, København S, Denmark
Riko Jacob

Authors

Philipp Hupp
View author publications
You can also search for this author in PubMed Google Scholar
Riko Jacob
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Philipp Hupp .

Editor information

Editors and Affiliations

Institut für Numerische Simulation, Universität Bonn, Bonn, Germany
Jochen Garcke
Institute for Parallel and Distributed S, Universität Stuttgart, Stuttgart, Germany
Dirk Pflüger

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hupp, P., Jacob, R. (2016). A Cache-Optimal Alternative to the Unidirectional Hierarchization Algorithm. In: Garcke, J., Pflüger, D. (eds) Sparse Grids and Applications - Stuttgart 2014. Lecture Notes in Computational Science and Engineering, vol 109. Springer, Cham. https://doi.org/10.1007/978-3-319-28262-6_5

Download citation

DOI: https://doi.org/10.1007/978-3-319-28262-6_5
Published: 17 March 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-28260-2
Online ISBN: 978-3-319-28262-6
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics