Scalable Kernel Methods for Uncertainty Quantification

Tharakan, S.; March, W. B.; Biros, G.

doi:10.1007/978-3-319-22997-3_1

S. Tharakan¹⁰,
W. B. March¹⁰ &
G. Biros¹⁰

Part of the book series: Lecture Notes in Computational Science and Engineering ((LNCSE,volume 105))

1006 Accesses

Abstract

Kernel methods are a broad class of algorithms that find application in approximation theory and non-parametric statistics. In this article, we review the literature with a focus on methods for uncertainty quantification and we discuss computational challenges related to kernel methods. In particular, we focus on approximating kernel matrices, one of the main computational bottlenecks in kernel methods. The most popular method for constructing approximations of kernel matrices is the Nystrom method, which uses randomized sampling to construct a low-rank factorization of a kernel matrix. We present a parallel implementation of the Nystrom method using the Elemental parallel linear algebra library and discuss an efficient variant called the one-shot Nystrom method. We conclude with examples of a regression problems for binary classification in high dimensions that illustrate the capabilities and limitations of Nystrom methods. In our largest test, we consider a dataset from high-energy physics in 28 dimensions with ten million points.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Hardcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
In mathematical physics, the kernel is the Green’s function of the partial differential equations (PDEs) that model the target application and the weights are the right-hand side of the PDE.
2.
Throughout, we refer to a point \(\underline{x}_{i}\) for which we compute y _i as a target and a point \(\underline{x}_{j}\) as a source with weight w _j.
3.
ASKIT stands for Approximate Skeletonization Kernel Independent Treecode.
4.
For example, the intrinsic dimension of a set of points distributed on a curve in three dimensions is one.
5.
We use the term interaction between two points \(\underline{x}_{i}\) and \(\underline{x}_{j}\) to refer to \(K(\underline{x}_{i},\underline{x}_{j})\).

References

Alwan, A., Aluru, N.: Improved statistical models for limited datasets in uncertainty quantification using stochastic collocation. J. Comput. Phys. 255, 521–539 (2013)
Article MathSciNet Google Scholar
Ambikasaran, S., Foreman-Mackey, D., Greengard, L., Hogg, D.W., O’Neil, M.: Fast direct methods for Gaussian processes and the analysis of NASA Kepler mission data. arXiv preprint (2014) [arXiv:1403.6015]
Google Scholar
Andoni, A., Indyk, P.: Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. Commun. ACM 51(1), 117 (2008)
Article Google Scholar
Bache, K., Lichman, M.: UCI machine learning repository. http://archive.ics.uci.edu/ml (2013)
Bardeen, J., Bond, J., Kaiser, N., Szalay, A.: The statistics of peaks of Gaussian random fields. Astrophys. J. 304, 15–61 (1986)
Article Google Scholar
Biegler, L., Biros, G., Ghattas, O., Marzouk, Y., Heinkenschloss, M., Keyes, D., Mallick, B., Tenorio, L., van Bloemen Waanders, B., Willcox, K. (eds.): Large-Scale Inverse Problems and Quantification of Uncertainty. Wiley, New York (2011)
MATH Google Scholar
Bilionis, I., Zabaras, N., Konomi, B.A., Lin, G.: Multi-output separable gaussian process: towards an efficient, fully bayesian paradigm for uncertainty quantification. J. Comput. Phys. 241, 212–239 (2013)
Article Google Scholar
Buhmann, M.D.: Radial Basis Functions: Theory and Implementations, vol. 12. Cambridge University Press, Cambridge (2003)
Book Google Scholar
Bungartz, H.J., Griebel, M.: Sparse grids. In: Acta Numerica, vol. 13, pp. 147–269. Cambridge University Press, Cambridge (2004)
Google Scholar
Camps-Valls, G., Bruzzone, L., et al.: Kernel Methods for Remote Sensing Data Analysis, vol. 26. Wiley, New York (2009)
Book MATH Google Scholar
Cecil, T., Qian, J., Osher, S.: Numerical methods for high dimensional Hamilton-Jacobi equations using radial basis functions. J. Comput. Phys. 196(1), 327–347 (2004)
Article MATH MathSciNet Google Scholar
Chen, J., Wang, L., Anitescu, M.: A fast summation tree code for Matérn kernel. SIAM J. Sci. Comput. 36(1), A289–A309 (2014)
Article MATH MathSciNet Google Scholar
Coifman, R.R., Lafon, S.: Diffusion maps. Appl. Comput. Harmon. Anal. 21(1), 5–30 (2006)
Article MATH MathSciNet Google Scholar
Cubuk, E.D., Schoenholz, S.S., Rieser, J.M., Malone, B.D., Rottler, J., Durian, D.J., Kaxiras, E., Liu, A.J.: Identifying structural flow defects in disordered solids using machine-learning methods. Phys. Rev. Lett. 114, 108001 (2015). http://link.aps.org/doi/10.1103/PhysRevLett.114.108001
Drineas, P., Mahoney, M.W.: On the nyström method for approximating a gram matrix for improved kernel-based learning. J. Mach. Learn. Res. 6, 2153–2175 (2005)
MATH MathSciNet Google Scholar
Elman, H.C., Miller, C.W.: Stochastic collocation with kernel density estimation. Comput. Methods Appl. Mech. Eng. 245–246, 36–46 (2012)
Article MathSciNet Google Scholar
Evensen, G.: Data Assimilation: The Ensemble Kalman Filter. Springer, Heidelberg (2006)
Google Scholar
Farrell, K., Oden, J.T.: Calibration and validation of coarse-grained models of atomic systems: application to semiconductor manufacturing. Comput. Mech. 54(1), 3–19 (2014)
Article MATH Google Scholar
Fornberg, B., Piret, C.: A stable algorithm for flat radial basis functions on a sphere. SIAM J. Sci. Comput. 30(1), 60–80 (2007)
Article MATH MathSciNet Google Scholar
Fowlkes, C., Belongie, S., Chung, F., Malik, J.: Spectral grouping using the Nystrom method. IEEE Trans. Pattern Anal. Mach. Intell. 26(2), 214–225 (2004)
Article Google Scholar
Friedman, J., Hastie, T., Tibshirani, R.: The Elements of Statistical Learning. Springer Series in Statistics, vol. 1. Springer, New York (2001)
Google Scholar
Gittens, A., Mahoney, M.: Revisiting the Nystrom method for improved large-scale machine learning. In: Proceedings of the 30th International Conference on Machine Learning (ICML-13), pp. 567–575 (2013)
Google Scholar
Gorodetsky, A., Marzouk, Y.: Efficient localization of discontinuities in complex computational simulations. SIAM J. Sci. Comput. 36(6), A2584–A2610 (2014)
Article MATH MathSciNet Google Scholar
Greengard, L.: Fast algorithms for classical physics. Science 265(5174), 909–914 (1994)
Article MATH MathSciNet Google Scholar
Greengard, L., Rokhlin, V.: A fast algorithm for particle simulations. J. Comput. Phys. 73, 325–348 (1987)
Article MATH MathSciNet Google Scholar
Greengard, L., Strain, J.: The fast Gauss transform. SIAM J. Sci. Stat. Comput. 12(1), 79–94 (1991)
Article MATH MathSciNet Google Scholar
Griebel, M., Wissel, D.: Fast approximation of the discrete Gauss transform in higher dimensions. J. Sci. Comput. 55(1), 149–172 (2013)
Article MATH MathSciNet Google Scholar
Hofmann, T., Schölkopf, B., Smola, A.J.: Kernel methods in machine learning. Ann. Stat. 36, 1171–1220 (2008)
Article MATH Google Scholar
Klaas, M., Briers, M., De Freitas, N., Doucet, A., Maskell, S., Lang, D.: Fast particle smoothing: if I had a million particles. In: Proceedings of the 23rd International Conference on Machine Learning, pp. 481–488. ACM, New York (2006)
Google Scholar
Kress, R.: Linear Integral Equations. Applied Mathematical Sciences. Springer, New York (1999)
Book MATH Google Scholar
Ma, X., Zabaras, N.: Kernel principal component analysis for stochastic input model generation. J. Comput. Phys. 230(19), 7311–7331 (2011)
Article MATH MathSciNet Google Scholar
Mahoney, M.W.: Randomized algorithms for matrices and data. Found. Trends Mach. Learn. 3(2), 123–224 (2011)
MATH MathSciNet Google Scholar
March, W.B., Biros, G.: Far-field compression for fast kernel summation methods in high dimensions, pp. 1–43 (2014) [arxiv.org/abs/1409.2802v1]
March, W.B., Xiao, B., Biros, G.: ASKIT: approximate skeletonization kernel-independent treecode in high dimensions. SIAM J. Sci. Comput. 37(2), 1089–1110 (2015). http://dx.doi.org/10.1137/140989546
March, W.B., Xiao, B., Tharakan, S., Yu, C.D., Biros, G.: Robust treecode approximation for kernel machines. In: Proceedings of the 21st ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Sydney, pp. 1–10 (2008). http://dx.doi.org/10.1145/2783258.2783272
March, W.B., Xiao, B., Yu, C., Biros, G.: An algebraic parallel treecode in arbitrary dimensions. In: Proceedings of IPDPS 2015. 29th IEEE International Parallel and Distributed Processing Symposium, Hyderabad (2015). http://padas.ices.utexas.edu/static/papers/ipdps15askit.pdf
Medina, J.C., Taflanidis, A.A.: Adaptive importance sampling for optimization under uncertainty problems. Comput. Methods Appl. Mech. Eng. 279, 133–162 (2014)
Article MathSciNet Google Scholar
Nadler, B., Lafon, S., Coifman, R.R., Kevrekidis, I.G.: Diffusion maps, spectral clustering and reaction coordinates of dynamical systems. Appl. Comput. Harmon. Anal. 21(1), 113–127 (2006)
Article MATH MathSciNet Google Scholar
Nichol, R., Sheth, R.K., Suto, Y., Gray, A., Kayo, I., Wechsler, R., Marin, F., Kulkarni, G., Blanton, M., Connolly, A., et al.: The effect of large-scale structure on the SDSS galaxy three-point correlation function. Mon. Not. R. Astron. Soc. 368(4), 1507–1514 (2006)
Article Google Scholar
Peherstorfer, B., Pflüger, D., Bungartz, H.J.: Density estimation with adaptive sparse grids for large data sets. In: Proceedings of the 2014 SIAM International Conference on Data Mining, pp. 443–451. Society for Industrial and Applied Mathematics, Philadelphia (2014)
Google Scholar
Petra, N., Martin, J., Stadler, G., Ghattas, O.: A computational framework for infinite-dimensional Bayesian inverse problems, part II: stochastic Newton MCMC with application to ice sheet flow inverse problems. SIAM J. Sci. Comput. 36(4), A1525–A1555 (2014)
Article MATH MathSciNet Google Scholar
Petschow, M., Peise, E., Bientinesi, P.: High-performance solvers for dense hermitian eigenproblems. SIAM J. Sci. Comput. 35(1), C1–C22 (2013)
Article MATH MathSciNet Google Scholar
Poulson, J., Marker, B., van de Geijn, R.A., Hammond, J.R., Romero, N.A.: Elemental: a new framework for distributed memory dense matrix computations. ACM Trans. Math. Softw. 39(2), 13:1–13:24 (2013). http://doi.acm.org/10.1145/2427023.2427030
Rasmussen, C.E., Williams, C.: Gaussian Processes for Machine Learning. MIT Press, Cambridge (2006)
MATH Google Scholar
Schaback, R., Wendland, H.: Kernel techniques: from machine learning to meshless methods. Acta Numer. 15, 543–639 (2006)
Article MATH MathSciNet Google Scholar
Schölkopf, B., Smola, A., Müller, K.R.: Kernel principal component analysis. In: Artificial Neural Networks—ICANN’97, pp. 583–588. Springer, Heidelberg (1997)
Google Scholar
Schwab, C., Todor, R.A.: Karhunen-Loeve approximation of random fields by generalized fast multipole methods. J. Comput. Phys. 217(1), 100–122 (2006). http://dx.doi.org/10.1016/j.jcp.2006.01.048
Shawe-Taylor, J., Cristianini, N.: Kernel Methods for Pattern Analysis. Cambridge University Press, Cambridge (2004)
Book Google Scholar
Silverman, B.W.: Density Estimation for Statistics and Data Analysis. Chapman and Hall, London (1986)
Book MATH Google Scholar
Spivak, M., Veerapaneni, S.K., Greengard, L.: The fast generalized Gauss transform. SIAM J. Sci. Comput. 32(5), 3092–3107 (2010)
Article MATH MathSciNet Google Scholar
Talmon, R., Coifman, R.R.: Intrinsic modeling of stochastic dynamical systems using empirical geometry. Appl. Comput. Harmon. Anal. 39(1), 138–160 (2015)
Article MathSciNet Google Scholar
Talwalkar, A., Rostamizadeh, A.: Matrix coherence and the nystrom method. In: Proceedings of the 26th Conference on Uncertainty in Artificial Intelligence (UAI 2010) (2010)
Google Scholar
Tarantola, A.: Inverse Problem Theory and Methods for Model Parameter Estimation. Society for Industrial and Applied Mathematics (SIAM), Philadelphia (2005)
Google Scholar
Wan, X., Karniadakis, G.E.: Solving elliptic problems with non-gaussian spatially-dependent random coefficients. Comput. Methods Appl. Mech. Eng. 198(21–26), 1985–1995 (2009)
Article MATH MathSciNet Google Scholar
Wasserman, L.: All of Statistics: A Concise Course in Statistical Inference. Springer, New York (2004)
Book Google Scholar
Weber, R., Schek, H., Blott, S.: A quantitative analysis and performance study for similarity-search methods in high-dimensional spaces. In: Proceedings of the International Conference on Very Large Data Bases, pp. 194–205. Morgan Kaufmann, San Francisco (1998)
Google Scholar
Wendland, H.: Scattered Data Approximation, vol. 17. Cambridge University Press, Cambridge (2004)
Book Google Scholar
Williams, C., Seeger, M.: Using the Nyström method to speed up kernel machines. In: Proceedings of the 14th Annual Conference on Neural Information Processing Systems, pp. 682–688 (2001)
Google Scholar
Xiao, B.: Parallel algorithms for the generalized n-body problem in high dimensions and their applications for bayesian inference and image analysis. Ph.D. thesis, Georgia Institute of Technology (2014)
Google Scholar
Xiu, D.: Fast numerical methods for stochastic computations: a review. Commun. Comput. Phys. 5(2–4), 242–272 (2009)
MathSciNet Google Scholar
Ying, L., Biros, G., Zorin, D.: A kernel-independent adaptive fast multipole method in two and three dimensions. J. Comput. Phys. 196(2), 591–626 (2004)
Article MATH MathSciNet Google Scholar

Download references

Acknowledgements

This material is based upon work supported by AFOSR grants FA9550-12-10484 and FA9550-11-10339; by NSF grants CCF-1337393, OCI-1029022; by the U.S. Department of Energy, Office of Science, Office of Advanced Scientific Computing Research, Applied Mathematics program under Award Numbers DE-SC0010518, DE-SC0009286, and DE- FG02-08ER2585; by NIH grant 10042242; and by the Technische Universität München—Institute for Advanced Study, funded by the German Excellence Initiative (and the European Union Seventh Framework Programme under grant agreement 291763). Any opinions, findings, and conclusions or recommendations expressed herein are those of the authors and do not necessarily reflect the views of the AFOSR or the NSF. Computing time on the Texas Advanced Computing Centers Stampede system was provided by an allocation from TACC and the NSF.

Author information

Authors and Affiliations

UT Austin, Austin, TX, USA
S. Tharakan, W. B. March & G. Biros

Authors

S. Tharakan
View author publications
You can also search for this author in PubMed Google Scholar
W. B. March
View author publications
You can also search for this author in PubMed Google Scholar
G. Biros
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to G. Biros .

Editor information

Editors and Affiliations

Institut für Parallele und Verteilte Systeme, Universität Stuttgart , Stuttgart, Germany
Miriam Mehl
Institut für Baustatik und Baudynamik, Universität Stuttgart, Stuttgart, Germany
Manfred Bischoff
Fakultät für Maschinenbau, Technische Universität Darmstadt , Darmstadt, Germany
Michael Schäfer

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Tharakan, S., March, W.B., Biros, G. (2015). Scalable Kernel Methods for Uncertainty Quantification. In: Mehl, M., Bischoff, M., Schäfer, M. (eds) Recent Trends in Computational Engineering - CE2014. Lecture Notes in Computational Science and Engineering, vol 105. Springer, Cham. https://doi.org/10.1007/978-3-319-22997-3_1

Download citation

DOI: https://doi.org/10.1007/978-3-319-22997-3_1
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-22996-6
Online ISBN: 978-3-319-22997-3
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics