Skip to main content

Approximating a Gram Matrix for Improved Kernel-Based Learning

(Extended Abstract)

  • Conference paper
Learning Theory (COLT 2005)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3559))

Included in the following conference series:

Abstract

A problem for many kernel-based methods is that the amount of computation required to find the solution scales as O(n 3), where n is the number of training examples. We develop and analyze an algorithm to compute an easily-interpretable low-rank approximation to an n × n Gram matrix G such that computations of interest may be performed more rapidly. The approximation is of the form \({\tilde G}_{k} = CW^{+}_{k}C^{T}\), where C is a matrix consisting of a small number c of columns of G and W k is the best rank-k approximation to W, the matrix formed by the intersection between those c columns of G and the corresponding c rows of G. An important aspect of the algorithm is the probability distribution used to randomly sample the columns; we will use a judiciously-chosen and data-dependent nonuniform probability distribution. Let || ·||2 and || ·|| F denote the spectral norm and the Frobenius norm, respectively, of a matrix, and let G k be the best rank-k approximation to G. We prove that by choosing O(k/ε 4) columns

$${\left\|G - CW^{+}_{k}C^{T}\right\|_{\xi}} \leq \|G - G_{k}\|_{\xi} + \sum\limits_{i=1}^{n} G^{2}_{ii},$$

both in expectation and with high probability, for both ξ = 2,F, and for all k : 0 ≤ k ≤ rank(W). This approximation can be computed using O(n) additional space and time, after making two passes over the data from external storage.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Achlioptas, D., McSherry, F.: Fast computation of low rank matrix approximations. In: Proceedings of the 33rd Annual ACM Symposium on Theory of Computing, pp. 611–618 (2001)

    Google Scholar 

  2. Achlioptas, D., McSherry, F., Schölkopf, B.: Sampling techniques for kernel methods. In: Annual Advances in Neural Information Processing Systems 14: Proceedings of the 2001 Conference, pp. 335–342 (2002)

    Google Scholar 

  3. Azar, Y., Fiat, A., Karlin, A.R., McSherry, F., Saia, J.: Spectral analysis of data. In: Proceedings of the 33rd Annual ACM Symposium on Theory of Computing, pp. 619–626 (2001)

    Google Scholar 

  4. Belkin, M., Niyogi, P.: Laplacian eigenmaps for dimensionality reduction and data representation. Neural Computation 15(6), 1373–1396 (2003)

    Article  MATH  Google Scholar 

  5. Bengio, Y., Paiement, J.F., Vincent, P., Delalleau, O., Le Roux, N., Ouimet, M.: Out-of-sample extensions for LLE, Isomap, MDS, eigenmaps, and spectral clustering. In: Annual Advances in Neural Information Processing Systems 16: Proceedings of the 2003 Conference, pp. 177–184 (2004)

    Google Scholar 

  6. Burges, C.J.C.: Simplified support vector decision rules. In: Proceedings of the 13th International Conference on Machine Learning, pp. 71–77 (1996)

    Google Scholar 

  7. Cristianini, N., Shawe-Taylor, J.: An Introduction to Support Vector Machines and Other Kernel-based Learning Methods. Cambridge University Press, Cambridge (2000)

    Google Scholar 

  8. Delves, L.M., Mohamed, J.L.: Computational Methods for Integral Equations. Cambridge University Press, Cambridge (1985)

    Book  MATH  Google Scholar 

  9. Donoho, D.L., Grimes, C.: Hessian eigenmaps: Locally linear embedding techniques for high-dimensional data. Proc. Natl. Acad. Sci. USA 100(10), 5591–5596 (2003)

    Article  MATH  MathSciNet  Google Scholar 

  10. Drineas, P., Frieze, A., Kannan, R., Vempala, S., Vinay, V.: Clustering in large graphs and matrices. In: Proceedings of the 10th Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 291–299 (1999)

    Google Scholar 

  11. Drineas, P., Kannan, R.: Fast Monte-Carlo algorithms for approximate matrix multiplication. In: Proceedings of the 42nd Annual IEEE Symposium on Foundations of Computer Science, pp. 452–459 (2001)

    Google Scholar 

  12. Drineas, P., Kannan, R.: Pass efficient algorithms for approximating large matrices. In: Proceedings of the 14th Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 223–232 (2003)

    Google Scholar 

  13. Drineas, P., Kannan, R., Mahoney, M.W.: Fast Monte Carlo algorithms for matrices I: Approximating matrix multiplication. Technical Report YALEU/DCS/TR-1269, Yale University Department of Computer Science, New Haven, CT (February 2004)

    Google Scholar 

  14. Drineas, P., Kannan, R., Mahoney, M.W.: Fast Monte Carlo algorithms for matrices II: Computing a low-rank approximation to a matrix. Technical Report YALEU/DCS/TR-1270, Yale University Department of Computer Science, New Haven, CT (February 2004)

    Google Scholar 

  15. Drineas, P., Kannan, R., Mahoney, M.W.: Fast Monte Carlo algorithms for matrices III: Computing a compressed approximate matrix decomposition. Technical Report YALEU/DCS/TR-1271, Yale University Department of Computer Science, New Haven, CT (February 2004)

    Google Scholar 

  16. Drineas, P., Kannan, R., Mahoney, M.W.: Sampling sub-problems of heterogeneous Max-Cut problems and approximation algorithms. Technical Report YALEU/DCS/TR-1283, Yale University Department of Computer Science, New Haven, CT (April 2004)

    Google Scholar 

  17. Drineas, P., Kannan, R., Mahoney, M.W.: Sampling sub-problems of heterogeneous Max-Cut problems and approximation algorithms. In: Proceedings of the 22nd Annual International Symposium on Theoretical Aspects of Computer Science, pp. 57–68 (2005)

    Google Scholar 

  18. Drineas, P., Mahoney, M.W.: On the Nyström method for approximating a Gram matrix for improved kernel-based learning. Technical Report 1319, Yale University Department of Computer Science, New Haven, CT (April 2005)

    Google Scholar 

  19. Fine, S., Scheinberg, K.: Efficient SVM training using low-rank kernel representations. Journal of Machine Learning Research 2, 243–264 (2001)

    Article  Google Scholar 

  20. Fowlkes, C., Belongie, S., Chung, F., Malik, J.: Spectral grouping using the Nyström method. IEEE Transactions on Pattern Analysis and Machine Intelligence 26(2), 214–225 (2004)

    Article  Google Scholar 

  21. Frieze, A., Kannan, R., Vempala, S.: Fast Monte-Carlo algorithms for finding low-rank approximations. In: Proceedings of the 39th Annual IEEE Symposium on Foundations of Computer Science, pp. 370–378 (1998)

    Google Scholar 

  22. Ham, J., Lee, D.D., Mika, S., Schölkopf, B.: A kernel view of the dimensionality reduction of manifolds. Technical Report TR-110, Max Planck Institute for Biological Cybernetics (July 2003)

    Google Scholar 

  23. Lafon, S.: Diffusion Maps and Geometric Harmonics. PhD thesis, Yale University (2004)

    Google Scholar 

  24. Osuna, E., Freund, R., Girosi, F.: An improved training algorithm for support vector machines. In: Proceedings of the 1997 IEEE Workshop on Neural Networks for Signal Processing VII, pp. 276–285 (1997)

    Google Scholar 

  25. Rademacher, L., Vempala, S., Wang, G.: Matrix approximation and projective clustering via iterative sampling (manuscript)

    Google Scholar 

  26. Roweis, S.T., Saul, L.K.: Nonlinear dimensionality reduction by local linear embedding. Science 290, 2323–2326 (2000)

    Article  Google Scholar 

  27. Schölkopf, B., Smola, A., Müller, K.-R.: Nonlinear component analysis as a kernel eigenvalue problem. Neural Computation 10, 1299–1319 (1998)

    Article  Google Scholar 

  28. Smola, A.J., Schölkopf, B.: Sparse greedy matrix approximation for machine learning. In: Proceedings of the 17th International Conference on Machine Learning, pp. 911–918 (2000)

    Google Scholar 

  29. Tenenbaum, J.B., de Silva, V., Langford, J.C.: A global geometric framework for nonlinear dimensionality reduction. Science 290, 2319–2323 (2000)

    Article  Google Scholar 

  30. Weinberger, K.Q., Sha, F., Saul, L.K.: Learning a kernel matrix for nonlinear dimensionality reduction. In: Proceedings of the 21st International Conference on Machine Learning, pp. 839–846 (2004)

    Google Scholar 

  31. Williams, C.K.I., Rasmussen, C.E., Schwaighofer, A., Tresp, V.: Observations on the Nyström method for Gaussian process prediction. Technical report, University of Edinburgh (2002)

    Google Scholar 

  32. Williams, C.K.I., Seeger, M.: The effect of the input density distribution on kernel-based classifiers. In: Proceedings of the 17th International Conference on Machine Learning, pp. 1159–1166 (2000)

    Google Scholar 

  33. Williams, C.K.I., Seeger, M.: Using the Nyström method to speed up kernel machines. In: Annual Advances in Neural Information Processing Systems 13: Proceedings of the 2000 Conference, pp. 682–688 (2001)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Drineas, P., Mahoney, M.W. (2005). Approximating a Gram Matrix for Improved Kernel-Based Learning. In: Auer, P., Meir, R. (eds) Learning Theory. COLT 2005. Lecture Notes in Computer Science(), vol 3559. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11503415_22

Download citation

  • DOI: https://doi.org/10.1007/11503415_22

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-26556-6

  • Online ISBN: 978-3-540-31892-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics