Kernel Methods

Camastra, Francesco; Vinciarelli, Alessandro

doi:10.1007/978-1-4471-6735-8_9

Francesco Camastra¹⁴ &
Alessandro Vinciarelli¹⁵

Part of the book series: Advanced Information and Knowledge Processing ((AI&KP))

4657 Accesses

Abstract

Notions of calculus. What the reader should know to understand this chapter $\bullet $ Notions of calculus. $\bullet $ Chapters 5, 6, and 7. $\bullet $ Although the reading of Appendix D is not mandatory, it represents an advantage for the chapter understanding.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 49.99; Price excludes VAT (USA)

Softcover Book: USD 64.99; Price excludes VAT (USA)

Hardcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
If the input dimensionality is higher than 2, the line has to be replaced with a plane or a hyperplane.
2.
The number of solutions is (at least) $\infty ^1$.
3.
The function signum sgn(u) is defined as follows: $sgn(u)=1$ if $u>0$; $sgn(u)=-1$ if $u<0$; $sgn(u)=0$ if $u=0$.
4.
This convention is adopted in the rest of the chapter.
5.
The term regularization constant is motivated in Sect. 9.3.6.
6.
$\theta (\beta )$ is 1 if $\beta >0$, 0 otherwise.
7.
In [102] the continuity requirement is replaced with the stability.
8.
$\delta _{ij}$ is 1 if $i=j$, 0 otherwise.
9.
$\mathrm{MATLAB}^{\copyright}$ is a registered trademark of The Mathworks, Inc.

References

M. Aizerman, E. Braverman, and L. Rozonoer. Theoretical foundations of the potential function method in pattern recognition learning. Automation and Remote Controld, 25:821–837, 1964.
Google Scholar
F. R. Bach and M. I. Jordan. Learning spectral clustering. Technical report, EECS Department, University of California, 2003.
Google Scholar
A. Barla, E. Franceschi, F. Odone, and F. Verri. Image kernels. In Proceedings of SVM2002, pages 83–96, 2002.
Google Scholar
A. Ben Hur, D. Horn, H.T. Siegelmann, and V. Vapnik. A support vector method for clustering. In Advances in Neural Information and Processing Systems, volume 12, pages 125–137, 2000.
Google Scholar
A. Ben-Hur, D. Horn, H.T. Siegelmann, and V. Vapnik. Support vector clustering. Journal of Machine Learning Research, 2(2):125–137, 2001.
Google Scholar
Y. Bengio, O. Dellaleau, N. Le Roux, J.F. Paiement, Vincent. P., and M. Ouimet. Learning eigenfunction links spectral embedding and kernel pca. Neural Computation, 16(10):2197–2219, 2004.
Google Scholar
Y. Bengio, Vincent. P., and J.F. Paiement. Spectral clustering and kernel pca are learning eigenfunctions. Technical report, CIRANO, 2003.
Google Scholar
C. Berg, J.P.R. Christensen, and P. Ressel. Harmonic analysis on semigroups. Springer-Verlag, 1984.
Google Scholar
C.M. Bishop. Neural Networks for Pattern Recognition. Cambridge University Press, 1995.
Google Scholar
M. Brand and K. Huang. A unifying theorem for spectral embedding and clustering. In Proceedings of the Ninth International Workshop on Artificial Intelligence and Statistics, 2003.
Google Scholar
L. M. Bregman. The relaxation method of finding the common point of convex sets and its application to the solution of problems in convex programming. USSR Computational Mathematics and Mathematical Physics, 7:200–217, 1967.
Google Scholar
F. Camastra and A. Verri. A novel kernel method for clustering. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(5):801–805, 2005.
Google Scholar
N. Cancedda, E. Gaussier, C. Goutte, and J.-M. Renders. Word-sequence kernels. Journal of Machine Learning Research, 3(1):1059–1082, 2003.
Google Scholar
S. Canu, Y. Grandvalet, V. Guigue, and A. Rakotomamonjy. SVM and kernel methods Matlab toolbox. Technical report, Perception Systemes et Information, INSA de Rouen, 2005.
Google Scholar
Y. Censor. Row-action methods for huge and sparse systems and their applications. SIAM Reviews, 23(4):444–467, 1981.
Google Scholar
Y. Censor and A. Lent. An iterative row-action method for interval convex programming. Journal of Optimization Theory and Application, 34(3):321–353, 1981.
Google Scholar
P.K. Chan, M. Schlag, and J.Y. Zien. Spectral k-way radio-cut partitioning and clustering. In Proceedings of the 1993 International Symposium on Research on Integrated Systems, pages 123–142. MIT Press, 1993.
Google Scholar
J.H. Chiang. A new kernel-based fuzzy clustering approach: support vector clustering with cell growing. IEEE Transactions on Fuzzy Systems, 11(4):518–527, 2003.
Google Scholar
F.R.K. Chung. Spectral Graph Theory. American Mathematical Society, 1997.
Google Scholar
R. Collobert and S. Bengio. SVMTorch: Support vector machines for large-scale regression problems. Journal of Machine Learning Research, 1(2):143–160, 2001.
Google Scholar
R. Collobert, S. Bengio, and J. Mariethoz. Torch: a modular machine learning software library. Technical report, IDIAP, 2002.
Google Scholar
C. Cortes and V. Vapnik. Support vector networks. Machine Learning, 20(3):1–25, 1995.
Google Scholar
N. Cressie. Statistics for Spatial Data. John Wiley, 1993.
Google Scholar
N. Cristianini, J.S. Taylor, and J. S. Kandola. Spectral kernel methods for clustering. In Advances in Neural Information Processing Systems 14, pages 649–655. MIT Press, 2001.
Google Scholar
A.P. Dempster, N.M. Laird, and D.B. Rubin. Maximum likelihood from incomplete data via the em algorithm. Journal Royal Statistical Society, 39(1):1–38, 1977.
Google Scholar
I.S. Dhillon, Y. Guan, and B. Kullis. Kernel k-means: spectral clustering and normalized cuts. In Proceedings of the $10^{th}$ ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 551–556. ACM Press, 2004.
Google Scholar
I.S. Dhillon, Y. Guan, and B. Kullis. A unified view of kernel k-means, spectral clustering and graph partitioning. Technical report, UTCS, 2005.
Google Scholar
I.S. Dhillon, Y. Guan, and B. Kullis. Weighted graph cuts without eigenvectors: A multilevel approach. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(11):1944–1957, 2007.
Google Scholar
R. O. Duda, P. E. Hart, and D. G. Stork. Pattern Classification. John Wiley, 2001.
Google Scholar
T. Evgeniou, M. Pontil, and T. Poggio. Regularization networks and support vector machines. Advances in Computational Mathematics, 13(1):1–50, 2001.
Google Scholar
P.-H. Fan, R.-E. andChen and C.-J. Lin. Working set selection using the second order information for training SVM. Journal of Machine Learning Research, 6:1889–1918, 2005.
Google Scholar
P. Fermat. Methodus ad disquirendam maximam et minimam. In Oeuvres de Fermat. MIT Press, 1891 (First Edition 1679).
Google Scholar
M. Ferris and T. Munson. Interior point method for massive support vector machines. Technical report, Computer Sciences Department, University of Wisconsin, Madison, Wisconsin, 2000.
Google Scholar
M. Ferris and T. Munson. Semi-smooth support vector machines. Technical report, Computer Sciences Department, University of Wisconsin, Madison, Wisconsin, 2000.
Google Scholar
M. Fiedler. Algebraic connectivity of graphs. Czechoslovak Math. J., 23(98):298–305, 1973.
MathSciNet Google Scholar
M. Filippone, F. Camastra, F. Masulli, and S. Rovetta. A survey of spectral and kernel methods for clustering. Pattern Recognition, 41(1):176–190, 2008.
Google Scholar
I. Fischer and I. Poland. New methods for spectral clustering. Technical report, IDSIA, 2004.
Google Scholar
R. A. Fisher. The use of multiple measurements in taxonomic problems. Annals of Eugenics, 7(2):179–188, 1936.
Google Scholar
J. Friedman. Regularized discriminant analysis. Journal of the American Statistical Association, 84(405):165–175, 1989.
Google Scholar
T.T. Friess, N. Cristianini, and C. Campbell. The kernel adatron algorithm: a fast and simple learning procedure for support vector machines. In Proceedings of $15^{th}$ International Conference on Machine Learning, pages 188–196. Morgan Kaufman Publishers, 1998.
Google Scholar
K. Fukunaga. An Introduction to Statistical Pattern Recognition. Academic Press, 1990.
Google Scholar
T. Gärtner, J.W. Lloyd, and P.A. Flach. Kernels and distances for structured data. Machine Learning, 57(3):205–232, 2004.
Google Scholar
M. Girolami. Mercer kernel based clustering in feature space. IEEE Transactions on Neural Networks, 13(3):780–784, 2002.
Google Scholar
F. Girosi, M. Jones, and T. Poggio. Regularization theory and neural network architectures. Neural Computation, 7(2):219–269, 1995.
Google Scholar
G.H. Golub and C.F.V. Loan. Matrix computation. The Johns Hopkins University Press, 1996.
Google Scholar
T. Graepel and K. Obermayer. Fuzzy topographic kernel clustering. In Proceedings of the Fifth GI Workshop Fuzzy Neuro Systems’98, pages 90–97, 1998.
Google Scholar
J. Hadamard. Sur les problemes aux derivees partielles et leur signification physique. Bull. Univ. Princeton, 13:49–52, 1902.
Google Scholar
T. Hastie, R. Tibshirani, and J. Friedman. The Elements of Statistical Learning. Springer-Verlag, 2001.
Google Scholar
R. Herbrich. Learning Kernel Classifiers: Theory and Algorithms. MIT Press, 2004.
Google Scholar
R. Inokuchi and S. Miyamoto. LVQ clustering and SOM using a kernel function. In Proceedings of IEEE International Conference on Fuzzy Systems, pages 367–373, 2004.
Google Scholar
T. Joachims. Making large-scale SVM learning practical. In Advances in Kernel Methods, pages 169–184. MIT Press, 1999.
Google Scholar
T. Joachims, N. Cristianini, and J. Shawe-Taylor. Composite kernels for hypertext classification. In Proceedings of the $18^{th}$ International Conference on Machine Learning, pages 250–257. IEEE Press, 2001.
Google Scholar
R. Kannan, S. Vempala, and A. Vetta. On clusterings: Good, bad and spectral. In Proceedings of the 41$^{st}$ Annual Symposium on the Foundation of Computer Science, pages 367–380. IEEE Press, 2000.
Google Scholar
A. Karatzoglou, A. Smola, K. Hornik, and A. Zeleis. kernlab- an s4 package for kernel methods in r. Journal of Statistical Software, 11(9):1–20, 2004.
Google Scholar
S. Keerthi, S. Shevde, C. Bhattacharyya, and K. Murthy. Improvements to platt’s smo algorithm for SVM classifier design. Technical report, Department of CSA, Bangalore, India,, 1999.
Google Scholar
S. Keerthi, S. Shevde, C. Bhattacharyya, and K. Murthy. A fast iterative nearest point algorithm for support vector machine design. IEEE Transaction on Neural Networks, 11(1):124–136, 2000.
Google Scholar
B.W. Kernighan and S. Lin. An efficient heuristic procedure for partitioning graphs. Bell System Technical Journal, 49(1):291–307, 1970.
Google Scholar
G.A. Korn and T.M. Korn. Mathematical Handbook for Scientists and Engineers. Mc Graw-Hill, 1968.
Google Scholar
R. Krishnapuram and J.M. Keller. A possibilistic approach to clustering. IEEE Transactions on Fuzzy Sets, 1(2):98–110, 1993.
Google Scholar
R. Krishnapuram and J.M. Keller. The possibilistic c-means algorithms: insight and recommandations. IEEE Transactions on Fuzzy Sets, 4(3):385–393, 1996.
Google Scholar
H.W. Kuhn and A.W. Tucker. Nonlinear programming. In Proceedings of $2^{nd}$ Berkeley Symposium on Mathematical Statistics and Probabilistics, pages 367–380. University of California Press, 1951.
Google Scholar
J.-L. Lagrange. Mecanique analytique. Chez La Veuve Desaint Libraire, 1788.<!– Missing/Wrong Year –>
Google Scholar
D. Lee. An improved cluster labeling method for support vector clustering. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(3):461–464, 2005.
Google Scholar
C. Leslie, E. Eskin, A. Cohen, J. Weston, and A. Noble. Mismatch string kernels for discriminative protein classification. Bioinformatics, 20(4):467–476, 2004.
Google Scholar
D. Lueberger. Linear and Nonlinear Programming. Addison-Wesley, 1984.
Google Scholar
D. Macdonald and C. Fyfe. The kernel self-organizing map. In Fourth International Conference on Knowledge-based Intelligent Engineering Systems and Allied Technologies, pages 317–320, 2000.
Google Scholar
D.J.C. MacKay. A practical bayesian framework for backpropagation networks. Neural Computation, 4(3):448–472, 1992.
Google Scholar
O.L. Mangasarian. Linear and non-linear separation of patterns by linear programming. Operations Research, 13(3):444–452, 1965.
Google Scholar
O.L. Mangasarian and D. Musicant. Lagrangian support vector regression. Technical report, Computer Sciences Department, University of Wisconsin, Madison, Wisconsin, June 2000.
Google Scholar
G. Matheron. Principles of geostatistics. Economic Geology, 58:1246–1266, 1963.
Google Scholar
M. Meila and J. Shi. Spectral methods for clustering. In Advances in Neural Information Processing Systems 12, pages 873–879. MIT Press, 2000.
Google Scholar
S. Mika, G. Rätsch, J. Weston, B. Schölkopf, and K.R. Müller. Fisher discriminant analysis with kernels. In Proceedings of IEEE Neural Networks for Signal Processing Workshop, pages 41–48. IEEE Press, 2001.
Google Scholar
M.L. Minsky and S.A. Papert. Perceptrons. MIT Press, 1969.
Google Scholar
J. Moody and C. Darken. Fast learning in networks of locally-tuned processing units. Neural Computation, 1(2):281–294, 1989.
Google Scholar
R. Neal. Bayesian Learning in Neural Networks. Springer-Verlag, 1996.
Google Scholar
A.Y. Ng, M.I. Jordan, and Y. Weiss. On spectral clustering: Analysis and an algorithm. In Advances in Neural Information Processing Systems 14, pages 849–856. MIT Press, 2002.
Google Scholar
E. Osuna, R. Freund, and F. Girosi. An improved training algorithm for support vector machines. In Neural Networks for Signal Processing VII, Proceedings of the 1997 IEEE Workshop, pages 276–285. IEEE Press, 1997.
Google Scholar
E. Osuna and F. Girosi. Reducing the run-time complexity in support vector machines. In Advances in Kernel Methods, pages 271–284. MIT Press, 1999.
Google Scholar
A. Paccanaro, C. Chennubhotla, J.A. Casbon, and M.A.S. Saqi. Spectral clustering of protein sequences. In Proceedings of International Joint Conference on Neural Networks, pages 3083–3088. IEEE Press, 2003.
Google Scholar
J.C. Platt. Fast training of support vector machines using sequential minimal optimization. In Advances in Kernel Methods, pages 185–208. MIT Press, 1999.
Google Scholar
J.C. Platt, N. Cristianini, and J. Shawe-Taylor. Large margin dags for multiclass classification. In Advances in Neural Information Processing Systems 12, pages 547–553. MIT Press, 2000.
Google Scholar
T. Poggio and F. Girosi. Networks for approximation and learning. Proceedings of the IEEE, 78(9):1481–1497, 1990.
Google Scholar
M. Pontil and A. Verri. Support vector machines for 3-d object recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(6):637–646, 1998.
Google Scholar
M.J.D. Powell. Radial basis functions for multivariable interpolation: A review. In Algorithms for Approximation, pages 143–167. Clarendon Press, 1987.
Google Scholar
A.K. Qinand and P.N. Sugantham. Kernel neural gas algorithms with application to cluster analysis. In iCPR- 17th International Conference on Fuzzy Systems, pages 617–620. Clarendon Press, 2004.
Google Scholar
C.E. Rasmussen and C. Willims. Gaussian Processes for Machine Learning. MIT Press, 2006.
Google Scholar
K. Rose. Deterministic annealing for clustering, compression, classification, regression, and related optimization problem. Proceedings of the IEEE, 86(11):2210–2239, 1998.
Google Scholar
R. Rosipal and M. Girolami. An expectation maximization approach to nonlinear component analysis. Neural Computation, 13(3):505–510, 2001.
Google Scholar
V. Roth, J. Laub, M. Kawanabe, and J.M. Buhmann. Optimal cluster preserving embedding of nonmetric proximity data. IEEE Transactions on Pattern Analysis and Machine Intelligence, 25(12):1540–1551, 2003.
Google Scholar
B. Schölkopf and A.J. Smola. Learning with Kernels. MIT Press, 2002.
Google Scholar
B. Schölkopf, A.J. Smola, and K.R. Muller. Nonlinear component analysis as a kernel eigenvalue problem. Neural Computation, 10(5):1299–1319, 1998.
Google Scholar
B. Schölkopf, A.J. Smola, and K.R. Muller. Nonlinear component analysis as a kernel eigenvalue problem. Technical report, Max Planck Institut für Biologische Kybernetik, 1998.
Google Scholar
B. Schölkopf, R.C. Williamson, A.J. Smola, J. Shawe-Taylor, and J. Platt. Support vector method for novelty detection. In Advances in Neural Information Processing Systems 12, pages 526–532. MIT Press, 2000.
Google Scholar
J. Shawe-Taylor and N. Cristianini. Kernel Methods for Pattern Analysis. Cambridge University Press, 2004.
Google Scholar
J. Shi and J. Malik. Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(8):888–905, 2000.
Google Scholar
D.M.J. Tax and R.P.W. Duin. Support vector domain description. Pattern Recognition Letters, 20(11–13):1191–1199, 1999.
Google Scholar
A.N. Tikhonov. On solving ill-posed problem and method of regularization. Dokl. Acad. Nauk USSR, 153:501–504, 1963.
Google Scholar
A.N. Tikhonov and V.Y. Arsenin. Solution of ill-posed problems. W.H. Winston, 2002.
Google Scholar
I. Tsochantaridis, T. Hoffman, T. Joachims, and Y. Altun. Support vector learning for interdependent and structured output spaces. In Proceedings of ICML04. IEEE Press, 2004.
Google Scholar
C.J. Twining and C.J. Taylor. The use of kernel principal component analysis to model data distributions. Pattern Recognition, 36(1):217–227, 2003.
Google Scholar
V.N. Vapnik. The Nature of Statistical Learning Theory. Springer-Verlag, 1995.
Google Scholar
V.N. Vapnik. Statistical Learning Theory. John Wiley, 1998.
Google Scholar
V.N. Vapnik and A.Ya. Chervonenkis. A note on one class of perceptron. Automation and Remote Control, 25:103–109, 1964.
Google Scholar
V.N. Vapnik and A. Lerner. Pattern recognition using generalized portrait method. Automation and Remote Control, 24:774–780, 1963.
Google Scholar
S. Vishwanathan and A.J. Smola. Fast kernels for string and tree matching. In Advances in Neural Information Processing Systems 15, pages 569–576. MIT Press, 2003.
Google Scholar
U. von Luxburg, M. Belkin, and O. Bosquet. Consistency of spectral clustering. Technical report, Max Planck Institut für Biologische Kybernetik, 2004.
Google Scholar
U. von Luxburg, M. Belkin, and O. Bosquet. Limits of spectral clustering. In Advances in Neural Information Processing Systems 17. MIT Press, 2005.
Google Scholar
D. Wagner and F. Wagner. Between min cut and graph bisection. In Mathematical Foundations of Kernel Methods, pages 744–750, 1993.
Google Scholar
G. Wahba. Spline Models for Observational Data. SIAM, 1990.
Google Scholar
J. Weston, A. Gammerman, M. Stitson, V. Vapnik, V. Vovk, and C. Watkins. Support vector density estimation. In Advances in Kernel Methods, pages 293–306. MIT Press, 1999.
Google Scholar
J. Weston and C. Watkins. Multi-class support vector machines. In Proceedings of ESANN99, pages 219–224. D. Facto Press, 1999.
Google Scholar
C.K.I. Williams and D. Barber. Bayesian classification with Gaussian processes. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(12):1342–1351, 1998.
Google Scholar
W.H. Wolberg and O. Mangasarian. Multisurface method of pattern separation for medical diagnosis applied to breast cytology. Proceedings of the National Academy of Sciences, U.S.A., 87:9193–9196, 1990.
Google Scholar
Z.D. Wu, W.X. Xie, and J.P. Yu. Fuzzy c-means clustering algorithm based on kernel method. In Proceedings of the Fifth International Conference on Computational Intelligence and Multimedia Applications, ICCIMA 2003, pages 49–54. IEEE, 2003.
Google Scholar
J. Yang, V. Estvill-Castro, and S.K. Chalup. Support vector clustering through proximity graph modelling. In Neural Information Processing 2002, ICONIP’02, pages 898–903, 2002.
Google Scholar
S.X. Yu and J. Shi. Multiclass spectral clustering. In ICCV’03: Proceedings of the Ninth IEEE Conference on Computer Vision. IEEE Computer Society, 2003.
Google Scholar
D.-Q. Zhang and S.-C. Chen. Fuzzy clustering using kernel method. In The 2002 International Conference on Control and Automation, pages 162–163, 2002.
Google Scholar
D.-Q. Zhang and S.-C. Chen. Kernel based fuzzy and possibilistic c-means clustering. In Proceedings of the Fifth International Conference on Artificial Neural Networks, ICANN 2003, pages 122–125, 2003.
Google Scholar
D.-Q. Zhang and S.-C. Chen. A novel kernelized fuzzy c-means algorithms with applications in image segmentation. Artificial Intelligence in Medicine, 32(1):37–50, 2004.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Science and Technology, Parthenope University of Naples, Naples, Italy
Francesco Camastra
School of Computing Science and the Institute of Neuroscience and Psychology, University of Glasgow, Glasgow, UK
Alessandro Vinciarelli

Authors

Francesco Camastra
View author publications
You can also search for this author in PubMed Google Scholar
Alessandro Vinciarelli
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Francesco Camastra .

Problems

9.1

Consider the function $K: X \times X \rightarrow \mathbb {R}$, where $X \subseteq \mathbb {R}^n$. Prove that if $K(\mathbf {x}, \mathbf {y}) = \varPhi ( \mathbf {x}) \cdot \varPhi (\mathbf {y})$ then $K(\cdot )$ is a Mercer kernel.

9.2

Prove that the Cauchy kernel $C(\mathbf {x}, \mathbf {y})= \alpha (1 + \Vert \mathbf {x}- \mathbf {y}\Vert ^2)$ is positive definite for $\alpha > 0$. (Hint: Read Appendix D).

9.3

Prove that the Epanechnikov kernel , defined by

$$\begin{aligned} E(x,y)= 0.75(1- \Vert \mathbf {x}- \mathbf {y} \Vert ^2)\mathbf {I}(\Vert \mathbf {x}- \mathbf {y} \Vert \le 1) \end{aligned}$$

(9.251)

is conditionally positive definite . (Hint: Read Appendix D).

9.4

Prove that the optimal hyperplane is unique.

9.5

Consider the SMO algorithm for classification. What is the minimum number of Lagrange multipliers which can be optimized in an iteration? Explain your answer.

9.6

Consider the SMO algorithm for classification. Show that in the case of unconstrained maximum we obtain the following updating rule

$$\begin{aligned} \alpha _2(t+1)= \alpha _2(t) -\frac{y_2(E_1-E_2)}{2K(\mathbf {x}_1,\mathbf {x}_2)- K(\mathbf {x}_1, \mathbf {x}_1) - K(\mathbf {x}_2, \mathbf {x}_2)} \end{aligned}$$

(9.252)

where $E_i = f(\mathbf {x}_i - y_i) $.

9.7

Consider the data Set A of the SantaFe time series competition. Using a public domain SVM regression package and the four preceeding values of the time series as input, predict the actual value of the time series. The data set A can be downloaded from http://www-psych.stanford.edu/~andreas/Time-Series/SantaFe.html. Implement a Gaussian process for regression and repeat the exercise replacing SVM with the Gaussian process. Discuss the results.

9.8

Using the o-v-r method and a public domain SVM binary classifier (e.g., SVMLight or SVMTorch), test a multiclass SVM on Iris Data [38] that can be dowloaded by ftp.ics.uci.edu/pub/machine-learning-databases/iris. Repeat the same experiment replacing the o-v-r method with the o-v-o strategy. Discuss the results.

9.9

Implement kernel PCA and test it on a dataset (e.g. Iris Data). Use as Mercer kernel the Gaussian and verify the Twining and Taylor’s result [100], that is, that for large values of the variance the kernel PCA eigenspectrum tends to PCA eigenspectrum.

9.10

Consider one-class SVM. Prove there are no bounded support vector when the regularization constant C is equal to 1.

9.11

Implement Kernel K-Means and test your implementation on a dataset (e.g. Iris Data). Verify that when you choose as Mercer kernel the inner product you obtain the same results of batch K-Means.

9.12

Implement the Ng-Jordan algorithm using a mathematical toolbox. Test your implementation on Iris data. Compare your results with the ones reported in [12].

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Camastra, F., Vinciarelli, A. (2015). Kernel Methods. In: Machine Learning for Audio, Image and Video Analysis. Advanced Information and Knowledge Processing. Springer, London. https://doi.org/10.1007/978-1-4471-6735-8_9

Download citation

DOI: https://doi.org/10.1007/978-1-4471-6735-8_9
Published: 22 July 2015
Publisher Name: Springer, London
Print ISBN: 978-1-4471-6734-1
Online ISBN: 978-1-4471-6735-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Kernel Methods

Abstract

Access this chapter

Notes

References

Author information

Authors and Affiliations

Corresponding author

Problems

Problems

9.1

9.2

9.3

9.4

9.5

9.6

9.7

9.8

9.9

9.10

9.11

9.12

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation