Abstract
We present a general study of learning and linear separability with rational kernels, the sequence kernels commonly used in computational biology and natural language processing. We give a characterization of the class of all languages linearly separable with rational kernels and prove several properties of the class of languages linearly separable with a fixed rational kernel. In particular, we show that for kernels with transducer values in a finite set, these languages are necessarily finite Boolean combinations of preimages by a transducer of a single sequence. We also analyze the margin properties of linear separation with rational kernels and show that kernels with transducer values in a finite set guarantee a positive margin and lead to better learning guarantees. Creating a rational kernel with values in a finite set is often non-trivial even for relatively simple cases. However, we present a novel and general algorithm, double-tape disambiguation, that takes as input a transducer mapping sequences to sequence features, and yields an associated transducer that defines a finite range rational kernel. We describe the algorithm in detail and show its application to several cases of interest.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Bartlett, P., Shawe-Taylor, J.: Generalization performance of support vector machines and other pattern classifiers. In: Advances in kernel methods: support vector learning, pp. 43–54. MIT Press, Cambridge, MA, USA (1999)
Berstel, J.: Transductions and Context-Free Languages. Teubner Studienbucher: Stuttgart (1979)
Boser, B.E., Guyon, I., Vapnik, V.N.: A training algorithm for optimal margin classifiers. In: Proceedings of COLT ’92, vol. 5, ACM Press, New York (1992)
Collins, M., Duffy, N.: Convolution kernels for natural language. In: NIPS 14, MIT Press, Cambridge, MA (2002)
Cortes, C., Haffner, P., Mohri, M.: Rational Kernels: Theory and Algorithms. Journal of Machine Learning Research 5, 1035–1062 (2004)
Cortes, C., Mohri, M.: Moment Kernels for Regular Distributions. Machine Learning 60(1-3), 117–134 (2005)
Cortes, C., Mohri, M., Rastogi, A., Riley, M.: Efficient Computation of the Relative Entropy of Probabilistic Automata. In: LATIN 2006, vol. 3887, Springer, Heidelberg (2006)
Cortes, C., Vapnik, V.N.: Support-Vector Networks. Machine Learning 20(3), 273–297 (1995)
Denis, F., Esposito, Y.: Rational stochastic languages. In: Lugosi, G., Simon, H.U. (eds.) COLT 2006. LNCS (LNAI), vol. 4005, Springer, Heidelberg (2006)
Eilenberg, S.: Automata, Languages and Machines, vol. A–B. Academic Press, 1974–1976
Haussler, D.: Convolution Kernels on Discrete Structures. Technical Report UCSC-CRL-99-10, University of California at Santa Cruz (1999)
Kontorovich, L., Cortes, C., Mohri, M.: Kernel Methods for Learning Languages. Theoretical Computer Science (submitted) (2006)
Kontorovich, L., Cortes, C., Mohri, M.: Learning Linearly Separable Languages. In: Balcázar, J.L., Long, P.M., Stephan, F. (eds.) ALT 2006. LNCS (LNAI), vol. 4264, Springer, Heidelberg (2006)
Kuich, W., Salomaa, A.: Semirings, Automata, Languages. In: EATCS Monographs on Theoretical Computer Science, vol. 5, Springer, Heidelberg (1986)
Leslie, C., Eskin, E., Weston, J., Noble, W.S.: Mismatch String Kernels for SVM Protein Classification. In: NIPS 2002, MIT Press, Cambridge (2003)
Lodhi, H., Shawe-Taylor, J., Cristianini, N., Watkins, C.: Text classification using string kernels. In: NIPS 2000, pp. 563–569. MIT Press, Cambridge (2001)
Lothaire, M.: Combinatorics on Words. In: Encyclopedia of Mathematics and Its Applications. Encyclopedia of Mathematics and Its Applications, vol. 17, Addison-Wesley, London (1983)
Mohri, M.: Finite-State Transducers in Language and Speech Processing. Computational Linguistics 23, 2 (1997)
Paz, A.: Introduction to probabilistic automata. Academic Press, New York (1971)
Rabin, M.O.: Probabilistic automata. Information and Control, 6 (1963)
Salomaa, A., Soittola, M. (eds.): Automata-Theoretic Aspects of Formal Power Series. Springer, Heidelberg (1978)
Imre Simon. Piecewise testable events. In: Aut. Theory and Formal Lang (1975)
Turakainen, P.: Generalized Automata and Stochastic Languages. In: Proceedings of the American Mathematical Society, 21(2), 303–309 (1969)
Vapnik, V.N.: Statistical Learning Theory. John Wiley & Sons, New York (1998)
Zien, A., Rätsch, G., Mika, S., Schölkopf, B., Lengauer, T., Müller, K.-R.: Engineering support vector machine kernels that recognize translation initiation sites. Bioinformatics 16(9), 799–807 (2000)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer Berlin Heidelberg
About this paper
Cite this paper
Cortes, C., Kontorovich, L., Mohri, M. (2007). Learning Languages with Rational Kernels. In: Bshouty, N.H., Gentile, C. (eds) Learning Theory. COLT 2007. Lecture Notes in Computer Science(), vol 4539. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-72927-3_26
Download citation
DOI: https://doi.org/10.1007/978-3-540-72927-3_26
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-72925-9
Online ISBN: 978-3-540-72927-3
eBook Packages: Computer ScienceComputer Science (R0)