Distributional Models for Lexical Semantics: An Investigation of Different Representations for Natural Language Learning

Croce, Danilo; Filice, Simone; Basili, Roberto

doi:10.1007/978-3-319-14206-7_6

Danilo Croce⁷,
Simone Filice⁷ &
Roberto Basili⁷

Part of the book series: Studies in Computational Intelligence ((SCI,volume 589))

416 Accesses

Abstract

Language learning systems usually generalize linguistic observations into rules and patterns that are statistical models of higher level semantic inferences. When the availability of training data is scarce, lexical information can be limited by data sparseness effects and generalization is thus needed. Distributional models represent lexical semantic information in terms of the basic co-occurrences between words in large-scale text collections. As recent works already address, the definition of proper distributional models as well as methods able to express the meaning of phrases or sentences as operations on lexical representations is a complex problem, and a still largely open issue. In this paper, a perspective centered on Convolution Kernels is discussed and the formulation of a Partial Tree Kernel that integrates syntactic information and lexical generalization is studied. Moreover a large scale investigation of different representation spaces, each capturing a different linguistic relation, is provided.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Note that SVD emphasizes directions with maximal covariance for \(M\), i.e. term clusters for which it is maximal the difference between contexts, i.e. short syntagmatic patterns.
2.
When \(n_1\) and \(n_2\) are not lexical nodes \(\sigma \) will be 0 when \(n_1 \ne n_2\).
3.
http://cogcomp.cs.illinois.edu/Data/QA/QC/.
4.
http://disi.unitn.it/moschitti/Tree-Kernel.htm.

References

Harris, Z.: Distributional structure. In: Katz, J.J., Fodor, J.A. (eds.) The Philosophy of Linguistics. Oxford University Press, Oxford (1964)
Google Scholar
Sahlgren, M.: The word-space model. PhD thesis, Stockholm University (2006)
Google Scholar
Turney, P.D., Pantel, P.: From frequency to meaning: vector space models of semantics. J. Artif. Intell. Res. 37, 141–188 (2010)
MATH MathSciNet Google Scholar
Schutze, H.: Automatic word sense discrimination. J. Comput. Linguist. 24, 97–123 (1998)
Google Scholar
Lin, D.: Automatic retrieval and clustering of similar word. In: Proceedings of COLING-ACL, Montreal, Canada (1998)
Google Scholar
Giuliano, C.: Fine-grained classification of named entities exploiting latent semantic kernels. In: Proceedings of CoNLL 2009, CoNLL’09, Stroudsburg, PA, USA, pp. 201–209 (2009)
Google Scholar
Croce, D., Giannone, C., Annesi, P., Basili, R.: Towards open-domain semantic role labeling. In: ACL, pp. 237–246 (2010)
Google Scholar
Pado, S., Lapata, M.: Dependency-based construction of semantic space models. Comput. Linguist. 33(2) (2007)
Google Scholar
Mitchell, J., Lapata, M.: Composition in distributional models of semantics. Cogn. Sci. 34, 1388–1429 (2010)
Article Google Scholar
Baroni, M., Lenci, A.: One distributional memory, many semantic spaces. In: Proceedings of the GEMS 2009 Workshop, GEMS’09, Stroudsburg, PA, USA, pp. 1–8 (2009)
Google Scholar
Clark, S., Pulman, S.: Combining symbolic and distributional models of meaning. In: Proceedings of the AAAI Spring Symposium on Quantum Interaction, pp. 52–55 (2007)
Google Scholar
Grefenstette, E., Sadrzadeh, M.: Experimental support for a categorical compositional distributional model of meaning. In: Proceedings of EMNLP 2011, Edinburgh, Scotland, UK
Google Scholar
Haussler, D.: Convolution kernels on discrete structures. University of Santa Cruz, Technical report (1999)
Google Scholar
Collins, M., Duffy, N.: New ranking algorithms for parsing and tagging: kernels over discrete structures, and the voted perceptron. In: Proceedings of ACL’02 (2002)
Google Scholar
Bloehdorn, S., Moschitti, A.: Combined syntactic and semantic kernels for text classification. In: Proceedings of ECIR 2007, Rome, Italy (2007)
Google Scholar
Croce, D., Moschitti, A., Basili, R.: Structured lexical similarity via convolution kernels on dependency trees. In: Proceedings of EMNLP 2011
Google Scholar
Pedersen, T., Patwardhan, S., Michelizzi, J.: WordNet::similarity—measuring the relatedness of concept. In: Proceedings of 5th NAACL, Boston, MA (2004)
Google Scholar
Salton, G., Wong, A., Yang, C.: A vector space model for automatic indexing. Commun. ACM 18 (1975)
Google Scholar
Landauer, T., Dumais, S.: A solution to plato’s problem: the latent semantic analysis theory of acquisition, induction and representation of knowledge. Psychol. Rev. 104 (1997)
Google Scholar
Agirre, E., Cer, D., Diab, M., Gonzalez-Agirre, A., Guo, W.: *SEM 2013 shared task: semantic textual similarity, including a pilot on typed-similarity. In: *SEM 2013 (2013)
Google Scholar
Schütze, H., Pedersen, J.O.: Information retrieval based on word senses. In: Proceedings of the 4th Annual Symposium on Document Analysis and Information Retrieval (1995)
Google Scholar
Aston, G., Burnard, L.: The BNC Handbook: Exploring the British National Corpus with SARA. Edinburgh University Press, Scotland (1998)
Google Scholar
Graff, D.: English Gigaword. Technical report, Linguistic Data Consortium, Philadelphia (2003)
Google Scholar
Baroni, M., Bernardini, S., Ferraresi, A., Zanchetta, E.: The WaCky wide web: a collection of very large linguistically processed web-crawled corpora. LRE 43(3), 209–226 (2009)
Google Scholar
Schütze, H.: Word space. In: Advances in Neural Information Processing Systems 5, Morgan Kaufmann, pp. 895–902 (1993)
Google Scholar
Basili, R., Pennacchiotti, M.: Distributional lexical semantics: toward uniform representation paradigms for advanced acquisition and processing tasks. Nat. Lang. Eng. 16(4), 347–358 (2010)
Article Google Scholar
Lin, D.: Automatic retrieval and clustering of similar words. In: COLING-ACL, pp. 768–774 (1998)
Google Scholar
Fano, R.M., Hawkins, D.: Transmission of information: a statistical theory of communications. Am. J. Phys. 29(11), 793–794 (1961)
Article Google Scholar
Bengio, Y., Delalleau, O., Roux, N.L.: The curse of dimensionality for local kernel machines. Technical report, Departement d’Informatique et Recherche Operationnelle (2005)
Google Scholar
Lee, J., Verleysen, M.: Nonlinear Dimensionality Reduction. Information Science and Statistics. Springer, New York (2007)
Book MATH Google Scholar
Golub, G., Kahan, W.: Calculating the singular values and pseudo-inverse of a matrix. J. Soc. Ind. Appl. Math.: Ser. B, Numer. Anal.
Google Scholar
Johansson, R., Nugues, P.: Dependency-based syntactic-semantic analysis with PropBank and NomBank. In: Proceedings of CoNLL, pp. 183–187 (2008)
Google Scholar
Shawe-Taylor, J., Cristianini, N.: Kernel Methods for Pattern Analysis. Cambridge University Press, Cambridge (2004)
Book Google Scholar
Collins, M., Duffy, N.: Convolution kernels for natural language. In: Proceedings of Neural Information Processing Systems (NIPS), pp. 625–632 (2001)
Google Scholar
Moschitti, A.: Efficient convolution kernels for dependency and constituent syntactic trees. In: ECML, Machine Learning: ECML, Berlin, Germany, pp. 318–329 (2006)
Google Scholar
Hovy, E., Marcus, M., Palmer, M., Ramshaw, L., Weischedel, R.: Ontonotes: the 90% solution. In: Proceedings of NAACL, Stroudsburg, PA, USA, pp. 57–60 (2006)
Google Scholar
Cristianini, N., Shawe-Taylor, J., Lodhi, H.: Latent semantic kernels. In: Brodley, C., Danyluk, A. (eds.) Proceedings of ICML-01 18th International Conference on Machine Learning, Williams College, US, Morgan Kaufmann Publishers, San Francisco, USA, pp. 66–73 (2001)
Google Scholar
Li, X., Roth, D.: Learning question classifiers. In: Proceedings of ACL’02 (2002)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, Systems and Production, University of Roma Tor Vergata, Via Del Politecnico 1, 00133, Rome, Italy
Danilo Croce, Simone Filice & Roberto Basili

Authors

Danilo Croce
View author publications
You can also search for this author in PubMed Google Scholar
Simone Filice
View author publications
You can also search for this author in PubMed Google Scholar
Roberto Basili
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Danilo Croce .

Editor information

Editors and Affiliations

Department of Computer Science, Systems and Production, University of Rome Tor Vergata, Rome, Italy
Roberto Basili
Department of Computer Science, University of Turin, Turin, Italy
Cristina Bosco
Department of Language and Cultural Studies, Department of Computer Science, Ca’ Foscari University of Venice, Venezia, Italy
Rodolfo Delmonte
Department of Computer Science and Information Engineering, University of Trento, Trento, Italy
Alessandro Moschitti
Department of Computer Science, University of Pisa, Pisa, Italy
Maria Simi

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Croce, D., Filice, S., Basili, R. (2015). Distributional Models for Lexical Semantics: An Investigation of Different Representations for Natural Language Learning. In: Basili, R., Bosco, C., Delmonte, R., Moschitti, A., Simi, M. (eds) Harmonization and Development of Resources and Tools for Italian Natural Language Processing within the PARLI Project. Studies in Computational Intelligence, vol 589. Springer, Cham. https://doi.org/10.1007/978-3-319-14206-7_6

Download citation

DOI: https://doi.org/10.1007/978-3-319-14206-7_6
Published: 15 January 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-14205-0
Online ISBN: 978-3-319-14206-7
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics