Skip to main content

Searching for Part of Speech Tags That Improve Parsing Models

  • Conference paper
Advances in Natural Language Processing (GoTAL 2008)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5221))

Included in the following conference series:

Abstract

We introduce a technique for inducing a refinement of the set of part of speech tags related to verbs. We cluster verbs according to their syntactic behavior in a dependency structure setting. The set of clusters is automatically determined by means of a quality measure over the probabilistic automata that describe words in a bilexical grammar. Each of the resulting clusters defines a new part of speech tag. We try out the resulting tag set in a state-of-the art phrase structure parser and we show that the induced part of speech tags significantly improve the accuracy of the parser.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Eisner, J.: Bilexical grammars and a cubictime probabilistic parser. In: Proceedings of IWPT 2004 (1994)

    Google Scholar 

  2. Marcus, M., Santorini, B.: Building a large annotated corpus of English: The Penn treebank. Computational Linguistics 19, 313–330 (1993)

    Google Scholar 

  3. Collins, M.: Three generative, lexicalized models for statistical parsing. In: ACL 1997 (1997)

    Google Scholar 

  4. Bikel, D.: On the Parameter Space of Generative Lexicalized Statistical Parsing Models. PhD thesis, University of Pennsylvania (2004)

    Google Scholar 

  5. Thollard, F., Dupont, P., de la Higuera, C.: Probabilistic DFA inference using Kullback-Leibler divergence and minimality. In: Proc. ICML, Stanford (2000)

    Google Scholar 

  6. Infante-Lopez, G., de Rijke, M.: Alternative approaches for generating bodies of grammar rules. In: Proc. 42nd ACL (2004)

    Google Scholar 

  7. Gen, M., Cheng, R.: Genetic Algorithms and Engineering Design. John Wiley, Chichester (1997)

    Google Scholar 

  8. Infante-Lopez, G.: Two-Level Probabilistic Grammars for Natural Language Parsing. PhD thesis, Universiteit van Amsterdam (2005)

    Google Scholar 

  9. Klein, D., Manning, C.: Accurate unlexicalized parsing. In: Proc. 41st ACL (2003)

    Google Scholar 

  10. Matsuzaki, T., M.Y.: Probabilistic cfg with latent annotations. In: ACL (2005)

    Google Scholar 

  11. Petrov, S., Barrett, L., Klein, D.: Learning accurate, compact, and interpretable tree annotation. In: ACL (2006)

    Google Scholar 

  12. Charniak, E.: A maximum-entropy-inspired parser. In: NAACL 2000 (2000)

    Google Scholar 

  13. Mohri, M., Roark, B.: Probabilistic context-free grammar induction based on structural zeros. In: HLT-NAACL 2006 (2006)

    Google Scholar 

  14. Klein, D., Manning, C.: Distributional phrase structure induction. In: CoNLL 2001 (2001)

    Google Scholar 

  15. Schone, P., Jurafsky, D.: Language-independent induction of part of speech class labels using only language universals. In: IJCAI 2001 (2001)

    Google Scholar 

  16. Henderson, J., Titor, I.: Data-defined kernels for parse reranking derived from probabilistic models. In: ACL (2005)

    Google Scholar 

  17. Osborne, M.: Shallow parsing as part-of-speech tagging. In: Conll. (2000)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Domínguez, M.A., Infante-Lopez, G. (2008). Searching for Part of Speech Tags That Improve Parsing Models. In: Nordström, B., Ranta, A. (eds) Advances in Natural Language Processing. GoTAL 2008. Lecture Notes in Computer Science(), vol 5221. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-85287-2_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-85287-2_13

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-85286-5

  • Online ISBN: 978-3-540-85287-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics