Skip to main content

Regularized Least-Squares for Parse Ranking

  • Conference paper
Advances in Intelligent Data Analysis VI (IDA 2005)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 3646))

Included in the following conference series:

Abstract

We present an adaptation of the Regularized Least-Squares algorithm for the rank learning problem and an application of the method to reranking of the parses produced by the Link Grammar (LG) dependency parser. We study the use of several grammatically motivated features extracted from parses and evaluate the ranker with individual features and the combination of all features on a set of biomedical sentences annotated for syntactic dependencies. Using a parse goodness function based on the F-score, we demonstrate that our method produces a statistically significant increase in rank correlation from 0.18 to 0.42 compared to the built-in ranking heuristics of the LG parser. Further, we analyze the performance of our ranker with respect to the number of sentences and parses per sentence used for training and illustrate that the method is applicable to sparse datasets, showing improved performance with as few as 100 training sentences.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Collins, M.: Discriminative reranking for natural language parsing. In: Langley, P. (ed.) Proceedings of the Seventeenth International Conference on Machine Learning, pp. 175–182. Morgan Kaufmann, San Francisco (2000)

    Google Scholar 

  2. Sleator, D.D., Temperley, D.: Parsing english with a link grammar. Technical Report CMU-CS-91-196, Department of Computer Science, Carnegie Mellon University, Pittsburgh, PA (1991)

    Google Scholar 

  3. Pyysalo, S., Ginter, F., Pahikkala, T., Boberg, J., Järvinen, J., Salakoski, T., Koivula, J.: Analysis of link grammar on biomedical dependency corpus targeted at protein-protein interactions. In: Collier, N., Ruch, P., Nazarenko, A. (eds.) Proceedings of the JNLPBA workshop at COLING 2004, Geneva, pp. 15–21 (2004)

    Google Scholar 

  4. Poggio, T., Smale, S.: The mathematics of learning: Dealing with data. Amer. Math. Soc. Notice 50, 537–544 (2003)

    MATH  MathSciNet  Google Scholar 

  5. Vapnik, V.N.: The nature of statistical learning theory. Springer, New York (1995)

    MATH  Google Scholar 

  6. Herbrich, R., Graepel, T., Obermayer, K.: Support vector learning for ordinal regression. In: Proceedings of the Ninth International Conference on Artificial Neural Networks, London, UK, pp. 97–102. IEE (1999)

    Google Scholar 

  7. Joachims, T.: Optimizing search engines using clickthrough data. In: Proceedings of the ACM Conference on Knowledge Discovery and Data Mining, pp. 133–142. ACM Press, New York (2002)

    Google Scholar 

  8. Shen, L., Joshi, A.K.: An svm-based voting algorithm with application to parse reranking. In: Daelemans, W., Osborne, M. (eds.) Proceedings of CoNLL-2003, pp. 9–16 (2003)

    Google Scholar 

  9. Collins, M., Koo, T.: Discriminative reranking for natural language parsing, To appear in Computational Linguistics (2004), available at http://people.csail.mit.edu/people/mcollins/papers/collinskoo.ps

  10. Kendall, M.G.: Rank Correlation Methods. 4th edn. Griffin, London (1970)

    MATH  Google Scholar 

  11. Lafferty, J., Sleator, D., Temperley, D.: Grammatical trigrams: A probabilistic model of link grammar. In: Proceedings of the AAAI Conference on Probabilistic Approaches to Natural Language, pp. 89–97. AAAI Press, Menlo Park (1992)

    Google Scholar 

  12. Schölkopf, B., Herbrich, R., Smola, A.J.: A generalized representer theorem. In: Helmbold, D., Williamson, R. (eds.) Proceedings of the 14th Annual Conference on Computational Learning Theory and and 5th European Conference on Computational Learning Theory, pp. 416–426. Springer, Berlin (2001)

    Google Scholar 

  13. Alpaydin, E.: Combined 5 × 2 cv F-test for comparing supervised classification learning algorithms. Neural Computation 11, 1885–1892 (1999)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Tsivtsivadze, E., Pahikkala, T., Pyysalo, S., Boberg, J., Mylläri, A., Salakoski, T. (2005). Regularized Least-Squares for Parse Ranking. In: Famili, A.F., Kok, J.N., Peña, J.M., Siebes, A., Feelders, A. (eds) Advances in Intelligent Data Analysis VI. IDA 2005. Lecture Notes in Computer Science, vol 3646. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11552253_42

Download citation

  • DOI: https://doi.org/10.1007/11552253_42

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-28795-7

  • Online ISBN: 978-3-540-31926-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics