Skip to main content

Inference and estimation of a long-range trigram model

  • Conference paper
  • First Online:
Grammatical Inference and Applications (ICGI 1994)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 862))

Included in the following conference series:

Abstract

We describe an implementation of a simple probabilistic link grammar. This probabilistic language model extends trigrams by allowing a word to be predicted not only from the two immediately preceeding words, but potentially from any preceeding pair of adjacent words that lie within the same sentence. In this way, the trigram model can skip over less informative words to make its predictions. The underlying “grammar” is nothing more than a list of pairs of words that can be linked together with one or more intervening words; this word-pair grammar is automatically inferred from a corpus of training text. We present a novel technique for indexing the model parameters that allows us to avoid all sorting in the M-step of the training algorithm. This results in significant savings in computation time, and is applicable to the training of a general probabilistic link grammar. Results of preliminary experiments carried out for this class of models are presented.

Research supported in part by NSF and ARPA under grants IRI-9314969 and N00014-92-C-0189.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. L.R. Bahl, P.F. Brown, P.V. de Souza, and R.L. Mercer. Tree-based smoothing algorithm for a trigram language speech recognition model. IBM Technical Disclosure Bulletin, 34(7B):380–383, December 1991.

    Google Scholar 

  2. A. Berger, P. Brown, S. Della Pietra, V. Della Pietra, J. Gillett, J. Lafferty, R. Mercer, H. Printz, and L. Ureš. The Candide system for machine translation. In Human Language Technologies, Morgan Kaufman Publishers, 1994.

    Google Scholar 

  3. T. Booth and R. Thompson. Applying probability measures to abstract languages. IEEE Transactions on Computers, C-22:442–450, 1973.

    Google Scholar 

  4. J. Lafferty, D. Sleator, and D. Temperley. Grammatical trigrams: A probabilistic model of link grammar. In Proceedings of the AAAI Fall Symposium on Probabilistic Approaches to Natural Language, Cambridge, MA, 1992.

    Google Scholar 

  5. D. Sleator and D. Temperley. Parsing English with a link grammar. Technical Report CMU-CS-91-196, School of Computer Science, Carnegie Mellon University, 1991.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Rafael C. Carrasco Jose Oncina

Rights and permissions

Reprints and permissions

Copyright information

© 1994 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Pietra, S.D., Pietra, V.D., Gillett, J., Lafferty, J., Printz, H., Ureš, L. (1994). Inference and estimation of a long-range trigram model. In: Carrasco, R.C., Oncina, J. (eds) Grammatical Inference and Applications. ICGI 1994. Lecture Notes in Computer Science, vol 862. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-58473-0_139

Download citation

  • DOI: https://doi.org/10.1007/3-540-58473-0_139

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-58473-5

  • Online ISBN: 978-3-540-48985-6

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics