Skip to main content

Constrained Stochastic Language Models

  • Conference paper
Image Models (and their Speech Model Cousins)

Part of the book series: The IMA Volumes in Mathematics and its Applications ((IMA,volume 80))

Abstract

Stochastic language models incorporating both n-grams and context-free grammars are proposed. A constrained context-free model specified by a stochastic context-free prior distribution with superimposed n-gram frequency constraints is derived and the resulting maximum-entropy distribution is shown to induce a Markov random field with neighborhood structure at the leaves determined by the relative n-gram frequencies. A computationally efficient version, the mixed tree/chain graph model, is derived with identical neighborhood structure. In this model, a word-tree derivation is given by a stochastic context-free prior on trees down to the preterminal (part-of-speech) level and word attachment is made by a nonstationary Markov chain. Using the Penn TreeBank, a comparison of the mixed tree/chain graph model to both the n-gram and context-free models is performed using entropy measures. The model entropy of the mixed tree/chain graph model is shown to reduce the entropy of both the bigram and context-free models.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. L. R. Bahl, F. Jelinek, and R. L. Mercer, A maximum likelihood approach to continuous speech recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, PAMI-5(2):179–190, 1983.

    Article  Google Scholar 

  2. T. Cover and J. Thomas. Elements of Information Theory. John Wiley and Sons, New York, 1991.

    Book  MATH  Google Scholar 

  3. U. Grenander. Probability measures for context-free languages. Res. rep. in pattern theory, Division of Applied Mathematics, Brown University, Providence, RI, 1967.

    Google Scholar 

  4. T. E. Harris. The Theory of Branching Processes, Springer-Verlag, Berlin-Gottingen-Heidelberg, 1963.

    MATH  Google Scholar 

  5. F. Jelinek and R. L. Mercer. Interpolated estimation of Markov source parameters from sparse data. In Proceedings, Workshop on Pattern Recognition in Practice, North-Holland Pub. Co., pages 381–397, Amsterdam, The Netherlands, 1980.

    Google Scholar 

  6. J. Kupiec. A trellis-based algorithm for estimating the parameters of a hidden stochastic context-free grammar. In DARPA Speech and Natural Language Workshop, Asilomar, CA, February 1991.

    Google Scholar 

  7. M. Marcus, B. Santorini, and M. Marcinkiewicz. Building a large annotated corpus of English: The Penn TreeBank. Computational Linguistics, 19(2):313–330, June 1993.

    Google Scholar 

  8. K. E. Mark, M. I. Miller, U. Grenander, and S. Abney. Parameter estimation for constrained context-free language models. In DARPA Speech and Natural Language Workshop, Harriman, NY, February 1992.

    Google Scholar 

  9. M. I. Miller and J. A. O’Sullivan. Entropies and combinatorics of random branching processes and context-free languages. IEEE Transactions on Information Theory, 38(4):1292–1310, July 1992.

    Article  MathSciNet  MATH  Google Scholar 

  10. C. Shannon. The mathematical theory of communication. Bell System Technical Journal, 27:398–403, 1948.

    MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1996 Springer-Verlag New York, Inc.

About this paper

Cite this paper

Mark, K.E., Miller, M.I., Grenander, U. (1996). Constrained Stochastic Language Models. In: Levinson, S.E., Shepp, L. (eds) Image Models (and their Speech Model Cousins). The IMA Volumes in Mathematics and its Applications, vol 80. Springer, New York, NY. https://doi.org/10.1007/978-1-4612-4056-3_7

Download citation

  • DOI: https://doi.org/10.1007/978-1-4612-4056-3_7

  • Publisher Name: Springer, New York, NY

  • Print ISBN: 978-1-4612-8482-6

  • Online ISBN: 978-1-4612-4056-3

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics