Skip to main content

Cross-Lingual Text Mining

  • Reference work entry
Encyclopedia of Machine Learning

Definition

Cross-lingual text mining is a general category denoting tasks and methods for accessing the information in sets of documents written in several languages, or whenever the language used to express an information need is different from the language of the documents. A distinguishing feature of cross-lingual text mining is the necessity to overcome some language translation barrier.

Motivation and Background

Advances in mass storage and network connectivity make enormous amounts of information easily accessible to an increasingly large fraction of the world population. Such information is mostly encoded in the form of running text which, in most cases, is written in a language different from the native language of the user. This state of affairs creates many situations in which the main barrier to the fulfillment of an information need is not technological but linguistic. For example, in some cases the user has some knowledge of the language in which the text containing a...

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Recommended Reading

  • Brown, P. E., Della Pietra, V. J., Della Pietra, S. A., & Mercer, R. L. (1993). The mathematics of statistical machine translation: Parameter estimation. Computational Linguistics, 12(2), 263–311.

    Google Scholar 

  • Gaussier, E., Renders, J.-M., Matveeva, I., Goutte, C., & Déjean, H. (2004). A geometric view on bilingual lexicon extraction from comparable corpora. In Proceedings of the 42nd annual meeting of the association for computational linguistics, Barcelona, Spain. Morristown, NJ: Association for Computational Linguistics.

    Google Scholar 

  • Savoy, J., & Berger, P. Y. (2005). Report on CLEF-2005 evaluation campaign: Monolingual, bilingual and GIRT information retrieval. In Proceedings of the cross-language evaluation forum (CLEF) (pp. 131–140). Heidelberg: Springer.

    Google Scholar 

  • Zhang, Y., & Vines, P. (2005). Using the web for translation disambiguation. In Proceedings of the NTCIR-5 workshop meeting, Tokyo, Japan.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer Science+Business Media, LLC

About this entry

Cite this entry

Cancedda, N., Renders, JM. (2011). Cross-Lingual Text Mining. In: Sammut, C., Webb, G.I. (eds) Encyclopedia of Machine Learning. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-30164-8_189

Download citation

Publish with us

Policies and ethics