Skip to main content

Cross-Lingual Information Retrieval by Feature Vectors

  • Conference paper
Natural Language Processing and Information Systems (NLDB 2007)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4592))

Abstract

This paper investigates query translation in cross-lingual information retrieval, especially the challenges caused by ambiguity and polysemi. We base our ideas on feature vectors and our method uses context during the translation of queries. Achieving good query translation can be difficult, due to short queries lacking context information. We argue that by using information external to the query, like ontologies and document collections, the effect of ambiguity and polysemi can be reduced. Different approaches for translation of these feature vectors are proposed and discussed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Google: 3 Billion Document Index (22.02.2007), http://www.google.com/3.html

  2. Search Engine Size Wars V Erupts (22.02.2007), http://blog.searchenginewatch.com/blog/041111-084221

  3. Internet world users by language (22.02.2007), http://www.internetworldstats.com/stats7.htm

  4. Allan, J., et al.: Challenges in Information Retrieval and Language Modelling: report of a workshop held at the centre for intelligent information retrieval (2002)

    Google Scholar 

  5. Babelplex (22.02.2007), http://babelplex.com/

  6. Google Translate (22.02.2007), http://www.google.com/translate

  7. Gulla, J.A., Auran, P.G., Risvik, K.M.: Linguistic Techniques in Large-Scale Search Engines. Fast Search & Transfer, p. 15 (2002)

    Google Scholar 

  8. Spink, A., Wolfram, D., Jansen, M.B.J., Saracevic, T.: Searching the Web: the public and their queries. J. Am. Soc. Inf. Sci. Technol. 52, 226–234 (2001)

    Article  Google Scholar 

  9. Gulla, J.A., Tomassen, S.L., Strasunskas, D.: Semantic Interoperability in the Norwegian Petroleum Industry. In: Karagiannis, D., Mayer, H.C. (eds.) Proceedings of the 5th International Conference on Information Systems Technology and its Applications (ISTA 2006) vol. P-84. Köllen Druck+Verlag GmbH, Bonn, Klagenfurt, Austria, pp. 81–94 (2006)

    Google Scholar 

  10. Gruber, T.R.: A translation approach to portable ontology specifications. Knowledge Acquisition 5, 199–220 (1993)

    Article  Google Scholar 

  11. Tomassen, S.L., Strasunskas, D.: Query Terms Abstraction Layers. In: Meersman, R., Tari, Z., Herrero, P. (eds.) On the Move to Meaningful Internet Systems 2006: OTM 2006 Workshops. LNCS, vol. 4278, pp. 1786–1795. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  12. Tomassen, S.L., Gulla, J.A., Strasunskas, D.: Document Space Adapted Ontology: Application in Query Enrichment. In: Kop, C., Fliedl, G., Mayr, H.C., Métais, E. (eds.) NLDB 2006. LNCS, vol. 3999, pp. 46–57. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  13. Kiryakov, A., Popov, B., Terziev, I., Manov, D., Ognyanoff, D.: Semantic Annotation, Indexing, and Retrieval. Journal of Web Semantics 2(1) (2005)

    Google Scholar 

  14. Nagypal, G.: Improving Information Retrieval Effectiveness by Using Domain Knowledge Stored in Ontologies. In: Meersman, R., Tari, Z., Herrero, P. (eds.) OTM Workshops 2005. LNCS, vol. 3762, pp. 780–789. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  15. Paralic, J., Kostial, I.: Ontology-based Information Retrieval. Information and Intelligent Systems, Croatia, pp. 23–28 (2003)

    Google Scholar 

  16. Rajapakse, R.K., Denham, M.: Text retrieval with more realistic concept matching and reinforcement learning. Information Processing & Management 42, 1260–1275 (2006)

    Article  Google Scholar 

  17. Grootjen, F.A., van der Weide, T.P.: Conceptual query expansion. Data. & Knowledge Engineering 56, 174–193 (2006)

    Article  Google Scholar 

  18. Qiu, Y., Frei, H.-P.: Concept based query expansion. In: Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 160–169. ACM Press, Pittsburgh, Pennsylvania, USA (1993)

    Chapter  Google Scholar 

  19. Chang, Y., Ounis, I., Kim, M.: Query reformulation using automatically generated query concepts from a document space. Information Processing and Management 42, 453–468 (2006)

    Article  Google Scholar 

  20. Adi, T., Ewell, O.K., Adi, P.: High Selectivity and Accuracy with READWARE’s Automated System of Knowledge Organization. Management Information Technologies, Inc. (MITi) (1999)

    Google Scholar 

  21. Chenggang, W., Wenpin, J., Qijia, T., et al.: An information retrieval server based on ontology and multiagent. Journal of computer research & development 38(6), 641–647 (2001)

    Google Scholar 

  22. Ciorăscu, C., Ciorăscu, I., Stoffel, K.: knOWLer - Ontological Support for Information Retrieval Systems. In: Proceedings of Sigir 2003 Conference, Workshop on Semantic Web, Toronto, Canada (2003)

    Google Scholar 

  23. Braga, R.M.M., Werner, C.M.L., Mattoso, M.: Using Ontologies for Domain Information Retrieval. In: Proceedings of the 11th International Workshop on Database and Expert Systems

    Google Scholar 

  24. Ozcan, R., Aslangdogan, Y.A.: Concept Based Information Access Using Ontologies and Latent Semantic Analysis. Technical Report CSE-2004-8. University of Texas at Arlington 16 (2004)

    Google Scholar 

  25. WordNet (22.02.2007), http://wordnet.princeton.edu/

  26. Davis, M.W., Ogden, W.C.: QUILT: implementing a large-scale cross-language text retrieval system. In: 20th annual international ACM SIGIR conference on Research and development in information retrieval, Philadelphia, Pennsylvania, United States (1997)

    Google Scholar 

  27. Chen, H.-H., et al.: Resolving translation ambiguity and target polysemy in cross-language information retrieval. In: Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics, College Park, Maryland (1999)

    Google Scholar 

  28. Stokoe, C., et al.: Word sense disambiguation in information retrieval revisited. In: Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval, Toronto, Canada (2003)

    Google Scholar 

  29. Liu, S., et al.: Word sense disambiguation in queries. In: Proceedings of the 14th ACM international conference on Information and knowledge management, Bremen, Germany (2005)

    Google Scholar 

  30. Fung, P.: A Statistical View on Bilingual Lexicon Extraction: From Parallel Corpora to Non-parallel Corpora (1998)

    Google Scholar 

  31. Wikipedia (22.02.2007), http://en.wikipedia.org

  32. English Wörterbuch (14.02.2007), http://dict.leo.org/

  33. Duden (14.02.2007), http://www.duden-suche.de/

  34. Das digitale Wörterbuch der deutchen Sprache des 20. Jh (14.02.2007), http://www.dwds.de/

Download references

Author information

Authors and Affiliations

Authors

Editor information

Zoubida Kedad Nadira Lammari Elisabeth Métais Farid Meziane Yacine Rezgui

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Lilleng, J., Tomassen, S.L. (2007). Cross-Lingual Information Retrieval by Feature Vectors. In: Kedad, Z., Lammari, N., Métais, E., Meziane, F., Rezgui, Y. (eds) Natural Language Processing and Information Systems. NLDB 2007. Lecture Notes in Computer Science, vol 4592. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-73351-5_20

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-73351-5_20

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-73350-8

  • Online ISBN: 978-3-540-73351-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics