Abstract
This paper investigates query translation in cross-lingual information retrieval, especially the challenges caused by ambiguity and polysemi. We base our ideas on feature vectors and our method uses context during the translation of queries. Achieving good query translation can be difficult, due to short queries lacking context information. We argue that by using information external to the query, like ontologies and document collections, the effect of ambiguity and polysemi can be reduced. Different approaches for translation of these feature vectors are proposed and discussed.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Google: 3 Billion Document Index (22.02.2007), http://www.google.com/3.html
Search Engine Size Wars V Erupts (22.02.2007), http://blog.searchenginewatch.com/blog/041111-084221
Internet world users by language (22.02.2007), http://www.internetworldstats.com/stats7.htm
Allan, J., et al.: Challenges in Information Retrieval and Language Modelling: report of a workshop held at the centre for intelligent information retrieval (2002)
Babelplex (22.02.2007), http://babelplex.com/
Google Translate (22.02.2007), http://www.google.com/translate
Gulla, J.A., Auran, P.G., Risvik, K.M.: Linguistic Techniques in Large-Scale Search Engines. Fast Search & Transfer, p. 15 (2002)
Spink, A., Wolfram, D., Jansen, M.B.J., Saracevic, T.: Searching the Web: the public and their queries. J. Am. Soc. Inf. Sci. Technol. 52, 226–234 (2001)
Gulla, J.A., Tomassen, S.L., Strasunskas, D.: Semantic Interoperability in the Norwegian Petroleum Industry. In: Karagiannis, D., Mayer, H.C. (eds.) Proceedings of the 5th International Conference on Information Systems Technology and its Applications (ISTA 2006) vol. P-84. Köllen Druck+Verlag GmbH, Bonn, Klagenfurt, Austria, pp. 81–94 (2006)
Gruber, T.R.: A translation approach to portable ontology specifications. Knowledge Acquisition 5, 199–220 (1993)
Tomassen, S.L., Strasunskas, D.: Query Terms Abstraction Layers. In: Meersman, R., Tari, Z., Herrero, P. (eds.) On the Move to Meaningful Internet Systems 2006: OTM 2006 Workshops. LNCS, vol. 4278, pp. 1786–1795. Springer, Heidelberg (2006)
Tomassen, S.L., Gulla, J.A., Strasunskas, D.: Document Space Adapted Ontology: Application in Query Enrichment. In: Kop, C., Fliedl, G., Mayr, H.C., Métais, E. (eds.) NLDB 2006. LNCS, vol. 3999, pp. 46–57. Springer, Heidelberg (2006)
Kiryakov, A., Popov, B., Terziev, I., Manov, D., Ognyanoff, D.: Semantic Annotation, Indexing, and Retrieval. Journal of Web Semantics 2(1) (2005)
Nagypal, G.: Improving Information Retrieval Effectiveness by Using Domain Knowledge Stored in Ontologies. In: Meersman, R., Tari, Z., Herrero, P. (eds.) OTM Workshops 2005. LNCS, vol. 3762, pp. 780–789. Springer, Heidelberg (2005)
Paralic, J., Kostial, I.: Ontology-based Information Retrieval. Information and Intelligent Systems, Croatia, pp. 23–28 (2003)
Rajapakse, R.K., Denham, M.: Text retrieval with more realistic concept matching and reinforcement learning. Information Processing & Management 42, 1260–1275 (2006)
Grootjen, F.A., van der Weide, T.P.: Conceptual query expansion. Data. & Knowledge Engineering 56, 174–193 (2006)
Qiu, Y., Frei, H.-P.: Concept based query expansion. In: Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 160–169. ACM Press, Pittsburgh, Pennsylvania, USA (1993)
Chang, Y., Ounis, I., Kim, M.: Query reformulation using automatically generated query concepts from a document space. Information Processing and Management 42, 453–468 (2006)
Adi, T., Ewell, O.K., Adi, P.: High Selectivity and Accuracy with READWARE’s Automated System of Knowledge Organization. Management Information Technologies, Inc. (MITi) (1999)
Chenggang, W., Wenpin, J., Qijia, T., et al.: An information retrieval server based on ontology and multiagent. Journal of computer research & development 38(6), 641–647 (2001)
Ciorăscu, C., Ciorăscu, I., Stoffel, K.: knOWLer - Ontological Support for Information Retrieval Systems. In: Proceedings of Sigir 2003 Conference, Workshop on Semantic Web, Toronto, Canada (2003)
Braga, R.M.M., Werner, C.M.L., Mattoso, M.: Using Ontologies for Domain Information Retrieval. In: Proceedings of the 11th International Workshop on Database and Expert Systems
Ozcan, R., Aslangdogan, Y.A.: Concept Based Information Access Using Ontologies and Latent Semantic Analysis. Technical Report CSE-2004-8. University of Texas at Arlington 16 (2004)
WordNet (22.02.2007), http://wordnet.princeton.edu/
Davis, M.W., Ogden, W.C.: QUILT: implementing a large-scale cross-language text retrieval system. In: 20th annual international ACM SIGIR conference on Research and development in information retrieval, Philadelphia, Pennsylvania, United States (1997)
Chen, H.-H., et al.: Resolving translation ambiguity and target polysemy in cross-language information retrieval. In: Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics, College Park, Maryland (1999)
Stokoe, C., et al.: Word sense disambiguation in information retrieval revisited. In: Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval, Toronto, Canada (2003)
Liu, S., et al.: Word sense disambiguation in queries. In: Proceedings of the 14th ACM international conference on Information and knowledge management, Bremen, Germany (2005)
Fung, P.: A Statistical View on Bilingual Lexicon Extraction: From Parallel Corpora to Non-parallel Corpora (1998)
Wikipedia (22.02.2007), http://en.wikipedia.org
English Wörterbuch (14.02.2007), http://dict.leo.org/
Duden (14.02.2007), http://www.duden-suche.de/
Das digitale Wörterbuch der deutchen Sprache des 20. Jh (14.02.2007), http://www.dwds.de/
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Lilleng, J., Tomassen, S.L. (2007). Cross-Lingual Information Retrieval by Feature Vectors. In: Kedad, Z., Lammari, N., Métais, E., Meziane, F., Rezgui, Y. (eds) Natural Language Processing and Information Systems. NLDB 2007. Lecture Notes in Computer Science, vol 4592. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-73351-5_20
Download citation
DOI: https://doi.org/10.1007/978-3-540-73351-5_20
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-73350-8
Online ISBN: 978-3-540-73351-5
eBook Packages: Computer ScienceComputer Science (R0)