Abstract
This paper presents some experiments carried out this year in the Spanish monolingual task at CLEF2002. The objective is to continue our research on term expansion. Last year we presented results regarding stemming. Now, our effort is centred on term expansion using thesauri. Many words that derive from the same stem have a close semantic content. However other words with very different stems also have semantically close senses. In this case, the analysis of the relationships between words in a document collection can be used to construct a thesaurus of related terms. The thesaurus can then be used to expand a term with the best related terms. This paper describes some experiments carried out to study term expansion using association and similarity thesauri.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Furnas, G. W., Landauer, T.K., Gomez, L.M., Dumais, S.T.: The vocabulary problem in human-system communication. Comunications of the ACM 30 (1987) 964–971. 301
Wolfram, D., Spink, A., Janses, B.J., Saracevic, T.: Vox populi: The public searching of the web. Journal of the American Society for Information Science and Technology 52 (2001) 1073–1074. 301
Xu, J., Croft, W. B.: Corpus-based stemming using cooccurrence of word variants. ACM Transactions on Information Systems 16 (1998) 61–81. 302
Figuerola, C. G., Gómez Díaz, R., Zazo Rodríguez, Á.F., Alonso Berrocal, J. L.: Spanish monolingual track: the impact of stemming on retrieval. In Peters, C., Braschler, M., Gonzalo, J., Kluck, M., eds.: Evaluation of Cross-Language Information Retrieval Systems. Second Workshop of the Cross-Languge Evaluation Forum, CLEF 2001. Darmstadt, Germany, September 2001. Revised Papers. Volume 2406 of Lecture Notes in Computer Science. Springer, Berlin, etc. ISBN: 3-540-44042-9 (2002) 253-261. 302
Voorhees, E.: Query expansion using lexical-semantic relations. In Croft, W. B., van Rijsbergen, C., eds.: Proceedings of the 17th Annual International ACM-SIGIR Conference on Research and Development in Information Retrieval. Dublin. Ireland, 3-6 July 1994 (Special Issue of the SIGIR Forum), ACM/Springer-Verlag (1994) 61-69. 302
Han, C., Fujii, H., Croft, W.: Automatic query expansion for japanese text retrieval. Technical Report UM-CS-1995-011, Department of Computer Science, Lederle Graduate Research Center, University of Massachusetts (1995) On line: ftp://www.cs.umass.edu/pub/techrept/techreport/1995/UM-CS-1995-011%.ps. 302
Minker, J., Wilson, G., Zimmerman, B.: An evaluation of query expansion by the addition of clustered terms for a document retrieval system. Information Storage and Retrieval 8 (1972) 329–348. 302
Crouch, C.J., Yang, B.: Experiments in automatic statistical thesaurus construction. [20] 77-88. 302
Salton, G., McGill, M.: Introduction to Modern Information Retrieval. McGraw-Hill, New-York (1983). 302, 303
Qiu, Y., Frei, H. P.: Concept-based query expansion. In Korfhage, R., Ras-mussen, E. M., Willett, P., eds.: Proceedings of the 16th Annual International ACM-SIGIR Conference on Research and Development in Information Retrieval. Pittsburgh, PA, USA, June 27 — July 1, 1993, ACM Press (1993) 160-169. 302, 303, 304, 305
Jing, Y., Croft, W. B.: An association thesaurus for information retrieval. In: Proceedings of RIAO-94, 4th International Conference “Recherche d’Information Assistee par Ordinateur”, New York, US (1994) 146-160. 302
Grefenstette, G.: Use of syntactic context to produce term association lists for text retrieval. [20] 89-97. 302
Schutze, H.: Dimensions of meaning. In: Proceedings of Supercomputing’ 92, Minneapolis, 1992. (1992) 787-796. 302
Billhardt, H., Borrajo, D., Maojo, V.: A context vector model for information retrieval. Journal of the American Society for Information Science and Technology 53 (2002) 236–249. 302
Peat, H. J., Willet, P.: The limitations of term co-occurrence data for query expansion in document retrieval systems. Journal of the American Society for Information Science 42 (1991) 378–383. 303
Smeaton, A., van Rijsbergen, C.: The retrieval effects of query expansion on a feedback document retrieval system. The Computer Journal 26 (1983) 239–246. 303
van Rijsbergen, C.: Information Retrieval. Second edn. Dept. of Computer Science, University of Glasgow (1979). 303
Zazo Rodríguez, Á.F., Figuerola, C. G., Berrocal, J.L.A., Rodríguez, E.: Tesauros de asociación y similitud para la expansión automática de consultas: Algunos resultados experimentales. Technical Report DPTOIA-IT-2002-007, Departamento de Informática y Automática — Universidad de Salamanca (2002) On line: http://www.tejo.usal.es/inftec/2002/DPTOIA-IT-2002-007.pdf. 305
Salton, G., Buckley, C.: Term-weighting approaches in automatic text retrieval. Information Processing & Management 24 (1988) 513–523. 305
Belkin, N.J., Ingwersen, P., Pejtersen, A.M., eds.: Proceedings of the 15th Annual International ACM-SIGIR Conference on Research and Development in Information Retrieval. Copenhagen, Denmark, June 21–24. In Belkin, N. J., Ingwersen, P., Pejtersen, A.M., eds.: Proceedings of the 15th Annual International ACM-SIGIR Conference on Research and Development in Information Retrieval. Copenhagen, Denmark, June 21–24, ACM Press (1992). 310
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Zazo, Á.F., Figuerola, C.G., Berrocal, J.L.A., Rodríguez, E., Gómez, R. (2003). Experiments in Term Expansion Using Thesauri in Spanish. In: Peters, C., Braschler, M., Gonzalo, J., Kluck, M. (eds) Advances in Cross-Language Information Retrieval. CLEF 2002. Lecture Notes in Computer Science, vol 2785. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-45237-9_25
Download citation
DOI: https://doi.org/10.1007/978-3-540-45237-9_25
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-40830-7
Online ISBN: 978-3-540-45237-9
eBook Packages: Springer Book Archive