Abstract
This paper presents work on document retrieval based on first time participation in the CLEF 2001 monolingual retrieval task using French. The experiment findings indicate that Okapi, the text retrieval system in use, can successfully be used for non-English text retrieval. A lot of internal preprocessing is required in the basic search system for conversion into Okapi access formats. Various shell scripts were written to achieve the conversion in a UNIX environment, failure of which would significantly have impeded the overall performance. Based on the experiment findings using Okapi-originally designed for English - it was clear that, although most European languages share conventional word boundaries and variant word morphemes formed by the addition of suffixes, there is significant difference between French and English retrieval depending on the adaptation of indexing and search strategies in use. No sophisticated method for higher recall and precision such as stemming techniques, phrase translation or de-compounding was employed for the experiment and our results were suggestively poor. Future participation would include more refined query translation tools.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Centre For Interactive Systems Research, City University: Introduction to Okapi Pack, 14 March 2000, available from: http://dotty.is.city.ac.uk/okapi-pack/okapi-pack.html (accessed 17 April 2001)
Cross Language Evaluation Forum, CLEF: CLEF Agenda for 2001, 14 March 2000, available from: http://www.clef-campaign.org (accessed May 2001)
Mueller, Erik T. 1998. Fluent French: Experiences of an English speaker. New York: Signiform. Available: http://www.signiform.com/french/ (accessed 4, 2001)
Peters, C. & Braschler, M. (forthcoming). Cross-Language System Evaluation: the CLEF Campaigns. European Research Letter. In Journal of the American Society for Information Science and Technology, Vol 52(12) pp 1067–1072.
Savoy, J. 2001, Stopword List, available from: http://www.unine.ch/info/clef/ (accessed June 5, 2001)
Text Retrieval Conference, Test Collections, available from: http://trec.nist.gov/ (accessed May 2001)
Centre for Interactive Systems Research, The Probabilistic Retrieval Model, available from: http://www.soi.city.ac.uk/research/cisr/okapi/prm.html (accessed May 2001)
Savoy J. 1999. A stemming procedure and stopword list for general French corpora. Journal of the American Society for Information Science, 50(10), 944–952.
M. Beaulieu, Experiments on Interfaces to Support Query Expansion (p8-19). S.E. Robertson, S. Walker and M. Beaulieu, Laboratory experiments with Okapi: participation in the TREC program (p20-34). S.E. Robertson and M. Beaulieu, Research and evaluation in information retrieval (p51-57). X. Huang and S.E. Robertson, Application of probabilistic methods to Chinese text retrieval (p74-79). Special issue of Journal of Documentation 53(1), (1997).
K. Sparck Jones, S. Walker and S.E. Robertson, A probabilistic model of information retrieval: development and status. See: http://citeseer.nj.nec.com/jones98probabilistic.html
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Matoyo, E., Valsamidis, T. (2002). Across the Bridge: CLEF 2001 — Non-english Monolingual Retrieval. The French Task. In: Peters, C., Braschler, M., Gonzalo, J., Kluck, M. (eds) Evaluation of Cross-Language Information Retrieval Systems. CLEF 2001. Lecture Notes in Computer Science, vol 2406. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45691-0_27
Download citation
DOI: https://doi.org/10.1007/3-540-45691-0_27
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-44042-0
Online ISBN: 978-3-540-45691-9
eBook Packages: Springer Book Archive