Skip to main content

A Baseline for NLP in Domain-Specific IR

  • Conference paper
Accessing Multilingual Information Repositories (CLEF 2005)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4022))

Included in the following conference series:

Abstract

The information retrieval (IR) methods employed for the third participation of the University of Hagen in the domain-specific task of the Cross Language Evaluation Campaign (CLEF 2005) provide a baseline for experiments with natural language processing (NLP) methods in domain-specific IR than methods employed in our previous participations. The baseline consists of a combination of state-of-the-art IR methods with NLP methods for document and query processing.

Our monolingual experiments with German documents combine several methods to achieve better performance, including an entry vocabulary module (EVM), query expansion with semantically related concepts, and a blind feedback technique. The monolingual experiments focus on comparing two techniques for constructing database queries: creating a ‘bag of words’ and creating a semantic network by means of deep linguistic analysis of the query.

For the bilingual experiments, the English topics are translated into German queries with several machine translation (MT) services publicly available. Each set of translated topics is processed separately with the same techniques as in the monolingual experiments. Evaluation results for official experiments with a staged logistic regression and additional experiments with BM25 are presented.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Leveling, J., Helbig, H.: A robust natural language interface for access to bibliographic databases. In: Callaos, N., Margenstern, M., Sanchez, B. (eds.) Proceedings of the 6th World Multiconference on Systemics, Cybernetics and Informatics (SCI 2002), Orlando, Florida, International Institute of Informatics and Systemics (IIIS), vol. XI, pp. 133–138 (2002)

    Google Scholar 

  2. Leveling, J., Hartrumpf, S.: University of Hagen at CLEF 2004: Indexing and Translating Concepts for the GIRT Task. In: Peters, C., Clough, P., Gonzalo, J., Jones, G.J.F., Kluck, M., Magnini, B. (eds.) CLEF 2004. LNCS, vol. 3491, pp. 271–282. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  3. Gey, F.C., Buckland, M., Chen, A., Larson, R.R.: Entry vocabulary – a technology to enhance digital search. In: Proc. of the First International Conference on Human Language Technology, San Diego (2001)

    Google Scholar 

  4. Petras, V.: GIRT and the Use of Subject Metadata for Retrieval. In: Peters, C., Clough, P., Gonzalo, J., Jones, G.J.F., Kluck, M., Magnini, B. (eds.) CLEF 2004. LNCS, vol. 3491, pp. 298–309. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  5. Leveling, J.: University of Hagen at CLEF 2003: Natural Language Access to the GIRT4 Data. In: Peters, C., Gonzalo, J., Braschler, M., Kluck, M. (eds.) CLEF 2003. LNCS, vol. 3237, pp. 412–424. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  6. Hartrumpf, S.: Hybrid Disambiguation in Natural Language Analysis. Der Andere Verlag, Osnabrück (2003)

    Google Scholar 

  7. Helbig, H.: Knowledge Representation and the Semantics of Natural Language. Springer, Berlin (2006)

    MATH  Google Scholar 

  8. Leveling, J.: University of Hagen at CLEF 2005: Towards a better baseline for NLP methods in domain-specific information retrieval. In: Peters, C. (ed.) Results of the CLEF 2005 Cross-Language System Evaluation Campaign, Working Notes for the CLEF 2005 Workshop. Centromedia, Wien, Österreich (2005)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Leveling, J. (2006). A Baseline for NLP in Domain-Specific IR. In: Peters, C., et al. Accessing Multilingual Information Repositories. CLEF 2005. Lecture Notes in Computer Science, vol 4022. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11878773_26

Download citation

  • DOI: https://doi.org/10.1007/11878773_26

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-45697-1

  • Online ISBN: 978-3-540-45700-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics