Skip to main content

IR Evaluation without a Common Set of Topics

  • Conference paper
Advances in Information Retrieval Theory (ICTIR 2009)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5766))

Included in the following conference series:

Abstract

Usually, system effectiveness evaluation in a TREC-like environment is performed on a common set of topics. We show that even when using different topics for different systems, a reliable evaluation can be obtained, and that reliability increases by using appropriate topic selection strategies and metric normalizations.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Alonso, O., Mizzaro, S.: Relevance criteria for e-commerce: A crowdsourcing-based experimental analysis. In: 32nd SIGIR (2009) (in press)

    Google Scholar 

  2. Alonso, O., Rose, D., Stewart, B.: Crowdsourcing for relevance evaluation. SIGIR Forum 42(2), 9–15 (2008)

    Article  Google Scholar 

  3. Buckley, C., Voorhees, E.: Evaluating evaluation measure stability. In: 23rd SIGIR, pp. 33–40 (2000)

    Google Scholar 

  4. Guiver, J., Mizzaro, S., Robertson, S.: A few good topics: Experiments in topic set reduction for retrieval evaluation. In: ACM TOIS (2009) (in press)

    Google Scholar 

  5. Mizzaro, S.: The Good, the Bad, the Difficult, and the Easy: Something Wrong with Information Retrieval Evaluation? In: Macdonald, C., Ounis, I., Plachouras, V., Ruthven, I., White, R.W. (eds.) ECIR 2008. LNCS, vol. 4956, pp. 642–646. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  6. Mizzaro, S., Robertson, S.: HITS hits TREC: exploring IR evaluation results with network analysis. In: 30th SIGIR, pp. 479–486 (2007)

    Google Scholar 

  7. Sanderson, M., Zobel, J.: Information retrieval system evaluation: effort, sensitivity, and reliability. In: 28th SIGIR, pp. 162–169 (2005)

    Google Scholar 

  8. Sparck Jones, K., van Rijsbergen, C.J.: Information retrieval test collections. Journal of Documentation 32, 59–75 (1976)

    Article  Google Scholar 

  9. Voorhees, E., Buckley, C.: The effect of topic set size on retrieval experiment error. In: 25th SIGIR, pp. 316–323 (2002)

    Google Scholar 

  10. Webber, W., Moffat, A., Zobel, J.: Score standardization for inter-collection comparison of retrieval systems. In: 31st SIGIR, pp. 51–58 (2008)

    Google Scholar 

  11. Zobel, J.: How reliable are the results of large-scale information retrieval experiments? In: 21st SIGIR, pp. 307–314 (1998)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Cattelan, M., Mizzaro, S. (2009). IR Evaluation without a Common Set of Topics. In: Azzopardi, L., et al. Advances in Information Retrieval Theory. ICTIR 2009. Lecture Notes in Computer Science, vol 5766. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04417-5_35

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-04417-5_35

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-04416-8

  • Online ISBN: 978-3-642-04417-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics