Skip to main content

A Search Log-Based Approach to Evaluation

  • Conference paper
Research and Advanced Technology for Digital Libraries (ECDL 2010)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6273))

Included in the following conference series:

Abstract

Anyone offering content in a digital library is naturally interested in assessing its performance: how well does my system meet the users’ information needs? Standard evaluation benchmarks have been developed in information retrieval that can be used to test retrieval effectiveness. However, these generic benchmarks focus on a single document genre, language, media-type, and searcher stereotype that is radically different from the unique content and user community of a particular digital library. This paper proposes to derive a domain-specific test collection from readily available interaction data in search logs files that captures the domain-specificity of digital libraries. We use as case study an archival institution’s complete search log that spans over multiple years, and derive a large-scale test collection. We manually derive a set of topics judged by human experts—based on a set of e-mail reference questions and responses from archivists—and use this for validation. Our main finding is that we can derive a reliable and domain-specific test collection from search log files.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Arampatzis, A., Kamps, J., Koolen, M., Nussbaum, N.: Deriving a domain specific test collection from a query log. In: LaTeCH 2007, pp. 73–80. ACL (2007)

    Google Scholar 

  2. Bailey, P., Craswell, N., Hawking, D.: Engineering a multi-purpose test collection for web retrieval experiments. Inf. Process. Manage. 39(6), 853–871 (2003)

    Article  Google Scholar 

  3. Boncz, P.A., Grust, T., van Keulen, M., Manegold, S., Rittinger, J., Teubner, J.: MonetDB/XQuery: A Fast XQuery Processor Powered by a Relational Engine. In: SIGMOD 2006, pp. 479–490. ACM, New York (2006)

    Chapter  Google Scholar 

  4. Bray, T., Paoli, J., Sperberg-McQueen, C.M., Maler, E., Yergeau, F.: Extensible markup language (XML) 1.0, 5th edn.(2008)

    Google Scholar 

  5. Dumais, S., Joachims, T., Bharat, K., Weigend, A.: SIGIR 2003 workshop report: implicit measures of user interests and preferences. SIGIR Forum 37, 50–54 (2003)

    Article  Google Scholar 

  6. Dupret, G., Liao, C.: A model to estimate intrinsic document relevance from the clickthrough logs of a web search engine. In: WSDM 2010. ACM Press, New York (2010)

    Google Scholar 

  7. Feeney, K.: Retrieval of archival finding aids using world-wide-web search engines. The American Archivist 62(2), 206–228 (1999)

    Google Scholar 

  8. Fuhr, N., Tsakonas, G., Aalberg, T., Agosti, M., Hansen, P., Kapidakis, S., Klas, C.-P., Kovács, L., Landoni, M., Micsik, A., Papatheodorou, C., Peters, C., Sølvberg, I.: Evaluation of digital libraries. Int. J. on Digital Libraries 8(1), 21–38 (2007)

    Article  Google Scholar 

  9. Hiemstra, D., Rode, H., van Os, R., Flokstra, J.: PF/Tijah: text search in an XML database system. In: OSIR 2006, pp. 12–17 (2006)

    Google Scholar 

  10. Hutchinson, T.: Strategies for Searching Online Finding Aids: A Retrieval Experiment. Archivaria 44, 72–101 ((Fall 1997)

    Google Scholar 

  11. Jansen, B.J.: Search log analysis: What it is, what’s been done, how to do it. Library & Information Science Research 28(3), 407–432 (2006)

    Article  Google Scholar 

  12. Joachims, T., Granka, L., Pan, B., Hembrooke, H., Gay, G.: Accurately interpreting clickthrough data as implicit feedback. In: SIGIR 2005, pp. 154–161 (2005)

    Google Scholar 

  13. Jones, S., Cunningham, S.J., McNab, R.J., Boddie, S.J.: A transaction log analysis of a digital library. Int. J. on Digital Libraries 3(2), 152–169 (2000)

    Article  Google Scholar 

  14. Lalmas, M.: XML Information Retrieval. Encycl. of Lib. and Inf. Sciences (2009)

    Google Scholar 

  15. Lytle, R.H.: Intellectual Access to Archives: I. Provenance and Content Indexing Methods of Subject Retrieval. American Archivist 43, 64–75 (Winter 1980)

    Google Scholar 

  16. Peters, T.: The history and development of transaction log analysis. Library Hi Tech. 42(11), 41–66 (1993)

    Article  Google Scholar 

  17. Pitti, D.V.: Encoded Archival Description: An Introduction and Overview. D-Lib Magazine 5(11) (1999)

    Google Scholar 

  18. Ribeiro, F.: Subject Indexing and Authority Control in Archives: The Need for Subject Indexing in Archives and for an Indexing Policy Using Controlled Language. Journal of the Society of Archivists 17(1), 27–54 (1996)

    Article  MathSciNet  Google Scholar 

  19. Robertson, S.: On the history of evaluation in IR. J. Inf. Sci. 34(4), 439–456 (2008)

    Article  Google Scholar 

  20. Shaw, W.M., Wood, J.B., Wood, R.E., Tibbo, H.R.: The cystic fibrosis database: content and research opportunities. Library & Information Science Research 13, 347–366 (1991)

    Google Scholar 

  21. Spärck Jones, K., van Rijsbergen, C.J.: Information retrieval test collections. J. of Documentation 32(1), 59–75 (1976)

    Article  Google Scholar 

  22. Tibbo, H.R., Meho, L.I.: Finding finding aids on the world wide web. The American Archivist 64(1), 61–77 (2001)

    Google Scholar 

  23. White, R.W., Morris, D.: Investigating the querying and browsing behavior of advanced search engine users. In: SIGIR 2007, pp. 255–262. ACM, New York (2007)

    Chapter  Google Scholar 

  24. Zhang, J., Kamps, J.: Searching archival finding aids: Retrieval in original order? In: Agosti, M., Borbinha, J., Kapidakis, S., Papatheodorou, C., Tsakonas, G. (eds.) ECDL 2009. LNCS, vol. 5714, pp. 447–450. Springer, Heidelberg (2009)

    Google Scholar 

  25. Zhang, J., Kamps, J.: Focused search in digital archives. In: Vossen, G., Long, D.D.E., Yu, J.X. (eds.) WISE 2009. LNCS, vol. 5802, pp. 463–471. Springer, Heidelberg (2009)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Zhang, J., Kamps, J. (2010). A Search Log-Based Approach to Evaluation. In: Lalmas, M., Jose, J., Rauber, A., Sebastiani, F., Frommholz, I. (eds) Research and Advanced Technology for Digital Libraries. ECDL 2010. Lecture Notes in Computer Science, vol 6273. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15464-5_26

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-15464-5_26

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-15463-8

  • Online ISBN: 978-3-642-15464-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics