Abstract
Europeana is a long-term project funded by the European Commission with the goal of making Europe’s cultural and scientific heritage accessible to the public. Since 2008, about 1500 institutions have contributed to Europeana, enabling people to explore the digital resources of Europe’s museums, libraries and archives. The huge amount of collected multi-lingual multi-media data is made available today through the Europeana portal, a search engine allowing users to explore such content through textual queries. One of the most important techniques for enhancing users search experience in large information spaces, is the exploitation of the knowledge contained in query logs. In this paper we present a characterization of the Europeana query log, showing statistics on common behavioral patterns of the Europeana users. Our analysis highlights some significative differences between the Europeana query log and the historical data collected by general purpose Web Search Engine logs. In particular, we find out that both query and search session distributions show different behaviors. Finally, we use this information for designing a query recommendation technique having the goal of enhancing the functionality of the Europeana portal.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Adomavicius, G., Tuzhilin, A.: Toward the next generation of recommender systems: A survey of the state-of-the-art and possible extensions. IEEE TKDE 17(6), 734–749 (2005)
Baeza-Yates, R., Gionis, A., Junqueira, F., Murdock, V., Plachouras, V., Silvestri, F.: The impact of caching on search engines. In: Proc. SIGIR 2007, pp. 183–190. ACM, New York (2007)
Baraglia, R., Cacheda, F., Carneiro, V., Fernandez, D., Formoso, V., Perego, R., Silvestri, F.: Search shortcuts: a new approach to the recommendation of queries. In: Proc. RecSys 2009. ACM, New York (2009)
Beitzel, S.M., Jensen, E.C., Chowdhury, A., Grossman, D., Frieder, O.: Hourly analysis of a very large topically categorized web query log. In: Proc. SIGIR 2004. ACM Press, New York (2004)
Boldi, P., Bonchi, F., Castillo, C., Donato, D., Gionis, A., Vigna, S.: The query-flow graph: model and applications. In: Proc. CIKM 2008. ACM, New York (2008)
Broccolo, D., Marcon, L., Nardini, F.M., Perego, R., Silvestri, F.: An efficient algorithm to generate search shortcuts. Tech. Rep. 2010-TR-017, CNR ISTI Pisa (2010)
Fagni, T., Perego, R., Silvestri, F., Orlando, S.: Boosting the performance of web search engines: Caching and prefetching query results by exploiting historical usage data. ACM Trans. Inf. Syst. 24, 51–78 (2006)
Gordea, S., Zanker, M.: Time filtering for better recommendations with small and sparse rating matrices. In: Benatallah, B., Casati, F., Georgakopoulos, D., Bartolini, C., Sadiq, W., Godart, C. (eds.) WISE 2007. LNCS, vol. 4831, pp. 171–183. Springer, Heidelberg (2007)
He, D., Göker, A.: Detecting session boundaries from web user logs. In: BCS-IRSG, pp. 57–66 (2000)
Hsieh-yee, L.: Effects of search experience and subject knowledge on the search tactics of novice and experienced searchers. JASIS 44, 161–174 (1993)
Jones, R., Klinkner, K.L.: Beyond the session timeout: automatic hierarchical segmentation of search topics in query logs. In: CIKM 2008, pp. 699–708. ACM, New York (2008)
Lempel, R., Moran, S.: Predictive caching and prefetching of query results in search engines. In: Proc. WWW 2003, pp. 19–28. ACM, New York (2003)
Lucchese, C., Orlando, S., Perego, R., Silvestri, F., Tolomei, G.: Identifying task-based sessions in search engine query logs. In: Proc. WSDM 2011, pp. 277–286. ACM, New York (2011)
Markatos, E.P.: On caching search engine query results. In: Computer Communications, p. 2001 (2000)
Radlinski, F., Joachims, T.: Query chains: learning to rank from implicit feedback. In: Proc. KDD 2005. ACM Press, New York (2005)
Robertson, S., Zaragoza, H.: The probabilistic relevance framework: Bm25 and beyond. Found. Trends Inf. Retr. 3(4), 333–389 (2009)
Siegfried, S., Bates, M., Wilde, D.: A profile of end-user searching behavior by humanities scholars: The Getty Online Searching Project Report No. 2. JASIS 44(5), 273–291 (1993)
Silverstein, C., Marais, H., Henzinger, M., Moricz, M.: Analysis of a very large web search engine query log. SIGIR Forum 33, 6–12 (1999)
Silvestri, F.: Mining query logs: Turning search usage data into knowledge. Foundations and Trends in Information Retrieval 1(1-2), 1–174 (2010)
Spink, A., Saracevic, T.: Interaction in information retrieval: selection and effectiveness of search terms. JASIS 48(8), 741–761 (1997)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ceccarelli, D., Gordea, S., Lucchese, C., Nardini, F.M., Tolomei, G. (2011). Improving Europeana Search Experience Using Query Logs. In: Gradmann, S., Borri, F., Meghini, C., Schuldt, H. (eds) Research and Advanced Technology for Digital Libraries. TPDL 2011. Lecture Notes in Computer Science, vol 6966. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-24469-8_39
Download citation
DOI: https://doi.org/10.1007/978-3-642-24469-8_39
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-24468-1
Online ISBN: 978-3-642-24469-8
eBook Packages: Computer ScienceComputer Science (R0)