Dynamic Clustering of Web Search Results

Yang, Li; Rahi, Adnan

doi:10.1007/3-540-44839-X_17

Li Yang¹⁰ &
Adnan Rahi¹⁰

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2667))

Included in the following conference series:

International Conference on Computational Science and Its Applications

767 Accesses

Abstract

A problem in Web searches is how to help users quickly find useful links from a long list of returned URLs. Document clustering provides an approach to organize retrieval results by clustering documents into meaningful groups. Because a word in a document is naturally correlated with neighboring words, document clustering often uses phrases rather than individual words in determining clusters. We have designed a system to cluster Web search results based on phrases that contain one or more search keywords. We show that, rather than clustering based on whole documents, clustering based on phrases containing search keywords often gives more accurate and informative clusters. Algorithms and experimental results are discussed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Baeza-Yates, R., Ribeiro-Neto, B.: Modern Information Retrieval. Addison Wesley (1999)
Google Scholar
Baeza-Yates, R., Frakes, W.: Information Retrieval: Data Structures and Algorithms. Prentice Hall (1992)
Google Scholar
Grouper2: http://longinus.cs.washington.edu/grouper2.html
IBM: Intelligent Miner for Text. http://www.ibm.com/software/data/iminer/fortext/
Milligan, G.W., Cooper, M.C.: An Examination of Procedures for Detecting the Number of Clusters in a Data Set. Psychometrika, 50 (1985) 159–79
Article Google Scholar
Willet, P.: Recent Trends in Hierarchical Document Clustering: A Critical Review. Information Processing and Management 24 (1988) 577–597
Article Google Scholar
Voorhees, E.M.: Implementing Agglomerative Hierarchical Clustering Algorithms for Use in Document Retrieval. Information Processing and Management 22 (1986) 465–476
Article Google Scholar
Zamir, O., Etzioni, O.: Web Document Clustering: A Feasibility Demonstration. ACM SIGIR’ 98, Melbourne, Australia (1998) 46–54
Google Scholar
Zamir, O., Etzioni, O.: Grouper: A Dynamic Clustering Interface to Web Search Results. Proc. 8^th World Wide Web Conference. Toronto, Canada (1999) 1361–1374
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, Western Michigan University, Kalamazoo, MI, 49008, USA
Li Yang & Adnan Rahi

Authors

Li Yang
View author publications
You can also search for this author in PubMed Google Scholar
Adnan Rahi
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Army High Performance Computing Research Center, USA
Vipin Kumar
Department of Computer Science, University of Calgary, Calgary, AB, T2N1N4, Canada
Marina L. Gavrilova
Heuchera Technologies Inc., 122 9251-8 Yonge Street, Richmond Hill, ON, Canada, L4C 9T3
Chih Jeng Kenneth Tan
Département d’informatique et de recherche opérationelle, Université de Montréal, Montréal, Québec, H3C 3J7, Canada
Pierre L’Ecuyer
Department of Computer Science and Engineering, University of Minessota, MN, 55455, USA
Vipin Kumar
The Queen’s University of Belfast, School of Computer Science, Belfast BT7 1NN, Northern Ireland, UK
Chih Jeng Kenneth Tan

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yang, L., Rahi, A. (2003). Dynamic Clustering of Web Search Results. In: Kumar, V., Gavrilova, M.L., Tan, C.J.K., L’Ecuyer, P. (eds) Computational Science and Its Applications — ICCSA 2003. ICCSA 2003. Lecture Notes in Computer Science, vol 2667. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44839-X_17

Download citation

DOI: https://doi.org/10.1007/3-540-44839-X_17
Published: 18 June 2003
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-40155-1
Online ISBN: 978-3-540-44839-6
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics