An Efficient Algorithm for Clustering Search Engine Results

Zhang, Hui; Pang, Bin; Xie, Ke; Wu, Hui

doi:10.1007/978-3-540-74377-4_69

Hui Zhang²²,
Bin Pang²²,
Ke Xie²² &
…
Hui Wu²²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4456))

Included in the following conference series:

International Conference on Computational and Information Science

974 Accesses
2 Citations

Abstract

With the increasing number of Web documents in the Internet, the most popular keyword-matching-based search engines, such as Google, often return a long list of search results ranked based on their relevancy and importance to the query. To cluster the search engine results can help users find the results in several clustered collections, so it is easy to locate the valuable search results that the users really needed. In this paper, we propose a new Key-Feature Clustering (KFC) algorithm which firstly extracts the significant keywords from the results as key features and cluster them, then clusters the documents based on these clustered key features. At last, the paper presents and analyzes the results from experiments we conducted to test and validate the algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Wang, Y., Kitsuregawa, M.: Use Link-based Clustering to Improve Web Search Results. IEEE, New York (2002)
Google Scholar
Zeng, H.-J., He, Q.-C.,Chen, Z., Ma, W.-Y., Ma, J.: Learning to Cluster Web Search Results
Google Scholar
Hotho, A., Maedche, A., Staab, S.: Ontology-based Text Document Clustering
Google Scholar
Wang, P.-H., Wang, J.-Y., Lee, H.-M.: Query Find: Search Ranking Based on Users’ Feedback and Expert’s Agreement. IEEE, New York (2004)
Google Scholar
Yuliang, G., Jiaqi, C., Yongmei, W.: Improvement of clustering algorithm in chinese web retrieva [J]. Computer engineering and design, 2005.10
Google Scholar
Lixiu, Y., Jie, Y., Chenzhou, Y., Nianyi, C.: K Nearest Neighbor(KNN) Method Used in Feature Selection [J]. Computer and applied chemistry, 2001.3
Google Scholar
Xiaoying, D., Zhanghua, M., et al.: The retrieval use and service of internet information resource[J]. Beijing University Press, 2003.7
Google Scholar
Xiaohui, Z., et al.: Information Discovery and Search Engine for the World-Wide Web. Mini-Micro Systems 6, 66–71 (1998)
Google Scholar
Jianpei, Z., Yang, L., Jing, Y., Kun, D.: Research on Clustering Algorithms for Search Engine Results[J].Computer Project, 2004.3
Google Scholar
Sai, W., Dongqing, Y., Jinqiang, H.,ming, Z., Wenqing, W., Ying, F.: WRM: A Novel Document Clustering Method Based on Word Relation[J]
Google Scholar
Guha, S., Rastogi, R., Shim, K.: CURE: An Efficient Clustering Algorithm for Large Databases. In: Proceedings of the, ACM SIGMOD international conference on Management of data, pp. 73–84, Washington, USA (1998)
Google Scholar

Download references

Author information

Authors and Affiliations

National Laboratory of Software Development Environment, Beihang University, Beijing 100083, China
Hui Zhang, Bin Pang, Ke Xie & Hui Wu

Authors

Hui Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Bin Pang
View author publications
You can also search for this author in PubMed Google Scholar
Ke Xie
View author publications
You can also search for this author in PubMed Google Scholar
Hui Wu
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Computer Science and Technology , Xidian University, 710071, Xi’an, China
Yuping Wang
Department of Computer Science , Hong Kong Baptist University, Hong Kong, China
Yiu-ming Cheung
Faculty of Applied Mathematics , Guangdong University of Technology, 5100006, Guangzhou, China
Hailin Liu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhang, H., Pang, B., Xie, K., Wu, H. (2007). An Efficient Algorithm for Clustering Search Engine Results. In: Wang, Y., Cheung, Ym., Liu, H. (eds) Computational Intelligence and Security. CIS 2006. Lecture Notes in Computer Science(), vol 4456. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74377-4_69

Download citation

DOI: https://doi.org/10.1007/978-3-540-74377-4_69
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-74376-7
Online ISBN: 978-3-540-74377-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics