Skip to main content

An Efficient Algorithm for Clustering Search Engine Results

  • Conference paper
Computational Intelligence and Security (CIS 2006)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4456))

Included in the following conference series:

Abstract

With the increasing number of Web documents in the Internet, the most popular keyword-matching-based search engines, such as Google, often return a long list of search results ranked based on their relevancy and importance to the query. To cluster the search engine results can help users find the results in several clustered collections, so it is easy to locate the valuable search results that the users really needed. In this paper, we propose a new Key-Feature Clustering (KFC) algorithm which firstly extracts the significant keywords from the results as key features and cluster them, then clusters the documents based on these clustered key features. At last, the paper presents and analyzes the results from experiments we conducted to test and validate the algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Wang, Y., Kitsuregawa, M.: Use Link-based Clustering to Improve Web Search Results. IEEE, New York (2002)

    Google Scholar 

  2. Zeng, H.-J., He, Q.-C.,Chen, Z., Ma, W.-Y., Ma, J.: Learning to Cluster Web Search Results

    Google Scholar 

  3. Hotho, A., Maedche, A., Staab, S.: Ontology-based Text Document Clustering

    Google Scholar 

  4. Wang, P.-H., Wang, J.-Y., Lee, H.-M.: Query Find: Search Ranking Based on Users’ Feedback and Expert’s Agreement. IEEE, New York (2004)

    Google Scholar 

  5. Yuliang, G., Jiaqi, C., Yongmei, W.: Improvement of clustering algorithm in chinese web retrieva [J]. Computer engineering and design, 2005.10

    Google Scholar 

  6. Lixiu, Y., Jie, Y., Chenzhou, Y., Nianyi, C.: K Nearest Neighbor(KNN) Method Used in Feature Selection [J]. Computer and applied chemistry, 2001.3

    Google Scholar 

  7. Xiaoying, D., Zhanghua, M., et al.: The retrieval use and service of internet information resource[J]. Beijing University Press, 2003.7

    Google Scholar 

  8. Xiaohui, Z., et al.: Information Discovery and Search Engine for the World-Wide Web. Mini-Micro Systems 6, 66–71 (1998)

    Google Scholar 

  9. Jianpei, Z., Yang, L., Jing, Y., Kun, D.: Research on Clustering Algorithms for Search Engine Results[J].Computer Project, 2004.3

    Google Scholar 

  10. Sai, W., Dongqing, Y., Jinqiang, H.,ming, Z., Wenqing, W., Ying, F.: WRM: A Novel Document Clustering Method Based on Word Relation[J]

    Google Scholar 

  11. Guha, S., Rastogi, R., Shim, K.: CURE: An Efficient Clustering Algorithm for Large Databases. In: Proceedings of the, ACM SIGMOD international conference on Management of data, pp. 73–84, Washington, USA (1998)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Zhang, H., Pang, B., Xie, K., Wu, H. (2007). An Efficient Algorithm for Clustering Search Engine Results. In: Wang, Y., Cheung, Ym., Liu, H. (eds) Computational Intelligence and Security. CIS 2006. Lecture Notes in Computer Science(), vol 4456. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74377-4_69

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-74377-4_69

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-74376-7

  • Online ISBN: 978-3-540-74377-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics