Skip to main content
Log in

A simple implementation of distributed vertical search and information integration technology

  • Published:
Wuhan University Journal of Natural Sciences

Abstract

The paper proposes the research on the distributed vertical search and information integration technology based on Web mining, which aims at satisfying the requirements of the specific fields’ applications. Nowadays, mining, analyzing, and integrating Web’s content have become an important trend for daily use. The technique includes the Map/Reduce model, the depth search, and the basic principles of information integration. The focus of the paper is how to implement the distributed vertical search engine based on Map/Reduce technology and the information integration system. System optimization mechanism and the system test are also proposed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Sergey B. The Anatomy of a large-scale hypertextual Web search engine [EB/OL]. [2012-02-21]. http://infolab.stan-ford.edu/~backrub/google.html .

  2. Wang Wenjun, Li Wei. Probe into present situation and development of vertical search engine [J]. Information Science, 2010, (3): 477–480(Ch).

    Google Scholar 

  3. Holly G. With specialty search engines [J]. Teacher Librarian, 2004, 32(2): 50–55.

    Google Scholar 

  4. Dean J, Ghemawat S. MapReduce: simplified data processing on large clusters [EB/OL]. [2012-02-04]. http://dl.acm.org/ci-tation.cfm?id=1327492 .

  5. Jiang J. Research of main distributed search engine technology [J]. Science Technology and Engineering, 2007, 7(10): 2418–2424.

    Google Scholar 

  6. Hu Y, Feng J. Distributed search engine using hadoop [J]. Computer Systems & Applications, 2010, 19(7): 224–228.

    Google Scholar 

  7. Bergman M K. The deep Web: Surfacing hidden value. white paper on the deep Web. 2007 [EB/OL]. [2012-01-01]. http://www.brightplanet.Com/pdf/DeepWebwhitepaper.pdf .

  8. Graupmann J, Biwer M, Zimmer C, et al. COMPASS: A concept-based Web search engine for HTML, XML, and deep web data [EB/OL]. [2012-03-01]. http://www.vldb.org/conf/2004/DEMP16.PDF .

  9. Wu Xindong. A frame based architecture for information integration in CIMS [J]. Journal of Computer Science and Technology, 2010, (2): 89–94(Ch).

    Google Scholar 

  10. World Wide Web Consortium (W3C). The document object model, 1998 [EB/OL]. [2012-03-05]. http://www.w3c.org/dom .

    Google Scholar 

  11. Wikipedia. Precision and recall [EB/OL]. [2012-04-09]. http://en.wikipedia.org/wiki/Precision_and_recall .

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Juan Deng.

Additional information

Foundation item: Supported by China National Earthquake Project (201008007)

Biography: LIU Jinshuo, female, Associate professor, research direction: data mining.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Liu, J., Yang, N., Liu, Y. et al. A simple implementation of distributed vertical search and information integration technology. Wuhan Univ. J. Nat. Sci. 18, 511–516 (2013). https://doi.org/10.1007/s11859-013-0965-1

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11859-013-0965-1

Key words

CLC number

Navigation