Abstract
The paper proposes the research on the distributed vertical search and information integration technology based on Web mining, which aims at satisfying the requirements of the specific fields’ applications. Nowadays, mining, analyzing, and integrating Web’s content have become an important trend for daily use. The technique includes the Map/Reduce model, the depth search, and the basic principles of information integration. The focus of the paper is how to implement the distributed vertical search engine based on Map/Reduce technology and the information integration system. System optimization mechanism and the system test are also proposed.
Similar content being viewed by others
References
Sergey B. The Anatomy of a large-scale hypertextual Web search engine [EB/OL]. [2012-02-21]. http://infolab.stan-ford.edu/~backrub/google.html .
Wang Wenjun, Li Wei. Probe into present situation and development of vertical search engine [J]. Information Science, 2010, (3): 477–480(Ch).
Holly G. With specialty search engines [J]. Teacher Librarian, 2004, 32(2): 50–55.
Dean J, Ghemawat S. MapReduce: simplified data processing on large clusters [EB/OL]. [2012-02-04]. http://dl.acm.org/ci-tation.cfm?id=1327492 .
Jiang J. Research of main distributed search engine technology [J]. Science Technology and Engineering, 2007, 7(10): 2418–2424.
Hu Y, Feng J. Distributed search engine using hadoop [J]. Computer Systems & Applications, 2010, 19(7): 224–228.
Bergman M K. The deep Web: Surfacing hidden value. white paper on the deep Web. 2007 [EB/OL]. [2012-01-01]. http://www.brightplanet.Com/pdf/DeepWebwhitepaper.pdf .
Graupmann J, Biwer M, Zimmer C, et al. COMPASS: A concept-based Web search engine for HTML, XML, and deep web data [EB/OL]. [2012-03-01]. http://www.vldb.org/conf/2004/DEMP16.PDF .
Wu Xindong. A frame based architecture for information integration in CIMS [J]. Journal of Computer Science and Technology, 2010, (2): 89–94(Ch).
World Wide Web Consortium (W3C). The document object model, 1998 [EB/OL]. [2012-03-05]. http://www.w3c.org/dom .
Wikipedia. Precision and recall [EB/OL]. [2012-04-09]. http://en.wikipedia.org/wiki/Precision_and_recall .
Author information
Authors and Affiliations
Corresponding author
Additional information
Foundation item: Supported by China National Earthquake Project (201008007)
Biography: LIU Jinshuo, female, Associate professor, research direction: data mining.
Rights and permissions
About this article
Cite this article
Liu, J., Yang, N., Liu, Y. et al. A simple implementation of distributed vertical search and information integration technology. Wuhan Univ. J. Nat. Sci. 18, 511–516 (2013). https://doi.org/10.1007/s11859-013-0965-1
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11859-013-0965-1