Abstract
In recent years there has been a significant interest in peer-to-peer (P2P) environments in the community of data management. However, almost all work, so far, is focused on exact query processing in current P2P data systems. The autonomy of peers also is not considered enough. In addition, the system cost is very high because the information publishing method of shared data is based on each document instead of document set. In this paper, abstract indices (AbIx) are presented to implement content-based approximate queries in centralized, distributed and structured P2P data systems. It can be used to search as few peers as possible but get as many returns satisfying users’ queries as possible on the guarantee of high autonomy of peers. Also, abstract indices have low system cost, can improve the query processing speed, and support very frequent updates and the set information publishing method. In order to verify the effectiveness of abstract indices, a simulator of 10,000 peers, over 3 million documents is made, and several metrics are proposed. The experimental results show that abstract indices work well in various P2P data systems.
Similar content being viewed by others
References
Ratnasamy S, Francis P, Handley M et al. A scalable content-addressable network. In Proc. ACM SIGCOMM, UC San Diego, USA, 2001, pp.161–172.
Balakrishnan H, Kaashoek M, Karger D et al. Looking up data in P2P systems. Commun. ACM, 2003, 46(2): 43–48.
Stoica I, Morris R, Liben-Nowell D et al. Chord: A scalable peer-to-peer lookup protocol for Internet applications. IEEE/ACM Trans. Networking, 2003, 11(1): 17–32.
Yang B, Garcia-Molina H. Efficient search in peer-to-peer networks. In Proc. Int. Conf. Distributed Computing Systems, Vienna, Austria, 2002, pp.5–14.
Crespo A, Garcia-Molina H. Routing indices for peer-to-peer systems. In Proc. Int. Conf. Distributed Computing Systems, Vienna, Austria, 2002, pp.23–34.
Rowstron A, Druschel P. Pastry: Scalable, distributed object location and routing for large-scale peer-to-peer systems. In Proc. IFIP/ACM Int. Conf. Distributed Systems Platforms (Middleware), Heidelberg, Germany, 2001, pp.329–350.
Zhao B Y, Kubiatowicz J, Joseph A D. Tapestry: An infrastructure for fault-tolerant wide-area location and routing. Tech. Report UCB/CSD-01-1141, University of California, Berkeley, USA, 2001.
Cuenca-Acuna F M, Nguyen T D. Text-based content search and retrieval in ad hoc p2p communities. In Proc. the International Workshop on Peer-to-Peer Computing, Cambridge, MA, USA, 2002, pp.220–234.
Tang C, Xu Z, Mahalingam M. pSearch: Information retrieval in structured overlays. Computer Communication Review, 2003, 33(1): 89–94.
Wang C, Li J, Shi S. A kind of content-based music information retrieval method in a peer-to-peer environment. In Proc. Int. Symp. Music Information Retrieval, Paris, France, 2002, pp.178–186.
Tzanetakis G, Gao J, Steenkiste P. A scalable peer-to-peer system for music information retrieval. Computer Music Journal, 2004, 28(2): 24–33.
Ng W S, Ooi B C, Tan K L et al. PeerDB: A P2P-based system for distributed data sharing. In Proc. Int. Conf. Data Engineering, Bangalore, India, 2003, pp.633–644.
Tatarinov I, Halevy A. Efficient query reformulation in peer data management systems. In Proc. SIGMOD, Paris, France, 2004, pp.539–550.
Wang C. Research on key techniques of music data management and retrieval [Dissertation]. Harbin Institute of Technology, China, 2005.
Author information
Authors and Affiliations
Corresponding author
Additional information
Supported by the National Natural Science Foundation of China under Grant No. 60473077 and the Program for New Century Excellent Talents in University.
Electronic supplementary material
Rights and permissions
About this article
Cite this article
Wang, CK., Wang, JM., Sun, JG. et al. AbIx: An Approach to Content-Based Approximate Query Processing in Peer-to-Peer Data Systems. J Comput Sci Technol 22, 280–286 (2007). https://doi.org/10.1007/s11390-007-9035-5
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11390-007-9035-5