Abstract
We focus on range query processing on large-scale, typically distributed infrastructures, such as clouds of thousands of nodes of shared-datacenters, of p2p distributed overlays, etc. In such distributed environments, efficient range query processing is the key for managing the distributed data sets per se, and for monitoring the infrastructure’s resources. We wish to develop an architecture that can support range queries in such large-scale decentralized environments and can scale in terms of the number of nodes as well as in terms of the data items stored. Of course, in the last few years there have been a number of solutions (mostly from researchers in the p2p domain) for designing such large-scale systems. However, these are inadequate for our purposes, since at the envisaged scales the classic logarithmic complexity (for point queries) is still too expensive while for range queries it is even more disappointing. In this paper we go one step further and achieve a sub-logarithmic complexity. We contribute the ART (Autonomous Range Tree) structure, which outperforms the most popular decentralized structures, including Chord (and some of its successors), BATON (and its successor) and Skip-Graphs. We contribute theoretical analysis, backed up by detailed experimental results, showing that the communication cost of query and update operations is \(O(\log_{b}^{2} \log N)\) hops, where the base b is a double-exponentially power of two and N is the total number of nodes. Moreover, ART is a fully dynamic and fault-tolerant structure, which supports the join/leave node operations in O(loglogN) expected w.h.p. number of hops. Our experimental performance studies include a detailed performance comparison which showcases the improved performance, scalability, and robustness of ART.
Similar content being viewed by others
Notes
References
Andrzejak, A., Xu, Z.: Scalable, efficient range queries for grid information services. In: Proceedings 2nd International Conference on Peer-to-Peer Computing (P2P), Linkoping, Sweden, pp. 33–40 (2002)
Aspnes, J., Shah, G.: Skip graphs. In: Proceedings 14th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), Baltimore, MD, pp. 384–393 (2003)
Bharambe, A.R., Agrawal, M., Seshan, S.: Mercury: supporting scalable multi-attribute range queries. In: Proceedings ACM Conference on Applications, Technologies, Architectures, and Protocols for Computer Communication (SIGCOMM), Portland, OR, pp. 353–366 (2004)
Cai, M., Frank, M., Chen, J., Szekely, P.: Maan: a multi-attribute addressable network for grid information services. In: Proceedings 4th International Workshop on Grid Computing (GRID), Phoenix, AZ, pp. 184–191 (2003)
Crainiceanu, A., Linga, P., Gehrke, J., Shanmugasundaram, J.: Querying peer-to-peer networks using p-trees. In: Proceedings 7th International Workshop on Web and Databases (WebDB), Paris, France, pp. 25–30 (2004)
Gupta, A., Agrawal, D., El Abbadi, A.: Approximate range selection queries peer-to-peer systems. In: Proceedings 1st Biennial Conference on Innovative Data Systems Research, Asilomar, CA (2003)
Gupta, I., Birman, K., Linga, P., Demers, A., van Renesse, R.: Kelips: building an efficient and stable P2P DHT through increased memory and background overhead. In: Proceedings of the 2nd International Workshop on Peer-to-Peer Systems (IPTPS ’03), Berkeley, CA, USA (2003)
Goodrich, M.T., Nelson, M.J., Sun, J.Z.: The rainbow skip graph: a fault-tolerant constant-degree distributed data structure. In: Proceedings 14th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), Miami, FL, pp. 384–393 (2006)
Huebsch, R., Hellerstein, J.M., Lanham, N., Loo, B.T., Shenker, S., Stoica, I.: Querying the internet with PIER. In: Proc. 29th Int. Conf. on Very Large Data Bases, pp. 321–332 (2003)
Harvey, N.J.A., Jones, M.B., Saroiu, S., Theimer, M., Wolman, A.: Skipnet: a scalable overlay network with practical locality properties. In: Proceedings USENIX Symposium on Internet Technologies and Systems, Seattle, WA (2003)
Jagadish, H.V., Ooi, B.C., Vu, Q.H.: Baton: a balanced tree structure for peer-to-peer networks. In: Proceedings 31st International Conference on Very Large Data Bases (VLDB), Trondheim, Norway, pp. 661–672 (2005)
Jagadish, H.V., Ooi, B.C., Tan, K.L., Vu, Q.H., Zhang, R.: Speeding up search in P2P networks with a multi-way tree structure. In: Proceedings ACM International Conference on Management of Data (SIGMOD), Chicago, IL, pp. 1–12 (2006)
Karger, D., Kaashoek, F., Stoica, I., Morris, R., Balakrishnan, H.: Chord: a scalable peer-to-peer lookup service for internet applications. In: Proceedings ACM Conference on Applications, Technologies, Architectures, and Protocols for Computer Communication (SIGCOMM), San Diego, CA, pp. 149–160 (2001)
Kaporis, A., Makris, Ch., Sioutas, S., Tsakalidis, A., Tsichlas, K., Zaroliagis, Ch.: Improved bounds for finger search on a RAM. In: Proceedings 11th Annual European Symposium on Algorithms (ESA), Budapest, Hungary, pp. 325–336 (2003)
Li, X., Kim, Y.J., Govindan, R., Hong, W.: Multi-dimensional range queries in sensor networks. In: Proceedings 1st International Conference on Embedded Networked Sensor Systems (SenSys), Los Angeles, CA, pp. 63–75 (2003)
Liau, C.Y., Ng, W.S., Shu, Y., Tan, K.L., Bressan, S.: Efficient range queries and fast lookup services for scalable P2P networks. In: Proceedings 2nd International Workshop on Databases, Information Systems, and Peer-to-Peer Computing(DBISP2P), Toronto, Canada, pp. 93–106 (2004)
Maymounkov, P., Mazieres, D.: Kademlia: a peer-to-peer information system based on the XOR metric. In: Proceedings 1st International Workshop on Peer-to-Peer Systems (IPTPS), Cambridge, MA, pp. 53–65 (2002)
Perpinan, M.: A review of dimension reduction techniques. Technical report CS-96-09, University of Sheffeld (1997)
Prada, C., Villamil, M., Roncancio, C.: Join queries in P2P DHT systems. In: DBISP2P 2008, pp. 93–105 (2008)
Ratnasamy, S., Francis, P., Handley, M., Karp, R., Shenker, S.: A scalable content addressable network. In: Proceedings ACM Conference on Applications, Technologies, Architectures, and Protocols for Computer Communication (SIGCOMM), San Diego, CA, pp. 161–172 (2001)
Rowstron, A., Druschel, P.: Pastry: a scalable, decentralized object location, and routing for large-scale peer-to-peer systems. In: Proceedings IFIP/ACM International Conference on Distributed Systems Platforms (MIDDLEWARE), Heidelberg, Germany, pp. 329–350 (2001)
Ramabhadran, S., Ratnasamy, S., Hellerstein, J.M., Shenker, S.: Prefix hash tree. In: Proceedings of the Twenty-Third Annual ACM Symposium on Principles of Distributed Computing Table of Contents (Brief Announcement), Newfoundland, Canada, p. 368 (2004)
Sahin, O.D., Gupta, A., Agrawal, D., El Abbadi, A.: A peer-to-peer framework for caching range queries. In: Proceedings 20th International Conference on Data Engineering (ICDE), Boston, MA, pp. 165–176 (2004)
Stoica, I., Morris, R., Liben-Nowell, D., Karger, D.R., Kaashoek, M.F., Dabek, F., Balakrishnan, H.: Chord: a scalable peer-to-peer lookup protocol for internet applications. IEEE/ACM Trans. Netw. 11(1), 17–32 (2003)
Sioutas, S., Papaloukopoulos, G., Sakkopoulos, E., Tsichlas, K., Manolopoulos, Y.: A novel distributed P2P simulator architecture: D-P2P-Sim. In: ACM CIKM 2009, pp. 2069–2070 (2009)
Sioutas, S., Papaloukopoulos, G., Sakkopoulos, E., Tsichlas, K., Manolopoulos, Y.: Brief announcement: ART–sub-logarithmic decentralized range query processing with probabilistic guarantees. In: ACM PODC 2010, pp. 118–119 (2010)
Triantafillou, P., Pitoura, T.: Towards a unifying framework for complex query processing over structured peer-to-peer data networks. In: VLDB 03 Workshop on Databases, Information Systems, and Peer-to-Peer Computing (2003)
Zhang, H., Goel, A., Govindan, R.: Incrementally improving lookup latency in distributed hash table systems. In: SIGMETRICS, San Diego, CA, pp. 114–125 (2003)
Zhao, B.Y., Huang, L., Stribling, J., Rhea, S.C., Joseph, A.D., Kubiatowicz, J.D.: Tapestry: a resilient global-scale overlay for service deployment. IEEE J. Sel. Areas Commun. 22(1), 41–53 (2004)
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by: Beng Chin Ooi.
A limited and preliminary version of this work has been presented as brief announcement in Twenty-Ninth Annual ACM SIGACT-SIGOPS Symposium on Principles of Distributed Computing, Zurich, Switzerland July 25–28, 2010 [26].
This work was partially supported by Thales Project entitled “Cloud9: A multidisciplinary, holistic approach to internet-scale cloud computing”. For more details see the following URL: https://sites.google.com/site/thaliscloud9/home.
Rights and permissions
About this article
Cite this article
Sioutas, S., Triantafillou, P., Papaloukopoulos, G. et al. ART: sub-logarithmic decentralized range query processing with probabilistic guarantees. Distrib Parallel Databases 31, 71–109 (2013). https://doi.org/10.1007/s10619-012-7112-4
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10619-012-7112-4