Top-K Aggregate Queries on Continuous Probabilistic Datasets

Chen, Jianwen; Feng, Ling; Zhang, Jun

doi:10.1007/978-3-642-38562-9_22

Jianwen Chen²¹,
Ling Feng²¹ &
Jun Zhang²²

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7923))

Included in the following conference series:

International Conference on Web-Age Information Management

3425 Accesses

Abstract

Top-K aggregate query, which ranks groups of tuples by their aggregate values and returns the K groups with the highest aggregates, is a crucial requirement in many domains such as information extraction, data integration, and sensor data processing. In this paper, we formulate the top-K aggregate queries when the tuple scores are presented as continuous probability distributions. Algorithms for top-K aggregate queries are presented. To further improve the performance, we develop pruning techniques and adaptive strategy that avoid computing the exact aggregate values of some groups that are guaranteed not to be in top-K. Our experimental study shows the efficiency of our techniques over several datasets with continuous attribute uncertainty.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Agrawal, P., Benjelloun, O., Sarma, A.D., Hayworth, C., Nabar, S., Sugihara, T., Widom, J.: Trio: A system for data, uncertainty, and lineage. In: VLDB (2006)
Google Scholar
Cheng, R., Kalahnikov, D.V., Prabhakar, S.: Evaluating probabilistic queries over imprecise data. In: SIGMOD (2003)
Google Scholar
Dalvi, N., Suciu, D.: Efficient query evaluation on probabilistic databases. VLDB Journal 16(4) (2007)
Google Scholar
Ge, T., Zdonik, S., Madden, S.: Top-k queries on uncertain data: On score distribution and typical answeres. In: SIGMOD (2009)
Google Scholar
Hua, M., Pei, J., Zhang, W., Lin, X.: Ranking queries on uncertain data: A probabilistic threshold approach. In: SIGMOD (2008)
Google Scholar
Jestes, J., Cormode, G., Li, F., Yi, K.: Semantics of ranking queries for probabilistic data. TKDE (2011)
Google Scholar
Li, J., Deshpande, A.: Ranking continuous probabilistic datasets. In: VLDB (2010)
Google Scholar
Lian, X., Chen, L.: Probabilistic inverse ranking queries in uncertain databases. The VLDB Journal (2011)
Google Scholar
Lyness, J.N.: Notes on the adaptive simpson quadrature routine. Journal of ACM (1969)
Google Scholar
Ré, C., Dalvi, N., Suciu, D.: Efficient top-k query evaluation on probabilistic data. In: ICDE (2007)
Google Scholar
Soliman, M.A., Ilyas, I.F.: Probabilistic top-k and ranking-aggregate queries. TODS (2008)
Google Scholar
Soliman, M.A., Ilyas, I.F.: Ranking with uncertain scores. In: ICDE (2009)
Google Scholar
Soliman, M.A., Ilyas, I.F., Chang, K.C.-C.: Top-k query processing in uncertain databases. In: ICDE (2007)
Google Scholar
Wang, C., Yuan, L.Y., You, H.-H., Zaiane, O.R.: On pruning for top-k ranking in uncertain databases. In: VLDB (2011)
Google Scholar
Lian, X., Chen, L.: Probabilisitc ranked queries in uncertain databases. In: EDBT (2008)
Google Scholar
Yi, K., Li, F., Kollios, G., Srivastava, D.: Efficient processing of top-k queries in uncertain databases with x-relations. TKDE (2009)
Google Scholar

Download references

Author information

Authors and Affiliations

Dept. of Computer Science & Technology, Tsinghua University, Beijing, China
Jianwen Chen & Ling Feng
No. 145, Erqi Road, Jiangan District, Wuhan City, Hubei Prov., China
Jun Zhang

Authors

Jianwen Chen
View author publications
You can also search for this author in PubMed Google Scholar
Ling Feng
View author publications
You can also search for this author in PubMed Google Scholar
Jun Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science and Technology, Tsinghua University, 100084, Beijing, China
Jianyong Wang
Management Science and Information Systems Department, Rutgers, the State University of New Jersey, 1, Washington Park, 07102, Newark, NJ, USA
Hui Xiong
Department of Information Engineering, Nagoya University, 464-8601, Nagoya, Japan
Yoshiharu Ishikawa
Department of Computer Science, Hong Kong Baptist University, Hong Kong
Jianliang Xu
School of Information Science and Engineering, Yanshan University, Qinhuangdao, China
Junfeng Zhou

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chen, J., Feng, L., Zhang, J. (2013). Top-K Aggregate Queries on Continuous Probabilistic Datasets. In: Wang, J., Xiong, H., Ishikawa, Y., Xu, J., Zhou, J. (eds) Web-Age Information Management. WAIM 2013. Lecture Notes in Computer Science, vol 7923. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-38562-9_22

Download citation

DOI: https://doi.org/10.1007/978-3-642-38562-9_22
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-38561-2
Online ISBN: 978-3-642-38562-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics