RoK: Roll-Up with the K-Means Clustering Method for Recommending OLAP Queries

Bentayeb, Fadila; Favre, Cécile

doi:10.1007/978-3-642-03573-9_43

Fadila Bentayeb¹⁸ &
Cécile Favre¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5690))

Included in the following conference series:

International Conference on Database and Expert Systems Applications

942 Accesses
5 Citations

Abstract

Dimension hierarchies represent a substantial part of the data warehouse model. Indeed they allow decision makers to examine data at different levels of detail with On-Line Analytical Processing (OLAP) operators such as drill-down and roll-up. The granularity levels which compose a dimension hierarchy are usually fixed during the design step of the data warehouse, according to the identified analysis needs of the users. However, in practice, the needs of users may evolve and grow in time. Hence, to take into account the users’ analysis evolution into the data warehouse, we propose to integrate personalization techniques within the OLAP process. We propose two kinds of OLAP personalization in the data warehouse: (1) adaptation and (2) recommendation.

Adaptation allows users to express their own needs in terms of aggregation rules defined from a child level (existing level) to a parent level (new level). The system will adapt itself by including the new hierarchy level into the data warehouse schema. For recommending new OLAP queries, we provide a new OLAP operator based on the K-means method. Users are asked to choose K-means parameters following their preferences about the obtained clusters which may form a new granularity level in the considered dimension hierarchy. We use the K-means clustering method in order to highlight aggregates semantically richer than those provided by classical OLAP operators. In both adaptation and recommendation techniques, the new data warehouse schema allows new and more elaborated OLAP queries.

Our approach for OLAP personalization is implemented within Oracle 10 g as a prototype which allows the creation of new granularity levels in dimension hierachies of the data warehouse. Moreover, we carried out some experiments which validate the relevance of our approach.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Rizzi, S.: OLAP Preferences: A Research Agenda. In: DOLAP 2007, pp. 99–100 (2007)
Google Scholar
Bentayeb, F., Favre, C., Boussaid, O.: A User-driven Data Warehouse Evolution Approach for Concurrent Personalized Analysis Needs. Journal of Integrated Computer-Aided Engineering 15(1), 21–36 (2008)
Google Scholar
Domshlak, C., Joachims, T.: Efficient and Non-Parametric Reasoning over User Preferences. User Modeling and User-Adapted Interaction 17(1-2), 41–69 (2007)
Article Google Scholar
Korfhage, R.R.: Information storage and retrieval. John Wiley & Sons, Inc., Chichester (1997)
Google Scholar
Manber, U., Patel, A., Robison, J.: Experience with personalization of yahoo! Communications of the ACM 43(8), 35–39 (2000)
Article Google Scholar
Pretschner, A., Gauch, S.: Ontology Based Personalized Search. In: ICTAI 1999, Chicago, Illinois, USA, pp. 391–398 (1999)
Google Scholar
Cherniack, M., Galvez, E.F., Franklin, M.J., Zdonik, S.B.: Profile-Driven Cache Management. In: ICDE 2003, Bangalore, India, pp. 645–656 (2003)
Google Scholar
Chomicki, J.: Preference Formulas in Relational Queries. ACM Transactions on Database Systems 28(4), 427–466 (2003)
Article MathSciNet Google Scholar
Koutrika, G., Ioannidis, Y.: Personalized Queries under a Generalized Preference Model. In: ICDE 2005, Tokyo, Japan, pp. 841–852 (2005)
Google Scholar
Bellatreche, L., Giacometti, A., Marcel, P., Mouloudi, H., Laurent, D.: A Personalization Framework for OLAP Queries. In: DOLAP 2005, pp. 9–18 (2005)
Google Scholar
Ravat, F., Teste, O.: Personalization and OLAP Databases. Annals of Information Systems, New Trends in Data Warehousing and Data Analysis (2008)
Google Scholar
Jerbi, H., Ravat, F., Teste, O., Zurfluh, G.: Management of context-aware preferences in multidimensional databases. In: ICDIM 2008, pp. 669–675 (2008)
Google Scholar
Giacometti, A., Marcel, P., Negre, E.: A Framework for Recommending OLAP Queries. In: DOLAP 2008, pp. 73–80 (2008)
Google Scholar
BenMessaoud, R., Boussaid, O., Rabaseda, S.: A new OLAP aggregation based on the AHC technique. In: DOLAP 2004, pp. 65–72 (2004)
Google Scholar
Kaya, M.A., Alhajj, R.: Extending OLAP with Fuzziness for Effective Mining of Fuzzy Multidimensional Weighted Association Rules. In: Li, X., Zaïane, O.R., Li, Z.-h. (eds.) ADMA 2006. LNCS, vol. 4093, pp. 64–71. Springer, Heidelberg (2006)
Chapter Google Scholar
Blaschka, M., Sapia, C., Höfling, G.: On Schema Evolution in Multidimensional Databases. In: Mohania, M., Tjoa, A.M. (eds.) DaWaK 1999. LNCS, vol. 1676, pp. 153–164. Springer, Heidelberg (1999)
Google Scholar
Hurtado, C., Mendelzon, A., Vaisman, A.: Maintaining Data Cubes under Dimension Updates. In: ICDE 1999, pp. 346–355 (1999)
Google Scholar
Morzy, T., Wrembel, R.: Modeling a Multiversion Data Warehouse: A Formal Approach. In: ICEIS 2003, vol. 1, pp. 120–127 (2003)
Google Scholar
Vaisman, A., Mendelzon, A.: Temporal Queries in OLAP. In: VLDB 2000, pp. 242–253 (2000)
Google Scholar
Gray, J., Bosworth, A., Layman, A., Pirahesh, H.: Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Total. In: ICDE 1996, pp. 152–159 (1996)
Google Scholar
Forgy, E.: Cluster Analysis of Multivariate Data: Efficiency versus Interpretability of Classification. Biometrics 21
Google Scholar
MacQueen, J.: Some Methods for Classification and Analysis of Multivariate Observations. In: Vth Berkeley Symposium, pp. 281–297 (1967)
Google Scholar
Huang, Z.: Clustering Large Data Sets with Mixed Numeric and Categorical Values. In: PAKDD 1997 (1997)
Google Scholar

Download references

Author information

Authors and Affiliations

Université de Lyon (ERIC - Lyon 2), 5 av. Pierre Mendès-France, 69676, Bron Cedex, France
Fadila Bentayeb & Cécile Favre

Authors

Fadila Bentayeb
View author publications
You can also search for this author in PubMed Google Scholar
Cécile Favre
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Nanyang Technological University, 50 Nanyang Avenue, 639798, Singapore
Sourav S. Bhowmick
University of Linz, Altenbergerstraße 69, 4040, Linz, Austria
Josef Küng & Roland Wagner &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Bentayeb, F., Favre, C. (2009). RoK: Roll-Up with the K-Means Clustering Method for Recommending OLAP Queries. In: Bhowmick, S.S., Küng, J., Wagner, R. (eds) Database and Expert Systems Applications. DEXA 2009. Lecture Notes in Computer Science, vol 5690. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-03573-9_43

Download citation

DOI: https://doi.org/10.1007/978-3-642-03573-9_43
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-03572-2
Online ISBN: 978-3-642-03573-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics