Abstract
Dimension hierarchies represent a substantial part of the data warehouse model. Indeed they allow decision makers to examine data at different levels of detail with On-Line Analytical Processing (OLAP) operators such as drill-down and roll-up. The granularity levels which compose a dimension hierarchy are usually fixed during the design step of the data warehouse, according to the identified analysis needs of the users. However, in practice, the needs of users may evolve and grow in time. Hence, to take into account the users’ analysis evolution into the data warehouse, we propose to integrate personalization techniques within the OLAP process. We propose two kinds of OLAP personalization in the data warehouse: (1) adaptation and (2) recommendation.
Adaptation allows users to express their own needs in terms of aggregation rules defined from a child level (existing level) to a parent level (new level). The system will adapt itself by including the new hierarchy level into the data warehouse schema. For recommending new OLAP queries, we provide a new OLAP operator based on the K-means method. Users are asked to choose K-means parameters following their preferences about the obtained clusters which may form a new granularity level in the considered dimension hierarchy. We use the K-means clustering method in order to highlight aggregates semantically richer than those provided by classical OLAP operators. In both adaptation and recommendation techniques, the new data warehouse schema allows new and more elaborated OLAP queries.
Our approach for OLAP personalization is implemented within Oracle 10 g as a prototype which allows the creation of new granularity levels in dimension hierachies of the data warehouse. Moreover, we carried out some experiments which validate the relevance of our approach.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Rizzi, S.: OLAP Preferences: A Research Agenda. In: DOLAP 2007, pp. 99–100 (2007)
Bentayeb, F., Favre, C., Boussaid, O.: A User-driven Data Warehouse Evolution Approach for Concurrent Personalized Analysis Needs. Journal of Integrated Computer-Aided Engineering 15(1), 21–36 (2008)
Domshlak, C., Joachims, T.: Efficient and Non-Parametric Reasoning over User Preferences. User Modeling and User-Adapted Interaction 17(1-2), 41–69 (2007)
Korfhage, R.R.: Information storage and retrieval. John Wiley & Sons, Inc., Chichester (1997)
Manber, U., Patel, A., Robison, J.: Experience with personalization of yahoo! Communications of the ACM 43(8), 35–39 (2000)
Pretschner, A., Gauch, S.: Ontology Based Personalized Search. In: ICTAI 1999, Chicago, Illinois, USA, pp. 391–398 (1999)
Cherniack, M., Galvez, E.F., Franklin, M.J., Zdonik, S.B.: Profile-Driven Cache Management. In: ICDE 2003, Bangalore, India, pp. 645–656 (2003)
Chomicki, J.: Preference Formulas in Relational Queries. ACM Transactions on Database Systems 28(4), 427–466 (2003)
Koutrika, G., Ioannidis, Y.: Personalized Queries under a Generalized Preference Model. In: ICDE 2005, Tokyo, Japan, pp. 841–852 (2005)
Bellatreche, L., Giacometti, A., Marcel, P., Mouloudi, H., Laurent, D.: A Personalization Framework for OLAP Queries. In: DOLAP 2005, pp. 9–18 (2005)
Ravat, F., Teste, O.: Personalization and OLAP Databases. Annals of Information Systems, New Trends in Data Warehousing and Data Analysis (2008)
Jerbi, H., Ravat, F., Teste, O., Zurfluh, G.: Management of context-aware preferences in multidimensional databases. In: ICDIM 2008, pp. 669–675 (2008)
Giacometti, A., Marcel, P., Negre, E.: A Framework for Recommending OLAP Queries. In: DOLAP 2008, pp. 73–80 (2008)
BenMessaoud, R., Boussaid, O., Rabaseda, S.: A new OLAP aggregation based on the AHC technique. In: DOLAP 2004, pp. 65–72 (2004)
Kaya, M.A., Alhajj, R.: Extending OLAP with Fuzziness for Effective Mining of Fuzzy Multidimensional Weighted Association Rules. In: Li, X., Zaïane, O.R., Li, Z.-h. (eds.) ADMA 2006. LNCS, vol. 4093, pp. 64–71. Springer, Heidelberg (2006)
Blaschka, M., Sapia, C., Höfling, G.: On Schema Evolution in Multidimensional Databases. In: Mohania, M., Tjoa, A.M. (eds.) DaWaK 1999. LNCS, vol. 1676, pp. 153–164. Springer, Heidelberg (1999)
Hurtado, C., Mendelzon, A., Vaisman, A.: Maintaining Data Cubes under Dimension Updates. In: ICDE 1999, pp. 346–355 (1999)
Morzy, T., Wrembel, R.: Modeling a Multiversion Data Warehouse: A Formal Approach. In: ICEIS 2003, vol. 1, pp. 120–127 (2003)
Vaisman, A., Mendelzon, A.: Temporal Queries in OLAP. In: VLDB 2000, pp. 242–253 (2000)
Gray, J., Bosworth, A., Layman, A., Pirahesh, H.: Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Total. In: ICDE 1996, pp. 152–159 (1996)
Forgy, E.: Cluster Analysis of Multivariate Data: Efficiency versus Interpretability of Classification. Biometrics 21
MacQueen, J.: Some Methods for Classification and Analysis of Multivariate Observations. In: Vth Berkeley Symposium, pp. 281–297 (1967)
Huang, Z.: Clustering Large Data Sets with Mixed Numeric and Categorical Values. In: PAKDD 1997 (1997)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Bentayeb, F., Favre, C. (2009). RoK: Roll-Up with the K-Means Clustering Method for Recommending OLAP Queries. In: Bhowmick, S.S., Küng, J., Wagner, R. (eds) Database and Expert Systems Applications. DEXA 2009. Lecture Notes in Computer Science, vol 5690. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-03573-9_43
Download citation
DOI: https://doi.org/10.1007/978-3-642-03573-9_43
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-03572-2
Online ISBN: 978-3-642-03573-9
eBook Packages: Computer ScienceComputer Science (R0)