Skip to main content

Hybrid Fragmentation of XML Data Warehouse Using K-Means Algorithm

  • Conference paper
Advances in Databases and Information Systems (ADBIS 2014)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8716))

Abstract

The efficiency of the decision-making process in an XML data warehouse environment, is in a narrow relation with the performances of decision-support queries. Optimize these performances, automatically contribute in improving decision making. One of the important performances optimization techniques in XML data warehouse is fragmentation with its different variants (horizontal fragmentation and vertical fragmentation). In this paper, we develop a hybrid fragmentation algorithm combining a vertical fragmentation based on XPath expressions and a horizontal fragmentation based on selection predicates. To control the number of fragments, we use the K-Means algorithm. Finally, we validate our approach under Oracle Berkeley DB XML by several experiments done on XML data, derived from the XWB benchmark.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Agrawal, S., Narasayya, V., Yang, B.: Integrating vertical and horizontal partitioning into automated physical database design. In: Proceedings of the 2004 ACM SIGMOD International Conference on Management of Data, SIGMOD 2004, pp. 359–370. ACM, New York (2004), http://doi.acm.org/10.1145/1007568.1007609

    Google Scholar 

  2. Baio, F., Mattoso, M.: A mixed fragmentation algorithm for distributed object oriented databases. In: Proc. of the 9th Int. Conf. on Computing Information, pp. 141–148 (1998)

    Google Scholar 

  3. Bellatreche, L., Bouchakri, R., Cuzzocrea, A., Maabout, S.: Horizontal partitioning of very-large data warehouses under dynamically-changing query workloads via incremental algorithms. In: Proceedings of the 28th Annual ACM Symposium on Applied Computing, SAC 2013, pp. 208–210. ACM, New York (2013), http://doi.acm.org/10.1145/2480362.2480406

  4. Bellatreche, L., Boukhalfa, K., Richard, P.: Data partitioning in data warehouses: Hardness study, heuristics and ORACLE validation. In: Song, I.-Y., Eder, J., Nguyen, T.M. (eds.) DaWaK 2008. LNCS, vol. 5182, pp. 87–96. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  5. Bellatreche, L., Karlapalem, K., Simonet, A.: Horizontal class partitioning in object-oriented databases. In: Tjoa, A.M. (ed.) DEXA 1997. LNCS, vol. 1308, pp. 58–67. Springer, Heidelberg (1997), http://dl.acm.org/citation.cfm?id=648310.754717

    Chapter  Google Scholar 

  6. Bellatreche, L., Karlapalem, K., Simonet, A.: Algorithms and support for horizontal class partitioning in object-oriented databases. Distrib. Parallel Databases 8(2), 155–179 (2000), http://dx.doi.org/10.1023/A:1008745624048

    Article  Google Scholar 

  7. Berglund, A., Boag, S., Chamberlin, D.: andez, M.F.F.: Xml path language (xpath) 2.0, 2nd edn. (December 2010)

    Google Scholar 

  8. Boucher, S., Verhaegen, B., Zimányi, E.: XML Multidimensional Modelling and Querying. CoRR abs/0912.1110 (2009)

    Google Scholar 

  9. Boukraâ, D., Boussaïd, O., Bentayeb, F.: Vertical fragmentation of XML data warehouses using frequent path sets. In: Cuzzocrea, A., Dayal, U. (eds.) DaWaK 2011. LNCS, vol. 6862, pp. 196–207. Springer, Heidelberg (2011), http://dblp.uni-trier.de/db/conf/dawak/dawak2011.html#BoukraaBB11

    Chapter  Google Scholar 

  10. Boukraa, D., Riadh Ben, M., Omar, B.: Proposition d’un modèle physique pour les entrepôts XML. In: Premier Atelier des Systèmes Décisionnels (ASD 2006), Agadir, Maroc (2006)

    Google Scholar 

  11. Boussaid, O., BenMessaoud, R., Choquet, R., Anthoard, S.: Conception et construction d’entrepôts XML. In: 2ème journée francophone sur les Entrepôts de Données et l’Analyse en ligne (EDA 2006), Versailles. RNTI, vol. B-2, pp. 3–22. Cépaduès, Toulouse (Juin 2006)

    Google Scholar 

  12. Brian, D.: The Definitive Guide to Berkeley DB XML (Definitive Guide). Apress, Berkely (2006)

    Google Scholar 

  13. Ceri, S., Negri, M., Pelagatti, G.: Horizontal data partitioning in database design. In: Proceedings of the 1982 ACM SIGMOD International Conference on Management of Data, SIGMOD 1982, pp. 128–136. ACM, New York (1982), http://doi.acm.org/10.1145/582353.582376

    Chapter  Google Scholar 

  14. Dimovski, A., Velinov, G., Sahpaski, D.: Horizontal partitioning by predicate abstraction and its application to data warehouse design. In: Catania, B., Ivanović, M., Thalheim, B. (eds.) ADBIS 2010. LNCS, vol. 6295, pp. 164–175. Springer, Heidelberg (2010), http://dl.acm.org/citation.cfm?id=1885872.1885888

    Chapter  Google Scholar 

  15. Elhoussaine, Z., Aboutajdine, D., Abderrahim, E.Q.: Algorithms for data warehouse design to enhance decision-making. WSEAS Trans. Comp. Res. 3(3), 111–120 (2008), http://dl.acm.org/citation.cfm?id=1466884.1466885

    Google Scholar 

  16. Golfarelli, M., Rizzi, S., Vrdoljak, B.: Data warehouse design from XML sources. In: Proceedings of the 4th ACM international workshop on Data warehousing and OLAP, DOLAP 2001, pp. 40–47. ACM, New York (2001), http://doi.acm.org/10.1145/512236.512242

    Google Scholar 

  17. Hümmer, W., 0004, A.B., Harde, G.: XCube: XML for Data Warehouses. In: DOLAP, pp. 33–40 (2003)

    Google Scholar 

  18. MacQueen, J.: Some Methods for Classifcation and Analysis of Multivariate Observations. In: Proceeding of Fifth Berkley Symposium on Mathematical Statistics and Probability, vol. 1, pp. 281–296 (1967)

    Google Scholar 

  19. Kimball, R.: A dimensional modeling manifesto. DBMS 10, 58–70 (1997), http://portal.acm.org/citation.cfm?id=261018.261025

    Google Scholar 

  20. Mahboubi, H., Aouiche, K., Darmont, J.: Un index de jointure pour les entrepôts de données xml. In: 6émes Journées Francophones Extraction et Gestion des Connaissances (EGC 2006), Lille. Revue des Nouvelles Technologies de l’Information, vol. E-6, pp. 89–94. Cépadués, Toulouse (2006)

    Google Scholar 

  21. Mahboubi, H., Darmont, J.: Benchmarking xml data warehouses. In: Atelier Syst emes Décisionnels (ASD 2006), 9th Maghrebian Conference on Information Technologies (MCSEAI 2006), Agadir, Maroc (December 2006)

    Google Scholar 

  22. Mahboubi, H., Darmont, J.: Data mining-based fragmentation of xml data warehouses. In: DOLAP, pp. 9–16 (2008)

    Google Scholar 

  23. Mahboubi, H., Darmont, J.: Enhancing xml data warehouse query performance by fragmentation. In: Proceedings of the 2009 ACM Symposium on Applied Computing, SAC 2009, pp. 1555–1562. ACM, New York (2009), http://doi.acm.org/10.1145/1529282.1529630

  24. Navathe, S.B., Karlapalem, K., Ra, M.: A mixed fragmentation methodology for initial distributed database design. Journal of Computer and Software Engineering 3(4), 395–426 (1995)

    Google Scholar 

  25. Ozsu, M.T.: Principles of Distributed Database Systems, 3rd edn. Prentice Hall Press, Upper Saddle River (2007)

    Google Scholar 

  26. Pokorný, J.: XML Data Warehouse: Modelling and Querying. In: Proceedings of the Baltic Conference, BalticDB&IS 2002, vol. 1, pp. 267–280. Institute of Cybernetics at Tallin Technical University (2002), http://portal.acm.org/citation.cfm?id=648170.750672

  27. Rusu, L.I., Rahayu, J.W., Taniar, D.: A methodology for building xml data warehouses. IJDWM 1(2), 23–48 (2005)

    Google Scholar 

  28. Rusu, L.I., Rahayu, W., Taniar, D.: Partitioning methods for multi-version xml data warehouses. Distrib. Parallel Databases 25(1-2), 47–69 (2009), http://dx.doi.org/10.1007/s10619-009-7034-y

    Article  Google Scholar 

  29. Walmsley, P.: XQuery. O’Reilly Media, Inc. (2007)

    Google Scholar 

  30. Zhang, Y., Orlowska, M.E.: On fragmentation approaches for distributed database design. Information Sciences - Applications 1(3), 117–132 (1994), http://www.sciencedirect.com/science/article/pii/1069011594900051

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Kechar, M., Bahloul, S.N. (2014). Hybrid Fragmentation of XML Data Warehouse Using K-Means Algorithm. In: Manolopoulos, Y., Trajcevski, G., Kon-Popovska, M. (eds) Advances in Databases and Information Systems. ADBIS 2014. Lecture Notes in Computer Science, vol 8716. Springer, Cham. https://doi.org/10.1007/978-3-319-10933-6_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-10933-6_6

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-10932-9

  • Online ISBN: 978-3-319-10933-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics