Abstract
Loss-less data compression is attractive in database systems as it may facilitate query performance improvement and storage reduction. Although there are many compression techniques that handle the whole database in main memory, problems arise when the amount of data increases gradually over time, and also when the data has high cardinality. Management of a rapidly evolving large volume of data in a scalable way is very challenging. This paper describes a disk based single vector large data cardinality approach, incorporating data compression in a distributed environment. The approach provides substantial storage performance improvement compared to other high performance database systems. The presented compressed database structure provides direct addressability in a distributed environment, thereby reducing retrieval latency when handling large volumes of data.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Garcia-Molina, H., Salem, K.: Main Memory Database Systems: An Overview. IEEE Transaction on Knowledge and Data Engineering 4(6), 509–516 (1992)
Cockshott, W.P., Mcgregor, D., Wilson, J.: High-Performance Operations Using a Compressed Database Architecture. The Computer Journal 41(5), 283–296 (1998)
Pucheral, P., Thevnin, J.-M., Valduriez, P.: Efficient Main Memory Data Management using DBGraph Storage Model. In: The 16th International Conference on Very Large Databases, Brisbase, Australia (1990)
Hoque, A.S.M.L.: Storage and Querying of High Dimensional Sparsely Populated Data in Compressed Representation. In: Shafazand, H., Tjoa, A.M. (eds.) EurAsia-ICT 2002. LNCS, vol. 2510, p. 418. Springer, Heidelberg (2002)
Alkhatib, G., Labban, R.S.: Transaction Management in Distributed Database Systems: the Case of Oracle’s Two-Phase Commit. The Journal of Information Systems Education 13(2), 95–103 (1995)
Lawrence, R., Kruger, A.: An Architecture for Real-T’ime Warehousing of Scientific Data. In: The International Conference on Scientific Computing (ICSC), Vegus, Nevada (2005)
Poess, M., Potapov, D.: Data Compression in Oracle. In: The 29th International Conference on Very Large Databases(VLDB), Berlin, Germany (2003)
Litwin, W., Moussa, R., Thomas, J.E., Schwartz, S.J.: LH*RS: A Highly Available Distributed Data Storage. In: The 30th International Conference on Very Large Databases Conference, Toronto, Canada (2004)
Chang, F., Dean, J., Ghemawat, S., Hsieh, W.C., Wallach, D.A., Burrows, M., Chandra, T., Fikes, A., Gruber, R.E.: Bigtable: A Distributed Storage System for Structured Data. In: The International Conference on Operating Systems Design and Implementation (OSDI), Seattle, Wa, USA (2006)
Hoque, A.S.M.l., McGregor, D., Wilson, J.: Database compression using an off-line dictionary method. In: Yakhno, T. (ed.) ADVIS 2002. LNCS, vol. 2457. Springer, Heidelberg (2002)
Lee, I., Yeom, H.Y., Park, T.: A New Approach for Distributed Main Memory Database Systems: A Casual Commit Protocol. IEICE Trans. Inf. & Syst. 87(1), 196–204 (2004)
Lehman, T.J., Shekita, E.J., Cabrera, L.-F.: An Evaluation of Starburst’s Memory Resident Storage Component. IEEE Transaction on Knowledge and Data Engineering, 555–566 (1992)
Lim, H.-S., Lee, J.-G., Lee, M.-J., Whang, K.-Y., Song, I.-Y.: Continuous Query Processing in Data Streams Using Duality of Data and Queries. In: SIGMOD Chicago, Illinois, USA (2006)
Liu, F., Yu, C., Meng, W., Chowdhury, A.: Effective Keyword Search in Relational Databases SIGMOD Chicago, Illinois, USA (2006)
Liu, X., Li, X.: Design and Implement of Distributed Database-based Pricing Management System*. In: Proceedings of the 6th World Congress on Intelligent Control and Automation, Dalian, China (2006)
Pucheral, P., Thevenin, J.-M., Valduriez, P.: Efficient Main Memory Data Management using DBGraph Storage Model. In: The 16th International Conference on Very Large Databases(VLDB), Brisbase, Australia (1990)
Teorey, T.J.: Distributed Database Design: A Practical Approach and Example. SIGMOD 18(4), 23–39 (1989)
Valduriez, P., Ozsu, T.: Principle of Distributed Database Systems. Prentice Hall, Englewood Cliffs (1999)
Lawrence, R., Kruger, A.: An Architecture for Real-Time Warehousing of Scientific Data. In: The International Conference on Scientific Computing (ICSC), Vegus, Nevada, USA (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Alom, B.M.M., Henskens, F., Hannaford, M. (2009). Single Vector Large Data Cardinality Structure to Handle Compressed Database in a Distributed Environment. In: Cordeiro, J., Shishkov, B., Ranchordas, A., Helfert, M. (eds) Software and Data Technologies. ICSOFT 2008. Communications in Computer and Information Science, vol 47. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-05201-9_12
Download citation
DOI: https://doi.org/10.1007/978-3-642-05201-9_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-05200-2
Online ISBN: 978-3-642-05201-9
eBook Packages: Computer ScienceComputer Science (R0)