Split Dictionaries for In-memory Column Stores in Mixed Workload Environments

Schwalb, David; Dreseler, Markus; Faust, Martin; Wust, Johannes; Plattner, Hasso

doi:10.1007/978-3-319-08608-8_16

David Schwalb¹⁷,
Markus Dreseler¹⁷,
Martin Faust¹⁷,
Johannes Wust¹⁷ &
…
Hasso Plattner¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8506))

Included in the following conference series:

Australasian Database Conference

1154 Accesses

Abstract

Columnar in-memory databases use dictionary encoding as a compression technique, replacing long and frequently occurring values with short integers. Sorted dictionaries allow for more efficient query processing as comparisons can be performed directly on the compressed data whereas unsorted dictionaries are faster when inserting new values.

In this work, we propose a new type of dictionary compression called Split Dictionaries. These organize their values in fixed-sized splits, enabling fast inserts and comparable query performance while significantly reducing maintenance costs. We present a detailed performance analysis regarding inserts, range queries, and the merge process as well as a memory usage model. We argue that adjusting the dictionary size allows for a more balanced trade-off especially in mixed workload environments.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Färber, F., Cha, S.K., Primsch, J., Bornhövd, C., Sigg, S., Lehner, W.: SAP HANA database: Data management for modern business applications. SIGMOD (2012)
Google Scholar
Grund, M., Krueger, J., Plattner, H., Zeier, A., Cudre-Mauroux, P., Madden, S.: HYRISE—A Main Memory Hybrid Storage Engine. In: VLDB (2010)
Google Scholar
Hildenbrand, S.: Scaling Out Column Stores: Data, Queries, and Transactions. PhD thesis, ETH Zurich (2012)
Google Scholar
Kemper, A., Neumann, T.: HyPer: A hybrid OLTP&OLAP main memory database system based on virtual memory snapshots. In: ICDE (2011)
Google Scholar
Krüger, J., Kim, C., Grund, M., Satish, N., Schwalb, D., Chhugani, J., Plattner, H., Dubey, P., Zeier, A.: Fast Updates on Read-Optimized Databases Using Multi-Core CPUs. In: VLDB (2011)
Google Scholar
Lemke, C., Sattler, K.-U., Faerber, F., Zeier, A.: Speeding up queries in column stores. In: Bach Pedersen, T., Mohania, M.K., Tjoa, A.M. (eds.) DAWAK 2010. LNCS, vol. 6263, pp. 117–129. Springer, Heidelberg (2010)
Chapter Google Scholar
MacNicol, R., French, B.: Sybase IQ Multiplex - Designed For Analytics. In: VLDB (2004)
Google Scholar
Mühe, H., Kemper, A., Neumann, T.: Executing Long-Running Transactions in Synchronization-Free Main Memory Database Systems. In: CIDR (2013)
Google Scholar
Plattner, H.: A Common Database Approach for OLTP and OLAP Using an In-Memory Column Database. In: SIGMOD (2009)
Google Scholar
Psaroudakis, I., Scheuer, T., May, N.: Task Scheduling for Highly Concurrent Analytical and Transactional Main-Memory Workloads. In: ADMS in Conjunction with VLDB (2013)
Google Scholar
Schwalb, D., Faust, M., Krueger, J., Plattner, H.: Physical Column Organization in In-Memory Column Stores. In: Meng, W., Feng, L., Bressan, S., Winiwarter, W., Song, W. (eds.) DASFAA 2013, Part II. LNCS, vol. 7826, pp. 48–63. Springer, Heidelberg (2013)
Chapter Google Scholar
Sikka, V., Färber, F., Lehner, W., Cha, S.K., Peh, T., Bornhövd, C.: Efficient Transaction Processing in SAP HANA Database - The End of a Column Store Myth. In: SIGMOD (2012)
Google Scholar
Stonebraker, M., Abadi, D., Batkin, A., Chen, X., Cherniack, M., Ferreira, M., Lau, E., Lin, A., Madden, S., O’Neil, E.: C-store: A Column-oriented DBMS. In: VLDB (2005)
Google Scholar
Willhalm, T., Popovici, N., Boshmaf, Y., Plattner, H., Zeier, A., Schaffner, J.: SIMD-Scan: Ultra Fast in-Memory Table Scan Using on-Chip Vector Processing Units. In: VLDB (2009)
Google Scholar
Zukowski, M., Boncz, P., Nes, N., Heman, S.: MonetDB/X100—A DBMS in the CPU cache. IEEE Data Engineering Bulletin (2005)
Google Scholar

Download references

Author information

Authors and Affiliations

Hasso Plattner Institute, Potsdam, Germany
David Schwalb, Markus Dreseler, Martin Faust, Johannes Wust & Hasso Plattner

Authors

David Schwalb
View author publications
You can also search for this author in PubMed Google Scholar
Markus Dreseler
View author publications
You can also search for this author in PubMed Google Scholar
Martin Faust
View author publications
You can also search for this author in PubMed Google Scholar
Johannes Wust
View author publications
You can also search for this author in PubMed Google Scholar
Hasso Plattner
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Centre for Applied Informatics (CAI), College of Engineering and Science, Victoria University, Ballarat Road, 8001, Footscray, VIC, Australia
Hua Wang
Faculty of Engineering, Architecture and Information Technology, School of Information Technology and Electrical Engineering, The University of Queensland, St. Lucia, 4072, Brisbane, QLD, Australia
Mohamed A. Sharaf

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Schwalb, D., Dreseler, M., Faust, M., Wust, J., Plattner, H. (2014). Split Dictionaries for In-memory Column Stores in Mixed Workload Environments. In: Wang, H., Sharaf, M.A. (eds) Databases Theory and Applications. ADC 2014. Lecture Notes in Computer Science, vol 8506. Springer, Cham. https://doi.org/10.1007/978-3-319-08608-8_16

Download citation

DOI: https://doi.org/10.1007/978-3-319-08608-8_16
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-08607-1
Online ISBN: 978-3-319-08608-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics