Abstract
Analytics database dbX is a cloud agnostic, MPP SQL product with both DSM and NSM stores. One of the techniques for better micro optimization of SQL query processing is runtime code generation and JIT compilation. We propose a RTCG model that is both query aware and hardware conscious extending analytics SQL query processing to a high degree of intra-query parallelism. Our approach to RTCG, at system level targets to maximize benefits from modern hardware, and at use level focuses on typical, industry type SQL, somewhat different from standard benchmarks. We describe the model, highlighting its novel aspects, techniques implemented and product engineering decisions in dbX. To evaluate the efficacy of the RTCG model, we perform experiments on desktop and cloud clusters, with standard and synthetic benchmarks, on data that is more commensurate in size with industry applications.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Ailamaki, A.A., Dewitt, D.J., Hill, M.D., Wood, D.A.: DBMS on a modern processor: where does time go? In: Proceedings of 25th VLDB, pp. 266–277 (1999)
Amazon: Redshift (2017). http://docs.aws.amazon.com/redshift/latest/dg/c-query-performance.html
Astrahan, M.M., et al.: System R: a relational data base management system. Computer 12, 42–48 (1979)
Aycock, S.: A brief history of Just-In-Time. Comput. Surv. 35, 97–113 (2003)
Becker, A., Sirowy, S., Vahid, F.: Just-In-Time compilation for FPGA processor cores. In: ESLsyn Conference, pp. 1–6 (2011)
Codd, E.F.: Relational databases: a practical foundation for productivity, Turing award lecture. Commun. ACM 25, 109–117 (1982)
Consel, C., Danvy, O.: Tutorial notes on partial evaluation. In: 20th POPL, pp. 493–501. ACM (1993)
Consel, C., Noel, F.: A general approach for Run-Time Specialization and its application to C. In: 23rd POPL, pp. 145–156. ACM (1996)
Diaconu, C., et al.: Hekaton: SQL Server’s memory optimized OLTP engine. In: SIGMOD 2013, pp. 1243–1254. ACM (2013)
Engler, D.R., Hsieh, W.C., Kaashoek, M.F.: \(^{\prime }\)C: a language for high-level, efficient and machine-independent dynamic code generation. In: 23rd POPL, pp. 131–144. ACM (1996)
Freytag, J.C., Goodman, N.: Translating aggregate queries into iterative programs. In: Proceedings of 12th VLDB, pp. 25–28 (1986)
Graeffe, G.: Query evaluation techniques for large databases. Comput. Surv. 25, 73–170 (1993)
Grant, B., et al.: DyC: an expression annotation-directed dynamic compiler for C. Theor. Comput. Sci. 248(1–2), 147–199 (2000)
Keppel, D., Eggers, S.J., Henry, R.R.: Evaluating runtime-compiled value specific optimizations. Technical report 93-11-02 (1993)
Krikellas, K., Viglas, S.D., Cintra, M.: Generating code for holistic query evaluation. In: Proceedings of 26th ICDE, pp. 613–624. IEEE (2010)
Lang, H., et al.: Data blocks: hybrid OLTP and OLAP on compressed storage using both vectorization and compilation. In: SIGMOD, pp. 311–326. ACM (2016)
Leone, M., Lee, P.: A declarative approach to run-time code generation. In: Proceedings of WCSSS, vol. 73, p. 10 (1996)
Leone, M., Lee, P.: Optimizing ML with run-time code generation. SIGPLAN Not. 31, 137–148 (1996)
Murray, D.G., Isard, M., Yu, Y.: Steno: automatic optimization of declarative queries. SIGPLAN Not. 46(6), 121–131 (2011)
Nagel, F., Bierman, G., Viglas, S.D.: Code generation for efficient query processing in managed runtimes. In: Proceedings of 40th VLDB, vol. 7, pp. 1095–1106 (2014)
Neumann, T.: Efficiently compiling efficient query plans for modern hardware. In: Proceedings of 37th VLDB, vol. 4, pp. 539–550 (2011)
Pantela, S., Idreos, S.: One loop does not fit all. In: Proceedings of SIGMOD 2015, pp. 2073–2074. ACM (2015)
Pike, R., Locanthi, B., Reiser, J.: Hardware/Software trade-offs for bitmap graphics on the BLIT. Softw. Pract. Exp. 15, 131–151 (1985)
Pu, C., et al.: Optimistic incremental specialization: streamlining a commercial Operating System. In: Proceedings of SIGOPS, vol. 29, pp. 314–321. ACM (1995)
Queva, C., Courousse, D., Charles, H.: Self-optimisation using runtime-code generation for wireless sensor networks. In: Proceedings of ICDN, p. 6 (2016)
Rao, J., Pirahesh, J., Mohan, C., Lohman, G.: Compiled query execution engine using JVM. In: Proceedings of 22nd ICDE, pp. 23–23. IEEE (2006)
Sompolski, T., Zukowski, M., Boncz, P.: Vectorization vs. compilation in query execution. In: Proceedings of 7th DaMon, pp. 33–40 (2011)
SQLite: The SQLite Bytecode Engine (2017). https://www.sqlite.org/opcode.html
Sridhar, K.T.: Modern column stores for big data processing. In: Reddy, P.K., Sureka, A., Chakravarthy, S., Bhalla, S. (eds.) BDA 2017. LNCS, vol. 10721, pp. 113–125. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-72413-3_8
Sridhar, K.T.: Reliability techniques for MPP SQL database product engineering. In: 2nd International Conference on System Reliability (ICSRS), pp. 180–185. IEEE (2017)
Sridhar, K.T., Johnson, J.: Entropy aware adaptive compression for SQL column stores. In: Kozielski, S., Mrozek, D., Kasprowski, P., Małysiak-Mrozek, B., Kostrzewa, D. (eds.) BDAS 2018. CCIS, vol. 928, pp. 90–104. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-99987-6_7
Sridhar, K.T., Sakkeer, M.A.: Optimizing database load and extract for big data era. In: Bhowmick, S.S., Dyreson, C.E., Jensen, C.S., Lee, M.L., Muliantara, A., Thalheim, B. (eds.) DASFAA 2014. LNCS, vol. 8422, pp. 503–512. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-05813-9_34
Sudarshan, S. (ed.): Special Issue on When Compilers Meet Database Systems, IEEE Data Engineering Bulletin, vol. 37. IEEE (2014). http://sites.computer.org/debull/A14mar/issue1.htm
Viglas, S.D.: Just-in-time compilation for SQL query processing. In: Proceedings of 39th VLDB, vol. 6, p. 2 (2013)
Wanderman-Milne, S., Li, N.: Runtime code generation in Cloudera Impala. IEEE Data Eng. Bull. 37(1), 31–37 (2014)
Acknowledgment
We thank several people; at Bangalore: Pramod Sahu for testing JIT modules and SQL code; Dipanjan Deb and Prajeesh for operational cloud support; at Schaumburg: Jim Benbow for dbX cloud deployment scripts.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Sridhar, K.T., Sakkeer, M.A., Andrews, S., Johnson, J. (2018). MPP SQL Query Optimization with RTCG. In: Mondal, A., Gupta, H., Srivastava, J., Reddy, P., Somayajulu, D. (eds) Big Data Analytics. BDA 2018. Lecture Notes in Computer Science(), vol 11297. Springer, Cham. https://doi.org/10.1007/978-3-030-04780-1_16
Download citation
DOI: https://doi.org/10.1007/978-3-030-04780-1_16
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-04779-5
Online ISBN: 978-3-030-04780-1
eBook Packages: Computer ScienceComputer Science (R0)