Abstract
In this article, we use the Coq proof assistant to specify and verify the low level layer of SQL’s execution engines. To reach our goals, we first design a high-level Coq specification for data-centric operators intended to capture their essence. We, then, provide two Coq implementations of our specification. The first one, the physical algebra, consists in the low level operators found in systems such as Postgresql or Oracle. The second, SQL algebra, is an extended relational algebra that provides a semantics for SQL. Last, we formally relate physical algebra and SQL algebra. By proving that the physical algebra implements SQL algebra, we give high level assurances that physical algebraic and SQL algebra expressions enjoy the same semantics. All this yields the first, to our best knowledge, formalisation and verification of the low level layer of an RDBMS as well as SQL’s compilation’s physical optimisation: fundamental steps towards mechanising SQL’s compilation chain.
Work funded by the DataCert ANR project: ANR-15-CE39-0009.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
The model exploits system collected statistics about the data stored in the database.
- 2.
The IJ nodes are expressed in Postqresql as Nested loop combined with an Index scan but corresponds to an index-based join.
- 3.
will be according to the various types of elements and various implementations for the collection. A particular case of is which denotes the number of occurrences in a list.
- 4.
We could also use a module type, but the syntax would be heavier and less general.
- 5.
This construction is similar to the exception monad. There is no interest to write the standard “return” and “bind” operators. The sequential scan and nested loop, respecitvely, can be seen as online versions of them.
References
Abiteboul, S., Hull, R., Vianu, V.: Foundations of Databases. Addison-Wesley, Boston (1995)
Anand, A., Appel, A., Morrisett, G., Paraskevopoulou, Z., Pollack, R., Bélanger-Savary, O., Sozeau, M., Weaver, M.: Certicoq: a verified compiler for Coq. In: The Third International Workshop on Coq for Programming Languages (CoqPL) (2017)
Auerbach, J.S., Hirzel, M., Mandel, L., Shinnar, A., Siméon, J.: Handling environments in a nested relational algebra with combinators and an implementation in a verified query compiler. In: Salihoglu, S., Zhou, W., Chirkova, R., Yang, J., Suciu, D. (eds.) Proceedings of the 2017 ACM International Conference on Management of Data, SIGMOD Conference 2017, Chicago, IL, USA, 14–19 May 2017, pp. 1555–1569. ACM (2017). https://doi.org/10.1145/3035918.3035961, http://doi.acm.org/10.1145/3035918.3035961
Auerbach, J.S., Hirzel, M., Mandel, L., Shinnar, A., Siméon, J.: Q*cert: a platform for implementing and verifying query compilers. In: Proceedings of the 2017 ACM International Conference on Management of Data, SIGMOD Conference 2017, Chicago, IL, USA, 14–19 May 2017, pp. 1703–1706 (2017)
Bailis, P., Hellerstein, J.M., Stonebraker, M. (eds.): Readings in Database Systems, 5th edn. MIT-Press (2015). http://www.redbook.io/
Benzaken, V., Contejean, E.: SQLCert: Coq mechanisation of SQL’s compilation: formally reconciling SQL and (relational) algebra, October 2016. Working paper available on demand
Benzaken, V., Contejean, E.: A Coq mechanised executable algebraic semantics for real life SQL queries (2018, Submitted for Publication)
Benzaken, V., Contejean, E., Dumbrava, S.: A Coq formalization of the relational data model. In: 23rd European Symposium on Programming (ESOP) (2014)
Benzaken, V., Contejean, É., Dumbrava, S.: Certifying standard and stratified datalog inference engines in SSReflect. In: Ayala-Rincón, M., Muñoz, C.A. (eds.) ITP 2017. LNCS, vol. 10499, pp. 171–188. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-66107-0_12
Chamberlin, D.D., Boyce, R.F.: SEQUEL: a structured English query language. In: Rustin, R. (ed.) Proceedings of 1974 ACM-SIGMOD Workshop on Data Description, Access and Control, Ann Arbor, Michigan, 1–3 May 1974, 2 vols., pp. 249–264. ACM (1974). https://doi.org/10.1145/800296.811515, http://doi.acm.org/10.1145/800296.811515
Chen, H., Wu, X.N., Shao, Z., Lockerman, J., Gu, R.: Toward compositional verification of interruptible OS kernels and device drivers. In: Krintz, C., Berger, E. (eds.) Proceedings of the 37th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2016, Santa Barbara, CA, USA, 13–17 June 2016, pp. 431–447. ACM (2016).https://doi.org/10.1145/2908080.2908101, http://doi.acm.org/10.1145/2908080.2908101
Chu, S., Weitz, K., Cheung, A., Suciu, D.: HoTTSQL: proving query rewrites with univalent SQL semantics. In: Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2017, pp. 510–524. ACM, New York (2017)
Codd, E.F.: A relational model of data for large shared data banks. Commun. ACM 13(6), 377–387 (1970). https://doi.org/10.1145/362384.362685, http://doi.acm.org/10.1145/362384.362685
Delaware, B., Pit-Claudel, C., Gross, J., Chlipala, A.: Fiat: Deductive synthesis of abstract data types in a proof assistant. In: Proceedings of the 42nd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL 2015, pp. 689–700 (2015)
Elmasri, R., Navathe, S.B.: Fundamentals of Database Systems, 2nd edn. Benjamin/Cummings, Redwood City (1994)
Filliâtre, J.-C., Paskevich, A.: Why3 — where programs meet provers. In: Felleisen, M., Gardner, P. (eds.) ESOP 2013. LNCS, vol. 7792, pp. 125–128. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-37036-6_8
Filliâtre, J.C., Pereira, M.: Itérer avec confiance. In: Journées Francophones des Langages Applicatifs. Saint-Malo, France, January 2016. https://hal.inria.fr/hal-01240891
Garcia-Molina, H., Ullman, J.D., Widom, J.: Database Systems - The Complete Book, 2nd edn. Pearson Education, Harlow (2009)
Gonzalía, C.: Towards a formalisation of relational database theory in constructive type theory. In: Berghammer, R., Möller, B., Struth, G. (eds.) RelMiCS 2003. LNCS, vol. 3051, pp. 137–148. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-24771-5_12
Gonzalia, C.: Relations in dependent type theory. Ph.D. thesis, Chalmers Göteborg University (2006)
Gu, R., Shao, Z., Chen, H., Wu, X.N., Kim, J., Sjöberg, V., Costanzo, D.: CertiKOS: an extensible architecture for building certified concurrent OS kernels. In: Keeton, K., Roscoe, T. (eds.) 12th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2016, Savannah, GA, USA, 2–4 November 2016, pp. 653–669. USENIX Association (2016). https://www.usenix.org/conference/osdi16/technical-sessions/presentation/gu
Karp, R.M.: On-line algorithms versus off-line algorithms: how much is it worth to know the future? In: van Leeuwen, J. (ed.) Algorithms, Software, Architecture - Information Processing 1992, vol. 1, Proceedings of the IFIP 12th World Computer Congress, Madrid, Spain, 7–11 September 1992. IFIP Transactions, vol. A-12, pp. 416–429. North-Holland (1992)
Leroy, X.: A formally verified compiler back-end. J. Autom. Reason. 43(4), 363–446 (2009)
Malecha, G., Morrisett, G., Shinnar, A., Wisnesky, R.: Toward a verified relational database management system. In: ACM International Conference on POPL (2010)
Ramakrishnan, R., Gehrke, J.: Database Management Systems, 3rd edn. McGraw-Hill, New York (2003)
Selinger, P.G., Astrahan, M.M., Chamberlin, D.D., Lorie, R.A., Price, T.G.: Access path selection in a relational database management system. In: Proceedings of the 1979 ACM SIGMOD International Conference on Management of Data, Boston, Massachusetts, 30 May–1 June 1979, pp. 23–34 (1979)
The Coq Development Team: The Coq Proof Assistant Reference Manual (2010). http://coq.inria.fr, http://coq.inria.fr
The Isabelle Development Team: The Isabelle Interactive Theorem Prover (2010). https://isabelle.in.tum.de/, https://isabelle.in.tum.de/
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Benzaken, V., Contejean, É., Keller, C., Martins, E. (2018). A Coq Formalisation of SQL’s Execution Engines. In: Avigad, J., Mahboubi, A. (eds) Interactive Theorem Proving. ITP 2018. Lecture Notes in Computer Science(), vol 10895. Springer, Cham. https://doi.org/10.1007/978-3-319-94821-8_6
Download citation
DOI: https://doi.org/10.1007/978-3-319-94821-8_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-94820-1
Online ISBN: 978-3-319-94821-8
eBook Packages: Computer ScienceComputer Science (R0)