Scaling up the performance of more powerful Datalog systems on multicore machines

Yang, Mohan; Shkapsky, Alexander; Zaniolo, Carlo

doi:10.1007/s00778-016-0448-z

Scaling up the performance of more powerful Datalog systems on multicore machines

Regular Paper
Published: 01 December 2016

Volume 26, pages 229–248, (2017)
Cite this article

The VLDB Journal Aims and scope Submit manuscript

1630 Accesses
20 Citations
Explore all metrics

Abstract

Extending RDBMS technology to achieve performance and scalability for queries that are much more powerful than those of SQL-2 has been the goal of deductive database research for more than thirty years. The \(\mathcal {D}e\mathcal {A}\mathcal {L}\mathcal {S}\) system has made major progress toward this goal, by (1) Datalog extensions that support the more powerful recursive queries needed in advanced applications, and (2) superior performance for both traditional recursive queries and those made possible by the new extensions, while (3) delivering competitive performance with commercial RDBMSs on non-recursive queries. In this paper, we focus on the techniques used to support the in-memory evaluation of Datalog programs on multicore machines. In \(\mathcal {D}e\mathcal {A}\mathcal {L}\mathcal {S}\), a Datalog program is represented as an AND/OR tree, and multiple copies of the same AND/OR tree are used to access the tables in the database concurrently during the parallel evaluation. We describe compilation techniques that (1) recognize when the given program is lock-free, (2) transform a locking program into a lock-free program, and (3) find an efficient parallel plan that correctly evaluates the program while minimizing the use of locks and other overhead required for parallel evaluation. Extensive experiments demonstrate the effectiveness of the proposed techniques.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Stochastic gradient descent without full data shuffle: with applications to in-database machine learning and deep learning systems

Article Open access 12 April 2024

Efficient High-Level Programming in Plain Java

Article 05 December 2022

Shared Memory Parallelism in Modern C++ and HPX

Article 20 April 2024

Notes

The actual name “Datalog” was introduced by David Maier several years later.
Another way to implement this query is to use the recursive common table expressions. But the approach of using a WHILE loop performs significantly better in our experiments.
There are other possible partitioning strategies, and the choice will be discussed later in the section.
Currently, the user determines when to force the materialization of a relation with an annotation in the program.
The idea of this optimization is that there is no need to consider any value but the maximum value produced by the mcount goals, i.e., the current count values, when certain conditions are satisfied. \(\mathcal {D}e\mathcal {A}\mathcal {L}\mathcal {S}\) uses simple sufficient conditions that can be easily checked by a compiler, including (i) the values produced by the mcount goals are tested against some monotonic Boolean conditions which evaluate to true iff they are true for the max values or (ii) the values produced by the mcount term are fed to the final extraction rule which disregards all the values but the max ones. Similar conditions apply for msum.
For a predicate p, R(p) denotes the relation that stores all tuples corresponding to facts about p; \(p[\overline{X}]\) denotes a tuple of arity \(|\overline{X}|\) by retrieving the arguments in p whose positions belong to \(\overline{X}\), and it is treated as a multiset of arguments when involved in equality checking.
count(distinct) is replaced with count in \(\mathtt{query16}\). order by and limit are ignored in our program. The evaluation time will not change significantly if we add these constructs since most queries return very few results except \(\mathtt{query3}\) and \(\mathtt{query10}\).
\(\mathcal {D}e\mathcal {A}\mathcal {L}\mathcal {S}\) is about 2\(\times \) faster than the version used in [55] on the TPC-H benchmark by function inline optimization.
We use the graph as a directed graph for \(\mathtt{4cycle}\), and as a undirected graph for \(\mathtt{3clique}\) and \(\mathtt{4clique}\).
The single-processor version of DLV is downloaded from [10]. Although a parallel version [9] is available, it is either much slower than the single-processor version, or it fails since it is a 32-bit executable that does not support more than 4 GB memory required by evaluation.

References

Aref, M., ten Cate, B., Green, T.J., Kimelfeld, B., et al.: Design and implementation of the LogicBlox system. In: SIGMOD, pp. 1371–1382. ACM, New York (2015)
Arni, F., Ong, K., Tsur, S., Wang, H., Zaniolo, C.: The deductive database system LDL++. TPLP 3(1), 61–94 (2003)
MATH Google Scholar
Bell, D.A., Shao, J., Hull, M.E.C.: A pipelined strategy for processing recursive queries in parallel. Data Knowl. Eng. 6(5), 367–391 (1991)
Article Google Scholar
Boncz, P.A., Zukowski, M., Nes, N.: MonetDB/X100: hyper-pipelining query execution. CIDR 5, 225–237 (2005)
Google Scholar
Bravenboer, M., Smaragdakis, Y.: Strictly declarative specification of sophisticated points-to analyses. In: OOPSLA, pp. 243–262. ACM, New York (2009)
Chimenti, D., Gamboa, R., Krishnamurthy, R., Naqvi, S., et al.: The LDL system prototype. TKDE 2(1), 76–90 (1990)
Google Scholar
Cohen, S., Wolfson, O.: Why a single parallelization strategy is not enough in knowledge bases. In: PODS, pp. 200–216. ACM, New York (1989)
Deductive application language system. http://wis.cs.ucla.edu/deals/
DLV (parallel version). http://www.mat.unical.it/ricca/downloads/parallelground10.zip
DLV (single-processor version). http://www.dlvsystem.com/files/dlv.x86-64-linux-elf-static.bin
DLV with recursive aggregates. http://www.dbai.tuwien.ac.at/proj/dlv/dlvRecAggr/dl-recagg-snapshot-2007-04-14.zip
Dees, J., Sanders, P.: Efficient many-core query execution in main memory column-stores. In: ICDE, pp. 350–361. IEEE, New York (2013)
Eisner, J., Filardo, N.W.: Dyna: extending Datalog for modern AI. In: Datalog Reloaded, pp. 181–220. Springer, Berlin (2011)
Fogel, A., Fung, S., Pedrosa, L., Walraed-Sullivan, M., et al.: A general approach to network configuration analysis. In: NSDI, pp. 469–483 (2015)
Ganguly, S., Silberschatz, A., Tsur, S.: Parallel bottom-up processing of Datalog queries. J. Logic Program. 14(1), 101–126 (1992)
Article MathSciNet MATH Google Scholar
Ganguly, S., Silberschatz, A., Tsur, S.: Mapping Datalog program execution to networks of processors. TKDE 7(3), 351–361 (1995)
Google Scholar
Gebser, M., Kaminski, R., Kaufmann, B., Schaub, T.: Clingo=ASP+Control: preliminary report. arXiv preprint arXiv:1405.3694
Hulin, G.: Parallel processing of recursive queries in distributed architectures. In: VLDB, pp. 87–96. Morgan Kaufmann, Los Altos (1989)
Lattner, C.: LLVM and Clang: next generation compiler technology. In: The BSD Conference, pp. 1–2 (2008)
Leone, N., Pfeifer, G., Faber, W., Eiter, T., et al.: The DLV system for knowledge representation and reasoning. TOCL 7(3), 499–562 (2006)
Article MathSciNet Google Scholar
Leskovec, J., Krevl, A.: SNAP datasets: Stanford large network dataset collection. http://snap.stanford.edu/data (2014)
Mazuran, M., Serra, E., Zaniolo, C.: Extending the power of Datalog recursion. VLDB J. 22(4), 471–493 (2013)
Article MATH Google Scholar
Mazuran, M., Serra, E., Zaniolo, C.: A declarative extension of Horn clauses, and its significance for Datalog and its applications. TPLP 13(4–5), 609–623 (2013)
MathSciNet MATH Google Scholar
Morris, K., Ullman, J.D., Van Gelder, A.: Design overview of the NAIL! system. In: ICLP, pp. 554–568. Springer, Berlin (1986)
Nguyen, D., Aref, M., Bravenboer, M., Kollias, G., et al.: Join Processing for Graph Patterns: An Old Dog with New Tricks. arXiv preprint arXiv:1503.04169 (2015)
Perri, S., Ricca, F., Sirianni, M.: Parallel instantiation of ASP programs: techniques and experiments. TPLP 13(02), 253–278 (2013)
MathSciNet MATH Google Scholar
Ramakrishnan, R., Srivastava, D., Sudarshan, S.: CORAL—control, relations and logic. In: VLDB, pp. 238–250. Morgan Kaufmann, Los Altos (1992)
Raschid, L., Su, S.Y.W.: A parallel processing strategy for evaluating recursive queries. In: VLDB, pp. 412–419. Morgan Kaufmann, Los Altos (1986)
Ross, K.A., Sagiv, Y.: Monotonic aggregation in deductive databases. In: PODS, pp. 114–126. ACM, New York (1992)
SociaLite. http://github.com/socialite-lang/socialite
SPEC\(^{\textregistered }\) CINT2006 Result. Cisco Systems: Cisco UCS C460 M4 (Intel Xeon E7-4890 v2, 2.80 GHz). http://www.spec.org/cpu2006/results/res2014q1/cpu2006-20140224-28687
SPEC\(^{\textregistered }\) CINT2006 Result. Dell Inc.: PowerEdge R720 (Intel Xeon E5-2690, 2.90 GHz). http://www.spec.org/cpu2006/results/res2012q1/cpu2006-20120228-19541
SPEC\(^{\textregistered }\) CINT2006 Result. Supermicro: Supermicro A+ Server 2042G-6RF (AMD Opteron 6376, 2.30 GHz). http://www.spec.org/cpu2006/results/res2012q4/cpu2006-20121005-24693
SQL Server 2014. http://www.microsoft.com/en-us/server-cloud/products/sql-server/
Seib, J., Lausen, G.: Parallelizing Datalog programs by generalized pivoting. In: PODS, pp. 241–251. ACM, New York (1991)
Selman, B., Kautz, H.: Domain-independent extensions to GSAT: Solving large structured satisfiability problems. In: IJCAI, pp. 290–295. Morgan Kaufmann, Los Altos (1993)
Selman, B., Kautz, H., Cohen, B.: Local search strategies for satisfiability testing. Cliques Color. Satisf.: Second DIMACS Implement. Chall. 26, 521–532 (1993)
Selman, B., Levesque, H.J., Mitchell, D.G.: A new method for solving hard satisfiability problems. In: AAAI, pp. 440–446. AAAI Press/MIT Press, Cambridge (1992)
Seo, J., Guo, S., Lam, M.S.: SociaLite: Datalog extensions for efficient social network analysis. In: ICDE, pp. 278–289. IEEE, New York (2013)
Seo, J., Park, J., Shin, J., Lam, M.S.: Distributed socialite: a Datalog-based language for large-scale graph analysis. PVLDB 6(14), 1906–1917 (2013)
Google Scholar
Shkapsky, A., Yang, M., Interlandi, M., Chiu, H., Condie, T., Zaniolo, C.: Big data analytics with Datalog queries on Spark. In: SIGMOD, pp. 1135–1149. ACM, New York (2016)
Shkapsky, A., Yang, M., Zaniolo, C.: Optimizing recursive queries with monotonic aggregates in DeALS. In: ICDE, pp. 867–878. IEEE, New York (2015)
Shkapsky, A., Zeng, K., Zaniolo, C.: Graph queries in a next-generation Datalog system. PVLDB 6(12), 1258–1261 (2013)
Google Scholar
Spears, W.M.: Simulated annealing for hard satisfiability problems. Cliques Color. Satisf.: Second DIMACS Implement. Chall. 26, 533–558 (1993)
TPC-H. http://www.tpc.org/tpch/
TPC-H Result on Cisco UCS C460 M4 Server. http://www.tpc.org/3311
TPC-H Result on Dell PowerEdge R720. http://www.tpc.org/3282
Ullman, J.D.: Implementation of logical query languages for databases. TODS 10(3), 289–321 (1985)
Article MATH Google Scholar
Vectorwise. http://www.actian.com/
Van Gelder, A.: Foundations of aggregation in deductive databases. In: DOOD, pp. 13–34. Springer, Berlin (1993)
Veldhuizen, T.L.: Triejoin: A simple, worst-case optimal join algorithm. In: ICDT, pp. 96–106 (2014)
Wang, J., Balazinska, M., Halperin, D.: Asynchronous and fault-tolerant recursive Datalog evaluation in shared-nothing engines. PVLDB 8(12), 1542–1553 (2015)
Google Scholar
Wolfson, O.: Sharing the load of logic-program evaluation. In: DPDS, pp. 46–55. IEEE, New York (1988)
Wolfson, O., Silberschatz, A.: Distributed processing of logic programs. In: SIGMOD, pp. 329–336. ACM, New York (1988)
Yang, M., Shkapsky, A., Zaniolo, C.: Parallel bottom-up evaluation of logic programs: DeALS on shared-memory multicore machines. In: Technical Communications of ICLP (2015)
Yang, M., Zaniolo, C.: Main memory evaluation of recursive queries on multicore machines. In: IEEE BigData, pp. 251–260. IEEE, New York (2014)
Zaniolo, C.: Logical foundations of continuous query languages for data streams. In: Datalog in Academia and Industry, pp. 177–189. Springer, Berlin (2012)
Zhang, W., Wang, K., Chau, S.C.: Data partition and parallel evaluation of Datalog programs. TKDE 7(1), 163–176 (1995)
Google Scholar

Download references

Acknowledgements

This work was supported by NSF Grants IIS 1218471 and IIS 1118107. We would like to thank the reviewers and Matteo Interlandi for their comments. We thank LogicBlox especially Martin Bravenboer, Dung Nguyen, and Yannis Smaragdakis for their assistance with the LogicBlox comparison.

Author information

Authors and Affiliations

University of California, Los Angeles, Los Angeles, CA, USA
Mohan Yang, Alexander Shkapsky & Carlo Zaniolo

Authors

Mohan Yang
View author publications
You can also search for this author in PubMed Google Scholar
Alexander Shkapsky
View author publications
You can also search for this author in PubMed Google Scholar
Carlo Zaniolo
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mohan Yang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yang, M., Shkapsky, A. & Zaniolo, C. Scaling up the performance of more powerful Datalog systems on multicore machines. The VLDB Journal 26, 229–248 (2017). https://doi.org/10.1007/s00778-016-0448-z

Download citation

Received: 17 February 2016
Revised: 22 July 2016
Accepted: 02 November 2016
Published: 01 December 2016
Issue Date: April 2017
DOI: https://doi.org/10.1007/s00778-016-0448-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Scaling up the performance of more powerful Datalog systems on multicore machines

Abstract

Access this article

Similar content being viewed by others

Stochastic gradient descent without full data shuffle: with applications to in-database machine learning and deep learning systems

Efficient High-Level Programming in Plain Java

Shared Memory Parallelism in Modern C++ and HPX

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Scaling up the performance of more powerful Datalog systems on multicore machines

Abstract

Access this article

Similar content being viewed by others

Stochastic gradient descent without full data shuffle: with applications to in-database machine learning and deep learning systems

Efficient High-Level Programming in Plain Java

Shared Memory Parallelism in Modern C++ and HPX

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation