Abstract
We present an Isabelle/HOL formalization of the first half of Bachmair and Ganzinger’s chapter on resolution theorem proving, culminating with a refutationally complete first-order prover based on ordered resolution with literal selection. We developed general infrastructure and methodology that can form the basis of completeness proofs for related calculi, including superposition. Our work clarifies fine points in the chapter, emphasizing the value of formal proofs in the field of automated reasoning.
Similar content being viewed by others
References
Bachmair, L., Dershowitz, N., Plaisted, D.A.: Completion without failure. In: Aït-Kaci, H., Nivat, M. (eds.) Rewriting Techniques—Resolution of Equations in Algebraic Structures, vol. 2, pp. 1–30. Academic Press, London (1989)
Bachmair, L., Ganzinger, H.: Rewrite-based equational theorem proving with selection and simplification. J. Log. Comput. 4(3), 217–247 (1994)
Bachmair, L., Ganzinger, H.: Ordered chaining calculi for first-order theories of transitive relations. J. ACM 45(6), 1007–1049 (1998)
Bachmair, L., Ganzinger, H.: Resolution theorem proving. In: Robinson, A., Voronkov, A. (eds.) Handbook of Automated Reasoning, vol. I, pp. 19–99. Elsevier, Amsterdam (2001)
Ballarin, C.: Locales: a module system for mathematical theories. J. Autom. Reason. 52(2), 123–153 (2014)
Baumgartner, P., Waldmann, U.: Hierarchic superposition revisited. In: Lutz, C., Sattler, U., Tinelli, C., Turhan, A., Wolter, F. (eds.) Description Logic, Theory Combination, and All That—Essays Dedicated to Franz Baader on the Occasion of His 60th Birthday. LNCS, vol. 11560, pp. 15–56. Springer, Berlin (2019)
Bentkamp, A., Blanchette, J., Tourret, S., Vukmirović, P., Waldmann, U.: Superposition with lambdas. In: Fontaine, P. (ed.) CADE-27, LNCS, vol. 11716, pp. 55–73. Springer, Berlin (2019)
Biendarra, J., Blanchette, J.C., Bouzy, A., Desharnais, M., Fleury, M., Hölzl, J., Kuncar, O., Lochbihler, A., Meier, F., Panny, L., Popescu, A., Sternagel, C., Thiemann, R., Traytel, D.: Foundational (co)datatypes and (co)recursion for higher-order logic. In: Dixon, C., Finger, M. (eds.) FroCoS 2017, LNCS, vol. 10483, pp. 3–21. Springer, Berlin (2017)
Blanchette, J.C.: Formalizing the metatheory of logical calculi and automatic provers in Isabelle/HOL (invited talk). In: Mahboubi, A., Myreen, M.O. (eds.) CPP 2019, pp. 1–13. ACM (2019)
Blanchette, J.C., Fleury, M., Lammich, P., Weidenbach, C.: A verified SAT solver framework with learn, forget, restart, and incrementality. J. Autom. Reason. 61(3), 333–366 (2018)
Blanchette, J.C., Fleury, M., Traytel, D.: Nested multisets, hereditary multisets, and syntactic ordinals in Isabelle/HOL. In: Miller, D. (ed.) FSCD 2017, LIPIcs, vol. 84, pp. 11:1–11:18. Schloss Dagstuhl—Leibniz-Zentrum für Informatik (2017)
Blanchette, J.C., Kaliszyk, C., Paulson, L.C., Urban, J.: Hammering towards QED. J. Formaliz. Reason. 9(1), 101–148 (2016)
Blanchette, J.C., Popescu, A., Traytel, D.: Soundness and completeness proofs by coinductive methods. J. Autom. Reason. 58(1), 149–179 (2017)
Brand, D.: Proving theorems with the modification method. SIAM J. Comput. 4(4), 412–430 (1975)
Cruanes, S.: Logtk: A logic toolkit for automated reasoning and its implementation. In: Schulz, S., de Moura, L., Konev, B. (eds.) PAAR-2014, EPiC Series in Computing, vol. 31, pp. 39–49. EasyChair (2014)
Denzinger, J., Kronenburg, M., Schulz, S.: DISCOUNT—a distributed and learning equational prover. J. Autom. Reason. 18(2), 189–198 (1997)
Dershowitz, N., Manna, Z.: Proving termination with multiset orderings. Commun. ACM 22(8), 465–476 (1979)
Fleury, M., Blanchette, J.C., Lammich, P.: A verified SAT solver with watched literals using Imperative HOL. In: Andronick, J., Felty, A.P. (eds.) CPP 2018, pp. 158–171. ACM (2018)
Godoy, G., Nieuwenhuis, R.: Superposition with completely built-in abelian groups. J. Symb. Comput. 37(1), 1–33 (2004)
Gordon, M.J.C., Melham, T.F. (eds.): Introduction to HOL: A Theorem Proving Environment for Higher Order Logic. Cambridge University Press, Cambridge (1993)
Hirokawa, N., Middeldorp, A., Sternagel, C., Winkler, S.: Infinite runs in abstract completion. In: Miller, D. (ed.) FSCD 2017, LIPIcs, vol. 84, pp. 19:1–19:16. Schloss Dagstuhl—Leibniz-Zentrum für Informatik (2017)
Krauss, A.: Partial recursive functions in higher-order logic. In: Furbach, U., Shankar, N. (eds.) IJCAR 2006, LNCS, vol. 4130, pp. 589–603. Springer, Berlin (2006)
McCune, W.: Otter 2.0. In: Stickel, M.E. (ed.) CADE-10, LNCS, vol. 449, pp. 663–664. Springer, Berlin (1990)
Nieuwenhuis, R., Rubio, A.: Theorem proving with ordering and equality constrained clauses. J. Symb. Comput. 19(4), 321–351 (1995)
Nieuwenhuis, R., Rubio, A.: Paramodulation-based theorem proving. In: Robinson, A., Voronkov, A. (eds.) Handbook of Automated Reasoning, vol. I, pp. 371–443. Elsevier, Amsterdam (2001)
Nipkow, T.: Teaching semantics with a proof assistant: no more LSD trip proofs. In: Kuncak, V., Rybalchenko, A. (eds.) VMCAI 2012, LNCS, vol. 7148, pp. 24–38. Springer, Berlin (2012)
Nipkow, T., Klein, G.: Concrete Semantics: With Isabelle/HOL. Springer, Berlin (2014)
Nipkow, T., Paulson, L.C., Wenzel, M.: Isabelle/HOL: A Proof Assistant for Higher-Order Logic, LNCS, vol. 2283. Springer, Berlin (2002)
O’Connor, R.: Essential incompleteness of arithmetic verified by Coq. In: Hurd, J., Melham, T.F. (eds.) TPHOLs 2005, LNCS, vol. 3603, pp. 245–260. Springer, Berlin (2005)
Paulson, L.C.: A machine-assisted proof of Gödel’s incompleteness theorems for the theory of hereditarily finite sets. Rew. Symb. Logic 7(3), 484–498 (2014)
Peltier, N.: A variant of the superposition calculus. Archive of Formal Proofs 2016 (2016). https://www.isa-afp.org/entries/SuperCalc.shtml. Accessed 22 May 2020
Persson, H.: Constructive completeness of intuitionistic predicate logic—a formalisation in type theory. Licentiate thesis, Chalmers tekniska högskola and Göteborgs universitet (1996)
Pierce, B.C.: Lambda, the ultimate TA: Using a proof assistant to teach programming language foundations. In: Hutton, G., Tolmach, A.P. (eds.) ICFP 2009, pp. 121–122. ACM (2009)
Popescu, A., Traytel, D.: A formally verified abstract account of Gödel’s incompleteness theorems. In: Fontaine, P. (ed.) CADE-27, LNCS, vol. 11716, pp. 442–461. Springer, Berlin (2019)
Reger, G., Suda, M.: Checkable proofs for first-order theorem proving. In: Reger, G., Traytel, D. (eds.) ARCADE 2017, EPiC Series in Computing, vol. 51, pp. 55–63. EasyChair (2017)
Schlichtkrull, A.: Formalization of the resolution calculus for first-order logic. J. Autom. Reason. 61(4), 455–484 (2018)
Schlichtkrull, A., Blanchette, J.C., Traytel, D.: A verified prover based on ordered resolution. In: Mahboubi, A., Myreen, M.O. (eds.) CPP 2019, pp. 152–165. ACM (2019)
Schlichtkrull, A., Blanchette, J.C., Traytel, D., Waldmann, U.: Formalization of a comprehensive framework for saturation theorem proving in Isabelle/HOL. Archive of Formal Proofs 2018 (2018). https://www.isa-afp.org/entries/Ordered_Resolution_Prover.html. Accessed 22 May 2020
Schlichtkrull, A., Blanchette, J.C., Traytel, D., Waldmann, U.: Formalizing Bachmair and Ganzinger’s ordered resolution prover. In: Galmiche, D., Schulz, S., Sebastiani, R. (eds.) IJCAR 2018, LNCS, vol. 10900, pp. 89–107. Springer, Berlin (2018)
Shankar, N.: Towards mechanical metamathematics. J. Autom. Reason. 1(4), 407–434 (1985)
Shankar, N.: Metamathematics, Machines, and Gödel’s Proof, Cambridge Tracts in Theoretical Computer Science, vol. 38. Cambridge University Press, Cambridge (1994)
Sutcliffe, G., Zimmer, J., Schulz, S.: TSTP data-exchange formats for automated theorem proving tools. In: Zhang, W., Sorge, V. (eds.) Distributed Constraint Problem Solving and Reasoning in Multi-Agent Systems, Frontiers in Artificial Intelligence and Applications, vol. 112, pp. 201–215. IOS Press, Amsterdam (2004)
Thiemann, R., Sternagel, C.: Certification of termination proofs using CeTA. In: Berghofer, S., Nipkow, T., Urban, C., Wenzel, M. (eds.) TPHOLs 2009, LNCS, vol. 5674, pp. 452–468. Springer, Berlin (2009)
Tourret, S.: A comprehensive framework for saturation theorem proving. Archive of Formal Proofs 2020 (2020). https://www.isa-afp.org/entries/Saturation_Framework.shtml. Accessed 22 May 2020
Voronkov, A.: AVATAR: the architecture for first-order theorem provers. In: Biere, A., Bloem, R. (eds.) CAV 2014, LNCS, vol. 8559, pp. 696–710. Springer, Berlin (2014)
Waldmann, U.: Cancellative abelian monoids and related structures in refutational theorem proving (part I/II). J. Symb. Comput. 33(6), 777–829/831–861 (2002)
Waldmann, U., Tourret, S., Robillard, S., Blanchette, J.: A comprehensive framework for saturation theorem proving. In: Peltier, N., Sofronie-Stokkermans, V. (eds.) IJCAR 2020. LNCS. Springer, Berlin (2020)
Wand, D.: Polymorphic + typeclass superposition. In: Schulz, S., de Moura, L., Konev, B. (eds.) PAAR-2014, EPiC Series in Computing, vol. 31, pp. 105–119. EasyChair (2014)
Weidenbach, C.: Combining superposition, sorts and splitting. In: Robinson, A., Voronkov, A. (eds.) Handbook of Automated Reasoning, vol. II, pp. 1965–2013. Elsevier, Amsterdam (2001)
Wenzel, M.: Isabelle/Isar—a generic framework for human-readable proof documents. In: Matuszewski, R. , Zalewska, A. (eds.) From Insight to Proof: Festschrift in Honour of Andrzej Trybulec, Studies in Logic, Grammar, and Rhetoric, vol. 10(23). University of Białystok (2007)
Wenzel, M.: Isabelle/jEdit–a prover IDE within the PIDE framework. In: Jeuring, J., Campbell, J.A., Carette, J., Reis, G.D., Sojka, P., Wenzel, M., Sorge, V. (eds.) CICM 2012, LNCS, vol. 7362, pp. 468–471. Springer, Berlin (2012)
Zhang, H., Kapur, D.: First-order theorem proving using conditional rewrite rules. In: Lusk, E.L., Overbeek, R.A. (eds.) CADE-9, LNCS, vol. 310, pp. 1–20. Springer, Berlin (1988)
Acknowledgements
Christoph Weidenbach repeatedly discussed Bachmair and Ganzinger’s chapter with us and hosted Schlichtkrull at the Max-Planck-Institut in Saarbrücken. Christian Sternagel and René Thiemann answered our questions about IsaFoR. Mathias Fleury, Florian Haftmann, and Tobias Nipkow helped enrich and reorganize Isabelle’s multiset library. Mathias Fleury, Robert Lewis, Simon Robillard, Mark Summerfield, Sophie Tourret, and the anonymous reviewers suggested many textual improvements.
Funding
Schlichtkrull was supported by a Ph.D. scholarship in the Algorithms, Logic and Graphs section of DTU Compute and by the \(\mathrm {LIGHT}^{ est }\) project, which is partially funded by the European Commission as an Innovation Act as part of the Horizon 2020 research and innovation program (Grant Agreement No. 700321, LIGHTest). Blanchette was partly supported by the Deutsche Forschungsgemeinschaft (DFG) project Hardening the Hammer (Grant NI 491/14-1). He also received funding from the ERC under the European Union’s Horizon 2020 program (Grant Agreement No. 713999, Matryoshka) and from the Netherlands Organization for Scientific Research (NWO) under the Vidi program (Project No. 016.Vidi.189.037, Lean Forward). Traytel was partly supported by the DFG program Program and Model Analysis (PUMA, doctorate Program 1480).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix A: Errors and Imprecisions Discovered in the Chapter
Appendix A: Errors and Imprecisions Discovered in the Chapter
In the chapter, we encountered several mathematical errors and imprecisions of various levels of severity. We also found lemmas that were stated but not explicitly applied afterwards. For reference, this appendix provides an exhaustive list of our findings. This list illustrates how difficult it is to write paper proofs correctly, and reminds us that we cannot rely on reviewers or second readers to catch all mistakes. We hope that our corrections will further increase the chapter’s value to the research community.
Regarding the errors and imprecisions, we have ignored infelicities that are not mathematical in nature, such as typos and LaTeX macros gone wrong (e.g., “by the defn[candidate model]candidate model for N” on page 34); for such errors, careful reading, not formalization, is the remedy. We have also ignored minor ambiguities if they can easily be resolved by appealing to the context and the reader’s common sense (e.g., whether the clause \(C \vee A \vee \cdots \vee A\) may contain zero occurrences of A).
-
One of Lemma 3.4’s claims is that if clause C is true in \(I^D\), then C is also true in \(I_{ D '}\), where \(C \preceq D \preceq D '\). This does not hold if \(C = D = D '\) and C is productive. Similarly, the first sentence of the proof is wrong if \(D = D '\) and D is productive: “First, observe that \(I_D \subseteq I^D \subseteq I_{ D '} \subseteq I^{ D '} \subseteq I_N\), whenever \( D '\succeq D\).”
-
The last occurrence of \( D '\) in the statement of Lemma 3.7 should be changed to C. In addition, it is not clear whether the phrase “another clause C” implies that \(C \not = D\), but the counterexample we gave in Sect. 4 works in both cases. Correspondingly, in the proof, the case distinction is incomplete, as can be seen by specializing the proof for the counterexample.
-
In the chapter’s Figure 2, in Sect. 3, the selection function is wrongly applied: References to \( S (D)\) should be changed to \( S (\lnot \,A_1 \vee \cdots \vee \lnot \,A_n \vee D)\). Moreover, in condition (iii), it is not clear with respect to which clause the “selected atom” must be considered, the two candidates being \( S (\lnot \,A_1 \vee \cdots \vee \lnot \,A_n \vee D)\) and \( S (C_i \vee A_i \vee \cdots \vee A_i)\). We assume the latter is meant. Finally, phrases like “\(A_1\) is maximal with respect to D” (here and in Figure 4) are slightly ambiguous, because it is unclear whether \(A_1\) denotes an atom or a (positive) literal, and whether it must be maximal with respect to D’s atoms or literals. From the context, we infer that an atom-with-atom comparison is meant.
-
Soundness is required in the chapter’s Sect. 4.1, even though it is claimed in Sect. 2.4 that only consistency-preserving inference systems will be considered.
-
In Sect. 4.1, it is claimed that “a fair derivation can be constructed by exhaustively applying inferences to persisting formulas.” However, this construction is circular: The notion of persisting formula (i.e., the formulas that belong to the limit) depends itself on the derivation.
-
In the proof of Theorem 4.3, the case where \(\gamma \in \smash {\mathcal {R}_{\scriptscriptstyle \mathcal {I}}}(N_\infty \setminus \smash {\mathcal {R}_{\scriptscriptstyle \mathcal {F}}}(N_\infty ))\) is not covered.
-
In Sect. 4.2, the phrase “side premises that are true in N” must be understood as meaning that the side premises both belong to N and are true in \(I_N.\)
-
Lemma 4.5 states the basic properties of the redundant clause operator \(\smash {\mathcal {R}_{\scriptscriptstyle \mathcal {F}}}\) (monotonicity and independence). Lemma 4.6 states the corresponding properties of the redundant inference operator \(\smash {\mathcal {R}_{\scriptscriptstyle \mathcal {I}}}\). As justification for Lemma 4.6, the authors tell us that “the proof uses Lemma 4.5,” but redundant inferences are a more general concept than redundant clauses, and we see no way to bridge the gap.
-
Similarly, in the proof of Theorem 4.9, the application of Lemma 4.5 does not fit. What is needed is a generalization of Lemma 4.6.
-
In condition (ii) of Figure 4, Sect. 4.2, \(A_{ii}\sigma \) should be changed to \(A_{i\!j}\sigma \).
-
In the nth side premise of Figure 4, Sect. 4.2, \(A_{1n}\) should be changed to \(A_{n1}\).
-
In Figure 4, Sect. 4.2, the same mistakes as in Figure 2 occur about the application of the selection function.
-
Sect. 4.3 states “Subsumption defines a well-founded ordering on clauses.” A simple counterexample is an infinite sequence repeating some clause. “Subsumption” should be replaced by “proper subsumption.”
-
In Lemma 4.10, it is not clear which selection function is used. When the lemma is applied in the proofs of Lemma 4.11 and Theorem 4.13, it must be \( S _{\mathcal {O}_\infty }\).
-
In Lemma 4.10, \(G(\mathcal {S})\) and \(G(\mathcal {S'})\) are related by \(\mathrel {\rhd }^*\), but \(\rhd \) is needed in the proofs of Lemma 4.11 and Lemma 4.13 since then derivations in RP, which are possibly infinite, can be projected to theorem proving processes. However \(G(\mathcal {S}) \rhd G(\mathcal {S'})\) does not hold in one of the cases since a combination of deduction and deletion is required. A solution is to change the definition of \(\rhd \) to allow such combinations.
-
In Lemma 4.10, it is not clear that the extension used should be the same between any considered pair of states. Otherwise, the lemma cannot be used to project derivations in RP to theorem proving processes.
-
In Lemma 4.11, it is not clear which selection function is used. When the lemma is applied in the proofs of Theorem 4.13, it must be \( S _{\mathcal {O}_\infty }\).
-
A step in the proof of Lemma 4.11 considers a clause \(D \in \mathcal {P}_l\) which has a nonredundant instance C. It is claimed that when D is removed from \(\mathcal {P}\), another clause \(D'\) with C as instance appears in some \(\mathcal {O}_l'\). That, however, does not follow if D was removed by backward subsumption. The problem can be resolved by choosing D as minimal, with respect to subsumption, among the clauses that generalize C in the derivation. This can be done since proper subsumption is well founded.
-
In Lemma 4.11, a minor inconsistency is that the described first-order derivation is indexed from 1 instead of 0.
-
In the proof of Theorem 4.13, the conclusion of Lemma 4.11 is stated as \(N_\infty \setminus \mathcal {R}(N_\infty ) \subseteq \mathcal {O}_\infty \), but it should have been \(N_\infty \setminus \mathcal {R}(N_\infty ) \subseteq G(\mathcal {O}_\infty )\). Furthermore, when Lemma 4.11 was first stated, the conclusion was \( N_\infty \setminus \smash {\mathcal {R}_{\scriptscriptstyle \mathcal {F}}}(N_\infty ) \subseteq G(\mathcal {S}_\infty ) \). The two are by fairness equivalent, but we find \(N_\infty \setminus \mathcal {R}(N_\infty ) \subseteq G(\mathcal {O}_\infty )\) more intuitive since it more clearly expresses that all nonredundant clauses become old.
Chief among the factors that contribute to making the chapter hard to follow is that many lemmas are stated (and usually proved) but not referenced later. We already mentioned the unfortunate Lemma 3.7. Sect. 4 contains several other specimens:
-
Theorem 4.3 (fair_derive_saturated_upto) states a completeness theorem for fair derivations. However, in Sect. 4.3, fairness is defined differently, and neither the text nor the formalization applies this theorem.
-
For the same reason, the property stated in the next-to-last sentence of Sect. 4.1 (standard_ redundancy_ criterion_ extension_fair_iff), which lifts fairness with respect to \((\smash {\mathcal {R}_{\scriptscriptstyle \mathcal {F}}}, \smash {\mathcal {R}_{\scriptscriptstyle \mathcal {I}}})\) to a standard extension \((\smash {\mathcal {R}_{\scriptscriptstyle \mathcal {F}}}, \smash {\mathcal {R}'_{\scriptscriptstyle \mathcal {}}})\), is not needed later.
-
Lemma 4.2 (sat_deriv_Liminf_iff, Ri_limit_Sup, Rf_limit_Sup) is not referenced in the text, but we need it (sat_deriv_Liminf_iff, Ri_limit_Sup) to prove Theorem 4.13 (fair_state_seq_complete). We also need it (Rf_limit_Sup) to prove Lemma 4.11 (fair_imp_Liminf_minus_Rf_subset_ground_Liminf_state).
-
Lemma 4.6 (saturated_upto_complete_if) is not referenced in the text, but we need it to prove Lemma 4.10 (resolution_prover_ground_derivation), Lemma 4.11 (fair_imp_ Liminf_minus_Rf_subset_ground_Liminf_state), and Theorem 4.13 (fair_state_seq_complete).
-
Theorem 4.8 (Ri_effective) is not referenced in the text, but we need it to prove Theorem 4.13 (fair_state_seq_complete).
-
Theorem 4.9 (saturated_upto_complete) is invoked implicitly in the next-to-last sentence in the proof of Theorem 4.13 (fair_state_seq_complete).
Rights and permissions
About this article
Cite this article
Schlichtkrull, A., Blanchette, J., Traytel, D. et al. Formalizing Bachmair and Ganzinger’s Ordered Resolution Prover. J Autom Reasoning 64, 1169–1195 (2020). https://doi.org/10.1007/s10817-020-09561-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10817-020-09561-0