E.W.Ayers
Chapter 3
A development calculus

Now that we have reviewed the requisite background material, I can define the moving parts of a human-like theorem prover. The driving principle is to find ways of representing proofs at the same level of detail that a human mathematician would use to communicate to colleagues.

The contributions of this chapter are:

  • The Box datastructure, a development calculus (Section 3.3) designed to better capture how humans reason about proofs while also being formally sound.

  • A set of transformations (moves) on Box which preserve this soundness (Section 3.5).

  • A natural language write-up component converting proof objects created with this layer to an interactive piece of text (Section 3.6).

  • In the supplementary Appendix A, an 'escape hatch' from the Box datastructure to a metavariable-oriented goal state system as used by Lean (Section 3.4.4, Appendix A), this enables compatibility between Box-style proofs and existing automation and verification within the proof assistant.

I wish to make sure that the system integrates with an existing proof assistant (in this case Lean). This is because, by plugging in to an existing prover, it is possible to gain leverage by utilising the already developed infrastructure for that prover such as parsers, tactics and automation. Using an existing prover also means that the verification of proofs can be outsourced to the prover's kernel.

The first challenge is to find a suitable way of determining what it means to be 'human-like'. This is the first research question of Section 1.2 and I provided a review in Section 2.7. Humans think differently to each other, and I do not wish to state that there is a 'right' way to perform mathematics. However I do wish to argue that there are certain ways in which the current methods for performing ITP should be closer to the general cluster of ways in which humans talk about and solve problems.

In this chapter I will investigate some ways in which the inference rules that provers take could be made more human-like and then introduce a new proving abstraction layer, HumanProof, written in the Lean 3 theorem prover, implementing these ideas. Later, in Chapter 6 I will gather thoughts and ratings from real mathematicians about the extent to which the developed system achieves these goals.

In Section 3.1, I will first present an example proof produced by a human to highlight the key features of 'human-like' reasoning that I wish to emulate. Then in Section 3.2 I will give an overview of the resulting designs and underline the primary design decisions and the evidence that drives them. In Section 3.4 I will provide the details and theory of how the system works through defining the key Box structure and 'moves' Boxes as well as how to run standard tactics within Boxes (Section 3.4.4). This theoretical basis will then be used to define the human-like moves in Section 3.5. Then, I will detail the natural language generation pipeline for HumanProof in Section 3.6.

3.1. Motivation

3.1.1. The need for human-like systems

In Section 1.1 I noted that non-specialist mathematicians have yet to widely accept proof assistants despite the adoption of other tools such as computer algebra systems. One way in which to improve this situation is to reduce the cost of learning to use proof assistants through making the way in which they process proofs more similar to how a human would. Doing so would reduce the learning curve for a new user by making the proofs more closely match what the mathematician already knows.

This rules out many automated reasoning methods such as resolution [BG01[BG01]Bachmair, Leo; Ganzinger, HaraldResolution theorem proving (2001)Handbook of automated reasoning][Ble81]Bledsoe, Woodrow WNon-resolution theorem proving (1981)Readings in Artificial IntelligenceCompare with non-resolution theorem proving [Ble81] discussed further in Section 2.7.. This is because typically the original statement of the proposition to be proved will be first reduced to a normal form and mechanically manipulated with a small set of inference rules. The resulting proof will be scarcely recognisable to a mathematician as a proof of the proposition, even if it is accepted by the kernel of a proof assistant. As discussed in Section 1.1, Section 2.6 and as will be confirmed in Chapter 6, mathematicians care not just about a certificate that a statement is correct but also care about the way in which the statement is correct.

Given some new way of creating proofs; how can we determine whether these created proofs are more 'human-like' than some other system? The way I propose here is to require that the program be able to imitate the reasoning of humans at least well enough to produce convincing natural language write-ups of the proofs that it generates and then to test the convincingness of these write-ups through asking mathematicians. This approach is shared by the previous work of Gowers and Ganesalingam [GG17[GG17]Ganesalingam, Mohan; Gowers, W. T.A fully automatic theorem prover with human-style output (2017)J. Automated Reasoning] (henceforth abbreviated G&G), they use a similar framework to the HumanProof system presented in this thesis to produce natural language write-ups of proofs for some lemmas in the domain of metric space topology. The work presented in this thesis expands significantly on the work of G&G.

3.1.2. Modelling human-like reasoning

Building on the background where I explored the literature on the definition of 'human-like' (Section 2.7) and 'understandable' (Section 2.6.1) proofs, my goal in this section is find some specific improvements to the way in which computer aided mathematics is done.

One of the key insights of Gowers and Ganesalingam is that humans reason with a different 'basis' of methods than the logical operations and tactics that are provided to the user of an ITP. For example, a hypothesis such as a function being continuous expands to a formula (3.38) with interlaced quantifiers.

(3.1).

Definition of a continuous function for metric spaces , . Here is the distance metric for or .

However in a mathematical text, if one needs to prove the hypothesis that is continuous will be applied in one go. Whereas in an ITP this process will need to be separated in to numerous steps.

Another example with the opposite problem are the automated tactics such as the tableaux prover blast [Pau99[Pau99]Paulson, Lawrence CA generic tableau prover and its integration with Isabelle (1999)Journal of Universal Computer Science]. The issue with these is that their process is opaque and leaves little explanation for why they succeed or fail. They may also step over multiple stages that a human would rather see spelled out in full. The most common occurrence of this is on definition expansion; two terms may be identical modulo definition expansion but a proof found in a textbook will often take the time to point out when such an expansion takes place.

This points towards creating a new set of 'moves' for constructing proofs that are better suited for creating proofs by corresponding better to a particular reasoning step as might be used by a human mathematician.

3.1.3. Structural sharing

When humans reason about mathematical proofs, they will often flip between forwards reasoning and backwards reasoningBroadly speaking, forwards reasoning is any mode of modifying the goal state that acts only on the hypotheses of the proof state. Whereas backwards reasoning modifies the targets.. The goal-centric proof state used by ITPs can make this kind of reasoning difficult. In the most simple example, suppose that the goal is PQQPThat is, given the hypothesis PQ, prove QP where P and Q are propositions and is the logical-and operation.. One solution is to perform a split on the target to produce PQQ and PQP. However, performing a conjunction elimination on the PQ hypothesis will then need to be performed on both of the new targets. This is avoided if the elimination is performed before splitting PQ. In this simplified example it is clear which order the forwards and backwards reasoning should be performed. But in more complex proofs, it may be difficult to see ahead how to proceed. A series of backwards reasoning steps may provide a clue as to how forwards reasoning should be applied. The usual way that this problem is solved is for the human to edit an earlier part of the proof script with the forwards reasoning step on discovering this. I reject this solution because it means that the resulting proof script no longer represents the reasoning process of the creator, the fact that the forwards reasoning step was motivated by the goal state at a later point is lost.

The need to share structure among objects in the name of efficiency has been studied at least as far back as Boyer and Moore [BM72[BM72]Boyer, R. S.; Moore, J. S.The sharing structure in theorem-proving programs (1972)Machine intelligence] however the motivation behind introducing it here is purely for the purpose of creating human-like proofs.

The solution that I use here is to use a different representation of the goal state that allows for structural sharing. This alteration puts the proof state calculus more in the camp of OLEG [McB00[McB00]McBride, ConorDependently typed functional programs and their proofs (2000)], and the G&G prover. The details of the implementation of structural sharing are presented later in Section 3.5.4.

Structural sharing can also be used to implement backtracking and counterfactuals. For example, suppose that we need to prove APQ, one could apply the -left-introduction rule PPQ, but then one might need to backtrack later in the event that really the right-introduction rule QPQ should be used instead. Structural sharing lets us split a target into two counterfactuals.

3.1.4. Verification

One of the key benefits of proof assistants is that they can rigorously check whether a proof is correct. This distinguishes the HumanProof project from the prior work of G&G, where no formal proof checking was present. While I have argued Section 2.6And will later be suggested from the results of my user study in Section 6.6. that this guarantee of correctness is less important for attracting working mathematicians, I wish to demonstrate here that there need not be a conflict between having a prover which is easy for non-specialists to understand and which is formally verified.

3.1.5. What about proof planning?

Proof planning is the process of creating proofs using abstract proof methods that are assembled with the use of classical AI planning algorithms[RN10]Russell, Stuart J.; Norvig, PeterArtificial Intelligence - A Modern Approach (2010)An introduction to classical AI planning can be found in Russel and Norvig [RN10 Pt.III]. The concept of proof planning was first introduced by Bundy [Bun88[Bun88]Bundy, AlanThe use of explicit plans to guide inductive proofs (1988)International conference on automated deduction]. A review of proof planning is given in Section 2.7.2.

The primary issue with proof planning is that there is a sharp learning curve. In order to get started with proof plans, one must learn a great deal of terminology and a new way of thinking about formalised mathematics. Bundy presents his own critique of proof planning [Bun02[Bun02]Bundy, AlanA critique of proof planning (2002)Computational Logic: Logic Programming and Beyond] which goes in to more detail on this point.

The study of proof planning has fallen out of favour for the 21st century so far, possibly in relation to the rise of practical SMT solvers such as E proverhttps://wwwlehre.dhbw-stuttgart.de/~sschulz/E/E.html [SCV19[SCV19]Schulz, Stephan; Cruanes, Simon; et al.Faster, Higher, Stronger: E 2.3 (2019)Proc. of the 27th CADE, Natal, Brasil] and Z3 proverhttps://github.com/Z3Prover/z3 [MB08[MB08]de Moura, Leonardo; Bjørner, NikolajZ3: An efficient SMT solver (2008)International conference on Tools and Algorithms for the Construction and Analysis of Systems] and their incorporation in to ITP through the use of 'hammer' software like Isabelle's Sledgehammer [BN10[BN10]Böhme, Sascha; Nipkow, TobiasSledgehammer: judgement day (2010)International Joint Conference on Automated Reasoning]. I share a great deal of the ideals that directed proof planning and the equational reasoning system presented in Chapter 4 is inspired by it. Although I wish to take a more practical stance; the additional abstractions that are placed atop the underlying tactic system should be transparent, in that they are understandable without needing to be familiar with proof planning and with easy 'escape hatches' back to the tactic world if needed.

3.2. Overview of the software

The software implementation of the work presented in this thesis is called 'HumanProof' and is implemented using the Lean 3 proverhttps://leanprover-community.github.io. The source code can be found here coming soon. In this section I will give a high-level overview of the system and some example screenshots. A general overview of the system and how it relates to the underlying Lean theorem prover is shown in Figure 3.2.

Figure 3.2.

High-level overview of the main modules that comprise the HumanProof system and these interface with Lean, ProofWidgets and the VSCode text editor. The green parts of the diagram are contributions given in this thesis. ProofWidgets (Chapter 5) was spun out from HumanProof for use as a general-purpose GUI system so that it could be used in other community projects (see [/thesis/widgets#community-built-widgets]).

Given a theorem to prove, HumanProof is invoked by indicating a special begin [hp] script block in the proof document (see Figure 3.3) is taken to be the lemma to be solved. This initialises HumanProof's Box datastructure with the assumptions and target proposition of the proof. The initial state of the prover is shown in the goal view of the development environment, here called the Info View. Using the ProofWidgets framework (developed in Chapter 5), this display of the state is interactive: the user can click at various points in the document to determine their next steps. Users can then manipulate this datastructure either through the use of interactive buttons or by typing commands in to the proof script in the editor. In the event of clicking the buttons, the commands are immediately added to the proof script sourcefile as if the user had typed it themselves (The left panel of Figure 3.3). In this way, the user can create proofs interactively whilst still preserving the plaintext proof document as the single-source-of-truth; this ensures that there is no hidden state in the interactive view that is needed for the Lean to reconstruct a proof of the statement. While the proof is being created, the system also produces a natural language write-up of the proof (Section 3.6) that is displayed alongside the proof state, as the proof progresses, users can see the natural language proof get longer too.

The system also comes equipped with a module for solving equalities using the 'subtasks algorithm' (Chapter 4). The subtasks algorithm uses a hierarchical planning system to produce an equality proof that is intended to match the way that a human would create the proof, as opposed to a more machine like approach such as E-matching [BN98[BN98]Baader, Franz; Nipkow, TobiasTerm rewriting and all that (1998) Ch. 10]. The output of this subsystem is a chain of equations that is inserted in to the natural language writeup.

Figure 3.3.

Screenshot of HumanProof in action on a test lemma. To the left is the code editor. The user invokes HumanProof with the begin [hp] command. The blue apply H button can be clicked to automatically insert more proofscript.

3.3. The Box datastructure

At the heart of HumanProof is a development calculus using a datastructure called Box. The considerations from Section 3.1.3 led to the development of an 'on-tree' development calculus. Rather than storing a flat list of goals and a metavariable context alongside the result, the entire development state is stored in a recursive tree structure which I will call a box tree. The box tree, to be defined in Section 3.3.2, stores the proof state as an incomplete proof tree with on-tree metavariable declarations which is then presented to the user as a nested set of boxes.

As we shall investigate in Section 3.3.5, this design puts the Box calculus in the company of McBride's OLEG [McB00[McB00]McBride, ConorDependently typed functional programs and their proofs (2000)] and G&G's Robot [GG17[GG17]Ganesalingam, Mohan; Gowers, W. T.A fully automatic theorem prover with human-style output (2017)J. Automated Reasoning]. A more abstract treatment can be found in the work of Sterling and Harper [SH17[SH17]Sterling, Jonathan; Harper, RobertAlgebraic Foundations of Proof Refinement (2017)arXiv preprint arXiv:1703.05215], implemented within the RedPRL theorem proverhttps://redprl.org.

The novel contribution of the development calculus developed here is that it works within a Spiwack-style [Spi11[Spi11]Spiwack, ArnaudVerified computing in homological algebra, a journey exploring the power and limits of dependent type theory (2011)]See Section 2.4 for more background information. flat metavariable context model as is used in Lean. That is, it is a layer atop the existing metavariable context system detailed in Section 2.4.3. This means that it is possible for the new calculus to work alongside an existing prover, rather than having to develop an entirely new one as was required for OLEG and Robot. This choice opens many possibilities, now one can leverage many of the advanced features that Lean offers such as a full-fledged modern editor and metaprogramming toolchain [EUR+17[EUR+17]Ebner, Gabriel; Ullrich, Sebastian; et al.A metaprogramming framework for formal verification (2017)Proceedings of the ACM on Programming Languages]. This approach also reduces some of the burden of correctness pressed upon alternative theorem provers, because we can outsource correctness checking to the Lean kernel. Even with this protection, it is still frustrating when a development calculus produces an incorrect proof and so I will also provide some theoretical results here and in Appendix A on some conditions that must be met for a proof step to be sound. The design of the Box calculus is also independent of any particular features of Lean, and so a variant of it may also be implemented in other systems.

The central datatype is the Box. This performs the role of holding a partially constructed proof object and a representation of the goals that remain to be solved. As discussed in Section 3.1.3, the purpose is to have a structurally shared tree of goals and assumptions that is also compatible with Lean tactics.

3.3.1. An example of Box in action.

Boxes are visualised as a tree of natural-deduction-style goal states. Let's start with a minimal example to get a feel for the general progression of a proof with the Box architecture. Let's prove PQQP using Boxes. The initial box takes the form (3.4).

(3.4).

?𝑡 : PQQP

And we can read (3.4) as saying "we need to show PQQP". The ?𝑡 is the name of the metavariable that the proof of this will be assigned to. The first action is to perform an intro step to get (3.5).

(3.5).
𝑕 : PQ

?𝑡: QP

To be read as "Given PQ, we need to show QP". So far the structure is the same as would be observed on a flat goal list structure. The idea is that everything above a horizontal line is a hypothesis (something that we have) and everything below is a target (something we want). When all of the targets are solved, we should have a valid proof of the original target. At this point, we would typically perform an elimination step on (e.g., cases in Lean) (3.6).

(3.6).

𝑕₁ : P

?𝑡₁: QP
𝑕₂ : Q

?𝑡₂: QP

Here in (3.6) we can see nested boxes, each nested box below the horizontal line must be solved to solve the parent box. However in the box architecture there is an additional move available; a branching on the goal (3.7).

(3.7).
𝑕 : PQ


?𝑡₁ : Q

?𝑡₂ : P

If a pair of boxes appear with a between them, then either of the boxes can be solved to solve the parent box. And then we can eliminate h on the branched box:

(3.8).

𝑕₁ : P


?𝑡₁₁ : Q

?𝑡₁₂ : P
𝑕₂ : Q


?𝑡₂₁ : Q

?𝑡₂₂ : P

Now at this point, we can directly match 𝑕₁ with ?𝑡₁₂ and 𝑕₂ with ?𝑡₂₁ to solve the box. Behind the scenes, the box is also producing a result proof term that can be checked by the proof assistant's kernel.

3.3.2. Definition of Box

The above formulation is intended to match with the architecture designed in G&G, so that all of the same 'moves' developed in G&G are available. Unlike G&G, the system also interfaces with a flat goal-based development calculus, and so it is possible to use both G&G moves and Lean tactics within the same development calculus. To do this, let's formalise the system presented above in Section 3.3.1 with the following Box datatype (3.9). Define a Binder := (name : Name) × (type : Expr) to be a name identifier and a type with notation (nametype), using a smaller colon to keep the distinction from a meta-level type annotation.

(3.9).

Inductive definition of Box.

Box ::=
|(x : Binder) (b : Box) : Box
| 𝒯 (m : Binder) (b : Box) : Box
| 𝒭 (r : Expr) : Box
| 𝒜 (b: Box) (r : Binder) (b: Box) : Box
| 𝒪 (b: Box) (b: Box) : Box
| 𝒱 (x : Binder) (t : Expr) (b : Box) : Box

I will represent instances of the Box type with a 2D box notation defined in (3.10) to make the connotations of the datastructure more apparent.

(3.10).

Visualisation rules for the Box type. Each rule takes a pair 𝐿𝑅 where 𝐿 is a constructor for Box and 𝑅 is the visualisation. The idea is that everything above the horizontal line in the box is a hypothesis or a value that we have. Everything below a line within a box is a target, something that we need to find through the proof discovery process. This visualisation is also implemented in Lean using the widgets framework presented in Section 5.7.

(𝑥α) 𝑏
𝑥 : α
...𝑏
𝒯 (𝑥α) 𝑏

?𝑥 : α
...𝑏
𝒭 𝑟
𝑟
𝒜 𝑏₁ (𝑥α) 𝑏₂

[𝑥 :=]
...𝑏₁
...𝑏₂
𝒪 𝑏₁ 𝑏₂

...𝑏₁
...𝑏₂
𝒱 (𝑥α) 𝑡
...𝑏
𝑥 := 𝑡
...𝑏

These visualisations are also presented directly to the user through the use of the widgets UI framework presented in Chapter 5. The details of this visualisation are given in Section 5.7.

To summarise the roles for each constructor:

  • 𝑥 𝑏 is a variable introduction binder, that is, it does the same job as a lambda binder for expressions and is used to introduce new hypotheses and variables.

  • 𝒯 𝑚 𝑏 is a target binder, it introduces a new metavariable ?𝑚 that the child box depends on.

  • 𝒭 𝑟 is the result box, it depends on all of the variables and targets that are declared above it. It represents the proof term that is returned once all of the target metavariables are solved. Extracting a proof term from a well-formed box will be discussed in Section 3.4.

  • 𝒜 𝑏₁ (𝑥α) 𝑏₂ is a conjunctive pair of boxes. Both boxes have to be solved to complete the proof. Box b depends on variable 𝑥. When 𝑏₁ is solved, the 𝑥 value will be replaced with the resulting proof term of 𝑏₁.

  • 𝒪 𝑏₁ 𝑏₂ is a disjunctive pair, if either of the child boxes are solved, then so is the total box. This is used to implement branching and backtracking.

  • 𝒱 𝑥 𝑏 is a value binder. It introduces a new assigned variable.

Boxes also have a set of well-formed conditions designed to follow the typing judgements of the underlying proof-assistant development calculus. This will be developed in Section 3.4.

3.3.3. Initialising and terminating a Box

Given an expression representing a theorem statement P : Expr, ∅ ⊢ PProp, we can initialise a box to solve P as 𝑏₀ := 𝒯 (𝑡P) (𝒭 𝑡) (3.11).

(3.11).

Initial 𝑏₀ : Box given PProp.


?𝑡 : P
?𝑡

In the case that P also depends on a telescope of hypotheses ΓPProp, these can be incorporated by prepending to the initial 𝑏₀ in (3.11) with an box for each 𝑕Γ.

Say that a Box is solved when there are no 𝒯-binders remaining in the Box. At this point, the proving process ceases and a proof term and natural language proof may be generated.

3.3.4. Transforming a box

The aim of the game is to solve a box through the use of 'moves'. A move partial functions on boxes Move := BoxOption Box. In Section 3.3.1 we saw some examples of moves to advance the box state and eventually complete it. A complete set of moves that are implemented in the system will be given in Section 3.5. Some moves will of course be nonsense and not produce sound proofs. In Section 3.4 I will define what it means for a move to be sound and produce a correct proof that can be checked by the ITP's kernel.

3.3.5. Relation to other development calculi

Let's pause now to highlight the similarities and differences of this approach with some other development calculi.

McBride's OLEG [McB00[McB00]McBride, ConorDependently typed functional programs and their proofs (2000)] is the most similar to the design presented here. OLEG 'holes' are functionally the same as metavariables. That is, they are specially tagged variables that will eventually be assigned with expressions. OLEG provides an additional constructor for expressions called 'hole-bindings' or '-bindings'. Because OLEG is a ground-up implementation of a new theorem prover, hole-bindings can be added directly as a constructor for expressions which is not available in Lean (without reimplementing Lean expressions and all of the algorithms upon it)It might be possible to use Lean's expression macro system to implement hole-bindings, but doing so would still require reimplementing a large number of type-context-centric algorithms such as unification [SB01].[SB01]Snyder, Wayne; Baader, FranzUnification theory (2001)Handbook of automated reasoning. These hole-bindings perform the same role as the 𝒯 constructor, in that they provide the context of variables that the hole/metavariable is allowed to depend on. But if the only purpose of a hole-binding is to give a context, then why not just explicitly name that context as is done in other theorem provers? The Box architecture given above is intended to give the best of both worlds, in that you still get a shared goal-tree structure without needing to explicitly bind metavariables within the expression tree, instead they are bound in a structure on top of it.

Lean and Coq's proof construction systems make use of the metavariable context approach outlined in Section 2.4. The metavariable context here performs the same role as the 𝒯 target boxes, however this set of targets is flattened in to a list structure rather than stored in a tree as in Box. This makes many aspects such as unification easier but means that structural sharing (Section 3.1.3) is lost. In Section 3.4.4 I show that we do not have to forgo use of the algorithms implemented for a flat metavariable structure to use Boxes.

In Isabelle, proofs are constructed through manipulating the proof state directly through an LCF-style [Mil72[Mil72]Milner, RobinLogic for computable functions description of a machine implementation (1972)] kernel of available functionsAs can be seen in the implementation sourcecode.. Schematic variables are used to create partially constructed terms.

Sterling and Harper [SH17[SH17]Sterling, Jonathan; Harper, RobertAlgebraic Foundations of Proof Refinement (2017)arXiv preprint arXiv:1703.05215] provide a category-theoretical theory of partially constructed proofs and use these principles in the implementation of RedPRLhttps://redprl.readthedocs.io/en/latest/. They are motivated by the need to create a principled way performing refinement of proofs in a dependently-typed foundation. They develop a judgement-independent framework for describing development calculi within a category-theoretical setting.

Another hierarchical proof system is HiProof [ADL10[ADL10]Aspinall, David; Denney, Ewen; et al.Tactics for hierarchical proof (2010)Mathematics in Computer Science]. HiProof makes use of a tree to write proofs. The nodes of tree are invocations of inference rules and axioms and an edge denotes the flow of evidence in the proof. These nodes may be grouped to provide varying levels of detail. These hierarchies are used to describe a proof, whereas a Box here describes a partially completed proof and a specification of hypotheses and targets that must be set to construct the proof.

3.4. Creating valid proof terms from a Box

Note that because we are using a trusted kernel, the result of producing an invalid proof with Box is a mere inconvenience because the kernel will simply reject it. However, in order for the Box structure defined in Section 3.3.2 to be useful within a proof assistant such as Lean as motivated by Section 3.1.4, it is important to make sure that given a solved Box will produce a valid proof for the underlying trusted kernel. To do this, I will define a typing judgement 𝑀;Γ𝑏α then present a method for extracting a proof term 𝑀;Γ𝑟α from 𝑏 with the same type provided 𝑏 is solved.

3.4.1. Assignability for Boxes

In Section 2.4.2, I introduced the concept of an assignable datastructure for generalising variable-manipulation operations to datatypes other than expressions. We can equip a datatype containing expressions with an assignability structure assign (3.12). This is a variable-context-aware traversal over the expressions present on the datatype. For Box, this traversal amounts to traversing the expressions on each box, while adding to the local context if the subtree is below a binder. The definition of assign induces a definition of variable substitution and abstraction over Boxes.

(3.12).

Definition of assign for Box. See Section 2.4.2 for a description of assignability. The <*> operator is the applicative product for some applicative functor M (see Section 2.2.2). Note that target 𝒯 declarations are bound, so for the purposes of assignment they are treated as simple variable binders.

assign (𝑓 : ContextExprM Expr) (Γ : Context)
: BoxM Box
|𝑥 𝑏pure<*> assign 𝑓 Γ 𝑥 <*> assign 𝑓 [..Γ, 𝑥] 𝑏
| 𝒯 𝑚 𝑏pure 𝒯 <*> assign 𝑓 Γ 𝑚 <*> assign 𝑓 [..Γ, 𝑚] 𝑏
| 𝒭 𝑟pure 𝒭 <*> assign 𝑓 Γ 𝑟
| 𝒜 𝑏₁ 𝑥 𝑏₂pure 𝒜 <*> assign 𝑓 Γ 𝑏₁ <*> assign 𝑓 Γ 𝑥 <*> assign 𝑓 [..Γ, 𝑥] 𝑏₂
| 𝒪 𝑏₁ 𝑏₂pure 𝒪 <*> assign 𝑓 Γ 𝑏₁ <*> assign 𝑓 Γ 𝑏₂
| 𝒱 𝑥 𝑡 𝑏pure 𝒱 <*> assign 𝑓 Γ 𝑥 <*> assign 𝑓 Γ 𝑡 <*> assign 𝑓 [..Γ, 𝑥𝑡] 𝑏

3.4.2. Typing judgements for Box

In Section 2.4, I defined contexts Γ, metavariable contexts 𝑀. As covered in Carneiro's thesis [Car19[Car19]Carneiro, MarioLean's Type Theory (2019)], Lean's type theory affords a set of inference rules on typing judgements Γ𝑡α, stating that the expression 𝑡 has the type α in the context Γ. However these inference rules are only defined for expressions 𝑡 : Expr that do not contain metavariables. In Section A.1, I extend these judgements (A.3), (A.4) to also include expressions containing metavariable contexts 𝑀;Γ𝑡α.

In a similar way, we can repeat this for Box: given contexts 𝑀 and Γ we can define a typing judgement 𝑀;Γ𝑏β where 𝑏 : Box and β is a type. The inference rules for this are given in (3.13). These have been chosen to mirror the typings given in Section 2.4.3.

(3.13).

Typing judgement rules for Box. Compare with (A.3) and (A.4) in Section A.1.

𝑀;(..Γ, 𝑥α)𝑏β

-typing
𝑀;Γ((𝑥α), 𝑏)(Π (𝑥α), β)
𝑀;Γ𝑡α

𝒭-typing
𝑀;Γ ⊢ 𝒭 𝑡α
[..𝑀,𝑚,α,Γ];Γ𝑏β

𝒯-typing
𝑀;Γ(𝒯 (?𝑥α), 𝑏)β
𝑀;Γ𝑏₁α
𝑀;[..Γ, (𝑥α)]𝑏₂β

𝒜-typing
𝑀;Γ(𝒜 𝑏₁ (𝑥α) 𝑏₂)β
𝑀;Γ𝑏₁α
𝑀;Γ𝑏₂α

𝒪-typing
𝑀;Γ(𝒪 𝑏₁ 𝑏₂)α
𝑀;Γ𝑣α
𝑀;[..Γ, (𝑥α)]𝑏β

𝒱-typing
𝑀;Γ(𝒱 (𝑥α𝑣), 𝑏)β

These typing rules have been designed to match the typing rules (A.3) of the underlying proof terms that a Box will produce when solved, as I will show next.

3.4.3. Results of Boxes

The structure of Box is designed to represent a partially complete expression without the use of unbound metavariables. Boxes can be converted to expressions containing unbound metavariables using results : BoxSet Expr as defined in (3.14).

(3.14).

Definition of results. 𝑟[𝑥] denotes a delayed abstraction (Section A.4) needed in the case that 𝑟 contains metavariables.

results
: BoxSet Expr
|(𝑥α) 𝑏{(Expr.λ (𝑥α) 𝑟[𝑥]) for 𝑟 in results 𝑏}
| 𝒯 (𝑥α) 𝑏results 𝑏
| 𝒭 𝑡{𝑡}
| 𝒜 𝑏₁ (𝑥α) 𝑏₂
{𝑠 for 𝑠 in results𝑥𝑟𝑏₂
for 𝑟 in results 𝑏₁}
| 𝒪 𝑏₁ 𝑏₂results 𝑏₁results 𝑏₂
| 𝒱 (𝑥α) 𝑏{(Expr.let 𝑥 𝑏 𝑟) for 𝑟 in results 𝑏}

Say that a 𝑏 : Box is solved when there are no remaining 𝒯 entries in it. Then in this case, the set of results for 𝑏 will not contain any metavariables and so can be checked by the kernel. In the case that 𝑏 is unsolved, the results of 𝑏 will contain unbound variables for each 𝒯-binder that need to be assigned. The claim to make here is that the typing system I've placed on Boxes in (3.13) is compatible with the results of these expressions as expressed by (3.15).

(3.15).

Statement of the compatibility lemma. That is, take a 𝑏 : Box and α : Expr, then if 𝑏α in the context 𝑀;Γ and 𝑟 : Expr is a result of 𝑏 (3.14); then 𝑟α in the context 𝑀;Γ with additional metavariables added for each of the targets in 𝑏.

𝑀;Γ𝑏α
𝑟results 𝑏

[..𝑀, ..targets 𝑏];Γ𝑟α

Here in (3.15), targets 𝑏 is the set of metavariable declarations formed by accumulating all of the 𝒯-binders in 𝑏. (3.15) states that given a box 𝑏 and an expression 𝑟 that is a result of 𝑏. Then if 𝑏 is a valid box with type α then 𝑟 will type to α too in the metavariable context including all of the targets in 𝑏.

We need to derive (3.15) because it ensures that our Box will produce well-typed expressions when solved. Using (3.15), we can find moves m : BoxOption Box - partial functions from Box to Box - such that 𝑀;Γ𝑏α𝑀;Γm 𝑏α whenever 𝑏dom m. Hence a chain of such moves will produce a result that satisfies the initial goal.

3.4.3.1. Proof of (3.15)

Without loss of generality, we only need prove (3.15) for a 𝑏 : Box with no 𝒪 boxes and a single result [𝑟] = results 𝑏. To see why, note that any box containing an 𝒪 can be split as in (3.16) until each Box has one result. Then we may prove (3.15) for each of these in turn.

(3.16).
results(
...𝑝

...𝑏₁
...𝑏₂
) = results(
...𝑝

...𝑏₁
)results(
...𝑝

...𝑏₂
)

Write result 𝑏 to mean this single result 𝑟. Performing induction on the typing judgements for boxes, the most difficult is 𝒜-typing, where we have to show (3.17).

(3.17).

The induction step that must be proven for the 𝒜-box case of (3.15)

𝑀;Γ𝑏₁α
𝑀;[..Γ, (𝑥α)]𝑏₂β
𝑀';Γresult 𝑏₁α
𝑀';[..Γ, (𝑥α)]result 𝑏₂β

𝑀';Γresult (𝒜 𝑏₁ (𝑥α) 𝑏₂)β

Where 𝑀' := [..𝑀, ..targets (𝒜 𝑏₁ (𝑥α) 𝑏₂)]. To derive this it suffices to show that result is a 'substitution homomorphism':

(3.18).

result is a substitution homomorphism

𝑀;Γσ ok

𝑀;Γσ (result 𝑏)result (σ 𝑏)

where σ is a substitutionSee Section 2.4.1. A substitution is a partial map from variables to expressions. in context Γ and is the definitional equality judgement under Γ. Because then we have

(3.19).

Here, 𝑥𝑒𝑏 is used to denote substitution applied to 𝑏. That is, replace each occurrence of 𝑥 in 𝑏 with 𝑒

𝑀';Γ
result (𝒜 𝑏₁ (𝑥α) 𝑏₂)
result (𝑥result 𝑏₁𝑏₂)
≡ ⦃𝑥result 𝑏₁(result 𝑏₂)
(λ (𝑥α), result 𝑏₂) (result 𝑏₁)

We can see the substitution homomorphism property of result holds by inspection on the equations of result, observing that each LHS expression behaves correctly. Here is the case for :

(3.20).

result and σ obey the 'substitution homomorphism' property on the case of . Here λ is used to denote the internal lambda constructor for expressions. Note here we are assuming dom σΓ, so 𝑥dom σ, otherwise dom σ.

𝑀';Γ
result (σ ((𝑥α) 𝑏))
result $(𝑥(σ α)) (σ 𝑏)
(λ (𝑥(σ α)), (result (σ 𝑏))[𝑥])
(λ (𝑥(σ α)), (σ (result 𝑏))[𝑥]) -- ∵ induction hypothesis
σ (λ (𝑥α), (result 𝑏))
σ (result ((𝑥α) 𝑏))

This completes the proof of type compatibility (3.15). By using compatibility, we can determine whether a given move m : BoxOption Box will be sound. Define a move m to be sound when for all 𝑏dom m we have some α such that 𝑀;Γ(m 𝑏)α whenever 𝑀;Γ𝑏α.

Hence, to prove a starting propositionOr, in general, a type α. P, start with an initial box 𝑏₀ := 𝒯 (?t₀∶P) (𝒭 ?t). Then if we only map 𝑏₀ with sound moves to produce a solved box 𝑏, then each of results 𝑏 will always have type α and hence will be accepted by Lean's kernel.

Given a move m that is sound on 𝑏, then we can construct a sound move on (𝑥α) 𝑏 too that acts on the nested box 𝑏.

3.4.4. Escape-hatch to tactics

As discussed in Section 2.4.4, many provers, including Lean 3, come with a tactic combinator language to construct proofs through mutating an object called the TacticState comprising a metavariable context and a list of metavariables called the goals. In Section 3.1 I highlighted some of the issues of this approach, but there are many built-in and community-made tactics which can still find use within a HumanProof proof. For this reason, it is important for HumanProof to provide an 'escape hatch' allowing these tactics to be used within the context of a HumanProof proof seamlessly. I have achieved this compatibility system between Boxes and tactics through defining a zipper [Hue97[Hue97]Huet, GérardFunctional Pearl: The Zipper (1997)Journal of functional programming] structure on Boxes (Section A.2) and then a set of shim operations for soundly converting an underlying TacticState to and from a Box object. The details of this mechanism can be found in Section A.2. It is used to implement some of the moves presented next in Section 3.5, since in some cases the move is the same as its tactic-based equivalent.

3.5. Moves for Box.

Using the framework presented above we can start defining sound moves on Boxes and use Box to actualise the kinds of reasoning discussed in Section 3.1. Many of the moves here will be similar to inference rules that one would find in a usual system, and so I will not cover these ones in great detail. I will also skip many of the soundness proofs, because in Appendix A I instead provide an 'escape hatch' for creating sound moves from tactics in the underlying metavariable-oriented development calculus. Some of these moves are

3.5.1. Simplifying moves

We have the following moves for reducing Boxes, these should be considered as tidying moves.

(3.21).

Reduction moves for Box. These are moves which should always be applied if they can and act as a set of reductions to a box. Note that these are not congruent; for example 𝒪-reduce and 𝒪-reduce on 𝒪 (𝒭 𝑒₁) (𝒭 𝑒₂) produce different terminals.

𝒪-reduce:=
𝑒
...𝑏₂
𝑒
𝒪-reduce:=
...𝑏₁
𝑒
𝑒
𝒜-reduce :=

𝑡₀ :=
𝑒
...𝑏
...(𝑡₀𝑒𝑏)
𝒯-reduce :=

?𝑡₀ : α
𝑒
𝑒
if ?𝑡₀𝑒
𝒪-revert:=
...𝑏₁
...𝑏₂
...𝑏₂
𝒪-revert:=
...𝑏₁
...𝑏₂
...𝑏₁

?t: Π (𝑥 : α), β
...𝑏

t:=
𝑥 : α

?t: β
?t
...𝑏

3.5.2. Deleting moves

These are moves that cause a Box to simplify but which are not always 'safe' to do, in the sense that they may lead to a Box which is impossible to solve.

(3.22).

Deletion moves for Box.

𝒪-revert:=
...𝑏₁
...𝑏₂
...𝑏₂
𝒪-revert:=
...𝑏₁
...𝑏₂
...𝑏₁
𝒱-delete :=
𝑥 : α := 𝑒

...𝑏

...(𝑥𝑒𝑏)

3.5.3. Lambda introduction

In normal tactics, an intro tactic is used to introduce Π-bindersΠ-binders Π (𝑥 : α), β are the dependent generalisation of the function type αβ where the return type β may depend on the input value α.. That is, if the goal state is ⊢ Π (𝑥 : α), β[𝑥] the intro tactic will produce a new state (𝑥 : α)β[𝑥]. To perform this, it assigns the target metavariable ?t: Π (𝑥 : α), β[𝑥] with the expression λ (𝑥 : α), ?t where ?t: β[𝑥] is the new target metavariable with context including the additional local variable 𝑥 : α.

The intro move on Box is analogous, although there are some additional steps required to ensure that contexts are preserved correctly. The simplified case simple_intro (3.23), performs the same steps as the tactic version of intro.

(3.23).

A simple variable introduction move. Note that that the new target ?t is not wrapped in a lambda abstraction because it is abstracted earlier by the box.

simple_intro :=

?t: Π (𝑥 : α), β
?t
𝑥 : α

?t: β
?t

The full version (3.24) is used in the case that the -box is not immediately followed by an 𝒭-box. In this case, a conjunctive 𝒜-box must be created in order to have a separate context for the new (𝑥 : α) variable.

(3.24).

The full version of the lambda introduction move. The box on the rhs of is an 𝒜 box: 𝒜 (𝑥, 𝒯 ?t, 𝒭 ?t) t𝑏.

intro :=

?t: Π (𝑥 : α), β
...𝑏

t:=
𝑥 : α

?t: β
?t
...𝑏

The fact that intro is sound follows mainly from the design of the definitions of : Define 𝑏' to be (𝑥 : α), 𝒯 (?t: β), 𝒭 ?t, represented graphically in (3.25). The typing judgement (3.25) follows from the typing rules (3.13).

(3.25).

The judgement that 𝑏' has type Π (𝑥 : α), β. β may possibly depend on 𝑥.

𝑥 : α

?t: β
?t
: Π (𝑥 : α), β

By the definition of a sound move we may assume (𝒯 ?t, 𝑏) : γ for some type γ. From the 𝒯 typing rule (3.13) we then have [?t];∅ ⊢ 𝑏 : γ. Then it follows from 𝒜 typing (3.13) that ⊢ 𝒜 𝑏' (t: Π (𝑥 : α), β) 𝑏 : γ where 𝑏' :=(𝑥 : α), 𝒯 (?t: β), 𝒭 ?t.

3.5.4. Split and cases moves

Here I present some moves for performing introduction and elimination of the type. The Box version of split performs the same operation as split in Lean: introducing a conjunction. A target ?t: PQ is replaced with a pair of new targets (?t,?t). These can be readily generalised to other inductive datatypes with one constructorOne caveat is that the use of requires the use of a non-constructive axiom of choice with this method. This is addressed in Section 3.5.8. In the implementation, these are implemented using the tactic escape-hatch described in Appendix A.

(3.26).

Move for introducing conjunctions.

split :=

?t: PQ
...𝑏

?t: P
?t: Q
...(?t₀ ↦ ⟨?t,?t₂⟩⦄ 𝑏)

Similarly we can eliminate a conjunction with cases.

(3.27).

Move for eliminating conjunctions. fst : PQP and snd : PQQ are the -projections. In the implementation; h₀ is hidden from the visualisation to give the impression that the hypothesis h₀ has been 'split' in to h₁ and h₂.

cases :=
h₀ : PQ

...𝑏
h₀ : PQ
h₁ : P := fst h₀
h₂ : Q := snd h₀

...𝑏

3.5.5. Induction moves

-elimination (3.27) from the previous section can be seen as a special case of induction on datatypes. Most forms of dependent type theory use inductive datatypes (See Section 2.2.3) to represent data and propositions, and use induction to eliminate them. To implement induction, in CICCalculus of Inductive Constructions. The foundation (Section 2.1.3) used by Lean 3 and Coq. Inductive datastructures (Section 2.2.3) for the Calculus of Constructions [CH88] were first introduced by Pfenning et al [PP89]. See [Car19 §2.6] for the axiomatisation of inductive types within Lean 3's type system.[CH88]Coquand, Thierry; Huet, Gérard P.The Calculus of Constructions (1988)Inf. Comput., [PP89]Pfenning, Frank; Paulin-Mohring, ChristineInductively defined types in the Calculus of Constructions (1989)International Conference on Mathematical Foundations of Programming Semantics, [Car19]Carneiro, MarioLean's Type Theory (2019) each inductive datatype comes equipped with a special constant called the recursor. This paradigm broadens the use of the words 'recursion' and 'induction' to include datastructures that are not recursive.

For example, we can view conjunction AB : Prop as an inductive datatype with one constructor mk : ABAB. Similarly, a disjunctive AB has two constructors inl : AAB and inr : BAB. Interpreting as implication, we recover the basic introduction axioms for conjunction and disjunction. The eliminators for and are implemented using recursors given in (3.28).

(3.28).

Recursors for conjunction and disjunction.

-rec :(A B C : Prop), (ABC)(AB)C
-rec :(A B C : Prop), (AC)(BC)(AB)C

Performing an induction step in a CIC theorem prover such as Lean amounts to the application of the relevant recursor. Case analysis on a disjunctive hypothesis makes for a good example of recursion, the recursor -rec : (PC)(QC)(PQ)C is used. Given a box (h₀ : PQ), 𝑏 where h₀𝑏α, the -cases move sends this to the box defined in (3.29). This is visualised in (3.30).

(3.29).

Explicit datastructure showing the resulting Box after performing -cases on (h₀ : PQ), 𝑏.

𝒜 ((h₁P), 𝑏₁) (𝑐₁Pα) (
𝒜 ((h₂Q), 𝑏₂) (𝑐₂Qα) (
𝒭 (-rec 𝑐₁ 𝑐₂ h₀)
)
)
where 𝑏₁ :=h₀inl h₁𝑏
𝑏₂ :=h₀inr h₂𝑏
(3.30).

Special case of recursion for eliminating statements. The right-hand side of is simplified for the user, but is represented as a nested set of 𝒜 boxes as explicitly written in (3.29). 𝑏₁ and 𝑏₂ are defined in (3.29).

cases :=
h₀ : PQ

...𝑏

h₁ : P

...𝑏₁
h₂ : Q

...𝑏₂

Note that the 𝑏 : Box in (3.30) may contain multiple targets. When the cases move is applied to (h₀PQ), 𝑏, the resulting Box on the rhs of (3.30) results in two copies of these targets. The implements the design requirement of structural sharing of targets as motivated in Section 3.1.3. Structural sharing is a significant advantage over the goal-state style approach to tactics, where the equivalent cases tactic would have to be applied separately to each goal if there were multiple targets.

This structurally-shared induction step also works on recursive datastructures such as lists and natural numbers, as shown in (3.32).

(3.31).

Recursor for natural numbers. -rec can be seen to have the same signature as mathematical induction on the natural numbers.

-rec :
(𝒞 : Type) -- motive
(𝒞 0) -- zero case
((𝑖 : ) → 𝒞 𝑖 → 𝒞 (𝑖 + 1)) -- successor case
(𝑖 : ) → 𝒞 𝑖
(3.32).

Induction move on natural numbers. Implemented using the 'escape hatch' detailed in Appendix A. Here, α is the result type of 𝑏 (Section 3.4.2). That is, (𝑛:)𝑏α.

induction :=
𝑛 :

...𝑏


...𝑛0𝑏
𝑛 :
𝑕 : α

...𝑛𝑛+1𝑏
(3.33).

Detail on the rhs of (3.32). The signature for -rec is given in (3.31).

𝒜 (𝑛0𝑏) (𝑐₁ ∶ ⦃𝑛0α) (
𝒜 ((𝑛),(𝑕α),𝑛𝑛+1𝑏) (𝑐₂∶ ⦃𝑛𝑛+1α) (
𝒭 (-rec (𝑛α) 𝑐₁ 𝑐₂ 𝑛)
)
)

In general, finding the appropriate motive 𝒞 for an induction step amounts to a higher order unification problem which was shown to be undecidable [Dow01[Dow01]Dowek, GilesHigher-order unification and matching (2001)Handbook of automated reasoning §3]. However in many practical cases 𝒞 can be found and higher-order provers come equipped with heuristics for these cases, an early example being Huet's semidecidable algorithm Rather than reimplementing these heuristics, I implement induction moves on Box by using the 'escape hatch' feature (Section 3.4.4).

3.5.6. Introducing 𝒪 boxes

The purpose of 𝒪 boxes is to enable backtracking and branches on Boxes that enables structural sharing. The G&G prover [GG17[GG17]Ganesalingam, Mohan; Gowers, W. T.A fully automatic theorem prover with human-style output (2017)J. Automated Reasoning] makes use of a similar approach. For example, suppose that we had a target xAB for some sets A, B. We might have some lemmas of the form PxA and QxB but we are not yet sure which one to use. Currently in Lean, if you don't yet know which injection to use, you have to guess and manually backtrack. However there may be some clues on which lemma is correct that only become apparent after applying an injection. Automation usually takes this in to account either through symbol counting or by using a representation of goals such as sequent calculus that avoids this problem. The problem with this is that you can't use the

The 𝒪 box allows us to explore various counterfactuals without having to perform any user-level backtracking (that is, having to rewrite proofs). The primitive move that creates new 𝒪-boxes is shown in (3.34), this is used to make more 'human-like' moves such as -split (3.35).

(3.34).

Move for introducing an 𝒪-box by duplication.

𝒪-intro :=
...𝑏
...𝑏
...𝑏
(3.35).

Move for introducing an 𝒪-box by duplication.

-intro :=

?𝑡 : PQ
...𝑏

?𝑡 : P
...𝑏

?𝑡 : Q
...𝑏

3.5.7. Unification under Boxes

Unification is the process of taking a pair of expressions 𝑙 𝑟 : Expr within a joint context 𝑀;Γ and finding a valid set of assignments of metavariables σ in 𝑀 such that (𝑀 + σ);Γ𝑙𝑟. Rather than develop a whole calculus of sound unification for the Box system, I can use the 'escape hatch' tactic compatibility layer developed in Appendix A to transform a sub-Box to a metavariable context and then use the underlying theory of unification used for the underlying development calculus of the theorem prover (in this case Lean). This is the correct approach because unifiers and matchers for theorem provers are usually very well developed in terms of both features and optimisation, so it wouldn't make sense to make a unifier from scratch when the host proof assistant has a perfectly good one already. This approach also has the benefit of dramatically reducing the size of this chapter.

3.5.8. Apply

In textbook proofs of mathematics, often an application of a lemma will also act under binders. For example, let's look at the application of fs 𝑛 being continuous from earlier.

(3.36).

An example lemma h₁ to apply. h₁ is a proof that fs 𝑛 is continuous.

h₁ :
(𝑥 : X) (ε : ) (h₀ : ε > 0),
(δ : ) (h₁ : δ > 0),
(𝑦 : X) (h₂ : dist 𝑥 𝑦 < δ), dist (f 𝑥) (f 𝑦) < ε

In the example the application of h₁ with 𝑁, ε, h₃, and then eliminating an existential quantifier δ and then applying more arguments y all happens in one step and without much exposition in terms of what δ depends on. A similar phenomenon occurs in backwards reasoning. If the target is dist (f 𝑥) (f 𝑦) < ε, in proof texts the continuity of f will be applied in one step to replace this goal with dist x y < δ, where δ is understood to be an 'output' of applying the continuity of f.

Contrast this with the logically equivalent Lean tactic script fragment (3.37):

(3.37).

A Lean tactic-mode proof fragment that is usually expressed in one step by a human but which requires two steps in Lean. The show lines can be omitted and are provided for clarity to show what the goal state before and after the obtain and apply steps. The obtain_,_,_: 𝑃 tactic creates a new goal t : 𝑃 and then after this goal is solved, performs elimination on

...
show dist (f x) (f y) < ε,
obtain ⟨δ, δ_pos, h₁⟩ : ∃ δ, δ > 0 ∧ ∀ y, dist x y < δ → dist (f x) (f y) < ε,
applycontinuous f,
apply h,
show dist x y < δ,
...

In order to reproduce this human-like proof step, we need to develop a theory for considering these 'complex applications'. A further property we desire is that results of the complex application must be stored such that we can recover a natural language write-up to explain it later (e.g., creating "Since f is continuous at x, there is some δ...").

The apply subsystem works by performing a recursive descent on the type of the assumption being applied. For example, applying the lemma given in (3.36) to a target 𝑡 : P will attempt to unify P with dist (f ?𝑥) (f ?𝑦) < ?ε with new metavariables ?𝑥 ?𝑦 : X, ε : . If the match is successful, it will create a new target for each variable in a Π-binderNote that is sugar for Π. above the matching expression and a new 𝒱-binder for each introduced -variable and each conjunct. These newly introduce nested boxes appear in the same order as they appear in the applied lemma.

An example of applying (3.36) to the target dist (f x) (f y) < ε can be seen in (3.38).

(3.38).

An example of applying (3.36) to t. It produces a set of nested targets in accordance with the structure of the binders in (3.36). Result Boxes are omitted.

applycontinuous f:
𝑥 𝑦 : X
ε :

?t: dist (f 𝑥) (f 𝑦) < ε
𝑥 𝑦 : X
ε :

?t: ε > 0
δ : := _
h₂ : δ > 0 := _

?t: dist 𝑥 𝑦 < δ

One complication with this approach to apply, performing many logical inference steps when applying a lemma in one go. There is a technical caveat with non-projectable structures such as (δ : ), P. By default, Lean is a non-classical theorem prover, which here amounts to saying that the axiom of choice is not assumed automatically. Without the axiom of choice, it is not generally possible to construct a function ε :(𝑥 : α), P [𝑥]α such that P[ε ] is true for all :(𝑥 : α), P. To

This apply system can be used for both forwards and backwards reasoning moves. Above deals with the backwards case, in the forwards case, the task is reversed, with now a variable bound by a Π-binder being the subject to match against the forwards-applied hypothesis.

3.6. Natural language generation of proofs

In this section I will detail how the above box architecture is used to produce natural language writeups as the proof progresses. The general field is known as Natural Language Generation (NLG). You can find a background survey of NLG both broadly and within the context of generating proofs in Section 2.8.

In this section I will lean on the work of Ganesalingam, who in his thesis [Gan10[Gan10]Ganesalingam, MohanThe language of mathematics (2010)] has specified a working theory of the linguistics of natural language mathematics. As well as generating a formally verifiable result of a proof, I also extend on G&G by providing some new mechanisms for converting Lean predicates and typeclasses in to English language sentences. That is, in the implementation of the Robot theorem prover, many natural language constructs such as " is a metric space" were hard-coded in to the system. In this work I provide a general framework for attaching verbalisations of these kinds of statements to typeclasses and predicates within Lean. I also extend on the work by making the resulting write-up interactive; emitting a partial proof write-up if the proof-state is not yet solved and also inspecting the natural language write-up through the widgets system are possible. In contrast G&G's output was a static file.

The goal of this section is to demonstrate that the Box architecture above is representative of human-like reasoning by constructing natural language writeups of the proofs created using Boxes. As such the NLG used here is very simple compared to the state of the art and doesn't make use of any modern techniques such as deep learning. The output of this system is evaluated by real, human mathematicians in Chapter 6. An example of a proof generated by the system is shown below in Output 3.39. There are some challenges in converting a Box proof to something that reads like a mathematical proof that I will detail here.

Output 3.39.

Output from the HumanProof natural language write-up system for a proof that the composition of continuous functions is continuous.

Let , and be metric spaces, let be a function and let be a function . Suppose is continuous and is continuous. We need to show that is continuous. Let and let . We must choose such that . Since is continuous, there exists a such that whenever . Since is continuous, there exists a such that whenever . Since , we are done by choosing to be .

3.6.1. Overview

The architecture of the NLG component is given in Figure 3.40. The design is similar to the standard architecture discussed in Section 2.8.1. In Section 3.1.2 I explained the decision to design the system to permit only a restricted set of moves on a Box representing the goal state of the prover. To create the natural language write-up from these moves, each move also emits an Act object. This is an inductive datatype representing the kind of move that occurred. So for example there is an Intro : List BinderAct that is emitted whenever the intro move is performed, storing the list of binders that were introduced. A list of these Acts is held on the state monad for the interactive proof session. This list of acts is then fed to a micro-planner, which converts the list of acts to an abstract representation of sentencesSometimes referred to as a phrase specification. These sentences are then converted to a realised sentence with the help of Run which is a form of S-expression [McC60[McC60]McCarthy, JohnRecursive functions of symbolic expressions and their computation by machine, Part I (1960)Communications of the ACM] containing text and expressions for interactive formatting. This natural language proof is then rendered in the output window using the widgets system (Chapter 5).

Figure 3.40.

Overview of the pipeline for the NLG component of HumanProof. A Box has a series of moves performed upon it, each producing an instance of Act, an abstract representation of what the move did. A list of all of the Acts from the session is then converted in to a list of sentences, which is finally converted to an S-expression-like structure called Run. Compare this with the standard architecture given in Figure 2.30; the main difference being that the macroplanning phase is performed by the choice of moves performed on boxes as detailed in Section 3.5.

3.6.2. Interlude: Grice's laws of implicature

One resource that has proven useful in creating human-like proofs is the work of the Grice on implicature in linguistics [Gri75[Gri75]Grice, Herbert PLogic and conversation (1975)Speech acts]. To review, Grice states that there is an unwritten rule in natural languages that one should only provide as much detail as is needed to convey the desired message. For example, the statement "I can't find my keys" has the implicature "Do you know where my keys are?", it implies that the keys may have been lost at the current location and not in a different part of town and so on. If superfluous detail is included, the reader will pick this up and try to use it to infer additional information. Saying "I can't find my keys at the moment" interpreted literally has the same meaning as "I can't find my keys", but implicitly means that I have only just realised the key loss or that I will be able to find them soon. Grice posits four maxims that should be maintained in order for a sentence or phrase to be useful:

  1. Quantity The contribution should contain no more or less than what is required. Examples: "Since and is prime, ". "Let be a positive real such that ."

  2. Quality Do not say things for which you don't have enough evidence or things that are not true. An example here would be a false proof.

  3. Relation The contributed sentence should be related to the task at hand. Example; putting a true but irrelevant statement in the middle of the proof is bad.

  4. Manner The message should avoid being obscure, ambiguous and long-winded.

Mathematical texts are shielded from the more oblique forms of implicature that may be found in general texts, but Grice's maxims are still important to consider in the construction of human-readable proofs and serve as a useful rule-of-thumb in determining when a generated sentence will be jarring to read.

With respect to the quantity maxim, it is important to remember also that what counts as superfluous detail can depend the context of the problem and the skill-level of the reader. For example, one may write:

Suppose and are open subsets of . Since is continuous, is open

A more introductory text will need to also mention that is a topological space and so is open. Generally these kinds of implicit lemma-chaining can become arbitrarily complex, but it is typically assumed that these implicit applications are entirely familiar to the reader. Mapping ability level to detail is not a model that I will attempt to write explicitly here. One simple way around this is to allow the proof creator to explicitly tag steps in the proof as 'trivial' so that their application is suppressed in the natural language write-up. Determining this correct level of detail may be a problem in which ML models may have a role to play.

3.6.3. Microplanning symbolic mathematics

From a linguistic perspective, a remarkable property of mathematical texts is the interlacing of mathematical symbols and natural language. In the vast majority of cases, each symbolic construct has a natural language equivalent (else verbalising that symbol in conversation would be difficult). For example: "" versus " plus ". Sometimes multiple verbalisations are possible: can be " implies " or " whenever ". Sometimes the the symbolic form of a statement is not used as frequently: " is prime" versus . In making text flow well, the decision of when to move between symbolic and textual renderings of a mathematical proof is important. The rule-of-thumb that I have arrived at is to render the general flow of the proof's reasoning using text and to render the objects that are being reasoned about using symbols. The idea here is that one should be able to follow the rough course of argument whilst only skimming the symbolic parts of the proof.

3.6.4. Microplanning binders with class predicate collections

In mathematics, it is common that a variable will be introduced in a sentence and then referenced in later sentences. For example, one will often read sentences such as "Let be a metric space and let and be points in ". This corresponds to the following telescopeA telescope is a list of binders where the type of a binder may depend on variables declared ealier in the list. Telescopes are equivalent to a well-formed context (see Section 2.1.3) but the term telescope is also used to discuss lists of binders that appear in expressions such as lambda and forall bindings. of binders: (X : Type) (_ : metric_space X) (x y : X). These effectively act as 'linguistic variable binders'.

In this subsection I will highlight how to convert lists of binders to natural language phrases of this form. To the best of my knowledge this is an original contribution so I will explain this mechanism in more detail. Related work to the approach here is discussed in Section 3.6.4.1. Table 3.41 presents some examples of this process.

Table 3.41.

Examples of generating natural language renderings of variable introductions from type-theory telescopes. Square brackets on a binder such as [group G] denote a typeclass binder. This typeclass binder is equivalent to the binder (𝔤 : group G) where the binder name 𝔤 is omitted. Typeclasses were first introduced by Hall et al for use with the Haskell programming language [HHPW96]. Typeclasses are used extensively in the Lean 3 theorem prover. A description of their implementation can be found at [MAKR15 §2.4].

Telescope Generated text
(X : Type) [metric_space X] (𝑥 𝑦 : X) Let X be a metric space and let 𝑥 and 𝑦 be points in X.
(G : Type) [group G] (𝑥 𝑦 : G) Let G be a group and let 𝑥 and 𝑦 be elements of G.
(G : Type) [group G] (H : set G) (h₁ : subgroup.normal G H) Let G be a group and H be a normal subgroup of G.
(𝑎 𝑏 : ) (h₁ : coprime 𝑎 𝑏) Let 𝑎 and 𝑏 be coprime integers.
(𝑓 : XY) (h₁ : continuous 𝑓) Let 𝑓 : XY be a continuous function.
(T : Type) [topological_space T] (U : set T) (h₁ : open U) Let T be a topological space and let U be an open set in T.
(ε : ) (h₁ : ε > 0) Let ε > 0.

[HHPW96]Hall, Cordelia V; Hammond, Kevin; et al.Type classes in Haskell (1996)ACM Transactions on Programming Languages and Systems (TOPLAS)[MAKR15]de Moura, Leonardo; Avigad, Jeremy; et al.Elaboration in Dependent Type Theory (2015)CoRRThese variable introduction sentences in Table 3.41 take the role of a variable binder for mathematical discourse, this variable is then implicitly 'in scope' until its last mention in the text. Some variables introduced in this way can remain in scope for an entire book. For example the choice of underlying field k in a book on linear algebra. As Ganesalingam notes [Gan10[Gan10]Ganesalingam, MohanThe language of mathematics (2010) §2.5.2], "If mathematicians were not able to use variables in this way, they would need to write extremely long sentences!"

Let's frame the problem as follows: take as input a telescope of binders (e.g. [(𝑎 : ), (𝑏 : ), (h₁ : coprime 𝑎 𝑏)]) and produce a 'variable introduction text' string as shown in the above table. The problem involves a number of challenges:

  • There is not a 1-1 map between binders and pieces of text: in "Let 𝑎, 𝑏 be coprime", the binder h₁ : coprime 𝑎 𝑏 is not named but instead treated as a property of 𝑎 and 𝑏.

  • The words that are used to describe a variable can depend on which typeclass [HHPW96]See the caption of Table 3.41 for more information on typeclasses. their type belongs to. For instance we write "let 𝑥 and 𝑦 be points" or "let 𝑥 and 𝑦 be elements of G" depending on whether the type of 𝑥 and 𝑦 is an instance of group or metric_space.

  • Compare "𝑥 and 𝑦 are prime" versus "𝑥 and 𝑦 are coprime". The first arises from (𝑥 𝑦 : ) (h₁ : prime 𝑥) (h₂ : prime 𝑦) whereas the second from (𝑥 𝑦 : ) (h₁ : coprime 𝑥 𝑦). Hence we need to model the adjectives "prime" and "coprime" as belonging to distinct categories.

To solve this I introduce a schema of class predicate collections. Each binder in the input telescope is converted to two pieces of data; the subject expression 𝑥 and the class predicate 𝑐𝑝; which is made from one of the following constructors.

  • adjective: "continuous", "prime", "positive"

  • fold_adjective: "coprime", "parallel"

  • symbolic_postfix: "A", "> 0", ": XY"

  • class_noun: "number", "group", "points in X", "elements of G", "function", "open set in T"

  • none: a failure case. For example if the binder is just for a proposition that should be realised as an assumption instead of a predicate about the binder.

The subject expression and the class predicate for a given binder in the input telescope are assigned by consulting a lookup table which pattern-matches the binder type expressions to determine the subject expression and any additional parameters (for example T in "open set in T"). Each pair 𝑥, 𝑐𝑝 is mapped to [𝑥], [𝑐𝑝]: List Expr × List ClassPredicate. Call this a class predicate collection (CPC). The resulting list of CPCs is then reduced by aggregating [DH93[DH93]Dalianis, Hercules; Hovy, EduardAggregation in natural language generation (1993)European Workshop on Trends in Natural Language Generation] adjacent pairs of CPCs according to the in (3.42).

(3.42).

Rules for aggregating class predicate collections.

𝑥𝑠, 𝑐𝑝𝑠,𝑦𝑠, 𝑐𝑝𝑠 ⟩ ↝ ⟨𝑥s ++ 𝑦𝑠, 𝑐𝑝𝑠
𝑥𝑠, 𝑐𝑝𝑠₁,𝑥𝑠, 𝑐𝑝𝑠₂⟩ ↝ ⟨𝑥s, 𝑐𝑝𝑠₁ ++ 𝑐𝑝𝑠₂

In certain cases, the merging operation can also delete class predicates that are superseded by later ones. An example is that if we have (𝑥 : X) (h₁ : 𝑥A), this can be condensed directly to [𝑥], [symbolic_postfix "∈ A"] which realises to "Let 𝑥A" instead of the redundant "Let 𝑥A be an element of X" which violates Grice's maxim of quantity (Section 3.6.2).

Additionally, the resulting class predicate collection list is partitioned in to two lists so that only the first mention of each subject appears in the first list. For example; 𝑥 : X and h : 𝑥A both have the subject 𝑥, but "Let 𝑥 be a point and let 𝑥A"

These class predicate collections can then be realised for a number of binder cases:

  • Let: "Let U be open in X"

  • Forall: "For all U open in X"

  • Exists: "For some U open in X"

class_noun can be compared to the concept of a 'notion' in ForTheL and Naproche/SAD and a 'non-extensional type' in Ganesalingam [Gan10[Gan10]Ganesalingam, MohanThe language of mathematics (2010)]. It takes the role of a noun that the introduced variable will belong to, and is usually preceded with an indefinite article: "let 𝑥 be an element of G".

The approach presented above is inspired by the idea of 'notions' as first used in the ForTheL controlled natural language parser for the SAD project [VLP07[VLP07]Verchinine, Konstantin; Lyaletski, Alexander; et al.System for Automated Deduction (SAD): a tool for proof verification (2007)International Conference on Automated Deduction, Pas07[Pas07]Paskevich, AndreiThe syntax and semantics of the ForTheL language (2007), VLPA08[VLPA08]Verchinine, Konstantin; Lyaletski, Alexander; et al.On correctness of mathematical texts from a logical and practical point of view (2008)International Conference on Intelligent Computer Mathematics] also used by Naproche/SAD [DKL20[DKL20]De Lon, Adrian; Koepke, Peter; et al.Interpreting Mathematical Texts in Naproche-SAD (2020)Intelligent Computer Mathematics]. Ganesalingam [Gan10[Gan10]Ganesalingam, MohanThe language of mathematics (2010)] refers to these as non-extensional types and Ranta [Ran94[Ran94]Ranta, AarneSyntactic categories in the language of mathematics (1994)International Workshop on Types for Proofs and Programs] as syntactic categories. The act of The PROVERB system [HF97[HF97]Huang, Xiaorong; Fiedler, ArminProof Verbalization as an Application of NLG (1997)IJCAI (2)] and Gowers and Ganesalingam's system [GG17[GG17]Ganesalingam, Mohan; Gowers, W. T.A fully automatic theorem prover with human-style output (2017)J. Automated Reasoning] provide a mechanism for generating natural language texts using a similar technique for aggregating assumptions, however these approaches do not allow for the handling of more complex telescopes found in dependent type theory.

3.6.4.2. Some closing thoughts on CPCs

Will some mechanism like CPCs be necessary in the future, or are they a cultural artefact of the way that mathematics has been done in the past? When designing mathematical definitions in formalised mathematics, one must often make a choice in how new datatypes are defined: should there be a new type 'positive real number' or just use the real numbers and add a hypothesis ε > 0? In natural language mathematics, one is free to move between these bundled and unbundled representations without concern. The CPC structure reflects this; "ε is a positive real" can be interpreted as either a "real that is positive" or as a single semantic type "positive real". Natural mathematics does not disambiguate between these two because they are both equivalent within its informal rules, similar to how the representation of 𝑎 + 𝑏 + 𝑐 does not need to disambiguate between (𝑎 + 𝑏) + 𝑐 and 𝑎 + (𝑏 + 𝑐) since they are equal.

3.6.5. Handling 'multi-apply' steps

The specialised apply move discussed in Section 3.5.8 requires some additional processing. The apply move returns a datatype called ApplyTree that indicates how a given lemma was applied, resulting in parameters, goals and values obtained through eliminating an existential statement. These are converted in to "since" sentences:

"Since f is continuous, there exists some δ > 0 such that d (f 𝑥) (f 𝑦) < 0 whenever d 𝑥 𝑦 < δ"

The code that produces this style of reasoning breaks down in to a Reason component indicating where the fact came from and a restatement of the fact with the parameters set to be relevant to the goal. In most cases, the Reason can simply be a restatement of the fact being used. However it is also possible to produce more elaborate reasons. For example, apply for some hypothesis will also match preconditions on if they appear in context. That is, if h₀PQ, then apply h₀ in the box Figure 3.43 will automatically include the propositional assumption h₁ : P to solve the Box, instead of resulting in a new target ?t: P. This will produce the reason "Since PQ and P, we have Q".

Figure 3.43.
h₀ : PQ
h₁ : P

?t: Q

3.6.6. Multiple cases

Some problems branch in to multiple cases. For example the AB problem. Here, some additional macroplanning needs to occur, since it usually makes sense to place each of the cases in their own paragraph. When cases is performed, the resulting 𝒜-box contains two separate branches for each case as discussed in (3.27).

When a new move is performed to create an Act, moves that are performed within one of these case blocks causes the Act to be tagged with the case. This is then used to partition the resulting rendered string in to multiple paragraphs.

3.6.7. Realisation

As shown in Figure 3.40 the set of Acts is compiled in to a sequence of Sentence objects, these are 'realised' to a run of text. In this phase each sentence is converted to a piece of text containing embedded mathematics. Each statement is constructed through recursively assembling 'canned' phrases representing each sentence. This means that longer proofs can become monotonous but the application of synonymous phrases could be used to add variation. However the purpose of this NLG system is to produce 'human-like' reasoning and so if the proofs read as too monotonous, it suggests that less detail should have been included in the Act list structure.

When realising logical statements, the prose would become unnatural or ambiguous after a certain depth. After a depth of two these statements switch to being entirely symbolic. For example: (PQ)XY would recursively render in natural language naïvely as "Y whenever X and Q whenever P", even with some more sophisticated algorithm to remove the clunkiness, writing "Y whenever X and PQ is just much clearer.

Mathematical expressions were pretty printed using Lean's pretty printing engine. However the Lean 3 pretty printer needs a metavariable context in order to render, so it was necessary to add a tactic state object alongside the Act objects. It was necessary to store this context separately for each act because some metavariables would become solved through the course of the proof and cause confusing statements such as "by setting ε to be ε", where it should read "by setting η to be ε". Another printing issue was in the printing of values created through destructuring existential variables, which would be rendered as classical.some.

3.7. Conclusion

In this chapter, I have introduced a new development calculus for human-like reasoning and demonstrated its compatibility with the development calculus of the Lean theorem prover. I have outlined the structure of a set of 'moves' within this calculus that allow for the creation of both formal and natural-language proofs of this output.

I then detailed the natural language generation component of HumanProof. The component can produce readable proofs of simple lemmas, but there is plenty of work still to do before the system can support larger projects.

In the next chapter, we will discuss a new component to enhance the Box system for use with equational reasoning. I will make use of the work presented in this chapter in the evaluation (Chapter 6).

I will finish this chapter with some thoughts on future directions for the Box datastructure. A more general outlook on future work can be found in Section 7.2, where I also discuss potential future directions in applying deep learning to natural language generation.

3.7.1. Future work: 𝒪-critics

An avenue for future research is the definition of some additional moves for the Box datastructure that allow it to work in a similar fashion to Ireland's proof critics [Ire92[Ire92]Ireland, AndrewThe use of planning critics in mechanizing inductive proofs (1992)International Conference on Logic for Programming Artificial Intelligence and Reasoning]. Recall from Section 2.7.2 that proof critics (broadly speaking) are a proof planning technique that can revise a proof plan in light of information gained from executing a failed plan. 𝒪-boxes can support a similar idea as I will now exemplify in (3.44), where the statement to prove is 𝑎 𝑏 : ,𝑥 : , (𝑎𝑥)(𝑏𝑥). The proof requires spotting the trichotomy property of real numbers: 𝑥 𝑦 : , 𝑥𝑦𝑦 < 𝑥, however it is difficult to see whether this will apply from the goal state.

(3.44).

Sketch of some future work making use of 𝒪-boxes to perform a speculative application of the lemma 𝑎 = 𝑥𝑎𝑥 (highlighted). The moves are: 𝒪-intro (3.35); apply 𝑎 = 𝑥𝑎𝑥 to the left instance of ?𝑡₁; apply reflexivity to the left ?𝑡₁, causing 𝑎 and ?𝑥 to be unified (see Section 3.5.7)

𝑎 𝑏 :

?𝑥 :
?𝑡₁ : 𝑎?𝑥
?𝑡₂ : 𝑏?𝑥
①↝
𝑎 𝑏 :


?𝑥 :
?𝑡₁ : 𝑎?𝑥
?𝑡₂ : 𝑏?𝑥

?𝑥 :
?𝑡₁ : 𝑎?𝑥
?𝑡₂ : 𝑏?𝑥
②↝
𝑎 𝑏 :


?𝑥 :
?𝑡₁ : 𝑎 = ?𝑥
?𝑡₂ : 𝑏?𝑥

?𝑥 :
?𝑡₁ : 𝑎?𝑥
?𝑡₂ : 𝑏?𝑥
③↝
𝑎 𝑏 :


?𝑡₂ : 𝑏𝑎

?𝑥 :
?𝑡₁ : 𝑎?𝑥
?𝑡₂ : 𝑏?𝑥

At this point, one can spot that the lefthand Box is no longer possible to solve unless one assumes 𝑏𝑎. However, rather than deleting the left-hand box, we can instead use this information as in (3.45).

(3.45).

Continuation of (3.44) to perform an 'informed backtracking'. The key step is , the inclusion of an instance of the LEM axiom triggered by the insolubility of the target ?𝑡₂ : 𝑏𝑎 on the left-hand branch of the 𝒪 box. is an amalgamation of two moves; -cases (3.30) and 𝒪-hoisting (A.18) as described in Section A.3.1. is application of : 𝑎𝑏 in the left-hand box and 𝒪-reduce (3.21). is an application of ¬(𝑏𝑎)𝑎𝑏 and is an application of 𝑏𝑏.

③↝
𝑎 𝑏 :


?𝑡₂ : 𝑏𝑎

?𝑥 :
?𝑡₁ : 𝑎?𝑥
?𝑡₂ : 𝑏?𝑥
④↝
𝑎 𝑏 :
: 𝑏𝑎 ∨ ¬ 𝑏𝑎


?𝑡₂ : 𝑏𝑎

?𝑥 :
?𝑡₁ : 𝑎?𝑥
?𝑡₂ : 𝑏?𝑥
⑤↝
𝑎 𝑏 :

: 𝑏𝑎

?𝑡₂ : 𝑏𝑎
: 𝑎 < 𝑏

?𝑥 :
?𝑡₁ : 𝑎?𝑥
?𝑡₂ : 𝑏?𝑥
⑥↝
𝑎 𝑏 :
: 𝑎 < 𝑏

?𝑥 :
?𝑡₁ : 𝑎?𝑥
?𝑡₂ : 𝑏?𝑥
⑦↝
𝑎 𝑏 :
: 𝑎 < 𝑏

?𝑡₂ : 𝑏𝑏
⑧↝
done!

The remaining research question for putting (3.44) and (3.45) in to practice is to determine some heuristics when it is appropriate to perform 𝒪-intro (step ) and step , where an instance : 𝑏𝑎 ∨ ¬ 𝑏𝑎 is introduced. What is an appropriate trigger for suggesting the manoeuvre in ,, to the user?

Bibliography for this chapter

  • [ADL10]Aspinall, David; Denney, Ewen; Lüth, ChristophTactics for hierarchical proof (2010)Mathematics in Computer Sciencevolume 3number 3pages 309--330publisher Springerview online
  • [BG01]Bachmair, Leo; Ganzinger, HaraldResolution theorem proving (2001)Handbook of automated reasoningpages 19--99editors n.b.publisher Elsevier
  • [BM72]Boyer, R. S.; Moore, J. S.The sharing structure in theorem-proving programs (1972)Machine intelligencevolume 7pages 101--116editors n.b.publisher Edinburgh University Pressview online
  • [BN10]Böhme, Sascha; Nipkow, TobiasSledgehammer: judgement day (2010)International Joint Conference on Automated Reasoningpages 107--121organization Springerview online
  • [BN98]Baader, Franz; Nipkow, TobiasTerm rewriting and all that (1998)publisher Cambridge University Pressisbn 978-0-521-45520-6
  • [Ble81]Bledsoe, Woodrow WNon-resolution theorem proving (1981)Readings in Artificial Intelligencepages 91--108publisher Elsevierview online
  • [Bun02]Bundy, AlanA critique of proof planning (2002)Computational Logic: Logic Programming and Beyondpages 160--177publisher Springerview online
  • [Bun88]Bundy, AlanThe use of explicit plans to guide inductive proofs (1988)International conference on automated deductionpages 111--120organization Springerview online
  • [CH88]Coquand, Thierry; Huet, Gérard P.The Calculus of Constructions (1988)Inf. Comput.volume 76number 2/3pages 95--120doi 10.1016/0890-5401(88)90005-3view online
  • [Car19]Carneiro, MarioLean's Type Theory (2019)view online
  • [DH93]Dalianis, Hercules; Hovy, EduardAggregation in natural language generation (1993)European Workshop on Trends in Natural Language Generationpages 88--105organization Springerview online
  • [DKL20]De Lon, Adrian; Koepke, Peter; Lorenzen, AntonInterpreting Mathematical Texts in Naproche-SAD (2020)Intelligent Computer Mathematicspages 284--289editors n.b.publisher Springer International Publishingisbn 978-3-030-53518-6view online
  • [Dow01]Dowek, GilesHigher-order unification and matching (2001)Handbook of automated reasoningvolume 2pages 1009--1063editors n.b.publisher Elsevier
  • [EUR+17]Ebner, Gabriel; Ullrich, Sebastian; Roesch, Jared; Avigad, Jeremy; de Moura, LeonardoA metaprogramming framework for formal verification (2017)Proceedings of the ACM on Programming Languagesvolume 1number ICFPpages 1--29publisher ACM New York, NY, USAview online
  • [GG17]Ganesalingam, Mohan; Gowers, W. T.A fully automatic theorem prover with human-style output (2017)J. Automated Reasoningvolume 58number 2pages 253--291doi 10.1007/s10817-016-9377-1view online
  • [Gan10]Ganesalingam, MohanThe language of mathematics (2010)publisher Springerview online
  • [Gri75]Grice, Herbert PLogic and conversation (1975)Speech actspages 41--58publisher Brillview online
  • [HF97]Huang, Xiaorong; Fiedler, ArminProof Verbalization as an Application of NLG (1997)IJCAI (2)pages 965--972view online
  • [HHPW96]Hall, Cordelia V; Hammond, Kevin; Peyton Jones, Simon L; Wadler, Philip LType classes in Haskell (1996)ACM Transactions on Programming Languages and Systems (TOPLAS)volume 18number 2pages 109--138publisher ACM New York, NY, USAview online
  • [Hue97]Huet, GérardFunctional Pearl: The Zipper (1997)Journal of functional programmingvolume 7number 5pages 549--554publisher Cambridge University Pressview online
  • [Ire92]Ireland, AndrewThe use of planning critics in mechanizing inductive proofs (1992)International Conference on Logic for Programming Artificial Intelligence and Reasoningpages 178--189organization Springerview online
  • [MAKR15]de Moura, Leonardo; Avigad, Jeremy; Kong, Soonho; Roux, CodyElaboration in Dependent Type Theory (2015)CoRRvolume abs/1505.04324view online
  • [MB08]de Moura, Leonardo; Bjørner, NikolajZ3: An efficient SMT solver (2008)International conference on Tools and Algorithms for the Construction and Analysis of Systemspages 337--340organization Springer
  • [McB00]McBride, ConorDependently typed functional programs and their proofs (2000)view online
  • [McC60]McCarthy, JohnRecursive functions of symbolic expressions and their computation by machine, Part I (1960)Communications of the ACMvolume 3number 4pages 184--195publisher ACM New York, NY, USAview online
  • [Mil72]Milner, RobinLogic for computable functions description of a machine implementation (1972)institution Stanford Universityview online
  • [PP89]Pfenning, Frank; Paulin-Mohring, ChristineInductively defined types in the Calculus of Constructions (1989)International Conference on Mathematical Foundations of Programming Semanticspages 209--228organization Springerview online
  • [Pas07]Paskevich, AndreiThe syntax and semantics of the ForTheL language (2007)view online
  • [Pau99]Paulson, Lawrence CA generic tableau prover and its integration with Isabelle (1999)Journal of Universal Computer Sciencevolume 5number 3pages 73--87view online
  • [RN10]Russell, Stuart J.; Norvig, PeterArtificial Intelligence - A Modern Approach (2010)publisher Pearson Educationisbn 978-0-13-207148-2view online
  • [Ran94]Ranta, AarneSyntactic categories in the language of mathematics (1994)International Workshop on Types for Proofs and Programspages 162--182organization Springerview online
  • [SB01]Snyder, Wayne; Baader, FranzUnification theory (2001)Handbook of automated reasoningvolume 1pages 447--533editors n.b.publisher Elsevier
  • [SCV19]Schulz, Stephan; Cruanes, Simon; Vukmirović, PetarFaster, Higher, Stronger: E 2.3 (2019)Proc. of the 27th CADE, Natal, Brasilnumber 11716pages 495--507editor Fontaine, Pascalpublisher Springerview online
  • [SH17]Sterling, Jonathan; Harper, RobertAlgebraic Foundations of Proof Refinement (2017)arXiv preprint arXiv:1703.05215view online
  • [Spi11]Spiwack, ArnaudVerified computing in homological algebra, a journey exploring the power and limits of dependent type theory (2011)view online
  • [VLP07]Verchinine, Konstantin; Lyaletski, Alexander; Paskevich, AndreiSystem for Automated Deduction (SAD): a tool for proof verification (2007)International Conference on Automated Deductionpages 398--403organization Springerview online
  • [VLPA08]Verchinine, Konstantin; Lyaletski, Alexander; Paskevich, Andrei; Anisimov, AnatolyOn correctness of mathematical texts from a logical and practical point of view (2008)International Conference on Intelligent Computer Mathematicspages 583--598organization Springer
© 2021 E.W.Ayers. Built with Gatsby