2 \chapter{The Classical Reasoner}\label{chap:classical}
3 \index{classical reasoner|(}
4 \newcommand\ainfer[2]{\begin{array}{r@{\,}l}#2\\ \hline#1\end{array}}
6 Although Isabelle is generic, many users will be working in some extension
7 of classical first-order logic. Isabelle's set theory~{\tt ZF} is built
8 upon theory~{\tt FOL}, while higher-order logic contains first-order logic
9 as a fragment. Theorem-proving in predicate logic is undecidable, but many
10 researchers have developed strategies to assist in this task.
12 Isabelle's classical reasoner is an \ML{} functor that accepts certain
13 information about a logic and delivers a suite of automatic tactics. Each
14 tactic takes a collection of rules and executes a simple, non-clausal proof
15 procedure. They are slow and simplistic compared with resolution theorem
16 provers, but they can save considerable time and effort. They can prove
17 theorems such as Pelletier's~\cite{pelletier86} problems~40 and~41 in
19 \[ (\exists y. \forall x. J(y,x) \bimp \neg J(x,x))
20 \imp \neg (\forall x. \exists y. \forall z. J(z,y) \bimp \neg J(z,x)) \]
21 \[ (\forall z. \exists y. \forall x. F(x,y) \bimp F(x,z) \conj \neg F(x,x))
22 \imp \neg (\exists z. \forall x. F(x,z))
25 The tactics are generic. They are not restricted to first-order logic, and
26 have been heavily used in the development of Isabelle's set theory. Few
27 interactive proof assistants provide this much automation. The tactics can
28 be traced, and their components can be called directly; in this manner,
29 any proof can be viewed interactively.
31 We shall first discuss the underlying principles, then consider how to use
32 the classical reasoner. Finally, we shall see how to instantiate it for
33 new logics. The logics {\tt FOL}, {\tt HOL} and {\tt ZF} have it already
37 \section{The sequent calculus}
38 \index{sequent calculus}
39 Isabelle supports natural deduction, which is easy to use for interactive
40 proof. But natural deduction does not easily lend itself to automation,
41 and has a bias towards intuitionism. For certain proofs in classical
42 logic, it can not be called natural. The {\bf sequent calculus}, a
43 generalization of natural deduction, is easier to automate.
45 A {\bf sequent} has the form $\Gamma\turn\Delta$, where $\Gamma$
46 and~$\Delta$ are sets of formulae.%
47 \footnote{For first-order logic, sequents can equivalently be made from
48 lists or multisets of formulae.} The sequent
49 \[ P@1,\ldots,P@m\turn Q@1,\ldots,Q@n \]
50 is {\bf valid} if $P@1\conj\ldots\conj P@m$ implies $Q@1\disj\ldots\disj
51 Q@n$. Thus $P@1,\ldots,P@m$ represent assumptions, each of which is true,
52 while $Q@1,\ldots,Q@n$ represent alternative goals. A sequent is {\bf
53 basic} if its left and right sides have a common formula, as in $P,Q\turn
54 Q,R$; basic sequents are trivially valid.
56 Sequent rules are classified as {\bf right} or {\bf left}, indicating which
57 side of the $\turn$~symbol they operate on. Rules that operate on the
58 right side are analogous to natural deduction's introduction rules, and
59 left rules are analogous to elimination rules.
60 Recall the natural deduction rules for
62 \iflabelundefined{fol-fig}{from {\it Introduction to Isabelle}}%
63 {Fig.\ts\ref{fol-fig}}.
64 The sequent calculus analogue of~$({\imp}I)$ is the rule
65 $$ \ainfer{\Gamma &\turn \Delta, P\imp Q}{P,\Gamma &\turn \Delta,Q}
67 This breaks down some implication on the right side of a sequent; $\Gamma$
68 and $\Delta$ stand for the sets of formulae that are unaffected by the
69 inference. The analogue of the pair~$({\disj}I1)$ and~$({\disj}I2)$ is the
71 $$ \ainfer{\Gamma &\turn \Delta, P\disj Q}{\Gamma &\turn \Delta,P,Q}
73 This breaks down some disjunction on the right side, replacing it by both
74 disjuncts. Thus, the sequent calculus is a kind of multiple-conclusion logic.
76 To illustrate the use of multiple formulae on the right, let us prove
77 the classical theorem $(P\imp Q)\disj(Q\imp P)$. Working backwards, we
78 reduce this formula to a basic sequent:
79 \[ \infer[(\disj)R]{\turn(P\imp Q)\disj(Q\imp P)}
80 {\infer[(\imp)R]{\turn(P\imp Q), (Q\imp P)\;}
81 {\infer[(\imp)R]{P \turn Q, (Q\imp P)\qquad}
82 {P, Q \turn Q, P\qquad\qquad}}}
84 This example is typical of the sequent calculus: start with the desired
85 theorem and apply rules backwards in a fairly arbitrary manner. This yields a
86 surprisingly effective proof procedure. Quantifiers add few complications,
87 since Isabelle handles parameters and schematic variables. See Chapter~10
88 of {\em ML for the Working Programmer}~\cite{paulson91} for further
92 \section{Simulating sequents by natural deduction}
93 Isabelle can represent sequents directly, as in the object-logic~{\tt LK}\@.
94 But natural deduction is easier to work with, and most object-logics employ
95 it. Fortunately, we can simulate the sequent $P@1,\ldots,P@m\turn
96 Q@1,\ldots,Q@n$ by the Isabelle formula
97 \[ \List{P@1;\ldots;P@m; \neg Q@2;\ldots; \neg Q@n}\Imp Q@1, \]
98 where the order of the assumptions and the choice of~$Q@1$ are arbitrary.
99 Elim-resolution plays a key role in simulating sequent proofs.
101 We can easily handle reasoning on the left.
103 \iflabelundefined{destruct}{{\it Introduction to Isabelle}}{\S\ref{destruct}},
104 elim-resolution with the rules $(\disj E)$, $(\bot E)$ and $(\exists E)$
105 achieves a similar effect as the corresponding sequent rules. For the
106 other connectives, we use sequent-style elimination rules instead of
107 destruction rules such as $({\conj}E1,2)$ and $(\forall E)$. But note that
108 the rule $(\neg L)$ has no effect under our representation of sequents!
109 $$ \ainfer{\neg P,\Gamma &\turn \Delta}{\Gamma &\turn \Delta,P}
111 What about reasoning on the right? Introduction rules can only affect the
112 formula in the conclusion, namely~$Q@1$. The other right-side formulae are
113 represented as negated assumptions, $\neg Q@2$, \ldots,~$\neg Q@n$.
114 \index{assumptions!negated}
115 In order to operate on one of these, it must first be exchanged with~$Q@1$.
116 Elim-resolution with the {\bf swap} rule has this effect:
117 $$ \List{\neg P; \; \neg R\Imp P} \Imp R \eqno(swap)$$
118 To ensure that swaps occur only when necessary, each introduction rule is
119 converted into a swapped form: it is resolved with the second premise
120 of~$(swap)$. The swapped form of~$({\conj}I)$, which might be
121 called~$({\neg\conj}E)$, is
122 \[ \List{\neg(P\conj Q); \; \neg R\Imp P; \; \neg R\Imp Q} \Imp R. \]
123 Similarly, the swapped form of~$({\imp}I)$ is
124 \[ \List{\neg(P\imp Q); \; \List{\neg R;P}\Imp Q} \Imp R \]
125 Swapped introduction rules are applied using elim-resolution, which deletes
126 the negated formula. Our representation of sequents also requires the use
127 of ordinary introduction rules. If we had no regard for readability, we
128 could treat the right side more uniformly by representing sequents as
129 \[ \List{P@1;\ldots;P@m; \neg Q@1;\ldots; \neg Q@n}\Imp \bot. \]
132 \section{Extra rules for the sequent calculus}
133 As mentioned, destruction rules such as $({\conj}E1,2)$ and $(\forall E)$
134 must be replaced by sequent-style elimination rules. In addition, we need
135 rules to embody the classical equivalence between $P\imp Q$ and $\neg P\disj
136 Q$. The introduction rules~$({\disj}I1,2)$ are replaced by a rule that
137 simulates $({\disj}R)$:
138 \[ (\neg Q\Imp P) \Imp P\disj Q \]
139 The destruction rule $({\imp}E)$ is replaced by
140 \[ \List{P\imp Q;\; \neg P\Imp R;\; Q\Imp R} \Imp R. \]
141 Quantifier replication also requires special rules. In classical logic,
142 $\exists x{.}P$ is equivalent to $\neg\forall x{.}\neg P$; the rules
143 $(\exists R)$ and $(\forall L)$ are dual:
144 \[ \ainfer{\Gamma &\turn \Delta, \exists x{.}P}
145 {\Gamma &\turn \Delta, \exists x{.}P, P[t/x]} \; (\exists R)
147 \ainfer{\forall x{.}P, \Gamma &\turn \Delta}
148 {P[t/x], \forall x{.}P, \Gamma &\turn \Delta} \; (\forall L)
150 Thus both kinds of quantifier may be replicated. Theorems requiring
151 multiple uses of a universal formula are easy to invent; consider
152 \[ (\forall x.P(x)\imp P(f(x))) \conj P(a) \imp P(f^n(a)), \]
153 for any~$n>1$. Natural examples of the multiple use of an existential
154 formula are rare; a standard one is $\exists x.\forall y. P(x)\imp P(y)$.
156 Forgoing quantifier replication loses completeness, but gains decidability,
157 since the search space becomes finite. Many useful theorems can be proved
158 without replication, and the search generally delivers its verdict in a
159 reasonable time. To adopt this approach, represent the sequent rules
160 $(\exists R)$, $(\exists L)$ and $(\forall R)$ by $(\exists I)$, $(\exists
161 E)$ and $(\forall I)$, respectively, and put $(\forall E)$ into elimination
163 $$ \List{\forall x{.}P(x); P(t)\Imp Q} \Imp Q \eqno(\forall E@2) $$
164 Elim-resolution with this rule will delete the universal formula after a
165 single use. To replicate universal quantifiers, replace the rule by
166 $$ \List{\forall x{.}P(x);\; \List{P(t); \forall x{.}P(x)}\Imp Q} \Imp Q.
167 \eqno(\forall E@3) $$
168 To replicate existential quantifiers, replace $(\exists I)$ by
169 \[ \List{\neg(\exists x{.}P(x)) \Imp P(t)} \Imp \exists x{.}P(x). \]
170 All introduction rules mentioned above are also useful in swapped form.
172 Replication makes the search space infinite; we must apply the rules with
173 care. The classical reasoner distinguishes between safe and unsafe
174 rules, applying the latter only when there is no alternative. Depth-first
175 search may well go down a blind alley; best-first search is better behaved
176 in an infinite search space. However, quantifier replication is too
177 expensive to prove any but the simplest theorems.
180 \section{Classical rule sets}
181 \index{classical sets}
182 Each automatic tactic takes a {\bf classical set} --- a collection of
183 rules, classified as introduction or elimination and as {\bf safe} or {\bf
184 unsafe}. In general, safe rules can be attempted blindly, while unsafe
185 rules must be used with care. A safe rule must never reduce a provable
186 goal to an unprovable set of subgoals.
188 The rule~$({\disj}I1)$ is unsafe because it reduces $P\disj Q$ to~$P$. Any
189 rule is unsafe whose premises contain new unknowns. The elimination
190 rule~$(\forall E@2)$ is unsafe, since it is applied via elim-resolution,
191 which discards the assumption $\forall x{.}P(x)$ and replaces it by the
192 weaker assumption~$P(\Var{t})$. The rule $({\exists}I)$ is unsafe for
193 similar reasons. The rule~$(\forall E@3)$ is unsafe in a different sense:
194 since it keeps the assumption $\forall x{.}P(x)$, it is prone to looping.
195 In classical first-order logic, all rules are safe except those mentioned
198 The safe/unsafe distinction is vague, and may be regarded merely as a way
199 of giving some rules priority over others. One could argue that
200 $({\disj}E)$ is unsafe, because repeated application of it could generate
201 exponentially many subgoals. Induction rules are unsafe because inductive
202 proofs are difficult to set up automatically. Any inference is unsafe that
203 instantiates an unknown in the proof state --- thus \ttindex{match_tac}
204 must be used, rather than \ttindex{resolve_tac}. Even proof by assumption
205 is unsafe if it instantiates unknowns shared with other subgoals --- thus
206 \ttindex{eq_assume_tac} must be used, rather than \ttindex{assume_tac}.
208 Classical rule sets belong to the abstract type \mltydx{claset}, which
209 supports the following operations (provided the classical reasoner is
213 addSIs : claset * thm list -> claset \hfill{\bf infix 4}
214 addSEs : claset * thm list -> claset \hfill{\bf infix 4}
215 addSDs : claset * thm list -> claset \hfill{\bf infix 4}
216 addIs : claset * thm list -> claset \hfill{\bf infix 4}
217 addEs : claset * thm list -> claset \hfill{\bf infix 4}
218 addDs : claset * thm list -> claset \hfill{\bf infix 4}
219 print_cs : claset -> unit
221 There are no operations for deletion from a classical set. The add
222 operations do not check for repetitions.
223 \begin{ttdescription}
224 \item[\ttindexbold{empty_cs}] is the empty classical set.
226 \item[$cs$ addSIs $rules$] \indexbold{*addSIs}
227 adds safe introduction~$rules$ to~$cs$.
229 \item[$cs$ addSEs $rules$] \indexbold{*addSEs}
230 adds safe elimination~$rules$ to~$cs$.
232 \item[$cs$ addSDs $rules$] \indexbold{*addSDs}
233 adds safe destruction~$rules$ to~$cs$.
235 \item[$cs$ addIs $rules$] \indexbold{*addIs}
236 adds unsafe introduction~$rules$ to~$cs$.
238 \item[$cs$ addEs $rules$] \indexbold{*addEs}
239 adds unsafe elimination~$rules$ to~$cs$.
241 \item[$cs$ addDs $rules$] \indexbold{*addDs}
242 adds unsafe destruction~$rules$ to~$cs$.
244 \item[\ttindexbold{print_cs} $cs$] prints the rules of~$cs$.
247 Introduction rules are those that can be applied using ordinary resolution.
248 The classical set automatically generates their swapped forms, which will
249 be applied using elim-resolution. Elimination rules are applied using
250 elim-resolution. In a classical set, rules are sorted by the number of new
251 subgoals they will yield; rules that generate the fewest subgoals will be
252 tried first (see \S\ref{biresolve_tac}).
254 For a given classical set, the proof strategy is simple. Perform as many
255 safe inferences as possible; or else, apply certain safe rules, allowing
256 instantiation of unknowns; or else, apply an unsafe rule. The tactics may
257 also apply {\tt hyp_subst_tac}, if they have been set up to do so (see
258 below). They may perform a form of Modus Ponens: if there are assumptions
259 $P\imp Q$ and~$P$, then replace $P\imp Q$ by~$Q$.
262 \section{The classical tactics}
263 \index{classical reasoner!tactics}
264 If installed, the classical module provides several tactics (and other
265 operations) for simulating the classical sequent calculus.
267 \subsection{The automatic tactics}
269 fast_tac : claset -> int -> tactic
270 best_tac : claset -> int -> tactic
271 slow_tac : claset -> int -> tactic
272 slow_best_tac : claset -> int -> tactic
274 These tactics work by applying {\tt step_tac} or {\tt slow_step_tac}
275 repeatedly. Their effect is restricted (by {\tt SELECT_GOAL}) to one subgoal;
276 they either solve this subgoal or fail. The {\tt slow_} versions are more
277 powerful but can be much slower.
279 The best-first tactics are guided by a heuristic function: typically, the
280 total size of the proof state. This function is supplied in the functor call
281 that sets up the classical reasoner.
282 \begin{ttdescription}
283 \item[\ttindexbold{fast_tac} $cs$ $i$] applies {\tt step_tac} using
284 depth-first search, to solve subgoal~$i$.
286 \item[\ttindexbold{best_tac} $cs$ $i$] applies {\tt step_tac} using
287 best-first search, to solve subgoal~$i$.
289 \item[\ttindexbold{slow_tac} $cs$ $i$] applies {\tt slow_step_tac} using
290 depth-first search, to solve subgoal~$i$.
292 \item[\ttindexbold{slow_best_tac} $cs$ $i$] applies {\tt slow_step_tac} using
293 best-first search, to solve subgoal~$i$.
297 \subsection{Depth-limited tactics}
299 depth_tac : claset -> int -> int -> tactic
300 deepen_tac : claset -> int -> int -> tactic
302 These work by exhaustive search up to a specified depth. Unsafe rules are
303 modified to preserve the formula they act on, so that it be used repeatedly.
304 They can prove more goals than {\tt fast_tac}, etc., can. They are much
305 slower, for example if the assumptions have many universal quantifiers.
307 The depth limits the number of unsafe steps. If you can estimate the minimum
308 number of unsafe steps needed, supply this value as~$m$ to save time.
309 \begin{ttdescription}
310 \item[\ttindexbold{depth_tac} $cs$ $m$ $i$]
311 tries to solve subgoal~$i$ by exhaustive search up to depth~$m$.
313 \item[\ttindexbold{deepen_tac} $cs$ $m$ $i$]
314 tries to solve subgoal~$i$ by iterative deepening. It calls {\tt depth_tac}
315 repeatedly with increasing depths, starting with~$m$.
319 \subsection{Single-step tactics}
321 safe_step_tac : claset -> int -> tactic
322 safe_tac : claset -> tactic
323 inst_step_tac : claset -> int -> tactic
324 step_tac : claset -> int -> tactic
325 slow_step_tac : claset -> int -> tactic
327 The automatic proof procedures call these tactics. By calling them
328 yourself, you can execute these procedures one step at a time.
329 \begin{ttdescription}
330 \item[\ttindexbold{safe_step_tac} $cs$ $i$] performs a safe step on
331 subgoal~$i$. This may include proof by assumption or Modus Ponens (taking
332 care not to instantiate unknowns), or {\tt hyp_subst_tac}.
334 \item[\ttindexbold{safe_tac} $cs$] repeatedly performs safe steps on all
335 subgoals. It is deterministic, with at most one outcome. If the automatic
336 tactics fail, try using {\tt safe_tac} to open up your formula; then you
337 can replicate certain quantifiers explicitly by applying appropriate rules.
339 \item[\ttindexbold{inst_step_tac} $cs$ $i$] is like {\tt safe_step_tac},
340 but allows unknowns to be instantiated.
342 \item[\ttindexbold{step_tac} $cs$ $i$] tries {\tt safe_tac}. If this
343 fails, it tries {\tt inst_step_tac}, or applies an unsafe rule from~$cs$.
344 This is the basic step of the proof procedure.
346 \item[\ttindexbold{slow_step_tac}]
347 resembles {\tt step_tac}, but allows backtracking between using safe
348 rules with instantiation ({\tt inst_step_tac}) and using unsafe rules.
349 The resulting search space is larger.
353 \subsection{Other useful tactics}
354 \index{tactics!for contradiction}
355 \index{tactics!for Modus Ponens}
357 contr_tac : int -> tactic
358 mp_tac : int -> tactic
359 eq_mp_tac : int -> tactic
360 swap_res_tac : thm list -> int -> tactic
362 These can be used in the body of a specialized search.
363 \begin{ttdescription}
364 \item[\ttindexbold{contr_tac} {\it i}]\index{assumptions!contradictory}
365 solves subgoal~$i$ by detecting a contradiction among two assumptions of
366 the form $P$ and~$\neg P$, or fail. It may instantiate unknowns. The
367 tactic can produce multiple outcomes, enumerating all possible
370 \item[\ttindexbold{mp_tac} {\it i}]
371 is like {\tt contr_tac}, but also attempts to perform Modus Ponens in
372 subgoal~$i$. If there are assumptions $P\imp Q$ and~$P$, then it replaces
373 $P\imp Q$ by~$Q$. It may instantiate unknowns. It fails if it can do
376 \item[\ttindexbold{eq_mp_tac} {\it i}]
377 is like {\tt mp_tac} {\it i}, but may not instantiate unknowns --- thus, it
380 \item[\ttindexbold{swap_res_tac} {\it thms} {\it i}] refines subgoal~$i$ of
381 the proof state using {\it thms}, which should be a list of introduction
382 rules. First, it attempts to solve the goal using {\tt assume_tac} or
383 {\tt contr_tac}. It then attempts to apply each rule in turn, attempting
384 resolution and also elim-resolution with the swapped form.
387 \subsection{Creating swapped rules}
389 swapify : thm list -> thm list
390 joinrules : thm list * thm list -> (bool * thm) list
392 \begin{ttdescription}
393 \item[\ttindexbold{swapify} {\it thms}] returns a list consisting of the
394 swapped versions of~{\it thms}, regarded as introduction rules.
396 \item[\ttindexbold{joinrules} ({\it intrs}, {\it elims})]
397 joins introduction rules, their swapped versions, and elimination rules for
398 use with \ttindex{biresolve_tac}. Each rule is paired with~{\tt false}
399 (indicating ordinary resolution) or~{\tt true} (indicating
404 \section{Setting up the classical reasoner}
405 \index{classical reasoner!setting up}
406 Isabelle's classical object-logics, including {\tt FOL} and {\tt HOL}, have
407 the classical reasoner already set up. When defining a new classical logic,
408 you should set up the reasoner yourself. It consists of the \ML{} functor
409 \ttindex{ClassicalFun}, which takes the argument
410 signature {\tt CLASSICAL_DATA}:
412 signature CLASSICAL_DATA =
417 val sizef : thm -> int
418 val hyp_subst_tacs : (int -> tactic) list
421 Thus, the functor requires the following items:
422 \begin{ttdescription}
423 \item[\tdxbold{mp}] should be the Modus Ponens rule
424 $\List{\Var{P}\imp\Var{Q};\; \Var{P}} \Imp \Var{Q}$.
426 \item[\tdxbold{not_elim}] should be the contradiction rule
427 $\List{\neg\Var{P};\; \Var{P}} \Imp \Var{R}$.
429 \item[\tdxbold{swap}] should be the swap rule
430 $\List{\neg \Var{P}; \; \neg \Var{R}\Imp \Var{P}} \Imp \Var{R}$.
432 \item[\ttindexbold{sizef}] is the heuristic function used for best-first
433 search. It should estimate the size of the remaining subgoals. A good
434 heuristic function is \ttindex{size_of_thm}, which measures the size of the
435 proof state. Another size function might ignore certain subgoals (say,
436 those concerned with type checking). A heuristic function might simply
439 \item[\ttindexbold{hyp_subst_tacs}] is a list of tactics for substitution in
440 the hypotheses, typically created by \ttindex{HypsubstFun} (see
441 Chapter~\ref{substitution}). This list can, of course, be empty. The
442 tactics are assumed to be safe!
444 The functor is not at all sensitive to the formalization of the
445 object-logic. It does not even examine the rules, but merely applies them
446 according to its fixed strategy. The functor resides in {\tt
447 Provers/classical.ML} in the Isabelle distribution directory.
449 \index{classical reasoner|)}