doc-src/Intro/foundations.tex
author Walther Neuper <neuper@ist.tugraz.at>
Thu, 12 Aug 2010 15:03:34 +0200
branchisac-from-Isabelle2009-2
changeset 37913 20e3616b2d9c
parent 9695 ec7d7f877712
child 43508 381fdcab0f36
permissions -rw-r--r--
prepare reactivation of isac-update-Isa09-2
lcp@105
     1
%% $Id$
wenzelm@3106
     2
lcp@105
     3
\part{Foundations} 
lcp@296
     4
The following sections discuss Isabelle's logical foundations in detail:
lcp@105
     5
representing logical syntax in the typed $\lambda$-calculus; expressing
lcp@105
     6
inference rules in Isabelle's meta-logic; combining rules by resolution.
lcp@296
     7
lcp@296
     8
If you wish to use Isabelle immediately, please turn to
lcp@296
     9
page~\pageref{chap:getting}.  You can always read about foundations later,
lcp@296
    10
either by returning to this point or by looking up particular items in the
lcp@296
    11
index.
lcp@105
    12
lcp@105
    13
\begin{figure} 
lcp@105
    14
\begin{eqnarray*}
lcp@105
    15
  \neg P   & \hbox{abbreviates} & P\imp\bot \\
lcp@105
    16
  P\bimp Q & \hbox{abbreviates} & (P\imp Q) \conj (Q\imp P)
lcp@105
    17
\end{eqnarray*}
lcp@105
    18
\vskip 4ex
lcp@105
    19
lcp@105
    20
\(\begin{array}{c@{\qquad\qquad}c}
lcp@105
    21
  \infer[({\conj}I)]{P\conj Q}{P & Q}  &
lcp@105
    22
  \infer[({\conj}E1)]{P}{P\conj Q} \qquad 
lcp@105
    23
  \infer[({\conj}E2)]{Q}{P\conj Q} \\[4ex]
lcp@105
    24
lcp@105
    25
  \infer[({\disj}I1)]{P\disj Q}{P} \qquad 
lcp@105
    26
  \infer[({\disj}I2)]{P\disj Q}{Q} &
lcp@105
    27
  \infer[({\disj}E)]{R}{P\disj Q & \infer*{R}{[P]} & \infer*{R}{[Q]}}\\[4ex]
lcp@105
    28
lcp@105
    29
  \infer[({\imp}I)]{P\imp Q}{\infer*{Q}{[P]}} &
lcp@105
    30
  \infer[({\imp}E)]{Q}{P\imp Q & P}  \\[4ex]
lcp@105
    31
lcp@105
    32
  &
lcp@105
    33
  \infer[({\bot}E)]{P}{\bot}\\[4ex]
lcp@105
    34
lcp@105
    35
  \infer[({\forall}I)*]{\forall x.P}{P} &
lcp@105
    36
  \infer[({\forall}E)]{P[t/x]}{\forall x.P} \\[3ex]
lcp@105
    37
lcp@105
    38
  \infer[({\exists}I)]{\exists x.P}{P[t/x]} &
lcp@105
    39
  \infer[({\exists}E)*]{Q}{{\exists x.P} & \infer*{Q}{[P]} } \\[3ex]
lcp@105
    40
lcp@105
    41
  {t=t} \,(refl)   &  \vcenter{\infer[(subst)]{P[u/x]}{t=u & P[t/x]}} 
lcp@105
    42
\end{array} \)
lcp@105
    43
lcp@105
    44
\bigskip\bigskip
lcp@105
    45
*{\em Eigenvariable conditions\/}:
lcp@105
    46
lcp@105
    47
$\forall I$: provided $x$ is not free in the assumptions
lcp@105
    48
lcp@105
    49
$\exists E$: provided $x$ is not free in $Q$ or any assumption except $P$
lcp@105
    50
\caption{Intuitionistic first-order logic} \label{fol-fig}
lcp@105
    51
\end{figure}
lcp@105
    52
lcp@296
    53
\section{Formalizing logical syntax in Isabelle}\label{sec:logical-syntax}
lcp@331
    54
\index{first-order logic}
lcp@331
    55
lcp@105
    56
Figure~\ref{fol-fig} presents intuitionistic first-order logic,
lcp@296
    57
including equality.  Let us see how to formalize
lcp@105
    58
this logic in Isabelle, illustrating the main features of Isabelle's
lcp@105
    59
polymorphic meta-logic.
lcp@105
    60
lcp@331
    61
\index{lambda calc@$\lambda$-calculus} 
lcp@312
    62
Isabelle represents syntax using the simply typed $\lambda$-calculus.  We
lcp@312
    63
declare a type for each syntactic category of the logic.  We declare a
lcp@312
    64
constant for each symbol of the logic, giving each $n$-place operation an
lcp@312
    65
$n$-argument curried function type.  Most importantly,
lcp@312
    66
$\lambda$-abstraction represents variable binding in quantifiers.
lcp@105
    67
lcp@331
    68
\index{types!syntax of}\index{types!function}\index{*fun type} 
lcp@331
    69
\index{type constructors}
lcp@331
    70
Isabelle has \ML-style polymorphic types such as~$(\alpha)list$, where
lcp@331
    71
$list$ is a type constructor and $\alpha$ is a type variable; for example,
lcp@105
    72
$(bool)list$ is the type of lists of booleans.  Function types have the
lcp@312
    73
form $(\sigma,\tau)fun$ or $\sigma\To\tau$, where $\sigma$ and $\tau$ are
lcp@312
    74
types.  Curried function types may be abbreviated:
lcp@105
    75
\[  \sigma@1\To (\cdots \sigma@n\To \tau\cdots)  \quad \hbox{as} \quad
wenzelm@3103
    76
[\sigma@1, \ldots, \sigma@n] \To \tau \]
lcp@105
    77
 
wenzelm@3103
    78
\index{terms!syntax of} The syntax for terms is summarised below.
wenzelm@3103
    79
Note that there are two versions of function application syntax
wenzelm@3103
    80
available in Isabelle: either $t\,u$, which is the usual form for
wenzelm@3103
    81
higher-order languages, or $t(u)$, trying to look more like
paulson@3485
    82
first-order.  The latter syntax is used throughout the manual.
lcp@105
    83
\[ 
lcp@312
    84
\index{lambda abs@$\lambda$-abstractions}\index{function applications}
lcp@105
    85
\begin{array}{ll}
lcp@296
    86
  t :: \tau   & \hbox{type constraint, on a term or bound variable} \\
lcp@105
    87
  \lambda x.t   & \hbox{abstraction} \\
lcp@105
    88
  \lambda x@1\ldots x@n.t
lcp@105
    89
        & \hbox{curried abstraction, $\lambda x@1. \ldots \lambda x@n.t$} \\
lcp@105
    90
  t(u)          & \hbox{application} \\
lcp@105
    91
  t (u@1, \ldots, u@n) & \hbox{curried application, $t(u@1)\ldots(u@n)$} 
lcp@105
    92
\end{array}
lcp@105
    93
\]
lcp@105
    94
lcp@105
    95
lcp@296
    96
\subsection{Simple types and constants}\index{types!simple|bold} 
lcp@296
    97
lcp@296
    98
The syntactic categories of our logic (Fig.\ts\ref{fol-fig}) are {\bf
lcp@296
    99
  formulae} and {\bf terms}.  Formulae denote truth values, so (following
lcp@296
   100
tradition) let us call their type~$o$.  To allow~0 and~$Suc(t)$ as terms,
lcp@296
   101
let us declare a type~$nat$ of natural numbers.  Later, we shall see
lcp@296
   102
how to admit terms of other types.
lcp@105
   103
lcp@312
   104
\index{constants}\index{*nat type}\index{*o type}
lcp@105
   105
After declaring the types~$o$ and~$nat$, we may declare constants for the
lcp@105
   106
symbols of our logic.  Since $\bot$ denotes a truth value (falsity) and 0
lcp@105
   107
denotes a number, we put \begin{eqnarray*}
lcp@105
   108
  \bot  & :: & o \\
lcp@105
   109
  0     & :: & nat.
lcp@105
   110
\end{eqnarray*}
lcp@105
   111
If a symbol requires operands, the corresponding constant must have a
lcp@105
   112
function type.  In our logic, the successor function
lcp@105
   113
($Suc$) is from natural numbers to natural numbers, negation ($\neg$) is a
lcp@105
   114
function from truth values to truth values, and the binary connectives are
lcp@105
   115
curried functions taking two truth values as arguments: 
lcp@105
   116
\begin{eqnarray*}
lcp@105
   117
  Suc    & :: & nat\To nat  \\
lcp@105
   118
  {\neg} & :: & o\To o      \\
lcp@105
   119
  \conj,\disj,\imp,\bimp  & :: & [o,o]\To o 
lcp@105
   120
\end{eqnarray*}
lcp@296
   121
The binary connectives can be declared as infixes, with appropriate
lcp@296
   122
precedences, so that we write $P\conj Q\disj R$ instead of
lcp@296
   123
$\disj(\conj(P,Q), R)$.
lcp@105
   124
lcp@331
   125
Section~\ref{sec:defining-theories} below describes the syntax of Isabelle
lcp@331
   126
theory files and illustrates it by extending our logic with mathematical
lcp@331
   127
induction.
lcp@105
   128
lcp@105
   129
lcp@105
   130
\subsection{Polymorphic types and constants} \label{polymorphic}
lcp@105
   131
\index{types!polymorphic|bold}
lcp@296
   132
\index{equality!polymorphic}
lcp@312
   133
\index{constants!polymorphic}
lcp@296
   134
lcp@105
   135
Which type should we assign to the equality symbol?  If we tried
lcp@105
   136
$[nat,nat]\To o$, then equality would be restricted to the natural
lcp@312
   137
numbers; we should have to declare different equality symbols for each
lcp@105
   138
type.  Isabelle's type system is polymorphic, so we could declare
lcp@105
   139
\begin{eqnarray*}
lcp@296
   140
  {=}  & :: & [\alpha,\alpha]\To o,
lcp@105
   141
\end{eqnarray*}
lcp@296
   142
where the type variable~$\alpha$ ranges over all types.
lcp@105
   143
But this is also wrong.  The declaration is too polymorphic; $\alpha$
lcp@296
   144
includes types like~$o$ and $nat\To nat$.  Thus, it admits
lcp@105
   145
$\bot=\neg(\bot)$ and $Suc=Suc$ as formulae, which is acceptable in
lcp@105
   146
higher-order logic but not in first-order logic.
lcp@105
   147
lcp@296
   148
Isabelle's {\bf type classes}\index{classes} control
lcp@296
   149
polymorphism~\cite{nipkow-prehofer}.  Each type variable belongs to a
lcp@296
   150
class, which denotes a set of types.  Classes are partially ordered by the
lcp@296
   151
subclass relation, which is essentially the subset relation on the sets of
lcp@296
   152
types.  They closely resemble the classes of the functional language
lcp@296
   153
Haskell~\cite{haskell-tutorial,haskell-report}.
lcp@105
   154
lcp@312
   155
\index{*logic class}\index{*term class}
lcp@105
   156
Isabelle provides the built-in class $logic$, which consists of the logical
lcp@105
   157
types: the ones we want to reason about.  Let us declare a class $term$, to
lcp@105
   158
consist of all legal types of terms in our logic.  The subclass structure
lcp@105
   159
is now $term\le logic$.
lcp@105
   160
lcp@312
   161
\index{*nat type}
lcp@296
   162
We put $nat$ in class $term$ by declaring $nat{::}term$.  We declare the
lcp@296
   163
equality constant by
lcp@105
   164
\begin{eqnarray*}
lcp@105
   165
  {=}  & :: & [\alpha{::}term,\alpha]\To o 
lcp@105
   166
\end{eqnarray*}
lcp@296
   167
where $\alpha{::}term$ constrains the type variable~$\alpha$ to class
lcp@296
   168
$term$.  Such type variables resemble Standard~\ML's equality type
lcp@296
   169
variables.
lcp@296
   170
lcp@312
   171
We give~$o$ and function types the class $logic$ rather than~$term$, since
lcp@105
   172
they are not legal types for terms.  We may introduce new types of class
lcp@105
   173
$term$ --- for instance, type $string$ or $real$ --- at any time.  We can
lcp@331
   174
even declare type constructors such as~$list$, and state that type
lcp@296
   175
$(\tau)list$ belongs to class~$term$ provided $\tau$ does; equality
lcp@105
   176
applies to lists of natural numbers but not to lists of formulae.  We may
lcp@105
   177
summarize this paragraph by a set of {\bf arity declarations} for type
lcp@312
   178
constructors:\index{arities!declaring}
lcp@312
   179
\begin{eqnarray*}\index{*o type}\index{*fun type}
lcp@312
   180
  o             & :: & logic \\
lcp@312
   181
  fun           & :: & (logic,logic)logic \\
lcp@105
   182
  nat, string, real     & :: & term \\
lcp@312
   183
  list          & :: & (term)term
lcp@105
   184
\end{eqnarray*}
lcp@312
   185
(Recall that $fun$ is the type constructor for function types.)
lcp@331
   186
In \rmindex{higher-order logic}, equality does apply to truth values and
lcp@296
   187
functions;  this requires the arity declarations ${o::term}$
lcp@312
   188
and ${fun::(term,term)term}$.  The class system can also handle
lcp@105
   189
overloading.\index{overloading|bold} We could declare $arith$ to be the
lcp@105
   190
subclass of $term$ consisting of the `arithmetic' types, such as~$nat$.
lcp@105
   191
Then we could declare the operators
lcp@105
   192
\begin{eqnarray*}
lcp@105
   193
  {+},{-},{\times},{/}  & :: & [\alpha{::}arith,\alpha]\To \alpha
lcp@105
   194
\end{eqnarray*}
lcp@105
   195
If we declare new types $real$ and $complex$ of class $arith$, then we
lcp@331
   196
in effect have three sets of operators:
lcp@105
   197
\begin{eqnarray*}
lcp@105
   198
  {+},{-},{\times},{/}  & :: & [nat,nat]\To nat \\
lcp@105
   199
  {+},{-},{\times},{/}  & :: & [real,real]\To real \\
lcp@105
   200
  {+},{-},{\times},{/}  & :: & [complex,complex]\To complex 
lcp@105
   201
\end{eqnarray*}
lcp@105
   202
Isabelle will regard these as distinct constants, each of which can be defined
lcp@296
   203
separately.  We could even introduce the type $(\alpha)vector$ and declare
lcp@296
   204
its arity as $(arith)arith$.  Then we could declare the constant
lcp@105
   205
\begin{eqnarray*}
lcp@296
   206
  {+}  & :: & [(\alpha)vector,(\alpha)vector]\To (\alpha)vector 
lcp@105
   207
\end{eqnarray*}
lcp@296
   208
and specify it in terms of ${+} :: [\alpha,\alpha]\To \alpha$.
lcp@105
   209
lcp@296
   210
A type variable may belong to any finite number of classes.  Suppose that
lcp@296
   211
we had declared yet another class $ord \le term$, the class of all
lcp@105
   212
`ordered' types, and a constant
lcp@105
   213
\begin{eqnarray*}
lcp@105
   214
  {\le}  & :: & [\alpha{::}ord,\alpha]\To o.
lcp@105
   215
\end{eqnarray*}
lcp@105
   216
In this context the variable $x$ in $x \le (x+x)$ will be assigned type
lcp@296
   217
$\alpha{::}\{arith,ord\}$, which means $\alpha$ belongs to both $arith$ and
lcp@296
   218
$ord$.  Semantically the set $\{arith,ord\}$ should be understood as the
lcp@296
   219
intersection of the sets of types represented by $arith$ and $ord$.  Such
lcp@296
   220
intersections of classes are called \bfindex{sorts}.  The empty
lcp@296
   221
intersection of classes, $\{\}$, contains all types and is thus the {\bf
lcp@296
   222
  universal sort}.
lcp@105
   223
lcp@296
   224
Even with overloading, each term has a unique, most general type.  For this
lcp@296
   225
to be possible, the class and type declarations must satisfy certain
lcp@331
   226
technical constraints; see 
lcp@331
   227
\iflabelundefined{sec:ref-defining-theories}%
lcp@331
   228
           {Sect.\ Defining Theories in the {\em Reference Manual}}%
lcp@331
   229
           {\S\ref{sec:ref-defining-theories}}.
lcp@105
   230
lcp@296
   231
lcp@105
   232
\subsection{Higher types and quantifiers}
lcp@312
   233
\index{types!higher|bold}\index{quantifiers}
lcp@105
   234
Quantifiers are regarded as operations upon functions.  Ignoring polymorphism
lcp@105
   235
for the moment, consider the formula $\forall x. P(x)$, where $x$ ranges
lcp@105
   236
over type~$nat$.  This is true if $P(x)$ is true for all~$x$.  Abstracting
lcp@105
   237
$P(x)$ into a function, this is the same as saying that $\lambda x.P(x)$
lcp@105
   238
returns true for all arguments.  Thus, the universal quantifier can be
lcp@105
   239
represented by a constant
lcp@105
   240
\begin{eqnarray*}
lcp@105
   241
  \forall  & :: & (nat\To o) \To o,
lcp@105
   242
\end{eqnarray*}
lcp@105
   243
which is essentially an infinitary truth table.  The representation of $\forall
lcp@105
   244
x. P(x)$ is $\forall(\lambda x. P(x))$.  
lcp@105
   245
lcp@105
   246
The existential quantifier is treated
lcp@105
   247
in the same way.  Other binding operators are also easily handled; for
lcp@105
   248
instance, the summation operator $\Sigma@{k=i}^j f(k)$ can be represented as
lcp@105
   249
$\Sigma(i,j,\lambda k.f(k))$, where
lcp@105
   250
\begin{eqnarray*}
lcp@105
   251
  \Sigma  & :: & [nat,nat, nat\To nat] \To nat.
lcp@105
   252
\end{eqnarray*}
lcp@105
   253
Quantifiers may be polymorphic.  We may define $\forall$ and~$\exists$ over
lcp@105
   254
all legal types of terms, not just the natural numbers, and
lcp@105
   255
allow summations over all arithmetic types:
lcp@105
   256
\begin{eqnarray*}
lcp@105
   257
   \forall,\exists      & :: & (\alpha{::}term\To o) \To o \\
lcp@105
   258
   \Sigma               & :: & [nat,nat, nat\To \alpha{::}arith] \To \alpha
lcp@105
   259
\end{eqnarray*}
lcp@105
   260
Observe that the index variables still have type $nat$, while the values
lcp@105
   261
being summed may belong to any arithmetic type.
lcp@105
   262
lcp@105
   263
lcp@105
   264
\section{Formalizing logical rules in Isabelle}
lcp@312
   265
\index{meta-implication|bold}
lcp@312
   266
\index{meta-quantifiers|bold}
lcp@312
   267
\index{meta-equality|bold}
lcp@296
   268
lcp@296
   269
Object-logics are formalized by extending Isabelle's
paulson@1878
   270
meta-logic~\cite{paulson-found}, which is intuitionistic higher-order logic.
lcp@296
   271
The meta-level connectives are {\bf implication}, the {\bf universal
lcp@296
   272
  quantifier}, and {\bf equality}.
lcp@105
   273
\begin{itemize}
lcp@105
   274
  \item The implication \(\phi\Imp \psi\) means `\(\phi\) implies
lcp@105
   275
\(\psi\)', and expresses logical {\bf entailment}.  
lcp@105
   276
lcp@105
   277
  \item The quantification \(\Forall x.\phi\) means `\(\phi\) is true for
lcp@105
   278
all $x$', and expresses {\bf generality} in rules and axiom schemes. 
lcp@105
   279
lcp@105
   280
\item The equality \(a\equiv b\) means `$a$ equals $b$', for expressing
lcp@312
   281
  {\bf definitions} (see~\S\ref{definitions}).\index{definitions}
lcp@312
   282
  Equalities left over from the unification process, so called {\bf
lcp@312
   283
    flex-flex constraints},\index{flex-flex constraints} are written $a\qeq
lcp@312
   284
  b$.  The two equality symbols have the same logical meaning.
lcp@105
   285
lcp@105
   286
\end{itemize}
lcp@312
   287
The syntax of the meta-logic is formalized in the same manner
lcp@312
   288
as object-logics, using the simply typed $\lambda$-calculus.  Analogous to
lcp@105
   289
type~$o$ above, there is a built-in type $prop$ of meta-level truth values.
lcp@105
   290
Meta-level formulae will have this type.  Type $prop$ belongs to
lcp@105
   291
class~$logic$; also, $\sigma\To\tau$ belongs to $logic$ provided $\sigma$
lcp@105
   292
and $\tau$ do.  Here are the types of the built-in connectives:
lcp@312
   293
\begin{eqnarray*}\index{*prop type}\index{*logic class}
lcp@105
   294
  \Imp     & :: & [prop,prop]\To prop \\
lcp@105
   295
  \Forall  & :: & (\alpha{::}logic\To prop) \To prop \\
lcp@105
   296
  {\equiv} & :: & [\alpha{::}\{\},\alpha]\To prop \\
lcp@312
   297
  \qeq & :: & [\alpha{::}\{\},\alpha]\To prop
lcp@105
   298
\end{eqnarray*}
lcp@312
   299
The polymorphism in $\Forall$ is restricted to class~$logic$ to exclude
lcp@312
   300
certain types, those used just for parsing.  The type variable
lcp@312
   301
$\alpha{::}\{\}$ ranges over the universal sort.
lcp@105
   302
lcp@105
   303
In our formalization of first-order logic, we declared a type~$o$ of
lcp@105
   304
object-level truth values, rather than using~$prop$ for this purpose.  If
lcp@105
   305
we declared the object-level connectives to have types such as
lcp@105
   306
${\neg}::prop\To prop$, then these connectives would be applicable to
lcp@105
   307
meta-level formulae.  Keeping $prop$ and $o$ as separate types maintains
lcp@105
   308
the distinction between the meta-level and the object-level.  To formalize
lcp@105
   309
the inference rules, we shall need to relate the two levels; accordingly,
lcp@105
   310
we declare the constant
lcp@312
   311
\index{*Trueprop constant}
lcp@105
   312
\begin{eqnarray*}
lcp@105
   313
  Trueprop & :: & o\To prop.
lcp@105
   314
\end{eqnarray*}
lcp@105
   315
We may regard $Trueprop$ as a meta-level predicate, reading $Trueprop(P)$ as
lcp@105
   316
`$P$ is true at the object-level.'  Put another way, $Trueprop$ is a coercion
lcp@105
   317
from $o$ to $prop$.
lcp@105
   318
lcp@105
   319
lcp@105
   320
\subsection{Expressing propositional rules}
lcp@312
   321
\index{rules!propositional}
lcp@105
   322
We shall illustrate the use of the meta-logic by formalizing the rules of
lcp@296
   323
Fig.\ts\ref{fol-fig}.  Each object-level rule is expressed as a meta-level
lcp@105
   324
axiom. 
lcp@105
   325
lcp@105
   326
One of the simplest rules is $(\conj E1)$.  Making
lcp@105
   327
everything explicit, its formalization in the meta-logic is
wenzelm@3103
   328
$$
wenzelm@3103
   329
\Forall P\;Q. Trueprop(P\conj Q) \Imp Trueprop(P).   \eqno(\conj E1)
wenzelm@3103
   330
$$
lcp@105
   331
This may look formidable, but it has an obvious reading: for all object-level
lcp@105
   332
truth values $P$ and~$Q$, if $P\conj Q$ is true then so is~$P$.  The
lcp@105
   333
reading is correct because the meta-logic has simple models, where
lcp@105
   334
types denote sets and $\Forall$ really means `for all.'
lcp@105
   335
lcp@312
   336
\index{*Trueprop constant}
lcp@105
   337
Isabelle adopts notational conventions to ease the writing of rules.  We may
lcp@105
   338
hide the occurrences of $Trueprop$ by making it an implicit coercion.
lcp@105
   339
Outer universal quantifiers may be dropped.  Finally, the nested implication
lcp@312
   340
\index{meta-implication}
lcp@105
   341
\[  \phi@1\Imp(\cdots \phi@n\Imp\psi\cdots) \]
lcp@105
   342
may be abbreviated as $\List{\phi@1; \ldots; \phi@n} \Imp \psi$, which
lcp@105
   343
formalizes a rule of $n$~premises.
lcp@105
   344
lcp@105
   345
Using these conventions, the conjunction rules become the following axioms.
lcp@105
   346
These fully specify the properties of~$\conj$:
lcp@105
   347
$$ \List{P; Q} \Imp P\conj Q                 \eqno(\conj I) $$
lcp@105
   348
$$ P\conj Q \Imp P  \qquad  P\conj Q \Imp Q  \eqno(\conj E1,2) $$
lcp@105
   349
lcp@105
   350
\noindent
lcp@105
   351
Next, consider the disjunction rules.  The discharge of assumption in
lcp@105
   352
$(\disj E)$ is expressed  using $\Imp$:
lcp@331
   353
\index{assumptions!discharge of}%
lcp@105
   354
$$ P \Imp P\disj Q  \qquad  Q \Imp P\disj Q  \eqno(\disj I1,2) $$
lcp@105
   355
$$ \List{P\disj Q; P\Imp R; Q\Imp R} \Imp R  \eqno(\disj E) $$
lcp@331
   356
%
lcp@312
   357
To understand this treatment of assumptions in natural
lcp@105
   358
deduction, look at implication.  The rule $({\imp}I)$ is the classic
lcp@105
   359
example of natural deduction: to prove that $P\imp Q$ is true, assume $P$
lcp@105
   360
is true and show that $Q$ must then be true.  More concisely, if $P$
lcp@105
   361
implies $Q$ (at the meta-level), then $P\imp Q$ is true (at the
lcp@105
   362
object-level).  Showing the coercion explicitly, this is formalized as
lcp@105
   363
\[ (Trueprop(P)\Imp Trueprop(Q)) \Imp Trueprop(P\imp Q). \]
lcp@105
   364
The rule $({\imp}E)$ is straightforward; hiding $Trueprop$, the axioms to
lcp@105
   365
specify $\imp$ are 
lcp@105
   366
$$ (P \Imp Q)  \Imp  P\imp Q   \eqno({\imp}I) $$
lcp@105
   367
$$ \List{P\imp Q; P}  \Imp Q.  \eqno({\imp}E) $$
lcp@105
   368
lcp@105
   369
\noindent
lcp@105
   370
Finally, the intuitionistic contradiction rule is formalized as the axiom
lcp@105
   371
$$ \bot \Imp P.   \eqno(\bot E) $$
lcp@105
   372
lcp@105
   373
\begin{warn}
lcp@105
   374
Earlier versions of Isabelle, and certain
paulson@1878
   375
papers~\cite{paulson-found,paulson700}, use $\List{P}$ to mean $Trueprop(P)$.
lcp@105
   376
\end{warn}
lcp@105
   377
lcp@105
   378
\subsection{Quantifier rules and substitution}
lcp@312
   379
\index{quantifiers}\index{rules!quantifier}\index{substitution|bold}
lcp@312
   380
\index{variables!bound}\index{lambda abs@$\lambda$-abstractions}
lcp@312
   381
\index{function applications}
lcp@312
   382
lcp@105
   383
Isabelle expresses variable binding using $\lambda$-abstraction; for instance,
lcp@105
   384
$\forall x.P$ is formalized as $\forall(\lambda x.P)$.  Recall that $F(t)$
lcp@105
   385
is Isabelle's syntax for application of the function~$F$ to the argument~$t$;
lcp@105
   386
it is not a meta-notation for substitution.  On the other hand, a substitution
lcp@105
   387
will take place if $F$ has the form $\lambda x.P$;  Isabelle transforms
lcp@105
   388
$(\lambda x.P)(t)$ to~$P[t/x]$ by $\beta$-conversion.  Thus, we can express
lcp@105
   389
inference rules that involve substitution for bound variables.
lcp@105
   390
lcp@105
   391
\index{parameters|bold}\index{eigenvariables|see{parameters}}
lcp@105
   392
A logic may attach provisos to certain of its rules, especially quantifier
lcp@105
   393
rules.  We cannot hope to formalize arbitrary provisos.  Fortunately, those
lcp@105
   394
typical of quantifier rules always have the same form, namely `$x$ not free in
lcp@105
   395
\ldots {\it (some set of formulae)},' where $x$ is a variable (called a {\bf
lcp@105
   396
parameter} or {\bf eigenvariable}) in some premise.  Isabelle treats
lcp@105
   397
provisos using~$\Forall$, its inbuilt notion of `for all'.
lcp@312
   398
\index{meta-quantifiers}
lcp@105
   399
lcp@105
   400
The purpose of the proviso `$x$ not free in \ldots' is
lcp@105
   401
to ensure that the premise may not make assumptions about the value of~$x$,
lcp@105
   402
and therefore holds for all~$x$.  We formalize $(\forall I)$ by
lcp@105
   403
\[ \left(\Forall x. Trueprop(P(x))\right) \Imp Trueprop(\forall x.P(x)). \]
lcp@105
   404
This means, `if $P(x)$ is true for all~$x$, then $\forall x.P(x)$ is true.'
lcp@105
   405
The $\forall E$ rule exploits $\beta$-conversion.  Hiding $Trueprop$, the
lcp@105
   406
$\forall$ axioms are
lcp@105
   407
$$ \left(\Forall x. P(x)\right)  \Imp  \forall x.P(x)   \eqno(\forall I) $$
wenzelm@3103
   408
$$ (\forall x.P(x))  \Imp P(t).  \eqno(\forall E) $$
lcp@105
   409
lcp@105
   410
\noindent
lcp@105
   411
We have defined the object-level universal quantifier~($\forall$)
lcp@105
   412
using~$\Forall$.  But we do not require meta-level counterparts of all the
lcp@105
   413
connectives of the object-logic!  Consider the existential quantifier: 
wenzelm@3103
   414
$$ P(t)  \Imp  \exists x.P(x)  \eqno(\exists I) $$
lcp@105
   415
$$ \List{\exists x.P(x);\; \Forall x. P(x)\Imp Q} \Imp Q  \eqno(\exists E) $$
lcp@105
   416
Let us verify $(\exists E)$ semantically.  Suppose that the premises
lcp@312
   417
hold; since $\exists x.P(x)$ is true, we may choose an~$a$ such that $P(a)$ is
lcp@105
   418
true.  Instantiating $\Forall x. P(x)\Imp Q$ with $a$ yields $P(a)\Imp Q$, and
lcp@105
   419
we obtain the desired conclusion, $Q$.
lcp@105
   420
lcp@105
   421
The treatment of substitution deserves mention.  The rule
lcp@105
   422
\[ \infer{P[u/t]}{t=u & P} \]
lcp@105
   423
would be hard to formalize in Isabelle.  It calls for replacing~$t$ by $u$
lcp@105
   424
throughout~$P$, which cannot be expressed using $\beta$-conversion.  Our
lcp@312
   425
rule~$(subst)$ uses~$P$ as a template for substitution, inferring $P[u/x]$
lcp@312
   426
from~$P[t/x]$.  When we formalize this as an axiom, the template becomes a
lcp@312
   427
function variable:
wenzelm@3103
   428
$$ \List{t=u; P(t)} \Imp P(u).  \eqno(subst) $$
lcp@105
   429
lcp@105
   430
lcp@105
   431
\subsection{Signatures and theories}
lcp@312
   432
\index{signatures|bold}
lcp@312
   433
paulson@6170
   434
A {\bf signature} contains the information necessary for type-checking,
wenzelm@3103
   435
parsing and pretty printing a term.  It specifies type classes and their
lcp@331
   436
relationships, types and their arities, constants and their types, etc.  It
wenzelm@3103
   437
also contains grammar rules, specified using mixfix declarations.
lcp@105
   438
lcp@105
   439
Two signatures can be merged provided their specifications are compatible ---
lcp@105
   440
they must not, for example, assign different types to the same constant.
lcp@105
   441
Under similar conditions, a signature can be extended.  Signatures are
lcp@105
   442
managed internally by Isabelle; users seldom encounter them.
lcp@105
   443
wenzelm@9695
   444
\index{theories|bold} A {\bf theory} consists of a signature plus a collection
wenzelm@9695
   445
of axioms.  The Pure theory contains only the meta-logic.  Theories can be
wenzelm@9695
   446
combined provided their signatures are compatible.  A theory definition
wenzelm@9695
   447
extends an existing theory with further signature specifications --- classes,
wenzelm@9695
   448
types, constants and mixfix declarations --- plus lists of axioms and
wenzelm@9695
   449
definitions etc., expressed as strings to be parsed.  A theory can formalize a
wenzelm@9695
   450
small piece of mathematics, such as lists and their operations, or an entire
wenzelm@9695
   451
logic.  A mathematical development typically involves many theories in a
wenzelm@9695
   452
hierarchy.  For example, the Pure theory could be extended to form a theory
wenzelm@9695
   453
for Fig.\ts\ref{fol-fig}; this could be extended in two separate ways to form
wenzelm@9695
   454
a theory for natural numbers and a theory for lists; the union of these two
wenzelm@9695
   455
could be extended into a theory defining the length of a list:
lcp@296
   456
\begin{tt}
lcp@296
   457
\[
lcp@105
   458
\begin{array}{c@{}c@{}c@{}c@{}c}
wenzelm@3103
   459
     {}   &     {}   &\hbox{Pure}&     {}  &     {}  \\
wenzelm@3103
   460
     {}   &     {}   &  \downarrow &     {}   &     {}   \\
wenzelm@3103
   461
     {}   &     {}   &\hbox{FOL} &     {}   &     {}   \\
wenzelm@3103
   462
     {}   & \swarrow &     {}    & \searrow &     {}   \\
wenzelm@3103
   463
 \hbox{Nat} &   {}   &     {}    &     {}   & \hbox{List} \\
wenzelm@3103
   464
     {}   & \searrow &     {}    & \swarrow &     {}   \\
lcp@105
   465
     {}   &     {} &\hbox{Nat}+\hbox{List}&  {}   &     {}   \\
wenzelm@3103
   466
     {}   &     {}   &  \downarrow &     {}   &     {}   \\
wenzelm@3103
   467
     {}   &     {} & \hbox{Length} &  {}   &     {}
lcp@105
   468
\end{array}
lcp@105
   469
\]
lcp@331
   470
\end{tt}%
lcp@105
   471
Each Isabelle proof typically works within a single theory, which is
lcp@105
   472
associated with the proof state.  However, many different theories may
lcp@105
   473
coexist at the same time, and you may work in each of these during a single
lcp@105
   474
session.  
lcp@105
   475
lcp@331
   476
\begin{warn}\index{constants!clashes with variables}%
lcp@296
   477
  Confusing problems arise if you work in the wrong theory.  Each theory
lcp@296
   478
  defines its own syntax.  An identifier may be regarded in one theory as a
wenzelm@3103
   479
  constant and in another as a variable, for example.
lcp@296
   480
\end{warn}
lcp@105
   481
lcp@105
   482
\section{Proof construction in Isabelle}
lcp@296
   483
I have elsewhere described the meta-logic and demonstrated it by
paulson@1878
   484
formalizing first-order logic~\cite{paulson-found}.  There is a one-to-one
lcp@296
   485
correspondence between meta-level proofs and object-level proofs.  To each
lcp@296
   486
use of a meta-level axiom, such as $(\forall I)$, there is a use of the
lcp@296
   487
corresponding object-level rule.  Object-level assumptions and parameters
lcp@296
   488
have meta-level counterparts.  The meta-level formalization is {\bf
lcp@296
   489
  faithful}, admitting no incorrect object-level inferences, and {\bf
lcp@296
   490
  adequate}, admitting all correct object-level inferences.  These
lcp@296
   491
properties must be demonstrated separately for each object-logic.
lcp@105
   492
lcp@105
   493
The meta-logic is defined by a collection of inference rules, including
lcp@331
   494
equational rules for the $\lambda$-calculus and logical rules.  The rules
lcp@105
   495
for~$\Imp$ and~$\Forall$ resemble those for~$\imp$ and~$\forall$ in
lcp@296
   496
Fig.\ts\ref{fol-fig}.  Proofs performed using the primitive meta-rules
lcp@105
   497
would be lengthy; Isabelle proofs normally use certain derived rules.
lcp@105
   498
{\bf Resolution}, in particular, is convenient for backward proof.
lcp@105
   499
lcp@105
   500
Unification is central to theorem proving.  It supports quantifier
lcp@105
   501
reasoning by allowing certain `unknown' terms to be instantiated later,
lcp@105
   502
possibly in stages.  When proving that the time required to sort $n$
lcp@105
   503
integers is proportional to~$n^2$, we need not state the constant of
lcp@105
   504
proportionality; when proving that a hardware adder will deliver the sum of
lcp@105
   505
its inputs, we need not state how many clock ticks will be required.  Such
lcp@105
   506
quantities often emerge from the proof.
lcp@105
   507
lcp@312
   508
Isabelle provides {\bf schematic variables}, or {\bf
lcp@312
   509
  unknowns},\index{unknowns} for unification.  Logically, unknowns are free
lcp@312
   510
variables.  But while ordinary variables remain fixed, unification may
lcp@312
   511
instantiate unknowns.  Unknowns are written with a ?\ prefix and are
lcp@312
   512
frequently subscripted: $\Var{a}$, $\Var{a@1}$, $\Var{a@2}$, \ldots,
lcp@312
   513
$\Var{P}$, $\Var{P@1}$, \ldots.
lcp@105
   514
lcp@105
   515
Recall that an inference rule of the form
lcp@105
   516
\[ \infer{\phi}{\phi@1 & \ldots & \phi@n} \]
lcp@105
   517
is formalized in Isabelle's meta-logic as the axiom
lcp@312
   518
$\List{\phi@1; \ldots; \phi@n} \Imp \phi$.\index{resolution}
lcp@296
   519
Such axioms resemble Prolog's Horn clauses, and can be combined by
lcp@105
   520
resolution --- Isabelle's principal proof method.  Resolution yields both
lcp@105
   521
forward and backward proof.  Backward proof works by unifying a goal with
lcp@105
   522
the conclusion of a rule, whose premises become new subgoals.  Forward proof
lcp@105
   523
works by unifying theorems with the premises of a rule, deriving a new theorem.
lcp@105
   524
lcp@312
   525
Isabelle formulae require an extended notion of resolution.
lcp@296
   526
They differ from Horn clauses in two major respects:
lcp@105
   527
\begin{itemize}
lcp@105
   528
  \item They are written in the typed $\lambda$-calculus, and therefore must be
lcp@105
   529
resolved using higher-order unification.
lcp@105
   530
lcp@296
   531
\item The constituents of a clause need not be atomic formulae.  Any
lcp@296
   532
  formula of the form $Trueprop(\cdots)$ is atomic, but axioms such as
lcp@296
   533
  ${\imp}I$ and $\forall I$ contain non-atomic formulae.
lcp@105
   534
\end{itemize}
lcp@296
   535
Isabelle has little in common with classical resolution theorem provers
lcp@105
   536
such as Otter~\cite{wos-bledsoe}.  At the meta-level, Isabelle proves
lcp@105
   537
theorems in their positive form, not by refutation.  However, an
lcp@105
   538
object-logic that includes a contradiction rule may employ a refutation
lcp@105
   539
proof procedure.
lcp@105
   540
lcp@105
   541
lcp@105
   542
\subsection{Higher-order unification}
lcp@105
   543
\index{unification!higher-order|bold}
lcp@105
   544
Unification is equation solving.  The solution of $f(\Var{x},c) \qeq
lcp@105
   545
f(d,\Var{y})$ is $\Var{x}\equiv d$ and $\Var{y}\equiv c$.  {\bf
lcp@105
   546
Higher-order unification} is equation solving for typed $\lambda$-terms.
lcp@105
   547
To handle $\beta$-conversion, it must reduce $(\lambda x.t)u$ to $t[u/x]$.
lcp@105
   548
That is easy --- in the typed $\lambda$-calculus, all reduction sequences
lcp@105
   549
terminate at a normal form.  But it must guess the unknown
lcp@105
   550
function~$\Var{f}$ in order to solve the equation
lcp@105
   551
\begin{equation} \label{hou-eqn}
lcp@105
   552
 \Var{f}(t) \qeq g(u@1,\ldots,u@k).
lcp@105
   553
\end{equation}
lcp@105
   554
Huet's~\cite{huet75} search procedure solves equations by imitation and
lcp@312
   555
projection.  {\bf Imitation} makes~$\Var{f}$ apply the leading symbol (if a
lcp@312
   556
constant) of the right-hand side.  To solve equation~(\ref{hou-eqn}), it
lcp@312
   557
guesses
lcp@105
   558
\[ \Var{f} \equiv \lambda x. g(\Var{h@1}(x),\ldots,\Var{h@k}(x)), \]
lcp@105
   559
where $\Var{h@1}$, \ldots, $\Var{h@k}$ are new unknowns.  Assuming there are no
lcp@105
   560
other occurrences of~$\Var{f}$, equation~(\ref{hou-eqn}) simplifies to the
lcp@105
   561
set of equations
lcp@105
   562
\[ \Var{h@1}(t)\qeq u@1 \quad\ldots\quad \Var{h@k}(t)\qeq u@k. \]
lcp@105
   563
If the procedure solves these equations, instantiating $\Var{h@1}$, \ldots,
lcp@105
   564
$\Var{h@k}$, then it yields an instantiation for~$\Var{f}$.
lcp@105
   565
lcp@105
   566
{\bf Projection} makes $\Var{f}$ apply one of its arguments.  To solve
lcp@105
   567
equation~(\ref{hou-eqn}), if $t$ expects~$m$ arguments and delivers a
lcp@105
   568
result of suitable type, it guesses
lcp@105
   569
\[ \Var{f} \equiv \lambda x. x(\Var{h@1}(x),\ldots,\Var{h@m}(x)), \]
lcp@105
   570
where $\Var{h@1}$, \ldots, $\Var{h@m}$ are new unknowns.  Assuming there are no
lcp@105
   571
other occurrences of~$\Var{f}$, equation~(\ref{hou-eqn}) simplifies to the 
lcp@105
   572
equation 
wenzelm@3103
   573
\[ t(\Var{h@1}(t),\ldots,\Var{h@m}(t)) \qeq g(u@1,\ldots,u@k). \]
lcp@105
   574
lcp@331
   575
\begin{warn}\index{unification!incompleteness of}%
lcp@105
   576
Huet's unification procedure is complete.  Isabelle's polymorphic version,
lcp@105
   577
which solves for type unknowns as well as for term unknowns, is incomplete.
lcp@105
   578
The problem is that projection requires type information.  In
lcp@105
   579
equation~(\ref{hou-eqn}), if the type of~$t$ is unknown, then projections
lcp@105
   580
are possible for all~$m\geq0$, and the types of the $\Var{h@i}$ will be
lcp@105
   581
similarly unconstrained.  Therefore, Isabelle never attempts such
lcp@105
   582
projections, and may fail to find unifiers where a type unknown turns out
lcp@105
   583
to be a function type.
lcp@105
   584
\end{warn}
lcp@105
   585
lcp@312
   586
\index{unknowns!function|bold}
lcp@105
   587
Given $\Var{f}(t@1,\ldots,t@n)\qeq u$, Huet's procedure could make up to
lcp@105
   588
$n+1$ guesses.  The search tree and set of unifiers may be infinite.  But
lcp@105
   589
higher-order unification can work effectively, provided you are careful
lcp@105
   590
with {\bf function unknowns}:
lcp@105
   591
\begin{itemize}
lcp@105
   592
  \item Equations with no function unknowns are solved using first-order
lcp@105
   593
unification, extended to treat bound variables.  For example, $\lambda x.x
lcp@105
   594
\qeq \lambda x.\Var{y}$ has no solution because $\Var{y}\equiv x$ would
lcp@105
   595
capture the free variable~$x$.
lcp@105
   596
lcp@105
   597
  \item An occurrence of the term $\Var{f}(x,y,z)$, where the arguments are
lcp@105
   598
distinct bound variables, causes no difficulties.  Its projections can only
lcp@105
   599
match the corresponding variables.
lcp@105
   600
lcp@105
   601
  \item Even an equation such as $\Var{f}(a)\qeq a+a$ is all right.  It has
lcp@105
   602
four solutions, but Isabelle evaluates them lazily, trying projection before
paulson@3485
   603
imitation.  The first solution is usually the one desired:
lcp@105
   604
\[ \Var{f}\equiv \lambda x. x+x \quad
lcp@105
   605
   \Var{f}\equiv \lambda x. a+x \quad
lcp@105
   606
   \Var{f}\equiv \lambda x. x+a \quad
lcp@105
   607
   \Var{f}\equiv \lambda x. a+a \]
lcp@105
   608
  \item  Equations such as $\Var{f}(\Var{x},\Var{y})\qeq t$ and
lcp@105
   609
$\Var{f}(\Var{g}(x))\qeq t$ admit vast numbers of unifiers, and must be
lcp@105
   610
avoided. 
lcp@105
   611
\end{itemize}
lcp@105
   612
In problematic cases, you may have to instantiate some unknowns before
lcp@105
   613
invoking unification. 
lcp@105
   614
lcp@105
   615
lcp@105
   616
\subsection{Joining rules by resolution} \label{joining}
lcp@105
   617
\index{resolution|bold}
lcp@105
   618
Let $\List{\psi@1; \ldots; \psi@m} \Imp \psi$ and $\List{\phi@1; \ldots;
lcp@105
   619
\phi@n} \Imp \phi$ be two Isabelle theorems, representing object-level rules. 
lcp@105
   620
Choosing some~$i$ from~1 to~$n$, suppose that $\psi$ and $\phi@i$ have a
lcp@105
   621
higher-order unifier.  Writing $Xs$ for the application of substitution~$s$ to
lcp@105
   622
expression~$X$, this means there is some~$s$ such that $\psi s\equiv \phi@i s$.
lcp@105
   623
By resolution, we may conclude
lcp@105
   624
\[ (\List{\phi@1; \ldots; \phi@{i-1}; \psi@1; \ldots; \psi@m;
lcp@105
   625
          \phi@{i+1}; \ldots; \phi@n} \Imp \phi)s.
lcp@105
   626
\]
lcp@105
   627
The substitution~$s$ may instantiate unknowns in both rules.  In short,
lcp@105
   628
resolution is the following rule:
lcp@105
   629
\[ \infer[(\psi s\equiv \phi@i s)]
lcp@105
   630
         {(\List{\phi@1; \ldots; \phi@{i-1}; \psi@1; \ldots; \psi@m;
lcp@105
   631
          \phi@{i+1}; \ldots; \phi@n} \Imp \phi)s}
lcp@105
   632
         {\List{\psi@1; \ldots; \psi@m} \Imp \psi & &
lcp@105
   633
          \List{\phi@1; \ldots; \phi@n} \Imp \phi}
lcp@105
   634
\]
lcp@105
   635
It operates at the meta-level, on Isabelle theorems, and is justified by
lcp@105
   636
the properties of $\Imp$ and~$\Forall$.  It takes the number~$i$ (for
lcp@105
   637
$1\leq i\leq n$) as a parameter and may yield infinitely many conclusions,
lcp@105
   638
one for each unifier of $\psi$ with $\phi@i$.  Isabelle returns these
lcp@105
   639
conclusions as a sequence (lazy list).
lcp@105
   640
wenzelm@3103
   641
Resolution expects the rules to have no outer quantifiers~($\Forall$).
wenzelm@3103
   642
It may rename or instantiate any schematic variables, but leaves free
wenzelm@3103
   643
variables unchanged.  When constructing a theory, Isabelle puts the
wenzelm@3106
   644
rules into a standard form with all free variables converted into
wenzelm@3106
   645
schematic ones; for instance, $({\imp}E)$ becomes
lcp@105
   646
\[ \List{\Var{P}\imp \Var{Q}; \Var{P}}  \Imp \Var{Q}. 
lcp@105
   647
\]
lcp@105
   648
When resolving two rules, the unknowns in the first rule are renamed, by
lcp@105
   649
subscripting, to make them distinct from the unknowns in the second rule.  To
lcp@331
   650
resolve $({\imp}E)$ with itself, the first copy of the rule becomes
lcp@105
   651
\[ \List{\Var{P@1}\imp \Var{Q@1}; \Var{P@1}}  \Imp \Var{Q@1}. \]
lcp@105
   652
Resolving this with $({\imp}E)$ in the first premise, unifying $\Var{Q@1}$ with
lcp@105
   653
$\Var{P}\imp \Var{Q}$, is the meta-level inference
lcp@105
   654
\[ \infer{\List{\Var{P@1}\imp (\Var{P}\imp \Var{Q}); \Var{P@1}; \Var{P}} 
lcp@105
   655
           \Imp\Var{Q}.}
lcp@105
   656
         {\List{\Var{P@1}\imp \Var{Q@1}; \Var{P@1}}  \Imp \Var{Q@1} & &
lcp@105
   657
          \List{\Var{P}\imp \Var{Q}; \Var{P}}  \Imp \Var{Q}}
lcp@105
   658
\]
lcp@105
   659
Renaming the unknowns in the resolvent, we have derived the
lcp@331
   660
object-level rule\index{rules!derived}
lcp@105
   661
\[ \infer{Q.}{R\imp (P\imp Q)  &  R  &  P}  \]
lcp@105
   662
Joining rules in this fashion is a simple way of proving theorems.  The
lcp@105
   663
derived rules are conservative extensions of the object-logic, and may permit
lcp@105
   664
simpler proofs.  Let us consider another example.  Suppose we have the axiom
lcp@105
   665
$$ \forall x\,y. Suc(x)=Suc(y)\imp x=y. \eqno (inject) $$
lcp@105
   666
lcp@105
   667
\noindent 
lcp@105
   668
The standard form of $(\forall E)$ is
lcp@105
   669
$\forall x.\Var{P}(x)  \Imp \Var{P}(\Var{t})$.
lcp@105
   670
Resolving $(inject)$ with $(\forall E)$ replaces $\Var{P}$ by
lcp@105
   671
$\lambda x. \forall y. Suc(x)=Suc(y)\imp x=y$ and leaves $\Var{t}$
lcp@105
   672
unchanged, yielding  
lcp@105
   673
\[ \forall y. Suc(\Var{t})=Suc(y)\imp \Var{t}=y. \]
lcp@105
   674
Resolving this with $(\forall E)$ puts a subscript on~$\Var{t}$
lcp@105
   675
and yields
lcp@105
   676
\[ Suc(\Var{t@1})=Suc(\Var{t})\imp \Var{t@1}=\Var{t}. \]
lcp@105
   677
Resolving this with $({\imp}E)$ increases the subscripts and yields
lcp@105
   678
\[ Suc(\Var{t@2})=Suc(\Var{t@1})\Imp \Var{t@2}=\Var{t@1}. 
lcp@105
   679
\]
lcp@105
   680
We have derived the rule
lcp@105
   681
\[ \infer{m=n,}{Suc(m)=Suc(n)} \]
lcp@105
   682
which goes directly from $Suc(m)=Suc(n)$ to $m=n$.  It is handy for simplifying
lcp@105
   683
an equation like $Suc(Suc(Suc(m)))=Suc(Suc(Suc(0)))$.  
lcp@105
   684
lcp@105
   685
lcp@296
   686
\section{Lifting a rule into a context}
lcp@105
   687
The rules $({\imp}I)$ and $(\forall I)$ may seem unsuitable for
lcp@105
   688
resolution.  They have non-atomic premises, namely $P\Imp Q$ and $\Forall
lcp@105
   689
x.P(x)$, while the conclusions of all the rules are atomic (they have the form
lcp@105
   690
$Trueprop(\cdots)$).  Isabelle gets round the problem through a meta-inference
lcp@105
   691
called \bfindex{lifting}.  Let us consider how to construct proofs such as
lcp@105
   692
\[ \infer[({\imp}I)]{P\imp(Q\imp R)}
lcp@105
   693
         {\infer[({\imp}I)]{Q\imp R}
lcp@296
   694
                        {\infer*{R}{[P,Q]}}}
lcp@105
   695
   \qquad
lcp@105
   696
   \infer[(\forall I)]{\forall x\,y.P(x,y)}
lcp@105
   697
         {\infer[(\forall I)]{\forall y.P(x,y)}{P(x,y)}}
lcp@105
   698
\]
lcp@105
   699
lcp@296
   700
\subsection{Lifting over assumptions}
lcp@312
   701
\index{assumptions!lifting over}
lcp@105
   702
Lifting over $\theta\Imp{}$ is the following meta-inference rule:
lcp@105
   703
\[ \infer{\List{\theta\Imp\phi@1; \ldots; \theta\Imp\phi@n} \Imp
lcp@105
   704
          (\theta \Imp \phi)}
lcp@105
   705
         {\List{\phi@1; \ldots; \phi@n} \Imp \phi} \]
lcp@105
   706
This is clearly sound: if $\List{\phi@1; \ldots; \phi@n} \Imp \phi$ is true and
lcp@296
   707
$\theta\Imp\phi@1$, \ldots, $\theta\Imp\phi@n$ and $\theta$ are all true then
lcp@105
   708
$\phi$ must be true.  Iterated lifting over a series of meta-formulae
lcp@105
   709
$\theta@k$, \ldots, $\theta@1$ yields an object-rule whose conclusion is
lcp@105
   710
$\List{\theta@1; \ldots; \theta@k} \Imp \phi$.  Typically the $\theta@i$ are
lcp@105
   711
the assumptions in a natural deduction proof; lifting copies them into a rule's
lcp@105
   712
premises and conclusion.
lcp@105
   713
lcp@105
   714
When resolving two rules, Isabelle lifts the first one if necessary.  The
lcp@105
   715
standard form of $({\imp}I)$ is
lcp@105
   716
\[ (\Var{P} \Imp \Var{Q})  \Imp  \Var{P}\imp \Var{Q}.   \]
lcp@105
   717
To resolve this rule with itself, Isabelle modifies one copy as follows: it
lcp@105
   718
renames the unknowns to $\Var{P@1}$ and $\Var{Q@1}$, then lifts the rule over
lcp@296
   719
$\Var{P}\Imp{}$ to obtain
lcp@296
   720
\[ (\Var{P}\Imp (\Var{P@1} \Imp \Var{Q@1})) \Imp (\Var{P} \Imp 
lcp@296
   721
   (\Var{P@1}\imp \Var{Q@1})).   \]
lcp@296
   722
Using the $\List{\cdots}$ abbreviation, this can be written as
lcp@105
   723
\[ \List{\List{\Var{P}; \Var{P@1}} \Imp \Var{Q@1}; \Var{P}} 
lcp@105
   724
   \Imp  \Var{P@1}\imp \Var{Q@1}.   \]
lcp@105
   725
Unifying $\Var{P}\Imp \Var{P@1}\imp\Var{Q@1}$ with $\Var{P} \Imp
lcp@105
   726
\Var{Q}$ instantiates $\Var{Q}$ to ${\Var{P@1}\imp\Var{Q@1}}$.
lcp@105
   727
Resolution yields
lcp@105
   728
\[ (\List{\Var{P}; \Var{P@1}} \Imp \Var{Q@1}) \Imp
lcp@105
   729
\Var{P}\imp(\Var{P@1}\imp\Var{Q@1}).   \]
lcp@105
   730
This represents the derived rule
lcp@105
   731
\[ \infer{P\imp(Q\imp R).}{\infer*{R}{[P,Q]}} \]
lcp@105
   732
lcp@296
   733
\subsection{Lifting over parameters}
lcp@312
   734
\index{parameters!lifting over}
lcp@105
   735
An analogous form of lifting handles premises of the form $\Forall x\ldots\,$. 
lcp@105
   736
Here, lifting prefixes an object-rule's premises and conclusion with $\Forall
lcp@105
   737
x$.  At the same time, lifting introduces a dependence upon~$x$.  It replaces
lcp@105
   738
each unknown $\Var{a}$ in the rule by $\Var{a'}(x)$, where $\Var{a'}$ is a new
lcp@105
   739
unknown (by subscripting) of suitable type --- necessarily a function type.  In
lcp@105
   740
short, lifting is the meta-inference
lcp@105
   741
\[ \infer{\List{\Forall x.\phi@1^x; \ldots; \Forall x.\phi@n^x} 
lcp@105
   742
           \Imp \Forall x.\phi^x,}
lcp@105
   743
         {\List{\phi@1; \ldots; \phi@n} \Imp \phi} \]
lcp@296
   744
%
lcp@296
   745
where $\phi^x$ stands for the result of lifting unknowns over~$x$ in
lcp@296
   746
$\phi$.  It is not hard to verify that this meta-inference is sound.  If
lcp@296
   747
$\phi\Imp\psi$ then $\phi^x\Imp\psi^x$ for all~$x$; so if $\phi^x$ is true
lcp@296
   748
for all~$x$ then so is $\psi^x$.  Thus, from $\phi\Imp\psi$ we conclude
lcp@296
   749
$(\Forall x.\phi^x) \Imp (\Forall x.\psi^x)$.
lcp@296
   750
lcp@105
   751
For example, $(\disj I)$ might be lifted to
lcp@105
   752
\[ (\Forall x.\Var{P@1}(x)) \Imp (\Forall x. \Var{P@1}(x)\disj \Var{Q@1}(x))\]
lcp@105
   753
and $(\forall I)$ to
lcp@105
   754
\[ (\Forall x\,y.\Var{P@1}(x,y)) \Imp (\Forall x. \forall y.\Var{P@1}(x,y)). \]
lcp@105
   755
Isabelle has renamed a bound variable in $(\forall I)$ from $x$ to~$y$,
lcp@105
   756
avoiding a clash.  Resolving the above with $(\forall I)$ is the meta-inference
lcp@105
   757
\[ \infer{\Forall x\,y.\Var{P@1}(x,y)) \Imp \forall x\,y.\Var{P@1}(x,y)) }
lcp@105
   758
         {(\Forall x\,y.\Var{P@1}(x,y)) \Imp 
lcp@105
   759
               (\Forall x. \forall y.\Var{P@1}(x,y))  &
lcp@105
   760
          (\Forall x.\Var{P}(x)) \Imp (\forall x.\Var{P}(x))} \]
lcp@105
   761
Here, $\Var{P}$ is replaced by $\lambda x.\forall y.\Var{P@1}(x,y)$; the
lcp@105
   762
resolvent expresses the derived rule
lcp@105
   763
\[ \vcenter{ \infer{\forall x\,y.Q(x,y)}{Q(x,y)} }
lcp@105
   764
   \quad\hbox{provided $x$, $y$ not free in the assumptions} 
lcp@105
   765
\] 
paulson@1878
   766
I discuss lifting and parameters at length elsewhere~\cite{paulson-found}.
lcp@296
   767
Miller goes into even greater detail~\cite{miller-mixed}.
lcp@105
   768
lcp@105
   769
lcp@105
   770
\section{Backward proof by resolution}
lcp@312
   771
\index{resolution!in backward proof}
lcp@296
   772
lcp@105
   773
Resolution is convenient for deriving simple rules and for reasoning
lcp@105
   774
forward from facts.  It can also support backward proof, where we start
lcp@105
   775
with a goal and refine it to progressively simpler subgoals until all have
lcp@296
   776
been solved.  {\sc lcf} and its descendants {\sc hol} and Nuprl provide
wenzelm@3103
   777
tactics and tacticals, which constitute a sophisticated language for
lcp@296
   778
expressing proof searches.  {\bf Tactics} refine subgoals while {\bf
lcp@296
   779
  tacticals} combine tactics.
lcp@105
   780
lcp@312
   781
\index{LCF system}
lcp@105
   782
Isabelle's tactics and tacticals work differently from {\sc lcf}'s.  An
lcp@296
   783
Isabelle rule is bidirectional: there is no distinction between
lcp@105
   784
inputs and outputs.  {\sc lcf} has a separate tactic for each rule;
lcp@105
   785
Isabelle performs refinement by any rule in a uniform fashion, using
lcp@105
   786
resolution.
lcp@105
   787
lcp@105
   788
Isabelle works with meta-level theorems of the form
lcp@105
   789
\( \List{\phi@1; \ldots; \phi@n} \Imp \phi \).
lcp@105
   790
We have viewed this as the {\bf rule} with premises
lcp@105
   791
$\phi@1$,~\ldots,~$\phi@n$ and conclusion~$\phi$.  It can also be viewed as
lcp@312
   792
the {\bf proof state}\index{proof state}
lcp@312
   793
with subgoals $\phi@1$,~\ldots,~$\phi@n$ and main
lcp@105
   794
goal~$\phi$.
lcp@105
   795
lcp@105
   796
To prove the formula~$\phi$, take $\phi\Imp \phi$ as the initial proof
lcp@105
   797
state.  This assertion is, trivially, a theorem.  At a later stage in the
lcp@105
   798
backward proof, a typical proof state is $\List{\phi@1; \ldots; \phi@n}
lcp@296
   799
\Imp \phi$.  This proof state is a theorem, ensuring that the subgoals
lcp@296
   800
$\phi@1$,~\ldots,~$\phi@n$ imply~$\phi$.  If $n=0$ then we have
lcp@105
   801
proved~$\phi$ outright.  If $\phi$ contains unknowns, they may become
lcp@105
   802
instantiated during the proof; a proof state may be $\List{\phi@1; \ldots;
lcp@105
   803
\phi@n} \Imp \phi'$, where $\phi'$ is an instance of~$\phi$.
lcp@105
   804
lcp@105
   805
\subsection{Refinement by resolution}
lcp@105
   806
To refine subgoal~$i$ of a proof state by a rule, perform the following
lcp@105
   807
resolution: 
lcp@105
   808
\[ \infer{\hbox{new proof state}}{\hbox{rule} & & \hbox{proof state}} \]
lcp@331
   809
Suppose the rule is $\List{\psi'@1; \ldots; \psi'@m} \Imp \psi'$ after
lcp@331
   810
lifting over subgoal~$i$'s assumptions and parameters.  If the proof state
lcp@331
   811
is $\List{\phi@1; \ldots; \phi@n} \Imp \phi$, then the new proof state is
lcp@331
   812
(for~$1\leq i\leq n$)
lcp@105
   813
\[ (\List{\phi@1; \ldots; \phi@{i-1}; \psi'@1;
lcp@105
   814
          \ldots; \psi'@m; \phi@{i+1}; \ldots; \phi@n} \Imp \phi)s.  \]
lcp@105
   815
Substitution~$s$ unifies $\psi'$ with~$\phi@i$.  In the proof state,
lcp@105
   816
subgoal~$i$ is replaced by $m$ new subgoals, the rule's instantiated premises.
lcp@105
   817
If some of the rule's unknowns are left un-instantiated, they become new
lcp@105
   818
unknowns in the proof state.  Refinement by~$(\exists I)$, namely
lcp@105
   819
\[ \Var{P}(\Var{t}) \Imp \exists x. \Var{P}(x), \]
lcp@105
   820
inserts a new unknown derived from~$\Var{t}$ by subscripting and lifting.
lcp@105
   821
We do not have to specify an `existential witness' when
lcp@105
   822
applying~$(\exists I)$.  Further resolutions may instantiate unknowns in
lcp@105
   823
the proof state.
lcp@105
   824
lcp@105
   825
\subsection{Proof by assumption}
lcp@312
   826
\index{assumptions!use of}
lcp@105
   827
In the course of a natural deduction proof, parameters $x@1$, \ldots,~$x@l$ and
lcp@105
   828
assumptions $\theta@1$, \ldots, $\theta@k$ accumulate, forming a context for
lcp@105
   829
each subgoal.  Repeated lifting steps can lift a rule into any context.  To
lcp@105
   830
aid readability, Isabelle puts contexts into a normal form, gathering the
lcp@105
   831
parameters at the front:
lcp@105
   832
\begin{equation} \label{context-eqn}
lcp@105
   833
\Forall x@1 \ldots x@l. \List{\theta@1; \ldots; \theta@k}\Imp\theta. 
lcp@105
   834
\end{equation}
lcp@105
   835
Under the usual reading of the connectives, this expresses that $\theta$
lcp@105
   836
follows from $\theta@1$,~\ldots~$\theta@k$ for arbitrary
lcp@105
   837
$x@1$,~\ldots,~$x@l$.  It is trivially true if $\theta$ equals any of
lcp@105
   838
$\theta@1$,~\ldots~$\theta@k$, or is unifiable with any of them.  This
lcp@105
   839
models proof by assumption in natural deduction.
lcp@105
   840
lcp@105
   841
Isabelle automates the meta-inference for proof by assumption.  Its arguments
lcp@105
   842
are the meta-theorem $\List{\phi@1; \ldots; \phi@n} \Imp \phi$, and some~$i$
lcp@105
   843
from~1 to~$n$, where $\phi@i$ has the form~(\ref{context-eqn}).  Its results
lcp@105
   844
are meta-theorems of the form
lcp@105
   845
\[ (\List{\phi@1; \ldots; \phi@{i-1}; \phi@{i+1}; \phi@n} \Imp \phi)s \]
lcp@105
   846
for each $s$ and~$j$ such that $s$ unifies $\lambda x@1 \ldots x@l. \theta@j$
lcp@105
   847
with $\lambda x@1 \ldots x@l. \theta$.  Isabelle supplies the parameters
lcp@105
   848
$x@1$,~\ldots,~$x@l$ to higher-order unification as bound variables, which
lcp@105
   849
regards them as unique constants with a limited scope --- this enforces
paulson@1878
   850
parameter provisos~\cite{paulson-found}.
lcp@105
   851
lcp@296
   852
The premise represents a proof state with~$n$ subgoals, of which the~$i$th
lcp@296
   853
is to be solved by assumption.  Isabelle searches the subgoal's context for
lcp@296
   854
an assumption~$\theta@j$ that can solve it.  For each unifier, the
lcp@296
   855
meta-inference returns an instantiated proof state from which the $i$th
lcp@296
   856
subgoal has been removed.  Isabelle searches for a unifying assumption; for
lcp@296
   857
readability and robustness, proofs do not refer to assumptions by number.
lcp@105
   858
lcp@296
   859
Consider the proof state 
lcp@296
   860
\[ (\List{P(a); P(b)} \Imp P(\Var{x})) \Imp Q(\Var{x}). \]
lcp@105
   861
Proof by assumption (with $i=1$, the only possibility) yields two results:
lcp@105
   862
\begin{itemize}
lcp@105
   863
  \item $Q(a)$, instantiating $\Var{x}\equiv a$
lcp@105
   864
  \item $Q(b)$, instantiating $\Var{x}\equiv b$
lcp@105
   865
\end{itemize}
lcp@105
   866
Here, proof by assumption affects the main goal.  It could also affect
lcp@296
   867
other subgoals; if we also had the subgoal ${\List{P(b); P(c)} \Imp
lcp@296
   868
  P(\Var{x})}$, then $\Var{x}\equiv a$ would transform it to ${\List{P(b);
lcp@296
   869
    P(c)} \Imp P(a)}$, which might be unprovable.
lcp@296
   870
lcp@105
   871
lcp@105
   872
\subsection{A propositional proof} \label{prop-proof}
lcp@105
   873
\index{examples!propositional}
lcp@105
   874
Our first example avoids quantifiers.  Given the main goal $P\disj P\imp
lcp@105
   875
P$, Isabelle creates the initial state
lcp@296
   876
\[ (P\disj P\imp P)\Imp (P\disj P\imp P). \] 
lcp@296
   877
%
lcp@105
   878
Bear in mind that every proof state we derive will be a meta-theorem,
lcp@296
   879
expressing that the subgoals imply the main goal.  Our aim is to reach the
lcp@296
   880
state $P\disj P\imp P$; this meta-theorem is the desired result.
lcp@296
   881
lcp@296
   882
The first step is to refine subgoal~1 by (${\imp}I)$, creating a new state
lcp@296
   883
where $P\disj P$ is an assumption:
lcp@105
   884
\[ (P\disj P\Imp P)\Imp (P\disj P\imp P) \]
lcp@105
   885
The next step is $(\disj E)$, which replaces subgoal~1 by three new subgoals. 
lcp@105
   886
Because of lifting, each subgoal contains a copy of the context --- the
lcp@105
   887
assumption $P\disj P$.  (In fact, this assumption is now redundant; we shall
lcp@105
   888
shortly see how to get rid of it!)  The new proof state is the following
lcp@105
   889
meta-theorem, laid out for clarity:
lcp@105
   890
\[ \begin{array}{l@{}l@{\qquad\qquad}l} 
lcp@105
   891
  \lbrakk\;& P\disj P\Imp \Var{P@1}\disj\Var{Q@1}; & \hbox{(subgoal 1)} \\
lcp@105
   892
           & \List{P\disj P; \Var{P@1}} \Imp P;    & \hbox{(subgoal 2)} \\
lcp@105
   893
           & \List{P\disj P; \Var{Q@1}} \Imp P     & \hbox{(subgoal 3)} \\
lcp@105
   894
  \rbrakk\;& \Imp (P\disj P\imp P)                 & \hbox{(main goal)}
lcp@105
   895
   \end{array} 
lcp@105
   896
\]
lcp@105
   897
Notice the unknowns in the proof state.  Because we have applied $(\disj E)$,
lcp@105
   898
we must prove some disjunction, $\Var{P@1}\disj\Var{Q@1}$.  Of course,
lcp@105
   899
subgoal~1 is provable by assumption.  This instantiates both $\Var{P@1}$ and
lcp@105
   900
$\Var{Q@1}$ to~$P$ throughout the proof state:
lcp@105
   901
\[ \begin{array}{l@{}l@{\qquad\qquad}l} 
lcp@105
   902
    \lbrakk\;& \List{P\disj P; P} \Imp P; & \hbox{(subgoal 1)} \\
lcp@105
   903
             & \List{P\disj P; P} \Imp P  & \hbox{(subgoal 2)} \\
lcp@105
   904
    \rbrakk\;& \Imp (P\disj P\imp P)      & \hbox{(main goal)}
lcp@105
   905
   \end{array} \]
lcp@105
   906
Both of the remaining subgoals can be proved by assumption.  After two such
lcp@296
   907
steps, the proof state is $P\disj P\imp P$.
lcp@296
   908
lcp@105
   909
lcp@105
   910
\subsection{A quantifier proof}
lcp@105
   911
\index{examples!with quantifiers}
lcp@105
   912
To illustrate quantifiers and $\Forall$-lifting, let us prove
lcp@105
   913
$(\exists x.P(f(x)))\imp(\exists x.P(x))$.  The initial proof
lcp@105
   914
state is the trivial meta-theorem 
lcp@105
   915
\[ (\exists x.P(f(x)))\imp(\exists x.P(x)) \Imp 
lcp@105
   916
   (\exists x.P(f(x)))\imp(\exists x.P(x)). \]
lcp@105
   917
As above, the first step is refinement by (${\imp}I)$: 
lcp@105
   918
\[ (\exists x.P(f(x))\Imp \exists x.P(x)) \Imp 
lcp@105
   919
   (\exists x.P(f(x)))\imp(\exists x.P(x)) 
lcp@105
   920
\]
lcp@105
   921
The next step is $(\exists E)$, which replaces subgoal~1 by two new subgoals.
lcp@105
   922
Both have the assumption $\exists x.P(f(x))$.  The new proof
lcp@105
   923
state is the meta-theorem  
lcp@105
   924
\[ \begin{array}{l@{}l@{\qquad\qquad}l} 
lcp@105
   925
   \lbrakk\;& \exists x.P(f(x)) \Imp \exists x.\Var{P@1}(x); & \hbox{(subgoal 1)} \\
lcp@105
   926
            & \Forall x.\List{\exists x.P(f(x)); \Var{P@1}(x)} \Imp 
lcp@105
   927
                       \exists x.P(x)  & \hbox{(subgoal 2)} \\
lcp@105
   928
    \rbrakk\;& \Imp (\exists x.P(f(x)))\imp(\exists x.P(x))  & \hbox{(main goal)}
lcp@105
   929
   \end{array} 
lcp@105
   930
\]
lcp@105
   931
The unknown $\Var{P@1}$ appears in both subgoals.  Because we have applied
lcp@105
   932
$(\exists E)$, we must prove $\exists x.\Var{P@1}(x)$, where $\Var{P@1}(x)$ may
lcp@105
   933
become any formula possibly containing~$x$.  Proving subgoal~1 by assumption
lcp@105
   934
instantiates $\Var{P@1}$ to~$\lambda x.P(f(x))$:  
lcp@105
   935
\[ \left(\Forall x.\List{\exists x.P(f(x)); P(f(x))} \Imp 
lcp@105
   936
         \exists x.P(x)\right) 
lcp@105
   937
   \Imp (\exists x.P(f(x)))\imp(\exists x.P(x)) 
lcp@105
   938
\]
lcp@105
   939
The next step is refinement by $(\exists I)$.  The rule is lifted into the
lcp@296
   940
context of the parameter~$x$ and the assumption $P(f(x))$.  This copies
lcp@296
   941
the context to the subgoal and allows the existential witness to
lcp@105
   942
depend upon~$x$: 
lcp@105
   943
\[ \left(\Forall x.\List{\exists x.P(f(x)); P(f(x))} \Imp 
lcp@105
   944
         P(\Var{x@2}(x))\right) 
lcp@105
   945
   \Imp (\exists x.P(f(x)))\imp(\exists x.P(x)) 
lcp@105
   946
\]
lcp@105
   947
The existential witness, $\Var{x@2}(x)$, consists of an unknown
lcp@105
   948
applied to a parameter.  Proof by assumption unifies $\lambda x.P(f(x))$ 
lcp@105
   949
with $\lambda x.P(\Var{x@2}(x))$, instantiating $\Var{x@2}$ to $f$.  The final
lcp@105
   950
proof state contains no subgoals: $(\exists x.P(f(x)))\imp(\exists x.P(x))$.
lcp@105
   951
lcp@105
   952
lcp@105
   953
\subsection{Tactics and tacticals}
lcp@105
   954
\index{tactics|bold}\index{tacticals|bold}
lcp@105
   955
{\bf Tactics} perform backward proof.  Isabelle tactics differ from those
lcp@105
   956
of {\sc lcf}, {\sc hol} and Nuprl by operating on entire proof states,
lcp@105
   957
rather than on individual subgoals.  An Isabelle tactic is a function that
lcp@105
   958
takes a proof state and returns a sequence (lazy list) of possible
lcp@296
   959
successor states.  Lazy lists are coded in ML as functions, a standard
paulson@6592
   960
technique~\cite{paulson-ml2}.  Isabelle represents proof states by theorems.
lcp@105
   961
lcp@105
   962
Basic tactics execute the meta-rules described above, operating on a
lcp@105
   963
given subgoal.  The {\bf resolution tactics} take a list of rules and
lcp@105
   964
return next states for each combination of rule and unifier.  The {\bf
lcp@105
   965
assumption tactic} examines the subgoal's assumptions and returns next
lcp@105
   966
states for each combination of assumption and unifier.  Lazy lists are
lcp@105
   967
essential because higher-order resolution may return infinitely many
lcp@105
   968
unifiers.  If there are no matching rules or assumptions then no next
lcp@105
   969
states are generated; a tactic application that returns an empty list is
lcp@105
   970
said to {\bf fail}.
lcp@105
   971
lcp@105
   972
Sequences realize their full potential with {\bf tacticals} --- operators
lcp@105
   973
for combining tactics.  Depth-first search, breadth-first search and
lcp@105
   974
best-first search (where a heuristic function selects the best state to
lcp@105
   975
explore) return their outcomes as a sequence.  Isabelle provides such
lcp@105
   976
procedures in the form of tacticals.  Simpler procedures can be expressed
lcp@312
   977
directly using the basic tacticals {\tt THEN}, {\tt ORELSE} and {\tt REPEAT}:
lcp@312
   978
\begin{ttdescription}
lcp@312
   979
\item[$tac1$ THEN $tac2$] is a tactic for sequential composition.  Applied
lcp@105
   980
to a proof state, it returns all states reachable in two steps by applying
lcp@105
   981
$tac1$ followed by~$tac2$.
lcp@105
   982
lcp@312
   983
\item[$tac1$ ORELSE $tac2$] is a choice tactic.  Applied to a state, it
lcp@105
   984
tries~$tac1$ and returns the result if non-empty; otherwise, it uses~$tac2$.
lcp@105
   985
lcp@312
   986
\item[REPEAT $tac$] is a repetition tactic.  Applied to a state, it
lcp@331
   987
returns all states reachable by applying~$tac$ as long as possible --- until
lcp@331
   988
it would fail.  
lcp@312
   989
\end{ttdescription}
lcp@105
   990
For instance, this tactic repeatedly applies $tac1$ and~$tac2$, giving
lcp@105
   991
$tac1$ priority:
lcp@312
   992
\begin{center} \tt
lcp@312
   993
REPEAT($tac1$ ORELSE $tac2$)
lcp@312
   994
\end{center}
lcp@105
   995
lcp@105
   996
lcp@105
   997
\section{Variations on resolution}
lcp@105
   998
In principle, resolution and proof by assumption suffice to prove all
lcp@105
   999
theorems.  However, specialized forms of resolution are helpful for working
lcp@105
  1000
with elimination rules.  Elim-resolution applies an elimination rule to an
lcp@105
  1001
assumption; destruct-resolution is similar, but applies a rule in a forward
lcp@105
  1002
style.
lcp@105
  1003
lcp@105
  1004
The last part of the section shows how the techniques for proving theorems
lcp@105
  1005
can also serve to derive rules.
lcp@105
  1006
lcp@105
  1007
\subsection{Elim-resolution}
lcp@312
  1008
\index{elim-resolution|bold}\index{assumptions!deleting}
lcp@312
  1009
lcp@105
  1010
Consider proving the theorem $((R\disj R)\disj R)\disj R\imp R$.  By
lcp@331
  1011
$({\imp}I)$, we prove~$R$ from the assumption $((R\disj R)\disj R)\disj R$.
lcp@105
  1012
Applying $(\disj E)$ to this assumption yields two subgoals, one that
lcp@105
  1013
assumes~$R$ (and is therefore trivial) and one that assumes $(R\disj
lcp@105
  1014
R)\disj R$.  This subgoal admits another application of $(\disj E)$.  Since
lcp@105
  1015
natural deduction never discards assumptions, we eventually generate a
lcp@105
  1016
subgoal containing much that is redundant:
lcp@105
  1017
\[ \List{((R\disj R)\disj R)\disj R; (R\disj R)\disj R; R\disj R; R} \Imp R. \]
lcp@105
  1018
In general, using $(\disj E)$ on the assumption $P\disj Q$ creates two new
lcp@105
  1019
subgoals with the additional assumption $P$ or~$Q$.  In these subgoals,
lcp@331
  1020
$P\disj Q$ is redundant.  Other elimination rules behave
lcp@331
  1021
similarly.  In first-order logic, only universally quantified
lcp@105
  1022
assumptions are sometimes needed more than once --- say, to prove
lcp@105
  1023
$P(f(f(a)))$ from the assumptions $\forall x.P(x)\imp P(f(x))$ and~$P(a)$.
lcp@105
  1024
lcp@105
  1025
Many logics can be formulated as sequent calculi that delete redundant
lcp@105
  1026
assumptions after use.  The rule $(\disj E)$ might become
lcp@105
  1027
\[ \infer[\disj\hbox{-left}]
lcp@105
  1028
         {\Gamma,P\disj Q,\Delta \turn \Theta}
lcp@105
  1029
         {\Gamma,P,\Delta \turn \Theta && \Gamma,Q,\Delta \turn \Theta}  \] 
lcp@105
  1030
In backward proof, a goal containing $P\disj Q$ on the left of the~$\turn$
lcp@105
  1031
(that is, as an assumption) splits into two subgoals, replacing $P\disj Q$
lcp@105
  1032
by $P$ or~$Q$.  But the sequent calculus, with its explicit handling of
lcp@105
  1033
assumptions, can be tiresome to use.
lcp@105
  1034
lcp@105
  1035
Elim-resolution is Isabelle's way of getting sequent calculus behaviour
lcp@105
  1036
from natural deduction rules.  It lets an elimination rule consume an
lcp@296
  1037
assumption.  Elim-resolution combines two meta-theorems:
lcp@296
  1038
\begin{itemize}
lcp@296
  1039
  \item a rule $\List{\psi@1; \ldots; \psi@m} \Imp \psi$
lcp@296
  1040
  \item a proof state $\List{\phi@1; \ldots; \phi@n} \Imp \phi$
lcp@296
  1041
\end{itemize}
lcp@296
  1042
The rule must have at least one premise, thus $m>0$.  Write the rule's
lcp@296
  1043
lifted form as $\List{\psi'@1; \ldots; \psi'@m} \Imp \psi'$.  Suppose we
lcp@296
  1044
wish to change subgoal number~$i$.
lcp@296
  1045
lcp@296
  1046
Ordinary resolution would attempt to reduce~$\phi@i$,
lcp@296
  1047
replacing subgoal~$i$ by $m$ new ones.  Elim-resolution tries
lcp@296
  1048
simultaneously to reduce~$\phi@i$ and to solve~$\psi'@1$ by assumption; it
lcp@105
  1049
returns a sequence of next states.  Each of these replaces subgoal~$i$ by
lcp@105
  1050
instances of $\psi'@2$, \ldots, $\psi'@m$ from which the selected
lcp@105
  1051
assumption has been deleted.  Suppose $\phi@i$ has the parameter~$x$ and
lcp@105
  1052
assumptions $\theta@1$,~\ldots,~$\theta@k$.  Then $\psi'@1$, the rule's first
lcp@105
  1053
premise after lifting, will be
lcp@105
  1054
\( \Forall x. \List{\theta@1; \ldots; \theta@k}\Imp \psi^{x}@1 \).
lcp@296
  1055
Elim-resolution tries to unify $\psi'\qeq\phi@i$ and
lcp@296
  1056
$\lambda x. \theta@j \qeq \lambda x. \psi^{x}@1$ simultaneously, for
lcp@296
  1057
$j=1$,~\ldots,~$k$. 
lcp@105
  1058
lcp@105
  1059
Let us redo the example from~\S\ref{prop-proof}.  The elimination rule
lcp@105
  1060
is~$(\disj E)$,
lcp@105
  1061
\[ \List{\Var{P}\disj \Var{Q};\; \Var{P}\Imp \Var{R};\; \Var{Q}\Imp \Var{R}}
lcp@105
  1062
      \Imp \Var{R}  \]
lcp@105
  1063
and the proof state is $(P\disj P\Imp P)\Imp (P\disj P\imp P)$.  The
lcp@331
  1064
lifted rule is
lcp@105
  1065
\[ \begin{array}{l@{}l}
lcp@105
  1066
  \lbrakk\;& P\disj P \Imp \Var{P@1}\disj\Var{Q@1}; \\
lcp@105
  1067
           & \List{P\disj P ;\; \Var{P@1}} \Imp \Var{R@1};    \\
lcp@105
  1068
           & \List{P\disj P ;\; \Var{Q@1}} \Imp \Var{R@1}     \\
paulson@1865
  1069
  \rbrakk\;& \Imp (P\disj P \Imp \Var{R@1})
lcp@105
  1070
  \end{array} 
lcp@105
  1071
\]
lcp@331
  1072
Unification takes the simultaneous equations
lcp@105
  1073
$P\disj P \qeq \Var{P@1}\disj\Var{Q@1}$ and $\Var{R@1} \qeq P$, yielding
lcp@105
  1074
$\Var{P@1}\equiv\Var{Q@1}\equiv\Var{R@1} \equiv P$.  The new proof state
lcp@331
  1075
is simply
lcp@105
  1076
\[ \List{P \Imp P;\; P \Imp P} \Imp (P\disj P\imp P). 
lcp@105
  1077
\]
lcp@105
  1078
Elim-resolution's simultaneous unification gives better control
lcp@105
  1079
than ordinary resolution.  Recall the substitution rule:
lcp@105
  1080
$$ \List{\Var{t}=\Var{u}; \Var{P}(\Var{t})} \Imp \Var{P}(\Var{u}) 
wenzelm@3103
  1081
\eqno(subst) $$
lcp@105
  1082
Unsuitable for ordinary resolution because $\Var{P}(\Var{u})$ admits many
lcp@105
  1083
unifiers, $(subst)$ works well with elim-resolution.  It deletes some
lcp@105
  1084
assumption of the form $x=y$ and replaces every~$y$ by~$x$ in the subgoal
lcp@105
  1085
formula.  The simultaneous unification instantiates $\Var{u}$ to~$y$; if
lcp@105
  1086
$y$ is not an unknown, then $\Var{P}(y)$ can easily be unified with another
lcp@105
  1087
formula.  
lcp@105
  1088
lcp@105
  1089
In logical parlance, the premise containing the connective to be eliminated
lcp@105
  1090
is called the \bfindex{major premise}.  Elim-resolution expects the major
lcp@105
  1091
premise to come first.  The order of the premises is significant in
lcp@105
  1092
Isabelle.
lcp@105
  1093
lcp@105
  1094
\subsection{Destruction rules} \label{destruct}
lcp@312
  1095
\index{rules!destruction}\index{rules!elimination}
lcp@312
  1096
\index{forward proof}
lcp@312
  1097
lcp@296
  1098
Looking back to Fig.\ts\ref{fol-fig}, notice that there are two kinds of
lcp@105
  1099
elimination rule.  The rules $({\conj}E1)$, $({\conj}E2)$, $({\imp}E)$ and
lcp@105
  1100
$({\forall}E)$ extract the conclusion from the major premise.  In Isabelle
lcp@312
  1101
parlance, such rules are called {\bf destruction rules}; they are readable
lcp@105
  1102
and easy to use in forward proof.  The rules $({\disj}E)$, $({\bot}E)$ and
lcp@105
  1103
$({\exists}E)$ work by discharging assumptions; they support backward proof
lcp@105
  1104
in a style reminiscent of the sequent calculus.
lcp@105
  1105
lcp@105
  1106
The latter style is the most general form of elimination rule.  In natural
lcp@105
  1107
deduction, there is no way to recast $({\disj}E)$, $({\bot}E)$ or
lcp@105
  1108
$({\exists}E)$ as destruction rules.  But we can write general elimination
lcp@105
  1109
rules for $\conj$, $\imp$ and~$\forall$:
lcp@105
  1110
\[
lcp@105
  1111
\infer{R}{P\conj Q & \infer*{R}{[P,Q]}} \qquad
lcp@105
  1112
\infer{R}{P\imp Q & P & \infer*{R}{[Q]}}  \qquad
lcp@105
  1113
\infer{Q}{\forall x.P & \infer*{Q}{[P[t/x]]}} 
lcp@105
  1114
\]
lcp@105
  1115
Because they are concise, destruction rules are simpler to derive than the
lcp@105
  1116
corresponding elimination rules.  To facilitate their use in backward
lcp@105
  1117
proof, Isabelle provides a means of transforming a destruction rule such as
lcp@105
  1118
\[ \infer[\quad\hbox{to the elimination rule}\quad]{Q}{P@1 & \ldots & P@m} 
lcp@105
  1119
   \infer{R.}{P@1 & \ldots & P@m & \infer*{R}{[Q]}} 
lcp@105
  1120
\]
lcp@331
  1121
{\bf Destruct-resolution}\index{destruct-resolution} combines this
lcp@331
  1122
transformation with elim-resolution.  It applies a destruction rule to some
lcp@331
  1123
assumption of a subgoal.  Given the rule above, it replaces the
lcp@331
  1124
assumption~$P@1$ by~$Q$, with new subgoals of showing instances of $P@2$,
lcp@331
  1125
\ldots,~$P@m$.  Destruct-resolution works forward from a subgoal's
lcp@331
  1126
assumptions.  Ordinary resolution performs forward reasoning from theorems,
lcp@331
  1127
as illustrated in~\S\ref{joining}.
lcp@105
  1128
lcp@105
  1129
lcp@105
  1130
\subsection{Deriving rules by resolution}  \label{deriving}
lcp@312
  1131
\index{rules!derived|bold}\index{meta-assumptions!syntax of}
lcp@105
  1132
The meta-logic, itself a form of the predicate calculus, is defined by a
lcp@105
  1133
system of natural deduction rules.  Each theorem may depend upon
lcp@105
  1134
meta-assumptions.  The theorem that~$\phi$ follows from the assumptions
lcp@105
  1135
$\phi@1$, \ldots, $\phi@n$ is written
lcp@105
  1136
\[ \phi \quad [\phi@1,\ldots,\phi@n]. \]
lcp@105
  1137
A more conventional notation might be $\phi@1,\ldots,\phi@n \turn \phi$,
lcp@105
  1138
but Isabelle's notation is more readable with large formulae.
lcp@105
  1139
lcp@105
  1140
Meta-level natural deduction provides a convenient mechanism for deriving
lcp@105
  1141
new object-level rules.  To derive the rule
lcp@105
  1142
\[ \infer{\phi,}{\theta@1 & \ldots & \theta@k} \]
lcp@105
  1143
assume the premises $\theta@1$,~\ldots,~$\theta@k$ at the
lcp@105
  1144
meta-level.  Then prove $\phi$, possibly using these assumptions.
lcp@105
  1145
Starting with a proof state $\phi\Imp\phi$, assumptions may accumulate,
lcp@105
  1146
reaching a final state such as
lcp@105
  1147
\[ \phi \quad [\theta@1,\ldots,\theta@k]. \]
lcp@105
  1148
The meta-rule for $\Imp$ introduction discharges an assumption.
lcp@105
  1149
Discharging them in the order $\theta@k,\ldots,\theta@1$ yields the
lcp@105
  1150
meta-theorem $\List{\theta@1; \ldots; \theta@k} \Imp \phi$, with no
lcp@105
  1151
assumptions.  This represents the desired rule.
lcp@105
  1152
Let us derive the general $\conj$ elimination rule:
wenzelm@3103
  1153
$$ \infer{R}{P\conj Q & \infer*{R}{[P,Q]}}  \eqno(\conj E) $$
lcp@105
  1154
We assume $P\conj Q$ and $\List{P;Q}\Imp R$, and commence backward proof in
lcp@105
  1155
the state $R\Imp R$.  Resolving this with the second assumption yields the
lcp@105
  1156
state 
lcp@105
  1157
\[ \phantom{\List{P\conj Q;\; P\conj Q}}
lcp@105
  1158
   \llap{$\List{P;Q}$}\Imp R \quad [\,\List{P;Q}\Imp R\,]. \]
lcp@331
  1159
Resolving subgoals~1 and~2 with~$({\conj}E1)$ and~$({\conj}E2)$,
lcp@105
  1160
respectively, yields the state
lcp@331
  1161
\[ \List{P\conj \Var{Q@1};\; \Var{P@2}\conj Q}\Imp R 
lcp@331
  1162
   \quad [\,\List{P;Q}\Imp R\,]. 
lcp@331
  1163
\]
lcp@331
  1164
The unknowns $\Var{Q@1}$ and~$\Var{P@2}$ arise from unconstrained
lcp@331
  1165
subformulae in the premises of~$({\conj}E1)$ and~$({\conj}E2)$.  Resolving
lcp@331
  1166
both subgoals with the assumption $P\conj Q$ instantiates the unknowns to yield
lcp@105
  1167
\[ R \quad [\, \List{P;Q}\Imp R, P\conj Q \,]. \]
lcp@105
  1168
The proof may use the meta-assumptions in any order, and as often as
lcp@105
  1169
necessary; when finished, we discharge them in the correct order to
lcp@105
  1170
obtain the desired form:
lcp@105
  1171
\[ \List{P\conj Q;\; \List{P;Q}\Imp R} \Imp R \]
lcp@105
  1172
We have derived the rule using free variables, which prevents their
lcp@105
  1173
premature instantiation during the proof; we may now replace them by
lcp@105
  1174
schematic variables.
lcp@105
  1175
lcp@105
  1176
\begin{warn}
lcp@331
  1177
  Schematic variables are not allowed in meta-assumptions, for a variety of
lcp@331
  1178
  reasons.  Meta-assumptions remain fixed throughout a proof.
lcp@105
  1179
\end{warn}
lcp@105
  1180