1.1 --- /dev/null	Thu Jan 01 00:00:00 1970 +0000
     1.2 +++ b/doc-src/ZF/ZF.tex	Wed Jan 13 16:36:36 1999 +0100
     1.3 @@ -0,0 +1,2616 @@
     1.4 +%% $Id$
     1.5 +\chapter{Zermelo-Fraenkel Set Theory}
     1.6 +\index{set theory|(}
     1.7 +
     1.8 +The theory~\thydx{ZF} implements Zermelo-Fraenkel set
     1.9 +theory~\cite{halmos60,suppes72} as an extension of~\texttt{FOL}, classical
    1.10 +first-order logic.  The theory includes a collection of derived natural
    1.11 +deduction rules, for use with Isabelle's classical reasoner.  Much
    1.12 +of it is based on the work of No\"el~\cite{noel}.
    1.13 +
    1.14 +A tremendous amount of set theory has been formally developed, including the
    1.15 +basic properties of relations, functions, ordinals and cardinals.  Significant
    1.16 +results have been proved, such as the Schr\"oder-Bernstein Theorem, the
    1.17 +Wellordering Theorem and a version of Ramsey's Theorem.  \texttt{ZF} provides
    1.18 +both the integers and the natural numbers.  General methods have been
    1.19 +developed for solving recursion equations over monotonic functors; these have
    1.20 +been applied to yield constructions of lists, trees, infinite lists, etc.
    1.21 +
    1.22 +\texttt{ZF} has a flexible package for handling inductive definitions,
    1.23 +such as inference systems, and datatype definitions, such as lists and
    1.24 +trees.  Moreover it handles coinductive definitions, such as
    1.25 +bisimulation relations, and codatatype definitions, such as streams.  It
    1.26 +provides a streamlined syntax for defining primitive recursive functions over
    1.27 +datatypes. 
    1.28 +
    1.29 +Because {\ZF} is an extension of {\FOL}, it provides the same
    1.30 +packages, namely \texttt{hyp_subst_tac}, the simplifier, and the
    1.31 +classical reasoner.  The default simpset and claset are usually
    1.32 +satisfactory.
    1.33 +
    1.34 +Published articles~\cite{paulson-set-I,paulson-set-II} describe \texttt{ZF}
    1.35 +less formally than this chapter.  Isabelle employs a novel treatment of
    1.36 +non-well-founded data structures within the standard {\sc zf} axioms including
    1.37 +the Axiom of Foundation~\cite{paulson-final}.
    1.38 +
    1.39 +
    1.40 +\section{Which version of axiomatic set theory?}
    1.41 +The two main axiom systems for set theory are Bernays-G\"odel~({\sc bg})
    1.42 +and Zermelo-Fraenkel~({\sc zf}).  Resolution theorem provers can use {\sc
    1.43 +  bg} because it is finite~\cite{boyer86,quaife92}.  {\sc zf} does not
    1.44 +have a finite axiom system because of its Axiom Scheme of Replacement.
    1.45 +This makes it awkward to use with many theorem provers, since instances
    1.46 +of the axiom scheme have to be invoked explicitly.  Since Isabelle has no
    1.47 +difficulty with axiom schemes, we may adopt either axiom system.
    1.48 +
    1.49 +These two theories differ in their treatment of {\bf classes}, which are
    1.50 +collections that are `too big' to be sets.  The class of all sets,~$V$,
    1.51 +cannot be a set without admitting Russell's Paradox.  In {\sc bg}, both
    1.52 +classes and sets are individuals; $x\in V$ expresses that $x$ is a set.  In
    1.53 +{\sc zf}, all variables denote sets; classes are identified with unary
    1.54 +predicates.  The two systems define essentially the same sets and classes,
    1.55 +with similar properties.  In particular, a class cannot belong to another
    1.56 +class (let alone a set).
    1.57 +
    1.58 +Modern set theorists tend to prefer {\sc zf} because they are mainly concerned
    1.59 +with sets, rather than classes.  {\sc bg} requires tiresome proofs that various
    1.60 +collections are sets; for instance, showing $x\in\{x\}$ requires showing that
    1.61 +$x$ is a set.
    1.62 +
    1.63 +
    1.64 +\begin{figure} \small
    1.65 +\begin{center}
    1.66 +\begin{tabular}{rrr} 
    1.67 +  \it name      &\it meta-type  & \it description \\ 
    1.68 +  \cdx{Let}     & $[\alpha,\alpha\To\beta]\To\beta$ & let binder\\
    1.69 +  \cdx{0}       & $i$           & empty set\\
    1.70 +  \cdx{cons}    & $[i,i]\To i$  & finite set constructor\\
    1.71 +  \cdx{Upair}   & $[i,i]\To i$  & unordered pairing\\
    1.72 +  \cdx{Pair}    & $[i,i]\To i$  & ordered pairing\\
    1.73 +  \cdx{Inf}     & $i$   & infinite set\\
    1.74 +  \cdx{Pow}     & $i\To i$      & powerset\\
    1.75 +  \cdx{Union} \cdx{Inter} & $i\To i$    & set union/intersection \\
    1.76 +  \cdx{split}   & $[[i,i]\To i, i] \To i$ & generalized projection\\
    1.77 +  \cdx{fst} \cdx{snd}   & $i\To i$      & projections\\
    1.78 +  \cdx{converse}& $i\To i$      & converse of a relation\\
    1.79 +  \cdx{succ}    & $i\To i$      & successor\\
    1.80 +  \cdx{Collect} & $[i,i\To o]\To i$     & separation\\
    1.81 +  \cdx{Replace} & $[i, [i,i]\To o] \To i$       & replacement\\
    1.82 +  \cdx{PrimReplace} & $[i, [i,i]\To o] \To i$   & primitive replacement\\
    1.83 +  \cdx{RepFun}  & $[i, i\To i] \To i$   & functional replacement\\
    1.84 +  \cdx{Pi} \cdx{Sigma}  & $[i,i\To i]\To i$     & general product/sum\\
    1.85 +  \cdx{domain}  & $i\To i$      & domain of a relation\\
    1.86 +  \cdx{range}   & $i\To i$      & range of a relation\\
    1.87 +  \cdx{field}   & $i\To i$      & field of a relation\\
    1.88 +  \cdx{Lambda}  & $[i, i\To i]\To i$    & $\lambda$-abstraction\\
    1.89 +  \cdx{restrict}& $[i, i] \To i$        & restriction of a function\\
    1.90 +  \cdx{The}     & $[i\To o]\To i$       & definite description\\
    1.91 +  \cdx{if}      & $[o,i,i]\To i$        & conditional\\
    1.92 +  \cdx{Ball} \cdx{Bex}  & $[i, i\To o]\To o$    & bounded quantifiers
    1.93 +\end{tabular}
    1.94 +\end{center}
    1.95 +\subcaption{Constants}
    1.96 +
    1.97 +\begin{center}
    1.98 +\index{*"`"` symbol}
    1.99 +\index{*"-"`"` symbol}
   1.100 +\index{*"` symbol}\index{function applications!in \ZF}
   1.101 +\index{*"- symbol}
   1.102 +\index{*": symbol}
   1.103 +\index{*"<"= symbol}
   1.104 +\begin{tabular}{rrrr} 
   1.105 +  \it symbol  & \it meta-type & \it priority & \it description \\ 
   1.106 +  \tt ``        & $[i,i]\To i$  &  Left 90      & image \\
   1.107 +  \tt -``       & $[i,i]\To i$  &  Left 90      & inverse image \\
   1.108 +  \tt `         & $[i,i]\To i$  &  Left 90      & application \\
   1.109 +  \sdx{Int}     & $[i,i]\To i$  &  Left 70      & intersection ($\int$) \\
   1.110 +  \sdx{Un}      & $[i,i]\To i$  &  Left 65      & union ($\un$) \\
   1.111 +  \tt -         & $[i,i]\To i$  &  Left 65      & set difference ($-$) \\[1ex]
   1.112 +  \tt:          & $[i,i]\To o$  &  Left 50      & membership ($\in$) \\
   1.113 +  \tt <=        & $[i,i]\To o$  &  Left 50      & subset ($\subseteq$) 
   1.114 +\end{tabular}
   1.115 +\end{center}
   1.116 +\subcaption{Infixes}
   1.117 +\caption{Constants of {\ZF}} \label{zf-constants}
   1.118 +\end{figure} 
   1.119 +
   1.120 +
   1.121 +\section{The syntax of set theory}
   1.122 +The language of set theory, as studied by logicians, has no constants.  The
   1.123 +traditional axioms merely assert the existence of empty sets, unions,
   1.124 +powersets, etc.; this would be intolerable for practical reasoning.  The
   1.125 +Isabelle theory declares constants for primitive sets.  It also extends
   1.126 +\texttt{FOL} with additional syntax for finite sets, ordered pairs,
   1.127 +comprehension, general union/intersection, general sums/products, and
   1.128 +bounded quantifiers.  In most other respects, Isabelle implements precisely
   1.129 +Zermelo-Fraenkel set theory.
   1.130 +
   1.131 +Figure~\ref{zf-constants} lists the constants and infixes of~\ZF, while
   1.132 +Figure~\ref{zf-trans} presents the syntax translations.  Finally,
   1.133 +Figure~\ref{zf-syntax} presents the full grammar for set theory, including
   1.134 +the constructs of \FOL.
   1.135 +
   1.136 +Local abbreviations can be introduced by a \texttt{let} construct whose
   1.137 +syntax appears in Fig.\ts\ref{zf-syntax}.  Internally it is translated into
   1.138 +the constant~\cdx{Let}.  It can be expanded by rewriting with its
   1.139 +definition, \tdx{Let_def}.
   1.140 +
   1.141 +Apart from \texttt{let}, set theory does not use polymorphism.  All terms in
   1.142 +{\ZF} have type~\tydx{i}, which is the type of individuals and has class~{\tt
   1.143 +  term}.  The type of first-order formulae, remember, is~\textit{o}.
   1.144 +
   1.145 +Infix operators include binary union and intersection ($A\un B$ and
   1.146 +$A\int B$), set difference ($A-B$), and the subset and membership
   1.147 +relations.  Note that $a$\verb|~:|$b$ is translated to $\neg(a\in b)$.  The
   1.148 +union and intersection operators ($\bigcup A$ and $\bigcap A$) form the
   1.149 +union or intersection of a set of sets; $\bigcup A$ means the same as
   1.150 +$\bigcup@{x\in A}x$.  Of these operators, only $\bigcup A$ is primitive.
   1.151 +
   1.152 +The constant \cdx{Upair} constructs unordered pairs; thus {\tt
   1.153 +  Upair($A$,$B$)} denotes the set~$\{A,B\}$ and \texttt{Upair($A$,$A$)}
   1.154 +denotes the singleton~$\{A\}$.  General union is used to define binary
   1.155 +union.  The Isabelle version goes on to define the constant
   1.156 +\cdx{cons}:
   1.157 +\begin{eqnarray*}
   1.158 +   A\cup B              & \equiv &       \bigcup(\texttt{Upair}(A,B)) \\
   1.159 +   \texttt{cons}(a,B)      & \equiv &        \texttt{Upair}(a,a) \un B
   1.160 +\end{eqnarray*}
   1.161 +The $\{a@1, \ldots\}$ notation abbreviates finite sets constructed in the
   1.162 +obvious manner using~\texttt{cons} and~$\emptyset$ (the empty set):
   1.163 +\begin{eqnarray*}
   1.164 + \{a,b,c\} & \equiv & \texttt{cons}(a,\texttt{cons}(b,\texttt{cons}(c,\emptyset)))
   1.165 +\end{eqnarray*}
   1.166 +
   1.167 +The constant \cdx{Pair} constructs ordered pairs, as in {\tt
   1.168 +Pair($a$,$b$)}.  Ordered pairs may also be written within angle brackets,
   1.169 +as {\tt<$a$,$b$>}.  The $n$-tuple {\tt<$a@1$,\ldots,$a@{n-1}$,$a@n$>}
   1.170 +abbreviates the nest of pairs\par\nobreak
   1.171 +\centerline{\texttt{Pair($a@1$,\ldots,Pair($a@{n-1}$,$a@n$)\ldots).}}
   1.172 +
   1.173 +In {\ZF}, a function is a set of pairs.  A {\ZF} function~$f$ is simply an
   1.174 +individual as far as Isabelle is concerned: its Isabelle type is~$i$, not
   1.175 +say $i\To i$.  The infix operator~{\tt`} denotes the application of a
   1.176 +function set to its argument; we must write~$f{\tt`}x$, not~$f(x)$.  The
   1.177 +syntax for image is~$f{\tt``}A$ and that for inverse image is~$f{\tt-``}A$.
   1.178 +
   1.179 +
   1.180 +\begin{figure} 
   1.181 +\index{lambda abs@$\lambda$-abstractions!in \ZF}
   1.182 +\index{*"-"> symbol}
   1.183 +\index{*"* symbol}
   1.184 +\begin{center} \footnotesize\tt\frenchspacing
   1.185 +\begin{tabular}{rrr} 
   1.186 +  \it external          & \it internal  & \it description \\ 
   1.187 +  $a$ \ttilde: $b$      & \ttilde($a$ : $b$)    & \rm negated membership\\
   1.188 +  \ttlbrace$a@1$, $\ldots$, $a@n$\ttrbrace  &  cons($a@1$,$\ldots$,cons($a@n$,0)) &
   1.189 +        \rm finite set \\
   1.190 +  <$a@1$, $\ldots$, $a@{n-1}$, $a@n$> & 
   1.191 +        Pair($a@1$,\ldots,Pair($a@{n-1}$,$a@n$)\ldots) &
   1.192 +        \rm ordered $n$-tuple \\
   1.193 +  \ttlbrace$x$:$A . P[x]$\ttrbrace    &  Collect($A$,$\lambda x. P[x]$) &
   1.194 +        \rm separation \\
   1.195 +  \ttlbrace$y . x$:$A$, $Q[x,y]$\ttrbrace  &  Replace($A$,$\lambda x\,y. Q[x,y]$) &
   1.196 +        \rm replacement \\
   1.197 +  \ttlbrace$b[x] . x$:$A$\ttrbrace  &  RepFun($A$,$\lambda x. b[x]$) &
   1.198 +        \rm functional replacement \\
   1.199 +  \sdx{INT} $x$:$A . B[x]$      & Inter(\ttlbrace$B[x] . x$:$A$\ttrbrace) &
   1.200 +        \rm general intersection \\
   1.201 +  \sdx{UN}  $x$:$A . B[x]$      & Union(\ttlbrace$B[x] . x$:$A$\ttrbrace) &
   1.202 +        \rm general union \\
   1.203 +  \sdx{PROD} $x$:$A . B[x]$     & Pi($A$,$\lambda x. B[x]$) & 
   1.204 +        \rm general product \\
   1.205 +  \sdx{SUM}  $x$:$A . B[x]$     & Sigma($A$,$\lambda x. B[x]$) & 
   1.206 +        \rm general sum \\
   1.207 +  $A$ -> $B$            & Pi($A$,$\lambda x. B$) & 
   1.208 +        \rm function space \\
   1.209 +  $A$ * $B$             & Sigma($A$,$\lambda x. B$) & 
   1.210 +        \rm binary product \\
   1.211 +  \sdx{THE}  $x . P[x]$ & The($\lambda x. P[x]$) & 
   1.212 +        \rm definite description \\
   1.213 +  \sdx{lam}  $x$:$A . b[x]$     & Lambda($A$,$\lambda x. b[x]$) & 
   1.214 +        \rm $\lambda$-abstraction\\[1ex]
   1.215 +  \sdx{ALL} $x$:$A . P[x]$      & Ball($A$,$\lambda x. P[x]$) & 
   1.216 +        \rm bounded $\forall$ \\
   1.217 +  \sdx{EX}  $x$:$A . P[x]$      & Bex($A$,$\lambda x. P[x]$) & 
   1.218 +        \rm bounded $\exists$
   1.219 +\end{tabular}
   1.220 +\end{center}
   1.221 +\caption{Translations for {\ZF}} \label{zf-trans}
   1.222 +\end{figure} 
   1.223 +
   1.224 +
   1.225 +\begin{figure} 
   1.226 +\index{*let symbol}
   1.227 +\index{*in symbol}
   1.228 +\dquotes
   1.229 +\[\begin{array}{rcl}
   1.230 +    term & = & \hbox{expression of type~$i$} \\
   1.231 +         & | & "let"~id~"="~term";"\dots";"~id~"="~term~"in"~term \\
   1.232 +         & | & "if"~term~"then"~term~"else"~term \\
   1.233 +         & | & "{\ttlbrace} " term\; ("," term)^* " {\ttrbrace}" \\
   1.234 +         & | & "< "  term\; ("," term)^* " >"  \\
   1.235 +         & | & "{\ttlbrace} " id ":" term " . " formula " {\ttrbrace}" \\
   1.236 +         & | & "{\ttlbrace} " id " . " id ":" term ", " formula " {\ttrbrace}" \\
   1.237 +         & | & "{\ttlbrace} " term " . " id ":" term " {\ttrbrace}" \\
   1.238 +         & | & term " `` " term \\
   1.239 +         & | & term " -`` " term \\
   1.240 +         & | & term " ` " term \\
   1.241 +         & | & term " * " term \\
   1.242 +         & | & term " Int " term \\
   1.243 +         & | & term " Un " term \\
   1.244 +         & | & term " - " term \\
   1.245 +         & | & term " -> " term \\
   1.246 +         & | & "THE~~"  id  " . " formula\\
   1.247 +         & | & "lam~~"  id ":" term " . " term \\
   1.248 +         & | & "INT~~"  id ":" term " . " term \\
   1.249 +         & | & "UN~~~"  id ":" term " . " term \\
   1.250 +         & | & "PROD~"  id ":" term " . " term \\
   1.251 +         & | & "SUM~~"  id ":" term " . " term \\[2ex]
   1.252 + formula & = & \hbox{expression of type~$o$} \\
   1.253 +         & | & term " : " term \\
   1.254 +         & | & term " \ttilde: " term \\
   1.255 +         & | & term " <= " term \\
   1.256 +         & | & term " = " term \\
   1.257 +         & | & term " \ttilde= " term \\
   1.258 +         & | & "\ttilde\ " formula \\
   1.259 +         & | & formula " \& " formula \\
   1.260 +         & | & formula " | " formula \\
   1.261 +         & | & formula " --> " formula \\
   1.262 +         & | & formula " <-> " formula \\
   1.263 +         & | & "ALL " id ":" term " . " formula \\
   1.264 +         & | & "EX~~" id ":" term " . " formula \\
   1.265 +         & | & "ALL~" id~id^* " . " formula \\
   1.266 +         & | & "EX~~" id~id^* " . " formula \\
   1.267 +         & | & "EX!~" id~id^* " . " formula
   1.268 +  \end{array}
   1.269 +\]
   1.270 +\caption{Full grammar for {\ZF}} \label{zf-syntax}
   1.271 +\end{figure} 
   1.272 +
   1.273 +
   1.274 +\section{Binding operators}
   1.275 +The constant \cdx{Collect} constructs sets by the principle of {\bf
   1.276 +  separation}.  The syntax for separation is
   1.277 +\hbox{\tt\ttlbrace$x$:$A$.\ $P[x]$\ttrbrace}, where $P[x]$ is a formula
   1.278 +that may contain free occurrences of~$x$.  It abbreviates the set {\tt
   1.279 +  Collect($A$,$\lambda x. P[x]$)}, which consists of all $x\in A$ that
   1.280 +satisfy~$P[x]$.  Note that \texttt{Collect} is an unfortunate choice of
   1.281 +name: some set theories adopt a set-formation principle, related to
   1.282 +replacement, called collection.
   1.283 +
   1.284 +The constant \cdx{Replace} constructs sets by the principle of {\bf
   1.285 +  replacement}.  The syntax
   1.286 +\hbox{\tt\ttlbrace$y$.\ $x$:$A$,$Q[x,y]$\ttrbrace} denotes the set {\tt
   1.287 +  Replace($A$,$\lambda x\,y. Q[x,y]$)}, which consists of all~$y$ such
   1.288 +that there exists $x\in A$ satisfying~$Q[x,y]$.  The Replacement Axiom
   1.289 +has the condition that $Q$ must be single-valued over~$A$: for
   1.290 +all~$x\in A$ there exists at most one $y$ satisfying~$Q[x,y]$.  A
   1.291 +single-valued binary predicate is also called a {\bf class function}.
   1.292 +
   1.293 +The constant \cdx{RepFun} expresses a special case of replacement,
   1.294 +where $Q[x,y]$ has the form $y=b[x]$.  Such a $Q$ is trivially
   1.295 +single-valued, since it is just the graph of the meta-level
   1.296 +function~$\lambda x. b[x]$.  The resulting set consists of all $b[x]$
   1.297 +for~$x\in A$.  This is analogous to the \ML{} functional \texttt{map},
   1.298 +since it applies a function to every element of a set.  The syntax is
   1.299 +\hbox{\tt\ttlbrace$b[x]$.\ $x$:$A$\ttrbrace}, which expands to {\tt
   1.300 +  RepFun($A$,$\lambda x. b[x]$)}.
   1.301 +
   1.302 +\index{*INT symbol}\index{*UN symbol} 
   1.303 +General unions and intersections of indexed
   1.304 +families of sets, namely $\bigcup@{x\in A}B[x]$ and $\bigcap@{x\in A}B[x]$,
   1.305 +are written \hbox{\tt UN $x$:$A$.\ $B[x]$} and \hbox{\tt INT $x$:$A$.\ $B[x]$}.
   1.306 +Their meaning is expressed using \texttt{RepFun} as
   1.307 +\[
   1.308 +\bigcup(\{B[x]. x\in A\}) \qquad\hbox{and}\qquad 
   1.309 +\bigcap(\{B[x]. x\in A\}). 
   1.310 +\]
   1.311 +General sums $\sum@{x\in A}B[x]$ and products $\prod@{x\in A}B[x]$ can be
   1.312 +constructed in set theory, where $B[x]$ is a family of sets over~$A$.  They
   1.313 +have as special cases $A\times B$ and $A\to B$, where $B$ is simply a set.
   1.314 +This is similar to the situation in Constructive Type Theory (set theory
   1.315 +has `dependent sets') and calls for similar syntactic conventions.  The
   1.316 +constants~\cdx{Sigma} and~\cdx{Pi} construct general sums and
   1.317 +products.  Instead of \texttt{Sigma($A$,$B$)} and \texttt{Pi($A$,$B$)} we may
   1.318 +write 
   1.319 +\hbox{\tt SUM $x$:$A$.\ $B[x]$} and \hbox{\tt PROD $x$:$A$.\ $B[x]$}.  
   1.320 +\index{*SUM symbol}\index{*PROD symbol}%
   1.321 +The special cases as \hbox{\tt$A$*$B$} and \hbox{\tt$A$->$B$} abbreviate
   1.322 +general sums and products over a constant family.\footnote{Unlike normal
   1.323 +infix operators, {\tt*} and {\tt->} merely define abbreviations; there are
   1.324 +no constants~\texttt{op~*} and~\hbox{\tt op~->}.} Isabelle accepts these
   1.325 +abbreviations in parsing and uses them whenever possible for printing.
   1.326 +
   1.327 +\index{*THE symbol} 
   1.328 +As mentioned above, whenever the axioms assert the existence and uniqueness
   1.329 +of a set, Isabelle's set theory declares a constant for that set.  These
   1.330 +constants can express the {\bf definite description} operator~$\iota
   1.331 +x. P[x]$, which stands for the unique~$a$ satisfying~$P[a]$, if such exists.
   1.332 +Since all terms in {\ZF} denote something, a description is always
   1.333 +meaningful, but we do not know its value unless $P[x]$ defines it uniquely.
   1.334 +Using the constant~\cdx{The}, we may write descriptions as {\tt
   1.335 +  The($\lambda x. P[x]$)} or use the syntax \hbox{\tt THE $x$.\ $P[x]$}.
   1.336 +
   1.337 +\index{*lam symbol}
   1.338 +Function sets may be written in $\lambda$-notation; $\lambda x\in A. b[x]$
   1.339 +stands for the set of all pairs $\pair{x,b[x]}$ for $x\in A$.  In order for
   1.340 +this to be a set, the function's domain~$A$ must be given.  Using the
   1.341 +constant~\cdx{Lambda}, we may express function sets as {\tt
   1.342 +Lambda($A$,$\lambda x. b[x]$)} or use the syntax \hbox{\tt lam $x$:$A$.\ $b[x]$}.
   1.343 +
   1.344 +Isabelle's set theory defines two {\bf bounded quantifiers}:
   1.345 +\begin{eqnarray*}
   1.346 +   \forall x\in A. P[x] &\hbox{abbreviates}& \forall x. x\in A\imp P[x] \\
   1.347 +   \exists x\in A. P[x] &\hbox{abbreviates}& \exists x. x\in A\conj P[x]
   1.348 +\end{eqnarray*}
   1.349 +The constants~\cdx{Ball} and~\cdx{Bex} are defined
   1.350 +accordingly.  Instead of \texttt{Ball($A$,$P$)} and \texttt{Bex($A$,$P$)} we may
   1.351 +write
   1.352 +\hbox{\tt ALL $x$:$A$.\ $P[x]$} and \hbox{\tt EX $x$:$A$.\ $P[x]$}.
   1.353 +
   1.354 +
   1.355 +%%%% ZF.thy
   1.356 +
   1.357 +\begin{figure}
   1.358 +\begin{ttbox}
   1.359 +\tdx{Let_def}            Let(s, f) == f(s)
   1.360 +
   1.361 +\tdx{Ball_def}           Ball(A,P) == ALL x. x:A --> P(x)
   1.362 +\tdx{Bex_def}            Bex(A,P)  == EX x. x:A & P(x)
   1.363 +
   1.364 +\tdx{subset_def}         A <= B  == ALL x:A. x:B
   1.365 +\tdx{extension}          A = B  <->  A <= B & B <= A
   1.366 +
   1.367 +\tdx{Union_iff}          A : Union(C) <-> (EX B:C. A:B)
   1.368 +\tdx{Pow_iff}            A : Pow(B) <-> A <= B
   1.369 +\tdx{foundation}         A=0 | (EX x:A. ALL y:x. ~ y:A)
   1.370 +
   1.371 +\tdx{replacement}        (ALL x:A. ALL y z. P(x,y) & P(x,z) --> y=z) ==>
   1.372 +                   b : PrimReplace(A,P) <-> (EX x:A. P(x,b))
   1.373 +\subcaption{The Zermelo-Fraenkel Axioms}
   1.374 +
   1.375 +\tdx{Replace_def}  Replace(A,P) == 
   1.376 +                   PrimReplace(A, \%x y. (EX!z. P(x,z)) & P(x,y))
   1.377 +\tdx{RepFun_def}   RepFun(A,f)  == {\ttlbrace}y . x:A, y=f(x)\ttrbrace
   1.378 +\tdx{the_def}      The(P)       == Union({\ttlbrace}y . x:{\ttlbrace}0{\ttrbrace}, P(y){\ttrbrace})
   1.379 +\tdx{if_def}       if(P,a,b)    == THE z. P & z=a | ~P & z=b
   1.380 +\tdx{Collect_def}  Collect(A,P) == {\ttlbrace}y . x:A, x=y & P(x){\ttrbrace}
   1.381 +\tdx{Upair_def}    Upair(a,b)   == 
   1.382 +                 {\ttlbrace}y. x:Pow(Pow(0)), (x=0 & y=a) | (x=Pow(0) & y=b){\ttrbrace}
   1.383 +\subcaption{Consequences of replacement}
   1.384 +
   1.385 +\tdx{Inter_def}    Inter(A) == {\ttlbrace}x:Union(A) . ALL y:A. x:y{\ttrbrace}
   1.386 +\tdx{Un_def}       A Un  B  == Union(Upair(A,B))
   1.387 +\tdx{Int_def}      A Int B  == Inter(Upair(A,B))
   1.388 +\tdx{Diff_def}     A - B    == {\ttlbrace}x:A . x~:B{\ttrbrace}
   1.389 +\subcaption{Union, intersection, difference}
   1.390 +\end{ttbox}
   1.391 +\caption{Rules and axioms of {\ZF}} \label{zf-rules}
   1.392 +\end{figure}
   1.393 +
   1.394 +
   1.395 +\begin{figure}
   1.396 +\begin{ttbox}
   1.397 +\tdx{cons_def}     cons(a,A) == Upair(a,a) Un A
   1.398 +\tdx{succ_def}     succ(i) == cons(i,i)
   1.399 +\tdx{infinity}     0:Inf & (ALL y:Inf. succ(y): Inf)
   1.400 +\subcaption{Finite and infinite sets}
   1.401 +
   1.402 +\tdx{Pair_def}       <a,b>      == {\ttlbrace}{\ttlbrace}a,a{\ttrbrace}, {\ttlbrace}a,b{\ttrbrace}{\ttrbrace}
   1.403 +\tdx{split_def}      split(c,p) == THE y. EX a b. p=<a,b> & y=c(a,b)
   1.404 +\tdx{fst_def}        fst(A)     == split(\%x y. x, p)
   1.405 +\tdx{snd_def}        snd(A)     == split(\%x y. y, p)
   1.406 +\tdx{Sigma_def}      Sigma(A,B) == UN x:A. UN y:B(x). {\ttlbrace}<x,y>{\ttrbrace}
   1.407 +\subcaption{Ordered pairs and Cartesian products}
   1.408 +
   1.409 +\tdx{converse_def}   converse(r) == {\ttlbrace}z. w:r, EX x y. w=<x,y> & z=<y,x>{\ttrbrace}
   1.410 +\tdx{domain_def}     domain(r)   == {\ttlbrace}x. w:r, EX y. w=<x,y>{\ttrbrace}
   1.411 +\tdx{range_def}      range(r)    == domain(converse(r))
   1.412 +\tdx{field_def}      field(r)    == domain(r) Un range(r)
   1.413 +\tdx{image_def}      r `` A      == {\ttlbrace}y : range(r) . EX x:A. <x,y> : r{\ttrbrace}
   1.414 +\tdx{vimage_def}     r -`` A     == converse(r)``A
   1.415 +\subcaption{Operations on relations}
   1.416 +
   1.417 +\tdx{lam_def}    Lambda(A,b) == {\ttlbrace}<x,b(x)> . x:A{\ttrbrace}
   1.418 +\tdx{apply_def}  f`a         == THE y. <a,y> : f
   1.419 +\tdx{Pi_def}     Pi(A,B) == {\ttlbrace}f: Pow(Sigma(A,B)). ALL x:A. EX! y. <x,y>: f{\ttrbrace}
   1.420 +\tdx{restrict_def}   restrict(f,A) == lam x:A. f`x
   1.421 +\subcaption{Functions and general product}
   1.422 +\end{ttbox}
   1.423 +\caption{Further definitions of {\ZF}} \label{zf-defs}
   1.424 +\end{figure}
   1.425 +
   1.426 +
   1.427 +
   1.428 +\section{The Zermelo-Fraenkel axioms}
   1.429 +The axioms appear in Fig.\ts \ref{zf-rules}.  They resemble those
   1.430 +presented by Suppes~\cite{suppes72}.  Most of the theory consists of
   1.431 +definitions.  In particular, bounded quantifiers and the subset relation
   1.432 +appear in other axioms.  Object-level quantifiers and implications have
   1.433 +been replaced by meta-level ones wherever possible, to simplify use of the
   1.434 +axioms.  See the file \texttt{ZF/ZF.thy} for details.
   1.435 +
   1.436 +The traditional replacement axiom asserts
   1.437 +\[ y \in \texttt{PrimReplace}(A,P) \bimp (\exists x\in A. P(x,y)) \]
   1.438 +subject to the condition that $P(x,y)$ is single-valued for all~$x\in A$.
   1.439 +The Isabelle theory defines \cdx{Replace} to apply
   1.440 +\cdx{PrimReplace} to the single-valued part of~$P$, namely
   1.441 +\[ (\exists!z. P(x,z)) \conj P(x,y). \]
   1.442 +Thus $y\in \texttt{Replace}(A,P)$ if and only if there is some~$x$ such that
   1.443 +$P(x,-)$ holds uniquely for~$y$.  Because the equivalence is unconditional,
   1.444 +\texttt{Replace} is much easier to use than \texttt{PrimReplace}; it defines the
   1.445 +same set, if $P(x,y)$ is single-valued.  The nice syntax for replacement
   1.446 +expands to \texttt{Replace}.
   1.447 +
   1.448 +Other consequences of replacement include functional replacement
   1.449 +(\cdx{RepFun}) and definite descriptions (\cdx{The}).
   1.450 +Axioms for separation (\cdx{Collect}) and unordered pairs
   1.451 +(\cdx{Upair}) are traditionally assumed, but they actually follow
   1.452 +from replacement~\cite[pages 237--8]{suppes72}.
   1.453 +
   1.454 +The definitions of general intersection, etc., are straightforward.  Note
   1.455 +the definition of \texttt{cons}, which underlies the finite set notation.
   1.456 +The axiom of infinity gives us a set that contains~0 and is closed under
   1.457 +successor (\cdx{succ}).  Although this set is not uniquely defined,
   1.458 +the theory names it (\cdx{Inf}) in order to simplify the
   1.459 +construction of the natural numbers.
   1.460 +                                             
   1.461 +Further definitions appear in Fig.\ts\ref{zf-defs}.  Ordered pairs are
   1.462 +defined in the standard way, $\pair{a,b}\equiv\{\{a\},\{a,b\}\}$.  Recall
   1.463 +that \cdx{Sigma}$(A,B)$ generalizes the Cartesian product of two
   1.464 +sets.  It is defined to be the union of all singleton sets
   1.465 +$\{\pair{x,y}\}$, for $x\in A$ and $y\in B(x)$.  This is a typical usage of
   1.466 +general union.
   1.467 +
   1.468 +The projections \cdx{fst} and~\cdx{snd} are defined in terms of the
   1.469 +generalized projection \cdx{split}.  The latter has been borrowed from
   1.470 +Martin-L\"of's Type Theory, and is often easier to use than \cdx{fst}
   1.471 +and~\cdx{snd}.
   1.472 +
   1.473 +Operations on relations include converse, domain, range, and image.  The
   1.474 +set ${\tt Pi}(A,B)$ generalizes the space of functions between two sets.
   1.475 +Note the simple definitions of $\lambda$-abstraction (using
   1.476 +\cdx{RepFun}) and application (using a definite description).  The
   1.477 +function \cdx{restrict}$(f,A)$ has the same values as~$f$, but only
   1.478 +over the domain~$A$.
   1.479 +
   1.480 +
   1.481 +%%%% zf.ML
   1.482 +
   1.483 +\begin{figure}
   1.484 +\begin{ttbox}
   1.485 +\tdx{ballI}       [| !!x. x:A ==> P(x) |] ==> ALL x:A. P(x)
   1.486 +\tdx{bspec}       [| ALL x:A. P(x);  x: A |] ==> P(x)
   1.487 +\tdx{ballE}       [| ALL x:A. P(x);  P(x) ==> Q;  ~ x:A ==> Q |] ==> Q
   1.488 +
   1.489 +\tdx{ball_cong}   [| A=A';  !!x. x:A' ==> P(x) <-> P'(x) |] ==> 
   1.490 +            (ALL x:A. P(x)) <-> (ALL x:A'. P'(x))
   1.491 +
   1.492 +\tdx{bexI}        [| P(x);  x: A |] ==> EX x:A. P(x)
   1.493 +\tdx{bexCI}       [| ALL x:A. ~P(x) ==> P(a);  a: A |] ==> EX x:A. P(x)
   1.494 +\tdx{bexE}        [| EX x:A. P(x);  !!x. [| x:A; P(x) |] ==> Q |] ==> Q
   1.495 +
   1.496 +\tdx{bex_cong}    [| A=A';  !!x. x:A' ==> P(x) <-> P'(x) |] ==> 
   1.497 +            (EX x:A. P(x)) <-> (EX x:A'. P'(x))
   1.498 +\subcaption{Bounded quantifiers}
   1.499 +
   1.500 +\tdx{subsetI}       (!!x. x:A ==> x:B) ==> A <= B
   1.501 +\tdx{subsetD}       [| A <= B;  c:A |] ==> c:B
   1.502 +\tdx{subsetCE}      [| A <= B;  ~(c:A) ==> P;  c:B ==> P |] ==> P
   1.503 +\tdx{subset_refl}   A <= A
   1.504 +\tdx{subset_trans}  [| A<=B;  B<=C |] ==> A<=C
   1.505 +
   1.506 +\tdx{equalityI}     [| A <= B;  B <= A |] ==> A = B
   1.507 +\tdx{equalityD1}    A = B ==> A<=B
   1.508 +\tdx{equalityD2}    A = B ==> B<=A
   1.509 +\tdx{equalityE}     [| A = B;  [| A<=B; B<=A |] ==> P |]  ==>  P
   1.510 +\subcaption{Subsets and extensionality}
   1.511 +
   1.512 +\tdx{emptyE}          a:0 ==> P
   1.513 +\tdx{empty_subsetI}   0 <= A
   1.514 +\tdx{equals0I}        [| !!y. y:A ==> False |] ==> A=0
   1.515 +\tdx{equals0D}        [| A=0;  a:A |] ==> P
   1.516 +
   1.517 +\tdx{PowI}            A <= B ==> A : Pow(B)
   1.518 +\tdx{PowD}            A : Pow(B)  ==>  A<=B
   1.519 +\subcaption{The empty set; power sets}
   1.520 +\end{ttbox}
   1.521 +\caption{Basic derived rules for {\ZF}} \label{zf-lemmas1}
   1.522 +\end{figure}
   1.523 +
   1.524 +
   1.525 +\section{From basic lemmas to function spaces}
   1.526 +Faced with so many definitions, it is essential to prove lemmas.  Even
   1.527 +trivial theorems like $A \int B = B \int A$ would be difficult to
   1.528 +prove from the definitions alone.  Isabelle's set theory derives many
   1.529 +rules using a natural deduction style.  Ideally, a natural deduction
   1.530 +rule should introduce or eliminate just one operator, but this is not
   1.531 +always practical.  For most operators, we may forget its definition
   1.532 +and use its derived rules instead.
   1.533 +
   1.534 +\subsection{Fundamental lemmas}
   1.535 +Figure~\ref{zf-lemmas1} presents the derived rules for the most basic
   1.536 +operators.  The rules for the bounded quantifiers resemble those for the
   1.537 +ordinary quantifiers, but note that \tdx{ballE} uses a negated assumption
   1.538 +in the style of Isabelle's classical reasoner.  The \rmindex{congruence
   1.539 +  rules} \tdx{ball_cong} and \tdx{bex_cong} are required by Isabelle's
   1.540 +simplifier, but have few other uses.  Congruence rules must be specially
   1.541 +derived for all binding operators, and henceforth will not be shown.
   1.542 +
   1.543 +Figure~\ref{zf-lemmas1} also shows rules for the subset and equality
   1.544 +relations (proof by extensionality), and rules about the empty set and the
   1.545 +power set operator.
   1.546 +
   1.547 +Figure~\ref{zf-lemmas2} presents rules for replacement and separation.
   1.548 +The rules for \cdx{Replace} and \cdx{RepFun} are much simpler than
   1.549 +comparable rules for \texttt{PrimReplace} would be.  The principle of
   1.550 +separation is proved explicitly, although most proofs should use the
   1.551 +natural deduction rules for \texttt{Collect}.  The elimination rule
   1.552 +\tdx{CollectE} is equivalent to the two destruction rules
   1.553 +\tdx{CollectD1} and \tdx{CollectD2}, but each rule is suited to
   1.554 +particular circumstances.  Although too many rules can be confusing, there
   1.555 +is no reason to aim for a minimal set of rules.  See the file
   1.556 +\texttt{ZF/ZF.ML} for a complete listing.
   1.557 +
   1.558 +Figure~\ref{zf-lemmas3} presents rules for general union and intersection.
   1.559 +The empty intersection should be undefined.  We cannot have
   1.560 +$\bigcap(\emptyset)=V$ because $V$, the universal class, is not a set.  All
   1.561 +expressions denote something in {\ZF} set theory; the definition of
   1.562 +intersection implies $\bigcap(\emptyset)=\emptyset$, but this value is
   1.563 +arbitrary.  The rule \tdx{InterI} must have a premise to exclude
   1.564 +the empty intersection.  Some of the laws governing intersections require
   1.565 +similar premises.
   1.566 +
   1.567 +
   1.568 +%the [p] gives better page breaking for the book
   1.569 +\begin{figure}[p]
   1.570 +\begin{ttbox}
   1.571 +\tdx{ReplaceI}      [| x: A;  P(x,b);  !!y. P(x,y) ==> y=b |] ==> 
   1.572 +              b : {\ttlbrace}y. x:A, P(x,y){\ttrbrace}
   1.573 +
   1.574 +\tdx{ReplaceE}      [| b : {\ttlbrace}y. x:A, P(x,y){\ttrbrace};  
   1.575 +                 !!x. [| x: A;  P(x,b);  ALL y. P(x,y)-->y=b |] ==> R 
   1.576 +              |] ==> R
   1.577 +
   1.578 +\tdx{RepFunI}       [| a : A |] ==> f(a) : {\ttlbrace}f(x). x:A{\ttrbrace}
   1.579 +\tdx{RepFunE}       [| b : {\ttlbrace}f(x). x:A{\ttrbrace};  
   1.580 +                 !!x.[| x:A;  b=f(x) |] ==> P |] ==> P
   1.581 +
   1.582 +\tdx{separation}     a : {\ttlbrace}x:A. P(x){\ttrbrace} <-> a:A & P(a)
   1.583 +\tdx{CollectI}       [| a:A;  P(a) |] ==> a : {\ttlbrace}x:A. P(x){\ttrbrace}
   1.584 +\tdx{CollectE}       [| a : {\ttlbrace}x:A. P(x){\ttrbrace};  [| a:A; P(a) |] ==> R |] ==> R
   1.585 +\tdx{CollectD1}      a : {\ttlbrace}x:A. P(x){\ttrbrace} ==> a:A
   1.586 +\tdx{CollectD2}      a : {\ttlbrace}x:A. P(x){\ttrbrace} ==> P(a)
   1.587 +\end{ttbox}
   1.588 +\caption{Replacement and separation} \label{zf-lemmas2}
   1.589 +\end{figure}
   1.590 +
   1.591 +
   1.592 +\begin{figure}
   1.593 +\begin{ttbox}
   1.594 +\tdx{UnionI}    [| B: C;  A: B |] ==> A: Union(C)
   1.595 +\tdx{UnionE}    [| A : Union(C);  !!B.[| A: B;  B: C |] ==> R |] ==> R
   1.596 +
   1.597 +\tdx{InterI}    [| !!x. x: C ==> A: x;  c:C |] ==> A : Inter(C)
   1.598 +\tdx{InterD}    [| A : Inter(C);  B : C |] ==> A : B
   1.599 +\tdx{InterE}    [| A : Inter(C);  A:B ==> R;  ~ B:C ==> R |] ==> R
   1.600 +
   1.601 +\tdx{UN_I}      [| a: A;  b: B(a) |] ==> b: (UN x:A. B(x))
   1.602 +\tdx{UN_E}      [| b : (UN x:A. B(x));  !!x.[| x: A;  b: B(x) |] ==> R 
   1.603 +          |] ==> R
   1.604 +
   1.605 +\tdx{INT_I}     [| !!x. x: A ==> b: B(x);  a: A |] ==> b: (INT x:A. B(x))
   1.606 +\tdx{INT_E}     [| b : (INT x:A. B(x));  a: A |] ==> b : B(a)
   1.607 +\end{ttbox}
   1.608 +\caption{General union and intersection} \label{zf-lemmas3}
   1.609 +\end{figure}
   1.610 +
   1.611 +
   1.612 +%%% upair.ML
   1.613 +
   1.614 +\begin{figure}
   1.615 +\begin{ttbox}
   1.616 +\tdx{pairing}      a:Upair(b,c) <-> (a=b | a=c)
   1.617 +\tdx{UpairI1}      a : Upair(a,b)
   1.618 +\tdx{UpairI2}      b : Upair(a,b)
   1.619 +\tdx{UpairE}       [| a : Upair(b,c);  a = b ==> P;  a = c ==> P |] ==> P
   1.620 +\end{ttbox}
   1.621 +\caption{Unordered pairs} \label{zf-upair1}
   1.622 +\end{figure}
   1.623 +
   1.624 +
   1.625 +\begin{figure}
   1.626 +\begin{ttbox}
   1.627 +\tdx{UnI1}         c : A ==> c : A Un B
   1.628 +\tdx{UnI2}         c : B ==> c : A Un B
   1.629 +\tdx{UnCI}         (~c : B ==> c : A) ==> c : A Un B
   1.630 +\tdx{UnE}          [| c : A Un B;  c:A ==> P;  c:B ==> P |] ==> P
   1.631 +
   1.632 +\tdx{IntI}         [| c : A;  c : B |] ==> c : A Int B
   1.633 +\tdx{IntD1}        c : A Int B ==> c : A
   1.634 +\tdx{IntD2}        c : A Int B ==> c : B
   1.635 +\tdx{IntE}         [| c : A Int B;  [| c:A; c:B |] ==> P |] ==> P
   1.636 +
   1.637 +\tdx{DiffI}        [| c : A;  ~ c : B |] ==> c : A - B
   1.638 +\tdx{DiffD1}       c : A - B ==> c : A
   1.639 +\tdx{DiffD2}       c : A - B ==> c ~: B
   1.640 +\tdx{DiffE}        [| c : A - B;  [| c:A; ~ c:B |] ==> P |] ==> P
   1.641 +\end{ttbox}
   1.642 +\caption{Union, intersection, difference} \label{zf-Un}
   1.643 +\end{figure}
   1.644 +
   1.645 +
   1.646 +\begin{figure}
   1.647 +\begin{ttbox}
   1.648 +\tdx{consI1}       a : cons(a,B)
   1.649 +\tdx{consI2}       a : B ==> a : cons(b,B)
   1.650 +\tdx{consCI}       (~ a:B ==> a=b) ==> a: cons(b,B)
   1.651 +\tdx{consE}        [| a : cons(b,A);  a=b ==> P;  a:A ==> P |] ==> P
   1.652 +
   1.653 +\tdx{singletonI}   a : {\ttlbrace}a{\ttrbrace}
   1.654 +\tdx{singletonE}   [| a : {\ttlbrace}b{\ttrbrace}; a=b ==> P |] ==> P
   1.655 +\end{ttbox}
   1.656 +\caption{Finite and singleton sets} \label{zf-upair2}
   1.657 +\end{figure}
   1.658 +
   1.659 +
   1.660 +\begin{figure}
   1.661 +\begin{ttbox}
   1.662 +\tdx{succI1}       i : succ(i)
   1.663 +\tdx{succI2}       i : j ==> i : succ(j)
   1.664 +\tdx{succCI}       (~ i:j ==> i=j) ==> i: succ(j)
   1.665 +\tdx{succE}        [| i : succ(j);  i=j ==> P;  i:j ==> P |] ==> P
   1.666 +\tdx{succ_neq_0}   [| succ(n)=0 |] ==> P
   1.667 +\tdx{succ_inject}  succ(m) = succ(n) ==> m=n
   1.668 +\end{ttbox}
   1.669 +\caption{The successor function} \label{zf-succ}
   1.670 +\end{figure}
   1.671 +
   1.672 +
   1.673 +\begin{figure}
   1.674 +\begin{ttbox}
   1.675 +\tdx{the_equality}     [| P(a);  !!x. P(x) ==> x=a |] ==> (THE x. P(x)) = a
   1.676 +\tdx{theI}             EX! x. P(x) ==> P(THE x. P(x))
   1.677 +
   1.678 +\tdx{if_P}              P ==> (if P then a else b) = a
   1.679 +\tdx{if_not_P}         ~P ==> (if P then a else b) = b
   1.680 +
   1.681 +\tdx{mem_asym}         [| a:b;  b:a |] ==> P
   1.682 +\tdx{mem_irrefl}       a:a ==> P
   1.683 +\end{ttbox}
   1.684 +\caption{Descriptions; non-circularity} \label{zf-the}
   1.685 +\end{figure}
   1.686 +
   1.687 +
   1.688 +\subsection{Unordered pairs and finite sets}
   1.689 +Figure~\ref{zf-upair1} presents the principle of unordered pairing, along
   1.690 +with its derived rules.  Binary union and intersection are defined in terms
   1.691 +of ordered pairs (Fig.\ts\ref{zf-Un}).  Set difference is also included.  The
   1.692 +rule \tdx{UnCI} is useful for classical reasoning about unions,
   1.693 +like \texttt{disjCI}\@; it supersedes \tdx{UnI1} and
   1.694 +\tdx{UnI2}, but these rules are often easier to work with.  For
   1.695 +intersection and difference we have both elimination and destruction rules.
   1.696 +Again, there is no reason to provide a minimal rule set.
   1.697 +
   1.698 +Figure~\ref{zf-upair2} is concerned with finite sets: it presents rules
   1.699 +for~\texttt{cons}, the finite set constructor, and rules for singleton
   1.700 +sets.  Figure~\ref{zf-succ} presents derived rules for the successor
   1.701 +function, which is defined in terms of~\texttt{cons}.  The proof that {\tt
   1.702 +  succ} is injective appears to require the Axiom of Foundation.
   1.703 +
   1.704 +Definite descriptions (\sdx{THE}) are defined in terms of the singleton
   1.705 +set~$\{0\}$, but their derived rules fortunately hide this
   1.706 +(Fig.\ts\ref{zf-the}).  The rule~\tdx{theI} is difficult to apply
   1.707 +because of the two occurrences of~$\Var{P}$.  However,
   1.708 +\tdx{the_equality} does not have this problem and the files contain
   1.709 +many examples of its use.
   1.710 +
   1.711 +Finally, the impossibility of having both $a\in b$ and $b\in a$
   1.712 +(\tdx{mem_asym}) is proved by applying the Axiom of Foundation to
   1.713 +the set $\{a,b\}$.  The impossibility of $a\in a$ is a trivial consequence.
   1.714 +
   1.715 +See the file \texttt{ZF/upair.ML} for full proofs of the rules discussed in
   1.716 +this section.
   1.717 +
   1.718 +
   1.719 +%%% subset.ML
   1.720 +
   1.721 +\begin{figure}
   1.722 +\begin{ttbox}
   1.723 +\tdx{Union_upper}       B:A ==> B <= Union(A)
   1.724 +\tdx{Union_least}       [| !!x. x:A ==> x<=C |] ==> Union(A) <= C
   1.725 +
   1.726 +\tdx{Inter_lower}       B:A ==> Inter(A) <= B
   1.727 +\tdx{Inter_greatest}    [| a:A;  !!x. x:A ==> C<=x |] ==> C <= Inter(A)
   1.728 +
   1.729 +\tdx{Un_upper1}         A <= A Un B
   1.730 +\tdx{Un_upper2}         B <= A Un B
   1.731 +\tdx{Un_least}          [| A<=C;  B<=C |] ==> A Un B <= C
   1.732 +
   1.733 +\tdx{Int_lower1}        A Int B <= A
   1.734 +\tdx{Int_lower2}        A Int B <= B
   1.735 +\tdx{Int_greatest}      [| C<=A;  C<=B |] ==> C <= A Int B
   1.736 +
   1.737 +\tdx{Diff_subset}       A-B <= A
   1.738 +\tdx{Diff_contains}     [| C<=A;  C Int B = 0 |] ==> C <= A-B
   1.739 +
   1.740 +\tdx{Collect_subset}    Collect(A,P) <= A
   1.741 +\end{ttbox}
   1.742 +\caption{Subset and lattice properties} \label{zf-subset}
   1.743 +\end{figure}
   1.744 +
   1.745 +
   1.746 +\subsection{Subset and lattice properties}
   1.747 +The subset relation is a complete lattice.  Unions form least upper bounds;
   1.748 +non-empty intersections form greatest lower bounds.  Figure~\ref{zf-subset}
   1.749 +shows the corresponding rules.  A few other laws involving subsets are
   1.750 +included.  Proofs are in the file \texttt{ZF/subset.ML}.
   1.751 +
   1.752 +Reasoning directly about subsets often yields clearer proofs than
   1.753 +reasoning about the membership relation.  Section~\ref{sec:ZF-pow-example}
   1.754 +below presents an example of this, proving the equation ${{\tt Pow}(A)\cap
   1.755 +  {\tt Pow}(B)}= {\tt Pow}(A\cap B)$.
   1.756 +
   1.757 +%%% pair.ML
   1.758 +
   1.759 +\begin{figure}
   1.760 +\begin{ttbox}
   1.761 +\tdx{Pair_inject1}    <a,b> = <c,d> ==> a=c
   1.762 +\tdx{Pair_inject2}    <a,b> = <c,d> ==> b=d
   1.763 +\tdx{Pair_inject}     [| <a,b> = <c,d>;  [| a=c; b=d |] ==> P |] ==> P
   1.764 +\tdx{Pair_neq_0}      <a,b>=0 ==> P
   1.765 +
   1.766 +\tdx{fst_conv}        fst(<a,b>) = a
   1.767 +\tdx{snd_conv}        snd(<a,b>) = b
   1.768 +\tdx{split}           split(\%x y. c(x,y), <a,b>) = c(a,b)
   1.769 +
   1.770 +\tdx{SigmaI}          [| a:A;  b:B(a) |] ==> <a,b> : Sigma(A,B)
   1.771 +
   1.772 +\tdx{SigmaE}          [| c: Sigma(A,B);  
   1.773 +                   !!x y.[| x:A; y:B(x); c=<x,y> |] ==> P |] ==> P
   1.774 +
   1.775 +\tdx{SigmaE2}         [| <a,b> : Sigma(A,B);    
   1.776 +                   [| a:A;  b:B(a) |] ==> P   |] ==> P
   1.777 +\end{ttbox}
   1.778 +\caption{Ordered pairs; projections; general sums} \label{zf-pair}
   1.779 +\end{figure}
   1.780 +
   1.781 +
   1.782 +\subsection{Ordered pairs} \label{sec:pairs}
   1.783 +
   1.784 +Figure~\ref{zf-pair} presents the rules governing ordered pairs,
   1.785 +projections and general sums.  File \texttt{ZF/pair.ML} contains the
   1.786 +full (and tedious) proof that $\{\{a\},\{a,b\}\}$ functions as an ordered
   1.787 +pair.  This property is expressed as two destruction rules,
   1.788 +\tdx{Pair_inject1} and \tdx{Pair_inject2}, and equivalently
   1.789 +as the elimination rule \tdx{Pair_inject}.
   1.790 +
   1.791 +The rule \tdx{Pair_neq_0} asserts $\pair{a,b}\neq\emptyset$.  This
   1.792 +is a property of $\{\{a\},\{a,b\}\}$, and need not hold for other 
   1.793 +encodings of ordered pairs.  The non-standard ordered pairs mentioned below
   1.794 +satisfy $\pair{\emptyset;\emptyset}=\emptyset$.
   1.795 +
   1.796 +The natural deduction rules \tdx{SigmaI} and \tdx{SigmaE}
   1.797 +assert that \cdx{Sigma}$(A,B)$ consists of all pairs of the form
   1.798 +$\pair{x,y}$, for $x\in A$ and $y\in B(x)$.  The rule \tdx{SigmaE2}
   1.799 +merely states that $\pair{a,b}\in \texttt{Sigma}(A,B)$ implies $a\in A$ and
   1.800 +$b\in B(a)$.
   1.801 +
   1.802 +In addition, it is possible to use tuples as patterns in abstractions:
   1.803 +\begin{center}
   1.804 +{\tt\%<$x$,$y$>. $t$} \quad stands for\quad \texttt{split(\%$x$ $y$.\ $t$)}
   1.805 +\end{center}
   1.806 +Nested patterns are translated recursively:
   1.807 +{\tt\%<$x$,$y$,$z$>. $t$} $\leadsto$ {\tt\%<$x$,<$y$,$z$>>. $t$} $\leadsto$
   1.808 +\texttt{split(\%$x$.\%<$y$,$z$>. $t$)} $\leadsto$ \texttt{split(\%$x$. split(\%$y$
   1.809 +  $z$.\ $t$))}.  The reverse translation is performed upon printing.
   1.810 +\begin{warn}
   1.811 +  The translation between patterns and \texttt{split} is performed automatically
   1.812 +  by the parser and printer.  Thus the internal and external form of a term
   1.813 +  may differ, which affects proofs.  For example the term {\tt
   1.814 +    (\%<x,y>.<y,x>)<a,b>} requires the theorem \texttt{split} to rewrite to
   1.815 +  {\tt<b,a>}.
   1.816 +\end{warn}
   1.817 +In addition to explicit $\lambda$-abstractions, patterns can be used in any
   1.818 +variable binding construct which is internally described by a
   1.819 +$\lambda$-abstraction.  Here are some important examples:
   1.820 +\begin{description}
   1.821 +\item[Let:] \texttt{let {\it pattern} = $t$ in $u$}
   1.822 +\item[Choice:] \texttt{THE~{\it pattern}~.~$P$}
   1.823 +\item[Set operations:] \texttt{UN~{\it pattern}:$A$.~$B$}
   1.824 +\item[Comprehension:] \texttt{{\ttlbrace}~{\it pattern}:$A$~.~$P$~{\ttrbrace}}
   1.825 +\end{description}
   1.826 +
   1.827 +
   1.828 +%%% domrange.ML
   1.829 +
   1.830 +\begin{figure}
   1.831 +\begin{ttbox}
   1.832 +\tdx{domainI}        <a,b>: r ==> a : domain(r)
   1.833 +\tdx{domainE}        [| a : domain(r);  !!y. <a,y>: r ==> P |] ==> P
   1.834 +\tdx{domain_subset}  domain(Sigma(A,B)) <= A
   1.835 +
   1.836 +\tdx{rangeI}         <a,b>: r ==> b : range(r)
   1.837 +\tdx{rangeE}         [| b : range(r);  !!x. <x,b>: r ==> P |] ==> P
   1.838 +\tdx{range_subset}   range(A*B) <= B
   1.839 +
   1.840 +\tdx{fieldI1}        <a,b>: r ==> a : field(r)
   1.841 +\tdx{fieldI2}        <a,b>: r ==> b : field(r)
   1.842 +\tdx{fieldCI}        (~ <c,a>:r ==> <a,b>: r) ==> a : field(r)
   1.843 +
   1.844 +\tdx{fieldE}         [| a : field(r);  
   1.845 +                  !!x. <a,x>: r ==> P;  
   1.846 +                  !!x. <x,a>: r ==> P      
   1.847 +               |] ==> P
   1.848 +
   1.849 +\tdx{field_subset}   field(A*A) <= A
   1.850 +\end{ttbox}
   1.851 +\caption{Domain, range and field of a relation} \label{zf-domrange}
   1.852 +\end{figure}
   1.853 +
   1.854 +\begin{figure}
   1.855 +\begin{ttbox}
   1.856 +\tdx{imageI}         [| <a,b>: r;  a:A |] ==> b : r``A
   1.857 +\tdx{imageE}         [| b: r``A;  !!x.[| <x,b>: r;  x:A |] ==> P |] ==> P
   1.858 +
   1.859 +\tdx{vimageI}        [| <a,b>: r;  b:B |] ==> a : r-``B
   1.860 +\tdx{vimageE}        [| a: r-``B;  !!x.[| <a,x>: r;  x:B |] ==> P |] ==> P
   1.861 +\end{ttbox}
   1.862 +\caption{Image and inverse image} \label{zf-domrange2}
   1.863 +\end{figure}
   1.864 +
   1.865 +
   1.866 +\subsection{Relations}
   1.867 +Figure~\ref{zf-domrange} presents rules involving relations, which are sets
   1.868 +of ordered pairs.  The converse of a relation~$r$ is the set of all pairs
   1.869 +$\pair{y,x}$ such that $\pair{x,y}\in r$; if $r$ is a function, then
   1.870 +{\cdx{converse}$(r)$} is its inverse.  The rules for the domain
   1.871 +operation, namely \tdx{domainI} and~\tdx{domainE}, assert that
   1.872 +\cdx{domain}$(r)$ consists of all~$x$ such that $r$ contains
   1.873 +some pair of the form~$\pair{x,y}$.  The range operation is similar, and
   1.874 +the field of a relation is merely the union of its domain and range.  
   1.875 +
   1.876 +Figure~\ref{zf-domrange2} presents rules for images and inverse images.
   1.877 +Note that these operations are generalisations of range and domain,
   1.878 +respectively.  See the file \texttt{ZF/domrange.ML} for derivations of the
   1.879 +rules.
   1.880 +
   1.881 +
   1.882 +%%% func.ML
   1.883 +
   1.884 +\begin{figure}
   1.885 +\begin{ttbox}
   1.886 +\tdx{fun_is_rel}      f: Pi(A,B) ==> f <= Sigma(A,B)
   1.887 +
   1.888 +\tdx{apply_equality}  [| <a,b>: f;  f: Pi(A,B) |] ==> f`a = b
   1.889 +\tdx{apply_equality2} [| <a,b>: f;  <a,c>: f;  f: Pi(A,B) |] ==> b=c
   1.890 +
   1.891 +\tdx{apply_type}      [| f: Pi(A,B);  a:A |] ==> f`a : B(a)
   1.892 +\tdx{apply_Pair}      [| f: Pi(A,B);  a:A |] ==> <a,f`a>: f
   1.893 +\tdx{apply_iff}       f: Pi(A,B) ==> <a,b>: f <-> a:A & f`a = b
   1.894 +
   1.895 +\tdx{fun_extension}   [| f : Pi(A,B);  g: Pi(A,D);
   1.896 +                   !!x. x:A ==> f`x = g`x     |] ==> f=g
   1.897 +
   1.898 +\tdx{domain_type}     [| <a,b> : f;  f: Pi(A,B) |] ==> a : A
   1.899 +\tdx{range_type}      [| <a,b> : f;  f: Pi(A,B) |] ==> b : B(a)
   1.900 +
   1.901 +\tdx{Pi_type}         [| f: A->C;  !!x. x:A ==> f`x: B(x) |] ==> f: Pi(A,B)
   1.902 +\tdx{domain_of_fun}   f: Pi(A,B) ==> domain(f)=A
   1.903 +\tdx{range_of_fun}    f: Pi(A,B) ==> f: A->range(f)
   1.904 +
   1.905 +\tdx{restrict}        a : A ==> restrict(f,A) ` a = f`a
   1.906 +\tdx{restrict_type}   [| !!x. x:A ==> f`x: B(x) |] ==> 
   1.907 +                restrict(f,A) : Pi(A,B)
   1.908 +\end{ttbox}
   1.909 +\caption{Functions} \label{zf-func1}
   1.910 +\end{figure}
   1.911 +
   1.912 +
   1.913 +\begin{figure}
   1.914 +\begin{ttbox}
   1.915 +\tdx{lamI}         a:A ==> <a,b(a)> : (lam x:A. b(x))
   1.916 +\tdx{lamE}         [| p: (lam x:A. b(x));  !!x.[| x:A; p=<x,b(x)> |] ==> P 
   1.917 +             |] ==>  P
   1.918 +
   1.919 +\tdx{lam_type}     [| !!x. x:A ==> b(x): B(x) |] ==> (lam x:A. b(x)) : Pi(A,B)
   1.920 +
   1.921 +\tdx{beta}         a : A ==> (lam x:A. b(x)) ` a = b(a)
   1.922 +\tdx{eta}          f : Pi(A,B) ==> (lam x:A. f`x) = f
   1.923 +\end{ttbox}
   1.924 +\caption{$\lambda$-abstraction} \label{zf-lam}
   1.925 +\end{figure}
   1.926 +
   1.927 +
   1.928 +\begin{figure}
   1.929 +\begin{ttbox}
   1.930 +\tdx{fun_empty}            0: 0->0
   1.931 +\tdx{fun_single}           {\ttlbrace}<a,b>{\ttrbrace} : {\ttlbrace}a{\ttrbrace} -> {\ttlbrace}b{\ttrbrace}
   1.932 +
   1.933 +\tdx{fun_disjoint_Un}      [| f: A->B;  g: C->D;  A Int C = 0  |] ==>  
   1.934 +                     (f Un g) : (A Un C) -> (B Un D)
   1.935 +
   1.936 +\tdx{fun_disjoint_apply1}  [| a:A;  f: A->B;  g: C->D;  A Int C = 0 |] ==>  
   1.937 +                     (f Un g)`a = f`a
   1.938 +
   1.939 +\tdx{fun_disjoint_apply2}  [| c:C;  f: A->B;  g: C->D;  A Int C = 0 |] ==>  
   1.940 +                     (f Un g)`c = g`c
   1.941 +\end{ttbox}
   1.942 +\caption{Constructing functions from smaller sets} \label{zf-func2}
   1.943 +\end{figure}
   1.944 +
   1.945 +
   1.946 +\subsection{Functions}
   1.947 +Functions, represented by graphs, are notoriously difficult to reason
   1.948 +about.  The file \texttt{ZF/func.ML} derives many rules, which overlap more
   1.949 +than they ought.  This section presents the more important rules.
   1.950 +
   1.951 +Figure~\ref{zf-func1} presents the basic properties of \cdx{Pi}$(A,B)$,
   1.952 +the generalized function space.  For example, if $f$ is a function and
   1.953 +$\pair{a,b}\in f$, then $f`a=b$ (\tdx{apply_equality}).  Two functions
   1.954 +are equal provided they have equal domains and deliver equals results
   1.955 +(\tdx{fun_extension}).
   1.956 +
   1.957 +By \tdx{Pi_type}, a function typing of the form $f\in A\to C$ can be
   1.958 +refined to the dependent typing $f\in\prod@{x\in A}B(x)$, given a suitable
   1.959 +family of sets $\{B(x)\}@{x\in A}$.  Conversely, by \tdx{range_of_fun},
   1.960 +any dependent typing can be flattened to yield a function type of the form
   1.961 +$A\to C$; here, $C={\tt range}(f)$.
   1.962 +
   1.963 +Among the laws for $\lambda$-abstraction, \tdx{lamI} and \tdx{lamE}
   1.964 +describe the graph of the generated function, while \tdx{beta} and
   1.965 +\tdx{eta} are the standard conversions.  We essentially have a
   1.966 +dependently-typed $\lambda$-calculus (Fig.\ts\ref{zf-lam}).
   1.967 +
   1.968 +Figure~\ref{zf-func2} presents some rules that can be used to construct
   1.969 +functions explicitly.  We start with functions consisting of at most one
   1.970 +pair, and may form the union of two functions provided their domains are
   1.971 +disjoint.  
   1.972 +
   1.973 +
   1.974 +\begin{figure}
   1.975 +\begin{ttbox}
   1.976 +\tdx{Int_absorb}         A Int A = A
   1.977 +\tdx{Int_commute}        A Int B = B Int A
   1.978 +\tdx{Int_assoc}          (A Int B) Int C  =  A Int (B Int C)
   1.979 +\tdx{Int_Un_distrib}     (A Un B) Int C  =  (A Int C) Un (B Int C)
   1.980 +
   1.981 +\tdx{Un_absorb}          A Un A = A
   1.982 +\tdx{Un_commute}         A Un B = B Un A
   1.983 +\tdx{Un_assoc}           (A Un B) Un C  =  A Un (B Un C)
   1.984 +\tdx{Un_Int_distrib}     (A Int B) Un C  =  (A Un C) Int (B Un C)
   1.985 +
   1.986 +\tdx{Diff_cancel}        A-A = 0
   1.987 +\tdx{Diff_disjoint}      A Int (B-A) = 0
   1.988 +\tdx{Diff_partition}     A<=B ==> A Un (B-A) = B
   1.989 +\tdx{double_complement}  [| A<=B; B<= C |] ==> (B - (C-A)) = A
   1.990 +\tdx{Diff_Un}            A - (B Un C) = (A-B) Int (A-C)
   1.991 +\tdx{Diff_Int}           A - (B Int C) = (A-B) Un (A-C)
   1.992 +
   1.993 +\tdx{Union_Un_distrib}   Union(A Un B) = Union(A) Un Union(B)
   1.994 +\tdx{Inter_Un_distrib}   [| a:A;  b:B |] ==> 
   1.995 +                   Inter(A Un B) = Inter(A) Int Inter(B)
   1.996 +
   1.997 +\tdx{Int_Union_RepFun}   A Int Union(B) = (UN C:B. A Int C)
   1.998 +
   1.999 +\tdx{Un_Inter_RepFun}    b:B ==> 
  1.1000 +                   A Un Inter(B) = (INT C:B. A Un C)
  1.1001 +
  1.1002 +\tdx{SUM_Un_distrib1}    (SUM x:A Un B. C(x)) = 
  1.1003 +                   (SUM x:A. C(x)) Un (SUM x:B. C(x))
  1.1004 +
  1.1005 +\tdx{SUM_Un_distrib2}    (SUM x:C. A(x) Un B(x)) =
  1.1006 +                   (SUM x:C. A(x))  Un  (SUM x:C. B(x))
  1.1007 +
  1.1008 +\tdx{SUM_Int_distrib1}   (SUM x:A Int B. C(x)) =
  1.1009 +                   (SUM x:A. C(x)) Int (SUM x:B. C(x))
  1.1010 +
  1.1011 +\tdx{SUM_Int_distrib2}   (SUM x:C. A(x) Int B(x)) =
  1.1012 +                   (SUM x:C. A(x)) Int (SUM x:C. B(x))
  1.1013 +\end{ttbox}
  1.1014 +\caption{Equalities} \label{zf-equalities}
  1.1015 +\end{figure}
  1.1016 +
  1.1017 +
  1.1018 +\begin{figure}
  1.1019 +%\begin{constants} 
  1.1020 +%  \cdx{1}       & $i$           &       & $\{\emptyset\}$       \\
  1.1021 +%  \cdx{bool}    & $i$           &       & the set $\{\emptyset,1\}$     \\
  1.1022 +%  \cdx{cond}   & $[i,i,i]\To i$ &       & conditional for \texttt{bool}    \\
  1.1023 +%  \cdx{not}    & $i\To i$       &       & negation for \texttt{bool}       \\
  1.1024 +%  \sdx{and}    & $[i,i]\To i$   & Left 70 & conjunction for \texttt{bool}  \\
  1.1025 +%  \sdx{or}     & $[i,i]\To i$   & Left 65 & disjunction for \texttt{bool}  \\
  1.1026 +%  \sdx{xor}    & $[i,i]\To i$   & Left 65 & exclusive-or for \texttt{bool}
  1.1027 +%\end{constants}
  1.1028 +%
  1.1029 +\begin{ttbox}
  1.1030 +\tdx{bool_def}       bool == {\ttlbrace}0,1{\ttrbrace}
  1.1031 +\tdx{cond_def}       cond(b,c,d) == if b=1 then c else d
  1.1032 +\tdx{not_def}        not(b)  == cond(b,0,1)
  1.1033 +\tdx{and_def}        a and b == cond(a,b,0)
  1.1034 +\tdx{or_def}         a or b  == cond(a,1,b)
  1.1035 +\tdx{xor_def}        a xor b == cond(a,not(b),b)
  1.1036 +
  1.1037 +\tdx{bool_1I}        1 : bool
  1.1038 +\tdx{bool_0I}        0 : bool
  1.1039 +\tdx{boolE}          [| c: bool;  c=1 ==> P;  c=0 ==> P |] ==> P
  1.1040 +\tdx{cond_1}         cond(1,c,d) = c
  1.1041 +\tdx{cond_0}         cond(0,c,d) = d
  1.1042 +\end{ttbox}
  1.1043 +\caption{The booleans} \label{zf-bool}
  1.1044 +\end{figure}
  1.1045 +
  1.1046 +
  1.1047 +\section{Further developments}
  1.1048 +The next group of developments is complex and extensive, and only
  1.1049 +highlights can be covered here.  It involves many theories and ML files of
  1.1050 +proofs. 
  1.1051 +
  1.1052 +Figure~\ref{zf-equalities} presents commutative, associative, distributive,
  1.1053 +and idempotency laws of union and intersection, along with other equations.
  1.1054 +See file \texttt{ZF/equalities.ML}.
  1.1055 +
  1.1056 +Theory \thydx{Bool} defines $\{0,1\}$ as a set of booleans, with the usual
  1.1057 +operators including a conditional (Fig.\ts\ref{zf-bool}).  Although {\ZF} is a
  1.1058 +first-order theory, you can obtain the effect of higher-order logic using
  1.1059 +\texttt{bool}-valued functions, for example.  The constant~\texttt{1} is
  1.1060 +translated to \texttt{succ(0)}.
  1.1061 +
  1.1062 +\begin{figure}
  1.1063 +\index{*"+ symbol}
  1.1064 +\begin{constants}
  1.1065 +  \it symbol    & \it meta-type & \it priority & \it description \\ 
  1.1066 +  \tt +         & $[i,i]\To i$  &  Right 65     & disjoint union operator\\
  1.1067 +  \cdx{Inl}~~\cdx{Inr}  & $i\To i$      &       & injections\\
  1.1068 +  \cdx{case}    & $[i\To i,i\To i, i]\To i$ &   & conditional for $A+B$
  1.1069 +\end{constants}
  1.1070 +\begin{ttbox}
  1.1071 +\tdx{sum_def}        A+B == {\ttlbrace}0{\ttrbrace}*A Un {\ttlbrace}1{\ttrbrace}*B
  1.1072 +\tdx{Inl_def}        Inl(a) == <0,a>
  1.1073 +\tdx{Inr_def}        Inr(b) == <1,b>
  1.1074 +\tdx{case_def}       case(c,d,u) == split(\%y z. cond(y, d(z), c(z)), u)
  1.1075 +
  1.1076 +\tdx{sum_InlI}       a : A ==> Inl(a) : A+B
  1.1077 +\tdx{sum_InrI}       b : B ==> Inr(b) : A+B
  1.1078 +
  1.1079 +\tdx{Inl_inject}     Inl(a)=Inl(b) ==> a=b
  1.1080 +\tdx{Inr_inject}     Inr(a)=Inr(b) ==> a=b
  1.1081 +\tdx{Inl_neq_Inr}    Inl(a)=Inr(b) ==> P
  1.1082 +
  1.1083 +\tdx{sumE2}   u: A+B ==> (EX x. x:A & u=Inl(x)) | (EX y. y:B & u=Inr(y))
  1.1084 +
  1.1085 +\tdx{case_Inl}       case(c,d,Inl(a)) = c(a)
  1.1086 +\tdx{case_Inr}       case(c,d,Inr(b)) = d(b)
  1.1087 +\end{ttbox}
  1.1088 +\caption{Disjoint unions} \label{zf-sum}
  1.1089 +\end{figure}
  1.1090 +
  1.1091 +
  1.1092 +Theory \thydx{Sum} defines the disjoint union of two sets, with
  1.1093 +injections and a case analysis operator (Fig.\ts\ref{zf-sum}).  Disjoint
  1.1094 +unions play a role in datatype definitions, particularly when there is
  1.1095 +mutual recursion~\cite{paulson-set-II}.
  1.1096 +
  1.1097 +\begin{figure}
  1.1098 +\begin{ttbox}
  1.1099 +\tdx{QPair_def}       <a;b> == a+b
  1.1100 +\tdx{qsplit_def}      qsplit(c,p)  == THE y. EX a b. p=<a;b> & y=c(a,b)
  1.1101 +\tdx{qfsplit_def}     qfsplit(R,z) == EX x y. z=<x;y> & R(x,y)
  1.1102 +\tdx{qconverse_def}   qconverse(r) == {\ttlbrace}z. w:r, EX x y. w=<x;y> & z=<y;x>{\ttrbrace}
  1.1103 +\tdx{QSigma_def}      QSigma(A,B)  == UN x:A. UN y:B(x). {\ttlbrace}<x;y>{\ttrbrace}
  1.1104 +
  1.1105 +\tdx{qsum_def}        A <+> B      == ({\ttlbrace}0{\ttrbrace} <*> A) Un ({\ttlbrace}1{\ttrbrace} <*> B)
  1.1106 +\tdx{QInl_def}        QInl(a)      == <0;a>
  1.1107 +\tdx{QInr_def}        QInr(b)      == <1;b>
  1.1108 +\tdx{qcase_def}       qcase(c,d)   == qsplit(\%y z. cond(y, d(z), c(z)))
  1.1109 +\end{ttbox}
  1.1110 +\caption{Non-standard pairs, products and sums} \label{zf-qpair}
  1.1111 +\end{figure}
  1.1112 +
  1.1113 +Theory \thydx{QPair} defines a notion of ordered pair that admits
  1.1114 +non-well-founded tupling (Fig.\ts\ref{zf-qpair}).  Such pairs are written
  1.1115 +{\tt<$a$;$b$>}.  It also defines the eliminator \cdx{qsplit}, the
  1.1116 +converse operator \cdx{qconverse}, and the summation operator
  1.1117 +\cdx{QSigma}.  These are completely analogous to the corresponding
  1.1118 +versions for standard ordered pairs.  The theory goes on to define a
  1.1119 +non-standard notion of disjoint sum using non-standard pairs.  All of these
  1.1120 +concepts satisfy the same properties as their standard counterparts; in
  1.1121 +addition, {\tt<$a$;$b$>} is continuous.  The theory supports coinductive
  1.1122 +definitions, for example of infinite lists~\cite{paulson-final}.
  1.1123 +
  1.1124 +\begin{figure}
  1.1125 +\begin{ttbox}
  1.1126 +\tdx{bnd_mono_def}   bnd_mono(D,h) == 
  1.1127 +                 h(D)<=D & (ALL W X. W<=X --> X<=D --> h(W) <= h(X))
  1.1128 +
  1.1129 +\tdx{lfp_def}        lfp(D,h) == Inter({\ttlbrace}X: Pow(D). h(X) <= X{\ttrbrace})
  1.1130 +\tdx{gfp_def}        gfp(D,h) == Union({\ttlbrace}X: Pow(D). X <= h(X){\ttrbrace})
  1.1131 +
  1.1132 +
  1.1133 +\tdx{lfp_lowerbound} [| h(A) <= A;  A<=D |] ==> lfp(D,h) <= A
  1.1134 +
  1.1135 +\tdx{lfp_subset}     lfp(D,h) <= D
  1.1136 +
  1.1137 +\tdx{lfp_greatest}   [| bnd_mono(D,h);  
  1.1138 +                  !!X. [| h(X) <= X;  X<=D |] ==> A<=X 
  1.1139 +               |] ==> A <= lfp(D,h)
  1.1140 +
  1.1141 +\tdx{lfp_Tarski}     bnd_mono(D,h) ==> lfp(D,h) = h(lfp(D,h))
  1.1142 +
  1.1143 +\tdx{induct}         [| a : lfp(D,h);  bnd_mono(D,h);
  1.1144 +                  !!x. x : h(Collect(lfp(D,h),P)) ==> P(x)
  1.1145 +               |] ==> P(a)
  1.1146 +
  1.1147 +\tdx{lfp_mono}       [| bnd_mono(D,h);  bnd_mono(E,i);
  1.1148 +                  !!X. X<=D ==> h(X) <= i(X)  
  1.1149 +               |] ==> lfp(D,h) <= lfp(E,i)
  1.1150 +
  1.1151 +\tdx{gfp_upperbound} [| A <= h(A);  A<=D |] ==> A <= gfp(D,h)
  1.1152 +
  1.1153 +\tdx{gfp_subset}     gfp(D,h) <= D
  1.1154 +
  1.1155 +\tdx{gfp_least}      [| bnd_mono(D,h);  
  1.1156 +                  !!X. [| X <= h(X);  X<=D |] ==> X<=A
  1.1157 +               |] ==> gfp(D,h) <= A
  1.1158 +
  1.1159 +\tdx{gfp_Tarski}     bnd_mono(D,h) ==> gfp(D,h) = h(gfp(D,h))
  1.1160 +
  1.1161 +\tdx{coinduct}       [| bnd_mono(D,h); a: X; X <= h(X Un gfp(D,h)); X <= D 
  1.1162 +               |] ==> a : gfp(D,h)
  1.1163 +
  1.1164 +\tdx{gfp_mono}       [| bnd_mono(D,h);  D <= E;
  1.1165 +                  !!X. X<=D ==> h(X) <= i(X)  
  1.1166 +               |] ==> gfp(D,h) <= gfp(E,i)
  1.1167 +\end{ttbox}
  1.1168 +\caption{Least and greatest fixedpoints} \label{zf-fixedpt}
  1.1169 +\end{figure}
  1.1170 +
  1.1171 +The Knaster-Tarski Theorem states that every monotone function over a
  1.1172 +complete lattice has a fixedpoint.  Theory \thydx{Fixedpt} proves the
  1.1173 +Theorem only for a particular lattice, namely the lattice of subsets of a
  1.1174 +set (Fig.\ts\ref{zf-fixedpt}).  The theory defines least and greatest
  1.1175 +fixedpoint operators with corresponding induction and coinduction rules.
  1.1176 +These are essential to many definitions that follow, including the natural
  1.1177 +numbers and the transitive closure operator.  The (co)inductive definition
  1.1178 +package also uses the fixedpoint operators~\cite{paulson-CADE}.  See
  1.1179 +Davey and Priestley~\cite{davey&priestley} for more on the Knaster-Tarski
  1.1180 +Theorem and my paper~\cite{paulson-set-II} for discussion of the Isabelle
  1.1181 +proofs.
  1.1182 +
  1.1183 +Monotonicity properties are proved for most of the set-forming operations:
  1.1184 +union, intersection, Cartesian product, image, domain, range, etc.  These
  1.1185 +are useful for applying the Knaster-Tarski Fixedpoint Theorem.  The proofs
  1.1186 +themselves are trivial applications of Isabelle's classical reasoner.  See
  1.1187 +file \texttt{ZF/mono.ML}.
  1.1188 +
  1.1189 +
  1.1190 +\begin{figure}
  1.1191 +\begin{constants} 
  1.1192 +  \it symbol  & \it meta-type & \it priority & \it description \\ 
  1.1193 +  \sdx{O}       & $[i,i]\To i$  &  Right 60     & composition ($\circ$) \\
  1.1194 +  \cdx{id}      & $i\To i$      &       & identity function \\
  1.1195 +  \cdx{inj}     & $[i,i]\To i$  &       & injective function space\\
  1.1196 +  \cdx{surj}    & $[i,i]\To i$  &       & surjective function space\\
  1.1197 +  \cdx{bij}     & $[i,i]\To i$  &       & bijective function space
  1.1198 +\end{constants}
  1.1199 +
  1.1200 +\begin{ttbox}
  1.1201 +\tdx{comp_def}  r O s     == {\ttlbrace}xz : domain(s)*range(r) . 
  1.1202 +                        EX x y z. xz=<x,z> & <x,y>:s & <y,z>:r{\ttrbrace}
  1.1203 +\tdx{id_def}    id(A)     == (lam x:A. x)
  1.1204 +\tdx{inj_def}   inj(A,B)  == {\ttlbrace} f: A->B. ALL w:A. ALL x:A. f`w=f`x --> w=x {\ttrbrace}
  1.1205 +\tdx{surj_def}  surj(A,B) == {\ttlbrace} f: A->B . ALL y:B. EX x:A. f`x=y {\ttrbrace}
  1.1206 +\tdx{bij_def}   bij(A,B)  == inj(A,B) Int surj(A,B)
  1.1207 +
  1.1208 +
  1.1209 +\tdx{left_inverse}     [| f: inj(A,B);  a: A |] ==> converse(f)`(f`a) = a
  1.1210 +\tdx{right_inverse}    [| f: inj(A,B);  b: range(f) |] ==> 
  1.1211 +                 f`(converse(f)`b) = b
  1.1212 +
  1.1213 +\tdx{inj_converse_inj} f: inj(A,B) ==> converse(f): inj(range(f), A)
  1.1214 +\tdx{bij_converse_bij} f: bij(A,B) ==> converse(f): bij(B,A)
  1.1215 +
  1.1216 +\tdx{comp_type}        [| s<=A*B;  r<=B*C |] ==> (r O s) <= A*C
  1.1217 +\tdx{comp_assoc}       (r O s) O t = r O (s O t)
  1.1218 +
  1.1219 +\tdx{left_comp_id}     r<=A*B ==> id(B) O r = r
  1.1220 +\tdx{right_comp_id}    r<=A*B ==> r O id(A) = r
  1.1221 +
  1.1222 +\tdx{comp_func}        [| g:A->B; f:B->C |] ==> (f O g):A->C
  1.1223 +\tdx{comp_func_apply}  [| g:A->B; f:B->C; a:A |] ==> (f O g)`a = f`(g`a)
  1.1224 +
  1.1225 +\tdx{comp_inj}         [| g:inj(A,B);  f:inj(B,C)  |] ==> (f O g):inj(A,C)
  1.1226 +\tdx{comp_surj}        [| g:surj(A,B); f:surj(B,C) |] ==> (f O g):surj(A,C)
  1.1227 +\tdx{comp_bij}         [| g:bij(A,B); f:bij(B,C) |] ==> (f O g):bij(A,C)
  1.1228 +
  1.1229 +\tdx{left_comp_inverse}     f: inj(A,B) ==> converse(f) O f = id(A)
  1.1230 +\tdx{right_comp_inverse}    f: surj(A,B) ==> f O converse(f) = id(B)
  1.1231 +
  1.1232 +\tdx{bij_disjoint_Un}   
  1.1233 +    [| f: bij(A,B);  g: bij(C,D);  A Int C = 0;  B Int D = 0 |] ==> 
  1.1234 +    (f Un g) : bij(A Un C, B Un D)
  1.1235 +
  1.1236 +\tdx{restrict_bij}  [| f:inj(A,B);  C<=A |] ==> restrict(f,C): bij(C, f``C)
  1.1237 +\end{ttbox}
  1.1238 +\caption{Permutations} \label{zf-perm}
  1.1239 +\end{figure}
  1.1240 +
  1.1241 +The theory \thydx{Perm} is concerned with permutations (bijections) and
  1.1242 +related concepts.  These include composition of relations, the identity
  1.1243 +relation, and three specialized function spaces: injective, surjective and
  1.1244 +bijective.  Figure~\ref{zf-perm} displays many of their properties that
  1.1245 +have been proved.  These results are fundamental to a treatment of
  1.1246 +equipollence and cardinality.
  1.1247 +
  1.1248 +\begin{figure}\small
  1.1249 +\index{#*@{\tt\#*} symbol}
  1.1250 +\index{*div symbol}
  1.1251 +\index{*mod symbol}
  1.1252 +\index{#+@{\tt\#+} symbol}
  1.1253 +\index{#-@{\tt\#-} symbol}
  1.1254 +\begin{constants}
  1.1255 +  \it symbol  & \it meta-type & \it priority & \it description \\ 
  1.1256 +  \cdx{nat}     & $i$                   &       & set of natural numbers \\
  1.1257 +  \cdx{nat_case}& $[i,i\To i,i]\To i$     &     & conditional for $nat$\\
  1.1258 +  \tt \#*       & $[i,i]\To i$  &  Left 70      & multiplication \\
  1.1259 +  \tt div       & $[i,i]\To i$  &  Left 70      & division\\
  1.1260 +  \tt mod       & $[i,i]\To i$  &  Left 70      & modulus\\
  1.1261 +  \tt \#+       & $[i,i]\To i$  &  Left 65      & addition\\
  1.1262 +  \tt \#-       & $[i,i]\To i$  &  Left 65      & subtraction
  1.1263 +\end{constants}
  1.1264 +
  1.1265 +\begin{ttbox}
  1.1266 +\tdx{nat_def}  nat == lfp(lam r: Pow(Inf). {\ttlbrace}0{\ttrbrace} Un {\ttlbrace}succ(x). x:r{\ttrbrace}
  1.1267 +
  1.1268 +\tdx{mod_def}  m mod n == transrec(m, \%j f. if j:n then j else f`(j#-n))
  1.1269 +\tdx{div_def}  m div n == transrec(m, \%j f. if j:n then 0 else succ(f`(j#-n)))
  1.1270 +
  1.1271 +\tdx{nat_case_def}  nat_case(a,b,k) == 
  1.1272 +              THE y. k=0 & y=a | (EX x. k=succ(x) & y=b(x))
  1.1273 +
  1.1274 +\tdx{nat_0I}        0 : nat
  1.1275 +\tdx{nat_succI}     n : nat ==> succ(n) : nat
  1.1276 +
  1.1277 +\tdx{nat_induct}        
  1.1278 +    [| n: nat;  P(0);  !!x. [| x: nat;  P(x) |] ==> P(succ(x)) 
  1.1279 +    |] ==> P(n)
  1.1280 +
  1.1281 +\tdx{nat_case_0}    nat_case(a,b,0) = a
  1.1282 +\tdx{nat_case_succ} nat_case(a,b,succ(m)) = b(m)
  1.1283 +
  1.1284 +\tdx{add_0}        0 #+ n = n
  1.1285 +\tdx{add_succ}     succ(m) #+ n = succ(m #+ n)
  1.1286 +
  1.1287 +\tdx{mult_type}     [| m:nat;  n:nat |] ==> m #* n : nat
  1.1288 +\tdx{mult_0}        0 #* n = 0
  1.1289 +\tdx{mult_succ}     succ(m) #* n = n #+ (m #* n)
  1.1290 +\tdx{mult_commute}  [| m:nat; n:nat |] ==> m #* n = n #* m
  1.1291 +\tdx{add_mult_dist} [| m:nat; k:nat |] ==> (m #+ n) #* k = (m #* k){\thinspace}#+{\thinspace}(n #* k)
  1.1292 +\tdx{mult_assoc}
  1.1293 +    [| m:nat;  n:nat;  k:nat |] ==> (m #* n) #* k = m #* (n #* k)
  1.1294 +\tdx{mod_quo_equality}
  1.1295 +    [| 0:n;  m:nat;  n:nat |] ==> (m div n)#*n #+ m mod n = m
  1.1296 +\end{ttbox}
  1.1297 +\caption{The natural numbers} \label{zf-nat}
  1.1298 +\end{figure}
  1.1299 +
  1.1300 +Theory \thydx{Nat} defines the natural numbers and mathematical
  1.1301 +induction, along with a case analysis operator.  The set of natural
  1.1302 +numbers, here called \texttt{nat}, is known in set theory as the ordinal~$\omega$.
  1.1303 +
  1.1304 +Theory \thydx{Arith} develops arithmetic on the natural numbers
  1.1305 +(Fig.\ts\ref{zf-nat}).  Addition, multiplication and subtraction are defined
  1.1306 +by primitive recursion.  Division and remainder are defined by repeated
  1.1307 +subtraction, which requires well-founded recursion; the termination argument
  1.1308 +relies on the divisor's being non-zero.  Many properties are proved:
  1.1309 +commutative, associative and distributive laws, identity and cancellation
  1.1310 +laws, etc.  The most interesting result is perhaps the theorem $a \bmod b +
  1.1311 +(a/b)\times b = a$.
  1.1312 +
  1.1313 +Theory \thydx{Univ} defines a `universe' $\texttt{univ}(A)$, which is used by
  1.1314 +the datatype package.  This set contains $A$ and the
  1.1315 +natural numbers.  Vitally, it is closed under finite products: ${\tt
  1.1316 +  univ}(A)\times{\tt univ}(A)\subseteq{\tt univ}(A)$.  This theory also
  1.1317 +defines the cumulative hierarchy of axiomatic set theory, which
  1.1318 +traditionally is written $V@\alpha$ for an ordinal~$\alpha$.  The
  1.1319 +`universe' is a simple generalization of~$V@\omega$.
  1.1320 +
  1.1321 +Theory \thydx{QUniv} defines a `universe' ${\tt quniv}(A)$, which is used by
  1.1322 +the datatype package to construct codatatypes such as streams.  It is
  1.1323 +analogous to ${\tt univ}(A)$ (and is defined in terms of it) but is closed
  1.1324 +under the non-standard product and sum.
  1.1325 +
  1.1326 +Theory \texttt{Finite} (Figure~\ref{zf-fin}) defines the finite set operator;
  1.1327 +${\tt Fin}(A)$ is the set of all finite sets over~$A$.  The theory employs
  1.1328 +Isabelle's inductive definition package, which proves various rules
  1.1329 +automatically.  The induction rule shown is stronger than the one proved by
  1.1330 +the package.  The theory also defines the set of all finite functions
  1.1331 +between two given sets.
  1.1332 +
  1.1333 +\begin{figure}
  1.1334 +\begin{ttbox}
  1.1335 +\tdx{Fin.emptyI}      0 : Fin(A)
  1.1336 +\tdx{Fin.consI}       [| a: A;  b: Fin(A) |] ==> cons(a,b) : Fin(A)
  1.1337 +
  1.1338 +\tdx{Fin_induct}
  1.1339 +    [| b: Fin(A);
  1.1340 +       P(0);
  1.1341 +       !!x y. [| x: A;  y: Fin(A);  x~:y;  P(y) |] ==> P(cons(x,y))
  1.1342 +    |] ==> P(b)
  1.1343 +
  1.1344 +\tdx{Fin_mono}        A<=B ==> Fin(A) <= Fin(B)
  1.1345 +\tdx{Fin_UnI}         [| b: Fin(A);  c: Fin(A) |] ==> b Un c : Fin(A)
  1.1346 +\tdx{Fin_UnionI}      C : Fin(Fin(A)) ==> Union(C) : Fin(A)
  1.1347 +\tdx{Fin_subset}      [| c<=b;  b: Fin(A) |] ==> c: Fin(A)
  1.1348 +\end{ttbox}
  1.1349 +\caption{The finite set operator} \label{zf-fin}
  1.1350 +\end{figure}
  1.1351 +
  1.1352 +\begin{figure}
  1.1353 +\begin{constants}
  1.1354 +  \it symbol  & \it meta-type & \it priority & \it description \\ 
  1.1355 +  \cdx{list}    & $i\To i$      && lists over some set\\
  1.1356 +  \cdx{list_case} & $[i, [i,i]\To i, i] \To i$  && conditional for $list(A)$ \\
  1.1357 +  \cdx{map}     & $[i\To i, i] \To i$   &       & mapping functional\\
  1.1358 +  \cdx{length}  & $i\To i$              &       & length of a list\\
  1.1359 +  \cdx{rev}     & $i\To i$              &       & reverse of a list\\
  1.1360 +  \tt \at       & $[i,i]\To i$  &  Right 60     & append for lists\\
  1.1361 +  \cdx{flat}    & $i\To i$   &                  & append of list of lists
  1.1362 +\end{constants}
  1.1363 +
  1.1364 +\underscoreon %%because @ is used here
  1.1365 +\begin{ttbox}
  1.1366 +\tdx{NilI}            Nil : list(A)
  1.1367 +\tdx{ConsI}           [| a: A;  l: list(A) |] ==> Cons(a,l) : list(A)
  1.1368 +
  1.1369 +\tdx{List.induct}
  1.1370 +    [| l: list(A);
  1.1371 +       P(Nil);
  1.1372 +       !!x y. [| x: A;  y: list(A);  P(y) |] ==> P(Cons(x,y))
  1.1373 +    |] ==> P(l)
  1.1374 +
  1.1375 +\tdx{Cons_iff}        Cons(a,l)=Cons(a',l') <-> a=a' & l=l'
  1.1376 +\tdx{Nil_Cons_iff}    ~ Nil=Cons(a,l)
  1.1377 +
  1.1378 +\tdx{list_mono}       A<=B ==> list(A) <= list(B)
  1.1379 +
  1.1380 +\tdx{map_ident}       l: list(A) ==> map(\%u. u, l) = l
  1.1381 +\tdx{map_compose}     l: list(A) ==> map(h, map(j,l)) = map(\%u. h(j(u)), l)
  1.1382 +\tdx{map_app_distrib} xs: list(A) ==> map(h, xs@ys) = map(h,xs) @ map(h,ys)
  1.1383 +\tdx{map_type}
  1.1384 +    [| l: list(A);  !!x. x: A ==> h(x): B |] ==> map(h,l) : list(B)
  1.1385 +\tdx{map_flat}
  1.1386 +    ls: list(list(A)) ==> map(h, flat(ls)) = flat(map(map(h),ls))
  1.1387 +\end{ttbox}
  1.1388 +\caption{Lists} \label{zf-list}
  1.1389 +\end{figure}
  1.1390 +
  1.1391 +
  1.1392 +Figure~\ref{zf-list} presents the set of lists over~$A$, ${\tt list}(A)$.  The
  1.1393 +definition employs Isabelle's datatype package, which defines the introduction
  1.1394 +and induction rules automatically, as well as the constructors, case operator
  1.1395 +(\verb|list_case|) and recursion operator.  The theory then defines the usual
  1.1396 +list functions by primitive recursion.  See theory \texttt{List}.
  1.1397 +
  1.1398 +
  1.1399 +\section{Simplification and classical reasoning}
  1.1400 +
  1.1401 +{\ZF} inherits simplification from {\FOL} but adopts it for set theory.  The
  1.1402 +extraction of rewrite rules takes the {\ZF} primitives into account.  It can
  1.1403 +strip bounded universal quantifiers from a formula; for example, ${\forall
  1.1404 +  x\in A. f(x)=g(x)}$ yields the conditional rewrite rule $x\in A \Imp
  1.1405 +f(x)=g(x)$.  Given $a\in\{x\in A. P(x)\}$ it extracts rewrite rules from $a\in
  1.1406 +A$ and~$P(a)$.  It can also break down $a\in A\int B$ and $a\in A-B$.
  1.1407 +
  1.1408 +Simplification tactics tactics such as \texttt{Asm_simp_tac} and
  1.1409 +\texttt{Full_simp_tac} use the default simpset (\texttt{simpset()}), which
  1.1410 +works for most purposes.  A small simplification set for set theory is
  1.1411 +called~\ttindexbold{ZF_ss}, and you can even use \ttindex{FOL_ss} as a minimal
  1.1412 +starting point.  \texttt{ZF_ss} contains congruence rules for all the binding
  1.1413 +operators of {\ZF}\@.  It contains all the conversion rules, such as
  1.1414 +\texttt{fst} and \texttt{snd}, as well as the rewrites shown in
  1.1415 +Fig.\ts\ref{zf-simpdata}.  See the file \texttt{ZF/simpdata.ML} for a fuller
  1.1416 +list.
  1.1417 +
  1.1418 +As for the classical reasoner, tactics such as \texttt{Blast_tac} and {\tt
  1.1419 +  Best_tac} refer to the default claset (\texttt{claset()}).  This works for
  1.1420 +most purposes.  Named clasets include \ttindexbold{ZF_cs} (basic set theory)
  1.1421 +and \ttindexbold{le_cs} (useful for reasoning about the relations $<$ and
  1.1422 +$\le$).  You can use \ttindex{FOL_cs} as a minimal basis for building your own
  1.1423 +clasets.  See \iflabelundefined{chap:classical}{the {\em Reference Manual\/}}%
  1.1424 +{Chap.\ts\ref{chap:classical}} for more discussion of classical proof methods.
  1.1425 +
  1.1426 +
  1.1427 +\begin{figure}
  1.1428 +\begin{eqnarray*}
  1.1429 +  a\in \emptyset        & \bimp &  \bot\\
  1.1430 +  a \in A \un B      & \bimp &  a\in A \disj a\in B\\
  1.1431 +  a \in A \int B      & \bimp &  a\in A \conj a\in B\\
  1.1432 +  a \in A-B             & \bimp &  a\in A \conj \neg (a\in B)\\
  1.1433 +  \pair{a,b}\in {\tt Sigma}(A,B)
  1.1434 +                        & \bimp &  a\in A \conj b\in B(a)\\
  1.1435 +  a \in {\tt Collect}(A,P)      & \bimp &  a\in A \conj P(a)\\
  1.1436 +  (\forall x \in \emptyset. P(x)) & \bimp &  \top\\
  1.1437 +  (\forall x \in A. \top)       & \bimp &  \top
  1.1438 +\end{eqnarray*}
  1.1439 +\caption{Some rewrite rules for set theory} \label{zf-simpdata}
  1.1440 +\end{figure}
  1.1441 +
  1.1442 +
  1.1443 +\section{Datatype definitions}
  1.1444 +\label{sec:ZF:datatype}
  1.1445 +\index{*datatype|(}
  1.1446 +
  1.1447 +The \ttindex{datatype} definition package of \ZF\ constructs inductive
  1.1448 +datatypes similar to those of \ML.  It can also construct coinductive
  1.1449 +datatypes (codatatypes), which are non-well-founded structures such as
  1.1450 +streams.  It defines the set using a fixed-point construction and proves
  1.1451 +induction rules, as well as theorems for recursion and case combinators.  It
  1.1452 +supplies mechanisms for reasoning about freeness.  The datatype package can
  1.1453 +handle both mutual and indirect recursion.
  1.1454 +
  1.1455 +
  1.1456 +\subsection{Basics}
  1.1457 +\label{subsec:datatype:basics}
  1.1458 +
  1.1459 +A \texttt{datatype} definition has the following form:
  1.1460 +\[
  1.1461 +\begin{array}{llcl}
  1.1462 +\mathtt{datatype} & t@1(A@1,\ldots,A@h) & = &
  1.1463 +  constructor^1@1 ~\mid~ \ldots ~\mid~ constructor^1@{k@1} \\
  1.1464 + & & \vdots \\
  1.1465 +\mathtt{and} & t@n(A@1,\ldots,A@h) & = &
  1.1466 +  constructor^n@1~ ~\mid~ \ldots ~\mid~ constructor^n@{k@n}
  1.1467 +\end{array}
  1.1468 +\]
  1.1469 +Here $t@1$, \ldots,~$t@n$ are identifiers and $A@1$, \ldots,~$A@h$ are
  1.1470 +variables: the datatype's parameters.  Each constructor specification has the
  1.1471 +form \dquotesoff
  1.1472 +\[ C \hbox{\tt~( } \hbox{\tt"} x@1 \hbox{\tt:} T@1 \hbox{\tt"},\;
  1.1473 +                   \ldots,\;
  1.1474 +                   \hbox{\tt"} x@m \hbox{\tt:} T@m \hbox{\tt"}
  1.1475 +     \hbox{\tt~)}
  1.1476 +\]
  1.1477 +Here $C$ is the constructor name, and variables $x@1$, \ldots,~$x@m$ are the
  1.1478 +constructor arguments, belonging to the sets $T@1$, \ldots, $T@m$,
  1.1479 +respectively.  Typically each $T@j$ is either a constant set, a datatype
  1.1480 +parameter (one of $A@1$, \ldots, $A@h$) or a recursive occurrence of one of
  1.1481 +the datatypes, say $t@i(A@1,\ldots,A@h)$.  More complex possibilities exist,
  1.1482 +but they are much harder to realize.  Often, additional information must be
  1.1483 +supplied in the form of theorems.
  1.1484 +
  1.1485 +A datatype can occur recursively as the argument of some function~$F$.  This
  1.1486 +is called a {\em nested} (or \emph{indirect}) occurrence.  It is only allowed
  1.1487 +if the datatype package is given a theorem asserting that $F$ is monotonic.
  1.1488 +If the datatype has indirect occurrences, then Isabelle/ZF does not support
  1.1489 +recursive function definitions.
  1.1490 +
  1.1491 +A simple example of a datatype is \texttt{list}, which is built-in, and is
  1.1492 +defined by
  1.1493 +\begin{ttbox}
  1.1494 +consts     list :: i=>i
  1.1495 +datatype  "list(A)" = Nil | Cons ("a:A", "l: list(A)")
  1.1496 +\end{ttbox}
  1.1497 +Note that the datatype operator must be declared as a constant first.
  1.1498 +However, the package declares the constructors.  Here, \texttt{Nil} gets type
  1.1499 +$i$ and \texttt{Cons} gets type $[i,i]\To i$.
  1.1500 +
  1.1501 +Trees and forests can be modelled by the mutually recursive datatype
  1.1502 +definition
  1.1503 +\begin{ttbox}
  1.1504 +consts     tree, forest, tree_forest :: i=>i
  1.1505 +datatype  "tree(A)"   = Tcons ("a: A",  "f: forest(A)")
  1.1506 +and       "forest(A)" = Fnil  |  Fcons ("t: tree(A)",  "f: forest(A)")
  1.1507 +\end{ttbox}
  1.1508 +Here $\texttt{tree}(A)$ is the set of trees over $A$, $\texttt{forest}(A)$ is
  1.1509 +the set of forests over $A$, and  $\texttt{tree_forest}(A)$ is the union of
  1.1510 +the previous two sets.  All three operators must be declared first.
  1.1511 +
  1.1512 +The datatype \texttt{term}, which is defined by
  1.1513 +\begin{ttbox}
  1.1514 +consts     term :: i=>i
  1.1515 +datatype  "term(A)" = Apply ("a: A", "l: list(term(A))")
  1.1516 +  monos "[list_mono]"
  1.1517 +\end{ttbox}
  1.1518 +is an example of nested recursion.  (The theorem \texttt{list_mono} is proved
  1.1519 +in file \texttt{List.ML}, and the \texttt{term} example is devaloped in theory
  1.1520 +\thydx{ex/Term}.)
  1.1521 +
  1.1522 +\subsubsection{Freeness of the constructors}
  1.1523 +
  1.1524 +Constructors satisfy {\em freeness} properties.  Constructions are distinct,
  1.1525 +for example $\texttt{Nil}\not=\texttt{Cons}(a,l)$, and they are injective, for
  1.1526 +example $\texttt{Cons}(a,l)=\texttt{Cons}(a',l') \bimp a=a' \conj l=l'$.
  1.1527 +Because the number of freeness is quadratic in the number of constructors, the
  1.1528 +datatype package does not prove them, but instead provides several means of
  1.1529 +proving them dynamically.  For the \texttt{list} datatype, freeness reasoning
  1.1530 +can be done in two ways: by simplifying with the theorems
  1.1531 +\texttt{list.free_iffs} or by invoking the classical reasoner with
  1.1532 +\texttt{list.free_SEs} as safe elimination rules.  Occasionally this exposes
  1.1533 +the underlying representation of some constructor, which can be rectified
  1.1534 +using the command \hbox{\tt fold_tac list.con_defs}.
  1.1535 +
  1.1536 +\subsubsection{Structural induction}
  1.1537 +
  1.1538 +The datatype package also provides structural induction rules.  For datatypes
  1.1539 +without mutual or nested recursion, the rule has the form exemplified by
  1.1540 +\texttt{list.induct} in Fig.\ts\ref{zf-list}.  For mutually recursive
  1.1541 +datatypes, the induction rule is supplied in two forms.  Consider datatype
  1.1542 +\texttt{TF}.  The rule \texttt{tree_forest.induct} performs induction over a
  1.1543 +single predicate~\texttt{P}, which is presumed to be defined for both trees
  1.1544 +and forests:
  1.1545 +\begin{ttbox}
  1.1546 +[| x : tree_forest(A);
  1.1547 +   !!a f. [| a : A; f : forest(A); P(f) |] ==> P(Tcons(a, f)); P(Fnil);
  1.1548 +   !!f t. [| t : tree(A); P(t); f : forest(A); P(f) |]
  1.1549 +          ==> P(Fcons(t, f)) 
  1.1550 +|] ==> P(x)
  1.1551 +\end{ttbox}
  1.1552 +The rule \texttt{tree_forest.mutual_induct} performs induction over two
  1.1553 +distinct predicates, \texttt{P_tree} and \texttt{P_forest}.
  1.1554 +\begin{ttbox}
  1.1555 +[| !!a f.
  1.1556 +      [| a : A; f : forest(A); P_forest(f) |] ==> P_tree(Tcons(a, f));
  1.1557 +   P_forest(Fnil);
  1.1558 +   !!f t. [| t : tree(A); P_tree(t); f : forest(A); P_forest(f) |]
  1.1559 +          ==> P_forest(Fcons(t, f)) 
  1.1560 +|] ==> (ALL za. za : tree(A) --> P_tree(za)) &
  1.1561 +    (ALL za. za : forest(A) --> P_forest(za))
  1.1562 +\end{ttbox}
  1.1563 +
  1.1564 +For datatypes with nested recursion, such as the \texttt{term} example from
  1.1565 +above, things are a bit more complicated.  The rule \texttt{term.induct}
  1.1566 +refers to the monotonic operator, \texttt{list}:
  1.1567 +\begin{ttbox}
  1.1568 +[| x : term(A);
  1.1569 +   !!a l. [| a : A; l : list(Collect(term(A), P)) |] ==> P(Apply(a, l)) 
  1.1570 +|] ==> P(x)
  1.1571 +\end{ttbox}
  1.1572 +The file \texttt{ex/Term.ML} derives two higher-level induction rules, one of
  1.1573 +which is particularly useful for proving equations:
  1.1574 +\begin{ttbox}
  1.1575 +[| t : term(A);
  1.1576 +   !!x zs. [| x : A; zs : list(term(A)); map(f, zs) = map(g, zs) |]
  1.1577 +           ==> f(Apply(x, zs)) = g(Apply(x, zs)) 
  1.1578 +|] ==> f(t) = g(t)  
  1.1579 +\end{ttbox}
  1.1580 +How this can be generalized to other nested datatypes is a matter for future
  1.1581 +research.
  1.1582 +
  1.1583 +
  1.1584 +\subsubsection{The \texttt{case} operator}
  1.1585 +
  1.1586 +The package defines an operator for performing case analysis over the
  1.1587 +datatype.  For \texttt{list}, it is called \texttt{list_case} and satisfies
  1.1588 +the equations
  1.1589 +\begin{ttbox}
  1.1590 +list_case(f_Nil, f_Cons, []) = f_Nil
  1.1591 +list_case(f_Nil, f_Cons, Cons(a, l)) = f_Cons(a, l)
  1.1592 +\end{ttbox}
  1.1593 +Here \texttt{f_Nil} is the value to return if the argument is \texttt{Nil} and
  1.1594 +\texttt{f_Cons} is a function that computes the value to return if the
  1.1595 +argument has the form $\texttt{Cons}(a,l)$.  The function can be expressed as
  1.1596 +an abstraction, over patterns if desired (\S\ref{sec:pairs}).
  1.1597 +
  1.1598 +For mutually recursive datatypes, there is a single \texttt{case} operator.
  1.1599 +In the tree/forest example, the constant \texttt{tree_forest_case} handles all
  1.1600 +of the constructors of the two datatypes.
  1.1601 +
  1.1602 +
  1.1603 +
  1.1604 +
  1.1605 +\subsection{Defining datatypes}
  1.1606 +
  1.1607 +The theory syntax for datatype definitions is shown in
  1.1608 +Fig.~\ref{datatype-grammar}.  In order to be well-formed, a datatype
  1.1609 +definition has to obey the rules stated in the previous section.  As a result
  1.1610 +the theory is extended with the new types, the constructors, and the theorems
  1.1611 +listed in the previous section.  The quotation marks are necessary because
  1.1612 +they enclose general Isabelle formul\ae.
  1.1613 +
  1.1614 +\begin{figure}
  1.1615 +\begin{rail}
  1.1616 +datatype : ( 'datatype' | 'codatatype' ) datadecls;
  1.1617 +
  1.1618 +datadecls: ( '"' id arglist '"' '=' (constructor + '|') ) + 'and'
  1.1619 +         ;
  1.1620 +constructor : name ( () | consargs )  ( () | ( '(' mixfix ')' ) )
  1.1621 +         ;
  1.1622 +consargs : '(' ('"' var ':' term '"' + ',') ')'
  1.1623 +         ;
  1.1624 +\end{rail}
  1.1625 +\caption{Syntax of datatype declarations}
  1.1626 +\label{datatype-grammar}
  1.1627 +\end{figure}
  1.1628 +
  1.1629 +Codatatypes are declared like datatypes and are identical to them in every
  1.1630 +respect except that they have a coinduction rule instead of an induction rule.
  1.1631 +Note that while an induction rule has the effect of limiting the values
  1.1632 +contained in the set, a coinduction rule gives a way of constructing new
  1.1633 +values of the set.
  1.1634 +
  1.1635 +Most of the theorems about datatypes become part of the default simpset.  You
  1.1636 +never need to see them again because the simplifier applies them
  1.1637 +automatically.  Add freeness properties (\texttt{free_iffs}) to the simpset
  1.1638 +when you want them.  Induction or exhaustion are usually invoked by hand,
  1.1639 +usually via these special-purpose tactics:
  1.1640 +\begin{ttdescription}
  1.1641 +\item[\ttindexbold{induct_tac} {\tt"}$x${\tt"} $i$] applies structural
  1.1642 +  induction on variable $x$ to subgoal $i$, provided the type of $x$ is a
  1.1643 +  datatype.  The induction variable should not occur among other assumptions
  1.1644 +  of the subgoal.
  1.1645 +\end{ttdescription}
  1.1646 +In some cases, induction is overkill and a case distinction over all
  1.1647 +constructors of the datatype suffices.
  1.1648 +\begin{ttdescription}
  1.1649 +\item[\ttindexbold{exhaust_tac} {\tt"}$x${\tt"} $i$]
  1.1650 + performs an exhaustive case analysis for the variable~$x$.
  1.1651 +\end{ttdescription}
  1.1652 +
  1.1653 +Both tactics can only be applied to a variable, whose typing must be given in
  1.1654 +some assumption, for example the assumption \texttt{x:\ list(A)}.  The tactics
  1.1655 +also work for the natural numbers (\texttt{nat}) and disjoint sums, although
  1.1656 +these sets were not defined using the datatype package.  (Disjoint sums are
  1.1657 +not recursive, so only \texttt{exhaust_tac} is available.)
  1.1658 +
  1.1659 +\bigskip
  1.1660 +Here are some more details for the technically minded.  Processing the
  1.1661 +theory file produces an \ML\ structure which, in addition to the usual
  1.1662 +components, contains a structure named $t$ for each datatype $t$ defined in
  1.1663 +the file.  Each structure $t$ contains the following elements:
  1.1664 +\begin{ttbox}
  1.1665 +val intrs         : thm list  \textrm{the introduction rules}
  1.1666 +val elim          : thm       \textrm{the elimination (case analysis) rule}
  1.1667 +val induct        : thm       \textrm{the standard induction rule}
  1.1668 +val mutual_induct : thm       \textrm{the mutual induction rule, or \texttt{True}}
  1.1669 +val case_eqns     : thm list  \textrm{equations for the case operator}
  1.1670 +val recursor_eqns : thm list  \textrm{equations for the recursor}
  1.1671 +val con_defs      : thm list  \textrm{definitions of the case operator and constructors}
  1.1672 +val free_iffs     : thm list  \textrm{logical equivalences for proving freeness}
  1.1673 +val free_SEs      : thm list  \textrm{elimination rules for proving freeness}
  1.1674 +val mk_free       : string -> thm  \textrm{A function for proving freeness theorems}
  1.1675 +val mk_cases      : thm list -> string -> thm  \textrm{case analysis, see below}
  1.1676 +val defs          : thm list  \textrm{definitions of operators}
  1.1677 +val bnd_mono      : thm list  \textrm{monotonicity property}
  1.1678 +val dom_subset    : thm list  \textrm{inclusion in `bounding set'}
  1.1679 +\end{ttbox}
  1.1680 +Furthermore there is the theorem $C$\texttt{_I} for every constructor~$C$; for
  1.1681 +example, the \texttt{list} datatype's introduction rules are bound to the
  1.1682 +identifiers \texttt{Nil_I} and \texttt{Cons_I}.
  1.1683 +
  1.1684 +For a codatatype, the component \texttt{coinduct} is the coinduction rule,
  1.1685 +replacing the \texttt{induct} component.
  1.1686 +
  1.1687 +See the theories \texttt{ex/Ntree} and \texttt{ex/Brouwer} for examples of
  1.1688 +infinitely branching datatypes.  See theory \texttt{ex/LList} for an example
  1.1689 +of a codatatype.  Some of these theories illustrate the use of additional,
  1.1690 +undocumented features of the datatype package.  Datatype definitions are
  1.1691 +reduced to inductive definitions, and the advanced features should be
  1.1692 +understood in that light.
  1.1693 +
  1.1694 +
  1.1695 +\subsection{Examples}
  1.1696 +
  1.1697 +\subsubsection{The datatype of binary trees}
  1.1698 +
  1.1699 +Let us define the set $\texttt{bt}(A)$ of binary trees over~$A$.  The theory
  1.1700 +must contain these lines:
  1.1701 +\begin{ttbox}
  1.1702 +consts   bt :: i=>i
  1.1703 +datatype "bt(A)"  =  Lf  |  Br ("a: A",  "t1: bt(A)",  "t2: bt(A)")
  1.1704 +\end{ttbox}
  1.1705 +After loading the theory, we can prove, for example, that no tree equals its
  1.1706 +left branch.  To ease the induction, we state the goal using quantifiers.
  1.1707 +\begin{ttbox}
  1.1708 +Goal "l : bt(A) ==> ALL x r. Br(x,l,r) ~= l";
  1.1709 +{\out Level 0}
  1.1710 +{\out l : bt(A) ==> ALL x r. Br(x, l, r) ~= l}
  1.1711 +{\out  1. l : bt(A) ==> ALL x r. Br(x, l, r) ~= l}
  1.1712 +\end{ttbox}
  1.1713 +This can be proved by the structural induction tactic:
  1.1714 +\begin{ttbox}
  1.1715 +by (induct_tac "l" 1);
  1.1716 +{\out Level 1}
  1.1717 +{\out l : bt(A) ==> ALL x r. Br(x, l, r) ~= l}
  1.1718 +{\out  1. ALL x r. Br(x, Lf, r) ~= Lf}
  1.1719 +{\out  2. !!a t1 t2.}
  1.1720 +{\out        [| a : A; t1 : bt(A); ALL x r. Br(x, t1, r) ~= t1; t2 : bt(A);}
  1.1721 +{\out           ALL x r. Br(x, t2, r) ~= t2 |]}
  1.1722 +{\out        ==> ALL x r. Br(x, Br(a, t1, t2), r) ~= Br(a, t1, t2)}
  1.1723 +\end{ttbox}
  1.1724 +Both subgoals are proved using the simplifier.  Tactic
  1.1725 +\texttt{asm_full_simp_tac} is used, rewriting the assumptions.
  1.1726 +This is because simplification using the freeness properties can unfold the
  1.1727 +definition of constructor~\texttt{Br}, so we arrange that all occurrences are
  1.1728 +unfolded. 
  1.1729 +\begin{ttbox}
  1.1730 +by (ALLGOALS (asm_full_simp_tac (simpset() addsimps bt.free_iffs)));
  1.1731 +{\out Level 2}
  1.1732 +{\out l : bt(A) ==> ALL x r. Br(x, l, r) ~= l}
  1.1733 +{\out No subgoals!}
  1.1734 +\end{ttbox}
  1.1735 +To remove the quantifiers from the induction formula, we save the theorem using
  1.1736 +\ttindex{qed_spec_mp}.
  1.1737 +\begin{ttbox}
  1.1738 +qed_spec_mp "Br_neq_left";
  1.1739 +{\out val Br_neq_left = "?l : bt(?A) ==> Br(?x, ?l, ?r) ~= ?l" : thm}
  1.1740 +\end{ttbox}
  1.1741 +
  1.1742 +When there are only a few constructors, we might prefer to prove the freenness
  1.1743 +theorems for each constructor.  This is trivial, using the function given us
  1.1744 +for that purpose:
  1.1745 +\begin{ttbox}
  1.1746 +val Br_iff = bt.mk_free "Br(a,l,r)=Br(a',l',r') <-> a=a' & l=l' & r=r'";
  1.1747 +{\out val Br_iff =}
  1.1748 +{\out   "Br(?a, ?l, ?r) = Br(?a', ?l', ?r') <->}
  1.1749 +{\out                     ?a = ?a' & ?l = ?l' & ?r = ?r'" : thm}
  1.1750 +\end{ttbox}
  1.1751 +
  1.1752 +The purpose of \ttindex{mk_cases} is to generate simplified instances of the
  1.1753 +elimination (case analysis) rule.  Its theorem list argument is a list of
  1.1754 +constructor definitions, which it uses for freeness reasoning.  For example,
  1.1755 +this instance of the elimination rule propagates type-checking information
  1.1756 +from the premise $\texttt{Br}(a,l,r)\in\texttt{bt}(A)$:
  1.1757 +\begin{ttbox}
  1.1758 +val BrE = bt.mk_cases bt.con_defs "Br(a,l,r) : bt(A)";
  1.1759 +{\out val BrE =}
  1.1760 +{\out   "[| Br(?a, ?l, ?r) : bt(?A);}
  1.1761 +{\out       [| ?a : ?A; ?l : bt(?A); ?r : bt(?A) |] ==> ?Q |] ==> ?Q" : thm}
  1.1762 +\end{ttbox}
  1.1763 +
  1.1764 +
  1.1765 +\subsubsection{Mixfix syntax in datatypes}
  1.1766 +
  1.1767 +Mixfix syntax is sometimes convenient.  The theory \texttt{ex/PropLog} makes a
  1.1768 +deep embedding of propositional logic:
  1.1769 +\begin{ttbox}
  1.1770 +consts     prop :: i
  1.1771 +datatype  "prop" = Fls
  1.1772 +                 | Var ("n: nat")                ("#_" [100] 100)
  1.1773 +                 | "=>" ("p: prop", "q: prop")   (infixr 90)
  1.1774 +\end{ttbox}
  1.1775 +The second constructor has a special $\#n$ syntax, while the third constructor
  1.1776 +is an infixed arrow.
  1.1777 +
  1.1778 +
  1.1779 +\subsubsection{A giant enumeration type}
  1.1780 +
  1.1781 +This example shows a datatype that consists of 60 constructors:
  1.1782 +\begin{ttbox}
  1.1783 +consts  enum :: i
  1.1784 +datatype
  1.1785 +  "enum" = C00 | C01 | C02 | C03 | C04 | C05 | C06 | C07 | C08 | C09
  1.1786 +         | C10 | C11 | C12 | C13 | C14 | C15 | C16 | C17 | C18 | C19
  1.1787 +         | C20 | C21 | C22 | C23 | C24 | C25 | C26 | C27 | C28 | C29
  1.1788 +         | C30 | C31 | C32 | C33 | C34 | C35 | C36 | C37 | C38 | C39
  1.1789 +         | C40 | C41 | C42 | C43 | C44 | C45 | C46 | C47 | C48 | C49
  1.1790 +         | C50 | C51 | C52 | C53 | C54 | C55 | C56 | C57 | C58 | C59
  1.1791 +end
  1.1792 +\end{ttbox}
  1.1793 +The datatype package scales well.  Even though all properties are proved
  1.1794 +rather than assumed, full processing of this definition takes under 15 seconds
  1.1795 +(on a 300 MHz Pentium).  The constructors have a balanced representation,
  1.1796 +essentially binary notation, so freeness properties can be proved fast.
  1.1797 +\begin{ttbox}
  1.1798 +Goal "C00 ~= C01";
  1.1799 +by (simp_tac (simpset() addsimps enum.free_iffs) 1);
  1.1800 +\end{ttbox}
  1.1801 +You need not derive such inequalities explicitly.  The simplifier will dispose
  1.1802 +of them automatically, given the theorem list \texttt{free_iffs}.
  1.1803 +
  1.1804 +\index{*datatype|)}
  1.1805 +
  1.1806 +
  1.1807 +\subsection{Recursive function definitions}\label{sec:ZF:recursive}
  1.1808 +\index{recursive functions|see{recursion}}
  1.1809 +\index{*primrec|(}
  1.1810 +
  1.1811 +Datatypes come with a uniform way of defining functions, {\bf primitive
  1.1812 +  recursion}.  Such definitions rely on the recursion operator defined by the
  1.1813 +datatype package.  Isabelle proves the desired recursion equations as
  1.1814 +theorems.
  1.1815 +
  1.1816 +In principle, one could introduce primitive recursive functions by asserting
  1.1817 +their reduction rules as new axioms.  Here is a dangerous way of defining the
  1.1818 +append function for lists:
  1.1819 +\begin{ttbox}\slshape
  1.1820 +consts  "\at" :: [i,i]=>i                        (infixr 60)
  1.1821 +rules 
  1.1822 +   app_Nil   "[] \at ys = ys"
  1.1823 +   app_Cons  "(Cons(a,l)) \at ys = Cons(a, l \at ys)"
  1.1824 +\end{ttbox}
  1.1825 +Asserting axioms brings the danger of accidentally asserting nonsense.  It
  1.1826 +should be avoided at all costs!
  1.1827 +
  1.1828 +The \ttindex{primrec} declaration is a safe means of defining primitive
  1.1829 +recursive functions on datatypes:
  1.1830 +\begin{ttbox}
  1.1831 +consts  "\at" :: [i,i]=>i                        (infixr 60)
  1.1832 +primrec 
  1.1833 +   "[] \at ys = ys"
  1.1834 +   "(Cons(a,l)) \at ys = Cons(a, l \at ys)"
  1.1835 +\end{ttbox}
  1.1836 +Isabelle will now check that the two rules do indeed form a primitive
  1.1837 +recursive definition.  For example, the declaration
  1.1838 +\begin{ttbox}
  1.1839 +primrec
  1.1840 +   "[] \at ys = us"
  1.1841 +\end{ttbox}
  1.1842 +is rejected with an error message ``\texttt{Extra variables on rhs}''.
  1.1843 +
  1.1844 +
  1.1845 +\subsubsection{Syntax of recursive definitions}
  1.1846 +
  1.1847 +The general form of a primitive recursive definition is
  1.1848 +\begin{ttbox}
  1.1849 +primrec
  1.1850 +    {\it reduction rules}
  1.1851 +\end{ttbox}
  1.1852 +where \textit{reduction rules} specify one or more equations of the form
  1.1853 +\[ f \, x@1 \, \dots \, x@m \, (C \, y@1 \, \dots \, y@k) \, z@1 \,
  1.1854 +\dots \, z@n = r \] such that $C$ is a constructor of the datatype, $r$
  1.1855 +contains only the free variables on the left-hand side, and all recursive
  1.1856 +calls in $r$ are of the form $f \, \dots \, y@i \, \dots$ for some $i$.  
  1.1857 +There must be at most one reduction rule for each constructor.  The order is
  1.1858 +immaterial.  For missing constructors, the function is defined to return zero.
  1.1859 +
  1.1860 +All reduction rules are added to the default simpset.
  1.1861 +If you would like to refer to some rule by name, then you must prefix
  1.1862 +the rule with an identifier.  These identifiers, like those in the
  1.1863 +\texttt{rules} section of a theory, will be visible at the \ML\ level.
  1.1864 +
  1.1865 +The reduction rules for {\tt\at} become part of the default simpset, which
  1.1866 +leads to short proof scripts:
  1.1867 +\begin{ttbox}\underscoreon
  1.1868 +Goal "xs: list(A) ==> (xs @ ys) @ zs = xs @ (ys @ zs)";
  1.1869 +by (induct\_tac "xs" 1);
  1.1870 +by (ALLGOALS Asm\_simp\_tac);
  1.1871 +\end{ttbox}
  1.1872 +
  1.1873 +You can even use the \texttt{primrec} form with non-recursive datatypes and
  1.1874 +with codatatypes.  Recursion is not allowed, but it provides a convenient
  1.1875 +syntax for defining functions by cases.
  1.1876 +
  1.1877 +
  1.1878 +\subsubsection{Example: varying arguments}
  1.1879 +
  1.1880 +All arguments, other than the recursive one, must be the same in each equation
  1.1881 +and in each recursive call.  To get around this restriction, use explict
  1.1882 +$\lambda$-abstraction and function application.  Here is an example, drawn
  1.1883 +from the theory \texttt{Resid/Substitution}.  The type of redexes is declared
  1.1884 +as follows:
  1.1885 +\begin{ttbox}
  1.1886 +consts  redexes :: i
  1.1887 +datatype
  1.1888 +  "redexes" = Var ("n: nat")            
  1.1889 +            | Fun ("t: redexes")
  1.1890 +            | App ("b:bool" ,"f:redexes" , "a:redexes")
  1.1891 +\end{ttbox}
  1.1892 +
  1.1893 +The function \texttt{lift} takes a second argument, $k$, which varies in
  1.1894 +recursive calls.
  1.1895 +\begin{ttbox}
  1.1896 +primrec
  1.1897 +  "lift(Var(i)) = (lam k:nat. if i<k then Var(i) else Var(succ(i)))"
  1.1898 +  "lift(Fun(t)) = (lam k:nat. Fun(lift(t) ` succ(k)))"
  1.1899 +  "lift(App(b,f,a)) = (lam k:nat. App(b, lift(f)`k, lift(a)`k))"
  1.1900 +\end{ttbox}
  1.1901 +Now \texttt{lift(r)`k} satisfies the required recursion equations.
  1.1902 +
  1.1903 +\index{recursion!primitive|)}
  1.1904 +\index{*primrec|)}
  1.1905 +
  1.1906 +
  1.1907 +\section{Inductive and coinductive definitions}
  1.1908 +\index{*inductive|(}
  1.1909 +\index{*coinductive|(}
  1.1910 +
  1.1911 +An {\bf inductive definition} specifies the least set~$R$ closed under given
  1.1912 +rules.  (Applying a rule to elements of~$R$ yields a result within~$R$.)  For
  1.1913 +example, a structural operational semantics is an inductive definition of an
  1.1914 +evaluation relation.  Dually, a {\bf coinductive definition} specifies the
  1.1915 +greatest set~$R$ consistent with given rules.  (Every element of~$R$ can be
  1.1916 +seen as arising by applying a rule to elements of~$R$.)  An important example
  1.1917 +is using bisimulation relations to formalise equivalence of processes and
  1.1918 +infinite data structures.
  1.1919 +
  1.1920 +A theory file may contain any number of inductive and coinductive
  1.1921 +definitions.  They may be intermixed with other declarations; in
  1.1922 +particular, the (co)inductive sets {\bf must} be declared separately as
  1.1923 +constants, and may have mixfix syntax or be subject to syntax translations.
  1.1924 +
  1.1925 +Each (co)inductive definition adds definitions to the theory and also
  1.1926 +proves some theorems.  Each definition creates an \ML\ structure, which is a
  1.1927 +substructure of the main theory structure.
  1.1928 +This package is described in detail in a separate paper,%
  1.1929 +\footnote{It appeared in CADE~\cite{paulson-CADE}; a longer version is
  1.1930 +  distributed with Isabelle as \emph{A Fixedpoint Approach to 
  1.1931 + (Co)Inductive and (Co)Datatype Definitions}.}  %
  1.1932 +which you might refer to for background information.
  1.1933 +
  1.1934 +
  1.1935 +\subsection{The syntax of a (co)inductive definition}
  1.1936 +An inductive definition has the form
  1.1937 +\begin{ttbox}
  1.1938 +inductive
  1.1939 +  domains    {\it domain declarations}
  1.1940 +  intrs      {\it introduction rules}
  1.1941 +  monos      {\it monotonicity theorems}
  1.1942 +  con_defs   {\it constructor definitions}
  1.1943 +  type_intrs {\it introduction rules for type-checking}
  1.1944 +  type_elims {\it elimination rules for type-checking}
  1.1945 +\end{ttbox}
  1.1946 +A coinductive definition is identical, but starts with the keyword
  1.1947 +{\tt coinductive}.  
  1.1948 +
  1.1949 +The {\tt monos}, {\tt con\_defs}, {\tt type\_intrs} and {\tt type\_elims}
  1.1950 +sections are optional.  If present, each is specified either as a list of
  1.1951 +identifiers or as a string.  If the latter, then the string must be a valid
  1.1952 +\textsc{ml} expression of type {\tt thm list}.  The string is simply inserted
  1.1953 +into the {\tt _thy.ML} file; if it is ill-formed, it will trigger \textsc{ml}
  1.1954 +error messages.  You can then inspect the file on the temporary directory.
  1.1955 +
  1.1956 +\begin{description}
  1.1957 +\item[\it domain declarations] consist of one or more items of the form
  1.1958 +  {\it string\/}~{\tt <=}~{\it string}, associating each recursive set with
  1.1959 +  its domain.  (The domain is some existing set that is large enough to
  1.1960 +  hold the new set being defined.)
  1.1961 +
  1.1962 +\item[\it introduction rules] specify one or more introduction rules in
  1.1963 +  the form {\it ident\/}~{\it string}, where the identifier gives the name of
  1.1964 +  the rule in the result structure.
  1.1965 +
  1.1966 +\item[\it monotonicity theorems] are required for each operator applied to
  1.1967 +  a recursive set in the introduction rules.  There \textbf{must} be a theorem
  1.1968 +  of the form $A\subseteq B\Imp M(A)\subseteq M(B)$, for each premise $t\in M(R_i)$
  1.1969 +  in an introduction rule!
  1.1970 +
  1.1971 +\item[\it constructor definitions] contain definitions of constants
  1.1972 +  appearing in the introduction rules.  The (co)datatype package supplies
  1.1973 +  the constructors' definitions here.  Most (co)inductive definitions omit
  1.1974 +  this section; one exception is the primitive recursive functions example;
  1.1975 +  see theory \texttt{ex/Primrec}.
  1.1976 +  
  1.1977 +\item[\it type\_intrs] consists of introduction rules for type-checking the
  1.1978 +  definition: for demonstrating that the new set is included in its domain.
  1.1979 +  (The proof uses depth-first search.)
  1.1980 +
  1.1981 +\item[\it type\_elims] consists of elimination rules for type-checking the
  1.1982 +  definition.  They are presumed to be safe and are applied as often as
  1.1983 +  possible prior to the {\tt type\_intrs} search.
  1.1984 +\end{description}
  1.1985 +
  1.1986 +The package has a few restrictions:
  1.1987 +\begin{itemize}
  1.1988 +\item The theory must separately declare the recursive sets as
  1.1989 +  constants.
  1.1990 +
  1.1991 +\item The names of the recursive sets must be identifiers, not infix
  1.1992 +operators.  
  1.1993 +
  1.1994 +\item Side-conditions must not be conjunctions.  However, an introduction rule
  1.1995 +may contain any number of side-conditions.
  1.1996 +
  1.1997 +\item Side-conditions of the form $x=t$, where the variable~$x$ does not
  1.1998 +  occur in~$t$, will be substituted through the rule \verb|mutual_induct|.
  1.1999 +\end{itemize}
  1.2000 +
  1.2001 +
  1.2002 +\subsection{Example of an inductive definition}
  1.2003 +
  1.2004 +Two declarations, included in a theory file, define the finite powerset
  1.2005 +operator.  First we declare the constant~\texttt{Fin}.  Then we declare it
  1.2006 +inductively, with two introduction rules:
  1.2007 +\begin{ttbox}
  1.2008 +consts  Fin :: i=>i
  1.2009 +
  1.2010 +inductive
  1.2011 +  domains   "Fin(A)" <= "Pow(A)"
  1.2012 +  intrs
  1.2013 +    emptyI  "0 : Fin(A)"
  1.2014 +    consI   "[| a: A;  b: Fin(A) |] ==> cons(a,b) : Fin(A)"
  1.2015 +  type_intrs empty_subsetI, cons_subsetI, PowI
  1.2016 +  type_elims "[make_elim PowD]"
  1.2017 +\end{ttbox}
  1.2018 +The resulting theory structure contains a substructure, called~\texttt{Fin}.
  1.2019 +It contains the \texttt{Fin}$~A$ introduction rules as the list
  1.2020 +\texttt{Fin.intrs}, and also individually as \texttt{Fin.emptyI} and
  1.2021 +\texttt{Fin.consI}.  The induction rule is \texttt{Fin.induct}.
  1.2022 +
  1.2023 +The chief problem with making (co)inductive definitions involves type-checking
  1.2024 +the rules.  Sometimes, additional theorems need to be supplied under
  1.2025 +\texttt{type_intrs} or \texttt{type_elims}.  If the package fails when trying
  1.2026 +to prove your introduction rules, then set the flag \ttindexbold{trace_induct}
  1.2027 +to \texttt{true} and try again.  (See the manual \emph{A Fixedpoint Approach
  1.2028 +  \ldots} for more discussion of type-checking.)
  1.2029 +
  1.2030 +In the example above, $\texttt{Pow}(A)$ is given as the domain of
  1.2031 +$\texttt{Fin}(A)$, for obviously every finite subset of~$A$ is a subset
  1.2032 +of~$A$.  However, the inductive definition package can only prove that given a
  1.2033 +few hints.
  1.2034 +Here is the output that results (with the flag set) when the
  1.2035 +\texttt{type_intrs} and \texttt{type_elims} are omitted from the inductive
  1.2036 +definition above:
  1.2037 +\begin{ttbox}
  1.2038 +Inductive definition Finite.Fin
  1.2039 +Fin(A) ==
  1.2040 +lfp(Pow(A),
  1.2041 +    \%X. {z: Pow(A) . z = 0 | (EX a b. z = cons(a, b) & a : A & b : X)})
  1.2042 +  Proving monotonicity...
  1.2043 +\ttbreak
  1.2044 +  Proving the introduction rules...
  1.2045 +The typechecking subgoal:
  1.2046 +0 : Fin(A)
  1.2047 + 1. 0 : Pow(A)
  1.2048 +\ttbreak
  1.2049 +The subgoal after monos, type_elims:
  1.2050 +0 : Fin(A)
  1.2051 + 1. 0 : Pow(A)
  1.2052 +*** prove_goal: tactic failed
  1.2053 +\end{ttbox}
  1.2054 +We see the need to supply theorems to let the package prove
  1.2055 +$\emptyset\in\texttt{Pow}(A)$.  Restoring the \texttt{type_intrs} but not the
  1.2056 +\texttt{type_elims}, we again get an error message:
  1.2057 +\begin{ttbox}
  1.2058 +The typechecking subgoal:
  1.2059 +0 : Fin(A)
  1.2060 + 1. 0 : Pow(A)
  1.2061 +\ttbreak
  1.2062 +The subgoal after monos, type_elims:
  1.2063 +0 : Fin(A)
  1.2064 + 1. 0 : Pow(A)
  1.2065 +\ttbreak
  1.2066 +The typechecking subgoal:
  1.2067 +cons(a, b) : Fin(A)
  1.2068 + 1. [| a : A; b : Fin(A) |] ==> cons(a, b) : Pow(A)
  1.2069 +\ttbreak
  1.2070 +The subgoal after monos, type_elims:
  1.2071 +cons(a, b) : Fin(A)
  1.2072 + 1. [| a : A; b : Pow(A) |] ==> cons(a, b) : Pow(A)
  1.2073 +*** prove_goal: tactic failed
  1.2074 +\end{ttbox}
  1.2075 +The first rule has been type-checked, but the second one has failed.  The
  1.2076 +simplest solution to such problems is to prove the failed subgoal separately
  1.2077 +and to supply it under \texttt{type_intrs}.  The solution actually used is
  1.2078 +to supply, under \texttt{type_elims}, a rule that changes
  1.2079 +$b\in\texttt{Pow}(A)$ to $b\subseteq A$; together with \texttt{cons_subsetI}
  1.2080 +and \texttt{PowI}, it is enough to complete the type-checking.
  1.2081 +
  1.2082 +
  1.2083 +
  1.2084 +\subsection{Further examples}
  1.2085 +
  1.2086 +An inductive definition may involve arbitrary monotonic operators.  Here is a
  1.2087 +standard example: the accessible part of a relation.  Note the use
  1.2088 +of~\texttt{Pow} in the introduction rule and the corresponding mention of the
  1.2089 +rule \verb|Pow_mono| in the \texttt{monos} list.  If the desired rule has a
  1.2090 +universally quantified premise, usually the effect can be obtained using
  1.2091 +\texttt{Pow}.
  1.2092 +\begin{ttbox}
  1.2093 +consts  acc :: i=>i
  1.2094 +inductive
  1.2095 +  domains "acc(r)" <= "field(r)"
  1.2096 +  intrs
  1.2097 +    vimage  "[| r-``{a}: Pow(acc(r)); a: field(r) |] ==> a: acc(r)"
  1.2098 +  monos      Pow_mono
  1.2099 +\end{ttbox}
  1.2100 +
  1.2101 +Finally, here is a coinductive definition.  It captures (as a bisimulation)
  1.2102 +the notion of equality on lazy lists, which are first defined as a codatatype:
  1.2103 +\begin{ttbox}
  1.2104 +consts  llist :: i=>i
  1.2105 +codatatype  "llist(A)" = LNil | LCons ("a: A", "l: llist(A)")
  1.2106 +\ttbreak
  1.2107 +
  1.2108 +consts  lleq :: i=>i
  1.2109 +coinductive
  1.2110 +  domains "lleq(A)" <= "llist(A) * llist(A)"
  1.2111 +  intrs
  1.2112 +    LNil  "<LNil, LNil> : lleq(A)"
  1.2113 +    LCons "[| a:A; <l,l'>: lleq(A) |] 
  1.2114 +           ==> <LCons(a,l), LCons(a,l')>: lleq(A)"
  1.2115 +  type_intrs  "llist.intrs"
  1.2116 +\end{ttbox}
  1.2117 +This use of \texttt{type_intrs} is typical: the relation concerns the
  1.2118 +codatatype \texttt{llist}, so naturally the introduction rules for that
  1.2119 +codatatype will be required for type-checking the rules.
  1.2120 +
  1.2121 +The Isabelle distribution contains many other inductive definitions.  Simple
  1.2122 +examples are collected on subdirectory \texttt{ZF/ex}.  The directory
  1.2123 +\texttt{Coind} and the theory \texttt{ZF/ex/LList} contain coinductive
  1.2124 +definitions.  Larger examples may be found on other subdirectories of
  1.2125 +\texttt{ZF}, such as \texttt{IMP}, and \texttt{Resid}.
  1.2126 +
  1.2127 +
  1.2128 +\subsection{The result structure}
  1.2129 +
  1.2130 +Each (co)inductive set defined in a theory file generates an \ML\ substructure
  1.2131 +having the same name.  The the substructure contains the following elements:
  1.2132 +
  1.2133 +\begin{ttbox}
  1.2134 +val intrs         : thm list  \textrm{the introduction rules}
  1.2135 +val elim          : thm       \textrm{the elimination (case analysis) rule}
  1.2136 +val mk_cases      : thm list -> string -> thm  \textrm{case analysis, see below}
  1.2137 +val induct        : thm       \textrm{the standard induction rule}
  1.2138 +val mutual_induct : thm       \textrm{the mutual induction rule, or \texttt{True}}
  1.2139 +val defs          : thm list  \textrm{definitions of operators}
  1.2140 +val bnd_mono      : thm list  \textrm{monotonicity property}
  1.2141 +val dom_subset    : thm list  \textrm{inclusion in `bounding set'}
  1.2142 +\end{ttbox}
  1.2143 +Furthermore there is the theorem $C$\texttt{_I} for every constructor~$C$; for
  1.2144 +example, the \texttt{list} datatype's introduction rules are bound to the
  1.2145 +identifiers \texttt{Nil_I} and \texttt{Cons_I}.
  1.2146 +
  1.2147 +For a codatatype, the component \texttt{coinduct} is the coinduction rule,
  1.2148 +replacing the \texttt{induct} component.
  1.2149 +
  1.2150 +Recall that \ttindex{mk_cases} generates simplified instances of the
  1.2151 +elimination (case analysis) rule.  It is as useful for inductive definitions
  1.2152 +as it is for datatypes.  There are many examples in the theory
  1.2153 +\texttt{ex/Comb}, which is discussed at length
  1.2154 +elsewhere~\cite{paulson-generic}.  The theory first defines the datatype
  1.2155 +\texttt{comb} of combinators:
  1.2156 +\begin{ttbox}
  1.2157 +consts comb :: i
  1.2158 +datatype  "comb" = K
  1.2159 +                 | S
  1.2160 +                 | "#" ("p: comb", "q: comb")   (infixl 90)
  1.2161 +\end{ttbox}
  1.2162 +The theory goes on to define contraction and parallel contraction
  1.2163 +inductively.  Then the file \texttt{ex/Comb.ML} defines special cases of
  1.2164 +contraction using \texttt{mk_cases}:
  1.2165 +\begin{ttbox}
  1.2166 +val K_contractE = contract.mk_cases comb.con_defs "K -1-> r";
  1.2167 +{\out val K_contractE = "K -1-> ?r ==> ?Q" : thm}
  1.2168 +\end{ttbox}
  1.2169 +We can read this as saying that the combinator \texttt{K} cannot reduce to
  1.2170 +anything.  Similar elimination rules for \texttt{S} and application are also
  1.2171 +generated and are supplied to the classical reasoner.  Note that
  1.2172 +\texttt{comb.con_defs} is given to \texttt{mk_cases} to allow freeness
  1.2173 +reasoning on datatype \texttt{comb}.
  1.2174 +
  1.2175 +\index{*coinductive|)} \index{*inductive|)}
  1.2176 +
  1.2177 +
  1.2178 +
  1.2179 +
  1.2180 +\section{The outer reaches of set theory}
  1.2181 +
  1.2182 +The constructions of the natural numbers and lists use a suite of
  1.2183 +operators for handling recursive function definitions.  I have described
  1.2184 +the developments in detail elsewhere~\cite{paulson-set-II}.  Here is a brief
  1.2185 +summary:
  1.2186 +\begin{itemize}
  1.2187 +  \item Theory \texttt{Trancl} defines the transitive closure of a relation
  1.2188 +    (as a least fixedpoint).
  1.2189 +
  1.2190 +  \item Theory \texttt{WF} proves the Well-Founded Recursion Theorem, using an
  1.2191 +    elegant approach of Tobias Nipkow.  This theorem permits general
  1.2192 +    recursive definitions within set theory.
  1.2193 +
  1.2194 +  \item Theory \texttt{Ord} defines the notions of transitive set and ordinal
  1.2195 +    number.  It derives transfinite induction.  A key definition is {\bf
  1.2196 +      less than}: $i<j$ if and only if $i$ and $j$ are both ordinals and
  1.2197 +    $i\in j$.  As a special case, it includes less than on the natural
  1.2198 +    numbers.
  1.2199 +    
  1.2200 +  \item Theory \texttt{Epsilon} derives $\varepsilon$-induction and
  1.2201 +    $\varepsilon$-recursion, which are generalisations of transfinite
  1.2202 +    induction and recursion.  It also defines \cdx{rank}$(x)$, which
  1.2203 +    is the least ordinal $\alpha$ such that $x$ is constructed at
  1.2204 +    stage $\alpha$ of the cumulative hierarchy (thus $x\in
  1.2205 +    V@{\alpha+1}$).
  1.2206 +\end{itemize}
  1.2207 +
  1.2208 +Other important theories lead to a theory of cardinal numbers.  They have
  1.2209 +not yet been written up anywhere.  Here is a summary:
  1.2210 +\begin{itemize}
  1.2211 +\item Theory \texttt{Rel} defines the basic properties of relations, such as
  1.2212 +  (ir)reflexivity, (a)symmetry, and transitivity.
  1.2213 +
  1.2214 +\item Theory \texttt{EquivClass} develops a theory of equivalence
  1.2215 +  classes, not using the Axiom of Choice.
  1.2216 +
  1.2217 +\item Theory \texttt{Order} defines partial orderings, total orderings and
  1.2218 +  wellorderings.
  1.2219 +
  1.2220 +\item Theory \texttt{OrderArith} defines orderings on sum and product sets.
  1.2221 +  These can be used to define ordinal arithmetic and have applications to
  1.2222 +  cardinal arithmetic.
  1.2223 +
  1.2224 +\item Theory \texttt{OrderType} defines order types.  Every wellordering is
  1.2225 +  equivalent to a unique ordinal, which is its order type.
  1.2226 +
  1.2227 +\item Theory \texttt{Cardinal} defines equipollence and cardinal numbers.
  1.2228 + 
  1.2229 +\item Theory \texttt{CardinalArith} defines cardinal addition and
  1.2230 +  multiplication, and proves their elementary laws.  It proves that there
  1.2231 +  is no greatest cardinal.  It also proves a deep result, namely
  1.2232 +  $\kappa\otimes\kappa=\kappa$ for every infinite cardinal~$\kappa$; see
  1.2233 +  Kunen~\cite[page 29]{kunen80}.  None of these results assume the Axiom of
  1.2234 +  Choice, which complicates their proofs considerably.  
  1.2235 +\end{itemize}
  1.2236 +
  1.2237 +The following developments involve the Axiom of Choice (AC):
  1.2238 +\begin{itemize}
  1.2239 +\item Theory \texttt{AC} asserts the Axiom of Choice and proves some simple
  1.2240 +  equivalent forms.
  1.2241 +
  1.2242 +\item Theory \texttt{Zorn} proves Hausdorff's Maximal Principle, Zorn's Lemma
  1.2243 +  and the Wellordering Theorem, following Abrial and
  1.2244 +  Laffitte~\cite{abrial93}.
  1.2245 +
  1.2246 +\item Theory \verb|Cardinal_AC| uses AC to prove simplified theorems about
  1.2247 +  the cardinals.  It also proves a theorem needed to justify
  1.2248 +  infinitely branching datatype declarations: if $\kappa$ is an infinite
  1.2249 +  cardinal and $|X(\alpha)| \le \kappa$ for all $\alpha<\kappa$ then
  1.2250 +  $|\union\sb{\alpha<\kappa} X(\alpha)| \le \kappa$.
  1.2251 +
  1.2252 +\item Theory \texttt{InfDatatype} proves theorems to justify infinitely
  1.2253 +  branching datatypes.  Arbitrary index sets are allowed, provided their
  1.2254 +  cardinalities have an upper bound.  The theory also justifies some
  1.2255 +  unusual cases of finite branching, involving the finite powerset operator
  1.2256 +  and the finite function space operator.
  1.2257 +\end{itemize}
  1.2258 +
  1.2259 +
  1.2260 +
  1.2261 +\section{The examples directories}
  1.2262 +Directory \texttt{HOL/IMP} contains a mechanised version of a semantic
  1.2263 +equivalence proof taken from Winskel~\cite{winskel93}.  It formalises the
  1.2264 +denotational and operational semantics of a simple while-language, then
  1.2265 +proves the two equivalent.  It contains several datatype and inductive
  1.2266 +definitions, and demonstrates their use.
  1.2267 +
  1.2268 +The directory \texttt{ZF/ex} contains further developments in {\ZF} set
  1.2269 +theory.  Here is an overview; see the files themselves for more details.  I
  1.2270 +describe much of this material in other
  1.2271 +publications~\cite{paulson-set-I,paulson-set-II,paulson-CADE}. 
  1.2272 +\begin{itemize}
  1.2273 +\item File \texttt{misc.ML} contains miscellaneous examples such as
  1.2274 +  Cantor's Theorem, the Schr\"oder-Bernstein Theorem and the `Composition
  1.2275 +  of homomorphisms' challenge~\cite{boyer86}.
  1.2276 +
  1.2277 +\item Theory \texttt{Ramsey} proves the finite exponent 2 version of
  1.2278 +  Ramsey's Theorem, following Basin and Kaufmann's
  1.2279 +  presentation~\cite{basin91}.
  1.2280 +
  1.2281 +\item Theory \texttt{Integ} develops a theory of the integers as
  1.2282 +  equivalence classes of pairs of natural numbers.
  1.2283 +
  1.2284 +\item Theory \texttt{Primrec} develops some computation theory.  It
  1.2285 +  inductively defines the set of primitive recursive functions and presents a
  1.2286 +  proof that Ackermann's function is not primitive recursive.
  1.2287 +
  1.2288 +\item Theory \texttt{Primes} defines the Greatest Common Divisor of two
  1.2289 +  natural numbers and and the ``divides'' relation.
  1.2290 +
  1.2291 +\item Theory \texttt{Bin} defines a datatype for two's complement binary
  1.2292 +  integers, then proves rewrite rules to perform binary arithmetic.  For
  1.2293 +  instance, $1359\times {-}2468 = {-}3354012$ takes under 14 seconds.
  1.2294 +
  1.2295 +\item Theory \texttt{BT} defines the recursive data structure ${\tt
  1.2296 +    bt}(A)$, labelled binary trees.
  1.2297 +
  1.2298 +\item Theory \texttt{Term} defines a recursive data structure for terms
  1.2299 +  and term lists.  These are simply finite branching trees.
  1.2300 +
  1.2301 +\item Theory \texttt{TF} defines primitives for solving mutually
  1.2302 +  recursive equations over sets.  It constructs sets of trees and forests
  1.2303 +  as an example, including induction and recursion rules that handle the
  1.2304 +  mutual recursion.
  1.2305 +
  1.2306 +\item Theory \texttt{Prop} proves soundness and completeness of
  1.2307 +  propositional logic~\cite{paulson-set-II}.  This illustrates datatype
  1.2308 +  definitions, inductive definitions, structural induction and rule
  1.2309 +  induction.
  1.2310 +
  1.2311 +\item Theory \texttt{ListN} inductively defines the lists of $n$
  1.2312 +  elements~\cite{paulin92}.
  1.2313 +
  1.2314 +\item Theory \texttt{Acc} inductively defines the accessible part of a
  1.2315 +  relation~\cite{paulin92}.
  1.2316 +
  1.2317 +\item Theory \texttt{Comb} defines the datatype of combinators and
  1.2318 +  inductively defines contraction and parallel contraction.  It goes on to
  1.2319 +  prove the Church-Rosser Theorem.  This case study follows Camilleri and
  1.2320 +  Melham~\cite{camilleri92}.
  1.2321 +
  1.2322 +\item Theory \texttt{LList} defines lazy lists and a coinduction
  1.2323 +  principle for proving equations between them.
  1.2324 +\end{itemize}
  1.2325 +
  1.2326 +
  1.2327 +\section{A proof about powersets}\label{sec:ZF-pow-example}
  1.2328 +To demonstrate high-level reasoning about subsets, let us prove the
  1.2329 +equation ${{\tt Pow}(A)\cap {\tt Pow}(B)}= {\tt Pow}(A\cap B)$.  Compared
  1.2330 +with first-order logic, set theory involves a maze of rules, and theorems
  1.2331 +have many different proofs.  Attempting other proofs of the theorem might
  1.2332 +be instructive.  This proof exploits the lattice properties of
  1.2333 +intersection.  It also uses the monotonicity of the powerset operation,
  1.2334 +from \texttt{ZF/mono.ML}:
  1.2335 +\begin{ttbox}
  1.2336 +\tdx{Pow_mono}      A<=B ==> Pow(A) <= Pow(B)
  1.2337 +\end{ttbox}
  1.2338 +We enter the goal and make the first step, which breaks the equation into
  1.2339 +two inclusions by extensionality:\index{*equalityI theorem}
  1.2340 +\begin{ttbox}
  1.2341 +Goal "Pow(A Int B) = Pow(A) Int Pow(B)";
  1.2342 +{\out Level 0}
  1.2343 +{\out Pow(A Int B) = Pow(A) Int Pow(B)}
  1.2344 +{\out  1. Pow(A Int B) = Pow(A) Int Pow(B)}
  1.2345 +\ttbreak
  1.2346 +by (resolve_tac [equalityI] 1);
  1.2347 +{\out Level 1}
  1.2348 +{\out Pow(A Int B) = Pow(A) Int Pow(B)}
  1.2349 +{\out  1. Pow(A Int B) <= Pow(A) Int Pow(B)}
  1.2350 +{\out  2. Pow(A) Int Pow(B) <= Pow(A Int B)}
  1.2351 +\end{ttbox}
  1.2352 +Both inclusions could be tackled straightforwardly using \texttt{subsetI}.
  1.2353 +A shorter proof results from noting that intersection forms the greatest
  1.2354 +lower bound:\index{*Int_greatest theorem}
  1.2355 +\begin{ttbox}
  1.2356 +by (resolve_tac [Int_greatest] 1);
  1.2357 +{\out Level 2}
  1.2358 +{\out Pow(A Int B) = Pow(A) Int Pow(B)}
  1.2359 +{\out  1. Pow(A Int B) <= Pow(A)}
  1.2360 +{\out  2. Pow(A Int B) <= Pow(B)}
  1.2361 +{\out  3. Pow(A) Int Pow(B) <= Pow(A Int B)}
  1.2362 +\end{ttbox}
  1.2363 +Subgoal~1 follows by applying the monotonicity of \texttt{Pow} to $A\int
  1.2364 +B\subseteq A$; subgoal~2 follows similarly:
  1.2365 +\index{*Int_lower1 theorem}\index{*Int_lower2 theorem}
  1.2366 +\begin{ttbox}
  1.2367 +by (resolve_tac [Int_lower1 RS Pow_mono] 1);
  1.2368 +{\out Level 3}
  1.2369 +{\out Pow(A Int B) = Pow(A) Int Pow(B)}
  1.2370 +{\out  1. Pow(A Int B) <= Pow(B)}
  1.2371 +{\out  2. Pow(A) Int Pow(B) <= Pow(A Int B)}
  1.2372 +\ttbreak
  1.2373 +by (resolve_tac [Int_lower2 RS Pow_mono] 1);
  1.2374 +{\out Level 4}
  1.2375 +{\out Pow(A Int B) = Pow(A) Int Pow(B)}
  1.2376 +{\out  1. Pow(A) Int Pow(B) <= Pow(A Int B)}
  1.2377 +\end{ttbox}
  1.2378 +We are left with the opposite inclusion, which we tackle in the
  1.2379 +straightforward way:\index{*subsetI theorem}
  1.2380 +\begin{ttbox}
  1.2381 +by (resolve_tac [subsetI] 1);
  1.2382 +{\out Level 5}
  1.2383 +{\out Pow(A Int B) = Pow(A) Int Pow(B)}
  1.2384 +{\out  1. !!x. x : Pow(A) Int Pow(B) ==> x : Pow(A Int B)}
  1.2385 +\end{ttbox}
  1.2386 +The subgoal is to show $x\in {\tt Pow}(A\cap B)$ assuming $x\in{\tt
  1.2387 +Pow}(A)\cap {\tt Pow}(B)$; eliminating this assumption produces two
  1.2388 +subgoals.  The rule \tdx{IntE} treats the intersection like a conjunction
  1.2389 +instead of unfolding its definition.
  1.2390 +\begin{ttbox}
  1.2391 +by (eresolve_tac [IntE] 1);
  1.2392 +{\out Level 6}
  1.2393 +{\out Pow(A Int B) = Pow(A) Int Pow(B)}
  1.2394 +{\out  1. !!x. [| x : Pow(A); x : Pow(B) |] ==> x : Pow(A Int B)}
  1.2395 +\end{ttbox}
  1.2396 +The next step replaces the \texttt{Pow} by the subset
  1.2397 +relation~($\subseteq$).\index{*PowI theorem}
  1.2398 +\begin{ttbox}
  1.2399 +by (resolve_tac [PowI] 1);
  1.2400 +{\out Level 7}
  1.2401 +{\out Pow(A Int B) = Pow(A) Int Pow(B)}
  1.2402 +{\out  1. !!x. [| x : Pow(A); x : Pow(B) |] ==> x <= A Int B}
  1.2403 +\end{ttbox}
  1.2404 +We perform the same replacement in the assumptions.  This is a good
  1.2405 +demonstration of the tactic \ttindex{dresolve_tac}:\index{*PowD theorem}
  1.2406 +\begin{ttbox}
  1.2407 +by (REPEAT (dresolve_tac [PowD] 1));
  1.2408 +{\out Level 8}
  1.2409 +{\out Pow(A Int B) = Pow(A) Int Pow(B)}
  1.2410 +{\out  1. !!x. [| x <= A; x <= B |] ==> x <= A Int B}
  1.2411 +\end{ttbox}
  1.2412 +The assumptions are that $x$ is a lower bound of both $A$ and~$B$, but
  1.2413 +$A\int B$ is the greatest lower bound:\index{*Int_greatest theorem}
  1.2414 +\begin{ttbox}
  1.2415 +by (resolve_tac [Int_greatest] 1);
  1.2416 +{\out Level 9}
  1.2417 +{\out Pow(A Int B) = Pow(A) Int Pow(B)}
  1.2418 +{\out  1. !!x. [| x <= A; x <= B |] ==> x <= A}
  1.2419 +{\out  2. !!x. [| x <= A; x <= B |] ==> x <= B}
  1.2420 +\end{ttbox}
  1.2421 +To conclude the proof, we clear up the trivial subgoals:
  1.2422 +\begin{ttbox}
  1.2423 +by (REPEAT (assume_tac 1));
  1.2424 +{\out Level 10}
  1.2425 +{\out Pow(A Int B) = Pow(A) Int Pow(B)}
  1.2426 +{\out No subgoals!}
  1.2427 +\end{ttbox}
  1.2428 +\medskip
  1.2429 +We could have performed this proof in one step by applying
  1.2430 +\ttindex{Blast_tac}.  Let us
  1.2431 +go back to the start:
  1.2432 +\begin{ttbox}
  1.2433 +choplev 0;
  1.2434 +{\out Level 0}
  1.2435 +{\out Pow(A Int B) = Pow(A) Int Pow(B)}
  1.2436 +{\out  1. Pow(A Int B) = Pow(A) Int Pow(B)}
  1.2437 +by (Blast_tac 1);
  1.2438 +{\out Depth = 0}
  1.2439 +{\out Depth = 1}
  1.2440 +{\out Depth = 2}
  1.2441 +{\out Depth = 3}
  1.2442 +{\out Level 1}
  1.2443 +{\out Pow(A Int B) = Pow(A) Int Pow(B)}
  1.2444 +{\out No subgoals!}
  1.2445 +\end{ttbox}
  1.2446 +Past researchers regarded this as a difficult proof, as indeed it is if all
  1.2447 +the symbols are replaced by their definitions.
  1.2448 +\goodbreak
  1.2449 +
  1.2450 +\section{Monotonicity of the union operator}
  1.2451 +For another example, we prove that general union is monotonic:
  1.2452 +${C\subseteq D}$ implies $\bigcup(C)\subseteq \bigcup(D)$.  To begin, we
  1.2453 +tackle the inclusion using \tdx{subsetI}:
  1.2454 +\begin{ttbox}
  1.2455 +Goal "C<=D ==> Union(C) <= Union(D)";
  1.2456 +{\out Level 0}
  1.2457 +{\out C <= D ==> Union(C) <= Union(D)}
  1.2458 +{\out  1. C <= D ==> Union(C) <= Union(D)}
  1.2459 +\ttbreak
  1.2460 +by (resolve_tac [subsetI] 1);
  1.2461 +{\out Level 1}
  1.2462 +{\out C <= D ==> Union(C) <= Union(D)}
  1.2463 +{\out  1. !!x. [| C <= D; x : Union(C) |] ==> x : Union(D)}
  1.2464 +\end{ttbox}
  1.2465 +Big union is like an existential quantifier --- the occurrence in the
  1.2466 +assumptions must be eliminated early, since it creates parameters.
  1.2467 +\index{*UnionE theorem}
  1.2468 +\begin{ttbox}
  1.2469 +by (eresolve_tac [UnionE] 1);
  1.2470 +{\out Level 2}
  1.2471 +{\out C <= D ==> Union(C) <= Union(D)}
  1.2472 +{\out  1. !!x B. [| C <= D; x : B; B : C |] ==> x : Union(D)}
  1.2473 +\end{ttbox}
  1.2474 +Now we may apply \tdx{UnionI}, which creates an unknown involving the
  1.2475 +parameters.  To show $x\in \bigcup(D)$ it suffices to show that $x$ belongs
  1.2476 +to some element, say~$\Var{B2}(x,B)$, of~$D$.
  1.2477 +\begin{ttbox}
  1.2478 +by (resolve_tac [UnionI] 1);
  1.2479 +{\out Level 3}
  1.2480 +{\out C <= D ==> Union(C) <= Union(D)}
  1.2481 +{\out  1. !!x B. [| C <= D; x : B; B : C |] ==> ?B2(x,B) : D}
  1.2482 +{\out  2. !!x B. [| C <= D; x : B; B : C |] ==> x : ?B2(x,B)}
  1.2483 +\end{ttbox}
  1.2484 +Combining \tdx{subsetD} with the assumption $C\subseteq D$ yields 
  1.2485 +$\Var{a}\in C \Imp \Var{a}\in D$, which reduces subgoal~1.  Note that
  1.2486 +\texttt{eresolve_tac} has removed that assumption.
  1.2487 +\begin{ttbox}
  1.2488 +by (eresolve_tac [subsetD] 1);
  1.2489 +{\out Level 4}
  1.2490 +{\out C <= D ==> Union(C) <= Union(D)}
  1.2491 +{\out  1. !!x B. [| x : B; B : C |] ==> ?B2(x,B) : C}
  1.2492 +{\out  2. !!x B. [| C <= D; x : B; B : C |] ==> x : ?B2(x,B)}
  1.2493 +\end{ttbox}
  1.2494 +The rest is routine.  Observe how~$\Var{B2}(x,B)$ is instantiated.
  1.2495 +\begin{ttbox}
  1.2496 +by (assume_tac 1);
  1.2497 +{\out Level 5}
  1.2498 +{\out C <= D ==> Union(C) <= Union(D)}
  1.2499 +{\out  1. !!x B. [| C <= D; x : B; B : C |] ==> x : B}
  1.2500 +by (assume_tac 1);
  1.2501 +{\out Level 6}
  1.2502 +{\out C <= D ==> Union(C) <= Union(D)}
  1.2503 +{\out No subgoals!}
  1.2504 +\end{ttbox}
  1.2505 +Again, \ttindex{Blast_tac} can prove the theorem in one step.
  1.2506 +\begin{ttbox}
  1.2507 +by (Blast_tac 1);
  1.2508 +{\out Depth = 0}
  1.2509 +{\out Depth = 1}
  1.2510 +{\out Depth = 2}
  1.2511 +{\out Level 1}
  1.2512 +{\out C <= D ==> Union(C) <= Union(D)}
  1.2513 +{\out No subgoals!}
  1.2514 +\end{ttbox}
  1.2515 +
  1.2516 +The file \texttt{ZF/equalities.ML} has many similar proofs.  Reasoning about
  1.2517 +general intersection can be difficult because of its anomalous behaviour on
  1.2518 +the empty set.  However, \ttindex{Blast_tac} copes well with these.  Here is
  1.2519 +a typical example, borrowed from Devlin~\cite[page 12]{devlin79}:
  1.2520 +\begin{ttbox}
  1.2521 +a:C ==> (INT x:C. A(x) Int B(x)) = (INT x:C. A(x)) Int (INT x:C. B(x))
  1.2522 +\end{ttbox}
  1.2523 +In traditional notation this is
  1.2524 +\[ a\in C \,\Imp\, \inter@{x\in C} \Bigl(A(x) \int B(x)\Bigr) =        
  1.2525 +       \Bigl(\inter@{x\in C} A(x)\Bigr)  \int  
  1.2526 +       \Bigl(\inter@{x\in C} B(x)\Bigr)  \]
  1.2527 +
  1.2528 +\section{Low-level reasoning about functions}
  1.2529 +The derived rules \texttt{lamI}, \texttt{lamE}, \texttt{lam_type}, \texttt{beta}
  1.2530 +and \texttt{eta} support reasoning about functions in a
  1.2531 +$\lambda$-calculus style.  This is generally easier than regarding
  1.2532 +functions as sets of ordered pairs.  But sometimes we must look at the
  1.2533 +underlying representation, as in the following proof
  1.2534 +of~\tdx{fun_disjoint_apply1}.  This states that if $f$ and~$g$ are
  1.2535 +functions with disjoint domains~$A$ and~$C$, and if $a\in A$, then
  1.2536 +$(f\un g)`a = f`a$:
  1.2537 +\begin{ttbox}
  1.2538 +Goal "[| a:A;  f: A->B;  g: C->D;  A Int C = 0 |] ==>  \ttback
  1.2539 +\ttback    (f Un g)`a = f`a";
  1.2540 +{\out Level 0}
  1.2541 +{\out [| a : A; f : A -> B; g : C -> D; A Int C = 0 |]}
  1.2542 +{\out ==> (f Un g) ` a = f ` a}
  1.2543 +{\out  1. [| a : A; f : A -> B; g : C -> D; A Int C = 0 |]}
  1.2544 +{\out     ==> (f Un g) ` a = f ` a}
  1.2545 +\end{ttbox}
  1.2546 +Using \tdx{apply_equality}, we reduce the equality to reasoning about
  1.2547 +ordered pairs.  The second subgoal is to verify that $f\un g$ is a function.
  1.2548 +To save space, the assumptions will be abbreviated below.
  1.2549 +\begin{ttbox}
  1.2550 +by (resolve_tac [apply_equality] 1);
  1.2551 +{\out Level 1}
  1.2552 +{\out [| \ldots |] ==> (f Un g) ` a = f ` a}
  1.2553 +{\out  1. [| \ldots |] ==> <a,f ` a> : f Un g}
  1.2554 +{\out  2. [| \ldots |] ==> f Un g : (PROD x:?A. ?B(x))}
  1.2555 +\end{ttbox}
  1.2556 +We must show that the pair belongs to~$f$ or~$g$; by~\tdx{UnI1} we
  1.2557 +choose~$f$:
  1.2558 +\begin{ttbox}
  1.2559 +by (resolve_tac [UnI1] 1);
  1.2560 +{\out Level 2}
  1.2561 +{\out [| \ldots |] ==> (f Un g) ` a = f ` a}
  1.2562 +{\out  1. [| \ldots |] ==> <a,f ` a> : f}
  1.2563 +{\out  2. [| \ldots |] ==> f Un g : (PROD x:?A. ?B(x))}
  1.2564 +\end{ttbox}
  1.2565 +To show $\pair{a,f`a}\in f$ we use \tdx{apply_Pair}, which is
  1.2566 +essentially the converse of \tdx{apply_equality}:
  1.2567 +\begin{ttbox}
  1.2568 +by (resolve_tac [apply_Pair] 1);
  1.2569 +{\out Level 3}
  1.2570 +{\out [| \ldots |] ==> (f Un g) ` a = f ` a}
  1.2571 +{\out  1. [| \ldots |] ==> f : (PROD x:?A2. ?B2(x))}
  1.2572 +{\out  2. [| \ldots |] ==> a : ?A2}
  1.2573 +{\out  3. [| \ldots |] ==> f Un g : (PROD x:?A. ?B(x))}
  1.2574 +\end{ttbox}
  1.2575 +Using the assumptions $f\in A\to B$ and $a\in A$, we solve the two subgoals
  1.2576 +from \tdx{apply_Pair}.  Recall that a $\Pi$-set is merely a generalized
  1.2577 +function space, and observe that~{\tt?A2} is instantiated to~\texttt{A}.
  1.2578 +\begin{ttbox}
  1.2579 +by (assume_tac 1);
  1.2580 +{\out Level 4}
  1.2581 +{\out [| \ldots |] ==> (f Un g) ` a = f ` a}
  1.2582 +{\out  1. [| \ldots |] ==> a : A}
  1.2583 +{\out  2. [| \ldots |] ==> f Un g : (PROD x:?A. ?B(x))}
  1.2584 +by (assume_tac 1);
  1.2585 +{\out Level 5}
  1.2586 +{\out [| \ldots |] ==> (f Un g) ` a = f ` a}
  1.2587 +{\out  1. [| \ldots |] ==> f Un g : (PROD x:?A. ?B(x))}
  1.2588 +\end{ttbox}
  1.2589 +To construct functions of the form $f\un g$, we apply
  1.2590 +\tdx{fun_disjoint_Un}:
  1.2591 +\begin{ttbox}
  1.2592 +by (resolve_tac [fun_disjoint_Un] 1);
  1.2593 +{\out Level 6}
  1.2594 +{\out [| \ldots |] ==> (f Un g) ` a = f ` a}
  1.2595 +{\out  1. [| \ldots |] ==> f : ?A3 -> ?B3}
  1.2596 +{\out  2. [| \ldots |] ==> g : ?C3 -> ?D3}
  1.2597 +{\out  3. [| \ldots |] ==> ?A3 Int ?C3 = 0}
  1.2598 +\end{ttbox}
  1.2599 +The remaining subgoals are instances of the assumptions.  Again, observe how
  1.2600 +unknowns are instantiated:
  1.2601 +\begin{ttbox}
  1.2602 +by (assume_tac 1);
  1.2603 +{\out Level 7}
  1.2604 +{\out [| \ldots |] ==> (f Un g) ` a = f ` a}
  1.2605 +{\out  1. [| \ldots |] ==> g : ?C3 -> ?D3}
  1.2606 +{\out  2. [| \ldots |] ==> A Int ?C3 = 0}
  1.2607 +by (assume_tac 1);
  1.2608 +{\out Level 8}
  1.2609 +{\out [| \ldots |] ==> (f Un g) ` a = f ` a}
  1.2610 +{\out  1. [| \ldots |] ==> A Int C = 0}
  1.2611 +by (assume_tac 1);
  1.2612 +{\out Level 9}
  1.2613 +{\out [| \ldots |] ==> (f Un g) ` a = f ` a}
  1.2614 +{\out No subgoals!}
  1.2615 +\end{ttbox}
  1.2616 +See the files \texttt{ZF/func.ML} and \texttt{ZF/WF.ML} for more
  1.2617 +examples of reasoning about functions.
  1.2618 +
  1.2619 +\index{set theory|)}