doc-src/Inductive/ind-defs.tex
author Walther Neuper <neuper@ist.tugraz.at>
Thu, 12 Aug 2010 15:03:34 +0200
branchisac-from-Isabelle2009-2
changeset 37913 20e3616b2d9c
parent 9695 ec7d7f877712
child 43508 381fdcab0f36
permissions -rw-r--r--
prepare reactivation of isac-update-Isa09-2
     1 %% $Id$
     2 \documentclass[12pt,a4paper]{article}
     3 \usepackage{latexsym,../iman,../extra,../ttbox,../proof,../pdfsetup}
     4 
     5 \newif\ifshort%''Short'' means a published version, not the documentation
     6 \shortfalse%%%%%\shorttrue
     7 
     8 \title{A Fixedpoint Approach to\\ 
     9   (Co)Inductive and (Co)Datatype Definitions%
    10   \thanks{J. Grundy and S. Thompson made detailed comments.  Mads Tofte and
    11     the referees were also helpful.  The research was funded by the SERC
    12     grants GR/G53279, GR/H40570 and by the ESPRIT Project 6453 ``Types''.}}
    13 
    14 \author{Lawrence C. Paulson\\{\tt lcp@cl.cam.ac.uk}\\
    15         Computer Laboratory, University of Cambridge, England}
    16 \date{\today} 
    17 \setcounter{secnumdepth}{2} \setcounter{tocdepth}{2}
    18 
    19 \newcommand\sbs{\subseteq}
    20 \let\To=\Rightarrow
    21 
    22 \newcommand\defn[1]{{\bf#1}}
    23 
    24 \newcommand\pow{{\cal P}}
    25 \newcommand\RepFun{\hbox{\tt RepFun}}
    26 \newcommand\cons{\hbox{\tt cons}}
    27 \def\succ{\hbox{\tt succ}}
    28 \newcommand\split{\hbox{\tt split}}
    29 \newcommand\fst{\hbox{\tt fst}}
    30 \newcommand\snd{\hbox{\tt snd}}
    31 \newcommand\converse{\hbox{\tt converse}}
    32 \newcommand\domain{\hbox{\tt domain}}
    33 \newcommand\range{\hbox{\tt range}}
    34 \newcommand\field{\hbox{\tt field}}
    35 \newcommand\lfp{\hbox{\tt lfp}}
    36 \newcommand\gfp{\hbox{\tt gfp}}
    37 \newcommand\id{\hbox{\tt id}}
    38 \newcommand\trans{\hbox{\tt trans}}
    39 \newcommand\wf{\hbox{\tt wf}}
    40 \newcommand\nat{\hbox{\tt nat}}
    41 \newcommand\rank{\hbox{\tt rank}}
    42 \newcommand\univ{\hbox{\tt univ}}
    43 \newcommand\Vrec{\hbox{\tt Vrec}}
    44 \newcommand\Inl{\hbox{\tt Inl}}
    45 \newcommand\Inr{\hbox{\tt Inr}}
    46 \newcommand\case{\hbox{\tt case}}
    47 \newcommand\lst{\hbox{\tt list}}
    48 \newcommand\Nil{\hbox{\tt Nil}}
    49 \newcommand\Cons{\hbox{\tt Cons}}
    50 \newcommand\lstcase{\hbox{\tt list\_case}}
    51 \newcommand\lstrec{\hbox{\tt list\_rec}}
    52 \newcommand\length{\hbox{\tt length}}
    53 \newcommand\listn{\hbox{\tt listn}}
    54 \newcommand\acc{\hbox{\tt acc}}
    55 \newcommand\primrec{\hbox{\tt primrec}}
    56 \newcommand\SC{\hbox{\tt SC}}
    57 \newcommand\CONST{\hbox{\tt CONST}}
    58 \newcommand\PROJ{\hbox{\tt PROJ}}
    59 \newcommand\COMP{\hbox{\tt COMP}}
    60 \newcommand\PREC{\hbox{\tt PREC}}
    61 
    62 \newcommand\quniv{\hbox{\tt quniv}}
    63 \newcommand\llist{\hbox{\tt llist}}
    64 \newcommand\LNil{\hbox{\tt LNil}}
    65 \newcommand\LCons{\hbox{\tt LCons}}
    66 \newcommand\lconst{\hbox{\tt lconst}}
    67 \newcommand\lleq{\hbox{\tt lleq}}
    68 \newcommand\map{\hbox{\tt map}}
    69 \newcommand\term{\hbox{\tt term}}
    70 \newcommand\Apply{\hbox{\tt Apply}}
    71 \newcommand\termcase{\hbox{\tt term\_case}}
    72 \newcommand\rev{\hbox{\tt rev}}
    73 \newcommand\reflect{\hbox{\tt reflect}}
    74 \newcommand\tree{\hbox{\tt tree}}
    75 \newcommand\forest{\hbox{\tt forest}}
    76 \newcommand\Part{\hbox{\tt Part}}
    77 \newcommand\TF{\hbox{\tt tree\_forest}}
    78 \newcommand\Tcons{\hbox{\tt Tcons}}
    79 \newcommand\Fcons{\hbox{\tt Fcons}}
    80 \newcommand\Fnil{\hbox{\tt Fnil}}
    81 \newcommand\TFcase{\hbox{\tt TF\_case}}
    82 \newcommand\Fin{\hbox{\tt Fin}}
    83 \newcommand\QInl{\hbox{\tt QInl}}
    84 \newcommand\QInr{\hbox{\tt QInr}}
    85 \newcommand\qsplit{\hbox{\tt qsplit}}
    86 \newcommand\qcase{\hbox{\tt qcase}}
    87 \newcommand\Con{\hbox{\tt Con}}
    88 \newcommand\data{\hbox{\tt data}}
    89 
    90 \binperiod     %%%treat . like a binary operator
    91 
    92 \begin{document}
    93 \pagestyle{empty}
    94 \begin{titlepage}
    95 \maketitle 
    96 \begin{abstract}
    97   This paper presents a fixedpoint approach to inductive definitions.
    98   Instead of using a syntactic test such as ``strictly positive,'' the
    99   approach lets definitions involve any operators that have been proved
   100   monotone.  It is conceptually simple, which has allowed the easy
   101   implementation of mutual recursion and iterated definitions.  It also
   102   handles coinductive definitions: simply replace the least fixedpoint by a
   103   greatest fixedpoint.  
   104 
   105   The method has been implemented in two of Isabelle's logics, \textsc{zf} set
   106   theory and higher-order logic.  It should be applicable to any logic in
   107   which the Knaster-Tarski theorem can be proved.  Examples include lists of
   108   $n$ elements, the accessible part of a relation and the set of primitive
   109   recursive functions.  One example of a coinductive definition is
   110   bisimulations for lazy lists.  Recursive datatypes are examined in detail,
   111   as well as one example of a \defn{codatatype}: lazy lists.
   112 
   113   The Isabelle package has been applied in several large case studies,
   114   including two proofs of the Church-Rosser theorem and a coinductive proof of
   115   semantic consistency.  The package can be trusted because it proves theorems
   116   from definitions, instead of asserting desired properties as axioms.
   117 \end{abstract}
   118 %
   119 \bigskip
   120 \centerline{Copyright \copyright{} \number\year{} by Lawrence C. Paulson}
   121 \thispagestyle{empty} 
   122 \end{titlepage}
   123 \tableofcontents\cleardoublepage\pagestyle{plain}
   124 
   125 \setcounter{page}{1}
   126 
   127 \section{Introduction}
   128 Several theorem provers provide commands for formalizing recursive data
   129 structures, like lists and trees.  Robin Milner implemented one of the first
   130 of these, for Edinburgh \textsc{lcf}~\cite{milner-ind}.  Given a description
   131 of the desired data structure, Milner's package formulated appropriate
   132 definitions and proved the characteristic theorems.  Similar is Melham's
   133 recursive type package for the Cambridge \textsc{hol} system~\cite{melham89}.
   134 Such data structures are called \defn{datatypes}
   135 below, by analogy with datatype declarations in Standard~\textsc{ml}\@.
   136 Some logics take datatypes as primitive; consider Boyer and Moore's shell
   137 principle~\cite{bm79} and the Coq type theory~\cite{paulin-tlca}.
   138 
   139 A datatype is but one example of an \defn{inductive definition}.  Such a
   140 definition~\cite{aczel77} specifies the least set~$R$ \defn{closed under}
   141 given rules: applying a rule to elements of~$R$ yields a result within~$R$.
   142 Inductive definitions have many applications.  The collection of theorems in a
   143 logic is inductively defined.  A structural operational
   144 semantics~\cite{hennessy90} is an inductive definition of a reduction or
   145 evaluation relation on programs.  A few theorem provers provide commands for
   146 formalizing inductive definitions; these include Coq~\cite{paulin-tlca} and
   147 again the \textsc{hol} system~\cite{camilleri92}.
   148 
   149 The dual notion is that of a \defn{coinductive definition}.  Such a definition
   150 specifies the greatest set~$R$ \defn{consistent with} given rules: every
   151 element of~$R$ can be seen as arising by applying a rule to elements of~$R$.
   152 Important examples include using bisimulation relations to formalize
   153 equivalence of processes~\cite{milner89} or lazy functional
   154 programs~\cite{abramsky90}.  Other examples include lazy lists and other
   155 infinite data structures; these are called \defn{codatatypes} below.
   156 
   157 Not all inductive definitions are meaningful.  \defn{Monotone} inductive
   158 definitions are a large, well-behaved class.  Monotonicity can be enforced
   159 by syntactic conditions such as ``strictly positive,'' but this could lead to
   160 monotone definitions being rejected on the grounds of their syntactic form.
   161 More flexible is to formalize monotonicity within the logic and allow users
   162 to prove it.
   163 
   164 This paper describes a package based on a fixedpoint approach.  Least
   165 fixedpoints yield inductive definitions; greatest fixedpoints yield
   166 coinductive definitions.  Most of the discussion below applies equally to
   167 inductive and coinductive definitions, and most of the code is shared.  
   168 
   169 The package supports mutual recursion and infinitely-branching datatypes and
   170 codatatypes.  It allows use of any operators that have been proved monotone,
   171 thus accepting all provably monotone inductive definitions, including
   172 iterated definitions.
   173 
   174 The package has been implemented in
   175 Isabelle~\cite{paulson-markt,paulson-isa-book} using 
   176 \textsc{zf} set theory \cite{paulson-set-I,paulson-set-II}; part of it has
   177 since been ported to Isabelle/\textsc{hol} (higher-order logic).  The
   178 recursion equations are specified as introduction rules for the mutually
   179 recursive sets.  The package transforms these rules into a mapping over sets,
   180 and attempts to prove that the mapping is monotonic and well-typed.  If
   181 successful, the package makes fixedpoint definitions and proves the
   182 introduction, elimination and (co)induction rules.  Users invoke the package
   183 by making simple declarations in Isabelle theory files.
   184 
   185 Most datatype packages equip the new datatype with some means of expressing
   186 recursive functions.  This is the main omission from my package.  Its
   187 fixedpoint operators define only recursive sets.  The Isabelle/\textsc{zf}
   188 theory provides well-founded recursion~\cite{paulson-set-II}, which is harder
   189 to use than structural recursion but considerably more general.
   190 Slind~\cite{slind-tfl} has written a package to automate the definition of
   191 well-founded recursive functions in Isabelle/\textsc{hol}.
   192 
   193 \paragraph*{Outline.} Section~2 introduces the least and greatest fixedpoint
   194 operators.  Section~3 discusses the form of introduction rules, mutual
   195 recursion and other points common to inductive and coinductive definitions.
   196 Section~4 discusses induction and coinduction rules separately.  Section~5
   197 presents several examples, including a coinductive definition.  Section~6
   198 describes datatype definitions.  Section~7 presents related work.
   199 Section~8 draws brief conclusions.  \ifshort\else The appendices are simple
   200 user's manuals for this Isabelle package.\fi
   201 
   202 Most of the definitions and theorems shown below have been generated by the
   203 package.  I have renamed some variables to improve readability.
   204  
   205 \section{Fixedpoint operators}
   206 In set theory, the least and greatest fixedpoint operators are defined as
   207 follows:
   208 \begin{eqnarray*}
   209    \lfp(D,h)  & \equiv & \inter\{X\sbs D. h(X)\sbs X\} \\
   210    \gfp(D,h)  & \equiv & \union\{X\sbs D. X\sbs h(X)\}
   211 \end{eqnarray*}   
   212 Let $D$ be a set.  Say that $h$ is \defn{bounded by}~$D$ if $h(D)\sbs D$, and
   213 \defn{monotone below~$D$} if
   214 $h(A)\sbs h(B)$ for all $A$ and $B$ such that $A\sbs B\sbs D$.  If $h$ is
   215 bounded by~$D$ and monotone then both operators yield fixedpoints:
   216 \begin{eqnarray*}
   217    \lfp(D,h)  & = & h(\lfp(D,h)) \\
   218    \gfp(D,h)  & = & h(\gfp(D,h)) 
   219 \end{eqnarray*}   
   220 These equations are instances of the Knaster-Tarski theorem, which states
   221 that every monotonic function over a complete lattice has a
   222 fixedpoint~\cite{davey-priestley}.  It is obvious from their definitions
   223 that $\lfp$ must be the least fixedpoint, and $\gfp$ the greatest.
   224 
   225 This fixedpoint theory is simple.  The Knaster-Tarski theorem is easy to
   226 prove.  Showing monotonicity of~$h$ is trivial, in typical cases.  We must
   227 also exhibit a bounding set~$D$ for~$h$.  Frequently this is trivial, as when
   228 a set of theorems is (co)inductively defined over some previously existing set
   229 of formul{\ae}.  Isabelle/\textsc{zf} provides suitable bounding sets for
   230 infinitely-branching (co)datatype definitions; see~\S\ref{univ-sec}.  Bounding
   231 sets are also called \defn{domains}.
   232 
   233 The powerset operator is monotone, but by Cantor's theorem there is no
   234 set~$A$ such that $A=\pow(A)$.  We cannot put $A=\lfp(D,\pow)$ because
   235 there is no suitable domain~$D$.  But \S\ref{acc-sec} demonstrates
   236 that~$\pow$ is still useful in inductive definitions.
   237 
   238 \section{Elements of an inductive or coinductive definition}\label{basic-sec}
   239 Consider a (co)inductive definition of the sets $R_1$, \ldots,~$R_n$, in
   240 mutual recursion.  They will be constructed from domains $D_1$,
   241 \ldots,~$D_n$, respectively.  The construction yields not $R_i\sbs D_i$ but
   242 $R_i\sbs D_1+\cdots+D_n$, where $R_i$ is contained in the image of~$D_i$
   243 under an injection.  Reasons for this are discussed
   244 elsewhere~\cite[\S4.5]{paulson-set-II}.
   245 
   246 The definition may involve arbitrary parameters $\vec{p}=p_1$,
   247 \ldots,~$p_k$.  Each recursive set then has the form $R_i(\vec{p})$.  The
   248 parameters must be identical every time they occur within a definition.  This
   249 would appear to be a serious restriction compared with other systems such as
   250 Coq~\cite{paulin-tlca}.  For instance, we cannot define the lists of
   251 $n$ elements as the set $\listn(A,n)$ using rules where the parameter~$n$
   252 varies.  Section~\ref{listn-sec} describes how to express this set using the
   253 inductive definition package.
   254 
   255 To avoid clutter below, the recursive sets are shown as simply $R_i$
   256 instead of~$R_i(\vec{p})$.
   257 
   258 \subsection{The form of the introduction rules}\label{intro-sec}
   259 The body of the definition consists of the desired introduction rules.  The
   260 conclusion of each rule must have the form $t\in R_i$, where $t$ is any term.
   261 Premises typically have the same form, but they can have the more general form
   262 $t\in M(R_i)$ or express arbitrary side-conditions.
   263 
   264 The premise $t\in M(R_i)$ is permitted if $M$ is a monotonic operator on
   265 sets, satisfying the rule 
   266 \[ \infer{M(A)\sbs M(B)}{A\sbs B} \]
   267 The user must supply the package with monotonicity rules for all such premises.
   268 
   269 The ability to introduce new monotone operators makes the approach
   270 flexible.  A suitable choice of~$M$ and~$t$ can express a lot.  The
   271 powerset operator $\pow$ is monotone, and the premise $t\in\pow(R)$
   272 expresses $t\sbs R$; see \S\ref{acc-sec} for an example.  The \emph{list of}
   273 operator is monotone, as is easily proved by induction.  The premise
   274 $t\in\lst(R)$ avoids having to encode the effect of~$\lst(R)$ using mutual
   275 recursion; see \S\ref{primrec-sec} and also my earlier
   276 paper~\cite[\S4.4]{paulson-set-II}.
   277 
   278 Introduction rules may also contain \defn{side-conditions}.  These are
   279 premises consisting of arbitrary formul{\ae} not mentioning the recursive
   280 sets. Side-conditions typically involve type-checking.  One example is the
   281 premise $a\in A$ in the following rule from the definition of lists:
   282 \[ \infer{\Cons(a,l)\in\lst(A)}{a\in A & l\in\lst(A)} \]
   283 
   284 \subsection{The fixedpoint definitions}
   285 The package translates the list of desired introduction rules into a fixedpoint
   286 definition.  Consider, as a running example, the finite powerset operator
   287 $\Fin(A)$: the set of all finite subsets of~$A$.  It can be
   288 defined as the least set closed under the rules
   289 \[  \emptyset\in\Fin(A)  \qquad 
   290     \infer{\{a\}\un b\in\Fin(A)}{a\in A & b\in\Fin(A)} 
   291 \]
   292 
   293 The domain in a (co)inductive definition must be some existing set closed
   294 under the rules.  A suitable domain for $\Fin(A)$ is $\pow(A)$, the set of all
   295 subsets of~$A$.  The package generates the definition
   296 \[  \Fin(A) \equiv \lfp(\pow(A), \,
   297   \begin{array}[t]{r@{\,}l}
   298       \lambda X. \{z\in\pow(A). & z=\emptyset \disj{} \\
   299                   &(\exists a\,b. z=\{a\}\un b\conj a\in A\conj b\in X)\})
   300   \end{array}
   301 \]
   302 The contribution of each rule to the definition of $\Fin(A)$ should be
   303 obvious.  A coinductive definition is similar but uses $\gfp$ instead
   304 of~$\lfp$.
   305 
   306 The package must prove that the fixedpoint operator is applied to a
   307 monotonic function.  If the introduction rules have the form described
   308 above, and if the package is supplied a monotonicity theorem for every
   309 $t\in M(R_i)$ premise, then this proof is trivial.\footnote{Due to the
   310   presence of logical connectives in the fixedpoint's body, the
   311   monotonicity proof requires some unusual rules.  These state that the
   312   connectives $\conj$, $\disj$ and $\exists$ preserve monotonicity with respect
   313   to the partial ordering on unary predicates given by $P\sqsubseteq Q$ if and
   314   only if $\forall x.P(x)\imp Q(x)$.}
   315 
   316 The package returns its result as an \textsc{ml} structure, which consists of named
   317 components; we may regard it as a record.  The result structure contains
   318 the definitions of the recursive sets as a theorem list called {\tt defs}.
   319 It also contains some theorems; {\tt dom\_subset} is an inclusion such as 
   320 $\Fin(A)\sbs\pow(A)$, while {\tt bnd\_mono} asserts that the fixedpoint
   321 definition is monotonic.
   322 
   323 Internally the package uses the theorem {\tt unfold}, a fixedpoint equation
   324 such as
   325 \[
   326   \begin{array}[t]{r@{\,}l}
   327      \Fin(A) = \{z\in\pow(A). & z=\emptyset \disj{} \\
   328              &(\exists a\,b. z=\{a\}\un b\conj a\in A\conj b\in \Fin(A))\}
   329   \end{array}
   330 \]
   331 In order to save space, this theorem is not exported.  
   332 
   333 
   334 \subsection{Mutual recursion} \label{mutual-sec}
   335 In a mutually recursive definition, the domain of the fixedpoint construction
   336 is the disjoint sum of the domain~$D_i$ of each~$R_i$, for $i=1$,
   337 \ldots,~$n$.  The package uses the injections of the
   338 binary disjoint sum, typically $\Inl$ and~$\Inr$, to express injections
   339 $h_{1n}$, \ldots, $h_{nn}$ for the $n$-ary disjoint sum $D_1+\cdots+D_n$.
   340 
   341 As discussed elsewhere \cite[\S4.5]{paulson-set-II}, Isabelle/\textsc{zf} defines the
   342 operator $\Part$ to support mutual recursion.  The set $\Part(A,h)$
   343 contains those elements of~$A$ having the form~$h(z)$:
   344 \[ \Part(A,h)  \equiv \{x\in A. \exists z. x=h(z)\}. \]   
   345 For mutually recursive sets $R_1$, \ldots,~$R_n$ with
   346 $n>1$, the package makes $n+1$ definitions.  The first defines a set $R$ using
   347 a fixedpoint operator. The remaining $n$ definitions have the form
   348 \[ R_i \equiv \Part(R,h_{in}), \qquad i=1,\ldots, n.  \] 
   349 It follows that $R=R_1\un\cdots\un R_n$, where the $R_i$ are pairwise disjoint.
   350 
   351 
   352 \subsection{Proving the introduction rules}
   353 The user supplies the package with the desired form of the introduction
   354 rules.  Once it has derived the theorem {\tt unfold}, it attempts
   355 to prove those rules.  From the user's point of view, this is the
   356 trickiest stage; the proofs often fail.  The task is to show that the domain 
   357 $D_1+\cdots+D_n$ of the combined set $R_1\un\cdots\un R_n$ is
   358 closed under all the introduction rules.  This essentially involves replacing
   359 each~$R_i$ by $D_1+\cdots+D_n$ in each of the introduction rules and
   360 attempting to prove the result.
   361 
   362 Consider the $\Fin(A)$ example.  After substituting $\pow(A)$ for $\Fin(A)$
   363 in the rules, the package must prove
   364 \[  \emptyset\in\pow(A)  \qquad 
   365     \infer{\{a\}\un b\in\pow(A)}{a\in A & b\in\pow(A)} 
   366 \]
   367 Such proofs can be regarded as type-checking the definition.\footnote{The
   368   Isabelle/\textsc{hol} version does not require these proofs, as \textsc{hol}
   369   has implicit type-checking.} The user supplies the package with
   370 type-checking rules to apply.  Usually these are general purpose rules from
   371 the \textsc{zf} theory.  They could however be rules specifically proved for a
   372 particular inductive definition; sometimes this is the easiest way to get the
   373 definition through!
   374 
   375 The result structure contains the introduction rules as the theorem list {\tt
   376 intrs}.
   377 
   378 \subsection{The case analysis rule}
   379 The elimination rule, called {\tt elim}, performs case analysis.  It is a
   380 simple consequence of {\tt unfold}.  There is one case for each introduction
   381 rule.  If $x\in\Fin(A)$ then either $x=\emptyset$ or else $x=\{a\}\un b$ for
   382 some $a\in A$ and $b\in\Fin(A)$.  Formally, the elimination rule for $\Fin(A)$
   383 is written
   384 \[ \infer{Q}{x\in\Fin(A) & \infer*{Q}{[x=\emptyset]}
   385                  & \infer*{Q}{[x=\{a\}\un b & a\in A &b\in\Fin(A)]_{a,b}} }
   386 \]
   387 The subscripted variables $a$ and~$b$ above the third premise are
   388 eigenvariables, subject to the usual ``not free in \ldots'' proviso.
   389 
   390 
   391 \section{Induction and coinduction rules}
   392 Here we must consider inductive and coinductive definitions separately.  For
   393 an inductive definition, the package returns an induction rule derived
   394 directly from the properties of least fixedpoints, as well as a modified rule
   395 for mutual recursion.  For a coinductive definition, the package returns a
   396 basic coinduction rule.
   397 
   398 \subsection{The basic induction rule}\label{basic-ind-sec}
   399 The basic rule, called {\tt induct}, is appropriate in most situations.
   400 For inductive definitions, it is strong rule induction~\cite{camilleri92}; for
   401 datatype definitions (see below), it is just structural induction.  
   402 
   403 The induction rule for an inductively defined set~$R$ has the form described
   404 below.  For the time being, assume that $R$'s domain is not a Cartesian
   405 product; inductively defined relations are treated slightly differently.
   406 
   407 The major premise is $x\in R$.  There is a minor premise for each
   408 introduction rule:
   409 \begin{itemize}
   410 \item If the introduction rule concludes $t\in R_i$, then the minor premise
   411 is~$P(t)$.
   412 
   413 \item The minor premise's eigenvariables are precisely the introduction
   414 rule's free variables that are not parameters of~$R$.  For instance, the
   415 eigenvariables in the $\Fin(A)$ rule below are $a$ and $b$, but not~$A$.
   416 
   417 \item If the introduction rule has a premise $t\in R_i$, then the minor
   418 premise discharges the assumption $t\in R_i$ and the induction
   419 hypothesis~$P(t)$.  If the introduction rule has a premise $t\in M(R_i)$
   420 then the minor premise discharges the single assumption
   421 \[ t\in M(\{z\in R_i. P(z)\}). \] 
   422 Because $M$ is monotonic, this assumption implies $t\in M(R_i)$.  The
   423 occurrence of $P$ gives the effect of an induction hypothesis, which may be
   424 exploited by appealing to properties of~$M$.
   425 \end{itemize}
   426 The induction rule for $\Fin(A)$ resembles the elimination rule shown above,
   427 but includes an induction hypothesis:
   428 \[ \infer{P(x)}{x\in\Fin(A) & P(\emptyset)
   429         & \infer*{P(\{a\}\un b)}{[a\in A & b\in\Fin(A) & P(b)]_{a,b}} }
   430 \] 
   431 Stronger induction rules often suggest themselves.  We can derive a rule for
   432 $\Fin(A)$ whose third premise discharges the extra assumption $a\not\in b$.
   433 The package provides rules for mutual induction and inductive relations.  The
   434 Isabelle/\textsc{zf} theory also supports well-founded induction and recursion
   435 over datatypes, by reasoning about the \defn{rank} of a
   436 set~\cite[\S3.4]{paulson-set-II}.
   437 
   438 
   439 \subsection{Modified induction rules}
   440 
   441 If the domain of $R$ is a Cartesian product $A_1\times\cdots\times A_m$
   442 (however nested), then the corresponding predicate $P_i$ takes $m$ arguments.
   443 The major premise becomes $\pair{z_1,\ldots,z_m}\in R$ instead of $x\in R$;
   444 the conclusion becomes $P(z_1,\ldots,z_m)$.  This simplifies reasoning about
   445 inductively defined relations, eliminating the need to express properties of
   446 $z_1$, \ldots,~$z_m$ as properties of the tuple $\pair{z_1,\ldots,z_m}$.
   447 Occasionally it may require you to split up the induction variable
   448 using {\tt SigmaE} and {\tt dom\_subset}, especially if the constant {\tt
   449   split} appears in the rule.
   450 
   451 The mutual induction rule is called {\tt
   452 mutual\_induct}.  It differs from the basic rule in two respects:
   453 \begin{itemize}
   454 \item Instead of a single predicate~$P$, it uses $n$ predicates $P_1$,
   455 \ldots,~$P_n$: one for each recursive set.
   456 
   457 \item There is no major premise such as $x\in R_i$.  Instead, the conclusion
   458 refers to all the recursive sets:
   459 \[ (\forall z.z\in R_1\imp P_1(z))\conj\cdots\conj
   460    (\forall z.z\in R_n\imp P_n(z))
   461 \]
   462 Proving the premises establishes $P_i(z)$ for $z\in R_i$ and $i=1$,
   463 \ldots,~$n$.
   464 \end{itemize}
   465 %
   466 If the domain of some $R_i$ is a Cartesian product, then the mutual induction
   467 rule is modified accordingly.  The predicates are made to take $m$ separate
   468 arguments instead of a tuple, and the quantification in the conclusion is over
   469 the separate variables $z_1$, \ldots, $z_m$.
   470 
   471 \subsection{Coinduction}\label{coind-sec}
   472 A coinductive definition yields a primitive coinduction rule, with no
   473 refinements such as those for the induction rules.  (Experience may suggest
   474 refinements later.)  Consider the codatatype of lazy lists as an example.  For
   475 suitable definitions of $\LNil$ and $\LCons$, lazy lists may be defined as the
   476 greatest set consistent with the rules
   477 \[  \LNil\in\llist(A)  \qquad 
   478     \infer[(-)]{\LCons(a,l)\in\llist(A)}{a\in A & l\in\llist(A)}
   479 \]
   480 The $(-)$ tag stresses that this is a coinductive definition.  A suitable
   481 domain for $\llist(A)$ is $\quniv(A)$; this set is closed under the variant
   482 forms of sum and product that are used to represent non-well-founded data
   483 structures (see~\S\ref{univ-sec}).
   484 
   485 The package derives an {\tt unfold} theorem similar to that for $\Fin(A)$. 
   486 Then it proves the theorem {\tt coinduct}, which expresses that $\llist(A)$
   487 is the greatest solution to this equation contained in $\quniv(A)$:
   488 \[ \infer{x\in\llist(A)}{x\in X & X\sbs \quniv(A) &
   489     \infer*{
   490      \begin{array}[b]{r@{}l}
   491        z=\LNil\disj 
   492        \bigl(\exists a\,l.\, & z=\LCons(a,l) \conj a\in A \conj{}\\
   493                              & l\in X\un\llist(A) \bigr)
   494      \end{array}  }{[z\in X]_z}}
   495 \]
   496 This rule complements the introduction rules; it provides a means of showing
   497 $x\in\llist(A)$ when $x$ is infinite.  For instance, if $x=\LCons(0,x)$ then
   498 applying the rule with $X=\{x\}$ proves $x\in\llist(\nat)$.  (Here $\nat$
   499 is the set of natural numbers.)
   500 
   501 Having $X\un\llist(A)$ instead of simply $X$ in the third premise above
   502 represents a slight strengthening of the greatest fixedpoint property.  I
   503 discuss several forms of coinduction rules elsewhere~\cite{paulson-coind}.
   504 
   505 The clumsy form of the third premise makes the rule hard to use, especially in
   506 large definitions.  Probably a constant should be declared to abbreviate the
   507 large disjunction, and rules derived to allow proving the separate disjuncts.
   508 
   509 
   510 \section{Examples of inductive and coinductive definitions}\label{ind-eg-sec}
   511 This section presents several examples from the literature: the finite
   512 powerset operator, lists of $n$ elements, bisimulations on lazy lists, the
   513 well-founded part of a relation, and the primitive recursive functions.
   514 
   515 \subsection{The finite powerset operator}
   516 This operator has been discussed extensively above.  Here is the
   517 corresponding invocation in an Isabelle theory file.  Note that
   518 $\cons(a,b)$ abbreviates $\{a\}\un b$ in Isabelle/\textsc{zf}.
   519 \begin{ttbox}
   520 Finite = Arith + 
   521 consts      Fin :: i=>i
   522 inductive
   523   domains   "Fin(A)" <= "Pow(A)"
   524   intrs
   525     emptyI  "0 : Fin(A)"
   526     consI   "[| a: A;  b: Fin(A) |] ==> cons(a,b) : Fin(A)"
   527   type_intrs  empty_subsetI, cons_subsetI, PowI
   528   type_elims "[make_elim PowD]"
   529 end
   530 \end{ttbox}
   531 Theory {\tt Finite} extends the parent theory {\tt Arith} by declaring the
   532 unary function symbol~$\Fin$, which is defined inductively.  Its domain is
   533 specified as $\pow(A)$, where $A$ is the parameter appearing in the
   534 introduction rules.  For type-checking, we supply two introduction
   535 rules:
   536 \[ \emptyset\sbs A              \qquad
   537    \infer{\{a\}\un B\sbs C}{a\in C & B\sbs C}
   538 \]
   539 A further introduction rule and an elimination rule express both
   540 directions of the equivalence $A\in\pow(B)\bimp A\sbs B$.  Type-checking
   541 involves mostly introduction rules.  
   542 
   543 Like all Isabelle theory files, this one yields a structure containing the
   544 new theory as an \textsc{ml} value.  Structure {\tt Finite} also has a
   545 substructure, called~{\tt Fin}.  After declaring \hbox{\tt open Finite;} we
   546 can refer to the $\Fin(A)$ introduction rules as the list {\tt Fin.intrs}
   547 or individually as {\tt Fin.emptyI} and {\tt Fin.consI}.  The induction
   548 rule is {\tt Fin.induct}.
   549 
   550 
   551 \subsection{Lists of $n$ elements}\label{listn-sec}
   552 This has become a standard example of an inductive definition.  Following
   553 Paulin-Mohring~\cite{paulin-tlca}, we could attempt to define a new datatype
   554 $\listn(A,n)$, for lists of length~$n$, as an $n$-indexed family of sets.
   555 But her introduction rules
   556 \[ \hbox{\tt Niln}\in\listn(A,0)  \qquad
   557    \infer{\hbox{\tt Consn}(n,a,l)\in\listn(A,\succ(n))}
   558          {n\in\nat & a\in A & l\in\listn(A,n)}
   559 \]
   560 are not acceptable to the inductive definition package:
   561 $\listn$ occurs with three different parameter lists in the definition.
   562 
   563 The Isabelle version of this example suggests a general treatment of
   564 varying parameters.  It uses the existing datatype definition of
   565 $\lst(A)$, with constructors $\Nil$ and~$\Cons$, and incorporates the
   566 parameter~$n$ into the inductive set itself.  It defines $\listn(A)$ as a
   567 relation consisting of pairs $\pair{n,l}$ such that $n\in\nat$
   568 and~$l\in\lst(A)$ and $l$ has length~$n$.  In fact, $\listn(A)$ is the
   569 converse of the length function on~$\lst(A)$.  The Isabelle/\textsc{zf} introduction
   570 rules are
   571 \[ \pair{0,\Nil}\in\listn(A)  \qquad
   572    \infer{\pair{\succ(n),\Cons(a,l)}\in\listn(A)}
   573          {a\in A & \pair{n,l}\in\listn(A)}
   574 \]
   575 The Isabelle theory file takes, as parent, the theory~{\tt List} of lists.
   576 We declare the constant~$\listn$ and supply an inductive definition,
   577 specifying the domain as $\nat\times\lst(A)$:
   578 \begin{ttbox}
   579 ListN = List +
   580 consts  listn :: i=>i
   581 inductive
   582   domains   "listn(A)" <= "nat*list(A)"
   583   intrs
   584     NilI  "<0,Nil>: listn(A)"
   585     ConsI "[| a:A; <n,l>:listn(A) |] ==> <succ(n), Cons(a,l)>: listn(A)"
   586   type_intrs "nat_typechecks @ list.intrs"
   587 end
   588 \end{ttbox}
   589 The type-checking rules include those for 0, $\succ$, $\Nil$ and $\Cons$.
   590 Because $\listn(A)$ is a set of pairs, type-checking requires the
   591 equivalence $\pair{a,b}\in A\times B \bimp a\in A \conj b\in B$.  The
   592 package always includes the rules for ordered pairs.
   593 
   594 The package returns introduction, elimination and induction rules for
   595 $\listn$.  The basic induction rule, {\tt listn.induct}, is
   596 \[ \infer{P(z_1,z_2)}{\pair{z_1,z_2}\in\listn(A) & P(0,\Nil) &
   597              \infer*{P(\succ(n),\Cons(a,l))}
   598                 {[a\in A & \pair{n,l}\in\listn(A) & P(n,l)]_{a,l,n}}}
   599 \]
   600 This rule lets the induction formula to be a 
   601 binary property of pairs, $P(n,l)$.  
   602 It is now a simple matter to prove theorems about $\listn(A)$, such as
   603 \[ \forall l\in\lst(A). \pair{\length(l),\, l}\in\listn(A) \]
   604 \[ \listn(A)``\{n\} = \{l\in\lst(A). \length(l)=n\} \]
   605 This latter result --- here $r``X$ denotes the image of $X$ under $r$
   606 --- asserts that the inductive definition agrees with the obvious notion of
   607 $n$-element list.  
   608 
   609 A ``list of $n$ elements'' really is a list, namely an element of ~$\lst(A)$.
   610 It is subject to list operators such as append (concatenation).  For example,
   611 a trivial induction on $\pair{m,l}\in\listn(A)$ yields
   612 \[ \infer{\pair{m\mathbin{+} m',\, l@l'}\in\listn(A)}
   613          {\pair{m,l}\in\listn(A) & \pair{m',l'}\in\listn(A)} 
   614 \]
   615 where $+$ denotes addition on the natural numbers and @ denotes append.
   616 
   617 \subsection{Rule inversion: the function \texttt{mk\_cases}}
   618 The elimination rule, {\tt listn.elim}, is cumbersome:
   619 \[ \infer{Q}{x\in\listn(A) & 
   620           \infer*{Q}{[x = \pair{0,\Nil}]} &
   621           \infer*{Q}
   622              {\left[\begin{array}{l}
   623                x = \pair{\succ(n),\Cons(a,l)} \\
   624                a\in A \\
   625                \pair{n,l}\in\listn(A)
   626                \end{array} \right]_{a,l,n}}}
   627 \]
   628 The \textsc{ml} function {\tt listn.mk\_cases} generates simplified instances of
   629 this rule.  It works by freeness reasoning on the list constructors:
   630 $\Cons(a,l)$ is injective in its two arguments and differs from~$\Nil$.  If
   631 $x$ is $\pair{i,\Nil}$ or $\pair{i,\Cons(a,l)}$ then {\tt listn.mk\_cases}
   632 deduces the corresponding form of~$i$;  this is called rule inversion.  
   633 Here is a sample session: 
   634 \begin{ttbox}
   635 listn.mk_cases "<i,Nil> : listn(A)";
   636 {\out "[| <?i, []> : listn(?A); ?i = 0 ==> ?Q |] ==> ?Q" : thm}
   637 listn.mk_cases "<i,Cons(a,l)> : listn(A)";
   638 {\out "[| <?i, Cons(?a, ?l)> : listn(?A);}
   639 {\out     !!n. [| ?a : ?A; <n, ?l> : listn(?A); ?i = succ(n) |] ==> ?Q }
   640 {\out  |] ==> ?Q" : thm}
   641 \end{ttbox}
   642 Each of these rules has only two premises.  In conventional notation, the
   643 second rule is
   644 \[ \infer{Q}{\pair{i, \Cons(a,l)}\in\listn(A) & 
   645           \infer*{Q}
   646              {\left[\begin{array}{l}
   647                a\in A \\ \pair{n,l}\in\listn(A) \\ i = \succ(n)
   648                \end{array} \right]_{n}}}
   649 \]
   650 The package also has built-in rules for freeness reasoning about $0$
   651 and~$\succ$.  So if $x$ is $\pair{0,l}$ or $\pair{\succ(i),l}$, then {\tt
   652 listn.mk\_cases} can deduce the corresponding form of~$l$. 
   653 
   654 The function {\tt mk\_cases} is also useful with datatype definitions.  The
   655 instance from the definition of lists, namely {\tt list.mk\_cases}, can
   656 prove that $\Cons(a,l)\in\lst(A)$ implies $a\in A $ and $l\in\lst(A)$:
   657 \[ \infer{Q}{\Cons(a,l)\in\lst(A) & 
   658                  & \infer*{Q}{[a\in A &l\in\lst(A)]} }
   659 \]
   660 A typical use of {\tt mk\_cases} concerns inductive definitions of evaluation
   661 relations.  Then rule inversion yields case analysis on possible evaluations.
   662 For example, Isabelle/\textsc{zf} includes a short proof of the
   663 diamond property for parallel contraction on combinators.  Ole Rasmussen used
   664 {\tt mk\_cases} extensively in his development of the theory of
   665 residuals~\cite{rasmussen95}.
   666 
   667 
   668 \subsection{A coinductive definition: bisimulations on lazy lists}
   669 This example anticipates the definition of the codatatype $\llist(A)$, which
   670 consists of finite and infinite lists over~$A$.  Its constructors are $\LNil$
   671 and~$\LCons$, satisfying the introduction rules shown in~\S\ref{coind-sec}.  
   672 Because $\llist(A)$ is defined as a greatest fixedpoint and uses the variant
   673 pairing and injection operators, it contains non-well-founded elements such as
   674 solutions to $\LCons(a,l)=l$.
   675 
   676 The next step in the development of lazy lists is to define a coinduction
   677 principle for proving equalities.  This is done by showing that the equality
   678 relation on lazy lists is the greatest fixedpoint of some monotonic
   679 operation.  The usual approach~\cite{pitts94} is to define some notion of 
   680 bisimulation for lazy lists, define equivalence to be the greatest
   681 bisimulation, and finally to prove that two lazy lists are equivalent if and
   682 only if they are equal.  The coinduction rule for equivalence then yields a
   683 coinduction principle for equalities.
   684 
   685 A binary relation $R$ on lazy lists is a \defn{bisimulation} provided $R\sbs
   686 R^+$, where $R^+$ is the relation
   687 \[ \{\pair{\LNil,\LNil}\} \un 
   688    \{\pair{\LCons(a,l),\LCons(a,l')} . a\in A \conj \pair{l,l'}\in R\}.
   689 \]
   690 A pair of lazy lists are \defn{equivalent} if they belong to some
   691 bisimulation.  Equivalence can be coinductively defined as the greatest
   692 fixedpoint for the introduction rules
   693 \[  \pair{\LNil,\LNil} \in\lleq(A)  \qquad 
   694     \infer[(-)]{\pair{\LCons(a,l),\LCons(a,l')} \in\lleq(A)}
   695           {a\in A & \pair{l,l'}\in \lleq(A)}
   696 \]
   697 To make this coinductive definition, the theory file includes (after the
   698 declaration of $\llist(A)$) the following lines:
   699 \bgroup\leftmargini=\parindent
   700 \begin{ttbox}
   701 consts    lleq :: i=>i
   702 coinductive
   703   domains "lleq(A)" <= "llist(A) * llist(A)"
   704   intrs
   705     LNil  "<LNil,LNil> : lleq(A)"
   706     LCons "[| a:A; <l,l'>:lleq(A) |] ==> <LCons(a,l),LCons(a,l')>: lleq(A)"
   707   type_intrs  "llist.intrs"
   708 \end{ttbox}
   709 \egroup
   710 The domain of $\lleq(A)$ is $\llist(A)\times\llist(A)$.  The type-checking
   711 rules include the introduction rules for $\llist(A)$, whose 
   712 declaration is discussed below (\S\ref{lists-sec}).
   713 
   714 The package returns the introduction rules and the elimination rule, as
   715 usual.  But instead of induction rules, it returns a coinduction rule.
   716 The rule is too big to display in the usual notation; its conclusion is
   717 $x\in\lleq(A)$ and its premises are $x\in X$, 
   718 ${X\sbs\llist(A)\times\llist(A)}$ and
   719 \[ \infer*{z=\pair{\LNil,\LNil}\disj \bigl(\exists a\,l\,l'.\,
   720      \begin{array}[t]{@{}l}
   721        z=\pair{\LCons(a,l),\LCons(a,l')} \conj a\in A \conj{}\\
   722        \pair{l,l'}\in X\un\lleq(A) \bigr)
   723      \end{array}  
   724     }{[z\in X]_z}
   725 \]
   726 Thus if $x\in X$, where $X$ is a bisimulation contained in the
   727 domain of $\lleq(A)$, then $x\in\lleq(A)$.  It is easy to show that
   728 $\lleq(A)$ is reflexive: the equality relation is a bisimulation.  And
   729 $\lleq(A)$ is symmetric: its converse is a bisimulation.  But showing that
   730 $\lleq(A)$ coincides with the equality relation takes some work.
   731 
   732 \subsection{The accessible part of a relation}\label{acc-sec}
   733 Let $\prec$ be a binary relation on~$D$; in short, $(\prec)\sbs D\times D$.
   734 The \defn{accessible} or \defn{well-founded} part of~$\prec$, written
   735 $\acc(\prec)$, is essentially that subset of~$D$ for which $\prec$ admits
   736 no infinite decreasing chains~\cite{aczel77}.  Formally, $\acc(\prec)$ is
   737 inductively defined to be the least set that contains $a$ if it contains
   738 all $\prec$-predecessors of~$a$, for $a\in D$.  Thus we need an
   739 introduction rule of the form 
   740 \[ \infer{a\in\acc(\prec)}{\forall y.y\prec a\imp y\in\acc(\prec)} \]
   741 Paulin-Mohring treats this example in Coq~\cite{paulin-tlca}, but it causes
   742 difficulties for other systems.  Its premise is not acceptable to the
   743 inductive definition package of the Cambridge \textsc{hol}
   744 system~\cite{camilleri92}.  It is also unacceptable to the Isabelle package
   745 (recall \S\ref{intro-sec}), but fortunately can be transformed into the
   746 acceptable form $t\in M(R)$.
   747 
   748 The powerset operator is monotonic, and $t\in\pow(R)$ is equivalent to
   749 $t\sbs R$.  This in turn is equivalent to $\forall y\in t. y\in R$.  To
   750 express $\forall y.y\prec a\imp y\in\acc(\prec)$ we need only find a
   751 term~$t$ such that $y\in t$ if and only if $y\prec a$.  A suitable $t$ is
   752 the inverse image of~$\{a\}$ under~$\prec$.
   753 
   754 The definition below follows this approach.  Here $r$ is~$\prec$ and
   755 $\field(r)$ refers to~$D$, the domain of $\acc(r)$.  (The field of a
   756 relation is the union of its domain and range.)  Finally $r^{-}``\{a\}$
   757 denotes the inverse image of~$\{a\}$ under~$r$.  We supply the theorem {\tt
   758   Pow\_mono}, which asserts that $\pow$ is monotonic.
   759 \begin{ttbox}
   760 consts    acc :: i=>i
   761 inductive
   762   domains "acc(r)" <= "field(r)"
   763   intrs
   764     vimage  "[| r-``\{a\}: Pow(acc(r)); a: field(r) |] ==> a: acc(r)"
   765   monos      Pow_mono
   766 \end{ttbox}
   767 The Isabelle theory proceeds to prove facts about $\acc(\prec)$.  For
   768 instance, $\prec$ is well-founded if and only if its field is contained in
   769 $\acc(\prec)$.  
   770 
   771 As mentioned in~\S\ref{basic-ind-sec}, a premise of the form $t\in M(R)$
   772 gives rise to an unusual induction hypothesis.  Let us examine the
   773 induction rule, {\tt acc.induct}:
   774 \[ \infer{P(x)}{x\in\acc(r) &
   775      \infer*{P(a)}{\left[
   776                    \begin{array}{r@{}l}
   777                      r^{-}``\{a\} &\, \in\pow(\{z\in\acc(r).P(z)\}) \\
   778                                 a &\, \in\field(r)
   779                    \end{array}
   780                    \right]_a}}
   781 \]
   782 The strange induction hypothesis is equivalent to
   783 $\forall y. \pair{y,a}\in r\imp y\in\acc(r)\conj P(y)$.
   784 Therefore the rule expresses well-founded induction on the accessible part
   785 of~$\prec$.
   786 
   787 The use of inverse image is not essential.  The Isabelle package can accept
   788 introduction rules with arbitrary premises of the form $\forall
   789 \vec{y}.P(\vec{y})\imp f(\vec{y})\in R$.  The premise can be expressed
   790 equivalently as 
   791 \[ \{z\in D. P(\vec{y}) \conj z=f(\vec{y})\} \in \pow(R) \] 
   792 provided $f(\vec{y})\in D$ for all $\vec{y}$ such that~$P(\vec{y})$.  The
   793 following section demonstrates another use of the premise $t\in M(R)$,
   794 where $M=\lst$. 
   795 
   796 \subsection{The primitive recursive functions}\label{primrec-sec}
   797 The primitive recursive functions are traditionally defined inductively, as
   798 a subset of the functions over the natural numbers.  One difficulty is that
   799 functions of all arities are taken together, but this is easily
   800 circumvented by regarding them as functions on lists.  Another difficulty,
   801 the notion of composition, is less easily circumvented.
   802 
   803 Here is a more precise definition.  Letting $\vec{x}$ abbreviate
   804 $x_0,\ldots,x_{n-1}$, we can write lists such as $[\vec{x}]$,
   805 $[y+1,\vec{x}]$, etc.  A function is \defn{primitive recursive} if it
   806 belongs to the least set of functions in $\lst(\nat)\to\nat$ containing
   807 \begin{itemize}
   808 \item The \defn{successor} function $\SC$, such that $\SC[y,\vec{x}]=y+1$.
   809 \item All \defn{constant} functions $\CONST(k)$, such that
   810   $\CONST(k)[\vec{x}]=k$. 
   811 \item All \defn{projection} functions $\PROJ(i)$, such that
   812   $\PROJ(i)[\vec{x}]=x_i$ if $0\leq i<n$. 
   813 \item All \defn{compositions} $\COMP(g,[f_0,\ldots,f_{m-1}])$, 
   814 where $g$ and $f_0$, \ldots, $f_{m-1}$ are primitive recursive,
   815 such that
   816 \[ \COMP(g,[f_0,\ldots,f_{m-1}])[\vec{x}] = 
   817    g[f_0[\vec{x}],\ldots,f_{m-1}[\vec{x}]]. \] 
   818 
   819 \item All \defn{recursions} $\PREC(f,g)$, where $f$ and $g$ are primitive
   820   recursive, such that
   821 \begin{eqnarray*}
   822   \PREC(f,g)[0,\vec{x}] & = & f[\vec{x}] \\
   823   \PREC(f,g)[y+1,\vec{x}] & = & g[\PREC(f,g)[y,\vec{x}],\, y,\, \vec{x}].
   824 \end{eqnarray*} 
   825 \end{itemize}
   826 Composition is awkward because it combines not two functions, as is usual,
   827 but $m+1$ functions.  In her proof that Ackermann's function is not
   828 primitive recursive, Nora Szasz was unable to formalize this definition
   829 directly~\cite{szasz93}.  So she generalized primitive recursion to
   830 tuple-valued functions.  This modified the inductive definition such that
   831 each operation on primitive recursive functions combined just two functions.
   832 
   833 \begin{figure}
   834 \begin{ttbox}
   835 Primrec_defs = Main +
   836 consts SC :: i
   837  \(\vdots\)
   838 defs
   839  SC_def    "SC == lam l:list(nat).list_case(0, \%x xs.succ(x), l)"
   840  \(\vdots\)
   841 end
   842 
   843 Primrec = Primrec_defs +
   844 consts prim_rec :: i
   845 inductive
   846  domains "primrec" <= "list(nat)->nat"
   847  intrs
   848    SC     "SC : primrec"
   849    CONST  "k: nat ==> CONST(k) : primrec"
   850    PROJ   "i: nat ==> PROJ(i) : primrec"
   851    COMP   "[| g: primrec; fs: list(primrec) |] ==> COMP(g,fs): primrec"
   852    PREC   "[| f: primrec; g: primrec |] ==> PREC(f,g): primrec"
   853  monos       list_mono
   854  con_defs    SC_def, CONST_def, PROJ_def, COMP_def, PREC_def
   855  type_intrs "nat_typechecks @ list.intrs @                      
   856              [lam_type, list_case_type, drop_type, map_type,    
   857               apply_type, rec_type]"
   858 end
   859 \end{ttbox}
   860 \hrule
   861 \caption{Inductive definition of the primitive recursive functions} 
   862 \label{primrec-fig}
   863 \end{figure}
   864 \def\fs{{\it fs}} 
   865 
   866 Szasz was using \textsc{alf}, but Coq and \textsc{hol} would also have
   867 problems accepting this definition.  Isabelle's package accepts it easily
   868 since $[f_0,\ldots,f_{m-1}]$ is a list of primitive recursive functions and
   869 $\lst$ is monotonic.  There are five introduction rules, one for each of the
   870 five forms of primitive recursive function.  Let us examine the one for
   871 $\COMP$:
   872 \[ \infer{\COMP(g,\fs)\in\primrec}{g\in\primrec & \fs\in\lst(\primrec)} \]
   873 The induction rule for $\primrec$ has one case for each introduction rule.
   874 Due to the use of $\lst$ as a monotone operator, the composition case has
   875 an unusual induction hypothesis:
   876  \[ \infer*{P(\COMP(g,\fs))}
   877           {[g\in\primrec & \fs\in\lst(\{z\in\primrec.P(z)\})]_{\fs,g}} 
   878 \]
   879 The hypothesis states that $\fs$ is a list of primitive recursive functions,
   880 each satisfying the induction formula.  Proving the $\COMP$ case typically
   881 requires structural induction on lists, yielding two subcases: either
   882 $\fs=\Nil$ or else $\fs=\Cons(f,\fs')$, where $f\in\primrec$, $P(f)$, and
   883 $\fs'$ is another list of primitive recursive functions satisfying~$P$.
   884 
   885 Figure~\ref{primrec-fig} presents the theory file.  Theory {\tt Primrec}
   886 defines the constants $\SC$, $\CONST$, etc.  These are not constructors of
   887 a new datatype, but functions over lists of numbers.  Their definitions,
   888 most of which are omitted, consist of routine list programming.  In
   889 Isabelle/\textsc{zf}, the primitive recursive functions are defined as a subset of
   890 the function set $\lst(\nat)\to\nat$.
   891 
   892 The Isabelle theory goes on to formalize Ackermann's function and prove
   893 that it is not primitive recursive, using the induction rule {\tt
   894   primrec.induct}.  The proof follows Szasz's excellent account.
   895 
   896 
   897 \section{Datatypes and codatatypes}\label{data-sec}
   898 A (co)datatype definition is a (co)inductive definition with automatically
   899 defined constructors and a case analysis operator.  The package proves that
   900 the case operator inverts the constructors and can prove freeness theorems
   901 involving any pair of constructors.
   902 
   903 
   904 \subsection{Constructors and their domain}\label{univ-sec}
   905 A (co)inductive definition selects a subset of an existing set; a (co)datatype
   906 definition creates a new set.  The package reduces the latter to the former.
   907 Isabelle/\textsc{zf} supplies sets having strong closure properties to serve
   908 as domains for (co)inductive definitions.
   909 
   910 Isabelle/\textsc{zf} defines the Cartesian product $A\times
   911 B$, containing ordered pairs $\pair{a,b}$; it also defines the
   912 disjoint sum $A+B$, containing injections $\Inl(a)\equiv\pair{0,a}$ and
   913 $\Inr(b)\equiv\pair{1,b}$.  For use below, define the $m$-tuple
   914 $\pair{x_1,\ldots,x_m}$ to be the empty set~$\emptyset$ if $m=0$, simply $x_1$
   915 if $m=1$ and $\pair{x_1,\pair{x_2,\ldots,x_m}}$ if $m\geq2$.
   916 
   917 A datatype constructor $\Con(x_1,\ldots,x_m)$ is defined to be
   918 $h(\pair{x_1,\ldots,x_m})$, where $h$ is composed of $\Inl$ and~$\Inr$.
   919 In a mutually recursive definition, all constructors for the set~$R_i$ have
   920 the outer form~$h_{in}$, where $h_{in}$ is the injection described
   921 in~\S\ref{mutual-sec}.  Further nested injections ensure that the
   922 constructors for~$R_i$ are pairwise distinct.  
   923 
   924 Isabelle/\textsc{zf} defines the set $\univ(A)$, which contains~$A$ and
   925 furthermore contains $\pair{a,b}$, $\Inl(a)$ and $\Inr(b)$ for $a$,
   926 $b\in\univ(A)$.  In a typical datatype definition with set parameters
   927 $A_1$, \ldots, $A_k$, a suitable domain for all the recursive sets is
   928 $\univ(A_1\un\cdots\un A_k)$.  This solves the problem for
   929 datatypes~\cite[\S4.2]{paulson-set-II}.
   930 
   931 The standard pairs and injections can only yield well-founded
   932 constructions.  This eases the (manual!) definition of recursive functions
   933 over datatypes.  But they are unsuitable for codatatypes, which typically
   934 contain non-well-founded objects.
   935 
   936 To support codatatypes, Isabelle/\textsc{zf} defines a variant notion of
   937 ordered pair, written~$\pair{a;b}$.  It also defines the corresponding variant
   938 notion of Cartesian product $A\otimes B$, variant injections $\QInl(a)$
   939 and~$\QInr(b)$ and variant disjoint sum $A\oplus B$.  Finally it defines the
   940 set $\quniv(A)$, which contains~$A$ and furthermore contains $\pair{a;b}$,
   941 $\QInl(a)$ and $\QInr(b)$ for $a$, $b\in\quniv(A)$.  In a typical codatatype
   942 definition with set parameters $A_1$, \ldots, $A_k$, a suitable domain is
   943 $\quniv(A_1\un\cdots\un A_k)$.  Details are published
   944 elsewhere~\cite{paulson-mscs}.
   945 
   946 \subsection{The case analysis operator}
   947 The (co)datatype package automatically defines a case analysis operator,
   948 called {\tt$R$\_case}.  A mutually recursive definition still has only one
   949 operator, whose name combines those of the recursive sets: it is called
   950 {\tt$R_1$\_\ldots\_$R_n$\_case}.  The case operator is analogous to those
   951 for products and sums.
   952 
   953 Datatype definitions employ standard products and sums, whose operators are
   954 $\split$ and $\case$ and satisfy the equations
   955 \begin{eqnarray*}
   956   \split(f,\pair{x,y})  & = &  f(x,y) \\
   957   \case(f,g,\Inl(x))    & = &  f(x)   \\
   958   \case(f,g,\Inr(y))    & = &  g(y)
   959 \end{eqnarray*}
   960 Suppose the datatype has $k$ constructors $\Con_1$, \ldots,~$\Con_k$.  Then
   961 its case operator takes $k+1$ arguments and satisfies an equation for each
   962 constructor:
   963 \[ R\hbox{\_case}(f_1,\ldots,f_k, {\tt Con}_i(\vec{x})) = f_i(\vec{x}),
   964     \qquad i = 1, \ldots, k
   965 \]
   966 The case operator's definition takes advantage of Isabelle's representation of
   967 syntax in the typed $\lambda$-calculus; it could readily be adapted to a
   968 theorem prover for higher-order logic.  If $f$ and~$g$ have meta-type $i\To i$
   969 then so do $\split(f)$ and $\case(f,g)$.  This works because $\split$ and
   970 $\case$ operate on their last argument.  They are easily combined to make
   971 complex case analysis operators.  For example, $\case(f,\case(g,h))$ performs
   972 case analysis for $A+(B+C)$; let us verify one of the three equations:
   973 \[ \case(f,\case(g,h), \Inr(\Inl(b))) = \case(g,h,\Inl(b)) = g(b) \]
   974 Codatatype definitions are treated in precisely the same way.  They express
   975 case operators using those for the variant products and sums, namely
   976 $\qsplit$ and~$\qcase$.
   977 
   978 \medskip
   979 
   980 To see how constructors and the case analysis operator are defined, let us
   981 examine some examples.  Further details are available
   982 elsewhere~\cite{paulson-set-II}.
   983 
   984 
   985 \subsection{Example: lists and lazy lists}\label{lists-sec}
   986 Here is a declaration of the datatype of lists, as it might appear in a theory
   987 file:
   988 \begin{ttbox} 
   989 consts  list :: i=>i
   990 datatype "list(A)" = Nil | Cons ("a:A", "l: list(A)")
   991 \end{ttbox}
   992 And here is a declaration of the codatatype of lazy lists:
   993 \begin{ttbox}
   994 consts  llist :: i=>i
   995 codatatype "llist(A)" = LNil | LCons ("a: A", "l: llist(A)")
   996 \end{ttbox}
   997 
   998 Each form of list has two constructors, one for the empty list and one for
   999 adding an element to a list.  Each takes a parameter, defining the set of
  1000 lists over a given set~$A$.  Each is automatically given the appropriate
  1001 domain: $\univ(A)$ for $\lst(A)$ and $\quniv(A)$ for $\llist(A)$.  The default
  1002 can be overridden.
  1003 
  1004 \ifshort
  1005 Now $\lst(A)$ is a datatype and enjoys the usual induction rule.
  1006 \else
  1007 Since $\lst(A)$ is a datatype, it has a structural induction rule, {\tt
  1008   list.induct}:
  1009 \[ \infer{P(x)}{x\in\lst(A) & P(\Nil)
  1010         & \infer*{P(\Cons(a,l))}{[a\in A & l\in\lst(A) & P(l)]_{a,l}} }
  1011 \] 
  1012 Induction and freeness yield the law $l\not=\Cons(a,l)$.  To strengthen this,
  1013 Isabelle/\textsc{zf} defines the rank of a set and proves that the standard
  1014 pairs and injections have greater rank than their components.  An immediate
  1015 consequence, which justifies structural recursion on lists
  1016 \cite[\S4.3]{paulson-set-II}, is
  1017 \[ \rank(l) < \rank(\Cons(a,l)). \]
  1018 \par
  1019 \fi
  1020 But $\llist(A)$ is a codatatype and has no induction rule.  Instead it has
  1021 the coinduction rule shown in \S\ref{coind-sec}.  Since variant pairs and
  1022 injections are monotonic and need not have greater rank than their
  1023 components, fixedpoint operators can create cyclic constructions.  For
  1024 example, the definition
  1025 \[ \lconst(a) \equiv \lfp(\univ(a), \lambda l. \LCons(a,l)) \]
  1026 yields $\lconst(a) = \LCons(a,\lconst(a))$.
  1027 
  1028 \ifshort
  1029 \typeout{****SHORT VERSION}
  1030 \typeout{****Omitting discussion of constructors!}
  1031 \else
  1032 \medskip
  1033 It may be instructive to examine the definitions of the constructors and
  1034 case operator for $\lst(A)$.  The definitions for $\llist(A)$ are similar.
  1035 The list constructors are defined as follows:
  1036 \begin{eqnarray*}
  1037   \Nil       & \equiv & \Inl(\emptyset) \\
  1038   \Cons(a,l) & \equiv & \Inr(\pair{a,l})
  1039 \end{eqnarray*}
  1040 The operator $\lstcase$ performs case analysis on these two alternatives:
  1041 \[ \lstcase(c,h) \equiv \case(\lambda u.c, \split(h)) \]
  1042 Let us verify the two equations:
  1043 \begin{eqnarray*}
  1044     \lstcase(c, h, \Nil) & = & 
  1045        \case(\lambda u.c, \split(h), \Inl(\emptyset)) \\
  1046      & = & (\lambda u.c)(\emptyset) \\
  1047      & = & c\\[1ex]
  1048     \lstcase(c, h, \Cons(x,y)) & = & 
  1049        \case(\lambda u.c, \split(h), \Inr(\pair{x,y})) \\
  1050      & = & \split(h, \pair{x,y}) \\
  1051      & = & h(x,y)
  1052 \end{eqnarray*} 
  1053 \fi
  1054 
  1055 
  1056 \ifshort
  1057 \typeout{****Omitting mutual recursion example!}
  1058 \else
  1059 \subsection{Example: mutual recursion}
  1060 In mutually recursive trees and forests~\cite[\S4.5]{paulson-set-II}, trees
  1061 have the one constructor $\Tcons$, while forests have the two constructors
  1062 $\Fnil$ and~$\Fcons$:
  1063 \begin{ttbox}
  1064 consts  tree, forest, tree_forest    :: i=>i
  1065 datatype "tree(A)"   = Tcons ("a: A",  "f: forest(A)")
  1066 and      "forest(A)" = Fnil  |  Fcons ("t: tree(A)",  "f: forest(A)")
  1067 \end{ttbox}
  1068 The three introduction rules define the mutual recursion.  The
  1069 distinguishing feature of this example is its two induction rules.
  1070 
  1071 The basic induction rule is called {\tt tree\_forest.induct}:
  1072 \[ \infer{P(x)}{x\in\TF(A) & 
  1073      \infer*{P(\Tcons(a,f))}
  1074         {\left[\begin{array}{l} a\in A \\ 
  1075                                 f\in\forest(A) \\ P(f)
  1076                \end{array}
  1077          \right]_{a,f}}
  1078      & P(\Fnil)
  1079      & \infer*{P(\Fcons(t,f))}
  1080         {\left[\begin{array}{l} t\in\tree(A)   \\ P(t) \\
  1081                                 f\in\forest(A) \\ P(f)
  1082                 \end{array}
  1083          \right]_{t,f}} }
  1084 \] 
  1085 This rule establishes a single predicate for $\TF(A)$, the union of the
  1086 recursive sets.  Although such reasoning can be useful
  1087 \cite[\S4.5]{paulson-set-II}, a proper mutual induction rule should establish
  1088 separate predicates for $\tree(A)$ and $\forest(A)$.  The package calls this
  1089 rule {\tt tree\_forest.mutual\_induct}.  Observe the usage of $P$ and $Q$ in
  1090 the induction hypotheses:
  1091 \[ \infer{(\forall z. z\in\tree(A)\imp P(z)) \conj
  1092           (\forall z. z\in\forest(A)\imp Q(z))}
  1093      {\infer*{P(\Tcons(a,f))}
  1094         {\left[\begin{array}{l} a\in A \\ 
  1095                                 f\in\forest(A) \\ Q(f)
  1096                \end{array}
  1097          \right]_{a,f}}
  1098      & Q(\Fnil)
  1099      & \infer*{Q(\Fcons(t,f))}
  1100         {\left[\begin{array}{l} t\in\tree(A)   \\ P(t) \\
  1101                                 f\in\forest(A) \\ Q(f)
  1102                 \end{array}
  1103          \right]_{t,f}} }
  1104 \] 
  1105 Elsewhere I describe how to define mutually recursive functions over trees and
  1106 forests \cite[\S4.5]{paulson-set-II}.
  1107 
  1108 Both forest constructors have the form $\Inr(\cdots)$,
  1109 while the tree constructor has the form $\Inl(\cdots)$.  This pattern would
  1110 hold regardless of how many tree or forest constructors there were.
  1111 \begin{eqnarray*}
  1112   \Tcons(a,l)  & \equiv & \Inl(\pair{a,l}) \\
  1113   \Fnil        & \equiv & \Inr(\Inl(\emptyset)) \\
  1114   \Fcons(a,l)  & \equiv & \Inr(\Inr(\pair{a,l}))
  1115 \end{eqnarray*} 
  1116 There is only one case operator; it works on the union of the trees and
  1117 forests:
  1118 \[ {\tt tree\_forest\_case}(f,c,g) \equiv 
  1119     \case(\split(f),\, \case(\lambda u.c, \split(g))) 
  1120 \]
  1121 \fi
  1122 
  1123 
  1124 \subsection{Example: a four-constructor datatype}
  1125 A bigger datatype will illustrate some efficiency 
  1126 refinements.  It has four constructors $\Con_0$, \ldots, $\Con_3$, with the
  1127 corresponding arities.
  1128 \begin{ttbox}
  1129 consts    data :: [i,i] => i
  1130 datatype  "data(A,B)" = Con0
  1131                       | Con1 ("a: A")
  1132                       | Con2 ("a: A", "b: B")
  1133                       | Con3 ("a: A", "b: B", "d: data(A,B)")
  1134 \end{ttbox}
  1135 Because this datatype has two set parameters, $A$ and~$B$, the package
  1136 automatically supplies $\univ(A\un B)$ as its domain.  The structural
  1137 induction rule has four minor premises, one per constructor, and only the last
  1138 has an induction hypothesis.  (Details are left to the reader.)
  1139 
  1140 The constructors are defined by the equations
  1141 \begin{eqnarray*}
  1142   \Con_0         & \equiv & \Inl(\Inl(\emptyset)) \\
  1143   \Con_1(a)      & \equiv & \Inl(\Inr(a)) \\
  1144   \Con_2(a,b)    & \equiv & \Inr(\Inl(\pair{a,b})) \\
  1145   \Con_3(a,b,c)  & \equiv & \Inr(\Inr(\pair{a,b,c})).
  1146 \end{eqnarray*} 
  1147 The case analysis operator is
  1148 \[ {\tt data\_case}(f_0,f_1,f_2,f_3) \equiv 
  1149     \case(\begin{array}[t]{@{}l}
  1150           \case(\lambda u.f_0,\; f_1),\, \\
  1151           \case(\split(f_2),\; \split(\lambda v.\split(f_3(v)))) )
  1152    \end{array} 
  1153 \]
  1154 This may look cryptic, but the case equations are trivial to verify.
  1155 
  1156 In the constructor definitions, the injections are balanced.  A more naive
  1157 approach is to define $\Con_3(a,b,c)$ as $\Inr(\Inr(\Inr(\pair{a,b,c})))$;
  1158 instead, each constructor has two injections.  The difference here is small.
  1159 But the \textsc{zf} examples include a 60-element enumeration type, where each
  1160 constructor has 5 or~6 injections.  The naive approach would require 1 to~59
  1161 injections; the definitions would be quadratic in size.  It is like the
  1162 advantage of binary notation over unary.
  1163 
  1164 The result structure contains the case operator and constructor definitions as
  1165 the theorem list \verb|con_defs|. It contains the case equations, such as 
  1166 \[ {\tt data\_case}(f_0,f_1,f_2,f_3,\Con_3(a,b,c)) = f_3(a,b,c), \]
  1167 as the theorem list \verb|case_eqns|.  There is one equation per constructor.
  1168 
  1169 \subsection{Proving freeness theorems}
  1170 There are two kinds of freeness theorems:
  1171 \begin{itemize}
  1172 \item \defn{injectiveness} theorems, such as
  1173 \[ \Con_2(a,b) = \Con_2(a',b') \bimp a=a' \conj b=b' \]
  1174 
  1175 \item \defn{distinctness} theorems, such as
  1176 \[ \Con_1(a) \not= \Con_2(a',b')  \]
  1177 \end{itemize}
  1178 Since the number of such theorems is quadratic in the number of constructors,
  1179 the package does not attempt to prove them all.  Instead it returns tools for
  1180 proving desired theorems --- either manually or during
  1181 simplification or classical reasoning.
  1182 
  1183 The theorem list \verb|free_iffs| enables the simplifier to perform freeness
  1184 reasoning.  This works by incremental unfolding of constructors that appear in
  1185 equations.  The theorem list contains logical equivalences such as
  1186 \begin{eqnarray*}
  1187   \Con_0=c      & \bimp &  c=\Inl(\Inl(\emptyset))     \\
  1188   \Con_1(a)=c   & \bimp &  c=\Inl(\Inr(a))             \\
  1189                 & \vdots &                             \\
  1190   \Inl(a)=\Inl(b)   & \bimp &  a=b                     \\
  1191   \Inl(a)=\Inr(b)   & \bimp &  {\tt False}             \\
  1192   \pair{a,b} = \pair{a',b'} & \bimp & a=a' \conj b=b'
  1193 \end{eqnarray*}
  1194 For example, these rewrite $\Con_1(a)=\Con_1(b)$ to $a=b$ in four steps.
  1195 
  1196 The theorem list \verb|free_SEs| enables the classical
  1197 reasoner to perform similar replacements.  It consists of elimination rules
  1198 to replace $\Con_0=c$ by $c=\Inl(\Inl(\emptyset))$ and so forth, in the
  1199 assumptions.
  1200 
  1201 Such incremental unfolding combines freeness reasoning with other proof
  1202 steps.  It has the unfortunate side-effect of unfolding definitions of
  1203 constructors in contexts such as $\exists x.\Con_1(a)=x$, where they should
  1204 be left alone.  Calling the Isabelle tactic {\tt fold\_tac con\_defs}
  1205 restores the defined constants.
  1206 
  1207 
  1208 \section{Related work}\label{related}
  1209 The use of least fixedpoints to express inductive definitions seems
  1210 obvious.  Why, then, has this technique so seldom been implemented?
  1211 
  1212 Most automated logics can only express inductive definitions by asserting
  1213 axioms.  Little would be left of Boyer and Moore's logic~\cite{bm79} if their
  1214 shell principle were removed.  With \textsc{alf} the situation is more
  1215 complex; earlier versions of Martin-L\"of's type theory could (using
  1216 wellordering types) express datatype definitions, but the version underlying
  1217 \textsc{alf} requires new rules for each definition~\cite{dybjer91}.  With Coq
  1218 the situation is subtler still; its underlying Calculus of Constructions can
  1219 express inductive definitions~\cite{huet88}, but cannot quite handle datatype
  1220 definitions~\cite{paulin-tlca}.  It seems that researchers tried hard to
  1221 circumvent these problems before finally extending the Calculus with rule
  1222 schemes for strictly positive operators.  Recently Gim{\'e}nez has extended
  1223 the Calculus of Constructions with inductive and coinductive
  1224 types~\cite{gimenez-codifying}, with mechanized support in Coq.
  1225 
  1226 Higher-order logic can express inductive definitions through quantification
  1227 over unary predicates.  The following formula expresses that~$i$ belongs to the
  1228 least set containing~0 and closed under~$\succ$:
  1229 \[ \forall P. P(0)\conj (\forall x.P(x)\imp P(\succ(x))) \imp P(i) \] 
  1230 This technique can be used to prove the Knaster-Tarski theorem, which (in its
  1231 general form) is little used in the Cambridge \textsc{hol} system.
  1232 Melham~\cite{melham89} describes the development.  The natural numbers are
  1233 defined as shown above, but lists are defined as functions over the natural
  1234 numbers.  Unlabelled trees are defined using G\"odel numbering; a labelled
  1235 tree consists of an unlabelled tree paired with a list of labels.  Melham's
  1236 datatype package expresses the user's datatypes in terms of labelled trees.
  1237 It has been highly successful, but a fixedpoint approach might have yielded
  1238 greater power with less effort.
  1239 
  1240 Elsa Gunter~\cite{gunter-trees} reports an ongoing project to generalize the
  1241 Cambridge \textsc{hol} system with mutual recursion and infinitely-branching
  1242 trees.  She retains many features of Melham's approach.
  1243 
  1244 Melham's inductive definition package~\cite{camilleri92} also uses
  1245 quantification over predicates.  But instead of formalizing the notion of
  1246 monotone function, it requires definitions to consist of finitary rules, a
  1247 syntactic form that excludes many monotone inductive definitions.
  1248 
  1249 \textsc{pvs}~\cite{pvs-language} is another proof assistant based on
  1250 higher-order logic.  It supports both inductive definitions and datatypes,
  1251 apparently by asserting axioms.  Datatypes may not be iterated in general, but
  1252 may use recursion over the built-in $\lst$ type.
  1253 
  1254 The earliest use of least fixedpoints is probably Robin Milner's.  Brian
  1255 Monahan extended this package considerably~\cite{monahan84}, as did I in
  1256 unpublished work.\footnote{The datatype package described in my \textsc{lcf}
  1257   book~\cite{paulson87} does {\it not\/} make definitions, but merely asserts
  1258   axioms.} \textsc{lcf} is a first-order logic of domain theory; the relevant
  1259 fixedpoint theorem is not Knaster-Tarski but concerns fixedpoints of
  1260 continuous functions over domains.  \textsc{lcf} is too weak to express
  1261 recursive predicates.  The Isabelle package might be the first to be based on
  1262 the Knaster-Tarski theorem.
  1263 
  1264 
  1265 \section{Conclusions and future work}
  1266 Higher-order logic and set theory are both powerful enough to express
  1267 inductive definitions.  A growing number of theorem provers implement one
  1268 of these~\cite{IMPS,saaltink-fme}.  The easiest sort of inductive
  1269 definition package to write is one that asserts new axioms, not one that
  1270 makes definitions and proves theorems about them.  But asserting axioms
  1271 could introduce unsoundness.
  1272 
  1273 The fixedpoint approach makes it fairly easy to implement a package for
  1274 (co)in\-duc\-tive definitions that does not assert axioms.  It is efficient:
  1275 it processes most definitions in seconds and even a 60-constructor datatype
  1276 requires only a few minutes.  It is also simple: The first working version took
  1277 under a week to code, consisting of under 1100 lines (35K bytes) of Standard
  1278 \textsc{ml}.
  1279 
  1280 In set theory, care is needed to ensure that the inductive definition yields
  1281 a set (rather than a proper class).  This problem is inherent to set theory,
  1282 whether or not the Knaster-Tarski theorem is employed.  We must exhibit a
  1283 bounding set (called a domain above).  For inductive definitions, this is
  1284 often trivial.  For datatype definitions, I have had to formalize much set
  1285 theory.  To justify infinitely-branching datatype definitions, I have had to
  1286 develop a theory of cardinal arithmetic~\cite{paulson-gr}, such as the theorem
  1287 that if $\kappa$ is an infinite cardinal and $|X(\alpha)| \le \kappa$ for all
  1288 $\alpha<\kappa$ then $|\union\sb{\alpha<\kappa} X(\alpha)| \le \kappa$.  
  1289 The need for such efforts is not a drawback of the fixedpoint approach, for
  1290 the alternative is to take such definitions on faith.
  1291 
  1292 Care is also needed to ensure that the greatest fixedpoint really yields a
  1293 coinductive definition.  In set theory, standard pairs admit only well-founded
  1294 constructions.  Aczel's anti-foundation axiom~\cite{aczel88} could be used to
  1295 get non-well-founded objects, but it does not seem easy to mechanize.
  1296 Isabelle/\textsc{zf} instead uses a variant notion of ordered pairing, which
  1297 can be generalized to a variant notion of function.  Elsewhere I have
  1298 proved that this simple approach works (yielding final coalgebras) for a broad
  1299 class of definitions~\cite{paulson-mscs}.
  1300 
  1301 Several large studies make heavy use of inductive definitions.  L\"otzbeyer
  1302 and Sandner have formalized two chapters of a semantics book~\cite{winskel93},
  1303 proving the equivalence between the operational and denotational semantics of
  1304 a simple imperative language.  A single theory file contains three datatype
  1305 definitions (of arithmetic expressions, boolean expressions and commands) and
  1306 three inductive definitions (the corresponding operational rules).  Using
  1307 different techniques, Nipkow~\cite{nipkow-CR} and Rasmussen~\cite{rasmussen95}
  1308 have both proved the Church-Rosser theorem; inductive definitions specify
  1309 several reduction relations on $\lambda$-terms.  Recently, I have applied
  1310 inductive definitions to the analysis of cryptographic
  1311 protocols~\cite{paulson-markt}. 
  1312 
  1313 To demonstrate coinductive definitions, Frost~\cite{frost95} has proved the
  1314 consistency of the dynamic and static semantics for a small functional
  1315 language.  The example is due to Milner and Tofte~\cite{milner-coind}.  It
  1316 concerns an extended correspondence relation, which is defined coinductively.
  1317 A codatatype definition specifies values and value environments in mutual
  1318 recursion.  Non-well-founded values represent recursive functions.  Value
  1319 environments are variant functions from variables into values.  This one key
  1320 definition uses most of the package's novel features.
  1321 
  1322 The approach is not restricted to set theory.  It should be suitable for any
  1323 logic that has some notion of set and the Knaster-Tarski theorem.  I have
  1324 ported the (co)inductive definition package from Isabelle/\textsc{zf} to
  1325 Isabelle/\textsc{hol} (higher-order logic).  
  1326 
  1327 
  1328 \begin{footnotesize}
  1329 \bibliographystyle{plain}
  1330 \bibliography{../manual}
  1331 \end{footnotesize}
  1332 %%%%%\doendnotes
  1333 
  1334 \end{document}