doc-src/ind-defs.tex
author berghofe
Fri, 02 May 1997 16:18:49 +0200
changeset 3096 ccc2c92bb232
parent 2974 bb55cab416af
permissions -rw-r--r--
Updated to LaTeX 2e
     1 \documentstyle[a4,alltt,iman,extra,proof209,12pt]{article}
     2 \newif\ifshort%''Short'' means a published version, not the documentation
     3 \shortfalse%%%%%\shorttrue
     4 
     5 \title{A Fixedpoint Approach to\\ 
     6   (Co)Inductive and (Co)Datatype Definitions%
     7   \thanks{J. Grundy and S. Thompson made detailed comments.  Mads Tofte and
     8     the referees were also helpful.  The research was funded by the SERC
     9     grants GR/G53279, GR/H40570 and by the ESPRIT Project 6453 ``Types''.}}
    10 
    11 \author{Lawrence C. Paulson\\{\tt lcp@cl.cam.ac.uk}\\
    12         Computer Laboratory, University of Cambridge, England}
    13 \date{\today} 
    14 \setcounter{secnumdepth}{2} \setcounter{tocdepth}{2}
    15 
    16 \newcommand\sbs{\subseteq}
    17 \let\To=\Rightarrow
    18 
    19 \newcommand\emph[1]{{\em#1\/}}
    20 \newcommand\defn[1]{{\bf#1}}
    21 \newcommand\textsc[1]{{\sc#1}}
    22 \newcommand\texttt[1]{{\tt#1}}
    23 
    24 \newcommand\pow{{\cal P}}
    25 %%%\let\pow=\wp
    26 \newcommand\RepFun{\hbox{\tt RepFun}}
    27 \newcommand\cons{\hbox{\tt cons}}
    28 \def\succ{\hbox{\tt succ}}
    29 \newcommand\split{\hbox{\tt split}}
    30 \newcommand\fst{\hbox{\tt fst}}
    31 \newcommand\snd{\hbox{\tt snd}}
    32 \newcommand\converse{\hbox{\tt converse}}
    33 \newcommand\domain{\hbox{\tt domain}}
    34 \newcommand\range{\hbox{\tt range}}
    35 \newcommand\field{\hbox{\tt field}}
    36 \newcommand\lfp{\hbox{\tt lfp}}
    37 \newcommand\gfp{\hbox{\tt gfp}}
    38 \newcommand\id{\hbox{\tt id}}
    39 \newcommand\trans{\hbox{\tt trans}}
    40 \newcommand\wf{\hbox{\tt wf}}
    41 \newcommand\nat{\hbox{\tt nat}}
    42 \newcommand\rank{\hbox{\tt rank}}
    43 \newcommand\univ{\hbox{\tt univ}}
    44 \newcommand\Vrec{\hbox{\tt Vrec}}
    45 \newcommand\Inl{\hbox{\tt Inl}}
    46 \newcommand\Inr{\hbox{\tt Inr}}
    47 \newcommand\case{\hbox{\tt case}}
    48 \newcommand\lst{\hbox{\tt list}}
    49 \newcommand\Nil{\hbox{\tt Nil}}
    50 \newcommand\Cons{\hbox{\tt Cons}}
    51 \newcommand\lstcase{\hbox{\tt list\_case}}
    52 \newcommand\lstrec{\hbox{\tt list\_rec}}
    53 \newcommand\length{\hbox{\tt length}}
    54 \newcommand\listn{\hbox{\tt listn}}
    55 \newcommand\acc{\hbox{\tt acc}}
    56 \newcommand\primrec{\hbox{\tt primrec}}
    57 \newcommand\SC{\hbox{\tt SC}}
    58 \newcommand\CONST{\hbox{\tt CONST}}
    59 \newcommand\PROJ{\hbox{\tt PROJ}}
    60 \newcommand\COMP{\hbox{\tt COMP}}
    61 \newcommand\PREC{\hbox{\tt PREC}}
    62 
    63 \newcommand\quniv{\hbox{\tt quniv}}
    64 \newcommand\llist{\hbox{\tt llist}}
    65 \newcommand\LNil{\hbox{\tt LNil}}
    66 \newcommand\LCons{\hbox{\tt LCons}}
    67 \newcommand\lconst{\hbox{\tt lconst}}
    68 \newcommand\lleq{\hbox{\tt lleq}}
    69 \newcommand\map{\hbox{\tt map}}
    70 \newcommand\term{\hbox{\tt term}}
    71 \newcommand\Apply{\hbox{\tt Apply}}
    72 \newcommand\termcase{\hbox{\tt term\_case}}
    73 \newcommand\rev{\hbox{\tt rev}}
    74 \newcommand\reflect{\hbox{\tt reflect}}
    75 \newcommand\tree{\hbox{\tt tree}}
    76 \newcommand\forest{\hbox{\tt forest}}
    77 \newcommand\Part{\hbox{\tt Part}}
    78 \newcommand\TF{\hbox{\tt tree\_forest}}
    79 \newcommand\Tcons{\hbox{\tt Tcons}}
    80 \newcommand\Fcons{\hbox{\tt Fcons}}
    81 \newcommand\Fnil{\hbox{\tt Fnil}}
    82 \newcommand\TFcase{\hbox{\tt TF\_case}}
    83 \newcommand\Fin{\hbox{\tt Fin}}
    84 \newcommand\QInl{\hbox{\tt QInl}}
    85 \newcommand\QInr{\hbox{\tt QInr}}
    86 \newcommand\qsplit{\hbox{\tt qsplit}}
    87 \newcommand\qcase{\hbox{\tt qcase}}
    88 \newcommand\Con{\hbox{\tt Con}}
    89 \newcommand\data{\hbox{\tt data}}
    90 
    91 \binperiod     %%%treat . like a binary operator
    92 
    93 \begin{document}
    94 \pagestyle{empty}
    95 \begin{titlepage}
    96 \maketitle 
    97 \begin{abstract}
    98   This paper presents a fixedpoint approach to inductive definitions.
    99   Instead of using a syntactic test such as ``strictly positive,'' the
   100   approach lets definitions involve any operators that have been proved
   101   monotone.  It is conceptually simple, which has allowed the easy
   102   implementation of mutual recursion and iterated definitions.  It also
   103   handles coinductive definitions: simply replace the least fixedpoint by a
   104   greatest fixedpoint.  
   105 
   106   The method has been implemented in two of Isabelle's logics, \textsc{zf} set
   107   theory and higher-order logic.  It should be applicable to any logic in
   108   which the Knaster-Tarski theorem can be proved.  Examples include lists of
   109   $n$ elements, the accessible part of a relation and the set of primitive
   110   recursive functions.  One example of a coinductive definition is
   111   bisimulations for lazy lists.  Recursive datatypes are examined in detail,
   112   as well as one example of a \defn{codatatype}: lazy lists.
   113 
   114   The Isabelle package has been applied in several large case studies,
   115   including two proofs of the Church-Rosser theorem and a coinductive proof of
   116   semantic consistency.  The package can be trusted because it proves theorems
   117   from definitions, instead of asserting desired properties as axioms.
   118 \end{abstract}
   119 %
   120 \bigskip
   121 \centerline{Copyright \copyright{} \number\year{} by Lawrence C. Paulson}
   122 \thispagestyle{empty} 
   123 \end{titlepage}
   124 \tableofcontents\cleardoublepage\pagestyle{plain}
   125 
   126 \setcounter{page}{1}
   127 
   128 \section{Introduction}
   129 Several theorem provers provide commands for formalizing recursive data
   130 structures, like lists and trees.  Robin Milner implemented one of the first
   131 of these, for Edinburgh \textsc{lcf}~\cite{milner-ind}.  Given a description
   132 of the desired data structure, Milner's package formulated appropriate
   133 definitions and proved the characteristic theorems.  Similar is Melham's
   134 recursive type package for the Cambridge \textsc{hol} system~\cite{melham89}.
   135 Such data structures are called \defn{datatypes}
   136 below, by analogy with datatype declarations in Standard~\textsc{ml}\@.
   137 Some logics take datatypes as primitive; consider Boyer and Moore's shell
   138 principle~\cite{bm79} and the Coq type theory~\cite{paulin-tlca}.
   139 
   140 A datatype is but one example of an \defn{inductive definition}.  Such a
   141 definition~\cite{aczel77} specifies the least set~$R$ \defn{closed under}
   142 given rules: applying a rule to elements of~$R$ yields a result within~$R$.
   143 Inductive definitions have many applications.  The collection of theorems in a
   144 logic is inductively defined.  A structural operational
   145 semantics~\cite{hennessy90} is an inductive definition of a reduction or
   146 evaluation relation on programs.  A few theorem provers provide commands for
   147 formalizing inductive definitions; these include Coq~\cite{paulin-tlca} and
   148 again the \textsc{hol} system~\cite{camilleri92}.
   149 
   150 The dual notion is that of a \defn{coinductive definition}.  Such a definition
   151 specifies the greatest set~$R$ \defn{consistent with} given rules: every
   152 element of~$R$ can be seen as arising by applying a rule to elements of~$R$.
   153 Important examples include using bisimulation relations to formalize
   154 equivalence of processes~\cite{milner89} or lazy functional
   155 programs~\cite{abramsky90}.  Other examples include lazy lists and other
   156 infinite data structures; these are called \defn{codatatypes} below.
   157 
   158 Not all inductive definitions are meaningful.  \defn{Monotone} inductive
   159 definitions are a large, well-behaved class.  Monotonicity can be enforced
   160 by syntactic conditions such as ``strictly positive,'' but this could lead to
   161 monotone definitions being rejected on the grounds of their syntactic form.
   162 More flexible is to formalize monotonicity within the logic and allow users
   163 to prove it.
   164 
   165 This paper describes a package based on a fixedpoint approach.  Least
   166 fixedpoints yield inductive definitions; greatest fixedpoints yield
   167 coinductive definitions.  Most of the discussion below applies equally to
   168 inductive and coinductive definitions, and most of the code is shared.  
   169 
   170 The package supports mutual recursion and infinitely-branching datatypes and
   171 codatatypes.  It allows use of any operators that have been proved monotone,
   172 thus accepting all provably monotone inductive definitions, including
   173 iterated definitions.
   174 
   175 The package has been implemented in
   176 Isabelle~\cite{paulson-markt,paulson-isa-book} using 
   177 \textsc{zf} set theory \cite{paulson-set-I,paulson-set-II}; part of it has
   178 since been ported to Isabelle/\textsc{hol} (higher-order logic).  The
   179 recursion equations are specified as introduction rules for the mutually
   180 recursive sets.  The package transforms these rules into a mapping over sets,
   181 and attempts to prove that the mapping is monotonic and well-typed.  If
   182 successful, the package makes fixedpoint definitions and proves the
   183 introduction, elimination and (co)induction rules.  Users invoke the package
   184 by making simple declarations in Isabelle theory files.
   185 
   186 Most datatype packages equip the new datatype with some means of expressing
   187 recursive functions.  This is the main omission from my package.  Its
   188 fixedpoint operators define only recursive sets.  The Isabelle/\textsc{zf}
   189 theory provides well-founded recursion~\cite{paulson-set-II}, which is harder
   190 to use than structural recursion but considerably more general.
   191 Slind~\cite{slind-tfl} has written a package to automate the definition of
   192 well-founded recursive functions in Isabelle/\textsc{hol}.
   193 
   194 \paragraph*{Outline.} Section~2 introduces the least and greatest fixedpoint
   195 operators.  Section~3 discusses the form of introduction rules, mutual
   196 recursion and other points common to inductive and coinductive definitions.
   197 Section~4 discusses induction and coinduction rules separately.  Section~5
   198 presents several examples, including a coinductive definition.  Section~6
   199 describes datatype definitions.  Section~7 presents related work.
   200 Section~8 draws brief conclusions.  \ifshort\else The appendices are simple
   201 user's manuals for this Isabelle package.\fi
   202 
   203 Most of the definitions and theorems shown below have been generated by the
   204 package.  I have renamed some variables to improve readability.
   205  
   206 \section{Fixedpoint operators}
   207 In set theory, the least and greatest fixedpoint operators are defined as
   208 follows:
   209 \begin{eqnarray*}
   210    \lfp(D,h)  & \equiv & \inter\{X\sbs D. h(X)\sbs X\} \\
   211    \gfp(D,h)  & \equiv & \union\{X\sbs D. X\sbs h(X)\}
   212 \end{eqnarray*}   
   213 Let $D$ be a set.  Say that $h$ is \defn{bounded by}~$D$ if $h(D)\sbs D$, and
   214 \defn{monotone below~$D$} if
   215 $h(A)\sbs h(B)$ for all $A$ and $B$ such that $A\sbs B\sbs D$.  If $h$ is
   216 bounded by~$D$ and monotone then both operators yield fixedpoints:
   217 \begin{eqnarray*}
   218    \lfp(D,h)  & = & h(\lfp(D,h)) \\
   219    \gfp(D,h)  & = & h(\gfp(D,h)) 
   220 \end{eqnarray*}   
   221 These equations are instances of the Knaster-Tarski theorem, which states
   222 that every monotonic function over a complete lattice has a
   223 fixedpoint~\cite{davey&priestley}.  It is obvious from their definitions
   224 that $\lfp$ must be the least fixedpoint, and $\gfp$ the greatest.
   225 
   226 This fixedpoint theory is simple.  The Knaster-Tarski theorem is easy to
   227 prove.  Showing monotonicity of~$h$ is trivial, in typical cases.  We must
   228 also exhibit a bounding set~$D$ for~$h$.  Frequently this is trivial, as when
   229 a set of theorems is (co)inductively defined over some previously existing set
   230 of formul{\ae}.  Isabelle/\textsc{zf} provides suitable bounding sets for
   231 infinitely-branching (co)datatype definitions; see~\S\ref{univ-sec}.  Bounding
   232 sets are also called \defn{domains}.
   233 
   234 The powerset operator is monotone, but by Cantor's theorem there is no
   235 set~$A$ such that $A=\pow(A)$.  We cannot put $A=\lfp(D,\pow)$ because
   236 there is no suitable domain~$D$.  But \S\ref{acc-sec} demonstrates
   237 that~$\pow$ is still useful in inductive definitions.
   238 
   239 \section{Elements of an inductive or coinductive definition}\label{basic-sec}
   240 Consider a (co)inductive definition of the sets $R_1$, \ldots,~$R_n$, in
   241 mutual recursion.  They will be constructed from domains $D_1$,
   242 \ldots,~$D_n$, respectively.  The construction yields not $R_i\sbs D_i$ but
   243 $R_i\sbs D_1+\cdots+D_n$, where $R_i$ is contained in the image of~$D_i$
   244 under an injection.  Reasons for this are discussed
   245 elsewhere~\cite[\S4.5]{paulson-set-II}.
   246 
   247 The definition may involve arbitrary parameters $\vec{p}=p_1$,
   248 \ldots,~$p_k$.  Each recursive set then has the form $R_i(\vec{p})$.  The
   249 parameters must be identical every time they occur within a definition.  This
   250 would appear to be a serious restriction compared with other systems such as
   251 Coq~\cite{paulin-tlca}.  For instance, we cannot define the lists of
   252 $n$ elements as the set $\listn(A,n)$ using rules where the parameter~$n$
   253 varies.  Section~\ref{listn-sec} describes how to express this set using the
   254 inductive definition package.
   255 
   256 To avoid clutter below, the recursive sets are shown as simply $R_i$
   257 instead of~$R_i(\vec{p})$.
   258 
   259 \subsection{The form of the introduction rules}\label{intro-sec}
   260 The body of the definition consists of the desired introduction rules.  The
   261 conclusion of each rule must have the form $t\in R_i$, where $t$ is any term.
   262 Premises typically have the same form, but they can have the more general form
   263 $t\in M(R_i)$ or express arbitrary side-conditions.
   264 
   265 The premise $t\in M(R_i)$ is permitted if $M$ is a monotonic operator on
   266 sets, satisfying the rule 
   267 \[ \infer{M(A)\sbs M(B)}{A\sbs B} \]
   268 The user must supply the package with monotonicity rules for all such premises.
   269 
   270 The ability to introduce new monotone operators makes the approach
   271 flexible.  A suitable choice of~$M$ and~$t$ can express a lot.  The
   272 powerset operator $\pow$ is monotone, and the premise $t\in\pow(R)$
   273 expresses $t\sbs R$; see \S\ref{acc-sec} for an example.  The \emph{list of}
   274 operator is monotone, as is easily proved by induction.  The premise
   275 $t\in\lst(R)$ avoids having to encode the effect of~$\lst(R)$ using mutual
   276 recursion; see \S\ref{primrec-sec} and also my earlier
   277 paper~\cite[\S4.4]{paulson-set-II}.
   278 
   279 Introduction rules may also contain \defn{side-conditions}.  These are
   280 premises consisting of arbitrary formul{\ae} not mentioning the recursive
   281 sets. Side-conditions typically involve type-checking.  One example is the
   282 premise $a\in A$ in the following rule from the definition of lists:
   283 \[ \infer{\Cons(a,l)\in\lst(A)}{a\in A & l\in\lst(A)} \]
   284 
   285 \subsection{The fixedpoint definitions}
   286 The package translates the list of desired introduction rules into a fixedpoint
   287 definition.  Consider, as a running example, the finite powerset operator
   288 $\Fin(A)$: the set of all finite subsets of~$A$.  It can be
   289 defined as the least set closed under the rules
   290 \[  \emptyset\in\Fin(A)  \qquad 
   291     \infer{\{a\}\un b\in\Fin(A)}{a\in A & b\in\Fin(A)} 
   292 \]
   293 
   294 The domain in a (co)inductive definition must be some existing set closed
   295 under the rules.  A suitable domain for $\Fin(A)$ is $\pow(A)$, the set of all
   296 subsets of~$A$.  The package generates the definition
   297 \[  \Fin(A) \equiv \lfp(\pow(A), \,
   298   \begin{array}[t]{r@{\,}l}
   299       \lambda X. \{z\in\pow(A). & z=\emptyset \disj{} \\
   300                   &(\exists a\,b. z=\{a\}\un b\conj a\in A\conj b\in X)\})
   301   \end{array}
   302 \]
   303 The contribution of each rule to the definition of $\Fin(A)$ should be
   304 obvious.  A coinductive definition is similar but uses $\gfp$ instead
   305 of~$\lfp$.
   306 
   307 The package must prove that the fixedpoint operator is applied to a
   308 monotonic function.  If the introduction rules have the form described
   309 above, and if the package is supplied a monotonicity theorem for every
   310 $t\in M(R_i)$ premise, then this proof is trivial.\footnote{Due to the
   311   presence of logical connectives in the fixedpoint's body, the
   312   monotonicity proof requires some unusual rules.  These state that the
   313   connectives $\conj$, $\disj$ and $\exists$ preserve monotonicity with respect
   314   to the partial ordering on unary predicates given by $P\sqsubseteq Q$ if and
   315   only if $\forall x.P(x)\imp Q(x)$.}
   316 
   317 The package returns its result as an \textsc{ml} structure, which consists of named
   318 components; we may regard it as a record.  The result structure contains
   319 the definitions of the recursive sets as a theorem list called {\tt defs}.
   320 It also contains some theorems; {\tt dom\_subset} is an inclusion such as 
   321 $\Fin(A)\sbs\pow(A)$, while {\tt bnd\_mono} asserts that the fixedpoint
   322 definition is monotonic.
   323 
   324 Internally the package uses the theorem {\tt unfold}, a fixedpoint equation
   325 such as
   326 \[
   327   \begin{array}[t]{r@{\,}l}
   328      \Fin(A) = \{z\in\pow(A). & z=\emptyset \disj{} \\
   329              &(\exists a\,b. z=\{a\}\un b\conj a\in A\conj b\in \Fin(A))\}
   330   \end{array}
   331 \]
   332 In order to save space, this theorem is not exported.  
   333 
   334 
   335 \subsection{Mutual recursion} \label{mutual-sec}
   336 In a mutually recursive definition, the domain of the fixedpoint construction
   337 is the disjoint sum of the domain~$D_i$ of each~$R_i$, for $i=1$,
   338 \ldots,~$n$.  The package uses the injections of the
   339 binary disjoint sum, typically $\Inl$ and~$\Inr$, to express injections
   340 $h_{1n}$, \ldots, $h_{nn}$ for the $n$-ary disjoint sum $D_1+\cdots+D_n$.
   341 
   342 As discussed elsewhere \cite[\S4.5]{paulson-set-II}, Isabelle/\textsc{zf} defines the
   343 operator $\Part$ to support mutual recursion.  The set $\Part(A,h)$
   344 contains those elements of~$A$ having the form~$h(z)$:
   345 \[ \Part(A,h)  \equiv \{x\in A. \exists z. x=h(z)\}. \]   
   346 For mutually recursive sets $R_1$, \ldots,~$R_n$ with
   347 $n>1$, the package makes $n+1$ definitions.  The first defines a set $R$ using
   348 a fixedpoint operator. The remaining $n$ definitions have the form
   349 \[ R_i \equiv \Part(R,h_{in}), \qquad i=1,\ldots, n.  \] 
   350 It follows that $R=R_1\un\cdots\un R_n$, where the $R_i$ are pairwise disjoint.
   351 
   352 
   353 \subsection{Proving the introduction rules}
   354 The user supplies the package with the desired form of the introduction
   355 rules.  Once it has derived the theorem {\tt unfold}, it attempts
   356 to prove those rules.  From the user's point of view, this is the
   357 trickiest stage; the proofs often fail.  The task is to show that the domain 
   358 $D_1+\cdots+D_n$ of the combined set $R_1\un\cdots\un R_n$ is
   359 closed under all the introduction rules.  This essentially involves replacing
   360 each~$R_i$ by $D_1+\cdots+D_n$ in each of the introduction rules and
   361 attempting to prove the result.
   362 
   363 Consider the $\Fin(A)$ example.  After substituting $\pow(A)$ for $\Fin(A)$
   364 in the rules, the package must prove
   365 \[  \emptyset\in\pow(A)  \qquad 
   366     \infer{\{a\}\un b\in\pow(A)}{a\in A & b\in\pow(A)} 
   367 \]
   368 Such proofs can be regarded as type-checking the definition.\footnote{The
   369   Isabelle/\textsc{hol} version does not require these proofs, as \textsc{hol}
   370   has implicit type-checking.} The user supplies the package with
   371 type-checking rules to apply.  Usually these are general purpose rules from
   372 the \textsc{zf} theory.  They could however be rules specifically proved for a
   373 particular inductive definition; sometimes this is the easiest way to get the
   374 definition through!
   375 
   376 The result structure contains the introduction rules as the theorem list {\tt
   377 intrs}.
   378 
   379 \subsection{The case analysis rule}
   380 The elimination rule, called {\tt elim}, performs case analysis.  It is a
   381 simple consequence of {\tt unfold}.  There is one case for each introduction
   382 rule.  If $x\in\Fin(A)$ then either $x=\emptyset$ or else $x=\{a\}\un b$ for
   383 some $a\in A$ and $b\in\Fin(A)$.  Formally, the elimination rule for $\Fin(A)$
   384 is written
   385 \[ \infer{Q}{x\in\Fin(A) & \infer*{Q}{[x=\emptyset]}
   386                  & \infer*{Q}{[x=\{a\}\un b & a\in A &b\in\Fin(A)]_{a,b}} }
   387 \]
   388 The subscripted variables $a$ and~$b$ above the third premise are
   389 eigenvariables, subject to the usual ``not free in \ldots'' proviso.
   390 
   391 
   392 \section{Induction and coinduction rules}
   393 Here we must consider inductive and coinductive definitions separately.  For
   394 an inductive definition, the package returns an induction rule derived
   395 directly from the properties of least fixedpoints, as well as a modified rule
   396 for mutual recursion.  For a coinductive definition, the package returns a
   397 basic coinduction rule.
   398 
   399 \subsection{The basic induction rule}\label{basic-ind-sec}
   400 The basic rule, called {\tt induct}, is appropriate in most situations.
   401 For inductive definitions, it is strong rule induction~\cite{camilleri92}; for
   402 datatype definitions (see below), it is just structural induction.  
   403 
   404 The induction rule for an inductively defined set~$R$ has the form described
   405 below.  For the time being, assume that $R$'s domain is not a Cartesian
   406 product; inductively defined relations are treated slightly differently.
   407 
   408 The major premise is $x\in R$.  There is a minor premise for each
   409 introduction rule:
   410 \begin{itemize}
   411 \item If the introduction rule concludes $t\in R_i$, then the minor premise
   412 is~$P(t)$.
   413 
   414 \item The minor premise's eigenvariables are precisely the introduction
   415 rule's free variables that are not parameters of~$R$.  For instance, the
   416 eigenvariables in the $\Fin(A)$ rule below are $a$ and $b$, but not~$A$.
   417 
   418 \item If the introduction rule has a premise $t\in R_i$, then the minor
   419 premise discharges the assumption $t\in R_i$ and the induction
   420 hypothesis~$P(t)$.  If the introduction rule has a premise $t\in M(R_i)$
   421 then the minor premise discharges the single assumption
   422 \[ t\in M(\{z\in R_i. P(z)\}). \] 
   423 Because $M$ is monotonic, this assumption implies $t\in M(R_i)$.  The
   424 occurrence of $P$ gives the effect of an induction hypothesis, which may be
   425 exploited by appealing to properties of~$M$.
   426 \end{itemize}
   427 The induction rule for $\Fin(A)$ resembles the elimination rule shown above,
   428 but includes an induction hypothesis:
   429 \[ \infer{P(x)}{x\in\Fin(A) & P(\emptyset)
   430         & \infer*{P(\{a\}\un b)}{[a\in A & b\in\Fin(A) & P(b)]_{a,b}} }
   431 \] 
   432 Stronger induction rules often suggest themselves.  We can derive a rule for
   433 $\Fin(A)$ whose third premise discharges the extra assumption $a\not\in b$.
   434 The package provides rules for mutual induction and inductive relations.  The
   435 Isabelle/\textsc{zf} theory also supports well-founded induction and recursion
   436 over datatypes, by reasoning about the \defn{rank} of a
   437 set~\cite[\S3.4]{paulson-set-II}.
   438 
   439 
   440 \subsection{Modified induction rules}
   441 
   442 If the domain of $R$ is a Cartesian product $A_1\times\cdots\times A_m$
   443 (however nested), then the corresponding predicate $P_i$ takes $m$ arguments.
   444 The major premise becomes $\pair{z_1,\ldots,z_m}\in R$ instead of $x\in R$;
   445 the conclusion becomes $P(z_1,\ldots,z_m)$.  This simplifies reasoning about
   446 inductively defined relations, eliminating the need to express properties of
   447 $z_1$, \ldots,~$z_m$ as properties of the tuple $\pair{z_1,\ldots,z_m}$.
   448 Occasionally it may require you to split up the induction variable
   449 using {\tt SigmaE} and {\tt dom\_subset}, especially if the constant {\tt
   450   split} appears in the rule.
   451 
   452 The mutual induction rule is called {\tt
   453 mutual\_induct}.  It differs from the basic rule in two respects:
   454 \begin{itemize}
   455 \item Instead of a single predicate~$P$, it uses $n$ predicates $P_1$,
   456 \ldots,~$P_n$: one for each recursive set.
   457 
   458 \item There is no major premise such as $x\in R_i$.  Instead, the conclusion
   459 refers to all the recursive sets:
   460 \[ (\forall z.z\in R_1\imp P_1(z))\conj\cdots\conj
   461    (\forall z.z\in R_n\imp P_n(z))
   462 \]
   463 Proving the premises establishes $P_i(z)$ for $z\in R_i$ and $i=1$,
   464 \ldots,~$n$.
   465 \end{itemize}
   466 %
   467 If the domain of some $R_i$ is a Cartesian product, then the mutual induction
   468 rule is modified accordingly.  The predicates are made to take $m$ separate
   469 arguments instead of a tuple, and the quantification in the conclusion is over
   470 the separate variables $z_1$, \ldots, $z_m$.
   471 
   472 \subsection{Coinduction}\label{coind-sec}
   473 A coinductive definition yields a primitive coinduction rule, with no
   474 refinements such as those for the induction rules.  (Experience may suggest
   475 refinements later.)  Consider the codatatype of lazy lists as an example.  For
   476 suitable definitions of $\LNil$ and $\LCons$, lazy lists may be defined as the
   477 greatest set consistent with the rules
   478 \[  \LNil\in\llist(A)  \qquad 
   479     \infer[(-)]{\LCons(a,l)\in\llist(A)}{a\in A & l\in\llist(A)}
   480 \]
   481 The $(-)$ tag stresses that this is a coinductive definition.  A suitable
   482 domain for $\llist(A)$ is $\quniv(A)$; this set is closed under the variant
   483 forms of sum and product that are used to represent non-well-founded data
   484 structures (see~\S\ref{univ-sec}).
   485 
   486 The package derives an {\tt unfold} theorem similar to that for $\Fin(A)$. 
   487 Then it proves the theorem {\tt coinduct}, which expresses that $\llist(A)$
   488 is the greatest solution to this equation contained in $\quniv(A)$:
   489 \[ \infer{x\in\llist(A)}{x\in X & X\sbs \quniv(A) &
   490     \infer*{
   491      \begin{array}[b]{r@{}l}
   492        z=\LNil\disj 
   493        \bigl(\exists a\,l.\, & z=\LCons(a,l) \conj a\in A \conj{}\\
   494                              & l\in X\un\llist(A) \bigr)
   495      \end{array}  }{[z\in X]_z}}
   496 \]
   497 This rule complements the introduction rules; it provides a means of showing
   498 $x\in\llist(A)$ when $x$ is infinite.  For instance, if $x=\LCons(0,x)$ then
   499 applying the rule with $X=\{x\}$ proves $x\in\llist(\nat)$.  (Here $\nat$
   500 is the set of natural numbers.)
   501 
   502 Having $X\un\llist(A)$ instead of simply $X$ in the third premise above
   503 represents a slight strengthening of the greatest fixedpoint property.  I
   504 discuss several forms of coinduction rules elsewhere~\cite{paulson-coind}.
   505 
   506 The clumsy form of the third premise makes the rule hard to use, especially in
   507 large definitions.  Probably a constant should be declared to abbreviate the
   508 large disjunction, and rules derived to allow proving the separate disjuncts.
   509 
   510 
   511 \section{Examples of inductive and coinductive definitions}\label{ind-eg-sec}
   512 This section presents several examples from the literature: the finite
   513 powerset operator, lists of $n$ elements, bisimulations on lazy lists, the
   514 well-founded part of a relation, and the primitive recursive functions.
   515 
   516 \subsection{The finite powerset operator}
   517 This operator has been discussed extensively above.  Here is the
   518 corresponding invocation in an Isabelle theory file.  Note that
   519 $\cons(a,b)$ abbreviates $\{a\}\un b$ in Isabelle/\textsc{zf}.
   520 \begin{ttbox}
   521 Finite = Arith + 
   522 consts      Fin :: i=>i
   523 inductive
   524   domains   "Fin(A)" <= "Pow(A)"
   525   intrs
   526     emptyI  "0 : Fin(A)"
   527     consI   "[| a: A;  b: Fin(A) |] ==> cons(a,b) : Fin(A)"
   528   type_intrs "[empty_subsetI, cons_subsetI, PowI]"
   529   type_elims "[make_elim PowD]"
   530 end
   531 \end{ttbox}
   532 Theory {\tt Finite} extends the parent theory {\tt Arith} by declaring the
   533 unary function symbol~$\Fin$, which is defined inductively.  Its domain is
   534 specified as $\pow(A)$, where $A$ is the parameter appearing in the
   535 introduction rules.  For type-checking, we supply two introduction
   536 rules:
   537 \[ \emptyset\sbs A              \qquad
   538    \infer{\{a\}\un B\sbs C}{a\in C & B\sbs C}
   539 \]
   540 A further introduction rule and an elimination rule express both
   541 directions of the equivalence $A\in\pow(B)\bimp A\sbs B$.  Type-checking
   542 involves mostly introduction rules.  
   543 
   544 Like all Isabelle theory files, this one yields a structure containing the
   545 new theory as an \textsc{ml} value.  Structure {\tt Finite} also has a
   546 substructure, called~{\tt Fin}.  After declaring \hbox{\tt open Finite;} we
   547 can refer to the $\Fin(A)$ introduction rules as the list {\tt Fin.intrs}
   548 or individually as {\tt Fin.emptyI} and {\tt Fin.consI}.  The induction
   549 rule is {\tt Fin.induct}.
   550 
   551 
   552 \subsection{Lists of $n$ elements}\label{listn-sec}
   553 This has become a standard example of an inductive definition.  Following
   554 Paulin-Mohring~\cite{paulin-tlca}, we could attempt to define a new datatype
   555 $\listn(A,n)$, for lists of length~$n$, as an $n$-indexed family of sets.
   556 But her introduction rules
   557 \[ \hbox{\tt Niln}\in\listn(A,0)  \qquad
   558    \infer{\hbox{\tt Consn}(n,a,l)\in\listn(A,\succ(n))}
   559          {n\in\nat & a\in A & l\in\listn(A,n)}
   560 \]
   561 are not acceptable to the inductive definition package:
   562 $\listn$ occurs with three different parameter lists in the definition.
   563 
   564 The Isabelle version of this example suggests a general treatment of
   565 varying parameters.  It uses the existing datatype definition of
   566 $\lst(A)$, with constructors $\Nil$ and~$\Cons$, and incorporates the
   567 parameter~$n$ into the inductive set itself.  It defines $\listn(A)$ as a
   568 relation consisting of pairs $\pair{n,l}$ such that $n\in\nat$
   569 and~$l\in\lst(A)$ and $l$ has length~$n$.  In fact, $\listn(A)$ is the
   570 converse of the length function on~$\lst(A)$.  The Isabelle/\textsc{zf} introduction
   571 rules are
   572 \[ \pair{0,\Nil}\in\listn(A)  \qquad
   573    \infer{\pair{\succ(n),\Cons(a,l)}\in\listn(A)}
   574          {a\in A & \pair{n,l}\in\listn(A)}
   575 \]
   576 The Isabelle theory file takes, as parent, the theory~{\tt List} of lists.
   577 We declare the constant~$\listn$ and supply an inductive definition,
   578 specifying the domain as $\nat\times\lst(A)$:
   579 \begin{ttbox}
   580 ListN = List +
   581 consts  listn :: i=>i
   582 inductive
   583   domains   "listn(A)" <= "nat*list(A)"
   584   intrs
   585     NilI  "<0,Nil>: listn(A)"
   586     ConsI "[| a: A;  <n,l>: listn(A) |] ==> <succ(n), Cons(a,l)>: listn(A)"
   587   type_intrs "nat_typechecks @ list.intrs"
   588 end
   589 \end{ttbox}
   590 The type-checking rules include those for 0, $\succ$, $\Nil$ and $\Cons$.
   591 Because $\listn(A)$ is a set of pairs, type-checking requires the
   592 equivalence $\pair{a,b}\in A\times B \bimp a\in A \conj b\in B$.  The
   593 package always includes the rules for ordered pairs.
   594 
   595 The package returns introduction, elimination and induction rules for
   596 $\listn$.  The basic induction rule, {\tt listn.induct}, is
   597 \[ \infer{P(z_1,z_2)}{\pair{z_1,z_2}\in\listn(A) & P(0,\Nil) &
   598              \infer*{P(\succ(n),\Cons(a,l))}
   599                 {[a\in A & \pair{n,l}\in\listn(A) & P(n,l)]_{a,l,n}}}
   600 \]
   601 This rule lets the induction formula to be a 
   602 binary property of pairs, $P(n,l)$.  
   603 It is now a simple matter to prove theorems about $\listn(A)$, such as
   604 \[ \forall l\in\lst(A). \pair{\length(l),\, l}\in\listn(A) \]
   605 \[ \listn(A)``\{n\} = \{l\in\lst(A). \length(l)=n\} \]
   606 This latter result --- here $r``X$ denotes the image of $X$ under $r$
   607 --- asserts that the inductive definition agrees with the obvious notion of
   608 $n$-element list.  
   609 
   610 A ``list of $n$ elements'' really is a list, namely an element of ~$\lst(A)$.
   611 It is subject to list operators such as append (concatenation).  For example,
   612 a trivial induction on $\pair{m,l}\in\listn(A)$ yields
   613 \[ \infer{\pair{m\mathbin{+} m',\, l@l'}\in\listn(A)}
   614          {\pair{m,l}\in\listn(A) & \pair{m',l'}\in\listn(A)} 
   615 \]
   616 where $+$ denotes addition on the natural numbers and @ denotes append.
   617 
   618 \subsection{Rule inversion: the function {\tt mk\_cases}}
   619 The elimination rule, {\tt listn.elim}, is cumbersome:
   620 \[ \infer{Q}{x\in\listn(A) & 
   621           \infer*{Q}{[x = \pair{0,\Nil}]} &
   622           \infer*{Q}
   623              {\left[\begin{array}{l}
   624                x = \pair{\succ(n),\Cons(a,l)} \\
   625                a\in A \\
   626                \pair{n,l}\in\listn(A)
   627                \end{array} \right]_{a,l,n}}}
   628 \]
   629 The \textsc{ml} function {\tt listn.mk\_cases} generates simplified instances of
   630 this rule.  It works by freeness reasoning on the list constructors:
   631 $\Cons(a,l)$ is injective in its two arguments and differs from~$\Nil$.  If
   632 $x$ is $\pair{i,\Nil}$ or $\pair{i,\Cons(a,l)}$ then {\tt listn.mk\_cases}
   633 deduces the corresponding form of~$i$;  this is called rule inversion.  
   634 Here is a sample session: 
   635 \begin{ttbox}
   636 listn.mk_cases list.con_defs "<i,Nil> : listn(A)";
   637 {\out "[| <?i, []> : listn(?A); ?i = 0 ==> ?Q |] ==> ?Q" : thm}
   638 listn.mk_cases list.con_defs "<i,Cons(a,l)> : listn(A)";
   639 {\out "[| <?i, Cons(?a, ?l)> : listn(?A);}
   640 {\out     !!n. [| ?a : ?A; <n, ?l> : listn(?A); ?i = succ(n) |] ==> ?Q }
   641 {\out  |] ==> ?Q" : thm}
   642 \end{ttbox}
   643 Each of these rules has only two premises.  In conventional notation, the
   644 second rule is
   645 \[ \infer{Q}{\pair{i, \Cons(a,l)}\in\listn(A) & 
   646           \infer*{Q}
   647              {\left[\begin{array}{l}
   648                a\in A \\ \pair{n,l}\in\listn(A) \\ i = \succ(n)
   649                \end{array} \right]_{n}}}
   650 \]
   651 The package also has built-in rules for freeness reasoning about $0$
   652 and~$\succ$.  So if $x$ is $\pair{0,l}$ or $\pair{\succ(i),l}$, then {\tt
   653 listn.mk\_cases} can deduce the corresponding form of~$l$. 
   654 
   655 The function {\tt mk\_cases} is also useful with datatype definitions.  The
   656 instance from the definition of lists, namely {\tt list.mk\_cases}, can
   657 prove that $\Cons(a,l)\in\lst(A)$ implies $a\in A $ and $l\in\lst(A)$:
   658 \[ \infer{Q}{\Cons(a,l)\in\lst(A) & 
   659                  & \infer*{Q}{[a\in A &l\in\lst(A)]} }
   660 \]
   661 A typical use of {\tt mk\_cases} concerns inductive definitions of evaluation
   662 relations.  Then rule inversion yields case analysis on possible evaluations.
   663 For example, Isabelle/\textsc{zf} includes a short proof of the
   664 diamond property for parallel contraction on combinators.  Ole Rasmussen used
   665 {\tt mk\_cases} extensively in his development of the theory of
   666 residuals~\cite{rasmussen95}.
   667 
   668 
   669 \subsection{A coinductive definition: bisimulations on lazy lists}
   670 This example anticipates the definition of the codatatype $\llist(A)$, which
   671 consists of finite and infinite lists over~$A$.  Its constructors are $\LNil$
   672 and~$\LCons$, satisfying the introduction rules shown in~\S\ref{coind-sec}.  
   673 Because $\llist(A)$ is defined as a greatest fixedpoint and uses the variant
   674 pairing and injection operators, it contains non-well-founded elements such as
   675 solutions to $\LCons(a,l)=l$.
   676 
   677 The next step in the development of lazy lists is to define a coinduction
   678 principle for proving equalities.  This is done by showing that the equality
   679 relation on lazy lists is the greatest fixedpoint of some monotonic
   680 operation.  The usual approach~\cite{pitts94} is to define some notion of 
   681 bisimulation for lazy lists, define equivalence to be the greatest
   682 bisimulation, and finally to prove that two lazy lists are equivalent if and
   683 only if they are equal.  The coinduction rule for equivalence then yields a
   684 coinduction principle for equalities.
   685 
   686 A binary relation $R$ on lazy lists is a \defn{bisimulation} provided $R\sbs
   687 R^+$, where $R^+$ is the relation
   688 \[ \{\pair{\LNil,\LNil}\} \un 
   689    \{\pair{\LCons(a,l),\LCons(a,l')} . a\in A \conj \pair{l,l'}\in R\}.
   690 \]
   691 A pair of lazy lists are \defn{equivalent} if they belong to some
   692 bisimulation.  Equivalence can be coinductively defined as the greatest
   693 fixedpoint for the introduction rules
   694 \[  \pair{\LNil,\LNil} \in\lleq(A)  \qquad 
   695     \infer[(-)]{\pair{\LCons(a,l),\LCons(a,l')} \in\lleq(A)}
   696           {a\in A & \pair{l,l'}\in \lleq(A)}
   697 \]
   698 To make this coinductive definition, the theory file includes (after the
   699 declaration of $\llist(A)$) the following lines:
   700 \begin{ttbox}
   701 consts    lleq :: i=>i
   702 coinductive
   703   domains "lleq(A)" <= "llist(A) * llist(A)"
   704   intrs
   705     LNil  "<LNil,LNil> : lleq(A)"
   706     LCons "[| a:A; <l,l'>: lleq(A) |] ==> <LCons(a,l),LCons(a,l')>: lleq(A)"
   707   type_intrs  "llist.intrs"
   708 \end{ttbox}
   709 The domain of $\lleq(A)$ is $\llist(A)\times\llist(A)$.  The type-checking
   710 rules include the introduction rules for $\llist(A)$, whose 
   711 declaration is discussed below (\S\ref{lists-sec}).
   712 
   713 The package returns the introduction rules and the elimination rule, as
   714 usual.  But instead of induction rules, it returns a coinduction rule.
   715 The rule is too big to display in the usual notation; its conclusion is
   716 $x\in\lleq(A)$ and its premises are $x\in X$, 
   717 ${X\sbs\llist(A)\times\llist(A)}$ and
   718 \[ \infer*{z=\pair{\LNil,\LNil}\disj \bigl(\exists a\,l\,l'.\,
   719      \begin{array}[t]{@{}l}
   720        z=\pair{\LCons(a,l),\LCons(a,l')} \conj a\in A \conj{}\\
   721        \pair{l,l'}\in X\un\lleq(A) \bigr)
   722      \end{array}  
   723     }{[z\in X]_z}
   724 \]
   725 Thus if $x\in X$, where $X$ is a bisimulation contained in the
   726 domain of $\lleq(A)$, then $x\in\lleq(A)$.  It is easy to show that
   727 $\lleq(A)$ is reflexive: the equality relation is a bisimulation.  And
   728 $\lleq(A)$ is symmetric: its converse is a bisimulation.  But showing that
   729 $\lleq(A)$ coincides with the equality relation takes some work.
   730 
   731 \subsection{The accessible part of a relation}\label{acc-sec}
   732 Let $\prec$ be a binary relation on~$D$; in short, $(\prec)\sbs D\times D$.
   733 The \defn{accessible} or \defn{well-founded} part of~$\prec$, written
   734 $\acc(\prec)$, is essentially that subset of~$D$ for which $\prec$ admits
   735 no infinite decreasing chains~\cite{aczel77}.  Formally, $\acc(\prec)$ is
   736 inductively defined to be the least set that contains $a$ if it contains
   737 all $\prec$-predecessors of~$a$, for $a\in D$.  Thus we need an
   738 introduction rule of the form 
   739 \[ \infer{a\in\acc(\prec)}{\forall y.y\prec a\imp y\in\acc(\prec)} \]
   740 Paulin-Mohring treats this example in Coq~\cite{paulin-tlca}, but it causes
   741 difficulties for other systems.  Its premise is not acceptable to the
   742 inductive definition package of the Cambridge \textsc{hol}
   743 system~\cite{camilleri92}.  It is also unacceptable to the Isabelle package
   744 (recall \S\ref{intro-sec}), but fortunately can be transformed into the
   745 acceptable form $t\in M(R)$.
   746 
   747 The powerset operator is monotonic, and $t\in\pow(R)$ is equivalent to
   748 $t\sbs R$.  This in turn is equivalent to $\forall y\in t. y\in R$.  To
   749 express $\forall y.y\prec a\imp y\in\acc(\prec)$ we need only find a
   750 term~$t$ such that $y\in t$ if and only if $y\prec a$.  A suitable $t$ is
   751 the inverse image of~$\{a\}$ under~$\prec$.
   752 
   753 The definition below follows this approach.  Here $r$ is~$\prec$ and
   754 $\field(r)$ refers to~$D$, the domain of $\acc(r)$.  (The field of a
   755 relation is the union of its domain and range.)  Finally $r^{-}``\{a\}$
   756 denotes the inverse image of~$\{a\}$ under~$r$.  We supply the theorem {\tt
   757   Pow\_mono}, which asserts that $\pow$ is monotonic.
   758 \begin{ttbox}
   759 consts    acc :: i=>i
   760 inductive
   761   domains "acc(r)" <= "field(r)"
   762   intrs
   763     vimage  "[| r-``\{a\}: Pow(acc(r)); a: field(r) |] ==> a: acc(r)"
   764   monos     "[Pow_mono]"
   765 \end{ttbox}
   766 The Isabelle theory proceeds to prove facts about $\acc(\prec)$.  For
   767 instance, $\prec$ is well-founded if and only if its field is contained in
   768 $\acc(\prec)$.  
   769 
   770 As mentioned in~\S\ref{basic-ind-sec}, a premise of the form $t\in M(R)$
   771 gives rise to an unusual induction hypothesis.  Let us examine the
   772 induction rule, {\tt acc.induct}:
   773 \[ \infer{P(x)}{x\in\acc(r) &
   774      \infer*{P(a)}{\left[
   775                    \begin{array}{r@{}l}
   776                      r^{-}``\{a\} &\, \in\pow(\{z\in\acc(r).P(z)\}) \\
   777                                 a &\, \in\field(r)
   778                    \end{array}
   779                    \right]_a}}
   780 \]
   781 The strange induction hypothesis is equivalent to
   782 $\forall y. \pair{y,a}\in r\imp y\in\acc(r)\conj P(y)$.
   783 Therefore the rule expresses well-founded induction on the accessible part
   784 of~$\prec$.
   785 
   786 The use of inverse image is not essential.  The Isabelle package can accept
   787 introduction rules with arbitrary premises of the form $\forall
   788 \vec{y}.P(\vec{y})\imp f(\vec{y})\in R$.  The premise can be expressed
   789 equivalently as 
   790 \[ \{z\in D. P(\vec{y}) \conj z=f(\vec{y})\} \in \pow(R) \] 
   791 provided $f(\vec{y})\in D$ for all $\vec{y}$ such that~$P(\vec{y})$.  The
   792 following section demonstrates another use of the premise $t\in M(R)$,
   793 where $M=\lst$. 
   794 
   795 \subsection{The primitive recursive functions}\label{primrec-sec}
   796 The primitive recursive functions are traditionally defined inductively, as
   797 a subset of the functions over the natural numbers.  One difficulty is that
   798 functions of all arities are taken together, but this is easily
   799 circumvented by regarding them as functions on lists.  Another difficulty,
   800 the notion of composition, is less easily circumvented.
   801 
   802 Here is a more precise definition.  Letting $\vec{x}$ abbreviate
   803 $x_0,\ldots,x_{n-1}$, we can write lists such as $[\vec{x}]$,
   804 $[y+1,\vec{x}]$, etc.  A function is \defn{primitive recursive} if it
   805 belongs to the least set of functions in $\lst(\nat)\to\nat$ containing
   806 \begin{itemize}
   807 \item The \defn{successor} function $\SC$, such that $\SC[y,\vec{x}]=y+1$.
   808 \item All \defn{constant} functions $\CONST(k)$, such that
   809   $\CONST(k)[\vec{x}]=k$. 
   810 \item All \defn{projection} functions $\PROJ(i)$, such that
   811   $\PROJ(i)[\vec{x}]=x_i$ if $0\leq i<n$. 
   812 \item All \defn{compositions} $\COMP(g,[f_0,\ldots,f_{m-1}])$, 
   813 where $g$ and $f_0$, \ldots, $f_{m-1}$ are primitive recursive,
   814 such that
   815 \[ \COMP(g,[f_0,\ldots,f_{m-1}])[\vec{x}] = 
   816    g[f_0[\vec{x}],\ldots,f_{m-1}[\vec{x}]]. \] 
   817 
   818 \item All \defn{recursions} $\PREC(f,g)$, where $f$ and $g$ are primitive
   819   recursive, such that
   820 \begin{eqnarray*}
   821   \PREC(f,g)[0,\vec{x}] & = & f[\vec{x}] \\
   822   \PREC(f,g)[y+1,\vec{x}] & = & g[\PREC(f,g)[y,\vec{x}],\, y,\, \vec{x}].
   823 \end{eqnarray*} 
   824 \end{itemize}
   825 Composition is awkward because it combines not two functions, as is usual,
   826 but $m+1$ functions.  In her proof that Ackermann's function is not
   827 primitive recursive, Nora Szasz was unable to formalize this definition
   828 directly~\cite{szasz93}.  So she generalized primitive recursion to
   829 tuple-valued functions.  This modified the inductive definition such that
   830 each operation on primitive recursive functions combined just two functions.
   831 
   832 \begin{figure}
   833 \begin{ttbox}
   834 Primrec = List +
   835 consts
   836   primrec :: i
   837   SC      :: i
   838   \(\vdots\)
   839 defs
   840   SC_def    "SC == lam l:list(nat).list_case(0, \%x xs.succ(x), l)"
   841   \(\vdots\)
   842 inductive
   843   domains "primrec" <= "list(nat)->nat"
   844   intrs
   845     SC       "SC : primrec"
   846     CONST    "k: nat ==> CONST(k) : primrec"
   847     PROJ     "i: nat ==> PROJ(i) : primrec"
   848     COMP     "[| g: primrec; fs: list(primrec) |] ==> COMP(g,fs): primrec"
   849     PREC     "[| f: primrec; g: primrec |] ==> PREC(f,g): primrec"
   850   monos      "[list_mono]"
   851   con_defs   "[SC_def,CONST_def,PROJ_def,COMP_def,PREC_def]"
   852   type_intrs "nat_typechecks @ list.intrs @                      
   853               [lam_type, list_case_type, drop_type, map_type,    
   854                apply_type, rec_type]"
   855 end
   856 \end{ttbox}
   857 \hrule
   858 \caption{Inductive definition of the primitive recursive functions} 
   859 \label{primrec-fig}
   860 \end{figure}
   861 \def\fs{{\it fs}} 
   862 
   863 Szasz was using \textsc{alf}, but Coq and \textsc{hol} would also have
   864 problems accepting this definition.  Isabelle's package accepts it easily
   865 since $[f_0,\ldots,f_{m-1}]$ is a list of primitive recursive functions and
   866 $\lst$ is monotonic.  There are five introduction rules, one for each of the
   867 five forms of primitive recursive function.  Let us examine the one for
   868 $\COMP$:
   869 \[ \infer{\COMP(g,\fs)\in\primrec}{g\in\primrec & \fs\in\lst(\primrec)} \]
   870 The induction rule for $\primrec$ has one case for each introduction rule.
   871 Due to the use of $\lst$ as a monotone operator, the composition case has
   872 an unusual induction hypothesis:
   873  \[ \infer*{P(\COMP(g,\fs))}
   874           {[g\in\primrec & \fs\in\lst(\{z\in\primrec.P(z)\})]_{\fs,g}} 
   875 \]
   876 The hypothesis states that $\fs$ is a list of primitive recursive functions,
   877 each satisfying the induction formula.  Proving the $\COMP$ case typically
   878 requires structural induction on lists, yielding two subcases: either
   879 $\fs=\Nil$ or else $\fs=\Cons(f,\fs')$, where $f\in\primrec$, $P(f)$, and
   880 $\fs'$ is another list of primitive recursive functions satisfying~$P$.
   881 
   882 Figure~\ref{primrec-fig} presents the theory file.  Theory {\tt Primrec}
   883 defines the constants $\SC$, $\CONST$, etc.  These are not constructors of
   884 a new datatype, but functions over lists of numbers.  Their definitions,
   885 most of which are omitted, consist of routine list programming.  In
   886 Isabelle/\textsc{zf}, the primitive recursive functions are defined as a subset of
   887 the function set $\lst(\nat)\to\nat$.
   888 
   889 The Isabelle theory goes on to formalize Ackermann's function and prove
   890 that it is not primitive recursive, using the induction rule {\tt
   891   primrec.induct}.  The proof follows Szasz's excellent account.
   892 
   893 
   894 \section{Datatypes and codatatypes}\label{data-sec}
   895 A (co)datatype definition is a (co)inductive definition with automatically
   896 defined constructors and a case analysis operator.  The package proves that
   897 the case operator inverts the constructors and can prove freeness theorems
   898 involving any pair of constructors.
   899 
   900 
   901 \subsection{Constructors and their domain}\label{univ-sec}
   902 A (co)inductive definition selects a subset of an existing set; a (co)datatype
   903 definition creates a new set.  The package reduces the latter to the former.
   904 Isabelle/\textsc{zf} supplies sets having strong closure properties to serve
   905 as domains for (co)inductive definitions.
   906 
   907 Isabelle/\textsc{zf} defines the Cartesian product $A\times
   908 B$, containing ordered pairs $\pair{a,b}$; it also defines the
   909 disjoint sum $A+B$, containing injections $\Inl(a)\equiv\pair{0,a}$ and
   910 $\Inr(b)\equiv\pair{1,b}$.  For use below, define the $m$-tuple
   911 $\pair{x_1,\ldots,x_m}$ to be the empty set~$\emptyset$ if $m=0$, simply $x_1$
   912 if $m=1$ and $\pair{x_1,\pair{x_2,\ldots,x_m}}$ if $m\geq2$.
   913 
   914 A datatype constructor $\Con(x_1,\ldots,x_m)$ is defined to be
   915 $h(\pair{x_1,\ldots,x_m})$, where $h$ is composed of $\Inl$ and~$\Inr$.
   916 In a mutually recursive definition, all constructors for the set~$R_i$ have
   917 the outer form~$h_{in}$, where $h_{in}$ is the injection described
   918 in~\S\ref{mutual-sec}.  Further nested injections ensure that the
   919 constructors for~$R_i$ are pairwise distinct.  
   920 
   921 Isabelle/\textsc{zf} defines the set $\univ(A)$, which contains~$A$ and
   922 furthermore contains $\pair{a,b}$, $\Inl(a)$ and $\Inr(b)$ for $a$,
   923 $b\in\univ(A)$.  In a typical datatype definition with set parameters
   924 $A_1$, \ldots, $A_k$, a suitable domain for all the recursive sets is
   925 $\univ(A_1\un\cdots\un A_k)$.  This solves the problem for
   926 datatypes~\cite[\S4.2]{paulson-set-II}.
   927 
   928 The standard pairs and injections can only yield well-founded
   929 constructions.  This eases the (manual!) definition of recursive functions
   930 over datatypes.  But they are unsuitable for codatatypes, which typically
   931 contain non-well-founded objects.
   932 
   933 To support codatatypes, Isabelle/\textsc{zf} defines a variant notion of
   934 ordered pair, written~$\pair{a;b}$.  It also defines the corresponding variant
   935 notion of Cartesian product $A\otimes B$, variant injections $\QInl(a)$
   936 and~$\QInr(b)$ and variant disjoint sum $A\oplus B$.  Finally it defines the
   937 set $\quniv(A)$, which contains~$A$ and furthermore contains $\pair{a;b}$,
   938 $\QInl(a)$ and $\QInr(b)$ for $a$, $b\in\quniv(A)$.  In a typical codatatype
   939 definition with set parameters $A_1$, \ldots, $A_k$, a suitable domain is
   940 $\quniv(A_1\un\cdots\un A_k)$.  Details are published
   941 elsewhere~\cite{paulson-final}.
   942 
   943 \subsection{The case analysis operator}
   944 The (co)datatype package automatically defines a case analysis operator,
   945 called {\tt$R$\_case}.  A mutually recursive definition still has only one
   946 operator, whose name combines those of the recursive sets: it is called
   947 {\tt$R_1$\_\ldots\_$R_n$\_case}.  The case operator is analogous to those
   948 for products and sums.
   949 
   950 Datatype definitions employ standard products and sums, whose operators are
   951 $\split$ and $\case$ and satisfy the equations
   952 \begin{eqnarray*}
   953   \split(f,\pair{x,y})  & = &  f(x,y) \\
   954   \case(f,g,\Inl(x))    & = &  f(x)   \\
   955   \case(f,g,\Inr(y))    & = &  g(y)
   956 \end{eqnarray*}
   957 Suppose the datatype has $k$ constructors $\Con_1$, \ldots,~$\Con_k$.  Then
   958 its case operator takes $k+1$ arguments and satisfies an equation for each
   959 constructor:
   960 \[ R\hbox{\_case}(f_1,\ldots,f_k, {\tt Con}_i(\vec{x})) = f_i(\vec{x}),
   961     \qquad i = 1, \ldots, k
   962 \]
   963 The case operator's definition takes advantage of Isabelle's representation of
   964 syntax in the typed $\lambda$-calculus; it could readily be adapted to a
   965 theorem prover for higher-order logic.  If $f$ and~$g$ have meta-type $i\To i$
   966 then so do $\split(f)$ and $\case(f,g)$.  This works because $\split$ and
   967 $\case$ operate on their last argument.  They are easily combined to make
   968 complex case analysis operators.  For example, $\case(f,\case(g,h))$ performs
   969 case analysis for $A+(B+C)$; let us verify one of the three equations:
   970 \[ \case(f,\case(g,h), \Inr(\Inl(b))) = \case(g,h,\Inl(b)) = g(b) \]
   971 Codatatype definitions are treated in precisely the same way.  They express
   972 case operators using those for the variant products and sums, namely
   973 $\qsplit$ and~$\qcase$.
   974 
   975 \medskip
   976 
   977 To see how constructors and the case analysis operator are defined, let us
   978 examine some examples.  Further details are available
   979 elsewhere~\cite{paulson-set-II}.
   980 
   981 
   982 \subsection{Example: lists and lazy lists}\label{lists-sec}
   983 Here is a declaration of the datatype of lists, as it might appear in a theory
   984 file:
   985 \begin{ttbox} 
   986 consts  list :: i=>i
   987 datatype "list(A)" = Nil | Cons ("a:A", "l: list(A)")
   988 \end{ttbox}
   989 And here is a declaration of the codatatype of lazy lists:
   990 \begin{ttbox}
   991 consts  llist :: i=>i
   992 codatatype "llist(A)" = LNil | LCons ("a: A", "l: llist(A)")
   993 \end{ttbox}
   994 
   995 Each form of list has two constructors, one for the empty list and one for
   996 adding an element to a list.  Each takes a parameter, defining the set of
   997 lists over a given set~$A$.  Each is automatically given the appropriate
   998 domain: $\univ(A)$ for $\lst(A)$ and $\quniv(A)$ for $\llist(A)$.  The default
   999 can be overridden.
  1000 
  1001 \ifshort
  1002 Now $\lst(A)$ is a datatype and enjoys the usual induction rule.
  1003 \else
  1004 Since $\lst(A)$ is a datatype, it enjoys a structural induction rule, {\tt
  1005   list.induct}:
  1006 \[ \infer{P(x)}{x\in\lst(A) & P(\Nil)
  1007         & \infer*{P(\Cons(a,l))}{[a\in A & l\in\lst(A) & P(l)]_{a,l}} }
  1008 \] 
  1009 Induction and freeness yield the law $l\not=\Cons(a,l)$.  To strengthen this,
  1010 Isabelle/\textsc{zf} defines the rank of a set and proves that the standard
  1011 pairs and injections have greater rank than their components.  An immediate
  1012 consequence, which justifies structural recursion on lists
  1013 \cite[\S4.3]{paulson-set-II}, is
  1014 \[ \rank(l) < \rank(\Cons(a,l)). \]
  1015 \par
  1016 \fi
  1017 But $\llist(A)$ is a codatatype and has no induction rule.  Instead it has
  1018 the coinduction rule shown in \S\ref{coind-sec}.  Since variant pairs and
  1019 injections are monotonic and need not have greater rank than their
  1020 components, fixedpoint operators can create cyclic constructions.  For
  1021 example, the definition
  1022 \[ \lconst(a) \equiv \lfp(\univ(a), \lambda l. \LCons(a,l)) \]
  1023 yields $\lconst(a) = \LCons(a,\lconst(a))$.
  1024 
  1025 \ifshort
  1026 \typeout{****SHORT VERSION}
  1027 \typeout{****Omitting discussion of constructors!}
  1028 \else
  1029 \medskip
  1030 It may be instructive to examine the definitions of the constructors and
  1031 case operator for $\lst(A)$.  The definitions for $\llist(A)$ are similar.
  1032 The list constructors are defined as follows:
  1033 \begin{eqnarray*}
  1034   \Nil       & \equiv & \Inl(\emptyset) \\
  1035   \Cons(a,l) & \equiv & \Inr(\pair{a,l})
  1036 \end{eqnarray*}
  1037 The operator $\lstcase$ performs case analysis on these two alternatives:
  1038 \[ \lstcase(c,h) \equiv \case(\lambda u.c, \split(h)) \]
  1039 Let us verify the two equations:
  1040 \begin{eqnarray*}
  1041     \lstcase(c, h, \Nil) & = & 
  1042        \case(\lambda u.c, \split(h), \Inl(\emptyset)) \\
  1043      & = & (\lambda u.c)(\emptyset) \\
  1044      & = & c\\[1ex]
  1045     \lstcase(c, h, \Cons(x,y)) & = & 
  1046        \case(\lambda u.c, \split(h), \Inr(\pair{x,y})) \\
  1047      & = & \split(h, \pair{x,y}) \\
  1048      & = & h(x,y)
  1049 \end{eqnarray*} 
  1050 \fi
  1051 
  1052 
  1053 \ifshort
  1054 \typeout{****Omitting mutual recursion example!}
  1055 \else
  1056 \subsection{Example: mutual recursion}
  1057 In mutually recursive trees and forests~\cite[\S4.5]{paulson-set-II}, trees
  1058 have the one constructor $\Tcons$, while forests have the two constructors
  1059 $\Fnil$ and~$\Fcons$:
  1060 \begin{ttbox}
  1061 consts  tree, forest, tree_forest    :: i=>i
  1062 datatype "tree(A)"   = Tcons ("a: A",  "f: forest(A)")
  1063 and      "forest(A)" = Fnil  |  Fcons ("t: tree(A)",  "f: forest(A)")
  1064 \end{ttbox}
  1065 The three introduction rules define the mutual recursion.  The
  1066 distinguishing feature of this example is its two induction rules.
  1067 
  1068 The basic induction rule is called {\tt tree\_forest.induct}:
  1069 \[ \infer{P(x)}{x\in\TF(A) & 
  1070      \infer*{P(\Tcons(a,f))}
  1071         {\left[\begin{array}{l} a\in A \\ 
  1072                                 f\in\forest(A) \\ P(f)
  1073                \end{array}
  1074          \right]_{a,f}}
  1075      & P(\Fnil)
  1076      & \infer*{P(\Fcons(t,f))}
  1077         {\left[\begin{array}{l} t\in\tree(A)   \\ P(t) \\
  1078                                 f\in\forest(A) \\ P(f)
  1079                 \end{array}
  1080          \right]_{t,f}} }
  1081 \] 
  1082 This rule establishes a single predicate for $\TF(A)$, the union of the
  1083 recursive sets.  Although such reasoning is sometimes useful
  1084 \cite[\S4.5]{paulson-set-II}, a proper mutual induction rule should establish
  1085 separate predicates for $\tree(A)$ and $\forest(A)$.  The package calls this
  1086 rule {\tt tree\_forest.mutual\_induct}.  Observe the usage of $P$ and $Q$ in
  1087 the induction hypotheses:
  1088 \[ \infer{(\forall z. z\in\tree(A)\imp P(z)) \conj
  1089           (\forall z. z\in\forest(A)\imp Q(z))}
  1090      {\infer*{P(\Tcons(a,f))}
  1091         {\left[\begin{array}{l} a\in A \\ 
  1092                                 f\in\forest(A) \\ Q(f)
  1093                \end{array}
  1094          \right]_{a,f}}
  1095      & Q(\Fnil)
  1096      & \infer*{Q(\Fcons(t,f))}
  1097         {\left[\begin{array}{l} t\in\tree(A)   \\ P(t) \\
  1098                                 f\in\forest(A) \\ Q(f)
  1099                 \end{array}
  1100          \right]_{t,f}} }
  1101 \] 
  1102 Elsewhere I describe how to define mutually recursive functions over trees and
  1103 forests \cite[\S4.5]{paulson-set-II}.
  1104 
  1105 Both forest constructors have the form $\Inr(\cdots)$,
  1106 while the tree constructor has the form $\Inl(\cdots)$.  This pattern would
  1107 hold regardless of how many tree or forest constructors there were.
  1108 \begin{eqnarray*}
  1109   \Tcons(a,l)  & \equiv & \Inl(\pair{a,l}) \\
  1110   \Fnil        & \equiv & \Inr(\Inl(\emptyset)) \\
  1111   \Fcons(a,l)  & \equiv & \Inr(\Inr(\pair{a,l}))
  1112 \end{eqnarray*} 
  1113 There is only one case operator; it works on the union of the trees and
  1114 forests:
  1115 \[ {\tt tree\_forest\_case}(f,c,g) \equiv 
  1116     \case(\split(f),\, \case(\lambda u.c, \split(g))) 
  1117 \]
  1118 \fi
  1119 
  1120 
  1121 \subsection{Example: a four-constructor datatype}
  1122 A bigger datatype will illustrate some efficiency 
  1123 refinements.  It has four constructors $\Con_0$, \ldots, $\Con_3$, with the
  1124 corresponding arities.
  1125 \begin{ttbox}
  1126 consts    data :: [i,i] => i
  1127 datatype  "data(A,B)" = Con0
  1128                       | Con1 ("a: A")
  1129                       | Con2 ("a: A", "b: B")
  1130                       | Con3 ("a: A", "b: B", "d: data(A,B)")
  1131 \end{ttbox}
  1132 Because this datatype has two set parameters, $A$ and~$B$, the package
  1133 automatically supplies $\univ(A\un B)$ as its domain.  The structural
  1134 induction rule has four minor premises, one per constructor, and only the last
  1135 has an induction hypothesis.  (Details are left to the reader.)
  1136 
  1137 The constructors are defined by the equations
  1138 \begin{eqnarray*}
  1139   \Con_0         & \equiv & \Inl(\Inl(\emptyset)) \\
  1140   \Con_1(a)      & \equiv & \Inl(\Inr(a)) \\
  1141   \Con_2(a,b)    & \equiv & \Inr(\Inl(\pair{a,b})) \\
  1142   \Con_3(a,b,c)  & \equiv & \Inr(\Inr(\pair{a,b,c})).
  1143 \end{eqnarray*} 
  1144 The case analysis operator is
  1145 \[ {\tt data\_case}(f_0,f_1,f_2,f_3) \equiv 
  1146     \case(\begin{array}[t]{@{}l}
  1147           \case(\lambda u.f_0,\; f_1),\, \\
  1148           \case(\split(f_2),\; \split(\lambda v.\split(f_3(v)))) )
  1149    \end{array} 
  1150 \]
  1151 This may look cryptic, but the case equations are trivial to verify.
  1152 
  1153 In the constructor definitions, the injections are balanced.  A more naive
  1154 approach is to define $\Con_3(a,b,c)$ as $\Inr(\Inr(\Inr(\pair{a,b,c})))$;
  1155 instead, each constructor has two injections.  The difference here is small.
  1156 But the \textsc{zf} examples include a 60-element enumeration type, where each
  1157 constructor has 5 or~6 injections.  The naive approach would require 1 to~59
  1158 injections; the definitions would be quadratic in size.  It is like the
  1159 advantage of binary notation over unary.
  1160 
  1161 The result structure contains the case operator and constructor definitions as
  1162 the theorem list \verb|con_defs|. It contains the case equations, such as 
  1163 \[ {\tt data\_case}(f_0,f_1,f_2,f_3,\Con_3(a,b,c)) = f_3(a,b,c), \]
  1164 as the theorem list \verb|case_eqns|.  There is one equation per constructor.
  1165 
  1166 \subsection{Proving freeness theorems}
  1167 There are two kinds of freeness theorems:
  1168 \begin{itemize}
  1169 \item \defn{injectiveness} theorems, such as
  1170 \[ \Con_2(a,b) = \Con_2(a',b') \bimp a=a' \conj b=b' \]
  1171 
  1172 \item \defn{distinctness} theorems, such as
  1173 \[ \Con_1(a) \not= \Con_2(a',b')  \]
  1174 \end{itemize}
  1175 Since the number of such theorems is quadratic in the number of constructors,
  1176 the package does not attempt to prove them all.  Instead it returns tools for
  1177 proving desired theorems --- either manually or during
  1178 simplification or classical reasoning.
  1179 
  1180 The theorem list \verb|free_iffs| enables the simplifier to perform freeness
  1181 reasoning.  This works by incremental unfolding of constructors that appear in
  1182 equations.  The theorem list contains logical equivalences such as
  1183 \begin{eqnarray*}
  1184   \Con_0=c      & \bimp &  c=\Inl(\Inl(\emptyset))     \\
  1185   \Con_1(a)=c   & \bimp &  c=\Inl(\Inr(a))             \\
  1186                 & \vdots &                             \\
  1187   \Inl(a)=\Inl(b)   & \bimp &  a=b                     \\
  1188   \Inl(a)=\Inr(b)   & \bimp &  {\tt False}             \\
  1189   \pair{a,b} = \pair{a',b'} & \bimp & a=a' \conj b=b'
  1190 \end{eqnarray*}
  1191 For example, these rewrite $\Con_1(a)=\Con_1(b)$ to $a=b$ in four steps.
  1192 
  1193 The theorem list \verb|free_SEs| enables the classical
  1194 reasoner to perform similar replacements.  It consists of elimination rules
  1195 to replace $\Con_0=c$ by $c=\Inl(\Inl(\emptyset))$ and so forth, in the
  1196 assumptions.
  1197 
  1198 Such incremental unfolding combines freeness reasoning with other proof
  1199 steps.  It has the unfortunate side-effect of unfolding definitions of
  1200 constructors in contexts such as $\exists x.\Con_1(a)=x$, where they should
  1201 be left alone.  Calling the Isabelle tactic {\tt fold\_tac con\_defs}
  1202 restores the defined constants.
  1203 
  1204 
  1205 \section{Related work}\label{related}
  1206 The use of least fixedpoints to express inductive definitions seems
  1207 obvious.  Why, then, has this technique so seldom been implemented?
  1208 
  1209 Most automated logics can only express inductive definitions by asserting
  1210 axioms.  Little would be left of Boyer and Moore's logic~\cite{bm79} if their
  1211 shell principle were removed.  With \textsc{alf} the situation is more
  1212 complex; earlier versions of Martin-L\"of's type theory could (using
  1213 wellordering types) express datatype definitions, but the version underlying
  1214 \textsc{alf} requires new rules for each definition~\cite{dybjer91}.  With Coq
  1215 the situation is subtler still; its underlying Calculus of Constructions can
  1216 express inductive definitions~\cite{huet88}, but cannot quite handle datatype
  1217 definitions~\cite{paulin-tlca}.  It seems that researchers tried hard to
  1218 circumvent these problems before finally extending the Calculus with rule
  1219 schemes for strictly positive operators.  Recently Gim{\'e}nez has extended
  1220 the Calculus of Constructions with inductive and coinductive
  1221 types~\cite{gimenez-codifying}, with mechanized support in Coq.
  1222 
  1223 Higher-order logic can express inductive definitions through quantification
  1224 over unary predicates.  The following formula expresses that~$i$ belongs to the
  1225 least set containing~0 and closed under~$\succ$:
  1226 \[ \forall P. P(0)\conj (\forall x.P(x)\imp P(\succ(x))) \imp P(i) \] 
  1227 This technique can be used to prove the Knaster-Tarski theorem, which (in its
  1228 general form) is little used in the Cambridge \textsc{hol} system.
  1229 Melham~\cite{melham89} describes the development.  The natural numbers are
  1230 defined as shown above, but lists are defined as functions over the natural
  1231 numbers.  Unlabelled trees are defined using G\"odel numbering; a labelled
  1232 tree consists of an unlabelled tree paired with a list of labels.  Melham's
  1233 datatype package expresses the user's datatypes in terms of labelled trees.
  1234 It has been highly successful, but a fixedpoint approach might have yielded
  1235 greater power with less effort.
  1236 
  1237 Elsa Gunter~\cite{gunter-trees} reports an ongoing project to generalize the
  1238 Cambridge \textsc{hol} system with mutual recursion and infinitely-branching
  1239 trees.  She retains many features of Melham's approach.
  1240 
  1241 Melham's inductive definition package~\cite{camilleri92} also uses
  1242 quantification over predicates.  But instead of formalizing the notion of
  1243 monotone function, it requires definitions to consist of finitary rules, a
  1244 syntactic form that excludes many monotone inductive definitions.
  1245 
  1246 \textsc{pvs}~\cite{pvs-language} is another proof assistant based on
  1247 higher-order logic.  It supports both inductive definitions and datatypes,
  1248 apparently by asserting axioms.  Datatypes may not be iterated in general, but
  1249 may use recursion over the built-in $\lst$ type.
  1250 
  1251 The earliest use of least fixedpoints is probably Robin Milner's.  Brian
  1252 Monahan extended this package considerably~\cite{monahan84}, as did I in
  1253 unpublished work.\footnote{The datatype package described in my \textsc{lcf}
  1254   book~\cite{paulson87} does {\it not\/} make definitions, but merely asserts
  1255   axioms.} \textsc{lcf} is a first-order logic of domain theory; the relevant
  1256 fixedpoint theorem is not Knaster-Tarski but concerns fixedpoints of
  1257 continuous functions over domains.  \textsc{lcf} is too weak to express
  1258 recursive predicates.  The Isabelle package might be the first to be based on
  1259 the Knaster-Tarski theorem.
  1260 
  1261 
  1262 \section{Conclusions and future work}
  1263 Higher-order logic and set theory are both powerful enough to express
  1264 inductive definitions.  A growing number of theorem provers implement one
  1265 of these~\cite{IMPS,saaltink-fme}.  The easiest sort of inductive
  1266 definition package to write is one that asserts new axioms, not one that
  1267 makes definitions and proves theorems about them.  But asserting axioms
  1268 could introduce unsoundness.
  1269 
  1270 The fixedpoint approach makes it fairly easy to implement a package for
  1271 (co)in\-duc\-tive definitions that does not assert axioms.  It is efficient:
  1272 it processes most definitions in seconds and even a 60-constructor datatype
  1273 requires only a few minutes.  It is also simple: The first working version took
  1274 under a week to code, consisting of under 1100 lines (35K bytes) of Standard
  1275 \textsc{ml}.
  1276 
  1277 In set theory, care is needed to ensure that the inductive definition yields
  1278 a set (rather than a proper class).  This problem is inherent to set theory,
  1279 whether or not the Knaster-Tarski theorem is employed.  We must exhibit a
  1280 bounding set (called a domain above).  For inductive definitions, this is
  1281 often trivial.  For datatype definitions, I have had to formalize much set
  1282 theory.  To justify infinitely-branching datatype definitions, I have had to
  1283 develop a theory of cardinal arithmetic~\cite{paulson-gr}, such as the theorem
  1284 that if $\kappa$ is an infinite cardinal and $|X(\alpha)| \le \kappa$ for all
  1285 $\alpha<\kappa$ then $|\union\sb{\alpha<\kappa} X(\alpha)| \le \kappa$.  
  1286 The need for such efforts is not a drawback of the fixedpoint approach, for
  1287 the alternative is to take such definitions on faith.
  1288 
  1289 Care is also needed to ensure that the greatest fixedpoint really yields a
  1290 coinductive definition.  In set theory, standard pairs admit only well-founded
  1291 constructions.  Aczel's anti-foundation axiom~\cite{aczel88} could be used to
  1292 get non-well-founded objects, but it does not seem easy to mechanize.
  1293 Isabelle/\textsc{zf} instead uses a variant notion of ordered pairing, which
  1294 can be generalized to a variant notion of function.  Elsewhere I have
  1295 proved that this simple approach works (yielding final coalgebras) for a broad
  1296 class of definitions~\cite{paulson-final}.
  1297 
  1298 Several large studies make heavy use of inductive definitions.  L\"otzbeyer
  1299 and Sandner have formalized two chapters of a semantics book~\cite{winskel93},
  1300 proving the equivalence between the operational and denotational semantics of
  1301 a simple imperative language.  A single theory file contains three datatype
  1302 definitions (of arithmetic expressions, boolean expressions and commands) and
  1303 three inductive definitions (the corresponding operational rules).  Using
  1304 different techniques, Nipkow~\cite{nipkow-CR} and Rasmussen~\cite{rasmussen95}
  1305 have both proved the Church-Rosser theorem; inductive definitions specify
  1306 several reduction relations on $\lambda$-terms.  Recently, I have applied
  1307 inductive definitions to the analysis of cryptographic
  1308 protocols~\cite{paulson-markt}. 
  1309 
  1310 To demonstrate coinductive definitions, Frost~\cite{frost95} has proved the
  1311 consistency of the dynamic and static semantics for a small functional
  1312 language.  The example is due to Milner and Tofte~\cite{milner-coind}.  It
  1313 concerns an extended correspondence relation, which is defined coinductively.
  1314 A codatatype definition specifies values and value environments in mutual
  1315 recursion.  Non-well-founded values represent recursive functions.  Value
  1316 environments are variant functions from variables into values.  This one key
  1317 definition uses most of the package's novel features.
  1318 
  1319 The approach is not restricted to set theory.  It should be suitable for any
  1320 logic that has some notion of set and the Knaster-Tarski theorem.  I have
  1321 ported the (co)inductive definition package from Isabelle/\textsc{zf} to
  1322 Isabelle/\textsc{hol} (higher-order logic).  V\"olker~\cite{voelker95}
  1323 is investigating how to port the (co)datatype package.  \textsc{hol}
  1324 represents sets by unary predicates; defining the corresponding types may
  1325 cause complications.
  1326 
  1327 
  1328 \begin{footnotesize}
  1329 \bibliographystyle{springer}
  1330 \bibliography{string-abbrv,atp,theory,funprog,isabelle,crossref}
  1331 \end{footnotesize}
  1332 %%%%%\doendnotes
  1333 
  1334 \ifshort\typeout{****Omitting appendices}
  1335 \else
  1336 \newpage
  1337 \appendix
  1338 \section{Inductive and coinductive definitions: users guide}
  1339 A theory file may contain any number of inductive and coinductive
  1340 definitions.  They may be intermixed with other declarations; in
  1341 particular, the (co)inductive sets \defn{must} be declared separately as
  1342 constants, and may have mixfix syntax or be subject to syntax translations.
  1343 
  1344 The syntax is rather complicated.  Please consult the examples above and the
  1345 theory files on the \textsc{zf} source directory.  
  1346 
  1347 Each (co)inductive definition adds definitions to the theory and also proves
  1348 some theorems.  Each definition creates an \textsc{ml} structure, which is a
  1349 substructure of the main theory structure.
  1350 
  1351 Inductive and datatype definitions can take up considerable storage.  The
  1352 introduction rules are replicated in slightly different forms as fixedpoint
  1353 definitions, elimination rules and induction rules.  L\"otzbeyer and Sandner's
  1354 six definitions occupy over 600K in total.  Defining the 60-constructor
  1355 datatype requires nearly 560K\@.
  1356 
  1357 \subsection{The result structure}
  1358 Many of the result structure's components have been discussed
  1359 in~\S\ref{basic-sec}; others are self-explanatory.
  1360 \begin{description}
  1361 \item[\tt thy] is the new theory containing the recursive sets.
  1362 
  1363 \item[\tt defs] is the list of definitions of the recursive sets.
  1364 
  1365 \item[\tt bnd\_mono] is a monotonicity theorem for the fixedpoint operator.
  1366 
  1367 \item[\tt dom\_subset] is a theorem stating inclusion in the domain.
  1368 
  1369 \item[\tt intrs] is the list of introduction rules, now proved as theorems, for
  1370 the recursive sets.  The rules are also available individually, using the
  1371 names given them in the theory file. 
  1372 
  1373 \item[\tt elim] is the elimination rule.
  1374 
  1375 \item[\tt mk\_cases] is a function to create simplified instances of {\tt
  1376 elim}, using freeness reasoning on some underlying datatype.
  1377 \end{description}
  1378 
  1379 For an inductive definition, the result structure contains two induction
  1380 rules, {\tt induct} and \verb|mutual_induct|.  (To save storage, the latter
  1381 rule is just {\tt True} unless more than one set is being defined.)  For a
  1382 coinductive definition, it contains the rule \verb|coinduct|.
  1383 
  1384 Figure~\ref{def-result-fig} summarizes the two result signatures,
  1385 specifying the types of all these components.
  1386 
  1387 \begin{figure}
  1388 \begin{ttbox}
  1389 sig
  1390 val thy          : theory
  1391 val defs         : thm list
  1392 val bnd_mono     : thm
  1393 val dom_subset   : thm
  1394 val intrs        : thm list
  1395 val elim         : thm
  1396 val mk_cases     : thm list -> string -> thm
  1397 {\it(Inductive definitions only)} 
  1398 val induct       : thm
  1399 val mutual_induct: thm
  1400 {\it(Coinductive definitions only)}
  1401 val coinduct    : thm
  1402 end
  1403 \end{ttbox}
  1404 \hrule
  1405 \caption{The result of a (co)inductive definition} \label{def-result-fig}
  1406 \end{figure}
  1407 
  1408 \subsection{The syntax of a (co)inductive definition}
  1409 An inductive definition has the form
  1410 \begin{ttbox}
  1411 inductive
  1412   domains    {\it domain declarations}
  1413   intrs      {\it introduction rules}
  1414   monos      {\it monotonicity theorems}
  1415   con_defs   {\it constructor definitions}
  1416   type_intrs {\it introduction rules for type-checking}
  1417   type_elims {\it elimination rules for type-checking}
  1418 \end{ttbox}
  1419 A coinductive definition is identical, but starts with the keyword
  1420 {\tt coinductive}.  
  1421 
  1422 The {\tt monos}, {\tt con\_defs}, {\tt type\_intrs} and {\tt type\_elims}
  1423 sections are optional.  If present, each is specified as a string, which
  1424 must be a valid \textsc{ml} expression of type {\tt thm list}.  It is simply
  1425 inserted into the {\tt .thy.ML} file; if it is ill-formed, it will trigger
  1426 \textsc{ml} error messages.  You can then inspect the file on your directory.
  1427 
  1428 \begin{description}
  1429 \item[\it domain declarations] consist of one or more items of the form
  1430   {\it string\/}~{\tt <=}~{\it string}, associating each recursive set with
  1431   its domain.
  1432 
  1433 \item[\it introduction rules] specify one or more introduction rules in
  1434   the form {\it ident\/}~{\it string}, where the identifier gives the name of
  1435   the rule in the result structure.
  1436 
  1437 \item[\it monotonicity theorems] are required for each operator applied to
  1438   a recursive set in the introduction rules.  There \defn{must} be a theorem
  1439   of the form $A\sbs B\Imp M(A)\sbs M(B)$, for each premise $t\in M(R_i)$
  1440   in an introduction rule!
  1441 
  1442 \item[\it constructor definitions] contain definitions of constants
  1443   appearing in the introduction rules.  The (co)datatype package supplies
  1444   the constructors' definitions here.  Most (co)inductive definitions omit
  1445   this section; one exception is the primitive recursive functions example
  1446   (\S\ref{primrec-sec}).
  1447 
  1448 \item[\it type\_intrs] consists of introduction rules for type-checking the
  1449   definition, as discussed in~\S\ref{basic-sec}.  They are applied using
  1450   depth-first search; you can trace the proof by setting
  1451 
  1452   \verb|trace_DEPTH_FIRST := true|.
  1453 
  1454 \item[\it type\_elims] consists of elimination rules for type-checking the
  1455   definition.  They are presumed to be safe and are applied as much as
  1456   possible, prior to the {\tt type\_intrs} search.
  1457 \end{description}
  1458 
  1459 The package has a few notable restrictions:
  1460 \begin{itemize}
  1461 \item The theory must separately declare the recursive sets as
  1462   constants.
  1463 
  1464 \item The names of the recursive sets must be identifiers, not infix
  1465 operators.  
  1466 
  1467 \item Side-conditions must not be conjunctions.  However, an introduction rule
  1468 may contain any number of side-conditions.
  1469 
  1470 \item Side-conditions of the form $x=t$, where the variable~$x$ does not
  1471   occur in~$t$, will be substituted through the rule \verb|mutual_induct|.
  1472 \end{itemize}
  1473 
  1474 Isabelle/\textsc{hol} uses a simplified syntax for inductive definitions,
  1475 thanks to type-checking.  There are no \texttt{domains}, \texttt{type\_intrs}
  1476 or \texttt{type\_elims} parts.
  1477 
  1478 
  1479 \section{Datatype and codatatype definitions: users guide}
  1480 This section explains how to include (co)datatype declarations in a theory
  1481 file.  Please include {\tt Datatype} as a parent theory; this makes available
  1482 the definitions of $\univ$ and $\quniv$.
  1483 
  1484 
  1485 \subsection{The result structure}
  1486 The result structure extends that of (co)inductive definitions
  1487 (Figure~\ref{def-result-fig}) with several additional items:
  1488 \begin{ttbox}
  1489 val con_defs  : thm list
  1490 val case_eqns : thm list
  1491 val free_iffs : thm list
  1492 val free_SEs  : thm list
  1493 val mk_free   : string -> thm
  1494 \end{ttbox}
  1495 Most of these have been discussed in~\S\ref{data-sec}.  Here is a summary:
  1496 \begin{description}
  1497 \item[\tt con\_defs] is a list of definitions: the case operator followed by
  1498 the constructors.  This theorem list can be supplied to \verb|mk_cases|, for
  1499 example.
  1500 
  1501 \item[\tt case\_eqns] is a list of equations, stating that the case operator
  1502 inverts each constructor.
  1503 
  1504 \item[\tt free\_iffs] is a list of logical equivalences to perform freeness
  1505 reasoning by rewriting.  A typical application has the form
  1506 \begin{ttbox}
  1507 by (asm_simp_tac (ZF_ss addsimps free_iffs) 1);
  1508 \end{ttbox}
  1509 
  1510 \item[\tt free\_SEs] is a list of safe elimination rules to perform freeness
  1511 reasoning.  It can be supplied to \verb|eresolve_tac| or to the classical
  1512 reasoner:
  1513 \begin{ttbox} 
  1514 by (fast_tac (ZF_cs addSEs free_SEs) 1);
  1515 \end{ttbox}
  1516 
  1517 \item[\tt mk\_free] is a function to prove freeness properties, specified as
  1518 strings.  The theorems can be expressed in various forms, such as logical
  1519 equivalences or elimination rules.
  1520 \end{description}
  1521 
  1522 The result structure also inherits everything from the underlying
  1523 (co)inductive definition, such as the introduction rules, elimination rule,
  1524 and (co)induction rule.
  1525 
  1526 
  1527 \subsection{The syntax of a (co)datatype definition}
  1528 A datatype definition has the form
  1529 \begin{ttbox}
  1530 datatype <={\it domain}
  1531  {\it datatype declaration} and {\it datatype declaration} and \ldots
  1532   monos      {\it monotonicity theorems}
  1533   type_intrs {\it introduction rules for type-checking}
  1534   type_elims {\it elimination rules for type-checking}
  1535 \end{ttbox}
  1536 A codatatype definition is identical save that it starts with the keyword {\tt
  1537   codatatype}.
  1538 
  1539 The {\tt monos}, {\tt type\_intrs} and {\tt type\_elims} sections are
  1540 optional.  They are treated like their counterparts in a (co)inductive
  1541 definition, as described above.  The package supplements your type-checking
  1542 rules (if any) with additional ones that should cope with any
  1543 finitely-branching (co)datatype definition.
  1544 
  1545 \begin{description}
  1546 \item[\it domain] specifies a single domain to use for all the mutually
  1547   recursive (co)datatypes.  If it (and the preceeding~{\tt <=}) are
  1548   omitted, the package supplies a domain automatically.  Suppose the
  1549   definition involves the set parameters $A_1$, \ldots, $A_k$.  Then
  1550   $\univ(A_1\un\cdots\un A_k)$ is used for a datatype definition and
  1551   $\quniv(A_1\un\cdots\un A_k)$ is used for a codatatype definition.
  1552 
  1553   These choices should work for all finitely-branching (co)datatype
  1554   definitions.  For examples of infinitely-branching datatypes, 
  1555   see file {\tt ZF/ex/Brouwer.thy}.
  1556 
  1557 \item[\it datatype declaration] has the form
  1558 \begin{quote}
  1559  {\it string\/} {\tt =} {\it constructor} {\tt|} {\it constructor} {\tt|}
  1560  \ldots 
  1561 \end{quote}
  1562 The {\it string\/} is the datatype, say {\tt"list(A)"}.  Each
  1563 {\it constructor\/} has the form 
  1564 \begin{quote}
  1565  {\it name\/} {\tt(} {\it premise} {\tt,} {\it premise} {\tt,} \ldots {\tt)}
  1566  {\it mixfix\/}
  1567 \end{quote}
  1568 The {\it name\/} specifies a new constructor while the {\it premises\/} its
  1569 typing conditions.  The optional {\it mixfix\/} phrase may give
  1570 the constructor infix, for example.
  1571 
  1572 Mutually recursive {\it datatype declarations\/} are separated by the
  1573 keyword~{\tt and}.
  1574 \end{description}
  1575 
  1576 Isabelle/\textsc{hol}'s datatype definition package is (as of this writing)
  1577 entirely different from Isabelle/\textsc{zf}'s.  The syntax is different, and
  1578 instead of making an inductive definition it asserts axioms.
  1579 
  1580 \paragraph*{Note.}
  1581 In the definitions of the constructors, the right-hand sides may overlap.
  1582 For instance, the datatype of combinators has constructors defined by
  1583 \begin{eqnarray*}
  1584   {\tt K} & \equiv & \Inl(\emptyset) \\
  1585   {\tt S} & \equiv & \Inr(\Inl(\emptyset)) \\
  1586   p{\tt\#}q & \equiv & \Inr(\Inl(\pair{p,q})) 
  1587 \end{eqnarray*}
  1588 Unlike in previous versions of Isabelle, \verb|fold_tac| now ensures that the
  1589 longest right-hand sides are folded first.
  1590 
  1591 \fi
  1592 \end{document}