The HOL tutorial.
authornipkow
Wed, 26 Aug 1998 16:57:49 +0200
changeset 53751463e182c533
parent 5374 6ef3742b6153
child 5376 60b31a24f1a6
The HOL tutorial.
doc-src/Tutorial/appendix.tex
doc-src/Tutorial/basics.tex
doc-src/Tutorial/extra.sty
doc-src/Tutorial/fp.tex
doc-src/Tutorial/ttbox.sty
doc-src/Tutorial/tutorial.bbl
doc-src/Tutorial/tutorial.ind
doc-src/Tutorial/tutorial.tex
     1.1 --- /dev/null	Thu Jan 01 00:00:00 1970 +0000
     1.2 +++ b/doc-src/Tutorial/appendix.tex	Wed Aug 26 16:57:49 1998 +0200
     1.3 @@ -0,0 +1,75 @@
     1.4 +\appendix
     1.5 +
     1.6 +\chapter{Appendix}
     1.7 +\label{sec:Appendix}
     1.8 +
     1.9 +\begin{figure}[htbp]
    1.10 +\begin{center}
    1.11 +\begin{tabular}{|llll|}
    1.12 +\hline
    1.13 +\texttt{arities} &
    1.14 +\texttt{binder} &
    1.15 +\texttt{classes} &
    1.16 +\texttt{consts} \\
    1.17 +\texttt{default} &
    1.18 +\texttt{defs} &
    1.19 +\texttt{end} &
    1.20 +\texttt{global} \\
    1.21 +\texttt{infixl} &
    1.22 +\texttt{infixr} &
    1.23 +\texttt{instance} &
    1.24 +\texttt{local} \\
    1.25 +\texttt{mixfix} &
    1.26 +\texttt{ML} &
    1.27 +\texttt{MLtext} &
    1.28 +\texttt{nonterminals} \\
    1.29 +\texttt{oracle} &
    1.30 +\texttt{output} &
    1.31 +\texttt{path} &
    1.32 +\texttt{rules} \\
    1.33 +\texttt{setup} &
    1.34 +\texttt{syntax} &
    1.35 +\texttt{translations} &
    1.36 +\texttt{types} \\
    1.37 +\texttt{constdefs} &
    1.38 +\texttt{axclass} &&\\
    1.39 +\hline
    1.40 +\end{tabular}
    1.41 +\end{center}
    1.42 +\caption{Keywords in theory files}
    1.43 +\label{fig:keywords}
    1.44 +\end{figure}
    1.45 +
    1.46 +\begin{figure}[htbp]
    1.47 +\begin{center}
    1.48 +\begin{tabular}{|lllll|}
    1.49 +\hline
    1.50 +\texttt{ALL} &
    1.51 +\texttt{case} &
    1.52 +\texttt{div} &
    1.53 +\texttt{dvd} &
    1.54 +\texttt{else} \\
    1.55 +\texttt{EX} &
    1.56 +\texttt{if} &
    1.57 +\texttt{in} &
    1.58 +\texttt{INT} &
    1.59 +\texttt{Int} \\
    1.60 +\texttt{LEAST} &
    1.61 +\texttt{let} &
    1.62 +\texttt{mod} &
    1.63 +\texttt{O} &
    1.64 +\texttt{o} \\
    1.65 +\texttt{of} &
    1.66 +\texttt{op} &
    1.67 +\texttt{PROP} &
    1.68 +\texttt{SIGMA} &
    1.69 +\texttt{then} \\
    1.70 +\texttt{Times} &
    1.71 +\texttt{UN} &
    1.72 +\texttt{Un} &&\\
    1.73 +\hline
    1.74 +\end{tabular}
    1.75 +\end{center}
    1.76 +\caption{Reserved words in HOL}
    1.77 +\label{fig:ReservedWords}
    1.78 +\end{figure}
     2.1 --- /dev/null	Thu Jan 01 00:00:00 1970 +0000
     2.2 +++ b/doc-src/Tutorial/basics.tex	Wed Aug 26 16:57:49 1998 +0200
     2.3 @@ -0,0 +1,261 @@
     2.4 +\chapter{Basic Concepts}
     2.5 +
     2.6 +\section{Introduction}
     2.7 +
     2.8 +This is a tutorial on how to use Isabelle/HOL as a specification and
     2.9 +verification system. Isabelle is a generic system for implementing logical
    2.10 +formalisms, and Isabelle/HOL is the specialization of Isabelle for
    2.11 +HOL, which abbreviates Higher-Order Logic. We introduce HOL step by step
    2.12 +following the equation
    2.13 +\[ \mbox{HOL} = \mbox{Functional Programming} + \mbox{Logic}. \]
    2.14 +We assume that the reader is familiar with the basic concepts of both fields.
    2.15 +For excellent introductions to functional programming consult the textbooks
    2.16 +by Bird and Wadler~\cite{Bird-Wadler} or Paulson~\cite{Paulson-ML}.  Although
    2.17 +this tutorial initially concentrates on functional programming, do not be
    2.18 +misled: HOL can express most mathematical concepts, and functional
    2.19 +programming is just one particularly simple and ubiquitous instance.
    2.20 +
    2.21 +A tutorial is by definition incomplete. To fully exploit the power of the
    2.22 +system you need to consult the Isabelle Reference Manual~\cite{Isa-Ref-Man}
    2.23 +for details about Isabelle and the HOL chapter of the Logics
    2.24 +manual~\cite{Isa-Logics-Man} for details relating to HOL. Both manuals have a
    2.25 +comprehensive index.
    2.26 +
    2.27 +\section{Theories, proofs and interaction}
    2.28 +\label{sec:Basic:Theories}
    2.29 +
    2.30 +Working with Isabelle means creating two different kinds of documents:
    2.31 +theories and proof scripts. Roughly speaking, a \bfindex{theory} is a named
    2.32 +collection of types and functions, much like a module in a programming
    2.33 +language or a specification in a specification language. In fact, theories in
    2.34 +HOL can be either. Theories must reside in files with the suffix
    2.35 +\texttt{.thy}. The general format of a theory file \texttt{T.thy} is
    2.36 +\begin{ttbox}
    2.37 +T = B\(@1\) + \(\cdots\) + B\(@n\) +
    2.38 +\({<}declarations{>}\)
    2.39 +end
    2.40 +\end{ttbox}
    2.41 +where \texttt{B}$@1$, \dots, \texttt{B}$@n$ are the names of existing
    2.42 +theories that \texttt{T} is based on and ${<}declarations{>}$ stands for the
    2.43 +newly introduced concepts (types, functions etc). The \texttt{B}$@i$ are the
    2.44 +direct \textbf{parent theories}\indexbold{parent theory} of \texttt{T}.
    2.45 +Everything defined in the parent theories (and their parents \dots) is
    2.46 +automatically visible. To avoid name clashes, identifiers can be qualified by
    2.47 +theory names as in \texttt{T.f} and \texttt{B.f}. HOL's theory library is
    2.48 +available online at \verb$http://www.in.tum.de/~isabelle/library/HOL/$ and is
    2.49 +recommended browsing.
    2.50 +\begin{warn}
    2.51 +  HOL contains a theory \ttindexbold{Main}, the union of all the basic
    2.52 +  predefined theories like arithmetic, lists, sets, etc.\ (see the online
    2.53 +  library).  Unless you know what you are doing, always include \texttt{Main}
    2.54 +  as a direct or indirect parent theory of all your theories.
    2.55 +\end{warn}
    2.56 +
    2.57 +This tutorial is concerned with introducing you to the different linguistic
    2.58 +constructs that can fill ${<}declarations{>}$ in the above theory template.
    2.59 +A complete grammar of the basic constructs is found in Appendix~A
    2.60 +of~\cite{Isa-Ref-Man}, for reference in times of doubt.
    2.61 +
    2.62 +The tutorial is also concerned with showing you how to prove theorems about
    2.63 +the concepts in a theory. This involves invoking predefined theorem proving
    2.64 +commands. Because Isabelle is written in the programming language
    2.65 +ML,\footnote{Many concepts in HOL and ML are similar. Make sure you do not
    2.66 +  confuse the two levels.} interacting with Isabelle means calling ML
    2.67 +functions. Hence \bfindex{proof scripts} are sequences of calls to ML
    2.68 +functions that perform specific theorem proving tasks. Nevertheless,
    2.69 +familiarity with ML is absolutely not required.  All proof scripts for theory
    2.70 +\texttt{T} (defined in file \texttt{T.thy}) should be contained in file
    2.71 +\texttt{T.ML}. Theory and proof scripts are loaded (and checked!) by calling
    2.72 +the ML function \ttindexbold{use_thy}:
    2.73 +\begin{ttbox}
    2.74 +use_thy "T";
    2.75 +\end{ttbox}
    2.76 +
    2.77 +There are more advanced interfaces for Isabelle that hide the ML level from
    2.78 +you and replace function calls by menu selection. There is even a special
    2.79 +font with mathematical symbols. For details see the Isabelle home page.  This
    2.80 +tutorial concentrates on the bare essentials and ignores such niceties.
    2.81 +
    2.82 +\section{Types, terms and formulae}
    2.83 +\label{sec:TypesTermsForms}
    2.84 +
    2.85 +Embedded in the declarations of a theory are the types, terms and formulae of
    2.86 +HOL. HOL is a typed logic whose type system resembles that of functional
    2.87 +programming languages like ML or Haskell. Thus there are
    2.88 +\begin{description}
    2.89 +\item[base types,] in particular \ttindex{bool}, the type of truth values,
    2.90 +and \ttindex{nat}, the type of natural numbers.
    2.91 +\item[type constructors,] in particular \ttindex{list}, the type of
    2.92 +lists, and \ttindex{set}, the type of sets. Type constructors are written
    2.93 +postfix, e.g.\ \texttt{(nat)list} is the type of lists whose elements are
    2.94 +natural numbers. Parentheses around single arguments can be dropped (as in
    2.95 +\texttt{nat list}), multiple arguments are separated by commas (as in
    2.96 +\texttt{(bool,nat)foo}).
    2.97 +\item[function types,] denoted by \ttindexbold{=>}. In HOL
    2.98 +\texttt{=>} represents {\em total} functions only. As is customary,
    2.99 +\texttt{$\tau@1$ => $\tau@2$ => $\tau@3$} means
   2.100 +\texttt{$\tau@1$ => ($\tau@2$ => $\tau@3$)}. Isabelle also supports the
   2.101 +notation \texttt{[$\tau@1,\dots,\tau@n$] => $\tau$} which abbreviates
   2.102 +\texttt{$\tau@1$ => $\cdots$ => $\tau@n$ => $\tau$}.
   2.103 +\item[type variables,] denoted by \texttt{'a}, \texttt{'b} etc, just like in
   2.104 +ML. They give rise to polymorphic types like \texttt{'a => 'a}, the type of the
   2.105 +identity function.
   2.106 +\end{description}
   2.107 +\begin{warn}
   2.108 +  Types are extremely important because they prevent us from writing
   2.109 +  nonsense.  Isabelle insists that all terms and formulae must be well-typed
   2.110 +  and will print an error message if a type mismatch is encountered. To
   2.111 +  reduce the amount of explicit type information that needs to be provided by
   2.112 +  the user, Isabelle infers the type of all variables automatically (this is
   2.113 +  called \bfindex{type inference}) and keeps quiet about it. Occasionally
   2.114 +  this may lead to misunderstandings between you and the system. If anything
   2.115 +  strange happens, we recommend to set the flag \ttindexbold{show_types} that
   2.116 +  tells Isabelle to display type information that is usually suppressed:
   2.117 +  simply type
   2.118 +\begin{ttbox}
   2.119 +set show_types;
   2.120 +\end{ttbox}
   2.121 +
   2.122 +\noindent
   2.123 +at the ML-level. This can be reversed by \texttt{reset show_types;}.
   2.124 +\end{warn}
   2.125 +
   2.126 +
   2.127 +\textbf{Terms}\indexbold{term}
   2.128 +are formed as in functional programming by applying functions to
   2.129 +arguments. If \texttt{f} is a function of type \texttt{$\tau@1$ => $\tau@2$}
   2.130 +and \texttt{t} is a term of type $\tau@1$ then \texttt{f~t} is a term of type
   2.131 +$\tau@2$. HOL also supports infix functions like \texttt{+} and some basic
   2.132 +constructs from functional programming:
   2.133 +\begin{description}
   2.134 +\item[\texttt{if $b$ then $t@1$ else $t@2$}]\indexbold{*if}
   2.135 +means what you think it means and requires that
   2.136 +$b$ is of type \texttt{bool} and $t@1$ and $t@2$ are of the same type.
   2.137 +\item[\texttt{let $x$ = $t$ in $u$}]\indexbold{*let}
   2.138 +is equivalent to $u$ where all occurrences of $x$ have been replaced by
   2.139 +$t$. For example,
   2.140 +\texttt{let x = 0 in x+x} means \texttt{0+0}. Multiple bindings are separated
   2.141 +by semicolons: \texttt{let $x@1$ = $t@1$; \dots; $x@n$ = $t@n$ in $u$}.
   2.142 +\item[\texttt{case $e$ of $c@1$ => $e@1$ | \dots | $c@n$ => $e@n$}]
   2.143 +\indexbold{*case}
   2.144 +evaluates to $e@i$ if $e$ is of the form
   2.145 +$c@i$. See~\S\ref{sec:case-expressions} for details.
   2.146 +\end{description}
   2.147 +
   2.148 +Terms may also contain $\lambda$-abstractions. For example, $\lambda
   2.149 +x. x+1$ is the function that takes an argument $x$ and returns $x+1$. In
   2.150 +Isabelle we write \texttt{\%x.~x+1}.\index{==>@{\tt\%}|bold}
   2.151 +Instead of \texttt{\%x.~\%y.~\%z.~t} we can write \texttt{\%x~y~z.~t}.
   2.152 +
   2.153 +\textbf{Formulae}\indexbold{formula}
   2.154 +are terms of type \texttt{bool}. There are the basic
   2.155 +constants \ttindexbold{True} and \ttindexbold{False} and the usual logical
   2.156 +connectives (in decreasing order of priority):
   2.157 +\verb$~$\index{$HOL1@{\ttnot}|bold} (`not'),
   2.158 +\texttt{\&}\index{$HOL2@{\tt\&}|bold} (`and'),
   2.159 +\texttt{|}\index{$HOL2@{\ttor}|bold} (`or') and
   2.160 +\texttt{-->}\index{$HOL2@{\tt-->}|bold} (`implies'),
   2.161 +all of which (except the unary \verb$~$) associate to the right. In
   2.162 +particular \texttt{A --> B --> C} means \texttt{A --> (B --> C)} and is thus
   2.163 +logically equivalent with \texttt{A \& B --> C}
   2.164 +(which is \texttt{(A \& B) --> C}).
   2.165 +
   2.166 +Equality is available in the form of the infix function
   2.167 +\texttt{=} of type \texttt{'a => 'a => bool}. Thus \texttt{$t@1$ = $t@2$} is
   2.168 +a formula provided $t@1$ and $t@2$ are terms of the same type. In case $t@1$
   2.169 +and $t@2$ are of type \texttt{bool}, \texttt{=} acts as if-and-only-if.
   2.170 +
   2.171 +The syntax for quantifiers is
   2.172 +\texttt{!~$x$.$\,P$}\index{$HOLQ@{\ttall}|bold} (`for all $x$') and
   2.173 +\texttt{?~$x$.$\,P$}\index{$HOLQ@{\tt?}|bold} (`exists $x$').
   2.174 +There is even \texttt{?!~$x$.$\,P$}\index{$HOLQ@{\ttuniquex}|bold}, which
   2.175 +means that there exists exactly one $x$ that satisfies $P$. Instead of
   2.176 +\texttt{!} and \texttt{?} you may also write \texttt{ALL} and \texttt{EX}.
   2.177 +Nested quantifications can be abbreviated:
   2.178 +\texttt{!$x~y~z$.$\,P$} means \texttt{!$x$.~!$y$.~!$z$.$\,P$}.
   2.179 +
   2.180 +Despite type inference, it is sometimes necessary to attach explicit
   2.181 +\bfindex{type constraints} to a term.  The syntax is \texttt{$t$::$\tau$} as
   2.182 +in \texttt{x < (y::nat)}. Note that \texttt{::} binds weakly and should
   2.183 +therefore be enclosed in parentheses: \texttt{x < y::nat} is ill-typed
   2.184 +because it is interpreted as \texttt{(x < y)::nat}. The main reason for type
   2.185 +constraints are overloaded functions like \texttt{+}, \texttt{*} and
   2.186 +\texttt{<}. (See \S\ref{sec:TypeClasses} for a full discussion of
   2.187 +overloading.)
   2.188 +
   2.189 +\begin{warn}
   2.190 +In general, HOL's concrete syntax tries to follow the conventions of
   2.191 +functional programming and mathematics. Below we list the main rules that you
   2.192 +should be familiar with to avoid certain syntactic traps. A particular
   2.193 +problem for novices can be the priority of operators. If you are unsure, use
   2.194 +more rather than fewer parentheses. In those cases where Isabelle echoes your
   2.195 +input, you can see which parentheses are dropped---they were superfluous. If
   2.196 +you are unsure how to interpret Isabelle's output because you don't know
   2.197 +where the (dropped) parentheses go, set (and possibly reset) the flag
   2.198 +\ttindexbold{show_brackets}:
   2.199 +\begin{ttbox}
   2.200 +set show_brackets; \(\dots\); reset show_brackets;
   2.201 +\end{ttbox}
   2.202 +\end{warn}
   2.203 +
   2.204 +\begin{itemize}
   2.205 +\item
   2.206 +Remember that \texttt{f t u} means \texttt{(f t) u} and not \texttt{f(t u)}!
   2.207 +\item
   2.208 +Isabelle allows infix functions like \texttt{+}. The prefix form of function
   2.209 +application binds more strongly than anything else and hence \texttt{f~x + y}
   2.210 +means \texttt{(f~x)~+~y} and not \texttt{f(x+y)}.
   2.211 +\item
   2.212 +Remember that in HOL if-and-only-if is expressed using equality.  But equality
   2.213 +has a high priority, as befitting a  relation, while if-and-only-if typically
   2.214 +has the lowest priority.  Thus, \verb$~ ~ P = P$ means \verb$~ ~(P = P)$ and
   2.215 +not \verb$(~ ~P) = P$. When using \texttt{=} to mean logical equivalence,
   2.216 +enclose both operands in parentheses, as in \texttt{(A \& B) = (B \& A)}.
   2.217 +\item
   2.218 +Constructs with an opening but without a closing delimiter bind very weakly
   2.219 +and should therefore be enclosed in parentheses if they appear in subterms, as
   2.220 +in \texttt{f = (\%x.~x)}. This includes
   2.221 +\ttindex{if}, \ttindex{let}, \ttindex{case}, \verb$%$ and quantifiers.
   2.222 +\item
   2.223 +Never write \texttt{\%x.x} or \texttt{!x.x=x} because \texttt{x.x} is always
   2.224 +read as a single qualified identifier that refers to an item \texttt{x} in
   2.225 +theory \texttt{x}. Write \texttt{\%x.~x} and \texttt{!x.~x=x} instead.
   2.226 +\end{itemize}
   2.227 +
   2.228 +\section{Variables}
   2.229 +\label{sec:variables}
   2.230 +
   2.231 +Isabelle distinguishes free and bound variables just as is customary. Bound
   2.232 +variables are automatically renamed to avoid clashes with free variables. In
   2.233 +addition, Isabelle has a third kind of variable, called a \bfindex{schematic
   2.234 +  variable} or \bfindex{unknown}, which starts with a \texttt{?}.  Logically,
   2.235 +an unknown is a free variable. But it may be instantiated by another term
   2.236 +during the proof process. For example, the mathematical theorem $x = x$ is
   2.237 +represented in Isabelle as \texttt{?x = ?x}, which means that Isabelle can
   2.238 +instantiate it arbitrarily. This is in contrast to ordinary variables, which
   2.239 +remain fixed. The programming language Prolog calls unknowns {\em logical\/}
   2.240 +variables.
   2.241 +
   2.242 +Most of the time you can and should ignore unknowns and work with ordinary
   2.243 +variables. Just don't be surprised that after you have finished the
   2.244 +proof of a theorem, Isabelle will turn your free variables into unknowns: it
   2.245 +merely indicates that Isabelle will automatically instantiate those unknowns
   2.246 +suitably when the theorem is used in some other proof.
   2.247 +\begin{warn}
   2.248 +  The existential quantifier \texttt{?}\index{$HOLQ@{\tt?}} needs to be
   2.249 +  followed by a space. Otherwise \texttt{?x} is interpreted as a schematic
   2.250 +  variable.
   2.251 +\end{warn}
   2.252 +
   2.253 +\section{Getting started}
   2.254 +
   2.255 +Assuming you have installed Isabelle, you start it by typing \texttt{isabelle
   2.256 +  HOL} in a shell window.\footnote{Since you will always want to use HOL when
   2.257 +  studying this tutorial, you can set the shell variable
   2.258 +  \texttt{ISABELLE_LOGIC} to \texttt{HOL} once and for all and simply execute
   2.259 +  \texttt{isabelle}.} This presents you with Isabelle's most basic ASCII
   2.260 +interface. In addition you need to open an editor window to create theories
   2.261 +(\texttt{.thy} files) and proof scripts (\texttt{.ML} files). While you are
   2.262 +developing a proof, we recommend to type each proof command into the ML-file
   2.263 +first and then enter it into Isabelle by copy-and-paste, thus ensuring that
   2.264 +you have a complete record of your proof.
     3.1 --- /dev/null	Thu Jan 01 00:00:00 1970 +0000
     3.2 +++ b/doc-src/Tutorial/extra.sty	Wed Aug 26 16:57:49 1998 +0200
     3.3 @@ -0,0 +1,29 @@
     3.4 +% extra.sty : Isabelle Manual extra macros for non-Springer version
     3.5 +%
     3.6 +\typeout{Document Style extra. Released 17 February 1994}
     3.7 +
     3.8 +%%Euro-style date: 20 September 1955
     3.9 +\def\today{\number\day\space\ifcase\month\or
    3.10 +January\or February\or March\or April\or May\or June\or
    3.11 +July\or August\or September\or October\or November\or December\fi
    3.12 +\space\number\year}
    3.13 +
    3.14 +%%Borrowed from alltt.sty, but leaves % as the comment character
    3.15 +\def\docspecials{\do\ \do\$\do\&%
    3.16 +  \do\#\do\^\do\^^K\do\_\do\^^A\do\~}
    3.17 +
    3.18 +%%%Put first chapter on odd page, with arabic numbering; like \cleardoublepage
    3.19 +\newcommand\clearfirst{\clearpage\ifodd\c@page\else
    3.20 +    \hbox{}\newpage\if@twocolumn\hbox{}\newpage\fi\fi
    3.21 +    \pagenumbering{arabic}}
    3.22 +
    3.23 +%%%Ruled chapter headings 
    3.24 +\def\@rulehead#1{\hrule height1pt \vskip 14pt \Huge \bf 
    3.25 +   #1 \vskip 14pt\hrule height1pt}
    3.26 +\def\@makechapterhead#1{ { \parindent 0pt 
    3.27 + \ifnum\c@secnumdepth >\m@ne \raggedleft\large\bf\@chapapp{} \thechapter \par 
    3.28 + \vskip 20pt \fi \raggedright \@rulehead{#1} \par \nobreak \vskip 40pt } }
    3.29 +
    3.30 +\def\@makeschapterhead#1{ { \parindent 0pt \raggedright 
    3.31 + \@rulehead{#1} \par \nobreak \vskip 40pt } }
    3.32 +
     4.1 --- /dev/null	Thu Jan 01 00:00:00 1970 +0000
     4.2 +++ b/doc-src/Tutorial/fp.tex	Wed Aug 26 16:57:49 1998 +0200
     4.3 @@ -0,0 +1,1416 @@
     4.4 +\chapter{Functional Programming in HOL}
     4.5 +
     4.6 +Although on the surface this chapter is mainly concerned with how to write
     4.7 +functional programs in HOL and how to verify them, most of the
     4.8 +constructs and proof procedures introduced are general purpose and recur in
     4.9 +any specification or verification task.
    4.10 +
    4.11 +The dedicated functional programmer should be warned: HOL offers only what
    4.12 +could be called {\em total functional programming} --- all functions in HOL
    4.13 +must be total; lazy data structures are not directly available. On the
    4.14 +positive side, functions in HOL need not be computable: HOL is a
    4.15 +specification language that goes well beyond what can be expressed as a
    4.16 +program. However, for the time being we concentrate on the computable.
    4.17 +
    4.18 +\section{An introductory theory}
    4.19 +\label{sec:intro-theory}
    4.20 +
    4.21 +Functional programming needs datatypes and functions. Both of them can be
    4.22 +defined in a theory with a syntax reminiscent of languages like ML or
    4.23 +Haskell. As an example consider the theory in Fig.~\ref{fig:ToyList}.
    4.24 +
    4.25 +\begin{figure}[htbp]
    4.26 +\begin{ttbox}\makeatother
    4.27 +\input{ToyList/ToyList.thy}\end{ttbox}
    4.28 +\caption{A theory of lists}
    4.29 +\label{fig:ToyList}
    4.30 +\end{figure}
    4.31 +
    4.32 +HOL already has a predefined theory of lists called \texttt{List} ---
    4.33 +\texttt{ToyList} is merely a small fragment of it chosen as an example. In
    4.34 +contrast to what is recommended in \S\ref{sec:Basic:Theories},
    4.35 +\texttt{ToyList} is not based on \texttt{Main} but on \texttt{Datatype}, a
    4.36 +theory that contains everything required for datatype definitions but does
    4.37 +not have \texttt{List} as a parent, thus avoiding ambiguities caused by
    4.38 +defining lists twice.
    4.39 +
    4.40 +The \ttindexbold{datatype} \texttt{list} introduces two constructors
    4.41 +\texttt{Nil} and \texttt{Cons}, the empty list and the operator that adds an
    4.42 +element to the front of a list. For example, the term \texttt{Cons True (Cons
    4.43 +  False Nil)} is a value of type \texttt{bool~list}, namely the list with the
    4.44 +elements \texttt{True} and \texttt{False}. Because this notation becomes
    4.45 +unwieldy very quickly, the datatype declaration is annotated with an
    4.46 +alternative syntax: instead of \texttt{Nil} and \texttt{Cons}~$x$~$xs$ we can
    4.47 +write \index{#@{\tt[]}|bold}\texttt{[]} and
    4.48 +\texttt{$x$~\#~$xs$}\index{#@{\tt\#}|bold}. In fact, this alternative syntax
    4.49 +is the standard syntax. Thus the list \texttt{Cons True (Cons False Nil)}
    4.50 +becomes \texttt{True \# False \# []}. The annotation \ttindexbold{infixr}
    4.51 +means that \texttt{\#} associates to the right, i.e.\ the term \texttt{$x$ \#
    4.52 +  $y$ \# $z$} is read as \texttt{$x$ \# ($y$ \# $z$)} and not as \texttt{($x$
    4.53 +  \# $y$) \# $z$}.
    4.54 +
    4.55 +\begin{warn}
    4.56 +  Syntax annotations are a powerful but completely optional feature. You
    4.57 +  could drop them from theory \texttt{ToyList} and go back to the identifiers
    4.58 +  \texttt{Nil} and \texttt{Cons}. However, lists are such a central datatype
    4.59 +  that their syntax is highly customized. We recommend that novices should
    4.60 +  not use syntax annotations in their own theories.
    4.61 +\end{warn}
    4.62 +
    4.63 +Next, the functions \texttt{app} and \texttt{rev} are declared. In contrast
    4.64 +to ML, Isabelle insists on explicit declarations of all functions (keyword
    4.65 +\ttindexbold{consts}).  (Apart from the declaration-before-use restriction,
    4.66 +the order of items in a theory file is unconstrained.) Function \texttt{app}
    4.67 +is annotated with concrete syntax too. Instead of the prefix syntax
    4.68 +\texttt{app}~$xs$~$ys$ the infix $xs$~\texttt{\at}~$ys$ becomes the preferred
    4.69 +form.
    4.70 +
    4.71 +Both functions are defined recursively. The equations for \texttt{app} and
    4.72 +\texttt{rev} hardly need comments: \texttt{app} appends two lists and
    4.73 +\texttt{rev} reverses a list.  The keyword \texttt{primrec} indicates that
    4.74 +the recursion is of a particularly primitive kind where each recursive call
    4.75 +peels off a datatype constructor from one of the arguments (see
    4.76 +\S\ref{sec:datatype}).  Thus the recursion always terminates, i.e.\ the
    4.77 +function is \bfindex{total}.
    4.78 +
    4.79 +The termination requirement is absolutely essential in HOL, a logic of total
    4.80 +functions. If we were to drop it, inconsistencies could quickly arise: the
    4.81 +``definition'' $f(n) = f(n)+1$ immediately leads to $0 = 1$ by subtracting
    4.82 +$f(n)$ on both sides.
    4.83 +% However, this is a subtle issue that we cannot discuss here further.
    4.84 +
    4.85 +\begin{warn}
    4.86 +  As we have indicated, the desire for total functions is not a gratuitously
    4.87 +  imposed restriction but an essential characteristic of HOL. It is only
    4.88 +  because of totality that reasoning in HOL is comparatively easy.  More
    4.89 +  generally, the philosophy in HOL is not to allow arbitrary axioms (such as
    4.90 +  function definitions whose totality has not been proved) because they
    4.91 +  quickly lead to inconsistencies. Instead, fixed constructs for introducing
    4.92 +  types and functions are offered (such as \texttt{datatype} and
    4.93 +  \texttt{primrec}) which are guaranteed to preserve consistency.
    4.94 +\end{warn}
    4.95 +
    4.96 +A remark about syntax.  The textual definition of a theory follows a fixed
    4.97 +syntax with keywords like \texttt{datatype} and \texttt{end} (see
    4.98 +Fig.~\ref{fig:keywords} in Appendix~\ref{sec:Appendix} for a full list).
    4.99 +Embedded in this syntax are the types and formulae of HOL, whose syntax is
   4.100 +extensible, e.g.\ by new user-defined infix operators
   4.101 +(see~\ref{sec:infix-syntax}). To distinguish the two levels, everything
   4.102 +HOL-specific should be enclosed in \texttt{"}\dots\texttt{"}. The same holds
   4.103 +for identifiers that happen to be keywords, as in
   4.104 +\begin{ttbox}
   4.105 +consts "end" :: 'a list => 'a
   4.106 +\end{ttbox}
   4.107 +To lessen this burden, quotation marks around types can be dropped,
   4.108 +provided their syntax does not go beyond what is described in
   4.109 +\S\ref{sec:TypesTermsForms}. Types containing further operators, e.g.\
   4.110 +\texttt{*} for Cartesian products, need quotation marks.
   4.111 +
   4.112 +When Isabelle prints a syntax error message, it refers to the HOL syntax as
   4.113 +the \bfindex{inner syntax}.
   4.114 +
   4.115 +\section{An introductory proof}
   4.116 +\label{sec:intro-proof}
   4.117 +
   4.118 +Having defined \texttt{ToyList}, we load it with the ML command
   4.119 +\begin{ttbox}
   4.120 +use_thy "ToyList";
   4.121 +\end{ttbox}
   4.122 +and are ready to prove a few simple theorems. This will illustrate not just
   4.123 +the basic proof commands but also the typical proof process.
   4.124 +
   4.125 +\subsubsection*{Main goal: \texttt{rev(rev xs) = xs}}
   4.126 +
   4.127 +Our goal is to show that reversing a list twice produces the original
   4.128 +list. Typing
   4.129 +\begin{ttbox}
   4.130 +\input{ToyList/thm}\end{ttbox}
   4.131 +establishes a new goal to be proved in the context of the current theory,
   4.132 +which is the one we just loaded. Isabelle's response is to print the current proof state:
   4.133 +\begin{ttbox}
   4.134 +{\out Level 0}
   4.135 +{\out rev (rev xs) = xs}
   4.136 +{\out  1. rev (rev xs) = xs}
   4.137 +\end{ttbox}
   4.138 +Until we have finished a proof, the proof state always looks like this:
   4.139 +\begin{ttbox}
   4.140 +{\out Level \(i\)}
   4.141 +{\out \(G\)}
   4.142 +{\out  1. \(G@1\)}
   4.143 +{\out  \(\vdots\)}
   4.144 +{\out  \(n\). \(G@n\)}
   4.145 +\end{ttbox}
   4.146 +where \texttt{Level}~$i$ indicates that we are $i$ steps into the proof, $G$
   4.147 +is the overall goal that we are trying to prove, and the numbered lines
   4.148 +contain the subgoals $G@1$, \dots, $G@n$ that we need to prove to establish
   4.149 +$G$. At \texttt{Level 0} there is only one subgoal, which is identical with
   4.150 +the overall goal.  Normally $G$ is constant and only serves as a reminder.
   4.151 +Hence we rarely show it in this tutorial.
   4.152 +
   4.153 +Let us now get back to \texttt{rev(rev xs) = xs}. Properties of recursively
   4.154 +defined functions are best established by induction. In this case there is
   4.155 +not much choice except to induct on \texttt{xs}:
   4.156 +\begin{ttbox}
   4.157 +\input{ToyList/inductxs}\end{ttbox}
   4.158 +This tells Isabelle to perform induction on variable \texttt{xs} in subgoal
   4.159 +1. The new proof state contains two subgoals, namely the base case
   4.160 +(\texttt{Nil}) and the induction step (\texttt{Cons}):
   4.161 +\begin{ttbox}
   4.162 +{\out 1. rev (rev []) = []}
   4.163 +{\out 2. !!a list. rev (rev list) = list ==> rev (rev (a # list)) = a # list}
   4.164 +\end{ttbox}
   4.165 +The induction step is an example of the general format of a subgoal:
   4.166 +\begin{ttbox}
   4.167 +{\out  \(i\). !!\(x@1 \dots x@n\). {\it assumptions} ==> {\it conclusion}}
   4.168 +\end{ttbox}\index{==>@{\tt==>}|bold}
   4.169 +The prefix of bound variables \texttt{!!\(x@1 \dots x@n\)} can be ignored
   4.170 +most of the time, or simply treated as a list of variables local to this
   4.171 +subgoal. Their deeper significance is explained in \S\ref{sec:PCproofs}.  The
   4.172 +{\it assumptions} are the local assumptions for this subgoal and {\it
   4.173 +  conclusion} is the actual proposition to be proved. Typical proof steps
   4.174 +that add new assumptions are induction or case distinction. In our example
   4.175 +the only assumption is the induction hypothesis \texttt{rev (rev list) =
   4.176 +  list}, where \texttt{list} is a variable name chosen by Isabelle. If there
   4.177 +are multiple assumptions, they are enclosed in the bracket pair
   4.178 +\texttt{[|}\index{==>@\ttlbr|bold} and \texttt{|]}\index{==>@\ttrbr|bold}
   4.179 +and separated by semicolons.
   4.180 +
   4.181 +Let us try to solve both goals automatically:
   4.182 +\begin{ttbox}
   4.183 +\input{ToyList/autotac}\end{ttbox}
   4.184 +This command tells Isabelle to apply a proof strategy called
   4.185 +\texttt{Auto_tac} to all subgoals. Essentially, \texttt{Auto_tac} tries to
   4.186 +`simplify' the subgoals.  In our case, subgoal~1 is solved completely (thanks
   4.187 +to the equation \texttt{rev [] = []}) and disappears; the simplified version
   4.188 +of subgoal~2 becomes the new subgoal~1:
   4.189 +\begin{ttbox}\makeatother
   4.190 +{\out 1. !!a list. rev(rev list) = list ==> rev(rev list @ a # []) = a # list}
   4.191 +\end{ttbox}
   4.192 +In order to simplify this subgoal further, a lemma suggests itself.
   4.193 +
   4.194 +\subsubsection*{First lemma: \texttt{rev(xs \at~ys) = (rev ys) \at~(rev xs)}}
   4.195 +
   4.196 +We start the proof as usual:
   4.197 +\begin{ttbox}\makeatother
   4.198 +\input{ToyList/lemma1}\end{ttbox}
   4.199 +There are two variables that we could induct on: \texttt{xs} and
   4.200 +\texttt{ys}. Because \texttt{\at} is defined by recursion on
   4.201 +the first argument, \texttt{xs} is the correct one:
   4.202 +\begin{ttbox}
   4.203 +\input{ToyList/inductxs}\end{ttbox}
   4.204 +This time not even the base case is solved automatically:
   4.205 +\begin{ttbox}\makeatother
   4.206 +by(Auto_tac);
   4.207 +{\out 1. rev ys = rev ys @ []}
   4.208 +{\out 2. \dots}
   4.209 +\end{ttbox}
   4.210 +We need another lemma.
   4.211 +
   4.212 +\subsubsection*{Second lemma: \texttt{xs \at~[] = xs}}
   4.213 +
   4.214 +This time the canonical proof procedure
   4.215 +\begin{ttbox}\makeatother
   4.216 +\input{ToyList/lemma2}\input{ToyList/inductxs}\input{ToyList/autotac}\end{ttbox}
   4.217 +leads to the desired message \texttt{No subgoals!}:
   4.218 +\begin{ttbox}\makeatother
   4.219 +{\out Level 2}
   4.220 +{\out xs @ [] = xs}
   4.221 +{\out No subgoals!}
   4.222 +\end{ttbox}
   4.223 +Now we can give the lemma just proved a suitable name
   4.224 +\begin{ttbox}
   4.225 +\input{ToyList/qed2}\end{ttbox}
   4.226 +and tell Isabelle to use this lemma in all future proofs by simplification:
   4.227 +\begin{ttbox}
   4.228 +\input{ToyList/addsimps2}\end{ttbox}
   4.229 +Note that in the theorem \texttt{app_Nil2} the free variable \texttt{xs} has
   4.230 +been replaced by the unknown \texttt{?xs}, just as explained in
   4.231 +\S\ref{sec:variables}.
   4.232 +
   4.233 +Going back to the proof of the first lemma
   4.234 +\begin{ttbox}\makeatother
   4.235 +\input{ToyList/lemma1}\input{ToyList/inductxs}\input{ToyList/autotac}\end{ttbox}
   4.236 +we find that this time \texttt{Auto_tac} solves the base case, but the
   4.237 +induction step merely simplifies to
   4.238 +\begin{ttbox}\makeatother
   4.239 +{\out 1. !!a list.}
   4.240 +{\out       rev (list @ ys) = rev ys @ rev list}
   4.241 +{\out       ==> (rev ys @ rev list) @ a # [] = rev ys @ rev list @ a # []}
   4.242 +\end{ttbox}
   4.243 +Now we need to remember that \texttt{\at} associates to the right, and that
   4.244 +\texttt{\#} and \texttt{\at} have the same priority (namely the \texttt{65}
   4.245 +in the definition of \texttt{ToyList}). Thus the conclusion really is
   4.246 +\begin{ttbox}\makeatother
   4.247 +{\out     ==> (rev ys @ rev list) @ (a # []) = rev ys @ (rev list @ (a # []))}
   4.248 +\end{ttbox}
   4.249 +and the missing lemma is associativity of \texttt{\at}.
   4.250 +
   4.251 +\subsubsection*{Third lemma: \texttt{(xs \at~ys) \at~zs = xs \at~(ys \at~zs)}}
   4.252 +
   4.253 +This time the canonical proof procedure
   4.254 +\begin{ttbox}\makeatother
   4.255 +\input{ToyList/lemma3}\end{ttbox}
   4.256 +succeeds without further ado. Again we name the lemma and add it to
   4.257 +the set of lemmas used during simplification:
   4.258 +\begin{ttbox}
   4.259 +\input{ToyList/qed3}\end{ttbox}
   4.260 +Now we can go back and prove the first lemma
   4.261 +\begin{ttbox}\makeatother
   4.262 +\input{ToyList/lemma1}\input{ToyList/inductxs}\input{ToyList/autotac}\end{ttbox}
   4.263 +add it to the simplification lemmas
   4.264 +\begin{ttbox}
   4.265 +\input{ToyList/qed1}\end{ttbox}
   4.266 +and then solve our main theorem:
   4.267 +\begin{ttbox}\makeatother
   4.268 +\input{ToyList/thm}\input{ToyList/inductxs}\input{ToyList/autotac}\end{ttbox}
   4.269 +
   4.270 +\subsubsection*{Review}
   4.271 +
   4.272 +This is the end of our toy proof. It should have familiarized you with
   4.273 +\begin{itemize}
   4.274 +\item the standard theorem proving procedure:
   4.275 +state a goal; proceed with proof until a new lemma is required; prove that
   4.276 +lemma; come back to the original goal.
   4.277 +\item a specific procedure that works well for functional programs:
   4.278 +induction followed by all-out simplification via \texttt{Auto_tac}.
   4.279 +\item a basic repertoire of proof commands.
   4.280 +\end{itemize}
   4.281 +
   4.282 +
   4.283 +\section{Some helpful commands}
   4.284 +\label{sec:commands-and-hints}
   4.285 +
   4.286 +This section discusses a few basic commands for manipulating the proof state
   4.287 +and can be skipped by casual readers.
   4.288 +
   4.289 +There are two kinds of commands used during a proof: the actual proof
   4.290 +commands and auxiliary commands for examining the proof state and controlling
   4.291 +the display. Proof commands are always of the form
   4.292 +\texttt{by(\textit{tactic});}\indexbold{tactic} where \textbf{tactic} is a
   4.293 +synonym for ``theorem proving function''. Typical examples are
   4.294 +\texttt{induct_tac} and \texttt{Auto_tac} --- the suffix \texttt{_tac} is
   4.295 +merely a mnemonic. Further tactics are introduced throughout the tutorial.
   4.296 +
   4.297 +%Tactics can also be modified. For example,
   4.298 +%\begin{ttbox}
   4.299 +%by(ALLGOALS Asm_simp_tac);
   4.300 +%\end{ttbox}
   4.301 +%tells Isabelle to apply \texttt{Asm_simp_tac} to all subgoals. For more on
   4.302 +%tactics and how to combine them see~\S\ref{sec:Tactics}.
   4.303 +
   4.304 +The most useful auxiliary commands are:
   4.305 +\begin{description}
   4.306 +\item[Printing the current state]
   4.307 +Type \texttt{pr();} to redisplay the current proof state, for example when it
   4.308 +has disappeared off the screen.
   4.309 +\item[Limiting the number of subgoals]
   4.310 +Typing \texttt{prlim $k$;} tells Isabelle to print only the first $k$
   4.311 +subgoals from now on and redisplays the current proof state. This is helpful
   4.312 +when there are many subgoals.
   4.313 +\item[Undoing] Typing \texttt{undo();} undoes the effect of the last
   4.314 +tactic.
   4.315 +\item[Context switch] Every proof happens in the context of a
   4.316 +  \bfindex{current theory}. By default, this is the last theory loaded. If
   4.317 +  you want to prove a theorem in the context of a different theory
   4.318 +  \texttt{T}, you need to type \texttt{context T.thy;}\index{*context|bold}
   4.319 +  first. Of course you need to change the context again if you want to go
   4.320 +  back to your original theory.
   4.321 +\item[Displaying types] We have already mentioned the flag
   4.322 +  \ttindex{show_types} above. It can also be useful for detecting typos in
   4.323 +  formulae early on. For example, if \texttt{show_types} is set and the goal
   4.324 +  \texttt{rev(rev xs) = xs} is started, Isabelle prints the additional output
   4.325 +\begin{ttbox}
   4.326 +{\out Variables:}
   4.327 +{\out   xs :: 'a list}
   4.328 +\end{ttbox}
   4.329 +which tells us that Isabelle has correctly inferred that
   4.330 +\texttt{xs} is a variable of list type. On the other hand, had we
   4.331 +made a typo as in \texttt{rev(re xs) = xs}, the response
   4.332 +\begin{ttbox}
   4.333 +Variables:
   4.334 +  re :: 'a list => 'a list
   4.335 +  xs :: 'a list
   4.336 +\end{ttbox}
   4.337 +would have alerted us because of the unexpected variable \texttt{re}.
   4.338 +\item[(Re)loading theories]\index{loading theories}\index{reloading theories}
   4.339 +Initially you load theory \texttt{T} by typing \ttindex{use_thy}~\texttt{"T";},
   4.340 +which loads all parent theories of \texttt{T} automatically, if they are not
   4.341 +loaded already. If you modify \texttt{T.thy} or \texttt{T.ML}, you can
   4.342 +reload it by typing \texttt{use_thy~"T";} again. This time, however, only
   4.343 +\texttt{T} is reloaded. If some of \texttt{T}'s parents have changed as well,
   4.344 +type \ttindexbold{update}\texttt{();} to reload all theories that have
   4.345 +changed.
   4.346 +\end{description}
   4.347 +Further commands are found in the Reference Manual.
   4.348 +
   4.349 +
   4.350 +\section{Datatypes}
   4.351 +\label{sec:datatype}
   4.352 +
   4.353 +Inductive datatypes are part of almost every non-trivial application of HOL.
   4.354 +First we take another look at a very important example, the datatype of
   4.355 +lists, before we turn to datatypes in general. The section closes with a
   4.356 +case study.
   4.357 +
   4.358 +
   4.359 +\subsection{Lists}
   4.360 +
   4.361 +Lists are one of the essential datatypes in computing. Readers of this tutorial
   4.362 +and users of HOL need to be familiar with their basic operations. Theory
   4.363 +\texttt{ToyList} is only a small fragment of HOL's predefined theory
   4.364 +\texttt{List}\footnote{\texttt{http://www.in.tum.de/\~\relax
   4.365 +    isabelle/library/HOL/List.html}}.
   4.366 +The latter contains many further operations. For example, the functions
   4.367 +\ttindexbold{hd} (`head') and \ttindexbold{tl} (`tail') return the first
   4.368 +element and the remainder of a list. (However, pattern-matching is usually
   4.369 +preferable to \texttt{hd} and \texttt{tl}.)
   4.370 +Theory \texttt{List} also contains more syntactic sugar:
   4.371 +\texttt{[}$x@1$\texttt{,}\dots\texttt{,}$x@n$\texttt{]} abbreviates
   4.372 +$x@1$\texttt{\#}\dots\texttt{\#}$x@n$\texttt{\#[]}.
   4.373 +In the rest of the tutorial we always use HOL's predefined lists.
   4.374 +
   4.375 +
   4.376 +\subsection{The general format}
   4.377 +\label{sec:general-datatype}
   4.378 +
   4.379 +The general HOL \texttt{datatype} definition is of the form
   4.380 +\[
   4.381 +\mathtt{datatype}~(\alpha@1, \dots, \alpha@n) \, t ~=~
   4.382 +C@1~\tau@{11}~\dots~\tau@{1k@1} ~\mid~ \dots ~\mid~
   4.383 +C@m~\tau@{m1}~\dots~\tau@{mk@m}
   4.384 +\]
   4.385 +where $\alpha@i$ are type variables (the parameters), $C@i$ are distinct
   4.386 +constructor names and $\tau@{ij}$ are types; it is customary to capitalize
   4.387 +the first letter in constructor names. There are a number of
   4.388 +restrictions (such as the type should not be empty) detailed
   4.389 +elsewhere~\cite{Isa-Logics-Man}. Isabelle notifies you if you violate them.
   4.390 +
   4.391 +Laws about datatypes, such as \verb$[] ~= x#xs$ and \texttt{(x\#xs = y\#ys) =
   4.392 +  (x=y \& xs=ys)}, are used automatically during proofs by simplification.
   4.393 +The same is true for the equations in primitive recursive function
   4.394 +definitions.
   4.395 +
   4.396 +\subsection{Primitive recursion}
   4.397 +
   4.398 +Functions on datatypes are usually defined by recursion. In fact, most of the
   4.399 +time they are defined by what is called \bfindex{primitive recursion}.
   4.400 +The keyword \texttt{primrec} is followed by a list of equations
   4.401 +\[ f \, x@1 \, \dots \, (C \, y@1 \, \dots \, y@k)\, \dots \, x@n = r \]
   4.402 +such that $C$ is a constructor of the datatype $t$ and all recursive calls of
   4.403 +$f$ in $r$ are of the form $f \, \dots \, y@i \, \dots$ for some $i$. Thus
   4.404 +Isabelle immediately sees that $f$ terminates because one (fixed!) argument
   4.405 +becomes smaller with every recursive call. There must be exactly one equation
   4.406 +for each constructor.  Their order is immaterial.
   4.407 +A more general method for defining total recursive functions is explained in
   4.408 +\S\ref{sec:recdef}.
   4.409 +
   4.410 +\begin{exercise}
   4.411 +Given the datatype of binary trees
   4.412 +\begin{ttbox}
   4.413 +\input{Misc/tree}\end{ttbox}
   4.414 +define a function \texttt{mirror} that mirrors the structure of a binary tree
   4.415 +by swapping subtrees (recursively). Prove \texttt{mirror(mirror(t)) = t}.
   4.416 +\end{exercise}
   4.417 +
   4.418 +\subsection{\texttt{case}-expressions}
   4.419 +\label{sec:case-expressions}
   4.420 +
   4.421 +HOL also features \ttindexbold{case}-expressions for analyzing elements of a
   4.422 +datatype. For example,
   4.423 +\begin{ttbox}
   4.424 +case xs of [] => 0 | y#ys => y
   4.425 +\end{ttbox}
   4.426 +evaluates to \texttt{0} if \texttt{xs} is \texttt{[]} and to \texttt{y} if 
   4.427 +\texttt{xs} is \texttt{y\#ys}. (Since the result in both branches must be of
   4.428 +the same type, it follows that \texttt{y::nat} and hence
   4.429 +\texttt{xs::(nat)list}.)
   4.430 +
   4.431 +In general, if $e$ is a term of the datatype $t$ defined in
   4.432 +\S\ref{sec:general-datatype} above, the corresponding
   4.433 +\texttt{case}-expression analyzing $e$ is
   4.434 +\[
   4.435 +\begin{array}{rrcl}
   4.436 +\mbox{\tt case}~e~\mbox{\tt of} & C@1~x@{11}~\dots~x@{1k@1} & \To & e@1 \\
   4.437 +                           \vdots \\
   4.438 +                           \mid & C@m~x@{m1}~\dots~x@{mk@m} & \To & e@m
   4.439 +\end{array}
   4.440 +\]
   4.441 +
   4.442 +\begin{warn}
   4.443 +{\em All} constructors must be present, their order is fixed, and nested
   4.444 +patterns are not supported.  Violating these restrictions results in strange
   4.445 +error messages.
   4.446 +\end{warn}
   4.447 +\noindent
   4.448 +Nested patterns can be simulated by nested \texttt{case}-expressions: instead
   4.449 +of
   4.450 +\begin{ttbox}
   4.451 +case xs of [] => 0 | [x] => x | x#(y#zs) => y
   4.452 +\end{ttbox}
   4.453 +write
   4.454 +\begin{ttbox}
   4.455 +case xs of [] => 0 | x#ys => (case ys of [] => x | y#zs => y)
   4.456 +\end{ttbox}
   4.457 +Note that \texttt{case}-expressions should be enclosed in parentheses to
   4.458 +indicate their scope.
   4.459 +
   4.460 +\subsection{Structural induction}
   4.461 +
   4.462 +Almost all the basic laws about a datatype are applied automatically during
   4.463 +simplification. Only induction is invoked by hand via \texttt{induct_tac},
   4.464 +which works for any datatype. In some cases, induction is overkill and a case
   4.465 +distinction over all constructors of the datatype suffices. This is performed
   4.466 +by \ttindexbold{exhaust_tac}. A trivial example:
   4.467 +\begin{ttbox}
   4.468 +\input{Misc/exhaust.ML}{\out1. xs = [] ==> (case xs of [] => [] | y # ys => xs) = xs}
   4.469 +{\out2. !!a list. xs = a # list ==> (case xs of [] => [] | y # ys => xs) = xs}
   4.470 +\input{Misc/autotac.ML}\end{ttbox}
   4.471 +Note that this particular case distinction could have been automated
   4.472 +completely. See~\S\ref{sec:SimpFeatures}.
   4.473 +
   4.474 +\begin{warn}
   4.475 +  Induction is only allowed on a free variable that should not occur among
   4.476 +  the assumptions of the subgoal.  Exhaustion works for arbitrary terms.
   4.477 +\end{warn}
   4.478 +
   4.479 +\subsection{Case study: boolean expressions}
   4.480 +\label{sec:boolex}
   4.481 +
   4.482 +The aim of this case study is twofold: it shows how to model boolean
   4.483 +expressions and some algorithms for manipulating them, and it demonstrates
   4.484 +the constructs introduced above.
   4.485 +
   4.486 +\subsubsection{How can we model boolean expressions?}
   4.487 +
   4.488 +We want to represent boolean expressions built up from variables and
   4.489 +constants by negation and conjunction. The following datatype serves exactly
   4.490 +that purpose:
   4.491 +\begin{ttbox}
   4.492 +\input{Ifexpr/boolex}\end{ttbox}
   4.493 +The two constants are represented by the terms \texttt{Const~True} and
   4.494 +\texttt{Const~False}. Variables are represented by terms of the form
   4.495 +\texttt{Var}~$n$, where $n$ is a natural number (type \texttt{nat}).
   4.496 +For example, the formula $P@0 \land \neg P@1$ is represented by the term
   4.497 +\texttt{And~(Var~0)~(Neg(Var~1))}.
   4.498 +
   4.499 +\subsubsection{What is the value of boolean expressions?}
   4.500 +
   4.501 +The value of a boolean expressions depends on the value of its variables.
   4.502 +Hence the function \texttt{value} takes an additional parameter, an {\em
   4.503 +  environment} of type \texttt{nat~=>~bool}, which maps variables to their
   4.504 +values:
   4.505 +\begin{ttbox}
   4.506 +\input{Ifexpr/value}\end{ttbox}
   4.507 +
   4.508 +\subsubsection{If-expressions}
   4.509 +
   4.510 +An alternative and often more efficient (because in a certain sense
   4.511 +canonical) representation are so-called \textit{If-expressions\/} built up
   4.512 +from constants (\texttt{CIF}), variables (\texttt{VIF}) and conditionals
   4.513 +(\texttt{IF}):
   4.514 +\begin{ttbox}
   4.515 +\input{Ifexpr/ifex}\end{ttbox}
   4.516 +The evaluation if If-expressions proceeds as for \texttt{boolex}:
   4.517 +\begin{ttbox}
   4.518 +\input{Ifexpr/valif}\end{ttbox}
   4.519 +
   4.520 +\subsubsection{Transformation into and of If-expressions}
   4.521 +
   4.522 +The type \texttt{boolex} is close to the customary representation of logical
   4.523 +formulae, whereas \texttt{ifex} is designed for efficiency. Thus we need to
   4.524 +translate from \texttt{boolex} into \texttt{ifex}:
   4.525 +\begin{ttbox}
   4.526 +\input{Ifexpr/bool2if}\end{ttbox}
   4.527 +At last, we have something we can verify: that \texttt{bool2if} preserves the
   4.528 +value of its argument.
   4.529 +\begin{ttbox}
   4.530 +\input{Ifexpr/bool2if.ML}\end{ttbox}
   4.531 +The proof is canonical:
   4.532 +\begin{ttbox}
   4.533 +\input{Ifexpr/proof.ML}\end{ttbox}
   4.534 +In fact, all proofs in this case study look exactly like this. Hence we do
   4.535 +not show them below.
   4.536 +
   4.537 +More interesting is the transformation of If-expressions into a normal form
   4.538 +where the first argument of \texttt{IF} cannot be another \texttt{IF} but
   4.539 +must be a constant or variable. Such a normal form can be computed by
   4.540 +repeatedly replacing a subterm of the form \texttt{IF~(IF~b~x~y)~z~u} by
   4.541 +\texttt{IF b (IF x z u) (IF y z u)}, which has the same value. The following
   4.542 +primitive recursive functions perform this task:
   4.543 +\begin{ttbox}
   4.544 +\input{Ifexpr/normif}
   4.545 +\input{Ifexpr/norm}\end{ttbox}
   4.546 +Their interplay is a bit tricky, and we leave it to the reader to develop an
   4.547 +intuitive understanding. Fortunately, Isabelle can help us to verify that the
   4.548 +transformation preserves the value of the expression:
   4.549 +\begin{ttbox}
   4.550 +\input{Ifexpr/norm.ML}\end{ttbox}
   4.551 +The proof is canonical, provided we first show the following lemma (which
   4.552 +also helps to understand what \texttt{normif} does) and make it available
   4.553 +for simplification via \texttt{Addsimps}:
   4.554 +\begin{ttbox}
   4.555 +\input{Ifexpr/normif.ML}\end{ttbox}
   4.556 +
   4.557 +But how can we be sure that \texttt{norm} really produces a normal form in
   4.558 +the above sense? We have to prove
   4.559 +\begin{ttbox}
   4.560 +\input{Ifexpr/normal_norm.ML}\end{ttbox}
   4.561 +where \texttt{normal} expresses that an If-expression is in normal form:
   4.562 +\begin{ttbox}
   4.563 +\input{Ifexpr/normal}\end{ttbox}
   4.564 +Of course, this requires a lemma about normality of \texttt{normif}
   4.565 +\begin{ttbox}
   4.566 +\input{Ifexpr/normal_normif.ML}\end{ttbox}
   4.567 +that has to be made available for simplification via \texttt{Addsimps}.
   4.568 +
   4.569 +How does one come up with the required lemmas? Try to prove the main theorems
   4.570 +without them and study carefully what \texttt{Auto_tac} leaves unproved. This
   4.571 +has to provide the clue.
   4.572 +The necessity of universal quantification (\texttt{!t e}) in the two lemmas
   4.573 +is explained in \S\ref{sec:InductionHeuristics}
   4.574 +
   4.575 +\begin{exercise}
   4.576 +  We strengthen the definition of a {\em normal\/} If-expression as follows:
   4.577 +  the first argument of all \texttt{IF}s must be a variable. Adapt the above
   4.578 +  development to this changed requirement. (Hint: you may need to formulate
   4.579 +  some of the goals as implications (\texttt{-->}) rather than equalities
   4.580 +  (\texttt{=}).)
   4.581 +\end{exercise}
   4.582 +
   4.583 +\section{Some basic types}
   4.584 +
   4.585 +\subsection{Natural numbers}
   4.586 +
   4.587 +The type \ttindexbold{nat} of natural numbers is predefined and behaves like
   4.588 +\begin{ttbox}
   4.589 +datatype nat = 0 | Suc nat
   4.590 +\end{ttbox}
   4.591 +In particular, there are \texttt{case}-expressions, for example
   4.592 +\begin{ttbox}
   4.593 +case n of 0 => 0 | Suc m => m
   4.594 +\end{ttbox}
   4.595 +primitive recursion, for example
   4.596 +\begin{ttbox}
   4.597 +\input{Misc/natsum}\end{ttbox}
   4.598 +and induction, for example
   4.599 +\begin{ttbox}
   4.600 +\input{Misc/NatSum.ML}\ttbreak
   4.601 +{\out sum n + sum n = n * Suc n}
   4.602 +{\out No subgoals!}
   4.603 +\end{ttbox}
   4.604 +
   4.605 +The usual arithmetic operations \ttindexbold{+}, \ttindexbold{-},
   4.606 +\ttindexbold{*}, \ttindexbold{div} and \ttindexbold{mod} are predefined, as
   4.607 +are the relations \ttindexbold{<=} and \ttindexbold{<}. There is even a least
   4.608 +number operation \ttindexbold{LEAST}. For example, \texttt{(LEAST n.$\,$1 <
   4.609 +  n) = 2} (HOL does not prove this completely automatically).
   4.610 +
   4.611 +\begin{warn}
   4.612 +  The operations \ttindexbold{+}, \ttindexbold{-}, \ttindexbold{*} are
   4.613 +  overloaded, i.e.\ they are available not just for natural numbers but at
   4.614 +  other types as well (see \S\ref{sec:TypeClasses}). For example, given
   4.615 +  the goal \texttt{x+y = y+x}, there is nothing to indicate that you are
   4.616 +  talking about natural numbers. Hence Isabelle can only infer that
   4.617 +  \texttt{x} and \texttt{y} are of some arbitrary type where \texttt{+} is
   4.618 +  declared. As a consequence, you will be unable to prove the goal (although
   4.619 +  it may take you some time to realize what has happened if
   4.620 +  \texttt{show_types} is not set).  In this particular example, you need to
   4.621 +  include an explicit type constraint, for example \texttt{x+y = y+(x::nat)}.
   4.622 +  If there is enough contextual information this may not be necessary:
   4.623 +  \texttt{x+0 = x} automatically implies \texttt{x::nat}.
   4.624 +\end{warn}
   4.625 +
   4.626 +
   4.627 +\subsection{Products}
   4.628 +
   4.629 +HOL also has pairs: \texttt{($a@1$,$a@2$)} is of type \texttt{$\tau@1$ *
   4.630 +$\tau@2$} provided each $a@i$ is of type $\tau@i$. The components of a pair
   4.631 +are extracted by \texttt{fst} and \texttt{snd}:
   4.632 +\texttt{fst($x$,$y$) = $x$} and \texttt{snd($x$,$y$) = $y$}. Tuples
   4.633 +are simulated by pairs nested to the right: 
   4.634 +\texttt{($a@1$,$a@2$,$a@3$)} and \texttt{$\tau@1$ * $\tau@2$ * $\tau@3$}
   4.635 +stand for \texttt{($a@1$,($a@2$,$a@3$))} and \texttt{$\tau@1$ * ($\tau@2$ *
   4.636 +$\tau@3$)}. Therefore \texttt{fst(snd($a@1$,$a@2$,$a@3$)) = $a@2$}.
   4.637 +
   4.638 +It is possible to use (nested) tuples as patterns in abstractions, for
   4.639 +example \texttt{\%(x,y,z).x+y+z} and \texttt{\%((x,y),z).x+y+z}.
   4.640 +
   4.641 +In addition to explicit $\lambda$-abstractions, tuple patterns can be used in
   4.642 +most variable binding constructs. Typical examples are
   4.643 +\begin{ttbox}
   4.644 +let (x,y) = f z in (y,x)
   4.645 +
   4.646 +case xs of [] => 0 | (x,y)\#zs => x+y
   4.647 +\end{ttbox}
   4.648 +Further important examples are quantifiers and sets.
   4.649 +
   4.650 +\begin{warn}
   4.651 +Abstraction over pairs and tuples is merely a convenient shorthand for a more
   4.652 +complex internal representation.  Thus the internal and external form of a
   4.653 +term may differ, which can affect proofs. If you want to avoid this
   4.654 +complication, use \texttt{fst} and \texttt{snd}, i.e.\ write
   4.655 +\texttt{\%p.~fst p + snd p} instead of \texttt{\%(x,y).~x + y}.
   4.656 +See~\S\ref{} for theorem proving with tuple patterns.
   4.657 +\end{warn}
   4.658 +
   4.659 +
   4.660 +\section{Definitions}
   4.661 +\label{sec:Definitions}
   4.662 +
   4.663 +A definition is simply an abbreviation, i.e.\ a new name for an existing
   4.664 +construction. In particular, definitions cannot be recursive. Isabelle offers
   4.665 +definitions on the level of types and terms. Those on the type level are
   4.666 +called type synonyms, those on the term level are called (constant)
   4.667 +definitions.
   4.668 +
   4.669 +
   4.670 +\subsection{Type synonyms}
   4.671 +\indexbold{type synonyms}
   4.672 +
   4.673 +Type synonyms are similar to those found in ML. Their syntax is fairly self
   4.674 +explanatory:
   4.675 +\begin{ttbox}
   4.676 +\input{Misc/types}\end{ttbox}\indexbold{*types}
   4.677 +The synonym \texttt{alist} shows that in general the type on the right-hand
   4.678 +side needs to be enclosed in double quotation marks
   4.679 +(see the end of~\S\ref{sec:intro-theory}).
   4.680 +
   4.681 +Internally all synonyms are fully expanded.  As a consequence Isabelle's
   4.682 +output never contains synonyms.  Their main purpose is to improve the
   4.683 +readability of theory definitions.  Synonyms can be used just like any other
   4.684 +type:
   4.685 +\begin{ttbox}
   4.686 +\input{Misc/consts}\end{ttbox}
   4.687 +
   4.688 +\subsection{Constant definitions}
   4.689 +\label{sec:ConstDefinitions}
   4.690 +
   4.691 +The above constants \texttt{nand} and \texttt{exor} are non-recursive and can
   4.692 +therefore be defined directly by
   4.693 +\begin{ttbox}
   4.694 +\input{Misc/defs}\end{ttbox}\indexbold{*defs}
   4.695 +where \texttt{defs} is a keyword and \texttt{nand_def} and \texttt{exor_def}
   4.696 +are arbitrary user-supplied names.
   4.697 +The symbol \texttt{==}\index{==>@{\tt==}|bold} is a special form of equality
   4.698 +that should only be used in constant definitions.
   4.699 +Declarations and definitions can also be merged
   4.700 +\begin{ttbox}
   4.701 +\input{Misc/constdefs}\end{ttbox}\indexbold{*constdefs}
   4.702 +in which case the default name of each definition is $f$\texttt{_def}, where
   4.703 +$f$ is the name of the defined constant.
   4.704 +
   4.705 +Note that pattern-matching is not allowed, i.e.\ each definition must be of
   4.706 +the form $f\,x@1\,\dots\,x@n$\texttt{~==~}$t$.
   4.707 +
   4.708 +Section~\S\ref{sec:Simplification} explains how definitions are used
   4.709 +in proofs.
   4.710 +
   4.711 +\begin{warn}
   4.712 +A common mistake when writing definitions is to introduce extra free variables
   4.713 +on the right-hand side as in the following fictitious definition:
   4.714 +\begin{ttbox}
   4.715 +defs  prime_def "prime(p) == (m divides p) --> (m=1 | m=p)"
   4.716 +\end{ttbox}
   4.717 +Isabelle rejects this `definition' because of the extra {\tt m} on the
   4.718 +right-hand side, which would introduce an inconsistency.  What you should have
   4.719 +written is
   4.720 +\begin{ttbox}
   4.721 +defs  prime_def "prime(p) == !m. (m divides p) --> (m=1 | m=p)"
   4.722 +\end{ttbox}
   4.723 +\end{warn}
   4.724 +
   4.725 +
   4.726 +
   4.727 +
   4.728 +\chapter{More Functional Programming}
   4.729 +
   4.730 +The purpose of this chapter is to deepen the reader's understanding of the
   4.731 +concepts encountered so far and to introduce an advanced method for defining
   4.732 +recursive functions. The first two sections give a structured presentation of
   4.733 +theorem proving by simplification (\S\ref{sec:Simplification}) and
   4.734 +discuss important heuristics for induction (\S\ref{sec:InductionHeuristics}). They
   4.735 +can be skipped by readers less interested in proofs. They are followed by a
   4.736 +case study, a compiler for expressions (\S\ref{sec:ExprCompiler}).
   4.737 +Finally we present a very general method for defining recursive functions
   4.738 +that goes well beyond what \texttt{primrec} allows (\S\ref{sec:recdef}).
   4.739 +
   4.740 +
   4.741 +\section{Simplification}
   4.742 +\label{sec:Simplification}
   4.743 +
   4.744 +So far we have proved our theorems by \texttt{Auto_tac}, which
   4.745 +`simplifies' all subgoals. In fact, \texttt{Auto_tac} can do much more than
   4.746 +that, except that it did not need to so far. However, when you go beyond toy
   4.747 +examples, you need to understand the ingredients of \texttt{Auto_tac}.
   4.748 +This section covers the tactic that \texttt{Auto_tac} always applies first,
   4.749 +namely simplification.
   4.750 +
   4.751 +Simplification is one of the central theorem proving tools in Isabelle and
   4.752 +many other systems. The tool itself is called the \bfindex{simplifier}. The
   4.753 +purpose of this section is twofold: to introduce the many features of the
   4.754 +simplifier (\S\ref{sec:SimpFeatures}) and to explain a little bit how the
   4.755 +simplifier works (\S\ref{sec:SimpHow}).  Anybody intending to use HOL should
   4.756 +read \S\ref{sec:SimpFeatures}, and the serious student should read
   4.757 +\S\ref{sec:SimpHow} as well in order to understand what happened in case
   4.758 +things do not simplify as expected.
   4.759 +
   4.760 +
   4.761 +\subsection{Using the simplifier}
   4.762 +\label{sec:SimpFeatures}
   4.763 +
   4.764 +In its most basic form, simplification means repeated application of
   4.765 +equations from left to right. For example, taking the rules for \texttt{\at}
   4.766 +and applying them to the term \texttt{[0,1] \at\ []} results in a sequence of
   4.767 +simplification steps:
   4.768 +\begin{ttbox}\makeatother
   4.769 +(0#1#[]) @ []  \(\leadsto\)  0#((1#[]) @ [])  \(\leadsto\)  0#(1#([] @ []))  \(\leadsto\)  0#1#[]
   4.770 +\end{ttbox}
   4.771 +This is also known as {\em term rewriting} and the equations are referred
   4.772 +to as {\em rewrite rules}. This is more honest than `simplification' because
   4.773 +the terms do not necessarily become simpler in the process.
   4.774 +
   4.775 +\subsubsection{Simpsets}
   4.776 +
   4.777 +To facilitate simplification, each theory has an associated set of
   4.778 +simplification rules, known as a \bfindex{simpset}. Within a theory,
   4.779 +proofs by simplification refer to the associated simpset by default.
   4.780 +The simpset of a theory is built up as follows: starting with the union of
   4.781 +the simpsets of the parent theories, each occurrence of a \texttt{datatype}
   4.782 +or \texttt{primrec} construct augments the simpset. Explicit definitions are
   4.783 +not added automatically. Users can add new theorems via \texttt{Addsimps} and
   4.784 +delete them again later by \texttt{Delsimps}.
   4.785 +
   4.786 +You may augment a simpset not just by equations but by pretty much any
   4.787 +theorem. The simplifier will try to make sense of it.  For example, a theorem
   4.788 +\verb$~$$P$ is automatically turned into \texttt{$P$ = False}. The details are
   4.789 +explained in \S\ref{sec:SimpHow}.
   4.790 +
   4.791 +As a rule of thumb, rewrite rules that really simplify a term (like
   4.792 +\texttt{xs \at\ [] = xs} and \texttt{rev(rev xs) = xs}) should be added to the
   4.793 +current simpset right after they have been proved.  Those of a more specific
   4.794 +nature (e.g.\ distributivity laws, which alter the structure of terms
   4.795 +considerably) should only be added for specific proofs and deleted again
   4.796 +afterwards.  Conversely, it may also happen that a generally useful rule
   4.797 +needs to be removed for a certain proof and is added again afterwards.  The
   4.798 +need of frequent temporary additions or deletions may indicate a badly
   4.799 +designed simpset.
   4.800 +\begin{warn}
   4.801 +  Simplification may not terminate, for example if both $f(x) = g(x)$ and
   4.802 +  $g(x) = f(x)$ are in the simpset. It is the user's responsibility not to
   4.803 +  include rules that can lead to nontermination, either on their own or in
   4.804 +  combination with other rules.
   4.805 +\end{warn}
   4.806 +
   4.807 +\subsubsection{Simplification tactics}
   4.808 +
   4.809 +There are four main simplification tactics:
   4.810 +\begin{ttdescription}
   4.811 +\item[\ttindexbold{Simp_tac} $i$] simplifies the conclusion of subgoal~$i$
   4.812 +  using the theory's simpset.  It may solve the subgoal completely if it has
   4.813 +  become trivial. For example:
   4.814 +\begin{ttbox}\makeatother
   4.815 +{\out 1. [] @ [] = []}
   4.816 +by(Simp_tac 1);
   4.817 +{\out No subgoals!}
   4.818 +\end{ttbox}
   4.819 +
   4.820 +\item[\ttindexbold{Asm_simp_tac}]
   4.821 +  is like \verb$Simp_tac$, but extracts additional rewrite rules from
   4.822 +  the assumptions of the subgoal. For example, it solves
   4.823 +\begin{ttbox}\makeatother
   4.824 +{\out 1. xs = [] ==> xs @ xs = xs}
   4.825 +\end{ttbox}
   4.826 +which \texttt{Simp_tac} does not do.
   4.827 +  
   4.828 +\item[\ttindexbold{Full_simp_tac}] is like \verb$Simp_tac$, but also
   4.829 +  simplifies the assumptions (without using the assumptions to
   4.830 +  simplify each other or the actual goal).
   4.831 +
   4.832 +\item[\ttindexbold{Asm_full_simp_tac}] is like \verb$Asm_simp_tac$,
   4.833 +  but also simplifies the assumptions. In particular, assumptions can
   4.834 +  simplify each other. For example:
   4.835 +\begin{ttbox}\makeatother
   4.836 +\out{ 1. [| xs @ zs = ys @ xs; [] @ xs = [] @ [] |] ==> ys = zs}
   4.837 +by(Asm_full_simp_tac 1);
   4.838 +{\out No subgoals!}
   4.839 +\end{ttbox}
   4.840 +The second assumption simplifies to \texttt{xs = []}, which in turn
   4.841 +simplifies the first assumption to \texttt{zs = ys}, thus reducing the
   4.842 +conclusion to \texttt{ys = ys} and hence to \texttt{True}.
   4.843 +(See also the paragraph on tracing below.)
   4.844 +\end{ttdescription}
   4.845 +\texttt{Asm_full_simp_tac} is the most powerful of this quartet of
   4.846 +tactics. In fact, \texttt{Auto_tac} starts by applying
   4.847 +\texttt{Asm_full_simp_tac} to all subgoals. The only reason for the existence
   4.848 +of the other three tactics is that sometimes one wants to limit the amount of
   4.849 +simplification, for example to avoid nontermination:
   4.850 +\begin{ttbox}\makeatother
   4.851 +{\out  1. ! x. f x = g (f (g x)) ==> f [] = f [] @ []}
   4.852 +\end{ttbox}
   4.853 +is solved by \texttt{Simp_tac}, but \texttt{Asm_simp_tac} and
   4.854 +\texttt{Asm_full_simp_tac} loop because the rewrite rule \texttt{f x = g(f(g
   4.855 +x))} extracted from the assumption does not terminate.  Isabelle notices
   4.856 +certain simple forms of nontermination, but not this one.
   4.857 + 
   4.858 +\subsubsection{Modifying simpsets locally}
   4.859 +
   4.860 +If a certain theorem is merely needed in one proof by simplification, the
   4.861 +pattern
   4.862 +\begin{ttbox}
   4.863 +Addsimps [\(rare_theorem\)];
   4.864 +by(Simp_tac 1);
   4.865 +Delsimps [\(rare_theorem\)];
   4.866 +\end{ttbox}
   4.867 +is awkward. Therefore there are lower-case versions of the simplification
   4.868 +tactics (\ttindexbold{simp_tac}, \ttindexbold{asm_simp_tac},
   4.869 +\ttindexbold{full_simp_tac}, \ttindexbold{asm_full_simp_tac}) and of the
   4.870 +simpset modifiers (\ttindexbold{addsimps}, \ttindexbold{delsimps})
   4.871 +that do not access or modify the implicit simpset but explicitly take a
   4.872 +simpset as an argument. For example, the above three lines become
   4.873 +\begin{ttbox}
   4.874 +by(simp_tac (simpset() addsimps [\(rare_theorem\)]) 1);
   4.875 +\end{ttbox}
   4.876 +where the result of the function call \texttt{simpset()} is the simpset of
   4.877 +the current theory and \texttt{addsimps} is an infix function. The implicit
   4.878 +simpset is read once but not modified.
   4.879 +This is far preferable to pairs of \texttt{Addsimps} and \texttt{Delsimps}.
   4.880 +Local modifications can be stacked as in
   4.881 +\begin{ttbox}
   4.882 +by(simp_tac (simpset() addsimps [\(rare_theorem\)] delsimps [\(some_thm\)]) 1);
   4.883 +\end{ttbox}
   4.884 +
   4.885 +\subsubsection{Rewriting with definitions}
   4.886 +
   4.887 +Constant definitions (\S\ref{sec:ConstDefinitions}) are not automatically
   4.888 +included in the simpset of a theory. Hence such definitions are not expanded
   4.889 +automatically either, just as it should be: definitions are introduced for
   4.890 +the purpose of abbreviating complex concepts. Of course we need to expand the
   4.891 +definitions initially to derive enough lemmas that characterize the concept
   4.892 +sufficiently for us to forget the original definition completely. For
   4.893 +example, given the theory
   4.894 +\begin{ttbox}
   4.895 +\input{Misc/Exor.thy}\end{ttbox}
   4.896 +we may want to prove \verb$exor A (~A)$. Instead of \texttt{Goal} we use
   4.897 +\begin{ttbox}
   4.898 +\input{Misc/exorgoal.ML}\end{ttbox}
   4.899 +which tells Isabelle to expand the definition of \texttt{exor}---the first
   4.900 +argument of \texttt{Goalw} can be a list of definitions---in the initial goal:
   4.901 +\begin{ttbox}
   4.902 +{\out exor A (~ A)}
   4.903 +{\out  1. A & ~ ~ A | ~ A & ~ A}
   4.904 +\end{ttbox}
   4.905 +In this simple example, the goal is proved by \texttt{Simp_tac}.
   4.906 +Of course the resulting theorem is insufficient to characterize \texttt{exor}
   4.907 +completely.
   4.908 +
   4.909 +In case we want to expand a definition in the middle of a proof, we can
   4.910 +simply add the definition locally to the simpset:
   4.911 +\begin{ttbox}
   4.912 +\input{Misc/exorproof.ML}\end{ttbox}
   4.913 +You should normally not add the definition permanently to the simpset
   4.914 +using \texttt{Addsimps} because this defeats the whole purpose of an
   4.915 +abbreviation.
   4.916 +
   4.917 +\begin{warn}
   4.918 +If you have defined $f\,x\,y$\texttt{~==~}$t$ then you can only expand
   4.919 +occurrences of $f$ with at least two arguments. Thus it is safer to define
   4.920 +$f$\texttt{~==~\%$x\,y$.}$\;t$.
   4.921 +\end{warn}
   4.922 +
   4.923 +\subsubsection{Simplifying \texttt{let}-expressions}
   4.924 +
   4.925 +Proving a goal containing \ttindex{let}-expressions invariably requires the
   4.926 +\texttt{let}-constructs to be expanded at some point. Since
   4.927 +\texttt{let}-\texttt{in} is just syntactic sugar for a defined constant
   4.928 +(called \texttt{Let}), expanding \texttt{let}-constructs means rewriting with
   4.929 +\texttt{Let_def}:
   4.930 +%context List.thy;
   4.931 +%Goal "(let xs = [] in xs@xs) = ys";
   4.932 +\begin{ttbox}\makeatother
   4.933 +{\out  1. (let xs = [] in xs @ xs) = ys}
   4.934 +by(simp_tac (simpset() addsimps [Let_def]) 1);
   4.935 +{\out  1. [] = ys}
   4.936 +\end{ttbox}
   4.937 +If, in a particular context, there is no danger of a combinatorial explosion
   4.938 +of nested \texttt{let}s one could even add \texttt{Let_def} permanently via
   4.939 +\texttt{Addsimps}.
   4.940 +
   4.941 +\subsubsection{Conditional equations}
   4.942 +
   4.943 +So far all examples of rewrite rules were equations. The simplifier also
   4.944 +accepts {\em conditional\/} equations, for example
   4.945 +\begin{ttbox}
   4.946 +xs ~= []  ==>  hd xs # tl xs = xs \hfill \((*)\)
   4.947 +\end{ttbox}
   4.948 +(which is proved by \texttt{exhaust_tac} on \texttt{xs} followed by
   4.949 +\texttt{Asm_full_simp_tac} twice). Assuming that this theorem together with
   4.950 +%\begin{ttbox}\makeatother
   4.951 +\texttt{(rev xs = []) = (xs = [])}
   4.952 +%\end{ttbox}
   4.953 +are part of the simpset, the subgoal
   4.954 +\begin{ttbox}\makeatother
   4.955 +{\out 1. xs ~= [] ==> hd(rev xs) # tl(rev xs) = rev xs}
   4.956 +\end{ttbox}
   4.957 +is proved by simplification:
   4.958 +the conditional equation $(*)$ above
   4.959 +can simplify \texttt{hd(rev~xs)~\#~tl(rev~xs)} to \texttt{rev xs}
   4.960 +because the corresponding precondition \verb$rev xs ~= []$ simplifies to
   4.961 +\verb$xs ~= []$, which is exactly the local assumption of the subgoal.
   4.962 +
   4.963 +
   4.964 +\subsubsection{Automatic case splits}
   4.965 +
   4.966 +Goals containing \ttindex{if}-expressions are usually proved by case
   4.967 +distinction on the condition of the \texttt{if}. For example the goal
   4.968 +\begin{ttbox}
   4.969 +{\out 1. ! xs. if xs = [] then rev xs = [] else rev xs ~= []}
   4.970 +\end{ttbox}
   4.971 +can be split into
   4.972 +\begin{ttbox}
   4.973 +{\out 1. ! xs. (xs = [] --> rev xs = []) \& (xs ~= [] --> rev xs ~= [])}
   4.974 +\end{ttbox}
   4.975 +by typing
   4.976 +\begin{ttbox}
   4.977 +\input{Misc/splitif.ML}\end{ttbox}
   4.978 +Because this is almost always the right proof strategy, the simplifier
   4.979 +performs case-splitting on \texttt{if}s automatically. Try \texttt{Simp_tac}
   4.980 +on the initial goal above.
   4.981 +
   4.982 +This splitting idea generalizes from \texttt{if} to \ttindex{case}:
   4.983 +\begin{ttbox}\makeatother
   4.984 +{\out 1. (case xs of [] => zs | y#ys => y#(ys@zs)) = xs@zs}
   4.985 +\end{ttbox}
   4.986 +becomes
   4.987 +\begin{ttbox}\makeatother
   4.988 +{\out 1. (xs = [] --> zs = xs @ zs) &}
   4.989 +{\out    (! a list. xs = a # list --> a # list @ zs = xs @ zs)}
   4.990 +\end{ttbox}
   4.991 +by typing
   4.992 +\begin{ttbox}
   4.993 +\input{Misc/splitlist.ML}\end{ttbox}
   4.994 +In contrast to \texttt{if}-expressions, the simplifier does not split
   4.995 +\texttt{case}-expressions by default because this can lead to nontermination
   4.996 +in case of recursive datatypes.
   4.997 +Nevertheless the simplifier can be instructed to perform \texttt{case}-splits
   4.998 +by adding the appropriate rule to the simpset:
   4.999 +\begin{ttbox}
  4.1000 +by(simp_tac (simpset() addsplits [split_list_case]) 1);
  4.1001 +\end{ttbox}\indexbold{*addsplits}
  4.1002 +solves the initial goal outright, which \texttt{Simp_tac} alone will not do.
  4.1003 +
  4.1004 +In general, every datatype $t$ comes with a rule
  4.1005 +\texttt{$t$.split} that can be added to the simpset either
  4.1006 +locally via \texttt{addsplits} (see above), or permanently via
  4.1007 +\begin{ttbox}
  4.1008 +Addsplits [\(t\).split];
  4.1009 +\end{ttbox}\indexbold{*Addsplits}
  4.1010 +Split-rules can be removed globally via \ttindexbold{Delsplits} and locally
  4.1011 +via \ttindexbold{delsplits} as, for example, in
  4.1012 +\begin{ttbox}
  4.1013 +by(simp_tac (simpset() addsimps [\(\dots\)] delsplits [split_if]) 1);
  4.1014 +\end{ttbox}
  4.1015 +
  4.1016 +
  4.1017 +\subsubsection{Permutative rewrite rules}
  4.1018 +
  4.1019 +A rewrite rule is {\bf permutative} if the left-hand side and right-hand side
  4.1020 +are the same up to renaming of variables.  The most common permutative rule
  4.1021 +is commutativity: $x+y = y+x$.  Another example is $(x-y)-z = (x-z)-y$.  Such
  4.1022 +rules are problematic because once they apply, they can be used forever.
  4.1023 +The simplifier is aware of this danger and treats permutative rules
  4.1024 +separately. For details see~\cite{Isa-Ref-Man}.
  4.1025 +
  4.1026 +\subsubsection{Tracing}
  4.1027 +\indexbold{tracing the simplifier}
  4.1028 +
  4.1029 +Using the simplifier effectively may take a bit of experimentation.  Set the
  4.1030 +\verb$trace_simp$ flag to get a better idea of what is going on:
  4.1031 +\begin{ttbox}\makeatother
  4.1032 +{\out  1. rev [x] = []}
  4.1033 +\ttbreak
  4.1034 +set trace_simp;
  4.1035 +by(Simp_tac 1);
  4.1036 +\ttbreak\makeatother
  4.1037 +{\out Applying instance of rewrite rule:}
  4.1038 +{\out rev (?x # ?xs) == rev ?xs @ [?x]}
  4.1039 +{\out Rewriting:}
  4.1040 +{\out rev [x] == rev [] @ [x]}
  4.1041 +\ttbreak
  4.1042 +{\out Applying instance of rewrite rule:}
  4.1043 +{\out rev [] == []}
  4.1044 +{\out Rewriting:}
  4.1045 +{\out rev [] == []}
  4.1046 +\ttbreak\makeatother
  4.1047 +{\out Applying instance of rewrite rule:}
  4.1048 +{\out [] @ ?y == ?y}
  4.1049 +{\out Rewriting:}
  4.1050 +{\out [] @ [x] == [x]}
  4.1051 +\ttbreak
  4.1052 +{\out Applying instance of rewrite rule:}
  4.1053 +{\out ?x # ?t = ?t == False}
  4.1054 +{\out Rewriting:}
  4.1055 +{\out [x] = [] == False}
  4.1056 +\ttbreak
  4.1057 +{\out Level 1}
  4.1058 +{\out rev [x] = []}
  4.1059 +{\out  1. False}
  4.1060 +\end{ttbox}
  4.1061 +In more complicated cases, the trace can be enormous, especially since
  4.1062 +invocations of the simplifier are often nested (e.g.\ when solving conditions
  4.1063 +of rewrite rules).
  4.1064 +
  4.1065 +\subsection{How it works}
  4.1066 +\label{sec:SimpHow}
  4.1067 +
  4.1068 +\subsubsection{Higher-order patterns}
  4.1069 +
  4.1070 +\subsubsection{Local assumptions}
  4.1071 +
  4.1072 +\subsubsection{The preprocessor}
  4.1073 +
  4.1074 +\section{Induction heuristics}
  4.1075 +\label{sec:InductionHeuristics}
  4.1076 +
  4.1077 +The purpose of this section is to illustrate some simple heuristics for
  4.1078 +inductive proofs. The first one we have already mentioned in our initial
  4.1079 +example:
  4.1080 +\begin{quote}
  4.1081 +{\em 1. Theorems about recursive functions are proved by induction.}
  4.1082 +\end{quote}
  4.1083 +In case the function has more than one argument
  4.1084 +\begin{quote}
  4.1085 +{\em 2. Do induction on argument number $i$ if the function is defined by
  4.1086 +recursion in argument number $i$.}
  4.1087 +\end{quote}
  4.1088 +When we look at the proof of
  4.1089 +\begin{ttbox}\makeatother
  4.1090 +(xs @ ys) @ zs = xs @ (ys @ zs)
  4.1091 +\end{ttbox}
  4.1092 +in \S\ref{sec:intro-proof} we find (a) \texttt{\at} is recursive in the first
  4.1093 +argument, (b) \texttt{xs} occurs only as the first argument of \texttt{\at},
  4.1094 +and (c) both \texttt{ys} and \texttt{zs} occur at least once as the second
  4.1095 +argument of \texttt{\at}. Hence it is natural to perform induction on
  4.1096 +\texttt{xs}.
  4.1097 +
  4.1098 +The key heuristic, and the main point of this section, is to
  4.1099 +generalize the goal before induction. The reason is simple: if the goal is
  4.1100 +too specific, the induction hypothesis is too weak to allow the induction
  4.1101 +step to go through. Let us now illustrate the idea with an example.
  4.1102 +
  4.1103 +We define a tail-recursive version of list-reversal,
  4.1104 +i.e.\ one that can be compiled into a loop:
  4.1105 +\begin{ttbox}
  4.1106 +\input{Misc/Itrev.thy}\end{ttbox}
  4.1107 +The behaviour of \texttt{itrev} is simple: it reverses its first argument by
  4.1108 +stacking its elements onto the second argument, and returning that second
  4.1109 +argument when the first one becomes empty.
  4.1110 +We need to show that \texttt{itrev} does indeed reverse its first argument
  4.1111 +provided the second one is empty:
  4.1112 +\begin{ttbox}
  4.1113 +\input{Misc/itrev1.ML}\end{ttbox}
  4.1114 +There is no choice as to the induction variable, and we immediately simplify:
  4.1115 +\begin{ttbox}
  4.1116 +\input{Misc/induct_auto.ML}\ttbreak\makeatother
  4.1117 +{\out1. !!a list. itrev list [] = rev list\(\;\)==> itrev list [a] = rev list @ [a]}
  4.1118 +\end{ttbox}
  4.1119 +Just as predicted above, the overall goal, and hence the induction
  4.1120 +hypothesis, is too weak to solve the induction step because of the fixed
  4.1121 +\texttt{[]}. The corresponding heuristic:
  4.1122 +\begin{quote}
  4.1123 +{\em 3. Generalize goals for induction by replacing constants by variables.}
  4.1124 +\end{quote}
  4.1125 +Of course one cannot do this na\"{\i}vely: \texttt{itrev xs ys = rev xs} is
  4.1126 +just not true --- the correct generalization is
  4.1127 +\begin{ttbox}\makeatother
  4.1128 +\input{Misc/itrev2.ML}\end{ttbox}
  4.1129 +If \texttt{ys} is replaced by \texttt{[]}, the right-hand side simplifies to
  4.1130 +\texttt{rev xs}, just as required.
  4.1131 +
  4.1132 +In this particular instance it is easy to guess the right generalization,
  4.1133 +but in more complex situations a good deal of creativity is needed. This is
  4.1134 +the main source of complications in inductive proofs.
  4.1135 +
  4.1136 +Although we now have two variables, only \texttt{xs} is suitable for
  4.1137 +induction, and we repeat our above proof attempt. Unfortunately, we are still
  4.1138 +not there:
  4.1139 +\begin{ttbox}\makeatother
  4.1140 +{\out 1. !!a list.}
  4.1141 +{\out       itrev list ys = rev list @ ys}
  4.1142 +{\out       ==> itrev list (a # ys) = rev list @ a # ys}
  4.1143 +\end{ttbox}
  4.1144 +The induction hypothesis is still too weak, but this time it takes no
  4.1145 +intuition to generalize: the problem is that \texttt{ys} is fixed throughout
  4.1146 +the subgoal, but the induction hypothesis needs to be applied with
  4.1147 +\texttt{a \# ys} instead of \texttt{ys}. Hence we prove the theorem
  4.1148 +for all \texttt{ys} instead of a fixed one:
  4.1149 +\begin{ttbox}\makeatother
  4.1150 +\input{Misc/itrev3.ML}\end{ttbox}
  4.1151 +This time induction on \texttt{xs} followed by simplification succeeds. This
  4.1152 +leads to another heuristic for generalization:
  4.1153 +\begin{quote}
  4.1154 +{\em 4. Generalize goals for induction by universally quantifying all free
  4.1155 +variables {\em(except the induction variable itself!)}.}
  4.1156 +\end{quote}
  4.1157 +This prevents trivial failures like the above and does not change the
  4.1158 +provability of the goal. Because it is not always required, and may even
  4.1159 +complicate matters in some cases, this heuristic is often not
  4.1160 +applied blindly.
  4.1161 +
  4.1162 +A final point worth mentioning is the orientation of the equation we just
  4.1163 +proved: the more complex notion (\texttt{itrev}) is on the left-hand
  4.1164 +side, the simpler \texttt{rev} on the right-hand side. This constitutes
  4.1165 +another, albeit weak heuristic that is not restricted to induction:
  4.1166 +\begin{quote}
  4.1167 +  {\em 5. The right-hand side of an equation should (in some sense) be
  4.1168 +    simpler than the left-hand side.}
  4.1169 +\end{quote}
  4.1170 +The heuristic is tricky to apply because it is not obvious that
  4.1171 +\texttt{rev xs \at\ ys} is simpler than \texttt{itrev xs ys}. But see what
  4.1172 +happens if you try to prove the symmetric equation!
  4.1173 +
  4.1174 +
  4.1175 +\section{Case study: compiling expressions}
  4.1176 +\label{sec:ExprCompiler}
  4.1177 +
  4.1178 +The task is to develop a compiler from a generic type of expressions (built
  4.1179 +up from variables, constants and binary operations) to a stack machine.  This
  4.1180 +generic type of expressions is a generalization of the boolean expressions in
  4.1181 +\S\ref{sec:boolex}.  This time we do not commit ourselves to a particular
  4.1182 +type of variables or values but make them type parameters.  Neither is there
  4.1183 +a fixed set of binary operations: instead the expression contains the
  4.1184 +appropriate function itself.
  4.1185 +\begin{ttbox}
  4.1186 +\input{CodeGen/expr}\end{ttbox}
  4.1187 +The three constructors represent constants, variables and the combination of
  4.1188 +two subexpressions with a binary operation.
  4.1189 +
  4.1190 +The value of an expression w.r.t.\ an environment that maps variables to
  4.1191 +values is easily defined:
  4.1192 +\begin{ttbox}
  4.1193 +\input{CodeGen/value}\end{ttbox}
  4.1194 +
  4.1195 +The stack machine has three instructions: load a constant value onto the
  4.1196 +stack, load the contents of a certain address onto the stack, and apply a
  4.1197 +binary operation to the two topmost elements of the stack, replacing them by
  4.1198 +the result. As for \texttt{expr}, addresses and values are type parameters:
  4.1199 +\begin{ttbox}
  4.1200 +\input{CodeGen/instr}\end{ttbox}
  4.1201 +
  4.1202 +The execution of the stack machine is modelled by a function \texttt{exec}
  4.1203 +that takes a store (modelled as a function from addresses to values, just
  4.1204 +like the environment for evaluating expressions), a stack (modelled as a
  4.1205 +list) of values and a list of instructions and returns the stack at the end
  4.1206 +of the execution --- the store remains unchanged:
  4.1207 +\begin{ttbox}
  4.1208 +\input{CodeGen/exec}\end{ttbox}
  4.1209 +Recall that \texttt{hd} and \texttt{tl}
  4.1210 +return the first element and the remainder of a list.
  4.1211 +
  4.1212 +Because all functions are total, \texttt{hd} is defined even for the empty
  4.1213 +list, although we do not know what the result is. Thus our model of the
  4.1214 +machine always terminates properly, although the above definition does not
  4.1215 +tell us much about the result in situations where \texttt{Apply} was executed
  4.1216 +with fewer than two elements on the stack.
  4.1217 +
  4.1218 +The compiler is a function from expressions to a list of instructions. Its
  4.1219 +definition is pretty much obvious:
  4.1220 +\begin{ttbox}\makeatother
  4.1221 +\input{CodeGen/comp}\end{ttbox}
  4.1222 +
  4.1223 +Now we have to prove the correctness of the compiler, i.e.\ that the
  4.1224 +execution of a compiled expression results in the value of the expression:
  4.1225 +\begin{ttbox}
  4.1226 +exec s [] (comp e) = [value s e]
  4.1227 +\end{ttbox}
  4.1228 +This is generalized to
  4.1229 +\begin{ttbox}
  4.1230 +\input{CodeGen/goal2.ML}\end{ttbox}
  4.1231 +and proved by induction on \texttt{e} followed by simplification, once we
  4.1232 +have the following lemma about executing the concatenation of two instruction
  4.1233 +sequences:
  4.1234 +\begin{ttbox}\makeatother
  4.1235 +\input{CodeGen/goal2.ML}\end{ttbox}
  4.1236 +This requires induction on \texttt{xs} and ordinary simplification for the
  4.1237 +base cases. In the induction step, simplification leaves us with a formula
  4.1238 +that contains two \texttt{case}-expressions over instructions. Thus we add
  4.1239 +automatic case splitting as well, which finishes the proof:
  4.1240 +\begin{ttbox}
  4.1241 +\input{CodeGen/simpsplit.ML}\end{ttbox}
  4.1242 +
  4.1243 +We could now go back and prove \texttt{exec s [] (comp e) = [value s e]}
  4.1244 +merely by simplification with the generalized version we just proved.
  4.1245 +However, this is unnecessary because the generalized version fully subsumes
  4.1246 +its instance.
  4.1247 +
  4.1248 +\section{Total recursive functions}
  4.1249 +\label{sec:recdef}
  4.1250 +\index{*recdef|(}
  4.1251 +
  4.1252 +
  4.1253 +Although many total functions have a natural primitive recursive definition,
  4.1254 +this is not always the case. Arbitrary total recursive functions can be
  4.1255 +defined by means of \texttt{recdef}: you can use full pattern-matching,
  4.1256 +recursion need not involve datatypes, and termination is proved by showing
  4.1257 +that each recursive call makes the argument smaller in a suitable (user
  4.1258 +supplied) sense.
  4.1259 +
  4.1260 +\subsection{Defining recursive functions}
  4.1261 +
  4.1262 +Here is a simple example, the Fibonacci function:
  4.1263 +\begin{ttbox}
  4.1264 +consts fib  :: nat => nat
  4.1265 +recdef fib "measure(\%n. n)"
  4.1266 +    "fib 0 = 0"
  4.1267 +    "fib 1 = 1"
  4.1268 +    "fib (Suc(Suc x)) = fib x + fib (Suc x)"
  4.1269 +\end{ttbox}
  4.1270 +The definition of \texttt{fib} is accompanied by a \bfindex{measure function}
  4.1271 +\texttt{\%n.$\;$n} that maps the argument of \texttt{fib} to a natural
  4.1272 +number. The requirement is that in each equation the measure of the argument
  4.1273 +on the left-hand side is strictly greater than the measure of the argument of
  4.1274 +each recursive call. In the case of \texttt{fib} this is obviously true
  4.1275 +because the measure function is the identity and \texttt{Suc(Suc~x)} is
  4.1276 +strictly greater than both \texttt{x} and \texttt{Suc~x}.
  4.1277 +
  4.1278 +Slightly more interesting is the insertion of a fixed element
  4.1279 +between any two elements of a list:
  4.1280 +\begin{ttbox}
  4.1281 +consts sep :: "'a * 'a list => 'a list"
  4.1282 +recdef sep "measure (\%(a,xs). length xs)"
  4.1283 +    "sep(a, [])     = []"
  4.1284 +    "sep(a, [x])    = [x]"
  4.1285 +    "sep(a, x#y#zs) = x # a # sep(a,y#zs)"
  4.1286 +\end{ttbox}
  4.1287 +This time the measure is the length of the list, which decreases with the
  4.1288 +recursive call; the first component of the argument tuple is irrelevant.
  4.1289 +
  4.1290 +Pattern matching need not be exhaustive:
  4.1291 +\begin{ttbox}
  4.1292 +consts last :: 'a list => bool
  4.1293 +recdef last "measure (\%xs. length xs)"
  4.1294 +    "last [x]      = x"
  4.1295 +    "last (x#y#zs) = last (y#zs)"
  4.1296 +\end{ttbox}
  4.1297 +
  4.1298 +Overlapping patterns are disambiguated by taking the order of equations into
  4.1299 +account, just as in functional programming:
  4.1300 +\begin{ttbox}
  4.1301 +recdef sep "measure (\%(a,xs). length xs)"
  4.1302 +    "sep(a, x#y#zs) = x # a # sep(a,y#zs)"
  4.1303 +    "sep(a, xs)     = xs"
  4.1304 +\end{ttbox}
  4.1305 +This defines exactly the same function \texttt{sep} as further above.
  4.1306 +
  4.1307 +\begin{warn}
  4.1308 +Currently \texttt{recdef} only accepts functions with a single argument,
  4.1309 +possibly of tuple type.
  4.1310 +\end{warn}
  4.1311 +
  4.1312 +When loading a theory containing a \texttt{recdef} of a function $f$,
  4.1313 +Isabelle proves the recursion equations and stores the result as a list of
  4.1314 +theorems $f$.\texttt{rules}. It can be viewed by typing
  4.1315 +\begin{ttbox}
  4.1316 +prths \(f\).rules;
  4.1317 +\end{ttbox}
  4.1318 +All of the above examples are simple enough that Isabelle can determine
  4.1319 +automatically that the measure of the argument goes down in each recursive
  4.1320 +call. In that case $f$.\texttt{rules} contains precisely the defining
  4.1321 +equations.
  4.1322 +
  4.1323 +In general, Isabelle may not be able to prove all termination conditions
  4.1324 +automatically. For example, termination of
  4.1325 +\begin{ttbox}
  4.1326 +consts gcd :: "nat*nat => nat"
  4.1327 +recdef gcd "measure ((\%(m,n).n))"
  4.1328 +    "gcd (m, n) = (if n=0 then m else gcd(n, m mod n))"
  4.1329 +\end{ttbox}
  4.1330 +relies on the lemma \texttt{mod_less_divisor}
  4.1331 +\begin{ttbox}
  4.1332 +0 < n ==> m mod n < n
  4.1333 +\end{ttbox}
  4.1334 +that is not part of the default simpset. As a result, Isabelle prints a
  4.1335 +warning and \texttt{gcd.rules} contains a precondition:
  4.1336 +\begin{ttbox}
  4.1337 +(! m n. 0 < n --> m mod n < n) ==> gcd (m, n) = (if n=0 \dots)
  4.1338 +\end{ttbox}
  4.1339 +We need to instruct \texttt{recdef} to use an extended simpset to prove the
  4.1340 +termination condition:
  4.1341 +\begin{ttbox}
  4.1342 +recdef gcd "measure ((\%(m,n).n))"
  4.1343 +  simpset "simpset() addsimps [mod_less_divisor]"
  4.1344 +    "gcd (m, n) = (if n=0 then m else gcd(n, m mod n))"
  4.1345 +\end{ttbox}
  4.1346 +This time everything works fine and \texttt{gcd.rules} contains precisely the
  4.1347 +stated recursion equation for \texttt{gcd}.
  4.1348 +
  4.1349 +When defining some nontrivial total recursive function, the first attempt
  4.1350 +will usually generate a number of termination conditions, some of which may
  4.1351 +require new lemmas to be proved in some of the parent theories. Those lemmas
  4.1352 +can then be added to the simpset used by \texttt{recdef} for its
  4.1353 +proofs, as shown for \texttt{gcd}.
  4.1354 +
  4.1355 +Although all the above examples employ measure functions, \texttt{recdef}
  4.1356 +allows arbitrary wellfounded relations. For example, termination of
  4.1357 +Ackermann's function requires the lexicographic product \texttt{**}:
  4.1358 +\begin{ttbox}
  4.1359 +recdef ack "measure(\%m. m) ** measure(\%n. n)"
  4.1360 +    "ack(0,n)         = Suc n"
  4.1361 +    "ack(Suc m,0)     = ack(m, 1)"
  4.1362 +    "ack(Suc m,Suc n) = ack(m,ack(Suc m,n))"
  4.1363 +\end{ttbox}
  4.1364 +For details see~\cite{Isa-Logics-Man} and the examples in the library.
  4.1365 +
  4.1366 +
  4.1367 +\subsection{Deriving simplification rules}
  4.1368 +
  4.1369 +Once we have succeeded to prove all termination conditions, we can start to
  4.1370 +derive some theorems. In contrast to \texttt{primrec} definitions, which are
  4.1371 +automatically added to the simpset, \texttt{recdef} rules must be included
  4.1372 +explicitly, for example as in
  4.1373 +\begin{ttbox}
  4.1374 +Addsimps fib.rules;
  4.1375 +\end{ttbox}
  4.1376 +However, some care is necessary now, in contrast to \texttt{primrec}.
  4.1377 +Although \texttt{gcd} is a total function, its defining equation leads to
  4.1378 +nontermination of the simplifier, because the subterm \texttt{gcd(n, m mod
  4.1379 +  n)} on the right-hand side can again be simplified by the same equation,
  4.1380 +and so on. The reason: the simplifier rewrites the \texttt{then} and
  4.1381 +\texttt{else} branches of a conditional if the condition simplifies to
  4.1382 +neither \texttt{True} nor \texttt{False}.  Therefore it is recommended to
  4.1383 +derive an alternative formulation that replaces case distinctions on the
  4.1384 +right-hand side by conditional equations. For \texttt{gcd} it means we have
  4.1385 +to prove
  4.1386 +\begin{ttbox}
  4.1387 +           gcd (m, 0) = m
  4.1388 +n ~= 0 ==> gcd (m, n) = gcd(n, m mod n)
  4.1389 +\end{ttbox}
  4.1390 +To avoid nontermination during those proofs, we have to resort to some low
  4.1391 +level tactics:
  4.1392 +\begin{ttbox}
  4.1393 +Goal "gcd(m,0) = m";
  4.1394 +by(resolve_tac [trans] 1);
  4.1395 +by(resolve_tac gcd.rules 1);
  4.1396 +by(Simp_tac 1);
  4.1397 +\end{ttbox}
  4.1398 +At this point it is not necessary to understand what exactly
  4.1399 +\texttt{resolve_tac} is doing. The main point is that the above proof works
  4.1400 +not just for this one example but in general (except that we have to use
  4.1401 +\texttt{Asm_simp_tac} and $f$\texttt{.rules} in general). Try the second
  4.1402 +\texttt{gcd}-equation above!
  4.1403 +
  4.1404 +\subsection{Induction}
  4.1405 +
  4.1406 +Assuming we have added the recursion equations (or some suitable derived
  4.1407 +equations) to the simpset, we might like to prove something about our
  4.1408 +function. Since the function is recursive, the natural proof principle is
  4.1409 +again induction. But this time the structural form of induction that comes
  4.1410 +with datatypes is unlikely to work well---otherwise we could have defined the
  4.1411 +function by \texttt{primrec}. Therefore \texttt{recdef} automatically proves
  4.1412 +a suitable induction rule $f$\texttt{.induct} that follows the recursion
  4.1413 +pattern of the particular function $f$. Roughly speaking, it requires you to
  4.1414 +prove for each \texttt{recdef} equation that the property you are trying to
  4.1415 +establish holds for the left-hand side provided it holds for all recursive
  4.1416 +calls on the right-hand side. Applying $f$\texttt{.induct} requires its
  4.1417 +explicit instantiation. See \S\ref{sec:explicit-inst} for details.
  4.1418 +
  4.1419 +\index{*recdef|)}
     5.1 --- /dev/null	Thu Jan 01 00:00:00 1970 +0000
     5.2 +++ b/doc-src/Tutorial/ttbox.sty	Wed Aug 26 16:57:49 1998 +0200
     5.3 @@ -0,0 +1,20 @@
     5.4 +\ProvidesPackage{ttbox}[1997/06/25]
     5.5 +\RequirePackage{alltt}
     5.6 +
     5.7 +%%%Boxed terminal sessions
     5.8 +
     5.9 +%Redefines \{ and \} to be in \tt font and \| to make a BACKSLASH
    5.10 +\def\ttbraces{\chardef\{=`\{\chardef\}=`\}\chardef\|=`\\}
    5.11 +
    5.12 +%Restores % as the comment character (especially, to suppress line breaks)
    5.13 +\newcommand\comments{\catcode`\%=14\relax}
    5.14 +
    5.15 +%alltt* environment: like alltt but smaller, and with \{ \} and \| as in ttbox
    5.16 +\newenvironment{alltt*}{\begin{alltt}\footnotesize\ttbraces}{\end{alltt}}
    5.17 +
    5.18 +%Indented alltt* environment with small point size
    5.19 +%NO LINE BREAKS are allowed unless \pagebreak appears at START of a line
    5.20 +\newenvironment{ttbox}{\begin{quote}\samepage\begin{alltt*}}%
    5.21 +                      {\end{alltt*}\end{quote}}
    5.22 +
    5.23 +\endinput
     6.1 --- /dev/null	Thu Jan 01 00:00:00 1970 +0000
     6.2 +++ b/doc-src/Tutorial/tutorial.bbl	Wed Aug 26 16:57:49 1998 +0200
     6.3 @@ -0,0 +1,25 @@
     6.4 +\begin{thebibliography}{1}
     6.5 +
     6.6 +\bibitem{Bird-Wadler}
     6.7 +Richard Bird and Philip Wadler.
     6.8 +\newblock {\em Introduction to Functional Programming}.
     6.9 +\newblock Prentice-Hall, 1988.
    6.10 +
    6.11 +\bibitem{Isa-Ref-Man}
    6.12 +Lawrence~C. Paulson.
    6.13 +\newblock {\em The Isabelle Reference Manual}.
    6.14 +\newblock University of Cambridge, Computer Laboratory.
    6.15 +\newblock \verb$http://www.in.tum.de/~isabelle/dist/$.
    6.16 +
    6.17 +\bibitem{Isa-Logics-Man}
    6.18 +Lawrence~C. Paulson.
    6.19 +\newblock {\em Isabelle's Object-Logics}.
    6.20 +\newblock University of Cambridge, Computer Laboratory.
    6.21 +\newblock \verb$http://www.in.tum.de/~isabelle/dist/$.
    6.22 +
    6.23 +\bibitem{Paulson-ML}
    6.24 +Lawrence~C. Paulson.
    6.25 +\newblock {\em ML for the Working Programmer}.
    6.26 +\newblock Cambridge University Press, 2nd edition, 1996.
    6.27 +
    6.28 +\end{thebibliography}
     7.1 --- /dev/null	Thu Jan 01 00:00:00 1970 +0000
     7.2 +++ b/doc-src/Tutorial/tutorial.ind	Wed Aug 26 16:57:49 1998 +0200
     7.3 @@ -0,0 +1,135 @@
     7.4 +\begin{theindex}
     7.5 +
     7.6 +  \item {\tt[]}, \bold{7}
     7.7 +  \item {\tt\#}, \bold{7}
     7.8 +  \item {\ttnot}, \bold{3}
     7.9 +  \item {\tt-->}, \bold{3}
    7.10 +  \item {\tt\&}, \bold{3}
    7.11 +  \item {\ttor}, \bold{3}
    7.12 +  \item {\tt?}, \bold{3}, 4
    7.13 +  \item {\ttall}, \bold{3}
    7.14 +  \item {\ttuniquex}, \bold{3}
    7.15 +  \item {\tt *}, \bold{17}
    7.16 +  \item {\tt +}, \bold{17}
    7.17 +  \item {\tt -}, \bold{17}
    7.18 +  \item {\tt <}, \bold{17}
    7.19 +  \item {\tt <=}, \bold{17}
    7.20 +  \item \ttlbr, \bold{9}
    7.21 +  \item \ttrbr, \bold{9}
    7.22 +  \item {\tt==>}, \bold{9}
    7.23 +  \item {\tt==}, \bold{18}
    7.24 +  \item {\tt\%}, \bold{3}
    7.25 +  \item {\tt =>}, \bold{2}
    7.26 +
    7.27 +  \indexspace
    7.28 +
    7.29 +  \item {\tt addsimps}, \bold{22}
    7.30 +  \item {\tt Addsplits}, \bold{24}
    7.31 +  \item {\tt addsplits}, \bold{24}
    7.32 +  \item {\tt Asm_full_simp_tac}, \bold{21}
    7.33 +  \item {\tt asm_full_simp_tac}, \bold{22}
    7.34 +  \item {\tt Asm_simp_tac}, \bold{21}
    7.35 +  \item {\tt asm_simp_tac}, \bold{22}
    7.36 +
    7.37 +  \indexspace
    7.38 +
    7.39 +  \item {\tt bool}, 2
    7.40 +
    7.41 +  \indexspace
    7.42 +
    7.43 +  \item {\tt case}, \bold{3}, 4, \bold{13}, 24
    7.44 +  \item {\tt constdefs}, \bold{18}
    7.45 +  \item {\tt consts}, \bold{7}
    7.46 +  \item {\tt context}, \bold{11}
    7.47 +  \item current theory, \bold{11}
    7.48 +
    7.49 +  \indexspace
    7.50 +
    7.51 +  \item {\tt datatype}, \bold{7}
    7.52 +  \item {\tt defs}, \bold{18}
    7.53 +  \item {\tt delsimps}, \bold{22}
    7.54 +  \item {\tt Delsplits}, \bold{24}
    7.55 +  \item {\tt delsplits}, \bold{24}
    7.56 +  \item {\tt div}, \bold{17}
    7.57 +
    7.58 +  \indexspace
    7.59 +
    7.60 +  \item {\tt exhaust_tac}, \bold{14}
    7.61 +
    7.62 +  \indexspace
    7.63 +
    7.64 +  \item {\tt False}, \bold{3}
    7.65 +  \item formula, \bold{3}
    7.66 +  \item {\tt Full_simp_tac}, \bold{21}
    7.67 +  \item {\tt full_simp_tac}, \bold{22}
    7.68 +
    7.69 +  \indexspace
    7.70 +
    7.71 +  \item {\tt hd}, \bold{12}
    7.72 +
    7.73 +  \indexspace
    7.74 +
    7.75 +  \item {\tt if}, \bold{3}, 4, 24
    7.76 +  \item {\tt infixr}, \bold{7}
    7.77 +  \item inner syntax, \bold{8}
    7.78 +
    7.79 +  \indexspace
    7.80 +
    7.81 +  \item {\tt LEAST}, \bold{17}
    7.82 +  \item {\tt let}, \bold{3}, 4, 23
    7.83 +  \item {\tt list}, 2
    7.84 +  \item loading theories, 12
    7.85 +
    7.86 +  \indexspace
    7.87 +
    7.88 +  \item {\tt Main}, \bold{2}
    7.89 +  \item measure function, \bold{29}
    7.90 +  \item {\tt mod}, \bold{17}
    7.91 +
    7.92 +  \indexspace
    7.93 +
    7.94 +  \item {\tt nat}, 2, \bold{17}
    7.95 +
    7.96 +  \indexspace
    7.97 +
    7.98 +  \item parent theory, \bold{1}
    7.99 +  \item primitive recursion, \bold{13}
   7.100 +  \item proof scripts, \bold{2}
   7.101 +
   7.102 +  \indexspace
   7.103 +
   7.104 +  \item {\tt recdef}, 29--31
   7.105 +  \item reloading theories, 12
   7.106 +
   7.107 +  \indexspace
   7.108 +
   7.109 +  \item schematic variable, \bold{4}
   7.110 +  \item {\tt set}, 2
   7.111 +  \item {\tt show_brackets}, \bold{4}
   7.112 +  \item {\tt show_types}, \bold{3}, 11
   7.113 +  \item {\tt Simp_tac}, \bold{21}
   7.114 +  \item {\tt simp_tac}, \bold{22}
   7.115 +  \item simplifier, \bold{20}
   7.116 +  \item simpset, \bold{21}
   7.117 +
   7.118 +  \indexspace
   7.119 +
   7.120 +  \item tactic, \bold{11}
   7.121 +  \item term, \bold{3}
   7.122 +  \item theory, \bold{1}
   7.123 +  \item {\tt tl}, \bold{12}
   7.124 +  \item total, \bold{7}
   7.125 +  \item tracing the simplifier, \bold{25}
   7.126 +  \item {\tt True}, \bold{3}
   7.127 +  \item type constraints, \bold{3}
   7.128 +  \item type inference, \bold{3}
   7.129 +  \item type synonyms, \bold{18}
   7.130 +  \item {\tt types}, \bold{18}
   7.131 +
   7.132 +  \indexspace
   7.133 +
   7.134 +  \item unknown, \bold{4}
   7.135 +  \item {\tt update}, \bold{12}
   7.136 +  \item {\tt use_thy}, \bold{2}, 12
   7.137 +
   7.138 +\end{theindex}
     8.1 --- /dev/null	Thu Jan 01 00:00:00 1970 +0000
     8.2 +++ b/doc-src/Tutorial/tutorial.tex	Wed Aug 26 16:57:49 1998 +0200
     8.3 @@ -0,0 +1,69 @@
     8.4 +\documentclass[11pt]{report}
     8.5 +\usepackage{a4,latexsym}
     8.6 +\usepackage{graphicx}
     8.7 +
     8.8 +\makeatletter
     8.9 +\input{../iman.sty}
    8.10 +\input{extra.sty}
    8.11 +\makeatother
    8.12 +\usepackage{ttbox}
    8.13 +\newcommand\ttbreak{\vskip-10pt\pagebreak[0]}
    8.14 +
    8.15 +%\newtheorem{theorem}{Theorem}[section]
    8.16 +\newtheorem{Exercise}{Exercise}[section]
    8.17 +\newenvironment{exercise}{\begin{Exercise}\rm}{\end{Exercise}}
    8.18 +\newcommand{\ttlbr}{{\tt[|}}
    8.19 +\newcommand{\ttrbr}{{\tt|]}}
    8.20 +\newcommand{\ttnot}{\tt\~\relax}
    8.21 +\newcommand{\ttor}{\tt|}
    8.22 +\newcommand{\ttall}{\tt!}
    8.23 +\newcommand{\ttuniquex}{\tt?!}
    8.24 +
    8.25 +%% $Id$
    8.26 +%%%STILL NEEDS MODAL, LCF
    8.27 +%%%\includeonly{ZF}
    8.28 +%%% to index derived rls:  ^\([a-zA-Z0-9][a-zA-Z0-9_]*\)        \\tdx{\1}  
    8.29 +%%% to index rulenames:   ^ *(\([a-zA-Z0-9][a-zA-Z0-9_]*\),     \\tdx{\1}  
    8.30 +%%% to index constants:   \\tt \([a-zA-Z0-9][a-zA-Z0-9_]*\)     \\cdx{\1}  
    8.31 +%%% to deverbify:         \\verb|\([^|]*\)|     \\ttindex{\1}  
    8.32 +%% run    ../sedindex logics    to prepare index file
    8.33 +
    8.34 +\makeindex
    8.35 +
    8.36 +\underscoreoff
    8.37 +
    8.38 +\setcounter{secnumdepth}{2} \setcounter{tocdepth}{2}  %% {secnumdepth}{2}???
    8.39 +
    8.40 +\pagestyle{headings}
    8.41 +%\sloppy
    8.42 +%\binperiod     %%%treat . like a binary operator
    8.43 +
    8.44 +\begin{document}
    8.45 +\title{\includegraphics[scale=0.2,angle=-90]{isabelle_hol.ps}
    8.46 +       \\ \vspace{0.5cm} The Tutorial
    8.47 +       \\ --- DRAFT ---}
    8.48 +\author{Tobias Nipkow\\
    8.49 +Technische Universit\"at M\"unchen \\
    8.50 +Institut f\"ur Informatik \\
    8.51 +\texttt{http://www.in.tum.de/\~\relax nipkow/}}
    8.52 +\maketitle
    8.53 +
    8.54 +\pagenumbering{roman}
    8.55 +\tableofcontents
    8.56 +
    8.57 +\subsubsection*{Acknowledgements}
    8.58 +This tutorial owes a lot to the constant discussions with and the valuable
    8.59 +feedback from Larry Paulson and the Isabelle group at Munich: Olaf M\"uller,
    8.60 +Wolfgang Naraschewski, David von Oheimb, Leonor Prensa-Nieto, Cornelia Pusch
    8.61 +and Markus Wenzel. Stefan Merz was also kind enough to read and comment on a
    8.62 +draft version.
    8.63 +\clearfirst
    8.64 +
    8.65 +\input{basics}
    8.66 +\input{fp}
    8.67 +\input{appendix}
    8.68 +
    8.69 +\bibliographystyle{plain}
    8.70 +\bibliography{base}
    8.71 +\input{tutorial.ind}
    8.72 +\end{document}