1.1 --- /dev/null	Thu Jan 01 00:00:00 1970 +0000
     1.2 +++ b/doc-src/Tutorial/fp.tex	Wed Aug 26 16:57:49 1998 +0200
     1.3 @@ -0,0 +1,1416 @@
     1.4 +\chapter{Functional Programming in HOL}
     1.5 +
     1.6 +Although on the surface this chapter is mainly concerned with how to write
     1.7 +functional programs in HOL and how to verify them, most of the
     1.8 +constructs and proof procedures introduced are general purpose and recur in
     1.9 +any specification or verification task.
    1.10 +
    1.11 +The dedicated functional programmer should be warned: HOL offers only what
    1.12 +could be called {\em total functional programming} --- all functions in HOL
    1.13 +must be total; lazy data structures are not directly available. On the
    1.14 +positive side, functions in HOL need not be computable: HOL is a
    1.15 +specification language that goes well beyond what can be expressed as a
    1.16 +program. However, for the time being we concentrate on the computable.
    1.17 +
    1.18 +\section{An introductory theory}
    1.19 +\label{sec:intro-theory}
    1.20 +
    1.21 +Functional programming needs datatypes and functions. Both of them can be
    1.22 +defined in a theory with a syntax reminiscent of languages like ML or
    1.23 +Haskell. As an example consider the theory in Fig.~\ref{fig:ToyList}.
    1.24 +
    1.25 +\begin{figure}[htbp]
    1.26 +\begin{ttbox}\makeatother
    1.27 +\input{ToyList/ToyList.thy}\end{ttbox}
    1.28 +\caption{A theory of lists}
    1.29 +\label{fig:ToyList}
    1.30 +\end{figure}
    1.31 +
    1.32 +HOL already has a predefined theory of lists called \texttt{List} ---
    1.33 +\texttt{ToyList} is merely a small fragment of it chosen as an example. In
    1.34 +contrast to what is recommended in \S\ref{sec:Basic:Theories},
    1.35 +\texttt{ToyList} is not based on \texttt{Main} but on \texttt{Datatype}, a
    1.36 +theory that contains everything required for datatype definitions but does
    1.37 +not have \texttt{List} as a parent, thus avoiding ambiguities caused by
    1.38 +defining lists twice.
    1.39 +
    1.40 +The \ttindexbold{datatype} \texttt{list} introduces two constructors
    1.41 +\texttt{Nil} and \texttt{Cons}, the empty list and the operator that adds an
    1.42 +element to the front of a list. For example, the term \texttt{Cons True (Cons
    1.43 +  False Nil)} is a value of type \texttt{bool~list}, namely the list with the
    1.44 +elements \texttt{True} and \texttt{False}. Because this notation becomes
    1.45 +unwieldy very quickly, the datatype declaration is annotated with an
    1.46 +alternative syntax: instead of \texttt{Nil} and \texttt{Cons}~$x$~$xs$ we can
    1.47 +write \index{#@{\tt[]}|bold}\texttt{[]} and
    1.48 +\texttt{$x$~\#~$xs$}\index{#@{\tt\#}|bold}. In fact, this alternative syntax
    1.49 +is the standard syntax. Thus the list \texttt{Cons True (Cons False Nil)}
    1.50 +becomes \texttt{True \# False \# []}. The annotation \ttindexbold{infixr}
    1.51 +means that \texttt{\#} associates to the right, i.e.\ the term \texttt{$x$ \#
    1.52 +  $y$ \# $z$} is read as \texttt{$x$ \# ($y$ \# $z$)} and not as \texttt{($x$
    1.53 +  \# $y$) \# $z$}.
    1.54 +
    1.55 +\begin{warn}
    1.56 +  Syntax annotations are a powerful but completely optional feature. You
    1.57 +  could drop them from theory \texttt{ToyList} and go back to the identifiers
    1.58 +  \texttt{Nil} and \texttt{Cons}. However, lists are such a central datatype
    1.59 +  that their syntax is highly customized. We recommend that novices should
    1.60 +  not use syntax annotations in their own theories.
    1.61 +\end{warn}
    1.62 +
    1.63 +Next, the functions \texttt{app} and \texttt{rev} are declared. In contrast
    1.64 +to ML, Isabelle insists on explicit declarations of all functions (keyword
    1.65 +\ttindexbold{consts}).  (Apart from the declaration-before-use restriction,
    1.66 +the order of items in a theory file is unconstrained.) Function \texttt{app}
    1.67 +is annotated with concrete syntax too. Instead of the prefix syntax
    1.68 +\texttt{app}~$xs$~$ys$ the infix $xs$~\texttt{\at}~$ys$ becomes the preferred
    1.69 +form.
    1.70 +
    1.71 +Both functions are defined recursively. The equations for \texttt{app} and
    1.72 +\texttt{rev} hardly need comments: \texttt{app} appends two lists and
    1.73 +\texttt{rev} reverses a list.  The keyword \texttt{primrec} indicates that
    1.74 +the recursion is of a particularly primitive kind where each recursive call
    1.75 +peels off a datatype constructor from one of the arguments (see
    1.76 +\S\ref{sec:datatype}).  Thus the recursion always terminates, i.e.\ the
    1.77 +function is \bfindex{total}.
    1.78 +
    1.79 +The termination requirement is absolutely essential in HOL, a logic of total
    1.80 +functions. If we were to drop it, inconsistencies could quickly arise: the
    1.81 +``definition'' $f(n) = f(n)+1$ immediately leads to $0 = 1$ by subtracting
    1.82 +$f(n)$ on both sides.
    1.83 +% However, this is a subtle issue that we cannot discuss here further.
    1.84 +
    1.85 +\begin{warn}
    1.86 +  As we have indicated, the desire for total functions is not a gratuitously
    1.87 +  imposed restriction but an essential characteristic of HOL. It is only
    1.88 +  because of totality that reasoning in HOL is comparatively easy.  More
    1.89 +  generally, the philosophy in HOL is not to allow arbitrary axioms (such as
    1.90 +  function definitions whose totality has not been proved) because they
    1.91 +  quickly lead to inconsistencies. Instead, fixed constructs for introducing
    1.92 +  types and functions are offered (such as \texttt{datatype} and
    1.93 +  \texttt{primrec}) which are guaranteed to preserve consistency.
    1.94 +\end{warn}
    1.95 +
    1.96 +A remark about syntax.  The textual definition of a theory follows a fixed
    1.97 +syntax with keywords like \texttt{datatype} and \texttt{end} (see
    1.98 +Fig.~\ref{fig:keywords} in Appendix~\ref{sec:Appendix} for a full list).
    1.99 +Embedded in this syntax are the types and formulae of HOL, whose syntax is
   1.100 +extensible, e.g.\ by new user-defined infix operators
   1.101 +(see~\ref{sec:infix-syntax}). To distinguish the two levels, everything
   1.102 +HOL-specific should be enclosed in \texttt{"}\dots\texttt{"}. The same holds
   1.103 +for identifiers that happen to be keywords, as in
   1.104 +\begin{ttbox}
   1.105 +consts "end" :: 'a list => 'a
   1.106 +\end{ttbox}
   1.107 +To lessen this burden, quotation marks around types can be dropped,
   1.108 +provided their syntax does not go beyond what is described in
   1.109 +\S\ref{sec:TypesTermsForms}. Types containing further operators, e.g.\
   1.110 +\texttt{*} for Cartesian products, need quotation marks.
   1.111 +
   1.112 +When Isabelle prints a syntax error message, it refers to the HOL syntax as
   1.113 +the \bfindex{inner syntax}.
   1.114 +
   1.115 +\section{An introductory proof}
   1.116 +\label{sec:intro-proof}
   1.117 +
   1.118 +Having defined \texttt{ToyList}, we load it with the ML command
   1.119 +\begin{ttbox}
   1.120 +use_thy "ToyList";
   1.121 +\end{ttbox}
   1.122 +and are ready to prove a few simple theorems. This will illustrate not just
   1.123 +the basic proof commands but also the typical proof process.
   1.124 +
   1.125 +\subsubsection*{Main goal: \texttt{rev(rev xs) = xs}}
   1.126 +
   1.127 +Our goal is to show that reversing a list twice produces the original
   1.128 +list. Typing
   1.129 +\begin{ttbox}
   1.130 +\input{ToyList/thm}\end{ttbox}
   1.131 +establishes a new goal to be proved in the context of the current theory,
   1.132 +which is the one we just loaded. Isabelle's response is to print the current proof state:
   1.133 +\begin{ttbox}
   1.134 +{\out Level 0}
   1.135 +{\out rev (rev xs) = xs}
   1.136 +{\out  1. rev (rev xs) = xs}
   1.137 +\end{ttbox}
   1.138 +Until we have finished a proof, the proof state always looks like this:
   1.139 +\begin{ttbox}
   1.140 +{\out Level \(i\)}
   1.141 +{\out \(G\)}
   1.142 +{\out  1. \(G@1\)}
   1.143 +{\out  \(\vdots\)}
   1.144 +{\out  \(n\). \(G@n\)}
   1.145 +\end{ttbox}
   1.146 +where \texttt{Level}~$i$ indicates that we are $i$ steps into the proof, $G$
   1.147 +is the overall goal that we are trying to prove, and the numbered lines
   1.148 +contain the subgoals $G@1$, \dots, $G@n$ that we need to prove to establish
   1.149 +$G$. At \texttt{Level 0} there is only one subgoal, which is identical with
   1.150 +the overall goal.  Normally $G$ is constant and only serves as a reminder.
   1.151 +Hence we rarely show it in this tutorial.
   1.152 +
   1.153 +Let us now get back to \texttt{rev(rev xs) = xs}. Properties of recursively
   1.154 +defined functions are best established by induction. In this case there is
   1.155 +not much choice except to induct on \texttt{xs}:
   1.156 +\begin{ttbox}
   1.157 +\input{ToyList/inductxs}\end{ttbox}
   1.158 +This tells Isabelle to perform induction on variable \texttt{xs} in subgoal
   1.159 +1. The new proof state contains two subgoals, namely the base case
   1.160 +(\texttt{Nil}) and the induction step (\texttt{Cons}):
   1.161 +\begin{ttbox}
   1.162 +{\out 1. rev (rev []) = []}
   1.163 +{\out 2. !!a list. rev (rev list) = list ==> rev (rev (a # list)) = a # list}
   1.164 +\end{ttbox}
   1.165 +The induction step is an example of the general format of a subgoal:
   1.166 +\begin{ttbox}
   1.167 +{\out  \(i\). !!\(x@1 \dots x@n\). {\it assumptions} ==> {\it conclusion}}
   1.168 +\end{ttbox}\index{==>@{\tt==>}|bold}
   1.169 +The prefix of bound variables \texttt{!!\(x@1 \dots x@n\)} can be ignored
   1.170 +most of the time, or simply treated as a list of variables local to this
   1.171 +subgoal. Their deeper significance is explained in \S\ref{sec:PCproofs}.  The
   1.172 +{\it assumptions} are the local assumptions for this subgoal and {\it
   1.173 +  conclusion} is the actual proposition to be proved. Typical proof steps
   1.174 +that add new assumptions are induction or case distinction. In our example
   1.175 +the only assumption is the induction hypothesis \texttt{rev (rev list) =
   1.176 +  list}, where \texttt{list} is a variable name chosen by Isabelle. If there
   1.177 +are multiple assumptions, they are enclosed in the bracket pair
   1.178 +\texttt{[|}\index{==>@\ttlbr|bold} and \texttt{|]}\index{==>@\ttrbr|bold}
   1.179 +and separated by semicolons.
   1.180 +
   1.181 +Let us try to solve both goals automatically:
   1.182 +\begin{ttbox}
   1.183 +\input{ToyList/autotac}\end{ttbox}
   1.184 +This command tells Isabelle to apply a proof strategy called
   1.185 +\texttt{Auto_tac} to all subgoals. Essentially, \texttt{Auto_tac} tries to
   1.186 +`simplify' the subgoals.  In our case, subgoal~1 is solved completely (thanks
   1.187 +to the equation \texttt{rev [] = []}) and disappears; the simplified version
   1.188 +of subgoal~2 becomes the new subgoal~1:
   1.189 +\begin{ttbox}\makeatother
   1.190 +{\out 1. !!a list. rev(rev list) = list ==> rev(rev list @ a # []) = a # list}
   1.191 +\end{ttbox}
   1.192 +In order to simplify this subgoal further, a lemma suggests itself.
   1.193 +
   1.194 +\subsubsection*{First lemma: \texttt{rev(xs \at~ys) = (rev ys) \at~(rev xs)}}
   1.195 +
   1.196 +We start the proof as usual:
   1.197 +\begin{ttbox}\makeatother
   1.198 +\input{ToyList/lemma1}\end{ttbox}
   1.199 +There are two variables that we could induct on: \texttt{xs} and
   1.200 +\texttt{ys}. Because \texttt{\at} is defined by recursion on
   1.201 +the first argument, \texttt{xs} is the correct one:
   1.202 +\begin{ttbox}
   1.203 +\input{ToyList/inductxs}\end{ttbox}
   1.204 +This time not even the base case is solved automatically:
   1.205 +\begin{ttbox}\makeatother
   1.206 +by(Auto_tac);
   1.207 +{\out 1. rev ys = rev ys @ []}
   1.208 +{\out 2. \dots}
   1.209 +\end{ttbox}
   1.210 +We need another lemma.
   1.211 +
   1.212 +\subsubsection*{Second lemma: \texttt{xs \at~[] = xs}}
   1.213 +
   1.214 +This time the canonical proof procedure
   1.215 +\begin{ttbox}\makeatother
   1.216 +\input{ToyList/lemma2}\input{ToyList/inductxs}\input{ToyList/autotac}\end{ttbox}
   1.217 +leads to the desired message \texttt{No subgoals!}:
   1.218 +\begin{ttbox}\makeatother
   1.219 +{\out Level 2}
   1.220 +{\out xs @ [] = xs}
   1.221 +{\out No subgoals!}
   1.222 +\end{ttbox}
   1.223 +Now we can give the lemma just proved a suitable name
   1.224 +\begin{ttbox}
   1.225 +\input{ToyList/qed2}\end{ttbox}
   1.226 +and tell Isabelle to use this lemma in all future proofs by simplification:
   1.227 +\begin{ttbox}
   1.228 +\input{ToyList/addsimps2}\end{ttbox}
   1.229 +Note that in the theorem \texttt{app_Nil2} the free variable \texttt{xs} has
   1.230 +been replaced by the unknown \texttt{?xs}, just as explained in
   1.231 +\S\ref{sec:variables}.
   1.232 +
   1.233 +Going back to the proof of the first lemma
   1.234 +\begin{ttbox}\makeatother
   1.235 +\input{ToyList/lemma1}\input{ToyList/inductxs}\input{ToyList/autotac}\end{ttbox}
   1.236 +we find that this time \texttt{Auto_tac} solves the base case, but the
   1.237 +induction step merely simplifies to
   1.238 +\begin{ttbox}\makeatother
   1.239 +{\out 1. !!a list.}
   1.240 +{\out       rev (list @ ys) = rev ys @ rev list}
   1.241 +{\out       ==> (rev ys @ rev list) @ a # [] = rev ys @ rev list @ a # []}
   1.242 +\end{ttbox}
   1.243 +Now we need to remember that \texttt{\at} associates to the right, and that
   1.244 +\texttt{\#} and \texttt{\at} have the same priority (namely the \texttt{65}
   1.245 +in the definition of \texttt{ToyList}). Thus the conclusion really is
   1.246 +\begin{ttbox}\makeatother
   1.247 +{\out     ==> (rev ys @ rev list) @ (a # []) = rev ys @ (rev list @ (a # []))}
   1.248 +\end{ttbox}
   1.249 +and the missing lemma is associativity of \texttt{\at}.
   1.250 +
   1.251 +\subsubsection*{Third lemma: \texttt{(xs \at~ys) \at~zs = xs \at~(ys \at~zs)}}
   1.252 +
   1.253 +This time the canonical proof procedure
   1.254 +\begin{ttbox}\makeatother
   1.255 +\input{ToyList/lemma3}\end{ttbox}
   1.256 +succeeds without further ado. Again we name the lemma and add it to
   1.257 +the set of lemmas used during simplification:
   1.258 +\begin{ttbox}
   1.259 +\input{ToyList/qed3}\end{ttbox}
   1.260 +Now we can go back and prove the first lemma
   1.261 +\begin{ttbox}\makeatother
   1.262 +\input{ToyList/lemma1}\input{ToyList/inductxs}\input{ToyList/autotac}\end{ttbox}
   1.263 +add it to the simplification lemmas
   1.264 +\begin{ttbox}
   1.265 +\input{ToyList/qed1}\end{ttbox}
   1.266 +and then solve our main theorem:
   1.267 +\begin{ttbox}\makeatother
   1.268 +\input{ToyList/thm}\input{ToyList/inductxs}\input{ToyList/autotac}\end{ttbox}
   1.269 +
   1.270 +\subsubsection*{Review}
   1.271 +
   1.272 +This is the end of our toy proof. It should have familiarized you with
   1.273 +\begin{itemize}
   1.274 +\item the standard theorem proving procedure:
   1.275 +state a goal; proceed with proof until a new lemma is required; prove that
   1.276 +lemma; come back to the original goal.
   1.277 +\item a specific procedure that works well for functional programs:
   1.278 +induction followed by all-out simplification via \texttt{Auto_tac}.
   1.279 +\item a basic repertoire of proof commands.
   1.280 +\end{itemize}
   1.281 +
   1.282 +
   1.283 +\section{Some helpful commands}
   1.284 +\label{sec:commands-and-hints}
   1.285 +
   1.286 +This section discusses a few basic commands for manipulating the proof state
   1.287 +and can be skipped by casual readers.
   1.288 +
   1.289 +There are two kinds of commands used during a proof: the actual proof
   1.290 +commands and auxiliary commands for examining the proof state and controlling
   1.291 +the display. Proof commands are always of the form
   1.292 +\texttt{by(\textit{tactic});}\indexbold{tactic} where \textbf{tactic} is a
   1.293 +synonym for ``theorem proving function''. Typical examples are
   1.294 +\texttt{induct_tac} and \texttt{Auto_tac} --- the suffix \texttt{_tac} is
   1.295 +merely a mnemonic. Further tactics are introduced throughout the tutorial.
   1.296 +
   1.297 +%Tactics can also be modified. For example,
   1.298 +%\begin{ttbox}
   1.299 +%by(ALLGOALS Asm_simp_tac);
   1.300 +%\end{ttbox}
   1.301 +%tells Isabelle to apply \texttt{Asm_simp_tac} to all subgoals. For more on
   1.302 +%tactics and how to combine them see~\S\ref{sec:Tactics}.
   1.303 +
   1.304 +The most useful auxiliary commands are:
   1.305 +\begin{description}
   1.306 +\item[Printing the current state]
   1.307 +Type \texttt{pr();} to redisplay the current proof state, for example when it
   1.308 +has disappeared off the screen.
   1.309 +\item[Limiting the number of subgoals]
   1.310 +Typing \texttt{prlim $k$;} tells Isabelle to print only the first $k$
   1.311 +subgoals from now on and redisplays the current proof state. This is helpful
   1.312 +when there are many subgoals.
   1.313 +\item[Undoing] Typing \texttt{undo();} undoes the effect of the last
   1.314 +tactic.
   1.315 +\item[Context switch] Every proof happens in the context of a
   1.316 +  \bfindex{current theory}. By default, this is the last theory loaded. If
   1.317 +  you want to prove a theorem in the context of a different theory
   1.318 +  \texttt{T}, you need to type \texttt{context T.thy;}\index{*context|bold}
   1.319 +  first. Of course you need to change the context again if you want to go
   1.320 +  back to your original theory.
   1.321 +\item[Displaying types] We have already mentioned the flag
   1.322 +  \ttindex{show_types} above. It can also be useful for detecting typos in
   1.323 +  formulae early on. For example, if \texttt{show_types} is set and the goal
   1.324 +  \texttt{rev(rev xs) = xs} is started, Isabelle prints the additional output
   1.325 +\begin{ttbox}
   1.326 +{\out Variables:}
   1.327 +{\out   xs :: 'a list}
   1.328 +\end{ttbox}
   1.329 +which tells us that Isabelle has correctly inferred that
   1.330 +\texttt{xs} is a variable of list type. On the other hand, had we
   1.331 +made a typo as in \texttt{rev(re xs) = xs}, the response
   1.332 +\begin{ttbox}
   1.333 +Variables:
   1.334 +  re :: 'a list => 'a list
   1.335 +  xs :: 'a list
   1.336 +\end{ttbox}
   1.337 +would have alerted us because of the unexpected variable \texttt{re}.
   1.338 +\item[(Re)loading theories]\index{loading theories}\index{reloading theories}
   1.339 +Initially you load theory \texttt{T} by typing \ttindex{use_thy}~\texttt{"T";},
   1.340 +which loads all parent theories of \texttt{T} automatically, if they are not
   1.341 +loaded already. If you modify \texttt{T.thy} or \texttt{T.ML}, you can
   1.342 +reload it by typing \texttt{use_thy~"T";} again. This time, however, only
   1.343 +\texttt{T} is reloaded. If some of \texttt{T}'s parents have changed as well,
   1.344 +type \ttindexbold{update}\texttt{();} to reload all theories that have
   1.345 +changed.
   1.346 +\end{description}
   1.347 +Further commands are found in the Reference Manual.
   1.348 +
   1.349 +
   1.350 +\section{Datatypes}
   1.351 +\label{sec:datatype}
   1.352 +
   1.353 +Inductive datatypes are part of almost every non-trivial application of HOL.
   1.354 +First we take another look at a very important example, the datatype of
   1.355 +lists, before we turn to datatypes in general. The section closes with a
   1.356 +case study.
   1.357 +
   1.358 +
   1.359 +\subsection{Lists}
   1.360 +
   1.361 +Lists are one of the essential datatypes in computing. Readers of this tutorial
   1.362 +and users of HOL need to be familiar with their basic operations. Theory
   1.363 +\texttt{ToyList} is only a small fragment of HOL's predefined theory
   1.364 +\texttt{List}\footnote{\texttt{http://www.in.tum.de/\~\relax
   1.365 +    isabelle/library/HOL/List.html}}.
   1.366 +The latter contains many further operations. For example, the functions
   1.367 +\ttindexbold{hd} (`head') and \ttindexbold{tl} (`tail') return the first
   1.368 +element and the remainder of a list. (However, pattern-matching is usually
   1.369 +preferable to \texttt{hd} and \texttt{tl}.)
   1.370 +Theory \texttt{List} also contains more syntactic sugar:
   1.371 +\texttt{[}$x@1$\texttt{,}\dots\texttt{,}$x@n$\texttt{]} abbreviates
   1.372 +$x@1$\texttt{\#}\dots\texttt{\#}$x@n$\texttt{\#[]}.
   1.373 +In the rest of the tutorial we always use HOL's predefined lists.
   1.374 +
   1.375 +
   1.376 +\subsection{The general format}
   1.377 +\label{sec:general-datatype}
   1.378 +
   1.379 +The general HOL \texttt{datatype} definition is of the form
   1.380 +\[
   1.381 +\mathtt{datatype}~(\alpha@1, \dots, \alpha@n) \, t ~=~
   1.382 +C@1~\tau@{11}~\dots~\tau@{1k@1} ~\mid~ \dots ~\mid~
   1.383 +C@m~\tau@{m1}~\dots~\tau@{mk@m}
   1.384 +\]
   1.385 +where $\alpha@i$ are type variables (the parameters), $C@i$ are distinct
   1.386 +constructor names and $\tau@{ij}$ are types; it is customary to capitalize
   1.387 +the first letter in constructor names. There are a number of
   1.388 +restrictions (such as the type should not be empty) detailed
   1.389 +elsewhere~\cite{Isa-Logics-Man}. Isabelle notifies you if you violate them.
   1.390 +
   1.391 +Laws about datatypes, such as \verb$[] ~= x#xs$ and \texttt{(x\#xs = y\#ys) =
   1.392 +  (x=y \& xs=ys)}, are used automatically during proofs by simplification.
   1.393 +The same is true for the equations in primitive recursive function
   1.394 +definitions.
   1.395 +
   1.396 +\subsection{Primitive recursion}
   1.397 +
   1.398 +Functions on datatypes are usually defined by recursion. In fact, most of the
   1.399 +time they are defined by what is called \bfindex{primitive recursion}.
   1.400 +The keyword \texttt{primrec} is followed by a list of equations
   1.401 +\[ f \, x@1 \, \dots \, (C \, y@1 \, \dots \, y@k)\, \dots \, x@n = r \]
   1.402 +such that $C$ is a constructor of the datatype $t$ and all recursive calls of
   1.403 +$f$ in $r$ are of the form $f \, \dots \, y@i \, \dots$ for some $i$. Thus
   1.404 +Isabelle immediately sees that $f$ terminates because one (fixed!) argument
   1.405 +becomes smaller with every recursive call. There must be exactly one equation
   1.406 +for each constructor.  Their order is immaterial.
   1.407 +A more general method for defining total recursive functions is explained in
   1.408 +\S\ref{sec:recdef}.
   1.409 +
   1.410 +\begin{exercise}
   1.411 +Given the datatype of binary trees
   1.412 +\begin{ttbox}
   1.413 +\input{Misc/tree}\end{ttbox}
   1.414 +define a function \texttt{mirror} that mirrors the structure of a binary tree
   1.415 +by swapping subtrees (recursively). Prove \texttt{mirror(mirror(t)) = t}.
   1.416 +\end{exercise}
   1.417 +
   1.418 +\subsection{\texttt{case}-expressions}
   1.419 +\label{sec:case-expressions}
   1.420 +
   1.421 +HOL also features \ttindexbold{case}-expressions for analyzing elements of a
   1.422 +datatype. For example,
   1.423 +\begin{ttbox}
   1.424 +case xs of [] => 0 | y#ys => y
   1.425 +\end{ttbox}
   1.426 +evaluates to \texttt{0} if \texttt{xs} is \texttt{[]} and to \texttt{y} if 
   1.427 +\texttt{xs} is \texttt{y\#ys}. (Since the result in both branches must be of
   1.428 +the same type, it follows that \texttt{y::nat} and hence
   1.429 +\texttt{xs::(nat)list}.)
   1.430 +
   1.431 +In general, if $e$ is a term of the datatype $t$ defined in
   1.432 +\S\ref{sec:general-datatype} above, the corresponding
   1.433 +\texttt{case}-expression analyzing $e$ is
   1.434 +\[
   1.435 +\begin{array}{rrcl}
   1.436 +\mbox{\tt case}~e~\mbox{\tt of} & C@1~x@{11}~\dots~x@{1k@1} & \To & e@1 \\
   1.437 +                           \vdots \\
   1.438 +                           \mid & C@m~x@{m1}~\dots~x@{mk@m} & \To & e@m
   1.439 +\end{array}
   1.440 +\]
   1.441 +
   1.442 +\begin{warn}
   1.443 +{\em All} constructors must be present, their order is fixed, and nested
   1.444 +patterns are not supported.  Violating these restrictions results in strange
   1.445 +error messages.
   1.446 +\end{warn}
   1.447 +\noindent
   1.448 +Nested patterns can be simulated by nested \texttt{case}-expressions: instead
   1.449 +of
   1.450 +\begin{ttbox}
   1.451 +case xs of [] => 0 | [x] => x | x#(y#zs) => y
   1.452 +\end{ttbox}
   1.453 +write
   1.454 +\begin{ttbox}
   1.455 +case xs of [] => 0 | x#ys => (case ys of [] => x | y#zs => y)
   1.456 +\end{ttbox}
   1.457 +Note that \texttt{case}-expressions should be enclosed in parentheses to
   1.458 +indicate their scope.
   1.459 +
   1.460 +\subsection{Structural induction}
   1.461 +
   1.462 +Almost all the basic laws about a datatype are applied automatically during
   1.463 +simplification. Only induction is invoked by hand via \texttt{induct_tac},
   1.464 +which works for any datatype. In some cases, induction is overkill and a case
   1.465 +distinction over all constructors of the datatype suffices. This is performed
   1.466 +by \ttindexbold{exhaust_tac}. A trivial example:
   1.467 +\begin{ttbox}
   1.468 +\input{Misc/exhaust.ML}{\out1. xs = [] ==> (case xs of [] => [] | y # ys => xs) = xs}
   1.469 +{\out2. !!a list. xs = a # list ==> (case xs of [] => [] | y # ys => xs) = xs}
   1.470 +\input{Misc/autotac.ML}\end{ttbox}
   1.471 +Note that this particular case distinction could have been automated
   1.472 +completely. See~\S\ref{sec:SimpFeatures}.
   1.473 +
   1.474 +\begin{warn}
   1.475 +  Induction is only allowed on a free variable that should not occur among
   1.476 +  the assumptions of the subgoal.  Exhaustion works for arbitrary terms.
   1.477 +\end{warn}
   1.478 +
   1.479 +\subsection{Case study: boolean expressions}
   1.480 +\label{sec:boolex}
   1.481 +
   1.482 +The aim of this case study is twofold: it shows how to model boolean
   1.483 +expressions and some algorithms for manipulating them, and it demonstrates
   1.484 +the constructs introduced above.
   1.485 +
   1.486 +\subsubsection{How can we model boolean expressions?}
   1.487 +
   1.488 +We want to represent boolean expressions built up from variables and
   1.489 +constants by negation and conjunction. The following datatype serves exactly
   1.490 +that purpose:
   1.491 +\begin{ttbox}
   1.492 +\input{Ifexpr/boolex}\end{ttbox}
   1.493 +The two constants are represented by the terms \texttt{Const~True} and
   1.494 +\texttt{Const~False}. Variables are represented by terms of the form
   1.495 +\texttt{Var}~$n$, where $n$ is a natural number (type \texttt{nat}).
   1.496 +For example, the formula $P@0 \land \neg P@1$ is represented by the term
   1.497 +\texttt{And~(Var~0)~(Neg(Var~1))}.
   1.498 +
   1.499 +\subsubsection{What is the value of boolean expressions?}
   1.500 +
   1.501 +The value of a boolean expressions depends on the value of its variables.
   1.502 +Hence the function \texttt{value} takes an additional parameter, an {\em
   1.503 +  environment} of type \texttt{nat~=>~bool}, which maps variables to their
   1.504 +values:
   1.505 +\begin{ttbox}
   1.506 +\input{Ifexpr/value}\end{ttbox}
   1.507 +
   1.508 +\subsubsection{If-expressions}
   1.509 +
   1.510 +An alternative and often more efficient (because in a certain sense
   1.511 +canonical) representation are so-called \textit{If-expressions\/} built up
   1.512 +from constants (\texttt{CIF}), variables (\texttt{VIF}) and conditionals
   1.513 +(\texttt{IF}):
   1.514 +\begin{ttbox}
   1.515 +\input{Ifexpr/ifex}\end{ttbox}
   1.516 +The evaluation if If-expressions proceeds as for \texttt{boolex}:
   1.517 +\begin{ttbox}
   1.518 +\input{Ifexpr/valif}\end{ttbox}
   1.519 +
   1.520 +\subsubsection{Transformation into and of If-expressions}
   1.521 +
   1.522 +The type \texttt{boolex} is close to the customary representation of logical
   1.523 +formulae, whereas \texttt{ifex} is designed for efficiency. Thus we need to
   1.524 +translate from \texttt{boolex} into \texttt{ifex}:
   1.525 +\begin{ttbox}
   1.526 +\input{Ifexpr/bool2if}\end{ttbox}
   1.527 +At last, we have something we can verify: that \texttt{bool2if} preserves the
   1.528 +value of its argument.
   1.529 +\begin{ttbox}
   1.530 +\input{Ifexpr/bool2if.ML}\end{ttbox}
   1.531 +The proof is canonical:
   1.532 +\begin{ttbox}
   1.533 +\input{Ifexpr/proof.ML}\end{ttbox}
   1.534 +In fact, all proofs in this case study look exactly like this. Hence we do
   1.535 +not show them below.
   1.536 +
   1.537 +More interesting is the transformation of If-expressions into a normal form
   1.538 +where the first argument of \texttt{IF} cannot be another \texttt{IF} but
   1.539 +must be a constant or variable. Such a normal form can be computed by
   1.540 +repeatedly replacing a subterm of the form \texttt{IF~(IF~b~x~y)~z~u} by
   1.541 +\texttt{IF b (IF x z u) (IF y z u)}, which has the same value. The following
   1.542 +primitive recursive functions perform this task:
   1.543 +\begin{ttbox}
   1.544 +\input{Ifexpr/normif}
   1.545 +\input{Ifexpr/norm}\end{ttbox}
   1.546 +Their interplay is a bit tricky, and we leave it to the reader to develop an
   1.547 +intuitive understanding. Fortunately, Isabelle can help us to verify that the
   1.548 +transformation preserves the value of the expression:
   1.549 +\begin{ttbox}
   1.550 +\input{Ifexpr/norm.ML}\end{ttbox}
   1.551 +The proof is canonical, provided we first show the following lemma (which
   1.552 +also helps to understand what \texttt{normif} does) and make it available
   1.553 +for simplification via \texttt{Addsimps}:
   1.554 +\begin{ttbox}
   1.555 +\input{Ifexpr/normif.ML}\end{ttbox}
   1.556 +
   1.557 +But how can we be sure that \texttt{norm} really produces a normal form in
   1.558 +the above sense? We have to prove
   1.559 +\begin{ttbox}
   1.560 +\input{Ifexpr/normal_norm.ML}\end{ttbox}
   1.561 +where \texttt{normal} expresses that an If-expression is in normal form:
   1.562 +\begin{ttbox}
   1.563 +\input{Ifexpr/normal}\end{ttbox}
   1.564 +Of course, this requires a lemma about normality of \texttt{normif}
   1.565 +\begin{ttbox}
   1.566 +\input{Ifexpr/normal_normif.ML}\end{ttbox}
   1.567 +that has to be made available for simplification via \texttt{Addsimps}.
   1.568 +
   1.569 +How does one come up with the required lemmas? Try to prove the main theorems
   1.570 +without them and study carefully what \texttt{Auto_tac} leaves unproved. This
   1.571 +has to provide the clue.
   1.572 +The necessity of universal quantification (\texttt{!t e}) in the two lemmas
   1.573 +is explained in \S\ref{sec:InductionHeuristics}
   1.574 +
   1.575 +\begin{exercise}
   1.576 +  We strengthen the definition of a {\em normal\/} If-expression as follows:
   1.577 +  the first argument of all \texttt{IF}s must be a variable. Adapt the above
   1.578 +  development to this changed requirement. (Hint: you may need to formulate
   1.579 +  some of the goals as implications (\texttt{-->}) rather than equalities
   1.580 +  (\texttt{=}).)
   1.581 +\end{exercise}
   1.582 +
   1.583 +\section{Some basic types}
   1.584 +
   1.585 +\subsection{Natural numbers}
   1.586 +
   1.587 +The type \ttindexbold{nat} of natural numbers is predefined and behaves like
   1.588 +\begin{ttbox}
   1.589 +datatype nat = 0 | Suc nat
   1.590 +\end{ttbox}
   1.591 +In particular, there are \texttt{case}-expressions, for example
   1.592 +\begin{ttbox}
   1.593 +case n of 0 => 0 | Suc m => m
   1.594 +\end{ttbox}
   1.595 +primitive recursion, for example
   1.596 +\begin{ttbox}
   1.597 +\input{Misc/natsum}\end{ttbox}
   1.598 +and induction, for example
   1.599 +\begin{ttbox}
   1.600 +\input{Misc/NatSum.ML}\ttbreak
   1.601 +{\out sum n + sum n = n * Suc n}
   1.602 +{\out No subgoals!}
   1.603 +\end{ttbox}
   1.604 +
   1.605 +The usual arithmetic operations \ttindexbold{+}, \ttindexbold{-},
   1.606 +\ttindexbold{*}, \ttindexbold{div} and \ttindexbold{mod} are predefined, as
   1.607 +are the relations \ttindexbold{<=} and \ttindexbold{<}. There is even a least
   1.608 +number operation \ttindexbold{LEAST}. For example, \texttt{(LEAST n.$\,$1 <
   1.609 +  n) = 2} (HOL does not prove this completely automatically).
   1.610 +
   1.611 +\begin{warn}
   1.612 +  The operations \ttindexbold{+}, \ttindexbold{-}, \ttindexbold{*} are
   1.613 +  overloaded, i.e.\ they are available not just for natural numbers but at
   1.614 +  other types as well (see \S\ref{sec:TypeClasses}). For example, given
   1.615 +  the goal \texttt{x+y = y+x}, there is nothing to indicate that you are
   1.616 +  talking about natural numbers. Hence Isabelle can only infer that
   1.617 +  \texttt{x} and \texttt{y} are of some arbitrary type where \texttt{+} is
   1.618 +  declared. As a consequence, you will be unable to prove the goal (although
   1.619 +  it may take you some time to realize what has happened if
   1.620 +  \texttt{show_types} is not set).  In this particular example, you need to
   1.621 +  include an explicit type constraint, for example \texttt{x+y = y+(x::nat)}.
   1.622 +  If there is enough contextual information this may not be necessary:
   1.623 +  \texttt{x+0 = x} automatically implies \texttt{x::nat}.
   1.624 +\end{warn}
   1.625 +
   1.626 +
   1.627 +\subsection{Products}
   1.628 +
   1.629 +HOL also has pairs: \texttt{($a@1$,$a@2$)} is of type \texttt{$\tau@1$ *
   1.630 +$\tau@2$} provided each $a@i$ is of type $\tau@i$. The components of a pair
   1.631 +are extracted by \texttt{fst} and \texttt{snd}:
   1.632 +\texttt{fst($x$,$y$) = $x$} and \texttt{snd($x$,$y$) = $y$}. Tuples
   1.633 +are simulated by pairs nested to the right: 
   1.634 +\texttt{($a@1$,$a@2$,$a@3$)} and \texttt{$\tau@1$ * $\tau@2$ * $\tau@3$}
   1.635 +stand for \texttt{($a@1$,($a@2$,$a@3$))} and \texttt{$\tau@1$ * ($\tau@2$ *
   1.636 +$\tau@3$)}. Therefore \texttt{fst(snd($a@1$,$a@2$,$a@3$)) = $a@2$}.
   1.637 +
   1.638 +It is possible to use (nested) tuples as patterns in abstractions, for
   1.639 +example \texttt{\%(x,y,z).x+y+z} and \texttt{\%((x,y),z).x+y+z}.
   1.640 +
   1.641 +In addition to explicit $\lambda$-abstractions, tuple patterns can be used in
   1.642 +most variable binding constructs. Typical examples are
   1.643 +\begin{ttbox}
   1.644 +let (x,y) = f z in (y,x)
   1.645 +
   1.646 +case xs of [] => 0 | (x,y)\#zs => x+y
   1.647 +\end{ttbox}
   1.648 +Further important examples are quantifiers and sets.
   1.649 +
   1.650 +\begin{warn}
   1.651 +Abstraction over pairs and tuples is merely a convenient shorthand for a more
   1.652 +complex internal representation.  Thus the internal and external form of a
   1.653 +term may differ, which can affect proofs. If you want to avoid this
   1.654 +complication, use \texttt{fst} and \texttt{snd}, i.e.\ write
   1.655 +\texttt{\%p.~fst p + snd p} instead of \texttt{\%(x,y).~x + y}.
   1.656 +See~\S\ref{} for theorem proving with tuple patterns.
   1.657 +\end{warn}
   1.658 +
   1.659 +
   1.660 +\section{Definitions}
   1.661 +\label{sec:Definitions}
   1.662 +
   1.663 +A definition is simply an abbreviation, i.e.\ a new name for an existing
   1.664 +construction. In particular, definitions cannot be recursive. Isabelle offers
   1.665 +definitions on the level of types and terms. Those on the type level are
   1.666 +called type synonyms, those on the term level are called (constant)
   1.667 +definitions.
   1.668 +
   1.669 +
   1.670 +\subsection{Type synonyms}
   1.671 +\indexbold{type synonyms}
   1.672 +
   1.673 +Type synonyms are similar to those found in ML. Their syntax is fairly self
   1.674 +explanatory:
   1.675 +\begin{ttbox}
   1.676 +\input{Misc/types}\end{ttbox}\indexbold{*types}
   1.677 +The synonym \texttt{alist} shows that in general the type on the right-hand
   1.678 +side needs to be enclosed in double quotation marks
   1.679 +(see the end of~\S\ref{sec:intro-theory}).
   1.680 +
   1.681 +Internally all synonyms are fully expanded.  As a consequence Isabelle's
   1.682 +output never contains synonyms.  Their main purpose is to improve the
   1.683 +readability of theory definitions.  Synonyms can be used just like any other
   1.684 +type:
   1.685 +\begin{ttbox}
   1.686 +\input{Misc/consts}\end{ttbox}
   1.687 +
   1.688 +\subsection{Constant definitions}
   1.689 +\label{sec:ConstDefinitions}
   1.690 +
   1.691 +The above constants \texttt{nand} and \texttt{exor} are non-recursive and can
   1.692 +therefore be defined directly by
   1.693 +\begin{ttbox}
   1.694 +\input{Misc/defs}\end{ttbox}\indexbold{*defs}
   1.695 +where \texttt{defs} is a keyword and \texttt{nand_def} and \texttt{exor_def}
   1.696 +are arbitrary user-supplied names.
   1.697 +The symbol \texttt{==}\index{==>@{\tt==}|bold} is a special form of equality
   1.698 +that should only be used in constant definitions.
   1.699 +Declarations and definitions can also be merged
   1.700 +\begin{ttbox}
   1.701 +\input{Misc/constdefs}\end{ttbox}\indexbold{*constdefs}
   1.702 +in which case the default name of each definition is $f$\texttt{_def}, where
   1.703 +$f$ is the name of the defined constant.
   1.704 +
   1.705 +Note that pattern-matching is not allowed, i.e.\ each definition must be of
   1.706 +the form $f\,x@1\,\dots\,x@n$\texttt{~==~}$t$.
   1.707 +
   1.708 +Section~\S\ref{sec:Simplification} explains how definitions are used
   1.709 +in proofs.
   1.710 +
   1.711 +\begin{warn}
   1.712 +A common mistake when writing definitions is to introduce extra free variables
   1.713 +on the right-hand side as in the following fictitious definition:
   1.714 +\begin{ttbox}
   1.715 +defs  prime_def "prime(p) == (m divides p) --> (m=1 | m=p)"
   1.716 +\end{ttbox}
   1.717 +Isabelle rejects this `definition' because of the extra {\tt m} on the
   1.718 +right-hand side, which would introduce an inconsistency.  What you should have
   1.719 +written is
   1.720 +\begin{ttbox}
   1.721 +defs  prime_def "prime(p) == !m. (m divides p) --> (m=1 | m=p)"
   1.722 +\end{ttbox}
   1.723 +\end{warn}
   1.724 +
   1.725 +
   1.726 +
   1.727 +
   1.728 +\chapter{More Functional Programming}
   1.729 +
   1.730 +The purpose of this chapter is to deepen the reader's understanding of the
   1.731 +concepts encountered so far and to introduce an advanced method for defining
   1.732 +recursive functions. The first two sections give a structured presentation of
   1.733 +theorem proving by simplification (\S\ref{sec:Simplification}) and
   1.734 +discuss important heuristics for induction (\S\ref{sec:InductionHeuristics}). They
   1.735 +can be skipped by readers less interested in proofs. They are followed by a
   1.736 +case study, a compiler for expressions (\S\ref{sec:ExprCompiler}).
   1.737 +Finally we present a very general method for defining recursive functions
   1.738 +that goes well beyond what \texttt{primrec} allows (\S\ref{sec:recdef}).
   1.739 +
   1.740 +
   1.741 +\section{Simplification}
   1.742 +\label{sec:Simplification}
   1.743 +
   1.744 +So far we have proved our theorems by \texttt{Auto_tac}, which
   1.745 +`simplifies' all subgoals. In fact, \texttt{Auto_tac} can do much more than
   1.746 +that, except that it did not need to so far. However, when you go beyond toy
   1.747 +examples, you need to understand the ingredients of \texttt{Auto_tac}.
   1.748 +This section covers the tactic that \texttt{Auto_tac} always applies first,
   1.749 +namely simplification.
   1.750 +
   1.751 +Simplification is one of the central theorem proving tools in Isabelle and
   1.752 +many other systems. The tool itself is called the \bfindex{simplifier}. The
   1.753 +purpose of this section is twofold: to introduce the many features of the
   1.754 +simplifier (\S\ref{sec:SimpFeatures}) and to explain a little bit how the
   1.755 +simplifier works (\S\ref{sec:SimpHow}).  Anybody intending to use HOL should
   1.756 +read \S\ref{sec:SimpFeatures}, and the serious student should read
   1.757 +\S\ref{sec:SimpHow} as well in order to understand what happened in case
   1.758 +things do not simplify as expected.
   1.759 +
   1.760 +
   1.761 +\subsection{Using the simplifier}
   1.762 +\label{sec:SimpFeatures}
   1.763 +
   1.764 +In its most basic form, simplification means repeated application of
   1.765 +equations from left to right. For example, taking the rules for \texttt{\at}
   1.766 +and applying them to the term \texttt{[0,1] \at\ []} results in a sequence of
   1.767 +simplification steps:
   1.768 +\begin{ttbox}\makeatother
   1.769 +(0#1#[]) @ []  \(\leadsto\)  0#((1#[]) @ [])  \(\leadsto\)  0#(1#([] @ []))  \(\leadsto\)  0#1#[]
   1.770 +\end{ttbox}
   1.771 +This is also known as {\em term rewriting} and the equations are referred
   1.772 +to as {\em rewrite rules}. This is more honest than `simplification' because
   1.773 +the terms do not necessarily become simpler in the process.
   1.774 +
   1.775 +\subsubsection{Simpsets}
   1.776 +
   1.777 +To facilitate simplification, each theory has an associated set of
   1.778 +simplification rules, known as a \bfindex{simpset}. Within a theory,
   1.779 +proofs by simplification refer to the associated simpset by default.
   1.780 +The simpset of a theory is built up as follows: starting with the union of
   1.781 +the simpsets of the parent theories, each occurrence of a \texttt{datatype}
   1.782 +or \texttt{primrec} construct augments the simpset. Explicit definitions are
   1.783 +not added automatically. Users can add new theorems via \texttt{Addsimps} and
   1.784 +delete them again later by \texttt{Delsimps}.
   1.785 +
   1.786 +You may augment a simpset not just by equations but by pretty much any
   1.787 +theorem. The simplifier will try to make sense of it.  For example, a theorem
   1.788 +\verb$~$$P$ is automatically turned into \texttt{$P$ = False}. The details are
   1.789 +explained in \S\ref{sec:SimpHow}.
   1.790 +
   1.791 +As a rule of thumb, rewrite rules that really simplify a term (like
   1.792 +\texttt{xs \at\ [] = xs} and \texttt{rev(rev xs) = xs}) should be added to the
   1.793 +current simpset right after they have been proved.  Those of a more specific
   1.794 +nature (e.g.\ distributivity laws, which alter the structure of terms
   1.795 +considerably) should only be added for specific proofs and deleted again
   1.796 +afterwards.  Conversely, it may also happen that a generally useful rule
   1.797 +needs to be removed for a certain proof and is added again afterwards.  The
   1.798 +need of frequent temporary additions or deletions may indicate a badly
   1.799 +designed simpset.
   1.800 +\begin{warn}
   1.801 +  Simplification may not terminate, for example if both $f(x) = g(x)$ and
   1.802 +  $g(x) = f(x)$ are in the simpset. It is the user's responsibility not to
   1.803 +  include rules that can lead to nontermination, either on their own or in
   1.804 +  combination with other rules.
   1.805 +\end{warn}
   1.806 +
   1.807 +\subsubsection{Simplification tactics}
   1.808 +
   1.809 +There are four main simplification tactics:
   1.810 +\begin{ttdescription}
   1.811 +\item[\ttindexbold{Simp_tac} $i$] simplifies the conclusion of subgoal~$i$
   1.812 +  using the theory's simpset.  It may solve the subgoal completely if it has
   1.813 +  become trivial. For example:
   1.814 +\begin{ttbox}\makeatother
   1.815 +{\out 1. [] @ [] = []}
   1.816 +by(Simp_tac 1);
   1.817 +{\out No subgoals!}
   1.818 +\end{ttbox}
   1.819 +
   1.820 +\item[\ttindexbold{Asm_simp_tac}]
   1.821 +  is like \verb$Simp_tac$, but extracts additional rewrite rules from
   1.822 +  the assumptions of the subgoal. For example, it solves
   1.823 +\begin{ttbox}\makeatother
   1.824 +{\out 1. xs = [] ==> xs @ xs = xs}
   1.825 +\end{ttbox}
   1.826 +which \texttt{Simp_tac} does not do.
   1.827 +  
   1.828 +\item[\ttindexbold{Full_simp_tac}] is like \verb$Simp_tac$, but also
   1.829 +  simplifies the assumptions (without using the assumptions to
   1.830 +  simplify each other or the actual goal).
   1.831 +
   1.832 +\item[\ttindexbold{Asm_full_simp_tac}] is like \verb$Asm_simp_tac$,
   1.833 +  but also simplifies the assumptions. In particular, assumptions can
   1.834 +  simplify each other. For example:
   1.835 +\begin{ttbox}\makeatother
   1.836 +\out{ 1. [| xs @ zs = ys @ xs; [] @ xs = [] @ [] |] ==> ys = zs}
   1.837 +by(Asm_full_simp_tac 1);
   1.838 +{\out No subgoals!}
   1.839 +\end{ttbox}
   1.840 +The second assumption simplifies to \texttt{xs = []}, which in turn
   1.841 +simplifies the first assumption to \texttt{zs = ys}, thus reducing the
   1.842 +conclusion to \texttt{ys = ys} and hence to \texttt{True}.
   1.843 +(See also the paragraph on tracing below.)
   1.844 +\end{ttdescription}
   1.845 +\texttt{Asm_full_simp_tac} is the most powerful of this quartet of
   1.846 +tactics. In fact, \texttt{Auto_tac} starts by applying
   1.847 +\texttt{Asm_full_simp_tac} to all subgoals. The only reason for the existence
   1.848 +of the other three tactics is that sometimes one wants to limit the amount of
   1.849 +simplification, for example to avoid nontermination:
   1.850 +\begin{ttbox}\makeatother
   1.851 +{\out  1. ! x. f x = g (f (g x)) ==> f [] = f [] @ []}
   1.852 +\end{ttbox}
   1.853 +is solved by \texttt{Simp_tac}, but \texttt{Asm_simp_tac} and
   1.854 +\texttt{Asm_full_simp_tac} loop because the rewrite rule \texttt{f x = g(f(g
   1.855 +x))} extracted from the assumption does not terminate.  Isabelle notices
   1.856 +certain simple forms of nontermination, but not this one.
   1.857 + 
   1.858 +\subsubsection{Modifying simpsets locally}
   1.859 +
   1.860 +If a certain theorem is merely needed in one proof by simplification, the
   1.861 +pattern
   1.862 +\begin{ttbox}
   1.863 +Addsimps [\(rare_theorem\)];
   1.864 +by(Simp_tac 1);
   1.865 +Delsimps [\(rare_theorem\)];
   1.866 +\end{ttbox}
   1.867 +is awkward. Therefore there are lower-case versions of the simplification
   1.868 +tactics (\ttindexbold{simp_tac}, \ttindexbold{asm_simp_tac},
   1.869 +\ttindexbold{full_simp_tac}, \ttindexbold{asm_full_simp_tac}) and of the
   1.870 +simpset modifiers (\ttindexbold{addsimps}, \ttindexbold{delsimps})
   1.871 +that do not access or modify the implicit simpset but explicitly take a
   1.872 +simpset as an argument. For example, the above three lines become
   1.873 +\begin{ttbox}
   1.874 +by(simp_tac (simpset() addsimps [\(rare_theorem\)]) 1);
   1.875 +\end{ttbox}
   1.876 +where the result of the function call \texttt{simpset()} is the simpset of
   1.877 +the current theory and \texttt{addsimps} is an infix function. The implicit
   1.878 +simpset is read once but not modified.
   1.879 +This is far preferable to pairs of \texttt{Addsimps} and \texttt{Delsimps}.
   1.880 +Local modifications can be stacked as in
   1.881 +\begin{ttbox}
   1.882 +by(simp_tac (simpset() addsimps [\(rare_theorem\)] delsimps [\(some_thm\)]) 1);
   1.883 +\end{ttbox}
   1.884 +
   1.885 +\subsubsection{Rewriting with definitions}
   1.886 +
   1.887 +Constant definitions (\S\ref{sec:ConstDefinitions}) are not automatically
   1.888 +included in the simpset of a theory. Hence such definitions are not expanded
   1.889 +automatically either, just as it should be: definitions are introduced for
   1.890 +the purpose of abbreviating complex concepts. Of course we need to expand the
   1.891 +definitions initially to derive enough lemmas that characterize the concept
   1.892 +sufficiently for us to forget the original definition completely. For
   1.893 +example, given the theory
   1.894 +\begin{ttbox}
   1.895 +\input{Misc/Exor.thy}\end{ttbox}
   1.896 +we may want to prove \verb$exor A (~A)$. Instead of \texttt{Goal} we use
   1.897 +\begin{ttbox}
   1.898 +\input{Misc/exorgoal.ML}\end{ttbox}
   1.899 +which tells Isabelle to expand the definition of \texttt{exor}---the first
   1.900 +argument of \texttt{Goalw} can be a list of definitions---in the initial goal:
   1.901 +\begin{ttbox}
   1.902 +{\out exor A (~ A)}
   1.903 +{\out  1. A & ~ ~ A | ~ A & ~ A}
   1.904 +\end{ttbox}
   1.905 +In this simple example, the goal is proved by \texttt{Simp_tac}.
   1.906 +Of course the resulting theorem is insufficient to characterize \texttt{exor}
   1.907 +completely.
   1.908 +
   1.909 +In case we want to expand a definition in the middle of a proof, we can
   1.910 +simply add the definition locally to the simpset:
   1.911 +\begin{ttbox}
   1.912 +\input{Misc/exorproof.ML}\end{ttbox}
   1.913 +You should normally not add the definition permanently to the simpset
   1.914 +using \texttt{Addsimps} because this defeats the whole purpose of an
   1.915 +abbreviation.
   1.916 +
   1.917 +\begin{warn}
   1.918 +If you have defined $f\,x\,y$\texttt{~==~}$t$ then you can only expand
   1.919 +occurrences of $f$ with at least two arguments. Thus it is safer to define
   1.920 +$f$\texttt{~==~\%$x\,y$.}$\;t$.
   1.921 +\end{warn}
   1.922 +
   1.923 +\subsubsection{Simplifying \texttt{let}-expressions}
   1.924 +
   1.925 +Proving a goal containing \ttindex{let}-expressions invariably requires the
   1.926 +\texttt{let}-constructs to be expanded at some point. Since
   1.927 +\texttt{let}-\texttt{in} is just syntactic sugar for a defined constant
   1.928 +(called \texttt{Let}), expanding \texttt{let}-constructs means rewriting with
   1.929 +\texttt{Let_def}:
   1.930 +%context List.thy;
   1.931 +%Goal "(let xs = [] in xs@xs) = ys";
   1.932 +\begin{ttbox}\makeatother
   1.933 +{\out  1. (let xs = [] in xs @ xs) = ys}
   1.934 +by(simp_tac (simpset() addsimps [Let_def]) 1);
   1.935 +{\out  1. [] = ys}
   1.936 +\end{ttbox}
   1.937 +If, in a particular context, there is no danger of a combinatorial explosion
   1.938 +of nested \texttt{let}s one could even add \texttt{Let_def} permanently via
   1.939 +\texttt{Addsimps}.
   1.940 +
   1.941 +\subsubsection{Conditional equations}
   1.942 +
   1.943 +So far all examples of rewrite rules were equations. The simplifier also
   1.944 +accepts {\em conditional\/} equations, for example
   1.945 +\begin{ttbox}
   1.946 +xs ~= []  ==>  hd xs # tl xs = xs \hfill \((*)\)
   1.947 +\end{ttbox}
   1.948 +(which is proved by \texttt{exhaust_tac} on \texttt{xs} followed by
   1.949 +\texttt{Asm_full_simp_tac} twice). Assuming that this theorem together with
   1.950 +%\begin{ttbox}\makeatother
   1.951 +\texttt{(rev xs = []) = (xs = [])}
   1.952 +%\end{ttbox}
   1.953 +are part of the simpset, the subgoal
   1.954 +\begin{ttbox}\makeatother
   1.955 +{\out 1. xs ~= [] ==> hd(rev xs) # tl(rev xs) = rev xs}
   1.956 +\end{ttbox}
   1.957 +is proved by simplification:
   1.958 +the conditional equation $(*)$ above
   1.959 +can simplify \texttt{hd(rev~xs)~\#~tl(rev~xs)} to \texttt{rev xs}
   1.960 +because the corresponding precondition \verb$rev xs ~= []$ simplifies to
   1.961 +\verb$xs ~= []$, which is exactly the local assumption of the subgoal.
   1.962 +
   1.963 +
   1.964 +\subsubsection{Automatic case splits}
   1.965 +
   1.966 +Goals containing \ttindex{if}-expressions are usually proved by case
   1.967 +distinction on the condition of the \texttt{if}. For example the goal
   1.968 +\begin{ttbox}
   1.969 +{\out 1. ! xs. if xs = [] then rev xs = [] else rev xs ~= []}
   1.970 +\end{ttbox}
   1.971 +can be split into
   1.972 +\begin{ttbox}
   1.973 +{\out 1. ! xs. (xs = [] --> rev xs = []) \& (xs ~= [] --> rev xs ~= [])}
   1.974 +\end{ttbox}
   1.975 +by typing
   1.976 +\begin{ttbox}
   1.977 +\input{Misc/splitif.ML}\end{ttbox}
   1.978 +Because this is almost always the right proof strategy, the simplifier
   1.979 +performs case-splitting on \texttt{if}s automatically. Try \texttt{Simp_tac}
   1.980 +on the initial goal above.
   1.981 +
   1.982 +This splitting idea generalizes from \texttt{if} to \ttindex{case}:
   1.983 +\begin{ttbox}\makeatother
   1.984 +{\out 1. (case xs of [] => zs | y#ys => y#(ys@zs)) = xs@zs}
   1.985 +\end{ttbox}
   1.986 +becomes
   1.987 +\begin{ttbox}\makeatother
   1.988 +{\out 1. (xs = [] --> zs = xs @ zs) &}
   1.989 +{\out    (! a list. xs = a # list --> a # list @ zs = xs @ zs)}
   1.990 +\end{ttbox}
   1.991 +by typing
   1.992 +\begin{ttbox}
   1.993 +\input{Misc/splitlist.ML}\end{ttbox}
   1.994 +In contrast to \texttt{if}-expressions, the simplifier does not split
   1.995 +\texttt{case}-expressions by default because this can lead to nontermination
   1.996 +in case of recursive datatypes.
   1.997 +Nevertheless the simplifier can be instructed to perform \texttt{case}-splits
   1.998 +by adding the appropriate rule to the simpset:
   1.999 +\begin{ttbox}
  1.1000 +by(simp_tac (simpset() addsplits [split_list_case]) 1);
  1.1001 +\end{ttbox}\indexbold{*addsplits}
  1.1002 +solves the initial goal outright, which \texttt{Simp_tac} alone will not do.
  1.1003 +
  1.1004 +In general, every datatype $t$ comes with a rule
  1.1005 +\texttt{$t$.split} that can be added to the simpset either
  1.1006 +locally via \texttt{addsplits} (see above), or permanently via
  1.1007 +\begin{ttbox}
  1.1008 +Addsplits [\(t\).split];
  1.1009 +\end{ttbox}\indexbold{*Addsplits}
  1.1010 +Split-rules can be removed globally via \ttindexbold{Delsplits} and locally
  1.1011 +via \ttindexbold{delsplits} as, for example, in
  1.1012 +\begin{ttbox}
  1.1013 +by(simp_tac (simpset() addsimps [\(\dots\)] delsplits [split_if]) 1);
  1.1014 +\end{ttbox}
  1.1015 +
  1.1016 +
  1.1017 +\subsubsection{Permutative rewrite rules}
  1.1018 +
  1.1019 +A rewrite rule is {\bf permutative} if the left-hand side and right-hand side
  1.1020 +are the same up to renaming of variables.  The most common permutative rule
  1.1021 +is commutativity: $x+y = y+x$.  Another example is $(x-y)-z = (x-z)-y$.  Such
  1.1022 +rules are problematic because once they apply, they can be used forever.
  1.1023 +The simplifier is aware of this danger and treats permutative rules
  1.1024 +separately. For details see~\cite{Isa-Ref-Man}.
  1.1025 +
  1.1026 +\subsubsection{Tracing}
  1.1027 +\indexbold{tracing the simplifier}
  1.1028 +
  1.1029 +Using the simplifier effectively may take a bit of experimentation.  Set the
  1.1030 +\verb$trace_simp$ flag to get a better idea of what is going on:
  1.1031 +\begin{ttbox}\makeatother
  1.1032 +{\out  1. rev [x] = []}
  1.1033 +\ttbreak
  1.1034 +set trace_simp;
  1.1035 +by(Simp_tac 1);
  1.1036 +\ttbreak\makeatother
  1.1037 +{\out Applying instance of rewrite rule:}
  1.1038 +{\out rev (?x # ?xs) == rev ?xs @ [?x]}
  1.1039 +{\out Rewriting:}
  1.1040 +{\out rev [x] == rev [] @ [x]}
  1.1041 +\ttbreak
  1.1042 +{\out Applying instance of rewrite rule:}
  1.1043 +{\out rev [] == []}
  1.1044 +{\out Rewriting:}
  1.1045 +{\out rev [] == []}
  1.1046 +\ttbreak\makeatother
  1.1047 +{\out Applying instance of rewrite rule:}
  1.1048 +{\out [] @ ?y == ?y}
  1.1049 +{\out Rewriting:}
  1.1050 +{\out [] @ [x] == [x]}
  1.1051 +\ttbreak
  1.1052 +{\out Applying instance of rewrite rule:}
  1.1053 +{\out ?x # ?t = ?t == False}
  1.1054 +{\out Rewriting:}
  1.1055 +{\out [x] = [] == False}
  1.1056 +\ttbreak
  1.1057 +{\out Level 1}
  1.1058 +{\out rev [x] = []}
  1.1059 +{\out  1. False}
  1.1060 +\end{ttbox}
  1.1061 +In more complicated cases, the trace can be enormous, especially since
  1.1062 +invocations of the simplifier are often nested (e.g.\ when solving conditions
  1.1063 +of rewrite rules).
  1.1064 +
  1.1065 +\subsection{How it works}
  1.1066 +\label{sec:SimpHow}
  1.1067 +
  1.1068 +\subsubsection{Higher-order patterns}
  1.1069 +
  1.1070 +\subsubsection{Local assumptions}
  1.1071 +
  1.1072 +\subsubsection{The preprocessor}
  1.1073 +
  1.1074 +\section{Induction heuristics}
  1.1075 +\label{sec:InductionHeuristics}
  1.1076 +
  1.1077 +The purpose of this section is to illustrate some simple heuristics for
  1.1078 +inductive proofs. The first one we have already mentioned in our initial
  1.1079 +example:
  1.1080 +\begin{quote}
  1.1081 +{\em 1. Theorems about recursive functions are proved by induction.}
  1.1082 +\end{quote}
  1.1083 +In case the function has more than one argument
  1.1084 +\begin{quote}
  1.1085 +{\em 2. Do induction on argument number $i$ if the function is defined by
  1.1086 +recursion in argument number $i$.}
  1.1087 +\end{quote}
  1.1088 +When we look at the proof of
  1.1089 +\begin{ttbox}\makeatother
  1.1090 +(xs @ ys) @ zs = xs @ (ys @ zs)
  1.1091 +\end{ttbox}
  1.1092 +in \S\ref{sec:intro-proof} we find (a) \texttt{\at} is recursive in the first
  1.1093 +argument, (b) \texttt{xs} occurs only as the first argument of \texttt{\at},
  1.1094 +and (c) both \texttt{ys} and \texttt{zs} occur at least once as the second
  1.1095 +argument of \texttt{\at}. Hence it is natural to perform induction on
  1.1096 +\texttt{xs}.
  1.1097 +
  1.1098 +The key heuristic, and the main point of this section, is to
  1.1099 +generalize the goal before induction. The reason is simple: if the goal is
  1.1100 +too specific, the induction hypothesis is too weak to allow the induction
  1.1101 +step to go through. Let us now illustrate the idea with an example.
  1.1102 +
  1.1103 +We define a tail-recursive version of list-reversal,
  1.1104 +i.e.\ one that can be compiled into a loop:
  1.1105 +\begin{ttbox}
  1.1106 +\input{Misc/Itrev.thy}\end{ttbox}
  1.1107 +The behaviour of \texttt{itrev} is simple: it reverses its first argument by
  1.1108 +stacking its elements onto the second argument, and returning that second
  1.1109 +argument when the first one becomes empty.
  1.1110 +We need to show that \texttt{itrev} does indeed reverse its first argument
  1.1111 +provided the second one is empty:
  1.1112 +\begin{ttbox}
  1.1113 +\input{Misc/itrev1.ML}\end{ttbox}
  1.1114 +There is no choice as to the induction variable, and we immediately simplify:
  1.1115 +\begin{ttbox}
  1.1116 +\input{Misc/induct_auto.ML}\ttbreak\makeatother
  1.1117 +{\out1. !!a list. itrev list [] = rev list\(\;\)==> itrev list [a] = rev list @ [a]}
  1.1118 +\end{ttbox}
  1.1119 +Just as predicted above, the overall goal, and hence the induction
  1.1120 +hypothesis, is too weak to solve the induction step because of the fixed
  1.1121 +\texttt{[]}. The corresponding heuristic:
  1.1122 +\begin{quote}
  1.1123 +{\em 3. Generalize goals for induction by replacing constants by variables.}
  1.1124 +\end{quote}
  1.1125 +Of course one cannot do this na\"{\i}vely: \texttt{itrev xs ys = rev xs} is
  1.1126 +just not true --- the correct generalization is
  1.1127 +\begin{ttbox}\makeatother
  1.1128 +\input{Misc/itrev2.ML}\end{ttbox}
  1.1129 +If \texttt{ys} is replaced by \texttt{[]}, the right-hand side simplifies to
  1.1130 +\texttt{rev xs}, just as required.
  1.1131 +
  1.1132 +In this particular instance it is easy to guess the right generalization,
  1.1133 +but in more complex situations a good deal of creativity is needed. This is
  1.1134 +the main source of complications in inductive proofs.
  1.1135 +
  1.1136 +Although we now have two variables, only \texttt{xs} is suitable for
  1.1137 +induction, and we repeat our above proof attempt. Unfortunately, we are still
  1.1138 +not there:
  1.1139 +\begin{ttbox}\makeatother
  1.1140 +{\out 1. !!a list.}
  1.1141 +{\out       itrev list ys = rev list @ ys}
  1.1142 +{\out       ==> itrev list (a # ys) = rev list @ a # ys}
  1.1143 +\end{ttbox}
  1.1144 +The induction hypothesis is still too weak, but this time it takes no
  1.1145 +intuition to generalize: the problem is that \texttt{ys} is fixed throughout
  1.1146 +the subgoal, but the induction hypothesis needs to be applied with
  1.1147 +\texttt{a \# ys} instead of \texttt{ys}. Hence we prove the theorem
  1.1148 +for all \texttt{ys} instead of a fixed one:
  1.1149 +\begin{ttbox}\makeatother
  1.1150 +\input{Misc/itrev3.ML}\end{ttbox}
  1.1151 +This time induction on \texttt{xs} followed by simplification succeeds. This
  1.1152 +leads to another heuristic for generalization:
  1.1153 +\begin{quote}
  1.1154 +{\em 4. Generalize goals for induction by universally quantifying all free
  1.1155 +variables {\em(except the induction variable itself!)}.}
  1.1156 +\end{quote}
  1.1157 +This prevents trivial failures like the above and does not change the
  1.1158 +provability of the goal. Because it is not always required, and may even
  1.1159 +complicate matters in some cases, this heuristic is often not
  1.1160 +applied blindly.
  1.1161 +
  1.1162 +A final point worth mentioning is the orientation of the equation we just
  1.1163 +proved: the more complex notion (\texttt{itrev}) is on the left-hand
  1.1164 +side, the simpler \texttt{rev} on the right-hand side. This constitutes
  1.1165 +another, albeit weak heuristic that is not restricted to induction:
  1.1166 +\begin{quote}
  1.1167 +  {\em 5. The right-hand side of an equation should (in some sense) be
  1.1168 +    simpler than the left-hand side.}
  1.1169 +\end{quote}
  1.1170 +The heuristic is tricky to apply because it is not obvious that
  1.1171 +\texttt{rev xs \at\ ys} is simpler than \texttt{itrev xs ys}. But see what
  1.1172 +happens if you try to prove the symmetric equation!
  1.1173 +
  1.1174 +
  1.1175 +\section{Case study: compiling expressions}
  1.1176 +\label{sec:ExprCompiler}
  1.1177 +
  1.1178 +The task is to develop a compiler from a generic type of expressions (built
  1.1179 +up from variables, constants and binary operations) to a stack machine.  This
  1.1180 +generic type of expressions is a generalization of the boolean expressions in
  1.1181 +\S\ref{sec:boolex}.  This time we do not commit ourselves to a particular
  1.1182 +type of variables or values but make them type parameters.  Neither is there
  1.1183 +a fixed set of binary operations: instead the expression contains the
  1.1184 +appropriate function itself.
  1.1185 +\begin{ttbox}
  1.1186 +\input{CodeGen/expr}\end{ttbox}
  1.1187 +The three constructors represent constants, variables and the combination of
  1.1188 +two subexpressions with a binary operation.
  1.1189 +
  1.1190 +The value of an expression w.r.t.\ an environment that maps variables to
  1.1191 +values is easily defined:
  1.1192 +\begin{ttbox}
  1.1193 +\input{CodeGen/value}\end{ttbox}
  1.1194 +
  1.1195 +The stack machine has three instructions: load a constant value onto the
  1.1196 +stack, load the contents of a certain address onto the stack, and apply a
  1.1197 +binary operation to the two topmost elements of the stack, replacing them by
  1.1198 +the result. As for \texttt{expr}, addresses and values are type parameters:
  1.1199 +\begin{ttbox}
  1.1200 +\input{CodeGen/instr}\end{ttbox}
  1.1201 +
  1.1202 +The execution of the stack machine is modelled by a function \texttt{exec}
  1.1203 +that takes a store (modelled as a function from addresses to values, just
  1.1204 +like the environment for evaluating expressions), a stack (modelled as a
  1.1205 +list) of values and a list of instructions and returns the stack at the end
  1.1206 +of the execution --- the store remains unchanged:
  1.1207 +\begin{ttbox}
  1.1208 +\input{CodeGen/exec}\end{ttbox}
  1.1209 +Recall that \texttt{hd} and \texttt{tl}
  1.1210 +return the first element and the remainder of a list.
  1.1211 +
  1.1212 +Because all functions are total, \texttt{hd} is defined even for the empty
  1.1213 +list, although we do not know what the result is. Thus our model of the
  1.1214 +machine always terminates properly, although the above definition does not
  1.1215 +tell us much about the result in situations where \texttt{Apply} was executed
  1.1216 +with fewer than two elements on the stack.
  1.1217 +
  1.1218 +The compiler is a function from expressions to a list of instructions. Its
  1.1219 +definition is pretty much obvious:
  1.1220 +\begin{ttbox}\makeatother
  1.1221 +\input{CodeGen/comp}\end{ttbox}
  1.1222 +
  1.1223 +Now we have to prove the correctness of the compiler, i.e.\ that the
  1.1224 +execution of a compiled expression results in the value of the expression:
  1.1225 +\begin{ttbox}
  1.1226 +exec s [] (comp e) = [value s e]
  1.1227 +\end{ttbox}
  1.1228 +This is generalized to
  1.1229 +\begin{ttbox}
  1.1230 +\input{CodeGen/goal2.ML}\end{ttbox}
  1.1231 +and proved by induction on \texttt{e} followed by simplification, once we
  1.1232 +have the following lemma about executing the concatenation of two instruction
  1.1233 +sequences:
  1.1234 +\begin{ttbox}\makeatother
  1.1235 +\input{CodeGen/goal2.ML}\end{ttbox}
  1.1236 +This requires induction on \texttt{xs} and ordinary simplification for the
  1.1237 +base cases. In the induction step, simplification leaves us with a formula
  1.1238 +that contains two \texttt{case}-expressions over instructions. Thus we add
  1.1239 +automatic case splitting as well, which finishes the proof:
  1.1240 +\begin{ttbox}
  1.1241 +\input{CodeGen/simpsplit.ML}\end{ttbox}
  1.1242 +
  1.1243 +We could now go back and prove \texttt{exec s [] (comp e) = [value s e]}
  1.1244 +merely by simplification with the generalized version we just proved.
  1.1245 +However, this is unnecessary because the generalized version fully subsumes
  1.1246 +its instance.
  1.1247 +
  1.1248 +\section{Total recursive functions}
  1.1249 +\label{sec:recdef}
  1.1250 +\index{*recdef|(}
  1.1251 +
  1.1252 +
  1.1253 +Although many total functions have a natural primitive recursive definition,
  1.1254 +this is not always the case. Arbitrary total recursive functions can be
  1.1255 +defined by means of \texttt{recdef}: you can use full pattern-matching,
  1.1256 +recursion need not involve datatypes, and termination is proved by showing
  1.1257 +that each recursive call makes the argument smaller in a suitable (user
  1.1258 +supplied) sense.
  1.1259 +
  1.1260 +\subsection{Defining recursive functions}
  1.1261 +
  1.1262 +Here is a simple example, the Fibonacci function:
  1.1263 +\begin{ttbox}
  1.1264 +consts fib  :: nat => nat
  1.1265 +recdef fib "measure(\%n. n)"
  1.1266 +    "fib 0 = 0"
  1.1267 +    "fib 1 = 1"
  1.1268 +    "fib (Suc(Suc x)) = fib x + fib (Suc x)"
  1.1269 +\end{ttbox}
  1.1270 +The definition of \texttt{fib} is accompanied by a \bfindex{measure function}
  1.1271 +\texttt{\%n.$\;$n} that maps the argument of \texttt{fib} to a natural
  1.1272 +number. The requirement is that in each equation the measure of the argument
  1.1273 +on the left-hand side is strictly greater than the measure of the argument of
  1.1274 +each recursive call. In the case of \texttt{fib} this is obviously true
  1.1275 +because the measure function is the identity and \texttt{Suc(Suc~x)} is
  1.1276 +strictly greater than both \texttt{x} and \texttt{Suc~x}.
  1.1277 +
  1.1278 +Slightly more interesting is the insertion of a fixed element
  1.1279 +between any two elements of a list:
  1.1280 +\begin{ttbox}
  1.1281 +consts sep :: "'a * 'a list => 'a list"
  1.1282 +recdef sep "measure (\%(a,xs). length xs)"
  1.1283 +    "sep(a, [])     = []"
  1.1284 +    "sep(a, [x])    = [x]"
  1.1285 +    "sep(a, x#y#zs) = x # a # sep(a,y#zs)"
  1.1286 +\end{ttbox}
  1.1287 +This time the measure is the length of the list, which decreases with the
  1.1288 +recursive call; the first component of the argument tuple is irrelevant.
  1.1289 +
  1.1290 +Pattern matching need not be exhaustive:
  1.1291 +\begin{ttbox}
  1.1292 +consts last :: 'a list => bool
  1.1293 +recdef last "measure (\%xs. length xs)"
  1.1294 +    "last [x]      = x"
  1.1295 +    "last (x#y#zs) = last (y#zs)"
  1.1296 +\end{ttbox}
  1.1297 +
  1.1298 +Overlapping patterns are disambiguated by taking the order of equations into
  1.1299 +account, just as in functional programming:
  1.1300 +\begin{ttbox}
  1.1301 +recdef sep "measure (\%(a,xs). length xs)"
  1.1302 +    "sep(a, x#y#zs) = x # a # sep(a,y#zs)"
  1.1303 +    "sep(a, xs)     = xs"
  1.1304 +\end{ttbox}
  1.1305 +This defines exactly the same function \texttt{sep} as further above.
  1.1306 +
  1.1307 +\begin{warn}
  1.1308 +Currently \texttt{recdef} only accepts functions with a single argument,
  1.1309 +possibly of tuple type.
  1.1310 +\end{warn}
  1.1311 +
  1.1312 +When loading a theory containing a \texttt{recdef} of a function $f$,
  1.1313 +Isabelle proves the recursion equations and stores the result as a list of
  1.1314 +theorems $f$.\texttt{rules}. It can be viewed by typing
  1.1315 +\begin{ttbox}
  1.1316 +prths \(f\).rules;
  1.1317 +\end{ttbox}
  1.1318 +All of the above examples are simple enough that Isabelle can determine
  1.1319 +automatically that the measure of the argument goes down in each recursive
  1.1320 +call. In that case $f$.\texttt{rules} contains precisely the defining
  1.1321 +equations.
  1.1322 +
  1.1323 +In general, Isabelle may not be able to prove all termination conditions
  1.1324 +automatically. For example, termination of
  1.1325 +\begin{ttbox}
  1.1326 +consts gcd :: "nat*nat => nat"
  1.1327 +recdef gcd "measure ((\%(m,n).n))"
  1.1328 +    "gcd (m, n) = (if n=0 then m else gcd(n, m mod n))"
  1.1329 +\end{ttbox}
  1.1330 +relies on the lemma \texttt{mod_less_divisor}
  1.1331 +\begin{ttbox}
  1.1332 +0 < n ==> m mod n < n
  1.1333 +\end{ttbox}
  1.1334 +that is not part of the default simpset. As a result, Isabelle prints a
  1.1335 +warning and \texttt{gcd.rules} contains a precondition:
  1.1336 +\begin{ttbox}
  1.1337 +(! m n. 0 < n --> m mod n < n) ==> gcd (m, n) = (if n=0 \dots)
  1.1338 +\end{ttbox}
  1.1339 +We need to instruct \texttt{recdef} to use an extended simpset to prove the
  1.1340 +termination condition:
  1.1341 +\begin{ttbox}
  1.1342 +recdef gcd "measure ((\%(m,n).n))"
  1.1343 +  simpset "simpset() addsimps [mod_less_divisor]"
  1.1344 +    "gcd (m, n) = (if n=0 then m else gcd(n, m mod n))"
  1.1345 +\end{ttbox}
  1.1346 +This time everything works fine and \texttt{gcd.rules} contains precisely the
  1.1347 +stated recursion equation for \texttt{gcd}.
  1.1348 +
  1.1349 +When defining some nontrivial total recursive function, the first attempt
  1.1350 +will usually generate a number of termination conditions, some of which may
  1.1351 +require new lemmas to be proved in some of the parent theories. Those lemmas
  1.1352 +can then be added to the simpset used by \texttt{recdef} for its
  1.1353 +proofs, as shown for \texttt{gcd}.
  1.1354 +
  1.1355 +Although all the above examples employ measure functions, \texttt{recdef}
  1.1356 +allows arbitrary wellfounded relations. For example, termination of
  1.1357 +Ackermann's function requires the lexicographic product \texttt{**}:
  1.1358 +\begin{ttbox}
  1.1359 +recdef ack "measure(\%m. m) ** measure(\%n. n)"
  1.1360 +    "ack(0,n)         = Suc n"
  1.1361 +    "ack(Suc m,0)     = ack(m, 1)"
  1.1362 +    "ack(Suc m,Suc n) = ack(m,ack(Suc m,n))"
  1.1363 +\end{ttbox}
  1.1364 +For details see~\cite{Isa-Logics-Man} and the examples in the library.
  1.1365 +
  1.1366 +
  1.1367 +\subsection{Deriving simplification rules}
  1.1368 +
  1.1369 +Once we have succeeded to prove all termination conditions, we can start to
  1.1370 +derive some theorems. In contrast to \texttt{primrec} definitions, which are
  1.1371 +automatically added to the simpset, \texttt{recdef} rules must be included
  1.1372 +explicitly, for example as in
  1.1373 +\begin{ttbox}
  1.1374 +Addsimps fib.rules;
  1.1375 +\end{ttbox}
  1.1376 +However, some care is necessary now, in contrast to \texttt{primrec}.
  1.1377 +Although \texttt{gcd} is a total function, its defining equation leads to
  1.1378 +nontermination of the simplifier, because the subterm \texttt{gcd(n, m mod
  1.1379 +  n)} on the right-hand side can again be simplified by the same equation,
  1.1380 +and so on. The reason: the simplifier rewrites the \texttt{then} and
  1.1381 +\texttt{else} branches of a conditional if the condition simplifies to
  1.1382 +neither \texttt{True} nor \texttt{False}.  Therefore it is recommended to
  1.1383 +derive an alternative formulation that replaces case distinctions on the
  1.1384 +right-hand side by conditional equations. For \texttt{gcd} it means we have
  1.1385 +to prove
  1.1386 +\begin{ttbox}
  1.1387 +           gcd (m, 0) = m
  1.1388 +n ~= 0 ==> gcd (m, n) = gcd(n, m mod n)
  1.1389 +\end{ttbox}
  1.1390 +To avoid nontermination during those proofs, we have to resort to some low
  1.1391 +level tactics:
  1.1392 +\begin{ttbox}
  1.1393 +Goal "gcd(m,0) = m";
  1.1394 +by(resolve_tac [trans] 1);
  1.1395 +by(resolve_tac gcd.rules 1);
  1.1396 +by(Simp_tac 1);
  1.1397 +\end{ttbox}
  1.1398 +At this point it is not necessary to understand what exactly
  1.1399 +\texttt{resolve_tac} is doing. The main point is that the above proof works
  1.1400 +not just for this one example but in general (except that we have to use
  1.1401 +\texttt{Asm_simp_tac} and $f$\texttt{.rules} in general). Try the second
  1.1402 +\texttt{gcd}-equation above!
  1.1403 +
  1.1404 +\subsection{Induction}
  1.1405 +
  1.1406 +Assuming we have added the recursion equations (or some suitable derived
  1.1407 +equations) to the simpset, we might like to prove something about our
  1.1408 +function. Since the function is recursive, the natural proof principle is
  1.1409 +again induction. But this time the structural form of induction that comes
  1.1410 +with datatypes is unlikely to work well---otherwise we could have defined the
  1.1411 +function by \texttt{primrec}. Therefore \texttt{recdef} automatically proves
  1.1412 +a suitable induction rule $f$\texttt{.induct} that follows the recursion
  1.1413 +pattern of the particular function $f$. Roughly speaking, it requires you to
  1.1414 +prove for each \texttt{recdef} equation that the property you are trying to
  1.1415 +establish holds for the left-hand side provided it holds for all recursive
  1.1416 +calls on the right-hand side. Applying $f$\texttt{.induct} requires its
  1.1417 +explicit instantiation. See \S\ref{sec:explicit-inst} for details.
  1.1418 +
  1.1419 +\index{*recdef|)}