1.1 --- /dev/null Thu Jan 01 00:00:00 1970 +0000
1.2 +++ b/doc-src/Tutorial/fp.tex Wed Aug 26 16:57:49 1998 +0200
1.3 @@ -0,0 +1,1416 @@
1.4 +\chapter{Functional Programming in HOL}
1.5 +
1.6 +Although on the surface this chapter is mainly concerned with how to write
1.7 +functional programs in HOL and how to verify them, most of the
1.8 +constructs and proof procedures introduced are general purpose and recur in
1.9 +any specification or verification task.
1.10 +
1.11 +The dedicated functional programmer should be warned: HOL offers only what
1.12 +could be called {\em total functional programming} --- all functions in HOL
1.13 +must be total; lazy data structures are not directly available. On the
1.14 +positive side, functions in HOL need not be computable: HOL is a
1.15 +specification language that goes well beyond what can be expressed as a
1.16 +program. However, for the time being we concentrate on the computable.
1.17 +
1.18 +\section{An introductory theory}
1.19 +\label{sec:intro-theory}
1.20 +
1.21 +Functional programming needs datatypes and functions. Both of them can be
1.22 +defined in a theory with a syntax reminiscent of languages like ML or
1.23 +Haskell. As an example consider the theory in Fig.~\ref{fig:ToyList}.
1.24 +
1.25 +\begin{figure}[htbp]
1.26 +\begin{ttbox}\makeatother
1.27 +\input{ToyList/ToyList.thy}\end{ttbox}
1.28 +\caption{A theory of lists}
1.29 +\label{fig:ToyList}
1.30 +\end{figure}
1.31 +
1.32 +HOL already has a predefined theory of lists called \texttt{List} ---
1.33 +\texttt{ToyList} is merely a small fragment of it chosen as an example. In
1.34 +contrast to what is recommended in \S\ref{sec:Basic:Theories},
1.35 +\texttt{ToyList} is not based on \texttt{Main} but on \texttt{Datatype}, a
1.36 +theory that contains everything required for datatype definitions but does
1.37 +not have \texttt{List} as a parent, thus avoiding ambiguities caused by
1.38 +defining lists twice.
1.39 +
1.40 +The \ttindexbold{datatype} \texttt{list} introduces two constructors
1.41 +\texttt{Nil} and \texttt{Cons}, the empty list and the operator that adds an
1.42 +element to the front of a list. For example, the term \texttt{Cons True (Cons
1.43 + False Nil)} is a value of type \texttt{bool~list}, namely the list with the
1.44 +elements \texttt{True} and \texttt{False}. Because this notation becomes
1.45 +unwieldy very quickly, the datatype declaration is annotated with an
1.46 +alternative syntax: instead of \texttt{Nil} and \texttt{Cons}~$x$~$xs$ we can
1.47 +write \index{#@{\tt[]}|bold}\texttt{[]} and
1.48 +\texttt{$x$~\#~$xs$}\index{#@{\tt\#}|bold}. In fact, this alternative syntax
1.49 +is the standard syntax. Thus the list \texttt{Cons True (Cons False Nil)}
1.50 +becomes \texttt{True \# False \# []}. The annotation \ttindexbold{infixr}
1.51 +means that \texttt{\#} associates to the right, i.e.\ the term \texttt{$x$ \#
1.52 + $y$ \# $z$} is read as \texttt{$x$ \# ($y$ \# $z$)} and not as \texttt{($x$
1.53 + \# $y$) \# $z$}.
1.54 +
1.55 +\begin{warn}
1.56 + Syntax annotations are a powerful but completely optional feature. You
1.57 + could drop them from theory \texttt{ToyList} and go back to the identifiers
1.58 + \texttt{Nil} and \texttt{Cons}. However, lists are such a central datatype
1.59 + that their syntax is highly customized. We recommend that novices should
1.60 + not use syntax annotations in their own theories.
1.61 +\end{warn}
1.62 +
1.63 +Next, the functions \texttt{app} and \texttt{rev} are declared. In contrast
1.64 +to ML, Isabelle insists on explicit declarations of all functions (keyword
1.65 +\ttindexbold{consts}). (Apart from the declaration-before-use restriction,
1.66 +the order of items in a theory file is unconstrained.) Function \texttt{app}
1.67 +is annotated with concrete syntax too. Instead of the prefix syntax
1.68 +\texttt{app}~$xs$~$ys$ the infix $xs$~\texttt{\at}~$ys$ becomes the preferred
1.69 +form.
1.70 +
1.71 +Both functions are defined recursively. The equations for \texttt{app} and
1.72 +\texttt{rev} hardly need comments: \texttt{app} appends two lists and
1.73 +\texttt{rev} reverses a list. The keyword \texttt{primrec} indicates that
1.74 +the recursion is of a particularly primitive kind where each recursive call
1.75 +peels off a datatype constructor from one of the arguments (see
1.76 +\S\ref{sec:datatype}). Thus the recursion always terminates, i.e.\ the
1.77 +function is \bfindex{total}.
1.78 +
1.79 +The termination requirement is absolutely essential in HOL, a logic of total
1.80 +functions. If we were to drop it, inconsistencies could quickly arise: the
1.81 +``definition'' $f(n) = f(n)+1$ immediately leads to $0 = 1$ by subtracting
1.82 +$f(n)$ on both sides.
1.83 +% However, this is a subtle issue that we cannot discuss here further.
1.84 +
1.85 +\begin{warn}
1.86 + As we have indicated, the desire for total functions is not a gratuitously
1.87 + imposed restriction but an essential characteristic of HOL. It is only
1.88 + because of totality that reasoning in HOL is comparatively easy. More
1.89 + generally, the philosophy in HOL is not to allow arbitrary axioms (such as
1.90 + function definitions whose totality has not been proved) because they
1.91 + quickly lead to inconsistencies. Instead, fixed constructs for introducing
1.92 + types and functions are offered (such as \texttt{datatype} and
1.93 + \texttt{primrec}) which are guaranteed to preserve consistency.
1.94 +\end{warn}
1.95 +
1.96 +A remark about syntax. The textual definition of a theory follows a fixed
1.97 +syntax with keywords like \texttt{datatype} and \texttt{end} (see
1.98 +Fig.~\ref{fig:keywords} in Appendix~\ref{sec:Appendix} for a full list).
1.99 +Embedded in this syntax are the types and formulae of HOL, whose syntax is
1.100 +extensible, e.g.\ by new user-defined infix operators
1.101 +(see~\ref{sec:infix-syntax}). To distinguish the two levels, everything
1.102 +HOL-specific should be enclosed in \texttt{"}\dots\texttt{"}. The same holds
1.103 +for identifiers that happen to be keywords, as in
1.104 +\begin{ttbox}
1.105 +consts "end" :: 'a list => 'a
1.106 +\end{ttbox}
1.107 +To lessen this burden, quotation marks around types can be dropped,
1.108 +provided their syntax does not go beyond what is described in
1.109 +\S\ref{sec:TypesTermsForms}. Types containing further operators, e.g.\
1.110 +\texttt{*} for Cartesian products, need quotation marks.
1.111 +
1.112 +When Isabelle prints a syntax error message, it refers to the HOL syntax as
1.113 +the \bfindex{inner syntax}.
1.114 +
1.115 +\section{An introductory proof}
1.116 +\label{sec:intro-proof}
1.117 +
1.118 +Having defined \texttt{ToyList}, we load it with the ML command
1.119 +\begin{ttbox}
1.120 +use_thy "ToyList";
1.121 +\end{ttbox}
1.122 +and are ready to prove a few simple theorems. This will illustrate not just
1.123 +the basic proof commands but also the typical proof process.
1.124 +
1.125 +\subsubsection*{Main goal: \texttt{rev(rev xs) = xs}}
1.126 +
1.127 +Our goal is to show that reversing a list twice produces the original
1.128 +list. Typing
1.129 +\begin{ttbox}
1.130 +\input{ToyList/thm}\end{ttbox}
1.131 +establishes a new goal to be proved in the context of the current theory,
1.132 +which is the one we just loaded. Isabelle's response is to print the current proof state:
1.133 +\begin{ttbox}
1.134 +{\out Level 0}
1.135 +{\out rev (rev xs) = xs}
1.136 +{\out 1. rev (rev xs) = xs}
1.137 +\end{ttbox}
1.138 +Until we have finished a proof, the proof state always looks like this:
1.139 +\begin{ttbox}
1.140 +{\out Level \(i\)}
1.141 +{\out \(G\)}
1.142 +{\out 1. \(G@1\)}
1.143 +{\out \(\vdots\)}
1.144 +{\out \(n\). \(G@n\)}
1.145 +\end{ttbox}
1.146 +where \texttt{Level}~$i$ indicates that we are $i$ steps into the proof, $G$
1.147 +is the overall goal that we are trying to prove, and the numbered lines
1.148 +contain the subgoals $G@1$, \dots, $G@n$ that we need to prove to establish
1.149 +$G$. At \texttt{Level 0} there is only one subgoal, which is identical with
1.150 +the overall goal. Normally $G$ is constant and only serves as a reminder.
1.151 +Hence we rarely show it in this tutorial.
1.152 +
1.153 +Let us now get back to \texttt{rev(rev xs) = xs}. Properties of recursively
1.154 +defined functions are best established by induction. In this case there is
1.155 +not much choice except to induct on \texttt{xs}:
1.156 +\begin{ttbox}
1.157 +\input{ToyList/inductxs}\end{ttbox}
1.158 +This tells Isabelle to perform induction on variable \texttt{xs} in subgoal
1.159 +1. The new proof state contains two subgoals, namely the base case
1.160 +(\texttt{Nil}) and the induction step (\texttt{Cons}):
1.161 +\begin{ttbox}
1.162 +{\out 1. rev (rev []) = []}
1.163 +{\out 2. !!a list. rev (rev list) = list ==> rev (rev (a # list)) = a # list}
1.164 +\end{ttbox}
1.165 +The induction step is an example of the general format of a subgoal:
1.166 +\begin{ttbox}
1.167 +{\out \(i\). !!\(x@1 \dots x@n\). {\it assumptions} ==> {\it conclusion}}
1.168 +\end{ttbox}\index{==>@{\tt==>}|bold}
1.169 +The prefix of bound variables \texttt{!!\(x@1 \dots x@n\)} can be ignored
1.170 +most of the time, or simply treated as a list of variables local to this
1.171 +subgoal. Their deeper significance is explained in \S\ref{sec:PCproofs}. The
1.172 +{\it assumptions} are the local assumptions for this subgoal and {\it
1.173 + conclusion} is the actual proposition to be proved. Typical proof steps
1.174 +that add new assumptions are induction or case distinction. In our example
1.175 +the only assumption is the induction hypothesis \texttt{rev (rev list) =
1.176 + list}, where \texttt{list} is a variable name chosen by Isabelle. If there
1.177 +are multiple assumptions, they are enclosed in the bracket pair
1.178 +\texttt{[|}\index{==>@\ttlbr|bold} and \texttt{|]}\index{==>@\ttrbr|bold}
1.179 +and separated by semicolons.
1.180 +
1.181 +Let us try to solve both goals automatically:
1.182 +\begin{ttbox}
1.183 +\input{ToyList/autotac}\end{ttbox}
1.184 +This command tells Isabelle to apply a proof strategy called
1.185 +\texttt{Auto_tac} to all subgoals. Essentially, \texttt{Auto_tac} tries to
1.186 +`simplify' the subgoals. In our case, subgoal~1 is solved completely (thanks
1.187 +to the equation \texttt{rev [] = []}) and disappears; the simplified version
1.188 +of subgoal~2 becomes the new subgoal~1:
1.189 +\begin{ttbox}\makeatother
1.190 +{\out 1. !!a list. rev(rev list) = list ==> rev(rev list @ a # []) = a # list}
1.191 +\end{ttbox}
1.192 +In order to simplify this subgoal further, a lemma suggests itself.
1.193 +
1.194 +\subsubsection*{First lemma: \texttt{rev(xs \at~ys) = (rev ys) \at~(rev xs)}}
1.195 +
1.196 +We start the proof as usual:
1.197 +\begin{ttbox}\makeatother
1.198 +\input{ToyList/lemma1}\end{ttbox}
1.199 +There are two variables that we could induct on: \texttt{xs} and
1.200 +\texttt{ys}. Because \texttt{\at} is defined by recursion on
1.201 +the first argument, \texttt{xs} is the correct one:
1.202 +\begin{ttbox}
1.203 +\input{ToyList/inductxs}\end{ttbox}
1.204 +This time not even the base case is solved automatically:
1.205 +\begin{ttbox}\makeatother
1.206 +by(Auto_tac);
1.207 +{\out 1. rev ys = rev ys @ []}
1.208 +{\out 2. \dots}
1.209 +\end{ttbox}
1.210 +We need another lemma.
1.211 +
1.212 +\subsubsection*{Second lemma: \texttt{xs \at~[] = xs}}
1.213 +
1.214 +This time the canonical proof procedure
1.215 +\begin{ttbox}\makeatother
1.216 +\input{ToyList/lemma2}\input{ToyList/inductxs}\input{ToyList/autotac}\end{ttbox}
1.217 +leads to the desired message \texttt{No subgoals!}:
1.218 +\begin{ttbox}\makeatother
1.219 +{\out Level 2}
1.220 +{\out xs @ [] = xs}
1.221 +{\out No subgoals!}
1.222 +\end{ttbox}
1.223 +Now we can give the lemma just proved a suitable name
1.224 +\begin{ttbox}
1.225 +\input{ToyList/qed2}\end{ttbox}
1.226 +and tell Isabelle to use this lemma in all future proofs by simplification:
1.227 +\begin{ttbox}
1.228 +\input{ToyList/addsimps2}\end{ttbox}
1.229 +Note that in the theorem \texttt{app_Nil2} the free variable \texttt{xs} has
1.230 +been replaced by the unknown \texttt{?xs}, just as explained in
1.231 +\S\ref{sec:variables}.
1.232 +
1.233 +Going back to the proof of the first lemma
1.234 +\begin{ttbox}\makeatother
1.235 +\input{ToyList/lemma1}\input{ToyList/inductxs}\input{ToyList/autotac}\end{ttbox}
1.236 +we find that this time \texttt{Auto_tac} solves the base case, but the
1.237 +induction step merely simplifies to
1.238 +\begin{ttbox}\makeatother
1.239 +{\out 1. !!a list.}
1.240 +{\out rev (list @ ys) = rev ys @ rev list}
1.241 +{\out ==> (rev ys @ rev list) @ a # [] = rev ys @ rev list @ a # []}
1.242 +\end{ttbox}
1.243 +Now we need to remember that \texttt{\at} associates to the right, and that
1.244 +\texttt{\#} and \texttt{\at} have the same priority (namely the \texttt{65}
1.245 +in the definition of \texttt{ToyList}). Thus the conclusion really is
1.246 +\begin{ttbox}\makeatother
1.247 +{\out ==> (rev ys @ rev list) @ (a # []) = rev ys @ (rev list @ (a # []))}
1.248 +\end{ttbox}
1.249 +and the missing lemma is associativity of \texttt{\at}.
1.250 +
1.251 +\subsubsection*{Third lemma: \texttt{(xs \at~ys) \at~zs = xs \at~(ys \at~zs)}}
1.252 +
1.253 +This time the canonical proof procedure
1.254 +\begin{ttbox}\makeatother
1.255 +\input{ToyList/lemma3}\end{ttbox}
1.256 +succeeds without further ado. Again we name the lemma and add it to
1.257 +the set of lemmas used during simplification:
1.258 +\begin{ttbox}
1.259 +\input{ToyList/qed3}\end{ttbox}
1.260 +Now we can go back and prove the first lemma
1.261 +\begin{ttbox}\makeatother
1.262 +\input{ToyList/lemma1}\input{ToyList/inductxs}\input{ToyList/autotac}\end{ttbox}
1.263 +add it to the simplification lemmas
1.264 +\begin{ttbox}
1.265 +\input{ToyList/qed1}\end{ttbox}
1.266 +and then solve our main theorem:
1.267 +\begin{ttbox}\makeatother
1.268 +\input{ToyList/thm}\input{ToyList/inductxs}\input{ToyList/autotac}\end{ttbox}
1.269 +
1.270 +\subsubsection*{Review}
1.271 +
1.272 +This is the end of our toy proof. It should have familiarized you with
1.273 +\begin{itemize}
1.274 +\item the standard theorem proving procedure:
1.275 +state a goal; proceed with proof until a new lemma is required; prove that
1.276 +lemma; come back to the original goal.
1.277 +\item a specific procedure that works well for functional programs:
1.278 +induction followed by all-out simplification via \texttt{Auto_tac}.
1.279 +\item a basic repertoire of proof commands.
1.280 +\end{itemize}
1.281 +
1.282 +
1.283 +\section{Some helpful commands}
1.284 +\label{sec:commands-and-hints}
1.285 +
1.286 +This section discusses a few basic commands for manipulating the proof state
1.287 +and can be skipped by casual readers.
1.288 +
1.289 +There are two kinds of commands used during a proof: the actual proof
1.290 +commands and auxiliary commands for examining the proof state and controlling
1.291 +the display. Proof commands are always of the form
1.292 +\texttt{by(\textit{tactic});}\indexbold{tactic} where \textbf{tactic} is a
1.293 +synonym for ``theorem proving function''. Typical examples are
1.294 +\texttt{induct_tac} and \texttt{Auto_tac} --- the suffix \texttt{_tac} is
1.295 +merely a mnemonic. Further tactics are introduced throughout the tutorial.
1.296 +
1.297 +%Tactics can also be modified. For example,
1.298 +%\begin{ttbox}
1.299 +%by(ALLGOALS Asm_simp_tac);
1.300 +%\end{ttbox}
1.301 +%tells Isabelle to apply \texttt{Asm_simp_tac} to all subgoals. For more on
1.302 +%tactics and how to combine them see~\S\ref{sec:Tactics}.
1.303 +
1.304 +The most useful auxiliary commands are:
1.305 +\begin{description}
1.306 +\item[Printing the current state]
1.307 +Type \texttt{pr();} to redisplay the current proof state, for example when it
1.308 +has disappeared off the screen.
1.309 +\item[Limiting the number of subgoals]
1.310 +Typing \texttt{prlim $k$;} tells Isabelle to print only the first $k$
1.311 +subgoals from now on and redisplays the current proof state. This is helpful
1.312 +when there are many subgoals.
1.313 +\item[Undoing] Typing \texttt{undo();} undoes the effect of the last
1.314 +tactic.
1.315 +\item[Context switch] Every proof happens in the context of a
1.316 + \bfindex{current theory}. By default, this is the last theory loaded. If
1.317 + you want to prove a theorem in the context of a different theory
1.318 + \texttt{T}, you need to type \texttt{context T.thy;}\index{*context|bold}
1.319 + first. Of course you need to change the context again if you want to go
1.320 + back to your original theory.
1.321 +\item[Displaying types] We have already mentioned the flag
1.322 + \ttindex{show_types} above. It can also be useful for detecting typos in
1.323 + formulae early on. For example, if \texttt{show_types} is set and the goal
1.324 + \texttt{rev(rev xs) = xs} is started, Isabelle prints the additional output
1.325 +\begin{ttbox}
1.326 +{\out Variables:}
1.327 +{\out xs :: 'a list}
1.328 +\end{ttbox}
1.329 +which tells us that Isabelle has correctly inferred that
1.330 +\texttt{xs} is a variable of list type. On the other hand, had we
1.331 +made a typo as in \texttt{rev(re xs) = xs}, the response
1.332 +\begin{ttbox}
1.333 +Variables:
1.334 + re :: 'a list => 'a list
1.335 + xs :: 'a list
1.336 +\end{ttbox}
1.337 +would have alerted us because of the unexpected variable \texttt{re}.
1.338 +\item[(Re)loading theories]\index{loading theories}\index{reloading theories}
1.339 +Initially you load theory \texttt{T} by typing \ttindex{use_thy}~\texttt{"T";},
1.340 +which loads all parent theories of \texttt{T} automatically, if they are not
1.341 +loaded already. If you modify \texttt{T.thy} or \texttt{T.ML}, you can
1.342 +reload it by typing \texttt{use_thy~"T";} again. This time, however, only
1.343 +\texttt{T} is reloaded. If some of \texttt{T}'s parents have changed as well,
1.344 +type \ttindexbold{update}\texttt{();} to reload all theories that have
1.345 +changed.
1.346 +\end{description}
1.347 +Further commands are found in the Reference Manual.
1.348 +
1.349 +
1.350 +\section{Datatypes}
1.351 +\label{sec:datatype}
1.352 +
1.353 +Inductive datatypes are part of almost every non-trivial application of HOL.
1.354 +First we take another look at a very important example, the datatype of
1.355 +lists, before we turn to datatypes in general. The section closes with a
1.356 +case study.
1.357 +
1.358 +
1.359 +\subsection{Lists}
1.360 +
1.361 +Lists are one of the essential datatypes in computing. Readers of this tutorial
1.362 +and users of HOL need to be familiar with their basic operations. Theory
1.363 +\texttt{ToyList} is only a small fragment of HOL's predefined theory
1.364 +\texttt{List}\footnote{\texttt{http://www.in.tum.de/\~\relax
1.365 + isabelle/library/HOL/List.html}}.
1.366 +The latter contains many further operations. For example, the functions
1.367 +\ttindexbold{hd} (`head') and \ttindexbold{tl} (`tail') return the first
1.368 +element and the remainder of a list. (However, pattern-matching is usually
1.369 +preferable to \texttt{hd} and \texttt{tl}.)
1.370 +Theory \texttt{List} also contains more syntactic sugar:
1.371 +\texttt{[}$x@1$\texttt{,}\dots\texttt{,}$x@n$\texttt{]} abbreviates
1.372 +$x@1$\texttt{\#}\dots\texttt{\#}$x@n$\texttt{\#[]}.
1.373 +In the rest of the tutorial we always use HOL's predefined lists.
1.374 +
1.375 +
1.376 +\subsection{The general format}
1.377 +\label{sec:general-datatype}
1.378 +
1.379 +The general HOL \texttt{datatype} definition is of the form
1.380 +\[
1.381 +\mathtt{datatype}~(\alpha@1, \dots, \alpha@n) \, t ~=~
1.382 +C@1~\tau@{11}~\dots~\tau@{1k@1} ~\mid~ \dots ~\mid~
1.383 +C@m~\tau@{m1}~\dots~\tau@{mk@m}
1.384 +\]
1.385 +where $\alpha@i$ are type variables (the parameters), $C@i$ are distinct
1.386 +constructor names and $\tau@{ij}$ are types; it is customary to capitalize
1.387 +the first letter in constructor names. There are a number of
1.388 +restrictions (such as the type should not be empty) detailed
1.389 +elsewhere~\cite{Isa-Logics-Man}. Isabelle notifies you if you violate them.
1.390 +
1.391 +Laws about datatypes, such as \verb$[] ~= x#xs$ and \texttt{(x\#xs = y\#ys) =
1.392 + (x=y \& xs=ys)}, are used automatically during proofs by simplification.
1.393 +The same is true for the equations in primitive recursive function
1.394 +definitions.
1.395 +
1.396 +\subsection{Primitive recursion}
1.397 +
1.398 +Functions on datatypes are usually defined by recursion. In fact, most of the
1.399 +time they are defined by what is called \bfindex{primitive recursion}.
1.400 +The keyword \texttt{primrec} is followed by a list of equations
1.401 +\[ f \, x@1 \, \dots \, (C \, y@1 \, \dots \, y@k)\, \dots \, x@n = r \]
1.402 +such that $C$ is a constructor of the datatype $t$ and all recursive calls of
1.403 +$f$ in $r$ are of the form $f \, \dots \, y@i \, \dots$ for some $i$. Thus
1.404 +Isabelle immediately sees that $f$ terminates because one (fixed!) argument
1.405 +becomes smaller with every recursive call. There must be exactly one equation
1.406 +for each constructor. Their order is immaterial.
1.407 +A more general method for defining total recursive functions is explained in
1.408 +\S\ref{sec:recdef}.
1.409 +
1.410 +\begin{exercise}
1.411 +Given the datatype of binary trees
1.412 +\begin{ttbox}
1.413 +\input{Misc/tree}\end{ttbox}
1.414 +define a function \texttt{mirror} that mirrors the structure of a binary tree
1.415 +by swapping subtrees (recursively). Prove \texttt{mirror(mirror(t)) = t}.
1.416 +\end{exercise}
1.417 +
1.418 +\subsection{\texttt{case}-expressions}
1.419 +\label{sec:case-expressions}
1.420 +
1.421 +HOL also features \ttindexbold{case}-expressions for analyzing elements of a
1.422 +datatype. For example,
1.423 +\begin{ttbox}
1.424 +case xs of [] => 0 | y#ys => y
1.425 +\end{ttbox}
1.426 +evaluates to \texttt{0} if \texttt{xs} is \texttt{[]} and to \texttt{y} if
1.427 +\texttt{xs} is \texttt{y\#ys}. (Since the result in both branches must be of
1.428 +the same type, it follows that \texttt{y::nat} and hence
1.429 +\texttt{xs::(nat)list}.)
1.430 +
1.431 +In general, if $e$ is a term of the datatype $t$ defined in
1.432 +\S\ref{sec:general-datatype} above, the corresponding
1.433 +\texttt{case}-expression analyzing $e$ is
1.434 +\[
1.435 +\begin{array}{rrcl}
1.436 +\mbox{\tt case}~e~\mbox{\tt of} & C@1~x@{11}~\dots~x@{1k@1} & \To & e@1 \\
1.437 + \vdots \\
1.438 + \mid & C@m~x@{m1}~\dots~x@{mk@m} & \To & e@m
1.439 +\end{array}
1.440 +\]
1.441 +
1.442 +\begin{warn}
1.443 +{\em All} constructors must be present, their order is fixed, and nested
1.444 +patterns are not supported. Violating these restrictions results in strange
1.445 +error messages.
1.446 +\end{warn}
1.447 +\noindent
1.448 +Nested patterns can be simulated by nested \texttt{case}-expressions: instead
1.449 +of
1.450 +\begin{ttbox}
1.451 +case xs of [] => 0 | [x] => x | x#(y#zs) => y
1.452 +\end{ttbox}
1.453 +write
1.454 +\begin{ttbox}
1.455 +case xs of [] => 0 | x#ys => (case ys of [] => x | y#zs => y)
1.456 +\end{ttbox}
1.457 +Note that \texttt{case}-expressions should be enclosed in parentheses to
1.458 +indicate their scope.
1.459 +
1.460 +\subsection{Structural induction}
1.461 +
1.462 +Almost all the basic laws about a datatype are applied automatically during
1.463 +simplification. Only induction is invoked by hand via \texttt{induct_tac},
1.464 +which works for any datatype. In some cases, induction is overkill and a case
1.465 +distinction over all constructors of the datatype suffices. This is performed
1.466 +by \ttindexbold{exhaust_tac}. A trivial example:
1.467 +\begin{ttbox}
1.468 +\input{Misc/exhaust.ML}{\out1. xs = [] ==> (case xs of [] => [] | y # ys => xs) = xs}
1.469 +{\out2. !!a list. xs = a # list ==> (case xs of [] => [] | y # ys => xs) = xs}
1.470 +\input{Misc/autotac.ML}\end{ttbox}
1.471 +Note that this particular case distinction could have been automated
1.472 +completely. See~\S\ref{sec:SimpFeatures}.
1.473 +
1.474 +\begin{warn}
1.475 + Induction is only allowed on a free variable that should not occur among
1.476 + the assumptions of the subgoal. Exhaustion works for arbitrary terms.
1.477 +\end{warn}
1.478 +
1.479 +\subsection{Case study: boolean expressions}
1.480 +\label{sec:boolex}
1.481 +
1.482 +The aim of this case study is twofold: it shows how to model boolean
1.483 +expressions and some algorithms for manipulating them, and it demonstrates
1.484 +the constructs introduced above.
1.485 +
1.486 +\subsubsection{How can we model boolean expressions?}
1.487 +
1.488 +We want to represent boolean expressions built up from variables and
1.489 +constants by negation and conjunction. The following datatype serves exactly
1.490 +that purpose:
1.491 +\begin{ttbox}
1.492 +\input{Ifexpr/boolex}\end{ttbox}
1.493 +The two constants are represented by the terms \texttt{Const~True} and
1.494 +\texttt{Const~False}. Variables are represented by terms of the form
1.495 +\texttt{Var}~$n$, where $n$ is a natural number (type \texttt{nat}).
1.496 +For example, the formula $P@0 \land \neg P@1$ is represented by the term
1.497 +\texttt{And~(Var~0)~(Neg(Var~1))}.
1.498 +
1.499 +\subsubsection{What is the value of boolean expressions?}
1.500 +
1.501 +The value of a boolean expressions depends on the value of its variables.
1.502 +Hence the function \texttt{value} takes an additional parameter, an {\em
1.503 + environment} of type \texttt{nat~=>~bool}, which maps variables to their
1.504 +values:
1.505 +\begin{ttbox}
1.506 +\input{Ifexpr/value}\end{ttbox}
1.507 +
1.508 +\subsubsection{If-expressions}
1.509 +
1.510 +An alternative and often more efficient (because in a certain sense
1.511 +canonical) representation are so-called \textit{If-expressions\/} built up
1.512 +from constants (\texttt{CIF}), variables (\texttt{VIF}) and conditionals
1.513 +(\texttt{IF}):
1.514 +\begin{ttbox}
1.515 +\input{Ifexpr/ifex}\end{ttbox}
1.516 +The evaluation if If-expressions proceeds as for \texttt{boolex}:
1.517 +\begin{ttbox}
1.518 +\input{Ifexpr/valif}\end{ttbox}
1.519 +
1.520 +\subsubsection{Transformation into and of If-expressions}
1.521 +
1.522 +The type \texttt{boolex} is close to the customary representation of logical
1.523 +formulae, whereas \texttt{ifex} is designed for efficiency. Thus we need to
1.524 +translate from \texttt{boolex} into \texttt{ifex}:
1.525 +\begin{ttbox}
1.526 +\input{Ifexpr/bool2if}\end{ttbox}
1.527 +At last, we have something we can verify: that \texttt{bool2if} preserves the
1.528 +value of its argument.
1.529 +\begin{ttbox}
1.530 +\input{Ifexpr/bool2if.ML}\end{ttbox}
1.531 +The proof is canonical:
1.532 +\begin{ttbox}
1.533 +\input{Ifexpr/proof.ML}\end{ttbox}
1.534 +In fact, all proofs in this case study look exactly like this. Hence we do
1.535 +not show them below.
1.536 +
1.537 +More interesting is the transformation of If-expressions into a normal form
1.538 +where the first argument of \texttt{IF} cannot be another \texttt{IF} but
1.539 +must be a constant or variable. Such a normal form can be computed by
1.540 +repeatedly replacing a subterm of the form \texttt{IF~(IF~b~x~y)~z~u} by
1.541 +\texttt{IF b (IF x z u) (IF y z u)}, which has the same value. The following
1.542 +primitive recursive functions perform this task:
1.543 +\begin{ttbox}
1.544 +\input{Ifexpr/normif}
1.545 +\input{Ifexpr/norm}\end{ttbox}
1.546 +Their interplay is a bit tricky, and we leave it to the reader to develop an
1.547 +intuitive understanding. Fortunately, Isabelle can help us to verify that the
1.548 +transformation preserves the value of the expression:
1.549 +\begin{ttbox}
1.550 +\input{Ifexpr/norm.ML}\end{ttbox}
1.551 +The proof is canonical, provided we first show the following lemma (which
1.552 +also helps to understand what \texttt{normif} does) and make it available
1.553 +for simplification via \texttt{Addsimps}:
1.554 +\begin{ttbox}
1.555 +\input{Ifexpr/normif.ML}\end{ttbox}
1.556 +
1.557 +But how can we be sure that \texttt{norm} really produces a normal form in
1.558 +the above sense? We have to prove
1.559 +\begin{ttbox}
1.560 +\input{Ifexpr/normal_norm.ML}\end{ttbox}
1.561 +where \texttt{normal} expresses that an If-expression is in normal form:
1.562 +\begin{ttbox}
1.563 +\input{Ifexpr/normal}\end{ttbox}
1.564 +Of course, this requires a lemma about normality of \texttt{normif}
1.565 +\begin{ttbox}
1.566 +\input{Ifexpr/normal_normif.ML}\end{ttbox}
1.567 +that has to be made available for simplification via \texttt{Addsimps}.
1.568 +
1.569 +How does one come up with the required lemmas? Try to prove the main theorems
1.570 +without them and study carefully what \texttt{Auto_tac} leaves unproved. This
1.571 +has to provide the clue.
1.572 +The necessity of universal quantification (\texttt{!t e}) in the two lemmas
1.573 +is explained in \S\ref{sec:InductionHeuristics}
1.574 +
1.575 +\begin{exercise}
1.576 + We strengthen the definition of a {\em normal\/} If-expression as follows:
1.577 + the first argument of all \texttt{IF}s must be a variable. Adapt the above
1.578 + development to this changed requirement. (Hint: you may need to formulate
1.579 + some of the goals as implications (\texttt{-->}) rather than equalities
1.580 + (\texttt{=}).)
1.581 +\end{exercise}
1.582 +
1.583 +\section{Some basic types}
1.584 +
1.585 +\subsection{Natural numbers}
1.586 +
1.587 +The type \ttindexbold{nat} of natural numbers is predefined and behaves like
1.588 +\begin{ttbox}
1.589 +datatype nat = 0 | Suc nat
1.590 +\end{ttbox}
1.591 +In particular, there are \texttt{case}-expressions, for example
1.592 +\begin{ttbox}
1.593 +case n of 0 => 0 | Suc m => m
1.594 +\end{ttbox}
1.595 +primitive recursion, for example
1.596 +\begin{ttbox}
1.597 +\input{Misc/natsum}\end{ttbox}
1.598 +and induction, for example
1.599 +\begin{ttbox}
1.600 +\input{Misc/NatSum.ML}\ttbreak
1.601 +{\out sum n + sum n = n * Suc n}
1.602 +{\out No subgoals!}
1.603 +\end{ttbox}
1.604 +
1.605 +The usual arithmetic operations \ttindexbold{+}, \ttindexbold{-},
1.606 +\ttindexbold{*}, \ttindexbold{div} and \ttindexbold{mod} are predefined, as
1.607 +are the relations \ttindexbold{<=} and \ttindexbold{<}. There is even a least
1.608 +number operation \ttindexbold{LEAST}. For example, \texttt{(LEAST n.$\,$1 <
1.609 + n) = 2} (HOL does not prove this completely automatically).
1.610 +
1.611 +\begin{warn}
1.612 + The operations \ttindexbold{+}, \ttindexbold{-}, \ttindexbold{*} are
1.613 + overloaded, i.e.\ they are available not just for natural numbers but at
1.614 + other types as well (see \S\ref{sec:TypeClasses}). For example, given
1.615 + the goal \texttt{x+y = y+x}, there is nothing to indicate that you are
1.616 + talking about natural numbers. Hence Isabelle can only infer that
1.617 + \texttt{x} and \texttt{y} are of some arbitrary type where \texttt{+} is
1.618 + declared. As a consequence, you will be unable to prove the goal (although
1.619 + it may take you some time to realize what has happened if
1.620 + \texttt{show_types} is not set). In this particular example, you need to
1.621 + include an explicit type constraint, for example \texttt{x+y = y+(x::nat)}.
1.622 + If there is enough contextual information this may not be necessary:
1.623 + \texttt{x+0 = x} automatically implies \texttt{x::nat}.
1.624 +\end{warn}
1.625 +
1.626 +
1.627 +\subsection{Products}
1.628 +
1.629 +HOL also has pairs: \texttt{($a@1$,$a@2$)} is of type \texttt{$\tau@1$ *
1.630 +$\tau@2$} provided each $a@i$ is of type $\tau@i$. The components of a pair
1.631 +are extracted by \texttt{fst} and \texttt{snd}:
1.632 +\texttt{fst($x$,$y$) = $x$} and \texttt{snd($x$,$y$) = $y$}. Tuples
1.633 +are simulated by pairs nested to the right:
1.634 +\texttt{($a@1$,$a@2$,$a@3$)} and \texttt{$\tau@1$ * $\tau@2$ * $\tau@3$}
1.635 +stand for \texttt{($a@1$,($a@2$,$a@3$))} and \texttt{$\tau@1$ * ($\tau@2$ *
1.636 +$\tau@3$)}. Therefore \texttt{fst(snd($a@1$,$a@2$,$a@3$)) = $a@2$}.
1.637 +
1.638 +It is possible to use (nested) tuples as patterns in abstractions, for
1.639 +example \texttt{\%(x,y,z).x+y+z} and \texttt{\%((x,y),z).x+y+z}.
1.640 +
1.641 +In addition to explicit $\lambda$-abstractions, tuple patterns can be used in
1.642 +most variable binding constructs. Typical examples are
1.643 +\begin{ttbox}
1.644 +let (x,y) = f z in (y,x)
1.645 +
1.646 +case xs of [] => 0 | (x,y)\#zs => x+y
1.647 +\end{ttbox}
1.648 +Further important examples are quantifiers and sets.
1.649 +
1.650 +\begin{warn}
1.651 +Abstraction over pairs and tuples is merely a convenient shorthand for a more
1.652 +complex internal representation. Thus the internal and external form of a
1.653 +term may differ, which can affect proofs. If you want to avoid this
1.654 +complication, use \texttt{fst} and \texttt{snd}, i.e.\ write
1.655 +\texttt{\%p.~fst p + snd p} instead of \texttt{\%(x,y).~x + y}.
1.656 +See~\S\ref{} for theorem proving with tuple patterns.
1.657 +\end{warn}
1.658 +
1.659 +
1.660 +\section{Definitions}
1.661 +\label{sec:Definitions}
1.662 +
1.663 +A definition is simply an abbreviation, i.e.\ a new name for an existing
1.664 +construction. In particular, definitions cannot be recursive. Isabelle offers
1.665 +definitions on the level of types and terms. Those on the type level are
1.666 +called type synonyms, those on the term level are called (constant)
1.667 +definitions.
1.668 +
1.669 +
1.670 +\subsection{Type synonyms}
1.671 +\indexbold{type synonyms}
1.672 +
1.673 +Type synonyms are similar to those found in ML. Their syntax is fairly self
1.674 +explanatory:
1.675 +\begin{ttbox}
1.676 +\input{Misc/types}\end{ttbox}\indexbold{*types}
1.677 +The synonym \texttt{alist} shows that in general the type on the right-hand
1.678 +side needs to be enclosed in double quotation marks
1.679 +(see the end of~\S\ref{sec:intro-theory}).
1.680 +
1.681 +Internally all synonyms are fully expanded. As a consequence Isabelle's
1.682 +output never contains synonyms. Their main purpose is to improve the
1.683 +readability of theory definitions. Synonyms can be used just like any other
1.684 +type:
1.685 +\begin{ttbox}
1.686 +\input{Misc/consts}\end{ttbox}
1.687 +
1.688 +\subsection{Constant definitions}
1.689 +\label{sec:ConstDefinitions}
1.690 +
1.691 +The above constants \texttt{nand} and \texttt{exor} are non-recursive and can
1.692 +therefore be defined directly by
1.693 +\begin{ttbox}
1.694 +\input{Misc/defs}\end{ttbox}\indexbold{*defs}
1.695 +where \texttt{defs} is a keyword and \texttt{nand_def} and \texttt{exor_def}
1.696 +are arbitrary user-supplied names.
1.697 +The symbol \texttt{==}\index{==>@{\tt==}|bold} is a special form of equality
1.698 +that should only be used in constant definitions.
1.699 +Declarations and definitions can also be merged
1.700 +\begin{ttbox}
1.701 +\input{Misc/constdefs}\end{ttbox}\indexbold{*constdefs}
1.702 +in which case the default name of each definition is $f$\texttt{_def}, where
1.703 +$f$ is the name of the defined constant.
1.704 +
1.705 +Note that pattern-matching is not allowed, i.e.\ each definition must be of
1.706 +the form $f\,x@1\,\dots\,x@n$\texttt{~==~}$t$.
1.707 +
1.708 +Section~\S\ref{sec:Simplification} explains how definitions are used
1.709 +in proofs.
1.710 +
1.711 +\begin{warn}
1.712 +A common mistake when writing definitions is to introduce extra free variables
1.713 +on the right-hand side as in the following fictitious definition:
1.714 +\begin{ttbox}
1.715 +defs prime_def "prime(p) == (m divides p) --> (m=1 | m=p)"
1.716 +\end{ttbox}
1.717 +Isabelle rejects this `definition' because of the extra {\tt m} on the
1.718 +right-hand side, which would introduce an inconsistency. What you should have
1.719 +written is
1.720 +\begin{ttbox}
1.721 +defs prime_def "prime(p) == !m. (m divides p) --> (m=1 | m=p)"
1.722 +\end{ttbox}
1.723 +\end{warn}
1.724 +
1.725 +
1.726 +
1.727 +
1.728 +\chapter{More Functional Programming}
1.729 +
1.730 +The purpose of this chapter is to deepen the reader's understanding of the
1.731 +concepts encountered so far and to introduce an advanced method for defining
1.732 +recursive functions. The first two sections give a structured presentation of
1.733 +theorem proving by simplification (\S\ref{sec:Simplification}) and
1.734 +discuss important heuristics for induction (\S\ref{sec:InductionHeuristics}). They
1.735 +can be skipped by readers less interested in proofs. They are followed by a
1.736 +case study, a compiler for expressions (\S\ref{sec:ExprCompiler}).
1.737 +Finally we present a very general method for defining recursive functions
1.738 +that goes well beyond what \texttt{primrec} allows (\S\ref{sec:recdef}).
1.739 +
1.740 +
1.741 +\section{Simplification}
1.742 +\label{sec:Simplification}
1.743 +
1.744 +So far we have proved our theorems by \texttt{Auto_tac}, which
1.745 +`simplifies' all subgoals. In fact, \texttt{Auto_tac} can do much more than
1.746 +that, except that it did not need to so far. However, when you go beyond toy
1.747 +examples, you need to understand the ingredients of \texttt{Auto_tac}.
1.748 +This section covers the tactic that \texttt{Auto_tac} always applies first,
1.749 +namely simplification.
1.750 +
1.751 +Simplification is one of the central theorem proving tools in Isabelle and
1.752 +many other systems. The tool itself is called the \bfindex{simplifier}. The
1.753 +purpose of this section is twofold: to introduce the many features of the
1.754 +simplifier (\S\ref{sec:SimpFeatures}) and to explain a little bit how the
1.755 +simplifier works (\S\ref{sec:SimpHow}). Anybody intending to use HOL should
1.756 +read \S\ref{sec:SimpFeatures}, and the serious student should read
1.757 +\S\ref{sec:SimpHow} as well in order to understand what happened in case
1.758 +things do not simplify as expected.
1.759 +
1.760 +
1.761 +\subsection{Using the simplifier}
1.762 +\label{sec:SimpFeatures}
1.763 +
1.764 +In its most basic form, simplification means repeated application of
1.765 +equations from left to right. For example, taking the rules for \texttt{\at}
1.766 +and applying them to the term \texttt{[0,1] \at\ []} results in a sequence of
1.767 +simplification steps:
1.768 +\begin{ttbox}\makeatother
1.769 +(0#1#[]) @ [] \(\leadsto\) 0#((1#[]) @ []) \(\leadsto\) 0#(1#([] @ [])) \(\leadsto\) 0#1#[]
1.770 +\end{ttbox}
1.771 +This is also known as {\em term rewriting} and the equations are referred
1.772 +to as {\em rewrite rules}. This is more honest than `simplification' because
1.773 +the terms do not necessarily become simpler in the process.
1.774 +
1.775 +\subsubsection{Simpsets}
1.776 +
1.777 +To facilitate simplification, each theory has an associated set of
1.778 +simplification rules, known as a \bfindex{simpset}. Within a theory,
1.779 +proofs by simplification refer to the associated simpset by default.
1.780 +The simpset of a theory is built up as follows: starting with the union of
1.781 +the simpsets of the parent theories, each occurrence of a \texttt{datatype}
1.782 +or \texttt{primrec} construct augments the simpset. Explicit definitions are
1.783 +not added automatically. Users can add new theorems via \texttt{Addsimps} and
1.784 +delete them again later by \texttt{Delsimps}.
1.785 +
1.786 +You may augment a simpset not just by equations but by pretty much any
1.787 +theorem. The simplifier will try to make sense of it. For example, a theorem
1.788 +\verb$~$$P$ is automatically turned into \texttt{$P$ = False}. The details are
1.789 +explained in \S\ref{sec:SimpHow}.
1.790 +
1.791 +As a rule of thumb, rewrite rules that really simplify a term (like
1.792 +\texttt{xs \at\ [] = xs} and \texttt{rev(rev xs) = xs}) should be added to the
1.793 +current simpset right after they have been proved. Those of a more specific
1.794 +nature (e.g.\ distributivity laws, which alter the structure of terms
1.795 +considerably) should only be added for specific proofs and deleted again
1.796 +afterwards. Conversely, it may also happen that a generally useful rule
1.797 +needs to be removed for a certain proof and is added again afterwards. The
1.798 +need of frequent temporary additions or deletions may indicate a badly
1.799 +designed simpset.
1.800 +\begin{warn}
1.801 + Simplification may not terminate, for example if both $f(x) = g(x)$ and
1.802 + $g(x) = f(x)$ are in the simpset. It is the user's responsibility not to
1.803 + include rules that can lead to nontermination, either on their own or in
1.804 + combination with other rules.
1.805 +\end{warn}
1.806 +
1.807 +\subsubsection{Simplification tactics}
1.808 +
1.809 +There are four main simplification tactics:
1.810 +\begin{ttdescription}
1.811 +\item[\ttindexbold{Simp_tac} $i$] simplifies the conclusion of subgoal~$i$
1.812 + using the theory's simpset. It may solve the subgoal completely if it has
1.813 + become trivial. For example:
1.814 +\begin{ttbox}\makeatother
1.815 +{\out 1. [] @ [] = []}
1.816 +by(Simp_tac 1);
1.817 +{\out No subgoals!}
1.818 +\end{ttbox}
1.819 +
1.820 +\item[\ttindexbold{Asm_simp_tac}]
1.821 + is like \verb$Simp_tac$, but extracts additional rewrite rules from
1.822 + the assumptions of the subgoal. For example, it solves
1.823 +\begin{ttbox}\makeatother
1.824 +{\out 1. xs = [] ==> xs @ xs = xs}
1.825 +\end{ttbox}
1.826 +which \texttt{Simp_tac} does not do.
1.827 +
1.828 +\item[\ttindexbold{Full_simp_tac}] is like \verb$Simp_tac$, but also
1.829 + simplifies the assumptions (without using the assumptions to
1.830 + simplify each other or the actual goal).
1.831 +
1.832 +\item[\ttindexbold{Asm_full_simp_tac}] is like \verb$Asm_simp_tac$,
1.833 + but also simplifies the assumptions. In particular, assumptions can
1.834 + simplify each other. For example:
1.835 +\begin{ttbox}\makeatother
1.836 +\out{ 1. [| xs @ zs = ys @ xs; [] @ xs = [] @ [] |] ==> ys = zs}
1.837 +by(Asm_full_simp_tac 1);
1.838 +{\out No subgoals!}
1.839 +\end{ttbox}
1.840 +The second assumption simplifies to \texttt{xs = []}, which in turn
1.841 +simplifies the first assumption to \texttt{zs = ys}, thus reducing the
1.842 +conclusion to \texttt{ys = ys} and hence to \texttt{True}.
1.843 +(See also the paragraph on tracing below.)
1.844 +\end{ttdescription}
1.845 +\texttt{Asm_full_simp_tac} is the most powerful of this quartet of
1.846 +tactics. In fact, \texttt{Auto_tac} starts by applying
1.847 +\texttt{Asm_full_simp_tac} to all subgoals. The only reason for the existence
1.848 +of the other three tactics is that sometimes one wants to limit the amount of
1.849 +simplification, for example to avoid nontermination:
1.850 +\begin{ttbox}\makeatother
1.851 +{\out 1. ! x. f x = g (f (g x)) ==> f [] = f [] @ []}
1.852 +\end{ttbox}
1.853 +is solved by \texttt{Simp_tac}, but \texttt{Asm_simp_tac} and
1.854 +\texttt{Asm_full_simp_tac} loop because the rewrite rule \texttt{f x = g(f(g
1.855 +x))} extracted from the assumption does not terminate. Isabelle notices
1.856 +certain simple forms of nontermination, but not this one.
1.857 +
1.858 +\subsubsection{Modifying simpsets locally}
1.859 +
1.860 +If a certain theorem is merely needed in one proof by simplification, the
1.861 +pattern
1.862 +\begin{ttbox}
1.863 +Addsimps [\(rare_theorem\)];
1.864 +by(Simp_tac 1);
1.865 +Delsimps [\(rare_theorem\)];
1.866 +\end{ttbox}
1.867 +is awkward. Therefore there are lower-case versions of the simplification
1.868 +tactics (\ttindexbold{simp_tac}, \ttindexbold{asm_simp_tac},
1.869 +\ttindexbold{full_simp_tac}, \ttindexbold{asm_full_simp_tac}) and of the
1.870 +simpset modifiers (\ttindexbold{addsimps}, \ttindexbold{delsimps})
1.871 +that do not access or modify the implicit simpset but explicitly take a
1.872 +simpset as an argument. For example, the above three lines become
1.873 +\begin{ttbox}
1.874 +by(simp_tac (simpset() addsimps [\(rare_theorem\)]) 1);
1.875 +\end{ttbox}
1.876 +where the result of the function call \texttt{simpset()} is the simpset of
1.877 +the current theory and \texttt{addsimps} is an infix function. The implicit
1.878 +simpset is read once but not modified.
1.879 +This is far preferable to pairs of \texttt{Addsimps} and \texttt{Delsimps}.
1.880 +Local modifications can be stacked as in
1.881 +\begin{ttbox}
1.882 +by(simp_tac (simpset() addsimps [\(rare_theorem\)] delsimps [\(some_thm\)]) 1);
1.883 +\end{ttbox}
1.884 +
1.885 +\subsubsection{Rewriting with definitions}
1.886 +
1.887 +Constant definitions (\S\ref{sec:ConstDefinitions}) are not automatically
1.888 +included in the simpset of a theory. Hence such definitions are not expanded
1.889 +automatically either, just as it should be: definitions are introduced for
1.890 +the purpose of abbreviating complex concepts. Of course we need to expand the
1.891 +definitions initially to derive enough lemmas that characterize the concept
1.892 +sufficiently for us to forget the original definition completely. For
1.893 +example, given the theory
1.894 +\begin{ttbox}
1.895 +\input{Misc/Exor.thy}\end{ttbox}
1.896 +we may want to prove \verb$exor A (~A)$. Instead of \texttt{Goal} we use
1.897 +\begin{ttbox}
1.898 +\input{Misc/exorgoal.ML}\end{ttbox}
1.899 +which tells Isabelle to expand the definition of \texttt{exor}---the first
1.900 +argument of \texttt{Goalw} can be a list of definitions---in the initial goal:
1.901 +\begin{ttbox}
1.902 +{\out exor A (~ A)}
1.903 +{\out 1. A & ~ ~ A | ~ A & ~ A}
1.904 +\end{ttbox}
1.905 +In this simple example, the goal is proved by \texttt{Simp_tac}.
1.906 +Of course the resulting theorem is insufficient to characterize \texttt{exor}
1.907 +completely.
1.908 +
1.909 +In case we want to expand a definition in the middle of a proof, we can
1.910 +simply add the definition locally to the simpset:
1.911 +\begin{ttbox}
1.912 +\input{Misc/exorproof.ML}\end{ttbox}
1.913 +You should normally not add the definition permanently to the simpset
1.914 +using \texttt{Addsimps} because this defeats the whole purpose of an
1.915 +abbreviation.
1.916 +
1.917 +\begin{warn}
1.918 +If you have defined $f\,x\,y$\texttt{~==~}$t$ then you can only expand
1.919 +occurrences of $f$ with at least two arguments. Thus it is safer to define
1.920 +$f$\texttt{~==~\%$x\,y$.}$\;t$.
1.921 +\end{warn}
1.922 +
1.923 +\subsubsection{Simplifying \texttt{let}-expressions}
1.924 +
1.925 +Proving a goal containing \ttindex{let}-expressions invariably requires the
1.926 +\texttt{let}-constructs to be expanded at some point. Since
1.927 +\texttt{let}-\texttt{in} is just syntactic sugar for a defined constant
1.928 +(called \texttt{Let}), expanding \texttt{let}-constructs means rewriting with
1.929 +\texttt{Let_def}:
1.930 +%context List.thy;
1.931 +%Goal "(let xs = [] in xs@xs) = ys";
1.932 +\begin{ttbox}\makeatother
1.933 +{\out 1. (let xs = [] in xs @ xs) = ys}
1.934 +by(simp_tac (simpset() addsimps [Let_def]) 1);
1.935 +{\out 1. [] = ys}
1.936 +\end{ttbox}
1.937 +If, in a particular context, there is no danger of a combinatorial explosion
1.938 +of nested \texttt{let}s one could even add \texttt{Let_def} permanently via
1.939 +\texttt{Addsimps}.
1.940 +
1.941 +\subsubsection{Conditional equations}
1.942 +
1.943 +So far all examples of rewrite rules were equations. The simplifier also
1.944 +accepts {\em conditional\/} equations, for example
1.945 +\begin{ttbox}
1.946 +xs ~= [] ==> hd xs # tl xs = xs \hfill \((*)\)
1.947 +\end{ttbox}
1.948 +(which is proved by \texttt{exhaust_tac} on \texttt{xs} followed by
1.949 +\texttt{Asm_full_simp_tac} twice). Assuming that this theorem together with
1.950 +%\begin{ttbox}\makeatother
1.951 +\texttt{(rev xs = []) = (xs = [])}
1.952 +%\end{ttbox}
1.953 +are part of the simpset, the subgoal
1.954 +\begin{ttbox}\makeatother
1.955 +{\out 1. xs ~= [] ==> hd(rev xs) # tl(rev xs) = rev xs}
1.956 +\end{ttbox}
1.957 +is proved by simplification:
1.958 +the conditional equation $(*)$ above
1.959 +can simplify \texttt{hd(rev~xs)~\#~tl(rev~xs)} to \texttt{rev xs}
1.960 +because the corresponding precondition \verb$rev xs ~= []$ simplifies to
1.961 +\verb$xs ~= []$, which is exactly the local assumption of the subgoal.
1.962 +
1.963 +
1.964 +\subsubsection{Automatic case splits}
1.965 +
1.966 +Goals containing \ttindex{if}-expressions are usually proved by case
1.967 +distinction on the condition of the \texttt{if}. For example the goal
1.968 +\begin{ttbox}
1.969 +{\out 1. ! xs. if xs = [] then rev xs = [] else rev xs ~= []}
1.970 +\end{ttbox}
1.971 +can be split into
1.972 +\begin{ttbox}
1.973 +{\out 1. ! xs. (xs = [] --> rev xs = []) \& (xs ~= [] --> rev xs ~= [])}
1.974 +\end{ttbox}
1.975 +by typing
1.976 +\begin{ttbox}
1.977 +\input{Misc/splitif.ML}\end{ttbox}
1.978 +Because this is almost always the right proof strategy, the simplifier
1.979 +performs case-splitting on \texttt{if}s automatically. Try \texttt{Simp_tac}
1.980 +on the initial goal above.
1.981 +
1.982 +This splitting idea generalizes from \texttt{if} to \ttindex{case}:
1.983 +\begin{ttbox}\makeatother
1.984 +{\out 1. (case xs of [] => zs | y#ys => y#(ys@zs)) = xs@zs}
1.985 +\end{ttbox}
1.986 +becomes
1.987 +\begin{ttbox}\makeatother
1.988 +{\out 1. (xs = [] --> zs = xs @ zs) &}
1.989 +{\out (! a list. xs = a # list --> a # list @ zs = xs @ zs)}
1.990 +\end{ttbox}
1.991 +by typing
1.992 +\begin{ttbox}
1.993 +\input{Misc/splitlist.ML}\end{ttbox}
1.994 +In contrast to \texttt{if}-expressions, the simplifier does not split
1.995 +\texttt{case}-expressions by default because this can lead to nontermination
1.996 +in case of recursive datatypes.
1.997 +Nevertheless the simplifier can be instructed to perform \texttt{case}-splits
1.998 +by adding the appropriate rule to the simpset:
1.999 +\begin{ttbox}
1.1000 +by(simp_tac (simpset() addsplits [split_list_case]) 1);
1.1001 +\end{ttbox}\indexbold{*addsplits}
1.1002 +solves the initial goal outright, which \texttt{Simp_tac} alone will not do.
1.1003 +
1.1004 +In general, every datatype $t$ comes with a rule
1.1005 +\texttt{$t$.split} that can be added to the simpset either
1.1006 +locally via \texttt{addsplits} (see above), or permanently via
1.1007 +\begin{ttbox}
1.1008 +Addsplits [\(t\).split];
1.1009 +\end{ttbox}\indexbold{*Addsplits}
1.1010 +Split-rules can be removed globally via \ttindexbold{Delsplits} and locally
1.1011 +via \ttindexbold{delsplits} as, for example, in
1.1012 +\begin{ttbox}
1.1013 +by(simp_tac (simpset() addsimps [\(\dots\)] delsplits [split_if]) 1);
1.1014 +\end{ttbox}
1.1015 +
1.1016 +
1.1017 +\subsubsection{Permutative rewrite rules}
1.1018 +
1.1019 +A rewrite rule is {\bf permutative} if the left-hand side and right-hand side
1.1020 +are the same up to renaming of variables. The most common permutative rule
1.1021 +is commutativity: $x+y = y+x$. Another example is $(x-y)-z = (x-z)-y$. Such
1.1022 +rules are problematic because once they apply, they can be used forever.
1.1023 +The simplifier is aware of this danger and treats permutative rules
1.1024 +separately. For details see~\cite{Isa-Ref-Man}.
1.1025 +
1.1026 +\subsubsection{Tracing}
1.1027 +\indexbold{tracing the simplifier}
1.1028 +
1.1029 +Using the simplifier effectively may take a bit of experimentation. Set the
1.1030 +\verb$trace_simp$ flag to get a better idea of what is going on:
1.1031 +\begin{ttbox}\makeatother
1.1032 +{\out 1. rev [x] = []}
1.1033 +\ttbreak
1.1034 +set trace_simp;
1.1035 +by(Simp_tac 1);
1.1036 +\ttbreak\makeatother
1.1037 +{\out Applying instance of rewrite rule:}
1.1038 +{\out rev (?x # ?xs) == rev ?xs @ [?x]}
1.1039 +{\out Rewriting:}
1.1040 +{\out rev [x] == rev [] @ [x]}
1.1041 +\ttbreak
1.1042 +{\out Applying instance of rewrite rule:}
1.1043 +{\out rev [] == []}
1.1044 +{\out Rewriting:}
1.1045 +{\out rev [] == []}
1.1046 +\ttbreak\makeatother
1.1047 +{\out Applying instance of rewrite rule:}
1.1048 +{\out [] @ ?y == ?y}
1.1049 +{\out Rewriting:}
1.1050 +{\out [] @ [x] == [x]}
1.1051 +\ttbreak
1.1052 +{\out Applying instance of rewrite rule:}
1.1053 +{\out ?x # ?t = ?t == False}
1.1054 +{\out Rewriting:}
1.1055 +{\out [x] = [] == False}
1.1056 +\ttbreak
1.1057 +{\out Level 1}
1.1058 +{\out rev [x] = []}
1.1059 +{\out 1. False}
1.1060 +\end{ttbox}
1.1061 +In more complicated cases, the trace can be enormous, especially since
1.1062 +invocations of the simplifier are often nested (e.g.\ when solving conditions
1.1063 +of rewrite rules).
1.1064 +
1.1065 +\subsection{How it works}
1.1066 +\label{sec:SimpHow}
1.1067 +
1.1068 +\subsubsection{Higher-order patterns}
1.1069 +
1.1070 +\subsubsection{Local assumptions}
1.1071 +
1.1072 +\subsubsection{The preprocessor}
1.1073 +
1.1074 +\section{Induction heuristics}
1.1075 +\label{sec:InductionHeuristics}
1.1076 +
1.1077 +The purpose of this section is to illustrate some simple heuristics for
1.1078 +inductive proofs. The first one we have already mentioned in our initial
1.1079 +example:
1.1080 +\begin{quote}
1.1081 +{\em 1. Theorems about recursive functions are proved by induction.}
1.1082 +\end{quote}
1.1083 +In case the function has more than one argument
1.1084 +\begin{quote}
1.1085 +{\em 2. Do induction on argument number $i$ if the function is defined by
1.1086 +recursion in argument number $i$.}
1.1087 +\end{quote}
1.1088 +When we look at the proof of
1.1089 +\begin{ttbox}\makeatother
1.1090 +(xs @ ys) @ zs = xs @ (ys @ zs)
1.1091 +\end{ttbox}
1.1092 +in \S\ref{sec:intro-proof} we find (a) \texttt{\at} is recursive in the first
1.1093 +argument, (b) \texttt{xs} occurs only as the first argument of \texttt{\at},
1.1094 +and (c) both \texttt{ys} and \texttt{zs} occur at least once as the second
1.1095 +argument of \texttt{\at}. Hence it is natural to perform induction on
1.1096 +\texttt{xs}.
1.1097 +
1.1098 +The key heuristic, and the main point of this section, is to
1.1099 +generalize the goal before induction. The reason is simple: if the goal is
1.1100 +too specific, the induction hypothesis is too weak to allow the induction
1.1101 +step to go through. Let us now illustrate the idea with an example.
1.1102 +
1.1103 +We define a tail-recursive version of list-reversal,
1.1104 +i.e.\ one that can be compiled into a loop:
1.1105 +\begin{ttbox}
1.1106 +\input{Misc/Itrev.thy}\end{ttbox}
1.1107 +The behaviour of \texttt{itrev} is simple: it reverses its first argument by
1.1108 +stacking its elements onto the second argument, and returning that second
1.1109 +argument when the first one becomes empty.
1.1110 +We need to show that \texttt{itrev} does indeed reverse its first argument
1.1111 +provided the second one is empty:
1.1112 +\begin{ttbox}
1.1113 +\input{Misc/itrev1.ML}\end{ttbox}
1.1114 +There is no choice as to the induction variable, and we immediately simplify:
1.1115 +\begin{ttbox}
1.1116 +\input{Misc/induct_auto.ML}\ttbreak\makeatother
1.1117 +{\out1. !!a list. itrev list [] = rev list\(\;\)==> itrev list [a] = rev list @ [a]}
1.1118 +\end{ttbox}
1.1119 +Just as predicted above, the overall goal, and hence the induction
1.1120 +hypothesis, is too weak to solve the induction step because of the fixed
1.1121 +\texttt{[]}. The corresponding heuristic:
1.1122 +\begin{quote}
1.1123 +{\em 3. Generalize goals for induction by replacing constants by variables.}
1.1124 +\end{quote}
1.1125 +Of course one cannot do this na\"{\i}vely: \texttt{itrev xs ys = rev xs} is
1.1126 +just not true --- the correct generalization is
1.1127 +\begin{ttbox}\makeatother
1.1128 +\input{Misc/itrev2.ML}\end{ttbox}
1.1129 +If \texttt{ys} is replaced by \texttt{[]}, the right-hand side simplifies to
1.1130 +\texttt{rev xs}, just as required.
1.1131 +
1.1132 +In this particular instance it is easy to guess the right generalization,
1.1133 +but in more complex situations a good deal of creativity is needed. This is
1.1134 +the main source of complications in inductive proofs.
1.1135 +
1.1136 +Although we now have two variables, only \texttt{xs} is suitable for
1.1137 +induction, and we repeat our above proof attempt. Unfortunately, we are still
1.1138 +not there:
1.1139 +\begin{ttbox}\makeatother
1.1140 +{\out 1. !!a list.}
1.1141 +{\out itrev list ys = rev list @ ys}
1.1142 +{\out ==> itrev list (a # ys) = rev list @ a # ys}
1.1143 +\end{ttbox}
1.1144 +The induction hypothesis is still too weak, but this time it takes no
1.1145 +intuition to generalize: the problem is that \texttt{ys} is fixed throughout
1.1146 +the subgoal, but the induction hypothesis needs to be applied with
1.1147 +\texttt{a \# ys} instead of \texttt{ys}. Hence we prove the theorem
1.1148 +for all \texttt{ys} instead of a fixed one:
1.1149 +\begin{ttbox}\makeatother
1.1150 +\input{Misc/itrev3.ML}\end{ttbox}
1.1151 +This time induction on \texttt{xs} followed by simplification succeeds. This
1.1152 +leads to another heuristic for generalization:
1.1153 +\begin{quote}
1.1154 +{\em 4. Generalize goals for induction by universally quantifying all free
1.1155 +variables {\em(except the induction variable itself!)}.}
1.1156 +\end{quote}
1.1157 +This prevents trivial failures like the above and does not change the
1.1158 +provability of the goal. Because it is not always required, and may even
1.1159 +complicate matters in some cases, this heuristic is often not
1.1160 +applied blindly.
1.1161 +
1.1162 +A final point worth mentioning is the orientation of the equation we just
1.1163 +proved: the more complex notion (\texttt{itrev}) is on the left-hand
1.1164 +side, the simpler \texttt{rev} on the right-hand side. This constitutes
1.1165 +another, albeit weak heuristic that is not restricted to induction:
1.1166 +\begin{quote}
1.1167 + {\em 5. The right-hand side of an equation should (in some sense) be
1.1168 + simpler than the left-hand side.}
1.1169 +\end{quote}
1.1170 +The heuristic is tricky to apply because it is not obvious that
1.1171 +\texttt{rev xs \at\ ys} is simpler than \texttt{itrev xs ys}. But see what
1.1172 +happens if you try to prove the symmetric equation!
1.1173 +
1.1174 +
1.1175 +\section{Case study: compiling expressions}
1.1176 +\label{sec:ExprCompiler}
1.1177 +
1.1178 +The task is to develop a compiler from a generic type of expressions (built
1.1179 +up from variables, constants and binary operations) to a stack machine. This
1.1180 +generic type of expressions is a generalization of the boolean expressions in
1.1181 +\S\ref{sec:boolex}. This time we do not commit ourselves to a particular
1.1182 +type of variables or values but make them type parameters. Neither is there
1.1183 +a fixed set of binary operations: instead the expression contains the
1.1184 +appropriate function itself.
1.1185 +\begin{ttbox}
1.1186 +\input{CodeGen/expr}\end{ttbox}
1.1187 +The three constructors represent constants, variables and the combination of
1.1188 +two subexpressions with a binary operation.
1.1189 +
1.1190 +The value of an expression w.r.t.\ an environment that maps variables to
1.1191 +values is easily defined:
1.1192 +\begin{ttbox}
1.1193 +\input{CodeGen/value}\end{ttbox}
1.1194 +
1.1195 +The stack machine has three instructions: load a constant value onto the
1.1196 +stack, load the contents of a certain address onto the stack, and apply a
1.1197 +binary operation to the two topmost elements of the stack, replacing them by
1.1198 +the result. As for \texttt{expr}, addresses and values are type parameters:
1.1199 +\begin{ttbox}
1.1200 +\input{CodeGen/instr}\end{ttbox}
1.1201 +
1.1202 +The execution of the stack machine is modelled by a function \texttt{exec}
1.1203 +that takes a store (modelled as a function from addresses to values, just
1.1204 +like the environment for evaluating expressions), a stack (modelled as a
1.1205 +list) of values and a list of instructions and returns the stack at the end
1.1206 +of the execution --- the store remains unchanged:
1.1207 +\begin{ttbox}
1.1208 +\input{CodeGen/exec}\end{ttbox}
1.1209 +Recall that \texttt{hd} and \texttt{tl}
1.1210 +return the first element and the remainder of a list.
1.1211 +
1.1212 +Because all functions are total, \texttt{hd} is defined even for the empty
1.1213 +list, although we do not know what the result is. Thus our model of the
1.1214 +machine always terminates properly, although the above definition does not
1.1215 +tell us much about the result in situations where \texttt{Apply} was executed
1.1216 +with fewer than two elements on the stack.
1.1217 +
1.1218 +The compiler is a function from expressions to a list of instructions. Its
1.1219 +definition is pretty much obvious:
1.1220 +\begin{ttbox}\makeatother
1.1221 +\input{CodeGen/comp}\end{ttbox}
1.1222 +
1.1223 +Now we have to prove the correctness of the compiler, i.e.\ that the
1.1224 +execution of a compiled expression results in the value of the expression:
1.1225 +\begin{ttbox}
1.1226 +exec s [] (comp e) = [value s e]
1.1227 +\end{ttbox}
1.1228 +This is generalized to
1.1229 +\begin{ttbox}
1.1230 +\input{CodeGen/goal2.ML}\end{ttbox}
1.1231 +and proved by induction on \texttt{e} followed by simplification, once we
1.1232 +have the following lemma about executing the concatenation of two instruction
1.1233 +sequences:
1.1234 +\begin{ttbox}\makeatother
1.1235 +\input{CodeGen/goal2.ML}\end{ttbox}
1.1236 +This requires induction on \texttt{xs} and ordinary simplification for the
1.1237 +base cases. In the induction step, simplification leaves us with a formula
1.1238 +that contains two \texttt{case}-expressions over instructions. Thus we add
1.1239 +automatic case splitting as well, which finishes the proof:
1.1240 +\begin{ttbox}
1.1241 +\input{CodeGen/simpsplit.ML}\end{ttbox}
1.1242 +
1.1243 +We could now go back and prove \texttt{exec s [] (comp e) = [value s e]}
1.1244 +merely by simplification with the generalized version we just proved.
1.1245 +However, this is unnecessary because the generalized version fully subsumes
1.1246 +its instance.
1.1247 +
1.1248 +\section{Total recursive functions}
1.1249 +\label{sec:recdef}
1.1250 +\index{*recdef|(}
1.1251 +
1.1252 +
1.1253 +Although many total functions have a natural primitive recursive definition,
1.1254 +this is not always the case. Arbitrary total recursive functions can be
1.1255 +defined by means of \texttt{recdef}: you can use full pattern-matching,
1.1256 +recursion need not involve datatypes, and termination is proved by showing
1.1257 +that each recursive call makes the argument smaller in a suitable (user
1.1258 +supplied) sense.
1.1259 +
1.1260 +\subsection{Defining recursive functions}
1.1261 +
1.1262 +Here is a simple example, the Fibonacci function:
1.1263 +\begin{ttbox}
1.1264 +consts fib :: nat => nat
1.1265 +recdef fib "measure(\%n. n)"
1.1266 + "fib 0 = 0"
1.1267 + "fib 1 = 1"
1.1268 + "fib (Suc(Suc x)) = fib x + fib (Suc x)"
1.1269 +\end{ttbox}
1.1270 +The definition of \texttt{fib} is accompanied by a \bfindex{measure function}
1.1271 +\texttt{\%n.$\;$n} that maps the argument of \texttt{fib} to a natural
1.1272 +number. The requirement is that in each equation the measure of the argument
1.1273 +on the left-hand side is strictly greater than the measure of the argument of
1.1274 +each recursive call. In the case of \texttt{fib} this is obviously true
1.1275 +because the measure function is the identity and \texttt{Suc(Suc~x)} is
1.1276 +strictly greater than both \texttt{x} and \texttt{Suc~x}.
1.1277 +
1.1278 +Slightly more interesting is the insertion of a fixed element
1.1279 +between any two elements of a list:
1.1280 +\begin{ttbox}
1.1281 +consts sep :: "'a * 'a list => 'a list"
1.1282 +recdef sep "measure (\%(a,xs). length xs)"
1.1283 + "sep(a, []) = []"
1.1284 + "sep(a, [x]) = [x]"
1.1285 + "sep(a, x#y#zs) = x # a # sep(a,y#zs)"
1.1286 +\end{ttbox}
1.1287 +This time the measure is the length of the list, which decreases with the
1.1288 +recursive call; the first component of the argument tuple is irrelevant.
1.1289 +
1.1290 +Pattern matching need not be exhaustive:
1.1291 +\begin{ttbox}
1.1292 +consts last :: 'a list => bool
1.1293 +recdef last "measure (\%xs. length xs)"
1.1294 + "last [x] = x"
1.1295 + "last (x#y#zs) = last (y#zs)"
1.1296 +\end{ttbox}
1.1297 +
1.1298 +Overlapping patterns are disambiguated by taking the order of equations into
1.1299 +account, just as in functional programming:
1.1300 +\begin{ttbox}
1.1301 +recdef sep "measure (\%(a,xs). length xs)"
1.1302 + "sep(a, x#y#zs) = x # a # sep(a,y#zs)"
1.1303 + "sep(a, xs) = xs"
1.1304 +\end{ttbox}
1.1305 +This defines exactly the same function \texttt{sep} as further above.
1.1306 +
1.1307 +\begin{warn}
1.1308 +Currently \texttt{recdef} only accepts functions with a single argument,
1.1309 +possibly of tuple type.
1.1310 +\end{warn}
1.1311 +
1.1312 +When loading a theory containing a \texttt{recdef} of a function $f$,
1.1313 +Isabelle proves the recursion equations and stores the result as a list of
1.1314 +theorems $f$.\texttt{rules}. It can be viewed by typing
1.1315 +\begin{ttbox}
1.1316 +prths \(f\).rules;
1.1317 +\end{ttbox}
1.1318 +All of the above examples are simple enough that Isabelle can determine
1.1319 +automatically that the measure of the argument goes down in each recursive
1.1320 +call. In that case $f$.\texttt{rules} contains precisely the defining
1.1321 +equations.
1.1322 +
1.1323 +In general, Isabelle may not be able to prove all termination conditions
1.1324 +automatically. For example, termination of
1.1325 +\begin{ttbox}
1.1326 +consts gcd :: "nat*nat => nat"
1.1327 +recdef gcd "measure ((\%(m,n).n))"
1.1328 + "gcd (m, n) = (if n=0 then m else gcd(n, m mod n))"
1.1329 +\end{ttbox}
1.1330 +relies on the lemma \texttt{mod_less_divisor}
1.1331 +\begin{ttbox}
1.1332 +0 < n ==> m mod n < n
1.1333 +\end{ttbox}
1.1334 +that is not part of the default simpset. As a result, Isabelle prints a
1.1335 +warning and \texttt{gcd.rules} contains a precondition:
1.1336 +\begin{ttbox}
1.1337 +(! m n. 0 < n --> m mod n < n) ==> gcd (m, n) = (if n=0 \dots)
1.1338 +\end{ttbox}
1.1339 +We need to instruct \texttt{recdef} to use an extended simpset to prove the
1.1340 +termination condition:
1.1341 +\begin{ttbox}
1.1342 +recdef gcd "measure ((\%(m,n).n))"
1.1343 + simpset "simpset() addsimps [mod_less_divisor]"
1.1344 + "gcd (m, n) = (if n=0 then m else gcd(n, m mod n))"
1.1345 +\end{ttbox}
1.1346 +This time everything works fine and \texttt{gcd.rules} contains precisely the
1.1347 +stated recursion equation for \texttt{gcd}.
1.1348 +
1.1349 +When defining some nontrivial total recursive function, the first attempt
1.1350 +will usually generate a number of termination conditions, some of which may
1.1351 +require new lemmas to be proved in some of the parent theories. Those lemmas
1.1352 +can then be added to the simpset used by \texttt{recdef} for its
1.1353 +proofs, as shown for \texttt{gcd}.
1.1354 +
1.1355 +Although all the above examples employ measure functions, \texttt{recdef}
1.1356 +allows arbitrary wellfounded relations. For example, termination of
1.1357 +Ackermann's function requires the lexicographic product \texttt{**}:
1.1358 +\begin{ttbox}
1.1359 +recdef ack "measure(\%m. m) ** measure(\%n. n)"
1.1360 + "ack(0,n) = Suc n"
1.1361 + "ack(Suc m,0) = ack(m, 1)"
1.1362 + "ack(Suc m,Suc n) = ack(m,ack(Suc m,n))"
1.1363 +\end{ttbox}
1.1364 +For details see~\cite{Isa-Logics-Man} and the examples in the library.
1.1365 +
1.1366 +
1.1367 +\subsection{Deriving simplification rules}
1.1368 +
1.1369 +Once we have succeeded to prove all termination conditions, we can start to
1.1370 +derive some theorems. In contrast to \texttt{primrec} definitions, which are
1.1371 +automatically added to the simpset, \texttt{recdef} rules must be included
1.1372 +explicitly, for example as in
1.1373 +\begin{ttbox}
1.1374 +Addsimps fib.rules;
1.1375 +\end{ttbox}
1.1376 +However, some care is necessary now, in contrast to \texttt{primrec}.
1.1377 +Although \texttt{gcd} is a total function, its defining equation leads to
1.1378 +nontermination of the simplifier, because the subterm \texttt{gcd(n, m mod
1.1379 + n)} on the right-hand side can again be simplified by the same equation,
1.1380 +and so on. The reason: the simplifier rewrites the \texttt{then} and
1.1381 +\texttt{else} branches of a conditional if the condition simplifies to
1.1382 +neither \texttt{True} nor \texttt{False}. Therefore it is recommended to
1.1383 +derive an alternative formulation that replaces case distinctions on the
1.1384 +right-hand side by conditional equations. For \texttt{gcd} it means we have
1.1385 +to prove
1.1386 +\begin{ttbox}
1.1387 + gcd (m, 0) = m
1.1388 +n ~= 0 ==> gcd (m, n) = gcd(n, m mod n)
1.1389 +\end{ttbox}
1.1390 +To avoid nontermination during those proofs, we have to resort to some low
1.1391 +level tactics:
1.1392 +\begin{ttbox}
1.1393 +Goal "gcd(m,0) = m";
1.1394 +by(resolve_tac [trans] 1);
1.1395 +by(resolve_tac gcd.rules 1);
1.1396 +by(Simp_tac 1);
1.1397 +\end{ttbox}
1.1398 +At this point it is not necessary to understand what exactly
1.1399 +\texttt{resolve_tac} is doing. The main point is that the above proof works
1.1400 +not just for this one example but in general (except that we have to use
1.1401 +\texttt{Asm_simp_tac} and $f$\texttt{.rules} in general). Try the second
1.1402 +\texttt{gcd}-equation above!
1.1403 +
1.1404 +\subsection{Induction}
1.1405 +
1.1406 +Assuming we have added the recursion equations (or some suitable derived
1.1407 +equations) to the simpset, we might like to prove something about our
1.1408 +function. Since the function is recursive, the natural proof principle is
1.1409 +again induction. But this time the structural form of induction that comes
1.1410 +with datatypes is unlikely to work well---otherwise we could have defined the
1.1411 +function by \texttt{primrec}. Therefore \texttt{recdef} automatically proves
1.1412 +a suitable induction rule $f$\texttt{.induct} that follows the recursion
1.1413 +pattern of the particular function $f$. Roughly speaking, it requires you to
1.1414 +prove for each \texttt{recdef} equation that the property you are trying to
1.1415 +establish holds for the left-hand side provided it holds for all recursive
1.1416 +calls on the right-hand side. Applying $f$\texttt{.induct} requires its
1.1417 +explicit instantiation. See \S\ref{sec:explicit-inst} for details.
1.1418 +
1.1419 +\index{*recdef|)}