|
1 \chapter{Functional Programming in HOL} |
|
2 |
|
3 Although on the surface this chapter is mainly concerned with how to write |
|
4 functional programs in HOL and how to verify them, most of the |
|
5 constructs and proof procedures introduced are general purpose and recur in |
|
6 any specification or verification task. |
|
7 |
|
8 The dedicated functional programmer should be warned: HOL offers only what |
|
9 could be called {\em total functional programming} --- all functions in HOL |
|
10 must be total; lazy data structures are not directly available. On the |
|
11 positive side, functions in HOL need not be computable: HOL is a |
|
12 specification language that goes well beyond what can be expressed as a |
|
13 program. However, for the time being we concentrate on the computable. |
|
14 |
|
15 \section{An introductory theory} |
|
16 \label{sec:intro-theory} |
|
17 |
|
18 Functional programming needs datatypes and functions. Both of them can be |
|
19 defined in a theory with a syntax reminiscent of languages like ML or |
|
20 Haskell. As an example consider the theory in Fig.~\ref{fig:ToyList}. |
|
21 |
|
22 \begin{figure}[htbp] |
|
23 \begin{ttbox}\makeatother |
|
24 \input{ToyList/ToyList.thy}\end{ttbox} |
|
25 \caption{A theory of lists} |
|
26 \label{fig:ToyList} |
|
27 \end{figure} |
|
28 |
|
29 HOL already has a predefined theory of lists called \texttt{List} --- |
|
30 \texttt{ToyList} is merely a small fragment of it chosen as an example. In |
|
31 contrast to what is recommended in \S\ref{sec:Basic:Theories}, |
|
32 \texttt{ToyList} is not based on \texttt{Main} but on \texttt{Datatype}, a |
|
33 theory that contains everything required for datatype definitions but does |
|
34 not have \texttt{List} as a parent, thus avoiding ambiguities caused by |
|
35 defining lists twice. |
|
36 |
|
37 The \ttindexbold{datatype} \texttt{list} introduces two constructors |
|
38 \texttt{Nil} and \texttt{Cons}, the empty list and the operator that adds an |
|
39 element to the front of a list. For example, the term \texttt{Cons True (Cons |
|
40 False Nil)} is a value of type \texttt{bool~list}, namely the list with the |
|
41 elements \texttt{True} and \texttt{False}. Because this notation becomes |
|
42 unwieldy very quickly, the datatype declaration is annotated with an |
|
43 alternative syntax: instead of \texttt{Nil} and \texttt{Cons}~$x$~$xs$ we can |
|
44 write \index{#@{\tt[]}|bold}\texttt{[]} and |
|
45 \texttt{$x$~\#~$xs$}\index{#@{\tt\#}|bold}. In fact, this alternative syntax |
|
46 is the standard syntax. Thus the list \texttt{Cons True (Cons False Nil)} |
|
47 becomes \texttt{True \# False \# []}. The annotation \ttindexbold{infixr} |
|
48 means that \texttt{\#} associates to the right, i.e.\ the term \texttt{$x$ \# |
|
49 $y$ \# $z$} is read as \texttt{$x$ \# ($y$ \# $z$)} and not as \texttt{($x$ |
|
50 \# $y$) \# $z$}. |
|
51 |
|
52 \begin{warn} |
|
53 Syntax annotations are a powerful but completely optional feature. You |
|
54 could drop them from theory \texttt{ToyList} and go back to the identifiers |
|
55 \texttt{Nil} and \texttt{Cons}. However, lists are such a central datatype |
|
56 that their syntax is highly customized. We recommend that novices should |
|
57 not use syntax annotations in their own theories. |
|
58 \end{warn} |
|
59 |
|
60 Next, the functions \texttt{app} and \texttt{rev} are declared. In contrast |
|
61 to ML, Isabelle insists on explicit declarations of all functions (keyword |
|
62 \ttindexbold{consts}). (Apart from the declaration-before-use restriction, |
|
63 the order of items in a theory file is unconstrained.) Function \texttt{app} |
|
64 is annotated with concrete syntax too. Instead of the prefix syntax |
|
65 \texttt{app}~$xs$~$ys$ the infix $xs$~\texttt{\at}~$ys$ becomes the preferred |
|
66 form. |
|
67 |
|
68 Both functions are defined recursively. The equations for \texttt{app} and |
|
69 \texttt{rev} hardly need comments: \texttt{app} appends two lists and |
|
70 \texttt{rev} reverses a list. The keyword \texttt{primrec} indicates that |
|
71 the recursion is of a particularly primitive kind where each recursive call |
|
72 peels off a datatype constructor from one of the arguments (see |
|
73 \S\ref{sec:datatype}). Thus the recursion always terminates, i.e.\ the |
|
74 function is \bfindex{total}. |
|
75 |
|
76 The termination requirement is absolutely essential in HOL, a logic of total |
|
77 functions. If we were to drop it, inconsistencies could quickly arise: the |
|
78 ``definition'' $f(n) = f(n)+1$ immediately leads to $0 = 1$ by subtracting |
|
79 $f(n)$ on both sides. |
|
80 % However, this is a subtle issue that we cannot discuss here further. |
|
81 |
|
82 \begin{warn} |
|
83 As we have indicated, the desire for total functions is not a gratuitously |
|
84 imposed restriction but an essential characteristic of HOL. It is only |
|
85 because of totality that reasoning in HOL is comparatively easy. More |
|
86 generally, the philosophy in HOL is not to allow arbitrary axioms (such as |
|
87 function definitions whose totality has not been proved) because they |
|
88 quickly lead to inconsistencies. Instead, fixed constructs for introducing |
|
89 types and functions are offered (such as \texttt{datatype} and |
|
90 \texttt{primrec}) which are guaranteed to preserve consistency. |
|
91 \end{warn} |
|
92 |
|
93 A remark about syntax. The textual definition of a theory follows a fixed |
|
94 syntax with keywords like \texttt{datatype} and \texttt{end} (see |
|
95 Fig.~\ref{fig:keywords} in Appendix~\ref{sec:Appendix} for a full list). |
|
96 Embedded in this syntax are the types and formulae of HOL, whose syntax is |
|
97 extensible, e.g.\ by new user-defined infix operators |
|
98 (see~\ref{sec:infix-syntax}). To distinguish the two levels, everything |
|
99 HOL-specific should be enclosed in \texttt{"}\dots\texttt{"}. The same holds |
|
100 for identifiers that happen to be keywords, as in |
|
101 \begin{ttbox} |
|
102 consts "end" :: 'a list => 'a |
|
103 \end{ttbox} |
|
104 To lessen this burden, quotation marks around types can be dropped, |
|
105 provided their syntax does not go beyond what is described in |
|
106 \S\ref{sec:TypesTermsForms}. Types containing further operators, e.g.\ |
|
107 \texttt{*} for Cartesian products, need quotation marks. |
|
108 |
|
109 When Isabelle prints a syntax error message, it refers to the HOL syntax as |
|
110 the \bfindex{inner syntax}. |
|
111 |
|
112 \section{An introductory proof} |
|
113 \label{sec:intro-proof} |
|
114 |
|
115 Having defined \texttt{ToyList}, we load it with the ML command |
|
116 \begin{ttbox} |
|
117 use_thy "ToyList"; |
|
118 \end{ttbox} |
|
119 and are ready to prove a few simple theorems. This will illustrate not just |
|
120 the basic proof commands but also the typical proof process. |
|
121 |
|
122 \subsubsection*{Main goal: \texttt{rev(rev xs) = xs}} |
|
123 |
|
124 Our goal is to show that reversing a list twice produces the original |
|
125 list. Typing |
|
126 \begin{ttbox} |
|
127 \input{ToyList/thm}\end{ttbox} |
|
128 establishes a new goal to be proved in the context of the current theory, |
|
129 which is the one we just loaded. Isabelle's response is to print the current proof state: |
|
130 \begin{ttbox} |
|
131 {\out Level 0} |
|
132 {\out rev (rev xs) = xs} |
|
133 {\out 1. rev (rev xs) = xs} |
|
134 \end{ttbox} |
|
135 Until we have finished a proof, the proof state always looks like this: |
|
136 \begin{ttbox} |
|
137 {\out Level \(i\)} |
|
138 {\out \(G\)} |
|
139 {\out 1. \(G@1\)} |
|
140 {\out \(\vdots\)} |
|
141 {\out \(n\). \(G@n\)} |
|
142 \end{ttbox} |
|
143 where \texttt{Level}~$i$ indicates that we are $i$ steps into the proof, $G$ |
|
144 is the overall goal that we are trying to prove, and the numbered lines |
|
145 contain the subgoals $G@1$, \dots, $G@n$ that we need to prove to establish |
|
146 $G$. At \texttt{Level 0} there is only one subgoal, which is identical with |
|
147 the overall goal. Normally $G$ is constant and only serves as a reminder. |
|
148 Hence we rarely show it in this tutorial. |
|
149 |
|
150 Let us now get back to \texttt{rev(rev xs) = xs}. Properties of recursively |
|
151 defined functions are best established by induction. In this case there is |
|
152 not much choice except to induct on \texttt{xs}: |
|
153 \begin{ttbox} |
|
154 \input{ToyList/inductxs}\end{ttbox} |
|
155 This tells Isabelle to perform induction on variable \texttt{xs} in subgoal |
|
156 1. The new proof state contains two subgoals, namely the base case |
|
157 (\texttt{Nil}) and the induction step (\texttt{Cons}): |
|
158 \begin{ttbox} |
|
159 {\out 1. rev (rev []) = []} |
|
160 {\out 2. !!a list. rev (rev list) = list ==> rev (rev (a # list)) = a # list} |
|
161 \end{ttbox} |
|
162 The induction step is an example of the general format of a subgoal: |
|
163 \begin{ttbox} |
|
164 {\out \(i\). !!\(x@1 \dots x@n\). {\it assumptions} ==> {\it conclusion}} |
|
165 \end{ttbox}\index{==>@{\tt==>}|bold} |
|
166 The prefix of bound variables \texttt{!!\(x@1 \dots x@n\)} can be ignored |
|
167 most of the time, or simply treated as a list of variables local to this |
|
168 subgoal. Their deeper significance is explained in \S\ref{sec:PCproofs}. The |
|
169 {\it assumptions} are the local assumptions for this subgoal and {\it |
|
170 conclusion} is the actual proposition to be proved. Typical proof steps |
|
171 that add new assumptions are induction or case distinction. In our example |
|
172 the only assumption is the induction hypothesis \texttt{rev (rev list) = |
|
173 list}, where \texttt{list} is a variable name chosen by Isabelle. If there |
|
174 are multiple assumptions, they are enclosed in the bracket pair |
|
175 \texttt{[|}\index{==>@\ttlbr|bold} and \texttt{|]}\index{==>@\ttrbr|bold} |
|
176 and separated by semicolons. |
|
177 |
|
178 Let us try to solve both goals automatically: |
|
179 \begin{ttbox} |
|
180 \input{ToyList/autotac}\end{ttbox} |
|
181 This command tells Isabelle to apply a proof strategy called |
|
182 \texttt{Auto_tac} to all subgoals. Essentially, \texttt{Auto_tac} tries to |
|
183 `simplify' the subgoals. In our case, subgoal~1 is solved completely (thanks |
|
184 to the equation \texttt{rev [] = []}) and disappears; the simplified version |
|
185 of subgoal~2 becomes the new subgoal~1: |
|
186 \begin{ttbox}\makeatother |
|
187 {\out 1. !!a list. rev(rev list) = list ==> rev(rev list @ a # []) = a # list} |
|
188 \end{ttbox} |
|
189 In order to simplify this subgoal further, a lemma suggests itself. |
|
190 |
|
191 \subsubsection*{First lemma: \texttt{rev(xs \at~ys) = (rev ys) \at~(rev xs)}} |
|
192 |
|
193 We start the proof as usual: |
|
194 \begin{ttbox}\makeatother |
|
195 \input{ToyList/lemma1}\end{ttbox} |
|
196 There are two variables that we could induct on: \texttt{xs} and |
|
197 \texttt{ys}. Because \texttt{\at} is defined by recursion on |
|
198 the first argument, \texttt{xs} is the correct one: |
|
199 \begin{ttbox} |
|
200 \input{ToyList/inductxs}\end{ttbox} |
|
201 This time not even the base case is solved automatically: |
|
202 \begin{ttbox}\makeatother |
|
203 by(Auto_tac); |
|
204 {\out 1. rev ys = rev ys @ []} |
|
205 {\out 2. \dots} |
|
206 \end{ttbox} |
|
207 We need another lemma. |
|
208 |
|
209 \subsubsection*{Second lemma: \texttt{xs \at~[] = xs}} |
|
210 |
|
211 This time the canonical proof procedure |
|
212 \begin{ttbox}\makeatother |
|
213 \input{ToyList/lemma2}\input{ToyList/inductxs}\input{ToyList/autotac}\end{ttbox} |
|
214 leads to the desired message \texttt{No subgoals!}: |
|
215 \begin{ttbox}\makeatother |
|
216 {\out Level 2} |
|
217 {\out xs @ [] = xs} |
|
218 {\out No subgoals!} |
|
219 \end{ttbox} |
|
220 Now we can give the lemma just proved a suitable name |
|
221 \begin{ttbox} |
|
222 \input{ToyList/qed2}\end{ttbox} |
|
223 and tell Isabelle to use this lemma in all future proofs by simplification: |
|
224 \begin{ttbox} |
|
225 \input{ToyList/addsimps2}\end{ttbox} |
|
226 Note that in the theorem \texttt{app_Nil2} the free variable \texttt{xs} has |
|
227 been replaced by the unknown \texttt{?xs}, just as explained in |
|
228 \S\ref{sec:variables}. |
|
229 |
|
230 Going back to the proof of the first lemma |
|
231 \begin{ttbox}\makeatother |
|
232 \input{ToyList/lemma1}\input{ToyList/inductxs}\input{ToyList/autotac}\end{ttbox} |
|
233 we find that this time \texttt{Auto_tac} solves the base case, but the |
|
234 induction step merely simplifies to |
|
235 \begin{ttbox}\makeatother |
|
236 {\out 1. !!a list.} |
|
237 {\out rev (list @ ys) = rev ys @ rev list} |
|
238 {\out ==> (rev ys @ rev list) @ a # [] = rev ys @ rev list @ a # []} |
|
239 \end{ttbox} |
|
240 Now we need to remember that \texttt{\at} associates to the right, and that |
|
241 \texttt{\#} and \texttt{\at} have the same priority (namely the \texttt{65} |
|
242 in the definition of \texttt{ToyList}). Thus the conclusion really is |
|
243 \begin{ttbox}\makeatother |
|
244 {\out ==> (rev ys @ rev list) @ (a # []) = rev ys @ (rev list @ (a # []))} |
|
245 \end{ttbox} |
|
246 and the missing lemma is associativity of \texttt{\at}. |
|
247 |
|
248 \subsubsection*{Third lemma: \texttt{(xs \at~ys) \at~zs = xs \at~(ys \at~zs)}} |
|
249 |
|
250 This time the canonical proof procedure |
|
251 \begin{ttbox}\makeatother |
|
252 \input{ToyList/lemma3}\end{ttbox} |
|
253 succeeds without further ado. Again we name the lemma and add it to |
|
254 the set of lemmas used during simplification: |
|
255 \begin{ttbox} |
|
256 \input{ToyList/qed3}\end{ttbox} |
|
257 Now we can go back and prove the first lemma |
|
258 \begin{ttbox}\makeatother |
|
259 \input{ToyList/lemma1}\input{ToyList/inductxs}\input{ToyList/autotac}\end{ttbox} |
|
260 add it to the simplification lemmas |
|
261 \begin{ttbox} |
|
262 \input{ToyList/qed1}\end{ttbox} |
|
263 and then solve our main theorem: |
|
264 \begin{ttbox}\makeatother |
|
265 \input{ToyList/thm}\input{ToyList/inductxs}\input{ToyList/autotac}\end{ttbox} |
|
266 |
|
267 \subsubsection*{Review} |
|
268 |
|
269 This is the end of our toy proof. It should have familiarized you with |
|
270 \begin{itemize} |
|
271 \item the standard theorem proving procedure: |
|
272 state a goal; proceed with proof until a new lemma is required; prove that |
|
273 lemma; come back to the original goal. |
|
274 \item a specific procedure that works well for functional programs: |
|
275 induction followed by all-out simplification via \texttt{Auto_tac}. |
|
276 \item a basic repertoire of proof commands. |
|
277 \end{itemize} |
|
278 |
|
279 |
|
280 \section{Some helpful commands} |
|
281 \label{sec:commands-and-hints} |
|
282 |
|
283 This section discusses a few basic commands for manipulating the proof state |
|
284 and can be skipped by casual readers. |
|
285 |
|
286 There are two kinds of commands used during a proof: the actual proof |
|
287 commands and auxiliary commands for examining the proof state and controlling |
|
288 the display. Proof commands are always of the form |
|
289 \texttt{by(\textit{tactic});}\indexbold{tactic} where \textbf{tactic} is a |
|
290 synonym for ``theorem proving function''. Typical examples are |
|
291 \texttt{induct_tac} and \texttt{Auto_tac} --- the suffix \texttt{_tac} is |
|
292 merely a mnemonic. Further tactics are introduced throughout the tutorial. |
|
293 |
|
294 %Tactics can also be modified. For example, |
|
295 %\begin{ttbox} |
|
296 %by(ALLGOALS Asm_simp_tac); |
|
297 %\end{ttbox} |
|
298 %tells Isabelle to apply \texttt{Asm_simp_tac} to all subgoals. For more on |
|
299 %tactics and how to combine them see~\S\ref{sec:Tactics}. |
|
300 |
|
301 The most useful auxiliary commands are: |
|
302 \begin{description} |
|
303 \item[Printing the current state] |
|
304 Type \texttt{pr();} to redisplay the current proof state, for example when it |
|
305 has disappeared off the screen. |
|
306 \item[Limiting the number of subgoals] |
|
307 Typing \texttt{prlim $k$;} tells Isabelle to print only the first $k$ |
|
308 subgoals from now on and redisplays the current proof state. This is helpful |
|
309 when there are many subgoals. |
|
310 \item[Undoing] Typing \texttt{undo();} undoes the effect of the last |
|
311 tactic. |
|
312 \item[Context switch] Every proof happens in the context of a |
|
313 \bfindex{current theory}. By default, this is the last theory loaded. If |
|
314 you want to prove a theorem in the context of a different theory |
|
315 \texttt{T}, you need to type \texttt{context T.thy;}\index{*context|bold} |
|
316 first. Of course you need to change the context again if you want to go |
|
317 back to your original theory. |
|
318 \item[Displaying types] We have already mentioned the flag |
|
319 \ttindex{show_types} above. It can also be useful for detecting typos in |
|
320 formulae early on. For example, if \texttt{show_types} is set and the goal |
|
321 \texttt{rev(rev xs) = xs} is started, Isabelle prints the additional output |
|
322 \begin{ttbox} |
|
323 {\out Variables:} |
|
324 {\out xs :: 'a list} |
|
325 \end{ttbox} |
|
326 which tells us that Isabelle has correctly inferred that |
|
327 \texttt{xs} is a variable of list type. On the other hand, had we |
|
328 made a typo as in \texttt{rev(re xs) = xs}, the response |
|
329 \begin{ttbox} |
|
330 Variables: |
|
331 re :: 'a list => 'a list |
|
332 xs :: 'a list |
|
333 \end{ttbox} |
|
334 would have alerted us because of the unexpected variable \texttt{re}. |
|
335 \item[(Re)loading theories]\index{loading theories}\index{reloading theories} |
|
336 Initially you load theory \texttt{T} by typing \ttindex{use_thy}~\texttt{"T";}, |
|
337 which loads all parent theories of \texttt{T} automatically, if they are not |
|
338 loaded already. If you modify \texttt{T.thy} or \texttt{T.ML}, you can |
|
339 reload it by typing \texttt{use_thy~"T";} again. This time, however, only |
|
340 \texttt{T} is reloaded. If some of \texttt{T}'s parents have changed as well, |
|
341 type \ttindexbold{update}\texttt{();} to reload all theories that have |
|
342 changed. |
|
343 \end{description} |
|
344 Further commands are found in the Reference Manual. |
|
345 |
|
346 |
|
347 \section{Datatypes} |
|
348 \label{sec:datatype} |
|
349 |
|
350 Inductive datatypes are part of almost every non-trivial application of HOL. |
|
351 First we take another look at a very important example, the datatype of |
|
352 lists, before we turn to datatypes in general. The section closes with a |
|
353 case study. |
|
354 |
|
355 |
|
356 \subsection{Lists} |
|
357 |
|
358 Lists are one of the essential datatypes in computing. Readers of this tutorial |
|
359 and users of HOL need to be familiar with their basic operations. Theory |
|
360 \texttt{ToyList} is only a small fragment of HOL's predefined theory |
|
361 \texttt{List}\footnote{\texttt{http://www.in.tum.de/\~\relax |
|
362 isabelle/library/HOL/List.html}}. |
|
363 The latter contains many further operations. For example, the functions |
|
364 \ttindexbold{hd} (`head') and \ttindexbold{tl} (`tail') return the first |
|
365 element and the remainder of a list. (However, pattern-matching is usually |
|
366 preferable to \texttt{hd} and \texttt{tl}.) |
|
367 Theory \texttt{List} also contains more syntactic sugar: |
|
368 \texttt{[}$x@1$\texttt{,}\dots\texttt{,}$x@n$\texttt{]} abbreviates |
|
369 $x@1$\texttt{\#}\dots\texttt{\#}$x@n$\texttt{\#[]}. |
|
370 In the rest of the tutorial we always use HOL's predefined lists. |
|
371 |
|
372 |
|
373 \subsection{The general format} |
|
374 \label{sec:general-datatype} |
|
375 |
|
376 The general HOL \texttt{datatype} definition is of the form |
|
377 \[ |
|
378 \mathtt{datatype}~(\alpha@1, \dots, \alpha@n) \, t ~=~ |
|
379 C@1~\tau@{11}~\dots~\tau@{1k@1} ~\mid~ \dots ~\mid~ |
|
380 C@m~\tau@{m1}~\dots~\tau@{mk@m} |
|
381 \] |
|
382 where $\alpha@i$ are type variables (the parameters), $C@i$ are distinct |
|
383 constructor names and $\tau@{ij}$ are types; it is customary to capitalize |
|
384 the first letter in constructor names. There are a number of |
|
385 restrictions (such as the type should not be empty) detailed |
|
386 elsewhere~\cite{Isa-Logics-Man}. Isabelle notifies you if you violate them. |
|
387 |
|
388 Laws about datatypes, such as \verb$[] ~= x#xs$ and \texttt{(x\#xs = y\#ys) = |
|
389 (x=y \& xs=ys)}, are used automatically during proofs by simplification. |
|
390 The same is true for the equations in primitive recursive function |
|
391 definitions. |
|
392 |
|
393 \subsection{Primitive recursion} |
|
394 |
|
395 Functions on datatypes are usually defined by recursion. In fact, most of the |
|
396 time they are defined by what is called \bfindex{primitive recursion}. |
|
397 The keyword \texttt{primrec} is followed by a list of equations |
|
398 \[ f \, x@1 \, \dots \, (C \, y@1 \, \dots \, y@k)\, \dots \, x@n = r \] |
|
399 such that $C$ is a constructor of the datatype $t$ and all recursive calls of |
|
400 $f$ in $r$ are of the form $f \, \dots \, y@i \, \dots$ for some $i$. Thus |
|
401 Isabelle immediately sees that $f$ terminates because one (fixed!) argument |
|
402 becomes smaller with every recursive call. There must be exactly one equation |
|
403 for each constructor. Their order is immaterial. |
|
404 A more general method for defining total recursive functions is explained in |
|
405 \S\ref{sec:recdef}. |
|
406 |
|
407 \begin{exercise} |
|
408 Given the datatype of binary trees |
|
409 \begin{ttbox} |
|
410 \input{Misc/tree}\end{ttbox} |
|
411 define a function \texttt{mirror} that mirrors the structure of a binary tree |
|
412 by swapping subtrees (recursively). Prove \texttt{mirror(mirror(t)) = t}. |
|
413 \end{exercise} |
|
414 |
|
415 \subsection{\texttt{case}-expressions} |
|
416 \label{sec:case-expressions} |
|
417 |
|
418 HOL also features \ttindexbold{case}-expressions for analyzing elements of a |
|
419 datatype. For example, |
|
420 \begin{ttbox} |
|
421 case xs of [] => 0 | y#ys => y |
|
422 \end{ttbox} |
|
423 evaluates to \texttt{0} if \texttt{xs} is \texttt{[]} and to \texttt{y} if |
|
424 \texttt{xs} is \texttt{y\#ys}. (Since the result in both branches must be of |
|
425 the same type, it follows that \texttt{y::nat} and hence |
|
426 \texttt{xs::(nat)list}.) |
|
427 |
|
428 In general, if $e$ is a term of the datatype $t$ defined in |
|
429 \S\ref{sec:general-datatype} above, the corresponding |
|
430 \texttt{case}-expression analyzing $e$ is |
|
431 \[ |
|
432 \begin{array}{rrcl} |
|
433 \mbox{\tt case}~e~\mbox{\tt of} & C@1~x@{11}~\dots~x@{1k@1} & \To & e@1 \\ |
|
434 \vdots \\ |
|
435 \mid & C@m~x@{m1}~\dots~x@{mk@m} & \To & e@m |
|
436 \end{array} |
|
437 \] |
|
438 |
|
439 \begin{warn} |
|
440 {\em All} constructors must be present, their order is fixed, and nested |
|
441 patterns are not supported. Violating these restrictions results in strange |
|
442 error messages. |
|
443 \end{warn} |
|
444 \noindent |
|
445 Nested patterns can be simulated by nested \texttt{case}-expressions: instead |
|
446 of |
|
447 \begin{ttbox} |
|
448 case xs of [] => 0 | [x] => x | x#(y#zs) => y |
|
449 \end{ttbox} |
|
450 write |
|
451 \begin{ttbox} |
|
452 case xs of [] => 0 | x#ys => (case ys of [] => x | y#zs => y) |
|
453 \end{ttbox} |
|
454 Note that \texttt{case}-expressions should be enclosed in parentheses to |
|
455 indicate their scope. |
|
456 |
|
457 \subsection{Structural induction} |
|
458 |
|
459 Almost all the basic laws about a datatype are applied automatically during |
|
460 simplification. Only induction is invoked by hand via \texttt{induct_tac}, |
|
461 which works for any datatype. In some cases, induction is overkill and a case |
|
462 distinction over all constructors of the datatype suffices. This is performed |
|
463 by \ttindexbold{exhaust_tac}. A trivial example: |
|
464 \begin{ttbox} |
|
465 \input{Misc/exhaust.ML}{\out1. xs = [] ==> (case xs of [] => [] | y # ys => xs) = xs} |
|
466 {\out2. !!a list. xs = a # list ==> (case xs of [] => [] | y # ys => xs) = xs} |
|
467 \input{Misc/autotac.ML}\end{ttbox} |
|
468 Note that this particular case distinction could have been automated |
|
469 completely. See~\S\ref{sec:SimpFeatures}. |
|
470 |
|
471 \begin{warn} |
|
472 Induction is only allowed on a free variable that should not occur among |
|
473 the assumptions of the subgoal. Exhaustion works for arbitrary terms. |
|
474 \end{warn} |
|
475 |
|
476 \subsection{Case study: boolean expressions} |
|
477 \label{sec:boolex} |
|
478 |
|
479 The aim of this case study is twofold: it shows how to model boolean |
|
480 expressions and some algorithms for manipulating them, and it demonstrates |
|
481 the constructs introduced above. |
|
482 |
|
483 \subsubsection{How can we model boolean expressions?} |
|
484 |
|
485 We want to represent boolean expressions built up from variables and |
|
486 constants by negation and conjunction. The following datatype serves exactly |
|
487 that purpose: |
|
488 \begin{ttbox} |
|
489 \input{Ifexpr/boolex}\end{ttbox} |
|
490 The two constants are represented by the terms \texttt{Const~True} and |
|
491 \texttt{Const~False}. Variables are represented by terms of the form |
|
492 \texttt{Var}~$n$, where $n$ is a natural number (type \texttt{nat}). |
|
493 For example, the formula $P@0 \land \neg P@1$ is represented by the term |
|
494 \texttt{And~(Var~0)~(Neg(Var~1))}. |
|
495 |
|
496 \subsubsection{What is the value of boolean expressions?} |
|
497 |
|
498 The value of a boolean expressions depends on the value of its variables. |
|
499 Hence the function \texttt{value} takes an additional parameter, an {\em |
|
500 environment} of type \texttt{nat~=>~bool}, which maps variables to their |
|
501 values: |
|
502 \begin{ttbox} |
|
503 \input{Ifexpr/value}\end{ttbox} |
|
504 |
|
505 \subsubsection{If-expressions} |
|
506 |
|
507 An alternative and often more efficient (because in a certain sense |
|
508 canonical) representation are so-called \textit{If-expressions\/} built up |
|
509 from constants (\texttt{CIF}), variables (\texttt{VIF}) and conditionals |
|
510 (\texttt{IF}): |
|
511 \begin{ttbox} |
|
512 \input{Ifexpr/ifex}\end{ttbox} |
|
513 The evaluation if If-expressions proceeds as for \texttt{boolex}: |
|
514 \begin{ttbox} |
|
515 \input{Ifexpr/valif}\end{ttbox} |
|
516 |
|
517 \subsubsection{Transformation into and of If-expressions} |
|
518 |
|
519 The type \texttt{boolex} is close to the customary representation of logical |
|
520 formulae, whereas \texttt{ifex} is designed for efficiency. Thus we need to |
|
521 translate from \texttt{boolex} into \texttt{ifex}: |
|
522 \begin{ttbox} |
|
523 \input{Ifexpr/bool2if}\end{ttbox} |
|
524 At last, we have something we can verify: that \texttt{bool2if} preserves the |
|
525 value of its argument. |
|
526 \begin{ttbox} |
|
527 \input{Ifexpr/bool2if.ML}\end{ttbox} |
|
528 The proof is canonical: |
|
529 \begin{ttbox} |
|
530 \input{Ifexpr/proof.ML}\end{ttbox} |
|
531 In fact, all proofs in this case study look exactly like this. Hence we do |
|
532 not show them below. |
|
533 |
|
534 More interesting is the transformation of If-expressions into a normal form |
|
535 where the first argument of \texttt{IF} cannot be another \texttt{IF} but |
|
536 must be a constant or variable. Such a normal form can be computed by |
|
537 repeatedly replacing a subterm of the form \texttt{IF~(IF~b~x~y)~z~u} by |
|
538 \texttt{IF b (IF x z u) (IF y z u)}, which has the same value. The following |
|
539 primitive recursive functions perform this task: |
|
540 \begin{ttbox} |
|
541 \input{Ifexpr/normif} |
|
542 \input{Ifexpr/norm}\end{ttbox} |
|
543 Their interplay is a bit tricky, and we leave it to the reader to develop an |
|
544 intuitive understanding. Fortunately, Isabelle can help us to verify that the |
|
545 transformation preserves the value of the expression: |
|
546 \begin{ttbox} |
|
547 \input{Ifexpr/norm.ML}\end{ttbox} |
|
548 The proof is canonical, provided we first show the following lemma (which |
|
549 also helps to understand what \texttt{normif} does) and make it available |
|
550 for simplification via \texttt{Addsimps}: |
|
551 \begin{ttbox} |
|
552 \input{Ifexpr/normif.ML}\end{ttbox} |
|
553 |
|
554 But how can we be sure that \texttt{norm} really produces a normal form in |
|
555 the above sense? We have to prove |
|
556 \begin{ttbox} |
|
557 \input{Ifexpr/normal_norm.ML}\end{ttbox} |
|
558 where \texttt{normal} expresses that an If-expression is in normal form: |
|
559 \begin{ttbox} |
|
560 \input{Ifexpr/normal}\end{ttbox} |
|
561 Of course, this requires a lemma about normality of \texttt{normif} |
|
562 \begin{ttbox} |
|
563 \input{Ifexpr/normal_normif.ML}\end{ttbox} |
|
564 that has to be made available for simplification via \texttt{Addsimps}. |
|
565 |
|
566 How does one come up with the required lemmas? Try to prove the main theorems |
|
567 without them and study carefully what \texttt{Auto_tac} leaves unproved. This |
|
568 has to provide the clue. |
|
569 The necessity of universal quantification (\texttt{!t e}) in the two lemmas |
|
570 is explained in \S\ref{sec:InductionHeuristics} |
|
571 |
|
572 \begin{exercise} |
|
573 We strengthen the definition of a {\em normal\/} If-expression as follows: |
|
574 the first argument of all \texttt{IF}s must be a variable. Adapt the above |
|
575 development to this changed requirement. (Hint: you may need to formulate |
|
576 some of the goals as implications (\texttt{-->}) rather than equalities |
|
577 (\texttt{=}).) |
|
578 \end{exercise} |
|
579 |
|
580 \section{Some basic types} |
|
581 |
|
582 \subsection{Natural numbers} |
|
583 |
|
584 The type \ttindexbold{nat} of natural numbers is predefined and behaves like |
|
585 \begin{ttbox} |
|
586 datatype nat = 0 | Suc nat |
|
587 \end{ttbox} |
|
588 In particular, there are \texttt{case}-expressions, for example |
|
589 \begin{ttbox} |
|
590 case n of 0 => 0 | Suc m => m |
|
591 \end{ttbox} |
|
592 primitive recursion, for example |
|
593 \begin{ttbox} |
|
594 \input{Misc/natsum}\end{ttbox} |
|
595 and induction, for example |
|
596 \begin{ttbox} |
|
597 \input{Misc/NatSum.ML}\ttbreak |
|
598 {\out sum n + sum n = n * Suc n} |
|
599 {\out No subgoals!} |
|
600 \end{ttbox} |
|
601 |
|
602 The usual arithmetic operations \ttindexbold{+}, \ttindexbold{-}, |
|
603 \ttindexbold{*}, \ttindexbold{div} and \ttindexbold{mod} are predefined, as |
|
604 are the relations \ttindexbold{<=} and \ttindexbold{<}. There is even a least |
|
605 number operation \ttindexbold{LEAST}. For example, \texttt{(LEAST n.$\,$1 < |
|
606 n) = 2} (HOL does not prove this completely automatically). |
|
607 |
|
608 \begin{warn} |
|
609 The operations \ttindexbold{+}, \ttindexbold{-}, \ttindexbold{*} are |
|
610 overloaded, i.e.\ they are available not just for natural numbers but at |
|
611 other types as well (see \S\ref{sec:TypeClasses}). For example, given |
|
612 the goal \texttt{x+y = y+x}, there is nothing to indicate that you are |
|
613 talking about natural numbers. Hence Isabelle can only infer that |
|
614 \texttt{x} and \texttt{y} are of some arbitrary type where \texttt{+} is |
|
615 declared. As a consequence, you will be unable to prove the goal (although |
|
616 it may take you some time to realize what has happened if |
|
617 \texttt{show_types} is not set). In this particular example, you need to |
|
618 include an explicit type constraint, for example \texttt{x+y = y+(x::nat)}. |
|
619 If there is enough contextual information this may not be necessary: |
|
620 \texttt{x+0 = x} automatically implies \texttt{x::nat}. |
|
621 \end{warn} |
|
622 |
|
623 |
|
624 \subsection{Products} |
|
625 |
|
626 HOL also has pairs: \texttt{($a@1$,$a@2$)} is of type \texttt{$\tau@1$ * |
|
627 $\tau@2$} provided each $a@i$ is of type $\tau@i$. The components of a pair |
|
628 are extracted by \texttt{fst} and \texttt{snd}: |
|
629 \texttt{fst($x$,$y$) = $x$} and \texttt{snd($x$,$y$) = $y$}. Tuples |
|
630 are simulated by pairs nested to the right: |
|
631 \texttt{($a@1$,$a@2$,$a@3$)} and \texttt{$\tau@1$ * $\tau@2$ * $\tau@3$} |
|
632 stand for \texttt{($a@1$,($a@2$,$a@3$))} and \texttt{$\tau@1$ * ($\tau@2$ * |
|
633 $\tau@3$)}. Therefore \texttt{fst(snd($a@1$,$a@2$,$a@3$)) = $a@2$}. |
|
634 |
|
635 It is possible to use (nested) tuples as patterns in abstractions, for |
|
636 example \texttt{\%(x,y,z).x+y+z} and \texttt{\%((x,y),z).x+y+z}. |
|
637 |
|
638 In addition to explicit $\lambda$-abstractions, tuple patterns can be used in |
|
639 most variable binding constructs. Typical examples are |
|
640 \begin{ttbox} |
|
641 let (x,y) = f z in (y,x) |
|
642 |
|
643 case xs of [] => 0 | (x,y)\#zs => x+y |
|
644 \end{ttbox} |
|
645 Further important examples are quantifiers and sets. |
|
646 |
|
647 \begin{warn} |
|
648 Abstraction over pairs and tuples is merely a convenient shorthand for a more |
|
649 complex internal representation. Thus the internal and external form of a |
|
650 term may differ, which can affect proofs. If you want to avoid this |
|
651 complication, use \texttt{fst} and \texttt{snd}, i.e.\ write |
|
652 \texttt{\%p.~fst p + snd p} instead of \texttt{\%(x,y).~x + y}. |
|
653 See~\S\ref{} for theorem proving with tuple patterns. |
|
654 \end{warn} |
|
655 |
|
656 |
|
657 \section{Definitions} |
|
658 \label{sec:Definitions} |
|
659 |
|
660 A definition is simply an abbreviation, i.e.\ a new name for an existing |
|
661 construction. In particular, definitions cannot be recursive. Isabelle offers |
|
662 definitions on the level of types and terms. Those on the type level are |
|
663 called type synonyms, those on the term level are called (constant) |
|
664 definitions. |
|
665 |
|
666 |
|
667 \subsection{Type synonyms} |
|
668 \indexbold{type synonyms} |
|
669 |
|
670 Type synonyms are similar to those found in ML. Their syntax is fairly self |
|
671 explanatory: |
|
672 \begin{ttbox} |
|
673 \input{Misc/types}\end{ttbox}\indexbold{*types} |
|
674 The synonym \texttt{alist} shows that in general the type on the right-hand |
|
675 side needs to be enclosed in double quotation marks |
|
676 (see the end of~\S\ref{sec:intro-theory}). |
|
677 |
|
678 Internally all synonyms are fully expanded. As a consequence Isabelle's |
|
679 output never contains synonyms. Their main purpose is to improve the |
|
680 readability of theory definitions. Synonyms can be used just like any other |
|
681 type: |
|
682 \begin{ttbox} |
|
683 \input{Misc/consts}\end{ttbox} |
|
684 |
|
685 \subsection{Constant definitions} |
|
686 \label{sec:ConstDefinitions} |
|
687 |
|
688 The above constants \texttt{nand} and \texttt{exor} are non-recursive and can |
|
689 therefore be defined directly by |
|
690 \begin{ttbox} |
|
691 \input{Misc/defs}\end{ttbox}\indexbold{*defs} |
|
692 where \texttt{defs} is a keyword and \texttt{nand_def} and \texttt{exor_def} |
|
693 are arbitrary user-supplied names. |
|
694 The symbol \texttt{==}\index{==>@{\tt==}|bold} is a special form of equality |
|
695 that should only be used in constant definitions. |
|
696 Declarations and definitions can also be merged |
|
697 \begin{ttbox} |
|
698 \input{Misc/constdefs}\end{ttbox}\indexbold{*constdefs} |
|
699 in which case the default name of each definition is $f$\texttt{_def}, where |
|
700 $f$ is the name of the defined constant. |
|
701 |
|
702 Note that pattern-matching is not allowed, i.e.\ each definition must be of |
|
703 the form $f\,x@1\,\dots\,x@n$\texttt{~==~}$t$. |
|
704 |
|
705 Section~\S\ref{sec:Simplification} explains how definitions are used |
|
706 in proofs. |
|
707 |
|
708 \begin{warn} |
|
709 A common mistake when writing definitions is to introduce extra free variables |
|
710 on the right-hand side as in the following fictitious definition: |
|
711 \begin{ttbox} |
|
712 defs prime_def "prime(p) == (m divides p) --> (m=1 | m=p)" |
|
713 \end{ttbox} |
|
714 Isabelle rejects this `definition' because of the extra {\tt m} on the |
|
715 right-hand side, which would introduce an inconsistency. What you should have |
|
716 written is |
|
717 \begin{ttbox} |
|
718 defs prime_def "prime(p) == !m. (m divides p) --> (m=1 | m=p)" |
|
719 \end{ttbox} |
|
720 \end{warn} |
|
721 |
|
722 |
|
723 |
|
724 |
|
725 \chapter{More Functional Programming} |
|
726 |
|
727 The purpose of this chapter is to deepen the reader's understanding of the |
|
728 concepts encountered so far and to introduce an advanced method for defining |
|
729 recursive functions. The first two sections give a structured presentation of |
|
730 theorem proving by simplification (\S\ref{sec:Simplification}) and |
|
731 discuss important heuristics for induction (\S\ref{sec:InductionHeuristics}). They |
|
732 can be skipped by readers less interested in proofs. They are followed by a |
|
733 case study, a compiler for expressions (\S\ref{sec:ExprCompiler}). |
|
734 Finally we present a very general method for defining recursive functions |
|
735 that goes well beyond what \texttt{primrec} allows (\S\ref{sec:recdef}). |
|
736 |
|
737 |
|
738 \section{Simplification} |
|
739 \label{sec:Simplification} |
|
740 |
|
741 So far we have proved our theorems by \texttt{Auto_tac}, which |
|
742 `simplifies' all subgoals. In fact, \texttt{Auto_tac} can do much more than |
|
743 that, except that it did not need to so far. However, when you go beyond toy |
|
744 examples, you need to understand the ingredients of \texttt{Auto_tac}. |
|
745 This section covers the tactic that \texttt{Auto_tac} always applies first, |
|
746 namely simplification. |
|
747 |
|
748 Simplification is one of the central theorem proving tools in Isabelle and |
|
749 many other systems. The tool itself is called the \bfindex{simplifier}. The |
|
750 purpose of this section is twofold: to introduce the many features of the |
|
751 simplifier (\S\ref{sec:SimpFeatures}) and to explain a little bit how the |
|
752 simplifier works (\S\ref{sec:SimpHow}). Anybody intending to use HOL should |
|
753 read \S\ref{sec:SimpFeatures}, and the serious student should read |
|
754 \S\ref{sec:SimpHow} as well in order to understand what happened in case |
|
755 things do not simplify as expected. |
|
756 |
|
757 |
|
758 \subsection{Using the simplifier} |
|
759 \label{sec:SimpFeatures} |
|
760 |
|
761 In its most basic form, simplification means repeated application of |
|
762 equations from left to right. For example, taking the rules for \texttt{\at} |
|
763 and applying them to the term \texttt{[0,1] \at\ []} results in a sequence of |
|
764 simplification steps: |
|
765 \begin{ttbox}\makeatother |
|
766 (0#1#[]) @ [] \(\leadsto\) 0#((1#[]) @ []) \(\leadsto\) 0#(1#([] @ [])) \(\leadsto\) 0#1#[] |
|
767 \end{ttbox} |
|
768 This is also known as {\em term rewriting} and the equations are referred |
|
769 to as {\em rewrite rules}. This is more honest than `simplification' because |
|
770 the terms do not necessarily become simpler in the process. |
|
771 |
|
772 \subsubsection{Simpsets} |
|
773 |
|
774 To facilitate simplification, each theory has an associated set of |
|
775 simplification rules, known as a \bfindex{simpset}. Within a theory, |
|
776 proofs by simplification refer to the associated simpset by default. |
|
777 The simpset of a theory is built up as follows: starting with the union of |
|
778 the simpsets of the parent theories, each occurrence of a \texttt{datatype} |
|
779 or \texttt{primrec} construct augments the simpset. Explicit definitions are |
|
780 not added automatically. Users can add new theorems via \texttt{Addsimps} and |
|
781 delete them again later by \texttt{Delsimps}. |
|
782 |
|
783 You may augment a simpset not just by equations but by pretty much any |
|
784 theorem. The simplifier will try to make sense of it. For example, a theorem |
|
785 \verb$~$$P$ is automatically turned into \texttt{$P$ = False}. The details are |
|
786 explained in \S\ref{sec:SimpHow}. |
|
787 |
|
788 As a rule of thumb, rewrite rules that really simplify a term (like |
|
789 \texttt{xs \at\ [] = xs} and \texttt{rev(rev xs) = xs}) should be added to the |
|
790 current simpset right after they have been proved. Those of a more specific |
|
791 nature (e.g.\ distributivity laws, which alter the structure of terms |
|
792 considerably) should only be added for specific proofs and deleted again |
|
793 afterwards. Conversely, it may also happen that a generally useful rule |
|
794 needs to be removed for a certain proof and is added again afterwards. The |
|
795 need of frequent temporary additions or deletions may indicate a badly |
|
796 designed simpset. |
|
797 \begin{warn} |
|
798 Simplification may not terminate, for example if both $f(x) = g(x)$ and |
|
799 $g(x) = f(x)$ are in the simpset. It is the user's responsibility not to |
|
800 include rules that can lead to nontermination, either on their own or in |
|
801 combination with other rules. |
|
802 \end{warn} |
|
803 |
|
804 \subsubsection{Simplification tactics} |
|
805 |
|
806 There are four main simplification tactics: |
|
807 \begin{ttdescription} |
|
808 \item[\ttindexbold{Simp_tac} $i$] simplifies the conclusion of subgoal~$i$ |
|
809 using the theory's simpset. It may solve the subgoal completely if it has |
|
810 become trivial. For example: |
|
811 \begin{ttbox}\makeatother |
|
812 {\out 1. [] @ [] = []} |
|
813 by(Simp_tac 1); |
|
814 {\out No subgoals!} |
|
815 \end{ttbox} |
|
816 |
|
817 \item[\ttindexbold{Asm_simp_tac}] |
|
818 is like \verb$Simp_tac$, but extracts additional rewrite rules from |
|
819 the assumptions of the subgoal. For example, it solves |
|
820 \begin{ttbox}\makeatother |
|
821 {\out 1. xs = [] ==> xs @ xs = xs} |
|
822 \end{ttbox} |
|
823 which \texttt{Simp_tac} does not do. |
|
824 |
|
825 \item[\ttindexbold{Full_simp_tac}] is like \verb$Simp_tac$, but also |
|
826 simplifies the assumptions (without using the assumptions to |
|
827 simplify each other or the actual goal). |
|
828 |
|
829 \item[\ttindexbold{Asm_full_simp_tac}] is like \verb$Asm_simp_tac$, |
|
830 but also simplifies the assumptions. In particular, assumptions can |
|
831 simplify each other. For example: |
|
832 \begin{ttbox}\makeatother |
|
833 \out{ 1. [| xs @ zs = ys @ xs; [] @ xs = [] @ [] |] ==> ys = zs} |
|
834 by(Asm_full_simp_tac 1); |
|
835 {\out No subgoals!} |
|
836 \end{ttbox} |
|
837 The second assumption simplifies to \texttt{xs = []}, which in turn |
|
838 simplifies the first assumption to \texttt{zs = ys}, thus reducing the |
|
839 conclusion to \texttt{ys = ys} and hence to \texttt{True}. |
|
840 (See also the paragraph on tracing below.) |
|
841 \end{ttdescription} |
|
842 \texttt{Asm_full_simp_tac} is the most powerful of this quartet of |
|
843 tactics. In fact, \texttt{Auto_tac} starts by applying |
|
844 \texttt{Asm_full_simp_tac} to all subgoals. The only reason for the existence |
|
845 of the other three tactics is that sometimes one wants to limit the amount of |
|
846 simplification, for example to avoid nontermination: |
|
847 \begin{ttbox}\makeatother |
|
848 {\out 1. ! x. f x = g (f (g x)) ==> f [] = f [] @ []} |
|
849 \end{ttbox} |
|
850 is solved by \texttt{Simp_tac}, but \texttt{Asm_simp_tac} and |
|
851 \texttt{Asm_full_simp_tac} loop because the rewrite rule \texttt{f x = g(f(g |
|
852 x))} extracted from the assumption does not terminate. Isabelle notices |
|
853 certain simple forms of nontermination, but not this one. |
|
854 |
|
855 \subsubsection{Modifying simpsets locally} |
|
856 |
|
857 If a certain theorem is merely needed in one proof by simplification, the |
|
858 pattern |
|
859 \begin{ttbox} |
|
860 Addsimps [\(rare_theorem\)]; |
|
861 by(Simp_tac 1); |
|
862 Delsimps [\(rare_theorem\)]; |
|
863 \end{ttbox} |
|
864 is awkward. Therefore there are lower-case versions of the simplification |
|
865 tactics (\ttindexbold{simp_tac}, \ttindexbold{asm_simp_tac}, |
|
866 \ttindexbold{full_simp_tac}, \ttindexbold{asm_full_simp_tac}) and of the |
|
867 simpset modifiers (\ttindexbold{addsimps}, \ttindexbold{delsimps}) |
|
868 that do not access or modify the implicit simpset but explicitly take a |
|
869 simpset as an argument. For example, the above three lines become |
|
870 \begin{ttbox} |
|
871 by(simp_tac (simpset() addsimps [\(rare_theorem\)]) 1); |
|
872 \end{ttbox} |
|
873 where the result of the function call \texttt{simpset()} is the simpset of |
|
874 the current theory and \texttt{addsimps} is an infix function. The implicit |
|
875 simpset is read once but not modified. |
|
876 This is far preferable to pairs of \texttt{Addsimps} and \texttt{Delsimps}. |
|
877 Local modifications can be stacked as in |
|
878 \begin{ttbox} |
|
879 by(simp_tac (simpset() addsimps [\(rare_theorem\)] delsimps [\(some_thm\)]) 1); |
|
880 \end{ttbox} |
|
881 |
|
882 \subsubsection{Rewriting with definitions} |
|
883 |
|
884 Constant definitions (\S\ref{sec:ConstDefinitions}) are not automatically |
|
885 included in the simpset of a theory. Hence such definitions are not expanded |
|
886 automatically either, just as it should be: definitions are introduced for |
|
887 the purpose of abbreviating complex concepts. Of course we need to expand the |
|
888 definitions initially to derive enough lemmas that characterize the concept |
|
889 sufficiently for us to forget the original definition completely. For |
|
890 example, given the theory |
|
891 \begin{ttbox} |
|
892 \input{Misc/Exor.thy}\end{ttbox} |
|
893 we may want to prove \verb$exor A (~A)$. Instead of \texttt{Goal} we use |
|
894 \begin{ttbox} |
|
895 \input{Misc/exorgoal.ML}\end{ttbox} |
|
896 which tells Isabelle to expand the definition of \texttt{exor}---the first |
|
897 argument of \texttt{Goalw} can be a list of definitions---in the initial goal: |
|
898 \begin{ttbox} |
|
899 {\out exor A (~ A)} |
|
900 {\out 1. A & ~ ~ A | ~ A & ~ A} |
|
901 \end{ttbox} |
|
902 In this simple example, the goal is proved by \texttt{Simp_tac}. |
|
903 Of course the resulting theorem is insufficient to characterize \texttt{exor} |
|
904 completely. |
|
905 |
|
906 In case we want to expand a definition in the middle of a proof, we can |
|
907 simply add the definition locally to the simpset: |
|
908 \begin{ttbox} |
|
909 \input{Misc/exorproof.ML}\end{ttbox} |
|
910 You should normally not add the definition permanently to the simpset |
|
911 using \texttt{Addsimps} because this defeats the whole purpose of an |
|
912 abbreviation. |
|
913 |
|
914 \begin{warn} |
|
915 If you have defined $f\,x\,y$\texttt{~==~}$t$ then you can only expand |
|
916 occurrences of $f$ with at least two arguments. Thus it is safer to define |
|
917 $f$\texttt{~==~\%$x\,y$.}$\;t$. |
|
918 \end{warn} |
|
919 |
|
920 \subsubsection{Simplifying \texttt{let}-expressions} |
|
921 |
|
922 Proving a goal containing \ttindex{let}-expressions invariably requires the |
|
923 \texttt{let}-constructs to be expanded at some point. Since |
|
924 \texttt{let}-\texttt{in} is just syntactic sugar for a defined constant |
|
925 (called \texttt{Let}), expanding \texttt{let}-constructs means rewriting with |
|
926 \texttt{Let_def}: |
|
927 %context List.thy; |
|
928 %Goal "(let xs = [] in xs@xs) = ys"; |
|
929 \begin{ttbox}\makeatother |
|
930 {\out 1. (let xs = [] in xs @ xs) = ys} |
|
931 by(simp_tac (simpset() addsimps [Let_def]) 1); |
|
932 {\out 1. [] = ys} |
|
933 \end{ttbox} |
|
934 If, in a particular context, there is no danger of a combinatorial explosion |
|
935 of nested \texttt{let}s one could even add \texttt{Let_def} permanently via |
|
936 \texttt{Addsimps}. |
|
937 |
|
938 \subsubsection{Conditional equations} |
|
939 |
|
940 So far all examples of rewrite rules were equations. The simplifier also |
|
941 accepts {\em conditional\/} equations, for example |
|
942 \begin{ttbox} |
|
943 xs ~= [] ==> hd xs # tl xs = xs \hfill \((*)\) |
|
944 \end{ttbox} |
|
945 (which is proved by \texttt{exhaust_tac} on \texttt{xs} followed by |
|
946 \texttt{Asm_full_simp_tac} twice). Assuming that this theorem together with |
|
947 %\begin{ttbox}\makeatother |
|
948 \texttt{(rev xs = []) = (xs = [])} |
|
949 %\end{ttbox} |
|
950 are part of the simpset, the subgoal |
|
951 \begin{ttbox}\makeatother |
|
952 {\out 1. xs ~= [] ==> hd(rev xs) # tl(rev xs) = rev xs} |
|
953 \end{ttbox} |
|
954 is proved by simplification: |
|
955 the conditional equation $(*)$ above |
|
956 can simplify \texttt{hd(rev~xs)~\#~tl(rev~xs)} to \texttt{rev xs} |
|
957 because the corresponding precondition \verb$rev xs ~= []$ simplifies to |
|
958 \verb$xs ~= []$, which is exactly the local assumption of the subgoal. |
|
959 |
|
960 |
|
961 \subsubsection{Automatic case splits} |
|
962 |
|
963 Goals containing \ttindex{if}-expressions are usually proved by case |
|
964 distinction on the condition of the \texttt{if}. For example the goal |
|
965 \begin{ttbox} |
|
966 {\out 1. ! xs. if xs = [] then rev xs = [] else rev xs ~= []} |
|
967 \end{ttbox} |
|
968 can be split into |
|
969 \begin{ttbox} |
|
970 {\out 1. ! xs. (xs = [] --> rev xs = []) \& (xs ~= [] --> rev xs ~= [])} |
|
971 \end{ttbox} |
|
972 by typing |
|
973 \begin{ttbox} |
|
974 \input{Misc/splitif.ML}\end{ttbox} |
|
975 Because this is almost always the right proof strategy, the simplifier |
|
976 performs case-splitting on \texttt{if}s automatically. Try \texttt{Simp_tac} |
|
977 on the initial goal above. |
|
978 |
|
979 This splitting idea generalizes from \texttt{if} to \ttindex{case}: |
|
980 \begin{ttbox}\makeatother |
|
981 {\out 1. (case xs of [] => zs | y#ys => y#(ys@zs)) = xs@zs} |
|
982 \end{ttbox} |
|
983 becomes |
|
984 \begin{ttbox}\makeatother |
|
985 {\out 1. (xs = [] --> zs = xs @ zs) &} |
|
986 {\out (! a list. xs = a # list --> a # list @ zs = xs @ zs)} |
|
987 \end{ttbox} |
|
988 by typing |
|
989 \begin{ttbox} |
|
990 \input{Misc/splitlist.ML}\end{ttbox} |
|
991 In contrast to \texttt{if}-expressions, the simplifier does not split |
|
992 \texttt{case}-expressions by default because this can lead to nontermination |
|
993 in case of recursive datatypes. |
|
994 Nevertheless the simplifier can be instructed to perform \texttt{case}-splits |
|
995 by adding the appropriate rule to the simpset: |
|
996 \begin{ttbox} |
|
997 by(simp_tac (simpset() addsplits [split_list_case]) 1); |
|
998 \end{ttbox}\indexbold{*addsplits} |
|
999 solves the initial goal outright, which \texttt{Simp_tac} alone will not do. |
|
1000 |
|
1001 In general, every datatype $t$ comes with a rule |
|
1002 \texttt{$t$.split} that can be added to the simpset either |
|
1003 locally via \texttt{addsplits} (see above), or permanently via |
|
1004 \begin{ttbox} |
|
1005 Addsplits [\(t\).split]; |
|
1006 \end{ttbox}\indexbold{*Addsplits} |
|
1007 Split-rules can be removed globally via \ttindexbold{Delsplits} and locally |
|
1008 via \ttindexbold{delsplits} as, for example, in |
|
1009 \begin{ttbox} |
|
1010 by(simp_tac (simpset() addsimps [\(\dots\)] delsplits [split_if]) 1); |
|
1011 \end{ttbox} |
|
1012 |
|
1013 |
|
1014 \subsubsection{Permutative rewrite rules} |
|
1015 |
|
1016 A rewrite rule is {\bf permutative} if the left-hand side and right-hand side |
|
1017 are the same up to renaming of variables. The most common permutative rule |
|
1018 is commutativity: $x+y = y+x$. Another example is $(x-y)-z = (x-z)-y$. Such |
|
1019 rules are problematic because once they apply, they can be used forever. |
|
1020 The simplifier is aware of this danger and treats permutative rules |
|
1021 separately. For details see~\cite{Isa-Ref-Man}. |
|
1022 |
|
1023 \subsubsection{Tracing} |
|
1024 \indexbold{tracing the simplifier} |
|
1025 |
|
1026 Using the simplifier effectively may take a bit of experimentation. Set the |
|
1027 \verb$trace_simp$ flag to get a better idea of what is going on: |
|
1028 \begin{ttbox}\makeatother |
|
1029 {\out 1. rev [x] = []} |
|
1030 \ttbreak |
|
1031 set trace_simp; |
|
1032 by(Simp_tac 1); |
|
1033 \ttbreak\makeatother |
|
1034 {\out Applying instance of rewrite rule:} |
|
1035 {\out rev (?x # ?xs) == rev ?xs @ [?x]} |
|
1036 {\out Rewriting:} |
|
1037 {\out rev [x] == rev [] @ [x]} |
|
1038 \ttbreak |
|
1039 {\out Applying instance of rewrite rule:} |
|
1040 {\out rev [] == []} |
|
1041 {\out Rewriting:} |
|
1042 {\out rev [] == []} |
|
1043 \ttbreak\makeatother |
|
1044 {\out Applying instance of rewrite rule:} |
|
1045 {\out [] @ ?y == ?y} |
|
1046 {\out Rewriting:} |
|
1047 {\out [] @ [x] == [x]} |
|
1048 \ttbreak |
|
1049 {\out Applying instance of rewrite rule:} |
|
1050 {\out ?x # ?t = ?t == False} |
|
1051 {\out Rewriting:} |
|
1052 {\out [x] = [] == False} |
|
1053 \ttbreak |
|
1054 {\out Level 1} |
|
1055 {\out rev [x] = []} |
|
1056 {\out 1. False} |
|
1057 \end{ttbox} |
|
1058 In more complicated cases, the trace can be enormous, especially since |
|
1059 invocations of the simplifier are often nested (e.g.\ when solving conditions |
|
1060 of rewrite rules). |
|
1061 |
|
1062 \subsection{How it works} |
|
1063 \label{sec:SimpHow} |
|
1064 |
|
1065 \subsubsection{Higher-order patterns} |
|
1066 |
|
1067 \subsubsection{Local assumptions} |
|
1068 |
|
1069 \subsubsection{The preprocessor} |
|
1070 |
|
1071 \section{Induction heuristics} |
|
1072 \label{sec:InductionHeuristics} |
|
1073 |
|
1074 The purpose of this section is to illustrate some simple heuristics for |
|
1075 inductive proofs. The first one we have already mentioned in our initial |
|
1076 example: |
|
1077 \begin{quote} |
|
1078 {\em 1. Theorems about recursive functions are proved by induction.} |
|
1079 \end{quote} |
|
1080 In case the function has more than one argument |
|
1081 \begin{quote} |
|
1082 {\em 2. Do induction on argument number $i$ if the function is defined by |
|
1083 recursion in argument number $i$.} |
|
1084 \end{quote} |
|
1085 When we look at the proof of |
|
1086 \begin{ttbox}\makeatother |
|
1087 (xs @ ys) @ zs = xs @ (ys @ zs) |
|
1088 \end{ttbox} |
|
1089 in \S\ref{sec:intro-proof} we find (a) \texttt{\at} is recursive in the first |
|
1090 argument, (b) \texttt{xs} occurs only as the first argument of \texttt{\at}, |
|
1091 and (c) both \texttt{ys} and \texttt{zs} occur at least once as the second |
|
1092 argument of \texttt{\at}. Hence it is natural to perform induction on |
|
1093 \texttt{xs}. |
|
1094 |
|
1095 The key heuristic, and the main point of this section, is to |
|
1096 generalize the goal before induction. The reason is simple: if the goal is |
|
1097 too specific, the induction hypothesis is too weak to allow the induction |
|
1098 step to go through. Let us now illustrate the idea with an example. |
|
1099 |
|
1100 We define a tail-recursive version of list-reversal, |
|
1101 i.e.\ one that can be compiled into a loop: |
|
1102 \begin{ttbox} |
|
1103 \input{Misc/Itrev.thy}\end{ttbox} |
|
1104 The behaviour of \texttt{itrev} is simple: it reverses its first argument by |
|
1105 stacking its elements onto the second argument, and returning that second |
|
1106 argument when the first one becomes empty. |
|
1107 We need to show that \texttt{itrev} does indeed reverse its first argument |
|
1108 provided the second one is empty: |
|
1109 \begin{ttbox} |
|
1110 \input{Misc/itrev1.ML}\end{ttbox} |
|
1111 There is no choice as to the induction variable, and we immediately simplify: |
|
1112 \begin{ttbox} |
|
1113 \input{Misc/induct_auto.ML}\ttbreak\makeatother |
|
1114 {\out1. !!a list. itrev list [] = rev list\(\;\)==> itrev list [a] = rev list @ [a]} |
|
1115 \end{ttbox} |
|
1116 Just as predicted above, the overall goal, and hence the induction |
|
1117 hypothesis, is too weak to solve the induction step because of the fixed |
|
1118 \texttt{[]}. The corresponding heuristic: |
|
1119 \begin{quote} |
|
1120 {\em 3. Generalize goals for induction by replacing constants by variables.} |
|
1121 \end{quote} |
|
1122 Of course one cannot do this na\"{\i}vely: \texttt{itrev xs ys = rev xs} is |
|
1123 just not true --- the correct generalization is |
|
1124 \begin{ttbox}\makeatother |
|
1125 \input{Misc/itrev2.ML}\end{ttbox} |
|
1126 If \texttt{ys} is replaced by \texttt{[]}, the right-hand side simplifies to |
|
1127 \texttt{rev xs}, just as required. |
|
1128 |
|
1129 In this particular instance it is easy to guess the right generalization, |
|
1130 but in more complex situations a good deal of creativity is needed. This is |
|
1131 the main source of complications in inductive proofs. |
|
1132 |
|
1133 Although we now have two variables, only \texttt{xs} is suitable for |
|
1134 induction, and we repeat our above proof attempt. Unfortunately, we are still |
|
1135 not there: |
|
1136 \begin{ttbox}\makeatother |
|
1137 {\out 1. !!a list.} |
|
1138 {\out itrev list ys = rev list @ ys} |
|
1139 {\out ==> itrev list (a # ys) = rev list @ a # ys} |
|
1140 \end{ttbox} |
|
1141 The induction hypothesis is still too weak, but this time it takes no |
|
1142 intuition to generalize: the problem is that \texttt{ys} is fixed throughout |
|
1143 the subgoal, but the induction hypothesis needs to be applied with |
|
1144 \texttt{a \# ys} instead of \texttt{ys}. Hence we prove the theorem |
|
1145 for all \texttt{ys} instead of a fixed one: |
|
1146 \begin{ttbox}\makeatother |
|
1147 \input{Misc/itrev3.ML}\end{ttbox} |
|
1148 This time induction on \texttt{xs} followed by simplification succeeds. This |
|
1149 leads to another heuristic for generalization: |
|
1150 \begin{quote} |
|
1151 {\em 4. Generalize goals for induction by universally quantifying all free |
|
1152 variables {\em(except the induction variable itself!)}.} |
|
1153 \end{quote} |
|
1154 This prevents trivial failures like the above and does not change the |
|
1155 provability of the goal. Because it is not always required, and may even |
|
1156 complicate matters in some cases, this heuristic is often not |
|
1157 applied blindly. |
|
1158 |
|
1159 A final point worth mentioning is the orientation of the equation we just |
|
1160 proved: the more complex notion (\texttt{itrev}) is on the left-hand |
|
1161 side, the simpler \texttt{rev} on the right-hand side. This constitutes |
|
1162 another, albeit weak heuristic that is not restricted to induction: |
|
1163 \begin{quote} |
|
1164 {\em 5. The right-hand side of an equation should (in some sense) be |
|
1165 simpler than the left-hand side.} |
|
1166 \end{quote} |
|
1167 The heuristic is tricky to apply because it is not obvious that |
|
1168 \texttt{rev xs \at\ ys} is simpler than \texttt{itrev xs ys}. But see what |
|
1169 happens if you try to prove the symmetric equation! |
|
1170 |
|
1171 |
|
1172 \section{Case study: compiling expressions} |
|
1173 \label{sec:ExprCompiler} |
|
1174 |
|
1175 The task is to develop a compiler from a generic type of expressions (built |
|
1176 up from variables, constants and binary operations) to a stack machine. This |
|
1177 generic type of expressions is a generalization of the boolean expressions in |
|
1178 \S\ref{sec:boolex}. This time we do not commit ourselves to a particular |
|
1179 type of variables or values but make them type parameters. Neither is there |
|
1180 a fixed set of binary operations: instead the expression contains the |
|
1181 appropriate function itself. |
|
1182 \begin{ttbox} |
|
1183 \input{CodeGen/expr}\end{ttbox} |
|
1184 The three constructors represent constants, variables and the combination of |
|
1185 two subexpressions with a binary operation. |
|
1186 |
|
1187 The value of an expression w.r.t.\ an environment that maps variables to |
|
1188 values is easily defined: |
|
1189 \begin{ttbox} |
|
1190 \input{CodeGen/value}\end{ttbox} |
|
1191 |
|
1192 The stack machine has three instructions: load a constant value onto the |
|
1193 stack, load the contents of a certain address onto the stack, and apply a |
|
1194 binary operation to the two topmost elements of the stack, replacing them by |
|
1195 the result. As for \texttt{expr}, addresses and values are type parameters: |
|
1196 \begin{ttbox} |
|
1197 \input{CodeGen/instr}\end{ttbox} |
|
1198 |
|
1199 The execution of the stack machine is modelled by a function \texttt{exec} |
|
1200 that takes a store (modelled as a function from addresses to values, just |
|
1201 like the environment for evaluating expressions), a stack (modelled as a |
|
1202 list) of values and a list of instructions and returns the stack at the end |
|
1203 of the execution --- the store remains unchanged: |
|
1204 \begin{ttbox} |
|
1205 \input{CodeGen/exec}\end{ttbox} |
|
1206 Recall that \texttt{hd} and \texttt{tl} |
|
1207 return the first element and the remainder of a list. |
|
1208 |
|
1209 Because all functions are total, \texttt{hd} is defined even for the empty |
|
1210 list, although we do not know what the result is. Thus our model of the |
|
1211 machine always terminates properly, although the above definition does not |
|
1212 tell us much about the result in situations where \texttt{Apply} was executed |
|
1213 with fewer than two elements on the stack. |
|
1214 |
|
1215 The compiler is a function from expressions to a list of instructions. Its |
|
1216 definition is pretty much obvious: |
|
1217 \begin{ttbox}\makeatother |
|
1218 \input{CodeGen/comp}\end{ttbox} |
|
1219 |
|
1220 Now we have to prove the correctness of the compiler, i.e.\ that the |
|
1221 execution of a compiled expression results in the value of the expression: |
|
1222 \begin{ttbox} |
|
1223 exec s [] (comp e) = [value s e] |
|
1224 \end{ttbox} |
|
1225 This is generalized to |
|
1226 \begin{ttbox} |
|
1227 \input{CodeGen/goal2.ML}\end{ttbox} |
|
1228 and proved by induction on \texttt{e} followed by simplification, once we |
|
1229 have the following lemma about executing the concatenation of two instruction |
|
1230 sequences: |
|
1231 \begin{ttbox}\makeatother |
|
1232 \input{CodeGen/goal2.ML}\end{ttbox} |
|
1233 This requires induction on \texttt{xs} and ordinary simplification for the |
|
1234 base cases. In the induction step, simplification leaves us with a formula |
|
1235 that contains two \texttt{case}-expressions over instructions. Thus we add |
|
1236 automatic case splitting as well, which finishes the proof: |
|
1237 \begin{ttbox} |
|
1238 \input{CodeGen/simpsplit.ML}\end{ttbox} |
|
1239 |
|
1240 We could now go back and prove \texttt{exec s [] (comp e) = [value s e]} |
|
1241 merely by simplification with the generalized version we just proved. |
|
1242 However, this is unnecessary because the generalized version fully subsumes |
|
1243 its instance. |
|
1244 |
|
1245 \section{Total recursive functions} |
|
1246 \label{sec:recdef} |
|
1247 \index{*recdef|(} |
|
1248 |
|
1249 |
|
1250 Although many total functions have a natural primitive recursive definition, |
|
1251 this is not always the case. Arbitrary total recursive functions can be |
|
1252 defined by means of \texttt{recdef}: you can use full pattern-matching, |
|
1253 recursion need not involve datatypes, and termination is proved by showing |
|
1254 that each recursive call makes the argument smaller in a suitable (user |
|
1255 supplied) sense. |
|
1256 |
|
1257 \subsection{Defining recursive functions} |
|
1258 |
|
1259 Here is a simple example, the Fibonacci function: |
|
1260 \begin{ttbox} |
|
1261 consts fib :: nat => nat |
|
1262 recdef fib "measure(\%n. n)" |
|
1263 "fib 0 = 0" |
|
1264 "fib 1 = 1" |
|
1265 "fib (Suc(Suc x)) = fib x + fib (Suc x)" |
|
1266 \end{ttbox} |
|
1267 The definition of \texttt{fib} is accompanied by a \bfindex{measure function} |
|
1268 \texttt{\%n.$\;$n} that maps the argument of \texttt{fib} to a natural |
|
1269 number. The requirement is that in each equation the measure of the argument |
|
1270 on the left-hand side is strictly greater than the measure of the argument of |
|
1271 each recursive call. In the case of \texttt{fib} this is obviously true |
|
1272 because the measure function is the identity and \texttt{Suc(Suc~x)} is |
|
1273 strictly greater than both \texttt{x} and \texttt{Suc~x}. |
|
1274 |
|
1275 Slightly more interesting is the insertion of a fixed element |
|
1276 between any two elements of a list: |
|
1277 \begin{ttbox} |
|
1278 consts sep :: "'a * 'a list => 'a list" |
|
1279 recdef sep "measure (\%(a,xs). length xs)" |
|
1280 "sep(a, []) = []" |
|
1281 "sep(a, [x]) = [x]" |
|
1282 "sep(a, x#y#zs) = x # a # sep(a,y#zs)" |
|
1283 \end{ttbox} |
|
1284 This time the measure is the length of the list, which decreases with the |
|
1285 recursive call; the first component of the argument tuple is irrelevant. |
|
1286 |
|
1287 Pattern matching need not be exhaustive: |
|
1288 \begin{ttbox} |
|
1289 consts last :: 'a list => bool |
|
1290 recdef last "measure (\%xs. length xs)" |
|
1291 "last [x] = x" |
|
1292 "last (x#y#zs) = last (y#zs)" |
|
1293 \end{ttbox} |
|
1294 |
|
1295 Overlapping patterns are disambiguated by taking the order of equations into |
|
1296 account, just as in functional programming: |
|
1297 \begin{ttbox} |
|
1298 recdef sep "measure (\%(a,xs). length xs)" |
|
1299 "sep(a, x#y#zs) = x # a # sep(a,y#zs)" |
|
1300 "sep(a, xs) = xs" |
|
1301 \end{ttbox} |
|
1302 This defines exactly the same function \texttt{sep} as further above. |
|
1303 |
|
1304 \begin{warn} |
|
1305 Currently \texttt{recdef} only accepts functions with a single argument, |
|
1306 possibly of tuple type. |
|
1307 \end{warn} |
|
1308 |
|
1309 When loading a theory containing a \texttt{recdef} of a function $f$, |
|
1310 Isabelle proves the recursion equations and stores the result as a list of |
|
1311 theorems $f$.\texttt{rules}. It can be viewed by typing |
|
1312 \begin{ttbox} |
|
1313 prths \(f\).rules; |
|
1314 \end{ttbox} |
|
1315 All of the above examples are simple enough that Isabelle can determine |
|
1316 automatically that the measure of the argument goes down in each recursive |
|
1317 call. In that case $f$.\texttt{rules} contains precisely the defining |
|
1318 equations. |
|
1319 |
|
1320 In general, Isabelle may not be able to prove all termination conditions |
|
1321 automatically. For example, termination of |
|
1322 \begin{ttbox} |
|
1323 consts gcd :: "nat*nat => nat" |
|
1324 recdef gcd "measure ((\%(m,n).n))" |
|
1325 "gcd (m, n) = (if n=0 then m else gcd(n, m mod n))" |
|
1326 \end{ttbox} |
|
1327 relies on the lemma \texttt{mod_less_divisor} |
|
1328 \begin{ttbox} |
|
1329 0 < n ==> m mod n < n |
|
1330 \end{ttbox} |
|
1331 that is not part of the default simpset. As a result, Isabelle prints a |
|
1332 warning and \texttt{gcd.rules} contains a precondition: |
|
1333 \begin{ttbox} |
|
1334 (! m n. 0 < n --> m mod n < n) ==> gcd (m, n) = (if n=0 \dots) |
|
1335 \end{ttbox} |
|
1336 We need to instruct \texttt{recdef} to use an extended simpset to prove the |
|
1337 termination condition: |
|
1338 \begin{ttbox} |
|
1339 recdef gcd "measure ((\%(m,n).n))" |
|
1340 simpset "simpset() addsimps [mod_less_divisor]" |
|
1341 "gcd (m, n) = (if n=0 then m else gcd(n, m mod n))" |
|
1342 \end{ttbox} |
|
1343 This time everything works fine and \texttt{gcd.rules} contains precisely the |
|
1344 stated recursion equation for \texttt{gcd}. |
|
1345 |
|
1346 When defining some nontrivial total recursive function, the first attempt |
|
1347 will usually generate a number of termination conditions, some of which may |
|
1348 require new lemmas to be proved in some of the parent theories. Those lemmas |
|
1349 can then be added to the simpset used by \texttt{recdef} for its |
|
1350 proofs, as shown for \texttt{gcd}. |
|
1351 |
|
1352 Although all the above examples employ measure functions, \texttt{recdef} |
|
1353 allows arbitrary wellfounded relations. For example, termination of |
|
1354 Ackermann's function requires the lexicographic product \texttt{**}: |
|
1355 \begin{ttbox} |
|
1356 recdef ack "measure(\%m. m) ** measure(\%n. n)" |
|
1357 "ack(0,n) = Suc n" |
|
1358 "ack(Suc m,0) = ack(m, 1)" |
|
1359 "ack(Suc m,Suc n) = ack(m,ack(Suc m,n))" |
|
1360 \end{ttbox} |
|
1361 For details see~\cite{Isa-Logics-Man} and the examples in the library. |
|
1362 |
|
1363 |
|
1364 \subsection{Deriving simplification rules} |
|
1365 |
|
1366 Once we have succeeded to prove all termination conditions, we can start to |
|
1367 derive some theorems. In contrast to \texttt{primrec} definitions, which are |
|
1368 automatically added to the simpset, \texttt{recdef} rules must be included |
|
1369 explicitly, for example as in |
|
1370 \begin{ttbox} |
|
1371 Addsimps fib.rules; |
|
1372 \end{ttbox} |
|
1373 However, some care is necessary now, in contrast to \texttt{primrec}. |
|
1374 Although \texttt{gcd} is a total function, its defining equation leads to |
|
1375 nontermination of the simplifier, because the subterm \texttt{gcd(n, m mod |
|
1376 n)} on the right-hand side can again be simplified by the same equation, |
|
1377 and so on. The reason: the simplifier rewrites the \texttt{then} and |
|
1378 \texttt{else} branches of a conditional if the condition simplifies to |
|
1379 neither \texttt{True} nor \texttt{False}. Therefore it is recommended to |
|
1380 derive an alternative formulation that replaces case distinctions on the |
|
1381 right-hand side by conditional equations. For \texttt{gcd} it means we have |
|
1382 to prove |
|
1383 \begin{ttbox} |
|
1384 gcd (m, 0) = m |
|
1385 n ~= 0 ==> gcd (m, n) = gcd(n, m mod n) |
|
1386 \end{ttbox} |
|
1387 To avoid nontermination during those proofs, we have to resort to some low |
|
1388 level tactics: |
|
1389 \begin{ttbox} |
|
1390 Goal "gcd(m,0) = m"; |
|
1391 by(resolve_tac [trans] 1); |
|
1392 by(resolve_tac gcd.rules 1); |
|
1393 by(Simp_tac 1); |
|
1394 \end{ttbox} |
|
1395 At this point it is not necessary to understand what exactly |
|
1396 \texttt{resolve_tac} is doing. The main point is that the above proof works |
|
1397 not just for this one example but in general (except that we have to use |
|
1398 \texttt{Asm_simp_tac} and $f$\texttt{.rules} in general). Try the second |
|
1399 \texttt{gcd}-equation above! |
|
1400 |
|
1401 \subsection{Induction} |
|
1402 |
|
1403 Assuming we have added the recursion equations (or some suitable derived |
|
1404 equations) to the simpset, we might like to prove something about our |
|
1405 function. Since the function is recursive, the natural proof principle is |
|
1406 again induction. But this time the structural form of induction that comes |
|
1407 with datatypes is unlikely to work well---otherwise we could have defined the |
|
1408 function by \texttt{primrec}. Therefore \texttt{recdef} automatically proves |
|
1409 a suitable induction rule $f$\texttt{.induct} that follows the recursion |
|
1410 pattern of the particular function $f$. Roughly speaking, it requires you to |
|
1411 prove for each \texttt{recdef} equation that the property you are trying to |
|
1412 establish holds for the left-hand side provided it holds for all recursive |
|
1413 calls on the right-hand side. Applying $f$\texttt{.induct} requires its |
|
1414 explicit instantiation. See \S\ref{sec:explicit-inst} for details. |
|
1415 |
|
1416 \index{*recdef|)} |