1 \chapter{Functional Programming in HOL}
3 Although on the surface this chapter is mainly concerned with how to write
4 functional programs in HOL and how to verify them, most of the
5 constructs and proof procedures introduced are general purpose and recur in
6 any specification or verification task.
8 The dedicated functional programmer should be warned: HOL offers only
9 \emph{total functional programming} --- all functions in HOL must be total;
10 lazy data structures are not directly available. On the positive side,
11 functions in HOL need not be computable: HOL is a specification language that
12 goes well beyond what can be expressed as a program. However, for the time
13 being we concentrate on the computable.
15 \section{An introductory theory}
16 \label{sec:intro-theory}
18 Functional programming needs datatypes and functions. Both of them can be
19 defined in a theory with a syntax reminiscent of languages like ML or
20 Haskell. As an example consider the theory in figure~\ref{fig:ToyList}.
21 We will now examine it line by line.
24 \begin{ttbox}\makeatother
25 \input{ToyList2/ToyList1}\end{ttbox}
26 \caption{A theory of lists}
30 {\makeatother\input{ToyList/document/ToyList.tex}}
32 The complete proof script is shown in figure~\ref{fig:ToyList-proofs}. The
33 concatenation of figures \ref{fig:ToyList} and \ref{fig:ToyList-proofs}
34 constitutes the complete theory \texttt{ToyList} and should reside in file
35 \texttt{ToyList.thy}. It is good practice to present all declarations and
36 definitions at the beginning of a theory to facilitate browsing.
39 \begin{ttbox}\makeatother
40 \input{ToyList2/ToyList2}\end{ttbox}
41 \caption{Proofs about lists}
42 \label{fig:ToyList-proofs}
45 \subsubsection*{Review}
47 This is the end of our toy proof. It should have familiarized you with
49 \item the standard theorem proving procedure:
50 state a goal (lemma or theorem); proceed with proof until a separate lemma is
51 required; prove that lemma; come back to the original goal.
52 \item a specific procedure that works well for functional programs:
53 induction followed by all-out simplification via \isa{auto}.
54 \item a basic repertoire of proof commands.
58 \section{Some helpful commands}
59 \label{sec:commands-and-hints}
61 This section discusses a few basic commands for manipulating the proof state
62 and can be skipped by casual readers.
64 There are two kinds of commands used during a proof: the actual proof
65 commands and auxiliary commands for examining the proof state and controlling
66 the display. Simple proof commands are of the form
67 \isacommand{apply}\isa{(method)}\indexbold{apply} where \bfindex{method} is a
68 synonym for ``theorem proving function''. Typical examples are
69 \isa{induct_tac} and \isa{auto}. Further methods are introduced throughout
70 the tutorial. Unless stated otherwise you may assume that a method attacks
71 merely the first subgoal. An exception is \isa{auto} which tries to solve all
74 The most useful auxiliary commands are:
76 \item[Undoing:] \isacommand{undo}\indexbold{*undo} undoes the effect of the
77 last command; \isacommand{undo} can be undone by
78 \isacommand{redo}\indexbold{*redo}. Both are only needed at the shell
79 level and should not occur in the final theory.
80 \item[Printing the current state:] \isacommand{pr}\indexbold{*pr} redisplays
81 the current proof state, for example when it has disappeared off the
83 \item[Limiting the number of subgoals:] \isacommand{pr}~$n$ tells Isabelle to
84 print only the first $n$ subgoals from now on and redisplays the current
85 proof state. This is helpful when there are many subgoals.
86 \item[Modifying the order of subgoals:]
87 \isacommand{defer}\indexbold{*defer} moves the first subgoal to the end and
88 \isacommand{prefer}\indexbold{*prefer}~$n$ moves subgoal $n$ to the front.
89 \item[Printing theorems:]
90 \isacommand{thm}\indexbold{*thm}~\textit{name}$@1$~\dots~\textit{name}$@n$
91 prints the named theorems.
92 \item[Displaying types:] We have already mentioned the flag
93 \ttindex{show_types} above. It can also be useful for detecting typos in
94 formulae early on. For example, if \texttt{show_types} is set and the goal
95 \isa{rev(rev xs) = xs} is started, Isabelle prints the additional output
102 which tells us that Isabelle has correctly inferred that
103 \isa{xs} is a variable of list type. On the other hand, had we
104 made a typo as in \isa{rev(re xs) = xs}, the response
107 Variables:\isanewline
108 ~~re~::~'a~list~{\isasymRightarrow}~'a~list\isanewline
112 would have alerted us because of the unexpected variable \isa{re}.
113 \item[Reading terms and types:] \isacommand{term}\indexbold{*term}
114 \textit{string} reads, type-checks and prints the given string as a term in
115 the current context; the inferred type is output as well.
116 \isacommand{typ}\indexbold{*typ} \textit{string} reads and prints the given
117 string as a type in the current context.
118 \item[(Re)loading theories:] When you start your interaction you must open a
119 named theory with the line \isa{\isacommand{theory}~T~=~\dots~:}. Isabelle
120 automatically loads all the required parent theories from their respective
121 files (which may take a moment, unless the theories are already loaded and
122 the files have not been modified).
124 If you suddenly discover that you need to modify a parent theory of your
125 current theory you must first abandon your current theory\indexbold{abandon
126 theory}\indexbold{theory!abandon} (at the shell
127 level type \isacommand{kill}\indexbold{*kill}). After the parent theory has
128 been modified you go back to your original theory. When its first line
129 \isacommand{theory}\texttt{~T~=}~\dots~\texttt{:} is processed, the
130 modified parent is reloaded automatically.
132 The only time when you need to load a theory by hand is when you simply
133 want to check if it loads successfully without wanting to make use of the
134 theory itself. This you can do by typing
135 \isa{\isacommand{use\_thy}\indexbold{*use_thy}~"T"}.
137 Further commands are found in the Isabelle/Isar Reference Manual.
139 We now examine Isabelle's functional programming constructs systematically,
140 starting with inductive datatypes.
146 Inductive datatypes are part of almost every non-trivial application of HOL.
147 First we take another look at a very important example, the datatype of
148 lists, before we turn to datatypes in general. The section closes with a
155 Lists are one of the essential datatypes in computing. Readers of this
156 tutorial and users of HOL need to be familiar with their basic operations.
157 Theory \isa{ToyList} is only a small fragment of HOL's predefined theory
158 \isa{List}\footnote{\url{http://isabelle.in.tum.de/library/HOL/List.html}}.
159 The latter contains many further operations. For example, the functions
160 \isaindexbold{hd} (``head'') and \isaindexbold{tl} (``tail'') return the first
161 element and the remainder of a list. (However, pattern-matching is usually
162 preferable to \isa{hd} and \isa{tl}.) Theory \isa{List} also contains
163 more syntactic sugar: \isa{[}$x@1$\isa{,}\dots\isa{,}$x@n$\isa{]} abbreviates
164 $x@1$\isa{\#}\dots\isa{\#}$x@n$\isa{\#[]}. In the rest of the tutorial we
165 always use HOL's predefined lists.
168 \subsection{The general format}
169 \label{sec:general-datatype}
171 The general HOL \isacommand{datatype} definition is of the form
173 \isacommand{datatype}~(\alpha@1, \dots, \alpha@n) \, t ~=~
174 C@1~\tau@{11}~\dots~\tau@{1k@1} ~\mid~ \dots ~\mid~
175 C@m~\tau@{m1}~\dots~\tau@{mk@m}
177 where $\alpha@i$ are distinct type variables (the parameters), $C@i$ are distinct
178 constructor names and $\tau@{ij}$ are types; it is customary to capitalize
179 the first letter in constructor names. There are a number of
180 restrictions (such as that the type should not be empty) detailed
181 elsewhere~\cite{isabelle-HOL}. Isabelle notifies you if you violate them.
183 Laws about datatypes, such as \isa{[] \isasymnoteq~x\#xs} and
184 \isa{(x\#xs = y\#ys) = (x=y \isasymand~xs=ys)}, are used automatically
185 during proofs by simplification. The same is true for the equations in
186 primitive recursive function definitions.
188 Every datatype $t$ comes equipped with a \isa{size} function from $t$ into
189 the natural numbers (see~\S\ref{sec:nat} below). For lists, \isa{size} is
190 just the length, i.e.\ \isa{size [] = 0} and \isa{size(x \# xs) = size xs +
191 1}. In general, \isaindexbold{size} returns \isa{0} for all constructors
192 that do not have an argument of type $t$, and for all other constructors
193 \isa{1 +} the sum of the sizes of all arguments of type $t$. Note that because
194 \isa{size} is defined on every datatype, it is overloaded; on lists
195 \isa{size} is also called \isaindexbold{length}, which is not overloaded.
196 Isbelle will always show \isa{size} on lists as \isa{length}.
199 \subsection{Primitive recursion}
201 Functions on datatypes are usually defined by recursion. In fact, most of the
202 time they are defined by what is called \bfindex{primitive recursion}.
203 The keyword \isacommand{primrec}\indexbold{*primrec} is followed by a list of
205 \[ f \, x@1 \, \dots \, (C \, y@1 \, \dots \, y@k)\, \dots \, x@n = r \]
206 such that $C$ is a constructor of the datatype $t$ and all recursive calls of
207 $f$ in $r$ are of the form $f \, \dots \, y@i \, \dots$ for some $i$. Thus
208 Isabelle immediately sees that $f$ terminates because one (fixed!) argument
209 becomes smaller with every recursive call. There must be exactly one equation
210 for each constructor. Their order is immaterial.
211 A more general method for defining total recursive functions is introduced in
214 \begin{exercise}\label{ex:Tree}
215 \input{Misc/document/Tree.tex}%
218 \input{Misc/document/case_exprs.tex}
221 Induction is only allowed on free (or \isasymAnd-bound) variables that
222 should not occur among the assumptions of the subgoal; see
223 \S\ref{sec:ind-var-in-prems} for details. Case distinction
224 (\isa{case_tac}) works for arbitrary terms, which need to be
225 quoted if they are non-atomic.
229 \input{Ifexpr/document/Ifexpr.tex}
231 \section{Some basic types}
234 \subsection{Natural numbers}
238 \input{Misc/document/fakenat.tex}
239 \input{Misc/document/natsum.tex}
241 The usual arithmetic operations \ttindexboldpos{+}{$HOL2arithfun},
242 \ttindexboldpos{-}{$HOL2arithfun}, \ttindexboldpos{*}{$HOL2arithfun},
243 \isaindexbold{div}, \isaindexbold{mod}, \isaindexbold{min} and
244 \isaindexbold{max} are predefined, as are the relations
245 \indexboldpos{\isasymle}{$HOL2arithrel} and
246 \ttindexboldpos{<}{$HOL2arithrel}. There is even a least number operation
247 \isaindexbold{LEAST}. For example, \isa{(LEAST n.$\,$1 < n) = 2}, although
248 Isabelle does not prove this completely automatically. Note that \isa{1} and
249 \isa{2} are available as abbreviations for the corresponding
250 \isa{Suc}-expressions. If you need the full set of numerals,
251 see~\S\ref{nat-numerals}.
254 The constant \ttindexbold{0} and the operations
255 \ttindexboldpos{+}{$HOL2arithfun}, \ttindexboldpos{-}{$HOL2arithfun},
256 \ttindexboldpos{*}{$HOL2arithfun}, \isaindexbold{min}, \isaindexbold{max},
257 \indexboldpos{\isasymle}{$HOL2arithrel} and
258 \ttindexboldpos{<}{$HOL2arithrel} are overloaded, i.e.\ they are available
259 not just for natural numbers but at other types as well (see
260 \S\ref{sec:overloading}). For example, given the goal \isa{x+0 = x}, there
261 is nothing to indicate that you are talking about natural numbers. Hence
262 Isabelle can only infer that \isa{x} is of some arbitrary type where
263 \isa{0} and \isa{+} are declared. As a consequence, you will be unable to
264 prove the goal (although it may take you some time to realize what has
265 happened if \texttt{show_types} is not set). In this particular example,
266 you need to include an explicit type constraint, for example \isa{x+0 =
267 (x::nat)}. If there is enough contextual information this may not be
268 necessary: \isa{Suc x = x} automatically implies \isa{x::nat} because
269 \isa{Suc} is not overloaded.
272 Simple arithmetic goals are proved automatically by both \isa{auto}
273 and the simplification methods introduced in \S\ref{sec:Simplification}. For
276 \input{Misc/document/arith1.tex}%
277 is proved automatically. The main restriction is that only addition is taken
278 into account; other arithmetic operations and quantified formulae are ignored.
280 For more complex goals, there is the special method
281 \isaindexbold{arith} (which attacks the first subgoal). It proves
282 arithmetic goals involving the usual logical connectives (\isasymnot,
283 \isasymand, \isasymor, \isasymimp), the relations \isasymle\ and \isa{<}, and
284 the operations \isa{+}, \isa{-}, \isa{min} and \isa{max}. For example,
286 \input{Misc/document/arith2.tex}%
287 succeeds because \isa{k*k} can be treated as atomic.
290 \input{Misc/document/arith3.tex}%
291 is not even proved by \isa{arith} because the proof relies essentially
292 on properties of multiplication.
295 The running time of \isa{arith} is exponential in the number of occurrences
296 of \ttindexboldpos{-}{$HOL2arithfun}, \isaindexbold{min} and
297 \isaindexbold{max} because they are first eliminated by case distinctions.
299 \isa{arith} is incomplete even for the restricted class of formulae
300 described above (known as ``linear arithmetic''). If divisibility plays a
301 role, it may fail to prove a valid formula, for example $m+m \neq n+n+1$.
302 Fortunately, such examples are rare in practice.
309 \input{Misc/document/pairs.tex}
311 %FIXME move stuff below into section on proofs about products?
313 Abstraction over pairs and tuples is merely a convenient shorthand for a
314 more complex internal representation. Thus the internal and external form
315 of a term may differ, which can affect proofs. If you want to avoid this
316 complication, use \isa{fst} and \isa{snd}, i.e.\ write
317 \isa{\isasymlambda{}p.~fst p + snd p} instead of
318 \isa{\isasymlambda(x,y).~x + y}. See~\S\ref{sec:products} for
319 theorem proving with tuple patterns.
322 Note that products, like type \isa{nat}, are datatypes, which means
323 in particular that \isa{induct_tac} and \isa{case_tac} are applicable to
324 products (see \S\ref{sec:products}).
326 Instead of tuples with many components (where ``many'' is not much above 2),
327 it is far preferable to use records (see \S\ref{sec:records}).
329 \section{Definitions}
330 \label{sec:Definitions}
332 A definition is simply an abbreviation, i.e.\ a new name for an existing
333 construction. In particular, definitions cannot be recursive. Isabelle offers
334 definitions on the level of types and terms. Those on the type level are
335 called type synonyms, those on the term level are called (constant)
339 \subsection{Type synonyms}
340 \indexbold{type synonym}
342 Type synonyms are similar to those found in ML. Their syntax is fairly self
345 \input{Misc/document/types.tex}%
347 Note that pattern-matching is not allowed, i.e.\ each definition must be of
348 the form $f\,x@1\,\dots\,x@n~\isasymequiv~t$.
349 Section~\S\ref{sec:Simplification} explains how definitions are used
352 \input{Misc/document/prime_def.tex}
355 \chapter{More Functional Programming}
357 The purpose of this chapter is to deepen the reader's understanding of the
358 concepts encountered so far and to introduce advanced forms of datatypes and
359 recursive functions. The first two sections give a structured presentation of
360 theorem proving by simplification (\S\ref{sec:Simplification}) and discuss
361 important heuristics for induction (\S\ref{sec:InductionHeuristics}). They can
362 be skipped by readers less interested in proofs. They are followed by a case
363 study, a compiler for expressions (\S\ref{sec:ExprCompiler}). Advanced
364 datatypes, including those involving function spaces, are covered in
365 \S\ref{sec:advanced-datatypes}, which closes with another case study, search
366 trees (``tries''). Finally we introduce \isacommand{recdef}, a very general
367 form of recursive function definition which goes well beyond what
368 \isacommand{primrec} allows (\S\ref{sec:recdef}).
371 \section{Simplification}
372 \label{sec:Simplification}
373 \index{simplification|(}
375 So far we have proved our theorems by \isa{auto}, which ``simplifies''
376 \emph{all} subgoals. In fact, \isa{auto} can do much more than that, except
377 that it did not need to so far. However, when you go beyond toy examples, you
378 need to understand the ingredients of \isa{auto}. This section covers the
379 method that \isa{auto} always applies first, namely simplification.
381 Simplification is one of the central theorem proving tools in Isabelle and
382 many other systems. The tool itself is called the \bfindex{simplifier}. The
383 purpose of this section is introduce the many features of the simplifier.
384 Anybody intending to use HOL should read this section. Later on
385 (\S\ref{sec:simplification-II}) we explain some more advanced features and a
386 little bit of how the simplifier works. The serious student should read that
387 section as well, in particular in order to understand what happened if things
388 do not simplify as expected.
390 \subsubsection{What is simplification}
392 In its most basic form, simplification means repeated application of
393 equations from left to right. For example, taking the rules for \isa{\at}
394 and applying them to the term \isa{[0,1] \at\ []} results in a sequence of
395 simplification steps:
396 \begin{ttbox}\makeatother
397 (0#1#[]) @ [] \(\leadsto\) 0#((1#[]) @ []) \(\leadsto\) 0#(1#([] @ [])) \(\leadsto\) 0#1#[]
399 This is also known as \bfindex{term rewriting}\indexbold{rewriting} and the
400 equations are referred to as \textbf{rewrite rules}\indexbold{rewrite rule}.
401 ``Rewriting'' is more honest than ``simplification'' because the terms do not
402 necessarily become simpler in the process.
404 \input{Misc/document/simp.tex}
406 \index{simplification|)}
408 \input{Misc/document/Itrev.tex}
411 \input{Misc/document/Tree2.tex}%
414 \input{CodeGen/document/CodeGen.tex}
417 \section{Advanced datatypes}
418 \label{sec:advanced-datatypes}
423 This section presents advanced forms of \isacommand{datatype}s.
425 \subsection{Mutual recursion}
426 \label{sec:datatype-mut-rec}
428 \input{Datatype/document/ABexpr.tex}
430 \subsection{Nested recursion}
431 \label{sec:nested-datatype}
433 {\makeatother\input{Datatype/document/Nested.tex}}
436 \subsection{The limits of nested recursion}
438 How far can we push nested recursion? By the unfolding argument above, we can
439 reduce nested to mutual recursion provided the nested recursion only involves
440 previously defined datatypes. This does not include functions:
442 \isacommand{datatype} t = C "t \isasymRightarrow\ bool"
444 is a real can of worms: in HOL it must be ruled out because it requires a type
445 \isa{t} such that \isa{t} and its power set \isa{t \isasymFun\ bool} have the
446 same cardinality---an impossibility. For the same reason it is not possible
447 to allow recursion involving the type \isa{set}, which is isomorphic to
448 \isa{t \isasymFun\ bool}.
450 Fortunately, a limited form of recursion
451 involving function spaces is permitted: the recursive type may occur on the
452 right of a function arrow, but never on the left. Hence the above can of worms
453 is ruled out but the following example of a potentially infinitely branching tree is
457 \input{Datatype/document/Fundata.tex}
460 If you need nested recursion on the left of a function arrow, there are
461 alternatives to pure HOL: LCF~\cite{paulson87} is a logic where types like
463 \isacommand{datatype} lam = C "lam \isasymrightarrow\ lam"
465 do indeed make sense (but note the intentionally different arrow
466 \isa{\isasymrightarrow}). There is even a version of LCF on top of HOL,
467 called HOLCF~\cite{MuellerNvOS99}.
472 \subsection{Case study: Tries}
474 Tries are a classic search tree data structure~\cite{Knuth3-75} for fast
475 indexing with strings. Figure~\ref{fig:trie} gives a graphical example of a
476 trie containing the words ``all'', ``an'', ``ape'', ``can'', ``car'' and
477 ``cat''. When searching a string in a trie, the letters of the string are
478 examined sequentially. Each letter determines which subtrie to search next.
479 In this case study we model tries as a datatype, define a lookup and an
480 update function, and prove that they behave as expected.
485 \begin{picture}(60,30)
486 \put( 5, 0){\makebox(0,0)[b]{l}}
487 \put(25, 0){\makebox(0,0)[b]{e}}
488 \put(35, 0){\makebox(0,0)[b]{n}}
489 \put(45, 0){\makebox(0,0)[b]{r}}
490 \put(55, 0){\makebox(0,0)[b]{t}}
492 \put( 5, 9){\line(0,-1){5}}
493 \put(25, 9){\line(0,-1){5}}
494 \put(44, 9){\line(-3,-2){9}}
495 \put(45, 9){\line(0,-1){5}}
496 \put(46, 9){\line(3,-2){9}}
498 \put( 5,10){\makebox(0,0)[b]{l}}
499 \put(15,10){\makebox(0,0)[b]{n}}
500 \put(25,10){\makebox(0,0)[b]{p}}
501 \put(45,10){\makebox(0,0)[b]{a}}
503 \put(14,19){\line(-3,-2){9}}
504 \put(15,19){\line(0,-1){5}}
505 \put(16,19){\line(3,-2){9}}
506 \put(45,19){\line(0,-1){5}}
508 \put(15,20){\makebox(0,0)[b]{a}}
509 \put(45,20){\makebox(0,0)[b]{c}}
511 \put(30,30){\line(-3,-2){13}}
512 \put(30,30){\line(3,-2){13}}
515 \caption{A sample trie}
519 Proper tries associate some value with each string. Since the
520 information is stored only in the final node associated with the string, many
521 nodes do not carry any value. This distinction is captured by the
522 following predefined datatype (from theory \isa{Option}, which is a parent
525 \input{Trie/document/Option2.tex}
526 \indexbold{*option}\indexbold{*None}\indexbold{*Some}%
527 \input{Trie/document/Trie.tex}
530 Write an improved version of \isa{update} that does not suffer from the
531 space leak in the version above. Prove the main theorem for your improved
536 Write a function to \emph{delete} entries from a trie. An easy solution is
537 a clever modification of \isa{update} which allows both insertion and
538 deletion with a single function. Prove (a modified version of) the main
539 theorem above. Optimize you function such that it shrinks tries after
540 deletion, if possible.
543 \section{Total recursive functions}
547 Although many total functions have a natural primitive recursive definition,
548 this is not always the case. Arbitrary total recursive functions can be
549 defined by means of \isacommand{recdef}: you can use full pattern-matching,
550 recursion need not involve datatypes, and termination is proved by showing
551 that the arguments of all recursive calls are smaller in a suitable (user
554 \subsection{Defining recursive functions}
556 \input{Recdef/document/examples.tex}
558 \subsection{Proving termination}
560 \input{Recdef/document/termination.tex}
562 \subsection{Simplification with recdef}
563 \label{sec:recdef-simplification}
565 \input{Recdef/document/simplification.tex}
567 \subsection{Induction}
568 \index{induction!recursion|(}
569 \index{recursion induction|(}
571 \input{Recdef/document/Induction.tex}
572 \label{sec:recdef-induction}
574 \index{induction!recursion|)}
575 \index{recursion induction|)}