1 \chapter{Functional Programming in HOL}
3 This chapter describes how to write
4 functional programs in HOL and how to verify them. However,
5 most of the constructs and
6 proof procedures introduced are general and recur in any specification
7 or verification task. We really should speak of functional
8 \emph{modelling} rather than functional \emph{programming}:
10 to write programs but to design abstract models of systems. HOL is
11 a specification language that goes well beyond what can be expressed as a
12 program. However, for the time being we concentrate on the computable.
14 If you are a purist functional programmer, please note that all functions
16 they must terminate for all inputs. Lazy data structures are not
19 \section{An Introductory Theory}
20 \label{sec:intro-theory}
22 Functional programming needs datatypes and functions. Both of them can be
23 defined in a theory with a syntax reminiscent of languages like ML or
24 Haskell. As an example consider the theory in figure~\ref{fig:ToyList}.
25 We will now examine it line by line.
28 \begin{ttbox}\makeatother
29 \input{ToyList2/ToyList1}\end{ttbox}
30 \caption{A Theory of Lists}
34 \index{*ToyList example|(}
35 {\makeatother\input{ToyList/document/ToyList.tex}}
37 The complete proof script is shown in Fig.\ts\ref{fig:ToyList-proofs}. The
38 concatenation of Figs.\ts\ref{fig:ToyList} and~\ref{fig:ToyList-proofs}
39 constitutes the complete theory \texttt{ToyList} and should reside in file
41 % It is good practice to present all declarations and
42 %definitions at the beginning of a theory to facilitate browsing.%
43 \index{*ToyList example|)}
46 \begin{ttbox}\makeatother
47 \input{ToyList2/ToyList2}\end{ttbox}
48 \caption{Proofs about Lists}
49 \label{fig:ToyList-proofs}
52 \subsubsection*{Review}
54 This is the end of our toy proof. It should have familiarized you with
56 \item the standard theorem proving procedure:
57 state a goal (lemma or theorem); proceed with proof until a separate lemma is
58 required; prove that lemma; come back to the original goal.
59 \item a specific procedure that works well for functional programs:
60 induction followed by all-out simplification via \isa{auto}.
61 \item a basic repertoire of proof commands.
65 It is tempting to think that all lemmas should have the \isa{simp} attribute
66 just because this was the case in the example above. However, in that example
67 all lemmas were equations, and the right-hand side was simpler than the
68 left-hand side --- an ideal situation for simplification purposes. Unless
69 this is clearly the case, novices should refrain from awarding a lemma the
70 \isa{simp} attribute, which has a global effect. Instead, lemmas can be
71 applied locally where they are needed, which is discussed in the following
75 \section{Some Helpful Commands}
76 \label{sec:commands-and-hints}
78 This section discusses a few basic commands for manipulating the proof state
79 and can be skipped by casual readers.
81 There are two kinds of commands used during a proof: the actual proof
82 commands and auxiliary commands for examining the proof state and controlling
83 the display. Simple proof commands are of the form
84 \commdx{apply}(\textit{method}), where \textit{method} is typically
85 \isa{induct_tac} or \isa{auto}. All such theorem proving operations
86 are referred to as \bfindex{methods}, and further ones are
87 introduced throughout the tutorial. Unless stated otherwise, you may
88 assume that a method attacks merely the first subgoal. An exception is
89 \isa{auto}, which tries to solve all subgoals.
91 The most useful auxiliary commands are as follows:
93 \item[Undoing:] \commdx{undo} undoes the effect of
95 last command; \isacommand{undo} can be undone by
96 \commdx{redo}. Both are only needed at the shell
97 level and should not occur in the final theory.
98 \item[Printing the current state:] \commdx{pr}
100 the current proof state, for example when it has scrolled past the top of
102 \item[Limiting the number of subgoals:] \isacommand{pr}~$n$ tells Isabelle to
103 print only the first $n$ subgoals from now on and redisplays the current
104 proof state. This is helpful when there are many subgoals.
105 \item[Modifying the order of subgoals:]
106 \commdx{defer} moves the first subgoal to the end and
107 \commdx{prefer}~$n$ moves subgoal $n$ to the front.
108 \item[Printing theorems:]
109 \commdx{thm}~\textit{name}$@1$~\dots~\textit{name}$@n$
110 prints the named theorems.
111 \item[Displaying types:] We have already mentioned the flag
112 \texttt{show_types} above.\index{*show_types (flag)}
113 It can also be useful for detecting misspellings in
114 formulae. For example, if \texttt{show_types} is set and the goal
115 \isa{rev(rev xs) = xs} is started, Isabelle prints the additional output
118 variables:\isanewline
122 which tells us that Isabelle has correctly inferred that
123 \isa{xs} is a variable of list type. On the other hand, had we
124 made a typo as in \isa{rev(re xs) = xs}, the response
127 variables:\isanewline
128 ~~re~::~'a~list~{\isasymRightarrow}~'a~list\isanewline
132 would have alerted us because of the unexpected variable \isa{re}.
133 \item[Reading terms and types:] \commdx{term}
134 \textit{string} reads, type-checks and prints the given string as a term in
135 the current context; the inferred type is output as well.
136 \commdx{typ} \textit{string} reads and prints the given
137 string as a type in the current context.
138 \item[(Re)loading theories:] When you start your interaction you must open a
139 named theory with the line \commdx{theory}~\isa{T~=~\dots~:}. Isabelle
140 automatically loads all the required parent theories from their respective
141 files (which may take a moment, unless the theories are already loaded and
142 the files have not been modified).
144 If you suddenly discover that you need to modify a parent theory of your
145 current theory, you must first abandon your current theory%
146 \indexbold{abandoning a theory}\indexbold{theories!abandoning}
148 level type \commdx{kill}). After the parent theory has
149 been modified, you go back to your original theory. When its first line
150 \isa{\isacommand{theory}~T~=~\dots~:} is processed, the
151 modified parent is reloaded automatically.
153 % The only time when you need to load a theory by hand is when you simply
154 % want to check if it loads successfully without wanting to make use of the
155 % theory itself. This you can do by typing
156 % \isa{\commdx{use\_thy}~"T"}.
158 Further commands are found in the Isabelle/Isar Reference
159 Manual~\cite{isabelle-isar-ref}.
161 We now examine Isabelle's functional programming constructs systematically,
162 starting with inductive datatypes.
169 Inductive datatypes are part of almost every non-trivial application of HOL.
170 First we take another look at an important example, the datatype of
171 lists, before we turn to datatypes in general. The section closes with a
177 \index{*list (type)}%
178 Lists are one of the essential datatypes in computing. We expect that you
179 are already familiar with their basic operations.
180 Theory \isa{ToyList} is only a small fragment of HOL's predefined theory
181 \thydx{List}\footnote{\url{http://isabelle.in.tum.de/library/HOL/List.html}}.
182 The latter contains many further operations. For example, the functions
183 \cdx{hd} (``head'') and \cdx{tl} (``tail'') return the first
184 element and the remainder of a list. (However, pattern-matching is usually
185 preferable to \isa{hd} and \isa{tl}.)
186 Also available are higher-order functions like \isa{map} and \isa{filter}.
187 Theory \isa{List} also contains
188 more syntactic sugar: \isa{[}$x@1$\isa{,}\dots\isa{,}$x@n$\isa{]} abbreviates
189 $x@1$\isa{\#}\dots\isa{\#}$x@n$\isa{\#[]}. In the rest of the tutorial we
190 always use HOL's predefined lists by building on theory \isa{Main}.
193 \subsection{The General Format}
194 \label{sec:general-datatype}
196 The general HOL \isacommand{datatype} definition is of the form
198 \isacommand{datatype}~(\alpha@1, \dots, \alpha@n) \, t ~=~
199 C@1~\tau@{11}~\dots~\tau@{1k@1} ~\mid~ \dots ~\mid~
200 C@m~\tau@{m1}~\dots~\tau@{mk@m}
202 where $\alpha@i$ are distinct type variables (the parameters), $C@i$ are distinct
203 constructor names and $\tau@{ij}$ are types; it is customary to capitalize
204 the first letter in constructor names. There are a number of
205 restrictions (such as that the type should not be empty) detailed
206 elsewhere~\cite{isabelle-HOL}. Isabelle notifies you if you violate them.
208 Laws about datatypes, such as \isa{[] \isasymnoteq~x\#xs} and
209 \isa{(x\#xs = y\#ys) = (x=y \isasymand~xs=ys)}, are used automatically
210 during proofs by simplification. The same is true for the equations in
211 primitive recursive function definitions.
213 Every\footnote{Except for advanced datatypes where the recursion involves
214 ``\isasymRightarrow'' as in {\S}\ref{sec:nested-fun-datatype}.} datatype $t$
215 comes equipped with a \isa{size} function from $t$ into the natural numbers
216 (see~{\S}\ref{sec:nat} below). For lists, \isa{size} is just the length, i.e.\
217 \isa{size [] = 0} and \isa{size(x \# xs) = size xs + 1}. In general,
220 \item zero for all constructors
221 that do not have an argument of type $t$\\
222 \item one plus the sum of the sizes of all arguments of type~$t$,
223 for all other constructors
226 \isa{size} is defined on every datatype, it is overloaded; on lists
227 \isa{size} is also called \sdx{length}, which is not overloaded.
228 Isabelle will always show \isa{size} on lists as \isa{length}.
231 \subsection{Primitive Recursion}
233 \index{recursion!primitive}%
234 Functions on datatypes are usually defined by recursion. In fact, most of the
235 time they are defined by what is called \textbf{primitive recursion}.
236 The keyword \commdx{primrec} is followed by a list of
238 \[ f \, x@1 \, \dots \, (C \, y@1 \, \dots \, y@k)\, \dots \, x@n = r \]
239 such that $C$ is a constructor of the datatype $t$ and all recursive calls of
240 $f$ in $r$ are of the form $f \, \dots \, y@i \, \dots$ for some $i$. Thus
241 Isabelle immediately sees that $f$ terminates because one (fixed!) argument
242 becomes smaller with every recursive call. There must be at most one equation
243 for each constructor. Their order is immaterial.
244 A more general method for defining total recursive functions is introduced in
245 {\S}\ref{sec:recdef}.
247 \begin{exercise}\label{ex:Tree}
248 \input{Misc/document/Tree.tex}%
251 \input{Misc/document/case_exprs.tex}
253 \input{Ifexpr/document/Ifexpr.tex}
257 \section{Some Basic Types}
259 This section introduces the types of natural numbers and ordered pairs. Also
260 described is type \isa{option}, which is useful for modelling exceptional
263 \subsection{Natural Numbers}
264 \label{sec:nat}\index{natural numbers}%
265 \index{linear arithmetic|(}
267 \input{Misc/document/fakenat.tex}
268 \input{Misc/document/natsum.tex}
270 \index{linear arithmetic|)}
274 \input{Misc/document/pairs.tex}
276 \subsection{Datatype {\tt\slshape option}}
278 \input{Misc/document/Option2.tex}
280 \section{Definitions}
281 \label{sec:Definitions}
283 A definition is simply an abbreviation, i.e.\ a new name for an existing
284 construction. In particular, definitions cannot be recursive. Isabelle offers
285 definitions on the level of types and terms. Those on the type level are
286 called \textbf{type synonyms}; those on the term level are simply called
290 \subsection{Type Synonyms}
292 \index{type synonyms}%
293 Type synonyms are similar to those found in ML\@. They are created by a
294 \commdx{types} command:
296 \input{Misc/document/types.tex}
298 \input{Misc/document/prime_def.tex}
301 \section{The Definitional Approach}
302 \label{sec:definitional}
304 \index{Definitional Approach}%
305 As we pointed out at the beginning of the chapter, asserting arbitrary
306 axioms such as $f(n) = f(n) + 1$ can easily lead to contradictions. In order
307 to avoid this danger, we advocate the definitional rather than
308 the axiomatic approach: introduce new concepts by definitions. However, Isabelle/HOL seems to
309 support many richer definitional constructs, such as
310 \isacommand{primrec}. The point is that Isabelle reduces such constructs to first principles. For example, each
311 \isacommand{primrec} function definition is turned into a proper
312 (nonrecursive!) definition from which the user-supplied recursion equations are
313 automatically proved. This process is
314 hidden from the user, who does not have to understand the details. Other commands described
315 later, like \isacommand{recdef} and \isacommand{inductive}, work similarly.
316 This strict adherence to the definitional approach reduces the risk of
319 \chapter{More Functional Programming}
321 The purpose of this chapter is to deepen your understanding of the
322 concepts encountered so far and to introduce advanced forms of datatypes and
323 recursive functions. The first two sections give a structured presentation of
324 theorem proving by simplification ({\S}\ref{sec:Simplification}) and discuss
325 important heuristics for induction ({\S}\ref{sec:InductionHeuristics}). You can
326 skip them if you are not planning to perform proofs yourself.
327 We then present a case
328 study: a compiler for expressions ({\S}\ref{sec:ExprCompiler}). Advanced
329 datatypes, including those involving function spaces, are covered in
330 {\S}\ref{sec:advanced-datatypes}; it closes with another case study, search
331 trees (``tries''). Finally we introduce \isacommand{recdef}, a general
332 form of recursive function definition that goes well beyond
333 \isacommand{primrec} ({\S}\ref{sec:recdef}).
336 \section{Simplification}
337 \label{sec:Simplification}
338 \index{simplification|(}
340 So far we have proved our theorems by \isa{auto}, which simplifies
341 all subgoals. In fact, \isa{auto} can do much more than that.
342 To go beyond toy examples, you
343 need to understand the ingredients of \isa{auto}. This section covers the
344 method that \isa{auto} always applies first, simplification.
346 Simplification is one of the central theorem proving tools in Isabelle and
347 many other systems. The tool itself is called the \textbf{simplifier}.
348 This section introduces the many features of the simplifier
349 and is required reading if you intend to perform proofs. Later on,
350 {\S}\ref{sec:simplification-II} explains some more advanced features and a
351 little bit of how the simplifier works. The serious student should read that
352 section as well, in particular to understand why the simplifier did
353 something unexpected.
355 \subsection{What is Simplification?}
357 In its most basic form, simplification means repeated application of
358 equations from left to right. For example, taking the rules for \isa{\at}
359 and applying them to the term \isa{[0,1] \at\ []} results in a sequence of
360 simplification steps:
361 \begin{ttbox}\makeatother
362 (0#1#[]) @ [] \(\leadsto\) 0#((1#[]) @ []) \(\leadsto\) 0#(1#([] @ [])) \(\leadsto\) 0#1#[]
364 This is also known as \bfindex{term rewriting}\indexbold{rewriting} and the
365 equations are referred to as \bfindex{rewrite rules}.
366 ``Rewriting'' is more honest than ``simplification'' because the terms do not
367 necessarily become simpler in the process.
369 The simplifier proves arithmetic goals as described in
370 {\S}\ref{sec:nat} above. Arithmetic expressions are simplified using built-in
371 procedures that go beyond mere rewrite rules. New simplification procedures
372 can be coded and installed, but they are definitely not a matter for this
375 \input{Misc/document/simp.tex}
377 \index{simplification|)}
379 \input{Misc/document/Itrev.tex}
382 \input{Misc/document/Tree2.tex}%
385 \input{CodeGen/document/CodeGen.tex}
388 \section{Advanced Datatypes}
389 \label{sec:advanced-datatypes}
390 \index{datatype@\isacommand {datatype} (command)|(}
391 \index{primrec@\isacommand {primrec} (command)|(}
394 This section presents advanced forms of datatypes: mutual and nested
395 recursion. A series of examples will culminate in a treatment of the trie
399 \subsection{Mutual Recursion}
400 \label{sec:datatype-mut-rec}
402 \input{Datatype/document/ABexpr.tex}
404 \subsection{Nested Recursion}
405 \label{sec:nested-datatype}
407 {\makeatother\input{Datatype/document/Nested.tex}}
410 \subsection{The Limits of Nested Recursion}
411 \label{sec:nested-fun-datatype}
413 How far can we push nested recursion? By the unfolding argument above, we can
414 reduce nested to mutual recursion provided the nested recursion only involves
415 previously defined datatypes. This does not include functions:
417 \isacommand{datatype} t = C "t \isasymRightarrow\ bool"
419 This declaration is a real can of worms.
420 In HOL it must be ruled out because it requires a type
421 \isa{t} such that \isa{t} and its power set \isa{t \isasymFun\ bool} have the
422 same cardinality --- an impossibility. For the same reason it is not possible
423 to allow recursion involving the type \isa{t set}, which is isomorphic to
424 \isa{t \isasymFun\ bool}.
426 Fortunately, a limited form of recursion
427 involving function spaces is permitted: the recursive type may occur on the
428 right of a function arrow, but never on the left. Hence the above can of worms
429 is ruled out but the following example of a potentially
430 \index{infinitely branching trees}%
431 infinitely branching tree is accepted:
434 \input{Datatype/document/Fundata.tex}
436 If you need nested recursion on the left of a function arrow, there are
437 alternatives to pure HOL\@. In the Logic for Computable Functions
438 (\rmindex{LCF}), types like
440 \isacommand{datatype} lam = C "lam \isasymrightarrow\ lam"
442 do indeed make sense~\cite{paulson87}. Note the different arrow,
443 \isa{\isasymrightarrow} instead of \isa{\isasymRightarrow},
444 expressing the type of \emph{continuous} functions.
445 There is even a version of LCF on top of HOL,
446 called \rmindex{HOLCF}~\cite{MuellerNvOS99}.
447 \index{datatype@\isacommand {datatype} (command)|)}
448 \index{primrec@\isacommand {primrec} (command)|)}
451 \subsection{Case Study: Tries}
455 Tries are a classic search tree data structure~\cite{Knuth3-75} for fast
456 indexing with strings. Figure~\ref{fig:trie} gives a graphical example of a
457 trie containing the words ``all'', ``an'', ``ape'', ``can'', ``car'' and
458 ``cat''. When searching a string in a trie, the letters of the string are
459 examined sequentially. Each letter determines which subtrie to search next.
460 In this case study we model tries as a datatype, define a lookup and an
461 update function, and prove that they behave as expected.
466 \begin{picture}(60,30)
467 \put( 5, 0){\makebox(0,0)[b]{l}}
468 \put(25, 0){\makebox(0,0)[b]{e}}
469 \put(35, 0){\makebox(0,0)[b]{n}}
470 \put(45, 0){\makebox(0,0)[b]{r}}
471 \put(55, 0){\makebox(0,0)[b]{t}}
473 \put( 5, 9){\line(0,-1){5}}
474 \put(25, 9){\line(0,-1){5}}
475 \put(44, 9){\line(-3,-2){9}}
476 \put(45, 9){\line(0,-1){5}}
477 \put(46, 9){\line(3,-2){9}}
479 \put( 5,10){\makebox(0,0)[b]{l}}
480 \put(15,10){\makebox(0,0)[b]{n}}
481 \put(25,10){\makebox(0,0)[b]{p}}
482 \put(45,10){\makebox(0,0)[b]{a}}
484 \put(14,19){\line(-3,-2){9}}
485 \put(15,19){\line(0,-1){5}}
486 \put(16,19){\line(3,-2){9}}
487 \put(45,19){\line(0,-1){5}}
489 \put(15,20){\makebox(0,0)[b]{a}}
490 \put(45,20){\makebox(0,0)[b]{c}}
492 \put(30,30){\line(-3,-2){13}}
493 \put(30,30){\line(3,-2){13}}
496 \caption{A Sample Trie}
500 Proper tries associate some value with each string. Since the
501 information is stored only in the final node associated with the string, many
502 nodes do not carry any value. This distinction is modeled with the help
503 of the predefined datatype \isa{option} (see {\S}\ref{sec:option}).
504 \input{Trie/document/Trie.tex}
507 \section{Total Recursive Functions}
509 \index{recdef@\isacommand {recdef} (command)|(}\index{functions!total|(}
511 Although many total functions have a natural primitive recursive definition,
512 this is not always the case. Arbitrary total recursive functions can be
513 defined by means of \isacommand{recdef}: you can use full pattern-matching,
514 recursion need not involve datatypes, and termination is proved by showing
515 that the arguments of all recursive calls are smaller in a suitable (user
516 supplied) sense. In this section we restrict ourselves to measure functions;
517 more advanced termination proofs are discussed in {\S}\ref{sec:beyond-measure}.
519 \subsection{Defining Recursive Functions}
520 \label{sec:recdef-examples}
521 \input{Recdef/document/examples.tex}
523 \subsection{Proving Termination}
524 \input{Recdef/document/termination.tex}
526 \subsection{Simplification and Recursive Functions}
527 \label{sec:recdef-simplification}
528 \input{Recdef/document/simplification.tex}
530 \subsection{Induction and Recursive Functions}
531 \index{induction!recursion|(}
532 \index{recursion induction|(}
534 \input{Recdef/document/Induction.tex}
535 \label{sec:recdef-induction}
537 \index{induction!recursion|)}
538 \index{recursion induction|)}
539 \index{recdef@\isacommand {recdef} (command)|)}\index{functions!total|)}