wneuper/isa: doc-src/TutorialI/fp.tex@b85acd66f715

     1 \chapter{Functional Programming in HOL}

     3 This chapter describes how to write

     4 functional programs in HOL and how to verify them.  However,

     5 most of the constructs and

     6 proof procedures introduced are general and recur in any specification

     7 or verification task.  We really should speak of functional

     8 \emph{modelling} rather than functional \emph{programming}:

     9 our primary aim is not

    10 to write programs but to design abstract models of systems.  HOL is

    11 a specification language that goes well beyond what can be expressed as a

    12 program. However, for the time being we concentrate on the computable.

    14 If you are a purist functional programmer, please note that all functions

    15 in HOL must be total:

    16 they must terminate for all inputs.  Lazy data structures are not

    17 directly available.

    19 \section{An Introductory Theory}

    20 \label{sec:intro-theory}

    22 Functional programming needs datatypes and functions. Both of them can be

    23 defined in a theory with a syntax reminiscent of languages like ML or

    24 Haskell. As an example consider the theory in figure~\ref{fig:ToyList}.

    25 We will now examine it line by line.

    27 \begin{figure}[htbp]

    28 \begin{ttbox}\makeatother

    29 \input{ToyList2/ToyList1}\end{ttbox}

    30 \caption{A Theory of Lists}

    31 \label{fig:ToyList}

    32 \end{figure}

    34 \index{*ToyList example|(}

    35 {\makeatother\input{ToyList/document/ToyList.tex}}

    37 The complete proof script is shown in Fig.\ts\ref{fig:ToyList-proofs}. The

    38 concatenation of Figs.\ts\ref{fig:ToyList} and~\ref{fig:ToyList-proofs}

    39 constitutes the complete theory \texttt{ToyList} and should reside in file

    40 \texttt{ToyList.thy}.

    41 % It is good practice to present all declarations and

    42 %definitions at the beginning of a theory to facilitate browsing.%

    43 \index{*ToyList example|)}

    45 \begin{figure}[htbp]

    46 \begin{ttbox}\makeatother

    47 \input{ToyList2/ToyList2}\end{ttbox}

    48 \caption{Proofs about Lists}

    49 \label{fig:ToyList-proofs}

    50 \end{figure}

    52 \subsubsection*{Review}

    54 This is the end of our toy proof. It should have familiarized you with

    55 \begin{itemize}

    56 \item the standard theorem proving procedure:

    57 state a goal (lemma or theorem); proceed with proof until a separate lemma is

    58 required; prove that lemma; come back to the original goal.

    59 \item a specific procedure that works well for functional programs:

    60 induction followed by all-out simplification via \isa{auto}.

    61 \item a basic repertoire of proof commands.

    62 \end{itemize}

    64 \begin{warn}

    65 It is tempting to think that all lemmas should have the \isa{simp} attribute

    66 just because this was the case in the example above. However, in that example

    67 all lemmas were equations, and the right-hand side was simpler than the

    68 left-hand side --- an ideal situation for simplification purposes. Unless

    69 this is clearly the case, novices should refrain from awarding a lemma the

    70 \isa{simp} attribute, which has a global effect. Instead, lemmas can be

    71 applied locally where they are needed, which is discussed in the following

    72 chapter.

    73 \end{warn}

    75 \section{Some Helpful Commands}

    76 \label{sec:commands-and-hints}

    78 This section discusses a few basic commands for manipulating the proof state

    79 and can be skipped by casual readers.

    81 There are two kinds of commands used during a proof: the actual proof

    82 commands and auxiliary commands for examining the proof state and controlling

    83 the display. Simple proof commands are of the form

    84 \commdx{apply}(\textit{method}), where \textit{method} is typically

    85 \isa{induct_tac} or \isa{auto}.  All such theorem proving operations

    86 are referred to as \bfindex{methods}, and further ones are

    87 introduced throughout the tutorial.  Unless stated otherwise, you may

    88 assume that a method attacks merely the first subgoal. An exception is

    89 \isa{auto}, which tries to solve all subgoals.

    91 The most useful auxiliary commands are as follows:

    92 \begin{description}

    93 \item[Undoing:] \commdx{undo} undoes the effect of

    94 the

    95   last command; \isacommand{undo} can be undone by

    96   \commdx{redo}.  Both are only needed at the shell

    97   level and should not occur in the final theory.

    98 \item[Printing the current state:] \commdx{pr}

    99 redisplays

   100   the current proof state, for example when it has scrolled past the top of

   101   the screen.

   102 \item[Limiting the number of subgoals:] \isacommand{pr}~$n$ tells Isabelle to

   103   print only the first $n$ subgoals from now on and redisplays the current

   104   proof state. This is helpful when there are many subgoals.

   105 \item[Modifying the order of subgoals:]

   106 \commdx{defer} moves the first subgoal to the end and

   107 \commdx{prefer}~$n$ moves subgoal $n$ to the front.

   108 \item[Printing theorems:]

   109   \commdx{thm}~\textit{name}$@1$~\dots~\textit{name}$@n$

   110   prints the named theorems.

   111 \item[Displaying types:] We have already mentioned the flag

   112   \texttt{show_types} above.\index{*show_types (flag)}

   113   It can also be useful for detecting misspellings in

   114   formulae. For example, if \texttt{show_types} is set and the goal

   115   \isa{rev(rev xs) = xs} is started, Isabelle prints the additional output

   116 \par\noindent

   117 \begin{isabelle}%

   118 variables:\isanewline

   119 ~~xs~::~'a~list

   120 \end{isabelle}%

   121 \par\noindent

   122 which tells us that Isabelle has correctly inferred that

   123 \isa{xs} is a variable of list type. On the other hand, had we

   124 made a typo as in \isa{rev(re xs) = xs}, the response

   125 \par\noindent

   126 \begin{isabelle}%

   127 variables:\isanewline

   128 ~~re~::~'a~list~{\isasymRightarrow}~'a~list\isanewline

   129 ~~xs~::~'a~list%

   130 \end{isabelle}%

   131 \par\noindent

   132 would have alerted us because of the unexpected variable \isa{re}.

   133 \item[Reading terms and types:] \commdx{term}

   134   \textit{string} reads, type-checks and prints the given string as a term in

   135   the current context; the inferred type is output as well.

   136   \commdx{typ} \textit{string} reads and prints the given

   137   string as a type in the current context.

   138 \item[(Re)loading theories:] When you start your interaction you must open a

   139   named theory with the line \commdx{theory}~\isa{T~=~\dots~:}. Isabelle

   140   automatically loads all the required parent theories from their respective

   141   files (which may take a moment, unless the theories are already loaded and

   142   the files have not been modified).

   144   If you suddenly discover that you need to modify a parent theory of your

   145   current theory, you must first abandon your current theory%

   146   \indexbold{abandoning a theory}\indexbold{theories!abandoning}

   147   (at the shell

   148   level type \commdx{kill}). After the parent theory has

   149   been modified, you go back to your original theory. When its first line

   150   \isa{\isacommand{theory}~T~=~\dots~:} is processed, the

   151   modified parent is reloaded automatically.

   153 %  The only time when you need to load a theory by hand is when you simply

   154 %  want to check if it loads successfully without wanting to make use of the

   155 %  theory itself. This you can do by typing

   156 %  \isa{\commdx{use\_thy}~"T"}.

   157 \end{description}

   158 Further commands are found in the Isabelle/Isar Reference

   159 Manual~\cite{isabelle-isar-ref}.

   161 We now examine Isabelle's functional programming constructs systematically,

   162 starting with inductive datatypes.

   165 \section{Datatypes}

   166 \label{sec:datatype}

   168 \index{datatypes|(}%

   169 Inductive datatypes are part of almost every non-trivial application of HOL.

   170 First we take another look at an important example, the datatype of

   171 lists, before we turn to datatypes in general. The section closes with a

   172 case study.

   175 \subsection{Lists}

   177 \index{*list (type)}%

   178 Lists are one of the essential datatypes in computing.  We expect that you

   179 are already familiar with their basic operations.

   180 Theory \isa{ToyList} is only a small fragment of HOL's predefined theory

   181 \thydx{List}\footnote{\url{http://isabelle.in.tum.de/library/HOL/List.html}}.

   182 The latter contains many further operations. For example, the functions

   183 \cdx{hd} (``head'') and \cdx{tl} (``tail'') return the first

   184 element and the remainder of a list. (However, pattern-matching is usually

   185 preferable to \isa{hd} and \isa{tl}.)

   186 Also available are higher-order functions like \isa{map} and \isa{filter}.

   187 Theory \isa{List} also contains

   188 more syntactic sugar: \isa{[}$x@1$\isa{,}\dots\isa{,}$x@n$\isa{]} abbreviates

   189 $x@1$\isa{\#}\dots\isa{\#}$x@n$\isa{\#[]}.  In the rest of the tutorial we

   190 always use HOL's predefined lists by building on theory \isa{Main}.

   193 \subsection{The General Format}

   194 \label{sec:general-datatype}

   196 The general HOL \isacommand{datatype} definition is of the form

   197 \[

   198 \isacommand{datatype}~(\alpha@1, \dots, \alpha@n) \, t ~=~

   199 C@1~\tau@{11}~\dots~\tau@{1k@1} ~\mid~ \dots ~\mid~

   200 C@m~\tau@{m1}~\dots~\tau@{mk@m}

   201 \]

   202 where $\alpha@i$ are distinct type variables (the parameters), $C@i$ are distinct

   203 constructor names and $\tau@{ij}$ are types; it is customary to capitalize

   204 the first letter in constructor names. There are a number of

   205 restrictions (such as that the type should not be empty) detailed

   206 elsewhere~\cite{isabelle-HOL}. Isabelle notifies you if you violate them.

   208 Laws about datatypes, such as \isa{[] \isasymnoteq~x\#xs} and

   209 \isa{(x\#xs = y\#ys) = (x=y \isasymand~xs=ys)}, are used automatically

   210 during proofs by simplification.  The same is true for the equations in

   211 primitive recursive function definitions.

   213 Every\footnote{Except for advanced datatypes where the recursion involves

   214 ``\isasymRightarrow'' as in {\S}\ref{sec:nested-fun-datatype}.} datatype $t$

   215 comes equipped with a \isa{size} function from $t$ into the natural numbers

   216 (see~{\S}\ref{sec:nat} below). For lists, \isa{size} is just the length, i.e.\

   217 \isa{size [] = 0} and \isa{size(x \# xs) = size xs + 1}.  In general,

   218 \cdx{size} returns

   219 \begin{itemize}

   220 \item zero for all constructors

   221 that do not have an argument of type $t$\\

   222 \item one plus the sum of the sizes of all arguments of type~$t$,

   223 for all other constructors

   224 \end{itemize}

   225 Note that because

   226 \isa{size} is defined on every datatype, it is overloaded; on lists

   227 \isa{size} is also called \sdx{length}, which is not overloaded.

   228 Isabelle will always show \isa{size} on lists as \isa{length}.

   231 \subsection{Primitive Recursion}

   233 \index{recursion!primitive}%

   234 Functions on datatypes are usually defined by recursion. In fact, most of the

   235 time they are defined by what is called \textbf{primitive recursion}.

   236 The keyword \commdx{primrec} is followed by a list of

   237 equations

   238 \[ f \, x@1 \, \dots \, (C \, y@1 \, \dots \, y@k)\, \dots \, x@n = r \]

   239 such that $C$ is a constructor of the datatype $t$ and all recursive calls of

   240 $f$ in $r$ are of the form $f \, \dots \, y@i \, \dots$ for some $i$. Thus

   241 Isabelle immediately sees that $f$ terminates because one (fixed!) argument

   242 becomes smaller with every recursive call. There must be at most one equation

   243 for each constructor.  Their order is immaterial.

   244 A more general method for defining total recursive functions is introduced in

   245 {\S}\ref{sec:recdef}.

   247 \begin{exercise}\label{ex:Tree}

   248 \input{Misc/document/Tree.tex}%

   249 \end{exercise}

   251 \input{Misc/document/case_exprs.tex}

   253 \input{Ifexpr/document/Ifexpr.tex}

   254 \index{datatypes|)}

   257 \section{Some Basic Types}

   259 This section introduces the types of natural numbers and ordered pairs.  Also

   260 described is type \isa{option}, which is useful for modelling exceptional

   261 cases.

   263 \subsection{Natural Numbers}

   264 \label{sec:nat}\index{natural numbers}%

   265 \index{linear arithmetic|(}

   267 \input{Misc/document/fakenat.tex}

   268 \input{Misc/document/natsum.tex}

   270 \index{linear arithmetic|)}

   273 \subsection{Pairs}

   274 \input{Misc/document/pairs.tex}

   276 \subsection{Datatype {\tt\slshape option}}

   277 \label{sec:option}

   278 \input{Misc/document/Option2.tex}

   280 \section{Definitions}

   281 \label{sec:Definitions}

   283 A definition is simply an abbreviation, i.e.\ a new name for an existing

   284 construction. In particular, definitions cannot be recursive. Isabelle offers

   285 definitions on the level of types and terms. Those on the type level are

   286 called \textbf{type synonyms}; those on the term level are simply called

   287 definitions.

   290 \subsection{Type Synonyms}

   292 \index{type synonyms}%

   293 Type synonyms are similar to those found in ML\@. They are created by a

   294 \commdx{types} command:

   296 \input{Misc/document/types.tex}

   298 \input{Misc/document/prime_def.tex}

   301 \section{The Definitional Approach}

   302 \label{sec:definitional}

   304 \index{Definitional Approach}%

   305 As we pointed out at the beginning of the chapter, asserting arbitrary

   306 axioms such as $f(n) = f(n) + 1$ can easily lead to contradictions. In order

   307 to avoid this danger, we advocate the definitional rather than

   308 the axiomatic approach: introduce new concepts by definitions. However,  Isabelle/HOL seems to

   309 support many richer definitional constructs, such as

   310 \isacommand{primrec}. The point is that Isabelle reduces such constructs to first principles. For example, each

   311 \isacommand{primrec} function definition is turned into a proper

   312 (nonrecursive!) definition from which the user-supplied recursion equations are

   313 automatically proved.  This process is

   314 hidden from the user, who does not have to understand the details.  Other commands described

   315 later, like \isacommand{recdef} and \isacommand{inductive}, work similarly.

   316 This strict adherence to the definitional approach reduces the risk of

   317 soundness errors.

   319 \chapter{More Functional Programming}

   321 The purpose of this chapter is to deepen your understanding of the

   322 concepts encountered so far and to introduce advanced forms of datatypes and

   323 recursive functions. The first two sections give a structured presentation of

   324 theorem proving by simplification ({\S}\ref{sec:Simplification}) and discuss

   325 important heuristics for induction ({\S}\ref{sec:InductionHeuristics}).  You can

   326 skip them if you are not planning to perform proofs yourself.

   327 We then present a case

   328 study: a compiler for expressions ({\S}\ref{sec:ExprCompiler}). Advanced

   329 datatypes, including those involving function spaces, are covered in

   330 {\S}\ref{sec:advanced-datatypes}; it closes with another case study, search

   331 trees (``tries'').  Finally we introduce \isacommand{recdef}, a general

   332 form of recursive function definition that goes well beyond

   333 \isacommand{primrec} ({\S}\ref{sec:recdef}).

   336 \section{Simplification}

   337 \label{sec:Simplification}

   338 \index{simplification|(}

   340 So far we have proved our theorems by \isa{auto}, which simplifies

   341 all subgoals. In fact, \isa{auto} can do much more than that.

   342 To go beyond toy examples, you

   343 need to understand the ingredients of \isa{auto}.  This section covers the

   344 method that \isa{auto} always applies first, simplification.

   346 Simplification is one of the central theorem proving tools in Isabelle and

   347 many other systems. The tool itself is called the \textbf{simplifier}.

   348 This section introduces the many features of the simplifier

   349 and is required reading if you intend to perform proofs.  Later on,

   350 {\S}\ref{sec:simplification-II} explains some more advanced features and a

   351 little bit of how the simplifier works. The serious student should read that

   352 section as well, in particular to understand why the simplifier did

   353 something unexpected.

   355 \subsection{What is Simplification?}

   357 In its most basic form, simplification means repeated application of

   358 equations from left to right. For example, taking the rules for \isa{\at}

   359 and applying them to the term \isa{[0,1] \at\ []} results in a sequence of

   360 simplification steps:

   361 \begin{ttbox}\makeatother

   362 (0#1#[]) @ []  \(\leadsto\)  0#((1#[]) @ [])  \(\leadsto\)  0#(1#([] @ []))  \(\leadsto\)  0#1#[]

   363 \end{ttbox}

   364 This is also known as \bfindex{term rewriting}\indexbold{rewriting} and the

   365 equations are referred to as \bfindex{rewrite rules}.

   366 ``Rewriting'' is more honest than ``simplification'' because the terms do not

   367 necessarily become simpler in the process.

   369 The simplifier proves arithmetic goals as described in

   370 {\S}\ref{sec:nat} above.  Arithmetic expressions are simplified using built-in

   371 procedures that go beyond mere rewrite rules.  New simplification procedures

   372 can be coded and installed, but they are definitely not a matter for this

   373 tutorial.

   375 \input{Misc/document/simp.tex}

   377 \index{simplification|)}

   379 \input{Misc/document/Itrev.tex}

   381 \begin{exercise}

   382 \input{Misc/document/Tree2.tex}%

   383 \end{exercise}

   385 \input{CodeGen/document/CodeGen.tex}

   388 \section{Advanced Datatypes}

   389 \label{sec:advanced-datatypes}

   390 \index{datatype@\isacommand {datatype} (command)|(}

   391 \index{primrec@\isacommand {primrec} (command)|(}

   392 %|)

   394 This section presents advanced forms of datatypes: mutual and nested

   395 recursion.  A series of examples will culminate in a treatment of the trie

   396 data structure.

   399 \subsection{Mutual Recursion}

   400 \label{sec:datatype-mut-rec}

   402 \input{Datatype/document/ABexpr.tex}

   404 \subsection{Nested Recursion}

   405 \label{sec:nested-datatype}

   407 {\makeatother\input{Datatype/document/Nested.tex}}

   410 \subsection{The Limits of Nested Recursion}

   411 \label{sec:nested-fun-datatype}

   413 How far can we push nested recursion? By the unfolding argument above, we can

   414 reduce nested to mutual recursion provided the nested recursion only involves

   415 previously defined datatypes. This does not include functions:

   416 \begin{isabelle}

   417 \isacommand{datatype} t = C "t \isasymRightarrow\ bool"

   418 \end{isabelle}

   419 This declaration is a real can of worms.

   420 In HOL it must be ruled out because it requires a type

   421 \isa{t} such that \isa{t} and its power set \isa{t \isasymFun\ bool} have the

   422 same cardinality --- an impossibility. For the same reason it is not possible

   423 to allow recursion involving the type \isa{t set}, which is isomorphic to

   424 \isa{t \isasymFun\ bool}.

   426 Fortunately, a limited form of recursion

   427 involving function spaces is permitted: the recursive type may occur on the

   428 right of a function arrow, but never on the left. Hence the above can of worms

   429 is ruled out but the following example of a potentially

   430 \index{infinitely branching trees}%

   431 infinitely branching tree is accepted:

   432 \smallskip

   434 \input{Datatype/document/Fundata.tex}

   436 If you need nested recursion on the left of a function arrow, there are

   437 alternatives to pure HOL\@.  In the Logic for Computable Functions

   438 (\rmindex{LCF}), types like

   439 \begin{isabelle}

   440 \isacommand{datatype} lam = C "lam \isasymrightarrow\ lam"

   441 \end{isabelle}

   442 do indeed make sense~\cite{paulson87}.  Note the different arrow,

   443 \isa{\isasymrightarrow} instead of \isa{\isasymRightarrow},

   444 expressing the type of \emph{continuous} functions.

   445 There is even a version of LCF on top of HOL,

   446 called \rmindex{HOLCF}~\cite{MuellerNvOS99}.

   447 \index{datatype@\isacommand {datatype} (command)|)}

   448 \index{primrec@\isacommand {primrec} (command)|)}

   451 \subsection{Case Study: Tries}

   452 \label{sec:Trie}

   454 \index{tries|(}%

   455 Tries are a classic search tree data structure~\cite{Knuth3-75} for fast

   456 indexing with strings. Figure~\ref{fig:trie} gives a graphical example of a

   457 trie containing the words ``all'', ``an'', ``ape'', ``can'', ``car'' and

   458 ``cat''.  When searching a string in a trie, the letters of the string are

   459 examined sequentially. Each letter determines which subtrie to search next.

   460 In this case study we model tries as a datatype, define a lookup and an

   461 update function, and prove that they behave as expected.

   463 \begin{figure}[htbp]

   464 \begin{center}

   465 \unitlength1mm

   466 \begin{picture}(60,30)

   467 \put( 5, 0){\makebox(0,0)[b]{l}}

   468 \put(25, 0){\makebox(0,0)[b]{e}}

   469 \put(35, 0){\makebox(0,0)[b]{n}}

   470 \put(45, 0){\makebox(0,0)[b]{r}}

   471 \put(55, 0){\makebox(0,0)[b]{t}}

   472 %

   473 \put( 5, 9){\line(0,-1){5}}

   474 \put(25, 9){\line(0,-1){5}}

   475 \put(44, 9){\line(-3,-2){9}}

   476 \put(45, 9){\line(0,-1){5}}

   477 \put(46, 9){\line(3,-2){9}}

   478 %

   479 \put( 5,10){\makebox(0,0)[b]{l}}

   480 \put(15,10){\makebox(0,0)[b]{n}}

   481 \put(25,10){\makebox(0,0)[b]{p}}

   482 \put(45,10){\makebox(0,0)[b]{a}}

   483 %

   484 \put(14,19){\line(-3,-2){9}}

   485 \put(15,19){\line(0,-1){5}}

   486 \put(16,19){\line(3,-2){9}}

   487 \put(45,19){\line(0,-1){5}}

   488 %

   489 \put(15,20){\makebox(0,0)[b]{a}}

   490 \put(45,20){\makebox(0,0)[b]{c}}

   491 %

   492 \put(30,30){\line(-3,-2){13}}

   493 \put(30,30){\line(3,-2){13}}

   494 \end{picture}

   495 \end{center}

   496 \caption{A Sample Trie}

   497 \label{fig:trie}

   498 \end{figure}

   500 Proper tries associate some value with each string. Since the

   501 information is stored only in the final node associated with the string, many

   502 nodes do not carry any value. This distinction is modeled with the help

   503 of the predefined datatype \isa{option} (see {\S}\ref{sec:option}).

   504 \input{Trie/document/Trie.tex}

   505 \index{tries|)}

   507 \section{Total Recursive Functions}

   508 \label{sec:recdef}

   509 \index{recdef@\isacommand {recdef} (command)|(}\index{functions!total|(}

   511 Although many total functions have a natural primitive recursive definition,

   512 this is not always the case. Arbitrary total recursive functions can be

   513 defined by means of \isacommand{recdef}: you can use full pattern-matching,

   514 recursion need not involve datatypes, and termination is proved by showing

   515 that the arguments of all recursive calls are smaller in a suitable (user

   516 supplied) sense. In this section we restrict ourselves to measure functions;

   517 more advanced termination proofs are discussed in {\S}\ref{sec:beyond-measure}.

   519 \subsection{Defining Recursive Functions}

   520 \label{sec:recdef-examples}

   521 \input{Recdef/document/examples.tex}

   523 \subsection{Proving Termination}

   524 \input{Recdef/document/termination.tex}

   526 \subsection{Simplification and Recursive Functions}

   527 \label{sec:recdef-simplification}

   528 \input{Recdef/document/simplification.tex}

   530 \subsection{Induction and Recursive Functions}

   531 \index{induction!recursion|(}

   532 \index{recursion induction|(}

   534 \input{Recdef/document/Induction.tex}

   535 \label{sec:recdef-induction}

   537 \index{induction!recursion|)}

   538 \index{recursion induction|)}

   539 \index{recdef@\isacommand {recdef} (command)|)}\index{functions!total|)}

author	wenzelm
	Fri, 21 Dec 2001 19:55:39 +0100
changeset 12582	b85acd66f715
parent 12473	f41e477576b9
child 13305	f88d0c363582
permissions	-rw-r--r--