wneuper/isa: doc-src/TutorialI/fp.tex@8e4307d1207a

     1 \chapter{Functional Programming in HOL}

     3 Although on the surface this chapter is mainly concerned with how to write

     4 functional programs in HOL and how to verify them, most of the

     5 constructs and proof procedures introduced are general purpose and recur in

     6 any specification or verification task.

     8 The dedicated functional programmer should be warned: HOL offers only

     9 \emph{total functional programming} --- all functions in HOL must be total;

    10 lazy data structures are not directly available. On the positive side,

    11 functions in HOL need not be computable: HOL is a specification language that

    12 goes well beyond what can be expressed as a program. However, for the time

    13 being we concentrate on the computable.

    15 \section{An introductory theory}

    16 \label{sec:intro-theory}

    18 Functional programming needs datatypes and functions. Both of them can be

    19 defined in a theory with a syntax reminiscent of languages like ML or

    20 Haskell. As an example consider the theory in figure~\ref{fig:ToyList}.

    21 We will now examine it line by line.

    23 \begin{figure}[htbp]

    24 \begin{ttbox}\makeatother

    25 \input{ToyList2/ToyList1}\end{ttbox}

    26 \caption{A theory of lists}

    27 \label{fig:ToyList}

    28 \end{figure}

    30 {\makeatother\input{ToyList/document/ToyList.tex}}

    32 The complete proof script is shown in figure~\ref{fig:ToyList-proofs}. The

    33 concatenation of figures \ref{fig:ToyList} and \ref{fig:ToyList-proofs}

    34 constitutes the complete theory \texttt{ToyList} and should reside in file

    35 \texttt{ToyList.thy}. It is good practice to present all declarations and

    36 definitions at the beginning of a theory to facilitate browsing.

    38 \begin{figure}[htbp]

    39 \begin{ttbox}\makeatother

    40 \input{ToyList2/ToyList2}\end{ttbox}

    41 \caption{Proofs about lists}

    42 \label{fig:ToyList-proofs}

    43 \end{figure}

    45 \subsubsection*{Review}

    47 This is the end of our toy proof. It should have familiarized you with

    48 \begin{itemize}

    49 \item the standard theorem proving procedure:

    50 state a goal (lemma or theorem); proceed with proof until a separate lemma is

    51 required; prove that lemma; come back to the original goal.

    52 \item a specific procedure that works well for functional programs:

    53 induction followed by all-out simplification via \isa{auto}.

    54 \item a basic repertoire of proof commands.

    55 \end{itemize}

    58 \section{Some helpful commands}

    59 \label{sec:commands-and-hints}

    61 This section discusses a few basic commands for manipulating the proof state

    62 and can be skipped by casual readers.

    64 There are two kinds of commands used during a proof: the actual proof

    65 commands and auxiliary commands for examining the proof state and controlling

    66 the display. Simple proof commands are of the form

    67 \isacommand{apply}\isa{(method)}\indexbold{apply} where \bfindex{method} is a

    68 synonym for ``theorem proving function''. Typical examples are

    69 \isa{induct_tac} and \isa{auto}. Further methods are introduced throughout

    70 the tutorial.  Unless stated otherwise you may assume that a method attacks

    71 merely the first subgoal. An exception is \isa{auto} which tries to solve all

    72 subgoals.

    74 The most useful auxiliary commands are:

    75 \begin{description}

    76 \item[Undoing:] \isacommand{undo}\indexbold{*undo} undoes the effect of the

    77   last command; \isacommand{undo} can be undone by

    78   \isacommand{redo}\indexbold{*redo}.  Both are only needed at the shell

    79   level and should not occur in the final theory.

    80 \item[Printing the current state:] \isacommand{pr}\indexbold{*pr} redisplays

    81   the current proof state, for example when it has disappeared off the

    82   screen.

    83 \item[Limiting the number of subgoals:] \isacommand{pr}~$n$ tells Isabelle to

    84   print only the first $n$ subgoals from now on and redisplays the current

    85   proof state. This is helpful when there are many subgoals.

    86 \item[Modifying the order of subgoals:]

    87 \isacommand{defer}\indexbold{*defer} moves the first subgoal to the end and

    88 \isacommand{prefer}\indexbold{*prefer}~$n$ moves subgoal $n$ to the front.

    89 \item[Printing theorems:]

    90   \isacommand{thm}\indexbold{*thm}~\textit{name}$@1$~\dots~\textit{name}$@n$

    91   prints the named theorems.

    92 \item[Displaying types:] We have already mentioned the flag

    93   \ttindex{show_types} above. It can also be useful for detecting typos in

    94   formulae early on. For example, if \texttt{show_types} is set and the goal

    95   \isa{rev(rev xs) = xs} is started, Isabelle prints the additional output

    96 \par\noindent

    97 \begin{isabelle}%

    98 Variables:\isanewline

    99 ~~xs~::~'a~list

   100 \end{isabelle}%

   101 \par\noindent

   102 which tells us that Isabelle has correctly inferred that

   103 \isa{xs} is a variable of list type. On the other hand, had we

   104 made a typo as in \isa{rev(re xs) = xs}, the response

   105 \par\noindent

   106 \begin{isabelle}%

   107 Variables:\isanewline

   108 ~~re~::~'a~list~{\isasymRightarrow}~'a~list\isanewline

   109 ~~xs~::~'a~list%

   110 \end{isabelle}%

   111 \par\noindent

   112 would have alerted us because of the unexpected variable \isa{re}.

   113 \item[Reading terms and types:] \isacommand{term}\indexbold{*term}

   114   \textit{string} reads, type-checks and prints the given string as a term in

   115   the current context; the inferred type is output as well.

   116   \isacommand{typ}\indexbold{*typ} \textit{string} reads and prints the given

   117   string as a type in the current context.

   118 \item[(Re)loading theories:] When you start your interaction you must open a

   119   named theory with the line \isa{\isacommand{theory}~T~=~\dots~:}. Isabelle

   120   automatically loads all the required parent theories from their respective

   121   files (which may take a moment, unless the theories are already loaded and

   122   the files have not been modified).

   124   If you suddenly discover that you need to modify a parent theory of your

   125   current theory you must first abandon your current theory\indexbold{abandon

   126   theory}\indexbold{theory!abandon} (at the shell

   127   level type \isacommand{kill}\indexbold{*kill}). After the parent theory has

   128   been modified you go back to your original theory. When its first line

   129   \isacommand{theory}\texttt{~T~=}~\dots~\texttt{:} is processed, the

   130   modified parent is reloaded automatically.

   132   The only time when you need to load a theory by hand is when you simply

   133   want to check if it loads successfully without wanting to make use of the

   134   theory itself. This you can do by typing

   135   \isa{\isacommand{use\_thy}\indexbold{*use_thy}~"T"}.

   136 \end{description}

   137 Further commands are found in the Isabelle/Isar Reference Manual.

   139 We now examine Isabelle's functional programming constructs systematically,

   140 starting with inductive datatypes.

   143 \section{Datatypes}

   144 \label{sec:datatype}

   146 Inductive datatypes are part of almost every non-trivial application of HOL.

   147 First we take another look at a very important example, the datatype of

   148 lists, before we turn to datatypes in general. The section closes with a

   149 case study.

   152 \subsection{Lists}

   153 \indexbold{*list}

   155 Lists are one of the essential datatypes in computing. Readers of this

   156 tutorial and users of HOL need to be familiar with their basic operations.

   157 Theory \isa{ToyList} is only a small fragment of HOL's predefined theory

   158 \isa{List}\footnote{\url{http://isabelle.in.tum.de/library/HOL/List.html}}.

   159 The latter contains many further operations. For example, the functions

   160 \isaindexbold{hd} (``head'') and \isaindexbold{tl} (``tail'') return the first

   161 element and the remainder of a list. (However, pattern-matching is usually

   162 preferable to \isa{hd} and \isa{tl}.)  Theory \isa{List} also contains

   163 more syntactic sugar: \isa{[}$x@1$\isa{,}\dots\isa{,}$x@n$\isa{]} abbreviates

   164 $x@1$\isa{\#}\dots\isa{\#}$x@n$\isa{\#[]}.  In the rest of the tutorial we

   165 always use HOL's predefined lists.

   168 \subsection{The general format}

   169 \label{sec:general-datatype}

   171 The general HOL \isacommand{datatype} definition is of the form

   172 \[

   173 \isacommand{datatype}~(\alpha@1, \dots, \alpha@n) \, t ~=~

   174 C@1~\tau@{11}~\dots~\tau@{1k@1} ~\mid~ \dots ~\mid~

   175 C@m~\tau@{m1}~\dots~\tau@{mk@m}

   176 \]

   177 where $\alpha@i$ are distinct type variables (the parameters), $C@i$ are distinct

   178 constructor names and $\tau@{ij}$ are types; it is customary to capitalize

   179 the first letter in constructor names. There are a number of

   180 restrictions (such as that the type should not be empty) detailed

   181 elsewhere~\cite{isabelle-HOL}. Isabelle notifies you if you violate them.

   183 Laws about datatypes, such as \isa{[] \isasymnoteq~x\#xs} and

   184 \isa{(x\#xs = y\#ys) = (x=y \isasymand~xs=ys)}, are used automatically

   185 during proofs by simplification.  The same is true for the equations in

   186 primitive recursive function definitions.

   188 Every datatype $t$ comes equipped with a \isa{size} function from $t$ into

   189 the natural numbers (see~{\S}\ref{sec:nat} below). For lists, \isa{size} is

   190 just the length, i.e.\ \isa{size [] = 0} and \isa{size(x \# xs) = size xs +

   191   1}.  In general, \isaindexbold{size} returns \isa{0} for all constructors

   192 that do not have an argument of type $t$, and for all other constructors

   193 \isa{1 +} the sum of the sizes of all arguments of type $t$. Note that because

   194 \isa{size} is defined on every datatype, it is overloaded; on lists

   195 \isa{size} is also called \isaindexbold{length}, which is not overloaded.

   196 Isbelle will always show \isa{size} on lists as \isa{length}.

   199 \subsection{Primitive recursion}

   201 Functions on datatypes are usually defined by recursion. In fact, most of the

   202 time they are defined by what is called \bfindex{primitive recursion}.

   203 The keyword \isacommand{primrec}\indexbold{*primrec} is followed by a list of

   204 equations

   205 \[ f \, x@1 \, \dots \, (C \, y@1 \, \dots \, y@k)\, \dots \, x@n = r \]

   206 such that $C$ is a constructor of the datatype $t$ and all recursive calls of

   207 $f$ in $r$ are of the form $f \, \dots \, y@i \, \dots$ for some $i$. Thus

   208 Isabelle immediately sees that $f$ terminates because one (fixed!) argument

   209 becomes smaller with every recursive call. There must be exactly one equation

   210 for each constructor.  Their order is immaterial.

   211 A more general method for defining total recursive functions is introduced in

   212 {\S}\ref{sec:recdef}.

   214 \begin{exercise}\label{ex:Tree}

   215 \input{Misc/document/Tree.tex}%

   216 \end{exercise}

   218 \input{Misc/document/case_exprs.tex}

   220 \begin{warn}

   221   Induction is only allowed on free (or \isasymAnd-bound) variables that

   222   should not occur among the assumptions of the subgoal; see

   223   {\S}\ref{sec:ind-var-in-prems} for details. Case distinction

   224   (\isa{case_tac}) works for arbitrary terms, which need to be

   225   quoted if they are non-atomic.

   226 \end{warn}

   229 \input{Ifexpr/document/Ifexpr.tex}

   231 \section{Some basic types}

   234 \subsection{Natural numbers}

   235 \label{sec:nat}

   236 \index{arithmetic|(}

   238 \input{Misc/document/fakenat.tex}

   239 \input{Misc/document/natsum.tex}

   241 \index{arithmetic|)}

   244 \subsection{Pairs}

   245 \input{Misc/document/pairs.tex}

   247 \subsection{Datatype \emph{\texttt{option}}}

   248 \label{sec:option}

   249 \input{Misc/document/Option2.tex}

   251 \section{Definitions}

   252 \label{sec:Definitions}

   254 A definition is simply an abbreviation, i.e.\ a new name for an existing

   255 construction. In particular, definitions cannot be recursive. Isabelle offers

   256 definitions on the level of types and terms. Those on the type level are

   257 called type synonyms, those on the term level are called (constant)

   258 definitions.

   261 \subsection{Type synonyms}

   262 \indexbold{type synonym}

   264 Type synonyms are similar to those found in ML. Their syntax is fairly self

   265 explanatory:

   267 \input{Misc/document/types.tex}%

   269 Note that pattern-matching is not allowed, i.e.\ each definition must be of

   270 the form $f\,x@1\,\dots\,x@n~\isasymequiv~t$.

   271 Section~{\S}\ref{sec:Simplification} explains how definitions are used

   272 in proofs.

   274 \input{Misc/document/prime_def.tex}

   277 \chapter{More Functional Programming}

   279 The purpose of this chapter is to deepen the reader's understanding of the

   280 concepts encountered so far and to introduce advanced forms of datatypes and

   281 recursive functions. The first two sections give a structured presentation of

   282 theorem proving by simplification ({\S}\ref{sec:Simplification}) and discuss

   283 important heuristics for induction ({\S}\ref{sec:InductionHeuristics}). They can

   284 be skipped by readers less interested in proofs. They are followed by a case

   285 study, a compiler for expressions ({\S}\ref{sec:ExprCompiler}). Advanced

   286 datatypes, including those involving function spaces, are covered in

   287 {\S}\ref{sec:advanced-datatypes}, which closes with another case study, search

   288 trees (``tries'').  Finally we introduce \isacommand{recdef}, a very general

   289 form of recursive function definition which goes well beyond what

   290 \isacommand{primrec} allows ({\S}\ref{sec:recdef}).

   293 \section{Simplification}

   294 \label{sec:Simplification}

   295 \index{simplification|(}

   297 So far we have proved our theorems by \isa{auto}, which ``simplifies''

   298 \emph{all} subgoals. In fact, \isa{auto} can do much more than that, except

   299 that it did not need to so far. However, when you go beyond toy examples, you

   300 need to understand the ingredients of \isa{auto}.  This section covers the

   301 method that \isa{auto} always applies first, namely simplification.

   303 Simplification is one of the central theorem proving tools in Isabelle and

   304 many other systems. The tool itself is called the \bfindex{simplifier}. The

   305 purpose of this section is introduce the many features of the simplifier.

   306 Anybody intending to use HOL should read this section. Later on

   307 ({\S}\ref{sec:simplification-II}) we explain some more advanced features and a

   308 little bit of how the simplifier works. The serious student should read that

   309 section as well, in particular in order to understand what happened if things

   310 do not simplify as expected.

   312 \subsubsection{What is simplification}

   314 In its most basic form, simplification means repeated application of

   315 equations from left to right. For example, taking the rules for \isa{\at}

   316 and applying them to the term \isa{[0,1] \at\ []} results in a sequence of

   317 simplification steps:

   318 \begin{ttbox}\makeatother

   319 (0#1#[]) @ []  \(\leadsto\)  0#((1#[]) @ [])  \(\leadsto\)  0#(1#([] @ []))  \(\leadsto\)  0#1#[]

   320 \end{ttbox}

   321 This is also known as \bfindex{term rewriting}\indexbold{rewriting} and the

   322 equations are referred to as \textbf{rewrite rules}\indexbold{rewrite rule}.

   323 ``Rewriting'' is more honest than ``simplification'' because the terms do not

   324 necessarily become simpler in the process.

   326 \input{Misc/document/simp.tex}

   328 \index{simplification|)}

   330 \input{Misc/document/Itrev.tex}

   332 \begin{exercise}

   333 \input{Misc/document/Tree2.tex}%

   334 \end{exercise}

   336 \input{CodeGen/document/CodeGen.tex}

   339 \section{Advanced datatypes}

   340 \label{sec:advanced-datatypes}

   341 \index{*datatype|(}

   342 \index{*primrec|(}

   343 %|)

   345 This section presents advanced forms of \isacommand{datatype}s.

   347 \subsection{Mutual recursion}

   348 \label{sec:datatype-mut-rec}

   350 \input{Datatype/document/ABexpr.tex}

   352 \subsection{Nested recursion}

   353 \label{sec:nested-datatype}

   355 {\makeatother\input{Datatype/document/Nested.tex}}

   358 \subsection{The limits of nested recursion}

   360 How far can we push nested recursion? By the unfolding argument above, we can

   361 reduce nested to mutual recursion provided the nested recursion only involves

   362 previously defined datatypes. This does not include functions:

   363 \begin{isabelle}

   364 \isacommand{datatype} t = C "t \isasymRightarrow\ bool"

   365 \end{isabelle}

   366 is a real can of worms: in HOL it must be ruled out because it requires a type

   367 \isa{t} such that \isa{t} and its power set \isa{t \isasymFun\ bool} have the

   368 same cardinality---an impossibility. For the same reason it is not possible

   369 to allow recursion involving the type \isa{set}, which is isomorphic to

   370 \isa{t \isasymFun\ bool}.

   372 Fortunately, a limited form of recursion

   373 involving function spaces is permitted: the recursive type may occur on the

   374 right of a function arrow, but never on the left. Hence the above can of worms

   375 is ruled out but the following example of a potentially infinitely branching tree is

   376 accepted:

   377 \smallskip

   379 \input{Datatype/document/Fundata.tex}

   380 \bigskip

   382 If you need nested recursion on the left of a function arrow, there are

   383 alternatives to pure HOL: LCF~\cite{paulson87} is a logic where types like

   384 \begin{isabelle}

   385 \isacommand{datatype} lam = C "lam \isasymrightarrow\ lam"

   386 \end{isabelle}

   387 do indeed make sense (but note the intentionally different arrow

   388 \isa{\isasymrightarrow}). There is even a version of LCF on top of HOL,

   389 called HOLCF~\cite{MuellerNvOS99}.

   391 \index{*primrec|)}

   392 \index{*datatype|)}

   394 \subsection{Case study: Tries}

   395 \label{sec:Trie}

   397 Tries are a classic search tree data structure~\cite{Knuth3-75} for fast

   398 indexing with strings. Figure~\ref{fig:trie} gives a graphical example of a

   399 trie containing the words ``all'', ``an'', ``ape'', ``can'', ``car'' and

   400 ``cat''.  When searching a string in a trie, the letters of the string are

   401 examined sequentially. Each letter determines which subtrie to search next.

   402 In this case study we model tries as a datatype, define a lookup and an

   403 update function, and prove that they behave as expected.

   405 \begin{figure}[htbp]

   406 \begin{center}

   407 \unitlength1mm

   408 \begin{picture}(60,30)

   409 \put( 5, 0){\makebox(0,0)[b]{l}}

   410 \put(25, 0){\makebox(0,0)[b]{e}}

   411 \put(35, 0){\makebox(0,0)[b]{n}}

   412 \put(45, 0){\makebox(0,0)[b]{r}}

   413 \put(55, 0){\makebox(0,0)[b]{t}}

   414 %

   415 \put( 5, 9){\line(0,-1){5}}

   416 \put(25, 9){\line(0,-1){5}}

   417 \put(44, 9){\line(-3,-2){9}}

   418 \put(45, 9){\line(0,-1){5}}

   419 \put(46, 9){\line(3,-2){9}}

   420 %

   421 \put( 5,10){\makebox(0,0)[b]{l}}

   422 \put(15,10){\makebox(0,0)[b]{n}}

   423 \put(25,10){\makebox(0,0)[b]{p}}

   424 \put(45,10){\makebox(0,0)[b]{a}}

   425 %

   426 \put(14,19){\line(-3,-2){9}}

   427 \put(15,19){\line(0,-1){5}}

   428 \put(16,19){\line(3,-2){9}}

   429 \put(45,19){\line(0,-1){5}}

   430 %

   431 \put(15,20){\makebox(0,0)[b]{a}}

   432 \put(45,20){\makebox(0,0)[b]{c}}

   433 %

   434 \put(30,30){\line(-3,-2){13}}

   435 \put(30,30){\line(3,-2){13}}

   436 \end{picture}

   437 \end{center}

   438 \caption{A sample trie}

   439 \label{fig:trie}

   440 \end{figure}

   442 Proper tries associate some value with each string. Since the

   443 information is stored only in the final node associated with the string, many

   444 nodes do not carry any value. This distinction is modeled with the help

   445 of the predefined datatype \isa{option} (see {\S}\ref{sec:option}).

   446 \input{Trie/document/Trie.tex}

   448 \begin{exercise}

   449   Write an improved version of \isa{update} that does not suffer from the

   450   space leak in the version above. Prove the main theorem for your improved

   451   \isa{update}.

   452 \end{exercise}

   454 \begin{exercise}

   455   Write a function to \emph{delete} entries from a trie. An easy solution is

   456   a clever modification of \isa{update} which allows both insertion and

   457   deletion with a single function.  Prove (a modified version of) the main

   458   theorem above. Optimize you function such that it shrinks tries after

   459   deletion, if possible.

   460 \end{exercise}

   462 \section{Total recursive functions}

   463 \label{sec:recdef}

   464 \index{*recdef|(}

   466 Although many total functions have a natural primitive recursive definition,

   467 this is not always the case. Arbitrary total recursive functions can be

   468 defined by means of \isacommand{recdef}: you can use full pattern-matching,

   469 recursion need not involve datatypes, and termination is proved by showing

   470 that the arguments of all recursive calls are smaller in a suitable (user

   471 supplied) sense. In this section we ristrict ourselves to measure functions;

   472 more advanced termination proofs are discussed in {\S}\ref{sec:beyond-measure}.

   474 \subsection{Defining recursive functions}

   476 \input{Recdef/document/examples.tex}

   478 \subsection{Proving termination}

   480 \input{Recdef/document/termination.tex}

   482 \subsection{Simplification with recdef}

   483 \label{sec:recdef-simplification}

   485 \input{Recdef/document/simplification.tex}

   487 \subsection{Induction}

   488 \index{induction!recursion|(}

   489 \index{recursion induction|(}

   491 \input{Recdef/document/Induction.tex}

   492 \label{sec:recdef-induction}

   494 \index{induction!recursion|)}

   495 \index{recursion induction|)}

   496 \index{*recdef|)}

author	nipkow
	Thu, 30 Nov 2000 13:56:46 +0100
changeset 10543	8e4307d1207a
parent 10538	d1bf9ca9008d
child 10608	620647438780
permissions	-rw-r--r--