wneuper/isa: doc-src/TutorialI/fp.tex@07b13770c4d6

     1 \chapter{Functional Programming in HOL}

     3 Although on the surface this chapter is mainly concerned with how to write

     4 functional programs in HOL and how to verify them, most of the constructs and

     5 proof procedures introduced are general purpose and recur in any specification

     6 or verification task. In fact, it we should really speak of functional

     7 \emph{modelling} rather than \emph{programming} because our aim is not

     8 primarily to write programs but to design abstract models of systems.  HOL is

     9 a specification language that goes well beyond what can be expressed as a

    10 program. However, for the time being we concentrate on the computable.

    12 The dedicated functional programmer should be warned: HOL offers only

    13 \emph{total functional programming} --- all functions in HOL must be total,

    14 i.e.\ they must terminate for all inputs; lazy data structures are not

    15 directly available.

    17 \section{An Introductory Theory}

    18 \label{sec:intro-theory}

    20 Functional programming needs datatypes and functions. Both of them can be

    21 defined in a theory with a syntax reminiscent of languages like ML or

    22 Haskell. As an example consider the theory in figure~\ref{fig:ToyList}.

    23 We will now examine it line by line.

    25 \begin{figure}[htbp]

    26 \begin{ttbox}\makeatother

    27 \input{ToyList2/ToyList1}\end{ttbox}

    28 \caption{A theory of lists}

    29 \label{fig:ToyList}

    30 \end{figure}

    32 {\makeatother\input{ToyList/document/ToyList.tex}}

    34 The complete proof script is shown in Fig.\ts\ref{fig:ToyList-proofs}. The

    35 concatenation of Figs.\ts\ref{fig:ToyList} and~\ref{fig:ToyList-proofs}

    36 constitutes the complete theory \texttt{ToyList} and should reside in file

    37 \texttt{ToyList.thy}. It is good practice to present all declarations and

    38 definitions at the beginning of a theory to facilitate browsing.

    40 \begin{figure}[htbp]

    41 \begin{ttbox}\makeatother

    42 \input{ToyList2/ToyList2}\end{ttbox}

    43 \caption{Proofs about lists}

    44 \label{fig:ToyList-proofs}

    45 \end{figure}

    47 \subsubsection*{Review}

    49 This is the end of our toy proof. It should have familiarized you with

    50 \begin{itemize}

    51 \item the standard theorem proving procedure:

    52 state a goal (lemma or theorem); proceed with proof until a separate lemma is

    53 required; prove that lemma; come back to the original goal.

    54 \item a specific procedure that works well for functional programs:

    55 induction followed by all-out simplification via \isa{auto}.

    56 \item a basic repertoire of proof commands.

    57 \end{itemize}

    60 \section{Some Helpful Commands}

    61 \label{sec:commands-and-hints}

    63 This section discusses a few basic commands for manipulating the proof state

    64 and can be skipped by casual readers.

    66 There are two kinds of commands used during a proof: the actual proof

    67 commands and auxiliary commands for examining the proof state and controlling

    68 the display. Simple proof commands are of the form

    69 \isacommand{apply}\isa{(method)}\indexbold{apply} where \bfindex{method} is a

    70 synonym for ``theorem proving function''. Typical examples are

    71 \isa{induct_tac} and \isa{auto}. Further methods are introduced throughout

    72 the tutorial.  Unless stated otherwise you may assume that a method attacks

    73 merely the first subgoal. An exception is \isa{auto} which tries to solve all

    74 subgoals.

    76 The most useful auxiliary commands are:

    77 \begin{description}

    78 \item[Undoing:] \isacommand{undo}\indexbold{*undo} undoes the effect of the

    79   last command; \isacommand{undo} can be undone by

    80   \isacommand{redo}\indexbold{*redo}.  Both are only needed at the shell

    81   level and should not occur in the final theory.

    82 \item[Printing the current state:] \isacommand{pr}\indexbold{*pr} redisplays

    83   the current proof state, for example when it has disappeared off the

    84   screen.

    85 \item[Limiting the number of subgoals:] \isacommand{pr}~$n$ tells Isabelle to

    86   print only the first $n$ subgoals from now on and redisplays the current

    87   proof state. This is helpful when there are many subgoals.

    88 \item[Modifying the order of subgoals:]

    89 \isacommand{defer}\indexbold{*defer} moves the first subgoal to the end and

    90 \isacommand{prefer}\indexbold{*prefer}~$n$ moves subgoal $n$ to the front.

    91 \item[Printing theorems:]

    92   \isacommand{thm}\indexbold{*thm}~\textit{name}$@1$~\dots~\textit{name}$@n$

    93   prints the named theorems.

    94 \item[Displaying types:] We have already mentioned the flag

    95   \ttindex{show_types} above. It can also be useful for detecting typos in

    96   formulae early on. For example, if \texttt{show_types} is set and the goal

    97   \isa{rev(rev xs) = xs} is started, Isabelle prints the additional output

    98 \par\noindent

    99 \begin{isabelle}%

   100 Variables:\isanewline

   101 ~~xs~::~'a~list

   102 \end{isabelle}%

   103 \par\noindent

   104 which tells us that Isabelle has correctly inferred that

   105 \isa{xs} is a variable of list type. On the other hand, had we

   106 made a typo as in \isa{rev(re xs) = xs}, the response

   107 \par\noindent

   108 \begin{isabelle}%

   109 Variables:\isanewline

   110 ~~re~::~'a~list~{\isasymRightarrow}~'a~list\isanewline

   111 ~~xs~::~'a~list%

   112 \end{isabelle}%

   113 \par\noindent

   114 would have alerted us because of the unexpected variable \isa{re}.

   115 \item[Reading terms and types:] \isacommand{term}\indexbold{*term}

   116   \textit{string} reads, type-checks and prints the given string as a term in

   117   the current context; the inferred type is output as well.

   118   \isacommand{typ}\indexbold{*typ} \textit{string} reads and prints the given

   119   string as a type in the current context.

   120 \item[(Re)loading theories:] When you start your interaction you must open a

   121   named theory with the line \isa{\isacommand{theory}~T~=~\dots~:}. Isabelle

   122   automatically loads all the required parent theories from their respective

   123   files (which may take a moment, unless the theories are already loaded and

   124   the files have not been modified).

   126   If you suddenly discover that you need to modify a parent theory of your

   127   current theory, you must first abandon your current theory\indexbold{abandon

   128   theory}\indexbold{theory!abandon} (at the shell

   129   level type \isacommand{kill}\indexbold{*kill}). After the parent theory has

   130   been modified, you go back to your original theory. When its first line

   131   \isa{\isacommand{theory}~T~=~\dots~:} is processed, the

   132   modified parent is reloaded automatically.

   134 %  The only time when you need to load a theory by hand is when you simply

   135 %  want to check if it loads successfully without wanting to make use of the

   136 %  theory itself. This you can do by typing

   137 %  \isa{\isacommand{use\_thy}\indexbold{*use_thy}~"T"}.

   138 \end{description}

   139 Further commands are found in the Isabelle/Isar Reference Manual.

   141 We now examine Isabelle's functional programming constructs systematically,

   142 starting with inductive datatypes.

   145 \section{Datatypes}

   146 \label{sec:datatype}

   148 Inductive datatypes are part of almost every non-trivial application of HOL.

   149 First we take another look at a very important example, the datatype of

   150 lists, before we turn to datatypes in general. The section closes with a

   151 case study.

   154 \subsection{Lists}

   155 \indexbold{*list}

   157 Lists are one of the essential datatypes in computing. Readers of this

   158 tutorial and users of HOL need to be familiar with their basic operations.

   159 Theory \isa{ToyList} is only a small fragment of HOL's predefined theory

   160 \isa{List}\footnote{\url{http://isabelle.in.tum.de/library/HOL/List.html}}.

   161 The latter contains many further operations. For example, the functions

   162 \isaindexbold{hd} (``head'') and \isaindexbold{tl} (``tail'') return the first

   163 element and the remainder of a list. (However, pattern-matching is usually

   164 preferable to \isa{hd} and \isa{tl}.)

   165 Also available are higher-order functions like \isa{map} and \isa{filter}.

   166 Theory \isa{List} also contains

   167 more syntactic sugar: \isa{[}$x@1$\isa{,}\dots\isa{,}$x@n$\isa{]} abbreviates

   168 $x@1$\isa{\#}\dots\isa{\#}$x@n$\isa{\#[]}.  In the rest of the tutorial we

   169 always use HOL's predefined lists.

   172 \subsection{The General Format}

   173 \label{sec:general-datatype}

   175 The general HOL \isacommand{datatype} definition is of the form

   176 \[

   177 \isacommand{datatype}~(\alpha@1, \dots, \alpha@n) \, t ~=~

   178 C@1~\tau@{11}~\dots~\tau@{1k@1} ~\mid~ \dots ~\mid~

   179 C@m~\tau@{m1}~\dots~\tau@{mk@m}

   180 \]

   181 where $\alpha@i$ are distinct type variables (the parameters), $C@i$ are distinct

   182 constructor names and $\tau@{ij}$ are types; it is customary to capitalize

   183 the first letter in constructor names. There are a number of

   184 restrictions (such as that the type should not be empty) detailed

   185 elsewhere~\cite{isabelle-HOL}. Isabelle notifies you if you violate them.

   187 Laws about datatypes, such as \isa{[] \isasymnoteq~x\#xs} and

   188 \isa{(x\#xs = y\#ys) = (x=y \isasymand~xs=ys)}, are used automatically

   189 during proofs by simplification.  The same is true for the equations in

   190 primitive recursive function definitions.

   192 Every datatype $t$ comes equipped with a \isa{size} function from $t$ into

   193 the natural numbers (see~{\S}\ref{sec:nat} below). For lists, \isa{size} is

   194 just the length, i.e.\ \isa{size [] = 0} and \isa{size(x \# xs) = size xs +

   195   1}.  In general, \isaindexbold{size} returns \isa{0} for all constructors

   196 that do not have an argument of type $t$, and for all other constructors

   197 \isa{1 +} the sum of the sizes of all arguments of type $t$. Note that because

   198 \isa{size} is defined on every datatype, it is overloaded; on lists

   199 \isa{size} is also called \isaindexbold{length}, which is not overloaded.

   200 Isabelle will always show \isa{size} on lists as \isa{length}.

   203 \subsection{Primitive Recursion}

   205 Functions on datatypes are usually defined by recursion. In fact, most of the

   206 time they are defined by what is called \bfindex{primitive recursion}.

   207 The keyword \isacommand{primrec}\indexbold{*primrec} is followed by a list of

   208 equations

   209 \[ f \, x@1 \, \dots \, (C \, y@1 \, \dots \, y@k)\, \dots \, x@n = r \]

   210 such that $C$ is a constructor of the datatype $t$ and all recursive calls of

   211 $f$ in $r$ are of the form $f \, \dots \, y@i \, \dots$ for some $i$. Thus

   212 Isabelle immediately sees that $f$ terminates because one (fixed!) argument

   213 becomes smaller with every recursive call. There must be at most one equation

   214 for each constructor.  Their order is immaterial.

   215 A more general method for defining total recursive functions is introduced in

   216 {\S}\ref{sec:recdef}.

   218 \begin{exercise}\label{ex:Tree}

   219 \input{Misc/document/Tree.tex}%

   220 \end{exercise}

   222 \input{Misc/document/case_exprs.tex}

   224 \input{Ifexpr/document/Ifexpr.tex}

   226 \section{Some Basic Types}

   229 \subsection{Natural Numbers}

   230 \label{sec:nat}

   231 \index{arithmetic|(}

   233 \input{Misc/document/fakenat.tex}

   234 \input{Misc/document/natsum.tex}

   236 \index{arithmetic|)}

   239 \subsection{Pairs}

   240 \input{Misc/document/pairs.tex}

   242 \subsection{Datatype {\tt\slshape option}}

   243 \label{sec:option}

   244 \input{Misc/document/Option2.tex}

   246 \section{Definitions}

   247 \label{sec:Definitions}

   249 A definition is simply an abbreviation, i.e.\ a new name for an existing

   250 construction. In particular, definitions cannot be recursive. Isabelle offers

   251 definitions on the level of types and terms. Those on the type level are

   252 called type synonyms, those on the term level are called (constant)

   253 definitions.

   256 \subsection{Type Synonyms}

   257 \indexbold{type synonym}

   259 Type synonyms are similar to those found in ML\@. Their syntax is fairly self

   260 explanatory:

   262 \input{Misc/document/types.tex}%

   264 Note that pattern-matching is not allowed, i.e.\ each definition must be of

   265 the form $f\,x@1\,\dots\,x@n~\isasymequiv~t$.

   266 Section~{\S}\ref{sec:Simplification} explains how definitions are used

   267 in proofs.

   269 \input{Misc/document/prime_def.tex}

   272 \chapter{More Functional Programming}

   274 The purpose of this chapter is to deepen the reader's understanding of the

   275 concepts encountered so far and to introduce advanced forms of datatypes and

   276 recursive functions. The first two sections give a structured presentation of

   277 theorem proving by simplification ({\S}\ref{sec:Simplification}) and discuss

   278 important heuristics for induction ({\S}\ref{sec:InductionHeuristics}). They can

   279 be skipped by readers less interested in proofs. They are followed by a case

   280 study, a compiler for expressions ({\S}\ref{sec:ExprCompiler}). Advanced

   281 datatypes, including those involving function spaces, are covered in

   282 {\S}\ref{sec:advanced-datatypes}, which closes with another case study, search

   283 trees (``tries'').  Finally we introduce \isacommand{recdef}, a very general

   284 form of recursive function definition which goes well beyond what

   285 \isacommand{primrec} allows ({\S}\ref{sec:recdef}).

   288 \section{Simplification}

   289 \label{sec:Simplification}

   290 \index{simplification|(}

   292 So far we have proved our theorems by \isa{auto}, which simplifies

   293 \emph{all} subgoals. In fact, \isa{auto} can do much more than that, except

   294 that it did not need to so far. However, when you go beyond toy examples, you

   295 need to understand the ingredients of \isa{auto}.  This section covers the

   296 method that \isa{auto} always applies first, simplification.

   298 Simplification is one of the central theorem proving tools in Isabelle and

   299 many other systems. The tool itself is called the \bfindex{simplifier}. The

   300 purpose of this section is to introduce the many features of the simplifier.

   301 Anybody intending to perform proofs in HOL should read this section. Later on

   302 ({\S}\ref{sec:simplification-II}) we explain some more advanced features and a

   303 little bit of how the simplifier works. The serious student should read that

   304 section as well, in particular in order to understand what happened if things

   305 do not simplify as expected.

   307 \subsubsection{What is Simplification}

   309 In its most basic form, simplification means repeated application of

   310 equations from left to right. For example, taking the rules for \isa{\at}

   311 and applying them to the term \isa{[0,1] \at\ []} results in a sequence of

   312 simplification steps:

   313 \begin{ttbox}\makeatother

   314 (0#1#[]) @ []  \(\leadsto\)  0#((1#[]) @ [])  \(\leadsto\)  0#(1#([] @ []))  \(\leadsto\)  0#1#[]

   315 \end{ttbox}

   316 This is also known as \bfindex{term rewriting}\indexbold{rewriting} and the

   317 equations are referred to as \textbf{rewrite rules}\indexbold{rewrite rule}.

   318 ``Rewriting'' is more honest than ``simplification'' because the terms do not

   319 necessarily become simpler in the process.

   321 \input{Misc/document/simp.tex}

   323 \index{simplification|)}

   325 \input{Misc/document/Itrev.tex}

   327 \begin{exercise}

   328 \input{Misc/document/Tree2.tex}%

   329 \end{exercise}

   331 \input{CodeGen/document/CodeGen.tex}

   334 \section{Advanced Datatypes}

   335 \label{sec:advanced-datatypes}

   336 \index{*datatype|(}

   337 \index{*primrec|(}

   338 %|)

   340 This section presents advanced forms of \isacommand{datatype}s.

   342 \subsection{Mutual Recursion}

   343 \label{sec:datatype-mut-rec}

   345 \input{Datatype/document/ABexpr.tex}

   347 \subsection{Nested Recursion}

   348 \label{sec:nested-datatype}

   350 {\makeatother\input{Datatype/document/Nested.tex}}

   353 \subsection{The Limits of Nested Recursion}

   355 How far can we push nested recursion? By the unfolding argument above, we can

   356 reduce nested to mutual recursion provided the nested recursion only involves

   357 previously defined datatypes. This does not include functions:

   358 \begin{isabelle}

   359 \isacommand{datatype} t = C "t \isasymRightarrow\ bool"

   360 \end{isabelle}

   361 This declaration is a real can of worms.

   362 In HOL it must be ruled out because it requires a type

   363 \isa{t} such that \isa{t} and its power set \isa{t \isasymFun\ bool} have the

   364 same cardinality --- an impossibility. For the same reason it is not possible

   365 to allow recursion involving the type \isa{set}, which is isomorphic to

   366 \isa{t \isasymFun\ bool}.

   368 Fortunately, a limited form of recursion

   369 involving function spaces is permitted: the recursive type may occur on the

   370 right of a function arrow, but never on the left. Hence the above can of worms

   371 is ruled out but the following example of a potentially infinitely branching tree is

   372 accepted:

   373 \smallskip

   375 \input{Datatype/document/Fundata.tex}

   376 \bigskip

   378 If you need nested recursion on the left of a function arrow, there are

   379 alternatives to pure HOL: LCF~\cite{paulson87} is a logic where types like

   380 \begin{isabelle}

   381 \isacommand{datatype} lam = C "lam \isasymrightarrow\ lam"

   382 \end{isabelle}

   383 do indeed make sense.  Note the different arrow,

   384 \isa{\isasymrightarrow} instead of \isa{\isasymRightarrow},

   385 expressing the type of \emph{continuous} functions.

   386 There is even a version of LCF on top of HOL,

   387 called HOLCF~\cite{MuellerNvOS99}.

   389 \index{*primrec|)}

   390 \index{*datatype|)}

   392 \subsection{Case Study: Tries}

   393 \label{sec:Trie}

   395 Tries are a classic search tree data structure~\cite{Knuth3-75} for fast

   396 indexing with strings. Figure~\ref{fig:trie} gives a graphical example of a

   397 trie containing the words ``all'', ``an'', ``ape'', ``can'', ``car'' and

   398 ``cat''.  When searching a string in a trie, the letters of the string are

   399 examined sequentially. Each letter determines which subtrie to search next.

   400 In this case study we model tries as a datatype, define a lookup and an

   401 update function, and prove that they behave as expected.

   403 \begin{figure}[htbp]

   404 \begin{center}

   405 \unitlength1mm

   406 \begin{picture}(60,30)

   407 \put( 5, 0){\makebox(0,0)[b]{l}}

   408 \put(25, 0){\makebox(0,0)[b]{e}}

   409 \put(35, 0){\makebox(0,0)[b]{n}}

   410 \put(45, 0){\makebox(0,0)[b]{r}}

   411 \put(55, 0){\makebox(0,0)[b]{t}}

   412 %

   413 \put( 5, 9){\line(0,-1){5}}

   414 \put(25, 9){\line(0,-1){5}}

   415 \put(44, 9){\line(-3,-2){9}}

   416 \put(45, 9){\line(0,-1){5}}

   417 \put(46, 9){\line(3,-2){9}}

   418 %

   419 \put( 5,10){\makebox(0,0)[b]{l}}

   420 \put(15,10){\makebox(0,0)[b]{n}}

   421 \put(25,10){\makebox(0,0)[b]{p}}

   422 \put(45,10){\makebox(0,0)[b]{a}}

   423 %

   424 \put(14,19){\line(-3,-2){9}}

   425 \put(15,19){\line(0,-1){5}}

   426 \put(16,19){\line(3,-2){9}}

   427 \put(45,19){\line(0,-1){5}}

   428 %

   429 \put(15,20){\makebox(0,0)[b]{a}}

   430 \put(45,20){\makebox(0,0)[b]{c}}

   431 %

   432 \put(30,30){\line(-3,-2){13}}

   433 \put(30,30){\line(3,-2){13}}

   434 \end{picture}

   435 \end{center}

   436 \caption{A sample trie}

   437 \label{fig:trie}

   438 \end{figure}

   440 Proper tries associate some value with each string. Since the

   441 information is stored only in the final node associated with the string, many

   442 nodes do not carry any value. This distinction is modeled with the help

   443 of the predefined datatype \isa{option} (see {\S}\ref{sec:option}).

   444 \input{Trie/document/Trie.tex}

   446 \begin{exercise}

   447   Write an improved version of \isa{update} that does not suffer from the

   448   space leak in the version above. Prove the main theorem for your improved

   449   \isa{update}.

   450 \end{exercise}

   452 \begin{exercise}

   453   Write a function to \emph{delete} entries from a trie. An easy solution is

   454   a clever modification of \isa{update} which allows both insertion and

   455   deletion with a single function.  Prove (a modified version of) the main

   456   theorem above. Optimize you function such that it shrinks tries after

   457   deletion, if possible.

   458 \end{exercise}

   460 \section{Total Recursive Functions}

   461 \label{sec:recdef}

   462 \index{*recdef|(}

   464 Although many total functions have a natural primitive recursive definition,

   465 this is not always the case. Arbitrary total recursive functions can be

   466 defined by means of \isacommand{recdef}: you can use full pattern-matching,

   467 recursion need not involve datatypes, and termination is proved by showing

   468 that the arguments of all recursive calls are smaller in a suitable (user

   469 supplied) sense. In this section we restrict ourselves to measure functions;

   470 more advanced termination proofs are discussed in {\S}\ref{sec:beyond-measure}.

   472 \subsection{Defining Recursive Functions}

   473 \label{sec:recdef-examples}

   474 \input{Recdef/document/examples.tex}

   476 \subsection{Proving Termination}

   478 \input{Recdef/document/termination.tex}

   480 \subsection{Simplification with Recdef}

   481 \label{sec:recdef-simplification}

   483 \input{Recdef/document/simplification.tex}

   485 \subsection{Induction}

   486 \index{induction!recursion|(}

   487 \index{recursion induction|(}

   489 \input{Recdef/document/Induction.tex}

   490 \label{sec:recdef-induction}

   492 \index{induction!recursion|)}

   493 \index{recursion induction|)}

   494 \index{*recdef|)}

author	nipkow
	Tue, 20 Feb 2001 10:37:12 +0100
changeset 11159	07b13770c4d6
parent 10983	59961d32b1ae
child 11201	eddc69b55fac
permissions	-rw-r--r--