5 section {* Turning Theories into Programs \label{sec:program} *}
7 subsection {* The @{text "Isabelle/HOL"} default setup *}
10 We have already seen how by default equations stemming from
11 @{command definition}, @{command primrec} and @{command fun}
12 statements are used for code generation. This default behaviour
13 can be changed, e.g.\ by providing different code equations.
14 The customisations shown in this section are \emph{safe}
15 as regards correctness: all programs that can be generated are partially
19 subsection {* Selecting code equations *}
22 Coming back to our introductory example, we
23 could provide an alternative code equations for @{const dequeue}
28 "dequeue (AQueue xs []) =
29 (if xs = [] then (None, AQueue [] [])
30 else dequeue (AQueue [] (rev xs)))"
31 "dequeue (AQueue xs (y # ys)) =
32 (Some y, AQueue xs ys)"
33 by (cases xs, simp_all) (cases "rev xs", simp_all)
36 \noindent The annotation @{text "[code]"} is an @{text Isar}
37 @{text attribute} which states that the given theorems should be
38 considered as code equations for a @{text fun} statement --
39 the corresponding constant is determined syntactically. The resulting code:
42 text %quote {*@{code_stmts dequeue (consts) dequeue (Haskell)}*}
45 \noindent You may note that the equality test @{term "xs = []"} has been
46 replaced by the predicate @{term "null xs"}. This is due to the default
47 setup in the \qn{preprocessor} to be discussed further below (\secref{sec:preproc}).
49 Changing the default constructor set of datatypes is also
50 possible. See \secref{sec:datatypes} for an example.
52 As told in \secref{sec:concept}, code generation is based
53 on a structured collection of code theorems.
55 may be inspected using the @{command code_thms} command:
58 code_thms %quote dequeue
61 \noindent prints a table with \emph{all} code equations
62 for @{const dequeue}, including
63 \emph{all} code equations those equations depend
66 Similarly, the @{command code_deps} command shows a graph
67 visualising dependencies between code equations.
70 subsection {* @{text class} and @{text instantiation} *}
73 Concerning type classes and code generation, let us examine an example
74 from abstract algebra:
77 class %quote semigroup =
78 fixes mult :: "'a \<Rightarrow> 'a \<Rightarrow> 'a" (infixl "\<otimes>" 70)
79 assumes assoc: "(x \<otimes> y) \<otimes> z = x \<otimes> (y \<otimes> z)"
81 class %quote monoid = semigroup +
82 fixes neutral :: 'a ("\<one>")
83 assumes neutl: "\<one> \<otimes> x = x"
84 and neutr: "x \<otimes> \<one> = x"
86 instantiation %quote nat :: monoid
89 primrec %quote mult_nat where
90 "0 \<otimes> n = (0\<Colon>nat)"
91 | "Suc m \<otimes> n = n + m \<otimes> n"
93 definition %quote neutral_nat where
96 lemma %quote add_mult_distrib:
98 shows "(n + m) \<otimes> q = n \<otimes> q + m \<otimes> q"
99 by (induct n) simp_all
101 instance %quote proof
103 show "m \<otimes> n \<otimes> q = m \<otimes> (n \<otimes> q)"
104 by (induct m) (simp_all add: add_mult_distrib)
105 show "\<one> \<otimes> n = n"
106 by (simp add: neutral_nat_def)
107 show "m \<otimes> \<one> = m"
108 by (induct m) (simp_all add: neutral_nat_def)
114 \noindent We define the natural operation of the natural numbers
118 primrec %quote (in monoid) pow :: "nat \<Rightarrow> 'a \<Rightarrow> 'a" where
120 | "pow (Suc n) a = a \<otimes> pow n a"
123 \noindent This we use to define the discrete exponentiation function:
126 definition %quote bexp :: "nat \<Rightarrow> nat" where
127 "bexp n = pow n (Suc (Suc 0))"
130 \noindent The corresponding code in Haskell uses that language's native classes:
133 text %quote {*@{code_stmts bexp (Haskell)}*}
136 \noindent This is a convenient place to show how explicit dictionary construction
137 manifests in generated code (here, the same example in @{text SML})
138 \cite{Haftmann-Nipkow:2010:code}:
141 text %quote {*@{code_stmts bexp (SML)}*}
144 \noindent Note the parameters with trailing underscore (@{verbatim "A_"}),
145 which are the dictionary parameters.
148 subsection {* The preprocessor \label{sec:preproc} *}
151 Before selected function theorems are turned into abstract
152 code, a chain of definitional transformation steps is carried
153 out: \emph{preprocessing}. In essence, the preprocessor
154 consists of two components: a \emph{simpset} and \emph{function transformers}.
156 The \emph{simpset} can apply the full generality of the
157 Isabelle simplifier. Due to the interpretation of theorems as code
158 equations, rewrites are applied to the right hand side and the
159 arguments of the left hand side of an equation, but never to the
160 constant heading the left hand side. An important special case are
161 \emph{unfold theorems}, which may be declared and removed using
162 the @{attribute code_unfold} or \emph{@{attribute code_unfold} del}
163 attribute, respectively.
165 Some common applications:
173 \item replacing non-executable constructs by executable ones:
176 lemma %quote [code_unfold]:
177 "x \<in> set xs \<longleftrightarrow> member xs x" by (fact in_set_code)
180 \item eliminating superfluous constants:
183 lemma %quote [code_unfold]:
184 "1 = Suc 0" by (fact One_nat_def)
187 \item replacing executable but inconvenient constructs:
190 lemma %quote [code_unfold]:
191 "xs = [] \<longleftrightarrow> List.null xs" by (fact empty_null)
198 \noindent \emph{Function transformers} provide a very general interface,
199 transforming a list of function theorems to another
200 list of function theorems, provided that neither the heading
201 constant nor its type change. The @{term "0\<Colon>nat"} / @{const Suc}
202 pattern elimination implemented in
203 theory @{text Efficient_Nat} (see \secref{eff_nat}) uses this
206 \noindent The current setup of the preprocessor may be inspected using
207 the @{command print_codeproc} command.
208 @{command code_thms} provides a convenient
209 mechanism to inspect the impact of a preprocessor setup
214 Attribute @{attribute code_unfold} also applies to the
215 preprocessor of the ancient @{text "SML code generator"}; in case
216 this is not what you intend, use @{attribute code_inline} instead.
220 subsection {* Datatypes \label{sec:datatypes} *}
223 Conceptually, any datatype is spanned by a set of
224 \emph{constructors} of type @{text "\<tau> = \<dots> \<Rightarrow> \<kappa> \<alpha>\<^isub>1 \<dots> \<alpha>\<^isub>n"} where @{text
225 "{\<alpha>\<^isub>1, \<dots>, \<alpha>\<^isub>n}"} is exactly the set of \emph{all} type variables in
226 @{text "\<tau>"}. The HOL datatype package by default registers any new
227 datatype in the table of datatypes, which may be inspected using the
228 @{command print_codesetup} command.
230 In some cases, it is appropriate to alter or extend this table. As
231 an example, we will develop an alternative representation of the
232 queue example given in \secref{sec:intro}. The amortised
233 representation is convenient for generating code but exposes its
234 \qt{implementation} details, which may be cumbersome when proving
235 theorems about it. Therefore, here is a simple, straightforward
236 representation of queues:
239 datatype %quote 'a queue = Queue "'a list"
241 definition %quote empty :: "'a queue" where
244 primrec %quote enqueue :: "'a \<Rightarrow> 'a queue \<Rightarrow> 'a queue" where
245 "enqueue x (Queue xs) = Queue (xs @ [x])"
247 fun %quote dequeue :: "'a queue \<Rightarrow> 'a option \<times> 'a queue" where
248 "dequeue (Queue []) = (None, Queue [])"
249 | "dequeue (Queue (x # xs)) = (Some x, Queue xs)"
252 \noindent This we can use directly for proving; for executing,
253 we provide an alternative characterisation:
256 definition %quote AQueue :: "'a list \<Rightarrow> 'a list \<Rightarrow> 'a queue" where
257 "AQueue xs ys = Queue (ys @ rev xs)"
259 code_datatype %quote AQueue
262 \noindent Here we define a \qt{constructor} @{const "AQueue"} which
263 is defined in terms of @{text "Queue"} and interprets its arguments
264 according to what the \emph{content} of an amortised queue is supposed
265 to be. Equipped with this, we are able to prove the following equations
266 for our primitive queue operations which \qt{implement} the simple
267 queues in an amortised fashion:
270 lemma %quote empty_AQueue [code]:
271 "empty = AQueue [] []"
272 unfolding AQueue_def empty_def by simp
274 lemma %quote enqueue_AQueue [code]:
275 "enqueue x (AQueue xs ys) = AQueue (x # xs) ys"
276 unfolding AQueue_def by simp
278 lemma %quote dequeue_AQueue [code]:
279 "dequeue (AQueue xs []) =
280 (if xs = [] then (None, AQueue [] [])
281 else dequeue (AQueue [] (rev xs)))"
282 "dequeue (AQueue xs (y # ys)) = (Some y, AQueue xs ys)"
283 unfolding AQueue_def by simp_all
286 \noindent For completeness, we provide a substitute for the
287 @{text case} combinator on queues:
290 lemma %quote queue_case_AQueue [code]:
291 "queue_case f (AQueue xs ys) = f (ys @ rev xs)"
292 unfolding AQueue_def by simp
295 \noindent The resulting code looks as expected:
298 text %quote {*@{code_stmts empty enqueue dequeue (SML)}*}
301 \noindent From this example, it can be glimpsed that using own
302 constructor sets is a little delicate since it changes the set of
303 valid patterns for values of that type. Without going into much
304 detail, here some practical hints:
308 \item When changing the constructor set for datatypes, take care
309 to provide alternative equations for the @{text case} combinator.
311 \item Values in the target language need not to be normalised --
312 different values in the target language may represent the same
315 \item Usually, a good methodology to deal with the subtleties of
316 pattern matching is to see the type as an abstract type: provide
317 a set of operations which operate on the concrete representation
318 of the type, and derive further operations by combinations of
319 these primitive ones, without relying on a particular
326 subsection {* Equality *}
329 Surely you have already noticed how equality is treated
330 by the code generator:
333 primrec %quote collect_duplicates :: "'a list \<Rightarrow> 'a list \<Rightarrow> 'a list \<Rightarrow> 'a list" where
334 "collect_duplicates xs ys [] = xs"
335 | "collect_duplicates xs ys (z#zs) = (if z \<in> set xs
336 then if z \<in> set ys
337 then collect_duplicates xs ys zs
338 else collect_duplicates xs (z#ys) zs
339 else collect_duplicates (z#xs) (z#ys) zs)"
342 \noindent The membership test during preprocessing is rewritten,
343 resulting in @{const List.member}, which itself
344 performs an explicit equality check.
347 text %quote {*@{code_stmts collect_duplicates (SML)}*}
350 \noindent Obviously, polymorphic equality is implemented the Haskell
351 way using a type class. How is this achieved? HOL introduces
352 an explicit class @{class eq} with a corresponding operation
353 @{const eq_class.eq} such that @{thm eq [no_vars]}.
354 The preprocessing framework does the rest by propagating the
355 @{class eq} constraints through all dependent code equations.
356 For datatypes, instances of @{class eq} are implicitly derived
357 when possible. For other types, you may instantiate @{text eq}
358 manually like any other type class.
362 subsection {* Explicit partiality *}
365 Partiality usually enters the game by partial patterns, as
366 in the following example, again for amortised queues:
369 definition %quote strict_dequeue :: "'a queue \<Rightarrow> 'a \<times> 'a queue" where
370 "strict_dequeue q = (case dequeue q
371 of (Some x, q') \<Rightarrow> (x, q'))"
373 lemma %quote strict_dequeue_AQueue [code]:
374 "strict_dequeue (AQueue xs (y # ys)) = (y, AQueue xs ys)"
375 "strict_dequeue (AQueue xs []) =
376 (case rev xs of y # ys \<Rightarrow> (y, AQueue [] ys))"
377 by (simp_all add: strict_dequeue_def dequeue_AQueue split: list.splits)
380 \noindent In the corresponding code, there is no equation
381 for the pattern @{term "AQueue [] []"}:
384 text %quote {*@{code_stmts strict_dequeue (consts) strict_dequeue (Haskell)}*}
387 \noindent In some cases it is desirable to have this
388 pseudo-\qt{partiality} more explicitly, e.g.~as follows:
391 axiomatization %quote empty_queue :: 'a
393 definition %quote strict_dequeue' :: "'a queue \<Rightarrow> 'a \<times> 'a queue" where
394 "strict_dequeue' q = (case dequeue q of (Some x, q') \<Rightarrow> (x, q') | _ \<Rightarrow> empty_queue)"
396 lemma %quote strict_dequeue'_AQueue [code]:
397 "strict_dequeue' (AQueue xs []) = (if xs = [] then empty_queue
398 else strict_dequeue' (AQueue [] (rev xs)))"
399 "strict_dequeue' (AQueue xs (y # ys)) =
401 by (simp_all add: strict_dequeue'_def dequeue_AQueue split: list.splits)
404 Observe that on the right hand side of the definition of @{const
405 "strict_dequeue'"}, the unspecified constant @{const empty_queue} occurs.
407 Normally, if constants without any code equations occur in a
408 program, the code generator complains (since in most cases this is
409 indeed an error). But such constants can also be thought
410 of as function definitions which always fail,
411 since there is never a successful pattern match on the left hand
412 side. In order to categorise a constant into that category
413 explicitly, use @{command "code_abort"}:
416 code_abort %quote empty_queue
419 \noindent Then the code generator will just insert an error or
420 exception at the appropriate position:
423 text %quote {*@{code_stmts strict_dequeue' (consts) empty_queue strict_dequeue' (Haskell)}*}
426 \noindent This feature however is rarely needed in practice.
427 Note also that the @{text HOL} default setup already declares
428 @{const undefined} as @{command "code_abort"}, which is most
429 likely to be used in such situations.
432 subsection {* Inductive Predicates *}
439 | "append xs ys zs ==> append (x # xs) ys (x # zs)"
442 To execute inductive predicates, a special preprocessor, the predicate
443 compiler, generates code equations from the introduction rules of the predicates.
444 The mechanisms of this compiler are described in \cite{Berghofer-Bulwahn-Haftmann:2009:TPHOL}.
445 Consider the simple predicate @{const append} given by these two
449 @{thm append.intros(1)[of ys]}\\
450 \noindent@{thm append.intros(2)[of xs ys zs x]}
453 \noindent To invoke the compiler, simply use @{command_def "code_pred"}:
455 code_pred %quote append .
457 \noindent The @{command "code_pred"} command takes the name
458 of the inductive predicate and then you put a period to discharge
459 a trivial correctness proof.
460 The compiler infers possible modes
461 for the predicate and produces the derived code equations.
462 Modes annotate which (parts of the) arguments are to be taken as input,
463 and which output. Modes are similar to types, but use the notation @{text "i"}
464 for input and @{text "o"} for output.
466 For @{term "append"}, the compiler can infer the following modes:
468 \item @{text "i \<Rightarrow> i \<Rightarrow> i \<Rightarrow> bool"}
469 \item @{text "i \<Rightarrow> i \<Rightarrow> o \<Rightarrow> bool"}
470 \item @{text "o \<Rightarrow> o \<Rightarrow> i \<Rightarrow> bool"}
472 You can compute sets of predicates using @{command_def "values"}:
474 values %quote "{zs. append [(1::nat),2,3] [4,5] zs}"
475 text {* \noindent outputs @{text "{[1, 2, 3, 4, 5]}"}, and *}
476 values %quote "{(xs, ys). append xs ys [(2::nat),3]}"
477 text {* \noindent outputs @{text "{([], [2, 3]), ([2], [3]), ([2, 3], [])}"}. *}
479 \noindent If you are only interested in the first elements of the set
480 comprehension (with respect to a depth-first search on the introduction rules), you can
482 @{command "values"} to specify the number of elements you want:
485 values %quote 1 "{(xs, ys). append xs ys [(1::nat),2,3,4]}"
486 values %quote 3 "{(xs, ys). append xs ys [(1::nat),2,3,4]}"
489 \noindent The @{command "values"} command can only compute set
490 comprehensions for which a mode has been inferred.
492 The code equations for a predicate are made available as theorems with
493 the suffix @{text "equation"}, and can be inspected with:
495 thm %quote append.equation
497 \noindent More advanced options are described in the following subsections.
499 subsubsection {* Alternative names for functions *}
501 By default, the functions generated from a predicate are named after the predicate with the
502 mode mangled into the name (e.g., @{text "append_i_i_o"}).
503 You can specify your own names as follows:
505 code_pred %quote (modes: i => i => o => bool as concat,
506 o => o => i => bool as split,
507 i => o => i => bool as suffix) append .
509 subsubsection {* Alternative introduction rules *}
511 Sometimes the introduction rules of an predicate are not executable because they contain
512 non-executable constants or specific modes could not be inferred.
513 It is also possible that the introduction rules yield a function that loops forever
514 due to the execution in a depth-first search manner.
515 Therefore, you can declare alternative introduction rules for predicates with the attribute
516 @{attribute "code_pred_intro"}.
517 For example, the transitive closure is defined by:
520 @{thm tranclp.intros(1)[of r a b]}\\
521 \noindent @{thm tranclp.intros(2)[of r a b c]}
524 \noindent These rules do not suit well for executing the transitive closure
525 with the mode @{text "(i \<Rightarrow> o \<Rightarrow> bool) \<Rightarrow> i \<Rightarrow> o \<Rightarrow> bool"}, as the second rule will
526 cause an infinite loop in the recursive call.
527 This can be avoided using the following alternative rules which are
528 declared to the predicate compiler by the attribute @{attribute "code_pred_intro"}:
530 lemma %quote [code_pred_intro]:
531 "r a b \<Longrightarrow> r\<^sup>+\<^sup>+ a b"
532 "r a b \<Longrightarrow> r\<^sup>+\<^sup>+ b c \<Longrightarrow> r\<^sup>+\<^sup>+ a c"
535 \noindent After declaring all alternative rules for the transitive closure,
536 you invoke @{command "code_pred"} as usual.
537 As you have declared alternative rules for the predicate, you are urged to prove that these
538 introduction rules are complete, i.e., that you can derive an elimination rule for the
541 code_pred %quote tranclp
544 from this converse_tranclpE[OF this(1)] show thesis by metis
547 \noindent Alternative rules can also be used for constants that have not
548 been defined inductively. For example, the lexicographic order which
551 @{thm [display] lexord_def[of "r"]}
554 \noindent To make it executable, you can derive the following two rules and prove the
558 lemma append: "append xs ys zs = (xs @ ys = zs)"
559 by (induct xs arbitrary: ys zs) (auto elim: append.cases intro: append.intros)
561 lemma %quote [code_pred_intro]:
562 "append xs (a # v) ys \<Longrightarrow> lexord r (xs, ys)"
563 (*<*)unfolding lexord_def Collect_def by (auto simp add: append)(*>*)
565 lemma %quote [code_pred_intro]:
566 "append u (a # v) xs \<Longrightarrow> append u (b # w) ys \<Longrightarrow> r (a, b)
567 \<Longrightarrow> lexord r (xs, ys)"
568 (*<*)unfolding lexord_def Collect_def append mem_def apply simp
569 apply (rule disjI2) by auto
572 code_pred %quote lexord
576 assume lexord: "lexord r (xs, ys)"
577 assume 1: "\<And> r' xs' ys' a v. r = r' \<Longrightarrow> (xs, ys) = (xs', ys') \<Longrightarrow> append xs' (a # v) ys' \<Longrightarrow> thesis"
578 assume 2: "\<And> r' xs' ys' u a v b w. r = r' \<Longrightarrow> (xs, ys) = (xs', ys') \<Longrightarrow> append u (a # v) xs' \<Longrightarrow> append u (b # w) ys' \<Longrightarrow> r' (a, b) \<Longrightarrow> thesis"
580 assume "\<exists>a v. ys = xs @ a # v"
581 from this 1 have thesis
582 by (fastsimp simp add: append)
585 assume "\<exists>u a b v w. (a, b) \<in> r \<and> xs = u @ a # v \<and> ys = u @ b # w"
586 from this 2 have thesis by (fastsimp simp add: append mem_def)
589 ultimately show thesis
591 by (fastsimp simp add: Collect_def)
594 subsubsection {* Options for values *}
596 In the presence of higher-order predicates, multiple modes for some predicate could be inferred
597 that are not disambiguated by the pattern of the set comprehension.
598 To disambiguate the modes for the arguments of a predicate, you can state
599 the modes explicitly in the @{command "values"} command.
600 Consider the simple predicate @{term "succ"}:
602 inductive succ :: "nat \<Rightarrow> nat \<Rightarrow> bool"
605 | "succ x y \<Longrightarrow> succ (Suc x) (Suc y)"
610 \noindent For this, the predicate compiler can infer modes @{text "o \<Rightarrow> o \<Rightarrow> bool"}, @{text "i \<Rightarrow> o \<Rightarrow> bool"},
611 @{text "o \<Rightarrow> i \<Rightarrow> bool"} and @{text "i \<Rightarrow> i \<Rightarrow> bool"}.
612 The invocation of @{command "values"} @{text "{n. tranclp succ 10 n}"} loops, as multiple
613 modes for the predicate @{text "succ"} are possible and here the first mode @{text "o \<Rightarrow> o \<Rightarrow> bool"}
614 is chosen. To choose another mode for the argument, you can declare the mode for the argument between
615 the @{command "values"} and the number of elements.
617 values %quote [mode: i => o => bool] 20 "{n. tranclp succ 10 n}"
618 values %quote [mode: o => i => bool] 10 "{n. tranclp succ n 10}"
620 subsubsection {* Embedding into functional code within Isabelle/HOL *}
622 To embed the computation of an inductive predicate into functions that are defined in Isabelle/HOL,
623 you have a number of options:
625 \item You want to use the first-order predicate with the mode
626 where all arguments are input. Then you can use the predicate directly, e.g.
628 @{text "valid_suffix ys zs = "}\\
629 @{text "(if append [Suc 0, 2] ys zs then Some ys else None)"}
631 \item If you know that the execution returns only one value (it is deterministic), then you can
632 use the combinator @{term "Predicate.the"}, e.g., a functional concatenation of lists
635 @{term "functional_concat xs ys = Predicate.the (append_i_i_o xs ys)"}
637 Note that if the evaluation does not return a unique value, it raises a run-time error
638 @{term "not_unique"}.
641 subsubsection {* Further Examples *}
642 text {* Further examples for compiling inductive predicates can be found in
643 the @{text "HOL/ex/Predicate_Compile_ex"} theory file.
644 There are also some examples in the Archive of Formal Proofs, notably
645 in the @{text "POPLmark-deBruijn"} and the @{text "FeatherweightJava"} sessions.