5 section {* Turning Theories into Programs \label{sec:program} *}
7 subsection {* The @{text "Isabelle/HOL"} default setup *}
10 We have already seen how by default equations stemming from
11 @{command definition}, @{command primrec} and @{command fun}
12 statements are used for code generation. This default behaviour
13 can be changed, e.g.\ by providing different code equations.
14 The customisations shown in this section are \emph{safe}
15 as regards correctness: all programs that can be generated are partially
19 subsection {* Selecting code equations *}
22 Coming back to our introductory example, we
23 could provide an alternative code equations for @{const dequeue}
28 "dequeue (AQueue xs []) =
29 (if xs = [] then (None, AQueue [] [])
30 else dequeue (AQueue [] (rev xs)))"
31 "dequeue (AQueue xs (y # ys)) =
32 (Some y, AQueue xs ys)"
33 by (cases xs, simp_all) (cases "rev xs", simp_all)
36 \noindent The annotation @{text "[code]"} is an @{text Isar}
37 @{text attribute} which states that the given theorems should be
38 considered as code equations for a @{text fun} statement --
39 the corresponding constant is determined syntactically. The resulting code:
42 text %quote {*@{code_stmts dequeue (consts) dequeue (Haskell)}*}
45 \noindent You may note that the equality test @{term "xs = []"} has been
46 replaced by the predicate @{term "null xs"}. This is due to the default
47 setup in the \qn{preprocessor} to be discussed further below (\secref{sec:preproc}).
49 Changing the default constructor set of datatypes is also
50 possible. See \secref{sec:datatypes} for an example.
52 As told in \secref{sec:concept}, code generation is based
53 on a structured collection of code theorems.
55 may be inspected using the @{command code_thms} command:
58 code_thms %quote dequeue
61 \noindent prints a table with \emph{all} code equations
62 for @{const dequeue}, including
63 \emph{all} code equations those equations depend
66 Similarly, the @{command code_deps} command shows a graph
67 visualising dependencies between code equations.
70 subsection {* @{text class} and @{text instantiation} *}
73 Concerning type classes and code generation, let us examine an example
74 from abstract algebra:
77 class %quote semigroup =
78 fixes mult :: "'a \<Rightarrow> 'a \<Rightarrow> 'a" (infixl "\<otimes>" 70)
79 assumes assoc: "(x \<otimes> y) \<otimes> z = x \<otimes> (y \<otimes> z)"
81 class %quote monoid = semigroup +
82 fixes neutral :: 'a ("\<one>")
83 assumes neutl: "\<one> \<otimes> x = x"
84 and neutr: "x \<otimes> \<one> = x"
86 instantiation %quote nat :: monoid
89 primrec %quote mult_nat where
90 "0 \<otimes> n = (0\<Colon>nat)"
91 | "Suc m \<otimes> n = n + m \<otimes> n"
93 definition %quote neutral_nat where
96 lemma %quote add_mult_distrib:
98 shows "(n + m) \<otimes> q = n \<otimes> q + m \<otimes> q"
99 by (induct n) simp_all
101 instance %quote proof
103 show "m \<otimes> n \<otimes> q = m \<otimes> (n \<otimes> q)"
104 by (induct m) (simp_all add: add_mult_distrib)
105 show "\<one> \<otimes> n = n"
106 by (simp add: neutral_nat_def)
107 show "m \<otimes> \<one> = m"
108 by (induct m) (simp_all add: neutral_nat_def)
114 \noindent We define the natural operation of the natural numbers
118 primrec %quote (in monoid) pow :: "nat \<Rightarrow> 'a \<Rightarrow> 'a" where
120 | "pow (Suc n) a = a \<otimes> pow n a"
123 \noindent This we use to define the discrete exponentiation function:
126 definition %quote bexp :: "nat \<Rightarrow> nat" where
127 "bexp n = pow n (Suc (Suc 0))"
130 \noindent The corresponding code in Haskell uses that language's native classes:
133 text %quote {*@{code_stmts bexp (Haskell)}*}
136 \noindent This is a convenient place to show how explicit dictionary construction
137 manifests in generated code (here, the same example in @{text SML}):
140 text %quote {*@{code_stmts bexp (SML)}*}
143 \noindent Note the parameters with trailing underscore (@{verbatim "A_"}),
144 which are the dictionary parameters.
147 subsection {* The preprocessor \label{sec:preproc} *}
150 Before selected function theorems are turned into abstract
151 code, a chain of definitional transformation steps is carried
152 out: \emph{preprocessing}. In essence, the preprocessor
153 consists of two components: a \emph{simpset} and \emph{function transformers}.
155 The \emph{simpset} can apply the full generality of the
156 Isabelle simplifier. Due to the interpretation of theorems as code
157 equations, rewrites are applied to the right hand side and the
158 arguments of the left hand side of an equation, but never to the
159 constant heading the left hand side. An important special case are
160 \emph{unfold theorems}, which may be declared and removed using
161 the @{attribute code_unfold} or \emph{@{attribute code_unfold} del}
162 attribute, respectively.
164 Some common applications:
172 \item replacing non-executable constructs by executable ones:
175 lemma %quote [code_unfold]:
176 "x \<in> set xs \<longleftrightarrow> member xs x" by (fact in_set_code)
179 \item eliminating superfluous constants:
182 lemma %quote [code_unfold]:
183 "1 = Suc 0" by (fact One_nat_def)
186 \item replacing executable but inconvenient constructs:
189 lemma %quote [code_unfold]:
190 "xs = [] \<longleftrightarrow> List.null xs" by (fact empty_null)
197 \noindent \emph{Function transformers} provide a very general interface,
198 transforming a list of function theorems to another
199 list of function theorems, provided that neither the heading
200 constant nor its type change. The @{term "0\<Colon>nat"} / @{const Suc}
201 pattern elimination implemented in
202 theory @{text Efficient_Nat} (see \secref{eff_nat}) uses this
205 \noindent The current setup of the preprocessor may be inspected using
206 the @{command print_codeproc} command.
207 @{command code_thms} provides a convenient
208 mechanism to inspect the impact of a preprocessor setup
213 Attribute @{attribute code_unfold} also applies to the
214 preprocessor of the ancient @{text "SML code generator"}; in case
215 this is not what you intend, use @{attribute code_inline} instead.
219 subsection {* Datatypes \label{sec:datatypes} *}
222 Conceptually, any datatype is spanned by a set of
223 \emph{constructors} of type @{text "\<tau> = \<dots> \<Rightarrow> \<kappa> \<alpha>\<^isub>1 \<dots> \<alpha>\<^isub>n"} where @{text
224 "{\<alpha>\<^isub>1, \<dots>, \<alpha>\<^isub>n}"} is exactly the set of \emph{all} type variables in
225 @{text "\<tau>"}. The HOL datatype package by default registers any new
226 datatype in the table of datatypes, which may be inspected using the
227 @{command print_codesetup} command.
229 In some cases, it is appropriate to alter or extend this table. As
230 an example, we will develop an alternative representation of the
231 queue example given in \secref{sec:intro}. The amortised
232 representation is convenient for generating code but exposes its
233 \qt{implementation} details, which may be cumbersome when proving
234 theorems about it. Therefore, here is a simple, straightforward
235 representation of queues:
238 datatype %quote 'a queue = Queue "'a list"
240 definition %quote empty :: "'a queue" where
243 primrec %quote enqueue :: "'a \<Rightarrow> 'a queue \<Rightarrow> 'a queue" where
244 "enqueue x (Queue xs) = Queue (xs @ [x])"
246 fun %quote dequeue :: "'a queue \<Rightarrow> 'a option \<times> 'a queue" where
247 "dequeue (Queue []) = (None, Queue [])"
248 | "dequeue (Queue (x # xs)) = (Some x, Queue xs)"
251 \noindent This we can use directly for proving; for executing,
252 we provide an alternative characterisation:
255 definition %quote AQueue :: "'a list \<Rightarrow> 'a list \<Rightarrow> 'a queue" where
256 "AQueue xs ys = Queue (ys @ rev xs)"
258 code_datatype %quote AQueue
261 \noindent Here we define a \qt{constructor} @{const "AQueue"} which
262 is defined in terms of @{text "Queue"} and interprets its arguments
263 according to what the \emph{content} of an amortised queue is supposed
264 to be. Equipped with this, we are able to prove the following equations
265 for our primitive queue operations which \qt{implement} the simple
266 queues in an amortised fashion:
269 lemma %quote empty_AQueue [code]:
270 "empty = AQueue [] []"
271 unfolding AQueue_def empty_def by simp
273 lemma %quote enqueue_AQueue [code]:
274 "enqueue x (AQueue xs ys) = AQueue (x # xs) ys"
275 unfolding AQueue_def by simp
277 lemma %quote dequeue_AQueue [code]:
278 "dequeue (AQueue xs []) =
279 (if xs = [] then (None, AQueue [] [])
280 else dequeue (AQueue [] (rev xs)))"
281 "dequeue (AQueue xs (y # ys)) = (Some y, AQueue xs ys)"
282 unfolding AQueue_def by simp_all
285 \noindent For completeness, we provide a substitute for the
286 @{text case} combinator on queues:
289 lemma %quote queue_case_AQueue [code]:
290 "queue_case f (AQueue xs ys) = f (ys @ rev xs)"
291 unfolding AQueue_def by simp
294 \noindent The resulting code looks as expected:
297 text %quote {*@{code_stmts empty enqueue dequeue (SML)}*}
300 \noindent From this example, it can be glimpsed that using own
301 constructor sets is a little delicate since it changes the set of
302 valid patterns for values of that type. Without going into much
303 detail, here some practical hints:
307 \item When changing the constructor set for datatypes, take care
308 to provide alternative equations for the @{text case} combinator.
310 \item Values in the target language need not to be normalised --
311 different values in the target language may represent the same
314 \item Usually, a good methodology to deal with the subtleties of
315 pattern matching is to see the type as an abstract type: provide
316 a set of operations which operate on the concrete representation
317 of the type, and derive further operations by combinations of
318 these primitive ones, without relying on a particular
325 subsection {* Equality *}
328 Surely you have already noticed how equality is treated
329 by the code generator:
332 primrec %quote collect_duplicates :: "'a list \<Rightarrow> 'a list \<Rightarrow> 'a list \<Rightarrow> 'a list" where
333 "collect_duplicates xs ys [] = xs"
334 | "collect_duplicates xs ys (z#zs) = (if z \<in> set xs
335 then if z \<in> set ys
336 then collect_duplicates xs ys zs
337 else collect_duplicates xs (z#ys) zs
338 else collect_duplicates (z#xs) (z#ys) zs)"
341 \noindent The membership test during preprocessing is rewritten,
342 resulting in @{const List.member}, which itself
343 performs an explicit equality check.
346 text %quote {*@{code_stmts collect_duplicates (SML)}*}
349 \noindent Obviously, polymorphic equality is implemented the Haskell
350 way using a type class. How is this achieved? HOL introduces
351 an explicit class @{class eq} with a corresponding operation
352 @{const eq_class.eq} such that @{thm eq [no_vars]}.
353 The preprocessing framework does the rest by propagating the
354 @{class eq} constraints through all dependent code equations.
355 For datatypes, instances of @{class eq} are implicitly derived
356 when possible. For other types, you may instantiate @{text eq}
357 manually like any other type class.
361 subsection {* Explicit partiality *}
364 Partiality usually enters the game by partial patterns, as
365 in the following example, again for amortised queues:
368 definition %quote strict_dequeue :: "'a queue \<Rightarrow> 'a \<times> 'a queue" where
369 "strict_dequeue q = (case dequeue q
370 of (Some x, q') \<Rightarrow> (x, q'))"
372 lemma %quote strict_dequeue_AQueue [code]:
373 "strict_dequeue (AQueue xs (y # ys)) = (y, AQueue xs ys)"
374 "strict_dequeue (AQueue xs []) =
375 (case rev xs of y # ys \<Rightarrow> (y, AQueue [] ys))"
376 by (simp_all add: strict_dequeue_def dequeue_AQueue split: list.splits)
379 \noindent In the corresponding code, there is no equation
380 for the pattern @{term "AQueue [] []"}:
383 text %quote {*@{code_stmts strict_dequeue (consts) strict_dequeue (Haskell)}*}
386 \noindent In some cases it is desirable to have this
387 pseudo-\qt{partiality} more explicitly, e.g.~as follows:
390 axiomatization %quote empty_queue :: 'a
392 definition %quote strict_dequeue' :: "'a queue \<Rightarrow> 'a \<times> 'a queue" where
393 "strict_dequeue' q = (case dequeue q of (Some x, q') \<Rightarrow> (x, q') | _ \<Rightarrow> empty_queue)"
395 lemma %quote strict_dequeue'_AQueue [code]:
396 "strict_dequeue' (AQueue xs []) = (if xs = [] then empty_queue
397 else strict_dequeue' (AQueue [] (rev xs)))"
398 "strict_dequeue' (AQueue xs (y # ys)) =
400 by (simp_all add: strict_dequeue'_def dequeue_AQueue split: list.splits)
403 Observe that on the right hand side of the definition of @{const
404 "strict_dequeue'"}, the unspecified constant @{const empty_queue} occurs.
406 Normally, if constants without any code equations occur in a
407 program, the code generator complains (since in most cases this is
408 indeed an error). But such constants can also be thought
409 of as function definitions which always fail,
410 since there is never a successful pattern match on the left hand
411 side. In order to categorise a constant into that category
412 explicitly, use @{command "code_abort"}:
415 code_abort %quote empty_queue
418 \noindent Then the code generator will just insert an error or
419 exception at the appropriate position:
422 text %quote {*@{code_stmts strict_dequeue' (consts) empty_queue strict_dequeue' (Haskell)}*}
425 \noindent This feature however is rarely needed in practice.
426 Note also that the @{text HOL} default setup already declares
427 @{const undefined} as @{command "code_abort"}, which is most
428 likely to be used in such situations.
431 subsection {* Inductive Predicates *}
438 | "append xs ys zs ==> append (x # xs) ys (x # zs)"
441 To execute inductive predicates, a special preprocessor, the predicate
442 compiler, generates code equations from the introduction rules of the predicates.
443 The mechanisms of this compiler are described in \cite{Berghofer-Bulwahn-Haftmann:2009:TPHOL}.
444 Consider the simple predicate @{const append} given by these two
448 @{thm append.intros(1)[of ys]}\\
449 \noindent@{thm append.intros(2)[of xs ys zs x]}
452 \noindent To invoke the compiler, simply use @{command_def "code_pred"}:
454 code_pred %quote append .
456 \noindent The @{command "code_pred"} command takes the name
457 of the inductive predicate and then you put a period to discharge
458 a trivial correctness proof.
459 The compiler infers possible modes
460 for the predicate and produces the derived code equations.
461 Modes annotate which (parts of the) arguments are to be taken as input,
462 and which output. Modes are similar to types, but use the notation @{text "i"}
463 for input and @{text "o"} for output.
465 For @{term "append"}, the compiler can infer the following modes:
467 \item @{text "i \<Rightarrow> i \<Rightarrow> i \<Rightarrow> bool"}
468 \item @{text "i \<Rightarrow> i \<Rightarrow> o \<Rightarrow> bool"}
469 \item @{text "o \<Rightarrow> o \<Rightarrow> i \<Rightarrow> bool"}
471 You can compute sets of predicates using @{command_def "values"}:
473 values %quote "{zs. append [(1::nat),2,3] [4,5] zs}"
474 text {* \noindent outputs @{text "{[1, 2, 3, 4, 5]}"}, and *}
475 values %quote "{(xs, ys). append xs ys [(2::nat),3]}"
476 text {* \noindent outputs @{text "{([], [2, 3]), ([2], [3]), ([2, 3], [])}"}. *}
478 \noindent If you are only interested in the first elements of the set
479 comprehension (with respect to a depth-first search on the introduction rules), you can
481 @{command "values"} to specify the number of elements you want:
484 values %quote 1 "{(xs, ys). append xs ys [(1::nat),2,3,4]}"
485 values %quote 3 "{(xs, ys). append xs ys [(1::nat),2,3,4]}"
488 \noindent The @{command "values"} command can only compute set
489 comprehensions for which a mode has been inferred.
491 The code equations for a predicate are made available as theorems with
492 the suffix @{text "equation"}, and can be inspected with:
494 thm %quote append.equation
496 \noindent More advanced options are described in the following subsections.
498 subsubsection {* Alternative names for functions *}
500 By default, the functions generated from a predicate are named after the predicate with the
501 mode mangled into the name (e.g., @{text "append_i_i_o"}).
502 You can specify your own names as follows:
504 code_pred %quote (modes: i => i => o => bool as concat,
505 o => o => i => bool as split,
506 i => o => i => bool as suffix) append .
508 subsubsection {* Alternative introduction rules *}
510 Sometimes the introduction rules of an predicate are not executable because they contain
511 non-executable constants or specific modes could not be inferred.
512 It is also possible that the introduction rules yield a function that loops forever
513 due to the execution in a depth-first search manner.
514 Therefore, you can declare alternative introduction rules for predicates with the attribute
515 @{attribute "code_pred_intro"}.
516 For example, the transitive closure is defined by:
519 @{thm tranclp.intros(1)[of r a b]}\\
520 \noindent @{thm tranclp.intros(2)[of r a b c]}
523 \noindent These rules do not suit well for executing the transitive closure
524 with the mode @{text "(i \<Rightarrow> o \<Rightarrow> bool) \<Rightarrow> i \<Rightarrow> o \<Rightarrow> bool"}, as the second rule will
525 cause an infinite loop in the recursive call.
526 This can be avoided using the following alternative rules which are
527 declared to the predicate compiler by the attribute @{attribute "code_pred_intro"}:
529 lemma %quote [code_pred_intro]:
530 "r a b \<Longrightarrow> r\<^sup>+\<^sup>+ a b"
531 "r a b \<Longrightarrow> r\<^sup>+\<^sup>+ b c \<Longrightarrow> r\<^sup>+\<^sup>+ a c"
534 \noindent After declaring all alternative rules for the transitive closure,
535 you invoke @{command "code_pred"} as usual.
536 As you have declared alternative rules for the predicate, you are urged to prove that these
537 introduction rules are complete, i.e., that you can derive an elimination rule for the
540 code_pred %quote tranclp
543 from this converse_tranclpE[OF this(1)] show thesis by metis
546 \noindent Alternative rules can also be used for constants that have not
547 been defined inductively. For example, the lexicographic order which
550 @{thm [display] lexord_def[of "r"]}
553 \noindent To make it executable, you can derive the following two rules and prove the
557 lemma append: "append xs ys zs = (xs @ ys = zs)"
558 by (induct xs arbitrary: ys zs) (auto elim: append.cases intro: append.intros)
560 lemma %quote [code_pred_intro]:
561 "append xs (a # v) ys \<Longrightarrow> lexord r (xs, ys)"
562 (*<*)unfolding lexord_def Collect_def by (auto simp add: append)(*>*)
564 lemma %quote [code_pred_intro]:
565 "append u (a # v) xs \<Longrightarrow> append u (b # w) ys \<Longrightarrow> r (a, b)
566 \<Longrightarrow> lexord r (xs, ys)"
567 (*<*)unfolding lexord_def Collect_def append mem_def apply simp
568 apply (rule disjI2) by auto
571 code_pred %quote lexord
575 assume lexord: "lexord r (xs, ys)"
576 assume 1: "\<And> r' xs' ys' a v. r = r' \<Longrightarrow> (xs, ys) = (xs', ys') \<Longrightarrow> append xs' (a # v) ys' \<Longrightarrow> thesis"
577 assume 2: "\<And> r' xs' ys' u a v b w. r = r' \<Longrightarrow> (xs, ys) = (xs', ys') \<Longrightarrow> append u (a # v) xs' \<Longrightarrow> append u (b # w) ys' \<Longrightarrow> r' (a, b) \<Longrightarrow> thesis"
579 assume "\<exists>a v. ys = xs @ a # v"
580 from this 1 have thesis
581 by (fastsimp simp add: append)
584 assume "\<exists>u a b v w. (a, b) \<in> r \<and> xs = u @ a # v \<and> ys = u @ b # w"
585 from this 2 have thesis by (fastsimp simp add: append mem_def)
588 ultimately show thesis
590 by (fastsimp simp add: Collect_def)
593 subsubsection {* Options for values *}
595 In the presence of higher-order predicates, multiple modes for some predicate could be inferred
596 that are not disambiguated by the pattern of the set comprehension.
597 To disambiguate the modes for the arguments of a predicate, you can state
598 the modes explicitly in the @{command "values"} command.
599 Consider the simple predicate @{term "succ"}:
601 inductive succ :: "nat \<Rightarrow> nat \<Rightarrow> bool"
604 | "succ x y \<Longrightarrow> succ (Suc x) (Suc y)"
609 \noindent For this, the predicate compiler can infer modes @{text "o \<Rightarrow> o \<Rightarrow> bool"}, @{text "i \<Rightarrow> o \<Rightarrow> bool"},
610 @{text "o \<Rightarrow> i \<Rightarrow> bool"} and @{text "i \<Rightarrow> i \<Rightarrow> bool"}.
611 The invocation of @{command "values"} @{text "{n. tranclp succ 10 n}"} loops, as multiple
612 modes for the predicate @{text "succ"} are possible and here the first mode @{text "o \<Rightarrow> o \<Rightarrow> bool"}
613 is chosen. To choose another mode for the argument, you can declare the mode for the argument between
614 the @{command "values"} and the number of elements.
616 values %quote [mode: i => o => bool] 20 "{n. tranclp succ 10 n}"
617 values %quote [mode: o => i => bool] 10 "{n. tranclp succ n 10}"
619 subsubsection {* Embedding into functional code within Isabelle/HOL *}
621 To embed the computation of an inductive predicate into functions that are defined in Isabelle/HOL,
622 you have a number of options:
624 \item You want to use the first-order predicate with the mode
625 where all arguments are input. Then you can use the predicate directly, e.g.
627 @{text "valid_suffix ys zs = "}\\
628 @{text "(if append [Suc 0, 2] ys zs then Some ys else None)"}
630 \item If you know that the execution returns only one value (it is deterministic), then you can
631 use the combinator @{term "Predicate.the"}, e.g., a functional concatenation of lists
634 @{term "functional_concat xs ys = Predicate.the (append_i_i_o xs ys)"}
636 Note that if the evaluation does not return a unique value, it raises a run-time error
637 @{term "not_unique"}.
640 subsubsection {* Further Examples *}
641 text {* Further examples for compiling inductive predicates can be found in
642 the @{text "HOL/ex/Predicate_Compile_ex"} theory file.
643 There are also some examples in the Archive of Formal Proofs, notably
644 in the @{text "POPLmark-deBruijn"} and the @{text "FeatherweightJava"} sessions.