5 section {* Turning Theories into Programs \label{sec:program} *}
7 subsection {* The @{text "Isabelle/HOL"} default setup *}
10 We have already seen how by default equations stemming from
11 @{command definition}/@{command primrec}/@{command fun}
12 statements are used for code generation. This default behaviour
13 can be changed, e.g. by providing different code equations.
14 All kinds of customisation shown in this section is \emph{safe}
15 in the sense that the user does not have to worry about
16 correctness -- all programs generatable that way are partially
20 subsection {* Selecting code equations *}
23 Coming back to our introductory example, we
24 could provide an alternative code equations for @{const dequeue}
29 "dequeue (AQueue xs []) =
30 (if xs = [] then (None, AQueue [] [])
31 else dequeue (AQueue [] (rev xs)))"
32 "dequeue (AQueue xs (y # ys)) =
33 (Some y, AQueue xs ys)"
34 by (cases xs, simp_all) (cases "rev xs", simp_all)
37 \noindent The annotation @{text "[code]"} is an @{text Isar}
38 @{text attribute} which states that the given theorems should be
39 considered as code equations for a @{text fun} statement --
40 the corresponding constant is determined syntactically. The resulting code:
43 text %quote {*@{code_stmts dequeue (consts) dequeue (Haskell)}*}
46 \noindent You may note that the equality test @{term "xs = []"} has been
47 replaced by the predicate @{term "null xs"}. This is due to the default
48 setup in the \qn{preprocessor} to be discussed further below (\secref{sec:preproc}).
50 Changing the default constructor set of datatypes is also
51 possible. See \secref{sec:datatypes} for an example.
53 As told in \secref{sec:concept}, code generation is based
54 on a structured collection of code theorems.
55 For explorative purpose, this collection
56 may be inspected using the @{command code_thms} command:
59 code_thms %quote dequeue
62 \noindent prints a table with \emph{all} code equations
63 for @{const dequeue}, including
64 \emph{all} code equations those equations depend
67 Similarly, the @{command code_deps} command shows a graph
68 visualising dependencies between code equations.
71 subsection {* @{text class} and @{text instantiation} *}
74 Concerning type classes and code generation, let us examine an example
75 from abstract algebra:
78 class %quote semigroup =
79 fixes mult :: "'a \<Rightarrow> 'a \<Rightarrow> 'a" (infixl "\<otimes>" 70)
80 assumes assoc: "(x \<otimes> y) \<otimes> z = x \<otimes> (y \<otimes> z)"
82 class %quote monoid = semigroup +
83 fixes neutral :: 'a ("\<one>")
84 assumes neutl: "\<one> \<otimes> x = x"
85 and neutr: "x \<otimes> \<one> = x"
87 instantiation %quote nat :: monoid
90 primrec %quote mult_nat where
91 "0 \<otimes> n = (0\<Colon>nat)"
92 | "Suc m \<otimes> n = n + m \<otimes> n"
94 definition %quote neutral_nat where
97 lemma %quote add_mult_distrib:
99 shows "(n + m) \<otimes> q = n \<otimes> q + m \<otimes> q"
100 by (induct n) simp_all
102 instance %quote proof
104 show "m \<otimes> n \<otimes> q = m \<otimes> (n \<otimes> q)"
105 by (induct m) (simp_all add: add_mult_distrib)
106 show "\<one> \<otimes> n = n"
107 by (simp add: neutral_nat_def)
108 show "m \<otimes> \<one> = m"
109 by (induct m) (simp_all add: neutral_nat_def)
115 \noindent We define the natural operation of the natural numbers
119 primrec %quote (in monoid) pow :: "nat \<Rightarrow> 'a \<Rightarrow> 'a" where
121 | "pow (Suc n) a = a \<otimes> pow n a"
124 \noindent This we use to define the discrete exponentiation function:
127 definition %quote bexp :: "nat \<Rightarrow> nat" where
128 "bexp n = pow n (Suc (Suc 0))"
131 \noindent The corresponding code:
134 text %quote {*@{code_stmts bexp (Haskell)}*}
137 \noindent This is a convenient place to show how explicit dictionary construction
138 manifests in generated code (here, the same example in @{text SML}):
141 text %quote {*@{code_stmts bexp (SML)}*}
144 \noindent Note the parameters with trailing underscore (@{verbatim "A_"})
145 which are the dictionary parameters.
148 subsection {* The preprocessor \label{sec:preproc} *}
151 Before selected function theorems are turned into abstract
152 code, a chain of definitional transformation steps is carried
153 out: \emph{preprocessing}. In essence, the preprocessor
154 consists of two components: a \emph{simpset} and \emph{function transformers}.
156 The \emph{simpset} allows to employ the full generality of the Isabelle
157 simplifier. Due to the interpretation of theorems
158 as code equations, rewrites are applied to the right
159 hand side and the arguments of the left hand side of an
160 equation, but never to the constant heading the left hand side.
161 An important special case are \emph{inline theorems} which may be
162 declared and undeclared using the
163 \emph{code inline} or \emph{code inline del} attribute respectively.
165 Some common applications:
173 \item replacing non-executable constructs by executable ones:
176 lemma %quote [code inline]:
177 "x \<in> set xs \<longleftrightarrow> x mem xs" by (induct xs) simp_all
180 \item eliminating superfluous constants:
183 lemma %quote [code inline]:
187 \item replacing executable but inconvenient constructs:
190 lemma %quote [code inline]:
191 "xs = [] \<longleftrightarrow> List.null xs" by (induct xs) simp_all
198 \noindent \emph{Function transformers} provide a very general interface,
199 transforming a list of function theorems to another
200 list of function theorems, provided that neither the heading
201 constant nor its type change. The @{term "0\<Colon>nat"} / @{const Suc}
202 pattern elimination implemented in
203 theory @{text Efficient_Nat} (see \secref{eff_nat}) uses this
206 \noindent The current setup of the preprocessor may be inspected using
207 the @{command print_codesetup} command.
208 @{command code_thms} provides a convenient
209 mechanism to inspect the impact of a preprocessor setup
213 The attribute \emph{code unfold}
214 associated with the @{text "SML code generator"} also applies to
215 the @{text "generic code generator"}:
216 \emph{code unfold} implies \emph{code inline}.
220 subsection {* Datatypes \label{sec:datatypes} *}
223 Conceptually, any datatype is spanned by a set of
224 \emph{constructors} of type @{text "\<tau> = \<dots> \<Rightarrow> \<kappa> \<alpha>\<^isub>1 \<dots> \<alpha>\<^isub>n"} where @{text
225 "{\<alpha>\<^isub>1, \<dots>, \<alpha>\<^isub>n}"} is exactly the set of \emph{all} type variables in
226 @{text "\<tau>"}. The HOL datatype package by default registers any new
227 datatype in the table of datatypes, which may be inspected using the
228 @{command print_codesetup} command.
230 In some cases, it is appropriate to alter or extend this table. As
231 an example, we will develop an alternative representation of the
232 queue example given in \secref{sec:intro}. The amortised
233 representation is convenient for generating code but exposes its
234 \qt{implementation} details, which may be cumbersome when proving
235 theorems about it. Therefore, here a simple, straightforward
236 representation of queues:
239 datatype %quote 'a queue = Queue "'a list"
241 definition %quote empty :: "'a queue" where
244 primrec %quote enqueue :: "'a \<Rightarrow> 'a queue \<Rightarrow> 'a queue" where
245 "enqueue x (Queue xs) = Queue (xs @ [x])"
247 fun %quote dequeue :: "'a queue \<Rightarrow> 'a option \<times> 'a queue" where
248 "dequeue (Queue []) = (None, Queue [])"
249 | "dequeue (Queue (x # xs)) = (Some x, Queue xs)"
252 \noindent This we can use directly for proving; for executing,
253 we provide an alternative characterisation:
256 definition %quote AQueue :: "'a list \<Rightarrow> 'a list \<Rightarrow> 'a queue" where
257 "AQueue xs ys = Queue (ys @ rev xs)"
259 code_datatype %quote AQueue
262 \noindent Here we define a \qt{constructor} @{const "AQueue"} which
263 is defined in terms of @{text "Queue"} and interprets its arguments
264 according to what the \emph{content} of an amortised queue is supposed
265 to be. Equipped with this, we are able to prove the following equations
266 for our primitive queue operations which \qt{implement} the simple
267 queues in an amortised fashion:
270 lemma %quote empty_AQueue [code]:
271 "empty = AQueue [] []"
272 unfolding AQueue_def empty_def by simp
274 lemma %quote enqueue_AQueue [code]:
275 "enqueue x (AQueue xs ys) = AQueue (x # xs) ys"
276 unfolding AQueue_def by simp
278 lemma %quote dequeue_AQueue [code]:
279 "dequeue (AQueue xs []) =
280 (if xs = [] then (None, AQueue [] [])
281 else dequeue (AQueue [] (rev xs)))"
282 "dequeue (AQueue xs (y # ys)) = (Some y, AQueue xs ys)"
283 unfolding AQueue_def by simp_all
286 \noindent For completeness, we provide a substitute for the
287 @{text case} combinator on queues:
291 aqueue_case_def: "aqueue_case = queue_case"
293 lemma %quote aqueue_case [code, code inline]:
294 "queue_case = aqueue_case"
295 unfolding aqueue_case_def ..
297 lemma %quote case_AQueue [code]:
298 "aqueue_case f (AQueue xs ys) = f (ys @ rev xs)"
299 unfolding aqueue_case_def AQueue_def by simp
302 \noindent The resulting code looks as expected:
305 text %quote {*@{code_stmts empty enqueue dequeue (SML)}*}
308 \noindent From this example, it can be glimpsed that using own
309 constructor sets is a little delicate since it changes the set of
310 valid patterns for values of that type. Without going into much
311 detail, here some practical hints:
315 \item When changing the constructor set for datatypes, take care
316 to provide an alternative for the @{text case} combinator
317 (e.g.~by replacing it using the preprocessor).
319 \item Values in the target language need not to be normalised --
320 different values in the target language may represent the same
323 \item Usually, a good methodology to deal with the subtleties of
324 pattern matching is to see the type as an abstract type: provide
325 a set of operations which operate on the concrete representation
326 of the type, and derive further operations by combinations of
327 these primitive ones, without relying on a particular
334 subsection {* Equality and wellsortedness *}
337 Surely you have already noticed how equality is treated
338 by the code generator:
341 primrec %quote collect_duplicates :: "'a list \<Rightarrow> 'a list \<Rightarrow> 'a list \<Rightarrow> 'a list" where
342 "collect_duplicates xs ys [] = xs"
343 | "collect_duplicates xs ys (z#zs) = (if z \<in> set xs
344 then if z \<in> set ys
345 then collect_duplicates xs ys zs
346 else collect_duplicates xs (z#ys) zs
347 else collect_duplicates (z#xs) (z#ys) zs)"
350 \noindent The membership test during preprocessing is rewritten,
351 resulting in @{const List.member}, which itself
352 performs an explicit equality check.
355 text %quote {*@{code_stmts collect_duplicates (SML)}*}
358 \noindent Obviously, polymorphic equality is implemented the Haskell
359 way using a type class. How is this achieved? HOL introduces
360 an explicit class @{class eq} with a corresponding operation
361 @{const eq_class.eq} such that @{thm eq [no_vars]}.
362 The preprocessing framework does the rest by propagating the
363 @{class eq} constraints through all dependent code equations.
364 For datatypes, instances of @{class eq} are implicitly derived
365 when possible. For other types, you may instantiate @{text eq}
366 manually like any other type class.
368 Though this @{text eq} class is designed to get rarely in
370 enters the stage when definitions of overloaded constants
371 are dependent on operational equality. For example, let
372 us define a lexicographic ordering on tuples
373 (also see theory @{theory Product_ord}):
376 instantiation %quote "*" :: (order, order) order
379 definition %quote [code del]:
380 "x \<le> y \<longleftrightarrow> fst x < fst y \<or> fst x = fst y \<and> snd x \<le> snd y"
382 definition %quote [code del]:
383 "x < y \<longleftrightarrow> fst x < fst y \<or> fst x = fst y \<and> snd x < snd y"
385 instance %quote proof
386 qed (auto simp: less_eq_prod_def less_prod_def intro: order_less_trans)
390 lemma %quote order_prod [code]:
391 "(x1 \<Colon> 'a\<Colon>order, y1 \<Colon> 'b\<Colon>order) < (x2, y2) \<longleftrightarrow>
392 x1 < x2 \<or> x1 = x2 \<and> y1 < y2"
393 "(x1 \<Colon> 'a\<Colon>order, y1 \<Colon> 'b\<Colon>order) \<le> (x2, y2) \<longleftrightarrow>
394 x1 < x2 \<or> x1 = x2 \<and> y1 \<le> y2"
395 by (simp_all add: less_prod_def less_eq_prod_def)
398 \noindent Then code generation will fail. Why? The definition
399 of @{term "op \<le>"} depends on equality on both arguments,
400 which are polymorphic and impose an additional @{class eq}
401 class constraint, which the preprocessor does not propagate
402 (for technical reasons).
404 The solution is to add @{class eq} explicitly to the first sort arguments in the
408 lemma %quote order_prod_code [code]:
409 "(x1 \<Colon> 'a\<Colon>{order, eq}, y1 \<Colon> 'b\<Colon>order) < (x2, y2) \<longleftrightarrow>
410 x1 < x2 \<or> x1 = x2 \<and> y1 < y2"
411 "(x1 \<Colon> 'a\<Colon>{order, eq}, y1 \<Colon> 'b\<Colon>order) \<le> (x2, y2) \<longleftrightarrow>
412 x1 < x2 \<or> x1 = x2 \<and> y1 \<le> y2"
413 by (simp_all add: less_prod_def less_eq_prod_def)
416 \noindent Then code generation succeeds:
419 text %quote {*@{code_stmts "op \<le> \<Colon> _ \<times> _ \<Rightarrow> _ \<times> _ \<Rightarrow> bool" (SML)}*}
422 In some cases, the automatically derived code equations
423 for equality on a particular type may not be appropriate.
424 As example, watch the following datatype representing
425 monomorphic parametric types (where type constructors
426 are referred to by natural numbers):
429 datatype %quote monotype = Mono nat "monotype list"
432 "eq_class.eq (Mono tyco1 typargs1) (Mono tyco2 typargs2) \<equiv>
433 eq_class.eq tyco1 tyco2 \<and> eq_class.eq typargs1 typargs2" by (simp add: eq)
437 \noindent Then code generation for SML would fail with a message
438 that the generated code contains illegal mutual dependencies:
439 the theorem @{thm monotype_eq [no_vars]} already requires the
440 instance @{text "monotype \<Colon> eq"}, which itself requires
441 @{thm monotype_eq [no_vars]}; Haskell has no problem with mutually
442 recursive @{text instance} and @{text function} definitions,
443 but the SML serialiser does not support this.
445 In such cases, you have to provide your own equality equations
446 involving auxiliary constants. In our case,
447 @{const [show_types] list_all2} can do the job:
450 lemma %quote monotype_eq_list_all2 [code]:
451 "eq_class.eq (Mono tyco1 typargs1) (Mono tyco2 typargs2) \<longleftrightarrow>
452 eq_class.eq tyco1 tyco2 \<and> list_all2 eq_class.eq typargs1 typargs2"
453 by (simp add: eq list_all2_eq [symmetric])
456 \noindent does not depend on instance @{text "monotype \<Colon> eq"}:
459 text %quote {*@{code_stmts "eq_class.eq :: monotype \<Rightarrow> monotype \<Rightarrow> bool" (SML)}*}
462 subsection {* Explicit partiality *}
465 Partiality usually enters the game by partial patterns, as
466 in the following example, again for amortised queues:
469 definition %quote strict_dequeue :: "'a queue \<Rightarrow> 'a \<times> 'a queue" where
470 "strict_dequeue q = (case dequeue q
471 of (Some x, q') \<Rightarrow> (x, q'))"
473 lemma %quote strict_dequeue_AQueue [code]:
474 "strict_dequeue (AQueue xs (y # ys)) = (y, AQueue xs ys)"
475 "strict_dequeue (AQueue xs []) =
476 (case rev xs of y # ys \<Rightarrow> (y, AQueue [] ys))"
477 by (simp_all add: strict_dequeue_def dequeue_AQueue split: list.splits)
480 \noindent In the corresponding code, there is no equation
481 for the pattern @{term "AQueue [] []"}:
484 text %quote {*@{code_stmts strict_dequeue (consts) strict_dequeue (Haskell)}*}
487 \noindent In some cases it is desirable to have this
488 pseudo-\qt{partiality} more explicitly, e.g.~as follows:
491 axiomatization %quote empty_queue :: 'a
493 definition %quote strict_dequeue' :: "'a queue \<Rightarrow> 'a \<times> 'a queue" where
494 "strict_dequeue' q = (case dequeue q of (Some x, q') \<Rightarrow> (x, q') | _ \<Rightarrow> empty_queue)"
496 lemma %quote strict_dequeue'_AQueue [code]:
497 "strict_dequeue' (AQueue xs []) = (if xs = [] then empty_queue
498 else strict_dequeue' (AQueue [] (rev xs)))"
499 "strict_dequeue' (AQueue xs (y # ys)) =
501 by (simp_all add: strict_dequeue'_def dequeue_AQueue split: list.splits)
504 Observe that on the right hand side of the definition of @{const
505 "strict_dequeue'"} the constant @{const empty_queue} occurs
506 which is unspecified.
508 Normally, if constants without any code equations occur in a
509 program, the code generator complains (since in most cases this is
510 not what the user expects). But such constants can also be thought
511 of as function definitions with no equations which always fail,
512 since there is never a successful pattern match on the left hand
513 side. In order to categorise a constant into that category
514 explicitly, use @{command "code_abort"}:
517 code_abort %quote empty_queue
520 \noindent Then the code generator will just insert an error or
521 exception at the appropriate position:
524 text %quote {*@{code_stmts strict_dequeue' (consts) empty_queue strict_dequeue' (Haskell)}*}
527 \noindent This feature however is rarely needed in practice.
528 Note also that the @{text HOL} default setup already declares
529 @{const undefined} as @{command "code_abort"}, which is most
530 likely to be used in such situations.