haftmann@31050: theory Adaptation haftmann@28213: imports Setup haftmann@28213: begin haftmann@28213: haftmann@28679: setup %invisible {* Code_Target.extend_target ("\", ("SML", K I)) *} haftmann@28561: haftmann@31050: section {* Adaptation to target languages \label{sec:adaptation} *} haftmann@28213: haftmann@28561: subsection {* Adapting code generation *} haftmann@28561: haftmann@28561: text {* haftmann@28561: The aspects of code generation introduced so far have two aspects haftmann@28561: in common: haftmann@28561: haftmann@28561: \begin{itemize} haftmann@28561: \item They act uniformly, without reference to a specific haftmann@28561: target language. haftmann@28561: \item They are \emph{safe} in the sense that as long as you trust haftmann@28561: the code generator meta theory and implementation, you cannot haftmann@28561: produce programs that yield results which are not derivable haftmann@28561: in the logic. haftmann@28561: \end{itemize} haftmann@28561: haftmann@28561: \noindent In this section we will introduce means to \emph{adapt} the serialiser haftmann@28561: to a specific target language, i.e.~to print program fragments haftmann@28593: in a way which accommodates \qt{already existing} ingredients of haftmann@28561: a target language environment, for three reasons: haftmann@28561: haftmann@28561: \begin{itemize} haftmann@28593: \item improving readability and aesthetics of generated code haftmann@28561: \item gaining efficiency haftmann@28561: \item interface with language parts which have no direct counterpart haftmann@28561: in @{text "HOL"} (say, imperative data structures) haftmann@28561: \end{itemize} haftmann@28561: haftmann@28561: \noindent Generally, you should avoid using those features yourself haftmann@28561: \emph{at any cost}: haftmann@28561: haftmann@28561: \begin{itemize} haftmann@28561: \item The safe configuration methods act uniformly on every target language, paulson@34155: whereas for adaptation you have to treat each target language separately. haftmann@28561: \item Application is extremely tedious since there is no abstraction haftmann@28593: which would allow for a static check, making it easy to produce garbage. paulson@34155: \item Subtle errors can be introduced unconsciously. haftmann@28561: \end{itemize} haftmann@28561: haftmann@31050: \noindent However, even if you ought refrain from setting up adaptation haftmann@28561: yourself, already the @{text "HOL"} comes with some reasonable default haftmann@31050: adaptations (say, using target language list syntax). There also some haftmann@31050: common adaptation cases which you can setup by importing particular haftmann@28561: library theories. In order to understand these, we provide some clues here; haftmann@28561: these however are not supposed to replace a careful study of the sources. haftmann@28561: *} haftmann@28561: haftmann@31050: subsection {* The adaptation principle *} haftmann@28561: haftmann@28561: text {* haftmann@31050: Figure \ref{fig:adaptation} illustrates what \qt{adaptation} is conceptually haftmann@28601: supposed to be: haftmann@28601: haftmann@28601: \begin{figure}[here] haftmann@31050: \includegraphics{adaptation} haftmann@31050: \caption{The adaptation principle} haftmann@31050: \label{fig:adaptation} haftmann@28601: \end{figure} haftmann@28601: haftmann@28601: \noindent In the tame view, code generation acts as broker between haftmann@28601: @{text logic}, @{text "intermediate language"} and haftmann@28601: @{text "target language"} by means of @{text translation} and haftmann@28601: @{text serialisation}; for the latter, the serialiser has to observe haftmann@28601: the structure of the @{text language} itself plus some @{text reserved} haftmann@28601: keywords which have to be avoided for generated code. haftmann@31050: However, if you consider @{text adaptation} mechanisms, the code generated haftmann@28601: by the serializer is just the tip of the iceberg: haftmann@28601: haftmann@28601: \begin{itemize} haftmann@28635: \item @{text serialisation} can be \emph{parametrised} such that haftmann@28635: logical entities are mapped to target-specific ones haftmann@28635: (e.g. target-specific list syntax, haftmann@31050: see also \secref{sec:adaptation_mechanisms}) haftmann@28635: \item Such parametrisations can involve references to a haftmann@28635: target-specific standard @{text library} (e.g. using haftmann@28635: the @{text Haskell} @{verbatim Maybe} type instead haftmann@28635: of the @{text HOL} @{type "option"} type); haftmann@28635: if such are used, the corresponding identifiers haftmann@28635: (in our example, @{verbatim Maybe}, @{verbatim Nothing} haftmann@28635: and @{verbatim Just}) also have to be considered @{text reserved}. haftmann@28635: \item Even more, the user can enrich the library of the haftmann@28635: target-language by providing code snippets haftmann@28635: (\qt{@{text "includes"}}) which are prepended to haftmann@28635: any generated code (see \secref{sec:include}); this typically haftmann@28635: also involves further @{text reserved} identifiers. haftmann@28601: \end{itemize} haftmann@28635: haftmann@31050: \noindent As figure \ref{fig:adaptation} illustrates, all these adaptation mechanisms haftmann@28635: have to act consistently; it is at the discretion of the user haftmann@28635: to take care for this. haftmann@28561: *} haftmann@28561: haftmann@31050: subsection {* Common adaptation patterns *} haftmann@28419: haftmann@28419: text {* haftmann@28428: The @{theory HOL} @{theory Main} theory already provides a code haftmann@28419: generator setup haftmann@28593: which should be suitable for most applications. Common extensions haftmann@28419: and modifications are available by certain theories of the @{text HOL} haftmann@28419: library; beside being useful in applications, they may serve haftmann@28419: as a tutorial for customising the code generator setup (see below haftmann@31050: \secref{sec:adaptation_mechanisms}). haftmann@28419: haftmann@28419: \begin{description} haftmann@28419: haftmann@28419: \item[@{theory "Code_Integer"}] represents @{text HOL} integers by big haftmann@28419: integer literals in target languages. haftmann@28419: \item[@{theory "Code_Char"}] represents @{text HOL} characters by haftmann@28419: character literals in target languages. haftmann@28419: \item[@{theory "Code_Char_chr"}] like @{text "Code_Char"}, haftmann@28419: but also offers treatment of character codes; includes haftmann@28561: @{theory "Code_Char"}. haftmann@28419: \item[@{theory "Efficient_Nat"}] \label{eff_nat} implements natural numbers by integers, haftmann@28419: which in general will result in higher efficiency; pattern haftmann@28419: matching with @{term "0\nat"} / @{const "Suc"} haftmann@28561: is eliminated; includes @{theory "Code_Integer"} haftmann@31206: and @{theory "Code_Numeral"}. haftmann@31206: \item[@{theory "Code_Numeral"}] provides an additional datatype haftmann@28419: @{typ index} which is mapped to target-language built-in integers. haftmann@28419: Useful for code setups which involve e.g. indexing of haftmann@28419: target-language arrays. haftmann@31050: \item[@{theory "String"}] provides an additional datatype haftmann@31205: @{typ String.literal} which is isomorphic to strings; haftmann@31205: @{typ String.literal}s are mapped to target-language strings. haftmann@28419: Useful for code setups which involve e.g. printing (error) messages. haftmann@28419: haftmann@28419: \end{description} haftmann@28419: haftmann@28419: \begin{warn} haftmann@28419: When importing any of these theories, they should form the last haftmann@28419: items in an import list. Since these theories adapt the haftmann@28419: code generator setup in a non-conservative fashion, haftmann@28419: strange effects may occur otherwise. haftmann@28419: \end{warn} haftmann@28419: *} haftmann@28419: haftmann@28419: haftmann@31050: subsection {* Parametrising serialisation \label{sec:adaptation_mechanisms} *} haftmann@28419: haftmann@28419: text {* haftmann@28561: Consider the following function and its corresponding haftmann@28419: SML code: haftmann@28419: *} haftmann@28419: haftmann@28564: primrec %quote in_interval :: "nat \ nat \ nat \ bool" where haftmann@28419: "in_interval (k, l) n \ k \ n \ n \ l" haftmann@28447: (*<*) haftmann@28419: code_type %invisible bool haftmann@28419: (SML) haftmann@28419: code_const %invisible True and False and "op \" and Not haftmann@28419: (SML and and and) haftmann@28447: (*>*) haftmann@28564: text %quote {*@{code_stmts in_interval (SML)}*} haftmann@28419: haftmann@28419: text {* haftmann@28419: \noindent Though this is correct code, it is a little bit unsatisfactory: haftmann@28419: boolean values and operators are materialised as distinguished haftmann@28419: entities with have nothing to do with the SML-built-in notion haftmann@28419: of \qt{bool}. This results in less readable code; haftmann@28419: additionally, eager evaluation may cause programs to haftmann@28419: loop or break which would perfectly terminate when haftmann@28419: the existing SML @{verbatim "bool"} would be used. To map haftmann@28419: the HOL @{typ bool} on SML @{verbatim "bool"}, we may use haftmann@28419: \qn{custom serialisations}: haftmann@28419: *} haftmann@28419: haftmann@28564: code_type %quotett bool haftmann@28419: (SML "bool") haftmann@28564: code_const %quotett True and False and "op \" haftmann@28419: (SML "true" and "false" and "_ andalso _") haftmann@28419: haftmann@28419: text {* haftmann@28447: \noindent The @{command code_type} command takes a type constructor haftmann@28419: as arguments together with a list of custom serialisations. haftmann@28419: Each custom serialisation starts with a target language haftmann@28419: identifier followed by an expression, which during haftmann@28419: code serialisation is inserted whenever the type constructor haftmann@28419: would occur. For constants, @{command code_const} implements haftmann@28419: the corresponding mechanism. Each ``@{verbatim "_"}'' in haftmann@28419: a serialisation expression is treated as a placeholder haftmann@28419: for the type constructor's (the constant's) arguments. haftmann@28419: *} haftmann@28419: haftmann@28564: text %quote {*@{code_stmts in_interval (SML)}*} haftmann@28419: haftmann@28419: text {* haftmann@28419: \noindent This still is not perfect: the parentheses haftmann@28419: around the \qt{andalso} expression are superfluous. haftmann@28593: Though the serialiser haftmann@28419: by no means attempts to imitate the rich Isabelle syntax haftmann@28419: framework, it provides some common idioms, notably haftmann@28419: associative infixes with precedences which may be used here: haftmann@28419: *} haftmann@28419: haftmann@28564: code_const %quotett "op \" haftmann@28419: (SML infixl 1 "andalso") haftmann@28419: haftmann@28564: text %quote {*@{code_stmts in_interval (SML)}*} haftmann@28419: haftmann@28419: text {* haftmann@28561: \noindent The attentive reader may ask how we assert that no generated haftmann@28561: code will accidentally overwrite. For this reason the serialiser has haftmann@28561: an internal table of identifiers which have to be avoided to be used haftmann@28561: for new declarations. Initially, this table typically contains the haftmann@28561: keywords of the target language. It can be extended manually, thus avoiding haftmann@28561: accidental overwrites, using the @{command "code_reserved"} command: haftmann@28561: *} haftmann@28561: haftmann@28601: code_reserved %quote "\" bool true false andalso haftmann@28561: haftmann@28561: text {* haftmann@28447: \noindent Next, we try to map HOL pairs to SML pairs, using the haftmann@28419: infix ``@{verbatim "*"}'' type constructor and parentheses: haftmann@28419: *} haftmann@28447: (*<*) haftmann@28419: code_type %invisible * haftmann@28419: (SML) haftmann@28419: code_const %invisible Pair haftmann@28419: (SML) haftmann@28447: (*>*) haftmann@28564: code_type %quotett * haftmann@28419: (SML infix 2 "*") haftmann@28564: code_const %quotett Pair haftmann@28419: (SML "!((_),/ (_))") haftmann@28419: haftmann@28419: text {* haftmann@28593: \noindent The initial bang ``@{verbatim "!"}'' tells the serialiser haftmann@28561: never to put haftmann@28419: parentheses around the whole expression (they are already present), haftmann@28419: while the parentheses around argument place holders haftmann@28419: tell not to put parentheses around the arguments. haftmann@28419: The slash ``@{verbatim "/"}'' (followed by arbitrary white space) haftmann@28419: inserts a space which may be used as a break if necessary haftmann@28419: during pretty printing. haftmann@28419: haftmann@28419: These examples give a glimpse what mechanisms haftmann@28419: custom serialisations provide; however their usage haftmann@28419: requires careful thinking in order not to introduce haftmann@28419: inconsistencies -- or, in other words: haftmann@28419: custom serialisations are completely axiomatic. haftmann@28419: haftmann@28419: A further noteworthy details is that any special haftmann@28419: character in a custom serialisation may be quoted haftmann@28419: using ``@{verbatim "'"}''; thus, in haftmann@28419: ``@{verbatim "fn '_ => _"}'' the first haftmann@28419: ``@{verbatim "_"}'' is a proper underscore while the haftmann@28419: second ``@{verbatim "_"}'' is a placeholder. haftmann@28419: *} haftmann@28419: haftmann@28419: haftmann@28419: subsection {* @{text Haskell} serialisation *} haftmann@28419: haftmann@28419: text {* haftmann@28419: For convenience, the default haftmann@28419: @{text HOL} setup for @{text Haskell} maps the @{class eq} class to haftmann@28419: its counterpart in @{text Haskell}, giving custom serialisations haftmann@28419: for the class @{class eq} (by command @{command code_class}) and its operation haftmann@28419: @{const HOL.eq} haftmann@28419: *} haftmann@28419: haftmann@28564: code_class %quotett eq haftmann@28714: (Haskell "Eq") haftmann@28419: haftmann@28564: code_const %quotett "op =" haftmann@28419: (Haskell infixl 4 "==") haftmann@28419: haftmann@28419: text {* haftmann@28447: \noindent A problem now occurs whenever a type which haftmann@28419: is an instance of @{class eq} in @{text HOL} is mapped haftmann@28419: on a @{text Haskell}-built-in type which is also an instance haftmann@28419: of @{text Haskell} @{text Eq}: haftmann@28419: *} haftmann@28419: haftmann@28564: typedecl %quote bar haftmann@28419: haftmann@28564: instantiation %quote bar :: eq haftmann@28419: begin haftmann@28419: haftmann@28564: definition %quote "eq_class.eq (x\bar) y \ x = y" haftmann@28419: haftmann@28564: instance %quote by default (simp add: eq_bar_def) haftmann@28213: haftmann@30880: end %quote (*<*) haftmann@30880: haftmann@30880: (*>*) code_type %quotett bar haftmann@28419: (Haskell "Integer") haftmann@28419: haftmann@28419: text {* haftmann@28447: \noindent The code generator would produce haftmann@28593: an additional instance, which of course is rejected by the @{text Haskell} haftmann@28419: compiler. haftmann@28419: To suppress this additional instance, use haftmann@28419: @{text "code_instance"}: haftmann@28419: *} haftmann@28419: haftmann@28564: code_instance %quotett bar :: eq haftmann@28419: (Haskell -) haftmann@28419: haftmann@28561: haftmann@28635: subsection {* Enhancing the target language context \label{sec:include} *} haftmann@28561: haftmann@28561: text {* haftmann@28593: In rare cases it is necessary to \emph{enrich} the context of a haftmann@28561: target language; this is accomplished using the @{command "code_include"} haftmann@28561: command: haftmann@28561: *} haftmann@28561: haftmann@28564: code_include %quotett Haskell "Errno" haftmann@28561: {*errno i = error ("Error number: " ++ show i)*} haftmann@28561: haftmann@28564: code_reserved %quotett Haskell Errno haftmann@28561: haftmann@28561: text {* haftmann@28561: \noindent Such named @{text include}s are then prepended to every generated code. haftmann@28561: Inspect such code in order to find out how @{command "code_include"} behaves haftmann@28561: with respect to a particular target language. haftmann@28561: *} haftmann@28561: haftmann@28419: end