haftmann@31050: theory Adaptation
haftmann@28213: imports Setup
haftmann@28213: begin
haftmann@28213: 
haftmann@28679: setup %invisible {* Code_Target.extend_target ("\<SML>", ("SML", K I)) *}
haftmann@28561: 
haftmann@31050: section {* Adaptation to target languages \label{sec:adaptation} *}
haftmann@28213: 
haftmann@28561: subsection {* Adapting code generation *}
haftmann@28561: 
haftmann@28561: text {*
haftmann@28561:   The aspects of code generation introduced so far have two aspects
haftmann@28561:   in common:
haftmann@28561: 
haftmann@28561:   \begin{itemize}
haftmann@28561:     \item They act uniformly, without reference to a specific
haftmann@28561:        target language.
haftmann@28561:     \item They are \emph{safe} in the sense that as long as you trust
haftmann@28561:        the code generator meta theory and implementation, you cannot
haftmann@28561:        produce programs that yield results which are not derivable
haftmann@28561:        in the logic.
haftmann@28561:   \end{itemize}
haftmann@28561: 
haftmann@28561:   \noindent In this section we will introduce means to \emph{adapt} the serialiser
haftmann@28561:   to a specific target language, i.e.~to print program fragments
haftmann@28593:   in a way which accommodates \qt{already existing} ingredients of
haftmann@28561:   a target language environment, for three reasons:
haftmann@28561: 
haftmann@28561:   \begin{itemize}
haftmann@28593:     \item improving readability and aesthetics of generated code
haftmann@28561:     \item gaining efficiency
haftmann@28561:     \item interface with language parts which have no direct counterpart
haftmann@28561:       in @{text "HOL"} (say, imperative data structures)
haftmann@28561:   \end{itemize}
haftmann@28561: 
haftmann@28561:   \noindent Generally, you should avoid using those features yourself
haftmann@28561:   \emph{at any cost}:
haftmann@28561: 
haftmann@28561:   \begin{itemize}
haftmann@28561:     \item The safe configuration methods act uniformly on every target language,
paulson@34155:       whereas for adaptation you have to treat each target language separately.
haftmann@28561:     \item Application is extremely tedious since there is no abstraction
haftmann@28593:       which would allow for a static check, making it easy to produce garbage.
paulson@34155:     \item Subtle errors can be introduced unconsciously.
haftmann@28561:   \end{itemize}
haftmann@28561: 
haftmann@31050:   \noindent However, even if you ought refrain from setting up adaptation
haftmann@28561:   yourself, already the @{text "HOL"} comes with some reasonable default
haftmann@31050:   adaptations (say, using target language list syntax).  There also some
haftmann@31050:   common adaptation cases which you can setup by importing particular
haftmann@28561:   library theories.  In order to understand these, we provide some clues here;
haftmann@28561:   these however are not supposed to replace a careful study of the sources.
haftmann@28561: *}
haftmann@28561: 
haftmann@31050: subsection {* The adaptation principle *}
haftmann@28561: 
haftmann@28561: text {*
haftmann@31050:   Figure \ref{fig:adaptation} illustrates what \qt{adaptation} is conceptually
haftmann@28601:   supposed to be:
haftmann@28601: 
haftmann@28601:   \begin{figure}[here]
haftmann@31050:     \includegraphics{adaptation}
haftmann@31050:     \caption{The adaptation principle}
haftmann@31050:     \label{fig:adaptation}
haftmann@28601:   \end{figure}
haftmann@28601: 
haftmann@28601:   \noindent In the tame view, code generation acts as broker between
haftmann@28601:   @{text logic}, @{text "intermediate language"} and
haftmann@28601:   @{text "target language"} by means of @{text translation} and
haftmann@28601:   @{text serialisation};  for the latter, the serialiser has to observe
haftmann@28601:   the structure of the @{text language} itself plus some @{text reserved}
haftmann@28601:   keywords which have to be avoided for generated code.
haftmann@31050:   However, if you consider @{text adaptation} mechanisms, the code generated
haftmann@28601:   by the serializer is just the tip of the iceberg:
haftmann@28601: 
haftmann@28601:   \begin{itemize}
haftmann@28635:     \item @{text serialisation} can be \emph{parametrised} such that
haftmann@28635:       logical entities are mapped to target-specific ones
haftmann@28635:       (e.g. target-specific list syntax,
haftmann@31050:         see also \secref{sec:adaptation_mechanisms})
haftmann@28635:     \item Such parametrisations can involve references to a
haftmann@28635:       target-specific standard @{text library} (e.g. using
haftmann@28635:       the @{text Haskell} @{verbatim Maybe} type instead
haftmann@28635:       of the @{text HOL} @{type "option"} type);
haftmann@28635:       if such are used, the corresponding identifiers
haftmann@28635:       (in our example, @{verbatim Maybe}, @{verbatim Nothing}
haftmann@28635:       and @{verbatim Just}) also have to be considered @{text reserved}.
haftmann@28635:     \item Even more, the user can enrich the library of the
haftmann@28635:       target-language by providing code snippets
haftmann@28635:       (\qt{@{text "includes"}}) which are prepended to
haftmann@28635:       any generated code (see \secref{sec:include});  this typically
haftmann@28635:       also involves further @{text reserved} identifiers.
haftmann@28601:   \end{itemize}
haftmann@28635: 
haftmann@31050:   \noindent As figure \ref{fig:adaptation} illustrates, all these adaptation mechanisms
haftmann@28635:   have to act consistently;  it is at the discretion of the user
haftmann@28635:   to take care for this.
haftmann@28561: *}
haftmann@28561: 
haftmann@31050: subsection {* Common adaptation patterns *}
haftmann@28419: 
haftmann@28419: text {*
haftmann@28428:   The @{theory HOL} @{theory Main} theory already provides a code
haftmann@28419:   generator setup
haftmann@28593:   which should be suitable for most applications.  Common extensions
haftmann@28419:   and modifications are available by certain theories of the @{text HOL}
haftmann@28419:   library; beside being useful in applications, they may serve
haftmann@28419:   as a tutorial for customising the code generator setup (see below
haftmann@31050:   \secref{sec:adaptation_mechanisms}).
haftmann@28419: 
haftmann@28419:   \begin{description}
haftmann@28419: 
haftmann@28419:     \item[@{theory "Code_Integer"}] represents @{text HOL} integers by big
haftmann@28419:        integer literals in target languages.
haftmann@28419:     \item[@{theory "Code_Char"}] represents @{text HOL} characters by 
haftmann@28419:        character literals in target languages.
haftmann@28419:     \item[@{theory "Code_Char_chr"}] like @{text "Code_Char"},
haftmann@28419:        but also offers treatment of character codes; includes
haftmann@28561:        @{theory "Code_Char"}.
haftmann@28419:     \item[@{theory "Efficient_Nat"}] \label{eff_nat} implements natural numbers by integers,
haftmann@28419:        which in general will result in higher efficiency; pattern
haftmann@28419:        matching with @{term "0\<Colon>nat"} / @{const "Suc"}
haftmann@28561:        is eliminated;  includes @{theory "Code_Integer"}
haftmann@31206:        and @{theory "Code_Numeral"}.
haftmann@31206:     \item[@{theory "Code_Numeral"}] provides an additional datatype
haftmann@28419:        @{typ index} which is mapped to target-language built-in integers.
haftmann@28419:        Useful for code setups which involve e.g. indexing of
haftmann@28419:        target-language arrays.
haftmann@31050:     \item[@{theory "String"}] provides an additional datatype
haftmann@31205:        @{typ String.literal} which is isomorphic to strings;
haftmann@31205:        @{typ String.literal}s are mapped to target-language strings.
haftmann@28419:        Useful for code setups which involve e.g. printing (error) messages.
haftmann@28419: 
haftmann@28419:   \end{description}
haftmann@28419: 
haftmann@28419:   \begin{warn}
haftmann@28419:     When importing any of these theories, they should form the last
haftmann@28419:     items in an import list.  Since these theories adapt the
haftmann@28419:     code generator setup in a non-conservative fashion,
haftmann@28419:     strange effects may occur otherwise.
haftmann@28419:   \end{warn}
haftmann@28419: *}
haftmann@28419: 
haftmann@28419: 
haftmann@31050: subsection {* Parametrising serialisation \label{sec:adaptation_mechanisms} *}
haftmann@28419: 
haftmann@28419: text {*
haftmann@28561:   Consider the following function and its corresponding
haftmann@28419:   SML code:
haftmann@28419: *}
haftmann@28419: 
haftmann@28564: primrec %quote in_interval :: "nat \<times> nat \<Rightarrow> nat \<Rightarrow> bool" where
haftmann@28419:   "in_interval (k, l) n \<longleftrightarrow> k \<le> n \<and> n \<le> l"
haftmann@28447: (*<*)
haftmann@28419: code_type %invisible bool
haftmann@28419:   (SML)
haftmann@28419: code_const %invisible True and False and "op \<and>" and Not
haftmann@28419:   (SML and and and)
haftmann@28447: (*>*)
haftmann@28564: text %quote {*@{code_stmts in_interval (SML)}*}
haftmann@28419: 
haftmann@28419: text {*
haftmann@28419:   \noindent Though this is correct code, it is a little bit unsatisfactory:
haftmann@28419:   boolean values and operators are materialised as distinguished
haftmann@28419:   entities with have nothing to do with the SML-built-in notion
haftmann@28419:   of \qt{bool}.  This results in less readable code;
haftmann@28419:   additionally, eager evaluation may cause programs to
haftmann@28419:   loop or break which would perfectly terminate when
haftmann@28419:   the existing SML @{verbatim "bool"} would be used.  To map
haftmann@28419:   the HOL @{typ bool} on SML @{verbatim "bool"}, we may use
haftmann@28419:   \qn{custom serialisations}:
haftmann@28419: *}
haftmann@28419: 
haftmann@28564: code_type %quotett bool
haftmann@28419:   (SML "bool")
haftmann@28564: code_const %quotett True and False and "op \<and>"
haftmann@28419:   (SML "true" and "false" and "_ andalso _")
haftmann@28419: 
haftmann@28419: text {*
haftmann@28447:   \noindent The @{command code_type} command takes a type constructor
haftmann@28419:   as arguments together with a list of custom serialisations.
haftmann@28419:   Each custom serialisation starts with a target language
haftmann@28419:   identifier followed by an expression, which during
haftmann@28419:   code serialisation is inserted whenever the type constructor
haftmann@28419:   would occur.  For constants, @{command code_const} implements
haftmann@28419:   the corresponding mechanism.  Each ``@{verbatim "_"}'' in
haftmann@28419:   a serialisation expression is treated as a placeholder
haftmann@28419:   for the type constructor's (the constant's) arguments.
haftmann@28419: *}
haftmann@28419: 
haftmann@28564: text %quote {*@{code_stmts in_interval (SML)}*}
haftmann@28419: 
haftmann@28419: text {*
haftmann@28419:   \noindent This still is not perfect: the parentheses
haftmann@28419:   around the \qt{andalso} expression are superfluous.
haftmann@28593:   Though the serialiser
haftmann@28419:   by no means attempts to imitate the rich Isabelle syntax
haftmann@28419:   framework, it provides some common idioms, notably
haftmann@28419:   associative infixes with precedences which may be used here:
haftmann@28419: *}
haftmann@28419: 
haftmann@28564: code_const %quotett "op \<and>"
haftmann@28419:   (SML infixl 1 "andalso")
haftmann@28419: 
haftmann@28564: text %quote {*@{code_stmts in_interval (SML)}*}
haftmann@28419: 
haftmann@28419: text {*
haftmann@28561:   \noindent The attentive reader may ask how we assert that no generated
haftmann@28561:   code will accidentally overwrite.  For this reason the serialiser has
haftmann@28561:   an internal table of identifiers which have to be avoided to be used
haftmann@28561:   for new declarations.  Initially, this table typically contains the
haftmann@28561:   keywords of the target language.  It can be extended manually, thus avoiding
haftmann@28561:   accidental overwrites, using the @{command "code_reserved"} command:
haftmann@28561: *}
haftmann@28561: 
haftmann@28601: code_reserved %quote "\<SML>" bool true false andalso
haftmann@28561: 
haftmann@28561: text {*
haftmann@28447:   \noindent Next, we try to map HOL pairs to SML pairs, using the
haftmann@28419:   infix ``@{verbatim "*"}'' type constructor and parentheses:
haftmann@28419: *}
haftmann@28447: (*<*)
haftmann@28419: code_type %invisible *
haftmann@28419:   (SML)
haftmann@28419: code_const %invisible Pair
haftmann@28419:   (SML)
haftmann@28447: (*>*)
haftmann@28564: code_type %quotett *
haftmann@28419:   (SML infix 2 "*")
haftmann@28564: code_const %quotett Pair
haftmann@28419:   (SML "!((_),/ (_))")
haftmann@28419: 
haftmann@28419: text {*
haftmann@28593:   \noindent The initial bang ``@{verbatim "!"}'' tells the serialiser
haftmann@28561:   never to put
haftmann@28419:   parentheses around the whole expression (they are already present),
haftmann@28419:   while the parentheses around argument place holders
haftmann@28419:   tell not to put parentheses around the arguments.
haftmann@28419:   The slash ``@{verbatim "/"}'' (followed by arbitrary white space)
haftmann@28419:   inserts a space which may be used as a break if necessary
haftmann@28419:   during pretty printing.
haftmann@28419: 
haftmann@28419:   These examples give a glimpse what mechanisms
haftmann@28419:   custom serialisations provide; however their usage
haftmann@28419:   requires careful thinking in order not to introduce
haftmann@28419:   inconsistencies -- or, in other words:
haftmann@28419:   custom serialisations are completely axiomatic.
haftmann@28419: 
haftmann@28419:   A further noteworthy details is that any special
haftmann@28419:   character in a custom serialisation may be quoted
haftmann@28419:   using ``@{verbatim "'"}''; thus, in
haftmann@28419:   ``@{verbatim "fn '_ => _"}'' the first
haftmann@28419:   ``@{verbatim "_"}'' is a proper underscore while the
haftmann@28419:   second ``@{verbatim "_"}'' is a placeholder.
haftmann@28419: *}
haftmann@28419: 
haftmann@28419: 
haftmann@28419: subsection {* @{text Haskell} serialisation *}
haftmann@28419: 
haftmann@28419: text {*
haftmann@28419:   For convenience, the default
haftmann@28419:   @{text HOL} setup for @{text Haskell} maps the @{class eq} class to
haftmann@28419:   its counterpart in @{text Haskell}, giving custom serialisations
haftmann@28419:   for the class @{class eq} (by command @{command code_class}) and its operation
haftmann@28419:   @{const HOL.eq}
haftmann@28419: *}
haftmann@28419: 
haftmann@28564: code_class %quotett eq
haftmann@28714:   (Haskell "Eq")
haftmann@28419: 
haftmann@28564: code_const %quotett "op ="
haftmann@28419:   (Haskell infixl 4 "==")
haftmann@28419: 
haftmann@28419: text {*
haftmann@28447:   \noindent A problem now occurs whenever a type which
haftmann@28419:   is an instance of @{class eq} in @{text HOL} is mapped
haftmann@28419:   on a @{text Haskell}-built-in type which is also an instance
haftmann@28419:   of @{text Haskell} @{text Eq}:
haftmann@28419: *}
haftmann@28419: 
haftmann@28564: typedecl %quote bar
haftmann@28419: 
haftmann@28564: instantiation %quote bar :: eq
haftmann@28419: begin
haftmann@28419: 
haftmann@28564: definition %quote "eq_class.eq (x\<Colon>bar) y \<longleftrightarrow> x = y"
haftmann@28419: 
haftmann@28564: instance %quote by default (simp add: eq_bar_def)
haftmann@28213: 
haftmann@30880: end %quote (*<*)
haftmann@30880: 
haftmann@30880: (*>*) code_type %quotett bar
haftmann@28419:   (Haskell "Integer")
haftmann@28419: 
haftmann@28419: text {*
haftmann@28447:   \noindent The code generator would produce
haftmann@28593:   an additional instance, which of course is rejected by the @{text Haskell}
haftmann@28419:   compiler.
haftmann@28419:   To suppress this additional instance, use
haftmann@28419:   @{text "code_instance"}:
haftmann@28419: *}
haftmann@28419: 
haftmann@28564: code_instance %quotett bar :: eq
haftmann@28419:   (Haskell -)
haftmann@28419: 
haftmann@28561: 
haftmann@28635: subsection {* Enhancing the target language context \label{sec:include} *}
haftmann@28561: 
haftmann@28561: text {*
haftmann@28593:   In rare cases it is necessary to \emph{enrich} the context of a
haftmann@28561:   target language;  this is accomplished using the @{command "code_include"}
haftmann@28561:   command:
haftmann@28561: *}
haftmann@28561: 
haftmann@28564: code_include %quotett Haskell "Errno"
haftmann@28561: {*errno i = error ("Error number: " ++ show i)*}
haftmann@28561: 
haftmann@28564: code_reserved %quotett Haskell Errno
haftmann@28561: 
haftmann@28561: text {*
haftmann@28561:   \noindent Such named @{text include}s are then prepended to every generated code.
haftmann@28561:   Inspect such code in order to find out how @{command "code_include"} behaves
haftmann@28561:   with respect to a particular target language.
haftmann@28561: *}
haftmann@28561: 
haftmann@28419: end