doc-src/Codegen/Thy/Adaptation.thy
author Walther Neuper <neuper@ist.tugraz.at>
Thu, 12 Aug 2010 15:03:34 +0200
branchisac-from-Isabelle2009-2
changeset 37913 20e3616b2d9c
parent 34155 14aaccb399b3
child 37836 2bcce92be291
permissions -rw-r--r--
prepare reactivation of isac-update-Isa09-2
     1 theory Adaptation
     2 imports Setup
     3 begin
     4 
     5 setup %invisible {* Code_Target.extend_target ("\<SML>", ("SML", K I)) *}
     6 
     7 section {* Adaptation to target languages \label{sec:adaptation} *}
     8 
     9 subsection {* Adapting code generation *}
    10 
    11 text {*
    12   The aspects of code generation introduced so far have two aspects
    13   in common:
    14 
    15   \begin{itemize}
    16     \item They act uniformly, without reference to a specific
    17        target language.
    18     \item They are \emph{safe} in the sense that as long as you trust
    19        the code generator meta theory and implementation, you cannot
    20        produce programs that yield results which are not derivable
    21        in the logic.
    22   \end{itemize}
    23 
    24   \noindent In this section we will introduce means to \emph{adapt} the serialiser
    25   to a specific target language, i.e.~to print program fragments
    26   in a way which accommodates \qt{already existing} ingredients of
    27   a target language environment, for three reasons:
    28 
    29   \begin{itemize}
    30     \item improving readability and aesthetics of generated code
    31     \item gaining efficiency
    32     \item interface with language parts which have no direct counterpart
    33       in @{text "HOL"} (say, imperative data structures)
    34   \end{itemize}
    35 
    36   \noindent Generally, you should avoid using those features yourself
    37   \emph{at any cost}:
    38 
    39   \begin{itemize}
    40     \item The safe configuration methods act uniformly on every target language,
    41       whereas for adaptation you have to treat each target language separately.
    42     \item Application is extremely tedious since there is no abstraction
    43       which would allow for a static check, making it easy to produce garbage.
    44     \item Subtle errors can be introduced unconsciously.
    45   \end{itemize}
    46 
    47   \noindent However, even if you ought refrain from setting up adaptation
    48   yourself, already the @{text "HOL"} comes with some reasonable default
    49   adaptations (say, using target language list syntax).  There also some
    50   common adaptation cases which you can setup by importing particular
    51   library theories.  In order to understand these, we provide some clues here;
    52   these however are not supposed to replace a careful study of the sources.
    53 *}
    54 
    55 subsection {* The adaptation principle *}
    56 
    57 text {*
    58   Figure \ref{fig:adaptation} illustrates what \qt{adaptation} is conceptually
    59   supposed to be:
    60 
    61   \begin{figure}[here]
    62     \includegraphics{adaptation}
    63     \caption{The adaptation principle}
    64     \label{fig:adaptation}
    65   \end{figure}
    66 
    67   \noindent In the tame view, code generation acts as broker between
    68   @{text logic}, @{text "intermediate language"} and
    69   @{text "target language"} by means of @{text translation} and
    70   @{text serialisation};  for the latter, the serialiser has to observe
    71   the structure of the @{text language} itself plus some @{text reserved}
    72   keywords which have to be avoided for generated code.
    73   However, if you consider @{text adaptation} mechanisms, the code generated
    74   by the serializer is just the tip of the iceberg:
    75 
    76   \begin{itemize}
    77     \item @{text serialisation} can be \emph{parametrised} such that
    78       logical entities are mapped to target-specific ones
    79       (e.g. target-specific list syntax,
    80         see also \secref{sec:adaptation_mechanisms})
    81     \item Such parametrisations can involve references to a
    82       target-specific standard @{text library} (e.g. using
    83       the @{text Haskell} @{verbatim Maybe} type instead
    84       of the @{text HOL} @{type "option"} type);
    85       if such are used, the corresponding identifiers
    86       (in our example, @{verbatim Maybe}, @{verbatim Nothing}
    87       and @{verbatim Just}) also have to be considered @{text reserved}.
    88     \item Even more, the user can enrich the library of the
    89       target-language by providing code snippets
    90       (\qt{@{text "includes"}}) which are prepended to
    91       any generated code (see \secref{sec:include});  this typically
    92       also involves further @{text reserved} identifiers.
    93   \end{itemize}
    94 
    95   \noindent As figure \ref{fig:adaptation} illustrates, all these adaptation mechanisms
    96   have to act consistently;  it is at the discretion of the user
    97   to take care for this.
    98 *}
    99 
   100 subsection {* Common adaptation patterns *}
   101 
   102 text {*
   103   The @{theory HOL} @{theory Main} theory already provides a code
   104   generator setup
   105   which should be suitable for most applications.  Common extensions
   106   and modifications are available by certain theories of the @{text HOL}
   107   library; beside being useful in applications, they may serve
   108   as a tutorial for customising the code generator setup (see below
   109   \secref{sec:adaptation_mechanisms}).
   110 
   111   \begin{description}
   112 
   113     \item[@{theory "Code_Integer"}] represents @{text HOL} integers by big
   114        integer literals in target languages.
   115     \item[@{theory "Code_Char"}] represents @{text HOL} characters by 
   116        character literals in target languages.
   117     \item[@{theory "Code_Char_chr"}] like @{text "Code_Char"},
   118        but also offers treatment of character codes; includes
   119        @{theory "Code_Char"}.
   120     \item[@{theory "Efficient_Nat"}] \label{eff_nat} implements natural numbers by integers,
   121        which in general will result in higher efficiency; pattern
   122        matching with @{term "0\<Colon>nat"} / @{const "Suc"}
   123        is eliminated;  includes @{theory "Code_Integer"}
   124        and @{theory "Code_Numeral"}.
   125     \item[@{theory "Code_Numeral"}] provides an additional datatype
   126        @{typ index} which is mapped to target-language built-in integers.
   127        Useful for code setups which involve e.g. indexing of
   128        target-language arrays.
   129     \item[@{theory "String"}] provides an additional datatype
   130        @{typ String.literal} which is isomorphic to strings;
   131        @{typ String.literal}s are mapped to target-language strings.
   132        Useful for code setups which involve e.g. printing (error) messages.
   133 
   134   \end{description}
   135 
   136   \begin{warn}
   137     When importing any of these theories, they should form the last
   138     items in an import list.  Since these theories adapt the
   139     code generator setup in a non-conservative fashion,
   140     strange effects may occur otherwise.
   141   \end{warn}
   142 *}
   143 
   144 
   145 subsection {* Parametrising serialisation \label{sec:adaptation_mechanisms} *}
   146 
   147 text {*
   148   Consider the following function and its corresponding
   149   SML code:
   150 *}
   151 
   152 primrec %quote in_interval :: "nat \<times> nat \<Rightarrow> nat \<Rightarrow> bool" where
   153   "in_interval (k, l) n \<longleftrightarrow> k \<le> n \<and> n \<le> l"
   154 (*<*)
   155 code_type %invisible bool
   156   (SML)
   157 code_const %invisible True and False and "op \<and>" and Not
   158   (SML and and and)
   159 (*>*)
   160 text %quote {*@{code_stmts in_interval (SML)}*}
   161 
   162 text {*
   163   \noindent Though this is correct code, it is a little bit unsatisfactory:
   164   boolean values and operators are materialised as distinguished
   165   entities with have nothing to do with the SML-built-in notion
   166   of \qt{bool}.  This results in less readable code;
   167   additionally, eager evaluation may cause programs to
   168   loop or break which would perfectly terminate when
   169   the existing SML @{verbatim "bool"} would be used.  To map
   170   the HOL @{typ bool} on SML @{verbatim "bool"}, we may use
   171   \qn{custom serialisations}:
   172 *}
   173 
   174 code_type %quotett bool
   175   (SML "bool")
   176 code_const %quotett True and False and "op \<and>"
   177   (SML "true" and "false" and "_ andalso _")
   178 
   179 text {*
   180   \noindent The @{command code_type} command takes a type constructor
   181   as arguments together with a list of custom serialisations.
   182   Each custom serialisation starts with a target language
   183   identifier followed by an expression, which during
   184   code serialisation is inserted whenever the type constructor
   185   would occur.  For constants, @{command code_const} implements
   186   the corresponding mechanism.  Each ``@{verbatim "_"}'' in
   187   a serialisation expression is treated as a placeholder
   188   for the type constructor's (the constant's) arguments.
   189 *}
   190 
   191 text %quote {*@{code_stmts in_interval (SML)}*}
   192 
   193 text {*
   194   \noindent This still is not perfect: the parentheses
   195   around the \qt{andalso} expression are superfluous.
   196   Though the serialiser
   197   by no means attempts to imitate the rich Isabelle syntax
   198   framework, it provides some common idioms, notably
   199   associative infixes with precedences which may be used here:
   200 *}
   201 
   202 code_const %quotett "op \<and>"
   203   (SML infixl 1 "andalso")
   204 
   205 text %quote {*@{code_stmts in_interval (SML)}*}
   206 
   207 text {*
   208   \noindent The attentive reader may ask how we assert that no generated
   209   code will accidentally overwrite.  For this reason the serialiser has
   210   an internal table of identifiers which have to be avoided to be used
   211   for new declarations.  Initially, this table typically contains the
   212   keywords of the target language.  It can be extended manually, thus avoiding
   213   accidental overwrites, using the @{command "code_reserved"} command:
   214 *}
   215 
   216 code_reserved %quote "\<SML>" bool true false andalso
   217 
   218 text {*
   219   \noindent Next, we try to map HOL pairs to SML pairs, using the
   220   infix ``@{verbatim "*"}'' type constructor and parentheses:
   221 *}
   222 (*<*)
   223 code_type %invisible *
   224   (SML)
   225 code_const %invisible Pair
   226   (SML)
   227 (*>*)
   228 code_type %quotett *
   229   (SML infix 2 "*")
   230 code_const %quotett Pair
   231   (SML "!((_),/ (_))")
   232 
   233 text {*
   234   \noindent The initial bang ``@{verbatim "!"}'' tells the serialiser
   235   never to put
   236   parentheses around the whole expression (they are already present),
   237   while the parentheses around argument place holders
   238   tell not to put parentheses around the arguments.
   239   The slash ``@{verbatim "/"}'' (followed by arbitrary white space)
   240   inserts a space which may be used as a break if necessary
   241   during pretty printing.
   242 
   243   These examples give a glimpse what mechanisms
   244   custom serialisations provide; however their usage
   245   requires careful thinking in order not to introduce
   246   inconsistencies -- or, in other words:
   247   custom serialisations are completely axiomatic.
   248 
   249   A further noteworthy details is that any special
   250   character in a custom serialisation may be quoted
   251   using ``@{verbatim "'"}''; thus, in
   252   ``@{verbatim "fn '_ => _"}'' the first
   253   ``@{verbatim "_"}'' is a proper underscore while the
   254   second ``@{verbatim "_"}'' is a placeholder.
   255 *}
   256 
   257 
   258 subsection {* @{text Haskell} serialisation *}
   259 
   260 text {*
   261   For convenience, the default
   262   @{text HOL} setup for @{text Haskell} maps the @{class eq} class to
   263   its counterpart in @{text Haskell}, giving custom serialisations
   264   for the class @{class eq} (by command @{command code_class}) and its operation
   265   @{const HOL.eq}
   266 *}
   267 
   268 code_class %quotett eq
   269   (Haskell "Eq")
   270 
   271 code_const %quotett "op ="
   272   (Haskell infixl 4 "==")
   273 
   274 text {*
   275   \noindent A problem now occurs whenever a type which
   276   is an instance of @{class eq} in @{text HOL} is mapped
   277   on a @{text Haskell}-built-in type which is also an instance
   278   of @{text Haskell} @{text Eq}:
   279 *}
   280 
   281 typedecl %quote bar
   282 
   283 instantiation %quote bar :: eq
   284 begin
   285 
   286 definition %quote "eq_class.eq (x\<Colon>bar) y \<longleftrightarrow> x = y"
   287 
   288 instance %quote by default (simp add: eq_bar_def)
   289 
   290 end %quote (*<*)
   291 
   292 (*>*) code_type %quotett bar
   293   (Haskell "Integer")
   294 
   295 text {*
   296   \noindent The code generator would produce
   297   an additional instance, which of course is rejected by the @{text Haskell}
   298   compiler.
   299   To suppress this additional instance, use
   300   @{text "code_instance"}:
   301 *}
   302 
   303 code_instance %quotett bar :: eq
   304   (Haskell -)
   305 
   306 
   307 subsection {* Enhancing the target language context \label{sec:include} *}
   308 
   309 text {*
   310   In rare cases it is necessary to \emph{enrich} the context of a
   311   target language;  this is accomplished using the @{command "code_include"}
   312   command:
   313 *}
   314 
   315 code_include %quotett Haskell "Errno"
   316 {*errno i = error ("Error number: " ++ show i)*}
   317 
   318 code_reserved %quotett Haskell Errno
   319 
   320 text {*
   321   \noindent Such named @{text include}s are then prepended to every generated code.
   322   Inspect such code in order to find out how @{command "code_include"} behaves
   323   with respect to a particular target language.
   324 *}
   325 
   326 end