doc-src/TutorialI/CodeGen/CodeGen.thy
author haftmann
Fri, 17 Jun 2005 16:12:49 +0200
changeset 16417 9bc16273c2d4
parent 11458 09a6c44a48ea
child 17212 6859484b5b2b
permissions -rw-r--r--
migrated theory headers to new format
nipkow@8744
     1
(*<*)
haftmann@16417
     2
theory CodeGen imports Main begin
nipkow@8744
     3
(*>*)
nipkow@8744
     4
paulson@10885
     5
section{*Case Study: Compiling Expressions*}
nipkow@9844
     6
nipkow@9844
     7
text{*\label{sec:ExprCompiler}
paulson@11458
     8
\index{compiling expressions example|(}%
nipkow@8744
     9
The task is to develop a compiler from a generic type of expressions (built
paulson@10795
    10
from variables, constants and binary operations) to a stack machine.  This
nipkow@8744
    11
generic type of expressions is a generalization of the boolean expressions in
nipkow@8744
    12
\S\ref{sec:boolex}.  This time we do not commit ourselves to a particular
nipkow@8744
    13
type of variables or values but make them type parameters.  Neither is there
nipkow@8744
    14
a fixed set of binary operations: instead the expression contains the
nipkow@8744
    15
appropriate function itself.
nipkow@8744
    16
*}
nipkow@8744
    17
nipkow@10171
    18
types 'v binop = "'v \<Rightarrow> 'v \<Rightarrow> 'v";
nipkow@8744
    19
datatype ('a,'v)expr = Cex 'v
nipkow@8744
    20
                     | Vex 'a
nipkow@8744
    21
                     | Bex "'v binop"  "('a,'v)expr"  "('a,'v)expr";
nipkow@8744
    22
nipkow@8744
    23
text{*\noindent
nipkow@8771
    24
The three constructors represent constants, variables and the application of
nipkow@8771
    25
a binary operation to two subexpressions.
nipkow@8744
    26
paulson@10795
    27
The value of an expression with respect to an environment that maps variables to
nipkow@8744
    28
values is easily defined:
nipkow@8744
    29
*}
nipkow@8744
    30
nipkow@10171
    31
consts value :: "('a,'v)expr \<Rightarrow> ('a \<Rightarrow> 'v) \<Rightarrow> 'v";
nipkow@8744
    32
primrec
nipkow@8771
    33
"value (Cex v) env = v"
nipkow@8771
    34
"value (Vex a) env = env a"
nipkow@8771
    35
"value (Bex f e1 e2) env = f (value e1 env) (value e2 env)";
nipkow@8744
    36
nipkow@8744
    37
text{*
nipkow@8744
    38
The stack machine has three instructions: load a constant value onto the
paulson@10795
    39
stack, load the contents of an address onto the stack, and apply a
nipkow@8744
    40
binary operation to the two topmost elements of the stack, replacing them by
nipkow@9792
    41
the result. As for @{text"expr"}, addresses and values are type parameters:
nipkow@8744
    42
*}
nipkow@8744
    43
nipkow@8744
    44
datatype ('a,'v) instr = Const 'v
nipkow@8744
    45
                       | Load 'a
nipkow@8744
    46
                       | Apply "'v binop";
nipkow@8744
    47
nipkow@8744
    48
text{*
nipkow@8771
    49
The execution of the stack machine is modelled by a function
nipkow@9792
    50
@{text"exec"} that takes a list of instructions, a store (modelled as a
nipkow@8771
    51
function from addresses to values, just like the environment for
nipkow@8771
    52
evaluating expressions), and a stack (modelled as a list) of values,
nipkow@10971
    53
and returns the stack at the end of the execution --- the store remains
nipkow@8771
    54
unchanged:
nipkow@8744
    55
*}
nipkow@8744
    56
nipkow@10171
    57
consts exec :: "('a,'v)instr list \<Rightarrow> ('a\<Rightarrow>'v) \<Rightarrow> 'v list \<Rightarrow> 'v list";
nipkow@8744
    58
primrec
nipkow@8771
    59
"exec [] s vs = vs"
nipkow@8771
    60
"exec (i#is) s vs = (case i of
nipkow@10171
    61
    Const v  \<Rightarrow> exec is s (v#vs)
nipkow@10171
    62
  | Load a   \<Rightarrow> exec is s ((s a)#vs)
nipkow@10171
    63
  | Apply f  \<Rightarrow> exec is s ((f (hd vs) (hd(tl vs)))#(tl(tl vs))))";
nipkow@8744
    64
nipkow@8744
    65
text{*\noindent
nipkow@9792
    66
Recall that @{term"hd"} and @{term"tl"}
nipkow@8744
    67
return the first element and the remainder of a list.
paulson@11458
    68
Because all functions are total, \cdx{hd} is defined even for the empty
nipkow@8744
    69
list, although we do not know what the result is. Thus our model of the
paulson@10795
    70
machine always terminates properly, although the definition above does not
nipkow@9792
    71
tell us much about the result in situations where @{term"Apply"} was executed
nipkow@8744
    72
with fewer than two elements on the stack.
nipkow@8744
    73
nipkow@8744
    74
The compiler is a function from expressions to a list of instructions. Its
paulson@10795
    75
definition is obvious:
nipkow@8744
    76
*}
nipkow@8744
    77
nipkow@10171
    78
consts comp :: "('a,'v)expr \<Rightarrow> ('a,'v)instr list";
nipkow@8744
    79
primrec
nipkow@8744
    80
"comp (Cex v)       = [Const v]"
nipkow@8744
    81
"comp (Vex a)       = [Load a]"
nipkow@8744
    82
"comp (Bex f e1 e2) = (comp e2) @ (comp e1) @ [Apply f]";
nipkow@8744
    83
nipkow@8744
    84
text{*
nipkow@8744
    85
Now we have to prove the correctness of the compiler, i.e.\ that the
nipkow@8744
    86
execution of a compiled expression results in the value of the expression:
nipkow@8744
    87
*}
nipkow@8771
    88
theorem "exec (comp e) s [] = [value e s]";
nipkow@8744
    89
(*<*)oops;(*>*)
nipkow@8744
    90
text{*\noindent
paulson@11458
    91
This theorem needs to be generalized:
nipkow@8744
    92
*}
nipkow@8744
    93
nipkow@10171
    94
theorem "\<forall>vs. exec (comp e) s vs = (value e s) # vs";
nipkow@8744
    95
nipkow@8744
    96
txt{*\noindent
paulson@11458
    97
It will be proved by induction on @{term"e"} followed by simplification.  
paulson@11458
    98
First, we must prove a lemma about executing the concatenation of two
nipkow@8744
    99
instruction sequences:
nipkow@8744
   100
*}
nipkow@8744
   101
(*<*)oops;(*>*)
nipkow@8744
   102
lemma exec_app[simp]:
nipkow@10171
   103
  "\<forall>vs. exec (xs@ys) s vs = exec ys s (exec xs s vs)"; 
nipkow@8744
   104
nipkow@8744
   105
txt{*\noindent
nipkow@9792
   106
This requires induction on @{term"xs"} and ordinary simplification for the
nipkow@8744
   107
base cases. In the induction step, simplification leaves us with a formula
nipkow@9792
   108
that contains two @{text"case"}-expressions over instructions. Thus we add
paulson@11458
   109
automatic case splitting, which finishes the proof:
nipkow@8744
   110
*}
nipkow@10171
   111
apply(induct_tac xs, simp, simp split: instr.split);
nipkow@10171
   112
(*<*)done(*>*)
nipkow@8744
   113
text{*\noindent
paulson@11428
   114
Note that because both \methdx{simp_all} and \methdx{auto} perform simplification, they can
paulson@11428
   115
be modified in the same way as @{text simp}.  Thus the proof can be
nipkow@8744
   116
rewritten as
nipkow@8744
   117
*}
nipkow@8744
   118
(*<*)
nipkow@9933
   119
declare exec_app[simp del];
nipkow@10171
   120
lemma [simp]: "\<forall>vs. exec (xs@ys) s vs = exec ys s (exec xs s vs)"; 
nipkow@8744
   121
(*>*)
nipkow@10971
   122
apply(induct_tac xs, simp_all split: instr.split);
nipkow@10171
   123
(*<*)done(*>*)
nipkow@8744
   124
text{*\noindent
nipkow@8744
   125
Although this is more compact, it is less clear for the reader of the proof.
nipkow@8744
   126
nipkow@8771
   127
We could now go back and prove \isa{exec (comp e) s [] = [value e s]}
nipkow@8744
   128
merely by simplification with the generalized version we just proved.
nipkow@8744
   129
However, this is unnecessary because the generalized version fully subsumes
paulson@11458
   130
its instance.%
paulson@11458
   131
\index{compiling expressions example|)}
nipkow@8744
   132
*}
nipkow@8744
   133
(*<*)
nipkow@10171
   134
theorem "\<forall>vs. exec (comp e) s vs = (value e s) # vs";
nipkow@9458
   135
by(induct_tac e, auto);
nipkow@8744
   136
end
nipkow@8744
   137
(*>*)