added datatype section
authornipkow
Tue, 12 Jul 1994 09:28:00 +0200
changeset 464552717636da4
parent 463 afb7259aebb8
child 465 d4bf81734dfe
added datatype section
doc-src/Logics/Old_HOL.tex
     1.1 --- a/doc-src/Logics/Old_HOL.tex	Tue Jul 12 09:14:04 1994 +0200
     1.2 +++ b/doc-src/Logics/Old_HOL.tex	Tue Jul 12 09:28:00 1994 +0200
     1.3 @@ -1233,6 +1233,302 @@
     1.4  covers finite lists~\cite{paulson-coind}.
     1.5  
     1.6  
     1.7 +\section{Datatype declarations}
     1.8 +\index{*datatype|(}
     1.9 +
    1.10 +\underscoreon
    1.11 +
    1.12 +It is often necessary to extend a theory with \ML-like datatypes.  This
    1.13 +extension consists of the new type, declarations of its constructors and
    1.14 +rules that describe the new type. The theory definition section {\tt
    1.15 +  datatype} represents a compact way of doing this.
    1.16 +
    1.17 +
    1.18 +\subsection{Foundations}
    1.19 +
    1.20 +A datatype declaration has the following general structure:
    1.21 +\[ \mbox{\tt datatype}~ (\alpha_1,\dots,\alpha_n)t ~=~
    1.22 +      c_1(\tau_{11},\dots,\tau_{1k_1}) ~\mid~ \dots ~\mid~
    1.23 +      c_m(\tau_{m1},\dots,\tau_{mk_m}) 
    1.24 +\]
    1.25 +where $\alpha_i$ are type variables, $c_i$ are distinct constructor names and
    1.26 +$\tau_{ij}$ are one of the following:
    1.27 +\begin{itemize}
    1.28 +\item type variables $\alpha_1,\dots,\alpha_n$,
    1.29 +\item types $(\beta_1,\dots,\beta_l)s$ where $s$ is a previously declared
    1.30 +  type or type synonym and $\{\beta_1,\dots,\beta_l\} \subseteq
    1.31 +  \{\alpha_1,\dots,\alpha_n\}$,
    1.32 +\item the newly defined type $(\alpha_1,\dots,\alpha_n)t$ \footnote{This
    1.33 +    makes it a recursive type. To ensure that the new type is not empty at
    1.34 +    least one constructor must consist of only non-recursive type
    1.35 +    components.}
    1.36 +\end{itemize}
    1.37 +The constructors are automatically defined as functions of their respective
    1.38 +type:
    1.39 +\[ c_j : [\tau_{j1},\dots,\tau_{jk_j}] \To (\alpha_1,\dots,\alpha_n)t \]
    1.40 +These functions have certain {\em freeness} properties:
    1.41 +\begin{description}
    1.42 +\item[\tt inject] They are injective:
    1.43 +\[ (c_j(x_1,\dots,x_{k_j}) = c_j(y_1,\dots,y_{k_j})) =
    1.44 +   (x_1 = y_1 \land \dots \land x_{k_j} = y_{k_j})
    1.45 +\]
    1.46 +\item[\tt ineq] They are distinct:
    1.47 +\[ c_i(x_1,\dots,x_{k_i}) \neq c_j(y_1,\dots,y_{k_j}) \qquad
    1.48 +   \mbox{for all}~ i \neq j.
    1.49 +\]
    1.50 +\end{description}
    1.51 +Because the number of inequalities is quadratic in the number of
    1.52 +constructors, a different method is used if their number exceeds
    1.53 +a certain value, currently 4. In that case every constructor is mapped to a
    1.54 +natural number
    1.55 +\[
    1.56 +\begin{array}{lcl}
    1.57 +\mbox{\it t\_ord}(c_1(x_1,\dots,x_{k_1})) & = & 0 \\
    1.58 +& \vdots & \\
    1.59 +\mbox{\it t\_ord}(c_m(x_1,\dots,x_{k_m})) & = & m-1
    1.60 +\end{array}
    1.61 +\]
    1.62 +and inequality of constructors is expressed by:
    1.63 +\[
    1.64 +\mbox{\it t\_ord}(x) \neq \mbox{\it t\_ord}(y) \Imp x \neq y.
    1.65 +\]
    1.66 +In addition a structural induction axiom {\tt induct} is provided: 
    1.67 +\[
    1.68 +\infer{P(x)}
    1.69 +{\begin{array}{lcl}
    1.70 +\Forall x_1\dots x_{k_1}.
    1.71 +  \List{P(x_{r_{11}}); \dots; P(x_{r_{1l_1}})} &
    1.72 +  \Imp  & P(c_1(x_1,\dots,x_{k_1})) \\
    1.73 + & \vdots & \\
    1.74 +\Forall x_1\dots x_{k_m}.
    1.75 +  \List{P(x_{r_{m1}}); \dots; P(x_{r_{ml_m}})} &
    1.76 +  \Imp & P(c_m(x_1,\dots,x_{k_m}))
    1.77 +\end{array}}
    1.78 +\]
    1.79 +where $\{r_{j1},\dots,r_{jl_j}\} = \{i \in \{1,\dots k_j\} ~\mid~ \tau_{ji}
    1.80 += (\alpha_1,\dots,\alpha_n)t \}$, i.e.\ the property $P$ can be assumed for
    1.81 +all arguments of the recursive type.
    1.82 +
    1.83 +The type also comes with an \ML-like {\tt case}-construct:
    1.84 +\[
    1.85 +\begin{array}{rrcl}
    1.86 +\mbox{\tt case}~e~\mbox{\tt of} & c_1(y_{11},\dots,y_{1k_1}) & \To & e_1 \\
    1.87 +                           \vdots \\
    1.88 +                           \mid & c_m(y_{m1},\dots,y_{mk_m}) & \To & e_m
    1.89 +\end{array}
    1.90 +\]
    1.91 +In contrast to \ML, {\em all} constructors must be present, their order is
    1.92 +fixed, and nested patterns are not supported.
    1.93 +
    1.94 +
    1.95 +\subsection{Defining datatypes}
    1.96 +
    1.97 +A datatype is defined in a theory definition file using the keyword {\tt
    1.98 +  datatype}. The definition following {\tt datatype} must conform to the
    1.99 +syntax of {\em typedecl} specified in Fig.~\ref{datatype-grammar} and must
   1.100 +obey the rules in the previous section. As a result the theory is extended
   1.101 +with the new type, the constructors, and the theorems listed in the previous
   1.102 +section.
   1.103 +
   1.104 +\begin{figure}
   1.105 +\begin{rail}
   1.106 +typedecl : typevarlist id '=' (cons + '|')
   1.107 +         ;
   1.108 +cons     : (id | string) ( () | '(' (typ + ',') ')' ) ( () | mixfix )
   1.109 +         ;
   1.110 +typ      : typevarlist id
   1.111 +           | tid
   1.112 +	 ;
   1.113 +typevarlist : () | tid | '(' (tid + ',') ')'
   1.114 +         ;
   1.115 +\end{rail}
   1.116 +\caption{Syntax of datatype declarations}
   1.117 +\label{datatype-grammar}
   1.118 +\end{figure}
   1.119 +
   1.120 +Reading the theory file produces a structure which, in addtion to the usual
   1.121 +components, contains a structure named $t$ for each datatype $t$ defined in
   1.122 +the file\footnote{Otherwise multiple datatypes in the same theory file would
   1.123 +  lead to name clashes.}. Each structure $t$ contains the following elements:
   1.124 +\begin{ttbox}
   1.125 +val induct : thm
   1.126 +val inject : thm list
   1.127 +val ineq : thm list
   1.128 +val cases : thm list
   1.129 +val simps : thm list
   1.130 +val induct_tac : string -> int -> tactic
   1.131 +\end{ttbox}
   1.132 +{\tt induct}, {\tt inject} and {\tt ineq} contain the theorems described
   1.133 +above. For convenience {\tt ineq} contains inequalities in both directions.
   1.134 +\begin{warn}
   1.135 +  If there are five or more constructors, the {\em t\_ord} scheme is used for
   1.136 +  {\tt ineq}.  In this case the theory {\tt Arith} must be contained
   1.137 +  in the current theory, if necessary by adding it explicitly.
   1.138 +\end{warn}
   1.139 +The reduction rules of the {\tt case}-construct can be found in {\tt cases}.
   1.140 +All theorems from {\tt inject}, {\tt ineq} and {\tt cases} are combined in
   1.141 +{\tt simps} for use with the simplifier. The tactic ``{\verb$induct_tac$~{\em
   1.142 +    var i}\/}'' applies structural induction over variable {\em var} to
   1.143 +subgoal {\em i}.
   1.144 +
   1.145 +
   1.146 +\subsection{Examples}
   1.147 +
   1.148 +\subsubsection{The datatype $\alpha~list$}
   1.149 +
   1.150 +We want to define the type $\alpha~list$.\footnote{Of course there is
   1.151 +a list type in HOL already. But for now let us assume that we have to define
   1.152 +a new type.} To do this we have to build a new theory that contains the
   1.153 +type definition. We start from {\tt HOL}.
   1.154 +\begin{ttbox}
   1.155 +MyList = HOL +
   1.156 +  datatype 'a list = Nil | Cons ('a, 'a list)
   1.157 +end
   1.158 +\end{ttbox}
   1.159 +After loading the theory with \verb$use_thy "MyList"$, we can prove
   1.160 +$Cons(x,xs)\neq xs$ using the new theory.  First we build a suitable simpset
   1.161 +for the simplifier:
   1.162 +
   1.163 +\begin{ttbox}
   1.164 +val mylist_ss = HOL_ss addsimps MyList.list.simps;
   1.165 +goal MyList.thy "!x. Cons(x,xs) ~= xs";
   1.166 +{\out Level 0}
   1.167 +{\out ! x. Cons(x, xs) ~= xs}
   1.168 +{\out  1. ! x. Cons(x, xs) ~= xs}
   1.169 +\end{ttbox}
   1.170 +This can be proved by the structural induction tactic:
   1.171 +\begin{ttbox}
   1.172 +by (MyList.list.induct_tac "xs" 1);
   1.173 +{\out Level 1}
   1.174 +{\out ! x. Cons(x, xs) ~= xs}
   1.175 +{\out  1. ! x. Cons(x, Nil) ~= Nil}
   1.176 +{\out  2. !!a list.}
   1.177 +{\out        ! x. Cons(x, list) ~= list ==>}
   1.178 +{\out        ! x. Cons(x, Cons(a, list)) ~= Cons(a, list)}
   1.179 +\end{ttbox}
   1.180 +The first subgoal can be proved with the simplifier and the {\tt ineq} axiom
   1.181 +that is part of \verb$mylist_ss$.
   1.182 +\begin{ttbox}
   1.183 +by (simp_tac mylist_ss 1);
   1.184 +{\out Level 2}
   1.185 +{\out ! x. Cons(x, xs) ~= xs}
   1.186 +{\out  1. !!a list.}
   1.187 +{\out        ! x. Cons(x, list) ~= list ==>}
   1.188 +{\out        ! x. Cons(x, Cons(a, list)) ~= Cons(a, list)}
   1.189 +\end{ttbox}
   1.190 +Using the {\tt inject} axiom we can quickly prove the remaining goal.
   1.191 +\begin{ttbox}
   1.192 +by (asm_simp_tac mylist_ss 1);
   1.193 +{\out Level 3}
   1.194 +{\out ! x. Cons(x, xs) ~= xs}
   1.195 +{\out No subgoals!}
   1.196 +\end{ttbox}
   1.197 +Because both subgoals were proved by almost the same tactic we could have
   1.198 +done that in one step using
   1.199 +\begin{ttbox}
   1.200 +by (ALLGOALS (asm_simp_tac mylist_ss));
   1.201 +\end{ttbox}
   1.202 +
   1.203 +
   1.204 +\subsubsection{The datatype $\alpha~list$ with mixfix syntax}
   1.205 +
   1.206 +In this example we define the type $\alpha~list$ again but this time we want
   1.207 +to write {\tt []} instead of {\tt Nil} and we want to use the infix operator
   1.208 +\verb|#| instead of {\tt Cons}. To do this we simply add mixfix annotations
   1.209 +after the constructor declarations as follows:
   1.210 +\begin{ttbox}
   1.211 +MyList = HOL +
   1.212 +  datatype 'a list = "[]" ("[]") 
   1.213 +                   | "#" ('a, 'a list) (infixr 70)
   1.214 +end
   1.215 +\end{ttbox}
   1.216 +Now the theorem in the previous example can be written \verb|x#xs ~= xs|. The
   1.217 +proof is the same.
   1.218 +
   1.219 +\subsubsection{Defining functions on datatypes}
   1.220 +
   1.221 +Normally one wants to define functions dealing with a newly defined type and
   1.222 +prove properties of these functions. As an example let us define \verb|@| for
   1.223 +concatenation and {\tt ttl} for the (total) tail of a list, i.e.\ the list
   1.224 +without its first element. We do this in theory \verb|List_fun| which is an
   1.225 +extension of {\tt MyList}:
   1.226 +\begin{ttbox}
   1.227 +List_fun = MyList +
   1.228 +consts ttl  :: "'a list => 'a list"
   1.229 +       "@"  :: "['a list,'a list] => 'a list"  (infixr 60)
   1.230 +rules 
   1.231 +       ttl_def   "ttl (l)  == case l of [] => []  |  y#ys => ys"
   1.232 +
   1.233 +       app_Nil   "[] @ ys = ys"
   1.234 +       app_Cons  "(x#xs) @ ys = x#(xs @ ys)"
   1.235 +end
   1.236 +\end{ttbox}
   1.237 +Let us have a look at the use of {\tt case} in the definition of {\tt ttl}.
   1.238 +The expression {\tt case e of [] => [] | y\#ys => ys} evaluates to {\tt []}
   1.239 +if \verb|e=[]| and to {\tt ys} if \verb|e=y#ys|. These properties are
   1.240 +trivial to derive by simplification:
   1.241 +\begin{ttbox}
   1.242 +val mylist_ss = HOL_ss addsimps MyList.list.simps;
   1.243 +
   1.244 +goalw List_fun.thy [List_fun.ttl_def] "ttl([]) = []";
   1.245 +by (simp_tac mylist_ss 1);
   1.246 +val ttl_Nil = result();
   1.247 +
   1.248 +goalw List_fun.thy [List_fun.ttl_def] "ttl(y#ys) = ys";
   1.249 +by (simp_tac mylist_ss 1);
   1.250 +val ttl_Cons = result();
   1.251 +
   1.252 +val list_fun_ss = mylist_ss addsimps
   1.253 +      [ttl_Nil, ttl_Cons, List_fun.app_Nil, List_fun.app_Cons];
   1.254 +\end{ttbox}
   1.255 +Note that we include the derived properties in our simpset, not the original
   1.256 +definition of {\tt ttl}. The former are better behaved because they only
   1.257 +apply if the argument is already a constructor.
   1.258 +
   1.259 +One could have defined \verb$@$ with a single {\tt case}-construct
   1.260 +\begin{ttbox}
   1.261 +app_def "x @ y == case x of [] => y  |  z#zs => z # (zs @ y)"
   1.262 +\end{ttbox}  
   1.263 +and derived \verb$app_Nil$ and \verb$app_Cons$ from it. But \verb$app_def$ is
   1.264 +not easy to work with: the left-hand side matches a subterm of the right-hand
   1.265 +side, which causes the simplifier to loop.
   1.266 +
   1.267 +Here is a simple proof of the associativity of \verb$@$:
   1.268 +\begin{ttbox}
   1.269 +goal List_fun.thy "(x @ y) @ z = x @ (y @ z)";
   1.270 +by (MyList.list.induct_tac "x" 1);
   1.271 +by (simp_tac list_fun_ss 1);
   1.272 +by (asm_simp_tac list_fun_ss 1);
   1.273 +\end{ttbox}
   1.274 +The last two steps can again be combined using {\tt ALLGOALS}.
   1.275 +
   1.276 +
   1.277 +\subsubsection{A datatype for weekdays}
   1.278 +
   1.279 +This example shows a datatype that consists of more than four constructors:
   1.280 +\begin{ttbox}
   1.281 +Days = Arith +
   1.282 +  datatype days = Mo | Tu | We | Th | Fr | Sa | So
   1.283 +end
   1.284 +\end{ttbox}
   1.285 +Because there are more than four constructors, the theory must be based on
   1.286 +{\tt Arith}. Inequality is defined via a function \verb|days_ord|. Although
   1.287 +the expression \verb|Mo ~= Tu| is not directly contained in {\tt ineq}, it
   1.288 +can be proved by the simplifier if \verb$arith_ss$ is used:
   1.289 +\begin{ttbox}
   1.290 +val days_ss = arith_ss addsimps Days.days.simps;
   1.291 +
   1.292 +goal Days.thy "Mo ~= Tu";
   1.293 +by (simp_tac days_ss 1);
   1.294 +\end{ttbox}
   1.295 +Note that usually it is not necessary to derive these inequalities explicitly
   1.296 +because the simplifier will dispose of them automatically.
   1.297 +
   1.298 +\index{*datatype|)}
   1.299 +\underscoreoff
   1.300 +
   1.301 +
   1.302 +
   1.303  \section{The examples directories}
   1.304  Directory {\tt HOL/Subst} contains Martin Coen's mechanisation of a theory of
   1.305  substitutions and unifiers.  It is based on Paulson's previous