doc-src/IsarRef/syntax.tex
changeset 7050 c70d3402fef5
parent 7046 9f755ff43cff
child 7134 320b412e5800
     1.1 --- a/doc-src/IsarRef/syntax.tex	Tue Jul 20 10:34:17 1999 +0200
     1.2 +++ b/doc-src/IsarRef/syntax.tex	Tue Jul 20 18:50:46 1999 +0200
     1.3 @@ -1,28 +1,59 @@
     1.4 +
     1.5 +%FIXME
     1.6 +% - examples (!?)
     1.7 +
     1.8  
     1.9  \chapter{Isar document syntax}
    1.10  
    1.11 -\section{Inner versus outer syntax}
    1.12 +FIXME important note: inner versus outer syntax
    1.13  
    1.14  \section{Lexical matters}
    1.15  
    1.16  \section{Common syntax entities}
    1.17  
    1.18 -\subsection{Atoms}
    1.19 +The Isar proof and theory language syntax has been carefully designed with
    1.20 +orthogonality in mind.  Many common syntax entities such that those for names,
    1.21 +terms, types etc.\ have been factored out.  Some of these basic syntactic
    1.22 +entities usually represent the level of abstraction for error messages: e.g.\ 
    1.23 +some higher syntax element such as $\CONSTS$ referring to \railtoken{name} or
    1.24 +\railtoken{type}, would really report a missing \railtoken{name} or
    1.25 +\railtoken{type} rather than any of its constituent primitive tokens (as
    1.26 +defined below).  These quasi-tokens are represented in the syntax diagrams
    1.27 +below using the same font as actual tokens (such as \railtoken{string}).
    1.28  
    1.29 +
    1.30 +\subsection{Names}
    1.31 +
    1.32 +Entity \railtoken{name} usually refers to any name of types, constants,
    1.33 +theorems, etc.\ to be \emph{declared} or \emph{defined} (so qualified
    1.34 +identifiers are excluded).  Already existing objects are typically referenced
    1.35 +by \railtoken{nameref}.
    1.36 +
    1.37 +\indexoutertoken{name}\indexoutertoken{nameref}
    1.38  \begin{rail}
    1.39    name : ident | symident | string
    1.40    ;
    1.41 -
    1.42    nameref : name | longident
    1.43    ;
    1.44 -
    1.45 -  text : nameref | verbatim
    1.46 -  ;
    1.47  \end{rail}
    1.48  
    1.49 +
    1.50  \subsection{Comments}
    1.51  
    1.52 +Large chunks of verbatim \railtoken{text} are usually given
    1.53 +\railtoken{verbatim}, i.e.\ enclosed in \verb|{*|~\verb|*}|; for convenience,
    1.54 +any of the smaller text entities (\railtoken{ident}, \railtoken{string} etc.)
    1.55 +are admitted as well.  Almost any of the Isar commands may be annotated by a
    1.56 +marginal comment: \texttt{--} \railtoken{text}.  Note that this kind of
    1.57 +comment is actually part of the language, while source level comments
    1.58 +\verb|(*|~\verb|*)| are already stripped at the lexical level.  A few commands
    1.59 +such as $\PROOFNAME$ admit some parts to be mark with a ``level of interest'':
    1.60 +currently only \texttt{\%} for ``boring, don't read this''.
    1.61 +
    1.62 +\indexoutertoken{text}\indexouternonterm{comment}\indexouternonterm{interest}
    1.63  \begin{rail}
    1.64 +  text : verbatim | nameref
    1.65 +  ;
    1.66    comment : (() | '--' text)
    1.67    ;
    1.68    interest : (() | '\%')
    1.69 @@ -32,30 +63,141 @@
    1.70  
    1.71  \subsection{Sorts and arities}
    1.72  
    1.73 +The syntax of sorts and arities is given directly at the outer level.  Note
    1.74 +that this in contrast to that types and terms (see below).  Only few commands
    1.75 +ever refer to sorts or arities explicitly.
    1.76 +
    1.77 +\indexouternonterm{sort}\indexouternonterm{arity}\indexouternonterm{simplearity}
    1.78  \begin{rail}
    1.79    sort : nameref | lbrace (nameref * ',') rbrace
    1.80    ;
    1.81    arity : ( () | '(' (sort + ',') ')' ) sort
    1.82    ;
    1.83 -  simple\-arity : ( () | '(' (sort + ',') ')' ) nameref
    1.84 +  simplearity : ( () | '(' (sort + ',') ')' ) nameref
    1.85    ;
    1.86  \end{rail}
    1.87  
    1.88  
    1.89 -\subsection{Terms and Types}
    1.90 +\subsection{Types and terms}
    1.91  
    1.92 +The actual inner Isabelle syntax, i.e.\ that of types and terms, is far too
    1.93 +flexible in order to be modeled explicitly at the outer theory level.
    1.94 +Basically, any such entity would have to be quoted at the outer level to turn
    1.95 +it into a single token, with the actual parsing deferred to some functions
    1.96 +that read and type-check terms etc.\ (note that \railtoken{prop}s will be
    1.97 +handled differently from plain \railtoken{term}s here).  For convenience, the
    1.98 +quotes may be omitted for any \emph{atomic} term or type (e.g.\ a single
    1.99 +variable).
   1.100 +
   1.101 +\indexoutertoken{type}\indexoutertoken{term}\indexoutertoken{prop}
   1.102  \begin{rail}
   1.103 -  
   1.104 +  type : ident | longident | symident | typefree | typevar | string
   1.105 +  ;
   1.106 +  term : ident | longident | symident | var | textvar | nat | string
   1.107 +  ;
   1.108 +  prop : term
   1.109 +  ;
   1.110  \end{rail}
   1.111  
   1.112 +Type definitions etc.\ usually refer to \railnonterm{typespec} on the
   1.113 +left-hand side.  This models basic type constructor application at the outer
   1.114 +syntax level.  Note that only plain postfix notation is available here, but no
   1.115 +infixes.
   1.116 +
   1.117 +\indexouternonterm{typespec}
   1.118 +\begin{rail}
   1.119 +  typespec : (() | typefree | '(' ( typefree + ',' ) ')') name
   1.120 +  ;
   1.121 +\end{rail}
   1.122 +
   1.123 +
   1.124 +\subsection{Term patterns}
   1.125 +
   1.126 +Statements like $\SHOWNAME$ involve propositions, some others like $\DEFNAME$
   1.127 +plain terms.  Any of these usually admit automatic binding of schematic text
   1.128 +variables by giving (optional) patterns $\IS{p@1 \dots p@n}$.  For
   1.129 +\railtoken{prop}s the $\CONCLNAME$ part refers to the conclusion only, in case
   1.130 +actual rules are involved, rather than atomic propositions.
   1.131 +
   1.132 +\indexouternonterm{termpat}\indexouternonterm{proppat}
   1.133 +\begin{rail}
   1.134 +  termpat : '(' (term + 'is' ) ')'
   1.135 +  ;
   1.136 +  proppat : '(' (() | (prop + 'is' )) (() | 'concl' (prop + 'is' )) ')'
   1.137 +  ;
   1.138 +\end{rail}
   1.139 +
   1.140 +
   1.141  \subsection{Mixfix annotations}
   1.142  
   1.143 +Mixfix annotations specify concrete \emph{inner} syntax.  Some commands such
   1.144 +as $\TYPES$ admit infixes only, while $\CONSTS$ etc.\ support the full range
   1.145 +of general mixfixes and binders.
   1.146  
   1.147 -\subsection{}
   1.148 +\indexouternonterm{infix}\indexouternonterm{mixfix}
   1.149 +\begin{rail}
   1.150 +  infix : '(' ('infixl' | 'infixr') (() | string) nat ')'
   1.151 +  ;
   1.152  
   1.153 -\subsection{}
   1.154 +  mixfix : infix | string (() | '[' (nat + ',') ']') (() | nat) |
   1.155 +  'binder' string (() | '[' (nat + ',') ']') nat
   1.156 +  ;
   1.157 +\end{rail}
   1.158  
   1.159 -\subsection{}
   1.160 +
   1.161 +\subsection{Attributes and theorem specifications}\label{sec:syn-att}
   1.162 +
   1.163 +Attributes (and proof methods, see \S\ref{sec:syn-meth}) have their own
   1.164 +``semi-inner'' syntax, which does not have to be atomic at the outer level
   1.165 +unlike that of types and terms.  Instead, the attribute argument
   1.166 +specifications may be any sequence of atomic entities (identifiers, strings
   1.167 +etc.), or properly bracketed argument lists.  Below \railtoken{atom} refers to
   1.168 +any atomic entity (\railtoken{ident}, \railtoken{longident},
   1.169 +\railtoken{symident} etc.), including keywords that conform to
   1.170 +\railtoken{symident}, but do not coincide with actual command names.
   1.171 +
   1.172 +\indexoutertoken{atom}\indexouternonterm{args}\indexouternonterm{attributes}
   1.173 +\begin{rail}
   1.174 +  args : (atom | '(' (args *) ')' | '[' (args *) ']' | lbrace (args *) rbrace) *
   1.175 +  ;
   1.176 +  attributes : '[' (name args + ',') ']'
   1.177 +  ;
   1.178 +\end{rail}
   1.179 +
   1.180 +Theorem specifications come in three flavours: \railnonterm{thmdecl} usually
   1.181 +refers to the result of a goal statement (such as $\SHOWNAME$),
   1.182 +\railnonterm{thmdef} collects lists of existing theorems (as in $\NOTENAME$),
   1.183 +\railnonterm{thmref} refers to any list of existing theorems (e.g.\ occurring
   1.184 +as proof method arguments).  Any of these may include lists of attributes,
   1.185 +which are applied to the preceding theorem or list of theorems.
   1.186 +
   1.187 +\indexouternonterm{thmdecl}\indexouternonterm{thmdef}\indexouternonterm{thmref}
   1.188 +\begin{rail}
   1.189 +  thmdecl : (() | name) (() | attributes) ':'
   1.190 +  ;
   1.191 +  thmdef : (() | name) (() | attributes) '='
   1.192 +  ;
   1.193 +  thmref : (name (() | attributes) +)
   1.194 +  ;
   1.195 +\end{rail}
   1.196 +
   1.197 +
   1.198 +\subsection{Proof methods}\label{sec:syn-meth}
   1.199 +
   1.200 +Proof methods are either basic ones, or expressions composed of methods via
   1.201 +``\texttt{,}'' (sequential composition), ``\texttt{|}'' (alternatives),
   1.202 +``\texttt{?}'' (try), ``\texttt{*}'' (repeat, ${} \ge 0$ times),
   1.203 +``\texttt{+}'' (repeat, ${} > 0$ times).  In practice, proof methods are
   1.204 +typically just a comma separeted list of \railtoken{name}~\railtoken{args}
   1.205 +specifications.  Thus the syntax is similar to that of attributes, with plain
   1.206 +parentheses instead of square brackets (see also \S\ref{sec:syn-att}).  Note
   1.207 +that parentheses may be dropped for methods without arguments.
   1.208 +
   1.209 +\indexouternonterm{method}
   1.210 +\begin{rail}
   1.211 +  method : (name args | (name | '(' method ')') (() | '?' | '*' | '+')) + (',' | '|')
   1.212 +  ;
   1.213 +\end{rail}
   1.214  
   1.215  
   1.216  %%% Local Variables: