doc-src/IsarRef/syntax.tex
author wenzelm
Mon, 12 Feb 2001 20:44:02 +0100
changeset 11100 34d58b1818f4
parent 10858 479dad7b3b41
child 11651 201b3f76c7b7
permissions -rw-r--r--
\<subseteq> syntax for classes/classrel/axclass/instance;
     1 
     2 \chapter{Isar Syntax Primitives}
     3 
     4 We give a complete reference of all basic syntactic entities underlying the
     5 Isabelle/Isar document syntax.  Actual theory and proof commands will be
     6 introduced later on.
     7 
     8 \medskip
     9 
    10 In order to get started with writing well-formed Isabelle/Isar documents, the
    11 most important aspect to be noted is the difference of \emph{inner} versus
    12 \emph{outer} syntax.  Inner syntax is that of Isabelle types and terms of the
    13 logic, while outer syntax is that of Isabelle/Isar theories (including
    14 proofs).  As a general rule, inner syntax entities may occur only as
    15 \emph{atomic entities} within outer syntax.  For example, the string
    16 \texttt{"x + y"} and identifier \texttt{z} are legal term specifications
    17 within a theory, while \texttt{x + y} is not.
    18 
    19 \begin{warn}
    20   Note that classic Isabelle theories used to fake parts of the inner syntax
    21   of types, with rather complicated rules when quotes may be omitted.  Despite
    22   the minor drawback of requiring quotes more often, the syntax of
    23   Isabelle/Isar is much simpler and more robust in that respect.
    24 \end{warn}
    25 
    26 \medskip
    27 
    28 Isabelle/Isar input may contain any number of input termination characters
    29 ``\texttt{;}'' (semicolon) to separate commands explicitly.  This may be
    30 particularly useful in interactive shell sessions to make clear where the
    31 current command is intended to end.  Otherwise, the interactive loop will
    32 continue until end-of-command is clearly indicated from the input syntax,
    33 e.g.\ encounter of the next command keyword.
    34 
    35 Advanced interfaces such as Proof~General \cite{proofgeneral} do not require
    36 explicit semicolons, the amount of input text is determined automatically by
    37 inspecting the Emacs text buffer.  Also note that in the presentation of
    38 Isabelle/Isar documents, semicolons are omitted in any case to gain
    39 readability.
    40 
    41 
    42 \section{Lexical matters}\label{sec:lex-syntax}
    43 
    44 The Isabelle/Isar outer syntax provides token classes as presented below.
    45 Note that some of these coincide (by full intention) with the inner lexical
    46 syntax as presented in \cite{isabelle-ref}.  These different levels of syntax
    47 should not be confused, though.
    48 
    49 %FIXME keyword, command
    50 \indexoutertoken{ident}\indexoutertoken{longident}\indexoutertoken{symident}
    51 \indexoutertoken{nat}\indexoutertoken{var}\indexoutertoken{typefree}
    52 \indexoutertoken{typevar}\indexoutertoken{string}\indexoutertoken{verbatim}
    53 \begin{matharray}{rcl}
    54   ident & = & letter~quasiletter^* \\
    55   longident & = & ident\verb,.,ident~\dots~ident \\
    56   symident & = & sym^+ ~|~ symbol \\
    57   nat & = & digit^+ \\
    58   var & = & \verb,?,ident ~|~ \verb,?,ident\verb,.,nat \\
    59   typefree & = & \verb,',ident \\
    60   typevar & = & \verb,?,typefree ~|~ \verb,?,typefree\verb,.,nat \\
    61   string & = & \verb,", ~\dots~ \verb,", \\
    62   verbatim & = & \verb,{*, ~\dots~ \verb,*}, \\
    63 \end{matharray}
    64 \begin{matharray}{rcl}
    65   letter & = & \verb,a, ~|~ \dots ~|~ \verb,z, ~|~ \verb,A, ~|~ \dots ~|~ \verb,Z, \\
    66   digit & = & \verb,0, ~|~ \dots ~|~ \verb,9, \\
    67   quasiletter & = & letter ~|~ digit ~|~ \verb,_, ~|~ \verb,', \\
    68   sym & = & \verb,!, ~|~ \verb,#, ~|~ \verb,$, ~|~ \verb,%, ~|~ \verb,&, ~|~  %$
    69    \verb,*, ~|~ \verb,+, ~|~ \verb,-, ~|~ \verb,/, ~|~ \verb,:, ~|~ \\
    70   & & \verb,<, ~|~ \verb,=, ~|~ \verb,>, ~|~ \verb,?, ~|~ \texttt{\at} ~|~
    71   \verb,^, ~|~ \verb,_, ~|~ \verb,`, ~|~ \verb,|, ~|~ \verb,~, \\
    72   symbol & = & {\forall} ~|~ {\exists} ~|~ \dots
    73 \end{matharray}
    74 
    75 The syntax of \texttt{string} admits any characters, including newlines;
    76 ``\verb|"|'' (double-quote) and ``\verb|\|'' (backslash) have to be escaped by
    77 a backslash; newlines need not be escaped.  Note that ML-style control
    78 characters are \emph{not} supported.  The body of \texttt{verbatim} may
    79 consist of any text not containing ``\verb|*}|''.
    80 
    81 Comments take the form \texttt{(*~\dots~*)} and may be
    82 nested\footnote{Proof~General may occasionally get confused by nested
    83   comments.}, just as in ML. Note that these are \emph{source} comments only,
    84 which are stripped after lexical analysis of the input.  The Isar document
    85 syntax also provides \emph{formal comments} that are actually part of the text
    86 (see \S\ref{sec:comments}).
    87 
    88 Mathematical symbols such as ``$\forall$'' are represented in plain ASCII as
    89 ``\verb,\<forall>,''.
    90 
    91 
    92 \section{Common syntax entities}
    93 
    94 Subsequently, we introduce several basic syntactic entities, such as names,
    95 terms, and theorem specifications, which have been factored out of the actual
    96 Isar language elements to be described later.
    97 
    98 Note that some of the basic syntactic entities introduced below (e.g.\ 
    99 \railqtoken{name}) act much like tokens rather than plain nonterminals (e.g.\ 
   100 \railnonterm{sort}), especially for the sake of error messages.  E.g.\ syntax
   101 elements such as $\CONSTS$ referring to \railqtoken{name} or \railqtoken{type}
   102 would really report a missing name or type rather than any of the constituent
   103 primitive tokens such as \railtoken{ident} or \railtoken{string}.
   104 
   105 
   106 \subsection{Names}
   107 
   108 Entity \railqtoken{name} usually refers to any name of types, constants,
   109 theorems etc.\ that are to be \emph{declared} or \emph{defined} (so qualified
   110 identifiers are excluded here).  Quoted strings provide an escape for
   111 non-identifier names or those ruled out by outer syntax keywords (e.g.\ 
   112 \verb|"let"|).  Already existing objects are usually referenced by
   113 \railqtoken{nameref}.
   114 
   115 \indexoutertoken{name}\indexoutertoken{parname}\indexoutertoken{nameref}
   116 \indexoutertoken{int}
   117 \begin{rail}
   118   name: ident | symident | string | nat
   119   ;
   120   parname: '(' name ')'
   121   ;
   122   nameref: name | longident
   123   ;
   124   int: nat | '-' nat
   125   ;
   126 \end{rail}
   127 
   128 
   129 \subsection{Comments}\label{sec:comments}
   130 
   131 Large chunks of plain \railqtoken{text} are usually given
   132 \railtoken{verbatim}, i.e.\ enclosed in \verb|{*|~\dots~\verb|*}|.  For
   133 convenience, any of the smaller text units conforming to \railqtoken{nameref}
   134 are admitted as well.  Almost any of the Isar commands may be annotated by
   135 marginal \railnonterm{comment} of the form \texttt{--} \railqtoken{text}.
   136 Note that the latter kind of comment is actually part of the language, while
   137 source level comments \verb|(*|~\dots~\verb|*)| are stripped at the lexical
   138 level.  A few commands such as $\PROOFNAME$ admit additional markup with a
   139 ``level of interest'': \texttt{\%} followed by an optional number $n$ (default
   140 $n = 1$) indicates that the respective part of the document becomes $n$ levels
   141 more obscure; \texttt{\%\%} means that interest drops by $\infty$ --- abandon
   142 every hope, who enter here.
   143 
   144 \indexoutertoken{text}\indexouternonterm{comment}\indexouternonterm{interest}
   145 \begin{rail}
   146   text: verbatim | nameref
   147   ;
   148   comment: ('--' text +)
   149   ;
   150   interest: percent nat? | ppercent
   151   ;
   152 \end{rail}
   153 
   154 
   155 \subsection{Type classes, sorts and arities}
   156 
   157 Classes are specified by plain names.  Sorts have a very simple inner syntax,
   158 which is either a single class name $c$ or a list $\{c@1, \dots, c@n\}$
   159 referring to the intersection of these classes.  The syntax of type arities is
   160 given directly at the outer level.
   161 
   162 \railalias{subseteq}{\isasymsubseteq}
   163 \railterm{subseteq}
   164 
   165 \indexouternonterm{sort}\indexouternonterm{arity}\indexouternonterm{simplearity}
   166 \indexouternonterm{classdecl}
   167 \begin{rail}
   168   classdecl: name (('<' | subseteq) (nameref + ','))?
   169   ;
   170   sort: nameref
   171   ;
   172   arity: ('(' (sort + ',') ')')? sort
   173   ;
   174   simplearity: ('(' (sort + ',') ')')? nameref
   175   ;
   176 \end{rail}
   177 
   178 
   179 \subsection{Types and terms}\label{sec:types-terms}
   180 
   181 The actual inner Isabelle syntax, that of types and terms of the logic, is far
   182 too sophisticated in order to be modelled explicitly at the outer theory
   183 level.  Basically, any such entity has to be quoted to turn it into a single
   184 token (the parsing and type-checking is performed internally later).  For
   185 convenience, a slightly more liberal convention is adopted: quotes may be
   186 omitted for any type or term that is already \emph{atomic} at the outer level.
   187 For example, one may write just \texttt{x} instead of \texttt{"x"}.  Note that
   188 symbolic identifiers (e.g.\ \texttt{++} or $\forall$) are available as well,
   189 provided these are not superseded by commands or keywords (e.g.\ \texttt{+}).
   190 
   191 \indexoutertoken{type}\indexoutertoken{term}\indexoutertoken{prop}
   192 \begin{rail}
   193   type: nameref | typefree | typevar
   194   ;
   195   term: nameref | var
   196   ;
   197   prop: term
   198   ;
   199 \end{rail}
   200 
   201 Positional instantiations are indicated by giving a sequence of terms, or the
   202 placeholder ``$\_$'' (underscore), which means to skip a position.
   203 
   204 \indexoutertoken{inst}\indexoutertoken{insts}
   205 \begin{rail}
   206   inst: underscore | term
   207   ;
   208   insts: (inst *)
   209   ;
   210 \end{rail}
   211 
   212 Type declarations and definitions usually refer to \railnonterm{typespec} on
   213 the left-hand side.  This models basic type constructor application at the
   214 outer syntax level.  Note that only plain postfix notation is available here,
   215 but no infixes.
   216 
   217 \indexouternonterm{typespec}
   218 \begin{rail}
   219   typespec: (() | typefree | '(' ( typefree + ',' ) ')') name
   220   ;
   221 \end{rail}
   222 
   223 
   224 \subsection{Term patterns}\label{sec:term-pats}
   225 
   226 Assumptions and goal statements usually admit casual binding of schematic term
   227 variables by giving (optional) patterns of the form $\ISS{p@1\;\dots}{p@n}$.
   228 There are separate versions available for \railqtoken{term}s and
   229 \railqtoken{prop}s.  The latter provides a $\CONCLNAME$ part with patterns
   230 referring the (atomic) conclusion of a rule.
   231 
   232 \indexouternonterm{termpat}\indexouternonterm{proppat}
   233 \begin{rail}
   234   termpat: '(' ('is' term +) ')'
   235   ;
   236   proppat: '(' (('is' prop +) | 'concl' ('is' prop +) | ('is' prop +) 'concl' ('is' prop +)) ')'
   237   ;
   238 \end{rail}
   239 
   240 
   241 \subsection{Mixfix annotations}
   242 
   243 Mixfix annotations specify concrete \emph{inner} syntax of Isabelle types and
   244 terms (see also \cite{isabelle-ref}).  Some commands such as $\TYPES$ (see
   245 \S\ref{sec:types-pure}) admit infixes only, while $\CONSTS$ (see
   246 \S\ref{sec:consts}) and $\isarkeyword{syntax}$ (see \S\ref{sec:syn-trans})
   247 support the full range of general mixfixes and binders.
   248 
   249 \indexouternonterm{infix}\indexouternonterm{mixfix}
   250 \begin{rail}
   251   infix: '(' ('infixl' | 'infixr') string? nat ')'
   252   ;
   253   mixfix: infix | '(' string prios? nat? ')' | '(' 'binder' string prios? nat ')'
   254   ;
   255 
   256   prios: '[' (nat + ',') ']'
   257   ;
   258 \end{rail}
   259 
   260 
   261 \subsection{Attributes and theorems}\label{sec:syn-att}
   262 
   263 Attributes (and proof methods, see \S\ref{sec:syn-meth}) have their own
   264 ``semi-inner'' syntax, in the sense that input conforming to
   265 \railnonterm{args} below is parsed by the attribute a second time.  The
   266 attribute argument specifications may be any sequence of atomic entities
   267 (identifiers, strings etc.), or properly bracketed argument lists.  Below
   268 \railqtoken{atom} refers to any atomic entity, including any
   269 \railtoken{keyword} conforming to \railtoken{symident}.
   270 
   271 \indexoutertoken{atom}\indexouternonterm{args}\indexouternonterm{attributes}
   272 \begin{rail}
   273   atom: nameref | typefree | typevar | var | nat | keyword
   274   ;
   275   arg: atom | '(' args ')' | '[' args ']'
   276   ;
   277   args: arg *
   278   ;
   279   attributes: '[' (nameref args * ',') ']'
   280   ;
   281 \end{rail}
   282 
   283 Theorem specifications come in several flavors: \railnonterm{axmdecl} and
   284 \railnonterm{thmdecl} usually refer to axioms, assumptions or results of goal
   285 statements, while \railnonterm{thmdef} collects lists of existing theorems.
   286 Existing theorems are given by \railnonterm{thmref} and \railnonterm{thmrefs},
   287 the former requires an actual singleton result.  Any of these theorem
   288 specifications may include lists of attributes both on the left and right hand
   289 sides; attributes are applied to any immediately preceding theorem.  If names
   290 are omitted, the theorems are not stored within the theorem database of the
   291 theory or proof context; any given attributes are still applied, though.
   292 
   293 \indexouternonterm{thmdecl}\indexouternonterm{axmdecl}
   294 \indexouternonterm{thmdef}\indexouternonterm{thmrefs}
   295 \begin{rail}
   296   axmdecl: name attributes? ':'
   297   ;
   298   thmdecl: thmbind ':'
   299   ;
   300   thmdef: thmbind '='
   301   ;
   302   thmref: nameref attributes?
   303   ;
   304   thmrefs: thmref +
   305   ;
   306 
   307   thmbind: name attributes | name | attributes
   308   ;
   309 \end{rail}
   310 
   311 
   312 \subsection{Proof methods}\label{sec:syn-meth}
   313 
   314 Proof methods are either basic ones, or expressions composed of methods via
   315 ``\texttt{,}'' (sequential composition), ``\texttt{|}'' (alternative choices),
   316 ``\texttt{?}'' (try), ``\texttt{+}'' (repeat at least once).  In practice,
   317 proof methods are usually just a comma separated list of
   318 \railqtoken{nameref}~\railnonterm{args} specifications.  Note that parentheses
   319 may be dropped for single method specifications (with no arguments).
   320 
   321 \indexouternonterm{method}
   322 \begin{rail}
   323   method: (nameref | '(' methods ')') (() | '?' | '+')
   324   ;
   325   methods: (nameref args | method) + (',' | '|')
   326   ;
   327 \end{rail}
   328 
   329 Proper use of Isar proof methods does \emph{not} involve goal addressing.
   330 Nevertheless, specifying goal ranges may occasionally come in handy in
   331 emulating tactic scripts.  Note that $[n-]$ refers to all goals, starting from
   332 $n$.  All goals may be specified by $[!]$, which is the same as $[1-]$.
   333 
   334 \indexouternonterm{goalspec}
   335 \begin{rail}
   336   goalspec: '[' (nat '-' nat | nat '-' | nat | '!' ) ']'
   337   ;
   338 \end{rail}
   339 
   340 
   341 \subsection{Antiquotations}\label{sec:antiq}
   342 
   343 \begin{matharray}{rcl}
   344   thm & : & \isarantiq \\
   345   prop & : & \isarantiq \\
   346   term & : & \isarantiq \\
   347   typ & : & \isarantiq \\
   348   text & : & \isarantiq \\
   349   goals & : & \isarantiq \\
   350   subgoals & : & \isarantiq \\
   351 \end{matharray}
   352 
   353 The text body of formal comments (see also \S\ref{sec:comments}) may contain
   354 antiquotations of logical entities, such as theorems, terms and types, which
   355 are to be presented in the final output produced by the Isabelle document
   356 preparation system (see also \S\ref{sec:document-prep}).
   357 
   358 Thus embedding of
   359 \texttt{{\at}{\ttlbrace}term~[show_types]~"f(x)~=~a~+~x"{\ttrbrace}} within a
   360 text block would cause
   361 \isa{(f{\isasymColon}'a~{\isasymRightarrow}~'a)~(x{\isasymColon}'a)~=~(a{\isasymColon}'a)~+~x}
   362 to appear in the final {\LaTeX} document.  Also note that theorem
   363 antiquotations may involve attributes as well.  For example,
   364 \texttt{{\at}{\ttlbrace}thm~sym~[no_vars]{\ttrbrace}} would print the
   365 statement where all schematic variables have been replaced by fixed ones,
   366 which are better readable.
   367 
   368 \indexisarant{thm}\indexisarant{prop}\indexisarant{term}
   369 \indexisarant{typ}\indexisarant{text}\indexisarant{goals}\indexisarant{subgoals}
   370 \begin{rail}
   371   atsign lbrace antiquotation rbrace
   372   ;
   373 
   374   antiquotation:
   375     'thm' options thmrefs |
   376     'prop' options prop |
   377     'term' options term |
   378     'typ' options type |
   379     'text' options name |
   380     'goals' options |
   381     'subgoals' options
   382   ;
   383   options: '[' (option * ',') ']'
   384   ;
   385   option: name | name '=' name
   386   ;
   387 \end{rail}
   388 
   389 Note that the syntax of antiquotations may \emph{not} include source comments
   390 \texttt{(*~\dots~*)} or verbatim text \verb|{*|~\dots~\verb|*}|.
   391 
   392 \begin{descr}
   393 \item [$\at\{thm~\vec a\}$] prints theorems $\vec a$. Note that attribute
   394   specifications may be included as well (see also \S\ref{sec:syn-att}); the
   395   $no_vars$ operation (see \S\ref{sec:misc-methods}) would be particularly
   396   useful to suppress printing of schematic variables.
   397 \item [$\at\{prop~\phi\}$] prints a well-typed proposition $\phi$.
   398 \item [$\at\{term~t\}$] prints a well-typed term $t$.
   399 \item [$\at\{typ~\tau\}$] prints a well-formed type $\tau$.
   400 \item [$\at\{text~s\}$] prints uninterpreted source text $s$.  This is
   401   particularly useful to print portions of text according to the Isabelle
   402   {\LaTeX} output style, without demanding well-formedness (e.g.\ small pieces
   403   of terms that cannot be parsed or type-checked yet).
   404 \item [$\at\{goals\}$] prints the current \emph{dynamic} goal state.  This is
   405   only for support of tactic-emulation scripts within Isar --- presentation of
   406   goal states does not conform to actual human-readable proof documents.
   407   
   408   Please do not include goal states into document output unless you really
   409   know what you are doing!
   410 \item [$\at\{subgoals\}$] behaves almost like $goals$, except that it does not
   411   print the main goal.
   412 \end{descr}
   413 
   414 \medskip
   415 
   416 The following options are available to tune the output.  Note that most of
   417 these coincide with ML flags of the same names (see also \cite{isabelle-ref}).
   418 \begin{descr}
   419 \item[$show_types = bool$ and $show_sorts = bool$] control printing of
   420   explicit type and sort constraints.
   421 \item[$long_names = bool$] forces names of types and constants etc.\ to be
   422   printed in their fully qualified internal form.
   423 \item[$eta_contract = bool$] prints terms in $\eta$-contracted form.
   424 \item[$display = bool$] indicates if the text is to be output as multi-line
   425   ``display material'', rather than a small piece of text without line breaks
   426   (which is the default).
   427 \item[$quotes = bool$] indicates if the output should be enclosed in double
   428   quotes.
   429 \item[$mode = name$] adds $name$ to the print mode to be used for presentation
   430   (see also \cite{isabelle-ref}).  Note that the standard setup for {\LaTeX}
   431   output is already present by default, including the modes ``$latex$'',
   432   ``$xsymbols$'', ``$symbols$''.
   433 \item[$margin = nat$ and $indent = nat$] change the margin or indentation for
   434   pretty printing of display material.
   435 \item[$source = bool$] prints the source text of the antiquotation arguments,
   436   rather than the actual value.  Note that this does not affect
   437   well-formedness checks of $thm$, $term$, etc. (only the $text$ antiquotation
   438   admits arbitrary output).
   439 \item[$goals_limit = nat$] determines the maximum number of goals to be
   440   printed.
   441 \end{descr}
   442 
   443 For boolean flags, ``$name = true$'' may be abbreviated as ``$name$''.  All of
   444 the above flags are disabled by default, unless changed from ML.
   445 
   446 \medskip Note that antiquotations do not only spare the author from tedious
   447 typing, but also achieve some degree of consistency-checking of informal
   448 explanations with formal developments, since well-formedness of terms and
   449 types with respect to the current theory or proof context can be ensured.
   450 
   451 %%% Local Variables: 
   452 %%% mode: latex
   453 %%% TeX-master: "isar-ref"
   454 %%% End: