wneuper/isa: doc-src/IsarRef/syntax.tex@473807a5a436

     2 \chapter{Isar Syntax Primitives}

     4 We give a complete reference of all basic syntactic entities underlying the

     5 Isabelle/Isar document syntax.  Actual theory and proof commands will be

     6 introduced later on.

     8 \medskip

    10 In order to get started with writing well-formed Isabelle/Isar documents, the

    11 most important aspect to be noted is the difference of \emph{inner} versus

    12 \emph{outer} syntax.  Inner syntax is that of Isabelle types and terms of the

    13 logic, while outer syntax is that of Isabelle/Isar theories (including

    14 proofs).  As a general rule, inner syntax entities may occur only as

    15 \emph{atomic entities} within outer syntax.  For example, the string

    16 \texttt{"x + y"} and identifier \texttt{z} are legal term specifications

    17 within a theory, while \texttt{x + y} is not.

    19 \begin{warn}

    20   Note that classic Isabelle theories used to fake parts of the inner syntax

    21   of types, with rather complicated rules when quotes may be omitted.  Despite

    22   the minor drawback of requiring quotes more often, the syntax of

    23   Isabelle/Isar is much simpler and more robust in that respect.

    24 \end{warn}

    26 \medskip

    28 Isabelle/Isar input may contain any number of input termination characters

    29 ``\texttt{;}'' (semicolon) to separate commands explicitly.  This may be

    30 particularly useful in interactive shell sessions to make clear where the

    31 current command is intended to end.  Otherwise, the interactive loop will

    32 continue until end-of-command in clearly indicated from the input syntax,

    33 e.g.\ encounter of the next command keyword.

    35 Advanced interfaces such as Proof~General \cite{proofgeneral} do not require

    36 explicit semicolons, the amount of input text is determined automatically by

    37 inspecting the Emacs text buffer.  Also note that in the presentation of

    38 Isabelle/Isar documents, semicolons are omitted in any case to gain

    39 readability.

    42 \section{Lexical matters}\label{sec:lex-syntax}

    44 The Isabelle/Isar outer syntax provides token classes as presented below.

    45 Note that some of these coincide (by full intention) with the inner lexical

    46 syntax as presented in \cite{isabelle-ref}.  These different levels of syntax

    47 should not be confused, though.

    49 %FIXME keyword, command

    50 \indexoutertoken{ident}\indexoutertoken{longident}\indexoutertoken{symident}

    51 \indexoutertoken{nat}\indexoutertoken{var}\indexoutertoken{typefree}

    52 \indexoutertoken{typevar}\indexoutertoken{string}\indexoutertoken{verbatim}

    53 \begin{matharray}{rcl}

    54   ident & = & letter~quasiletter^* \\

    55   longident & = & ident\verb,.,ident~\dots~ident \\

    56   symident & = & sym^+ ~|~ symbol \\

    57   nat & = & digit^+ \\

    58   var & = & \verb,?,ident ~|~ \verb,?,ident\verb,.,nat \\

    59   typefree & = & \verb,',ident \\

    60   typevar & = & \verb,?,typefree ~|~ \verb,?,typefree\verb,.,nat \\

    61   string & = & \verb,", ~\dots~ \verb,", \\

    62   verbatim & = & \verb,{*, ~\dots~ \verb,*}, \\

    63 \end{matharray}

    64 \begin{matharray}{rcl}

    65   letter & = & \verb,a, ~|~ \dots ~|~ \verb,z, ~|~ \verb,A, ~|~ \dots ~|~ \verb,Z, \\

    66   digit & = & \verb,0, ~|~ \dots ~|~ \verb,9, \\

    67   quasiletter & = & letter ~|~ digit ~|~ \verb,_, ~|~ \verb,', \\

    68   sym & = & \verb,!, ~|~ \verb,#, ~|~ \verb,$, ~|~ \verb,%, ~|~ \verb,&, ~|~  %$

    69    \verb,*, ~|~ \verb,+, ~|~ \verb,-, ~|~ \verb,/, ~|~ \verb,:, ~|~

    70    \verb,<, ~|~ \verb,=, ~|~ \verb,>, ~|~ \verb,?, ~|~ \texttt{\at} ~|~ \\

    71   & & \verb,^, ~|~ \verb,_, ~|~ \verb,`, ~|~ \verb,|, ~|~ \verb,~, \\

    72   symbol & = & {\forall} ~|~ {\exists} ~|~ \dots

    73 \end{matharray}

    75 The syntax of \texttt{string} admits any characters, including newlines;

    76 ``\verb|"|'' (double-quote) and ``\verb|\|'' (backslash) have to be escaped by

    77 a backslash; newlines need not be escaped.  Note that ML-style control

    78 characters are \emph{not} supported.  The body of \texttt{verbatim} may

    79 consist of any text not containing ``\verb|*}|''.

    81 Comments take the form \texttt{(*~\dots~*)} and may be

    82 nested\footnote{Proof~General may occasionally get confused by nested

    83   comments.}, just as in ML. Note that these are \emph{source} comments only,

    84 which are stripped after lexical analysis of the input.  The Isar document

    85 syntax also provides \emph{formal comments} that are actually part of the text

    86 (see \S\ref{sec:comments}).

    89 \section{Common syntax entities}

    91 Subsequently, we introduce several basic syntactic entities, such as names,

    92 terms, and theorem specifications, which have been factored out of the actual

    93 Isar language elements to be described later.

    95 Note that some of the basic syntactic entities introduced below (e.g.\

    96 \railqtoken{name}) act much like tokens rather than plain nonterminals (e.g.\

    97 \railnonterm{sort}), especially for the sake of error messages.  E.g.\ syntax

    98 elements such as $\CONSTS$ referring to \railqtoken{name} or \railqtoken{type}

    99 would really report a missing name or type rather than any of the constituent

   100 primitive tokens such as \railtoken{ident} or \railtoken{string}.

   103 \subsection{Names}

   105 Entity \railqtoken{name} usually refers to any name of types, constants,

   106 theorems etc.\ that are to be \emph{declared} or \emph{defined} (so qualified

   107 identifiers are excluded here).  Quoted strings provide an escape for

   108 non-identifier names or those ruled out by outer syntax keywords (e.g.\

   109 \verb|"let"|).  Already existing objects are usually referenced by

   110 \railqtoken{nameref}.

   112 \indexoutertoken{name}\indexoutertoken{parname}\indexoutertoken{nameref}

   113 \indexoutertoken{int}

   114 \begin{rail}

   115   name: ident | symident | string | nat

   116   ;

   117   parname: '(' name ')'

   118   ;

   119   nameref: name | longident

   120   ;

   121   int: nat | '-' nat

   122   ;

   123 \end{rail}

   126 \subsection{Comments}\label{sec:comments}

   128 Large chunks of plain \railqtoken{text} are usually given

   129 \railtoken{verbatim}, i.e.\ enclosed in \verb|{*|~\dots~\verb|*}|.  For

   130 convenience, any of the smaller text units conforming to \railqtoken{nameref}

   131 are admitted as well.  Almost any of the Isar commands may be annotated by

   132 marginal \railnonterm{comment} of the form \texttt{--} \railqtoken{text}.

   133 Note that the latter kind of comment is actually part of the language, while

   134 source level comments \verb|(*|~\dots~\verb|*)| are stripped at the lexical

   135 level.  A few commands such as $\PROOFNAME$ admit additional markup with a

   136 ``level of interest'': \texttt{\%} followed by an optional number $n$ (default

   137 $n = 1$) indicates that the respective part of the document becomes $n$ levels

   138 more obscure; \texttt{\%\%} means that interest drops by $\infty$ --- abandon

   139 every hope, who enter here.

   141 \indexoutertoken{text}\indexouternonterm{comment}\indexouternonterm{interest}

   142 \begin{rail}

   143   text: verbatim | nameref

   144   ;

   145   comment: ('--' text +)

   146   ;

   147   interest: percent nat? | ppercent

   148   ;

   149 \end{rail}

   152 \subsection{Type classes, sorts and arities}

   154 Classes are specified by plain names.  Sorts have a very simple inner syntax,

   155 which is either a single class name $c$ or a list $\{c@1, \dots, c@n\}$

   156 referring to the intersection of these classes.  The syntax of type arities is

   157 given directly at the outer level.

   159 \indexouternonterm{sort}\indexouternonterm{arity}\indexouternonterm{simplearity}

   160 \indexouternonterm{classdecl}

   161 \begin{rail}

   162   classdecl: name ('<' (nameref + ','))?

   163   ;

   164   sort: nameref

   165   ;

   166   arity: ('(' (sort + ',') ')')? sort

   167   ;

   168   simplearity: ('(' (sort + ',') ')')? nameref

   169   ;

   170 \end{rail}

   173 \subsection{Types and terms}\label{sec:types-terms}

   175 The actual inner Isabelle syntax, that of types and terms of the logic, is far

   176 too sophisticated in order to be modelled explicitly at the outer theory

   177 level.  Basically, any such entity has to be quoted to turn it into a single

   178 token (the parsing and type-checking is performed internally later).  For

   179 convenience, a slightly more liberal convention is adopted: quotes may be

   180 omitted for any type or term that is already \emph{atomic} at the outer level.

   181 For example, one may write just \texttt{x} instead of \texttt{"x"}.  Note that

   182 symbolic identifiers (e.g.\ \texttt{++} or $\forall$) are available as well,

   183 provided these are not superseded by commands or keywords (e.g.\ \texttt{+}).

   185 \indexoutertoken{type}\indexoutertoken{term}\indexoutertoken{prop}

   186 \begin{rail}

   187   type: nameref | typefree | typevar

   188   ;

   189   term: nameref | var

   190   ;

   191   prop: term

   192   ;

   193 \end{rail}

   195 Positional instantiations are indicated by giving a sequence of terms, or the

   196 placeholder ``$\_$'' (underscore), which means to skip a position.

   198 \indexoutertoken{inst}\indexoutertoken{insts}

   199 \begin{rail}

   200   inst: underscore | term

   201   ;

   202   insts: (inst *)

   203   ;

   204 \end{rail}

   206 Type declarations and definitions usually refer to \railnonterm{typespec} on

   207 the left-hand side.  This models basic type constructor application at the

   208 outer syntax level.  Note that only plain postfix notation is available here,

   209 but no infixes.

   211 \indexouternonterm{typespec}

   212 \begin{rail}

   213   typespec: (() | typefree | '(' ( typefree + ',' ) ')') name

   214   ;

   215 \end{rail}

   218 \subsection{Term patterns}\label{sec:term-pats}

   220 Assumptions and goal statements usually admit casual binding of schematic term

   221 variables by giving (optional) patterns of the form $\ISS{p@1\;\dots}{p@n}$.

   222 There are separate versions available for \railqtoken{term}s and

   223 \railqtoken{prop}s.  The latter provides a $\CONCLNAME$ part with patterns

   224 referring the (atomic) conclusion of a rule.

   226 \indexouternonterm{termpat}\indexouternonterm{proppat}

   227 \begin{rail}

   228   termpat: '(' ('is' term +) ')'

   229   ;

   230   proppat: '(' (('is' prop +) | 'concl' ('is' prop +) | ('is' prop +) 'concl' ('is' prop +)) ')'

   231   ;

   232 \end{rail}

   235 \subsection{Mixfix annotations}

   237 Mixfix annotations specify concrete \emph{inner} syntax of Isabelle types and

   238 terms (see also \cite{isabelle-ref}).  Some commands such as $\TYPES$ (see

   239 \S\ref{sec:types-pure}) admit infixes only, while $\CONSTS$ (see

   240 \S\ref{sec:consts}) and $\isarkeyword{syntax}$ (see \S\ref{sec:syn-trans})

   241 support the full range of general mixfixes and binders.

   243 \indexouternonterm{infix}\indexouternonterm{mixfix}

   244 \begin{rail}

   245   infix: '(' ('infixl' | 'infixr') string? nat ')'

   246   ;

   247   mixfix: infix | '(' string prios? nat? ')' | '(' 'binder' string prios? nat ')'

   248   ;

   250   prios: '[' (nat + ',') ']'

   251   ;

   252 \end{rail}

   255 \subsection{Attributes and theorems}\label{sec:syn-att}

   257 Attributes (and proof methods, see \S\ref{sec:syn-meth}) have their own

   258 ``semi-inner'' syntax, in the sense that input conforming to

   259 \railnonterm{args} below is parsed by the attribute a second time.  The

   260 attribute argument specifications may be any sequence of atomic entities

   261 (identifiers, strings etc.), or properly bracketed argument lists.  Below

   262 \railqtoken{atom} refers to any atomic entity, including any

   263 \railtoken{keyword} conforming to \railtoken{symident}.

   265 \indexoutertoken{atom}\indexouternonterm{args}\indexouternonterm{attributes}

   266 \begin{rail}

   267   atom: nameref | typefree | typevar | var | nat | keyword

   268   ;

   269   arg: atom | '(' args ')' | '[' args ']'

   270   ;

   271   args: arg *

   272   ;

   273   attributes: '[' (nameref args * ',') ']'

   274   ;

   275 \end{rail}

   277 Theorem specifications come in several flavors: \railnonterm{axmdecl} and

   278 \railnonterm{thmdecl} usually refer to axioms, assumptions or results of goal

   279 statements, while \railnonterm{thmdef} collects lists of existing theorems.

   280 Existing theorems are given by \railnonterm{thmref} and \railnonterm{thmrefs},

   281 the former requires an actual singleton result.  Any of these theorem

   282 specifications may include lists of attributes both on the left and right hand

   283 sides; attributes are applied to any immediately preceding theorem.  If names

   284 are omitted, the theorems are not stored within the theorem database of the

   285 theory or proof context; any given attributes are still applied, though.

   287 \indexouternonterm{thmdecl}\indexouternonterm{axmdecl}

   288 \indexouternonterm{thmdef}\indexouternonterm{thmrefs}

   289 \begin{rail}

   290   axmdecl: name attributes? ':'

   291   ;

   292   thmdecl: thmbind ':'

   293   ;

   294   thmdef: thmbind '='

   295   ;

   296   thmref: nameref attributes?

   297   ;

   298   thmrefs: thmref +

   299   ;

   301   thmbind: name attributes | name | attributes

   302   ;

   303 \end{rail}

   306 \subsection{Proof methods}\label{sec:syn-meth}

   308 Proof methods are either basic ones, or expressions composed of methods via

   309 ``\texttt{,}'' (sequential composition), ``\texttt{|}'' (alternative choices),

   310 ``\texttt{?}'' (try), ``\texttt{+}'' (repeat at least once).  In practice,

   311 proof methods are usually just a comma separated list of

   312 \railqtoken{nameref}~\railnonterm{args} specifications.  Note that parentheses

   313 may be dropped for single method specifications (with no arguments).

   315 \indexouternonterm{method}

   316 \begin{rail}

   317   method: (nameref | '(' methods ')') (() | '?' | '+')

   318   ;

   319   methods: (nameref args | method) + (',' | '|')

   320   ;

   321 \end{rail}

   323 Proper use of Isar proof methods does \emph{not} involve goal addressing.

   324 Nevertheless, specifying goal ranges may occasionally come in handy in

   325 emulating tactic scripts.  Note that $[n-]$ refers to all goals, starting from

   326 $n$.  All goals may be specified by $[!]$, which is the same as $[1-]$.

   328 \indexouternonterm{goalspec}

   329 \begin{rail}

   330   goalspec: '[' (nat '-' nat | nat '-' | nat | '!' ) ']'

   331   ;

   332 \end{rail}

   335 \subsection{Antiquotations}\label{sec:antiq}

   337 The text body of formal comments (see also \S\ref{sec:comments}) may contain

   338 antiquotations of logical entities, such as theorems, terms and types, which

   339 are to be presented in the final output produced by the Isabelle document

   340 preparation system (see also \S\ref{sec:document-prep}).

   342 Thus embedding of

   343 \texttt{{\at}{\ttlbrace}term[show_types]~"f(x)~=~a~+~x"{\ttrbrace}} within a

   344 text block would cause

   345 \isa{(f{\isasymColon}'a~{\isasymRightarrow}~'a)~(x{\isasymColon}'a)~=~(a{\isasymColon}'a)~+~x}

   346 to appear in the final {\LaTeX} document.

   348 Antiquotations do not only spare the author from tedious typing, but also

   349 achieve some degree of consistency-checking of informal explanations with

   350 formal developments, since well-formedness of terms and types with respect to

   351 the current theory or proof context can be ensured.  The $text$ antiquotation

   352 is an exception to this rule: it prints an uninterpreted text argument that is

   353 not checked in any way.

   355 \indexisarant{thm}\indexisarant{prop}\indexisarant{term}

   356 \indexisarant{typ}\indexisarant{name}

   357 \begin{rail}

   358   atsign lbrace antiquotation rbrace

   359   ;

   361   antiquotation:

   362     'thm' options thmrefs |

   363     'prop' options prop |

   364     'term' options term |

   365     'typ' options type |

   366     'text' options name

   367   ;

   368   options: '[' (option * ',') ']'

   369   ;

   370   option: name | name '=' name

   371   ;

   372 \end{rail}

   374 Note that the syntax of antiquotations may \emph{not} include source comments

   375 \texttt{(*~\dots~*)} or verbatim text \verb|{*|~\dots~\verb|*}|.

   377 \medskip

   379 The following options are available to tune the output.  Note that some of

   380 these coincide with ML flags of the same names (see also \cite{isabelle-ref}).

   381 \begin{descr}

   382 \item[$show_types = bool$ and $show_sorts = bool$] control printing of

   383   explicit type and sort constraints.

   384 \item[$long_names = bool$] forces names of types and constants etc.\ to be

   385   printed in their fully qualified internal form.

   386 \item[$eta_contract = bool$] prints terms in $\eta$-contracted form.

   387 \item[$display = bool$] indicates if the text is to be output as multi-line

   388   ``display material'', rather than a small piece of text without line breaks

   389   (which is the default).

   390 \item[$quotes = bool$] indicates if the output should be enclosed in double

   391   quotes.

   392 \item[$mode = name$] adds $name$ to the print mode to be used for presentation

   393   (see also \cite{isabelle-ref}).  Note that the standard setup for {\LaTeX}

   394   output is already present by default, including the modes ``$latex$'',

   395   ``$xsymbols$'', ``$symbols$''.

   396 \item[$margin = nat$ and $indent = nat$] change the margin or indentation for

   397   pretty printing of display material.

   398 \item[$source = bool$] prints the source text of the antiquotation arguments,

   399   rather than the actual value.  Note that this does not affect

   400   well-formedness checks of $thm$, $term$, etc. (only the $text$ antiquotation

   401   admits arbitrary output).

   402 \end{descr}

   404 For boolean flags, ``$name = true$'' may be abbreviated as ``$name$''.  All of

   405 the above flags are disabled by default, unless changed from ML.

   408 %%% Local Variables:

   409 %%% mode: latex

   410 %%% TeX-master: "isar-ref"

   411 %%% End:

author	wenzelm
	Wed, 04 Oct 2000 21:05:10 +0200
changeset 10152	473807a5a436
parent 9752	a09f4a7accea
child 10160	bb8f9412fec6
permissions	-rw-r--r--