2 \chapter{Isar Syntax Primitives}
4 We give a complete reference of all basic syntactic entities underlying the
5 Isabelle/Isar document syntax. Actual theory and proof commands will be
10 In order to get started with writing well-formed Isabelle/Isar documents, the
11 most important aspect to be noted is the difference of \emph{inner} versus
12 \emph{outer} syntax. Inner syntax is that of Isabelle types and terms of the
13 logic, while outer syntax is that of Isabelle/Isar theories (including
14 proofs). As a general rule, inner syntax entities may occur only as
15 \emph{atomic entities} within outer syntax. For example, the string
16 \texttt{"x + y"} and identifier \texttt{z} are legal term specifications
17 within a theory, while \texttt{x + y} is not.
20 Note that classic Isabelle theories used to fake parts of the inner syntax
21 of types, with rather complicated rules when quotes may be omitted. Despite
22 the minor drawback of requiring quotes more often, the syntax of
23 Isabelle/Isar is much simpler and more robust in that respect.
28 Isabelle/Isar input may contain any number of input termination characters
29 ``\texttt{;}'' (semicolon) to separate commands explicitly. This may be
30 particularly useful in interactive shell sessions to make clear where the
31 current command is intended to end. Otherwise, the interactive loop will
32 continue until end-of-command is clearly indicated from the input syntax,
33 e.g.\ encounter of the next command keyword.
35 Advanced interfaces such as Proof~General \cite{proofgeneral} do not require
36 explicit semicolons, the amount of input text is determined automatically by
37 inspecting the Emacs text buffer. Also note that in the presentation of
38 Isabelle/Isar documents, semicolons are omitted in any case to gain
42 \section{Lexical matters}\label{sec:lex-syntax}
44 The Isabelle/Isar outer syntax provides token classes as presented below.
45 Note that some of these coincide (by full intention) with the inner lexical
46 syntax as presented in \cite{isabelle-ref}. These different levels of syntax
47 should not be confused, though.
49 %FIXME keyword, command
50 \indexoutertoken{ident}\indexoutertoken{longident}\indexoutertoken{symident}
51 \indexoutertoken{nat}\indexoutertoken{var}\indexoutertoken{typefree}
52 \indexoutertoken{typevar}\indexoutertoken{string}\indexoutertoken{verbatim}
53 \begin{matharray}{rcl}
54 ident & = & letter~quasiletter^* \\
55 longident & = & ident\verb,.,ident~\dots~ident \\
56 symident & = & sym^+ ~|~ symbol \\
58 var & = & \verb,?,ident ~|~ \verb,?,ident\verb,.,nat \\
59 typefree & = & \verb,',ident \\
60 typevar & = & \verb,?,typefree ~|~ \verb,?,typefree\verb,.,nat \\
61 string & = & \verb,", ~\dots~ \verb,", \\
62 verbatim & = & \verb,{*, ~\dots~ \verb,*}, \\
64 \begin{matharray}{rcl}
65 letter & = & \verb,a, ~|~ \dots ~|~ \verb,z, ~|~ \verb,A, ~|~ \dots ~|~ \verb,Z, \\
66 digit & = & \verb,0, ~|~ \dots ~|~ \verb,9, \\
67 quasiletter & = & letter ~|~ digit ~|~ \verb,_, ~|~ \verb,', \\
68 sym & = & \verb,!, ~|~ \verb,#, ~|~ \verb,$, ~|~ \verb,%, ~|~ \verb,&, ~|~ %$
69 \verb,*, ~|~ \verb,+, ~|~ \verb,-, ~|~ \verb,/, ~|~ \verb,:, ~|~ \\
70 & & \verb,<, ~|~ \verb,=, ~|~ \verb,>, ~|~ \verb,?, ~|~ \texttt{\at} ~|~
71 \verb,^, ~|~ \verb,_, ~|~ \verb,`, ~|~ \verb,|, ~|~ \verb,~, \\
72 symbol & = & {\forall} ~|~ {\exists} ~|~ \dots
75 The syntax of \texttt{string} admits any characters, including newlines;
76 ``\verb|"|'' (double-quote) and ``\verb|\|'' (backslash) have to be escaped by
77 a backslash; newlines need not be escaped. Note that ML-style control
78 characters are \emph{not} supported. The body of \texttt{verbatim} may
79 consist of any text not containing ``\verb|*}|''.
81 Comments take the form \texttt{(*~\dots~*)} and may be
82 nested\footnote{Proof~General may occasionally get confused by nested
83 comments.}, just as in ML. Note that these are \emph{source} comments only,
84 which are stripped after lexical analysis of the input. The Isar document
85 syntax also provides \emph{formal comments} that are actually part of the text
86 (see \S\ref{sec:comments}).
88 Mathematical symbols such as ``$\forall$'' are represented in plain ASCII as
92 \section{Common syntax entities}
94 Subsequently, we introduce several basic syntactic entities, such as names,
95 terms, and theorem specifications, which have been factored out of the actual
96 Isar language elements to be described later.
98 Note that some of the basic syntactic entities introduced below (e.g.\
99 \railqtoken{name}) act much like tokens rather than plain nonterminals (e.g.\
100 \railnonterm{sort}), especially for the sake of error messages. E.g.\ syntax
101 elements such as $\CONSTS$ referring to \railqtoken{name} or \railqtoken{type}
102 would really report a missing name or type rather than any of the constituent
103 primitive tokens such as \railtoken{ident} or \railtoken{string}.
108 Entity \railqtoken{name} usually refers to any name of types, constants,
109 theorems etc.\ that are to be \emph{declared} or \emph{defined} (so qualified
110 identifiers are excluded here). Quoted strings provide an escape for
111 non-identifier names or those ruled out by outer syntax keywords (e.g.\
112 \verb|"let"|). Already existing objects are usually referenced by
113 \railqtoken{nameref}.
115 \indexoutertoken{name}\indexoutertoken{parname}\indexoutertoken{nameref}
116 \indexoutertoken{int}
118 name: ident | symident | string | nat
120 parname: '(' name ')'
122 nameref: name | longident
129 \subsection{Comments}\label{sec:comments}
131 Large chunks of plain \railqtoken{text} are usually given
132 \railtoken{verbatim}, i.e.\ enclosed in \verb|{*|~\dots~\verb|*}|. For
133 convenience, any of the smaller text units conforming to \railqtoken{nameref}
134 are admitted as well. Almost any of the Isar commands may be annotated by
135 marginal \railnonterm{comment} of the form \texttt{--} \railqtoken{text}.
136 Note that the latter kind of comment is actually part of the language, while
137 source level comments \verb|(*|~\dots~\verb|*)| are stripped at the lexical
138 level. A few commands such as $\PROOFNAME$ admit additional markup with a
139 ``level of interest'': \texttt{\%} followed by an optional number $n$ (default
140 $n = 1$) indicates that the respective part of the document becomes $n$ levels
141 more obscure; \texttt{\%\%} means that interest drops by $\infty$ --- abandon
142 every hope, who enter here.
144 \indexoutertoken{text}\indexouternonterm{comment}\indexouternonterm{interest}
146 text: verbatim | nameref
148 comment: ('--' text +)
150 interest: percent nat? | ppercent
155 \subsection{Type classes, sorts and arities}
157 Classes are specified by plain names. Sorts have a very simple inner syntax,
158 which is either a single class name $c$ or a list $\{c@1, \dots, c@n\}$
159 referring to the intersection of these classes. The syntax of type arities is
160 given directly at the outer level.
162 \railalias{subseteq}{\isasymsubseteq}
165 \indexouternonterm{sort}\indexouternonterm{arity}\indexouternonterm{simplearity}
166 \indexouternonterm{classdecl}
168 classdecl: name (('<' | subseteq) (nameref + ','))?
172 arity: ('(' (sort + ',') ')')? sort
174 simplearity: ('(' (sort + ',') ')')? nameref
179 \subsection{Types and terms}\label{sec:types-terms}
181 The actual inner Isabelle syntax, that of types and terms of the logic, is far
182 too sophisticated in order to be modelled explicitly at the outer theory
183 level. Basically, any such entity has to be quoted to turn it into a single
184 token (the parsing and type-checking is performed internally later). For
185 convenience, a slightly more liberal convention is adopted: quotes may be
186 omitted for any type or term that is already \emph{atomic} at the outer level.
187 For example, one may write just \texttt{x} instead of \texttt{"x"}. Note that
188 symbolic identifiers (e.g.\ \texttt{++} or $\forall$) are available as well,
189 provided these are not superseded by commands or keywords (e.g.\ \texttt{+}).
191 \indexoutertoken{type}\indexoutertoken{term}\indexoutertoken{prop}
193 type: nameref | typefree | typevar
201 Positional instantiations are indicated by giving a sequence of terms, or the
202 placeholder ``$\_$'' (underscore), which means to skip a position.
204 \indexoutertoken{inst}\indexoutertoken{insts}
206 inst: underscore | term
212 Type declarations and definitions usually refer to \railnonterm{typespec} on
213 the left-hand side. This models basic type constructor application at the
214 outer syntax level. Note that only plain postfix notation is available here,
217 \indexouternonterm{typespec}
219 typespec: (() | typefree | '(' ( typefree + ',' ) ')') name
224 \subsection{Term patterns}\label{sec:term-pats}
226 Assumptions and goal statements usually admit casual binding of schematic term
227 variables by giving (optional) patterns of the form $\ISS{p@1\;\dots}{p@n}$.
228 There are separate versions available for \railqtoken{term}s and
229 \railqtoken{prop}s. The latter provides a $\CONCLNAME$ part with patterns
230 referring the (atomic) conclusion of a rule.
232 \indexouternonterm{termpat}\indexouternonterm{proppat}
234 termpat: '(' ('is' term +) ')'
236 proppat: '(' (('is' prop +) | 'concl' ('is' prop +) | ('is' prop +) 'concl' ('is' prop +)) ')'
241 \subsection{Mixfix annotations}
243 Mixfix annotations specify concrete \emph{inner} syntax of Isabelle types and
244 terms (see also \cite{isabelle-ref}). Some commands such as $\TYPES$ (see
245 \S\ref{sec:types-pure}) admit infixes only, while $\CONSTS$ (see
246 \S\ref{sec:consts}) and $\isarkeyword{syntax}$ (see \S\ref{sec:syn-trans})
247 support the full range of general mixfixes and binders.
249 \indexouternonterm{infix}\indexouternonterm{mixfix}
251 infix: '(' ('infixl' | 'infixr') string? nat ')'
253 mixfix: infix | '(' string prios? nat? ')' | '(' 'binder' string prios? nat ')'
256 prios: '[' (nat + ',') ']'
261 \subsection{Attributes and theorems}\label{sec:syn-att}
263 Attributes (and proof methods, see \S\ref{sec:syn-meth}) have their own
264 ``semi-inner'' syntax, in the sense that input conforming to
265 \railnonterm{args} below is parsed by the attribute a second time. The
266 attribute argument specifications may be any sequence of atomic entities
267 (identifiers, strings etc.), or properly bracketed argument lists. Below
268 \railqtoken{atom} refers to any atomic entity, including any
269 \railtoken{keyword} conforming to \railtoken{symident}.
271 \indexoutertoken{atom}\indexouternonterm{args}\indexouternonterm{attributes}
273 atom: nameref | typefree | typevar | var | nat | keyword
275 arg: atom | '(' args ')' | '[' args ']'
279 attributes: '[' (nameref args * ',') ']'
283 Theorem specifications come in several flavors: \railnonterm{axmdecl} and
284 \railnonterm{thmdecl} usually refer to axioms, assumptions or results of goal
285 statements, while \railnonterm{thmdef} collects lists of existing theorems.
286 Existing theorems are given by \railnonterm{thmref} and \railnonterm{thmrefs},
287 the former requires an actual singleton result. Any of these theorem
288 specifications may include lists of attributes both on the left and right hand
289 sides; attributes are applied to any immediately preceding theorem. If names
290 are omitted, the theorems are not stored within the theorem database of the
291 theory or proof context; any given attributes are still applied, though.
293 \indexouternonterm{thmdecl}\indexouternonterm{axmdecl}
294 \indexouternonterm{thmdef}\indexouternonterm{thmrefs}
296 axmdecl: name attributes? ':'
302 thmref: nameref attributes?
307 thmbind: name attributes | name | attributes
312 \subsection{Proof methods}\label{sec:syn-meth}
314 Proof methods are either basic ones, or expressions composed of methods via
315 ``\texttt{,}'' (sequential composition), ``\texttt{|}'' (alternative choices),
316 ``\texttt{?}'' (try), ``\texttt{+}'' (repeat at least once). In practice,
317 proof methods are usually just a comma separated list of
318 \railqtoken{nameref}~\railnonterm{args} specifications. Note that parentheses
319 may be dropped for single method specifications (with no arguments).
321 \indexouternonterm{method}
323 method: (nameref | '(' methods ')') (() | '?' | '+')
325 methods: (nameref args | method) + (',' | '|')
329 Proper use of Isar proof methods does \emph{not} involve goal addressing.
330 Nevertheless, specifying goal ranges may occasionally come in handy in
331 emulating tactic scripts. Note that $[n-]$ refers to all goals, starting from
332 $n$. All goals may be specified by $[!]$, which is the same as $[1-]$.
334 \indexouternonterm{goalspec}
336 goalspec: '[' (nat '-' nat | nat '-' | nat | '!' ) ']'
341 \subsection{Antiquotations}\label{sec:antiq}
343 \begin{matharray}{rcl}
344 thm & : & \isarantiq \\
345 prop & : & \isarantiq \\
346 term & : & \isarantiq \\
347 typ & : & \isarantiq \\
348 text & : & \isarantiq \\
349 goals & : & \isarantiq \\
350 subgoals & : & \isarantiq \\
353 The text body of formal comments (see also \S\ref{sec:comments}) may contain
354 antiquotations of logical entities, such as theorems, terms and types, which
355 are to be presented in the final output produced by the Isabelle document
356 preparation system (see also \S\ref{sec:document-prep}).
359 \texttt{{\at}{\ttlbrace}term~[show_types]~"f(x)~=~a~+~x"{\ttrbrace}} within a
360 text block would cause
361 \isa{(f{\isasymColon}'a~{\isasymRightarrow}~'a)~(x{\isasymColon}'a)~=~(a{\isasymColon}'a)~+~x}
362 to appear in the final {\LaTeX} document. Also note that theorem
363 antiquotations may involve attributes as well. For example,
364 \texttt{{\at}{\ttlbrace}thm~sym~[no_vars]{\ttrbrace}} would print the
365 statement where all schematic variables have been replaced by fixed ones,
366 which are better readable.
368 \indexisarant{thm}\indexisarant{prop}\indexisarant{term}
369 \indexisarant{typ}\indexisarant{text}\indexisarant{goals}\indexisarant{subgoals}
371 atsign lbrace antiquotation rbrace
375 'thm' options thmrefs |
376 'prop' options prop |
377 'term' options term |
379 'text' options name |
383 options: '[' (option * ',') ']'
385 option: name | name '=' name
389 Note that the syntax of antiquotations may \emph{not} include source comments
390 \texttt{(*~\dots~*)} or verbatim text \verb|{*|~\dots~\verb|*}|.
393 \item [$\at\{thm~\vec a\}$] prints theorems $\vec a$. Note that attribute
394 specifications may be included as well (see also \S\ref{sec:syn-att}); the
395 $no_vars$ operation (see \S\ref{sec:misc-methods}) would be particularly
396 useful to suppress printing of schematic variables.
397 \item [$\at\{prop~\phi\}$] prints a well-typed proposition $\phi$.
398 \item [$\at\{term~t\}$] prints a well-typed term $t$.
399 \item [$\at\{typ~\tau\}$] prints a well-formed type $\tau$.
400 \item [$\at\{text~s\}$] prints uninterpreted source text $s$. This is
401 particularly useful to print portions of text according to the Isabelle
402 {\LaTeX} output style, without demanding well-formedness (e.g.\ small pieces
403 of terms that cannot be parsed or type-checked yet).
404 \item [$\at\{goals\}$] prints the current \emph{dynamic} goal state. This is
405 only for support of tactic-emulation scripts within Isar --- presentation of
406 goal states does not conform to actual human-readable proof documents.
408 Please do not include goal states into document output unless you really
409 know what you are doing!
410 \item [$\at\{subgoals\}$] behaves almost like $goals$, except that it does not
416 The following options are available to tune the output. Note that most of
417 these coincide with ML flags of the same names (see also \cite{isabelle-ref}).
419 \item[$show_types = bool$ and $show_sorts = bool$] control printing of
420 explicit type and sort constraints.
421 \item[$long_names = bool$] forces names of types and constants etc.\ to be
422 printed in their fully qualified internal form.
423 \item[$eta_contract = bool$] prints terms in $\eta$-contracted form.
424 \item[$display = bool$] indicates if the text is to be output as multi-line
425 ``display material'', rather than a small piece of text without line breaks
426 (which is the default).
427 \item[$quotes = bool$] indicates if the output should be enclosed in double
429 \item[$mode = name$] adds $name$ to the print mode to be used for presentation
430 (see also \cite{isabelle-ref}). Note that the standard setup for {\LaTeX}
431 output is already present by default, including the modes ``$latex$'',
432 ``$xsymbols$'', ``$symbols$''.
433 \item[$margin = nat$ and $indent = nat$] change the margin or indentation for
434 pretty printing of display material.
435 \item[$source = bool$] prints the source text of the antiquotation arguments,
436 rather than the actual value. Note that this does not affect
437 well-formedness checks of $thm$, $term$, etc. (only the $text$ antiquotation
438 admits arbitrary output).
439 \item[$goals_limit = nat$] determines the maximum number of goals to be
443 For boolean flags, ``$name = true$'' may be abbreviated as ``$name$''. All of
444 the above flags are disabled by default, unless changed from ML.
446 \medskip Note that antiquotations do not only spare the author from tedious
447 typing, but also achieve some degree of consistency-checking of informal
448 explanations with formal developments, since well-formedness of terms and
449 types with respect to the current theory or proof context can be ensured.
453 %%% TeX-master: "isar-ref"