2 \chapter{Syntax primitives}
4 The rather generic framework of Isabelle/Isar syntax emerges from three main
5 syntactic categories: \emph{commands} of the top-level Isar engine (covering
6 theory and proof elements), \emph{methods} for general goal refinements
7 (analogous to traditional ``tactics''), and \emph{attributes} for operations
8 on facts (within a certain context). Here we give a reference of basic
9 syntactic entities underlying Isabelle/Isar syntax in a bottom-up manner.
10 Concrete theory and proof language elements will be introduced later on.
14 In order to get started with writing well-formed Isabelle/Isar documents, the
15 most important aspect to be noted is the difference of \emph{inner} versus
16 \emph{outer} syntax. Inner syntax is that of Isabelle types and terms of the
17 logic, while outer syntax is that of Isabelle/Isar theory sources (including
18 proofs). As a general rule, inner syntax entities may occur only as
19 \emph{atomic entities} within outer syntax. For example, the string
20 \texttt{"x + y"} and identifier \texttt{z} are legal term specifications
21 within a theory, while \texttt{x + y} is not.
24 Old-style Isabelle theories used to fake parts of the inner syntax of types,
25 with rather complicated rules when quotes may be omitted. Despite the minor
26 drawback of requiring quotes more often, the syntax of Isabelle/Isar is
27 somewhat simpler and more robust in that respect.
30 Printed theory documents usually omit quotes to gain readability (this is a
31 matter of {\LaTeX} macro setup, say via \verb,\isabellestyle,, see also
32 \cite{isabelle-sys}). Experienced users of Isabelle/Isar may easily
33 reconstruct the lost technical information, while mere readers need not care
38 Isabelle/Isar input may contain any number of input termination characters
39 ``\texttt{;}'' (semicolon) to separate commands explicitly. This is
40 particularly useful in interactive shell sessions to make clear where the
41 current command is intended to end. Otherwise, the interpreter loop will
42 continue to issue a secondary prompt ``\verb,#,'' until an end-of-command is
43 clearly recognized from the input syntax, e.g.\ encounter of the next command
46 Advanced interfaces such as Proof~General \cite{proofgeneral} do not require
47 explicit semicolons, the amount of input text is determined automatically by
48 inspecting the present content of the Emacs text buffer. In the printed
49 presentation of Isabelle/Isar documents semicolons are omitted altogether for
53 Proof~General requires certain syntax classification tables in order to
54 achieve properly synchronized interaction with the Isabelle/Isar process.
55 These tables need to be consistent with the Isabelle version and particular
56 logic image to be used in a running session (common object-logics may well
57 change the outer syntax). The standard setup should work correctly with any
58 of the ``official'' logic images derived from Isabelle/HOL (including HOLCF
59 etc.). Users of alternative logics may need to tell Proof~General
60 explicitly, e.g.\ by giving an option \verb,-k ZF, (in conjunction with
61 \verb,-l ZF, to specify the default logic image).
64 \section{Lexical matters}\label{sec:lex-syntax}
66 The Isabelle/Isar outer syntax provides token classes as presented below.
67 Note that some of these coincide (by full intention) with the inner lexical
68 syntax as presented in \cite{isabelle-ref}.
70 \indexoutertoken{ident}\indexoutertoken{longident}\indexoutertoken{symident}
71 \indexoutertoken{nat}\indexoutertoken{var}\indexoutertoken{typefree}
72 \indexoutertoken{typevar}\indexoutertoken{string}\indexoutertoken{verbatim}
73 \begin{matharray}{rcl}
74 ident & = & letter~quasiletter^* \\
75 longident & = & ident\verb,.,ident~\dots~ident \\
76 symident & = & sym^+ ~|~ symbol \\
78 var & = & \verb,?,ident ~|~ \verb,?,ident\verb,.,nat \\
79 typefree & = & \verb,',ident \\
80 typevar & = & \verb,?,typefree ~|~ \verb,?,typefree\verb,.,nat \\
81 string & = & \verb,", ~\dots~ \verb,", \\
82 verbatim & = & \verb,{*, ~\dots~ \verb,*}, \\
84 \begin{matharray}{rcl}
85 letter & = & \verb,a, ~|~ \dots ~|~ \verb,z, ~|~ \verb,A, ~|~ \dots ~|~ \verb,Z, \\
86 digit & = & \verb,0, ~|~ \dots ~|~ \verb,9, \\
87 quasiletter & = & letter ~|~ digit ~|~ \verb,_, ~|~ \verb,', \\
88 sym & = & \verb,!, ~|~ \verb,#, ~|~ \verb,$, ~|~ \verb,%, ~|~ \verb,&, ~|~ %$
89 \verb,*, ~|~ \verb,+, ~|~ \verb,-, ~|~ \verb,/, ~|~ \verb,:, ~|~ \\
90 & & \verb,<, ~|~ \verb,=, ~|~ \verb,>, ~|~ \verb,?, ~|~ \texttt{\at} ~|~
91 \verb,^, ~|~ \verb,_, ~|~ \verb,`, ~|~ \verb,|, ~|~ \verb,~, \\
92 symbol & = & {\forall} ~|~ {\exists} ~|~ {\land} ~|~ {\lor} ~|~ \dots
95 The syntax of $string$ admits any characters, including newlines; ``\verb|"|''
96 (double-quote) and ``\verb|\|'' (backslash) need to be escaped by a backslash.
97 Note that ML-style control characters are \emph{not} supported. The body of
98 $verbatim$ may consist of any text not containing ``\verb|*}|''; this allows
99 convenient inclusion of quotes without further escapes.
101 Comments take the form \texttt{(*~\dots~*)} and may in principle be nested,
102 just as in ML. Note that these are \emph{source} comments only, which are
103 stripped after lexical analysis of the input. The Isar document syntax also
104 provides \emph{formal comments} that are considered as part of the text (see
105 \S\ref{sec:comments}).
108 Proof~General does not handle nested comments properly; it is also unable to
109 keep \verb,(*,\,/\,\verb,{*, and \verb,*),\,/\,\verb,*}, apart, despite
110 their rather different meaning. These are inherent problems of Emacs
111 legacy. Users should not be overly aggressive about nesting or alternating
117 Mathematical symbols such as ``$\forall$'' are represented in plain ASCII as
118 ``\verb,\<forall>,''. Concerning Isabelle itself, any sequence of the form
119 \verb,\<,$ident$\verb,>, (or \verb,\\<,$ident$\verb,>,) is a legal symbol.
120 Display of appropriate glyphs is a matter of front-end tools, say the
121 user-interface of Proof~General plus the X-Symbol package, or the {\LaTeX}
122 macro setup of document output. A list of predefined Isabelle symbols is
123 given in \cite[appendix~A]{isabelle-sys}.
126 \section{Common syntax entities}
128 Subsequently, we introduce several basic syntactic entities, such as names,
129 terms, and theorem specifications, which have been factored out of the actual
130 Isar language elements to be described later.
132 Note that some of the basic syntactic entities introduced below (e.g.\
133 \railqtok{name}) act much like tokens rather than plain nonterminals (e.g.\
134 \railnonterm{sort}), especially for the sake of error messages. E.g.\ syntax
135 elements like $\CONSTS$ referring to \railqtok{name} or \railqtok{type} would
136 really report a missing name or type rather than any of the constituent
137 primitive tokens such as \railtok{ident} or \railtok{string}.
142 Entity \railqtok{name} usually refers to any name of types, constants,
143 theorems etc.\ that are to be \emph{declared} or \emph{defined} (so qualified
144 identifiers are excluded here). Quoted strings provide an escape for
145 non-identifier names or those ruled out by outer syntax keywords (e.g.\
146 \verb|"let"|). Already existing objects are usually referenced by
149 \indexoutertoken{name}\indexoutertoken{parname}\indexoutertoken{nameref}
150 \indexoutertoken{int}
152 name: ident | symident | string | nat
154 parname: '(' name ')'
156 nameref: name | longident
163 \subsection{Comments}\label{sec:comments}
165 Large chunks of plain \railqtok{text} are usually given \railtok{verbatim},
166 i.e.\ enclosed in \verb|{*|~\dots~\verb|*}|. For convenience, any of the
167 smaller text units conforming to \railqtok{nameref} are admitted as well. A
168 marginal \railnonterm{comment} is of the form \texttt{--} \railqtok{text}.
169 Any number of these may occur within Isabelle/Isar commands.
171 \indexoutertoken{text}\indexouternonterm{comment}
173 text: verbatim | nameref
180 \subsection{Type classes, sorts and arities}
182 Classes are specified by plain names. Sorts have a very simple inner syntax,
183 which is either a single class name $c$ or a list $\{c@1, \dots, c@n\}$
184 referring to the intersection of these classes. The syntax of type arities is
185 given directly at the outer level.
187 \railalias{subseteq}{\isasymsubseteq}
190 \indexouternonterm{sort}\indexouternonterm{arity}\indexouternonterm{simplearity}
191 \indexouternonterm{classdecl}
193 classdecl: name (('<' | subseteq) (nameref + ','))?
197 arity: ('(' (sort + ',') ')')? sort
199 simplearity: ('(' (sort + ',') ')')? nameref
204 \subsection{Types and terms}\label{sec:types-terms}
206 The actual inner Isabelle syntax, that of types and terms of the logic, is far
207 too sophisticated in order to be modelled explicitly at the outer theory
208 level. Basically, any such entity has to be quoted to turn it into a single
209 token (the parsing and type-checking is performed internally later). For
210 convenience, a slightly more liberal convention is adopted: quotes may be
211 omitted for any type or term that is already atomic at the outer level. For
212 example, one may just write \texttt{x} instead of \texttt{"x"}. Note that
213 symbolic identifiers (e.g.\ \texttt{++} or $\forall$) are available as well,
214 provided these have not been superseded by commands or other keywords already
215 (e.g.\ \texttt{=} or \texttt{+}).
217 \indexoutertoken{type}\indexoutertoken{term}\indexoutertoken{prop}
219 type: nameref | typefree | typevar
227 Positional instantiations are indicated by giving a sequence of terms, or the
228 placeholder ``$\_$'' (underscore), which means to skip a position.
230 \indexoutertoken{inst}\indexoutertoken{insts}
232 inst: underscore | term
238 Type declarations and definitions usually refer to \railnonterm{typespec} on
239 the left-hand side. This models basic type constructor application at the
240 outer syntax level. Note that only plain postfix notation is available here,
243 \indexouternonterm{typespec}
245 typespec: (() | typefree | '(' ( typefree + ',' ) ')') name
250 \subsection{Mixfix annotations}
252 Mixfix annotations specify concrete \emph{inner} syntax of Isabelle types and
253 terms. Some commands such as $\TYPES$ (see \S\ref{sec:types-pure}) admit
254 infixes only, while $\CONSTS$ (see \S\ref{sec:consts}) and
255 $\isarkeyword{syntax}$ (see \S\ref{sec:syn-trans}) support the full range of
256 general mixfixes and binders.
258 \indexouternonterm{infix}\indexouternonterm{mixfix}\indexouternonterm{structmixfix}
260 infix: '(' ('infix' | 'infixl' | 'infixr') string? nat ')'
262 mixfix: infix | '(' string prios? nat? ')' | '(' 'binder' string prios? nat ')'
264 structmixfix: mixfix | '(' 'structure' ')'
267 prios: '[' (nat + ',') ']'
271 Here the \railtok{string} specifications refer to the actual mixfix template
272 (see also \cite{isabelle-ref}), which may include literal text, spacing,
273 blocks, and arguments (denoted by ``$_$''); the special symbol \verb,\<index>,
274 (printed as ``\i'') represents an index argument that specifies an implicit
275 structure reference (see also \S\ref{sec:locale}). Infix and binder
276 declarations provide common abbreviations for particular mixfix declarations.
277 So in practice, mixfix templates mostly degenerate to literal text for
278 concrete syntax, such as ``\verb,++,'' for an infix symbol, or ``\verb,++,\i''
279 for an infix of an implicit structure.
283 \subsection{Proof methods}\label{sec:syn-meth}
285 Proof methods are either basic ones, or expressions composed of methods via
286 ``\texttt{,}'' (sequential composition), ``\texttt{|}'' (alternative choices),
287 ``\texttt{?}'' (try), ``\texttt{+}'' (repeat at least once). In practice,
288 proof methods are usually just a comma separated list of
289 \railqtok{nameref}~\railnonterm{args} specifications. Note that parentheses
290 may be dropped for single method specifications (with no arguments).
292 \indexouternonterm{method}
294 method: (nameref | '(' methods ')') (() | '?' | '+')
296 methods: (nameref args | method) + (',' | '|')
300 Proper use of Isar proof methods does \emph{not} involve goal addressing.
301 Nevertheless, specifying goal ranges may occasionally come in handy in
302 emulating tactic scripts. Note that $[n-]$ refers to all goals, starting from
303 $n$. All goals may be specified by $[!]$, which is the same as $[1-]$.
305 \indexouternonterm{goalspec}
307 goalspec: '[' (nat '-' nat | nat '-' | nat | '!' ) ']'
312 \subsection{Attributes and theorems}\label{sec:syn-att}
314 Attributes (and proof methods, see \S\ref{sec:syn-meth}) have their own
315 ``semi-inner'' syntax, in the sense that input conforming to
316 \railnonterm{args} below is parsed by the attribute a second time. The
317 attribute argument specifications may be any sequence of atomic entities
318 (identifiers, strings etc.), or properly bracketed argument lists. Below
319 \railqtok{atom} refers to any atomic entity, including any \railtok{keyword}
320 conforming to \railtok{symident}.
322 \indexoutertoken{atom}\indexouternonterm{args}\indexouternonterm{attributes}
324 atom: nameref | typefree | typevar | var | nat | keyword
326 arg: atom | '(' args ')' | '[' args ']'
330 attributes: '[' (nameref args * ',') ']'
334 Theorem specifications come in several flavors: \railnonterm{axmdecl} and
335 \railnonterm{thmdecl} usually refer to axioms, assumptions or results of goal
336 statements, while \railnonterm{thmdef} collects lists of existing theorems.
337 Existing theorems are given by \railnonterm{thmref} and \railnonterm{thmrefs},
338 the former requires an actual singleton result. Any of these theorem
339 specifications may include lists of attributes both on the left and right hand
340 sides; attributes are applied to any immediately preceding fact. If names are
341 omitted, the theorems are not stored within the theorem database of the theory
342 or proof context; any given attributes are still applied, though.
344 \indexouternonterm{thmdecl}\indexouternonterm{axmdecl}
345 \indexouternonterm{thmdef}\indexouternonterm{thmrefs}
347 axmdecl: name attributes? ':'
353 thmref: nameref attributes?
358 thmbind: name attributes | name | attributes
363 \subsection{Term patterns and declarations}\label{sec:term-decls}
365 Wherever explicit propositions (or term fragments) occur in a proof text,
366 casual binding of schematic term variables may be given specified via patterns
367 of the form ``$\ISS{p@1\;\dots}{p@n}$''. There are separate versions
368 available for \railqtok{term}s and \railqtok{prop}s. The latter provides a
369 $\CONCLNAME$ part with patterns referring the (atomic) conclusion of a rule.
371 \indexouternonterm{termpat}\indexouternonterm{proppat}
373 termpat: '(' ('is' term +) ')'
375 proppat: '(' (('is' prop +) | 'concl' ('is' prop +) | ('is' prop +) 'concl' ('is' prop +)) ')'
379 Declarations of local variables $x :: \tau$ and logical propositions $a :
380 \phi$ represent different views on the same principle of introducing a local
381 scope. In practice, one may usually omit the typing of $vars$ (due to
382 type-inference), and the naming of propositions (due to implicit references of
383 current facts). In any case, Isar proof elements usually admit to introduce
384 multiple such items simultaneously.
386 \indexouternonterm{vars}\indexouternonterm{props}
388 vars: (name+) ('::' type)?
390 props: thmdecl? (prop proppat? +)
394 The treatment of multiple declarations corresponds to the complementary focus
395 of $vars$ versus $props$: in ``$x@1~\dots~x@n :: \tau$'' the typing refers to
396 all variables, while in $a\colon \phi@1~\dots~\phi@n$ the naming refers to all
397 propositions collectively. Isar language elements that refer to $vars$ or
398 $props$ typically admit separate typings or namings via another level of
399 iteration, with explicit $\AND$ separators; e.g.\ see $\FIXNAME$ and
400 $\ASSUMENAME$ in \S\ref{sec:proof-context}.
403 \subsection{Antiquotations}\label{sec:antiq}
405 \begin{matharray}{rcl}
406 thm & : & \isarantiq \\
407 prop & : & \isarantiq \\
408 term & : & \isarantiq \\
409 typ & : & \isarantiq \\
410 text & : & \isarantiq \\
411 goals & : & \isarantiq \\
412 subgoals & : & \isarantiq \\
415 The text body of formal comments (see also \S\ref{sec:comments}) may contain
416 antiquotations of logical entities, such as theorems, terms and types, which
417 are to be presented in the final output produced by the Isabelle document
418 preparation system (see also \S\ref{sec:document-prep}).
421 ``\texttt{{\at}{\ttlbrace}term~[show_types]~"f(x)~=~a~+~x"{\ttrbrace}}''
422 within a text block would cause
423 \isa{(f{\isasymColon}'a~{\isasymRightarrow}~'a)~(x{\isasymColon}'a)~=~(a{\isasymColon}'a)~+~x}
424 to appear in the final {\LaTeX} document. Also note that theorem
425 antiquotations may involve attributes as well. For example,
426 \texttt{{\at}{\ttlbrace}thm~sym~[no_vars]{\ttrbrace}} would print the
427 statement where all schematic variables have been replaced by fixed ones,
428 which are easier to read.
430 \indexisarant{thm}\indexisarant{prop}\indexisarant{term}
431 \indexisarant{typ}\indexisarant{text}\indexisarant{goals}\indexisarant{subgoals}
433 atsign lbrace antiquotation rbrace
437 'thm' options thmrefs |
438 'prop' options prop |
439 'term' options term |
441 'text' options name |
445 options: '[' (option * ',') ']'
447 option: name | name '=' name
451 Note that the syntax of antiquotations may \emph{not} include source comments
452 \texttt{(*~\dots~*)} or verbatim text \verb|{*|~\dots~\verb|*}|.
456 \item [$\at\{thm~\vec a\}$] prints theorems $\vec a$. Note that attribute
457 specifications may be included as well (see also \S\ref{sec:syn-att}); the
458 $no_vars$ operation (see \S\ref{sec:misc-meth-att}) would be particularly
459 useful to suppress printing of schematic variables.
461 \item [$\at\{prop~\phi\}$] prints a well-typed proposition $\phi$.
463 \item [$\at\{term~t\}$] prints a well-typed term $t$.
465 \item [$\at\{typ~\tau\}$] prints a well-formed type $\tau$.
467 \item [$\at\{text~s\}$] prints uninterpreted source text $s$. This is
468 particularly useful to print portions of text according to the Isabelle
469 {\LaTeX} output style, without demanding well-formedness (e.g.\ small pieces
470 of terms that should not be parsed or type-checked yet).
472 \item [$\at\{goals\}$] prints the current \emph{dynamic} goal state. This is
473 mainly for support of tactic-emulation scripts within Isar --- presentation
474 of goal states does not conform to actual human-readable proof documents.
475 Please do not include goal states into document output unless you really
476 know what you are doing!
478 \item [$\at\{subgoals\}$] behaves almost like $goals$, except that it does not
485 The following options are available to tune the output. Note that most of
486 these coincide with ML flags of the same names (see also \cite{isabelle-ref}).
488 \item[$show_types = bool$ and $show_sorts = bool$] control printing of
489 explicit type and sort constraints.
490 \item[$long_names = bool$] forces names of types and constants etc.\ to be
491 printed in their fully qualified internal form.
492 \item[$eta_contract = bool$] prints terms in $\eta$-contracted form.
493 \item[$display = bool$] indicates if the text is to be output as multi-line
494 ``display material'', rather than a small piece of text without line breaks
495 (which is the default).
496 \item[$quotes = bool$] indicates if the output should be enclosed in double
498 \item[$mode = name$] adds $name$ to the print mode to be used for presentation
499 (see also \cite{isabelle-ref}). Note that the standard setup for {\LaTeX}
500 output is already present by default, including the modes ``$latex$'',
501 ``$xsymbols$'', ``$symbols$''.
502 \item[$margin = nat$ and $indent = nat$] change the margin or indentation for
503 pretty printing of display material.
504 \item[$source = bool$] prints the source text of the antiquotation arguments,
505 rather than the actual value. Note that this does not affect
506 well-formedness checks of $thm$, $term$, etc. (only the $text$ antiquotation
507 admits arbitrary output).
508 \item[$goals_limit = nat$] determines the maximum number of goals to be
512 For boolean flags, ``$name = true$'' may be abbreviated as ``$name$''. All of
513 the above flags are disabled by default, unless changed from ML.
515 \medskip Note that antiquotations do not only spare the author from tedious
516 typing of logical entities, but also achieve some degree of
517 consistency-checking of informal explanations with formal developments:
518 well-formedness of terms and types with respect to the current theory or proof
519 context is ensured here.
523 %%% TeX-master: "isar-ref"