2 \chapter{Syntax primitives}
4 The rather generic framework of Isabelle/Isar syntax emerges from three main
5 syntactic categories: \emph{commands} of the top-level Isar engine (covering
6 theory and proof elements), \emph{methods} for general goal refinements
7 (analogous to traditional ``tactics''), and \emph{attributes} for operations
8 on facts (within a certain context). Here we give a reference of basic
9 syntactic entities underlying Isabelle/Isar syntax in a bottom-up manner.
10 Concrete theory and proof language elements will be introduced later on.
14 In order to get started with writing well-formed Isabelle/Isar documents, the
15 most important aspect to be noted is the difference of \emph{inner} versus
16 \emph{outer} syntax. Inner syntax is that of Isabelle types and terms of the
17 logic, while outer syntax is that of Isabelle/Isar theory sources (including
18 proofs). As a general rule, inner syntax entities may occur only as
19 \emph{atomic entities} within outer syntax. For example, the string
20 \texttt{"x + y"} and identifier \texttt{z} are legal term specifications
21 within a theory, while \texttt{x + y} is not.
24 Old-style Isabelle theories used to fake parts of the inner syntax of types,
25 with rather complicated rules when quotes may be omitted. Despite the minor
26 drawback of requiring quotes more often, the syntax of Isabelle/Isar is
27 somewhat simpler and more robust in that respect.
30 Printed theory documents usually omit quotes to gain readability (this is a
31 matter of {\LaTeX} macro setup, say via \verb,\isabellestyle,, see also
32 \cite{isabelle-sys}). Experienced users of Isabelle/Isar may easily
33 reconstruct the lost technical information, while mere readers need not care
38 Isabelle/Isar input may contain any number of input termination characters
39 ``\texttt{;}'' (semicolon) to separate commands explicitly. This is
40 particularly useful in interactive shell sessions to make clear where the
41 current command is intended to end. Otherwise, the interpreter loop will
42 continue to issue a secondary prompt ``\verb,#,'' until an end-of-command is
43 clearly indicated from the input syntax, e.g.\ encounter of the next command
46 Advanced interfaces such as Proof~General \cite{proofgeneral} do not require
47 explicit semicolons, the amount of input text is determined automatically by
48 inspecting the present content of the Emacs text buffer. In the printed
49 presentation of Isabelle/Isar documents semicolons are omitted altogether for
53 Proof~General requires certain syntax classification tables in order to
54 achieve properly synchronized interaction with the Isabelle/Isar process.
55 These tables need to be consistent with the Isabelle version and particular
56 logic image to be used in a running session (common object-logics may well
57 change the outer syntax). The standard setup should work correctly with any
58 of the ``official'' logic images derived from Isabelle/HOL (including HOLCF
59 etc.). Users of alternative logics may need to tell Proof~General
60 explicitly, e.g.\ by giving an option \verb,-k ZF, (in conjunction with
61 \verb,-l ZF, to specify the default logic image).
64 \section{Lexical matters}\label{sec:lex-syntax}
66 The Isabelle/Isar outer syntax provides token classes as presented below.
67 Note that some of these coincide (by full intention) with the inner lexical
68 syntax as presented in \cite{isabelle-ref}.
70 \indexoutertoken{ident}\indexoutertoken{longident}\indexoutertoken{symident}
71 \indexoutertoken{nat}\indexoutertoken{var}\indexoutertoken{typefree}
72 \indexoutertoken{typevar}\indexoutertoken{string}\indexoutertoken{verbatim}
73 \begin{matharray}{rcl}
74 ident & = & letter~quasiletter^* \\
75 longident & = & ident\verb,.,ident~\dots~ident \\
76 symident & = & sym^+ ~|~ symbol \\
78 var & = & \verb,?,ident ~|~ \verb,?,ident\verb,.,nat \\
79 typefree & = & \verb,',ident \\
80 typevar & = & \verb,?,typefree ~|~ \verb,?,typefree\verb,.,nat \\
81 string & = & \verb,", ~\dots~ \verb,", \\
82 verbatim & = & \verb,{*, ~\dots~ \verb,*}, \\
84 \begin{matharray}{rcl}
85 letter & = & \verb,a, ~|~ \dots ~|~ \verb,z, ~|~ \verb,A, ~|~ \dots ~|~ \verb,Z, \\
86 digit & = & \verb,0, ~|~ \dots ~|~ \verb,9, \\
87 quasiletter & = & letter ~|~ digit ~|~ \verb,_, ~|~ \verb,', \\
88 sym & = & \verb,!, ~|~ \verb,#, ~|~ \verb,$, ~|~ \verb,%, ~|~ \verb,&, ~|~ %$
89 \verb,*, ~|~ \verb,+, ~|~ \verb,-, ~|~ \verb,/, ~|~ \verb,:, ~|~ \\
90 & & \verb,<, ~|~ \verb,=, ~|~ \verb,>, ~|~ \verb,?, ~|~ \texttt{\at} ~|~
91 \verb,^, ~|~ \verb,_, ~|~ \verb,`, ~|~ \verb,|, ~|~ \verb,~, \\
92 symbol & = & {\forall} ~|~ {\exists} ~|~ {\land} ~|~ {\lor} ~|~ \dots
95 The syntax of \railtoken{string} admits any characters, including newlines;
96 ``\verb|"|'' (double-quote) and ``\verb|\|'' (backslash) need to be escaped by
97 a backslash. Note that ML-style control characters are \emph{not} supported.
98 The body of \railtoken{verbatim} may consist of any text not containing
99 ``\verb|*}|''; this allows handsome inclusion of quotes without further
102 Comments take the form \texttt{(*~\dots~*)} and may in principle be nested,
103 just as in ML. Note that these are \emph{source} comments only, which are
104 stripped after lexical analysis of the input. The Isar document syntax also
105 provides \emph{formal comments} that are considered as part of the text (see
106 \S\ref{sec:comments}).
109 Proof~General does not handle nested comments properly; it is also unable to
110 keep \verb,(*,\,/\,\verb,{*, and \verb,*),\,/\,\verb,*}, apart, despite
111 their rather different meaning. These are inherent problems of Emacs
117 Mathematical symbols such as ``$\forall$'' are represented in plain ASCII as
118 ``\verb,\<forall>,''. Concerning Isabelle itself, any sequence of the form
119 \verb,\<,$ident$\verb,>, (or \verb,\\<,$ident$\verb,>,) is a legal symbol.
120 Display of appropriate glyphs is a matter of front-end tools, say the
121 user-interface of Proof~General plus the X-Symbol package, or the {\LaTeX}
122 macro setup of document output. A list of predefined Isabelle symbols is
123 given in \cite[appendix~A]{isabelle-sys}.
126 \section{Common syntax entities}
128 Subsequently, we introduce several basic syntactic entities, such as names,
129 terms, and theorem specifications, which have been factored out of the actual
130 Isar language elements to be described later.
132 Note that some of the basic syntactic entities introduced below (e.g.\
133 \railqtoken{name}) act much like tokens rather than plain nonterminals (e.g.\
134 \railnonterm{sort}), especially for the sake of error messages. E.g.\ syntax
135 elements like $\CONSTS$ referring to \railqtoken{name} or \railqtoken{type}
136 would really report a missing name or type rather than any of the constituent
137 primitive tokens such as \railtoken{ident} or \railtoken{string}.
142 Entity \railqtoken{name} usually refers to any name of types, constants,
143 theorems etc.\ that are to be \emph{declared} or \emph{defined} (so qualified
144 identifiers are excluded here). Quoted strings provide an escape for
145 non-identifier names or those ruled out by outer syntax keywords (e.g.\
146 \verb|"let"|). Already existing objects are usually referenced by
147 \railqtoken{nameref}.
149 \indexoutertoken{name}\indexoutertoken{parname}\indexoutertoken{nameref}
150 \indexoutertoken{int}
152 name: ident | symident | string | nat
154 parname: '(' name ')'
156 nameref: name | longident
163 \subsection{Comments}\label{sec:comments}
165 Large chunks of plain \railqtoken{text} are usually given
166 \railtoken{verbatim}, i.e.\ enclosed in \verb|{*|~\dots~\verb|*}|. For
167 convenience, any of the smaller text units conforming to \railqtoken{nameref}
168 are admitted as well. A marginal \railnonterm{comment} is of the form
169 \texttt{--} \railqtoken{text}. Any number of these may occur within
170 Isabelle/Isar commands.
172 \indexoutertoken{text}\indexouternonterm{comment}
174 text: verbatim | nameref
181 \subsection{Type classes, sorts and arities}
183 Classes are specified by plain names. Sorts have a very simple inner syntax,
184 which is either a single class name $c$ or a list $\{c@1, \dots, c@n\}$
185 referring to the intersection of these classes. The syntax of type arities is
186 given directly at the outer level.
188 \railalias{subseteq}{\isasymsubseteq}
191 \indexouternonterm{sort}\indexouternonterm{arity}\indexouternonterm{simplearity}
192 \indexouternonterm{classdecl}
194 classdecl: name (('<' | subseteq) (nameref + ','))?
198 arity: ('(' (sort + ',') ')')? sort
200 simplearity: ('(' (sort + ',') ')')? nameref
205 \subsection{Types and terms}\label{sec:types-terms}
207 The actual inner Isabelle syntax, that of types and terms of the logic, is far
208 too sophisticated in order to be modelled explicitly at the outer theory
209 level. Basically, any such entity has to be quoted to turn it into a single
210 token (the parsing and type-checking is performed internally later). For
211 convenience, a slightly more liberal convention is adopted: quotes may be
212 omitted for any type or term that is already \emph{atomic} at the outer level.
213 For example, one may just write \texttt{x} instead of \texttt{"x"}. Note that
214 symbolic identifiers (e.g.\ \texttt{++} or $\forall$) are available as well,
215 provided these have not been superseded by commands or other keywords already
216 (e.g.\ \texttt{=} or \texttt{+}).
218 \indexoutertoken{type}\indexoutertoken{term}\indexoutertoken{prop}
220 type: nameref | typefree | typevar
228 Positional instantiations are indicated by giving a sequence of terms, or the
229 placeholder ``$\_$'' (underscore), which means to skip a position.
231 \indexoutertoken{inst}\indexoutertoken{insts}
233 inst: underscore | term
239 Type declarations and definitions usually refer to \railnonterm{typespec} on
240 the left-hand side. This models basic type constructor application at the
241 outer syntax level. Note that only plain postfix notation is available here,
244 \indexouternonterm{typespec}
246 typespec: (() | typefree | '(' ( typefree + ',' ) ')') name
251 \subsection{Mixfix annotations}
253 Mixfix annotations specify concrete \emph{inner} syntax of Isabelle types and
254 terms. Some commands such as $\TYPES$ (see \S\ref{sec:types-pure}) admit
255 infixes only, while $\CONSTS$ (see \S\ref{sec:consts}) and
256 $\isarkeyword{syntax}$ (see \S\ref{sec:syn-trans}) support the full range of
257 general mixfixes and binders.
259 \indexouternonterm{infix}\indexouternonterm{mixfix}
261 infix: '(' ('infix' | 'infixl' | 'infixr') string? nat ')'
263 mixfix: infix | '(' string prios? nat? ')' | '(' 'binder' string prios? nat ')'
266 prios: '[' (nat + ',') ']'
270 Here the \railtoken{string} specifications refer to the actual mixfix template
271 (see also \cite{isabelle-ref}), which may include literal text, spacing,
272 blocks, and arguments (denoted by ``$_$''); the special symbol \verb,\<index>,
273 (printed as ``\i'') represents an index argument that specifies an implicit
274 structure reference (see also \S\ref{sec:locale}). Infix and binder
275 declarations provide common abbreviations for particular mixfix declarations.
276 So in practice, mixfix templates mostly degenerate to literal text for
277 concrete syntax, such as ``\verb,++,'' for an infix symbol, or ``\verb,++,\i''
278 for an infix of an implicit structure.
282 \subsection{Proof methods}\label{sec:syn-meth}
284 Proof methods are either basic ones, or expressions composed of methods via
285 ``\texttt{,}'' (sequential composition), ``\texttt{|}'' (alternative choices),
286 ``\texttt{?}'' (try), ``\texttt{+}'' (repeat at least once). In practice,
287 proof methods are usually just a comma separated list of
288 \railqtoken{nameref}~\railnonterm{args} specifications. Note that parentheses
289 may be dropped for single method specifications (with no arguments).
291 \indexouternonterm{method}
293 method: (nameref | '(' methods ')') (() | '?' | '+')
295 methods: (nameref args | method) + (',' | '|')
299 Proper use of Isar proof methods does \emph{not} involve goal addressing.
300 Nevertheless, specifying goal ranges may occasionally come in handy in
301 emulating tactic scripts. Note that $[n-]$ refers to all goals, starting from
302 $n$. All goals may be specified by $[!]$, which is the same as $[1-]$.
304 \indexouternonterm{goalspec}
306 goalspec: '[' (nat '-' nat | nat '-' | nat | '!' ) ']'
311 \subsection{Attributes and theorems}\label{sec:syn-att}
313 Attributes (and proof methods, see \S\ref{sec:syn-meth}) have their own
314 ``semi-inner'' syntax, in the sense that input conforming to
315 \railnonterm{args} below is parsed by the attribute a second time. The
316 attribute argument specifications may be any sequence of atomic entities
317 (identifiers, strings etc.), or properly bracketed argument lists. Below
318 \railqtoken{atom} refers to any atomic entity, including any
319 \railtoken{keyword} conforming to \railtoken{symident}.
321 \indexoutertoken{atom}\indexouternonterm{args}\indexouternonterm{attributes}
323 atom: nameref | typefree | typevar | var | nat | keyword
325 arg: atom | '(' args ')' | '[' args ']'
329 attributes: '[' (nameref args * ',') ']'
333 Theorem specifications come in several flavors: \railnonterm{axmdecl} and
334 \railnonterm{thmdecl} usually refer to axioms, assumptions or results of goal
335 statements, while \railnonterm{thmdef} collects lists of existing theorems.
336 Existing theorems are given by \railnonterm{thmref} and \railnonterm{thmrefs},
337 the former requires an actual singleton result. Any of these theorem
338 specifications may include lists of attributes both on the left and right hand
339 sides; attributes are applied to any immediately preceding theorem. If names
340 are omitted, the theorems are not stored within the theorem database of the
341 theory or proof context; any given attributes are still applied, though.
343 \indexouternonterm{thmdecl}\indexouternonterm{axmdecl}
344 \indexouternonterm{thmdef}\indexouternonterm{thmrefs}
346 axmdecl: name attributes? ':'
352 thmref: nameref attributes?
357 thmbind: name attributes | name | attributes
362 \subsection{Term patterns and declarations}\label{sec:term-decls}
364 Wherever explicit propositions (or term fragments) occur in a proof text,
365 casual binding of schematic term variables may be given specified via patterns
366 of the form $\ISS{p@1\;\dots}{p@n}$. There are separate versions available
367 for \railqtoken{term}s and \railqtoken{prop}s. The latter provides a
368 $\CONCLNAME$ part with patterns referring the (atomic) conclusion of a rule.
370 \indexouternonterm{termpat}\indexouternonterm{proppat}
372 termpat: '(' ('is' term +) ')'
374 proppat: '(' (('is' prop +) | 'concl' ('is' prop +) | ('is' prop +) 'concl' ('is' prop +)) ')'
378 Declarations of local variables $x :: \tau$ and logical propositions $a :
379 \phi$ represent different views on the same principle of introducing a local
380 scope. In practice, one may usually omit the typing of $vars$ (due to
381 type-inference), and the naming of propositions (due to implicit chaining of
382 emerging facts). In any case, Isar proof elements usually admit to introduce
383 multiple such items simultaneously.
385 \indexouternonterm{vars}\indexouternonterm{props}
387 vars: (name+) ('::' type)?
389 props: thmdecl? (prop proppat? +)
393 The treatment of multiple declarations corresponds to the complementary focus
394 of $vars$ versus $props$: in ``$x@1~\dots~x@n :: \tau$'' the typing refers to
395 all variables, while in $a\colon \phi@1~\dots~\phi@n$ the naming refers to all
396 propositions collectively. Isar language elements that refer to $vars$ or
397 $props$ typically admit separate typings or namings via another level of
398 iteration, with explicit $\AND$ separators; e.g.\ see $\FIXNAME$ and
399 $\ASSUMENAME$ in \S\ref{sec:proof-context}.
402 \subsection{Antiquotations}\label{sec:antiq}
404 \begin{matharray}{rcl}
405 thm & : & \isarantiq \\
406 prop & : & \isarantiq \\
407 term & : & \isarantiq \\
408 typ & : & \isarantiq \\
409 text & : & \isarantiq \\
410 goals & : & \isarantiq \\
411 subgoals & : & \isarantiq \\
414 The text body of formal comments (see also \S\ref{sec:comments}) may contain
415 antiquotations of logical entities, such as theorems, terms and types, which
416 are to be presented in the final output produced by the Isabelle document
417 preparation system (see also \S\ref{sec:document-prep}).
420 \texttt{{\at}{\ttlbrace}term~[show_types]~"f(x)~=~a~+~x"{\ttrbrace}} within a
421 text block would cause
422 \isa{(f{\isasymColon}'a~{\isasymRightarrow}~'a)~(x{\isasymColon}'a)~=~(a{\isasymColon}'a)~+~x}
423 to appear in the final {\LaTeX} document. Also note that theorem
424 antiquotations may involve attributes as well. For example,
425 \texttt{{\at}{\ttlbrace}thm~sym~[no_vars]{\ttrbrace}} would print the
426 statement where all schematic variables have been replaced by fixed ones,
427 which are easier to read.
429 \indexisarant{thm}\indexisarant{prop}\indexisarant{term}
430 \indexisarant{typ}\indexisarant{text}\indexisarant{goals}\indexisarant{subgoals}
432 atsign lbrace antiquotation rbrace
436 'thm' options thmrefs |
437 'prop' options prop |
438 'term' options term |
440 'text' options name |
444 options: '[' (option * ',') ']'
446 option: name | name '=' name
450 Note that the syntax of antiquotations may \emph{not} include source comments
451 \texttt{(*~\dots~*)} or verbatim text \verb|{*|~\dots~\verb|*}|.
454 \item [$\at\{thm~\vec a\}$] prints theorems $\vec a$. Note that attribute
455 specifications may be included as well (see also \S\ref{sec:syn-att}); the
456 $no_vars$ operation (see \S\ref{sec:misc-meth-att}) would be particularly
457 useful to suppress printing of schematic variables.
458 \item [$\at\{prop~\phi\}$] prints a well-typed proposition $\phi$.
459 \item [$\at\{term~t\}$] prints a well-typed term $t$.
460 \item [$\at\{typ~\tau\}$] prints a well-formed type $\tau$.
461 \item [$\at\{text~s\}$] prints uninterpreted source text $s$. This is
462 particularly useful to print portions of text according to the Isabelle
463 {\LaTeX} output style, without demanding well-formedness (e.g.\ small pieces
464 of terms that cannot be parsed or type-checked yet).
465 \item [$\at\{goals\}$] prints the current \emph{dynamic} goal state. This is
466 only for support of tactic-emulation scripts within Isar --- presentation of
467 goal states does not conform to actual human-readable proof documents.
469 Please do not include goal states into document output unless you really
470 know what you are doing!
471 \item [$\at\{subgoals\}$] behaves almost like $goals$, except that it does not
477 The following options are available to tune the output. Note that most of
478 these coincide with ML flags of the same names (see also \cite{isabelle-ref}).
480 \item[$show_types = bool$ and $show_sorts = bool$] control printing of
481 explicit type and sort constraints.
482 \item[$long_names = bool$] forces names of types and constants etc.\ to be
483 printed in their fully qualified internal form.
484 \item[$eta_contract = bool$] prints terms in $\eta$-contracted form.
485 \item[$display = bool$] indicates if the text is to be output as multi-line
486 ``display material'', rather than a small piece of text without line breaks
487 (which is the default).
488 \item[$quotes = bool$] indicates if the output should be enclosed in double
490 \item[$mode = name$] adds $name$ to the print mode to be used for presentation
491 (see also \cite{isabelle-ref}). Note that the standard setup for {\LaTeX}
492 output is already present by default, including the modes ``$latex$'',
493 ``$xsymbols$'', ``$symbols$''.
494 \item[$margin = nat$ and $indent = nat$] change the margin or indentation for
495 pretty printing of display material.
496 \item[$source = bool$] prints the source text of the antiquotation arguments,
497 rather than the actual value. Note that this does not affect
498 well-formedness checks of $thm$, $term$, etc. (only the $text$ antiquotation
499 admits arbitrary output).
500 \item[$goals_limit = nat$] determines the maximum number of goals to be
504 For boolean flags, ``$name = true$'' may be abbreviated as ``$name$''. All of
505 the above flags are disabled by default, unless changed from ML.
507 \medskip Note that antiquotations do not only spare the author from tedious
508 typing, but also achieve some degree of consistency-checking of informal
509 explanations with formal developments, since well-formedness of terms and
510 types with respect to the current theory or proof context can be ensured.
514 %%% TeX-master: "isar-ref"