2 \chapter{Isar Syntax Primitives}
4 We give a complete reference of all basic syntactic entities underlying the
5 Isabelle/Isar document syntax. Actual theory and proof commands will be
10 In order to get started with writing well-formed Isabelle/Isar documents, the
11 most important aspect to be noted is the difference of \emph{inner} versus
12 \emph{outer} syntax. Inner syntax is that of Isabelle types and terms of the
13 logic, while outer syntax is that of Isabelle/Isar theories (including
14 proofs). As a general rule, inner syntax entities may occur only as
15 \emph{atomic entities} within outer syntax. For example, the string
16 \texttt{"x + y"} and identifier \texttt{z} are legal term specifications
17 within a theory, while \texttt{x + y} is not.
20 Note that classic Isabelle theories used to fake parts of the inner syntax
21 of types, with rather complicated rules when quotes may be omitted. Despite
22 the minor drawback of requiring quotes more often, the syntax of
23 Isabelle/Isar is much simpler and more robust in that respect.
28 Another notable point is proper input termination. Proof~General demands any
29 command to be terminated by ``\texttt{;}''
30 (semicolon)\index{semicolon}\index{*;}. As far as plain Isabelle/Isar is
31 concerned, commands may be directly run together, though. In the presentation
32 of Isabelle/Isar documents, semicolons are omitted in order to gain
36 \section{Lexical matters}\label{sec:lex-syntax}
38 The Isabelle/Isar outer syntax provides token classes as presented below.
39 Note that some of these coincide (by full intention) with the inner lexical
40 syntax as presented in \cite{isabelle-ref}. These different levels of syntax
41 should not be confused, though.
43 %FIXME keyword, command
44 \begin{matharray}{rcl}
45 ident & = & letter~quasiletter^* \\
46 longident & = & ident\verb,.,ident~\dots~ident \\
47 symident & = & sym^+ ~|~ symbol \\
49 var & = & \verb,?,ident ~|~ \verb,?,ident\verb,.,nat \\
50 typefree & = & \verb,',ident \\
51 typevar & = & \verb,?,typefree ~|~ \verb,?,typefree\verb,.,nat \\
52 string & = & \verb,", ~\dots~ \verb,", \\
53 verbatim & = & \verb,{*, ~\dots~ \verb,*}, \\
55 \begin{matharray}{rcl}
56 letter & = & \verb,a, ~|~ \dots ~|~ \verb,z, ~|~ \verb,A, ~|~ \dots ~|~ \verb,Z, \\
57 digit & = & \verb,0, ~|~ \dots ~|~ \verb,9, \\
58 quasiletter & = & letter ~|~ digit ~|~ \verb,_, ~|~ \verb,', \\
59 sym & = & \verb,!, ~|~ \verb,#, ~|~ \verb,$, ~|~ \verb,%, ~|~ \verb,&, ~|~ %$
60 \verb,*, ~|~ \verb,+, ~|~ \verb,-, ~|~ \verb,/, ~|~ \verb,:, ~|~
61 \verb,<, ~|~ \verb,=, ~|~ \verb,>, ~|~ \verb,?, ~|~ \mathtt{\at} ~|~ \\
62 & & \verb,^, ~|~ \verb,_, ~|~ \verb,`, ~|~ \verb,|, ~|~ \verb,~, \\
63 symbol & = & {\forall} ~|~ {\exists} ~|~ \dots
66 The syntax of \texttt{string} admits any characters, including newlines;
67 ``\verb|"|'' (double-quote) and ``\verb|\|'' (backslash) have to be escaped by
68 a backslash. Note that ML-style control characters are \emph{not} supported.
69 The body of \texttt{verbatim} may consist of any text not containing
72 Comments take the form \texttt{(*~\dots~*)} and may be
73 nested\footnote{Proof~General may occasionally get confused by nested
74 comments.}, just as in ML. Note that these are \emph{source} comments only,
75 which are stripped after lexical analysis of the input. The Isar document
76 syntax also provides \emph{formal comments} that are actually part of the text
77 (see \S\ref{sec:comments}).
80 \section{Common syntax entities}
82 Subsequently, we introduce several basic syntactic entities, such as names,
83 terms, and theorem specifications, which have been factored out of the actual
84 Isar language elements to be described later.
86 Note that some of the basic syntactic entities introduced below (e.g.\
87 \railqtoken{name}) act much like tokens rather than plain nonterminals (e.g.\
88 \railnonterm{sort}), especially for the sake of error messages. E.g.\ syntax
89 elements such as $\CONSTS$ referring to \railqtoken{name} or \railqtoken{type}
90 would really report a missing name or type rather than any of the constituent
91 primitive tokens such as \railtoken{ident} or \railtoken{string}.
96 Entity \railqtoken{name} usually refers to any name of types, constants,
97 theorems etc.\ that are to be \emph{declared} or \emph{defined} (so qualified
98 identifiers are excluded here). Quoted strings provide an escape for
99 non-identifier names or those ruled out by outer syntax keywords (e.g.\
100 \verb|"let"|). Already existing objects are usually referenced by
101 \railqtoken{nameref}.
103 \indexoutertoken{name}\indexoutertoken{parname}\indexoutertoken{nameref}
105 name: ident | symident | string | nat
107 parname: '(' name ')'
109 nameref: name | longident
114 \subsection{Comments}\label{sec:comments}
116 Large chunks of plain \railqtoken{text} are usually given
117 \railtoken{verbatim}, i.e.\ enclosed in \verb|{*|~\dots~\verb|*}|. For
118 convenience, any of the smaller text units conforming to \railqtoken{nameref}
119 are admitted as well. Almost any of the Isar commands may be annotated by
120 marginal \railnonterm{comment} of the form \texttt{--} \railqtoken{text}.
121 Note that the latter kind of comment is actually part of the language, while
122 source level comments \verb|(*|~\dots~\verb|*)| are stripped at the lexical
123 level. A few commands such as $\PROOFNAME$ admit additional markup with a
124 ``level of interest'': \texttt{\%} followed by an optional number $n$ (default
125 $n = 1$) indicates that the respective part of the document becomes $n$ levels
126 more obscure; \texttt{\%\%} means that interest drops by $\infty$ --- abandon
127 every hope, who enter here.
129 \indexoutertoken{text}\indexouternonterm{comment}\indexouternonterm{interest}
131 text: verbatim | nameref
133 comment: ('--' text +)
135 interest: percent nat? | ppercent
140 \subsection{Type classes, sorts and arities}
142 Classes are specified by plain names. Sorts have a very simple inner syntax,
143 which is either a single class name $c$ or a list $\{c@1, \dots, c@n\}$
144 referring to the intersection of these classes. The syntax of type arities is
145 given directly at the outer level.
147 \indexouternonterm{sort}\indexouternonterm{arity}\indexouternonterm{simplearity}
148 \indexouternonterm{classdecl}
150 classdecl: name ('<' (nameref + ','))?
154 arity: ('(' (sort + ',') ')')? sort
156 simplearity: ('(' (sort + ',') ')')? nameref
161 \subsection{Types and terms}\label{sec:types-terms}
163 The actual inner Isabelle syntax, that of types and terms of the logic, is far
164 too sophisticated in order to be modelled explicitly at the outer theory
165 level. Basically, any such entity has to be quoted to turn it into a single
166 token (the parsing and type-checking is performed internally later). For
167 convenience, a slightly more liberal convention is adopted: quotes may be
168 omitted for any type or term that is already \emph{atomic} at the outer level.
169 For example, one may write just \texttt{x} instead of \texttt{"x"}. Note that
170 symbolic identifiers (e.g.\ \texttt{++} or $\forall$) are available as well,
171 provided these are not superseded by commands or keywords (e.g.\ \texttt{+}).
173 \indexoutertoken{type}\indexoutertoken{term}\indexoutertoken{prop}
175 type: nameref | typefree | typevar
183 Positional instantiations are indicated by giving a sequence of terms, or the
184 placeholder ``$\_$'' (underscore), which means to skip a position.
186 \indexoutertoken{inst}\indexoutertoken{insts}
188 inst: underscore | term
194 Type declarations and definitions usually refer to \railnonterm{typespec} on
195 the left-hand side. This models basic type constructor application at the
196 outer syntax level. Note that only plain postfix notation is available here,
199 \indexouternonterm{typespec}
201 typespec: (() | typefree | '(' ( typefree + ',' ) ')') name
206 \subsection{Term patterns}\label{sec:term-pats}
208 Assumptions and goal statements usually admit casual binding of schematic term
209 variables by giving (optional) patterns of the form $\ISS{p@1\;\dots}{p@n}$.
210 There are separate versions available for \railqtoken{term}s and
211 \railqtoken{prop}s. The latter provides a $\CONCLNAME$ part with patterns
212 referring the (atomic) conclusion of a rule.
214 \indexouternonterm{termpat}\indexouternonterm{proppat}
216 termpat: '(' ('is' term +) ')'
218 proppat: '(' (('is' prop +) | 'concl' ('is' prop +) | ('is' prop +) 'concl' ('is' prop +)) ')'
223 \subsection{Mixfix annotations}
225 Mixfix annotations specify concrete \emph{inner} syntax of Isabelle types and
226 terms (see also \cite{isabelle-ref}). Some commands such as $\TYPES$ (see
227 \S\ref{sec:types-pure}) admit infixes only, while $\CONSTS$ (see
228 \S\ref{sec:consts}) and $\isarkeyword{syntax}$ (see \S\ref{sec:syn-trans})
229 support the full range of general mixfixes and binders.
231 \indexouternonterm{infix}\indexouternonterm{mixfix}
233 infix: '(' ('infixl' | 'infixr') string? nat ')'
235 mixfix: infix | '(' string prios? nat? ')' | '(' 'binder' string prios? nat ')'
238 prios: '[' (nat + ',') ']'
243 \subsection{Attributes and theorems}\label{sec:syn-att}
245 Attributes (and proof methods, see \S\ref{sec:syn-meth}) have their own
246 ``semi-inner'' syntax, in the sense that input conforming to
247 \railnonterm{args} below is parsed by the attribute a second time. The
248 attribute argument specifications may be any sequence of atomic entities
249 (identifiers, strings etc.), or properly bracketed argument lists. Below
250 \railqtoken{atom} refers to any atomic entity, including any
251 \railtoken{keyword} conforming to \railtoken{symident}.
253 \indexoutertoken{atom}\indexouternonterm{args}\indexouternonterm{attributes}
255 atom: nameref | typefree | typevar | var | nat | keyword
257 arg: atom | '(' args ')' | '[' args ']'
261 attributes: '[' (nameref args * ',') ']'
265 Theorem specifications come in several flavors: \railnonterm{axmdecl} and
266 \railnonterm{thmdecl} usually refer to axioms, assumptions or results of goal
267 statements, while \railnonterm{thmdef} collects lists of existing theorems.
268 Existing theorems are given by \railnonterm{thmref} and \railnonterm{thmrefs},
269 the former requires an actual singleton result. Any of these theorem
270 specifications may include lists of attributes both on the left and right hand
271 sides; attributes are applied to any immediately preceding theorem. If names
272 are omitted, the theorems are not stored within the theorem database of the
273 theory or proof context; any given attributes are still applied, though.
275 \indexouternonterm{thmdecl}\indexouternonterm{axmdecl}
276 \indexouternonterm{thmdef}\indexouternonterm{thmrefs}
278 axmdecl: name attributes? ':'
284 thmref: nameref attributes?
289 thmname: name attributes | name | attributes
294 \subsection{Proof methods}\label{sec:syn-meth}
296 Proof methods are either basic ones, or expressions composed of methods via
297 ``\texttt{,}'' (sequential composition), ``\texttt{|}'' (alternative choices),
298 ``\texttt{?}'' (try), ``\texttt{+}'' (repeat at least once). In practice,
299 proof methods are usually just a comma separated list of
300 \railqtoken{nameref}~\railnonterm{args} specifications. Note that parentheses
301 may be dropped for single method specifications (with no arguments).
303 \indexouternonterm{method}
305 method: (nameref | '(' methods ')') (() | '?' | '+')
307 methods: (nameref args | method) + (',' | '|')
311 Proper use of Isar proof methods does \emph{not} involve goal addressing.
312 Nevertheless, specifying goal ranges may occasionally come in handy in
313 emulating tactic scripts. Note that $[n-]$ refers to all goals, starting from
314 $n$. All goals may be specified by $[!]$, which is the same as $[1-]$.
316 \indexouternonterm{goalspec}
318 goalspec: '[' (nat '-' nat | nat '-' | nat | '!' ) ']'
325 %%% TeX-master: "isar-ref"