7 chapter {* Outer syntax *}
10 The rather generic framework of Isabelle/Isar syntax emerges from
11 three main syntactic categories: \emph{commands} of the top-level
12 Isar engine (covering theory and proof elements), \emph{methods} for
13 general goal refinements (analogous to traditional ``tactics''), and
14 \emph{attributes} for operations on facts (within a certain
15 context). Subsequently we give a reference of basic syntactic
16 entities underlying Isabelle/Isar syntax in a bottom-up manner.
17 Concrete theory and proof language elements will be introduced later
20 \medskip In order to get started with writing well-formed
21 Isabelle/Isar documents, the most important aspect to be noted is
22 the difference of \emph{inner} versus \emph{outer} syntax. Inner
23 syntax is that of Isabelle types and terms of the logic, while outer
24 syntax is that of Isabelle/Isar theory sources (specifications and
25 proofs). As a general rule, inner syntax entities may occur only as
26 \emph{atomic entities} within outer syntax. For example, the string
27 @{verbatim "\"x + y\""} and identifier @{verbatim z} are legal term
28 specifications within a theory, while @{verbatim "x + y"} without
31 Printed theory documents usually omit quotes to gain readability
32 (this is a matter of {\LaTeX} macro setup, say via @{verbatim
33 "\\isabellestyle"}, see also \cite{isabelle-sys}). Experienced
34 users of Isabelle/Isar may easily reconstruct the lost technical
35 information, while mere readers need not care about quotes at all.
37 \medskip Isabelle/Isar input may contain any number of input
38 termination characters ``@{verbatim ";"}'' (semicolon) to separate
39 commands explicitly. This is particularly useful in interactive
40 shell sessions to make clear where the current command is intended
41 to end. Otherwise, the interpreter loop will continue to issue a
42 secondary prompt ``@{verbatim "#"}'' until an end-of-command is
43 clearly recognized from the input syntax, e.g.\ encounter of the
46 More advanced interfaces such as Proof~General \cite{proofgeneral}
47 do not require explicit semicolons, the amount of input text is
48 determined automatically by inspecting the present content of the
49 Emacs text buffer. In the printed presentation of Isabelle/Isar
50 documents semicolons are omitted altogether for readability.
53 Proof~General requires certain syntax classification tables in
54 order to achieve properly synchronized interaction with the
55 Isabelle/Isar process. These tables need to be consistent with
56 the Isabelle version and particular logic image to be used in a
57 running session (common object-logics may well change the outer
58 syntax). The standard setup should work correctly with any of the
59 ``official'' logic images derived from Isabelle/HOL (including
60 HOLCF etc.). Users of alternative logics may need to tell
61 Proof~General explicitly, e.g.\ by giving an option @{verbatim "-k ZF"}
62 (in conjunction with @{verbatim "-l ZF"}, to specify the default
63 logic image). Note that option @{verbatim "-L"} does both
64 of this at the same time.
69 section {* Lexical matters \label{sec:lex-syntax} *}
72 The Isabelle/Isar outer syntax provides token classes as presented
73 below; most of these coincide with the inner lexical syntax as
74 presented in \cite{isabelle-ref}.
76 \begin{matharray}{rcl}
77 @{syntax_def ident} & = & letter\,quasiletter^* \\
78 @{syntax_def longident} & = & ident (\verb,.,ident)^+ \\
79 @{syntax_def symident} & = & sym^+ ~|~ \verb,\,\verb,<,ident\verb,>, \\
80 @{syntax_def nat} & = & digit^+ \\
81 @{syntax_def var} & = & ident ~|~ \verb,?,ident ~|~ \verb,?,ident\verb,.,nat \\
82 @{syntax_def typefree} & = & \verb,',ident \\
83 @{syntax_def typevar} & = & typefree ~|~ \verb,?,typefree ~|~ \verb,?,typefree\verb,.,nat \\
84 @{syntax_def string} & = & \verb,", ~\dots~ \verb,", \\
85 @{syntax_def altstring} & = & \backquote ~\dots~ \backquote \\
86 @{syntax_def verbatim} & = & \verb,{*, ~\dots~ \verb,*,\verb,}, \\[1ex]
88 letter & = & latin ~|~ \verb,\,\verb,<,latin\verb,>, ~|~ \verb,\,\verb,<,latin\,latin\verb,>, ~|~ greek ~|~ \\
89 & & \verb,\<^isub>, ~|~ \verb,\<^isup>, \\
90 quasiletter & = & letter ~|~ digit ~|~ \verb,_, ~|~ \verb,', \\
91 latin & = & \verb,a, ~|~ \dots ~|~ \verb,z, ~|~ \verb,A, ~|~ \dots ~|~ \verb,Z, \\
92 digit & = & \verb,0, ~|~ \dots ~|~ \verb,9, \\
93 sym & = & \verb,!, ~|~ \verb,#, ~|~ \verb,$, ~|~ \verb,%, ~|~ \verb,&, ~|~
94 \verb,*, ~|~ \verb,+, ~|~ \verb,-, ~|~ \verb,/, ~|~ \\
95 & & \verb,<, ~|~ \verb,=, ~|~ \verb,>, ~|~ \verb,?, ~|~ \texttt{\at} ~|~
96 \verb,^, ~|~ \verb,_, ~|~ \verb,|, ~|~ \verb,~, \\
97 greek & = & \verb,\<alpha>, ~|~ \verb,\<beta>, ~|~ \verb,\<gamma>, ~|~ \verb,\<delta>, ~| \\
98 & & \verb,\<epsilon>, ~|~ \verb,\<zeta>, ~|~ \verb,\<eta>, ~|~ \verb,\<theta>, ~| \\
99 & & \verb,\<iota>, ~|~ \verb,\<kappa>, ~|~ \verb,\<mu>, ~|~ \verb,\<nu>, ~| \\
100 & & \verb,\<xi>, ~|~ \verb,\<pi>, ~|~ \verb,\<rho>, ~|~ \verb,\<sigma>, ~|~ \verb,\<tau>, ~| \\
101 & & \verb,\<upsilon>, ~|~ \verb,\<phi>, ~|~ \verb,\<chi>, ~|~ \verb,\<psi>, ~| \\
102 & & \verb,\<omega>, ~|~ \verb,\<Gamma>, ~|~ \verb,\<Delta>, ~|~ \verb,\<Theta>, ~| \\
103 & & \verb,\<Lambda>, ~|~ \verb,\<Xi>, ~|~ \verb,\<Pi>, ~|~ \verb,\<Sigma>, ~| \\
104 & & \verb,\<Upsilon>, ~|~ \verb,\<Phi>, ~|~ \verb,\<Psi>, ~|~ \verb,\<Omega>, \\
107 The syntax of @{syntax string} admits any characters, including
108 newlines; ``@{verbatim "\""}'' (double-quote) and ``@{verbatim
109 "\\"}'' (backslash) need to be escaped by a backslash; arbitrary
110 character codes may be specified as ``@{verbatim "\\"}@{text ddd}'',
111 with three decimal digits. Alternative strings according to
112 @{syntax altstring} are analogous, using single back-quotes instead.
113 The body of @{syntax verbatim} may consist of any text not
114 containing ``@{verbatim "*"}@{verbatim "}"}''; this allows
115 convenient inclusion of quotes without further escapes. The greek
116 letters do \emph{not} include @{verbatim "\<lambda>"}, which is already used
117 differently in the meta-logic.
119 Common mathematical symbols such as @{text \<forall>} are represented in
120 Isabelle as @{verbatim \<forall>}. There are infinitely many Isabelle
121 symbols like this, although proper presentation is left to front-end
122 tools such as {\LaTeX} or Proof~General with the X-Symbol package.
123 A list of standard Isabelle symbols that work well with these tools
124 is given in \cite[appendix~A]{isabelle-sys}.
126 Source comments take the form @{verbatim "(*"}~@{text
127 "\<dots>"}~@{verbatim "*)"} and may be nested, although user-interface
128 tools might prevent this. Note that this form indicates source
129 comments only, which are stripped after lexical analysis of the
130 input. The Isar document syntax also provides formal comments that
131 are considered as part of the text (see \secref{sec:comments}).
135 section {* Common syntax entities *}
138 We now introduce several basic syntactic entities, such as names,
139 terms, and theorem specifications, which are factored out of the
140 actual Isar language elements to be described later.
144 subsection {* Names *}
147 Entity \railqtok{name} usually refers to any name of types,
148 constants, theorems etc.\ that are to be \emph{declared} or
149 \emph{defined} (so qualified identifiers are excluded here). Quoted
150 strings provide an escape for non-identifier names or those ruled
151 out by outer syntax keywords (e.g.\ quoted @{verbatim "\"let\""}).
152 Already existing objects are usually referenced by
155 \indexoutertoken{name}\indexoutertoken{parname}\indexoutertoken{nameref}
156 \indexoutertoken{int}
158 name: ident | symident | string | nat
160 parname: '(' name ')'
162 nameref: name | longident
170 subsection {* Comments \label{sec:comments} *}
173 Large chunks of plain \railqtok{text} are usually given
174 \railtok{verbatim}, i.e.\ enclosed in @{verbatim "{"}@{verbatim
175 "*"}~@{text "\<dots>"}~@{verbatim "*"}@{verbatim "}"}. For convenience,
176 any of the smaller text units conforming to \railqtok{nameref} are
177 admitted as well. A marginal \railnonterm{comment} is of the form
178 @{verbatim "--"} \railqtok{text}. Any number of these may occur
179 within Isabelle/Isar commands.
181 \indexoutertoken{text}\indexouternonterm{comment}
183 text: verbatim | nameref
191 subsection {* Type classes, sorts and arities *}
194 Classes are specified by plain names. Sorts have a very simple
195 inner syntax, which is either a single class name @{text c} or a
196 list @{text "{c\<^sub>1, \<dots>, c\<^sub>n}"} referring to the
197 intersection of these classes. The syntax of type arities is given
198 directly at the outer level.
200 \indexouternonterm{sort}\indexouternonterm{arity}
201 \indexouternonterm{classdecl}
203 classdecl: name (('<' | subseteq) (nameref + ','))?
207 arity: ('(' (sort + ',') ')')? sort
213 subsection {* Types and terms \label{sec:types-terms} *}
216 The actual inner Isabelle syntax, that of types and terms of the
217 logic, is far too sophisticated in order to be modelled explicitly
218 at the outer theory level. Basically, any such entity has to be
219 quoted to turn it into a single token (the parsing and type-checking
220 is performed internally later). For convenience, a slightly more
221 liberal convention is adopted: quotes may be omitted for any type or
222 term that is already atomic at the outer level. For example, one
223 may just write @{verbatim x} instead of quoted @{verbatim "\"x\""}.
224 Note that symbolic identifiers (e.g.\ @{verbatim "++"} or @{text
225 "\<forall>"} are available as well, provided these have not been superseded
226 by commands or other keywords already (such as @{verbatim "="} or
229 \indexoutertoken{type}\indexoutertoken{term}\indexoutertoken{prop}
231 type: nameref | typefree | typevar
239 Positional instantiations are indicated by giving a sequence of
240 terms, or the placeholder ``@{text _}'' (underscore), which means to
243 \indexoutertoken{inst}\indexoutertoken{insts}
245 inst: underscore | term
251 Type declarations and definitions usually refer to
252 \railnonterm{typespec} on the left-hand side. This models basic
253 type constructor application at the outer syntax level. Note that
254 only plain postfix notation is available here, but no infixes.
256 \indexouternonterm{typespec}
258 typespec: (() | typefree | '(' ( typefree + ',' ) ')') name
264 subsection {* Mixfix annotations *}
267 Mixfix annotations specify concrete \emph{inner} syntax of Isabelle
268 types and terms. Some commands such as @{command "types"} (see
269 \secref{sec:types-pure}) admit infixes only, while @{command
270 "consts"} (see \secref{sec:consts}) and @{command "syntax"} (see
271 \secref{sec:syn-trans}) support the full range of general mixfixes
274 \indexouternonterm{infix}\indexouternonterm{mixfix}\indexouternonterm{structmixfix}
276 infix: '(' ('infix' | 'infixl' | 'infixr') string? nat ')'
278 mixfix: infix | '(' string prios? nat? ')' | '(' 'binder' string prios? nat ')'
280 structmixfix: mixfix | '(' 'structure' ')'
283 prios: '[' (nat + ',') ']'
287 Here the \railtok{string} specifications refer to the actual mixfix
288 template (see also \cite{isabelle-ref}), which may include literal
289 text, spacing, blocks, and arguments (denoted by ``@{text _}''); the
290 special symbol ``@{verbatim "\<index>"}'' (printed as ``@{text "\<index>"}'')
291 represents an index argument that specifies an implicit structure
292 reference (see also \secref{sec:locale}). Infix and binder
293 declarations provide common abbreviations for particular mixfix
294 declarations. So in practice, mixfix templates mostly degenerate to
295 literal text for concrete syntax, such as ``@{verbatim "++"}'' for
296 an infix symbol, or ``@{verbatim "++"}@{text "\<index>"}'' for an infix of
297 an implicit structure.
301 subsection {* Proof methods \label{sec:syn-meth} *}
304 Proof methods are either basic ones, or expressions composed of
305 methods via ``@{verbatim ","}'' (sequential composition),
306 ``@{verbatim "|"}'' (alternative choices), ``@{verbatim "?"}''
307 (try), ``@{verbatim "+"}'' (repeat at least once), ``@{verbatim
308 "["}@{text n}@{verbatim "]"}'' (restriction to first @{text n}
309 sub-goals, with default @{text "n = 1"}). In practice, proof
310 methods are usually just a comma separated list of
311 \railqtok{nameref}~\railnonterm{args} specifications. Note that
312 parentheses may be dropped for single method specifications (with no
315 \indexouternonterm{method}
317 method: (nameref | '(' methods ')') (() | '?' | '+' | '[' nat? ']')
319 methods: (nameref args | method) + (',' | '|')
323 Proper Isar proof methods do \emph{not} admit arbitrary goal
324 addressing, but refer either to the first sub-goal or all sub-goals
325 uniformly. The goal restriction operator ``@{text "[n]"}''
326 evaluates a method expression within a sandbox consisting of the
327 first @{text n} sub-goals (which need to exist). For example, the
328 method ``@{text "simp_all[3]"}'' simplifies the first three
329 sub-goals, while ``@{text "(rule foo, simp_all)[]"}'' simplifies all
330 new goals that emerge from applying rule @{text "foo"} to the
331 originally first one.
333 Improper methods, notably tactic emulations, offer a separate
334 low-level goal addressing scheme as explicit argument to the
335 individual tactic being involved. Here ``@{text "[!]"}'' refers to
336 all goals, and ``@{text "[n-]"}'' to all goals starting from @{text
339 \indexouternonterm{goalspec}
341 goalspec: '[' (nat '-' nat | nat '-' | nat | '!' ) ']'
347 subsection {* Attributes and theorems \label{sec:syn-att} *}
350 Attributes (and proof methods, see \secref{sec:syn-meth}) have their
351 own ``semi-inner'' syntax, in the sense that input conforming to
352 \railnonterm{args} below is parsed by the attribute a second time.
353 The attribute argument specifications may be any sequence of atomic
354 entities (identifiers, strings etc.), or properly bracketed argument
355 lists. Below \railqtok{atom} refers to any atomic entity, including
356 any \railtok{keyword} conforming to \railtok{symident}.
358 \indexoutertoken{atom}\indexouternonterm{args}\indexouternonterm{attributes}
360 atom: nameref | typefree | typevar | var | nat | keyword
362 arg: atom | '(' args ')' | '[' args ']'
366 attributes: '[' (nameref args * ',') ']'
370 Theorem specifications come in several flavors:
371 \railnonterm{axmdecl} and \railnonterm{thmdecl} usually refer to
372 axioms, assumptions or results of goal statements, while
373 \railnonterm{thmdef} collects lists of existing theorems. Existing
374 theorems are given by \railnonterm{thmref} and
375 \railnonterm{thmrefs}, the former requires an actual singleton
378 There are three forms of theorem references:
381 \item named facts @{text "a"},
383 \item selections from named facts @{text "a(i)"} or @{text "a(j - k)"},
385 \item literal fact propositions using @{syntax_ref altstring} syntax
386 @{verbatim "`"}@{text "\<phi>"}@{verbatim "`"} (see also method
387 @{method_ref fact} in \secref{sec:pure-meth-att}).
391 Any kind of theorem specification may include lists of attributes
392 both on the left and right hand sides; attributes are applied to any
393 immediately preceding fact. If names are omitted, the theorems are
394 not stored within the theorem database of the theory or proof
395 context, but any given attributes are applied nonetheless.
397 An extra pair of brackets around attributes (like ``@{text
398 "[[simproc a]]"}'') abbreviates a theorem reference involving an
399 internal dummy fact, which will be ignored later on. So only the
400 effect of the attribute on the background context will persist.
401 This form of in-place declarations is particularly useful with
402 commands like @{command "declare"} and @{command "using"}.
404 \indexouternonterm{axmdecl}\indexouternonterm{thmdecl}
405 \indexouternonterm{thmdef}\indexouternonterm{thmref}
406 \indexouternonterm{thmrefs}\indexouternonterm{selection}
408 axmdecl: name attributes? ':'
414 thmref: (nameref selection? | altstring) attributes? | '[' attributes ']'
419 thmbind: name attributes | name | attributes
421 selection: '(' ((nat | nat '-' nat?) + ',') ')'
427 subsection {* Term patterns and declarations \label{sec:term-decls} *}
430 Wherever explicit propositions (or term fragments) occur in a proof
431 text, casual binding of schematic term variables may be given
432 specified via patterns of the form ``@{text "(\<IS> p\<^sub>1 \<dots>
433 p\<^sub>n)"}''. This works both for \railqtok{term} and \railqtok{prop}.
435 \indexouternonterm{termpat}\indexouternonterm{proppat}
437 termpat: '(' ('is' term +) ')'
439 proppat: '(' ('is' prop +) ')'
443 \medskip Declarations of local variables @{text "x :: \<tau>"} and
444 logical propositions @{text "a : \<phi>"} represent different views on
445 the same principle of introducing a local scope. In practice, one
446 may usually omit the typing of \railnonterm{vars} (due to
447 type-inference), and the naming of propositions (due to implicit
448 references of current facts). In any case, Isar proof elements
449 usually admit to introduce multiple such items simultaneously.
451 \indexouternonterm{vars}\indexouternonterm{props}
453 vars: (name+) ('::' type)?
455 props: thmdecl? (prop proppat? +)
459 The treatment of multiple declarations corresponds to the
460 complementary focus of \railnonterm{vars} versus
461 \railnonterm{props}. In ``@{text "x\<^sub>1 \<dots> x\<^sub>n :: \<tau>"}''
462 the typing refers to all variables, while in @{text "a: \<phi>\<^sub>1 \<dots>
463 \<phi>\<^sub>n"} the naming refers to all propositions collectively.
464 Isar language elements that refer to \railnonterm{vars} or
465 \railnonterm{props} typically admit separate typings or namings via
466 another level of iteration, with explicit @{keyword_ref "and"}
467 separators; e.g.\ see @{command "fix"} and @{command "assume"} in
468 \secref{sec:proof-context}.