7 chapter {* Outer syntax *}
10 The rather generic framework of Isabelle/Isar syntax emerges from
11 three main syntactic categories: \emph{commands} of the top-level
12 Isar engine (covering theory and proof elements), \emph{methods} for
13 general goal refinements (analogous to traditional ``tactics''), and
14 \emph{attributes} for operations on facts (within a certain
15 context). Subsequently we give a reference of basic syntactic
16 entities underlying Isabelle/Isar syntax in a bottom-up manner.
17 Concrete theory and proof language elements will be introduced later
20 \medskip In order to get started with writing well-formed
21 Isabelle/Isar documents, the most important aspect to be noted is
22 the difference of \emph{inner} versus \emph{outer} syntax. Inner
23 syntax is that of Isabelle types and terms of the logic, while outer
24 syntax is that of Isabelle/Isar theory sources (specifications and
25 proofs). As a general rule, inner syntax entities may occur only as
26 \emph{atomic entities} within outer syntax. For example, the string
27 @{verbatim "\"x + y\""} and identifier @{verbatim z} are legal term
28 specifications within a theory, while @{verbatim "x + y"} without
31 Printed theory documents usually omit quotes to gain readability
32 (this is a matter of {\LaTeX} macro setup, say via @{verbatim
33 "\\isabellestyle"}, see also \cite{isabelle-sys}). Experienced
34 users of Isabelle/Isar may easily reconstruct the lost technical
35 information, while mere readers need not care about quotes at all.
37 \medskip Isabelle/Isar input may contain any number of input
38 termination characters ``@{verbatim ";"}'' (semicolon) to separate
39 commands explicitly. This is particularly useful in interactive
40 shell sessions to make clear where the current command is intended
41 to end. Otherwise, the interpreter loop will continue to issue a
42 secondary prompt ``@{verbatim "#"}'' until an end-of-command is
43 clearly recognized from the input syntax, e.g.\ encounter of the
46 More advanced interfaces such as Proof~General \cite{proofgeneral}
47 do not require explicit semicolons, the amount of input text is
48 determined automatically by inspecting the present content of the
49 Emacs text buffer. In the printed presentation of Isabelle/Isar
50 documents semicolons are omitted altogether for readability.
53 Proof~General requires certain syntax classification tables in
54 order to achieve properly synchronized interaction with the
55 Isabelle/Isar process. These tables need to be consistent with
56 the Isabelle version and particular logic image to be used in a
57 running session (common object-logics may well change the outer
58 syntax). The standard setup should work correctly with any of the
59 ``official'' logic images derived from Isabelle/HOL (including
60 HOLCF etc.). Users of alternative logics may need to tell
61 Proof~General explicitly, e.g.\ by giving an option @{verbatim "-k ZF"}
62 (in conjunction with @{verbatim "-l ZF"}, to specify the default
63 logic image). Note that option @{verbatim "-L"} does both
64 of this at the same time.
69 section {* Lexical matters \label{sec:lex-syntax} *}
72 The Isabelle/Isar outer syntax provides token classes as presented
73 below; most of these coincide with the inner lexical syntax as
74 presented in \cite{isabelle-ref}.
76 \begin{matharray}{rcl}
77 @{syntax_def ident} & = & letter\,quasiletter^* \\
78 @{syntax_def longident} & = & ident (\verb,.,ident)^+ \\
79 @{syntax_def symident} & = & sym^+ ~|~ \verb,\,\verb,<,ident\verb,>, \\
80 @{syntax_def nat} & = & digit^+ \\
81 @{syntax_def var} & = & ident ~|~ \verb,?,ident ~|~ \verb,?,ident\verb,.,nat \\
82 @{syntax_def typefree} & = & \verb,',ident \\
83 @{syntax_def typevar} & = & typefree ~|~ \verb,?,typefree ~|~ \verb,?,typefree\verb,.,nat \\
84 @{syntax_def string} & = & \verb,", ~\dots~ \verb,", \\
85 @{syntax_def altstring} & = & \backquote ~\dots~ \backquote \\
86 @{syntax_def verbatim} & = & \verb,{*, ~\dots~ \verb,*,\verb,}, \\[1ex]
88 letter & = & latin ~|~ \verb,\,\verb,<,latin\verb,>, ~|~ \verb,\,\verb,<,latin\,latin\verb,>, ~|~ greek ~|~ \\
89 & & \verb,\<^isub>, ~|~ \verb,\<^isup>, \\
90 quasiletter & = & letter ~|~ digit ~|~ \verb,_, ~|~ \verb,', \\
91 latin & = & \verb,a, ~|~ \dots ~|~ \verb,z, ~|~ \verb,A, ~|~ \dots ~|~ \verb,Z, \\
92 digit & = & \verb,0, ~|~ \dots ~|~ \verb,9, \\
93 sym & = & \verb,!, ~|~ \verb,#, ~|~ \verb,$, ~|~ \verb,%, ~|~ \verb,&, ~|~
94 \verb,*, ~|~ \verb,+, ~|~ \verb,-, ~|~ \verb,/, ~|~ \\
95 & & \verb,<, ~|~ \verb,=, ~|~ \verb,>, ~|~ \verb,?, ~|~ \texttt{\at} ~|~
96 \verb,^, ~|~ \verb,_, ~|~ \verb,|, ~|~ \verb,~, \\
97 greek & = & \verb,\<alpha>, ~|~ \verb,\<beta>, ~|~ \verb,\<gamma>, ~|~ \verb,\<delta>, ~| \\
98 & & \verb,\<epsilon>, ~|~ \verb,\<zeta>, ~|~ \verb,\<eta>, ~|~ \verb,\<theta>, ~| \\
99 & & \verb,\<iota>, ~|~ \verb,\<kappa>, ~|~ \verb,\<mu>, ~|~ \verb,\<nu>, ~| \\
100 & & \verb,\<xi>, ~|~ \verb,\<pi>, ~|~ \verb,\<rho>, ~|~ \verb,\<sigma>, ~|~ \verb,\<tau>, ~| \\
101 & & \verb,\<upsilon>, ~|~ \verb,\<phi>, ~|~ \verb,\<chi>, ~|~ \verb,\<psi>, ~| \\
102 & & \verb,\<omega>, ~|~ \verb,\<Gamma>, ~|~ \verb,\<Delta>, ~|~ \verb,\<Theta>, ~| \\
103 & & \verb,\<Lambda>, ~|~ \verb,\<Xi>, ~|~ \verb,\<Pi>, ~|~ \verb,\<Sigma>, ~| \\
104 & & \verb,\<Upsilon>, ~|~ \verb,\<Phi>, ~|~ \verb,\<Psi>, ~|~ \verb,\<Omega>, \\
107 The syntax of @{syntax string} admits any characters, including
108 newlines; ``@{verbatim "\""}'' (double-quote) and ``@{verbatim
109 "\\"}'' (backslash) need to be escaped by a backslash; arbitrary
110 character codes may be specified as ``@{verbatim "\\"}@{text ddd}'',
111 with three decimal digits. Alternative strings according to
112 @{syntax altstring} are analogous, using single back-quotes instead.
113 The body of @{syntax verbatim} may consist of any text not
114 containing ``@{verbatim "*"}@{verbatim "}"}''; this allows
115 convenient inclusion of quotes without further escapes. The greek
116 letters do \emph{not} include @{verbatim "\<lambda>"}, which is already used
117 differently in the meta-logic.
119 Common mathematical symbols such as @{text \<forall>} are represented in
120 Isabelle as @{verbatim \<forall>}. There are infinitely many Isabelle
121 symbols like this, although proper presentation is left to front-end
122 tools such as {\LaTeX} or Proof~General with the X-Symbol package.
123 A list of standard Isabelle symbols that work well with these tools
124 is given in \cite[appendix~A]{isabelle-sys}.
126 Source comments take the form @{verbatim "(*"}~@{text
127 "\<dots>"}~@{verbatim "*)"} and may be nested, although user-interface
128 tools might prevent this. Note that this form indicates source
129 comments only, which are stripped after lexical analysis of the
130 input. The Isar syntax also provides proper \emph{document
131 comments} that are considered as part of the text (see
132 \secref{sec:comments}).
136 section {* Common syntax entities *}
139 We now introduce several basic syntactic entities, such as names,
140 terms, and theorem specifications, which are factored out of the
141 actual Isar language elements to be described later.
145 subsection {* Names *}
148 Entity \railqtok{name} usually refers to any name of types,
149 constants, theorems etc.\ that are to be \emph{declared} or
150 \emph{defined} (so qualified identifiers are excluded here). Quoted
151 strings provide an escape for non-identifier names or those ruled
152 out by outer syntax keywords (e.g.\ quoted @{verbatim "\"let\""}).
153 Already existing objects are usually referenced by
156 \indexoutertoken{name}\indexoutertoken{parname}\indexoutertoken{nameref}
157 \indexoutertoken{int}
159 name: ident | symident | string | nat
161 parname: '(' name ')'
163 nameref: name | longident
171 subsection {* Comments \label{sec:comments} *}
174 Large chunks of plain \railqtok{text} are usually given
175 \railtok{verbatim}, i.e.\ enclosed in @{verbatim "{"}@{verbatim
176 "*"}~@{text "\<dots>"}~@{verbatim "*"}@{verbatim "}"}. For convenience,
177 any of the smaller text units conforming to \railqtok{nameref} are
178 admitted as well. A marginal \railnonterm{comment} is of the form
179 @{verbatim "--"} \railqtok{text}. Any number of these may occur
180 within Isabelle/Isar commands.
182 \indexoutertoken{text}\indexouternonterm{comment}
184 text: verbatim | nameref
192 subsection {* Type classes, sorts and arities *}
195 Classes are specified by plain names. Sorts have a very simple
196 inner syntax, which is either a single class name @{text c} or a
197 list @{text "{c\<^sub>1, \<dots>, c\<^sub>n}"} referring to the
198 intersection of these classes. The syntax of type arities is given
199 directly at the outer level.
201 \indexouternonterm{sort}\indexouternonterm{arity}
202 \indexouternonterm{classdecl}
204 classdecl: name (('<' | subseteq) (nameref + ','))?
208 arity: ('(' (sort + ',') ')')? sort
214 subsection {* Types and terms \label{sec:types-terms} *}
217 The actual inner Isabelle syntax, that of types and terms of the
218 logic, is far too sophisticated in order to be modelled explicitly
219 at the outer theory level. Basically, any such entity has to be
220 quoted to turn it into a single token (the parsing and type-checking
221 is performed internally later). For convenience, a slightly more
222 liberal convention is adopted: quotes may be omitted for any type or
223 term that is already atomic at the outer level. For example, one
224 may just write @{verbatim x} instead of quoted @{verbatim "\"x\""}.
225 Note that symbolic identifiers (e.g.\ @{verbatim "++"} or @{text
226 "\<forall>"} are available as well, provided these have not been superseded
227 by commands or other keywords already (such as @{verbatim "="} or
230 \indexoutertoken{type}\indexoutertoken{term}\indexoutertoken{prop}
232 type: nameref | typefree | typevar
240 Positional instantiations are indicated by giving a sequence of
241 terms, or the placeholder ``@{text _}'' (underscore), which means to
244 \indexoutertoken{inst}\indexoutertoken{insts}
246 inst: underscore | term
252 Type declarations and definitions usually refer to
253 \railnonterm{typespec} on the left-hand side. This models basic
254 type constructor application at the outer syntax level. Note that
255 only plain postfix notation is available here, but no infixes.
257 \indexouternonterm{typespec}
259 typespec: (() | typefree | '(' ( typefree + ',' ) ')') name
265 subsection {* Term patterns and declarations \label{sec:term-decls} *}
268 Wherever explicit propositions (or term fragments) occur in a proof
269 text, casual binding of schematic term variables may be given
270 specified via patterns of the form ``@{text "(\<IS> p\<^sub>1 \<dots>
271 p\<^sub>n)"}''. This works both for \railqtok{term} and \railqtok{prop}.
273 \indexouternonterm{termpat}\indexouternonterm{proppat}
275 termpat: '(' ('is' term +) ')'
277 proppat: '(' ('is' prop +) ')'
281 \medskip Declarations of local variables @{text "x :: \<tau>"} and
282 logical propositions @{text "a : \<phi>"} represent different views on
283 the same principle of introducing a local scope. In practice, one
284 may usually omit the typing of \railnonterm{vars} (due to
285 type-inference), and the naming of propositions (due to implicit
286 references of current facts). In any case, Isar proof elements
287 usually admit to introduce multiple such items simultaneously.
289 \indexouternonterm{vars}\indexouternonterm{props}
291 vars: (name+) ('::' type)?
293 props: thmdecl? (prop proppat? +)
297 The treatment of multiple declarations corresponds to the
298 complementary focus of \railnonterm{vars} versus
299 \railnonterm{props}. In ``@{text "x\<^sub>1 \<dots> x\<^sub>n :: \<tau>"}''
300 the typing refers to all variables, while in @{text "a: \<phi>\<^sub>1 \<dots>
301 \<phi>\<^sub>n"} the naming refers to all propositions collectively.
302 Isar language elements that refer to \railnonterm{vars} or
303 \railnonterm{props} typically admit separate typings or namings via
304 another level of iteration, with explicit @{keyword_ref "and"}
305 separators; e.g.\ see @{command "fix"} and @{command "assume"} in
306 \secref{sec:proof-context}.
310 subsection {* Mixfix annotations *}
313 Mixfix annotations specify concrete \emph{inner} syntax of Isabelle
314 types and terms. Some commands such as @{command "types"} (see
315 \secref{sec:types-pure}) admit infixes only, while @{command
316 "consts"} (see \secref{sec:consts}) and @{command "syntax"} (see
317 \secref{sec:syn-trans}) support the full range of general mixfixes
320 \indexouternonterm{infix}\indexouternonterm{mixfix}\indexouternonterm{structmixfix}
322 infix: '(' ('infix' | 'infixl' | 'infixr') string nat ')'
324 mixfix: infix | '(' string prios? nat? ')' | '(' 'binder' string prios? nat ')'
326 structmixfix: mixfix | '(' 'structure' ')'
329 prios: '[' (nat + ',') ']'
333 Here the \railtok{string} specifications refer to the actual mixfix
334 template, which may include literal text, spacing, blocks, and
335 arguments (denoted by ``@{text _}''); the special symbol
336 ``@{verbatim "\<index>"}'' (printed as ``@{text "\<index>"}'') represents an index
337 argument that specifies an implicit structure reference (see also
338 \secref{sec:locale}). Infix and binder declarations provide common
339 abbreviations for particular mixfix declarations. So in practice,
340 mixfix templates mostly degenerate to literal text for concrete
341 syntax, such as ``@{verbatim "++"}'' for an infix symbol.
343 \medskip In full generality, mixfix declarations work as follows.
344 Suppose a constant @{text "c :: \<tau>\<^sub>1 \<Rightarrow> \<dots> \<tau>\<^sub>n \<Rightarrow> \<tau>"} is
345 annotated by @{text "(mixfix [p\<^sub>1, \<dots>, p\<^sub>n] p)"}, where @{text
346 "mixfix"} is a string @{text "d\<^sub>0 _ d\<^sub>1 _ \<dots> _ d\<^sub>n"} consisting of
347 delimiters that surround argument positions as indicated by
350 Altogether this determines a production for a context-free priority
351 grammar, where for each argument @{text "i"} the syntactic category
352 is determined by @{text "\<tau>\<^sub>i"} (with priority @{text "p\<^sub>i"}), and
353 the result category is determined from @{text "\<tau>"} (with
354 priority @{text "p"}). Priority specifications are optional, with
355 default 0 for arguments and 1000 for the result.
357 Since @{text "\<tau>"} may be again a function type, the constant
358 type scheme may have more argument positions than the mixfix
359 pattern. Printing a nested application @{text "c t\<^sub>1 \<dots> t\<^sub>m"} for
360 @{text "m > n"} works by attaching concrete notation only to the
361 innermost part, essentially by printing @{text "(c t\<^sub>1 \<dots> t\<^sub>n) \<dots> t\<^sub>m"}
362 instead. If a term has fewer arguments than specified in the mixfix
363 template, the concrete syntax is ignored.
365 \medskip A mixfix template may also contain additional directives
366 for pretty printing, notably spaces, blocks, and breaks. The
367 general template format is a sequence over any of the following
372 \item @{text "\<^bold>d"} is a delimiter, namely a non-empty
373 sequence of characters other than the special characters @{text "'"}
374 (single quote), @{text "_"} (underscore), @{text "\<index>"} (index
375 symbol), @{text "/"} (slash), @{text "("} and @{text ")"}
378 A single quote escapes the special meaning of these meta-characters,
379 producing a literal version of the following character, unless that
380 is a blank. A single quote followed by a blank separates
381 delimiters, without affecting printing, but input tokens may have
382 additional white space here.
384 \item @{text "_"} is an argument position, which stands for a
385 certain syntactic category in the underlying grammar.
387 \item @{text "\<index>"} is an indexed argument position; this is
388 the place where implicit structure arguments can be attached.
390 \item @{text "\<^bold>s"} is a non-empty sequence of spaces for
391 printing. This and the following specifications do not affect
394 \item @{text "(\<^bold>n"} opens a pretty printing block. The
395 optional number specifies how much indentation to add when a line
396 break occurs within the block. If the parenthesis is not followed
397 by digits, the indentation defaults to 0. A block specified via
398 @{text "(00"} is unbreakable.
400 \item @{text ")"} closes a pretty printing block.
402 \item @{text "//"} forces a line break.
404 \item @{text "/\<^bold>s"} allows a line break. Here @{text
405 "\<^bold>s"} stands for the string of spaces (zero or more) right
406 after the slash. These spaces are printed if the break is
411 For example, the template @{text "(_ +/ _)"} specifies an infix
412 operator. There are two argument positions; the delimiter @{text
413 "+"} is preceded by a space and followed by a space or line break;
414 the entire phrase is a pretty printing block.
416 The general idea of pretty printing with blocks and breaks is also
417 described in \cite{paulson-ml2}.
421 subsection {* Proof methods \label{sec:syn-meth} *}
424 Proof methods are either basic ones, or expressions composed of
425 methods via ``@{verbatim ","}'' (sequential composition),
426 ``@{verbatim "|"}'' (alternative choices), ``@{verbatim "?"}''
427 (try), ``@{verbatim "+"}'' (repeat at least once), ``@{verbatim
428 "["}@{text n}@{verbatim "]"}'' (restriction to first @{text n}
429 sub-goals, with default @{text "n = 1"}). In practice, proof
430 methods are usually just a comma separated list of
431 \railqtok{nameref}~\railnonterm{args} specifications. Note that
432 parentheses may be dropped for single method specifications (with no
435 \indexouternonterm{method}
437 method: (nameref | '(' methods ')') (() | '?' | '+' | '[' nat? ']')
439 methods: (nameref args | method) + (',' | '|')
443 Proper Isar proof methods do \emph{not} admit arbitrary goal
444 addressing, but refer either to the first sub-goal or all sub-goals
445 uniformly. The goal restriction operator ``@{text "[n]"}''
446 evaluates a method expression within a sandbox consisting of the
447 first @{text n} sub-goals (which need to exist). For example, the
448 method ``@{text "simp_all[3]"}'' simplifies the first three
449 sub-goals, while ``@{text "(rule foo, simp_all)[]"}'' simplifies all
450 new goals that emerge from applying rule @{text "foo"} to the
451 originally first one.
453 Improper methods, notably tactic emulations, offer a separate
454 low-level goal addressing scheme as explicit argument to the
455 individual tactic being involved. Here ``@{text "[!]"}'' refers to
456 all goals, and ``@{text "[n-]"}'' to all goals starting from @{text
459 \indexouternonterm{goalspec}
461 goalspec: '[' (nat '-' nat | nat '-' | nat | '!' ) ']'
467 subsection {* Attributes and theorems \label{sec:syn-att} *}
470 Attributes (and proof methods, see \secref{sec:syn-meth}) have their
471 own ``semi-inner'' syntax, in the sense that input conforming to
472 \railnonterm{args} below is parsed by the attribute a second time.
473 The attribute argument specifications may be any sequence of atomic
474 entities (identifiers, strings etc.), or properly bracketed argument
475 lists. Below \railqtok{atom} refers to any atomic entity, including
476 any \railtok{keyword} conforming to \railtok{symident}.
478 \indexoutertoken{atom}\indexouternonterm{args}\indexouternonterm{attributes}
480 atom: nameref | typefree | typevar | var | nat | keyword
482 arg: atom | '(' args ')' | '[' args ']'
486 attributes: '[' (nameref args * ',') ']'
490 Theorem specifications come in several flavors:
491 \railnonterm{axmdecl} and \railnonterm{thmdecl} usually refer to
492 axioms, assumptions or results of goal statements, while
493 \railnonterm{thmdef} collects lists of existing theorems. Existing
494 theorems are given by \railnonterm{thmref} and
495 \railnonterm{thmrefs}, the former requires an actual singleton
498 There are three forms of theorem references:
501 \item named facts @{text "a"},
503 \item selections from named facts @{text "a(i)"} or @{text "a(j - k)"},
505 \item literal fact propositions using @{syntax_ref altstring} syntax
506 @{verbatim "`"}@{text "\<phi>"}@{verbatim "`"} (see also method
507 @{method_ref fact} in \secref{sec:pure-meth-att}).
511 Any kind of theorem specification may include lists of attributes
512 both on the left and right hand sides; attributes are applied to any
513 immediately preceding fact. If names are omitted, the theorems are
514 not stored within the theorem database of the theory or proof
515 context, but any given attributes are applied nonetheless.
517 An extra pair of brackets around attributes (like ``@{text
518 "[[simproc a]]"}'') abbreviates a theorem reference involving an
519 internal dummy fact, which will be ignored later on. So only the
520 effect of the attribute on the background context will persist.
521 This form of in-place declarations is particularly useful with
522 commands like @{command "declare"} and @{command "using"}.
524 \indexouternonterm{axmdecl}\indexouternonterm{thmdecl}
525 \indexouternonterm{thmdef}\indexouternonterm{thmref}
526 \indexouternonterm{thmrefs}\indexouternonterm{selection}
528 axmdecl: name attributes? ':'
534 thmref: (nameref selection? | altstring) attributes? | '[' attributes ']'
539 thmbind: name attributes | name | attributes
541 selection: '(' ((nat | nat '-' nat?) + ',') ')'