2 \chapter{Defining Logics} \label{Defining-Logics}
4 \section{Mixfix declarations} \label{sec:mixfix}
5 \index{mixfix declarations|(}
7 When defining a theory, you declare new constants by giving their names,
8 their type, and an optional {\bf mixfix annotation}. Mixfix annotations
9 allow you to extend Isabelle's basic $\lambda$-calculus syntax with
10 readable notation. They can express any context-free priority grammar.
11 Isabelle syntax definitions are inspired by \OBJ~\cite{OBJ}; they are more
12 general than the priority declarations of \ML\ and Prolog.
14 A mixfix annotation defines a production of the priority grammar. It
15 describes the concrete syntax, the translation to abstract syntax, and the
16 pretty printing. Special case annotations provide a simple means of
17 specifying infix operators and binders.
19 \subsection{The general mixfix form}
20 Here is a detailed account of mixfix declarations. Suppose the following
21 line occurs within a {\tt consts} or {\tt syntax} section of a {\tt .thy}
24 {\tt $c$ ::\ "$\sigma$" ("$template$" $ps$ $p$)}
26 This constant declaration and mixfix annotation are interpreted as follows:
27 \begin{itemize}\index{productions}
28 \item The string {\tt $c$} is the name of the constant associated with the
29 production; unless it is a valid identifier, it must be enclosed in
30 quotes. If $c$ is empty (given as~{\tt ""}) then this is a copy
31 production.\index{productions!copy} Otherwise, parsing an instance of the
32 phrase $template$ generates the \AST{} {\tt ("$c$" $a@1$ $\ldots$
33 $a@n$)}, where $a@i$ is the \AST{} generated by parsing the $i$-th
36 \item The constant $c$, if non-empty, is declared to have type $\sigma$
37 ({\tt consts} section only).
39 \item The string $template$ specifies the right-hand side of
40 the production. It has the form
41 \[ w@0 \;_\; w@1 \;_\; \ldots \;_\; w@n, \]
42 where each occurrence of {\tt_} denotes an argument position and
43 the~$w@i$ do not contain~{\tt _}. (If you want a literal~{\tt _} in
44 the concrete syntax, you must escape it as described below.) The $w@i$
45 may consist of \rmindex{delimiters}, spaces or
46 \rmindex{pretty printing} annotations (see below).
48 \item The type $\sigma$ specifies the production's nonterminal symbols
49 (or name tokens). If $template$ is of the form above then $\sigma$
50 must be a function type with at least~$n$ argument positions, say
51 $\sigma = [\tau@1, \dots, \tau@n] \To \tau$. Nonterminal symbols are
52 derived from the types $\tau@1$, \ldots,~$\tau@n$, $\tau$ as described
53 below. Any of these may be function types.
55 \item The optional list~$ps$ may contain at most $n$ integers, say {\tt
56 [$p@1$, $\ldots$, $p@m$]}, where $p@i$ is the minimal
57 priority\indexbold{priorities} required of any phrase that may appear
58 as the $i$-th argument. Missing priorities default to~0.
60 \item The integer $p$ is the priority of this production. If
61 omitted, it defaults to the maximal priority. Priorities range
62 between 0 and \ttindexbold{max_pri} (= 1000).
66 The resulting production is \[ A^{(p)}= w@0\, A@1^{(p@1)}\, w@1\,
67 A@2^{(p@2)}\, \dots\, A@n^{(p@n)}\, w@n \] where $A$ and the $A@i$ are the
68 nonterminals corresponding to the types $\tau$ and $\tau@i$ respectively.
69 The nonterminal symbol associated with a type $(\ldots)ty$ is {\tt logic}, if
70 this is a logical type (namely one of class {\tt logic} excluding {\tt
71 prop}). Otherwise it is $ty$ (note that only the outermost type constructor
72 is taken into account). Finally, the nonterminal of a type variable is {\tt
76 Theories must sometimes declare types for purely syntactic purposes ---
77 merely playing the role of nonterminals. One example is \tydx{type}, the
78 built-in type of types. This is a `type of all types' in the syntactic
79 sense only. Do not declare such types under {\tt arities} as belonging to
80 class {\tt logic}\index{*logic class}, for that would make them useless as
81 separate nonterminal symbols.
84 Associating nonterminals with types allows a constant's type to specify
85 syntax as well. We can declare the function~$f$ to have type $[\tau@1,
86 \ldots, \tau@n]\To \tau$ and, through a mixfix annotation, specify the layout
87 of the function's $n$ arguments. The constant's name, in this case~$f$, will
88 also serve as the label in the abstract syntax tree.
90 You may also declare mixfix syntax without adding constants to the theory's
91 signature, by using a {\tt syntax} section instead of {\tt consts}. Thus a
92 production need not map directly to a logical function (this typically
93 requires additional syntactic translations, see also
94 Chapter~\ref{chap:syntax}).
98 As a special case of the general mixfix declaration, the form
100 {\tt $c$ ::\ "$\sigma$" ("$template$")}
102 specifies no priorities. The resulting production puts no priority
103 constraints on any of its arguments and has maximal priority itself.
104 Omitting priorities in this manner is prone to syntactic ambiguities unless
105 the production's right-hand side is fully bracketed, as in
106 \verb|"if _ then _ else _ fi"|.
108 Omitting the mixfix annotation completely, as in {\tt $c$ ::\ "$\sigma$"},
109 is sensible only if~$c$ is an identifier. Otherwise you will be unable to
110 write terms involving~$c$.
113 \subsection{Example: arithmetic expressions}
114 \index{examples!of mixfix declarations}
115 This theory specification contains a {\tt syntax} section with mixfix
116 declarations encoding the priority grammar from
117 \S\ref{sec:priority_grammars}:
124 "+" :: [exp, exp] => exp ("_ + _" [0, 1] 0)
125 "*" :: [exp, exp] => exp ("_ * _" [3, 2] 2)
126 "-" :: exp => exp ("- _" [3] 3)
129 Executing {\tt Syntax.print_gram} reveals the productions derived from the
130 above mixfix declarations (lots of additional information deleted):
132 Syntax.print_gram (syn_of ExpSyntax.thy);
133 {\out exp = "0" => "0" (9)}
134 {\out exp = exp[0] "+" exp[1] => "+" (0)}
135 {\out exp = exp[3] "*" exp[2] => "*" (2)}
136 {\out exp = "-" exp[3] => "-" (3)}
139 Note that because {\tt exp} is not of class {\tt logic}, it has been
140 retained as a separate nonterminal. This also entails that the syntax
141 does not provide for identifiers or paranthesized expressions.
142 Normally you would also want to add the declaration {\tt arities
143 exp::logic} after {\tt types} and use {\tt consts} instead of {\tt
144 syntax}. Try this as an exercise and study the changes in the
151 Infix operators associating to the left or right can be declared using
152 {\tt infixl} or {\tt infixr}. Basically, the form {\tt $c$ ::\
153 $\sigma$ (infixl $p$)} abbreviates the mixfix declarations
155 "op \(c\)" :: \(\sigma\) ("(_ \(c\)/ _)" [\(p\), \(p+1\)] \(p\))
156 "op \(c\)" :: \(\sigma\) ("op \(c\)")
158 and {\tt $c$ ::\ $\sigma$ (infixr $p$)} abbreviates the mixfix declarations
160 "op \(c\)" :: \(\sigma\) ("(_ \(c\)/ _)" [\(p+1\), \(p\)] \(p\))
161 "op \(c\)" :: \(\sigma\) ("op \(c\)")
163 The infix operator is declared as a constant with the prefix {\tt op}.
164 Thus, prefixing infixes with \sdx{op} makes them behave like ordinary
165 function symbols, as in \ML. Special characters occurring in~$c$ must be
166 escaped, as in delimiters, using a single quote.
168 A slightly more general form of infix declarations allows constant
169 names to be independent from their concrete syntax, namely \texttt{$c$
170 ::\ $\sigma$\ (infixl "$sy$" $p$)}, the same for \texttt{infixr}. As
173 and :: [bool, bool] => bool (infixr "&" 35)
175 The internal constant name will then be just \texttt{and}, without any
176 \texttt{op} prefixed.
183 A {\bf binder} is a variable-binding construct such as a quantifier. The
186 \(c\) :: \(\sigma\) (binder "\(\Q\)" [\(pb\)] \(p\))
188 introduces a constant~$c$ of type~$\sigma$, which must have the form
189 $(\tau@1 \To \tau@2) \To \tau@3$. Its concrete syntax is $\Q~x.P$, where
190 $x$ is a bound variable of type~$\tau@1$, the body~$P$ has type $\tau@2$
191 and the whole term has type~$\tau@3$. The optional integer $pb$
192 specifies the body's priority, by default~$p$. Special characters
193 in $\Q$ must be escaped using a single quote.
195 The declaration is expanded internally to something like
197 \(c\)\hskip3pt :: (\(\tau@1\) => \(\tau@2\)) => \(\tau@3\)
198 "\(\Q\)" :: [idts, \(\tau@2\)] => \(\tau@3\) ("(3\(\Q\)_./ _)" [0, \(pb\)] \(p\))
200 Here \ndx{idts} is the nonterminal symbol for a list of identifiers with
201 \index{type constraints}
202 optional type constraints (see Fig.\ts\ref{fig:pure_gram}). The
203 declaration also installs a parse translation\index{translations!parse}
204 for~$\Q$ and a print translation\index{translations!print} for~$c$ to
205 translate between the internal and external forms.
207 A binder of type $(\sigma \To \tau) \To \tau$ can be nested by giving a
208 list of variables. The external form $\Q~x@1~x@2 \ldots x@n. P$
209 corresponds to the internal form
210 \[ c(\lambda x@1. c(\lambda x@2. \ldots c(\lambda x@n. P) \ldots)). \]
213 For example, let us declare the quantifier~$\forall$:\index{quantifiers}
215 All :: ('a => o) => o (binder "ALL " 10)
217 This lets us write $\forall x.P$ as either {\tt All(\%$x$.$P$)} or {\tt ALL
218 $x$.$P$}. When printing, Isabelle prefers the latter form, but must fall
219 back on ${\tt All}(P)$ if $P$ is not an abstraction. Both $P$ and {\tt ALL
220 $x$.$P$} have type~$o$, the type of formulae, while the bound variable
224 \index{mixfix declarations|)}
227 \section{*Alternative print modes} \label{sec:prmodes}
228 \index{print modes|(}
230 Isabelle's pretty printer supports alternative output syntaxes. These
231 may be used independently or in cooperation. The currently active
232 print modes (with precedence from left to right) are determined by a
234 \begin{ttbox}\index{*print_mode}
235 print_mode: string list ref
237 Initially this may already contain some print mode identifiers,
238 depending on how Isabelle has been invoked (e.g.\ by some user
239 interface). So changes should be incremental --- adding or deleting
240 modes relative to the current value.
242 Any \ML{} string is a legal print mode identifier, without any predeclaration
243 required. The following names should be considered reserved, though:
244 \texttt{""} (the empty string), \texttt{symbols}, \texttt{xsymbols}, and
247 There is a separate table of mixfix productions for pretty printing
248 associated with each print mode. The currently active ones are
249 conceptually just concatenated from left to right, with the standard
250 syntax output table always coming last as default. Thus mixfix
251 productions of preceding modes in the list may override those of later
252 ones. Also note that token translations are always relative to some
253 print mode (see \S\ref{sec:tok_tr}).
255 \medskip The canonical application of print modes is optional printing
256 of mathematical symbols from a special screen font instead of {\sc
257 ascii}. Another example is to re-use Isabelle's advanced
258 $\lambda$-term printing mechanisms to generate completely different
259 output, say for interfacing external tools like \rmindex{model
260 checkers} (see also \texttt{HOL/Modelcheck}).
262 \index{print modes|)}
265 \section{Ambiguity of parsed expressions} \label{sec:ambiguity}
266 \index{ambiguity!of parsed expressions}
268 To keep the grammar small and allow common productions to be shared
269 all logical types (except {\tt prop}) are internally represented
270 by one nonterminal, namely {\tt logic}. This and omitted or too freely
271 chosen priorities may lead to ways of parsing an expression that were
272 not intended by the theory's maker. In most cases Isabelle is able to
273 select one of multiple parse trees that an expression has lead
274 to by checking which of them can be typed correctly. But this may not
275 work in every case and always slows down parsing.
276 The warning and error messages that can be produced during this process are
279 If an ambiguity can be resolved by type inference the following
280 warning is shown to remind the user that parsing is (unnecessarily)
281 slowed down. In cases where it's not easily possible to eliminate the
282 ambiguity the frequency of the warning can be controlled by changing
283 the value of {\tt Syntax.ambiguity_level} which has type {\tt int
284 ref}. Its default value is 1 and by increasing it one can control how
285 many parse trees are necessary to generate the warning.
288 {\out Ambiguous input "\dots"}
289 {\out produces the following parse trees:}
291 {\out Fortunately, only one parse tree is type correct.}
292 {\out You may still want to disambiguate your grammar or your input.}
295 The following message is normally caused by using the same
296 syntax in two different productions:
299 {\out Ambiguous input "..."}
300 {\out produces the following parse trees:}
302 {\out More than one term is type correct:}
306 Ambiguities occuring in syntax translation rules cannot be resolved by
307 type inference because it is not necessary for these rules to be type
308 correct. Therefore Isabelle always generates an error message and the
309 ambiguity should be eliminated by changing the grammar or the rule.
314 %%% TeX-master: "ref"