doc-src/TutorialI/Documents/Documents.thy
author wenzelm
Sat, 05 Jan 2002 01:15:12 +0100
changeset 12635 e2d44df29c94
parent 12629 281aa36829d8
child 12642 40fbd988b59b
permissions -rw-r--r--
more on concrete syntax;
wenzelm@11647
     1
(*<*)
wenzelm@11647
     2
theory Documents = Main:
wenzelm@11647
     3
(*>*)
wenzelm@11647
     4
wenzelm@12629
     5
section {* Concrete syntax \label{sec:concrete-syntax} *}
wenzelm@12629
     6
wenzelm@12629
     7
text {*
wenzelm@12629
     8
  Concerning Isabelle's ``inner'' language of simply-typed @{text
wenzelm@12629
     9
  \<lambda>}-calculus, the core concept of Isabelle's elaborate infrastructure
wenzelm@12629
    10
  for concrete syntax is that of general \emph{mixfix
wenzelm@12629
    11
  annotations}\index{mixfix annotations|bold}.  Associated with any
wenzelm@12629
    12
  kind of name and type declaration, mixfixes give rise both to
wenzelm@12629
    13
  grammar productions for the parser and output templates for the
wenzelm@12629
    14
  pretty printer.
wenzelm@12629
    15
wenzelm@12629
    16
  In full generality, the whole affair of parser and pretty printer
wenzelm@12629
    17
  configuration is rather subtle.  Any syntax specifications given by
wenzelm@12629
    18
  end-users need to interact properly with the existing setup of
wenzelm@12629
    19
  Isabelle/Pure and Isabelle/HOL; see \cite{isabelle-ref} for further
wenzelm@12629
    20
  details.  It is particularly important to get the precedence of new
wenzelm@12629
    21
  syntactic constructs right, avoiding ambiguities with existing
wenzelm@12629
    22
  elements.
wenzelm@12629
    23
wenzelm@12629
    24
  \medskip Subsequently we introduce a few simple declaration forms
wenzelm@12629
    25
  that already cover the most common situations fairly well.
wenzelm@12629
    26
*}
wenzelm@12629
    27
wenzelm@12629
    28
wenzelm@12629
    29
subsection {* Infixes *}
wenzelm@12629
    30
wenzelm@12629
    31
text {*
wenzelm@12629
    32
  Syntax annotations may be included wherever constants are declared
wenzelm@12629
    33
  directly or indirectly, including \isacommand{consts},
wenzelm@12629
    34
  \isacommand{constdefs}, or \isacommand{datatype} (for the
wenzelm@12629
    35
  constructor operations).  Type-constructors may be annotated as
wenzelm@12629
    36
  well, although this is less frequently encountered in practice
wenzelm@12629
    37
  (@{text "*"} and @{text "+"} types may come to mind).
wenzelm@12629
    38
wenzelm@12629
    39
  Infix declarations\index{infix annotations|bold} provide a useful
wenzelm@12629
    40
  special case of mixfixes, where users need not care about the full
wenzelm@12629
    41
  details of priorities, nesting, spacing, etc.  The subsequent
wenzelm@12629
    42
  example of the exclusive-or operation on boolean values illustrates
wenzelm@12629
    43
  typical infix declarations.
wenzelm@12629
    44
*}
wenzelm@12629
    45
wenzelm@12629
    46
constdefs
wenzelm@12629
    47
  xor :: "bool \<Rightarrow> bool \<Rightarrow> bool"    (infixl "[+]" 60)
wenzelm@12629
    48
  "A [+] B \<equiv> (A \<and> \<not> B) \<or> (\<not> A \<and> B)"
wenzelm@12629
    49
wenzelm@12629
    50
text {*
wenzelm@12629
    51
  Any curried function with at least two arguments may be associated
wenzelm@12629
    52
  with infix syntax: @{text "xor A B"} and @{text "A [+] B"} refer to
wenzelm@12629
    53
  the same expression internally.  In partial applications with less
wenzelm@12629
    54
  than two operands there is a special notation with \isa{op} prefix:
wenzelm@12629
    55
  @{text xor} without arguments is represented as @{text "op [+]"};
wenzelm@12629
    56
  combined with plain prefix application this turns @{text "xor A"}
wenzelm@12629
    57
  into @{text "op [+] A"}.
wenzelm@12629
    58
wenzelm@12629
    59
  \medskip The string @{text [source] "[+]"} in the above declaration
wenzelm@12629
    60
  refers to the bit of concrete syntax to represent the operator,
wenzelm@12629
    61
  while the number @{text 60} determines the precedence of the whole
wenzelm@12629
    62
  construct.
wenzelm@12629
    63
wenzelm@12629
    64
  As it happens, Isabelle/HOL already spends many popular combinations
wenzelm@12629
    65
  of ASCII symbols for its own use, including both @{text "+"} and
wenzelm@12629
    66
  @{text "++"}.  Slightly more awkward combinations like the present
wenzelm@12629
    67
  @{text "[+]"} tend to be available for user extensions.  The current
wenzelm@12629
    68
  arrangement of inner syntax may be inspected via
wenzelm@12629
    69
  \commdx{print\protect\_syntax}, albeit its output is enormous.
wenzelm@12629
    70
wenzelm@12629
    71
  Operator precedence also needs some special considerations.  The
wenzelm@12629
    72
  admissible range is 0--1000.  Very low or high priorities are
wenzelm@12629
    73
  basically reserved for the meta-logic.  Syntax of Isabelle/HOL
wenzelm@12629
    74
  mainly uses the range of 10--100: the equality infix @{text "="} is
wenzelm@12629
    75
  centered at 50, logical connectives (like @{text "\<or>"} and @{text
wenzelm@12629
    76
  "\<and>"}) are below 50, and algebraic ones (like @{text "+"} and @{text
wenzelm@12629
    77
  "*"}) above 50.  User syntax should strive to coexist with common
wenzelm@12629
    78
  HOL forms, or use the mostly unused range 100--900.
wenzelm@12629
    79
wenzelm@12629
    80
  \medskip The keyword \isakeyword{infixl} specifies an operator that
wenzelm@12629
    81
  is nested to the \emph{left}: in iterated applications the more
wenzelm@12629
    82
  complex expression appears on the left-hand side: @{term "A [+] B
wenzelm@12629
    83
  [+] C"} stands for @{text "(A [+] B) [+] C"}.  Similarly,
wenzelm@12635
    84
  \isakeyword{infixr} refers to nesting to the \emph{right}, reading
wenzelm@12635
    85
  @{term "A [+] B [+] C"} as @{text "A [+] (B [+] C)"}.  In contrast,
wenzelm@12635
    86
  a \emph{non-oriented} declaration via \isakeyword{infix} would
wenzelm@12635
    87
  always demand explicit parentheses.
wenzelm@12629
    88
  
wenzelm@12629
    89
  Many binary operations observe the associative law, so the exact
wenzelm@12629
    90
  grouping does not matter.  Nevertheless, formal statements need be
wenzelm@12629
    91
  given in a particular format, associativity needs to be treated
wenzelm@12629
    92
  explicitly within the logic.  Exclusive-or is happens to be
wenzelm@12629
    93
  associative, as shown below.
wenzelm@12629
    94
*}
wenzelm@12629
    95
wenzelm@12629
    96
lemma xor_assoc: "(A [+] B) [+] C = A [+] (B [+] C)"
wenzelm@12629
    97
  by (auto simp add: xor_def)
wenzelm@12629
    98
wenzelm@12629
    99
text {*
wenzelm@12629
   100
  Such rules may be used in simplification to regroup nested
wenzelm@12629
   101
  expressions as required.  Note that the system would actually print
wenzelm@12629
   102
  the above statement as @{term "A [+] B [+] C = A [+] (B [+] C)"}
wenzelm@12629
   103
  (due to nesting to the left).  We have preferred to give the fully
wenzelm@12635
   104
  parenthesized form in the text for clarity.  Only in rare situations
wenzelm@12635
   105
  one may consider to force parentheses by use of non-oriented infix
wenzelm@12635
   106
  syntax; equality would probably be a typical candidate.
wenzelm@12629
   107
*}
wenzelm@12629
   108
wenzelm@12635
   109
wenzelm@12629
   110
subsection {* Mathematical symbols \label{sec:thy-present-symbols} *}
wenzelm@12629
   111
wenzelm@12629
   112
text {*
wenzelm@12635
   113
  Concrete syntax based on plain ASCII characters has its inherent
wenzelm@12635
   114
  limitations.  Rich mathematical notation demands a larger repertoire
wenzelm@12635
   115
  of symbols.  Several standards of extended character sets have been
wenzelm@12635
   116
  proposed over decades, but none has become universally available so
wenzelm@12635
   117
  far, not even Unicode\index{Unicode}.
wenzelm@12629
   118
wenzelm@12635
   119
  Isabelle supports a generic notion of
wenzelm@12635
   120
  \emph{symbols}\index{symbols|bold} as the smallest entities of
wenzelm@12635
   121
  source text, without referring to internal encodings.  Such
wenzelm@12635
   122
  ``generalized characters'' may be of one of the following three
wenzelm@12635
   123
  kinds:
wenzelm@12635
   124
wenzelm@12635
   125
  \begin{enumerate}
wenzelm@12635
   126
wenzelm@12635
   127
  \item Traditional 7-bit ASCII characters.
wenzelm@12635
   128
wenzelm@12635
   129
  \item Named symbols: \verb,\,\verb,<,$ident$\verb,>, (or
wenzelm@12635
   130
  \verb,\\,\verb,<,$ident$\verb,>,).
wenzelm@12635
   131
wenzelm@12635
   132
  \item Named control symbols: \verb,\,\verb,<^,$ident$\verb,>, (or
wenzelm@12635
   133
  \verb,\\,\verb,<^,$ident$\verb,>,).
wenzelm@12635
   134
wenzelm@12635
   135
  \end{enumerate}
wenzelm@12635
   136
wenzelm@12635
   137
  Here $ident$ may be any identifier according to the usual Isabelle
wenzelm@12635
   138
  conventions.  This results in an infinite store of symbols, whose
wenzelm@12635
   139
  interpretation is left to further front-end tools.  For example, the
wenzelm@12635
   140
  \verb,\,\verb,<forall>, symbol of Isabelle is really displayed as
wenzelm@12635
   141
  $\forall$ --- both by the user-interface of Proof~General + X-Symbol
wenzelm@12635
   142
  and the Isabelle document processor (see \S\ref{FIXME}).
wenzelm@12635
   143
wenzelm@12635
   144
  A list of standard Isabelle symbols is given in
wenzelm@12635
   145
  \cite[appendix~A]{isabelle-sys}.  Users may introduce their own
wenzelm@12635
   146
  interpretation of further symbols by configuring the appropriate
wenzelm@12635
   147
  front-end tool accordingly, e.g.\ defining appropriate {\LaTeX}
wenzelm@12635
   148
  macros for document preparation.  There are also a few predefined
wenzelm@12635
   149
  control symbols, such as \verb,\,\verb,<^sub>, and
wenzelm@12635
   150
  \verb,\,\verb,<^sup>, for sub- and superscript of the subsequent
wenzelm@12635
   151
  (printable) symbol, respectively.
wenzelm@12635
   152
wenzelm@12635
   153
  \medskip The following version of our @{text xor} definition uses a
wenzelm@12635
   154
  standard Isabelle symbol to achieve typographically pleasing output.
wenzelm@12629
   155
*}
wenzelm@12629
   156
wenzelm@12629
   157
(*<*)
wenzelm@12629
   158
hide const xor
wenzelm@12629
   159
ML_setup {* Context.>> (Theory.add_path "1") *}
wenzelm@12629
   160
(*>*)
wenzelm@12629
   161
constdefs
wenzelm@12629
   162
  xor :: "bool \<Rightarrow> bool \<Rightarrow> bool"    (infixl "\<oplus>" 60)
wenzelm@12629
   163
  "A \<oplus> B \<equiv> (A \<and> \<not> B) \<or> (\<not> A \<and> B)"
wenzelm@12629
   164
(*<*)
wenzelm@12629
   165
local
wenzelm@12629
   166
(*>*)
wenzelm@12629
   167
wenzelm@12635
   168
text {*
wenzelm@12635
   169
  The X-Symbol package within Proof~General provides several input
wenzelm@12635
   170
  methods to enter @{text \<oplus>} in the text.  If all fails one may just
wenzelm@12635
   171
  type \verb,\,\verb,<oplus>, by hand; the display is adapted
wenzelm@12635
   172
  immediately after continuing further input.
wenzelm@12635
   173
wenzelm@12635
   174
  \medskip A slightly more refined scheme is to provide alternative
wenzelm@12635
   175
  syntax via the \emph{print mode}\index{print mode} concept of
wenzelm@12635
   176
  Isabelle (see also \cite{isabelle-ref}).  By convention, the mode
wenzelm@12635
   177
  ``$xsymbols$'' is enabled whenever X-Symbol is active.  Consider the
wenzelm@12635
   178
  following hybrid declaration of @{text xor}.
wenzelm@12635
   179
*}
wenzelm@12635
   180
wenzelm@12635
   181
(*<*)
wenzelm@12635
   182
hide const xor
wenzelm@12635
   183
ML_setup {* Context.>> (Theory.add_path "2") *}
wenzelm@12635
   184
(*>*)
wenzelm@12635
   185
constdefs
wenzelm@12635
   186
  xor :: "bool \<Rightarrow> bool \<Rightarrow> bool"    (infixl "[+]\<ignore>" 60)
wenzelm@12635
   187
  "A [+]\<ignore> B \<equiv> (A \<and> \<not> B) \<or> (\<not> A \<and> B)"
wenzelm@12635
   188
wenzelm@12635
   189
syntax (xsymbols)
wenzelm@12635
   190
  xor :: "bool \<Rightarrow> bool \<Rightarrow> bool"    (infixl "\<oplus>\<ignore>" 60)
wenzelm@12635
   191
(*<*)
wenzelm@12635
   192
local
wenzelm@12635
   193
(*>*)
wenzelm@12635
   194
wenzelm@12635
   195
text {*
wenzelm@12635
   196
  Here the \commdx{syntax} command acts like \isakeyword{consts}, but
wenzelm@12635
   197
  without declaring a logical constant, and with an optional print
wenzelm@12635
   198
  mode specification.  Note that the type declaration given here
wenzelm@12635
   199
  merely serves for syntactic purposes, and is not checked for
wenzelm@12635
   200
  consistency with the real constant.
wenzelm@12635
   201
wenzelm@12635
   202
  \medskip Now we may write either @{text "[+]"} or @{text "\<oplus>"} in
wenzelm@12635
   203
  input, while output uses the nicer syntax of $xsymbols$, provided
wenzelm@12635
   204
  that print mode is presently active.  This scheme is particularly
wenzelm@12635
   205
  useful for interactive development, with the user typing plain ASCII
wenzelm@12635
   206
  text, but gaining improved visual feedback from the system (say in
wenzelm@12635
   207
  current goal output).
wenzelm@12635
   208
wenzelm@12635
   209
  \begin{warn}
wenzelm@12635
   210
  Using alternative syntax declarations easily results in varying
wenzelm@12635
   211
  versions of input sources.  Isabelle provides no systematic way to
wenzelm@12635
   212
  convert alternative expressions back and forth.  Print modes only
wenzelm@12635
   213
  affect situations where formal entities are pretty printed by the
wenzelm@12635
   214
  Isabelle process (e.g.\ output of terms and types), but not the
wenzelm@12635
   215
  original theory text.
wenzelm@12635
   216
  \end{warn}
wenzelm@12635
   217
wenzelm@12635
   218
  \medskip The following variant makes the alternative @{text \<oplus>}
wenzelm@12635
   219
  notation only available for output.  Thus we may enforce input
wenzelm@12635
   220
  sources to refer to plain ASCII only.
wenzelm@12635
   221
*}
wenzelm@12635
   222
wenzelm@12635
   223
syntax (xsymbols output)
wenzelm@12635
   224
  xor :: "bool \<Rightarrow> bool \<Rightarrow> bool"    (infixl "\<oplus>\<ignore>" 60)
wenzelm@12635
   225
wenzelm@12629
   226
wenzelm@12629
   227
subsection {* Prefixes *}
wenzelm@12629
   228
wenzelm@12629
   229
text {*
wenzelm@12635
   230
  Prefix syntax annotations\index{prefix annotation|bold} are just a
wenzelm@12635
   231
  very degenerate of the general mixfix form \cite{isabelle-ref},
wenzelm@12635
   232
  without any template arguments or priorities --- just some piece of
wenzelm@12635
   233
  literal syntax.
wenzelm@12635
   234
wenzelm@12635
   235
  The following example illustrates this idea idea by associating
wenzelm@12635
   236
  common symbols with the constructors of a currency datatype.
wenzelm@12629
   237
*}
wenzelm@12629
   238
wenzelm@12629
   239
datatype currency =
wenzelm@12629
   240
    Euro nat    ("\<euro>")
wenzelm@12629
   241
  | Pounds nat  ("\<pounds>")
wenzelm@12629
   242
  | Yen nat     ("\<yen>")
wenzelm@12629
   243
  | Dollar nat  ("$")
wenzelm@12629
   244
wenzelm@12629
   245
text {*
wenzelm@12635
   246
  Here the degenerate mixfix annotations on the rightmost column
wenzelm@12635
   247
  happen to consist of a single Isabelle symbol each:
wenzelm@12635
   248
  \verb,\,\verb,<euro>,, \verb,\,\verb,<pounds>,,
wenzelm@12635
   249
  \verb,\,\verb,<yen>,, and \verb,\,$,.
wenzelm@12629
   250
wenzelm@12635
   251
  Recall that a constructor like @{text Euro} actually is a function
wenzelm@12635
   252
  @{typ "nat \<Rightarrow> currency"}.  An expression like @{text "Euro 10"} will
wenzelm@12635
   253
  be printed as @{term "\<euro> 10"}.  Merely the head of the application is
wenzelm@12635
   254
  subject to our trivial concrete syntax; this form is sufficient to
wenzelm@12635
   255
  achieve fair conformance to EU~Commission standards of currency
wenzelm@12635
   256
  notation.
wenzelm@12635
   257
wenzelm@12635
   258
  \medskip Certainly, the same idea of prefix syntax also works for
wenzelm@12635
   259
  \isakeyword{consts}, \isakeyword{constdefs} etc.  For example, we
wenzelm@12635
   260
  might introduce a (slightly unrealistic) function to calculate an
wenzelm@12635
   261
  abstract currency value, by cases on the datatype constructors and
wenzelm@12635
   262
  fixed exchange rates.
wenzelm@12635
   263
*}
wenzelm@12635
   264
wenzelm@12635
   265
consts
wenzelm@12635
   266
  currency :: "currency \<Rightarrow> nat"    ("\<currency>")
wenzelm@12635
   267
wenzelm@12635
   268
text {*
wenzelm@12635
   269
  \noindent The funny symbol encountered here is that of
wenzelm@12635
   270
  \verb,\<currency>,.
wenzelm@12629
   271
*}
wenzelm@12629
   272
wenzelm@12629
   273
wenzelm@12629
   274
subsection {* Syntax translations \label{sec:def-translations} *}
wenzelm@12629
   275
wenzelm@12629
   276
text{*
wenzelm@12629
   277
  FIXME
wenzelm@12629
   278
wenzelm@12629
   279
\index{syntax translations|(}%
wenzelm@12629
   280
\index{translations@\isacommand {translations} (command)|(}
wenzelm@12629
   281
Isabelle offers an additional definitional facility,
wenzelm@12629
   282
\textbf{syntax translations}.
wenzelm@12629
   283
They resemble macros: upon parsing, the defined concept is immediately
wenzelm@12629
   284
replaced by its definition.  This effect is reversed upon printing.  For example,
wenzelm@12629
   285
the symbol @{text"\<noteq>"} is defined via a syntax translation:
wenzelm@12629
   286
*}
wenzelm@12629
   287
wenzelm@12629
   288
translations "x \<noteq> y" \<rightleftharpoons> "\<not>(x = y)"
wenzelm@12629
   289
wenzelm@12629
   290
text{*\index{$IsaEqTrans@\isasymrightleftharpoons}
wenzelm@12629
   291
\noindent
wenzelm@12629
   292
Internally, @{text"\<noteq>"} never appears.
wenzelm@12629
   293
wenzelm@12629
   294
In addition to @{text"\<rightleftharpoons>"} there are
wenzelm@12629
   295
@{text"\<rightharpoonup>"}\index{$IsaEqTrans1@\isasymrightharpoonup}
wenzelm@12629
   296
and @{text"\<leftharpoondown>"}\index{$IsaEqTrans2@\isasymleftharpoondown}
wenzelm@12629
   297
for uni-directional translations, which only affect
wenzelm@12629
   298
parsing or printing.  This tutorial will not cover the details of
wenzelm@12629
   299
translations.  We have mentioned the concept merely because it
wenzelm@12629
   300
crops up occasionally; a number of HOL's built-in constructs are defined
wenzelm@12629
   301
via translations.  Translations are preferable to definitions when the new 
wenzelm@12629
   302
concept is a trivial variation on an existing one.  For example, we
wenzelm@12629
   303
don't need to derive new theorems about @{text"\<noteq>"}, since existing theorems
wenzelm@12629
   304
about @{text"="} still apply.%
wenzelm@12629
   305
\index{syntax translations|)}%
wenzelm@12629
   306
\index{translations@\isacommand {translations} (command)|)}
wenzelm@12629
   307
*}
wenzelm@12629
   308
wenzelm@12629
   309
wenzelm@12629
   310
section {* Document preparation *}
wenzelm@12629
   311
wenzelm@12629
   312
subsection {* Batch-mode sessions *}
wenzelm@12629
   313
wenzelm@12629
   314
subsection {* {\LaTeX} macros *}
wenzelm@12629
   315
wenzelm@12629
   316
subsubsection {* Structure markup *}
wenzelm@12629
   317
wenzelm@12629
   318
subsubsection {* Symbols and characters *}
wenzelm@12629
   319
wenzelm@12629
   320
text {*
wenzelm@12629
   321
  FIXME
wenzelm@12629
   322
wenzelm@12629
   323
  
wenzelm@12629
   324
*}
wenzelm@12629
   325
wenzelm@11647
   326
(*<*)
wenzelm@11647
   327
end
wenzelm@11647
   328
(*>*)