doc-src/IsarRef/Thy/Outer_Syntax.thy
author wenzelm
Thu, 13 Nov 2008 21:34:23 +0100
changeset 28752 754f10154d73
parent 28748 69268a097405
child 28753 b5926a48c943
permissions -rw-r--r--
tuned;
wenzelm@27037
     1
(* $Id$ *)
wenzelm@27037
     2
wenzelm@27037
     3
theory Outer_Syntax
wenzelm@27050
     4
imports Main
wenzelm@27037
     5
begin
wenzelm@27037
     6
wenzelm@27040
     7
chapter {* Outer syntax *}
wenzelm@27037
     8
wenzelm@27037
     9
text {*
wenzelm@27037
    10
  The rather generic framework of Isabelle/Isar syntax emerges from
wenzelm@27037
    11
  three main syntactic categories: \emph{commands} of the top-level
wenzelm@27037
    12
  Isar engine (covering theory and proof elements), \emph{methods} for
wenzelm@27037
    13
  general goal refinements (analogous to traditional ``tactics''), and
wenzelm@27037
    14
  \emph{attributes} for operations on facts (within a certain
wenzelm@27037
    15
  context).  Subsequently we give a reference of basic syntactic
wenzelm@27037
    16
  entities underlying Isabelle/Isar syntax in a bottom-up manner.
wenzelm@27037
    17
  Concrete theory and proof language elements will be introduced later
wenzelm@27037
    18
  on.
wenzelm@27037
    19
wenzelm@27037
    20
  \medskip In order to get started with writing well-formed
wenzelm@27037
    21
  Isabelle/Isar documents, the most important aspect to be noted is
wenzelm@27037
    22
  the difference of \emph{inner} versus \emph{outer} syntax.  Inner
wenzelm@27037
    23
  syntax is that of Isabelle types and terms of the logic, while outer
wenzelm@27037
    24
  syntax is that of Isabelle/Isar theory sources (specifications and
wenzelm@27037
    25
  proofs).  As a general rule, inner syntax entities may occur only as
wenzelm@27037
    26
  \emph{atomic entities} within outer syntax.  For example, the string
wenzelm@27037
    27
  @{verbatim "\"x + y\""} and identifier @{verbatim z} are legal term
wenzelm@27037
    28
  specifications within a theory, while @{verbatim "x + y"} without
wenzelm@27037
    29
  quotes is not.
wenzelm@27037
    30
wenzelm@27037
    31
  Printed theory documents usually omit quotes to gain readability
wenzelm@27037
    32
  (this is a matter of {\LaTeX} macro setup, say via @{verbatim
wenzelm@27037
    33
  "\\isabellestyle"}, see also \cite{isabelle-sys}).  Experienced
wenzelm@27037
    34
  users of Isabelle/Isar may easily reconstruct the lost technical
wenzelm@27037
    35
  information, while mere readers need not care about quotes at all.
wenzelm@27037
    36
wenzelm@27037
    37
  \medskip Isabelle/Isar input may contain any number of input
wenzelm@27037
    38
  termination characters ``@{verbatim ";"}'' (semicolon) to separate
wenzelm@27037
    39
  commands explicitly.  This is particularly useful in interactive
wenzelm@27037
    40
  shell sessions to make clear where the current command is intended
wenzelm@27037
    41
  to end.  Otherwise, the interpreter loop will continue to issue a
wenzelm@27037
    42
  secondary prompt ``@{verbatim "#"}'' until an end-of-command is
wenzelm@27037
    43
  clearly recognized from the input syntax, e.g.\ encounter of the
wenzelm@27037
    44
  next command keyword.
wenzelm@27037
    45
wenzelm@27037
    46
  More advanced interfaces such as Proof~General \cite{proofgeneral}
wenzelm@27037
    47
  do not require explicit semicolons, the amount of input text is
wenzelm@27037
    48
  determined automatically by inspecting the present content of the
wenzelm@27037
    49
  Emacs text buffer.  In the printed presentation of Isabelle/Isar
wenzelm@27037
    50
  documents semicolons are omitted altogether for readability.
wenzelm@27037
    51
wenzelm@27037
    52
  \begin{warn}
wenzelm@27037
    53
    Proof~General requires certain syntax classification tables in
wenzelm@27037
    54
    order to achieve properly synchronized interaction with the
wenzelm@27037
    55
    Isabelle/Isar process.  These tables need to be consistent with
wenzelm@27037
    56
    the Isabelle version and particular logic image to be used in a
wenzelm@27037
    57
    running session (common object-logics may well change the outer
wenzelm@27037
    58
    syntax).  The standard setup should work correctly with any of the
wenzelm@27037
    59
    ``official'' logic images derived from Isabelle/HOL (including
wenzelm@27037
    60
    HOLCF etc.).  Users of alternative logics may need to tell
wenzelm@27037
    61
    Proof~General explicitly, e.g.\ by giving an option @{verbatim "-k ZF"}
wenzelm@27037
    62
    (in conjunction with @{verbatim "-l ZF"}, to specify the default
wenzelm@27037
    63
    logic image).  Note that option @{verbatim "-L"} does both
wenzelm@27037
    64
    of this at the same time.
wenzelm@27037
    65
  \end{warn}
wenzelm@27037
    66
*}
wenzelm@27037
    67
wenzelm@27037
    68
wenzelm@27037
    69
section {* Lexical matters \label{sec:lex-syntax} *}
wenzelm@27037
    70
wenzelm@27037
    71
text {*
wenzelm@27037
    72
  The Isabelle/Isar outer syntax provides token classes as presented
wenzelm@27037
    73
  below; most of these coincide with the inner lexical syntax as
wenzelm@27037
    74
  presented in \cite{isabelle-ref}.
wenzelm@27037
    75
wenzelm@27037
    76
  \begin{matharray}{rcl}
wenzelm@27037
    77
    @{syntax_def ident} & = & letter\,quasiletter^* \\
wenzelm@27037
    78
    @{syntax_def longident} & = & ident (\verb,.,ident)^+ \\
wenzelm@27037
    79
    @{syntax_def symident} & = & sym^+ ~|~ \verb,\,\verb,<,ident\verb,>, \\
wenzelm@27037
    80
    @{syntax_def nat} & = & digit^+ \\
wenzelm@27037
    81
    @{syntax_def var} & = & ident ~|~ \verb,?,ident ~|~ \verb,?,ident\verb,.,nat \\
wenzelm@27037
    82
    @{syntax_def typefree} & = & \verb,',ident \\
wenzelm@27037
    83
    @{syntax_def typevar} & = & typefree ~|~ \verb,?,typefree ~|~ \verb,?,typefree\verb,.,nat \\
wenzelm@27037
    84
    @{syntax_def string} & = & \verb,", ~\dots~ \verb,", \\
wenzelm@27037
    85
    @{syntax_def altstring} & = & \backquote ~\dots~ \backquote \\
wenzelm@27037
    86
    @{syntax_def verbatim} & = & \verb,{*, ~\dots~ \verb,*,\verb,}, \\[1ex]
wenzelm@27037
    87
wenzelm@27037
    88
    letter & = & latin ~|~ \verb,\,\verb,<,latin\verb,>, ~|~ \verb,\,\verb,<,latin\,latin\verb,>, ~|~ greek ~|~ \\
wenzelm@27037
    89
           &   & \verb,\<^isub>, ~|~ \verb,\<^isup>, \\
wenzelm@27037
    90
    quasiletter & = & letter ~|~ digit ~|~ \verb,_, ~|~ \verb,', \\
wenzelm@27037
    91
    latin & = & \verb,a, ~|~ \dots ~|~ \verb,z, ~|~ \verb,A, ~|~ \dots ~|~ \verb,Z, \\
wenzelm@27037
    92
    digit & = & \verb,0, ~|~ \dots ~|~ \verb,9, \\
wenzelm@27037
    93
    sym & = & \verb,!, ~|~ \verb,#, ~|~ \verb,$, ~|~ \verb,%, ~|~ \verb,&, ~|~
wenzelm@27037
    94
     \verb,*, ~|~ \verb,+, ~|~ \verb,-, ~|~ \verb,/, ~|~ \\
wenzelm@27037
    95
    & & \verb,<, ~|~ \verb,=, ~|~ \verb,>, ~|~ \verb,?, ~|~ \texttt{\at} ~|~
wenzelm@27037
    96
    \verb,^, ~|~ \verb,_, ~|~ \verb,|, ~|~ \verb,~, \\
wenzelm@27037
    97
    greek & = & \verb,\<alpha>, ~|~ \verb,\<beta>, ~|~ \verb,\<gamma>, ~|~ \verb,\<delta>, ~| \\
wenzelm@27037
    98
          &   & \verb,\<epsilon>, ~|~ \verb,\<zeta>, ~|~ \verb,\<eta>, ~|~ \verb,\<theta>, ~| \\
wenzelm@27037
    99
          &   & \verb,\<iota>, ~|~ \verb,\<kappa>, ~|~ \verb,\<mu>, ~|~ \verb,\<nu>, ~| \\
wenzelm@27037
   100
          &   & \verb,\<xi>, ~|~ \verb,\<pi>, ~|~ \verb,\<rho>, ~|~ \verb,\<sigma>, ~|~ \verb,\<tau>, ~| \\
wenzelm@27037
   101
          &   & \verb,\<upsilon>, ~|~ \verb,\<phi>, ~|~ \verb,\<chi>, ~|~ \verb,\<psi>, ~| \\
wenzelm@27037
   102
          &   & \verb,\<omega>, ~|~ \verb,\<Gamma>, ~|~ \verb,\<Delta>, ~|~ \verb,\<Theta>, ~| \\
wenzelm@27037
   103
          &   & \verb,\<Lambda>, ~|~ \verb,\<Xi>, ~|~ \verb,\<Pi>, ~|~ \verb,\<Sigma>, ~| \\
wenzelm@27037
   104
          &   & \verb,\<Upsilon>, ~|~ \verb,\<Phi>, ~|~ \verb,\<Psi>, ~|~ \verb,\<Omega>, \\
wenzelm@27037
   105
  \end{matharray}
wenzelm@27037
   106
wenzelm@27037
   107
  The syntax of @{syntax string} admits any characters, including
wenzelm@27037
   108
  newlines; ``@{verbatim "\""}'' (double-quote) and ``@{verbatim
wenzelm@27037
   109
  "\\"}'' (backslash) need to be escaped by a backslash; arbitrary
wenzelm@27037
   110
  character codes may be specified as ``@{verbatim "\\"}@{text ddd}'',
wenzelm@27037
   111
  with three decimal digits.  Alternative strings according to
wenzelm@27037
   112
  @{syntax altstring} are analogous, using single back-quotes instead.
wenzelm@27037
   113
  The body of @{syntax verbatim} may consist of any text not
wenzelm@27037
   114
  containing ``@{verbatim "*"}@{verbatim "}"}''; this allows
wenzelm@27037
   115
  convenient inclusion of quotes without further escapes.  The greek
wenzelm@27037
   116
  letters do \emph{not} include @{verbatim "\<lambda>"}, which is already used
wenzelm@27037
   117
  differently in the meta-logic.
wenzelm@27037
   118
wenzelm@27037
   119
  Common mathematical symbols such as @{text \<forall>} are represented in
wenzelm@27037
   120
  Isabelle as @{verbatim \<forall>}.  There are infinitely many Isabelle
wenzelm@27037
   121
  symbols like this, although proper presentation is left to front-end
wenzelm@27037
   122
  tools such as {\LaTeX} or Proof~General with the X-Symbol package.
wenzelm@27037
   123
  A list of standard Isabelle symbols that work well with these tools
wenzelm@27037
   124
  is given in \cite[appendix~A]{isabelle-sys}.
wenzelm@27037
   125
  
wenzelm@27037
   126
  Source comments take the form @{verbatim "(*"}~@{text
wenzelm@27037
   127
  "\<dots>"}~@{verbatim "*)"} and may be nested, although user-interface
wenzelm@27037
   128
  tools might prevent this.  Note that this form indicates source
wenzelm@27037
   129
  comments only, which are stripped after lexical analysis of the
wenzelm@28748
   130
  input.  The Isar syntax also provides proper \emph{document
wenzelm@28748
   131
  comments} that are considered as part of the text (see
wenzelm@28748
   132
  \secref{sec:comments}).
wenzelm@27037
   133
*}
wenzelm@27037
   134
wenzelm@27037
   135
wenzelm@27037
   136
section {* Common syntax entities *}
wenzelm@27037
   137
wenzelm@27037
   138
text {*
wenzelm@27037
   139
  We now introduce several basic syntactic entities, such as names,
wenzelm@27037
   140
  terms, and theorem specifications, which are factored out of the
wenzelm@27037
   141
  actual Isar language elements to be described later.
wenzelm@27037
   142
*}
wenzelm@27037
   143
wenzelm@27037
   144
wenzelm@27037
   145
subsection {* Names *}
wenzelm@27037
   146
wenzelm@27037
   147
text {*
wenzelm@27037
   148
  Entity \railqtok{name} usually refers to any name of types,
wenzelm@27037
   149
  constants, theorems etc.\ that are to be \emph{declared} or
wenzelm@27037
   150
  \emph{defined} (so qualified identifiers are excluded here).  Quoted
wenzelm@27037
   151
  strings provide an escape for non-identifier names or those ruled
wenzelm@27037
   152
  out by outer syntax keywords (e.g.\ quoted @{verbatim "\"let\""}).
wenzelm@27037
   153
  Already existing objects are usually referenced by
wenzelm@27037
   154
  \railqtok{nameref}.
wenzelm@27037
   155
wenzelm@27037
   156
  \indexoutertoken{name}\indexoutertoken{parname}\indexoutertoken{nameref}
wenzelm@27037
   157
  \indexoutertoken{int}
wenzelm@27037
   158
  \begin{rail}
wenzelm@27037
   159
    name: ident | symident | string | nat
wenzelm@27037
   160
    ;
wenzelm@27037
   161
    parname: '(' name ')'
wenzelm@27037
   162
    ;
wenzelm@27037
   163
    nameref: name | longident
wenzelm@27037
   164
    ;
wenzelm@27037
   165
    int: nat | '-' nat
wenzelm@27037
   166
    ;
wenzelm@27037
   167
  \end{rail}
wenzelm@27037
   168
*}
wenzelm@27037
   169
wenzelm@27037
   170
wenzelm@27037
   171
subsection {* Comments \label{sec:comments} *}
wenzelm@27037
   172
wenzelm@27037
   173
text {*
wenzelm@27037
   174
  Large chunks of plain \railqtok{text} are usually given
wenzelm@27037
   175
  \railtok{verbatim}, i.e.\ enclosed in @{verbatim "{"}@{verbatim
wenzelm@27037
   176
  "*"}~@{text "\<dots>"}~@{verbatim "*"}@{verbatim "}"}.  For convenience,
wenzelm@27037
   177
  any of the smaller text units conforming to \railqtok{nameref} are
wenzelm@27037
   178
  admitted as well.  A marginal \railnonterm{comment} is of the form
wenzelm@27037
   179
  @{verbatim "--"} \railqtok{text}.  Any number of these may occur
wenzelm@27037
   180
  within Isabelle/Isar commands.
wenzelm@27037
   181
wenzelm@27037
   182
  \indexoutertoken{text}\indexouternonterm{comment}
wenzelm@27037
   183
  \begin{rail}
wenzelm@27037
   184
    text: verbatim | nameref
wenzelm@27037
   185
    ;
wenzelm@27037
   186
    comment: '--' text
wenzelm@27037
   187
    ;
wenzelm@27037
   188
  \end{rail}
wenzelm@27037
   189
*}
wenzelm@27037
   190
wenzelm@27037
   191
wenzelm@27037
   192
subsection {* Type classes, sorts and arities *}
wenzelm@27037
   193
wenzelm@27037
   194
text {*
wenzelm@27037
   195
  Classes are specified by plain names.  Sorts have a very simple
wenzelm@27037
   196
  inner syntax, which is either a single class name @{text c} or a
wenzelm@27037
   197
  list @{text "{c\<^sub>1, \<dots>, c\<^sub>n}"} referring to the
wenzelm@27037
   198
  intersection of these classes.  The syntax of type arities is given
wenzelm@27037
   199
  directly at the outer level.
wenzelm@27037
   200
wenzelm@27037
   201
  \indexouternonterm{sort}\indexouternonterm{arity}
wenzelm@27037
   202
  \indexouternonterm{classdecl}
wenzelm@27037
   203
  \begin{rail}
wenzelm@27037
   204
    classdecl: name (('<' | subseteq) (nameref + ','))?
wenzelm@27037
   205
    ;
wenzelm@27037
   206
    sort: nameref
wenzelm@27037
   207
    ;
wenzelm@27037
   208
    arity: ('(' (sort + ',') ')')? sort
wenzelm@27037
   209
    ;
wenzelm@27037
   210
  \end{rail}
wenzelm@27037
   211
*}
wenzelm@27037
   212
wenzelm@27037
   213
wenzelm@27037
   214
subsection {* Types and terms \label{sec:types-terms} *}
wenzelm@27037
   215
wenzelm@27037
   216
text {*
wenzelm@27037
   217
  The actual inner Isabelle syntax, that of types and terms of the
wenzelm@27037
   218
  logic, is far too sophisticated in order to be modelled explicitly
wenzelm@27037
   219
  at the outer theory level.  Basically, any such entity has to be
wenzelm@27037
   220
  quoted to turn it into a single token (the parsing and type-checking
wenzelm@27037
   221
  is performed internally later).  For convenience, a slightly more
wenzelm@27037
   222
  liberal convention is adopted: quotes may be omitted for any type or
wenzelm@27037
   223
  term that is already atomic at the outer level.  For example, one
wenzelm@27037
   224
  may just write @{verbatim x} instead of quoted @{verbatim "\"x\""}.
wenzelm@27037
   225
  Note that symbolic identifiers (e.g.\ @{verbatim "++"} or @{text
wenzelm@27037
   226
  "\<forall>"} are available as well, provided these have not been superseded
wenzelm@27037
   227
  by commands or other keywords already (such as @{verbatim "="} or
wenzelm@27037
   228
  @{verbatim "+"}).
wenzelm@27037
   229
wenzelm@27037
   230
  \indexoutertoken{type}\indexoutertoken{term}\indexoutertoken{prop}
wenzelm@27037
   231
  \begin{rail}
wenzelm@27037
   232
    type: nameref | typefree | typevar
wenzelm@27037
   233
    ;
wenzelm@27037
   234
    term: nameref | var
wenzelm@27037
   235
    ;
wenzelm@27037
   236
    prop: term
wenzelm@27037
   237
    ;
wenzelm@27037
   238
  \end{rail}
wenzelm@27037
   239
wenzelm@27037
   240
  Positional instantiations are indicated by giving a sequence of
wenzelm@27037
   241
  terms, or the placeholder ``@{text _}'' (underscore), which means to
wenzelm@27037
   242
  skip a position.
wenzelm@27037
   243
wenzelm@27037
   244
  \indexoutertoken{inst}\indexoutertoken{insts}
wenzelm@27037
   245
  \begin{rail}
wenzelm@27037
   246
    inst: underscore | term
wenzelm@27037
   247
    ;
wenzelm@27037
   248
    insts: (inst *)
wenzelm@27037
   249
    ;
wenzelm@27037
   250
  \end{rail}
wenzelm@27037
   251
wenzelm@27037
   252
  Type declarations and definitions usually refer to
wenzelm@27037
   253
  \railnonterm{typespec} on the left-hand side.  This models basic
wenzelm@27037
   254
  type constructor application at the outer syntax level.  Note that
wenzelm@27037
   255
  only plain postfix notation is available here, but no infixes.
wenzelm@27037
   256
wenzelm@27037
   257
  \indexouternonterm{typespec}
wenzelm@27037
   258
  \begin{rail}
wenzelm@27037
   259
    typespec: (() | typefree | '(' ( typefree + ',' ) ')') name
wenzelm@27037
   260
    ;
wenzelm@27037
   261
  \end{rail}
wenzelm@27037
   262
*}
wenzelm@27037
   263
wenzelm@27037
   264
wenzelm@28752
   265
subsection {* Term patterns and declarations \label{sec:term-decls} *}
wenzelm@28752
   266
wenzelm@28752
   267
text {*
wenzelm@28752
   268
  Wherever explicit propositions (or term fragments) occur in a proof
wenzelm@28752
   269
  text, casual binding of schematic term variables may be given
wenzelm@28752
   270
  specified via patterns of the form ``@{text "(\<IS> p\<^sub>1 \<dots>
wenzelm@28752
   271
  p\<^sub>n)"}''.  This works both for \railqtok{term} and \railqtok{prop}.
wenzelm@28752
   272
wenzelm@28752
   273
  \indexouternonterm{termpat}\indexouternonterm{proppat}
wenzelm@28752
   274
  \begin{rail}
wenzelm@28752
   275
    termpat: '(' ('is' term +) ')'
wenzelm@28752
   276
    ;
wenzelm@28752
   277
    proppat: '(' ('is' prop +) ')'
wenzelm@28752
   278
    ;
wenzelm@28752
   279
  \end{rail}
wenzelm@28752
   280
wenzelm@28752
   281
  \medskip Declarations of local variables @{text "x :: \<tau>"} and
wenzelm@28752
   282
  logical propositions @{text "a : \<phi>"} represent different views on
wenzelm@28752
   283
  the same principle of introducing a local scope.  In practice, one
wenzelm@28752
   284
  may usually omit the typing of \railnonterm{vars} (due to
wenzelm@28752
   285
  type-inference), and the naming of propositions (due to implicit
wenzelm@28752
   286
  references of current facts).  In any case, Isar proof elements
wenzelm@28752
   287
  usually admit to introduce multiple such items simultaneously.
wenzelm@28752
   288
wenzelm@28752
   289
  \indexouternonterm{vars}\indexouternonterm{props}
wenzelm@28752
   290
  \begin{rail}
wenzelm@28752
   291
    vars: (name+) ('::' type)?
wenzelm@28752
   292
    ;
wenzelm@28752
   293
    props: thmdecl? (prop proppat? +)
wenzelm@28752
   294
    ;
wenzelm@28752
   295
  \end{rail}
wenzelm@28752
   296
wenzelm@28752
   297
  The treatment of multiple declarations corresponds to the
wenzelm@28752
   298
  complementary focus of \railnonterm{vars} versus
wenzelm@28752
   299
  \railnonterm{props}.  In ``@{text "x\<^sub>1 \<dots> x\<^sub>n :: \<tau>"}''
wenzelm@28752
   300
  the typing refers to all variables, while in @{text "a: \<phi>\<^sub>1 \<dots>
wenzelm@28752
   301
  \<phi>\<^sub>n"} the naming refers to all propositions collectively.
wenzelm@28752
   302
  Isar language elements that refer to \railnonterm{vars} or
wenzelm@28752
   303
  \railnonterm{props} typically admit separate typings or namings via
wenzelm@28752
   304
  another level of iteration, with explicit @{keyword_ref "and"}
wenzelm@28752
   305
  separators; e.g.\ see @{command "fix"} and @{command "assume"} in
wenzelm@28752
   306
  \secref{sec:proof-context}.
wenzelm@28752
   307
*}
wenzelm@28752
   308
wenzelm@28752
   309
wenzelm@27037
   310
subsection {* Mixfix annotations *}
wenzelm@27037
   311
wenzelm@27037
   312
text {*
wenzelm@27037
   313
  Mixfix annotations specify concrete \emph{inner} syntax of Isabelle
wenzelm@27037
   314
  types and terms.  Some commands such as @{command "types"} (see
wenzelm@27037
   315
  \secref{sec:types-pure}) admit infixes only, while @{command
wenzelm@27037
   316
  "consts"} (see \secref{sec:consts}) and @{command "syntax"} (see
wenzelm@27037
   317
  \secref{sec:syn-trans}) support the full range of general mixfixes
wenzelm@27037
   318
  and binders.
wenzelm@27037
   319
wenzelm@27037
   320
  \indexouternonterm{infix}\indexouternonterm{mixfix}\indexouternonterm{structmixfix}
wenzelm@27037
   321
  \begin{rail}
wenzelm@28752
   322
    infix: '(' ('infix' | 'infixl' | 'infixr') string nat ')'
wenzelm@27037
   323
    ;
wenzelm@27037
   324
    mixfix: infix | '(' string prios? nat? ')' | '(' 'binder' string prios? nat ')'
wenzelm@27037
   325
    ;
wenzelm@27037
   326
    structmixfix: mixfix | '(' 'structure' ')'
wenzelm@27037
   327
    ;
wenzelm@27037
   328
wenzelm@27037
   329
    prios: '[' (nat + ',') ']'
wenzelm@27037
   330
    ;
wenzelm@27037
   331
  \end{rail}
wenzelm@27037
   332
wenzelm@27037
   333
  Here the \railtok{string} specifications refer to the actual mixfix
wenzelm@28752
   334
  template, which may include literal text, spacing, blocks, and
wenzelm@28752
   335
  arguments (denoted by ``@{text _}''); the special symbol
wenzelm@28752
   336
  ``@{verbatim "\<index>"}'' (printed as ``@{text "\<index>"}'') represents an index
wenzelm@28752
   337
  argument that specifies an implicit structure reference (see also
wenzelm@28752
   338
  \secref{sec:locale}).  Infix and binder declarations provide common
wenzelm@28752
   339
  abbreviations for particular mixfix declarations.  So in practice,
wenzelm@28752
   340
  mixfix templates mostly degenerate to literal text for concrete
wenzelm@28752
   341
  syntax, such as ``@{verbatim "++"}'' for an infix symbol.
wenzelm@28752
   342
wenzelm@28752
   343
  \medskip In full generality, mixfix declarations work as follows.
wenzelm@28752
   344
  Suppose a constant @{text "c :: \<tau>\<^sub>1 \<Rightarrow> \<dots> \<tau>\<^sub>n \<Rightarrow> \<tau>"} is
wenzelm@28752
   345
  annotated by @{text "(mixfix [p\<^sub>1, \<dots>, p\<^sub>n] p)"}, where @{text
wenzelm@28752
   346
  "mixfix"} is a string @{text "d\<^sub>0 _ d\<^sub>1 _ \<dots> _ d\<^sub>n"} consisting of
wenzelm@28752
   347
  delimiters that surround argument positions as indicated by
wenzelm@28752
   348
  underscores.
wenzelm@28752
   349
wenzelm@28752
   350
  Altogether this determines a production for a context-free priority
wenzelm@28752
   351
  grammar, where for each argument @{text "i"} the syntactic category
wenzelm@28752
   352
  is determined by @{text "\<tau>\<^sub>i"} (with priority @{text "p\<^sub>i"}), and
wenzelm@28752
   353
  the result category is determined from @{text "\<tau>"} (with
wenzelm@28752
   354
  priority @{text "p"}).  Priority specifications are optional, with
wenzelm@28752
   355
  default 0 for arguments and 1000 for the result.
wenzelm@28752
   356
wenzelm@28752
   357
  Since @{text "\<tau>"} may be again a function type, the constant
wenzelm@28752
   358
  type scheme may have more argument positions than the mixfix
wenzelm@28752
   359
  pattern.  Printing a nested application @{text "c t\<^sub>1 \<dots> t\<^sub>m"} for
wenzelm@28752
   360
  @{text "m > n"} works by attaching concrete notation only to the
wenzelm@28752
   361
  innermost part, essentially by printing @{text "(c t\<^sub>1 \<dots> t\<^sub>n) \<dots> t\<^sub>m"}
wenzelm@28752
   362
  instead.  If a term has fewer arguments than specified in the mixfix
wenzelm@28752
   363
  template, the concrete syntax is ignored.
wenzelm@28752
   364
wenzelm@28752
   365
  \medskip A mixfix template may also contain additional directives
wenzelm@28752
   366
  for pretty printing, notably spaces, blocks, and breaks.  The
wenzelm@28752
   367
  general template format is a sequence over any of the following
wenzelm@28752
   368
  entities.
wenzelm@28752
   369
wenzelm@28752
   370
  \begin{itemize}
wenzelm@28752
   371
wenzelm@28752
   372
  \item @{text "\<^bold>d"} is a delimiter, namely a non-empty
wenzelm@28752
   373
  sequence of characters other than the special characters @{text "'"}
wenzelm@28752
   374
  (single quote), @{text "_"} (underscore), @{text "\<index>"} (index
wenzelm@28752
   375
  symbol), @{text "/"} (slash), @{text "("} and @{text ")"}
wenzelm@28752
   376
  (parentheses).
wenzelm@28752
   377
wenzelm@28752
   378
  A single quote escapes the special meaning of these meta-characters,
wenzelm@28752
   379
  producing a literal version of the following character, unless that
wenzelm@28752
   380
  is a blank.  A single quote followed by a blank separates
wenzelm@28752
   381
  delimiters, without affecting printing, but input tokens may have
wenzelm@28752
   382
  additional white space here.
wenzelm@28752
   383
wenzelm@28752
   384
  \item @{text "_"} is an argument position, which stands for a
wenzelm@28752
   385
  certain syntactic category in the underlying grammar.
wenzelm@28752
   386
wenzelm@28752
   387
  \item @{text "\<index>"} is an indexed argument position; this is
wenzelm@28752
   388
  the place where implicit structure arguments can be attached.
wenzelm@28752
   389
wenzelm@28752
   390
  \item @{text "\<^bold>s"} is a non-empty sequence of spaces for
wenzelm@28752
   391
  printing.  This and the following specifications do not affect
wenzelm@28752
   392
  parsing at all.
wenzelm@28752
   393
wenzelm@28752
   394
  \item @{text "(\<^bold>n"} opens a pretty printing block.  The
wenzelm@28752
   395
  optional number specifies how much indentation to add when a line
wenzelm@28752
   396
  break occurs within the block.  If the parenthesis is not followed
wenzelm@28752
   397
  by digits, the indentation defaults to 0.  A block specified via
wenzelm@28752
   398
  @{text "(00"} is unbreakable.
wenzelm@28752
   399
wenzelm@28752
   400
  \item @{text ")"} closes a pretty printing block.
wenzelm@28752
   401
wenzelm@28752
   402
  \item @{text "//"} forces a line break.
wenzelm@28752
   403
wenzelm@28752
   404
  \item @{text "/\<^bold>s"} allows a line break.  Here @{text
wenzelm@28752
   405
  "\<^bold>s"} stands for the string of spaces (zero or more) right
wenzelm@28752
   406
  after the slash.  These spaces are printed if the break is
wenzelm@28752
   407
  \emph{not} taken.
wenzelm@28752
   408
wenzelm@28752
   409
  \end{itemize}
wenzelm@28752
   410
wenzelm@28752
   411
  For example, the template @{text "(_ +/ _)"} specifies an infix
wenzelm@28752
   412
  operator.  There are two argument positions; the delimiter @{text
wenzelm@28752
   413
  "+"} is preceded by a space and followed by a space or line break;
wenzelm@28752
   414
  the entire phrase is a pretty printing block.
wenzelm@28752
   415
wenzelm@28752
   416
  The general idea of pretty printing with blocks and breaks is also
wenzelm@28752
   417
  described in \cite{paulson-ml2}.
wenzelm@27037
   418
*}
wenzelm@27037
   419
wenzelm@27037
   420
wenzelm@27037
   421
subsection {* Proof methods \label{sec:syn-meth} *}
wenzelm@27037
   422
wenzelm@27037
   423
text {*
wenzelm@27037
   424
  Proof methods are either basic ones, or expressions composed of
wenzelm@27037
   425
  methods via ``@{verbatim ","}'' (sequential composition),
wenzelm@27037
   426
  ``@{verbatim "|"}'' (alternative choices), ``@{verbatim "?"}'' 
wenzelm@27037
   427
  (try), ``@{verbatim "+"}'' (repeat at least once), ``@{verbatim
wenzelm@27037
   428
  "["}@{text n}@{verbatim "]"}'' (restriction to first @{text n}
wenzelm@27037
   429
  sub-goals, with default @{text "n = 1"}).  In practice, proof
wenzelm@27037
   430
  methods are usually just a comma separated list of
wenzelm@27037
   431
  \railqtok{nameref}~\railnonterm{args} specifications.  Note that
wenzelm@27037
   432
  parentheses may be dropped for single method specifications (with no
wenzelm@27037
   433
  arguments).
wenzelm@27037
   434
wenzelm@27037
   435
  \indexouternonterm{method}
wenzelm@27037
   436
  \begin{rail}
wenzelm@27037
   437
    method: (nameref | '(' methods ')') (() | '?' | '+' | '[' nat? ']')
wenzelm@27037
   438
    ;
wenzelm@27037
   439
    methods: (nameref args | method) + (',' | '|')
wenzelm@27037
   440
    ;
wenzelm@27037
   441
  \end{rail}
wenzelm@27037
   442
wenzelm@27037
   443
  Proper Isar proof methods do \emph{not} admit arbitrary goal
wenzelm@27037
   444
  addressing, but refer either to the first sub-goal or all sub-goals
wenzelm@27037
   445
  uniformly.  The goal restriction operator ``@{text "[n]"}''
wenzelm@27037
   446
  evaluates a method expression within a sandbox consisting of the
wenzelm@27037
   447
  first @{text n} sub-goals (which need to exist).  For example, the
wenzelm@27037
   448
  method ``@{text "simp_all[3]"}'' simplifies the first three
wenzelm@27037
   449
  sub-goals, while ``@{text "(rule foo, simp_all)[]"}'' simplifies all
wenzelm@27037
   450
  new goals that emerge from applying rule @{text "foo"} to the
wenzelm@27037
   451
  originally first one.
wenzelm@27037
   452
wenzelm@27037
   453
  Improper methods, notably tactic emulations, offer a separate
wenzelm@27037
   454
  low-level goal addressing scheme as explicit argument to the
wenzelm@27037
   455
  individual tactic being involved.  Here ``@{text "[!]"}'' refers to
wenzelm@27037
   456
  all goals, and ``@{text "[n-]"}'' to all goals starting from @{text
wenzelm@27037
   457
  "n"}.
wenzelm@27037
   458
wenzelm@27037
   459
  \indexouternonterm{goalspec}
wenzelm@27037
   460
  \begin{rail}
wenzelm@27037
   461
    goalspec: '[' (nat '-' nat | nat '-' | nat | '!' ) ']'
wenzelm@27037
   462
    ;
wenzelm@27037
   463
  \end{rail}
wenzelm@27037
   464
*}
wenzelm@27037
   465
wenzelm@27037
   466
wenzelm@27037
   467
subsection {* Attributes and theorems \label{sec:syn-att} *}
wenzelm@27037
   468
wenzelm@27037
   469
text {*
wenzelm@27037
   470
  Attributes (and proof methods, see \secref{sec:syn-meth}) have their
wenzelm@27037
   471
  own ``semi-inner'' syntax, in the sense that input conforming to
wenzelm@27037
   472
  \railnonterm{args} below is parsed by the attribute a second time.
wenzelm@27037
   473
  The attribute argument specifications may be any sequence of atomic
wenzelm@27037
   474
  entities (identifiers, strings etc.), or properly bracketed argument
wenzelm@27037
   475
  lists.  Below \railqtok{atom} refers to any atomic entity, including
wenzelm@27037
   476
  any \railtok{keyword} conforming to \railtok{symident}.
wenzelm@27037
   477
wenzelm@27037
   478
  \indexoutertoken{atom}\indexouternonterm{args}\indexouternonterm{attributes}
wenzelm@27037
   479
  \begin{rail}
wenzelm@27037
   480
    atom: nameref | typefree | typevar | var | nat | keyword
wenzelm@27037
   481
    ;
wenzelm@27037
   482
    arg: atom | '(' args ')' | '[' args ']'
wenzelm@27037
   483
    ;
wenzelm@27037
   484
    args: arg *
wenzelm@27037
   485
    ;
wenzelm@27037
   486
    attributes: '[' (nameref args * ',') ']'
wenzelm@27037
   487
    ;
wenzelm@27037
   488
  \end{rail}
wenzelm@27037
   489
wenzelm@27037
   490
  Theorem specifications come in several flavors:
wenzelm@27037
   491
  \railnonterm{axmdecl} and \railnonterm{thmdecl} usually refer to
wenzelm@27037
   492
  axioms, assumptions or results of goal statements, while
wenzelm@27037
   493
  \railnonterm{thmdef} collects lists of existing theorems.  Existing
wenzelm@27037
   494
  theorems are given by \railnonterm{thmref} and
wenzelm@27037
   495
  \railnonterm{thmrefs}, the former requires an actual singleton
wenzelm@27037
   496
  result.
wenzelm@27037
   497
wenzelm@27037
   498
  There are three forms of theorem references:
wenzelm@27037
   499
  \begin{enumerate}
wenzelm@27037
   500
  
wenzelm@27037
   501
  \item named facts @{text "a"},
wenzelm@27037
   502
wenzelm@27037
   503
  \item selections from named facts @{text "a(i)"} or @{text "a(j - k)"},
wenzelm@27037
   504
wenzelm@27037
   505
  \item literal fact propositions using @{syntax_ref altstring} syntax
wenzelm@27037
   506
  @{verbatim "`"}@{text "\<phi>"}@{verbatim "`"} (see also method
wenzelm@27037
   507
  @{method_ref fact} in \secref{sec:pure-meth-att}).
wenzelm@27037
   508
wenzelm@27037
   509
  \end{enumerate}
wenzelm@27037
   510
wenzelm@27037
   511
  Any kind of theorem specification may include lists of attributes
wenzelm@27037
   512
  both on the left and right hand sides; attributes are applied to any
wenzelm@27037
   513
  immediately preceding fact.  If names are omitted, the theorems are
wenzelm@27037
   514
  not stored within the theorem database of the theory or proof
wenzelm@27037
   515
  context, but any given attributes are applied nonetheless.
wenzelm@27037
   516
wenzelm@27037
   517
  An extra pair of brackets around attributes (like ``@{text
wenzelm@27037
   518
  "[[simproc a]]"}'') abbreviates a theorem reference involving an
wenzelm@27037
   519
  internal dummy fact, which will be ignored later on.  So only the
wenzelm@27037
   520
  effect of the attribute on the background context will persist.
wenzelm@27037
   521
  This form of in-place declarations is particularly useful with
wenzelm@27037
   522
  commands like @{command "declare"} and @{command "using"}.
wenzelm@27037
   523
wenzelm@27037
   524
  \indexouternonterm{axmdecl}\indexouternonterm{thmdecl}
wenzelm@27037
   525
  \indexouternonterm{thmdef}\indexouternonterm{thmref}
wenzelm@27037
   526
  \indexouternonterm{thmrefs}\indexouternonterm{selection}
wenzelm@27037
   527
  \begin{rail}
wenzelm@27037
   528
    axmdecl: name attributes? ':'
wenzelm@27037
   529
    ;
wenzelm@27037
   530
    thmdecl: thmbind ':'
wenzelm@27037
   531
    ;
wenzelm@27037
   532
    thmdef: thmbind '='
wenzelm@27037
   533
    ;
wenzelm@27037
   534
    thmref: (nameref selection? | altstring) attributes? | '[' attributes ']'
wenzelm@27037
   535
    ;
wenzelm@27037
   536
    thmrefs: thmref +
wenzelm@27037
   537
    ;
wenzelm@27037
   538
wenzelm@27037
   539
    thmbind: name attributes | name | attributes
wenzelm@27037
   540
    ;
wenzelm@27037
   541
    selection: '(' ((nat | nat '-' nat?) + ',') ')'
wenzelm@27037
   542
    ;
wenzelm@27037
   543
  \end{rail}
wenzelm@27037
   544
*}
wenzelm@27037
   545
wenzelm@27037
   546
end