wneuper/isa: doc-src/IsarRef/Thy/Outer

     1 (* $Id$ *)

     3 theory Outer_Syntax

     4 imports Main

     5 begin

     7 chapter {* Outer syntax *}

     9 text {*

    10   The rather generic framework of Isabelle/Isar syntax emerges from

    11   three main syntactic categories: \emph{commands} of the top-level

    12   Isar engine (covering theory and proof elements), \emph{methods} for

    13   general goal refinements (analogous to traditional ``tactics''), and

    14   \emph{attributes} for operations on facts (within a certain

    15   context).  Subsequently we give a reference of basic syntactic

    16   entities underlying Isabelle/Isar syntax in a bottom-up manner.

    17   Concrete theory and proof language elements will be introduced later

    18   on.

    20   \medskip In order to get started with writing well-formed

    21   Isabelle/Isar documents, the most important aspect to be noted is

    22   the difference of \emph{inner} versus \emph{outer} syntax.  Inner

    23   syntax is that of Isabelle types and terms of the logic, while outer

    24   syntax is that of Isabelle/Isar theory sources (specifications and

    25   proofs).  As a general rule, inner syntax entities may occur only as

    26   \emph{atomic entities} within outer syntax.  For example, the string

    27   @{verbatim "\"x + y\""} and identifier @{verbatim z} are legal term

    28   specifications within a theory, while @{verbatim "x + y"} without

    29   quotes is not.

    31   Printed theory documents usually omit quotes to gain readability

    32   (this is a matter of {\LaTeX} macro setup, say via @{verbatim

    33   "\\isabellestyle"}, see also \cite{isabelle-sys}).  Experienced

    34   users of Isabelle/Isar may easily reconstruct the lost technical

    35   information, while mere readers need not care about quotes at all.

    37   \medskip Isabelle/Isar input may contain any number of input

    38   termination characters ``@{verbatim ";"}'' (semicolon) to separate

    39   commands explicitly.  This is particularly useful in interactive

    40   shell sessions to make clear where the current command is intended

    41   to end.  Otherwise, the interpreter loop will continue to issue a

    42   secondary prompt ``@{verbatim "#"}'' until an end-of-command is

    43   clearly recognized from the input syntax, e.g.\ encounter of the

    44   next command keyword.

    46   More advanced interfaces such as Proof~General \cite{proofgeneral}

    47   do not require explicit semicolons, the amount of input text is

    48   determined automatically by inspecting the present content of the

    49   Emacs text buffer.  In the printed presentation of Isabelle/Isar

    50   documents semicolons are omitted altogether for readability.

    52   \begin{warn}

    53     Proof~General requires certain syntax classification tables in

    54     order to achieve properly synchronized interaction with the

    55     Isabelle/Isar process.  These tables need to be consistent with

    56     the Isabelle version and particular logic image to be used in a

    57     running session (common object-logics may well change the outer

    58     syntax).  The standard setup should work correctly with any of the

    59     ``official'' logic images derived from Isabelle/HOL (including

    60     HOLCF etc.).  Users of alternative logics may need to tell

    61     Proof~General explicitly, e.g.\ by giving an option @{verbatim "-k ZF"}

    62     (in conjunction with @{verbatim "-l ZF"}, to specify the default

    63     logic image).  Note that option @{verbatim "-L"} does both

    64     of this at the same time.

    65   \end{warn}

    66 *}

    69 section {* Lexical matters \label{sec:outer-lex} *}

    71 text {* The outer lexical syntax consists of three main categories of

    72   syntax tokens:

    74   \begin{enumerate}

    76   \item \emph{major keywords} --- the command names that are available

    77   in the present logic session;

    79   \item \emph{minor keywords} --- additional literal tokens required

    80   by the syntax of commands;

    82   \item \emph{named tokens} --- various categories of identifiers etc.

    84   \end{enumerate}

    86   Major keywords and minor keywords are guaranteed to be disjoint.

    87   This helps user-interfaces to determine the overall structure of a

    88   theory text, without knowing the full details of command syntax.

    89   Internally, there is some additional information about the kind of

    90   major keywords, which approximates the command type (theory command,

    91   proof command etc.).

    93   Keywords override named tokens.  For example, the presence of a

    94   command called @{verbatim term} inhibits the identifier @{verbatim

    95   term}, but the string @{verbatim "\"term\""} can be used instead.

    96   By convention, the outer syntax always allows quoted strings in

    97   addition to identifiers, wherever a named entity is expected.

    99   When tokenizing a given input sequence, the lexer repeatedly takes

   100   the longest prefix of the input that forms a valid token.  Spaces,

   101   tabs, newlines and formfeeds between tokens serve as explicit

   102   separators.

   104   \medskip The categories for named tokens are defined once and for

   105   all as follows.

   107   \begin{center}

   108   \begin{supertabular}{rcl}

   109     @{syntax_def ident} & = & @{text "letter quasiletter\<^sup>*"} \\

   110     @{syntax_def longident} & = & @{text "ident("}@{verbatim "."}@{text "ident)\<^sup>+"} \\

   111     @{syntax_def symident} & = & @{text "sym\<^sup>+  |  "}@{verbatim "\\"}@{verbatim "<"}@{text ident}@{verbatim ">"} \\

   112     @{syntax_def nat} & = & @{text "digit\<^sup>+"} \\

   113     @{syntax_def var} & = & @{verbatim "?"}@{text "ident  |  "}@{verbatim "?"}@{text ident}@{verbatim "."}@{text nat} \\

   114     @{syntax_def typefree} & = & @{verbatim "'"}@{text ident} \\

   115     @{syntax_def typevar} & = & @{verbatim "?"}@{text "typefree  |  "}@{verbatim "?"}@{text typefree}@{verbatim "."}@{text nat} \\

   116     @{syntax_def string} & = & @{verbatim "\""} @{text "\<dots>"} @{verbatim "\""} \\

   117     @{syntax_def altstring} & = & @{verbatim "`"} @{text "\<dots>"} @{verbatim "`"} \\

   118     @{syntax_def verbatim} & = & @{verbatim "{*"} @{text "\<dots>"} @{verbatim "*"}@{verbatim "}"} \\[1ex]

   120     @{text letter} & = & @{text "latin  |  "}@{verbatim "\\"}@{verbatim "<"}@{text latin}@{verbatim ">"}@{text "  |  "}@{verbatim "\\"}@{verbatim "<"}@{text "latin latin"}@{verbatim ">"}@{text "  |  greek  |"} \\

   121           &   & @{verbatim "\<^isub>"}@{text "  |  "}@{verbatim "\<^isup>"} \\

   122     @{text quasiletter} & = & @{text "letter  |  digit  |  "}@{verbatim "_"}@{text "  |  "}@{verbatim "'"} \\

   123     @{text latin} & = & @{verbatim a}@{text "  | \<dots> |  "}@{verbatim z}@{text "  |  "}@{verbatim A}@{text "  |  \<dots> |  "}@{verbatim Z} \\

   124     @{text digit} & = & @{verbatim "0"}@{text "  |  \<dots> |  "}@{verbatim "9"} \\

   125     @{text sym} & = & @{verbatim "!"}@{text "  |  "}@{verbatim "#"}@{text "  |  "}@{verbatim "$"}@{text "  |  "}@{verbatim "%"}@{text "  |  "}@{verbatim "&"}@{text "  |  "}@{verbatim "*"}@{text "  |  "}@{verbatim "+"}@{text "  |  "}@{verbatim "-"}@{text "  |  "}@{verbatim "/"}@{text "  |"} \\

   126     & & @{verbatim "<"}@{text "  |  "}@{verbatim "="}@{text "  |  "}@{verbatim ">"}@{text "  |  "}@{verbatim "?"}@{text "  |  "}@{verbatim "@"}@{text "  |  "}@{verbatim "^"}@{text "  |  "}@{verbatim "_"}@{text "  |  "}@{verbatim "|"}@{text "  |  "}@{verbatim "~"} \\

   127     @{text greek} & = & @{verbatim "\<alpha>"}@{text "  |  "}@{verbatim "\<beta>"}@{text "  |  "}@{verbatim "\<gamma>"}@{text "  |  "}@{verbatim "\<delta>"}@{text "  |"} \\

   128           &   & @{verbatim "\<epsilon>"}@{text "  |  "}@{verbatim "\<zeta>"}@{text "  |  "}@{verbatim "\<eta>"}@{text "  |  "}@{verbatim "\<theta>"}@{text "  |"} \\

   129           &   & @{verbatim "\<iota>"}@{text "  |  "}@{verbatim "\<kappa>"}@{text "  |  "}@{verbatim "\<mu>"}@{text "  |  "}@{verbatim "\<nu>"}@{text "  |"} \\

   130           &   & @{verbatim "\<xi>"}@{text "  |  "}@{verbatim "\<pi>"}@{text "  |  "}@{verbatim "\<rho>"}@{text "  |  "}@{verbatim "\<sigma>"}@{text "  |  "}@{verbatim "\<tau>"}@{text "  |"} \\

   131           &   & @{verbatim "\<upsilon>"}@{text "  |  "}@{verbatim "\<phi>"}@{text "  |  "}@{verbatim "\<chi>"}@{text "  |  "}@{verbatim "\<psi>"}@{text "  |"} \\

   132           &   & @{verbatim "\<omega>"}@{text "  |  "}@{verbatim "\<Gamma>"}@{text "  |  "}@{verbatim "\<Delta>"}@{text "  |  "}@{verbatim "\<Theta>"}@{text "  |"} \\

   133           &   & @{verbatim "\<Lambda>"}@{text "  |  "}@{verbatim "\<Xi>"}@{text "  |  "}@{verbatim "\<Pi>"}@{text "  |  "}@{verbatim "\<Sigma>"}@{text "  |"} \\

   134           &   & @{verbatim "\<Upsilon>"}@{text "  |  "}@{verbatim "\<Phi>"}@{text "  |  "}@{verbatim "\<Psi>"}@{text "  |  "}@{verbatim "\<Omega>"} \\

   135   \end{supertabular}

   136   \end{center}

   138   The syntax of @{syntax string} admits any characters, including

   139   newlines; ``@{verbatim "\""}'' (double-quote) and ``@{verbatim

   140   "\\"}'' (backslash) need to be escaped by a backslash; arbitrary

   141   character codes may be specified as ``@{verbatim "\\"}@{text ddd}'',

   142   with three decimal digits.  Alternative strings according to

   143   @{syntax altstring} are analogous, using single back-quotes instead.

   144   The body of @{syntax verbatim} may consist of any text not

   145   containing ``@{verbatim "*"}@{verbatim "}"}''; this allows

   146   convenient inclusion of quotes without further escapes.  The greek

   147   letters do \emph{not} include @{verbatim "\<lambda>"}, which is already used

   148   differently in the meta-logic.

   150   Common mathematical symbols such as @{text \<forall>} are represented in

   151   Isabelle as @{verbatim \<forall>}.  There are infinitely many Isabelle

   152   symbols like this, although proper presentation is left to front-end

   153   tools such as {\LaTeX} or Proof~General with the X-Symbol package.

   154   A list of standard Isabelle symbols that work well with these tools

   155   is given in \cite[appendix~A]{isabelle-sys}.

   157   Source comments take the form @{verbatim "(*"}~@{text

   158   "\<dots>"}~@{verbatim "*)"} and may be nested, although the user-interface

   159   might prevent this.  Note that this form indicates source comments

   160   only, which are stripped after lexical analysis of the input.  The

   161   Isar syntax also provides proper \emph{document comments} that are

   162   considered as part of the text (see \secref{sec:comments}).

   163 *}

   166 section {* Common syntax entities *}

   168 text {*

   169   We now introduce several basic syntactic entities, such as names,

   170   terms, and theorem specifications, which are factored out of the

   171   actual Isar language elements to be described later.

   172 *}

   175 subsection {* Names *}

   177 text {*

   178   Entity \railqtok{name} usually refers to any name of types,

   179   constants, theorems etc.\ that are to be \emph{declared} or

   180   \emph{defined} (so qualified identifiers are excluded here).  Quoted

   181   strings provide an escape for non-identifier names or those ruled

   182   out by outer syntax keywords (e.g.\ quoted @{verbatim "\"let\""}).

   183   Already existing objects are usually referenced by

   184   \railqtok{nameref}.

   186   \indexoutertoken{name}\indexoutertoken{parname}\indexoutertoken{nameref}

   187   \indexoutertoken{int}

   188   \begin{rail}

   189     name: ident | symident | string | nat

   190     ;

   191     parname: '(' name ')'

   192     ;

   193     nameref: name | longident

   194     ;

   195     int: nat | '-' nat

   196     ;

   197   \end{rail}

   198 *}

   201 subsection {* Comments \label{sec:comments} *}

   203 text {*

   204   Large chunks of plain \railqtok{text} are usually given

   205   \railtok{verbatim}, i.e.\ enclosed in @{verbatim "{"}@{verbatim

   206   "*"}~@{text "\<dots>"}~@{verbatim "*"}@{verbatim "}"}.  For convenience,

   207   any of the smaller text units conforming to \railqtok{nameref} are

   208   admitted as well.  A marginal \railnonterm{comment} is of the form

   209   @{verbatim "--"} \railqtok{text}.  Any number of these may occur

   210   within Isabelle/Isar commands.

   212   \indexoutertoken{text}\indexouternonterm{comment}

   213   \begin{rail}

   214     text: verbatim | nameref

   215     ;

   216     comment: '--' text

   217     ;

   218   \end{rail}

   219 *}

   222 subsection {* Type classes, sorts and arities *}

   224 text {*

   225   Classes are specified by plain names.  Sorts have a very simple

   226   inner syntax, which is either a single class name @{text c} or a

   227   list @{text "{c\<^sub>1, \<dots>, c\<^sub>n}"} referring to the

   228   intersection of these classes.  The syntax of type arities is given

   229   directly at the outer level.

   231   \indexouternonterm{sort}\indexouternonterm{arity}

   232   \indexouternonterm{classdecl}

   233   \begin{rail}

   234     classdecl: name (('<' | subseteq) (nameref + ','))?

   235     ;

   236     sort: nameref

   237     ;

   238     arity: ('(' (sort + ',') ')')? sort

   239     ;

   240   \end{rail}

   241 *}

   244 subsection {* Types and terms \label{sec:types-terms} *}

   246 text {*

   247   The actual inner Isabelle syntax, that of types and terms of the

   248   logic, is far too sophisticated in order to be modelled explicitly

   249   at the outer theory level.  Basically, any such entity has to be

   250   quoted to turn it into a single token (the parsing and type-checking

   251   is performed internally later).  For convenience, a slightly more

   252   liberal convention is adopted: quotes may be omitted for any type or

   253   term that is already atomic at the outer level.  For example, one

   254   may just write @{verbatim x} instead of quoted @{verbatim "\"x\""}.

   255   Note that symbolic identifiers (e.g.\ @{verbatim "++"} or @{text

   256   "\<forall>"} are available as well, provided these have not been superseded

   257   by commands or other keywords already (such as @{verbatim "="} or

   258   @{verbatim "+"}).

   260   \indexoutertoken{type}\indexoutertoken{term}\indexoutertoken{prop}

   261   \begin{rail}

   262     type: nameref | typefree | typevar

   263     ;

   264     term: nameref | var

   265     ;

   266     prop: term

   267     ;

   268   \end{rail}

   270   Positional instantiations are indicated by giving a sequence of

   271   terms, or the placeholder ``@{text _}'' (underscore), which means to

   272   skip a position.

   274   \indexoutertoken{inst}\indexoutertoken{insts}

   275   \begin{rail}

   276     inst: underscore | term

   277     ;

   278     insts: (inst *)

   279     ;

   280   \end{rail}

   282   Type declarations and definitions usually refer to

   283   \railnonterm{typespec} on the left-hand side.  This models basic

   284   type constructor application at the outer syntax level.  Note that

   285   only plain postfix notation is available here, but no infixes.

   287   \indexouternonterm{typespec}

   288   \begin{rail}

   289     typespec: (() | typefree | '(' ( typefree + ',' ) ')') name

   290     ;

   291   \end{rail}

   292 *}

   295 subsection {* Term patterns and declarations \label{sec:term-decls} *}

   297 text {*

   298   Wherever explicit propositions (or term fragments) occur in a proof

   299   text, casual binding of schematic term variables may be given

   300   specified via patterns of the form ``@{text "(\<IS> p\<^sub>1 \<dots>

   301   p\<^sub>n)"}''.  This works both for \railqtok{term} and \railqtok{prop}.

   303   \indexouternonterm{termpat}\indexouternonterm{proppat}

   304   \begin{rail}

   305     termpat: '(' ('is' term +) ')'

   306     ;

   307     proppat: '(' ('is' prop +) ')'

   308     ;

   309   \end{rail}

   311   \medskip Declarations of local variables @{text "x :: \<tau>"} and

   312   logical propositions @{text "a : \<phi>"} represent different views on

   313   the same principle of introducing a local scope.  In practice, one

   314   may usually omit the typing of \railnonterm{vars} (due to

   315   type-inference), and the naming of propositions (due to implicit

   316   references of current facts).  In any case, Isar proof elements

   317   usually admit to introduce multiple such items simultaneously.

   319   \indexouternonterm{vars}\indexouternonterm{props}

   320   \begin{rail}

   321     vars: (name+) ('::' type)?

   322     ;

   323     props: thmdecl? (prop proppat? +)

   324     ;

   325   \end{rail}

   327   The treatment of multiple declarations corresponds to the

   328   complementary focus of \railnonterm{vars} versus

   329   \railnonterm{props}.  In ``@{text "x\<^sub>1 \<dots> x\<^sub>n :: \<tau>"}''

   330   the typing refers to all variables, while in @{text "a: \<phi>\<^sub>1 \<dots>

   331   \<phi>\<^sub>n"} the naming refers to all propositions collectively.

   332   Isar language elements that refer to \railnonterm{vars} or

   333   \railnonterm{props} typically admit separate typings or namings via

   334   another level of iteration, with explicit @{keyword_ref "and"}

   335   separators; e.g.\ see @{command "fix"} and @{command "assume"} in

   336   \secref{sec:proof-context}.

   337 *}

   340 subsection {* Attributes and theorems \label{sec:syn-att} *}

   342 text {* Attributes have their own ``semi-inner'' syntax, in the sense

   343   that input conforming to \railnonterm{args} below is parsed by the

   344   attribute a second time.  The attribute argument specifications may

   345   be any sequence of atomic entities (identifiers, strings etc.), or

   346   properly bracketed argument lists.  Below \railqtok{atom} refers to

   347   any atomic entity, including any \railtok{keyword} conforming to

   348   \railtok{symident}.

   350   \indexoutertoken{atom}\indexouternonterm{args}\indexouternonterm{attributes}

   351   \begin{rail}

   352     atom: nameref | typefree | typevar | var | nat | keyword

   353     ;

   354     arg: atom | '(' args ')' | '[' args ']'

   355     ;

   356     args: arg *

   357     ;

   358     attributes: '[' (nameref args * ',') ']'

   359     ;

   360   \end{rail}

   362   Theorem specifications come in several flavors:

   363   \railnonterm{axmdecl} and \railnonterm{thmdecl} usually refer to

   364   axioms, assumptions or results of goal statements, while

   365   \railnonterm{thmdef} collects lists of existing theorems.  Existing

   366   theorems are given by \railnonterm{thmref} and

   367   \railnonterm{thmrefs}, the former requires an actual singleton

   368   result.

   370   There are three forms of theorem references:

   371   \begin{enumerate}

   373   \item named facts @{text "a"},

   375   \item selections from named facts @{text "a(i)"} or @{text "a(j - k)"},

   377   \item literal fact propositions using @{syntax_ref altstring} syntax

   378   @{verbatim "`"}@{text "\<phi>"}@{verbatim "`"} (see also method

   379   @{method_ref fact}).

   381   \end{enumerate}

   383   Any kind of theorem specification may include lists of attributes

   384   both on the left and right hand sides; attributes are applied to any

   385   immediately preceding fact.  If names are omitted, the theorems are

   386   not stored within the theorem database of the theory or proof

   387   context, but any given attributes are applied nonetheless.

   389   An extra pair of brackets around attributes (like ``@{text

   390   "[[simproc a]]"}'') abbreviates a theorem reference involving an

   391   internal dummy fact, which will be ignored later on.  So only the

   392   effect of the attribute on the background context will persist.

   393   This form of in-place declarations is particularly useful with

   394   commands like @{command "declare"} and @{command "using"}.

   396   \indexouternonterm{axmdecl}\indexouternonterm{thmdecl}

   397   \indexouternonterm{thmdef}\indexouternonterm{thmref}

   398   \indexouternonterm{thmrefs}\indexouternonterm{selection}

   399   \begin{rail}

   400     axmdecl: name attributes? ':'

   401     ;

   402     thmdecl: thmbind ':'

   403     ;

   404     thmdef: thmbind '='

   405     ;

   406     thmref: (nameref selection? | altstring) attributes? | '[' attributes ']'

   407     ;

   408     thmrefs: thmref +

   409     ;

   411     thmbind: name attributes | name | attributes

   412     ;

   413     selection: '(' ((nat | nat '-' nat?) + ',') ')'

   414     ;

   415   \end{rail}

   416 *}

   418 end

author	wenzelm
	Thu, 13 Nov 2008 22:00:12 +0100
changeset 28776	e4090e51b8b9
parent 28775	d25fe9601dbd
child 28778	a25630deacaf
permissions	-rw-r--r--