wneuper/isa: doc-src/Sledgehammer/sledgehammer.tex@e437d47f419f

     1 \documentclass[a4paper,12pt]{article}

     2 \usepackage[T1]{fontenc}

     3 \usepackage{amsmath}

     4 \usepackage{amssymb}

     5 \usepackage[english,french]{babel}

     6 \usepackage{color}

     7 \usepackage{footmisc}

     8 \usepackage{graphicx}

     9 %\usepackage{mathpazo}

    10 \usepackage{multicol}

    11 \usepackage{stmaryrd}

    12 %\usepackage[scaled=.85]{beramono}

    13 \usepackage{../../lib/texinputs/isabelle,../iman,../pdfsetup}

    15 %\oddsidemargin=4.6mm

    16 %\evensidemargin=4.6mm

    17 %\textwidth=150mm

    18 %\topmargin=4.6mm

    19 %\headheight=0mm

    20 %\headsep=0mm

    21 %\textheight=234mm

    23 \def\Colon{\mathord{:\mkern-1.5mu:}}

    24 %\def\lbrakk{\mathopen{\lbrack\mkern-3.25mu\lbrack}}

    25 %\def\rbrakk{\mathclose{\rbrack\mkern-3.255mu\rbrack}}

    26 \def\lparr{\mathopen{(\mkern-4mu\mid}}

    27 \def\rparr{\mathclose{\mid\mkern-4mu)}}

    29 \def\unk{{?}}

    30 \def\undef{(\lambda x.\; \unk)}

    31 %\def\unr{\textit{others}}

    32 \def\unr{\ldots}

    33 \def\Abs#1{\hbox{\rm{\flqq}}{\,#1\,}\hbox{\rm{\frqq}}}

    34 \def\Q{{\smash{\lower.2ex\hbox{$\scriptstyle?$}}}}

    36 \urlstyle{tt}

    38 \begin{document}

    40 \selectlanguage{english}

    42 \title{\includegraphics[scale=0.5]{isabelle_sledgehammer} \\[4ex]

    43 Hammering Away \\[\smallskipamount]

    44 \Large A User's Guide to Sledgehammer for Isabelle/HOL}

    45 \author{\hbox{} \\

    46 Jasmin Christian Blanchette \\

    47 {\normalsize Institut f\"ur Informatik, Technische Universit\"at M\"unchen} \\[4\smallskipamount]

    48 {\normalsize with contributions from} \\[4\smallskipamount]

    49 Lawrence C. Paulson \\

    50 {\normalsize Computer Laboratory, University of Cambridge} \\

    51 \hbox{}}

    53 \maketitle

    55 \tableofcontents

    57 \setlength{\parskip}{.7em plus .2em minus .1em}

    58 \setlength{\parindent}{0pt}

    59 \setlength{\abovedisplayskip}{\parskip}

    60 \setlength{\abovedisplayshortskip}{.9\parskip}

    61 \setlength{\belowdisplayskip}{\parskip}

    62 \setlength{\belowdisplayshortskip}{.9\parskip}

    64 % General-purpose enum environment with correct spacing

    65 \newenvironment{enum}%

    66     {\begin{list}{}{%

    67         \setlength{\topsep}{.1\parskip}%

    68         \setlength{\partopsep}{.1\parskip}%

    69         \setlength{\itemsep}{\parskip}%

    70         \advance\itemsep by-\parsep}}

    71     {\end{list}}

    73 \def\pre{\begingroup\vskip0pt plus1ex\advance\leftskip by\leftmargin

    74 \advance\rightskip by\leftmargin}

    75 \def\post{\vskip0pt plus1ex\endgroup}

    77 \def\prew{\pre\advance\rightskip by-\leftmargin}

    78 \def\postw{\post}

    80 \section{Introduction}

    81 \label{introduction}

    83 Sledgehammer is a tool that applies automatic theorem provers (ATPs)

    84 and satisfiability-modulo-theories (SMT) solvers on the current goal. The

    85 supported ATPs are E \cite{schulz-2002}, LEO-II \cite{leo2}, Satallax

    86 \cite{satallax}, SInE-E \cite{sine}, SNARK \cite{snark}, SPASS

    87 \cite{weidenbach-et-al-2009}, ToFoF-E \cite{tofof}, Vampire

    88 \cite{riazanov-voronkov-2002}, and Waldmeister \cite{waldmeister}. The ATPs are

    89 run either locally or remotely via the System\-On\-TPTP web service

    90 \cite{sutcliffe-2000}. In addition to the ATPs, the SMT solvers Z3 \cite{z3} is

    91 used by default, and you can tell Sledgehammer to try CVC3 \cite{cvc3} and Yices

    92 \cite{yices} as well; these are run either locally or on a server at the TU

    93 M\"unchen.

    95 The problem passed to the automatic provers consists of your current goal

    96 together with a heuristic selection of hundreds of facts (theorems) from the

    97 current theory context, filtered by relevance. Because jobs are run in the

    98 background, you can continue to work on your proof by other means. Provers can

    99 be run in parallel. Any reply (which may arrive half a minute later) will appear

   100 in the Proof General response buffer.

   102 The result of a successful proof search is some source text that usually (but

   103 not always) reconstructs the proof within Isabelle. For ATPs, the reconstructed

   104 proof relies on the general-purpose Metis prover \cite{metis}, which is fully

   105 integrated into Isabelle/HOL, with explicit inferences going through the kernel.

   106 Thus its results are correct by construction.

   108 In this manual, we will explicitly invoke the \textbf{sledgehammer} command.

   109 Sledgehammer also provides an automatic mode that can be enabled via the

   110 ``Auto Sledgehammer'' option from the ``Isabelle'' menu in Proof General. In

   111 this mode, Sledgehammer is run on every newly entered theorem. The time limit

   112 for Auto Sledgehammer and other automatic tools can be set using the ``Auto

   113 Tools Time Limit'' option.

   115 \newbox\boxA

   116 \setbox\boxA=\hbox{\texttt{nospam}}

   118 \newcommand\authoremail{\texttt{blan{\color{white}nospam}\kern-\wd\boxA{}chette@\allowbreak

   119 in.\allowbreak tum.\allowbreak de}}

   121 To run Sledgehammer, you must make sure that the theory \textit{Sledgehammer} is

   122 imported---this is rarely a problem in practice since it is part of

   123 \textit{Main}. Examples of Sledgehammer use can be found in Isabelle's

   124 \texttt{src/HOL/Metis\_Examples} directory.

   125 Comments and bug reports concerning Sledgehammer or this manual should be

   126 directed to the author at \authoremail.

   128 \vskip2.5\smallskipamount

   130 %\textbf{Acknowledgment.} The author would like to thank Mark Summerfield for

   131 %suggesting several textual improvements.

   133 \section{Installation}

   134 \label{installation}

   136 Sledgehammer is part of Isabelle, so you don't need to install it. However, it

   137 relies on third-party automatic theorem provers (ATPs) and SMT solvers.

   139 \subsection{Installing ATPs}

   141 Currently, E, SPASS, and Vampire can be run locally; in addition, E, Vampire,

   142 LEO-II, Satallax, SInE-E, SNARK, ToFoF-E, and Waldmeister are available remotely

   143 via System\-On\-TPTP \cite{sutcliffe-2000}. If you want better performance, you

   144 should at least install E and SPASS locally.

   146 There are three main ways to install ATPs on your machine:

   148 \begin{enum}

   149 \item[$\bullet$] If you installed an official Isabelle package with everything

   150 inside, it should already include properly setup executables for E and SPASS,

   151 ready to use.%

   152 \footnote{Vampire's license prevents us from doing the same for this otherwise

   153 wonderful tool.}

   155 \item[$\bullet$] Alternatively, you can download the Isabelle-aware E and SPASS

   156 binary packages from Isabelle's download page. Extract the archives, then add a

   157 line to your \texttt{\$ISABELLE\_HOME\_USER/etc/components}%

   158 \footnote{The variable \texttt{\$ISABELLE\_HOME\_USER} is set by Isabelle at

   159 startup. Its value can be retrieved by invoking \texttt{isabelle}

   160 \texttt{getenv} \texttt{ISABELLE\_HOME\_USER} on the command line.}

   161 file with the absolute

   162 path to E or SPASS. For example, if the \texttt{components} does not exist yet

   163 and you extracted SPASS to \texttt{/usr/local/spass-3.7}, create the

   164 \texttt{components} file with the single line

   166 \prew

   167 \texttt{/usr/local/spass-3.7}

   168 \postw

   170 in it.

   172 \item[$\bullet$] If you prefer to build E or SPASS yourself, or obtained a

   173 Vampire executable from somewhere (e.g., \url{http://www.vprover.org/}),

   174 set the environment variable \texttt{E\_HOME}, \texttt{SPASS\_HOME}, or

   175 \texttt{VAMPIRE\_HOME} to the directory that contains the \texttt{eproof},

   176 \texttt{SPASS}, or \texttt{vampire} executable. Sledgehammer has been tested

   177 with E 1.0 and 1.2, SPASS 3.5 and 3.7, and Vampire 0.6 and 1.0%

   178 \footnote{Following the rewrite of Vampire, the counter for version numbers was

   179 reset to 0; hence the (new) Vampire versions 0.6 and 1.0 are more recent than,

   180 say, Vampire 11.5.}%

   181 . Since the ATPs' output formats are neither documented nor stable, other

   182 versions of the ATPs might or might not work well with Sledgehammer. Ideally,

   183 also set \texttt{E\_VERSION}, \texttt{SPASS\_VERSION}, or

   184 \texttt{VAMPIRE\_VERSION} to the ATP's version number (e.g., ``1.2'').

   185 \end{enum}

   187 To check whether E and SPASS are successfully installed, follow the example in

   188 \S\ref{first-steps}. If the remote versions of E and SPASS are used (identified

   189 by the prefix ``\emph{remote\_}''), or if the local versions fail to solve the

   190 easy goal presented there, this is a sign that something is wrong with your

   191 installation.

   193 Remote ATP invocation via the SystemOnTPTP web service requires Perl with the

   194 World Wide Web Library (\texttt{libwww-perl}) installed. If you must use a proxy

   195 server to access the Internet, set the \texttt{http\_proxy} environment variable

   196 to the proxy, either in the environment in which Isabelle is launched or in your

   197 \texttt{\char`\~/\$ISABELLE\_HOME\_USER/etc/settings} file. Here are a few examples:

   199 \prew

   200 \texttt{http\_proxy=http://proxy.example.org} \\

   201 \texttt{http\_proxy=http://proxy.example.org:8080} \\

   202 \texttt{http\_proxy=http://joeblow:pAsSwRd@proxy.example.org}

   203 \postw

   205 \subsection{Installing SMT Solvers}

   207 CVC3, Yices, and Z3 can be run locally or (for CVC3 and Z3) remotely on a TU

   208 M\"unchen server. If you want better performance and get the ability to replay

   209 proofs that rely on the \emph{smt} proof method, you should at least install Z3

   210 locally.

   212 There are two main ways of installing SMT solvers locally.

   214 \begin{enum}

   215 \item[$\bullet$] If you installed an official Isabelle package with everything

   216 inside, it should already include properly setup executables for CVC3 and Z3,

   217 ready to use.%

   218 \footnote{Yices's license prevents us from doing the same for this otherwise

   219 wonderful tool.}

   220 For Z3, you additionally need to set the environment variable

   221 \texttt{Z3\_NON\_COMMERCIAL} to ``yes'' to confirm that you are a noncommercial

   222 user.

   224 \item[$\bullet$] Otherwise, follow the instructions documented in the \emph{SMT}

   225 theory (\texttt{\$ISABELLE\_HOME/src/HOL/SMT.thy}).

   226 \end{enum}

   228 \section{First Steps}

   229 \label{first-steps}

   231 To illustrate Sledgehammer in context, let us start a theory file and

   232 attempt to prove a simple lemma:

   234 \prew

   235 \textbf{theory}~\textit{Scratch} \\

   236 \textbf{imports}~\textit{Main} \\

   237 \textbf{begin} \\[2\smallskipamount]

   238 %

   239 \textbf{lemma} ``$[a] = [b] \,\Longrightarrow\, a = b$'' \\

   240 \textbf{sledgehammer}

   241 \postw

   243 Instead of issuing the \textbf{sledgehammer} command, you can also find

   244 Sledgehammer in the ``Commands'' submenu of the ``Isabelle'' menu in Proof

   245 General or press the Emacs key sequence C-c C-a C-s.

   246 Either way, Sledgehammer produces the following output after a few seconds:

   248 \prew

   249 \slshape

   250 Sledgehammer: ``\textit{e}'' on goal: \\

   251 $[a] = [b] \,\Longrightarrow\, a = b$ \\

   252 Try this: \textbf{by} (\textit{metis last\_ConsL}). \\

   253 To minimize: \textbf{sledgehammer} \textit{min} [\textit{e}] (\textit{last\_ConsL}). \\[3\smallskipamount]

   254 %

   255 Sledgehammer: ``\textit{vampire}'' on goal: \\

   256 $[a] = [b] \,\Longrightarrow\, a = b$ \\

   257 Try this: \textbf{by} (\textit{metis hd.simps}). \\

   258 To minimize: \textbf{sledgehammer} \textit{min} [\textit{vampire}] (\textit{hd.simps}). \\[3\smallskipamount]

   259 %

   260 Sledgehammer: ``\textit{spass}'' on goal: \\

   261 $[a] = [b] \,\Longrightarrow\, a = b$ \\

   262 Try this: \textbf{by} (\textit{metis list.inject}). \\

   263 To minimize: \textbf{sledgehammer} \textit{min} [\textit{spass}]~(\textit{list.inject}). \\[3\smallskipamount]

   264 %

   265 Sledgehammer: ``\textit{remote\_waldmeister}'' on goal: \\

   266 $[a] = [b] \,\Longrightarrow\, a = b$ \\

   267 Try this: \textbf{by} (\textit{metis hd.simps insert\_Nil}). \\

   268 To minimize: \textbf{sledgehammer} \textit{min} [\textit{remote\_waldmeister}] \\

   269 \phantom{To minimize: \textbf{sledgehammer}~}(\textit{hd.simps insert\_Nil}). \\[3\smallskipamount]

   270 %

   271 Sledgehammer: ``\textit{remote\_sine\_e}'' on goal: \\

   272 $[a] = [b] \,\Longrightarrow\, a = b$ \\

   273 Try this: \textbf{by} (\textit{metis hd.simps}). \\

   274 To minimize: \textbf{sledgehammer} \textit{min} [\textit{remote\_sine\_e}]~(\textit{hd.simps}). \\[3\smallskipamount]

   275 %

   276 Sledgehammer: ``\textit{remote\_z3}'' on goal: \\

   277 $[a] = [b] \,\Longrightarrow\, a = b$ \\

   278 Try this: \textbf{by} (\textit{metis hd.simps}). \\

   279 To minimize: \textbf{sledgehammer} \textit{min} [\textit{remote\_z3}]~(\textit{hd.simps}).

   280 \postw

   282 Sledgehammer ran E, SInE-E, SPASS, Vampire, Waldmeister, and Z3 in parallel.

   283 Depending on which provers are installed and how many processor cores are

   284 available, some of the provers might be missing or present with a

   285 \textit{remote\_} prefix. Waldmeister is run only for unit equational problems,

   286 where the goal's conclusion is a (universally quantified) equation.

   288 For each successful prover, Sledgehammer gives a one-liner proof that uses the

   289 \textit{metis} or \textit{smt} method. You can click the proof to insert it into

   290 the theory text. You can click the ``\textbf{sledgehammer} \textit{minimize}''

   291 command if you want to look for a shorter (and probably faster) proof. But here

   292 the proof found by E looks perfect, so click it to finish the proof.

   294 You can ask Sledgehammer for an Isar text proof by passing the

   295 \textit{isar\_proof} option (\S\ref{output-format}):

   297 \prew

   298 \textbf{sledgehammer} [\textit{isar\_proof}]

   299 \postw

   301 When Isar proof construction is successful, it can yield proofs that are more

   302 readable and also faster than the \textit{metis} one-liners. This feature is

   303 experimental and is only available for ATPs.

   305 \section{Hints}

   306 \label{hints}

   308 This section presents a few hints that should help you get the most out of

   309 Sledgehammer and Metis. Frequently (and infrequently) asked questions are

   310 answered in \S\ref{frequently-asked-questions}.

   312 \newcommand\point[1]{\medskip\par{\sl\bfseries#1}\par\nopagebreak}

   314 \point{Presimplify the goal}

   316 For best results, first simplify your problem by calling \textit{auto} or at

   317 least \textit{safe} followed by \textit{simp\_all}. The SMT solvers provide

   318 arithmetic decision procedures, but the ATPs typically do not (or if they do,

   319 Sledgehammer does not use it yet). Apart from Waldmeister, they are not

   320 especially good at heavy rewriting, but because they regard equations as

   321 undirected, they often prove theorems that require the reverse orientation of a

   322 \textit{simp} rule. Higher-order problems can be tackled, but the success rate

   323 is better for first-order problems. Hence, you may get better results if you

   324 first simplify the problem to remove higher-order features.

   326 \point{Make sure at least E, SPASS, Vampire, and Z3 are installed}

   328 Locally installed provers are faster and more reliable than those running on

   329 servers. See \S\ref{installation} for details on how to install them.

   331 \point{Familiarize yourself with the most important options}

   333 Sledgehammer's options are fully documented in \S\ref{command-syntax}. Many of

   334 the options are very specialized, but serious users of the tool should at least

   335 familiarize themselves with the following options:

   337 \begin{enum}

   338 \item[$\bullet$] \textbf{\textit{provers}} (\S\ref{mode-of-operation}) specifies

   339 the automatic provers (ATPs and SMT solvers) that should be run whenever

   340 Sledgehammer is invoked (e.g., ``\textit{provers}~= \textit{e spass

   341 remote\_vampire}''). For convenience, you can omit ``\textit{provers}~=''

   342 and simply write the prover names as a space-separated list (e.g., ``\textit{e

   343 spass remote\_vampire}'').

   345 \item[$\bullet$] \textbf{\textit{timeout}} (\S\ref{mode-of-operation}) controls

   346 the provers' time limit. It is set to 30 seconds, but since Sledgehammer runs

   347 asynchronously you should not hesitate to raise this limit to 60 or 120 seconds

   348 if you are the kind of user who can think clearly while ATPs are active.

   350 \item[$\bullet$] \textbf{\textit{full\_types}} (\S\ref{problem-encoding})

   351 specifies whether type-sound encodings should be used. By default, Sledgehammer

   352 employs a mixture of type-sound and type-unsound encodings, occasionally

   353 yielding unsound ATP proofs. In contrast, SMT solver proofs should always be

   354 sound.

   356 \item[$\bullet$] \textbf{\textit{max\_relevant}} (\S\ref{relevance-filter})

   357 specifies the maximum number of facts that should be passed to the provers. By

   358 default, the value is prover-dependent but varies between about 150 and 1000. If

   359 the provers time out, you can try lowering this value to, say, 100 or 50 and see

   360 if that helps.

   362 \item[$\bullet$] \textbf{\textit{isar\_proof}} (\S\ref{output-format}) specifies

   363 that Isar proofs should be generated, instead of one-liner Metis proofs. The

   364 length of the Isar proofs can be controlled by setting

   365 \textit{isar\_shrink\_factor} (\S\ref{output-format}).

   366 \end{enum}

   368 Options can be set globally using \textbf{sledgehammer\_params}

   369 (\S\ref{command-syntax}). The command also prints the list of all available

   370 options with their current value. Fact selection can be influenced by specifying

   371 ``$(\textit{add}{:}~\textit{my\_facts})$'' after the \textbf{sledgehammer} call

   372 to ensure that certain facts are included, or simply ``$(\textit{my\_facts})$''

   373 to force Sledgehammer to run only with $\textit{my\_facts}$.

   375 \section{Frequently Asked Questions}

   376 \label{frequently-asked-questions}

   378 This sections answers frequently (and infrequently) asked questions about

   379 Sledgehammer. It is a good idea to skim over it now even if you don't have any

   380 questions at this stage. And if you have any further questions not listed here,

   381 send them to the author at \authoremail.

   383 \point{Why does Metis fail to reconstruct the proof?}

   385 There are many reasons. If Metis runs seemingly forever, that is a sign that the

   386 proof is too difficult for it. Metis is complete, so it should eventually find

   387 it, but that's little consolation. There are several possible solutions:

   389 \begin{enum}

   390 \item[$\bullet$] Try the \textit{isar\_proof} option (\S\ref{output-format}) to

   391 obtain a step-by-step Isar proof where each step is justified by Metis. Since

   392 the steps are fairly small, Metis is more likely to be able to replay them.

   394 \item[$\bullet$] Try the \textit{smt} proof method instead of \textit{metis}. It

   395 is usually stronger, but you need to have Z3 available to replay the proofs,

   396 trust the SMT solver, or use certificates. See the documentation in the

   397 \emph{SMT} theory (\texttt{\$ISABELLE\_HOME/src/HOL/SMT.thy}) for details.

   399 \item[$\bullet$] Try the \textit{blast} or \textit{auto} proof methods, passing

   400 the necessary facts via \textbf{unfolding}, \textbf{using}, \textit{intro}{:},

   401 \textit{elim}{:}, \textit{dest}{:}, or \textit{simp}{:}, as appropriate.

   402 \end{enum}

   404 In some rare cases, Metis fails fairly quickly. This usually indicates that

   405 Sledgehammer found a type-incorrect proof. Sledgehammer erases some type

   406 information to speed up the search. Try Sledgehammer again with full type

   407 information: \textit{full\_types} (\S\ref{problem-encoding}), or choose a

   408 specific type encoding with \textit{type\_sys} (\S\ref{problem-encoding}). Older

   409 versions of Sledgehammer were frequent victims of this problem. Now this should

   410 very seldom be an issue, but if you notice many unsound proofs, contact the

   411 author at \authoremail.

   413 \point{How can I tell whether a generated proof is sound?}

   415 First, if \emph{metis} (or \emph{metisFT}) can reconstruct it, the proof is

   416 sound (modulo soundness of Isabelle's inference kernel). If it fails or runs

   417 seemingly forever, you can try

   419 \prew

   420 \textbf{apply}~\textbf{--} \\

   421 \textbf{sledgehammer} [\textit{type\_sys} = \textit{poly\_tags}] (\textit{metis\_facts})

   422 \postw

   424 where \textit{metis\_facts} is the list of facts appearing in the suggested

   425 Metis call. The automatic provers should be able to re-find the proof very

   426 quickly if it is sound, and the \textit{type\_sys} $=$ \textit{poly\_tags}

   427 option (\S\ref{problem-encoding}) ensures that no unsound proofs are found.

   429 The \textit{full\_types} option (\S\ref{problem-encoding}) can also be used

   430 here, but it is unsound in extremely rare degenerate cases such as the

   431 following:

   433 \prew

   434 \textbf{lemma} ``$\forall x\> y\Colon{'}\!a.\ x = y \,\Longrightarrow \exists f\> g\Colon\mathit{nat} \Rightarrow {'}\!a.\ f \not= g$'' \\

   435 \textbf{sledgehammer} [\textit{full\_types}] (\textit{nat.distinct\/}(1))

   436 \postw

   438 \point{Which facts are passed to the automatic provers?}

   440 The relevance filter assigns a score to every available fact (lemma, theorem,

   441 definition, or axiom)\ based upon how many constants that fact shares with the

   442 conjecture. This process iterates to include facts relevant to those just

   443 accepted, but with a decay factor to ensure termination. The constants are

   444 weighted to give unusual ones greater significance. The relevance filter copes

   445 best when the conjecture contains some unusual constants; if all the constants

   446 are common, it is unable to discriminate among the hundreds of facts that are

   447 picked up. The relevance filter is also memoryless: It has no information about

   448 how many times a particular fact has been used in a proof, and it cannot learn.

   450 The number of facts included in a problem varies from prover to prover, since

   451 some provers get overwhelmed more easily than others. You can show the number of

   452 facts given using the \textit{verbose} option (\S\ref{output-format}) and the

   453 actual facts using \textit{debug} (\S\ref{output-format}).

   455 Sledgehammer is good at finding short proofs combining a handful of existing

   456 lemmas. If you are looking for longer proofs, you must typically restrict the

   457 number of facts, by setting the \textit{max\_relevant} option

   458 (\S\ref{relevance-filter}) to, say, 50 or 100.

   460 You can also influence which facts are actually selected in a number of ways. If

   461 you simply want to ensure that a fact is included, you can specify it using the

   462 ``$(\textit{add}{:}~\textit{my\_facts})$'' syntax. For example:

   463 %

   464 \prew

   465 \textbf{sledgehammer} (\textit{add}: \textit{hd.simps} \textit{tl.simps})

   466 \postw

   467 %

   468 The specified facts then replace the least relevant facts that would otherwise be

   469 included; the other selected facts remain the same.

   470 If you want to direct the selection in a particular direction, you can specify

   471 the facts via \textbf{using}:

   472 %

   473 \prew

   474 \textbf{using} \textit{hd.simps} \textit{tl.simps} \\

   475 \textbf{sledgehammer}

   476 \postw

   477 %

   478 The facts are then more likely to be selected than otherwise, and if they are

   479 selected at iteration $j$ they also influence which facts are selected at

   480 iterations $j + 1$, $j + 2$, etc. To give them even more weight, try

   481 %

   482 \prew

   483 \textbf{using} \textit{hd.simps} \textit{tl.simps} \\

   484 \textbf{apply}~\textbf{--} \\

   485 \textbf{sledgehammer}

   486 \postw

   488 \point{Why are the generated Isar proofs so ugly/detailed/broken?}

   490 The current implementation is experimental and explodes exponentially in the

   491 worst case. Work on a new implementation has begun. There is a large body of

   492 research into transforming resolution proofs into natural deduction proofs (such

   493 as Isar proofs), which we hope to leverage. In the meantime, a workaround is to

   494 set the \textit{isar\_shrink\_factor} option (\S\ref{output-format}) to a larger

   495 value or to try several provers and keep the nicest-looking proof.

   497 \point{Should I minimize the number of lemmas?}

   499 In general, minimization is a good idea, because proofs involving fewer lemmas

   500 tend to be shorter as well, and hence easier to re-find by Metis. But the

   501 opposite is sometimes the case.

   503 \point{Why does the minimizer sometimes starts on its own?}

   505 There are two scenarios in which this can happen. First, some provers (notably

   506 CVC3, Satallax, and Yices) do not provide proofs or sometimes provide incomplete

   507 proofs. The minimizer is then invoked to find out which facts are actually

   508 needed from the (large) set of facts that was initinally given to the prover.

   509 Second, if a prover returns a proof with lots of facts, the minimizer is invoked

   510 automatically since Metis would be unlikely to re-find the proof.

   512 \point{What is metisFT?}

   514 The \textit{metisFT} proof method is the fully-typed version of Metis. It is

   515 much slower than \textit{metis}, but the proof search is fully typed, and it

   516 also includes more powerful rules such as the axiom ``$x = \mathit{True}

   517 \mathrel{\lor} x = \mathit{False}$'' for reasoning in higher-order places (e.g.,

   518 in set comprehensions). The method kicks in automatically as a fallback when

   519 \textit{metis} fails, and it is sometimes generated by Sledgehammer instead of

   520 \textit{metis} if the proof obviously requires type information.

   522 If you see the warning

   524 \prew

   525 \slshape

   526 Metis: Falling back on ``\textit{metisFT\/}''.

   527 \postw

   529 in a successful Metis proof, you can advantageously replace the \textit{metis}

   530 call with \textit{metisFT}.

   532 \point{A strange error occurred---what should I do?}

   534 Sledgehammer tries to give informative error messages. Please report any strange

   535 error to the author at \authoremail. This applies double if you get the message

   537 \prew

   538 \slshape

   539 The prover found a type-unsound proof involving ``\textit{foo}'',

   540 ``\textit{bar}'', and ``\textit{baz}'' even though a supposedly type-sound

   541 encoding was used (or, less likely, your axioms are inconsistent). You might

   542 want to report this to the Isabelle developers.

   543 \postw

   545 \point{Auto can solve it---why not Sledgehammer?}

   547 Problems can be easy for \textit{auto} and difficult for automatic provers, but

   548 the reverse is also true, so don't be discouraged if your first attempts fail.

   549 Because the system refers to all theorems known to Isabelle, it is particularly

   550 suitable when your goal has a short proof from lemmas that you don't know about.

   552 \point{Why are there so many options?}

   554 Sledgehammer's philosophy should work out of the box, without user guidance.

   555 Many of the options are meant to be used mostly by the Sledgehammer developers

   556 for experimentation purposes. Of course, feel free to experiment with them if

   557 you are so inclined.

   559 \section{Command Syntax}

   560 \label{command-syntax}

   562 Sledgehammer can be invoked at any point when there is an open goal by entering

   563 the \textbf{sledgehammer} command in the theory file. Its general syntax is as

   564 follows:

   566 \prew

   567 \textbf{sledgehammer} \textit{subcommand\/$^?$ options\/$^?$ facts\_override\/$^?$ num\/$^?$}

   568 \postw

   570 For convenience, Sledgehammer is also available in the ``Commands'' submenu of

   571 the ``Isabelle'' menu in Proof General or by pressing the Emacs key sequence C-c

   572 C-a C-s. This is equivalent to entering the \textbf{sledgehammer} command with

   573 no arguments in the theory text.

   575 In the general syntax, the \textit{subcommand} may be any of the following:

   577 \begin{enum}

   578 \item[$\bullet$] \textbf{\textit{run} (the default):} Runs Sledgehammer on

   579 subgoal number \textit{num} (1 by default), with the given options and facts.

   581 \item[$\bullet$] \textbf{\textit{min}:} Attempts to minimize the provided facts

   582 (specified in the \textit{facts\_override} argument) to obtain a simpler proof

   583 involving fewer facts. The options and goal number are as for \textit{run}.

   585 \item[$\bullet$] \textbf{\textit{messages}:} Redisplays recent messages issued

   586 by Sledgehammer. This allows you to examine results that might have been lost

   587 due to Sledgehammer's asynchronous nature. The \textit{num} argument specifies a

   588 limit on the number of messages to display (5 by default).

   590 \item[$\bullet$] \textbf{\textit{supported\_provers}:} Prints the list of

   591 automatic provers supported by Sledgehammer. See \S\ref{installation} and

   592 \S\ref{mode-of-operation} for more information on how to install automatic

   593 provers.

   595 \item[$\bullet$] \textbf{\textit{running\_provers}:} Prints information about

   596 currently running automatic provers, including elapsed runtime and remaining

   597 time until timeout.

   599 \item[$\bullet$] \textbf{\textit{kill\_provers}:} Terminates all running

   600 automatic provers.

   602 \item[$\bullet$] \textbf{\textit{refresh\_tptp}:} Refreshes the list of remote

   603 ATPs available at System\-On\-TPTP \cite{sutcliffe-2000}.

   604 \end{enum}

   606 Sledgehammer's behavior can be influenced by various \textit{options}, which can

   607 be specified in brackets after the \textbf{sledgehammer} command. The

   608 \textit{options} are a list of key--value pairs of the form ``[$k_1 = v_1,

   609 \ldots, k_n = v_n$]''. For Boolean options, ``= \textit{true}'' is optional. For

   610 example:

   612 \prew

   613 \textbf{sledgehammer} [\textit{isar\_proof}, \,\textit{timeout} = 120$\,s$]

   614 \postw

   616 Default values can be set using \textbf{sledgehammer\_\allowbreak params}:

   618 \prew

   619 \textbf{sledgehammer\_params} \textit{options}

   620 \postw

   622 The supported options are described in \S\ref{option-reference}.

   624 The \textit{facts\_override} argument lets you alter the set of facts that go

   625 through the relevance filter. It may be of the form ``(\textit{facts})'', where

   626 \textit{facts} is a space-separated list of Isabelle facts (theorems, local

   627 assumptions, etc.), in which case the relevance filter is bypassed and the given

   628 facts are used. It may also be of the form ``(\textit{add}:\ \textit{facts}$_1$)'',

   629 ``(\textit{del}:\ \textit{facts}$_2$)'', or ``(\textit{add}:\ \textit{facts}$_1$\

   630 \textit{del}:\ \textit{facts}$_2$)'', where the relevance filter is instructed to

   631 proceed as usual except that it should consider \textit{facts}$_1$

   632 highly-relevant and \textit{facts}$_2$ fully irrelevant.

   634 You can instruct Sledgehammer to run automatically on newly entered theorems by

   635 enabling the ``Auto Sledgehammer'' option from the ``Isabelle'' menu in Proof

   636 General. For automatic runs, only the first prover set using \textit{provers}

   637 (\S\ref{mode-of-operation}) is considered, fewer facts are passed to the prover,

   638 \textit{slicing} (\S\ref{mode-of-operation}) is disabled, \textit{timeout}

   639 (\S\ref{mode-of-operation}) is superseded by the ``Auto Tools Time Limit'' in

   640 Proof General's ``Isabelle'' menu, \textit{full\_types}

   641 (\S\ref{problem-encoding}) is enabled, and \textit{verbose}

   642 (\S\ref{output-format}) and \textit{debug} (\S\ref{output-format}) are disabled.

   643 Sledgehammer's output is also more concise.

   645 \section{Option Reference}

   646 \label{option-reference}

   648 \def\defl{\{}

   649 \def\defr{\}}

   651 \def\flushitem#1{\item[]\noindent\kern-\leftmargin \textbf{#1}}

   652 \def\qty#1{$\left<\textit{#1}\right>$}

   653 \def\qtybf#1{$\mathbf{\left<\textbf{\textit{#1}}\right>}$}

   654 \def\optrue#1#2{\flushitem{\textit{#1} $\bigl[$= \qtybf{bool}$\bigr]$\enskip \defl\textit{true}\defr\hfill (neg.: \textit{#2})}\nopagebreak\\[\parskip]}

   655 \def\opfalse#1#2{\flushitem{\textit{#1} $\bigl[$= \qtybf{bool}$\bigr]$\enskip \defl\textit{false}\defr\hfill (neg.: \textit{#2})}\nopagebreak\\[\parskip]}

   656 \def\opsmart#1#2{\flushitem{\textit{#1} $\bigl[$= \qtybf{smart\_bool}$\bigr]$\enskip \defl\textit{smart}\defr\hfill (neg.: \textit{#2})}\nopagebreak\\[\parskip]}

   657 \def\opsmartx#1#2{\flushitem{\textit{#1} $\bigl[$= \qtybf{smart\_bool}$\bigr]$\enskip \defl\textit{smart}\defr\hfill\\\hbox{}\hfill (neg.: \textit{#2})}\nopagebreak\\[\parskip]}

   658 \def\opnodefault#1#2{\flushitem{\textit{#1} = \qtybf{#2}} \nopagebreak\\[\parskip]}

   659 \def\opnodefaultbrk#1#2{\flushitem{$\bigl[$\textit{#1} =$\bigr]$ \qtybf{#2}} \nopagebreak\\[\parskip]}

   660 \def\opdefault#1#2#3{\flushitem{\textit{#1} = \qtybf{#2}\enskip \defl\textit{#3}\defr} \nopagebreak\\[\parskip]}

   661 \def\oparg#1#2#3{\flushitem{\textit{#1} \qtybf{#2} = \qtybf{#3}} \nopagebreak\\[\parskip]}

   662 \def\opargbool#1#2#3{\flushitem{\textit{#1} \qtybf{#2} $\bigl[$= \qtybf{bool}$\bigr]$\hfill (neg.: \textit{#3})}\nopagebreak\\[\parskip]}

   663 \def\opargboolorsmart#1#2#3{\flushitem{\textit{#1} \qtybf{#2} $\bigl[$= \qtybf{smart\_bool}$\bigr]$\hfill (neg.: \textit{#3})}\nopagebreak\\[\parskip]}

   665 Sledgehammer's options are categorized as follows:\ mode of operation

   666 (\S\ref{mode-of-operation}), problem encoding (\S\ref{problem-encoding}),

   667 relevance filter (\S\ref{relevance-filter}), output format

   668 (\S\ref{output-format}), and authentication (\S\ref{authentication}).

   670 The descriptions below refer to the following syntactic quantities:

   672 \begin{enum}

   673 \item[$\bullet$] \qtybf{string}: A string.

   674 \item[$\bullet$] \qtybf{bool\/}: \textit{true} or \textit{false}.

   675 \item[$\bullet$] \qtybf{smart\_bool\/}: \textit{true}, \textit{false}, or

   676 \textit{smart}.

   677 \item[$\bullet$] \qtybf{int\/}: An integer.

   678 %\item[$\bullet$] \qtybf{float\/}: A floating-point number (e.g., 2.5).

   679 \item[$\bullet$] \qtybf{float\_pair\/}: A pair of floating-point numbers

   680 (e.g., 0.6 0.95).

   681 \item[$\bullet$] \qtybf{smart\_int\/}: An integer or \textit{smart}.

   682 \item[$\bullet$] \qtybf{float\_or\_none\/}: An integer (e.g., 60) or

   683 floating-point number (e.g., 0.5) expressing a number of seconds, or the keyword

   684 \textit{none} ($\infty$ seconds).

   685 \end{enum}

   687 Default values are indicated in braces. Boolean options have a negated

   688 counterpart (e.g., \textit{blocking} vs.\ \textit{non\_blocking}). When setting

   689 Boolean options, ``= \textit{true}'' may be omitted.

   691 \subsection{Mode of Operation}

   692 \label{mode-of-operation}

   694 \begin{enum}

   695 \opnodefaultbrk{provers}{string}

   696 Specifies the automatic provers to use as a space-separated list (e.g.,

   697 ``\textit{e}~\textit{spass}~\textit{remote\_vampire}''). The following local

   698 provers are supported:

   700 \begin{enum}

   701 \item[$\bullet$] \textbf{\textit{cvc3}:} CVC3 is an SMT solver developed by

   702 Clark Barrett, Cesare Tinelli, and their colleagues \cite{cvc3}. To use CVC3,

   703 set the environment variable \texttt{CVC3\_SOLVER} to the complete path of the

   704 executable, including the file name. Sledgehammer has been tested with version

   705 2.2.

   707 \item[$\bullet$] \textbf{\textit{e}:} E is a first-order resolution prover

   708 developed by Stephan Schulz \cite{schulz-2002}. To use E, set the environment

   709 variable \texttt{E\_HOME} to the directory that contains the \texttt{eproof}

   710 executable, or install the prebuilt E package from Isabelle's download page. See

   711 \S\ref{installation} for details.

   713 \item[$\bullet$] \textbf{\textit{spass}:} SPASS is a first-order resolution

   714 prover developed by Christoph Weidenbach et al.\ \cite{weidenbach-et-al-2009}.

   715 To use SPASS, set the environment variable \texttt{SPASS\_HOME} to the directory

   716 that contains the \texttt{SPASS} executable, or install the prebuilt SPASS

   717 package from Isabelle's download page. Sledgehammer requires version 3.5 or

   718 above. See \S\ref{installation} for details.

   720 \item[$\bullet$] \textbf{\textit{yices}:} Yices is an SMT solver developed at

   721 SRI \cite{yices}. To use Yices, set the environment variable

   722 \texttt{YICES\_SOLVER} to the complete path of the executable, including the

   723 file name. Sledgehammer has been tested with version 1.0.

   725 \item[$\bullet$] \textbf{\textit{vampire}:} Vampire is a first-order resolution

   726 prover developed by Andrei Voronkov and his colleagues

   727 \cite{riazanov-voronkov-2002}. To use Vampire, set the environment variable

   728 \texttt{VAMPIRE\_HOME} to the directory that contains the \texttt{vampire}

   729 executable. Sledgehammer has been tested with versions 11, 0.6, and 1.0.

   731 \item[$\bullet$] \textbf{\textit{z3}:} Z3 is an SMT solver developed at

   732 Microsoft Research \cite{z3}. To use Z3, set the environment variable

   733 \texttt{Z3\_SOLVER} to the complete path of the executable, including the file

   734 name, and set \texttt{Z3\_NON\_COMMERCIAL=yes} to confirm that you are a

   735 noncommercial user. Sledgehammer has been tested with versions 2.7 to 2.18.

   737 \item[$\bullet$] \textbf{\textit{z3\_atp}:} This version of Z3 pretends to be an

   738 ATP, exploiting Z3's undocumented support for the TPTP format. It is included

   739 for experimental purposes. It requires version 2.18 or above.

   740 \end{enum}

   742 In addition, the following remote provers are supported:

   744 \begin{enum}

   745 \item[$\bullet$] \textbf{\textit{remote\_cvc3}:} The remote version of CVC3 runs

   746 on servers at the TU M\"unchen (or wherever \texttt{REMOTE\_SMT\_URL} is set to

   747 point).

   749 \item[$\bullet$] \textbf{\textit{remote\_e}:} The remote version of E runs

   750 on Geoff Sutcliffe's Miami servers \cite{sutcliffe-2000}.

   752 \item[$\bullet$] \textbf{\textit{remote\_leo2}:} LEO-II is an automatic

   753 higher-order prover developed by Christoph Benzm\"uller et al. \cite{leo2}. The

   754 remote version of LEO-II runs on Geoff Sutcliffe's Miami servers. In the current

   755 setup, the problems given to LEO-II are only mildly higher-order.

   757 \item[$\bullet$] \textbf{\textit{remote\_satallax}:} Satallax is an automatic

   758 higher-order prover developed by Chad Brown et al. \cite{satallax}. The remote

   759 version of Satallax runs on Geoff Sutcliffe's Miami servers. In the current

   760 setup, the problems given to Satallax are only mildly higher-order.

   762 \item[$\bullet$] \textbf{\textit{remote\_sine\_e}:} SInE-E is a metaprover

   763 developed by Kry\v stof Hoder \cite{sine} based on E. The remote version of

   764 SInE runs on Geoff Sutcliffe's Miami servers.

   766 \item[$\bullet$] \textbf{\textit{remote\_snark}:} SNARK is a first-order

   767 resolution prover developed by Stickel et al.\ \cite{snark}. The remote version

   768 of SNARK runs on Geoff Sutcliffe's Miami servers.

   770 \item[$\bullet$] \textbf{\textit{remote\_tofof\_e}:} ToFoF-E is a metaprover

   771 developed by Geoff Sutcliffe \cite{tofof} based on E running on his Miami

   772 servers. This ATP supports a fragment of the TPTP many-typed first-order format

   773 (TFF). It is supported primarily for experimenting with the

   774 \textit{type\_sys} $=$ \textit{simple} option (\S\ref{problem-encoding}).

   776 \item[$\bullet$] \textbf{\textit{remote\_vampire}:} The remote version of

   777 Vampire runs on Geoff Sutcliffe's Miami servers. Version 9 is used.

   779 \item[$\bullet$] \textbf{\textit{remote\_waldmeister}:} Waldmeister is a unit

   780 equality prover developed by Hillenbrand et al.\ \cite{waldmeister}. It can be

   781 used to prove universally quantified equations using unconditional equations.

   782 The remote version of Waldmeister runs on Geoff Sutcliffe's Miami servers.

   784 \item[$\bullet$] \textbf{\textit{remote\_z3}:} The remote version of Z3 runs on

   785 servers at the TU M\"unchen (or wherever \texttt{REMOTE\_SMT\_URL} is set to

   786 point).

   788 \item[$\bullet$] \textbf{\textit{remote\_z3\_atp}:} The remote version of ``Z3

   789 as an ATP'' runs on Geoff Sutcliffe's Miami servers.

   790 \end{enum}

   792 By default, Sledgehammer will run E, SPASS, Vampire, SInE-E, and Z3 (or whatever

   793 the SMT module's \textit{smt\_solver} configuration option is set to) in

   794 parallel---either locally or remotely, depending on the number of processor

   795 cores available. For historical reasons, the default value of this option can be

   796 overridden using the option ``Sledgehammer: Provers'' from the ``Isabelle'' menu

   797 in Proof General.

   799 It is a good idea to run several provers in parallel, although it could slow

   800 down your machine. Running E, SPASS, and Vampire for 5~seconds yields a similar

   801 success rate to running the most effective of these for 120~seconds

   802 \cite{boehme-nipkow-2010}.

   804 \opnodefault{prover}{string}

   805 Alias for \textit{provers}.

   807 %\opnodefault{atps}{string}

   808 %Legacy alias for \textit{provers}.

   810 %\opnodefault{atp}{string}

   811 %Legacy alias for \textit{provers}.

   813 \opdefault{timeout}{float\_or\_none}{\upshape 30}

   814 Specifies the maximum number of seconds that the automatic provers should spend

   815 searching for a proof. This excludes problem preparation and is a soft limit.

   816 For historical reasons, the default value of this option can be overridden using

   817 the option ``Sledgehammer: Time Limit'' from the ``Isabelle'' menu in Proof

   818 General.

   820 \opfalse{blocking}{non\_blocking}

   821 Specifies whether the \textbf{sledgehammer} command should operate

   822 synchronously. The asynchronous (non-blocking) mode lets the user start proving

   823 the putative theorem manually while Sledgehammer looks for a proof, but it can

   824 also be more confusing. Irrespective of the value of this option, Sledgehammer

   825 is always run synchronously for the new jEdit-based user interface or if

   826 \textit{debug} (\S\ref{output-format}) is enabled.

   828 \optrue{slicing}{no\_slicing}

   829 Specifies whether the time allocated to a prover should be sliced into several

   830 segments, each of which has its own set of possibly prover-dependent options.

   831 For SPASS and Vampire, the first slice tries the fast but incomplete

   832 set-of-support (SOS) strategy, whereas the second slice runs without it. For E,

   833 up to three slices are tried, with different weighted search strategies and

   834 number of facts. For SMT solvers, several slices are tried with the same options

   835 each time but fewer and fewer facts. According to benchmarks with a timeout of

   836 30 seconds, slicing is a valuable optimization, and you should probably leave it

   837 enabled unless you are conducting experiments. This option is implicitly

   838 disabled for (short) automatic runs.

   840 \nopagebreak

   841 {\small See also \textit{verbose} (\S\ref{output-format}).}

   843 \opfalse{overlord}{no\_overlord}

   844 Specifies whether Sledgehammer should put its temporary files in

   845 \texttt{\$ISA\-BELLE\_\allowbreak HOME\_\allowbreak USER}, which is useful for

   846 debugging Sledgehammer but also unsafe if several instances of the tool are run

   847 simultaneously. The files are identified by the prefix \texttt{prob\_}; you may

   848 safely remove them after Sledgehammer has run.

   850 \nopagebreak

   851 {\small See also \textit{debug} (\S\ref{output-format}).}

   852 \end{enum}

   854 \subsection{Problem Encoding}

   855 \label{problem-encoding}

   857 \begin{enum}

   858 \opfalse{explicit\_apply}{implicit\_apply}

   859 Specifies whether function application should be encoded as an explicit

   860 ``apply'' operator in ATP problems. If the option is set to \textit{false}, each

   861 function will be directly applied to as many arguments as possible. Enabling

   862 this option can sometimes help discover higher-order proofs that otherwise would

   863 not be found.

   865 \opfalse{full\_types}{partial\_types}

   866 Specifies whether full type information is encoded in ATP problems. Enabling

   867 this option prevents the discovery of type-incorrect proofs, but it can slow

   868 down the ATP slightly. This option is implicitly enabled for automatic runs. For

   869 historical reasons, the default value of this option can be overridden using the

   870 option ``Sledgehammer: Full Types'' from the ``Isabelle'' menu in Proof General.

   872 \opdefault{type\_sys}{string}{smart}

   873 Specifies the type system to use in ATP problems. Some of the type systems are

   874 unsound, meaning that they can give rise to spurious proofs (unreconstructible

   875 using Metis). The supported type systems are listed below, with an indication of

   876 their soundness in parentheses:

   878 \begin{enum}

   879 \item[$\bullet$] \textbf{\textit{erased} (very unsound):} No type information is

   880 supplied to the ATP. Types are simply erased.

   882 \item[$\bullet$] \textbf{\textit{poly\_preds} (sound):} Types are encoded using

   883 a predicate \textit{has\_\allowbreak type\/}$(\tau, t)$ that restricts the range

   884 of bound variables. Constants are annotated with their types, supplied as extra

   885 arguments, to resolve overloading.

   887 \item[$\bullet$] \textbf{\textit{poly\_tags} (sound):} Each term and subterm is

   888 tagged with its type using a function $\mathit{type\_info\/}(\tau, t)$. This

   889 coincides with the encoding used by the \textit{metisFT} command.

   891 \item[$\bullet$] \textbf{\textit{poly\_args} (unsound):}

   892 Like for \textit{poly\_preds} constants are annotated with their types to

   893 resolve overloading, but otherwise no type information is encoded. This

   894 coincides with the encoding used by the \textit{metis} command (before it falls

   895 back on \textit{metisFT}).

   897 \item[$\bullet$]

   898 \textbf{%

   899 \textit{mono\_preds}, \textit{mono\_tags} (sound);

   900 \textit{mono\_args} (unsound):} \\

   901 Similar to \textit{poly\_preds}, \textit{poly\_tags}, and \textit{poly\_args},

   902 respectively, but the problem is additionally monomorphized, meaning that type

   903 variables are instantiated with heuristically chosen ground types.

   904 Monomorphization can simplify reasoning but also leads to larger fact bases,

   905 which can slow down the ATPs.

   907 \item[$\bullet$]

   908 \textbf{%

   909 \textit{mangled\_preds},

   910 \textit{mangled\_tags} (sound); \\

   911 \textit{mangled\_args} (unsound):} \\

   912 Similar to

   913 \textit{mono\_preds}, \textit{mono\_tags}, and \textit{mono\_args},

   914 respectively but types are mangled in constant names instead of being supplied

   915 as ground term arguments. The binary predicate $\mathit{has\_type\/}(\tau, t)$

   916 becomes a unary predicate $\mathit{has\_type\_}\tau(t)$, and the binary function

   917 $\mathit{type\_info\/}(\tau, t)$ becomes a unary function

   918 $\mathit{type\_info\_}\tau(t)$.

   920 \item[$\bullet$] \textbf{\textit{simple} (sound):} Use the prover's support for

   921 simple types if available; otherwise, fall back on \textit{mangled\_preds}. The

   922 problem is monomorphized.

   924 \item[$\bullet$]

   925 \textbf{%

   926 \textit{poly\_preds}?, \textit{poly\_tags}?, \textit{mono\_preds}?, \textit{mono\_tags}?, \\

   927 \textit{mangled\_preds}?, \textit{mangled\_tags}?, \textit{simple}? (quasi-sound):} \\

   928 The type systems \textit{poly\_preds}, \textit{poly\_tags},

   929 \textit{mono\_preds}, \textit{mono\_tags}, \textit{mangled\_preds},

   930 \textit{mangled\_tags}, and \textit{simple} are fully typed and sound. For each

   931 of these, Sledgehammer also provides a lighter, virtually sound variant

   932 identified by a question mark (`{?}')\ that detects and erases monotonic types,

   933 notably infinite types. (For \textit{simple}, the types are not actually erased

   934 but rather replaced by a shared uniform type of individuals.)

   936 \item[$\bullet$]

   937 \textbf{%

   938 \textit{poly\_preds}!, \textit{poly\_tags}!, \textit{mono\_preds}!, \textit{mono\_tags}!, \\

   939 \textit{mangled\_preds}!, \textit{mangled\_tags}!, \textit{simple}! \\

   940 (mildly unsound):} \\

   941 The type systems \textit{poly\_preds}, \textit{poly\_tags},

   942 \textit{mono\_preds}, \textit{mono\_tags}, \textit{mangled\_preds},

   943 \textit{mangled\_tags}, and \textit{simple} also admit a mildly unsound (but

   944 very efficient) variant identified by an exclamation mark (`{!}') that detects

   945 and erases erases all types except those that are clearly finite (e.g.,

   946 \textit{bool}). (For \textit{simple}, the types are not actually erased but

   947 rather replaced by a shared uniform type of individuals.)

   949 \item[$\bullet$] \textbf{\textit{smart}:} If \textit{full\_types} is enabled,

   950 uses a sound or virtually sound encoding; otherwise, uses any encoding. The actual

   951 encoding used depends on the ATP and should be the most efficient for that ATP.

   952 \end{enum}

   954 In addition, all the \textit{preds} and \textit{tags} type systems are available

   955 in two variants, a lightweight and a heavyweight variant. The lightweight

   956 variants are generally more efficient and are the default; the heavyweight

   957 variants are identified by a \textit{\_heavy} suffix (e.g.,

   958 \textit{mangled\_preds\_heavy}{?}).

   960 For SMT solvers and ToFoF-E, the type system is always \textit{simple},

   961 irrespective of the value of this option.

   963 \nopagebreak

   964 {\small See also \textit{max\_new\_mono\_instances} (\S\ref{relevance-filter})

   965 and \textit{max\_mono\_iters} (\S\ref{relevance-filter}).}

   966 \end{enum}

   968 \subsection{Relevance Filter}

   969 \label{relevance-filter}

   971 \begin{enum}

   972 \opdefault{relevance\_thresholds}{float\_pair}{\upshape 0.45~0.85}

   973 Specifies the thresholds above which facts are considered relevant by the

   974 relevance filter. The first threshold is used for the first iteration of the

   975 relevance filter and the second threshold is used for the last iteration (if it

   976 is reached). The effective threshold is quadratically interpolated for the other

   977 iterations. Each threshold ranges from 0 to 1, where 0 means that all theorems

   978 are relevant and 1 only theorems that refer to previously seen constants.

   980 \opsmart{max\_relevant}{smart\_int}

   981 Specifies the maximum number of facts that may be returned by the relevance

   982 filter. If the option is set to \textit{smart}, it is set to a value that was

   983 empirically found to be appropriate for the prover. A typical value would be

   984 300.

   986 \opdefault{max\_new\_mono\_instances}{int}{\upshape 400}

   987 Specifies the maximum number of monomorphic instances to generate beyond

   988 \textit{max\_relevant}. The higher this limit is, the more monomorphic instances

   989 are potentially generated. Whether monomorphization takes place depends on the

   990 type system used.

   992 \nopagebreak

   993 {\small See also \textit{type\_sys} (\S\ref{problem-encoding}).}

   995 \opdefault{max\_mono\_iters}{int}{\upshape 3}

   996 Specifies the maximum number of iterations for the monomorphization fixpoint

   997 construction. The higher this limit is, the more monomorphic instances are

   998 potentially generated. Whether monomorphization takes place depends on the

   999 type system used.

  1001 \nopagebreak

  1002 {\small See also \textit{type\_sys} (\S\ref{problem-encoding}).}

  1003 \end{enum}

  1005 \subsection{Output Format}

  1006 \label{output-format}

  1008 \begin{enum}

  1010 \opfalse{verbose}{quiet}

  1011 Specifies whether the \textbf{sledgehammer} command should explain what it does.

  1012 This option is implicitly disabled for automatic runs.

  1014 \opfalse{debug}{no\_debug}

  1015 Specifies whether Sledgehammer should display additional debugging information

  1016 beyond what \textit{verbose} already displays. Enabling \textit{debug} also

  1017 enables \textit{verbose} and \textit{blocking} (\S\ref{mode-of-operation})

  1018 behind the scenes. The \textit{debug} option is implicitly disabled for

  1019 automatic runs.

  1021 \nopagebreak

  1022 {\small See also \textit{overlord} (\S\ref{mode-of-operation}).}

  1024 \opfalse{isar\_proof}{no\_isar\_proof}

  1025 Specifies whether Isar proofs should be output in addition to one-liner

  1026 \textit{metis} proofs. Isar proof construction is still experimental and often

  1027 fails; however, they are usually faster and sometimes more robust than

  1028 \textit{metis} proofs.

  1030 \opdefault{isar\_shrink\_factor}{int}{\upshape 1}

  1031 Specifies the granularity of the Isar proof. A value of $n$ indicates that each

  1032 Isar proof step should correspond to a group of up to $n$ consecutive proof

  1033 steps in the ATP proof.

  1035 \end{enum}

  1037 \subsection{Authentication}

  1038 \label{authentication}

  1040 \begin{enum}

  1041 \opnodefault{expect}{string}

  1042 Specifies the expected outcome, which must be one of the following:

  1044 \begin{enum}

  1045 \item[$\bullet$] \textbf{\textit{some}:} Sledgehammer found a (potentially

  1046 unsound) proof.

  1047 \item[$\bullet$] \textbf{\textit{none}:} Sledgehammer found no proof.

  1048 \item[$\bullet$] \textbf{\textit{timeout}:} Sledgehammer timed out.

  1049 \item[$\bullet$] \textbf{\textit{unknown}:} Sledgehammer encountered some

  1050 problem.

  1051 \end{enum}

  1053 Sledgehammer emits an error (if \textit{blocking} is enabled) or a warning

  1054 (otherwise) if the actual outcome differs from the expected outcome. This option

  1055 is useful for regression testing.

  1057 \nopagebreak

  1058 {\small See also \textit{blocking} (\S\ref{mode-of-operation}).}

  1059 \end{enum}

  1061 \let\em=\sl

  1062 \bibliography{../manual}{}

  1063 \bibliographystyle{abbrv}

  1065 \end{document}

author	blanchet
	Fri, 27 May 2011 10:30:08 +0200
changeset 43872	e437d47f419f
parent 43855	0dca147765f4
child 43876	31182f0ec04d
permissions	-rw-r--r--