1 \documentclass[a4paper,12pt]{article}
2 \usepackage[T1]{fontenc}
5 \usepackage[english,french]{babel}
12 %\usepackage[scaled=.85]{beramono}
13 \usepackage{../../lib/texinputs/isabelle,../iman,../pdfsetup}
15 \newcommand\download{\url{http://www21.in.tum.de/~blanchet/\#software}}
17 \def\qty#1{\ensuremath{\left<\mathit{#1\/}\right>}}
18 \def\qtybf#1{$\mathbf{\left<\textbf{\textit{#1\/}}\right>}$}
20 \newcommand\const[1]{\textsf{#1}}
23 %\evensidemargin=4.6mm
30 \def\Colon{\mathord{:\mkern-1.5mu:}}
31 %\def\lbrakk{\mathopen{\lbrack\mkern-3.25mu\lbrack}}
32 %\def\rbrakk{\mathclose{\rbrack\mkern-3.255mu\rbrack}}
33 \def\lparr{\mathopen{(\mkern-4mu\mid}}
34 \def\rparr{\mathclose{\mid\mkern-4mu)}}
37 \def\undef{(\lambda x.\; \unk)}
38 %\def\unr{\textit{others}}
40 \def\Abs#1{\hbox{\rm{\flqq}}{\,#1\,}\hbox{\rm{\frqq}}}
41 \def\Q{{\smash{\lower.2ex\hbox{$\scriptstyle?$}}}}
48 %\renewcommand\labelitemi{$\bullet$}
49 \renewcommand\labelitemi{\raise.065ex\hbox{\small\textbullet}}
51 \selectlanguage{english}
53 \title{\includegraphics[scale=0.5]{isabelle_sledgehammer} \\[4ex]
54 Hammering Away \\[\smallskipamount]
55 \Large A User's Guide to Sledgehammer for Isabelle/HOL}
57 Jasmin Christian Blanchette \\
58 {\normalsize Institut f\"ur Informatik, Technische Universit\"at M\"unchen} \\[4\smallskipamount]
59 {\normalsize with contributions from} \\[4\smallskipamount]
60 Lawrence C. Paulson \\
61 {\normalsize Computer Laboratory, University of Cambridge} \\
68 \setlength{\parskip}{.7em plus .2em minus .1em}
69 \setlength{\parindent}{0pt}
70 \setlength{\abovedisplayskip}{\parskip}
71 \setlength{\abovedisplayshortskip}{.9\parskip}
72 \setlength{\belowdisplayskip}{\parskip}
73 \setlength{\belowdisplayshortskip}{.9\parskip}
75 % General-purpose enum environment with correct spacing
76 \newenvironment{enum}%
78 \setlength{\topsep}{.1\parskip}%
79 \setlength{\partopsep}{.1\parskip}%
80 \setlength{\itemsep}{\parskip}%
81 \advance\itemsep by-\parsep}}
84 \def\pre{\begingroup\vskip0pt plus1ex\advance\leftskip by\leftmargin
85 \advance\rightskip by\leftmargin}
86 \def\post{\vskip0pt plus1ex\endgroup}
88 \def\prew{\pre\advance\rightskip by-\leftmargin}
91 \section{Introduction}
94 Sledgehammer is a tool that applies automatic theorem provers (ATPs)
95 and satisfiability-modulo-theories (SMT) solvers on the current goal. The
96 supported ATPs are E \cite{schulz-2002}, E-SInE \cite{sine}, E-ToFoF
97 \cite{tofof}, LEO-II \cite{leo2}, Satallax \cite{satallax}, SNARK \cite{snark},
98 SPASS \cite{weidenbach-et-al-2009}, Vampire \cite{riazanov-voronkov-2002}, and
99 Waldmeister \cite{waldmeister}. The ATPs are run either locally or remotely via
100 the System\-On\-TPTP web service \cite{sutcliffe-2000}. In addition to the ATPs,
101 the SMT solvers Z3 \cite{z3} is used by default, and you can tell Sledgehammer
102 to try CVC3 \cite{cvc3} and Yices \cite{yices} as well; these are run either
103 locally or on a server at the TU M\"unchen.
105 The problem passed to the automatic provers consists of your current goal
106 together with a heuristic selection of hundreds of facts (theorems) from the
107 current theory context, filtered by relevance. Because jobs are run in the
108 background, you can continue to work on your proof by other means. Provers can
109 be run in parallel. Any reply (which may arrive half a minute later) will appear
110 in the Proof General response buffer.
112 The result of a successful proof search is some source text that usually (but
113 not always) reconstructs the proof within Isabelle. For ATPs, the reconstructed
114 proof relies on the general-purpose \textit{metis} proof method, which
115 integrates the Metis ATP in Isabelle/HOL with explicit inferences going through
116 the kernel. Thus its results are correct by construction.
118 In this manual, we will explicitly invoke the \textbf{sledgehammer} command.
119 Sledgehammer also provides an automatic mode that can be enabled via the ``Auto
120 Sledgehammer'' option in Proof General's ``Isabelle'' menu. In this mode,
121 Sledgehammer is run on every newly entered theorem. The time limit for Auto
122 Sledgehammer and other automatic tools can be set using the ``Auto Tools Time
126 \setbox\boxA=\hbox{\texttt{NOSPAM}}
128 \newcommand\authoremail{\texttt{blan{\color{white}NOSPAM}\kern-\wd\boxA{}chette@\allowbreak
129 in.\allowbreak tum.\allowbreak de}}
131 To run Sledgehammer, you must make sure that the theory \textit{Sledgehammer} is
132 imported---this is rarely a problem in practice since it is part of
133 \textit{Main}. Examples of Sledgehammer use can be found in Isabelle's
134 \texttt{src/HOL/Metis\_Examples} directory.
135 Comments and bug reports concerning Sledgehammer or this manual should be
136 directed to the author at \authoremail.
138 \vskip2.5\smallskipamount
140 %\textbf{Acknowledgment.} The author would like to thank Mark Summerfield for
141 %suggesting several textual improvements.
143 \section{Installation}
146 Sledgehammer is part of Isabelle, so you don't need to install it. However, it
147 relies on third-party automatic provers (ATPs and SMT solvers).
149 Among the ATPs, E, LEO-II, Satallax, SPASS, and Vampire can be run locally; in
150 addition, E, E-SInE, E-ToFoF, iProver, iProver-Eq, LEO-II, Satallax, SNARK,
151 Waldmeister, and Vampire are available remotely via System\-On\-TPTP
152 \cite{sutcliffe-2000}. If you want better performance, you should at least
153 install E and SPASS locally.
155 Among the SMT solvers, CVC3, Yices, and Z3 can be run locally, and CVC3 and Z3
156 can be run remotely on a TU M\"unchen server. If you want better performance and
157 get the ability to replay proofs that rely on the \emph{smt} proof method
158 without an Internet connection, you should at least install Z3 locally.
160 There are three main ways to install automatic provers on your machine:
164 \item[\labelitemi] If you installed an official Isabelle package, it should
165 already include properly setup executables for CVC3, E, SPASS, and Z3, ready to use.%
166 \footnote{Vampire's and Yices's licenses prevent us from doing the same for
167 these otherwise remarkable tools.}
168 For Z3, you must additionally set the variable
169 \texttt{Z3\_NON\_COMMERCIAL} to ``yes'' to confirm that you are a
170 noncommercial user, either in the environment in which Isabelle is
172 \texttt{\char`\~/\$ISABELLE\_HOME\_USER/etc/settings} file.
174 \item[\labelitemi] Alternatively, you can download the Isabelle-aware CVC3, E,
175 SPASS, and Z3 binary packages from \download. Extract the archives, then add a
176 line to your \texttt{\$ISABELLE\_HOME\_USER\slash etc\slash components}%
177 \footnote{The variable \texttt{\$ISABELLE\_HOME\_USER} is set by Isabelle at
178 startup. Its value can be retrieved by executing \texttt{isabelle}
179 \texttt{getenv} \texttt{ISABELLE\_HOME\_USER} on the command line.}
180 file with the absolute path to CVC3, E, SPASS, or Z3. For example, if the
181 \texttt{components} file does not exist yet and you extracted SPASS to
182 \texttt{/usr/local/spass-3.7}, create it with the single line
185 \texttt{/usr/local/spass-3.7}
188 (including an invisible newline character) in it.
190 \item[\labelitemi] If you prefer to build E, LEO-II, Satallax, or SPASS
191 manually, or found a Vampire executable somewhere (e.g.,
192 \url{http://www.vprover.org/}), set the environment variable \texttt{E\_HOME},
193 \texttt{LEO2\_HOME}, \texttt{SATALLAX\_HOME}, \texttt{SPASS\_HOME}, or
194 \texttt{VAMPIRE\_HOME} to the directory that contains the \texttt{eproof},
195 \texttt{leo}, \texttt{satallax}, \texttt{SPASS}, or \texttt{vampire} executable.
196 Sledgehammer has been tested with E 1.0 to 1.4, LEO-II 1.2.9, Satallax 2.2,
197 SPASS 3.5 and 3.7, and Vampire 0.6, 1.0, and 1.8%
198 \footnote{Following the rewrite of Vampire, the counter for version numbers was
199 reset to 0; hence the (new) Vampire versions 0.6, 1.0, and 1.8 are more recent
201 . Since the ATPs' output formats are neither documented nor stable, other
202 versions of the ATPs might not work well with Sledgehammer. Ideally,
203 also set \texttt{E\_VERSION}, \texttt{LEO2\_VERSION},
204 \texttt{SATALLAX\_VERSION}, \texttt{SPASS\_VERSION}, or
205 \texttt{VAMPIRE\_VERSION} to the prover's version number (e.g., ``1.4'').
207 Similarly, if you want to build CVC3, or found a
208 Yices or Z3 executable somewhere (e.g.,
209 \url{http://yices.csl.sri.com/download.shtml} or
210 \url{http://research.microsoft.com/en-us/um/redmond/projects/z3/download.html}),
211 set the environment variable \texttt{CVC3\_\allowbreak SOLVER},
212 \texttt{YICES\_SOLVER}, or \texttt{Z3\_SOLVER} to the complete path of
213 the executable, \emph{including the file name}. Sledgehammer has been tested
214 with CVC3 2.2, Yices 1.0.28, and Z3 3.0 to 3.2.
215 Since the SMT solvers' output formats are somewhat unstable, other
216 versions of the solvers might not work well with Sledgehammer. Ideally,
217 also set \texttt{CVC3\_VERSION}, \texttt{YICES\_VERSION}, or
218 \texttt{Z3\_VERSION} to the solver's version number (e.g., ``3.2'').
222 To check whether E, SPASS, Vampire, and/or Z3 are successfully installed, try
223 out the example in \S\ref{first-steps}. If the remote versions of any of these
224 provers is used (identified by the prefix ``\emph{remote\_\/}''), or if the
225 local versions fail to solve the easy goal presented there, something must be
226 wrong with the installation.
228 Remote prover invocation requires Perl with the World Wide Web Library
229 (\texttt{libwww-perl}) installed. If you must use a proxy server to access the
230 Internet, set the \texttt{http\_proxy} environment variable to the proxy, either
231 in the environment in which Isabelle is launched or in your
232 \texttt{\char`\~/\$ISABELLE\_HOME\_USER/etc/settings} file. Here are a few
236 \texttt{http\_proxy=http://proxy.example.org} \\
237 \texttt{http\_proxy=http://proxy.example.org:8080} \\
238 \texttt{http\_proxy=http://joeblow:pAsSwRd@proxy.example.org}
241 \section{First Steps}
244 To illustrate Sledgehammer in context, let us start a theory file and
245 attempt to prove a simple lemma:
248 \textbf{theory}~\textit{Scratch} \\
249 \textbf{imports}~\textit{Main} \\
250 \textbf{begin} \\[2\smallskipamount]
252 \textbf{lemma} ``$[a] = [b] \,\Longrightarrow\, a = b$'' \\
253 \textbf{sledgehammer}
256 Instead of issuing the \textbf{sledgehammer} command, you can also find
257 Sledgehammer in the ``Commands'' submenu of the ``Isabelle'' menu in Proof
258 General or press the Emacs key sequence C-c C-a C-s.
259 Either way, Sledgehammer produces the following output after a few seconds:
263 Sledgehammer: ``\textit{e\/}'' on goal \\
264 $[a] = [b] \,\Longrightarrow\, a = b$ \\
265 Try this: \textbf{by} (\textit{metis last\_ConsL}) (64 ms). \\[3\smallskipamount]
267 Sledgehammer: ``\textit{z3\/}'' on goal \\
268 $[a] = [b] \,\Longrightarrow\, a = b$ \\
269 Try this: \textbf{by} (\textit{metis list.inject}) (20 ms). \\[3\smallskipamount]
271 Sledgehammer: ``\textit{vampire\/}'' on goal \\
272 $[a] = [b] \,\Longrightarrow\, a = b$ \\
273 Try this: \textbf{by} (\textit{metis hd.simps}) (14 ms). \\[3\smallskipamount]
275 Sledgehammer: ``\textit{spass\/}'' on goal \\
276 $[a] = [b] \,\Longrightarrow\, a = b$ \\
277 Try this: \textbf{by} (\textit{metis list.inject}) (17 ms). \\[3\smallskipamount]
279 Sledgehammer: ``\textit{remote\_waldmeister\/}'' on goal \\
280 $[a] = [b] \,\Longrightarrow\, a = b$ \\
281 Try this: \textbf{by} (\textit{metis hd.simps}) (15 ms). \\[3\smallskipamount]
283 Sledgehammer: ``\textit{remote\_e\_sine\/}'' on goal \\
284 $[a] = [b] \,\Longrightarrow\, a = b$ \\
285 Try this: \textbf{by} (\textit{metis hd.simps}) (18 ms).
288 Sledgehammer ran E, E-SInE, SPASS, Vampire, Waldmeister, and Z3 in parallel.
289 Depending on which provers are installed and how many processor cores are
290 available, some of the provers might be missing or present with a
291 \textit{remote\_} prefix. Waldmeister is run only for unit equational problems,
292 where the goal's conclusion is a (universally quantified) equation.
294 For each successful prover, Sledgehammer gives a one-liner proof that uses
295 the \textit{metis} or \textit{smt} proof method. Approximate timings are shown
296 in parentheses, indicating how fast the call is. You can click the proof to
297 insert it into the theory text.
299 In addition, you can ask Sledgehammer for an Isar text proof by passing the
300 \textit{isar\_proof} option (\S\ref{output-format}):
303 \textbf{sledgehammer} [\textit{isar\_proof}]
306 When Isar proof construction is successful, it can yield proofs that are more
307 readable and also faster than the \textit{metis} or \textit{smt} one-liners.
308 This feature is experimental and is only available for ATPs.
313 This section presents a few hints that should help you get the most out of
314 Sledgehammer. Frequently (and infrequently) asked questions are answered in
315 \S\ref{frequently-asked-questions}.
317 %\newcommand\point[1]{\medskip\par{\sl\bfseries#1}\par\nopagebreak}
318 \newcommand\point[1]{\subsection{\emph{#1}}}
320 \point{Presimplify the goal}
322 For best results, first simplify your problem by calling \textit{auto} or at
323 least \textit{safe} followed by \textit{simp\_all}. The SMT solvers provide
324 arithmetic decision procedures, but the ATPs typically do not (or if they do,
325 Sledgehammer does not use it yet). Apart from Waldmeister, they are not
326 especially good at heavy rewriting, but because they regard equations as
327 undirected, they often prove theorems that require the reverse orientation of a
328 \textit{simp} rule. Higher-order problems can be tackled, but the success rate
329 is better for first-order problems. Hence, you may get better results if you
330 first simplify the problem to remove higher-order features.
332 \point{Make sure E, SPASS, Vampire, and Z3 are locally installed}
334 Locally installed provers are faster and more reliable than those running on
335 servers. See \S\ref{installation} for details on how to install them.
337 \point{Familiarize yourself with the most important options}
339 Sledgehammer's options are fully documented in \S\ref{command-syntax}. Many of
340 the options are very specialized, but serious users of the tool should at least
341 familiarize themselves with the following options:
344 \item[\labelitemi] \textbf{\textit{provers}} (\S\ref{mode-of-operation}) specifies
345 the automatic provers (ATPs and SMT solvers) that should be run whenever
346 Sledgehammer is invoked (e.g., ``\textit{provers}~= \textit{e spass
347 remote\_vampire\/}''). For convenience, you can omit ``\textit{provers}~=''
348 and simply write the prover names as a space-separated list (e.g., ``\textit{e
349 spass remote\_vampire\/}'').
351 \item[\labelitemi] \textbf{\textit{max\_relevant}} (\S\ref{relevance-filter})
352 specifies the maximum number of facts that should be passed to the provers. By
353 default, the value is prover-dependent but varies between about 150 and 1000. If
354 the provers time out, you can try lowering this value to, say, 100 or 50 and see
357 \item[\labelitemi] \textbf{\textit{isar\_proof}} (\S\ref{output-format}) specifies
358 that Isar proofs should be generated, instead of one-liner \textit{metis} or
359 \textit{smt} proofs. The length of the Isar proofs can be controlled by setting
360 \textit{isar\_shrink\_factor} (\S\ref{output-format}).
362 \item[\labelitemi] \textbf{\textit{timeout}} (\S\ref{timeouts}) controls the
363 provers' time limit. It is set to 30 seconds, but since Sledgehammer runs
364 asynchronously you should not hesitate to raise this limit to 60 or 120 seconds
365 if you are the kind of user who can think clearly while ATPs are active.
368 Options can be set globally using \textbf{sledgehammer\_params}
369 (\S\ref{command-syntax}). The command also prints the list of all available
370 options with their current value. Fact selection can be influenced by specifying
371 ``$(\textit{add}{:}~\textit{my\_facts})$'' after the \textbf{sledgehammer} call
372 to ensure that certain facts are included, or simply ``$(\textit{my\_facts})$''
373 to force Sledgehammer to run only with $\textit{my\_facts}$.
375 \section{Frequently Asked Questions}
376 \label{frequently-asked-questions}
378 This sections answers frequently (and infrequently) asked questions about
379 Sledgehammer. It is a good idea to skim over it now even if you don't have any
380 questions at this stage. And if you have any further questions not listed here,
381 send them to the author at \authoremail.
383 \point{Which facts are passed to the automatic provers?}
385 The relevance filter assigns a score to every available fact (lemma, theorem,
386 definition, or axiom) based upon how many constants that fact shares with the
387 conjecture. This process iterates to include facts relevant to those just
388 accepted, but with a decay factor to ensure termination. The constants are
389 weighted to give unusual ones greater significance. The relevance filter copes
390 best when the conjecture contains some unusual constants; if all the constants
391 are common, it is unable to discriminate among the hundreds of facts that are
392 picked up. The relevance filter is also memoryless: It has no information about
393 how many times a particular fact has been used in a proof, and it cannot learn.
395 The number of facts included in a problem varies from prover to prover, since
396 some provers get overwhelmed more easily than others. You can show the number of
397 facts given using the \textit{verbose} option (\S\ref{output-format}) and the
398 actual facts using \textit{debug} (\S\ref{output-format}).
400 Sledgehammer is good at finding short proofs combining a handful of existing
401 lemmas. If you are looking for longer proofs, you must typically restrict the
402 number of facts, by setting the \textit{max\_relevant} option
403 (\S\ref{relevance-filter}) to, say, 25 or 50.
405 You can also influence which facts are actually selected in a number of ways. If
406 you simply want to ensure that a fact is included, you can specify it using the
407 ``$(\textit{add}{:}~\textit{my\_facts})$'' syntax. For example:
410 \textbf{sledgehammer} (\textit{add}: \textit{hd.simps} \textit{tl.simps})
413 The specified facts then replace the least relevant facts that would otherwise be
414 included; the other selected facts remain the same.
415 If you want to direct the selection in a particular direction, you can specify
416 the facts via \textbf{using}:
419 \textbf{using} \textit{hd.simps} \textit{tl.simps} \\
420 \textbf{sledgehammer}
423 The facts are then more likely to be selected than otherwise, and if they are
424 selected at iteration $j$ they also influence which facts are selected at
425 iterations $j + 1$, $j + 2$, etc. To give them even more weight, try
428 \textbf{using} \textit{hd.simps} \textit{tl.simps} \\
429 \textbf{apply}~\textbf{--} \\
430 \textbf{sledgehammer}
433 \point{Why does Metis fail to reconstruct the proof?}
435 There are many reasons. If Metis runs seemingly forever, that is a sign that the
436 proof is too difficult for it. Metis's search is complete, so it should
437 eventually find it, but that's little consolation. There are several possible
441 \item[\labelitemi] Try the \textit{isar\_proof} option (\S\ref{output-format}) to
442 obtain a step-by-step Isar proof where each step is justified by \textit{metis}.
443 Since the steps are fairly small, \textit{metis} is more likely to be able to
446 \item[\labelitemi] Try the \textit{smt} proof method instead of \textit{metis}. It
447 is usually stronger, but you need to either have Z3 available to replay the
448 proofs, trust the SMT solver, or use certificates. See the documentation in the
449 \emph{SMT} theory (\texttt{\$ISABELLE\_HOME/src/HOL/SMT.thy}) for details.
451 \item[\labelitemi] Try the \textit{blast} or \textit{auto} proof methods, passing
452 the necessary facts via \textbf{unfolding}, \textbf{using}, \textit{intro}{:},
453 \textit{elim}{:}, \textit{dest}{:}, or \textit{simp}{:}, as appropriate.
456 In some rare cases, \textit{metis} fails fairly quickly, and you get the error
461 One-line proof reconstruction failed.
464 This message indicates that Sledgehammer determined that the goal is provable,
465 but the proof is, for technical reasons, beyond \textit{metis}'s power. You can
466 then try again with the \textit{strict} option (\S\ref{problem-encoding}).
468 If the goal is actually unprovable are you did not specify an unsound encoding
469 using \textit{type\_enc} (\S\ref{problem-encoding}), this is a bug, and you are
470 strongly encouraged to report this to the author at \authoremail.
472 \point{Why are the generated Isar proofs so ugly/broken?}
474 The current implementation of the Isar proof feature,
475 enabled by the \textit{isar\_proof} option (\S\ref{output-format}),
476 is highly experimental. Work on a new implementation has begun. There is a large body of
477 research into transforming resolution proofs into natural deduction proofs (such
478 as Isar proofs), which we hope to leverage. In the meantime, a workaround is to
479 set the \textit{isar\_shrink\_factor} option (\S\ref{output-format}) to a larger
480 value or to try several provers and keep the nicest-looking proof.
482 \point{How can I tell whether a suggested proof is sound?}
484 Earlier versions of Sledgehammer often suggested unsound proofs---either proofs
485 of nontheorems or simply proofs that rely on type-unsound inferences. This
486 is a thing of the pass, unless you explicitly specify an unsound encoding
487 using \textit{type\_enc} (\S\ref{problem-encoding}).
489 Officially, the only form of ``unsoundness'' that lurks in the sound
490 encodings is related to missing characteristic theorems of datatypes. For
494 \textbf{lemma}~``$\exists \mathit{xs}.\; \mathit{xs} \neq []$'' \\
495 \textbf{sledgehammer} ()
498 suggests an argumentless \textit{metis} call that fails. However, the conjecture
499 does actually hold, and the \textit{metis} call can be repaired by adding
500 \textit{list.distinct}.
502 We hope to address this problem in a future version of Isabelle. In the
503 meantime, you can avoid it by passing the \textit{strict} option
504 (\S\ref{problem-encoding}).
506 \point{What are the \textit{full\_types}, \textit{no\_types}, and
507 \textit{mono\_tags} arguments to Metis?}
509 The \textit{metis}~(\textit{full\_types}) proof method
510 and its cousin \textit{metis}~(\textit{mono\_tags}) are fully-typed
511 version of Metis. It is somewhat slower than \textit{metis}, but the proof
512 search is fully typed, and it also includes more powerful rules such as the
513 axiom ``$x = \const{True} \mathrel{\lor} x = \const{False}$'' for reasoning in
514 higher-order places (e.g., in set comprehensions). The method kicks in
515 automatically as a fallback when \textit{metis} fails, and it is sometimes
516 generated by Sledgehammer instead of \textit{metis} if the proof obviously
517 requires type information or if \textit{metis} failed when Sledgehammer
518 preplayed the proof. (By default, Sledgehammer tries to run \textit{metis} with
519 various options for up to 3 seconds each time to ensure that the generated
520 one-line proofs actually work and to display timing information. This can be
521 configured using the \textit{preplay\_timeout} option (\S\ref{timeouts}).)
523 At the other end of the soundness spectrum, \textit{metis} (\textit{no\_types})
524 uses no type information at all during the proof search, which is more efficient
525 but often fails. Calls to \textit{metis} (\textit{no\_types}) are occasionally
526 generated by Sledgehammer.
528 See the \textit{type\_enc} option (\S\ref{problem-encoding}) for details.
530 Incidentally, if you ever see warnings such as
534 Metis: Falling back on ``\textit{metis} (\textit{full\_types})''.
537 for a successful \textit{metis} proof, you can advantageously pass the
538 \textit{full\_types} option to \textit{metis} directly.
540 \point{And what are the \textit{lifting} and \textit{hide\_lams} arguments
543 Orthogonally to the encoding of types, it is important to choose an appropriate
544 translation of $\lambda$-abstractions. Metis supports three translation schemes,
545 in decreasing order of power: Curry combinators (the default),
546 $\lambda$-lifting, and a ``hiding'' scheme that disables all reasoning under
547 $\lambda$-abstractions. The more powerful schemes also give the automatic
548 provers more rope to hang themselves. See the \textit{lam\_trans} option (\S\ref{problem-encoding}) for details.
550 \point{Are generated proofs minimal?}
552 Automatic provers frequently use many more facts than are necessary.
553 Sledgehammer inclues a minimization tool that takes a set of facts returned by a
554 given prover and repeatedly calls the same prover, \textit{metis}, or
555 \textit{smt} with subsets of those axioms in order to find a minimal set.
556 Reducing the number of axioms typically improves Metis's speed and success rate,
557 while also removing superfluous clutter from the proof scripts.
559 In earlier versions of Sledgehammer, generated proofs were systematically
560 accompanied by a suggestion to invoke the minimization tool. This step is now
561 performed implicitly if it can be done in a reasonable amount of time (something
562 that can be guessed from the number of facts in the original proof and the time
563 it took to find or preplay it).
565 In addition, some provers (e.g., Yices) do not provide proofs or sometimes
566 produce incomplete proofs. The minimizer is then invoked to find out which facts
567 are actually needed from the (large) set of facts that was initinally given to
568 the prover. Finally, if a prover returns a proof with lots of facts, the
569 minimizer is invoked automatically since Metis would be unlikely to re-find the
572 Automatic minimization can be forced or disabled using the \textit{minimize}
573 option (\S\ref{mode-of-operation}).
575 \point{A strange error occurred---what should I do?}
577 Sledgehammer tries to give informative error messages. Please report any strange
578 error to the author at \authoremail. This applies double if you get the message
582 The prover found a type-unsound proof involving ``\textit{foo\/}'',
583 ``\textit{bar\/}'', and ``\textit{baz\/}'' even though a supposedly type-sound
584 encoding was used (or, less likely, your axioms are inconsistent). You might
585 want to report this to the Isabelle developers.
588 \point{Auto can solve it---why not Sledgehammer?}
590 Problems can be easy for \textit{auto} and difficult for automatic provers, but
591 the reverse is also true, so don't be discouraged if your first attempts fail.
592 Because the system refers to all theorems known to Isabelle, it is particularly
593 suitable when your goal has a short proof from lemmas that you don't know about.
595 \point{Why are there so many options?}
597 Sledgehammer's philosophy should work out of the box, without user guidance.
598 Many of the options are meant to be used mostly by the Sledgehammer developers
599 for experimentation purposes. Of course, feel free to experiment with them if
602 \section{Command Syntax}
603 \label{command-syntax}
605 \subsection{Sledgehammer}
607 Sledgehammer can be invoked at any point when there is an open goal by entering
608 the \textbf{sledgehammer} command in the theory file. Its general syntax is as
612 \textbf{sledgehammer} \qty{subcommand}$^?$ \qty{options}$^?$ \qty{facts\_override}$^?$ \qty{num}$^?$
615 For convenience, Sledgehammer is also available in the ``Commands'' submenu of
616 the ``Isabelle'' menu in Proof General or by pressing the Emacs key sequence C-c
617 C-a C-s. This is equivalent to entering the \textbf{sledgehammer} command with
618 no arguments in the theory text.
620 In the general syntax, the \qty{subcommand} may be any of the following:
623 \item[\labelitemi] \textbf{\textit{run} (the default):} Runs Sledgehammer on
624 subgoal number \qty{num} (1 by default), with the given options and facts.
626 \item[\labelitemi] \textbf{\textit{min}:} Attempts to minimize the facts
627 specified in the \qty{facts\_override} argument to obtain a simpler proof
628 involving fewer facts. The options and goal number are as for \textit{run}.
630 \item[\labelitemi] \textbf{\textit{messages}:} Redisplays recent messages issued
631 by Sledgehammer. This allows you to examine results that might have been lost
632 due to Sledgehammer's asynchronous nature. The \qty{num} argument specifies a
633 limit on the number of messages to display (5 by default).
635 \item[\labelitemi] \textbf{\textit{supported\_provers}:} Prints the list of
636 automatic provers supported by Sledgehammer. See \S\ref{installation} and
637 \S\ref{mode-of-operation} for more information on how to install automatic
640 \item[\labelitemi] \textbf{\textit{running\_provers}:} Prints information about
641 currently running automatic provers, including elapsed runtime and remaining
644 \item[\labelitemi] \textbf{\textit{kill\_provers}:} Terminates all running
647 \item[\labelitemi] \textbf{\textit{refresh\_tptp}:} Refreshes the list of remote
648 ATPs available at System\-On\-TPTP \cite{sutcliffe-2000}.
651 Sledgehammer's behavior can be influenced by various \qty{options}, which can be
652 specified in brackets after the \textbf{sledgehammer} command. The
653 \qty{options} are a list of key--value pairs of the form ``[$k_1 = v_1,
654 \ldots, k_n = v_n$]''. For Boolean options, ``= \textit{true\/}'' is optional. For
658 \textbf{sledgehammer} [\textit{isar\_proof}, \,\textit{timeout} = 120]
661 Default values can be set using \textbf{sledgehammer\_\allowbreak params}:
664 \textbf{sledgehammer\_params} \qty{options}
667 The supported options are described in \S\ref{option-reference}.
669 The \qty{facts\_override} argument lets you alter the set of facts that go
670 through the relevance filter. It may be of the form ``(\qty{facts})'', where
671 \qty{facts} is a space-separated list of Isabelle facts (theorems, local
672 assumptions, etc.), in which case the relevance filter is bypassed and the given
673 facts are used. It may also be of the form ``(\textit{add}:\ \qty{facts\/_{\mathrm{1}}})'',
674 ``(\textit{del}:\ \qty{facts\/_{\mathrm{2}}})'', or ``(\textit{add}:\ \qty{facts\/_{\mathrm{1}}}\
675 \textit{del}:\ \qty{facts\/_{\mathrm{2}}})'', where the relevance filter is instructed to
676 proceed as usual except that it should consider \qty{facts\/_{\mathrm{1}}}
677 highly-relevant and \qty{facts\/_{\mathrm{2}}} fully irrelevant.
679 You can instruct Sledgehammer to run automatically on newly entered theorems by
680 enabling the ``Auto Sledgehammer'' option in Proof General's ``Isabelle'' menu.
681 For automatic runs, only the first prover set using \textit{provers}
682 (\S\ref{mode-of-operation}) is considered, fewer facts are passed to the prover,
683 \textit{slice} (\S\ref{mode-of-operation}) is disabled, \textit{strict}
684 (\S\ref{problem-encoding}) is enabled, \textit{verbose} (\S\ref{output-format})
685 and \textit{debug} (\S\ref{output-format}) are disabled, and \textit{timeout}
686 (\S\ref{timeouts}) is superseded by the ``Auto Tools Time Limit'' in Proof
687 General's ``Isabelle'' menu. Sledgehammer's output is also more concise.
691 The \textit{metis} proof method has the syntax
694 \textbf{\textit{metis}}~(\qty{options})${}^?$~\qty{facts}${}^?$
697 where \qty{facts} is a list of arbitrary facts and \qty{options} is a
698 comma-separated list consisting of at most one $\lambda$ translation scheme
699 specification with the same semantics as Sledgehammer's \textit{lam\_trans}
700 option (\S\ref{problem-encoding}) and at most one type encoding specification
701 with the same semantics as Sledgehammer's \textit{type\_enc} option
702 (\S\ref{problem-encoding}).
704 The supported $\lambda$ translation schemes are \textit{hide\_lams},
705 \textit{lifting}, and \textit{combs} (the default).
707 All the untyped type encodings listed in \S\ref{problem-encoding} are supported.
708 For convenience, the following aliases are provided:
710 \item[\labelitemi] \textbf{\textit{full\_types}:} Synonym for \textit{poly\_guards\_query}.
711 \item[\labelitemi] \textbf{\textit{partial\_types}:} Synonym for \textit{poly\_args}.
712 \item[\labelitemi] \textbf{\textit{no\_types}:} Synonym for \textit{erased}.
715 \section{Option Reference}
716 \label{option-reference}
721 \def\flushitem#1{\item[]\noindent\kern-\leftmargin \textbf{#1}}
722 \def\optrue#1#2{\flushitem{\textit{#1} $\bigl[$= \qtybf{bool}$\bigr]$\enskip \defl\textit{true}\defr\hfill (neg.: \textit{#2})}\nopagebreak\\[\parskip]}
723 \def\opfalse#1#2{\flushitem{\textit{#1} $\bigl[$= \qtybf{bool}$\bigr]$\enskip \defl\textit{false}\defr\hfill (neg.: \textit{#2})}\nopagebreak\\[\parskip]}
724 \def\opsmart#1#2{\flushitem{\textit{#1} $\bigl[$= \qtybf{smart\_bool}$\bigr]$\enskip \defl\textit{smart}\defr\hfill (neg.: \textit{#2})}\nopagebreak\\[\parskip]}
725 \def\opsmartx#1#2{\flushitem{\textit{#1} $\bigl[$= \qtybf{smart\_bool}$\bigr]$\enskip \defl\textit{smart}\defr\\\hbox{}\hfill (neg.: \textit{#2})}\nopagebreak\\[\parskip]}
726 \def\opnodefault#1#2{\flushitem{\textit{#1} = \qtybf{#2}} \nopagebreak\\[\parskip]}
727 \def\opnodefaultbrk#1#2{\flushitem{$\bigl[$\textit{#1} =$\bigr]$ \qtybf{#2}} \nopagebreak\\[\parskip]}
728 \def\opdefault#1#2#3{\flushitem{\textit{#1} = \qtybf{#2}\enskip \defl\textit{#3}\defr} \nopagebreak\\[\parskip]}
729 \def\oparg#1#2#3{\flushitem{\textit{#1} \qtybf{#2} = \qtybf{#3}} \nopagebreak\\[\parskip]}
730 \def\opargbool#1#2#3{\flushitem{\textit{#1} \qtybf{#2} $\bigl[$= \qtybf{bool}$\bigr]$\hfill (neg.: \textit{#3})}\nopagebreak\\[\parskip]}
731 \def\opargboolorsmart#1#2#3{\flushitem{\textit{#1} \qtybf{#2} $\bigl[$= \qtybf{smart\_bool}$\bigr]$\hfill (neg.: \textit{#3})}\nopagebreak\\[\parskip]}
733 Sledgehammer's options are categorized as follows:\ mode of operation
734 (\S\ref{mode-of-operation}), problem encoding (\S\ref{problem-encoding}),
735 relevance filter (\S\ref{relevance-filter}), output format
736 (\S\ref{output-format}), authentication (\S\ref{authentication}), and timeouts
739 The descriptions below refer to the following syntactic quantities:
742 \item[\labelitemi] \qtybf{string}: A string.
743 \item[\labelitemi] \qtybf{bool\/}: \textit{true} or \textit{false}.
744 \item[\labelitemi] \qtybf{smart\_bool\/}: \textit{true}, \textit{false}, or
746 \item[\labelitemi] \qtybf{int\/}: An integer.
747 %\item[\labelitemi] \qtybf{float\/}: A floating-point number (e.g., 2.5).
748 \item[\labelitemi] \qtybf{float\_pair\/}: A pair of floating-point numbers
750 \item[\labelitemi] \qtybf{smart\_int\/}: An integer or \textit{smart}.
751 \item[\labelitemi] \qtybf{float\_or\_none\/}: A floating-point number (e.g., 60 or
752 0.5) expressing a number of seconds, or the keyword \textit{none} ($\infty$
756 Default values are indicated in curly brackets (\textrm{\{\}}). Boolean options
757 have a negated counterpart (e.g., \textit{blocking} vs.\
758 \textit{non\_blocking}). When setting them, ``= \textit{true\/}'' may be omitted.
760 \subsection{Mode of Operation}
761 \label{mode-of-operation}
764 \opnodefaultbrk{provers}{string}
765 Specifies the automatic provers to use as a space-separated list (e.g.,
766 ``\textit{e}~\textit{spass}~\textit{remote\_vampire\/}'').
767 Provers can be run locally or remotely; see \S\ref{installation} for
768 installation instructions.
770 The following local provers are supported:
773 \item[\labelitemi] \textbf{\textit{cvc3}:} CVC3 is an SMT solver developed by
774 Clark Barrett, Cesare Tinelli, and their colleagues \cite{cvc3}. To use CVC3,
775 set the environment variable \texttt{CVC3\_SOLVER} to the complete path of the
776 executable, including the file name, or install the prebuilt CVC3 package from
777 \download. Sledgehammer has been tested with version 2.2.
779 \item[\labelitemi] \textbf{\textit{e}:} E is a first-order resolution prover
780 developed by Stephan Schulz \cite{schulz-2002}. To use E, set the environment
781 variable \texttt{E\_HOME} to the directory that contains the \texttt{eproof}
782 executable, or install the prebuilt E package from \download. Sledgehammer has
783 been tested with versions 1.0 to 1.4.
785 \item[\labelitemi] \textbf{\textit{leo2}:} LEO-II is an automatic
786 higher-order prover developed by Christoph Benzm\"uller et al.\ \cite{leo2},
787 with support for the TPTP typed higher-order syntax (THF0). To use LEO-II, set
788 the environment variable \texttt{LEO2\_HOME} to the directory that contains the
789 \texttt{leo} executable. Sledgehammer requires version 1.2.9 or above.
791 \item[\labelitemi] \textbf{\textit{metis}:} Although it is much less powerful than
792 the external provers, Metis itself can be used for proof search.
794 \item[\labelitemi] \textbf{\textit{satallax}:} Satallax is an automatic
795 higher-order prover developed by Chad Brown et al.\ \cite{satallax}, with
796 support for the TPTP typed higher-order syntax (THF0). To use Satallax, set the
797 environment variable \texttt{SATALLAX\_HOME} to the directory that contains the
798 \texttt{satallax} executable. Sledgehammer requires version 2.2 or above.
800 \item[\labelitemi] \textbf{\textit{smt}:} The \textit{smt} proof method with the
801 current settings (usually:\ Z3 with proof reconstruction).
803 \item[\labelitemi] \textbf{\textit{spass}:} SPASS is a first-order resolution
804 prover developed by Christoph Weidenbach et al.\ \cite{weidenbach-et-al-2009}.
805 To use SPASS, set the environment variable \texttt{SPASS\_HOME} to the directory
806 that contains the \texttt{SPASS} executable, or install the prebuilt SPASS
807 package from \download. Sledgehammer requires version 3.5 or above.
809 \item[\labelitemi] \textbf{\textit{vampire}:} Vampire is a first-order resolution
810 prover developed by Andrei Voronkov and his colleagues
811 \cite{riazanov-voronkov-2002}. To use Vampire, set the environment variable
812 \texttt{VAMPIRE\_HOME} to the directory that contains the \texttt{vampire}
813 executable and \texttt{VAMPIRE\_VERSION} to the version number (e.g., ``1.8'').
814 Sledgehammer has been tested with versions 0.6, 1.0, and 1.8.
815 %Vampire 1.8 supports the TPTP typed first-order format (TFF0).
817 \item[\labelitemi] \textbf{\textit{yices}:} Yices is an SMT solver developed at
818 SRI \cite{yices}. To use Yices, set the environment variable
819 \texttt{YICES\_SOLVER} to the complete path of the executable, including the
820 file name. Sledgehammer has been tested with version 1.0.28.
822 \item[\labelitemi] \textbf{\textit{z3}:} Z3 is an SMT solver developed at
823 Microsoft Research \cite{z3}. To use Z3, set the environment variable
824 \texttt{Z3\_SOLVER} to the complete path of the executable, including the file
825 name, and set \texttt{Z3\_NON\_COMMERCIAL} to ``yes'' to confirm that you are a
826 noncommercial user. Sledgehammer has been tested with versions 3.0 to 3.2.
828 \item[\labelitemi] \textbf{\textit{z3\_tptp}:} This version of Z3 pretends to be
829 an ATP, exploiting Z3's support for the TPTP untyped and typed first-order
830 formats (FOF and TFF0). It is included for experimental purposes. It
831 requires version 3.0 or above. To use it, set the environment variable
832 \texttt{Z3\_HOME} to the directory that contains the \texttt{z3}
836 The following remote provers are supported:
839 \item[\labelitemi] \textbf{\textit{remote\_cvc3}:} The remote version of CVC3 runs
840 on servers at the TU M\"unchen (or wherever \texttt{REMOTE\_SMT\_URL} is set to
843 \item[\labelitemi] \textbf{\textit{remote\_e}:} The remote version of E runs
844 on Geoff Sutcliffe's Miami servers \cite{sutcliffe-2000}.
846 \item[\labelitemi] \textbf{\textit{remote\_e\_sine}:} E-SInE is a metaprover
847 developed by Kry\v stof Hoder \cite{sine} based on E. The remote version of
848 SInE runs on Geoff Sutcliffe's Miami servers.
850 \item[\labelitemi] \textbf{\textit{remote\_e\_tofof}:} E-ToFoF is a metaprover
851 developed by Geoff Sutcliffe \cite{tofof} based on E running on his Miami
852 servers. This ATP supports the TPTP typed first-order format (TFF0). The
853 remote version of E-ToFoF runs on Geoff Sutcliffe's Miami servers.
855 \item[\labelitemi] \textbf{\textit{remote\_iprover}:} iProver is a pure
856 instantiation-based prover developed by Konstantin Korovin \cite{korovin-2009}. The
857 remote version of iProver runs on Geoff Sutcliffe's Miami servers
858 \cite{sutcliffe-2000}.
860 \item[\labelitemi] \textbf{\textit{remote\_iprover\_eq}:} iProver-Eq is an
861 instantiation-based prover with native support for equality developed by
862 Konstantin Korovin and Christoph Sticksel \cite{korovin-sticksel-2010}. The
863 remote version of iProver-Eq runs on Geoff Sutcliffe's Miami servers
864 \cite{sutcliffe-2000}.
866 The remote version of LEO-II
867 runs on Geoff Sutcliffe's Miami servers \cite{sutcliffe-2000}.
869 \item[\labelitemi] \textbf{\textit{remote\_leo2}:} The remote version of LEO-II
870 runs on Geoff Sutcliffe's Miami servers \cite{sutcliffe-2000}.
872 \item[\labelitemi] \textbf{\textit{remote\_satallax}:} The remote version of
873 Satallax runs on Geoff Sutcliffe's Miami servers \cite{sutcliffe-2000}.
875 \item[\labelitemi] \textbf{\textit{remote\_snark}:} SNARK is a first-order
876 resolution prover developed by Stickel et al.\ \cite{snark}. It supports the
877 TPTP typed first-order format (TFF0). The remote version of SNARK runs on
878 Geoff Sutcliffe's Miami servers.
880 \item[\labelitemi] \textbf{\textit{remote\_vampire}:} The remote version of
881 Vampire runs on Geoff Sutcliffe's Miami servers. Version 1.8 is used.
883 \item[\labelitemi] \textbf{\textit{remote\_waldmeister}:} Waldmeister is a unit
884 equality prover developed by Hillenbrand et al.\ \cite{waldmeister}. It can be
885 used to prove universally quantified equations using unconditional equations,
886 corresponding to the TPTP CNF UEQ division. The remote version of Waldmeister
887 runs on Geoff Sutcliffe's Miami servers.
889 \item[\labelitemi] \textbf{\textit{remote\_z3}:} The remote version of Z3 runs on
890 servers at the TU M\"unchen (or wherever \texttt{REMOTE\_SMT\_URL} is set to
893 \item[\labelitemi] \textbf{\textit{remote\_z3\_tptp}:} The remote version of ``Z3
894 with TPTP syntax'' runs on Geoff Sutcliffe's Miami servers.
897 By default, Sledgehammer runs E, E-SInE, SPASS, Vampire, Z3 (or whatever
898 the SMT module's \textit{smt\_solver} configuration option is set to), and (if
899 appropriate) Waldmeister in parallel---either locally or remotely, depending on
900 the number of processor cores available. For historical reasons, the default
901 value of this option can be overridden using the option ``Sledgehammer:
902 Provers'' in Proof General's ``Isabelle'' menu.
904 It is generally a good idea to run several provers in parallel. Running E,
905 SPASS, and Vampire for 5~seconds yields a similar success rate to running the
906 most effective of these for 120~seconds \cite{boehme-nipkow-2010}.
908 For the \textit{min} subcommand, the default prover is \textit{metis}. If
909 several provers are set, the first one is used.
911 \opnodefault{prover}{string}
912 Alias for \textit{provers}.
914 %\opnodefault{atps}{string}
915 %Legacy alias for \textit{provers}.
917 %\opnodefault{atp}{string}
918 %Legacy alias for \textit{provers}.
920 \opfalse{blocking}{non\_blocking}
921 Specifies whether the \textbf{sledgehammer} command should operate
922 synchronously. The asynchronous (non-blocking) mode lets the user start proving
923 the putative theorem manually while Sledgehammer looks for a proof, but it can
924 also be more confusing. Irrespective of the value of this option, Sledgehammer
925 is always run synchronously for the new jEdit-based user interface or if
926 \textit{debug} (\S\ref{output-format}) is enabled.
928 \optrue{slice}{dont\_slice}
929 Specifies whether the time allocated to a prover should be sliced into several
930 segments, each of which has its own set of possibly prover-dependent options.
931 For SPASS and Vampire, the first slice tries the fast but incomplete
932 set-of-support (SOS) strategy, whereas the second slice runs without it. For E,
933 up to three slices are tried, with different weighted search strategies and
934 number of facts. For SMT solvers, several slices are tried with the same options
935 each time but fewer and fewer facts. According to benchmarks with a timeout of
936 30 seconds, slicing is a valuable optimization, and you should probably leave it
937 enabled unless you are conducting experiments. This option is implicitly
938 disabled for (short) automatic runs.
941 {\small See also \textit{verbose} (\S\ref{output-format}).}
943 \opsmart{minimize}{dont\_minimize}
944 Specifies whether the minimization tool should be invoked automatically after
945 proof search. By default, automatic minimization takes place only if
946 it can be done in a reasonable amount of time (as determined by
947 the number of facts in the original proof and the time it took to find or
948 preplay it) or the proof involves an unreasonably large number of facts.
951 {\small See also \textit{preplay\_timeout} (\S\ref{timeouts}).}
953 \opfalse{overlord}{no\_overlord}
954 Specifies whether Sledgehammer should put its temporary files in
955 \texttt{\$ISA\-BELLE\_\allowbreak HOME\_\allowbreak USER}, which is useful for
956 debugging Sledgehammer but also unsafe if several instances of the tool are run
957 simultaneously. The files are identified by the prefix \texttt{prob\_}; you may
958 safely remove them after Sledgehammer has run.
961 {\small See also \textit{debug} (\S\ref{output-format}).}
964 \subsection{Problem Encoding}
965 \label{problem-encoding}
967 \newcommand\comb[1]{\const{#1}}
970 \opdefault{lam\_trans}{string}{smart}
971 Specifies the $\lambda$ translation scheme to use in ATP problems. The supported
972 translation schemes are listed below:
975 \item[\labelitemi] \textbf{\textit{hide\_lams}:} Hide the $\lambda$-abstractions
976 by replacing them by unspecified fresh constants, effectively disabling all
977 reasoning under $\lambda$-abstractions.
979 \item[\labelitemi] \textbf{\textit{lifting}:} Introduce a new
980 supercombinator \const{c} for each cluster of $n$~$\lambda$-abstractions,
981 defined using an equation $\const{c}~x_1~\ldots~x_n = t$ ($\lambda$-lifting).
983 \item[\labelitemi] \textbf{\textit{combs}:} Rewrite lambdas to the Curry
984 combinators (\comb{I}, \comb{K}, \comb{S}, \comb{B}, \comb{C}). Combinators
985 enable the ATPs to synthesize $\lambda$-terms but tend to yield bulkier formulas
986 than $\lambda$-lifting: The translation is quadratic in the worst case, and the
987 equational definitions of the combinators are very prolific in the context of
990 \item[\labelitemi] \textbf{\textit{combs\_and\_lifting}:} Introduce a new
991 supercombinator \const{c} for each cluster of $\lambda$-abstractions and characterize it both using a
992 lifted equation $\const{c}~x_1~\ldots~x_n = t$ and via Curry combinators.
994 \item[\labelitemi] \textbf{\textit{combs\_or\_lifting}:} For each cluster of
995 $\lambda$-abstractions, heuristically choose between $\lambda$-lifting and Curry
998 \item[\labelitemi] \textbf{\textit{keep\_lams}:}
999 Keep the $\lambda$-abstractions in the generated problems. This is available
1000 only with provers that support the THF0 syntax.
1002 \item[\labelitemi] \textbf{\textit{smart}:} The actual translation scheme used
1003 depends on the ATP and should be the most efficient scheme for that ATP.
1006 For SMT solvers, the $\lambda$ translation scheme is always \textit{lifting},
1007 irrespective of the value of this option.
1009 \opsmartx{uncurried\_aliases}{no\_uncurried\_aliases}
1010 Specifies the $\lambda$ translation scheme to use in ATP problems. The supported
1011 translation schemes are listed below:
1013 \opdefault{type\_enc}{string}{smart}
1014 Specifies the type encoding to use in ATP problems. Some of the type encodings
1015 are unsound, meaning that they can give rise to spurious proofs
1016 (unreconstructible using \textit{metis}). The supported type encodings are
1017 listed below, with an indication of their soundness in parentheses.
1018 An asterisk (*) means that the encoding is slightly incomplete for
1019 reconstruction with \textit{metis}, unless the \emph{strict} option (described
1023 \item[\labelitemi] \textbf{\textit{erased} (very unsound):} No type information is
1024 supplied to the ATP, not even to resolve overloading. Types are simply erased.
1026 \item[\labelitemi] \textbf{\textit{poly\_guards} (sound):} Types are encoded using
1027 a predicate \const{g}$(\tau, t)$ that guards bound
1028 variables. Constants are annotated with their types, supplied as additional
1029 arguments, to resolve overloading.
1031 \item[\labelitemi] \textbf{\textit{poly\_tags} (sound):} Each term and subterm is
1032 tagged with its type using a function $\const{t\/}(\tau, t)$.
1034 \item[\labelitemi] \textbf{\textit{poly\_args} (unsound):}
1035 Like for \textit{poly\_guards} constants are annotated with their types to
1036 resolve overloading, but otherwise no type information is encoded. This
1037 coincides with the default encoding used by the \textit{metis} command.
1041 \textit{raw\_mono\_guards}, \textit{raw\_mono\_tags} (sound); \\
1042 \textit{raw\_mono\_args} (unsound):} \\
1043 Similar to \textit{poly\_guards}, \textit{poly\_tags}, and \textit{poly\_args},
1044 respectively, but the problem is additionally monomorphized, meaning that type
1045 variables are instantiated with heuristically chosen ground types.
1046 Monomorphization can simplify reasoning but also leads to larger fact bases,
1047 which can slow down the ATPs.
1051 \textit{mono\_guards}, \textit{mono\_tags} (sound);
1052 \textit{mono\_args} (unsound):} \\
1054 \textit{raw\_mono\_guards}, \textit{raw\_mono\_tags}, and
1055 \textit{raw\_mono\_args}, respectively but types are mangled in constant names
1056 instead of being supplied as ground term arguments. The binary predicate
1057 $\const{g}(\tau, t)$ becomes a unary predicate
1058 $\const{g\_}\tau(t)$, and the binary function
1059 $\const{t}(\tau, t)$ becomes a unary function
1060 $\const{t\_}\tau(t)$.
1062 \item[\labelitemi] \textbf{\textit{mono\_simple} (sound):} Exploits simple
1063 first-order types if the prover supports the TFF0 or THF0 syntax; otherwise,
1064 falls back on \textit{mono\_guards}. The problem is monomorphized.
1066 \item[\labelitemi] \textbf{\textit{mono\_simple\_higher} (sound):} Exploits simple
1067 higher-order types if the prover supports the THF0 syntax; otherwise, falls back
1068 on \textit{mono\_simple} or \textit{mono\_guards}. The problem is monomorphized.
1072 \textit{poly\_guards}?, \textit{poly\_tags}?, \textit{raw\_mono\_guards}?, \\
1073 \textit{raw\_mono\_tags}?, \textit{mono\_guards}?, \textit{mono\_tags}?, \\
1074 \textit{mono\_simple}? (sound*):} \\
1075 The type encodings \textit{poly\_guards}, \textit{poly\_tags},
1076 \textit{raw\_mono\_guards}, \textit{raw\_mono\_tags}, \textit{mono\_guards},
1077 \textit{mono\_tags}, and \textit{mono\_simple} are fully
1078 typed and sound. For each of these, Sledgehammer also provides a lighter,
1079 virtually sound variant identified by a question mark (`\hbox{?}')\ that detects
1080 and erases monotonic types, notably infinite types. (For \textit{mono\_simple},
1081 the types are not actually erased but rather replaced by a shared uniform type
1082 of individuals.) As argument to the \textit{metis} proof method, the question
1083 mark is replaced by a \hbox{``\textit{\_query\/}''} suffix.
1087 \textit{poly\_guards}??, \textit{poly\_tags}??, \textit{raw\_mono\_guards}??, \\
1088 \textit{raw\_mono\_tags}??, \textit{mono\_guards}??, \textit{mono\_tags}?? \\
1090 Even lighter versions of the `\hbox{?}' encodings. As argument to the
1091 \textit{metis} proof method, the `\hbox{??}' suffix is replaced by
1092 \hbox{``\textit{\_query\_query\/}''}.
1096 \textit{poly\_guards}@?, \textit{raw\_mono\_guards}@? (sound*):} \\
1097 Alternative versions of the `\hbox{??}' encodings. As argument to the
1098 \textit{metis} proof method, the `\hbox{@?}' suffix is replaced by
1099 \hbox{``\textit{\_at\_query\/}''}.
1103 \textit{poly\_guards}!, \textit{poly\_tags}!, \textit{raw\_mono\_guards}!, \\
1104 \textit{raw\_mono\_tags}!, \textit{mono\_guards}!, \textit{mono\_tags}!, \\
1105 \textit{mono\_simple}!, \textit{mono\_simple\_higher}! (mildly unsound):} \\
1106 The type encodings \textit{poly\_guards}, \textit{poly\_tags},
1107 \textit{raw\_mono\_guards}, \textit{raw\_mono\_tags}, \textit{mono\_guards},
1108 \textit{mono\_tags}, \textit{mono\_simple}, and \textit{mono\_simple\_higher}
1109 also admit a mildly unsound (but very efficient) variant identified by an
1110 exclamation mark (`\hbox{!}') that detects and erases erases all types except
1111 those that are clearly finite (e.g., \textit{bool}). (For \textit{mono\_simple}
1112 and \textit{mono\_simple\_higher}, the types are not actually erased but rather
1113 replaced by a shared uniform type of individuals.) As argument to the
1114 \textit{metis} proof method, the exclamation mark is replaced by the suffix
1115 \hbox{``\textit{\_bang\/}''}.
1119 \textit{poly\_guards}!!, \textit{poly\_tags}!!, \textit{raw\_mono\_guards}!!, \\
1120 \textit{raw\_mono\_tags}!!, \textit{mono\_guards}!!, \textit{mono\_tags}!! \\
1121 (mildly unsound):} \\
1122 Even lighter versions of the `\hbox{!}' encodings. As argument to the
1123 \textit{metis} proof method, the `\hbox{!!}' suffix is replaced by
1124 \hbox{``\textit{\_bang\_bang\/}''}.
1128 \textit{poly\_guards}@!, \textit{raw\_mono\_guards}@! (mildly unsound):} \\
1129 Alternative versions of the `\hbox{!!}' encodings. As argument to the
1130 \textit{metis} proof method, the `\hbox{@!}' suffix is replaced by
1131 \hbox{``\textit{\_at\_bang\/}''}.
1133 \item[\labelitemi] \textbf{\textit{smart}:} The actual encoding used depends on
1134 the ATP and should be the most efficient virtually sound encoding for that ATP.
1137 For SMT solvers, the type encoding is always \textit{mono\_simple}, irrespective
1138 of the value of this option.
1141 {\small See also \textit{max\_new\_mono\_instances} (\S\ref{relevance-filter})
1142 and \textit{max\_mono\_iters} (\S\ref{relevance-filter}).}
1144 \opfalse{strict}{non\_strict}
1145 Specifies whether Sledgehammer should run in its strict mode. In that mode,
1146 sound type encodings marked with an asterisk (*) above are made complete
1147 for reconstruction with \textit{metis}, at the cost of some clutter in the
1148 generated problems. This option has no effect if \textit{type\_enc} is
1149 deliberately set to an unsound encoding.
1152 \subsection{Relevance Filter}
1153 \label{relevance-filter}
1156 \opdefault{relevance\_thresholds}{float\_pair}{\upshape 0.45~0.85}
1157 Specifies the thresholds above which facts are considered relevant by the
1158 relevance filter. The first threshold is used for the first iteration of the
1159 relevance filter and the second threshold is used for the last iteration (if it
1160 is reached). The effective threshold is quadratically interpolated for the other
1161 iterations. Each threshold ranges from 0 to 1, where 0 means that all theorems
1162 are relevant and 1 only theorems that refer to previously seen constants.
1164 \opdefault{max\_relevant}{smart\_int}{smart}
1165 Specifies the maximum number of facts that may be returned by the relevance
1166 filter. If the option is set to \textit{smart}, it is set to a value that was
1167 empirically found to be appropriate for the prover. A typical value would be
1170 \opdefault{max\_new\_mono\_instances}{int}{\upshape 200}
1171 Specifies the maximum number of monomorphic instances to generate beyond
1172 \textit{max\_relevant}. The higher this limit is, the more monomorphic instances
1173 are potentially generated. Whether monomorphization takes place depends on the
1177 {\small See also \textit{type\_enc} (\S\ref{problem-encoding}).}
1179 \opdefault{max\_mono\_iters}{int}{\upshape 3}
1180 Specifies the maximum number of iterations for the monomorphization fixpoint
1181 construction. The higher this limit is, the more monomorphic instances are
1182 potentially generated. Whether monomorphization takes place depends on the
1186 {\small See also \textit{type\_enc} (\S\ref{problem-encoding}).}
1189 \subsection{Output Format}
1190 \label{output-format}
1194 \opfalse{verbose}{quiet}
1195 Specifies whether the \textbf{sledgehammer} command should explain what it does.
1196 This option is implicitly disabled for automatic runs.
1198 \opfalse{debug}{no\_debug}
1199 Specifies whether Sledgehammer should display additional debugging information
1200 beyond what \textit{verbose} already displays. Enabling \textit{debug} also
1201 enables \textit{verbose} and \textit{blocking} (\S\ref{mode-of-operation})
1202 behind the scenes. The \textit{debug} option is implicitly disabled for
1206 {\small See also \textit{overlord} (\S\ref{mode-of-operation}).}
1208 \opfalse{isar\_proof}{no\_isar\_proof}
1209 Specifies whether Isar proofs should be output in addition to one-liner
1210 \textit{metis} proofs. Isar proof construction is still experimental and often
1211 fails; however, they are usually faster and sometimes more robust than
1212 \textit{metis} proofs.
1214 \opdefault{isar\_shrink\_factor}{int}{\upshape 1}
1215 Specifies the granularity of the Isar proof. A value of $n$ indicates that each
1216 Isar proof step should correspond to a group of up to $n$ consecutive proof
1217 steps in the ATP proof.
1220 \subsection{Authentication}
1221 \label{authentication}
1224 \opnodefault{expect}{string}
1225 Specifies the expected outcome, which must be one of the following:
1228 \item[\labelitemi] \textbf{\textit{some}:} Sledgehammer found a proof.
1229 \item[\labelitemi] \textbf{\textit{none}:} Sledgehammer found no proof.
1230 \item[\labelitemi] \textbf{\textit{timeout}:} Sledgehammer timed out.
1231 \item[\labelitemi] \textbf{\textit{unknown}:} Sledgehammer encountered some
1235 Sledgehammer emits an error (if \textit{blocking} is enabled) or a warning
1236 (otherwise) if the actual outcome differs from the expected outcome. This option
1237 is useful for regression testing.
1240 {\small See also \textit{blocking} (\S\ref{mode-of-operation}) and
1241 \textit{timeout} (\S\ref{timeouts}).}
1244 \subsection{Timeouts}
1248 \opdefault{timeout}{float\_or\_none}{\upshape 30}
1249 Specifies the maximum number of seconds that the automatic provers should spend
1250 searching for a proof. This excludes problem preparation and is a soft limit.
1251 For historical reasons, the default value of this option can be overridden using
1252 the option ``Sledgehammer: Time Limit'' in Proof General's ``Isabelle'' menu.
1254 \opdefault{preplay\_timeout}{float\_or\_none}{\upshape 3}
1255 Specifies the maximum number of seconds that \textit{metis} or \textit{smt}
1256 should spend trying to ``preplay'' the found proof. If this option is set to 0,
1257 no preplaying takes place, and no timing information is displayed next to the
1258 suggested \textit{metis} calls.
1261 {\small See also \textit{minimize} (\S\ref{mode-of-operation}).}
1265 \bibliography{../manual}{}
1266 \bibliographystyle{abbrv}