1 \documentclass[a4paper,12pt]{article}
2 \usepackage[T1]{fontenc}
5 \usepackage[english]{babel}
12 %\usepackage[scaled=.85]{beramono}
13 \usepackage{isabelle,iman,pdfsetup}
15 \newcommand\download{\url{http://isabelle.in.tum.de/components/}}
17 \def\qty#1{\ensuremath{\left<\mathit{#1\/}\right>}}
18 \def\qtybf#1{$\mathbf{\left<\textbf{\textit{#1\/}}\right>}$}
20 \newcommand\const[1]{\textsf{#1}}
23 %\evensidemargin=4.6mm
30 \def\Colon{\mathord{:\mkern-1.5mu:}}
31 %\def\lbrakk{\mathopen{\lbrack\mkern-3.25mu\lbrack}}
32 %\def\rbrakk{\mathclose{\rbrack\mkern-3.255mu\rbrack}}
33 \def\lparr{\mathopen{(\mkern-4mu\mid}}
34 \def\rparr{\mathclose{\mid\mkern-4mu)}}
37 \def\undef{(\lambda x.\; \unk)}
38 %\def\unr{\textit{others}}
40 \def\Abs#1{\hbox{\rm{\flqq}}{\,#1\,}\hbox{\rm{\frqq}}}
41 \def\Q{{\smash{\lower.2ex\hbox{$\scriptstyle?$}}}}
48 %\renewcommand\labelitemi{$\bullet$}
49 \renewcommand\labelitemi{\raise.065ex\hbox{\small\textbullet}}
51 \title{\includegraphics[scale=0.5]{isabelle_sledgehammer} \\[4ex]
52 Hammering Away \\[\smallskipamount]
53 \Large A User's Guide to Sledgehammer for Isabelle/HOL}
55 Jasmin Christian Blanchette \\
56 {\normalsize Institut f\"ur Informatik, Technische Universit\"at M\"unchen} \\[4\smallskipamount]
57 {\normalsize with contributions from} \\[4\smallskipamount]
58 Lawrence C. Paulson \\
59 {\normalsize Computer Laboratory, University of Cambridge} \\
66 \setlength{\parskip}{.7em plus .2em minus .1em}
67 \setlength{\parindent}{0pt}
68 \setlength{\abovedisplayskip}{\parskip}
69 \setlength{\abovedisplayshortskip}{.9\parskip}
70 \setlength{\belowdisplayskip}{\parskip}
71 \setlength{\belowdisplayshortskip}{.9\parskip}
73 % general-purpose enum environment with correct spacing
74 \newenvironment{enum}%
76 \setlength{\topsep}{.1\parskip}%
77 \setlength{\partopsep}{.1\parskip}%
78 \setlength{\itemsep}{\parskip}%
79 \advance\itemsep by-\parsep}}
82 \def\pre{\begingroup\vskip0pt plus1ex\advance\leftskip by\leftmargin
83 \advance\rightskip by\leftmargin}
84 \def\post{\vskip0pt plus1ex\endgroup}
86 \def\prew{\pre\advance\rightskip by-\leftmargin}
89 \section{Introduction}
92 Sledgehammer is a tool that applies automatic theorem provers (ATPs)
93 and satisfiability-modulo-theories (SMT) solvers on the current goal.%
94 \footnote{The distinction between ATPs and SMT solvers is convenient but mostly
95 historical. The two communities are converging, with more and more ATPs
96 supporting typical SMT features such as arithmetic and sorts, and a few SMT
97 solvers parsing ATP syntaxes. There is also a strong technological connection
98 between instantiation-based ATPs (such as iProver and iProver-Eq) and SMT
101 The supported ATPs are agsyHOL \cite{agsyHOL}, Alt-Ergo \cite{alt-ergo}, E
102 \cite{schulz-2002}, E-SInE \cite{sine}, E-ToFoF \cite{tofof}, iProver
103 \cite{korovin-2009}, iProver-Eq \cite{korovin-sticksel-2010}, LEO-II
104 \cite{leo2}, Satallax \cite{satallax}, SNARK \cite{snark}, SPASS
105 \cite{weidenbach-et-al-2009}, Vampire \cite{riazanov-voronkov-2002}, and
106 Waldmeister \cite{waldmeister}. The ATPs are run either locally or remotely via
107 the System\-On\-TPTP web service \cite{sutcliffe-2000}. In addition to the ATPs,
108 a selection of the SMT solvers CVC3 \cite{cvc3}, Yices \cite{yices}, and Z3
109 \cite{z3} are run by default; these are run either locally or (for CVC3 and Z3)
110 on a server at the TU M\"unchen.
112 The problem passed to the automatic provers consists of your current goal
113 together with a heuristic selection of hundreds of facts (theorems) from the
114 current theory context, filtered by relevance.
116 The result of a successful proof search is some source text that usually (but
117 not always) reconstructs the proof within Isabelle. For ATPs, the reconstructed
118 proof relies on the general-purpose \textit{metis} proof method, which
119 integrates the Metis ATP in Isabelle/HOL with explicit inferences going through
120 the kernel. Thus its results are correct by construction.
122 For Isabelle/jEdit users, Sledgehammer provides an automatic mode that can be
123 enabled via the ``Auto Sledgehammer'' option under ``Plugins > Plugin Options >
124 Isabelle > General.'' In this mode, Sledgehammer is run on every newly entered
128 \setbox\boxA=\hbox{\texttt{NOSPAM}}
130 \newcommand\authoremail{\texttt{blan{\color{white}NOSPAM}\kern-\wd\boxA{}chette@\allowbreak
131 in.\allowbreak tum.\allowbreak de}}
133 To run Sledgehammer, you must make sure that the theory \textit{Sledgehammer} is
134 imported---this is rarely a problem in practice since it is part of
135 \textit{Main}. Examples of Sledgehammer use can be found in Isabelle's
136 \texttt{src/HOL/Metis\_Examples} directory.
137 Comments and bug reports concerning Sledgehammer or this manual should be
138 directed to the author at \authoremail.
140 %\vskip2.5\smallskipamount
142 %\textbf{Acknowledgment.} The author would like to thank Mark Summerfield for
143 %suggesting several textual improvements.
145 \section{Installation}
148 Sledgehammer is part of Isabelle, so you do not need to install it. However, it
149 relies on third-party automatic provers (ATPs and SMT solvers).
151 Among the ATPs, agsyHOL, Alt-Ergo, E, LEO-II, Satallax, SPASS, and Vampire can
152 be run locally; in addition, agsyHOL, E, E-SInE, E-ToFoF, iProver, iProver-Eq,
153 LEO-II, Satallax, SNARK, Vampire, and Waldmeister are available remotely via
154 System\-On\-TPTP \cite{sutcliffe-2000}. If you want better performance, you
155 should at least install E and SPASS locally.
157 The SMT solvers CVC3, Yices, and Z3 can be run locally, and CVC3 and
158 Z3 can be run remotely on a TU M\"unchen server. If you want better performance
159 and get the ability to replay proofs that rely on the \emph{smt} proof method
160 without an Internet connection, you should at least have Z3 locally installed.
162 There are three main ways to install automatic provers on your machine:
166 \item[\labelitemi] If you installed an official Isabelle package, it should
167 already include properly setup executables for CVC3, E, SPASS, and Z3, ready to use.%
168 \footnote{Vampire's and Yices's licenses prevent us from doing the same for
169 these otherwise remarkable tools.}
170 For Z3, you must additionally set the variable
171 \texttt{Z3\_NON\_COMMERCIAL} to ``yes'' to confirm that you are a
172 noncommercial user, either in the environment in which Isabelle is
174 \texttt{\$ISABELLE\_HOME\_USER/etc/settings} file.
176 \item[\labelitemi] Alternatively, you can download the Isabelle-aware CVC3, E,
177 SPASS, and Z3 binary packages from \download. Extract the archives, then add a
178 line to your \texttt{\$ISABELLE\_HOME\_USER\slash etc\slash components}%
179 \footnote{The variable \texttt{\$ISABELLE\_HOME\_USER} is set by Isabelle at
180 startup. Its value can be retrieved by executing \texttt{isabelle}
181 \texttt{getenv} \texttt{ISABELLE\_HOME\_USER} on the command line.}
182 file with the absolute path to CVC3, E, SPASS, or Z3. For example, if the
183 \texttt{components} file does not exist yet and you extracted SPASS to
184 \texttt{/usr/local/spass-3.8ds}, create it with the single line
187 \texttt{/usr/local/spass-3.8ds}
192 \item[\labelitemi] If you prefer to build agsyHOL, Alt-Ergo, E, LEO-II,
193 Satallax, or SPASS manually, or found a Vampire executable somewhere (e.g.,
194 \url{http://www.vprover.org/}), set the environment variable
195 \texttt{AGSYHOL\_HOME}, \texttt{E\_HOME}, \texttt{LEO2\_HOME},
196 \texttt{SATALLAX\_HOME}, \texttt{SPASS\_HOME}, or
197 \texttt{VAMPIRE\_HOME} to the directory that contains the \texttt{agsyHOL},
198 \texttt{eprover} (and/or \texttt{eproof} or \texttt{eproof\_ram}),
199 \texttt{leo}, \texttt{satallax}, \texttt{SPASS}, or \texttt{vampire} executable;
200 for Alt-Ergo, set the
201 environment variable \texttt{WHY3\_HOME} to the directory that contains the
202 \texttt{why3} executable.
203 Sledgehammer has been tested with agsyHOL 1.0, Alt-Ergo 0.95.1, E 1.0 to 1.8,
204 LEO-II 1.3.4, Satallax 2.2 to 2.7, SPASS 3.8ds, and Vampire 0.6 to 3.0.%
205 \footnote{Following the rewrite of Vampire, the counter for version numbers was
206 reset to 0; hence the (new) Vampire versions 0.6, 1.0, 1.8, 2.6, and 3.0 are more
207 recent than 9.0 or 11.5.}%
208 Since the ATPs' output formats are neither documented nor stable, other
209 versions might not work well with Sledgehammer. Ideally,
210 you should also set \texttt{E\_VERSION}, \texttt{LEO2\_VERSION},
211 \texttt{SATALLAX\_VERSION}, \texttt{SPASS\_VERSION}, or
212 \texttt{VAMPIRE\_VERSION} to the prover's version number (e.g., ``1.8'').
214 Similarly, if you want to build CVC3, or found a
215 Yices or Z3 executable somewhere (e.g.,
216 \url{http://yices.csl.sri.com/download.shtml} or
217 \url{http://research.microsoft.com/en-us/um/redmond/projects/z3/download.html}),
218 set the environment variable \texttt{CVC3\_\allowbreak SOLVER},
219 \texttt{YICES\_SOLVER}, or \texttt{Z3\_SOLVER} to the complete path of
220 the executable, \emph{including the file name}.
221 Sledgehammer has been tested with CVC3 2.2 and 2.4.1,
222 Yices 1.0.28 and 1.0.33, and Z3 3.0 to 4.0. Since the SMT solvers' output
223 formats are somewhat unstable, other versions of the solvers might not work well
224 with Sledgehammer. Ideally, also set \texttt{CVC3\_VERSION},
225 \texttt{YICES\_VERSION}, or \texttt{Z3\_VERSION} to the solver's version number
230 To check whether E, SPASS, Vampire, and/or Z3 are successfully installed, try
231 out the example in \S\ref{first-steps}. If the remote versions of any of these
232 provers is used (identified by the prefix ``\emph{remote\_\/}''), or if the
233 local versions fail to solve the easy goal presented there, something must be
234 wrong with the installation.
236 Remote prover invocation requires Perl with the World Wide Web Library
237 (\texttt{libwww-perl}) installed. If you must use a proxy server to access the
238 Internet, set the \texttt{http\_proxy} environment variable to the proxy, either
239 in the environment in which Isabelle is launched or in your
240 \texttt{\$ISABELLE\_HOME\_USER/etc/settings} file. Here are a few
244 \texttt{http\_proxy=http://proxy.example.org} \\
245 \texttt{http\_proxy=http://proxy.example.org:8080} \\
246 \texttt{http\_proxy=http://joeblow:pAsSwRd@proxy.example.org}
249 \section{First Steps}
252 To illustrate Sledgehammer in context, let us start a theory file and
253 attempt to prove a simple lemma:
256 \textbf{theory}~\textit{Scratch} \\
257 \textbf{imports}~\textit{Main} \\
258 \textbf{begin} \\[2\smallskipamount]
260 \textbf{lemma} ``$[a] = [b] \,\Longrightarrow\, a = b$'' \\
261 \textbf{sledgehammer}
264 Instead of issuing the \textbf{sledgehammer} command, you can also use the
265 Sledgehammer panel in Isabelle/jEdit. Sledgehammer produces the following output
270 Sledgehammer: ``\textit{e\/}'' on goal \\
271 $[a] = [b] \,\Longrightarrow\, a = b$ \\
272 Try this: \textbf{by} (\textit{metis last\_ConsL}) (64 ms). \\[3\smallskipamount]
274 Sledgehammer: ``\textit{z3\/}'' on goal \\
275 $[a] = [b] \,\Longrightarrow\, a = b$ \\
276 Try this: \textbf{by} (\textit{metis list.inject}) (20 ms). \\[3\smallskipamount]
278 Sledgehammer: ``\textit{vampire\/}'' on goal \\
279 $[a] = [b] \,\Longrightarrow\, a = b$ \\
280 Try this: \textbf{by} (\textit{metis hd.simps}) (14 ms). \\[3\smallskipamount]
282 Sledgehammer: ``\textit{spass\/}'' on goal \\
283 $[a] = [b] \,\Longrightarrow\, a = b$ \\
284 Try this: \textbf{by} (\textit{metis list.inject}) (17 ms). \\[3\smallskipamount]
286 Sledgehammer: ``\textit{remote\_e\_sine\/}'' on goal \\
287 $[a] = [b] \,\Longrightarrow\, a = b$ \\
288 Try this: \textbf{by} (\textit{metis hd.simps}) (18 ms).
291 Sledgehammer ran E, E-SInE, SPASS, Vampire, and Z3 in parallel. Depending on
292 which provers are installed and how many processor cores are available, some of
293 the provers might be missing or present with a \textit{remote\_} prefix.
294 Waldmeister is run only for unit equational problems, where the goal's
295 conclusion is a (universally quantified) equation.
297 For each successful prover, Sledgehammer gives a one-liner \textit{metis} or
298 \textit{smt} method call. Rough timings are shown in parentheses, indicating how
299 fast the call is. You can click the proof to insert it into the theory text.
301 In addition, you can ask Sledgehammer for an Isar text proof by enabling the
302 \textit{isar\_proofs} option (\S\ref{output-format}):
305 \textbf{sledgehammer} [\textit{isar\_proofs}]
308 When Isar proof construction is successful, it can yield proofs that are more
309 readable and also faster than the \textit{metis} or \textit{smt} one-liners.
310 This feature is experimental and is only available for ATPs.
315 This section presents a few hints that should help you get the most out of
316 Sledgehammer. Frequently asked questions are answered in
317 \S\ref{frequently-asked-questions}.
319 %\newcommand\point[1]{\medskip\par{\sl\bfseries#1}\par\nopagebreak}
320 \newcommand\point[1]{\subsection{\emph{#1}}}
322 \point{Presimplify the goal}
324 For best results, first simplify your problem by calling \textit{auto} or at
325 least \textit{safe} followed by \textit{simp\_all}. The SMT solvers provide
326 arithmetic decision procedures, but the ATPs typically do not (or if they do,
327 Sledgehammer does not use it yet). Apart from Waldmeister, they are not
328 particularly good at heavy rewriting, but because they regard equations as
329 undirected, they often prove theorems that require the reverse orientation of a
330 \textit{simp} rule. Higher-order problems can be tackled, but the success rate
331 is better for first-order problems. Hence, you may get better results if you
332 first simplify the problem to remove higher-order features.
334 \point{Make sure E, SPASS, Vampire, and Z3 are locally installed}
336 Locally installed provers are faster and more reliable than those running on
337 servers. See \S\ref{installation} for details on how to install them.
339 \point{Familiarize yourself with the main options}
341 Sledgehammer's options are fully documented in \S\ref{command-syntax}. Many of
342 the options are very specialized, but serious users of the tool should at least
343 familiarize themselves with the following options:
346 \item[\labelitemi] \textbf{\textit{provers}} (\S\ref{mode-of-operation}) specifies
347 the automatic provers (ATPs and SMT solvers) that should be run whenever
348 Sledgehammer is invoked (e.g., ``\textit{provers}~= \textit{e spass
349 remote\_vampire\/}''). For convenience, you can omit ``\textit{provers}~=''
350 and simply write the prover names as a space-separated list (e.g., ``\textit{e
351 spass remote\_vampire\/}'').
353 \item[\labelitemi] \textbf{\textit{max\_facts}} (\S\ref{relevance-filter})
354 specifies the maximum number of facts that should be passed to the provers. By
355 default, the value is prover-dependent but varies between about 50 and 1000. If
356 the provers time out, you can try lowering this value to, say, 25 or 50 and see
359 \item[\labelitemi] \textbf{\textit{isar\_proofs}} (\S\ref{output-format}) specifies
360 that Isar proofs should be generated, in addition to one-liner \textit{metis} or
361 \textit{smt} proofs. The length of the Isar proofs can be controlled by setting
362 \textit{isar\_compress} (\S\ref{output-format}).
364 \item[\labelitemi] \textbf{\textit{timeout}} (\S\ref{timeouts}) controls the
365 provers' time limit. It is set to 30 seconds, but since Sledgehammer runs
366 asynchronously you should not hesitate to raise this limit to 60 or 120 seconds
367 if you are the kind of user who can think clearly while ATPs are active.
370 Options can be set globally using \textbf{sledgehammer\_params}
371 (\S\ref{command-syntax}). The command also prints the list of all available
372 options with their current value. Fact selection can be influenced by specifying
373 ``$(\textit{add}{:}~\textit{my\_facts})$'' after the \textbf{sledgehammer} call
374 to ensure that certain facts are included, or simply ``$(\textit{my\_facts})$''
375 to force Sledgehammer to run only with $\textit{my\_facts}$.
377 \section{Frequently Asked Questions}
378 \label{frequently-asked-questions}
380 This sections answers frequently (and infrequently) asked questions about
381 Sledgehammer. It is a good idea to skim over it now even if you do not have any
382 questions at this stage. And if you have any further questions not listed here,
383 send them to the author at \authoremail.
385 \point{Which facts are passed to the automatic provers?}
387 Sledgehammer heuristically selects a few hundred relevant lemmas from the
388 currently loaded libraries. The component that performs this selection is
389 called \emph{relevance filter}.
393 The traditional relevance filter, called \emph{MePo}
394 (\underline{Me}ng--\underline{Pau}lson), assigns a score to every available fact
395 (lemma, theorem, definition, or axiom) based upon how many constants that fact
396 shares with the conjecture. This process iterates to include facts relevant to
397 those just accepted. The constants are weighted to give unusual ones greater
398 significance. MePo copes best when the conjecture contains some unusual
399 constants; if all the constants are common, it is unable to discriminate among
400 the hundreds of facts that are picked up. The filter is also memoryless: It has
401 no information about how many times a particular fact has been used in a proof,
405 An experimental alternative to MePo is \emph{MaSh}
406 (\underline{Ma}chine Learner for \underline{S}ledge\underline{h}ammer). It
407 relies on an external Python tool that applies machine learning to
408 the problem of finding relevant facts.
410 \item[\labelitemi] The \emph{MeSh} filter combines MePo and MaSh.
413 The default is either MePo or MeSh, depending on whether the environment
414 variable \texttt{MASH} is set and what class of provers the target prover
415 belongs to (\S\ref{relevance-filter}).
417 The number of facts included in a problem varies from prover to prover, since
418 some provers get overwhelmed more easily than others. You can show the number of
419 facts given using the \textit{verbose} option (\S\ref{output-format}) and the
420 actual facts using \textit{debug} (\S\ref{output-format}).
422 Sledgehammer is good at finding short proofs combining a handful of existing
423 lemmas. If you are looking for longer proofs, you must typically restrict the
424 number of facts, by setting the \textit{max\_facts} option
425 (\S\ref{relevance-filter}) to, say, 25 or 50.
427 You can also influence which facts are actually selected in a number of ways. If
428 you simply want to ensure that a fact is included, you can specify it using the
429 ``$(\textit{add}{:}~\textit{my\_facts})$'' syntax. For example:
432 \textbf{sledgehammer} (\textit{add}: \textit{hd.simps} \textit{tl.simps})
435 The specified facts then replace the least relevant facts that would otherwise be
436 included; the other selected facts remain the same.
437 If you want to direct the selection in a particular direction, you can specify
438 the facts via \textbf{using}:
441 \textbf{using} \textit{hd.simps} \textit{tl.simps} \\
442 \textbf{sledgehammer}
445 The facts are then more likely to be selected than otherwise, and if they are
446 selected at iteration $j$ they also influence which facts are selected at
447 iterations $j + 1$, $j + 2$, etc. To give them even more weight, try
450 \textbf{using} \textit{hd.simps} \textit{tl.simps} \\
451 \textbf{apply}~\textbf{--} \\
452 \textbf{sledgehammer}
455 \point{Why does Metis fail to reconstruct the proof?}
457 There are many reasons. If Metis runs seemingly forever, that is a sign that the
458 proof is too difficult for it. Metis's search is complete, so it should
459 eventually find it, but that's little consolation. There are several possible
463 \item[\labelitemi] Try the \textit{isar\_proofs} option (\S\ref{output-format}) to
464 obtain a step-by-step Isar proof where each step is justified by \textit{metis}.
465 Since the steps are fairly small, \textit{metis} is more likely to be able to
468 \item[\labelitemi] Try the \textit{smt} proof method instead of \textit{metis}.
469 It is usually stronger, but you need to either have Z3 available to replay the
470 proofs, trust the SMT solver, or use certificates. See the documentation in the
471 \emph{SMT} theory (\texttt{\$ISABELLE\_HOME/src/HOL/SMT.thy}) for details.
473 \item[\labelitemi] Try the \textit{blast} or \textit{auto} proof methods, passing
474 the necessary facts via \textbf{unfolding}, \textbf{using}, \textit{intro}{:},
475 \textit{elim}{:}, \textit{dest}{:}, or \textit{simp}{:}, as appropriate.
478 In some rare cases, \textit{metis} fails fairly quickly, and you get the error
483 One-line proof reconstruction failed.
486 This message indicates that Sledgehammer determined that the goal is provable,
487 but the proof is, for technical reasons, beyond \textit{metis}'s power. You can
488 then try again with the \textit{strict} option (\S\ref{problem-encoding}).
490 If the goal is actually unprovable and you did not specify an unsound encoding
491 using \textit{type\_enc} (\S\ref{problem-encoding}), this is a bug, and you are
492 strongly encouraged to report this to the author at \authoremail.
494 \point{How can I tell whether a suggested proof is sound?}
496 Earlier versions of Sledgehammer often suggested unsound proofs---either proofs
497 of nontheorems or simply proofs that rely on type-unsound inferences. This
498 is a thing of the past, unless you explicitly specify an unsound encoding
499 using \textit{type\_enc} (\S\ref{problem-encoding}).
501 Officially, the only form of ``unsoundness'' that lurks in the sound
502 encodings is related to missing characteristic theorems of datatypes. For
506 \textbf{lemma}~``$\exists \mathit{xs}.\; \mathit{xs} \neq []$'' \\
507 \textbf{sledgehammer} ()
510 suggests an argumentless \textit{metis} call that fails. However, the conjecture
511 does actually hold, and the \textit{metis} call can be repaired by adding
512 \textit{list.distinct}.
514 We hope to address this problem in a future version of Isabelle. In the
515 meantime, you can avoid it by passing the \textit{strict} option
516 (\S\ref{problem-encoding}).
518 \point{What are the \textit{full\_types}, \textit{no\_types}, and
519 \textit{mono\_tags} arguments to Metis?}
521 The \textit{metis}~(\textit{full\_types}) proof method
522 and its cousin \textit{metis}~(\textit{mono\_tags}) are fully-typed
523 versions of Metis. It is somewhat slower than \textit{metis}, but the proof
524 search is fully typed, and it also includes more powerful rules such as the
525 axiom ``$x = \const{True} \mathrel{\lor} x = \const{False}$'' for reasoning in
526 higher-order places (e.g., in set comprehensions). The method kicks in
527 automatically as a fallback when \textit{metis} fails, and it is sometimes
528 generated by Sledgehammer instead of \textit{metis} if the proof obviously
529 requires type information or if \textit{metis} failed when Sledgehammer
530 preplayed the proof. (By default, Sledgehammer tries to run \textit{metis} with
531 various options for up to 3 seconds each time to ensure that the generated
532 one-line proofs actually work and to display timing information. This can be
533 configured using the \textit{preplay\_timeout} and \textit{dont\_preplay}
534 options (\S\ref{timeouts}).)
536 At the other end of the soundness spectrum, \textit{metis} (\textit{no\_types})
537 uses no type information at all during the proof search, which is more efficient
538 but often fails. Calls to \textit{metis} (\textit{no\_types}) are occasionally
539 generated by Sledgehammer.
541 See the \textit{type\_enc} option (\S\ref{problem-encoding}) for details.
543 Incidentally, if you ever see warnings such as
547 Metis: Falling back on ``\textit{metis} (\textit{full\_types})''.
550 for a successful \textit{metis} proof, you can advantageously pass the
551 \textit{full\_types} option to \textit{metis} directly.
553 \point{And what are the \textit{lifting} and \textit{hide\_lams} arguments
556 Orthogonally to the encoding of types, it is important to choose an appropriate
557 translation of $\lambda$-abstractions. Metis supports three translation schemes,
558 in decreasing order of power: Curry combinators (the default),
559 $\lambda$-lifting, and a ``hiding'' scheme that disables all reasoning under
560 $\lambda$-abstractions. The more powerful schemes also give the automatic
561 provers more rope to hang themselves. See the \textit{lam\_trans} option (\S\ref{problem-encoding}) for details.
563 \point{Are generated proofs minimal?}
565 Automatic provers frequently use many more facts than are necessary.
566 Sledgehammer inclues a minimization tool that takes a set of facts returned by a
567 given prover and repeatedly calls the same prover, \textit{metis}, or
568 \textit{smt} with subsets of those axioms in order to find a minimal set.
569 Reducing the number of axioms typically improves Metis's speed and success rate,
570 while also removing superfluous clutter from the proof scripts.
572 In earlier versions of Sledgehammer, generated proofs were systematically
573 accompanied by a suggestion to invoke the minimization tool. This step is now
574 performed implicitly if it can be done in a reasonable amount of time (something
575 that can be guessed from the number of facts in the original proof and the time
576 it took to find or preplay it).
578 In addition, some provers (e.g., Yices) do not provide proofs or sometimes
579 produce incomplete proofs. The minimizer is then invoked to find out which facts
580 are actually needed from the (large) set of facts that was initially given to
581 the prover. Finally, if a prover returns a proof with lots of facts, the
582 minimizer is invoked automatically since Metis would be unlikely to re-find the
585 Automatic minimization can be forced or disabled using the \textit{minimize}
586 option (\S\ref{mode-of-operation}).
588 \point{A strange error occurred---what should I do?}
590 Sledgehammer tries to give informative error messages. Please report any strange
591 error to the author at \authoremail. This applies doubly if you get the message
595 The prover derived ``\textit{False}'' using ``\textit{foo\/}'',
596 ``\textit{bar\/}'', and ``\textit{baz\/}''.
597 This could be due to inconsistent axioms (including ``\textbf{sorry}''s) or to
598 a bug in Sledgehammer. If the problem persists, please contact the
602 \point{Auto can solve it---why not Sledgehammer?}
604 Problems can be easy for \textit{auto} and difficult for automatic provers, but
605 the reverse is also true, so do not be discouraged if your first attempts fail.
606 Because the system refers to all theorems known to Isabelle, it is particularly
607 suitable when your goal has a short proof from lemmas that you do not know
610 \point{Why are there so many options?}
612 Sledgehammer's philosophy should work out of the box, without user guidance.
613 Many of the options are meant to be used mostly by the Sledgehammer developers
614 for experiments. Of course, feel free to try them out if you are so inclined.
616 \section{Command Syntax}
617 \label{command-syntax}
619 \subsection{Sledgehammer}
621 Sledgehammer can be invoked at any point when there is an open goal by entering
622 the \textbf{sledgehammer} command in the theory file. Its general syntax is as
626 \textbf{sledgehammer} \qty{subcommand}$^?$ \qty{options}$^?$ \qty{facts\_override}$^?$ \qty{num}$^?$
629 In the general syntax, the \qty{subcommand} may be any of the following:
632 \item[\labelitemi] \textbf{\textit{run} (the default):} Runs Sledgehammer on
633 subgoal number \qty{num} (1 by default), with the given options and facts.
635 \item[\labelitemi] \textbf{\textit{min}:} Attempts to minimize the facts
636 specified in the \qty{facts\_override} argument to obtain a simpler proof
637 involving fewer facts. The options and goal number are as for \textit{run}.
639 \item[\labelitemi] \textbf{\textit{messages}:} Redisplays recent messages issued
640 by Sledgehammer. This allows you to examine results that might have been lost
641 due to Sledgehammer's asynchronous nature. The \qty{num} argument specifies a
642 limit on the number of messages to display (10 by default).
644 \item[\labelitemi] \textbf{\textit{supported\_provers}:} Prints the list of
645 automatic provers supported by Sledgehammer. See \S\ref{installation} and
646 \S\ref{mode-of-operation} for more information on how to install automatic
649 \item[\labelitemi] \textbf{\textit{running\_provers}:} Prints information about
650 currently running automatic provers, including elapsed runtime and remaining
653 \item[\labelitemi] \textbf{\textit{kill\_all}:} Terminates all running
654 threads (automatic provers and machine learners).
656 \item[\labelitemi] \textbf{\textit{refresh\_tptp}:} Refreshes the list of remote
657 ATPs available at System\-On\-TPTP \cite{sutcliffe-2000}.
660 In addition, the following subcommands provide finer control over machine
664 \item[\labelitemi] \textbf{\textit{unlearn}:} Resets MaSh, erasing any
667 \item[\labelitemi] \textbf{\textit{learn\_isar}:} Invokes MaSh on the current
668 theory to process all the available facts, learning from their Isabelle/Isar
669 proofs. This happens automatically at Sledgehammer invocations if the
670 \textit{learn} option (\S\ref{relevance-filter}) is enabled.
672 \item[\labelitemi] \textbf{\textit{learn\_prover}:} Invokes MaSh on the current
673 theory to process all the available facts, learning from proofs generated by
674 automatic provers. The prover to use and its timeout can be set using the
675 \textit{prover} (\S\ref{mode-of-operation}) and \textit{timeout}
676 (\S\ref{timeouts}) options. It is recommended to perform learning using an
677 efficient first-order ATP (such as E, SPASS, and Vampire) as opposed to a
678 higher-order ATP or an SMT solver.
680 \item[\labelitemi] \textbf{\textit{relearn\_isar}:} Same as \textit{unlearn}
681 followed by \textit{learn\_isar}.
683 \item[\labelitemi] \textbf{\textit{relearn\_prover}:} Same as \textit{unlearn}
684 followed by \textit{learn\_prover}.
686 \item[\labelitemi] \textbf{\textit{running\_learners}:} Prints information about
687 currently running machine learners, including elapsed runtime and remaining
691 Sledgehammer's behavior can be influenced by various \qty{options}, which can be
692 specified in brackets after the \textbf{sledgehammer} command. The
693 \qty{options} are a list of key--value pairs of the form ``[$k_1 = v_1,
694 \ldots, k_n = v_n$]''. For Boolean options, ``= \textit{true\/}'' is optional.
698 \textbf{sledgehammer} [\textit{isar\_proofs}, \,\textit{timeout} = 120]
701 Default values can be set using \textbf{sledgehammer\_\allowbreak params}:
704 \textbf{sledgehammer\_params} \qty{options}
707 The supported options are described in \S\ref{option-reference}.
709 The \qty{facts\_override} argument lets you alter the set of facts that go
710 through the relevance filter. It may be of the form ``(\qty{facts})'', where
711 \qty{facts} is a space-separated list of Isabelle facts (theorems, local
712 assumptions, etc.), in which case the relevance filter is bypassed and the given
713 facts are used. It may also be of the form ``(\textit{add}:\ \qty{facts\/_{\mathrm{1}}})'',
714 ``(\textit{del}:\ \qty{facts\/_{\mathrm{2}}})'', or ``(\textit{add}:\ \qty{facts\/_{\mathrm{1}}}\
715 \textit{del}:\ \qty{facts\/_{\mathrm{2}}})'', where the relevance filter is instructed to
716 proceed as usual except that it should consider \qty{facts\/_{\mathrm{1}}}
717 highly-relevant and \qty{facts\/_{\mathrm{2}}} fully irrelevant.
719 If you use Isabelle/jEdit, Sledgehammer also provides an automatic mode that can
720 be enabled via the ``Auto Sledgehammer'' option under ``Plugins > Plugin Options
721 > Isabelle > General.'' For automatic runs, only the first prover set using
722 \textit{provers} (\S\ref{mode-of-operation}) is considered, fewer facts are
723 passed to the prover, \textit{slice} (\S\ref{mode-of-operation}) is disabled,
724 \textit{strict} (\S\ref{problem-encoding}) is enabled, \textit{verbose}
725 (\S\ref{output-format}) and \textit{debug} (\S\ref{output-format}) are disabled,
726 and \textit{timeout} (\S\ref{timeouts}) is superseded by the ``Auto Time Limit''
727 option in jEdit. Sledgehammer's output is also more concise.
731 The \textit{metis} proof method has the syntax
734 \textbf{\textit{metis}}~(\qty{options})${}^?$~\qty{facts}${}^?$
737 where \qty{facts} is a list of arbitrary facts and \qty{options} is a
738 comma-separated list consisting of at most one $\lambda$ translation scheme
739 specification with the same semantics as Sledgehammer's \textit{lam\_trans}
740 option (\S\ref{problem-encoding}) and at most one type encoding specification
741 with the same semantics as Sledgehammer's \textit{type\_enc} option
742 (\S\ref{problem-encoding}).
744 The supported $\lambda$ translation schemes are \textit{hide\_lams},
745 \textit{lifting}, and \textit{combs} (the default).
747 All the untyped type encodings listed in \S\ref{problem-encoding} are supported.
748 For convenience, the following aliases are provided:
750 \item[\labelitemi] \textbf{\textit{full\_types}:} Alias for \textit{poly\_guards\_query}.
751 \item[\labelitemi] \textbf{\textit{partial\_types}:} Alias for \textit{poly\_args}.
752 \item[\labelitemi] \textbf{\textit{no\_types}:} Alias for \textit{erased}.
755 \section{Option Reference}
756 \label{option-reference}
761 \def\flushitem#1{\item[]\noindent\kern-\leftmargin \textbf{#1}}
762 \def\optrueonly#1{\flushitem{\textit{#1} $\bigl[$= \textit{true}$\bigr]$\enskip}\nopagebreak\\[\parskip]}
763 \def\optrue#1#2{\flushitem{\textit{#1} $\bigl[$= \qtybf{bool}$\bigr]$\enskip \defl\textit{true}\defr\hfill (neg.: \textit{#2})}\nopagebreak\\[\parskip]}
764 \def\opfalse#1#2{\flushitem{\textit{#1} $\bigl[$= \qtybf{bool}$\bigr]$\enskip \defl\textit{false}\defr\hfill (neg.: \textit{#2})}\nopagebreak\\[\parskip]}
765 \def\opsmart#1#2{\flushitem{\textit{#1} $\bigl[$= \qtybf{smart\_bool}$\bigr]$\enskip \defl\textit{smart}\defr\hfill (neg.: \textit{#2})}\nopagebreak\\[\parskip]}
766 \def\opsmartx#1#2{\flushitem{\textit{#1} $\bigl[$= \qtybf{smart\_bool}$\bigr]$\enskip \defl\textit{smart}\defr\\\hbox{}\hfill (neg.: \textit{#2})}\nopagebreak\\[\parskip]}
767 \def\opnodefault#1#2{\flushitem{\textit{#1} = \qtybf{#2}} \nopagebreak\\[\parskip]}
768 \def\opnodefaultbrk#1#2{\flushitem{$\bigl[$\textit{#1} =$\bigr]$ \qtybf{#2}} \nopagebreak\\[\parskip]}
769 \def\opdefault#1#2#3{\flushitem{\textit{#1} = \qtybf{#2}\enskip \defl\textit{#3}\defr} \nopagebreak\\[\parskip]}
770 \def\oparg#1#2#3{\flushitem{\textit{#1} \qtybf{#2} = \qtybf{#3}} \nopagebreak\\[\parskip]}
771 \def\opargbool#1#2#3{\flushitem{\textit{#1} \qtybf{#2} $\bigl[$= \qtybf{bool}$\bigr]$\hfill (neg.: \textit{#3})}\nopagebreak\\[\parskip]}
772 \def\opargboolorsmart#1#2#3{\flushitem{\textit{#1} \qtybf{#2} $\bigl[$= \qtybf{smart\_bool}$\bigr]$\hfill (neg.: \textit{#3})}\nopagebreak\\[\parskip]}
774 Sledgehammer's options are categorized as follows:\ mode of operation
775 (\S\ref{mode-of-operation}), problem encoding (\S\ref{problem-encoding}),
776 relevance filter (\S\ref{relevance-filter}), output format
777 (\S\ref{output-format}), authentication (\S\ref{authentication}), and timeouts
780 The descriptions below refer to the following syntactic quantities:
783 \item[\labelitemi] \qtybf{string}: A string.
784 \item[\labelitemi] \qtybf{bool\/}: \textit{true} or \textit{false}.
785 \item[\labelitemi] \qtybf{smart\_bool\/}: \textit{true}, \textit{false}, or
787 \item[\labelitemi] \qtybf{int\/}: An integer.
788 %\item[\labelitemi] \qtybf{float\/}: A floating-point number (e.g., 2.5).
789 \item[\labelitemi] \qtybf{float\_pair\/}: A pair of floating-point numbers
791 \item[\labelitemi] \qtybf{smart\_int\/}: An integer or \textit{smart}.
792 \item[\labelitemi] \qtybf{float\_or\_none\/}: A floating-point number (e.g., 60 or
793 0.5) expressing a number of seconds, or the keyword \textit{none} ($\infty$
797 Default values are indicated in curly brackets (\textrm{\{\}}). Boolean options
798 have a negative counterpart (e.g., \textit{blocking} vs.\
799 \textit{non\_blocking}). When setting Boolean options or their negative
800 counterparts, ``= \textit{true\/}'' may be omitted.
802 \subsection{Mode of Operation}
803 \label{mode-of-operation}
806 \opnodefaultbrk{provers}{string}
807 Specifies the automatic provers to use as a space-separated list (e.g.,
808 ``\textit{e}~\textit{spass}~\textit{remote\_vampire\/}'').
809 Provers can be run locally or remotely; see \S\ref{installation} for
810 installation instructions.
812 The following local provers are supported:
816 \item[\labelitemi] \textbf{\textit{agsyHOL}:} agsyHOL is an automatic
817 higher-order prover developed by Fredrik Lindblad \cite{agsyHOL},
818 with support for the TPTP typed higher-order syntax (THF0). To use agsyHOL, set
819 the environment variable \texttt{AGSYHOL\_HOME} to the directory that contains
820 the \texttt{agsyHOL} executable. Sledgehammer has been tested with version 1.0.
822 \item[\labelitemi] \textbf{\textit{alt\_ergo}:} Alt-Ergo is a polymorphic
823 ATP developed by Bobot et al.\ \cite{alt-ergo}.
824 It supports the TPTP polymorphic typed first-order format (TFF1) via Why3
825 \cite{why3}. To use Alt-Ergo, set the environment variable \texttt{WHY3\_HOME}
826 to the directory that contains the \texttt{why3} executable. Sledgehammer has
827 been tested with Alt-Ergo 0.95.1 and an unidentified development version of
830 \item[\labelitemi] \textbf{\textit{cvc3}:} CVC3 is an SMT solver developed by
831 Clark Barrett, Cesare Tinelli, and their colleagues \cite{cvc3}. To use CVC3,
832 set the environment variable \texttt{CVC3\_SOLVER} to the complete path of the
833 executable, including the file name, or install the prebuilt CVC3 package from
834 \download. Sledgehammer has been tested with version 2.2 and 2.4.1.
836 \item[\labelitemi] \textbf{\textit{e}:} E is a first-order resolution prover
837 developed by Stephan Schulz \cite{schulz-2002}. To use E, set the environment
838 variable \texttt{E\_HOME} to the directory that contains the \texttt{eproof}
839 executable and \texttt{E\_VERSION} to the version number (e.g., ``1.8''), or
840 install the prebuilt E package from \download. Sledgehammer has been tested with
843 \item[\labelitemi] \textbf{\textit{e\_males}:} E-MaLeS is a metaprover developed
844 by Daniel K\"uhlwein that implements strategy scheduling on top of E. To use
845 E-MaLeS, set the environment variable \texttt{E\_MALES\_HOME} to the directory
846 that contains the \texttt{emales.py} script. Sledgehammer has been tested with
849 \item[\labelitemi] \textbf{\textit{e\_par}:} E-Par is a metaprover developed
850 by Josef Urban that implements strategy scheduling on top of E. To use
851 E-Par, set the environment variable \texttt{E\_HOME} to the directory
852 that contains the \texttt{runepar.pl} script and the \texttt{eprover} and
853 \texttt{epclextract} executables, or use the prebuilt E package from \download.
855 \item[\labelitemi] \textbf{\textit{iprover}:} iProver is a pure
856 instantiation-based prover developed by Konstantin Korovin \cite{korovin-2009}.
857 To use iProver, set the environment variable \texttt{IPROVER\_HOME} to the
858 directory that contains the \texttt{iprover} and \texttt{vclausify\_rel}
859 executables. Sledgehammer has been tested with version 0.99.
861 \item[\labelitemi] \textbf{\textit{iprover\_eq}:} iProver-Eq is an
862 instantiation-based prover with native support for equality developed by
863 Konstantin Korovin and Christoph Sticksel \cite{korovin-sticksel-2010}. To use
864 iProver-Eq, set the environment variable \texttt{IPROVER\_EQ\_HOME} to the
865 directory that contains the \texttt{iprover-eq} and \texttt{vclausify\_rel}
866 executables. Sledgehammer has been tested with version 0.8.
868 \item[\labelitemi] \textbf{\textit{leo2}:} LEO-II is an automatic
869 higher-order prover developed by Christoph Benzm\"uller et al.\ \cite{leo2},
870 with support for the TPTP typed higher-order syntax (THF0). To use LEO-II, set
871 the environment variable \texttt{LEO2\_HOME} to the directory that contains the
872 \texttt{leo} executable. Sledgehammer requires version 1.3.4 or above.
874 \item[\labelitemi] \textbf{\textit{metis}:} Although it is less powerful than
875 the external provers, Metis itself can be used for proof search.
877 \item[\labelitemi] \textbf{\textit{satallax}:} Satallax is an automatic
878 higher-order prover developed by Chad Brown et al.\ \cite{satallax}, with
879 support for the TPTP typed higher-order syntax (THF0). To use Satallax, set the
880 environment variable \texttt{SATALLAX\_HOME} to the directory that contains the
881 \texttt{satallax} executable. Sledgehammer requires version 2.2 or above.
883 \item[\labelitemi] \textbf{\textit{smt}:} The \textit{smt} proof method with the
884 current settings (usually:\ Z3 with proof reconstruction) can be used for proof
887 \item[\labelitemi] \textbf{\textit{spass}:} SPASS is a first-order resolution
888 prover developed by Christoph Weidenbach et al.\ \cite{weidenbach-et-al-2009}.
889 To use SPASS, set the environment variable \texttt{SPASS\_HOME} to the directory
890 that contains the \texttt{SPASS} executable and \texttt{SPASS\_VERSION} to the
891 version number (e.g., ``3.8ds''), or install the prebuilt SPASS package from
892 \download. Sledgehammer requires version 3.8ds or above.
894 \item[\labelitemi] \textbf{\textit{vampire}:} Vampire is a first-order
895 resolution prover developed by Andrei Voronkov and his colleagues
896 \cite{riazanov-voronkov-2002}. To use Vampire, set the environment variable
897 \texttt{VAMPIRE\_HOME} to the directory that contains the \texttt{vampire}
898 executable and \texttt{VAMPIRE\_VERSION} to the version number (e.g.,
899 ``2.6''). Sledgehammer has been tested with versions 0.6 to 3.0.
900 Versions strictly above 1.8 support the TPTP typed first-order format (TFF0).
902 \item[\labelitemi] \textbf{\textit{yices}:} Yices is an SMT solver developed at
903 SRI \cite{yices}. To use Yices, set the environment variable
904 \texttt{YICES\_SOLVER} to the complete path of the executable, including the
905 file name. Sledgehammer has been tested with version 1.0.28.
907 \item[\labelitemi] \textbf{\textit{z3}:} Z3 is an SMT solver developed at
908 Microsoft Research \cite{z3}. To use Z3, set the environment variable
909 \texttt{Z3\_SOLVER} to the complete path of the executable, including the file
910 name, and set \texttt{Z3\_NON\_COMMERCIAL} to ``yes'' to confirm that you are a
911 noncommercial user. Sledgehammer has been tested with versions 3.0, 3.1, 3.2,
916 The following remote provers are supported:
919 \item[\labelitemi] \textbf{\textit{remote\_agsyhol}:} The remote version of
920 agsyHOL runs on Geoff Sutcliffe's Miami servers \cite{sutcliffe-2000}.
922 \item[\labelitemi] \textbf{\textit{remote\_cvc3}:} The remote version of CVC3 runs
923 on servers at the TU M\"unchen (or wherever \texttt{REMOTE\_SMT\_URL} is set to
926 \item[\labelitemi] \textbf{\textit{remote\_e}:} The remote version of E runs
927 on Geoff Sutcliffe's Miami servers \cite{sutcliffe-2000}.
929 \item[\labelitemi] \textbf{\textit{remote\_e\_sine}:} E-SInE is a metaprover
930 developed by Kry\v stof Hoder \cite{sine} based on E. It runs on Geoff
931 Sutcliffe's Miami servers.
933 \item[\labelitemi] \textbf{\textit{remote\_e\_tofof}:} E-ToFoF is a metaprover
934 developed by Geoff Sutcliffe \cite{tofof} based on E running on his Miami
935 servers. This ATP supports the TPTP typed first-order format (TFF0). The
936 remote version of E-ToFoF runs on Geoff Sutcliffe's Miami servers.
938 \item[\labelitemi] \textbf{\textit{remote\_iprover}:} The
939 remote version of iProver runs on Geoff Sutcliffe's Miami servers
940 \cite{sutcliffe-2000}.
942 \item[\labelitemi] \textbf{\textit{remote\_iprover\_eq}:} The
943 remote version of iProver-Eq runs on Geoff Sutcliffe's Miami servers
944 \cite{sutcliffe-2000}.
946 \item[\labelitemi] \textbf{\textit{remote\_leo2}:} The remote version of LEO-II
947 runs on Geoff Sutcliffe's Miami servers \cite{sutcliffe-2000}.
949 \item[\labelitemi] \textbf{\textit{remote\_satallax}:} The remote version of
950 Satallax runs on Geoff Sutcliffe's Miami servers \cite{sutcliffe-2000}.
952 \item[\labelitemi] \textbf{\textit{remote\_snark}:} SNARK is a first-order
953 resolution prover developed by Stickel et al.\ \cite{snark}. It supports the
954 TPTP typed first-order format (TFF0). The remote version of SNARK runs on
955 Geoff Sutcliffe's Miami servers.
957 \item[\labelitemi] \textbf{\textit{remote\_vampire}:} The remote version of
958 Vampire runs on Geoff Sutcliffe's Miami servers.
960 \item[\labelitemi] \textbf{\textit{remote\_waldmeister}:} Waldmeister is a unit
961 equality prover developed by Hillenbrand et al.\ \cite{waldmeister}. It can be
962 used to prove universally quantified equations using unconditional equations,
963 corresponding to the TPTP CNF UEQ division. The remote version of Waldmeister
964 runs on Geoff Sutcliffe's Miami servers.
966 \item[\labelitemi] \textbf{\textit{remote\_z3}:} The remote version of Z3 runs on
967 servers at the TU M\"unchen (or wherever \texttt{REMOTE\_SMT\_URL} is set to
971 By default, Sledgehammer runs a selection of CVC3, E, E-SInE, SPASS, Vampire,
972 Yices, and Z3 in parallel---either locally or remotely, depending on the number
973 of processor cores available.
975 It is generally a good idea to run several provers in parallel. Running E,
976 SPASS, and Vampire for 5~seconds yields a similar success rate to running the
977 most effective of these for 120~seconds \cite{boehme-nipkow-2010}.
979 For the \textit{min} subcommand, the default prover is \textit{metis}. If
980 several provers are set, the first one is used.
982 \opnodefault{prover}{string}
983 Alias for \textit{provers}.
985 \opfalse{blocking}{non\_blocking}
986 Specifies whether the \textbf{sledgehammer} command should operate
987 synchronously. The asynchronous (non-blocking) mode lets the user start proving
988 the putative theorem manually while Sledgehammer looks for a proof, but it can
989 also be more confusing. Irrespective of the value of this option, Sledgehammer
990 is always run synchronously if \textit{debug} (\S\ref{output-format}) is
993 \optrue{slice}{dont\_slice}
994 Specifies whether the time allocated to a prover should be sliced into several
995 segments, each of which has its own set of possibly prover-dependent options.
996 For SPASS and Vampire, the first slice tries the fast but incomplete
997 set-of-support (SOS) strategy, whereas the second slice runs without it. For E,
998 up to three slices are tried, with different weighted search strategies and
999 number of facts. For SMT solvers, several slices are tried with the same options
1000 each time but fewer and fewer facts. According to benchmarks with a timeout of
1001 30 seconds, slicing is a valuable optimization, and you should probably leave it
1002 enabled unless you are conducting experiments. This option is implicitly
1003 disabled for (short) automatic runs.
1006 {\small See also \textit{verbose} (\S\ref{output-format}).}
1008 \opsmart{minimize}{dont\_minimize}
1009 Specifies whether the minimization tool should be invoked automatically after
1010 proof search. By default, automatic minimization takes place only if
1011 it can be done in a reasonable amount of time (as determined by
1012 the number of facts in the original proof and the time it took to find or
1013 preplay it) or the proof involves an unreasonably large number of facts.
1016 {\small See also \textit{preplay\_timeout} (\S\ref{timeouts})
1017 and \textit{dont\_preplay} (\S\ref{timeouts}).}
1019 \opfalse{spy}{dont\_spy}
1020 Specifies whether Sledgehammer should record statistics in
1021 \texttt{\$ISA\-BELLE\_\allowbreak HOME\_\allowbreak USER/\allowbreak spy\_\allowbreak sledgehammer}.
1022 These statistics can be useful to the developers of Sledgehammer. If you are willing to have your
1023 interactions recorded in the name of science, please enable this feature and send the statistics
1024 file every now and then to the author of this manual (\authoremail).
1025 To change the default value of this option globally, set the environment variable
1026 \texttt{SLEDGEHAMMER\_SPY} to \texttt{yes}.
1029 {\small See also \textit{debug} (\S\ref{output-format}).}
1031 \opfalse{overlord}{no\_overlord}
1032 Specifies whether Sledgehammer should put its temporary files in
1033 \texttt{\$ISA\-BELLE\_\allowbreak HOME\_\allowbreak USER}, which is useful for
1034 debugging Sledgehammer but also unsafe if several instances of the tool are run
1035 simultaneously. The files are identified by the prefixes \texttt{prob\_} and
1036 \texttt{mash\_}; you may safely remove them after Sledgehammer has run.
1039 {\small See also \textit{debug} (\S\ref{output-format}).}
1042 \subsection{Relevance Filter}
1043 \label{relevance-filter}
1046 \opdefault{fact\_filter}{string}{smart}
1047 Specifies the relevance filter to use. The following filters are available:
1050 \item[\labelitemi] \textbf{\textit{mepo}:}
1051 The traditional memoryless MePo relevance filter.
1053 \item[\labelitemi] \textbf{\textit{mash}:}
1054 The experimental MaSh machine learner. MaSh relies on the external Python
1055 program \texttt{mash.py}, which is part of Isabelle. To enable MaSh, set the
1056 environment variable \texttt{MASH} to \texttt{yes}. Persistent data is stored in
1057 the directory \texttt{\$ISABELLE\_HOME\_USER/mash}.
1059 \item[\labelitemi] \textbf{\textit{mesh}:} The MeSh filter, which combines the
1060 rankings from MePo and MaSh.
1062 \item[\labelitemi] \textbf{\textit{smart}:} A mixture of MePo, MaSh, and MeSh if
1063 MaSh is enabled; otherwise, MePo.
1066 \opdefault{max\_facts}{smart\_int}{smart}
1067 Specifies the maximum number of facts that may be returned by the relevance
1068 filter. If the option is set to \textit{smart}, it is set to a value that was
1069 empirically found to be appropriate for the prover. Typical values range between
1072 For the MaSh-related commands \textit{learn\_isar}, \textit{learn\_prover},
1073 \textit{relearn\_isar}, and \textit{relearn\_prover}, this option specifies the
1074 maximum number of facts from the background library that should be learned
1075 ($\infty$ by default).
1077 \opdefault{fact\_thresholds}{float\_pair}{\upshape 0.45~0.85}
1078 Specifies the thresholds above which facts are considered relevant by the
1079 relevance filter. The first threshold is used for the first iteration of the
1080 relevance filter and the second threshold is used for the last iteration (if it
1081 is reached). The effective threshold is quadratically interpolated for the other
1082 iterations. Each threshold ranges from 0 to 1, where 0 means that all theorems
1083 are relevant and 1 only theorems that refer to previously seen constants.
1085 \optrue{learn}{dont\_learn}
1086 Specifies whether MaSh should be run automatically by Sledgehammer to learn the
1087 available theories (and hence provide more accurate results). Learning takes
1088 place only if MaSh is enabled.
1090 \opdefault{max\_new\_mono\_instances}{int}{smart}
1091 Specifies the maximum number of monomorphic instances to generate beyond
1092 \textit{max\_facts}. The higher this limit is, the more monomorphic instances
1093 are potentially generated. Whether monomorphization takes place depends on the
1094 type encoding used. If the option is set to \textit{smart}, it is set to a value
1095 that was empirically found to be appropriate for the prover. For most provers,
1099 {\small See also \textit{type\_enc} (\S\ref{problem-encoding}).}
1101 \opdefault{max\_mono\_iters}{int}{smart}
1102 Specifies the maximum number of iterations for the monomorphization fixpoint
1103 construction. The higher this limit is, the more monomorphic instances are
1104 potentially generated. Whether monomorphization takes place depends on the
1105 type encoding used. If the option is set to \textit{smart}, it is set to a value
1106 that was empirically found to be appropriate for the prover. For most provers,
1110 {\small See also \textit{type\_enc} (\S\ref{problem-encoding}).}
1113 \subsection{Problem Encoding}
1114 \label{problem-encoding}
1116 \newcommand\comb[1]{\const{#1}}
1119 \opdefault{lam\_trans}{string}{smart}
1120 Specifies the $\lambda$ translation scheme to use in ATP problems. The supported
1121 translation schemes are listed below:
1124 \item[\labelitemi] \textbf{\textit{hide\_lams}:} Hide the $\lambda$-abstractions
1125 by replacing them by unspecified fresh constants, effectively disabling all
1126 reasoning under $\lambda$-abstractions.
1128 \item[\labelitemi] \textbf{\textit{lifting}:} Introduce a new
1129 supercombinator \const{c} for each cluster of $n$~$\lambda$-abstractions,
1130 defined using an equation $\const{c}~x_1~\ldots~x_n = t$ ($\lambda$-lifting).
1132 \item[\labelitemi] \textbf{\textit{combs}:} Rewrite lambdas to the Curry
1133 combinators (\comb{I}, \comb{K}, \comb{S}, \comb{B}, \comb{C}). Combinators
1134 enable the ATPs to synthesize $\lambda$-terms but tend to yield bulkier formulas
1135 than $\lambda$-lifting: The translation is quadratic in the worst case, and the
1136 equational definitions of the combinators are very prolific in the context of
1139 \item[\labelitemi] \textbf{\textit{combs\_and\_lifting}:} Introduce a new
1140 supercombinator \const{c} for each cluster of $\lambda$-abstractions and characterize it both using a
1141 lifted equation $\const{c}~x_1~\ldots~x_n = t$ and via Curry combinators.
1143 \item[\labelitemi] \textbf{\textit{combs\_or\_lifting}:} For each cluster of
1144 $\lambda$-abstractions, heuristically choose between $\lambda$-lifting and Curry
1147 \item[\labelitemi] \textbf{\textit{keep\_lams}:}
1148 Keep the $\lambda$-abstractions in the generated problems. This is available
1149 only with provers that support the THF0 syntax.
1151 \item[\labelitemi] \textbf{\textit{smart}:} The actual translation scheme used
1152 depends on the ATP and should be the most efficient scheme for that ATP.
1155 For SMT solvers, the $\lambda$ translation scheme is always \textit{lifting},
1156 irrespective of the value of this option.
1158 \opsmartx{uncurried\_aliases}{no\_uncurried\_aliases}
1159 Specifies whether fresh function symbols should be generated as aliases for
1160 applications of curried functions in ATP problems.
1162 \opdefault{type\_enc}{string}{smart}
1163 Specifies the type encoding to use in ATP problems. Some of the type encodings
1164 are unsound, meaning that they can give rise to spurious proofs
1165 (unreconstructible using \textit{metis}). The type encodings are
1166 listed below, with an indication of their soundness in parentheses.
1167 An asterisk (*) indicates that the encoding is slightly incomplete for
1168 reconstruction with \textit{metis}, unless the \emph{strict} option (described
1172 \item[\labelitemi] \textbf{\textit{erased} (unsound):} No type information is
1173 supplied to the ATP, not even to resolve overloading. Types are simply erased.
1175 \item[\labelitemi] \textbf{\textit{poly\_guards} (sound):} Types are encoded using
1176 a predicate \const{g}$(\tau, t)$ that guards bound
1177 variables. Constants are annotated with their types, supplied as extra
1178 arguments, to resolve overloading.
1180 \item[\labelitemi] \textbf{\textit{poly\_tags} (sound):} Each term and subterm is
1181 tagged with its type using a function $\const{t\/}(\tau, t)$.
1183 \item[\labelitemi] \textbf{\textit{poly\_args} (unsound):}
1184 Like for \textit{poly\_guards} constants are annotated with their types to
1185 resolve overloading, but otherwise no type information is encoded. This
1186 is the default encoding used by the \textit{metis} command.
1190 \textit{raw\_mono\_guards}, \textit{raw\_mono\_tags} (sound); \\
1191 \textit{raw\_mono\_args} (unsound):} \\
1192 Similar to \textit{poly\_guards}, \textit{poly\_tags}, and \textit{poly\_args},
1193 respectively, but the problem is additionally monomorphized, meaning that type
1194 variables are instantiated with heuristically chosen ground types.
1195 Monomorphization can simplify reasoning but also leads to larger fact bases,
1196 which can slow down the ATPs.
1200 \textit{mono\_guards}, \textit{mono\_tags} (sound);
1201 \textit{mono\_args} (unsound):} \\
1203 \textit{raw\_mono\_guards}, \textit{raw\_mono\_tags}, and
1204 \textit{raw\_mono\_args}, respectively but types are mangled in constant names
1205 instead of being supplied as ground term arguments. The binary predicate
1206 $\const{g}(\tau, t)$ becomes a unary predicate
1207 $\const{g\_}\tau(t)$, and the binary function
1208 $\const{t}(\tau, t)$ becomes a unary function
1209 $\const{t\_}\tau(t)$.
1211 \item[\labelitemi] \textbf{\textit{mono\_native} (sound):} Exploits native
1212 first-order types if the prover supports the TFF0, TFF1, or THF0 syntax;
1213 otherwise, falls back on \textit{mono\_guards}. The problem is monomorphized.
1215 \item[\labelitemi] \textbf{\textit{mono\_native\_higher} (sound):} Exploits
1216 native higher-order types if the prover supports the THF0 syntax; otherwise,
1217 falls back on \textit{mono\_native} or \textit{mono\_guards}. The problem is
1220 \item[\labelitemi] \textbf{\textit{poly\_native} (sound):} Exploits native
1221 first-order polymorphic types if the prover supports the TFF1 syntax; otherwise,
1222 falls back on \textit{mono\_native}.
1226 \textit{poly\_guards}?, \textit{poly\_tags}?, \textit{raw\_mono\_guards}?, \\
1227 \textit{raw\_mono\_tags}?, \textit{mono\_guards}?, \textit{mono\_tags}?, \\
1228 \textit{mono\_native}? (sound*):} \\
1229 The type encodings \textit{poly\_guards}, \textit{poly\_tags},
1230 \textit{raw\_mono\_guards}, \textit{raw\_mono\_tags}, \textit{mono\_guards},
1231 \textit{mono\_tags}, and \textit{mono\_native} are fully typed and sound. For
1232 each of these, Sledgehammer also provides a lighter variant identified by a
1233 question mark (`\hbox{?}')\ that detects and erases monotonic types, notably
1234 infinite types. (For \textit{mono\_native}, the types are not actually erased
1235 but rather replaced by a shared uniform type of individuals.) As argument to the
1236 \textit{metis} proof method, the question mark is replaced by a
1237 \hbox{``\textit{\_query\/}''} suffix.
1241 \textit{poly\_guards}??, \textit{poly\_tags}??, \textit{raw\_mono\_guards}??, \\
1242 \textit{raw\_mono\_tags}??, \textit{mono\_guards}??, \textit{mono\_tags}?? \\
1244 Even lighter versions of the `\hbox{?}' encodings. As argument to the
1245 \textit{metis} proof method, the `\hbox{??}' suffix is replaced by
1246 \hbox{``\textit{\_query\_query\/}''}.
1250 \textit{poly\_guards}@, \textit{poly\_tags}@, \textit{raw\_mono\_guards}@, \\
1251 \textit{raw\_mono\_tags}@ (sound*):} \\
1252 Alternative versions of the `\hbox{??}' encodings. As argument to the
1253 \textit{metis} proof method, the `\hbox{@}' suffix is replaced by
1254 \hbox{``\textit{\_at\/}''}.
1256 \item[\labelitemi] \textbf{\textit{poly\_args}?, \textit{raw\_mono\_args}? (unsound):} \\
1257 Lighter versions of \textit{poly\_args} and \textit{raw\_mono\_args}.
1259 \item[\labelitemi] \textbf{\textit{smart}:} The actual encoding used depends on
1260 the ATP and should be the most efficient sound encoding for that ATP.
1263 For SMT solvers, the type encoding is always \textit{mono\_native}, irrespective
1264 of the value of this option.
1267 {\small See also \textit{max\_new\_mono\_instances} (\S\ref{relevance-filter})
1268 and \textit{max\_mono\_iters} (\S\ref{relevance-filter}).}
1270 \opfalse{strict}{non\_strict}
1271 Specifies whether Sledgehammer should run in its strict mode. In that mode,
1272 sound type encodings marked with an asterisk (*) above are made complete
1273 for reconstruction with \textit{metis}, at the cost of some clutter in the
1274 generated problems. This option has no effect if \textit{type\_enc} is
1275 deliberately set to an unsound encoding.
1278 \subsection{Output Format}
1279 \label{output-format}
1283 \opfalse{verbose}{quiet}
1284 Specifies whether the \textbf{sledgehammer} command should explain what it does.
1285 This option is implicitly disabled for automatic runs.
1287 \opfalse{debug}{no\_debug}
1288 Specifies whether Sledgehammer should display additional debugging information
1289 beyond what \textit{verbose} already displays. Enabling \textit{debug} also
1290 enables \textit{verbose} and \textit{blocking} (\S\ref{mode-of-operation})
1291 behind the scenes. The \textit{debug} option is implicitly disabled for
1295 {\small See also \textit{spy} (\S\ref{mode-of-operation}) and
1296 \textit{overlord} (\S\ref{mode-of-operation}).}
1298 \opsmart{isar\_proofs}{no\_isar\_proofs}
1299 Specifies whether Isar proofs should be output in addition to one-liner
1300 \textit{metis} proofs. The construction of Isar proof is still experimental and
1301 may sometimes fail; however, when they succeed they are usually faster and more
1302 intelligible than \textit{metis} proofs. If the option is set to \textit{smart}
1303 (the default), Isar proofs are only generated when no working one-liner
1304 \textit{metis} proof is available.
1306 \opdefault{isar\_compress}{int}{\upshape 10}
1307 Specifies the granularity of the generated Isar proofs if \textit{isar\_proofs}
1308 is explicitly enabled. A value of $n$ indicates that each Isar proof step should
1309 correspond to a group of up to $n$ consecutive proof steps in the ATP proof.
1311 \optrueonly{dont\_compress\_isar}
1312 Alias for ``\textit{isar\_compress} = 0''.
1314 \optrue{isar\_try0}{dont\_try0\_isar}
1315 Specifies whether standard proof methods such as \textit{auto} and
1316 \textit{blast} should be tried as alternatives to \textit{metis} and
1317 \textit{smt} in Isar proofs. The collection of methods is roughly the same as
1318 for the \textbf{try0} command.
1321 \subsection{Authentication}
1322 \label{authentication}
1325 \opnodefault{expect}{string}
1326 Specifies the expected outcome, which must be one of the following:
1329 \item[\labelitemi] \textbf{\textit{some}:} Sledgehammer found a proof.
1330 \item[\labelitemi] \textbf{\textit{none}:} Sledgehammer found no proof.
1331 \item[\labelitemi] \textbf{\textit{timeout}:} Sledgehammer timed out.
1332 \item[\labelitemi] \textbf{\textit{unknown}:} Sledgehammer encountered some
1336 Sledgehammer emits an error (if \textit{blocking} is enabled) or a warning
1337 (otherwise) if the actual outcome differs from the expected outcome. This option
1338 is useful for regression testing.
1341 {\small See also \textit{blocking} (\S\ref{mode-of-operation}) and
1342 \textit{timeout} (\S\ref{timeouts}).}
1345 \subsection{Timeouts}
1349 \opdefault{timeout}{float\_or\_none}{\upshape 30}
1350 Specifies the maximum number of seconds that the automatic provers should spend
1351 searching for a proof. This excludes problem preparation and is a soft limit.
1352 For automatic runs, the ``Auto Time Limit'' option under ``Plugins > Plugin
1353 Options > Isabelle > General'' is used instead.
1355 \opdefault{preplay\_timeout}{float\_or\_none}{\upshape 3}
1356 Specifies the maximum number of seconds that \textit{metis} or \textit{smt}
1357 should spend trying to ``preplay'' the found proof. If this option is set to 0,
1358 no preplaying takes place, and no timing information is displayed next to the
1359 suggested \textit{metis} calls.
1362 {\small See also \textit{minimize} (\S\ref{mode-of-operation}).}
1364 \optrueonly{dont\_preplay}
1365 Alias for ``\textit{preplay\_timeout} = 0''.
1370 \bibliography{manual}{}
1371 \bibliographystyle{abbrv}