doc-src/Sledgehammer/sledgehammer.tex
changeset 49402 302cf211fb3f
parent 49309 2b0c5553dc46
child 49403 fd7958ebee96
equal deleted inserted replaced
49401:b903ea11b3bc 49402:302cf211fb3f
   150 %suggesting several textual improvements.
   150 %suggesting several textual improvements.
   151 
   151 
   152 \section{Installation}
   152 \section{Installation}
   153 \label{installation}
   153 \label{installation}
   154 
   154 
   155 Sledgehammer is part of Isabelle, so you don't need to install it. However, it
   155 Sledgehammer is part of Isabelle, so you do not need to install it. However, it
   156 relies on third-party automatic provers (ATPs and SMT solvers).
   156 relies on third-party automatic provers (ATPs and SMT solvers).
   157 
   157 
   158 Among the ATPs, E, LEO-II, Satallax, SPASS, and Vampire can be run locally; in
   158 Among the ATPs, E, LEO-II, Satallax, SPASS, and Vampire can be run locally; in
   159 addition, E, E-SInE, E-ToFoF, iProver, iProver-Eq, LEO-II, Satallax, SNARK,
   159 addition, E, E-SInE, E-ToFoF, iProver, iProver-Eq, LEO-II, Satallax, SNARK,
   160 Vampire, and Waldmeister are available remotely via System\-On\-TPTP
   160 Vampire, and Waldmeister are available remotely via System\-On\-TPTP
   299 Depending on which provers are installed and how many processor cores are
   299 Depending on which provers are installed and how many processor cores are
   300 available, some of the provers might be missing or present with a
   300 available, some of the provers might be missing or present with a
   301 \textit{remote\_} prefix. Waldmeister is run only for unit equational problems,
   301 \textit{remote\_} prefix. Waldmeister is run only for unit equational problems,
   302 where the goal's conclusion is a (universally quantified) equation.
   302 where the goal's conclusion is a (universally quantified) equation.
   303 
   303 
   304 For each successful prover, Sledgehammer gives a one-liner proof that uses
   304 For each successful prover, Sledgehammer gives a one-liner \textit{metis} or
   305 the \textit{metis} or \textit{smt} proof method. Approximate timings are shown
   305 \textit{smt} method call. Rough timings are shown in parentheses, indicating how
   306 in parentheses, indicating how fast the call is. You can click the proof to
   306 fast the call is. You can click the proof to insert it into the theory text.
   307 insert it into the theory text.
       
   308 
   307 
   309 In addition, you can ask Sledgehammer for an Isar text proof by passing the
   308 In addition, you can ask Sledgehammer for an Isar text proof by passing the
   310 \textit{isar\_proof} option (\S\ref{output-format}):
   309 \textit{isar\_proof} option (\S\ref{output-format}):
   311 
   310 
   312 \prew
   311 \prew
   384 
   383 
   385 \section{Frequently Asked Questions}
   384 \section{Frequently Asked Questions}
   386 \label{frequently-asked-questions}
   385 \label{frequently-asked-questions}
   387 
   386 
   388 This sections answers frequently (and infrequently) asked questions about
   387 This sections answers frequently (and infrequently) asked questions about
   389 Sledgehammer. It is a good idea to skim over it now even if you don't have any
   388 Sledgehammer. It is a good idea to skim over it now even if you do not have any
   390 questions at this stage. And if you have any further questions not listed here,
   389 questions at this stage. And if you have any further questions not listed here,
   391 send them to the author at \authoremail.
   390 send them to the author at \authoremail.
   392 
   391 
   393 \point{Which facts are passed to the automatic provers?}
   392 \point{Which facts are passed to the automatic provers?}
   394 
   393 
   395 The relevance filter assigns a score to every available fact (lemma, theorem,
   394 Sledgehammer heuristically selects a few hundred relevant lemmas from the
   396 definition, or axiom) based upon how many constants that fact shares with the
   395 currently loaded libraries. The component that performs this selection is
   397 conjecture. This process iterates to include facts relevant to those just
   396 called \emph{relevance filter}.
   398 accepted, but with a decay factor to ensure termination. The constants are
   397 
   399 weighted to give unusual ones greater significance. The relevance filter copes
   398 \begin{enum}
   400 best when the conjecture contains some unusual constants; if all the constants
   399 \item[\labelitemi]
   401 are common, it is unable to discriminate among the hundreds of facts that are
   400 The traditional relevance filter, called \emph{MePo}, assigns a score to every
   402 picked up. The relevance filter is also memoryless: It has no information about
   401 available fact (lemma, theorem, definition, or axiom) based upon how many
   403 how many times a particular fact has been used in a proof, and it cannot learn.
   402 constants that fact shares with the conjecture. This process iterates to include
       
   403 facts relevant to those just accepted. The constants are weighted to give
       
   404 unusual ones greater significance. MePo copes best when the conjecture contains
       
   405 some unusual constants; if all the constants are common, it is unable to
       
   406 discriminate among the hundreds of facts that are picked up. The filter is also
       
   407 memoryless: It has no information about how many times a particular fact has
       
   408 been used in a proof, and it cannot learn.
       
   409 
       
   410 \item[\labelitemi]
       
   411 An experimental, memoryful alternative to MePo is \emph{MaSh}
       
   412 (\underline{Ma}chine Learner for \underline{S}ledge\underline{h}ammer). It
       
   413 relies on an external tool called \texttt{mash} that applies machine learning to
       
   414 the problem of finding relevant facts.
       
   415 
       
   416 \item[\labelitemi] The \emph{Mesh} filter combines MePo and MaSh.
       
   417 \end{enum}
       
   418 
       
   419 The default is either MePo or Mesh, depending on whether \texttt{mash} is
       
   420 installed and what class of provers the target prover belongs to
       
   421 (\S\ref{relevance-filter}).
   404 
   422 
   405 The number of facts included in a problem varies from prover to prover, since
   423 The number of facts included in a problem varies from prover to prover, since
   406 some provers get overwhelmed more easily than others. You can show the number of
   424 some provers get overwhelmed more easily than others. You can show the number of
   407 facts given using the \textit{verbose} option (\S\ref{output-format}) and the
   425 facts given using the \textit{verbose} option (\S\ref{output-format}) and the
   408 actual facts using \textit{debug} (\S\ref{output-format}).
   426 actual facts using \textit{debug} (\S\ref{output-format}).
   597 \postw
   615 \postw
   598 
   616 
   599 \point{Auto can solve it---why not Sledgehammer?}
   617 \point{Auto can solve it---why not Sledgehammer?}
   600 
   618 
   601 Problems can be easy for \textit{auto} and difficult for automatic provers, but
   619 Problems can be easy for \textit{auto} and difficult for automatic provers, but
   602 the reverse is also true, so don't be discouraged if your first attempts fail.
   620 the reverse is also true, so do not be discouraged if your first attempts fail.
   603 Because the system refers to all theorems known to Isabelle, it is particularly
   621 Because the system refers to all theorems known to Isabelle, it is particularly
   604 suitable when your goal has a short proof from lemmas that you don't know about.
   622 suitable when your goal has a short proof from lemmas that you do not know
       
   623 about.
   605 
   624 
   606 \point{Why are there so many options?}
   625 \point{Why are there so many options?}
   607 
   626 
   608 Sledgehammer's philosophy should work out of the box, without user guidance.
   627 Sledgehammer's philosophy should work out of the box, without user guidance.
   609 Many of the options are meant to be used mostly by the Sledgehammer developers
   628 Many of the options are meant to be used mostly by the Sledgehammer developers
   652 currently running automatic provers, including elapsed runtime and remaining
   671 currently running automatic provers, including elapsed runtime and remaining
   653 time until timeout.
   672 time until timeout.
   654 
   673 
   655 \item[\labelitemi] \textbf{\textit{kill\_provers}:} Terminates all running
   674 \item[\labelitemi] \textbf{\textit{kill\_provers}:} Terminates all running
   656 automatic provers.
   675 automatic provers.
       
   676 
       
   677 \item[\labelitemi] \textbf{\textit{unlearn}:} Resets the MaSh machine learner,
       
   678 erasing any persistent state.
       
   679 
       
   680 \item[\labelitemi] \textbf{\textit{learn}:} Invokes the MaSh machine learner on
       
   681 the current theory to process all the available facts. This happens
       
   682 automatically at Sledgehammer invocations if the \textit{learn} option
       
   683 (\S\ref{relevance-filter}) is enabled.
       
   684 
       
   685 \item[\labelitemi] \textbf{\textit{relearn}:} Same as \textit{unlearn} followed
       
   686 by \textit{learn}.
       
   687 
       
   688 \item[\labelitemi] \textbf{\textit{running\_learners}:} Prints information about
       
   689 currently running machine learners, including elapsed runtime and remaining
       
   690 time until timeout.
       
   691 
       
   692 \item[\labelitemi] \textbf{\textit{kill\_learners}:} Terminates all running
       
   693 machine learners.
   657 
   694 
   658 \item[\labelitemi] \textbf{\textit{refresh\_tptp}:} Refreshes the list of remote
   695 \item[\labelitemi] \textbf{\textit{refresh\_tptp}:} Refreshes the list of remote
   659 ATPs available at System\-On\-TPTP \cite{sutcliffe-2000}.
   696 ATPs available at System\-On\-TPTP \cite{sutcliffe-2000}.
   660 \end{enum}
   697 \end{enum}
   661 
   698 
   974 simultaneously. The files are identified by the prefix \texttt{prob\_}; you may
  1011 simultaneously. The files are identified by the prefix \texttt{prob\_}; you may
   975 safely remove them after Sledgehammer has run.
  1012 safely remove them after Sledgehammer has run.
   976 
  1013 
   977 \nopagebreak
  1014 \nopagebreak
   978 {\small See also \textit{debug} (\S\ref{output-format}).}
  1015 {\small See also \textit{debug} (\S\ref{output-format}).}
       
  1016 \end{enum}
       
  1017 
       
  1018 \subsection{Relevance Filter}
       
  1019 \label{relevance-filter}
       
  1020 
       
  1021 \begin{enum}
       
  1022 \opdefault{max\_facts}{smart\_int}{smart}
       
  1023 Specifies the maximum number of facts that may be returned by the relevance
       
  1024 filter. If the option is set to \textit{smart}, it is set to a value that was
       
  1025 empirically found to be appropriate for the prover. Typical values range between
       
  1026 50 and 1000.
       
  1027 
       
  1028 \opdefault{fact\_thresholds}{float\_pair}{\upshape 0.45~0.85}
       
  1029 Specifies the thresholds above which facts are considered relevant by the
       
  1030 relevance filter. The first threshold is used for the first iteration of the
       
  1031 relevance filter and the second threshold is used for the last iteration (if it
       
  1032 is reached). The effective threshold is quadratically interpolated for the other
       
  1033 iterations. Each threshold ranges from 0 to 1, where 0 means that all theorems
       
  1034 are relevant and 1 only theorems that refer to previously seen constants.
       
  1035 
       
  1036 \opdefault{max\_new\_mono\_instances}{int}{smart}
       
  1037 Specifies the maximum number of monomorphic instances to generate beyond
       
  1038 \textit{max\_facts}. The higher this limit is, the more monomorphic instances
       
  1039 are potentially generated. Whether monomorphization takes place depends on the
       
  1040 type encoding used. If the option is set to \textit{smart}, it is set to a value
       
  1041 that was empirically found to be appropriate for the prover. For most provers,
       
  1042 this value is 200.
       
  1043 
       
  1044 \nopagebreak
       
  1045 {\small See also \textit{type\_enc} (\S\ref{problem-encoding}).}
       
  1046 
       
  1047 \opdefault{max\_mono\_iters}{int}{smart}
       
  1048 Specifies the maximum number of iterations for the monomorphization fixpoint
       
  1049 construction. The higher this limit is, the more monomorphic instances are
       
  1050 potentially generated. Whether monomorphization takes place depends on the
       
  1051 type encoding used. If the option is set to \textit{smart}, it is set to a value
       
  1052 that was empirically found to be appropriate for the prover. For most provers,
       
  1053 this value is 3.
       
  1054 
       
  1055 \nopagebreak
       
  1056 {\small See also \textit{type\_enc} (\S\ref{problem-encoding}).}
   979 \end{enum}
  1057 \end{enum}
   980 
  1058 
   981 \subsection{Problem Encoding}
  1059 \subsection{Problem Encoding}
   982 \label{problem-encoding}
  1060 \label{problem-encoding}
   983 
  1061 
  1141 for reconstruction with \textit{metis}, at the cost of some clutter in the
  1219 for reconstruction with \textit{metis}, at the cost of some clutter in the
  1142 generated problems. This option has no effect if \textit{type\_enc} is
  1220 generated problems. This option has no effect if \textit{type\_enc} is
  1143 deliberately set to an unsound encoding.
  1221 deliberately set to an unsound encoding.
  1144 \end{enum}
  1222 \end{enum}
  1145 
  1223 
  1146 \subsection{Relevance Filter}
       
  1147 \label{relevance-filter}
       
  1148 
       
  1149 \begin{enum}
       
  1150 \opdefault{max\_facts}{smart\_int}{smart}
       
  1151 Specifies the maximum number of facts that may be returned by the relevance
       
  1152 filter. If the option is set to \textit{smart}, it is set to a value that was
       
  1153 empirically found to be appropriate for the prover. Typical values range between
       
  1154 50 and 1000.
       
  1155 
       
  1156 \opdefault{fact\_thresholds}{float\_pair}{\upshape 0.45~0.85}
       
  1157 Specifies the thresholds above which facts are considered relevant by the
       
  1158 relevance filter. The first threshold is used for the first iteration of the
       
  1159 relevance filter and the second threshold is used for the last iteration (if it
       
  1160 is reached). The effective threshold is quadratically interpolated for the other
       
  1161 iterations. Each threshold ranges from 0 to 1, where 0 means that all theorems
       
  1162 are relevant and 1 only theorems that refer to previously seen constants.
       
  1163 
       
  1164 \opdefault{max\_new\_mono\_instances}{int}{smart}
       
  1165 Specifies the maximum number of monomorphic instances to generate beyond
       
  1166 \textit{max\_facts}. The higher this limit is, the more monomorphic instances
       
  1167 are potentially generated. Whether monomorphization takes place depends on the
       
  1168 type encoding used. If the option is set to \textit{smart}, it is set to a value
       
  1169 that was empirically found to be appropriate for the prover. For most provers,
       
  1170 this value is 200.
       
  1171 
       
  1172 \nopagebreak
       
  1173 {\small See also \textit{type\_enc} (\S\ref{problem-encoding}).}
       
  1174 
       
  1175 \opdefault{max\_mono\_iters}{int}{smart}
       
  1176 Specifies the maximum number of iterations for the monomorphization fixpoint
       
  1177 construction. The higher this limit is, the more monomorphic instances are
       
  1178 potentially generated. Whether monomorphization takes place depends on the
       
  1179 type encoding used. If the option is set to \textit{smart}, it is set to a value
       
  1180 that was empirically found to be appropriate for the prover. For most provers,
       
  1181 this value is 3.
       
  1182 
       
  1183 \nopagebreak
       
  1184 {\small See also \textit{type\_enc} (\S\ref{problem-encoding}).}
       
  1185 \end{enum}
       
  1186 
       
  1187 \subsection{Output Format}
  1224 \subsection{Output Format}
  1188 \label{output-format}
  1225 \label{output-format}
  1189 
  1226 
  1190 \begin{enum}
  1227 \begin{enum}
  1191 
  1228