wneuper/isa: doc-src/Nitpick/nitpick.tex@ab01b72715ef

     1 \documentclass[a4paper,12pt]{article}

     2 \usepackage[T1]{fontenc}

     3 \usepackage{amsmath}

     4 \usepackage{amssymb}

     5 \usepackage[french,english]{babel}

     6 \usepackage{color}

     7 \usepackage{graphicx}

     8 %\usepackage{mathpazo}

     9 \usepackage{multicol}

    10 \usepackage{stmaryrd}

    11 %\usepackage[scaled=.85]{beramono}

    12 \usepackage{../iman,../pdfsetup}

    14 %\oddsidemargin=4.6mm

    15 %\evensidemargin=4.6mm

    16 %\textwidth=150mm

    17 %\topmargin=4.6mm

    18 %\headheight=0mm

    19 %\headsep=0mm

    20 %\textheight=234mm

    22 \def\Colon{\mathord{:\mkern-1.5mu:}}

    23 %\def\lbrakk{\mathopen{\lbrack\mkern-3.25mu\lbrack}}

    24 %\def\rbrakk{\mathclose{\rbrack\mkern-3.255mu\rbrack}}

    25 \def\lparr{\mathopen{(\mkern-4mu\mid}}

    26 \def\rparr{\mathclose{\mid\mkern-4mu)}}

    28 \def\undef{\textit{undefined}}

    29 \def\unk{{?}}

    30 %\def\unr{\textit{others}}

    31 \def\unr{\ldots}

    32 \def\Abs#1{\hbox{\rm{\flqq}}{\,#1\,}\hbox{\rm{\frqq}}}

    33 \def\Q{{\smash{\lower.2ex\hbox{$\scriptstyle?$}}}}

    35 \hyphenation{Mini-Sat size-change First-Steps grand-parent nit-pick

    36 counter-example counter-examples data-type data-types co-data-type

    37 co-data-types in-duc-tive co-in-duc-tive}

    39 \urlstyle{tt}

    41 \begin{document}

    43 \title{\includegraphics[scale=0.5]{isabelle_nitpick} \\[4ex]

    44 Picking Nits \\[\smallskipamount]

    45 \Large A User's Guide to Nitpick for Isabelle/HOL 2010}

    46 \author{\hbox{} \\

    47 Jasmin Christian Blanchette \\

    48 {\normalsize Fakult\"at f\"ur Informatik, Technische Universit\"at M\"unchen} \\

    49 \hbox{}}

    51 \maketitle

    53 \tableofcontents

    55 \setlength{\parskip}{.7em plus .2em minus .1em}

    56 \setlength{\parindent}{0pt}

    57 \setlength{\abovedisplayskip}{\parskip}

    58 \setlength{\abovedisplayshortskip}{.9\parskip}

    59 \setlength{\belowdisplayskip}{\parskip}

    60 \setlength{\belowdisplayshortskip}{.9\parskip}

    62 % General-purpose enum environment with correct spacing

    63 \newenvironment{enum}%

    64     {\begin{list}{}{%

    65         \setlength{\topsep}{.1\parskip}%

    66         \setlength{\partopsep}{.1\parskip}%

    67         \setlength{\itemsep}{\parskip}%

    68         \advance\itemsep by-\parsep}}

    69     {\end{list}}

    71 \def\pre{\begingroup\vskip0pt plus1ex\advance\leftskip by\leftmargin

    72 \advance\rightskip by\leftmargin}

    73 \def\post{\vskip0pt plus1ex\endgroup}

    75 \def\prew{\pre\advance\rightskip by-\leftmargin}

    76 \def\postw{\post}

    78 \section{Introduction}

    79 \label{introduction}

    81 Nitpick \cite{blanchette-nipkow-2009} is a counterexample generator for

    82 Isabelle/HOL \cite{isa-tutorial} that is designed to handle formulas

    83 combining (co)in\-duc\-tive datatypes, (co)in\-duc\-tively defined predicates, and

    84 quantifiers. It builds on Kodkod \cite{torlak-jackson-2007}, a highly optimized

    85 first-order relational model finder developed by the Software Design Group at

    86 MIT. It is conceptually similar to Refute \cite{weber-2008}, from which it

    87 borrows many ideas and code fragments, but it benefits from Kodkod's

    88 optimizations and a new encoding scheme. The name Nitpick is shamelessly

    89 appropriated from a now retired Alloy precursor.

    91 Nitpick is easy to use---you simply enter \textbf{nitpick} after a putative

    92 theorem and wait a few seconds. Nonetheless, there are situations where knowing

    93 how it works under the hood and how it reacts to various options helps

    94 increase the test coverage. This manual also explains how to install the tool on

    95 your workstation. Should the motivation fail you, think of the many hours of

    96 hard work Nitpick will save you. Proving non-theorems is \textsl{hard work}.

    98 Another common use of Nitpick is to find out whether the axioms of a locale are

    99 satisfiable, while the locale is being developed. To check this, it suffices to

   100 write

   102 \prew

   103 \textbf{lemma}~``$\textit{False}$'' \\

   104 \textbf{nitpick}~[\textit{show\_all}]

   105 \postw

   107 after the locale's \textbf{begin} keyword. To falsify \textit{False}, Nitpick

   108 must find a model for the axioms. If it finds no model, we have an indication

   109 that the axioms might be unsatisfiable.

   111 Nitpick requires the Kodkodi package for Isabelle as well as a Java 1.5 virtual

   112 machine called \texttt{java}. The examples presented in this manual can be found

   113 in Isabelle's \texttt{src/HOL/Nitpick\_Examples/Manual\_Nits.thy} theory.

   115 Throughout this manual, we will explicitly invoke the \textbf{nitpick} command.

   116 Nitpick also provides an automatic mode that can be enabled using the

   117 ``Auto Nitpick'' option from the ``Isabelle'' menu in Proof General. In this

   118 mode, Nitpick is run on every newly entered theorem, much like Auto Quickcheck.

   119 The collective time limit for Auto Nitpick and Auto Quickcheck can be set using

   120 the ``Auto Counterexample Time Limit'' option.

   122 \newbox\boxA

   123 \setbox\boxA=\hbox{\texttt{nospam}}

   125 The known bugs and limitations at the time of writing are listed in

   126 \S\ref{known-bugs-and-limitations}. Comments and bug reports concerning Nitpick

   127 or this manual should be directed to

   128 \texttt{blan{\color{white}nospam}\kern-\wd\boxA{}chette@\allowbreak

   129 in.\allowbreak tum.\allowbreak de}.

   131 \vskip2.5\smallskipamount

   133 \textbf{Acknowledgment.} The author would like to thank Mark Summerfield for

   134 suggesting several textual improvements.

   135 % and Perry James for reporting a typo.

   137 \section{First Steps}

   138 \label{first-steps}

   140 This section introduces Nitpick by presenting small examples. If possible, you

   141 should try out the examples on your workstation. Your theory file should start

   142 the standard way:

   144 \prew

   145 \textbf{theory}~\textit{Scratch} \\

   146 \textbf{imports}~\textit{Main} \\

   147 \textbf{begin}

   148 \postw

   150 The results presented here were obtained using the JNI version of MiniSat and

   151 with multithreading disabled to reduce nondeterminism. This was done by adding

   152 the line

   154 \prew

   155 \textbf{nitpick\_params} [\textit{sat\_solver}~= \textit{MiniSatJNI}, \,\textit{max\_threads}~= 1]

   156 \postw

   158 after the \textbf{begin} keyword. The JNI version of MiniSat is bundled with

   159 Kodkodi and is precompiled for the major platforms. Other SAT solvers can also

   160 be installed, as explained in \S\ref{optimizations}. If you have already

   161 configured SAT solvers in Isabelle (e.g., for Refute), these will also be

   162 available to Nitpick.

   164 \subsection{Propositional Logic}

   165 \label{propositional-logic}

   167 Let's start with a trivial example from propositional logic:

   169 \prew

   170 \textbf{lemma}~``$P \longleftrightarrow Q$'' \\

   171 \textbf{nitpick}

   172 \postw

   174 You should get the following output:

   176 \prew

   177 \slshape

   178 Nitpick found a counterexample: \\[2\smallskipamount]

   179 \hbox{}\qquad Free variables: \nopagebreak \\

   180 \hbox{}\qquad\qquad $P = \textit{True}$ \\

   181 \hbox{}\qquad\qquad $Q = \textit{False}$

   182 \postw

   184 Nitpick can also be invoked on individual subgoals, as in the example below:

   186 \prew

   187 \textbf{apply}~\textit{auto} \\[2\smallskipamount]

   188 {\slshape goal (2 subgoals): \\

   189 \ 1. $P\,\Longrightarrow\, Q$ \\

   190 \ 2. $Q\,\Longrightarrow\, P$} \\[2\smallskipamount]

   191 \textbf{nitpick}~1 \\[2\smallskipamount]

   192 {\slshape Nitpick found a counterexample: \\[2\smallskipamount]

   193 \hbox{}\qquad Free variables: \nopagebreak \\

   194 \hbox{}\qquad\qquad $P = \textit{True}$ \\

   195 \hbox{}\qquad\qquad $Q = \textit{False}$} \\[2\smallskipamount]

   196 \textbf{nitpick}~2 \\[2\smallskipamount]

   197 {\slshape Nitpick found a counterexample: \\[2\smallskipamount]

   198 \hbox{}\qquad Free variables: \nopagebreak \\

   199 \hbox{}\qquad\qquad $P = \textit{False}$ \\

   200 \hbox{}\qquad\qquad $Q = \textit{True}$} \\[2\smallskipamount]

   201 \textbf{oops}

   202 \postw

   204 \subsection{Type Variables}

   205 \label{type-variables}

   207 If you are left unimpressed by the previous example, don't worry. The next

   208 one is more mind- and computer-boggling:

   210 \prew

   211 \textbf{lemma} ``$P~x\,\Longrightarrow\, P~(\textrm{THE}~y.\;P~y)$''

   212 \postw

   213 \pagebreak[2] %% TYPESETTING

   215 The putative lemma involves the definite description operator, {THE}, presented

   216 in section 5.10.1 of the Isabelle tutorial \cite{isa-tutorial}. The

   217 operator is defined by the axiom $(\textrm{THE}~x.\; x = a) = a$. The putative

   218 lemma is merely asserting the indefinite description operator axiom with {THE}

   219 substituted for {SOME}.

   221 The free variable $x$ and the bound variable $y$ have type $'a$. For formulas

   222 containing type variables, Nitpick enumerates the possible domains for each type

   223 variable, up to a given cardinality (8 by default), looking for a finite

   224 countermodel:

   226 \prew

   227 \textbf{nitpick} [\textit{verbose}] \\[2\smallskipamount]

   228 \slshape

   229 Trying 8 scopes: \nopagebreak \\

   230 \hbox{}\qquad \textit{card}~$'a$~= 1; \\

   231 \hbox{}\qquad \textit{card}~$'a$~= 2; \\

   232 \hbox{}\qquad $\qquad\vdots$ \\[.5\smallskipamount]

   233 \hbox{}\qquad \textit{card}~$'a$~= 8. \\[2\smallskipamount]

   234 Nitpick found a counterexample for \textit{card} $'a$~= 3: \\[2\smallskipamount]

   235 \hbox{}\qquad Free variables: \nopagebreak \\

   236 \hbox{}\qquad\qquad $P = \{a_2,\, a_3\}$ \\

   237 \hbox{}\qquad\qquad $x = a_3$ \\[2\smallskipamount]

   238 Total time: 580 ms.

   239 \postw

   241 Nitpick found a counterexample in which $'a$ has cardinality 3. (For

   242 cardinalities 1 and 2, the formula holds.) In the counterexample, the three

   243 values of type $'a$ are written $a_1$, $a_2$, and $a_3$.

   245 The message ``Trying $n$ scopes: {\ldots}''\ is shown only if the option

   246 \textit{verbose} is enabled. You can specify \textit{verbose} each time you

   247 invoke \textbf{nitpick}, or you can set it globally using the command

   249 \prew

   250 \textbf{nitpick\_params} [\textit{verbose}]

   251 \postw

   253 This command also displays the current default values for all of the options

   254 supported by Nitpick. The options are listed in \S\ref{option-reference}.

   256 \subsection{Constants}

   257 \label{constants}

   259 By just looking at Nitpick's output, it might not be clear why the

   260 counterexample in \S\ref{type-variables} is genuine. Let's invoke Nitpick again,

   261 this time telling it to show the values of the constants that occur in the

   262 formula:

   264 \prew

   265 \textbf{lemma}~``$P~x\,\Longrightarrow\, P~(\textrm{THE}~y.\;P~y)$'' \\

   266 \textbf{nitpick}~[\textit{show\_consts}] \\[2\smallskipamount]

   267 \slshape

   268 Nitpick found a counterexample for \textit{card} $'a$~= 3: \\[2\smallskipamount]

   269 \hbox{}\qquad Free variables: \nopagebreak \\

   270 \hbox{}\qquad\qquad $P = \{a_2,\, a_3\}$ \\

   271 \hbox{}\qquad\qquad $x = a_3$ \\

   272 \hbox{}\qquad Constant: \nopagebreak \\

   273 \hbox{}\qquad\qquad $\textit{The}~\textsl{fallback} = a_1$

   274 \postw

   276 We can see more clearly now. Since the predicate $P$ isn't true for a unique

   277 value, $\textrm{THE}~y.\;P~y$ can denote any value of type $'a$, even

   278 $a_1$. Since $P~a_1$ is false, the entire formula is falsified.

   280 As an optimization, Nitpick's preprocessor introduced the special constant

   281 ``\textit{The} fallback'' corresponding to $\textrm{THE}~y.\;P~y$ (i.e.,

   282 $\mathit{The}~(\lambda y.\;P~y)$) when there doesn't exist a unique $y$

   283 satisfying $P~y$. We disable this optimization by passing the

   284 \textit{full\_descrs} option:

   286 \prew

   287 \textbf{nitpick}~[\textit{full\_descrs},\, \textit{show\_consts}] \\[2\smallskipamount]

   288 \slshape

   289 Nitpick found a counterexample for \textit{card} $'a$~= 3: \\[2\smallskipamount]

   290 \hbox{}\qquad Free variables: \nopagebreak \\

   291 \hbox{}\qquad\qquad $P = \{a_2,\, a_3\}$ \\

   292 \hbox{}\qquad\qquad $x = a_3$ \\

   293 \hbox{}\qquad Constant: \nopagebreak \\

   294 \hbox{}\qquad\qquad $\hbox{\slshape THE}~y.\;P~y = a_1$

   295 \postw

   297 As the result of another optimization, Nitpick directly assigned a value to the

   298 subterm $\textrm{THE}~y.\;P~y$, rather than to the \textit{The} constant. If we

   299 disable this second optimization by using the command

   301 \prew

   302 \textbf{nitpick}~[\textit{dont\_specialize},\, \textit{full\_descrs},\,

   303 \textit{show\_consts}]

   304 \postw

   306 we finally get \textit{The}:

   308 \prew

   309 \slshape Constant: \nopagebreak \\

   310 \hbox{}\qquad $\mathit{The} = \undef{}

   311     (\!\begin{aligned}[t]%

   312     & \{\} := a_3,\> \{a_3\} := a_3,\> \{a_2\} := a_2, \\[-2pt] %% TYPESETTING

   313     & \{a_2, a_3\} := a_1,\> \{a_1\} := a_1,\> \{a_1, a_3\} := a_3, \\[-2pt]

   314     & \{a_1, a_2\} := a_3,\> \{a_1, a_2, a_3\} := a_3)\end{aligned}$

   315 \postw

   317 Notice that $\textit{The}~(\lambda y.\;P~y) = \textit{The}~\{a_2, a_3\} = a_1$,

   318 just like before.\footnote{The \undef{} symbol's presence is explained as

   319 follows: In higher-order logic, any function can be built from the undefined

   320 function using repeated applications of the function update operator $f(x :=

   321 y)$, just like any list can be built from the empty list using $x \mathbin{\#}

   322 xs$.}

   324 Our misadventures with THE suggest adding `$\exists!x{.}$' (``there exists a

   325 unique $x$ such that'') at the front of our putative lemma's assumption:

   327 \prew

   328 \textbf{lemma}~``$\exists {!}x.\; P~x\,\Longrightarrow\, P~(\textrm{THE}~y.\;P~y)$''

   329 \postw

   331 The fix appears to work:

   333 \prew

   334 \textbf{nitpick} \\[2\smallskipamount]

   335 \slshape Nitpick found no counterexample.

   336 \postw

   338 We can further increase our confidence in the formula by exhausting all

   339 cardinalities up to 50:

   341 \prew

   342 \textbf{nitpick} [\textit{card} $'a$~= 1--50]\footnote{The symbol `--'

   343 can be entered as \texttt{-} (hyphen) or

   344 \texttt{\char`\\\char`\<midarrow\char`\>}.} \\[2\smallskipamount]

   345 \slshape Nitpick found no counterexample.

   346 \postw

   348 Let's see if Sledgehammer \cite{sledgehammer-2009} can find a proof:

   350 \prew

   351 \textbf{sledgehammer} \\[2\smallskipamount]

   352 {\slshape Sledgehammer: external prover ``$e$'' for subgoal 1: \\

   353 $\exists{!}x.\; P~x\,\Longrightarrow\, P~(\hbox{\slshape THE}~y.\; P~y)$ \\

   354 Try this command: \textrm{apply}~(\textit{metis~the\_equality})} \\[2\smallskipamount]

   355 \textbf{apply}~(\textit{metis~the\_equality\/}) \nopagebreak \\[2\smallskipamount]

   356 {\slshape No subgoals!}% \\[2\smallskipamount]

   357 %\textbf{done}

   358 \postw

   360 This must be our lucky day.

   362 \subsection{Skolemization}

   363 \label{skolemization}

   365 Are all invertible functions onto? Let's find out:

   367 \prew

   368 \textbf{lemma} ``$\exists g.\; \forall x.~g~(f~x) = x

   369  \,\Longrightarrow\, \forall y.\; \exists x.~y = f~x$'' \\

   370 \textbf{nitpick} \\[2\smallskipamount]

   371 \slshape

   372 Nitpick found a counterexample for \textit{card} $'a$~= 2 and \textit{card} $'b$~=~1: \\[2\smallskipamount]

   373 \hbox{}\qquad Free variable: \nopagebreak \\

   374 \hbox{}\qquad\qquad $f = \undef{}(b_1 := a_1)$ \\

   375 \hbox{}\qquad Skolem constants: \nopagebreak \\

   376 \hbox{}\qquad\qquad $g = \undef{}(a_1 := b_1,\> a_2 := b_1)$ \\

   377 \hbox{}\qquad\qquad $y = a_2$

   378 \postw

   380 Although $f$ is the only free variable occurring in the formula, Nitpick also

   381 displays values for the bound variables $g$ and $y$. These values are available

   382 to Nitpick because it performs skolemization as a preprocessing step.

   384 In the previous example, skolemization only affected the outermost quantifiers.

   385 This is not always the case, as illustrated below:

   387 \prew

   388 \textbf{lemma} ``$\exists x.\; \forall f.\; f~x = x$'' \\

   389 \textbf{nitpick} \\[2\smallskipamount]

   390 \slshape

   391 Nitpick found a counterexample for \textit{card} $'a$~= 2: \\[2\smallskipamount]

   392 \hbox{}\qquad Skolem constant: \nopagebreak \\

   393 \hbox{}\qquad\qquad $\lambda x.\; f =

   394     \undef{}(\!\begin{aligned}[t]

   395     & a_1 := \undef{}(a_1 := a_2,\> a_2 := a_1), \\[-2pt]

   396     & a_2 := \undef{}(a_1 := a_1,\> a_2 := a_1))\end{aligned}$

   397 \postw

   399 The variable $f$ is bound within the scope of $x$; therefore, $f$ depends on

   400 $x$, as suggested by the notation $\lambda x.\,f$. If $x = a_1$, then $f$ is the

   401 function that maps $a_1$ to $a_2$ and vice versa; otherwise, $x = a_2$ and $f$

   402 maps both $a_1$ and $a_2$ to $a_1$. In both cases, $f~x \not= x$.

   404 The source of the Skolem constants is sometimes more obscure:

   406 \prew

   407 \textbf{lemma} ``$\mathit{refl}~r\,\Longrightarrow\, \mathit{sym}~r$'' \\

   408 \textbf{nitpick} \\[2\smallskipamount]

   409 \slshape

   410 Nitpick found a counterexample for \textit{card} $'a$~= 2: \\[2\smallskipamount]

   411 \hbox{}\qquad Free variable: \nopagebreak \\

   412 \hbox{}\qquad\qquad $r = \{(a_1, a_1),\, (a_2, a_1),\, (a_2, a_2)\}$ \\

   413 \hbox{}\qquad Skolem constants: \nopagebreak \\

   414 \hbox{}\qquad\qquad $\mathit{sym}.x = a_2$ \\

   415 \hbox{}\qquad\qquad $\mathit{sym}.y = a_1$

   416 \postw

   418 What happened here is that Nitpick expanded the \textit{sym} constant to its

   419 definition:

   421 \prew

   422 $\mathit{sym}~r \,\equiv\,

   423  \forall x\> y.\,\> (x, y) \in r \longrightarrow (y, x) \in r.$

   424 \postw

   426 As their names suggest, the Skolem constants $\mathit{sym}.x$ and

   427 $\mathit{sym}.y$ are simply the bound variables $x$ and $y$

   428 from \textit{sym}'s definition.

   430 Although skolemization is a useful optimization, you can disable it by invoking

   431 Nitpick with \textit{dont\_skolemize}. See \S\ref{optimizations} for details.

   433 \subsection{Natural Numbers and Integers}

   434 \label{natural-numbers-and-integers}

   436 Because of the axiom of infinity, the type \textit{nat} does not admit any

   437 finite models. To deal with this, Nitpick considers prefixes $\{0,\, 1,\,

   438 \ldots,\, K - 1\}$ of \textit{nat} (where $K = \textit{card}~\textit{nat}$) and

   439 maps all other numbers to the undefined value ($\unk$). The type \textit{int} is

   440 handled in a similar way: If $K = \textit{card}~\textit{int}$, the subset of

   441 \textit{int} known to Nitpick is $\{-\lceil K/2 \rceil + 1,\, \ldots,\, +\lfloor

   442 K/2 \rfloor\}$. Undefined values lead to a three-valued logic.

   444 Here is an example involving \textit{int}:

   446 \prew

   447 \textbf{lemma} ``$\lbrakk i \le j;\> n \le (m{\Colon}\mathit{int})\rbrakk \,\Longrightarrow\, i * n + j * m \le i * m + j * n$'' \\

   448 \textbf{nitpick} \\[2\smallskipamount]

   449 \slshape Nitpick found a counterexample: \\[2\smallskipamount]

   450 \hbox{}\qquad Free variables: \nopagebreak \\

   451 \hbox{}\qquad\qquad $i = 0$ \\

   452 \hbox{}\qquad\qquad $j = 1$ \\

   453 \hbox{}\qquad\qquad $m = 1$ \\

   454 \hbox{}\qquad\qquad $n = 0$

   455 \postw

   457 With infinite types, we don't always have the luxury of a genuine counterexample

   458 and must often content ourselves with a potential one. The tedious task of

   459 finding out whether the potential counterexample is in fact genuine can be

   460 outsourced to \textit{auto} by passing the option \textit{check\_potential}. For

   461 example:

   463 \prew

   464 \textbf{lemma} ``$\forall n.\; \textit{Suc}~n \mathbin{\not=} n \,\Longrightarrow\, P$'' \\

   465 \textbf{nitpick} [\textit{card~nat}~= 100,\, \textit{check\_potential}] \\[2\smallskipamount]

   466 \slshape Nitpick found a potential counterexample: \\[2\smallskipamount]

   467 \hbox{}\qquad Free variable: \nopagebreak \\

   468 \hbox{}\qquad\qquad $P = \textit{False}$ \\[2\smallskipamount]

   469 Confirmation by ``\textit{auto}'': The above counterexample is genuine.

   470 \postw

   472 You might wonder why the counterexample is first reported as potential. The root

   473 of the problem is that the bound variable in $\forall n.\; \textit{Suc}~n

   474 \mathbin{\not=} n$ ranges over an infinite type. If Nitpick finds an $n$ such

   475 that $\textit{Suc}~n \mathbin{=} n$, it evaluates the assumption to

   476 \textit{False}; but otherwise, it does not know anything about values of $n \ge

   477 \textit{card~nat}$ and must therefore evaluate the assumption to $\unk$, not

   478 \textit{True}. Since the assumption can never be satisfied, the putative lemma

   479 can never be falsified.

   481 Incidentally, if you distrust the so-called genuine counterexamples, you can

   482 enable \textit{check\_\allowbreak genuine} to verify them as well. However, be

   483 aware that \textit{auto} will often fail to prove that the counterexample is

   484 genuine or spurious.

   486 Some conjectures involving elementary number theory make Nitpick look like a

   487 giant with feet of clay:

   489 \prew

   490 \textbf{lemma} ``$P~\textit{Suc}$'' \\

   491 \textbf{nitpick} [\textit{card} = 1--6] \\[2\smallskipamount]

   492 \slshape

   493 Nitpick found no counterexample.

   494 \postw

   496 For any cardinality $k$, \textit{Suc} is the partial function $\{0 \mapsto 1,\,

   497 1 \mapsto 2,\, \ldots,\, k - 1 \mapsto \unk\}$, which evaluates to $\unk$ when

   498 it is passed as argument to $P$. As a result, $P~\textit{Suc}$ is always $\unk$.

   499 The next example is similar:

   501 \prew

   502 \textbf{lemma} ``$P~(\textit{op}~{+}\Colon

   503 \textit{nat}\mathbin{\Rightarrow}\textit{nat}\mathbin{\Rightarrow}\textit{nat})$'' \\

   504 \textbf{nitpick} [\textit{card nat} = 1] \\[2\smallskipamount]

   505 {\slshape Nitpick found a counterexample:} \\[2\smallskipamount]

   506 \hbox{}\qquad Free variable: \nopagebreak \\

   507 \hbox{}\qquad\qquad $P = \{\}$ \\[2\smallskipamount]

   508 \textbf{nitpick} [\textit{card nat} = 2] \\[2\smallskipamount]

   509 {\slshape Nitpick found no counterexample.}

   510 \postw

   512 The problem here is that \textit{op}~+ is total when \textit{nat} is taken to be

   513 $\{0\}$ but becomes partial as soon as we add $1$, because $1 + 1 \notin \{0,

   514 1\}$.

   516 Because numbers are infinite and are approximated using a three-valued logic,

   517 there is usually no need to systematically enumerate domain sizes. If Nitpick

   518 cannot find a genuine counterexample for \textit{card~nat}~= $k$, it is very

   519 unlikely that one could be found for smaller domains. (The $P~(\textit{op}~{+})$

   520 example above is an exception to this principle.) Nitpick nonetheless enumerates

   521 all cardinalities from 1 to 8 for \textit{nat}, mainly because smaller

   522 cardinalities are fast to handle and give rise to simpler counterexamples. This

   523 is explained in more detail in \S\ref{scope-monotonicity}.

   525 \subsection{Inductive Datatypes}

   526 \label{inductive-datatypes}

   528 Like natural numbers and integers, inductive datatypes with recursive

   529 constructors admit no finite models and must be approximated by a subterm-closed

   530 subset. For example, using a cardinality of 10 for ${'}a~\textit{list}$,

   531 Nitpick looks for all counterexamples that can be built using at most 10

   532 different lists.

   534 Let's see with an example involving \textit{hd} (which returns the first element

   535 of a list) and $@$ (which concatenates two lists):

   537 \prew

   538 \textbf{lemma} ``$\textit{hd}~(\textit{xs} \mathbin{@} [y, y]) = \textit{hd}~\textit{xs}$'' \\

   539 \textbf{nitpick} \\[2\smallskipamount]

   540 \slshape Nitpick found a counterexample for \textit{card} $'a$~= 3: \\[2\smallskipamount]

   541 \hbox{}\qquad Free variables: \nopagebreak \\

   542 \hbox{}\qquad\qquad $\textit{xs} = []$ \\

   543 \hbox{}\qquad\qquad $\textit{y} = a_3$

   544 \postw

   546 To see why the counterexample is genuine, we enable \textit{show\_consts}

   547 and \textit{show\_\allowbreak datatypes}:

   549 \prew

   550 {\slshape Datatype:} \\

   551 \hbox{}\qquad $'a$~\textit{list}~= $\{[],\, [a_3, a_3],\, [a_3],\, \unr\}$ \\

   552 {\slshape Constants:} \\

   553 \hbox{}\qquad $\lambda x_1.\; x_1 \mathbin{@} [y, y] = \undef([] := [a_3, a_3],\> [a_3, a_3] := \unk,\> [a_3] := \unk)$ \\

   554 \hbox{}\qquad $\textit{hd} = \undef([] := a_2,\> [a_3, a_3] := a_3,\> [a_3] := a_3)$

   555 \postw

   557 Since $\mathit{hd}~[]$ is undefined in the logic, it may be given any value,

   558 including $a_2$.

   560 The second constant, $\lambda x_1.\; x_1 \mathbin{@} [y, y]$, is simply the

   561 append operator whose second argument is fixed to be $[y, y]$. Appending $[a_3,

   562 a_3]$ to $[a_3]$ would normally give $[a_3, a_3, a_3]$, but this value is not

   563 representable in the subset of $'a$~\textit{list} considered by Nitpick, which

   564 is shown under the ``Datatype'' heading; hence the result is $\unk$. Similarly,

   565 appending $[a_3, a_3]$ to itself gives $\unk$.

   567 Given \textit{card}~$'a = 3$ and \textit{card}~$'a~\textit{list} = 3$, Nitpick

   568 considers the following subsets:

   570 \kern-.5\smallskipamount %% TYPESETTING

   572 \prew

   573 \begin{multicols}{3}

   574 $\{[],\, [a_1],\, [a_2]\}$; \\

   575 $\{[],\, [a_1],\, [a_3]\}$; \\

   576 $\{[],\, [a_2],\, [a_3]\}$; \\

   577 $\{[],\, [a_1],\, [a_1, a_1]\}$; \\

   578 $\{[],\, [a_1],\, [a_2, a_1]\}$; \\

   579 $\{[],\, [a_1],\, [a_3, a_1]\}$; \\

   580 $\{[],\, [a_2],\, [a_1, a_2]\}$; \\

   581 $\{[],\, [a_2],\, [a_2, a_2]\}$; \\

   582 $\{[],\, [a_2],\, [a_3, a_2]\}$; \\

   583 $\{[],\, [a_3],\, [a_1, a_3]\}$; \\

   584 $\{[],\, [a_3],\, [a_2, a_3]\}$; \\

   585 $\{[],\, [a_3],\, [a_3, a_3]\}$.

   586 \end{multicols}

   587 \postw

   589 \kern-2\smallskipamount %% TYPESETTING

   591 All subterm-closed subsets of $'a~\textit{list}$ consisting of three values

   592 are listed and only those. As an example of a non-subterm-closed subset,

   593 consider $\mathcal{S} = \{[],\, [a_1],\,\allowbreak [a_1, a_3]\}$, and observe

   594 that $[a_1, a_3]$ (i.e., $a_1 \mathbin{\#} [a_3]$) has $[a_3] \notin

   595 \mathcal{S}$ as a subterm.

   597 Here's another m\"ochtegern-lemma that Nitpick can refute without a blink:

   599 \prew

   600 \textbf{lemma} ``$\lbrakk \textit{length}~\textit{xs} = 1;\> \textit{length}~\textit{ys} = 1

   601 \rbrakk \,\Longrightarrow\, \textit{xs} = \textit{ys}$''

   602 \\

   603 \textbf{nitpick} [\textit{show\_datatypes}] \\[2\smallskipamount]

   604 \slshape Nitpick found a counterexample for \textit{card} $'a$~= 3: \\[2\smallskipamount]

   605 \hbox{}\qquad Free variables: \nopagebreak \\

   606 \hbox{}\qquad\qquad $\textit{xs} = [a_2]$ \\

   607 \hbox{}\qquad\qquad $\textit{ys} = [a_3]$ \\

   608 \hbox{}\qquad Datatypes: \\

   609 \hbox{}\qquad\qquad $\textit{nat} = \{0,\, 1,\, 2,\, \unr\}$ \\

   610 \hbox{}\qquad\qquad $'a$~\textit{list} = $\{[],\, [a_3],\, [a_2],\, \unr\}$

   611 \postw

   613 Because datatypes are approximated using a three-valued logic, there is usually

   614 no need to systematically enumerate cardinalities: If Nitpick cannot find a

   615 genuine counterexample for \textit{card}~$'a~\textit{list}$~= 10, it is very

   616 unlikely that one could be found for smaller cardinalities.

   618 \subsection{Typedefs, Records, Rationals, and Reals}

   619 \label{typedefs-records-rationals-and-reals}

   621 Nitpick generally treats types declared using \textbf{typedef} as datatypes

   622 whose single constructor is the corresponding \textit{Abs\_\kern.1ex} function.

   623 For example:

   625 \prew

   626 \textbf{typedef}~\textit{three} = ``$\{0\Colon\textit{nat},\, 1,\, 2\}$'' \\

   627 \textbf{by}~\textit{blast} \\[2\smallskipamount]

   628 \textbf{definition}~$A \mathbin{\Colon} \textit{three}$ \textbf{where} ``\kern-.1em$A \,\equiv\, \textit{Abs\_\allowbreak three}~0$'' \\

   629 \textbf{definition}~$B \mathbin{\Colon} \textit{three}$ \textbf{where} ``$B \,\equiv\, \textit{Abs\_three}~1$'' \\

   630 \textbf{definition}~$C \mathbin{\Colon} \textit{three}$ \textbf{where} ``$C \,\equiv\, \textit{Abs\_three}~2$'' \\[2\smallskipamount]

   631 \textbf{lemma} ``$\lbrakk P~A;\> P~B\rbrakk \,\Longrightarrow\, P~x$'' \\

   632 \textbf{nitpick} [\textit{show\_datatypes}] \\[2\smallskipamount]

   633 \slshape Nitpick found a counterexample: \\[2\smallskipamount]

   634 \hbox{}\qquad Free variables: \nopagebreak \\

   635 \hbox{}\qquad\qquad $P = \{\Abs{1},\, \Abs{0}\}$ \\

   636 \hbox{}\qquad\qquad $x = \Abs{2}$ \\

   637 \hbox{}\qquad Datatypes: \\

   638 \hbox{}\qquad\qquad $\textit{nat} = \{0,\, 1,\, 2,\, \unr\}$ \\

   639 \hbox{}\qquad\qquad $\textit{three} = \{\Abs{2},\, \Abs{1},\, \Abs{0},\, \unr\}$

   640 \postw

   642 %% MARK

   643 In the output above, $\Abs{n}$ abbreviates $\textit{Abs\_three}~n$.

   645 %% MARK

   646 Records, which are implemented as \textbf{typedef}s behind the scenes, are

   647 handled in much the same way:

   649 \prew

   650 \textbf{record} \textit{point} = \\

   651 \hbox{}\quad $\textit{Xcoord} \mathbin{\Colon} \textit{int}$ \\

   652 \hbox{}\quad $\textit{Ycoord} \mathbin{\Colon} \textit{int}$ \\[2\smallskipamount]

   653 \textbf{lemma} ``$\textit{Xcoord}~(p\Colon\textit{point}) = \textit{Xcoord}~(q\Colon\textit{point})$'' \\

   654 \textbf{nitpick} [\textit{show\_datatypes}] \\[2\smallskipamount]

   655 \slshape Nitpick found a counterexample: \\[2\smallskipamount]

   656 \hbox{}\qquad Free variables: \nopagebreak \\

   657 \hbox{}\qquad\qquad $p = \lparr\textit{Xcoord} = 0,\> \textit{Ycoord} = 0\rparr$ \\

   658 \hbox{}\qquad\qquad $q = \lparr\textit{Xcoord} = 1,\> \textit{Ycoord} = 1\rparr$ \\

   659 \hbox{}\qquad Datatypes: \\

   660 \hbox{}\qquad\qquad $\textit{int} = \{0,\, 1,\, \unr\}$ \\

   661 \hbox{}\qquad\qquad $\textit{point} = \{\lparr\textit{Xcoord} = 1,\>

   662 \textit{Ycoord} = 1\rparr,\> \lparr\textit{Xcoord} = 0,\> \textit{Ycoord} = 0\rparr,\, \unr\}$\kern-1pt %% QUIET

   663 \postw

   665 Finally, Nitpick provides rudimentary support for rationals and reals using a

   666 similar approach:

   668 \prew

   669 \textbf{lemma} ``$4 * x + 3 * (y\Colon\textit{real}) \not= 1/2$'' \\

   670 \textbf{nitpick} [\textit{show\_datatypes}] \\[2\smallskipamount]

   671 \slshape Nitpick found a counterexample: \\[2\smallskipamount]

   672 \hbox{}\qquad Free variables: \nopagebreak \\

   673 \hbox{}\qquad\qquad $x = 1/2$ \\

   674 \hbox{}\qquad\qquad $y = -1/2$ \\

   675 \hbox{}\qquad Datatypes: \\

   676 \hbox{}\qquad\qquad $\textit{nat} = \{0,\, 1,\, 2,\, 3,\, 4,\, 5,\, 6,\, 7,\, \unr\}$ \\

   677 \hbox{}\qquad\qquad $\textit{int} = \{0,\, 1,\, 2,\, 3,\, 4,\, -3,\, -2,\, -1,\, \unr\}$ \\

   678 \hbox{}\qquad\qquad $\textit{real} = \{1,\, 0,\, 4,\, -3/2,\, 3,\, 2,\, 1/2,\, -1/2,\, \unr\}$

   679 \postw

   681 \subsection{Inductive and Coinductive Predicates}

   682 \label{inductive-and-coinductive-predicates}

   684 Inductively defined predicates (and sets) are particularly problematic for

   685 counterexample generators. They can make Quickcheck~\cite{berghofer-nipkow-2004}

   686 loop forever and Refute~\cite{weber-2008} run out of resources. The crux of

   687 the problem is that they are defined using a least fixed point construction.

   689 Nitpick's philosophy is that not all inductive predicates are equal. Consider

   690 the \textit{even} predicate below:

   692 \prew

   693 \textbf{inductive}~\textit{even}~\textbf{where} \\

   694 ``\textit{even}~0'' $\,\mid$ \\

   695 ``\textit{even}~$n\,\Longrightarrow\, \textit{even}~(\textit{Suc}~(\textit{Suc}~n))$''

   696 \postw

   698 This predicate enjoys the desirable property of being well-founded, which means

   699 that the introduction rules don't give rise to infinite chains of the form

   701 \prew

   702 $\cdots\,\Longrightarrow\, \textit{even}~k''

   703        \,\Longrightarrow\, \textit{even}~k'

   704        \,\Longrightarrow\, \textit{even}~k.$

   705 \postw

   707 For \textit{even}, this is obvious: Any chain ending at $k$ will be of length

   708 $k/2 + 1$:

   710 \prew

   711 $\textit{even}~0\,\Longrightarrow\, \textit{even}~2\,\Longrightarrow\, \cdots

   712        \,\Longrightarrow\, \textit{even}~(k - 2)

   713        \,\Longrightarrow\, \textit{even}~k.$

   714 \postw

   716 Wellfoundedness is desirable because it enables Nitpick to use a very efficient

   717 fixed point computation.%

   718 \footnote{If an inductive predicate is

   719 well-founded, then it has exactly one fixed point, which is simultaneously the

   720 least and the greatest fixed point. In these circumstances, the computation of

   721 the least fixed point amounts to the computation of an arbitrary fixed point,

   722 which can be performed using a straightforward recursive equation.}

   723 Moreover, Nitpick can prove wellfoundedness of most well-founded predicates,

   724 just as Isabelle's \textbf{function} package usually discharges termination

   725 proof obligations automatically.

   727 Let's try an example:

   729 \prew

   730 \textbf{lemma} ``$\exists n.\; \textit{even}~n \mathrel{\land} \textit{even}~(\textit{Suc}~n)$'' \\

   731 \textbf{nitpick}~[\textit{card nat}~= 100,\, \textit{verbose}] \\[2\smallskipamount]

   732 \slshape The inductive predicate ``\textit{even}'' was proved well-founded.

   733 Nitpick can compute it efficiently. \\[2\smallskipamount]

   734 Trying 1 scope: \\

   735 \hbox{}\qquad \textit{card nat}~= 100. \\[2\smallskipamount]

   736 Nitpick found a potential counterexample for \textit{card nat}~= 100: \\[2\smallskipamount]

   737 \hbox{}\qquad Empty assignment \\[2\smallskipamount]

   738 Nitpick could not find a better counterexample. \\[2\smallskipamount]

   739 Total time: 2274 ms.

   740 \postw

   742 No genuine counterexample is possible because Nitpick cannot rule out the

   743 existence of a natural number $n \ge 100$ such that both $\textit{even}~n$ and

   744 $\textit{even}~(\textit{Suc}~n)$ are true. To help Nitpick, we can bound the

   745 existential quantifier:

   747 \prew

   748 \textbf{lemma} ``$\exists n \mathbin{\le} 99.\; \textit{even}~n \mathrel{\land} \textit{even}~(\textit{Suc}~n)$'' \\

   749 \textbf{nitpick}~[\textit{card nat}~= 100] \\[2\smallskipamount]

   750 \slshape Nitpick found a counterexample: \\[2\smallskipamount]

   751 \hbox{}\qquad Empty assignment

   752 \postw

   754 So far we were blessed by the wellfoundedness of \textit{even}. What happens if

   755 we use the following definition instead?

   757 \prew

   758 \textbf{inductive} $\textit{even}'$ \textbf{where} \\

   759 ``$\textit{even}'~(0{\Colon}\textit{nat})$'' $\,\mid$ \\

   760 ``$\textit{even}'~2$'' $\,\mid$ \\

   761 ``$\lbrakk\textit{even}'~m;\> \textit{even}'~n\rbrakk \,\Longrightarrow\, \textit{even}'~(m + n)$''

   762 \postw

   764 This definition is not well-founded: From $\textit{even}'~0$ and

   765 $\textit{even}'~0$, we can derive that $\textit{even}'~0$. Nonetheless, the

   766 predicates $\textit{even}$ and $\textit{even}'$ are equivalent.

   768 Let's check a property involving $\textit{even}'$. To make up for the

   769 foreseeable computational hurdles entailed by non-wellfoundedness, we decrease

   770 \textit{nat}'s cardinality to a mere 10:

   772 \prew

   773 \textbf{lemma}~``$\exists n \in \{0, 2, 4, 6, 8\}.\;

   774 \lnot\;\textit{even}'~n$'' \\

   775 \textbf{nitpick}~[\textit{card nat}~= 10,\, \textit{verbose},\, \textit{show\_consts}] \\[2\smallskipamount]

   776 \slshape

   777 The inductive predicate ``$\textit{even}'\!$'' could not be proved well-founded.

   778 Nitpick might need to unroll it. \\[2\smallskipamount]

   779 Trying 6 scopes: \\

   780 \hbox{}\qquad \textit{card nat}~= 10 and \textit{iter} $\textit{even}'$~= 0; \\

   781 \hbox{}\qquad \textit{card nat}~= 10 and \textit{iter} $\textit{even}'$~= 1; \\

   782 \hbox{}\qquad \textit{card nat}~= 10 and \textit{iter} $\textit{even}'$~= 2; \\

   783 \hbox{}\qquad \textit{card nat}~= 10 and \textit{iter} $\textit{even}'$~= 4; \\

   784 \hbox{}\qquad \textit{card nat}~= 10 and \textit{iter} $\textit{even}'$~= 8; \\

   785 \hbox{}\qquad \textit{card nat}~= 10 and \textit{iter} $\textit{even}'$~= 9. \\[2\smallskipamount]

   786 Nitpick found a counterexample for \textit{card nat}~= 10 and \textit{iter} $\textit{even}'$~= 2: \\[2\smallskipamount]

   787 \hbox{}\qquad Constant: \nopagebreak \\

   788 \hbox{}\qquad\qquad $\lambda i.\; \textit{even}'$ = $\undef(\!\begin{aligned}[t]

   789 & 2 := \{0, 2, 4, 6, 8, 1^\Q, 3^\Q, 5^\Q, 7^\Q, 9^\Q\}, \\[-2pt]

   790 & 1 := \{0, 2, 4, 1^\Q, 3^\Q, 5^\Q, 6^\Q, 7^\Q, 8^\Q, 9^\Q\}, \\[-2pt]

   791 & 0 := \{0, 2, 1^\Q, 3^\Q, 4^\Q, 5^\Q, 6^\Q, 7^\Q, 8^\Q, 9^\Q\})\end{aligned}$ \\[2\smallskipamount]

   792 Total time: 1140 ms.

   793 \postw

   795 Nitpick's output is very instructive. First, it tells us that the predicate is

   796 unrolled, meaning that it is computed iteratively from the empty set. Then it

   797 lists six scopes specifying different bounds on the numbers of iterations:\ 0,

   798 1, 2, 4, 8, and~9.

   800 The output also shows how each iteration contributes to $\textit{even}'$. The

   801 notation $\lambda i.\; \textit{even}'$ indicates that the value of the

   802 predicate depends on an iteration counter. Iteration 0 provides the basis

   803 elements, $0$ and $2$. Iteration 1 contributes $4$ ($= 2 + 2$). Iteration 2

   804 throws $6$ ($= 2 + 4 = 4 + 2$) and $8$ ($= 4 + 4$) into the mix. Further

   805 iterations would not contribute any new elements.

   807 Some values are marked with superscripted question

   808 marks~(`\lower.2ex\hbox{$^\Q$}'). These are the elements for which the

   809 predicate evaluates to $\unk$. Thus, $\textit{even}'$ evaluates to either

   810 \textit{True} or $\unk$, never \textit{False}.

   812 When unrolling a predicate, Nitpick tries 0, 1, 2, 4, 8, 12, 16, and 24

   813 iterations. However, these numbers are bounded by the cardinality of the

   814 predicate's domain. With \textit{card~nat}~= 10, no more than 9 iterations are

   815 ever needed to compute the value of a \textit{nat} predicate. You can specify

   816 the number of iterations using the \textit{iter} option, as explained in

   817 \S\ref{scope-of-search}.

   819 In the next formula, $\textit{even}'$ occurs both positively and negatively:

   821 \prew

   822 \textbf{lemma} ``$\textit{even}'~(n - 2) \,\Longrightarrow\, \textit{even}'~n$'' \\

   823 \textbf{nitpick} [\textit{card nat} = 10,\, \textit{show\_consts}] \\[2\smallskipamount]

   824 \slshape Nitpick found a counterexample: \\[2\smallskipamount]

   825 \hbox{}\qquad Free variable: \nopagebreak \\

   826 \hbox{}\qquad\qquad $n = 1$ \\

   827 \hbox{}\qquad Constants: \nopagebreak \\

   828 \hbox{}\qquad\qquad $\lambda i.\; \textit{even}'$ = $\undef(\!\begin{aligned}[t]

   829 & 0 := \{0, 2, 1^\Q, 3^\Q, 4^\Q, 5^\Q, 6^\Q, 7^\Q, 8^\Q, 9^\Q\})\end{aligned}$  \\

   830 \hbox{}\qquad\qquad $\textit{even}' \subseteq \{0, 2, 4, 6, 8, \unr\}$

   831 \postw

   833 Notice the special constraint $\textit{even}' \subseteq \{0,\, 2,\, 4,\, 6,\,

   834 8,\, \unr\}$ in the output, whose right-hand side represents an arbitrary

   835 fixed point (not necessarily the least one). It is used to falsify

   836 $\textit{even}'~n$. In contrast, the unrolled predicate is used to satisfy

   837 $\textit{even}'~(n - 2)$.

   839 Coinductive predicates are handled dually. For example:

   841 \prew

   842 \textbf{coinductive} \textit{nats} \textbf{where} \\

   843 ``$\textit{nats}~(x\Colon\textit{nat}) \,\Longrightarrow\, \textit{nats}~x$'' \\[2\smallskipamount]

   844 \textbf{lemma} ``$\textit{nats} = \{0, 1, 2, 3, 4\}$'' \\

   845 \textbf{nitpick}~[\textit{card nat} = 10,\, \textit{show\_consts}] \\[2\smallskipamount]

   846 \slshape Nitpick found a counterexample:

   847 \\[2\smallskipamount]

   848 \hbox{}\qquad Constants: \nopagebreak \\

   849 \hbox{}\qquad\qquad $\lambda i.\; \textit{nats} = \undef(0 := \{\!\begin{aligned}[t]

   850 & 0^\Q, 1^\Q, 2^\Q, 3^\Q, 4^\Q, 5^\Q, 6^\Q, 7^\Q, 8^\Q, 9^\Q, \\[-2pt]

   851 & \unr\})\end{aligned}$ \\

   852 \hbox{}\qquad\qquad $nats \supseteq \{9, 5^\Q, 6^\Q, 7^\Q, 8^\Q, \unr\}$

   853 \postw

   855 As a special case, Nitpick uses Kodkod's transitive closure operator to encode

   856 negative occurrences of non-well-founded ``linear inductive predicates,'' i.e.,

   857 inductive predicates for which each the predicate occurs in at most one

   858 assumption of each introduction rule. For example:

   860 \prew

   861 \textbf{inductive} \textit{odd} \textbf{where} \\

   862 ``$\textit{odd}~1$'' $\,\mid$ \\

   863 ``$\lbrakk \textit{odd}~m;\>\, \textit{even}~n\rbrakk \,\Longrightarrow\, \textit{odd}~(m + n)$'' \\[2\smallskipamount]

   864 \textbf{lemma}~``$\textit{odd}~n \,\Longrightarrow\, \textit{odd}~(n - 2)$'' \\

   865 \textbf{nitpick}~[\textit{card nat} = 10,\, \textit{show\_consts}] \\[2\smallskipamount]

   866 \slshape Nitpick found a counterexample:

   867 \\[2\smallskipamount]

   868 \hbox{}\qquad Free variable: \nopagebreak \\

   869 \hbox{}\qquad\qquad $n = 1$ \\

   870 \hbox{}\qquad Constants: \nopagebreak \\

   871 \hbox{}\qquad\qquad $\textit{even} = \{0, 2, 4, 6, 8, \unr\}$ \\

   872 \hbox{}\qquad\qquad $\textit{odd}_{\textsl{base}} = \{1, \unr\}$ \\

   873 \hbox{}\qquad\qquad $\textit{odd}_{\textsl{step}} = \!

   874 \!\begin{aligned}[t]

   875   & \{(0, 0), (0, 2), (0, 4), (0, 6), (0, 8), (1, 1), (1, 3), (1, 5), \\[-2pt]

   876   & \phantom{\{} (1, 7), (1, 9), (2, 2), (2, 4), (2, 6), (2, 8), (3, 3),

   877        (3, 5), \\[-2pt]

   878   & \phantom{\{} (3, 7), (3, 9), (4, 4), (4, 6), (4, 8), (5, 5), (5, 7), (5, 9), \\[-2pt]

   879   & \phantom{\{} (6, 6), (6, 8), (7, 7), (7, 9), (8, 8), (9, 9), \unr\}\end{aligned}$ \\

   880 \hbox{}\qquad\qquad $\textit{odd} \subseteq \{1, 3, 5, 7, 9, 8^\Q, \unr\}$

   881 \postw

   883 \noindent

   884 In the output, $\textit{odd}_{\textrm{base}}$ represents the base elements and

   885 $\textit{odd}_{\textrm{step}}$ is a transition relation that computes new

   886 elements from known ones. The set $\textit{odd}$ consists of all the values

   887 reachable through the reflexive transitive closure of

   888 $\textit{odd}_{\textrm{step}}$ starting with any element from

   889 $\textit{odd}_{\textrm{base}}$, namely 1, 3, 5, 7, and 9. Using Kodkod's

   890 transitive closure to encode linear predicates is normally either more thorough

   891 or more efficient than unrolling (depending on the value of \textit{iter}), but

   892 for those cases where it isn't you can disable it by passing the

   893 \textit{dont\_star\_linear\_preds} option.

   895 \subsection{Coinductive Datatypes}

   896 \label{coinductive-datatypes}

   898 While Isabelle regrettably lacks a high-level mechanism for defining coinductive

   899 datatypes, the \textit{Coinductive\_List} theory provides a coinductive ``lazy

   900 list'' datatype, $'a~\textit{llist}$, defined the hard way. Nitpick supports

   901 these lazy lists seamlessly and provides a hook, described in

   902 \S\ref{registration-of-coinductive-datatypes}, to register custom coinductive

   903 datatypes.

   905 (Co)intuitively, a coinductive datatype is similar to an inductive datatype but

   906 allows infinite objects. Thus, the infinite lists $\textit{ps}$ $=$ $[a, a, a,

   907 \ldots]$, $\textit{qs}$ $=$ $[a, b, a, b, \ldots]$, and $\textit{rs}$ $=$ $[0,

   908 1, 2, 3, \ldots]$ can be defined as lazy lists using the

   909 $\textit{LNil}\mathbin{\Colon}{'}a~\textit{llist}$ and

   910 $\textit{LCons}\mathbin{\Colon}{'}a \mathbin{\Rightarrow} {'}a~\textit{llist}

   911 \mathbin{\Rightarrow} {'}a~\textit{llist}$ constructors.

   913 Although it is otherwise no friend of infinity, Nitpick can find counterexamples

   914 involving cyclic lists such as \textit{ps} and \textit{qs} above as well as

   915 finite lists:

   917 \prew

   918 \textbf{lemma} ``$\textit{xs} \not= \textit{LCons}~a~\textit{xs}$'' \\

   919 \textbf{nitpick} \\[2\smallskipamount]

   920 \slshape Nitpick found a counterexample for {\itshape card}~$'a$ = 1: \\[2\smallskipamount]

   921 \hbox{}\qquad Free variables: \nopagebreak \\

   922 \hbox{}\qquad\qquad $\textit{a} = a_1$ \\

   923 \hbox{}\qquad\qquad $\textit{xs} = \textsl{THE}~\omega.\; \omega = \textit{LCons}~a_1~\omega$

   924 \postw

   926 The notation $\textrm{THE}~\omega.\; \omega = t(\omega)$ stands

   927 for the infinite term $t(t(t(\ldots)))$. Hence, \textit{xs} is simply the

   928 infinite list $[a_1, a_1, a_1, \ldots]$.

   930 The next example is more interesting:

   932 \prew

   933 \textbf{lemma}~``$\lbrakk\textit{xs} = \textit{LCons}~a~\textit{xs};\>\,

   934 \textit{ys} = \textit{iterates}~(\lambda b.\> a)~b\rbrakk \,\Longrightarrow\, \textit{xs} = \textit{ys}$'' \\

   935 \textbf{nitpick} [\textit{verbose}] \\[2\smallskipamount]

   936 \slshape The type ``\kern1pt$'a$'' passed the monotonicity test. Nitpick might be able to skip

   937 some scopes. \\[2\smallskipamount]

   938 Trying 8 scopes: \\

   939 \hbox{}\qquad \textit{card} $'a$~= 1, \textit{card} ``\kern1pt$'a~\textit{list}$''~= 1,

   940 and \textit{bisim\_depth}~= 0. \\

   941 \hbox{}\qquad $\qquad\vdots$ \\[.5\smallskipamount]

   942 \hbox{}\qquad \textit{card} $'a$~= 8, \textit{card} ``\kern1pt$'a~\textit{list}$''~= 8,

   943 and \textit{bisim\_depth}~= 7. \\[2\smallskipamount]

   944 Nitpick found a counterexample for {\itshape card}~$'a$ = 2,

   945 \textit{card}~``\kern1pt$'a~\textit{list}$''~= 2, and \textit{bisim\_\allowbreak

   946 depth}~= 1:

   947 \\[2\smallskipamount]

   948 \hbox{}\qquad Free variables: \nopagebreak \\

   949 \hbox{}\qquad\qquad $\textit{a} = a_2$ \\

   950 \hbox{}\qquad\qquad $\textit{b} = a_1$ \\

   951 \hbox{}\qquad\qquad $\textit{xs} = \textsl{THE}~\omega.\; \omega = \textit{LCons}~a_2~\omega$ \\

   952 \hbox{}\qquad\qquad $\textit{ys} = \textit{LCons}~a_1~(\textsl{THE}~\omega.\; \omega = \textit{LCons}~a_2~\omega)$ \\[2\smallskipamount]

   953 Total time: 726 ms.

   954 \postw

   956 The lazy list $\textit{xs}$ is simply $[a_2, a_2, a_2, \ldots]$, whereas

   957 $\textit{ys}$ is $[a_1, a_2, a_2, a_2, \ldots]$, i.e., a lasso-shaped list with

   958 $[a_1]$ as its stem and $[a_2]$ as its cycle. In general, the list segment

   959 within the scope of the {THE} binder corresponds to the lasso's cycle, whereas

   960 the segment leading to the binder is the stem.

   962 A salient property of coinductive datatypes is that two objects are considered

   963 equal if and only if they lead to the same observations. For example, the lazy

   964 lists $\textrm{THE}~\omega.\; \omega =

   965 \textit{LCons}~a~(\textit{LCons}~b~\omega)$ and

   966 $\textit{LCons}~a~(\textrm{THE}~\omega.\; \omega =

   967 \textit{LCons}~b~(\textit{LCons}~a~\omega))$ are identical, because both lead

   968 to the sequence of observations $a$, $b$, $a$, $b$, \hbox{\ldots} (or,

   969 equivalently, both encode the infinite list $[a, b, a, b, \ldots]$). This

   970 concept of equality for coinductive datatypes is called bisimulation and is

   971 defined coinductively.

   973 Internally, Nitpick encodes the coinductive bisimilarity predicate as part of

   974 the Kodkod problem to ensure that distinct objects lead to different

   975 observations. This precaution is somewhat expensive and often unnecessary, so it

   976 can be disabled by setting the \textit{bisim\_depth} option to $-1$. The

   977 bisimilarity check is then performed \textsl{after} the counterexample has been

   978 found to ensure correctness. If this after-the-fact check fails, the

   979 counterexample is tagged as ``likely genuine'' and Nitpick recommends to try

   980 again with \textit{bisim\_depth} set to a nonnegative integer. Disabling the

   981 check for the previous example saves approximately 150~milli\-seconds; the speed

   982 gains can be more significant for larger scopes.

   984 The next formula illustrates the need for bisimilarity (either as a Kodkod

   985 predicate or as an after-the-fact check) to prevent spurious counterexamples:

   987 \prew

   988 \textbf{lemma} ``$\lbrakk xs = \textit{LCons}~a~\textit{xs};\>\, \textit{ys} = \textit{LCons}~a~\textit{ys}\rbrakk

   989 \,\Longrightarrow\, \textit{xs} = \textit{ys}$'' \\

   990 \textbf{nitpick} [\textit{bisim\_depth} = $-1$,\, \textit{show\_datatypes}] \\[2\smallskipamount]

   991 \slshape Nitpick found a likely genuine counterexample for $\textit{card}~'a$ = 2: \\[2\smallskipamount]

   992 \hbox{}\qquad Free variables: \nopagebreak \\

   993 \hbox{}\qquad\qquad $a = a_2$ \\

   994 \hbox{}\qquad\qquad $\textit{xs} = \textsl{THE}~\omega.\; \omega =

   995 \textit{LCons}~a_2~\omega$ \\

   996 \hbox{}\qquad\qquad $\textit{ys} = \textsl{THE}~\omega.\; \omega = \textit{LCons}~a_2~\omega$ \\

   997 \hbox{}\qquad Codatatype:\strut \nopagebreak \\

   998 \hbox{}\qquad\qquad $'a~\textit{llist} =

   999 \{\!\begin{aligned}[t]

  1000   & \textsl{THE}~\omega.\; \omega = \textit{LCons}~a_2~\omega, \\[-2pt]

  1001   & \textsl{THE}~\omega.\; \omega = \textit{LCons}~a_2~\omega,\> \unr\}\end{aligned}$

  1002 \\[2\smallskipamount]

  1003 Try again with ``\textit{bisim\_depth}'' set to a nonnegative value to confirm

  1004 that the counterexample is genuine. \\[2\smallskipamount]

  1005 {\upshape\textbf{nitpick}} \\[2\smallskipamount]

  1006 \slshape Nitpick found no counterexample.

  1007 \postw

  1009 In the first \textbf{nitpick} invocation, the after-the-fact check discovered

  1010 that the two known elements of type $'a~\textit{llist}$ are bisimilar.

  1012 A compromise between leaving out the bisimilarity predicate from the Kodkod

  1013 problem and performing the after-the-fact check is to specify a lower

  1014 nonnegative \textit{bisim\_depth} value than the default one provided by

  1015 Nitpick. In general, a value of $K$ means that Nitpick will require all lists to

  1016 be distinguished from each other by their prefixes of length $K$. Be aware that

  1017 setting $K$ to a too low value can overconstrain Nitpick, preventing it from

  1018 finding any counterexamples.

  1020 \subsection{Boxing}

  1021 \label{boxing}

  1023 Nitpick normally maps function and product types directly to the corresponding

  1024 Kodkod concepts. As a consequence, if $'a$ has cardinality 3 and $'b$ has

  1025 cardinality 4, then $'a \times {'}b$ has cardinality 12 ($= 4 \times 3$) and $'a

  1026 \Rightarrow {'}b$ has cardinality 64 ($= 4^3$). In some circumstances, it pays

  1027 off to treat these types in the same way as plain datatypes, by approximating

  1028 them by a subset of a given cardinality. This technique is called ``boxing'' and

  1029 is particularly useful for functions passed as arguments to other functions, for

  1030 high-arity functions, and for large tuples. Under the hood, boxing involves

  1031 wrapping occurrences of the types $'a \times {'}b$ and $'a \Rightarrow {'}b$ in

  1032 isomorphic datatypes, as can be seen by enabling the \textit{debug} option.

  1034 To illustrate boxing, we consider a formalization of $\lambda$-terms represented

  1035 using de Bruijn's notation:

  1037 \prew

  1038 \textbf{datatype} \textit{tm} = \textit{Var}~\textit{nat}~$\mid$~\textit{Lam}~\textit{tm} $\mid$ \textit{App~tm~tm}

  1039 \postw

  1041 The $\textit{lift}~t~k$ function increments all variables with indices greater

  1042 than or equal to $k$ by one:

  1044 \prew

  1045 \textbf{primrec} \textit{lift} \textbf{where} \\

  1046 ``$\textit{lift}~(\textit{Var}~j)~k = \textit{Var}~(\textrm{if}~j < k~\textrm{then}~j~\textrm{else}~j + 1)$'' $\mid$ \\

  1047 ``$\textit{lift}~(\textit{Lam}~t)~k = \textit{Lam}~(\textit{lift}~t~(k + 1))$'' $\mid$ \\

  1048 ``$\textit{lift}~(\textit{App}~t~u)~k = \textit{App}~(\textit{lift}~t~k)~(\textit{lift}~u~k)$''

  1049 \postw

  1051 The $\textit{loose}~t~k$ predicate returns \textit{True} if and only if

  1052 term $t$ has a loose variable with index $k$ or more:

  1054 \prew

  1055 \textbf{primrec}~\textit{loose} \textbf{where} \\

  1056 ``$\textit{loose}~(\textit{Var}~j)~k = (j \ge k)$'' $\mid$ \\

  1057 ``$\textit{loose}~(\textit{Lam}~t)~k = \textit{loose}~t~(\textit{Suc}~k)$'' $\mid$ \\

  1058 ``$\textit{loose}~(\textit{App}~t~u)~k = (\textit{loose}~t~k \mathrel{\lor} \textit{loose}~u~k)$''

  1059 \postw

  1061 Next, the $\textit{subst}~\sigma~t$ function applies the substitution $\sigma$

  1062 on $t$:

  1064 \prew

  1065 \textbf{primrec}~\textit{subst} \textbf{where} \\

  1066 ``$\textit{subst}~\sigma~(\textit{Var}~j) = \sigma~j$'' $\mid$ \\

  1067 ``$\textit{subst}~\sigma~(\textit{Lam}~t) = {}$\phantom{''} \\

  1068 \phantom{``}$\textit{Lam}~(\textit{subst}~(\lambda n.\> \textrm{case}~n~\textrm{of}~0 \Rightarrow \textit{Var}~0 \mid \textit{Suc}~m \Rightarrow \textit{lift}~(\sigma~m)~1)~t)$'' $\mid$ \\

  1069 ``$\textit{subst}~\sigma~(\textit{App}~t~u) = \textit{App}~(\textit{subst}~\sigma~t)~(\textit{subst}~\sigma~u)$''

  1070 \postw

  1072 A substitution is a function that maps variable indices to terms. Observe that

  1073 $\sigma$ is a function passed as argument and that Nitpick can't optimize it

  1074 away, because the recursive call for the \textit{Lam} case involves an altered

  1075 version. Also notice the \textit{lift} call, which increments the variable

  1076 indices when moving under a \textit{Lam}.

  1078 A reasonable property to expect of substitution is that it should leave closed

  1079 terms unchanged. Alas, even this simple property does not hold:

  1081 \pre

  1082 \textbf{lemma}~``$\lnot\,\textit{loose}~t~0 \,\Longrightarrow\, \textit{subst}~\sigma~t = t$'' \\

  1083 \textbf{nitpick} [\textit{verbose}] \\[2\smallskipamount]

  1084 \slshape

  1085 Trying 8 scopes: \nopagebreak \\

  1086 \hbox{}\qquad \textit{card~nat}~= 1, \textit{card tm}~= 1, and \textit{card} ``$\textit{nat} \Rightarrow \textit{tm}$'' = 1; \\

  1087 \hbox{}\qquad \textit{card~nat}~= 2, \textit{card tm}~= 2, and \textit{card} ``$\textit{nat} \Rightarrow \textit{tm}$'' = 2; \\

  1088 \hbox{}\qquad $\qquad\vdots$ \\[.5\smallskipamount]

  1089 \hbox{}\qquad \textit{card~nat}~= 8, \textit{card tm}~= 8, and \textit{card} ``$\textit{nat} \Rightarrow \textit{tm}$'' = 8. \\[2\smallskipamount]

  1090 Nitpick found a counterexample for \textit{card~nat}~= 6, \textit{card~tm}~= 6,

  1091 and \textit{card}~``$\textit{nat} \Rightarrow \textit{tm}$''~= 6: \\[2\smallskipamount]

  1092 \hbox{}\qquad Free variables: \nopagebreak \\

  1093 \hbox{}\qquad\qquad $\sigma = \undef(\!\begin{aligned}[t]

  1094 & 0 := \textit{Var}~0,\>

  1095   1 := \textit{Var}~0,\>

  1096   2 := \textit{Var}~0, \\[-2pt]

  1097 & 3 := \textit{Var}~0,\>

  1098   4 := \textit{Var}~0,\>

  1099   5 := \textit{Var}~0)\end{aligned}$ \\

  1100 \hbox{}\qquad\qquad $t = \textit{Lam}~(\textit{Lam}~(\textit{Var}~1))$ \\[2\smallskipamount]

  1101 Total time: $4679$ ms.

  1102 \postw

  1104 Using \textit{eval}, we find out that $\textit{subst}~\sigma~t =

  1105 \textit{Lam}~(\textit{Lam}~(\textit{Var}~0))$. Using the traditional

  1106 $\lambda$-term notation, $t$~is

  1107 $\lambda x\, y.\> x$ whereas $\textit{subst}~\sigma~t$ is $\lambda x\, y.\> y$.

  1108 The bug is in \textit{subst}: The $\textit{lift}~(\sigma~m)~1$ call should be

  1109 replaced with $\textit{lift}~(\sigma~m)~0$.

  1111 An interesting aspect of Nitpick's verbose output is that it assigned inceasing

  1112 cardinalities from 1 to 8 to the type $\textit{nat} \Rightarrow \textit{tm}$.

  1113 For the formula of interest, knowing 6 values of that type was enough to find

  1114 the counterexample. Without boxing, $46\,656$ ($= 6^6$) values must be

  1115 considered, a hopeless undertaking:

  1117 \prew

  1118 \textbf{nitpick} [\textit{dont\_box}] \\[2\smallskipamount]

  1119 {\slshape Nitpick ran out of time after checking 4 of 8 scopes.}

  1120 \postw

  1122 {\looseness=-1

  1123 Boxing can be enabled or disabled globally or on a per-type basis using the

  1124 \textit{box} option. Moreover, setting the cardinality of a function or

  1125 product type implicitly enables boxing for that type. Nitpick usually performs

  1126 reasonable choices about which types should be boxed, but option tweaking

  1127 sometimes helps.

  1129 }

  1131 \subsection{Scope Monotonicity}

  1132 \label{scope-monotonicity}

  1134 The \textit{card} option (together with \textit{iter}, \textit{bisim\_depth},

  1135 and \textit{max}) controls which scopes are actually tested. In general, to

  1136 exhaust all models below a certain cardinality bound, the number of scopes that

  1137 Nitpick must consider increases exponentially with the number of type variables

  1138 (and \textbf{typedecl}'d types) occurring in the formula. Given the default

  1139 cardinality specification of 1--8, no fewer than $8^4 = 4096$ scopes must be

  1140 considered for a formula involving $'a$, $'b$, $'c$, and $'d$.

  1142 Fortunately, many formulas exhibit a property called \textsl{scope

  1143 monotonicity}, meaning that if the formula is falsifiable for a given scope,

  1144 it is also falsifiable for all larger scopes \cite[p.~165]{jackson-2006}.

  1146 Consider the formula

  1148 \prew

  1149 \textbf{lemma}~``$\textit{length~xs} = \textit{length~ys} \,\Longrightarrow\, \textit{rev}~(\textit{zip~xs~ys}) = \textit{zip~xs}~(\textit{rev~ys})$''

  1150 \postw

  1152 where \textit{xs} is of type $'a~\textit{list}$ and \textit{ys} is of type

  1153 $'b~\textit{list}$. A priori, Nitpick would need to consider 512 scopes to

  1154 exhaust the specification \textit{card}~= 1--8. However, our intuition tells us

  1155 that any counterexample found with a small scope would still be a counterexample

  1156 in a larger scope---by simply ignoring the fresh $'a$ and $'b$ values provided

  1157 by the larger scope. Nitpick comes to the same conclusion after a careful

  1158 inspection of the formula and the relevant definitions:

  1160 \prew

  1161 \textbf{nitpick}~[\textit{verbose}] \\[2\smallskipamount]

  1162 \slshape

  1163 The types ``\kern1pt$'a$'' and ``\kern1pt$'b$'' passed the monotonicity test.

  1164 Nitpick might be able to skip some scopes.

  1165  \\[2\smallskipamount]

  1166 Trying 8 scopes: \\

  1167 \hbox{}\qquad \textit{card} $'a$~= 1, \textit{card} $'b$~= 1,

  1168 \textit{card} \textit{nat}~= 1, \textit{card} ``$('a \times {'}b)$

  1169 \textit{list}''~= 1, \\

  1170 \hbox{}\qquad\quad \textit{card} ``\kern1pt$'a$ \textit{list}''~= 1, and

  1171 \textit{card} ``\kern1pt$'b$ \textit{list}''~= 1. \\

  1172 \hbox{}\qquad \textit{card} $'a$~= 2, \textit{card} $'b$~= 2,

  1173 \textit{card} \textit{nat}~= 2, \textit{card} ``$('a \times {'}b)$

  1174 \textit{list}''~= 2, \\

  1175 \hbox{}\qquad\quad \textit{card} ``\kern1pt$'a$ \textit{list}''~= 2, and

  1176 \textit{card} ``\kern1pt$'b$ \textit{list}''~= 2. \\

  1177 \hbox{}\qquad $\qquad\vdots$ \\[.5\smallskipamount]

  1178 \hbox{}\qquad \textit{card} $'a$~= 8, \textit{card} $'b$~= 8,

  1179 \textit{card} \textit{nat}~= 8, \textit{card} ``$('a \times {'}b)$

  1180 \textit{list}''~= 8, \\

  1181 \hbox{}\qquad\quad \textit{card} ``\kern1pt$'a$ \textit{list}''~= 8, and

  1182 \textit{card} ``\kern1pt$'b$ \textit{list}''~= 8.

  1183 \\[2\smallskipamount]

  1184 Nitpick found a counterexample for

  1185 \textit{card} $'a$~= 5, \textit{card} $'b$~= 5,

  1186 \textit{card} \textit{nat}~= 5, \textit{card} ``$('a \times {'}b)$

  1187 \textit{list}''~= 5, \textit{card} ``\kern1pt$'a$ \textit{list}''~= 5, and

  1188 \textit{card} ``\kern1pt$'b$ \textit{list}''~= 5:

  1189 \\[2\smallskipamount]

  1190 \hbox{}\qquad Free variables: \nopagebreak \\

  1191 \hbox{}\qquad\qquad $\textit{xs} = [a_4, a_5]$ \\

  1192 \hbox{}\qquad\qquad $\textit{ys} = [b_3, b_3]$ \\[2\smallskipamount]

  1193 Total time: 1636 ms.

  1194 \postw

  1196 In theory, it should be sufficient to test a single scope:

  1198 \prew

  1199 \textbf{nitpick}~[\textit{card}~= 8]

  1200 \postw

  1202 However, this is often less efficient in practice and may lead to overly complex

  1203 counterexamples.

  1205 If the monotonicity check fails but we believe that the formula is monotonic (or

  1206 we don't mind missing some counterexamples), we can pass the

  1207 \textit{mono} option. To convince yourself that this option is risky,

  1208 simply consider this example from \S\ref{skolemization}:

  1210 \prew

  1211 \textbf{lemma} ``$\exists g.\; \forall x\Colon 'b.~g~(f~x) = x

  1212  \,\Longrightarrow\, \forall y\Colon {'}a.\; \exists x.~y = f~x$'' \\

  1213 \textbf{nitpick} [\textit{mono}] \\[2\smallskipamount]

  1214 {\slshape Nitpick found no counterexample.} \\[2\smallskipamount]

  1215 \textbf{nitpick} \\[2\smallskipamount]

  1216 \slshape

  1217 Nitpick found a counterexample for \textit{card} $'a$~= 2 and \textit{card} $'b$~=~1: \\

  1218 \hbox{}\qquad $\vdots$

  1219 \postw

  1221 (It turns out the formula holds if and only if $\textit{card}~'a \le

  1222 \textit{card}~'b$.) Although this is rarely advisable, the automatic

  1223 monotonicity checks can be disabled by passing \textit{non\_mono}

  1224 (\S\ref{optimizations}).

  1226 As insinuated in \S\ref{natural-numbers-and-integers} and

  1227 \S\ref{inductive-datatypes}, \textit{nat}, \textit{int}, and inductive datatypes

  1228 are normally monotonic and treated as such. The same is true for record types,

  1229 \textit{rat}, \textit{real}, and some \textbf{typedef}'d types. Thus, given the

  1230 cardinality specification 1--8, a formula involving \textit{nat}, \textit{int},

  1231 \textit{int~list}, \textit{rat}, and \textit{rat~list} will lead Nitpick to

  1232 consider only 8~scopes instead of $32\,768$.

  1234 \section{Case Studies}

  1235 \label{case-studies}

  1237 As a didactic device, the previous section focused mostly on toy formulas whose

  1238 validity can easily be assessed just by looking at the formula. We will now

  1239 review two somewhat more realistic case studies that are within Nitpick's

  1240 reach:\ a context-free grammar modeled by mutually inductive sets and a

  1241 functional implementation of AA trees. The results presented in this

  1242 section were produced with the following settings:

  1244 \prew

  1245 \textbf{nitpick\_params} [\textit{max\_potential}~= 0,\, \textit{max\_threads} = 2]

  1246 \postw

  1248 \subsection{A Context-Free Grammar}

  1249 \label{a-context-free-grammar}

  1251 Our first case study is taken from section 7.4 in the Isabelle tutorial

  1252 \cite{isa-tutorial}. The following grammar, originally due to Hopcroft and

  1253 Ullman, produces all strings with an equal number of $a$'s and $b$'s:

  1255 \prew

  1256 \begin{tabular}{@{}r@{$\;\,$}c@{$\;\,$}l@{}}

  1257 $S$ & $::=$ & $\epsilon \mid bA \mid aB$ \\

  1258 $A$ & $::=$ & $aS \mid bAA$ \\

  1259 $B$ & $::=$ & $bS \mid aBB$

  1260 \end{tabular}

  1261 \postw

  1263 The intuition behind the grammar is that $A$ generates all string with one more

  1264 $a$ than $b$'s and $B$ generates all strings with one more $b$ than $a$'s.

  1266 The alphabet consists exclusively of $a$'s and $b$'s:

  1268 \prew

  1269 \textbf{datatype} \textit{alphabet}~= $a$ $\mid$ $b$

  1270 \postw

  1272 Strings over the alphabet are represented by \textit{alphabet list}s.

  1273 Nonterminals in the grammar become sets of strings. The production rules

  1274 presented above can be expressed as a mutually inductive definition:

  1276 \prew

  1277 \textbf{inductive\_set} $S$ \textbf{and} $A$ \textbf{and} $B$ \textbf{where} \\

  1278 \textit{R1}:\kern.4em ``$[] \in S$'' $\,\mid$ \\

  1279 \textit{R2}:\kern.4em ``$w \in A\,\Longrightarrow\, b \mathbin{\#} w \in S$'' $\,\mid$ \\

  1280 \textit{R3}:\kern.4em ``$w \in B\,\Longrightarrow\, a \mathbin{\#} w \in S$'' $\,\mid$ \\

  1281 \textit{R4}:\kern.4em ``$w \in S\,\Longrightarrow\, a \mathbin{\#} w \in A$'' $\,\mid$ \\

  1282 \textit{R5}:\kern.4em ``$w \in S\,\Longrightarrow\, b \mathbin{\#} w \in S$'' $\,\mid$ \\

  1283 \textit{R6}:\kern.4em ``$\lbrakk v \in B;\> v \in B\rbrakk \,\Longrightarrow\, a \mathbin{\#} v \mathbin{@} w \in B$''

  1284 \postw

  1286 The conversion of the grammar into the inductive definition was done manually by

  1287 Joe Blow, an underpaid undergraduate student. As a result, some errors might

  1288 have sneaked in.

  1290 Debugging faulty specifications is at the heart of Nitpick's \textsl{raison

  1291 d'\^etre}. A good approach is to state desirable properties of the specification

  1292 (here, that $S$ is exactly the set of strings over $\{a, b\}$ with as many $a$'s

  1293 as $b$'s) and check them with Nitpick. If the properties are correctly stated,

  1294 counterexamples will point to bugs in the specification. For our grammar

  1295 example, we will proceed in two steps, separating the soundness and the

  1296 completeness of the set $S$. First, soundness:

  1298 \prew

  1299 \textbf{theorem}~\textit{S\_sound}: \\

  1300 ``$w \in S \longrightarrow \textit{length}~[x\mathbin{\leftarrow} w.\; x = a] =

  1301   \textit{length}~[x\mathbin{\leftarrow} w.\; x = b]$'' \\

  1302 \textbf{nitpick} \\[2\smallskipamount]

  1303 \slshape Nitpick found a counterexample: \\[2\smallskipamount]

  1304 \hbox{}\qquad Free variable: \nopagebreak \\

  1305 \hbox{}\qquad\qquad $w = [b]$

  1306 \postw

  1308 It would seem that $[b] \in S$. How could this be? An inspection of the

  1309 introduction rules reveals that the only rule with a right-hand side of the form

  1310 $b \mathbin{\#} {\ldots} \in S$ that could have introduced $[b]$ into $S$ is

  1311 \textit{R5}:

  1313 \prew

  1314 ``$w \in S\,\Longrightarrow\, b \mathbin{\#} w \in S$''

  1315 \postw

  1317 On closer inspection, we can see that this rule is wrong. To match the

  1318 production $B ::= bS$, the second $S$ should be a $B$. We fix the typo and try

  1319 again:

  1321 \prew

  1322 \textbf{nitpick} \\[2\smallskipamount]

  1323 \slshape Nitpick found a counterexample: \\[2\smallskipamount]

  1324 \hbox{}\qquad Free variable: \nopagebreak \\

  1325 \hbox{}\qquad\qquad $w = [a, a, b]$

  1326 \postw

  1328 Some detective work is necessary to find out what went wrong here. To get $[a,

  1329 a, b] \in S$, we need $[a, b] \in B$ by \textit{R3}, which in turn can only come

  1330 from \textit{R6}:

  1332 \prew

  1333 ``$\lbrakk v \in B;\> v \in B\rbrakk \,\Longrightarrow\, a \mathbin{\#} v \mathbin{@} w \in B$''

  1334 \postw

  1336 Now, this formula must be wrong: The same assumption occurs twice, and the

  1337 variable $w$ is unconstrained. Clearly, one of the two occurrences of $v$ in

  1338 the assumptions should have been a $w$.

  1340 With the correction made, we don't get any counterexample from Nitpick. Let's

  1341 move on and check completeness:

  1343 \prew

  1344 \textbf{theorem}~\textit{S\_complete}: \\

  1345 ``$\textit{length}~[x\mathbin{\leftarrow} w.\; x = a] =

  1346    \textit{length}~[x\mathbin{\leftarrow} w.\; x = b]

  1347   \longrightarrow w \in S$'' \\

  1348 \textbf{nitpick} \\[2\smallskipamount]

  1349 \slshape Nitpick found a counterexample: \\[2\smallskipamount]

  1350 \hbox{}\qquad Free variable: \nopagebreak \\

  1351 \hbox{}\qquad\qquad $w = [b, b, a, a]$

  1352 \postw

  1354 Apparently, $[b, b, a, a] \notin S$, even though it has the same numbers of

  1355 $a$'s and $b$'s. But since our inductive definition passed the soundness check,

  1356 the introduction rules we have are probably correct. Perhaps we simply lack an

  1357 introduction rule. Comparing the grammar with the inductive definition, our

  1358 suspicion is confirmed: Joe Blow simply forgot the production $A ::= bAA$,

  1359 without which the grammar cannot generate two or more $b$'s in a row. So we add

  1360 the rule

  1362 \prew

  1363 ``$\lbrakk v \in A;\> w \in A\rbrakk \,\Longrightarrow\, b \mathbin{\#} v \mathbin{@} w \in A$''

  1364 \postw

  1366 With this last change, we don't get any counterexamples from Nitpick for either

  1367 soundness or completeness. We can even generalize our result to cover $A$ and

  1368 $B$ as well:

  1370 \prew

  1371 \textbf{theorem} \textit{S\_A\_B\_sound\_and\_complete}: \\

  1372 ``$w \in S \longleftrightarrow \textit{length}~[x \mathbin{\leftarrow} w.\; x = a] = \textit{length}~[x \mathbin{\leftarrow} w.\; x = b]$'' \\

  1373 ``$w \in A \longleftrightarrow \textit{length}~[x \mathbin{\leftarrow} w.\; x = a] = \textit{length}~[x \mathbin{\leftarrow} w.\; x = b] + 1$'' \\

  1374 ``$w \in B \longleftrightarrow \textit{length}~[x \mathbin{\leftarrow} w.\; x = b] = \textit{length}~[x \mathbin{\leftarrow} w.\; x = a] + 1$'' \\

  1375 \textbf{nitpick} \\[2\smallskipamount]

  1376 \slshape Nitpick found no counterexample.

  1377 \postw

  1379 \subsection{AA Trees}

  1380 \label{aa-trees}

  1382 AA trees are a kind of balanced trees discovered by Arne Andersson that provide

  1383 similar performance to red-black trees, but with a simpler implementation

  1384 \cite{andersson-1993}. They can be used to store sets of elements equipped with

  1385 a total order $<$. We start by defining the datatype and some basic extractor

  1386 functions:

  1388 \prew

  1389 \textbf{datatype} $'a$~\textit{tree} = $\Lambda$ $\mid$ $N$ ``\kern1pt$'a\Colon \textit{linorder}$'' \textit{nat} ``\kern1pt$'a$ \textit{tree}'' ``\kern1pt$'a$ \textit{tree}''  \\[2\smallskipamount]

  1390 \textbf{primrec} \textit{data} \textbf{where} \\

  1391 ``$\textit{data}~\Lambda = \undef$'' $\,\mid$ \\

  1392 ``$\textit{data}~(N~x~\_~\_~\_) = x$'' \\[2\smallskipamount]

  1393 \textbf{primrec} \textit{dataset} \textbf{where} \\

  1394 ``$\textit{dataset}~\Lambda = \{\}$'' $\,\mid$ \\

  1395 ``$\textit{dataset}~(N~x~\_~t~u) = \{x\} \cup \textit{dataset}~t \mathrel{\cup} \textit{dataset}~u$'' \\[2\smallskipamount]

  1396 \textbf{primrec} \textit{level} \textbf{where} \\

  1397 ``$\textit{level}~\Lambda = 0$'' $\,\mid$ \\

  1398 ``$\textit{level}~(N~\_~k~\_~\_) = k$'' \\[2\smallskipamount]

  1399 \textbf{primrec} \textit{left} \textbf{where} \\

  1400 ``$\textit{left}~\Lambda = \Lambda$'' $\,\mid$ \\

  1401 ``$\textit{left}~(N~\_~\_~t~\_) = t$'' \\[2\smallskipamount]

  1402 \textbf{primrec} \textit{right} \textbf{where} \\

  1403 ``$\textit{right}~\Lambda = \Lambda$'' $\,\mid$ \\

  1404 ``$\textit{right}~(N~\_~\_~\_~u) = u$''

  1405 \postw

  1407 The wellformedness criterion for AA trees is fairly complex. Wikipedia states it

  1408 as follows \cite{wikipedia-2009-aa-trees}:

  1410 \kern.2\parskip %% TYPESETTING

  1412 \pre

  1413 Each node has a level field, and the following invariants must remain true for

  1414 the tree to be valid:

  1416 \raggedright

  1418 \kern-.4\parskip %% TYPESETTING

  1420 \begin{enum}

  1421 \item[]

  1422 \begin{enum}

  1423 \item[1.] The level of a leaf node is one.

  1424 \item[2.] The level of a left child is strictly less than that of its parent.

  1425 \item[3.] The level of a right child is less than or equal to that of its parent.

  1426 \item[4.] The level of a right grandchild is strictly less than that of its grandparent.

  1427 \item[5.] Every node of level greater than one must have two children.

  1428 \end{enum}

  1429 \end{enum}

  1430 \post

  1432 \kern.4\parskip %% TYPESETTING

  1434 The \textit{wf} predicate formalizes this description:

  1436 \prew

  1437 \textbf{primrec} \textit{wf} \textbf{where} \\

  1438 ``$\textit{wf}~\Lambda = \textit{True}$'' $\,\mid$ \\

  1439 ``$\textit{wf}~(N~\_~k~t~u) =$ \\

  1440 \phantom{``}$(\textrm{if}~t = \Lambda~\textrm{then}$ \\

  1441 \phantom{``$(\quad$}$k = 1 \mathrel{\land} (u = \Lambda \mathrel{\lor} (\textit{level}~u = 1 \mathrel{\land} \textit{left}~u = \Lambda \mathrel{\land} \textit{right}~u = \Lambda))$ \\

  1442 \phantom{``$($}$\textrm{else}$ \\

  1443 \hbox{}\phantom{``$(\quad$}$\textit{wf}~t \mathrel{\land} \textit{wf}~u

  1444 \mathrel{\land} u \not= \Lambda \mathrel{\land} \textit{level}~t < k

  1445 \mathrel{\land} \textit{level}~u \le k$ \\

  1446 \hbox{}\phantom{``$(\quad$}${\land}\; \textit{level}~(\textit{right}~u) < k)$''

  1447 \postw

  1449 Rebalancing the tree upon insertion and removal of elements is performed by two

  1450 auxiliary functions called \textit{skew} and \textit{split}, defined below:

  1452 \prew

  1453 \textbf{primrec} \textit{skew} \textbf{where} \\

  1454 ``$\textit{skew}~\Lambda = \Lambda$'' $\,\mid$ \\

  1455 ``$\textit{skew}~(N~x~k~t~u) = {}$ \\

  1456 \phantom{``}$(\textrm{if}~t \not= \Lambda \mathrel{\land} k =

  1457 \textit{level}~t~\textrm{then}$ \\

  1458 \phantom{``(\quad}$N~(\textit{data}~t)~k~(\textit{left}~t)~(N~x~k~

  1459 (\textit{right}~t)~u)$ \\

  1460 \phantom{``(}$\textrm{else}$ \\

  1461 \phantom{``(\quad}$N~x~k~t~u)$''

  1462 \postw

  1464 \prew

  1465 \textbf{primrec} \textit{split} \textbf{where} \\

  1466 ``$\textit{split}~\Lambda = \Lambda$'' $\,\mid$ \\

  1467 ``$\textit{split}~(N~x~k~t~u) = {}$ \\

  1468 \phantom{``}$(\textrm{if}~u \not= \Lambda \mathrel{\land} k =

  1469 \textit{level}~(\textit{right}~u)~\textrm{then}$ \\

  1470 \phantom{``(\quad}$N~(\textit{data}~u)~(\textit{Suc}~k)~

  1471 (N~x~k~t~(\textit{left}~u))~(\textit{right}~u)$ \\

  1472 \phantom{``(}$\textrm{else}$ \\

  1473 \phantom{``(\quad}$N~x~k~t~u)$''

  1474 \postw

  1476 Performing a \textit{skew} or a \textit{split} should have no impact on the set

  1477 of elements stored in the tree:

  1479 \prew

  1480 \textbf{theorem}~\textit{dataset\_skew\_split}:\\

  1481 ``$\textit{dataset}~(\textit{skew}~t) = \textit{dataset}~t$'' \\

  1482 ``$\textit{dataset}~(\textit{split}~t) = \textit{dataset}~t$'' \\

  1483 \textbf{nitpick} \\[2\smallskipamount]

  1484 {\slshape Nitpick ran out of time after checking 7 of 8 scopes.}

  1485 \postw

  1487 Furthermore, applying \textit{skew} or \textit{split} to a well-formed tree

  1488 should not alter the tree:

  1490 \prew

  1491 \textbf{theorem}~\textit{wf\_skew\_split}:\\

  1492 ``$\textit{wf}~t\,\Longrightarrow\, \textit{skew}~t = t$'' \\

  1493 ``$\textit{wf}~t\,\Longrightarrow\, \textit{split}~t = t$'' \\

  1494 \textbf{nitpick} \\[2\smallskipamount]

  1495 {\slshape Nitpick found no counterexample.}

  1496 \postw

  1498 Insertion is implemented recursively. It preserves the sort order:

  1500 \prew

  1501 \textbf{primrec}~\textit{insort} \textbf{where} \\

  1502 ``$\textit{insort}~\Lambda~x = N~x~1~\Lambda~\Lambda$'' $\,\mid$ \\

  1503 ``$\textit{insort}~(N~y~k~t~u)~x =$ \\

  1504 \phantom{``}$({*}~(\textit{split} \circ \textit{skew})~{*})~(N~y~k~(\textrm{if}~x < y~\textrm{then}~\textit{insort}~t~x~\textrm{else}~t)$ \\

  1505 \phantom{``$({*}~(\textit{split} \circ \textit{skew})~{*})~(N~y~k~$}$(\textrm{if}~x > y~\textrm{then}~\textit{insort}~u~x~\textrm{else}~u))$''

  1506 \postw

  1508 Notice that we deliberately commented out the application of \textit{skew} and

  1509 \textit{split}. Let's see if this causes any problems:

  1511 \prew

  1512 \textbf{theorem}~\textit{wf\_insort}:\kern.4em ``$\textit{wf}~t\,\Longrightarrow\, \textit{wf}~(\textit{insort}~t~x)$'' \\

  1513 \textbf{nitpick} \\[2\smallskipamount]

  1514 \slshape Nitpick found a counterexample for \textit{card} $'a$ = 4: \\[2\smallskipamount]

  1515 \hbox{}\qquad Free variables: \nopagebreak \\

  1516 \hbox{}\qquad\qquad $t = N~a_3~1~\Lambda~\Lambda$ \\

  1517 \hbox{}\qquad\qquad $x = a_4$ \\[2\smallskipamount]

  1518 Hint: Maybe you forgot a type constraint?

  1519 \postw

  1521 It's hard to see why this is a counterexample. The hint is of no help here. To

  1522 improve readability, we will restrict the theorem to \textit{nat}, so that we

  1523 don't need to look up the value of the $\textit{op}~{<}$ constant to find out

  1524 which element is smaller than the other. In addition, we will tell Nitpick to

  1525 display the value of $\textit{insort}~t~x$ using the \textit{eval} option. This

  1526 gives

  1528 \prew

  1529 \textbf{theorem} \textit{wf\_insort\_nat}:\kern.4em ``$\textit{wf}~t\,\Longrightarrow\, \textit{wf}~(\textit{insort}~t~(x\Colon\textit{nat}))$'' \\

  1530 \textbf{nitpick} [\textit{eval} = ``$\textit{insort}~t~x$''] \\[2\smallskipamount]

  1531 \slshape Nitpick found a counterexample: \\[2\smallskipamount]

  1532 \hbox{}\qquad Free variables: \nopagebreak \\

  1533 \hbox{}\qquad\qquad $t = N~1~1~\Lambda~\Lambda$ \\

  1534 \hbox{}\qquad\qquad $x = 0$ \\

  1535 \hbox{}\qquad Evaluated term: \\

  1536 \hbox{}\qquad\qquad $\textit{insort}~t~x = N~1~1~(N~0~1~\Lambda~\Lambda)~\Lambda$

  1537 \postw

  1539 Nitpick's output reveals that the element $0$ was added as a left child of $1$,

  1540 where both have a level of 1. This violates the second AA tree invariant, which

  1541 states that a left child's level must be less than its parent's. This shouldn't

  1542 come as a surprise, considering that we commented out the tree rebalancing code.

  1543 Reintroducing the code seems to solve the problem:

  1545 \prew

  1546 \textbf{theorem}~\textit{wf\_insort}:\kern.4em ``$\textit{wf}~t\,\Longrightarrow\, \textit{wf}~(\textit{insort}~t~x)$'' \\

  1547 \textbf{nitpick} \\[2\smallskipamount]

  1548 {\slshape Nitpick ran out of time after checking 6 of 8 scopes.}

  1549 \postw

  1551 Insertion should transform the set of elements represented by the tree in the

  1552 obvious way:

  1554 \prew

  1555 \textbf{theorem} \textit{dataset\_insort}:\kern.4em

  1556 ``$\textit{dataset}~(\textit{insort}~t~x) = \{x\} \cup \textit{dataset}~t$'' \\

  1557 \textbf{nitpick} \\[2\smallskipamount]

  1558 {\slshape Nitpick ran out of time after checking 5 of 8 scopes.}

  1559 \postw

  1561 We could continue like this and sketch a complete theory of AA trees without

  1562 performing a single proof. Once the definitions and main theorems are in place

  1563 and have been thoroughly tested using Nitpick, we could start working on the

  1564 proofs. Developing theories this way usually saves time, because faulty theorems

  1565 and definitions are discovered much earlier in the process.

  1567 \section{Option Reference}

  1568 \label{option-reference}

  1570 \def\flushitem#1{\item[]\noindent\kern-\leftmargin \textbf{#1}}

  1571 \def\qty#1{$\left<\textit{#1}\right>$}

  1572 \def\qtybf#1{$\mathbf{\left<\textbf{\textit{#1}}\right>}$}

  1573 \def\optrue#1#2{\flushitem{\textit{#1} $\bigl[$= \qtybf{bool}$\bigr]$\quad [\textit{true}]\hfill (neg.: \textit{#2})}\nopagebreak\\[\parskip]}

  1574 \def\opfalse#1#2{\flushitem{\textit{#1} $\bigl[$= \qtybf{bool}$\bigr]$\quad [\textit{false}]\hfill (neg.: \textit{#2})}\nopagebreak\\[\parskip]}

  1575 \def\opsmart#1#2{\flushitem{\textit{#1} $\bigl[$= \qtybf{bool\_or\_smart}$\bigr]$\quad [\textit{smart}]\hfill (neg.: \textit{#2})}\nopagebreak\\[\parskip]}

  1576 \def\ops#1#2{\flushitem{\textit{#1} = \qtybf{#2}} \nopagebreak\\[\parskip]}

  1577 \def\opt#1#2#3{\flushitem{\textit{#1} = \qtybf{#2}\quad [\textit{#3}]} \nopagebreak\\[\parskip]}

  1578 \def\opu#1#2#3{\flushitem{\textit{#1} \qtybf{#2} = \qtybf{#3}} \nopagebreak\\[\parskip]}

  1579 \def\opusmart#1#2#3{\flushitem{\textit{#1} \qtybf{#2} $\bigl[$= \qtybf{bool\_or\_smart}$\bigr]$\hfill (neg.: \textit{#3})}\nopagebreak\\[\parskip]}

  1581 Nitpick's behavior can be influenced by various options, which can be specified

  1582 in brackets after the \textbf{nitpick} command. Default values can be set

  1583 using \textbf{nitpick\_\allowbreak params}. For example:

  1585 \prew

  1586 \textbf{nitpick\_params} [\textit{verbose}, \,\textit{timeout} = 60$\,s$]

  1587 \postw

  1589 The options are categorized as follows:\ mode of operation

  1590 (\S\ref{mode-of-operation}), scope of search (\S\ref{scope-of-search}), output

  1591 format (\S\ref{output-format}), automatic counterexample checks

  1592 (\S\ref{authentication}), optimizations

  1593 (\S\ref{optimizations}), and timeouts (\S\ref{timeouts}).

  1595 You can instruct Nitpick to run automatically on newly entered theorems by

  1596 enabling the ``Auto Nitpick'' option from the ``Isabelle'' menu in Proof

  1597 General. For automatic runs, \textit{user\_axioms} (\S\ref{mode-of-operation})

  1598 and \textit{assms} (\S\ref{mode-of-operation}) are implicitly enabled,

  1599 \textit{blocking} (\S\ref{mode-of-operation}), \textit{verbose}

  1600 (\S\ref{output-format}), and \textit{debug} (\S\ref{output-format}) are

  1601 disabled, \textit{max\_potential} (\S\ref{output-format}) is taken to be 0, and

  1602 \textit{timeout} (\S\ref{timeouts}) is superseded by the ``Auto Counterexample

  1603 Time Limit'' in Proof General's ``Isabelle'' menu. Nitpick's output is also more

  1604 concise.

  1606 The number of options can be overwhelming at first glance. Do not let that worry

  1607 you: Nitpick's defaults have been chosen so that it almost always does the right

  1608 thing, and the most important options have been covered in context in

  1609 \S\ref{first-steps}.

  1611 The descriptions below refer to the following syntactic quantities:

  1613 \begin{enum}

  1614 \item[$\bullet$] \qtybf{string}: A string.

  1615 \item[$\bullet$] \qtybf{bool}: \textit{true} or \textit{false}.

  1616 \item[$\bullet$] \qtybf{bool\_or\_smart}: \textit{true}, \textit{false}, or \textit{smart}.

  1617 \item[$\bullet$] \qtybf{int}: An integer. Negative integers are prefixed with a hyphen.

  1618 \item[$\bullet$] \qtybf{int\_or\_smart}: An integer or \textit{smart}.

  1619 \item[$\bullet$] \qtybf{int\_range}: An integer (e.g., 3) or a range

  1620 of nonnegative integers (e.g., $1$--$4$). The range symbol `--' can be entered as \texttt{-} (hyphen) or \texttt{\char`\\\char`\<midarrow\char`\>}.

  1622 \item[$\bullet$] \qtybf{int\_seq}: A comma-separated sequence of ranges of integers (e.g.,~1{,}3{,}\allowbreak6--8).

  1623 \item[$\bullet$] \qtybf{time}: An integer followed by $\textit{min}$ (minutes), $s$ (seconds), or \textit{ms}

  1624 (milliseconds), or the keyword \textit{none} ($\infty$ years).

  1625 \item[$\bullet$] \qtybf{const}: The name of a HOL constant.

  1626 \item[$\bullet$] \qtybf{term}: A HOL term (e.g., ``$f~x$'').

  1627 \item[$\bullet$] \qtybf{term\_list}: A space-separated list of HOL terms (e.g.,

  1628 ``$f~x$''~``$g~y$'').

  1629 \item[$\bullet$] \qtybf{type}: A HOL type.

  1630 \end{enum}

  1632 Default values are indicated in square brackets. Boolean options have a negated

  1633 counterpart (e.g., \textit{blocking} vs.\ \textit{no\_blocking}). When setting

  1634 Boolean options, ``= \textit{true}'' may be omitted.

  1636 \subsection{Mode of Operation}

  1637 \label{mode-of-operation}

  1639 \begin{enum}

  1640 \optrue{blocking}{non\_blocking}

  1641 Specifies whether the \textbf{nitpick} command should operate synchronously.

  1642 The asynchronous (non-blocking) mode lets the user start proving the putative

  1643 theorem while Nitpick looks for a counterexample, but it can also be more

  1644 confusing. For technical reasons, automatic runs currently always block.

  1646 \optrue{falsify}{satisfy}

  1647 Specifies whether Nitpick should look for falsifying examples (countermodels) or

  1648 satisfying examples (models). This manual assumes throughout that

  1649 \textit{falsify} is enabled.

  1651 \opsmart{user\_axioms}{no\_user\_axioms}

  1652 Specifies whether the user-defined axioms (specified using

  1653 \textbf{axiomatization} and \textbf{axioms}) should be considered. If the option

  1654 is set to \textit{smart}, Nitpick performs an ad hoc axiom selection based on

  1655 the constants that occur in the formula to falsify. The option is implicitly set

  1656 to \textit{true} for automatic runs.

  1658 \textbf{Warning:} If the option is set to \textit{true}, Nitpick might

  1659 nonetheless ignore some polymorphic axioms. Counterexamples generated under

  1660 these conditions are tagged as ``likely genuine.'' The \textit{debug}

  1661 (\S\ref{output-format}) option can be used to find out which axioms were

  1662 considered.

  1664 \nopagebreak

  1665 {\small See also \textit{assms} (\S\ref{mode-of-operation}) and \textit{debug}

  1666 (\S\ref{output-format}).}

  1668 \optrue{assms}{no\_assms}

  1669 Specifies whether the relevant assumptions in structured proof should be

  1670 considered. The option is implicitly enabled for automatic runs.

  1672 \nopagebreak

  1673 {\small See also \textit{user\_axioms} (\S\ref{mode-of-operation}).}

  1675 \opfalse{overlord}{no\_overlord}

  1676 Specifies whether Nitpick should put its temporary files in

  1677 \texttt{\$ISABELLE\_\allowbreak HOME\_\allowbreak USER}, which is useful for

  1678 debugging Nitpick but also unsafe if several instances of the tool are run

  1679 simultaneously.

  1681 \nopagebreak

  1682 {\small See also \textit{debug} (\S\ref{output-format}).}

  1683 \end{enum}

  1685 \subsection{Scope of Search}

  1686 \label{scope-of-search}

  1688 \begin{enum}

  1689 \opu{card}{type}{int\_seq}

  1690 Specifies the sequence of cardinalities to use for a given type. For

  1691 \textit{nat} and \textit{int}, the cardinality fully specifies the subset used

  1692 to approximate the type. For example:

  1693 %

  1694 $$\hbox{\begin{tabular}{@{}rll@{}}%

  1695 \textit{card nat} = 4 & induces & $\{0,\, 1,\, 2,\, 3\}$ \\

  1696 \textit{card int} = 4 & induces & $\{-1,\, 0,\, +1,\, +2\}$ \\

  1697 \textit{card int} = 5 & induces & $\{-2,\, -1,\, 0,\, +1,\, +2\}.$%

  1698 \end{tabular}}$$

  1699 %

  1700 In general:

  1701 %

  1702 $$\hbox{\begin{tabular}{@{}rll@{}}%

  1703 \textit{card nat} = $K$ & induces & $\{0,\, \ldots,\, K - 1\}$ \\

  1704 \textit{card int} = $K$ & induces & $\{-\lceil K/2 \rceil + 1,\, \ldots,\, +\lfloor K/2 \rfloor\}.$%

  1705 \end{tabular}}$$

  1706 %

  1707 For free types, and often also for \textbf{typedecl}'d types, it usually makes

  1708 sense to specify cardinalities as a range of the form \textit{$1$--$n$}.

  1709 Although function and product types are normally mapped directly to the

  1710 corresponding Kodkod concepts, setting

  1711 the cardinality of such types is also allowed and implicitly enables ``boxing''

  1712 for them, as explained in the description of the \textit{box}~\qty{type}

  1713 and \textit{box} (\S\ref{scope-of-search}) options.

  1715 \nopagebreak

  1716 {\small See also \textit{mono} (\S\ref{scope-of-search}).}

  1718 \opt{card}{int\_seq}{$\mathbf{1}$--$\mathbf{8}$}

  1719 Specifies the default sequence of cardinalities to use. This can be overridden

  1720 on a per-type basis using the \textit{card}~\qty{type} option described above.

  1722 \opu{max}{const}{int\_seq}

  1723 Specifies the sequence of maximum multiplicities to use for a given

  1724 (co)in\-duc\-tive datatype constructor. A constructor's multiplicity is the

  1725 number of distinct values that it can construct. Nonsensical values (e.g.,

  1726 \textit{max}~[]~$=$~2) are silently repaired. This option is only available for

  1727 datatypes equipped with several constructors.

  1729 \ops{max}{int\_seq}

  1730 Specifies the default sequence of maximum multiplicities to use for

  1731 (co)in\-duc\-tive datatype constructors. This can be overridden on a per-constructor

  1732 basis using the \textit{max}~\qty{const} option described above.

  1734 \opusmart{wf}{const}{non\_wf}

  1735 Specifies whether the specified (co)in\-duc\-tively defined predicate is

  1736 well-founded. The option can take the following values:

  1738 \begin{enum}

  1739 \item[$\bullet$] \textbf{\textit{true}}: Tentatively treat the (co)in\-duc\-tive

  1740 predicate as if it were well-founded. Since this is generally not sound when the

  1741 predicate is not well-founded, the counterexamples are tagged as ``likely

  1742 genuine.''

  1744 \item[$\bullet$] \textbf{\textit{false}}: Treat the (co)in\-duc\-tive predicate

  1745 as if it were not well-founded. The predicate is then unrolled as prescribed by

  1746 the \textit{star\_linear\_preds}, \textit{iter}~\qty{const}, and \textit{iter}

  1747 options.

  1749 \item[$\bullet$] \textbf{\textit{smart}}: Try to prove that the inductive

  1750 predicate is well-founded using Isabelle's \textit{lexicographic\_order} and

  1751 \textit{sizechange} tactics. If this succeeds (or the predicate occurs with an

  1752 appropriate polarity in the formula to falsify), use an efficient fixed point

  1753 equation as specification of the predicate; otherwise, unroll the predicates

  1754 according to the \textit{iter}~\qty{const} and \textit{iter} options.

  1755 \end{enum}

  1757 \nopagebreak

  1758 {\small See also \textit{iter} (\S\ref{scope-of-search}),

  1759 \textit{star\_linear\_preds} (\S\ref{optimizations}), and \textit{tac\_timeout}

  1760 (\S\ref{timeouts}).}

  1762 \opsmart{wf}{non\_wf}

  1763 Specifies the default wellfoundedness setting to use. This can be overridden on

  1764 a per-predicate basis using the \textit{wf}~\qty{const} option above.

  1766 \opu{iter}{const}{int\_seq}

  1767 Specifies the sequence of iteration counts to use when unrolling a given

  1768 (co)in\-duc\-tive predicate. By default, unrolling is applied for inductive

  1769 predicates that occur negatively and coinductive predicates that occur

  1770 positively in the formula to falsify and that cannot be proved to be

  1771 well-founded, but this behavior is influenced by the \textit{wf} option. The

  1772 iteration counts are automatically bounded by the cardinality of the predicate's

  1773 domain.

  1775 {\small See also \textit{wf} (\S\ref{scope-of-search}) and

  1776 \textit{star\_linear\_preds} (\S\ref{optimizations}).}

  1778 \opt{iter}{int\_seq}{$\mathbf{1{,}2{,}4{,}8{,}12{,}16{,}24{,}32}$}

  1779 Specifies the sequence of iteration counts to use when unrolling (co)in\-duc\-tive

  1780 predicates. This can be overridden on a per-predicate basis using the

  1781 \textit{iter} \qty{const} option above.

  1783 \opt{bisim\_depth}{int\_seq}{$\mathbf{7}$}

  1784 Specifies the sequence of iteration counts to use when unrolling the

  1785 bisimilarity predicate generated by Nitpick for coinductive datatypes. A value

  1786 of $-1$ means that no predicate is generated, in which case Nitpick performs an

  1787 after-the-fact check to see if the known coinductive datatype values are

  1788 bidissimilar. If two values are found to be bisimilar, the counterexample is

  1789 tagged as ``likely genuine.'' The iteration counts are automatically bounded by

  1790 the sum of the cardinalities of the coinductive datatypes occurring in the

  1791 formula to falsify.

  1793 \opusmart{box}{type}{dont\_box}

  1794 Specifies whether Nitpick should attempt to wrap (``box'') a given function or

  1795 product type in an isomorphic datatype internally. Boxing is an effective mean

  1796 to reduce the search space and speed up Nitpick, because the isomorphic datatype

  1797 is approximated by a subset of the possible function or pair values;

  1798 like other drastic optimizations, it can also prevent the discovery of

  1799 counterexamples. The option can take the following values:

  1801 \begin{enum}

  1802 \item[$\bullet$] \textbf{\textit{true}}: Box the specified type whenever

  1803 practicable.

  1804 \item[$\bullet$] \textbf{\textit{false}}: Never box the type.

  1805 \item[$\bullet$] \textbf{\textit{smart}}: Box the type only in contexts where it

  1806 is likely to help. For example, $n$-tuples where $n > 2$ and arguments to

  1807 higher-order functions are good candidates for boxing.

  1808 \end{enum}

  1810 Setting the \textit{card}~\qty{type} option for a function or product type

  1811 implicitly enables boxing for that type.

  1813 \nopagebreak

  1814 {\small See also \textit{verbose} (\S\ref{output-format})

  1815 and \textit{debug} (\S\ref{output-format}).}

  1817 \opsmart{box}{dont\_box}

  1818 Specifies the default boxing setting to use. This can be overridden on a

  1819 per-type basis using the \textit{box}~\qty{type} option described above.

  1821 \opusmart{mono}{type}{non\_mono}

  1822 Specifies whether the specified type should be considered monotonic when

  1823 enumerating scopes. If the option is set to \textit{smart}, Nitpick performs a

  1824 monotonicity check on the type. Setting this option to \textit{true} can reduce

  1825 the number of scopes tried, but it also diminishes the theoretical chance of

  1826 finding a counterexample, as demonstrated in \S\ref{scope-monotonicity}.

  1828 \nopagebreak

  1829 {\small See also \textit{card} (\S\ref{scope-of-search}),

  1830 \textit{merge\_type\_vars} (\S\ref{scope-of-search}), and \textit{verbose}

  1831 (\S\ref{output-format}).}

  1833 \opsmart{mono}{non\_box}

  1834 Specifies the default monotonicity setting to use. This can be overridden on a

  1835 per-type basis using the \textit{mono}~\qty{type} option described above.

  1837 \opfalse{merge\_type\_vars}{dont\_merge\_type\_vars}

  1838 Specifies whether type variables with the same sort constraints should be

  1839 merged. Setting this option to \textit{true} can reduce the number of scopes

  1840 tried and the size of the generated Kodkod formulas, but it also diminishes the

  1841 theoretical chance of finding a counterexample.

  1843 {\small See also \textit{mono} (\S\ref{scope-of-search}).}

  1844 \end{enum}

  1846 \subsection{Output Format}

  1847 \label{output-format}

  1849 \begin{enum}

  1850 \opfalse{verbose}{quiet}

  1851 Specifies whether the \textbf{nitpick} command should explain what it does. This

  1852 option is useful to determine which scopes are tried or which SAT solver is

  1853 used. This option is implicitly disabled for automatic runs.

  1855 \opfalse{debug}{no\_debug}

  1856 Specifies whether Nitpick should display additional debugging information beyond

  1857 what \textit{verbose} already displays. Enabling \textit{debug} also enables

  1858 \textit{verbose} and \textit{show\_all} behind the scenes. The \textit{debug}

  1859 option is implicitly disabled for automatic runs.

  1861 \nopagebreak

  1862 {\small See also \textit{overlord} (\S\ref{mode-of-operation}) and

  1863 \textit{batch\_size} (\S\ref{optimizations}).}

  1865 \optrue{show\_skolems}{hide\_skolem}

  1866 Specifies whether the values of Skolem constants should be displayed as part of

  1867 counterexamples. Skolem constants correspond to bound variables in the original

  1868 formula and usually help us to understand why the counterexample falsifies the

  1869 formula.

  1871 \nopagebreak

  1872 {\small See also \textit{skolemize} (\S\ref{optimizations}).}

  1874 \opfalse{show\_datatypes}{hide\_datatypes}

  1875 Specifies whether the subsets used to approximate (co)in\-duc\-tive datatypes should

  1876 be displayed as part of counterexamples. Such subsets are sometimes helpful when

  1877 investigating whether a potential counterexample is genuine or spurious, but

  1878 their potential for clutter is real.

  1880 \opfalse{show\_consts}{hide\_consts}

  1881 Specifies whether the values of constants occurring in the formula (including

  1882 its axioms) should be displayed along with any counterexample. These values are

  1883 sometimes helpful when investigating why a counterexample is

  1884 genuine, but they can clutter the output.

  1886 \opfalse{show\_all}{dont\_show\_all}

  1887 Enabling this option effectively enables \textit{show\_skolems},

  1888 \textit{show\_datatypes}, and \textit{show\_consts}.

  1890 \opt{max\_potential}{int}{$\mathbf{1}$}

  1891 Specifies the maximum number of potential counterexamples to display. Setting

  1892 this option to 0 speeds up the search for a genuine counterexample. This option

  1893 is implicitly set to 0 for automatic runs. If you set this option to a value

  1894 greater than 1, you will need an incremental SAT solver: For efficiency, it is

  1895 recommended to install the JNI version of MiniSat and set \textit{sat\_solver} =

  1896 \textit{MiniSatJNI}. Also be aware that many of the counterexamples may look

  1897 identical, unless the \textit{show\_all} (\S\ref{output-format}) option is

  1898 enabled.

  1900 \nopagebreak

  1901 {\small See also \textit{check\_potential} (\S\ref{authentication}) and

  1902 \textit{sat\_solver} (\S\ref{optimizations}).}

  1904 \opt{max\_genuine}{int}{$\mathbf{1}$}

  1905 Specifies the maximum number of genuine counterexamples to display. If you set

  1906 this option to a value greater than 1, you will need an incremental SAT solver:

  1907 For efficiency, it is recommended to install the JNI version of MiniSat and set

  1908 \textit{sat\_solver} = \textit{MiniSatJNI}. Also be aware that many of the

  1909 counterexamples may look identical, unless the \textit{show\_all}

  1910 (\S\ref{output-format}) option is enabled.

  1912 \nopagebreak

  1913 {\small See also \textit{check\_genuine} (\S\ref{authentication}) and

  1914 \textit{sat\_solver} (\S\ref{optimizations}).}

  1916 \ops{eval}{term\_list}

  1917 Specifies the list of terms whose values should be displayed along with

  1918 counterexamples. This option suffers from an ``observer effect'': Nitpick might

  1919 find different counterexamples for different values of this option.

  1921 \opu{format}{term}{int\_seq}

  1922 Specifies how to uncurry the value displayed for a variable or constant.

  1923 Uncurrying sometimes increases the readability of the output for high-arity

  1924 functions. For example, given the variable $y \mathbin{\Colon} {'a}\Rightarrow

  1925 {'b}\Rightarrow {'c}\Rightarrow {'d}\Rightarrow {'e}\Rightarrow {'f}\Rightarrow

  1926 {'g}$, setting \textit{format}~$y$ = 3 tells Nitpick to group the last three

  1927 arguments, as if the type had been ${'a}\Rightarrow {'b}\Rightarrow

  1928 {'c}\Rightarrow {'d}\times {'e}\times {'f}\Rightarrow {'g}$. In general, a list

  1929 of values $n_1,\ldots,n_k$ tells Nitpick to show the last $n_k$ arguments as an

  1930 $n_k$-tuple, the previous $n_{k-1}$ arguments as an $n_{k-1}$-tuple, and so on;

  1931 arguments that are not accounted for are left alone, as if the specification had

  1932 been $1,\ldots,1,n_1,\ldots,n_k$.

  1934 \nopagebreak

  1935 {\small See also \textit{uncurry} (\S\ref{optimizations}).}

  1937 \opt{format}{int\_seq}{$\mathbf{1}$}

  1938 Specifies the default format to use. Irrespective of the default format, the

  1939 extra arguments to a Skolem constant corresponding to the outer bound variables

  1940 are kept separated from the remaining arguments, the \textbf{for} arguments of

  1941 an inductive definitions are kept separated from the remaining arguments, and

  1942 the iteration counter of an unrolled inductive definition is shown alone. The

  1943 default format can be overridden on a per-variable or per-constant basis using

  1944 the \textit{format}~\qty{term} option described above.

  1945 \end{enum}

  1947 %% MARK: Authentication

  1948 \subsection{Authentication}

  1949 \label{authentication}

  1951 \begin{enum}

  1952 \opfalse{check\_potential}{trust\_potential}

  1953 Specifies whether potential counterexamples should be given to Isabelle's

  1954 \textit{auto} tactic to assess their validity. If a potential counterexample is

  1955 shown to be genuine, Nitpick displays a message to this effect and terminates.

  1957 \nopagebreak

  1958 {\small See also \textit{max\_potential} (\S\ref{output-format}).}

  1960 \opfalse{check\_genuine}{trust\_genuine}

  1961 Specifies whether genuine and likely genuine counterexamples should be given to

  1962 Isabelle's \textit{auto} tactic to assess their validity. If a ``genuine''

  1963 counterexample is shown to be spurious, the user is kindly asked to send a bug

  1964 report to the author at

  1965 \texttt{blan{\color{white}nospam}\kern-\wd\boxA{}chette@in.tum.de}.

  1967 \nopagebreak

  1968 {\small See also \textit{max\_genuine} (\S\ref{output-format}).}

  1970 \ops{expect}{string}

  1971 Specifies the expected outcome, which must be one of the following:

  1973 \begin{enum}

  1974 \item[$\bullet$] \textbf{\textit{genuine}}: Nitpick found a genuine counterexample.

  1975 \item[$\bullet$] \textbf{\textit{likely\_genuine}}: Nitpick found a ``likely

  1976 genuine'' counterexample (i.e., a counterexample that is genuine unless

  1977 it contradicts a missing axiom or a dangerous option was used inappropriately).

  1978 \item[$\bullet$] \textbf{\textit{potential}}: Nitpick found a potential counterexample.

  1979 \item[$\bullet$] \textbf{\textit{none}}: Nitpick found no counterexample.

  1980 \item[$\bullet$] \textbf{\textit{unknown}}: Nitpick encountered some problem (e.g.,

  1981 Kodkod ran out of memory).

  1982 \end{enum}

  1984 Nitpick emits an error if the actual outcome differs from the expected outcome.

  1985 This option is useful for regression testing.

  1986 \end{enum}

  1988 \subsection{Optimizations}

  1989 \label{optimizations}

  1991 \def\cpp{C\nobreak\raisebox{.1ex}{+}\nobreak\raisebox{.1ex}{+}}

  1993 \sloppy

  1995 \begin{enum}

  1996 \opt{sat\_solver}{string}{smart}

  1997 Specifies which SAT solver to use. SAT solvers implemented in C or \cpp{} tend

  1998 to be faster than their Java counterparts, but they can be more difficult to

  1999 install. Also, if you set the \textit{max\_potential} (\S\ref{output-format}) or

  2000 \textit{max\_genuine} (\S\ref{output-format}) option to a value greater than 1,

  2001 you will need an incremental SAT solver, such as \textit{MiniSatJNI}

  2002 (recommended) or \textit{SAT4J}.

  2004 The supported solvers are listed below:

  2006 \begin{enum}

  2008 \item[$\bullet$] \textbf{\textit{MiniSat}}: MiniSat is an efficient solver

  2009 written in \cpp{}. To use MiniSat, set the environment variable

  2010 \texttt{MINISAT\_HOME} to the directory that contains the \texttt{minisat}

  2011 executable. The \cpp{} sources and executables for MiniSat are available at

  2012 \url{http://minisat.se/MiniSat.html}. Nitpick has been tested with versions 1.14

  2013 and 2.0 beta (2007-07-21).

  2015 \item[$\bullet$] \textbf{\textit{MiniSatJNI}}: The JNI (Java Native Interface)

  2016 version of MiniSat is bundled in \texttt{nativesolver.\allowbreak tgz}, which

  2017 you will find on Kodkod's web site \cite{kodkod-2009}. Unlike the standard

  2018 version of MiniSat, the JNI version can be used incrementally.

  2020 \item[$\bullet$] \textbf{\textit{PicoSAT}}: PicoSAT is an efficient solver

  2021 written in C. It is bundled with Kodkodi and requires no further installation or

  2022 configuration steps. Alternatively, you can install a standard version of

  2023 PicoSAT and set the environment variable \texttt{PICOSAT\_HOME} to the directory

  2024 that contains the \texttt{picosat} executable. The C sources for PicoSAT are

  2025 available at \url{http://fmv.jku.at/picosat/} and are also bundled with Kodkodi.

  2026 Nitpick has been tested with version 913.

  2028 \item[$\bullet$] \textbf{\textit{zChaff}}: zChaff is an efficient solver written

  2029 in \cpp{}. To use zChaff, set the environment variable \texttt{ZCHAFF\_HOME} to

  2030 the directory that contains the \texttt{zchaff} executable. The \cpp{} sources

  2031 and executables for zChaff are available at

  2032 \url{http://www.princeton.edu/~chaff/zchaff.html}. Nitpick has been tested with

  2033 versions 2004-05-13, 2004-11-15, and 2007-03-12.

  2035 \item[$\bullet$] \textbf{\textit{zChaffJNI}}: The JNI version of zChaff is

  2036 bundled in \texttt{native\-solver.\allowbreak tgz}, which you will find on

  2037 Kodkod's web site \cite{kodkod-2009}.

  2039 \item[$\bullet$] \textbf{\textit{RSat}}: RSat is an efficient solver written in

  2040 \cpp{}. To use RSat, set the environment variable \texttt{RSAT\_HOME} to the

  2041 directory that contains the \texttt{rsat} executable. The \cpp{} sources for

  2042 RSat are available at \url{http://reasoning.cs.ucla.edu/rsat/}. Nitpick has been

  2043 tested with version 2.01.

  2045 \item[$\bullet$] \textbf{\textit{BerkMin}}: BerkMin561 is an efficient solver

  2046 written in C. To use BerkMin, set the environment variable

  2047 \texttt{BERKMIN\_HOME} to the directory that contains the \texttt{BerkMin561}

  2048 executable. The BerkMin executables are available at

  2049 \url{http://eigold.tripod.com/BerkMin.html}.

  2051 \item[$\bullet$] \textbf{\textit{BerkMinAlloy}}: Variant of BerkMin that is

  2052 included with Alloy 4 and calls itself ``sat56'' in its banner text. To use this

  2053 version of BerkMin, set the environment variable

  2054 \texttt{BERKMINALLOY\_HOME} to the directory that contains the \texttt{berkmin}

  2055 executable.

  2057 \item[$\bullet$] \textbf{\textit{Jerusat}}: Jerusat 1.3 is an efficient solver

  2058 written in C. To use Jerusat, set the environment variable

  2059 \texttt{JERUSAT\_HOME} to the directory that contains the \texttt{Jerusat1.3}

  2060 executable. The C sources for Jerusat are available at

  2061 \url{http://www.cs.tau.ac.il/~ale1/Jerusat1.3.tgz}.

  2063 \item[$\bullet$] \textbf{\textit{SAT4J}}: SAT4J is a reasonably efficient solver

  2064 written in Java that can be used incrementally. It is bundled with Kodkodi and

  2065 requires no further installation or configuration steps. Do not attempt to

  2066 install the official SAT4J packages, because their API is incompatible with

  2067 Kodkod.

  2069 \item[$\bullet$] \textbf{\textit{SAT4JLight}}: Variant of SAT4J that is

  2070 optimized for small problems. It can also be used incrementally.

  2072 \item[$\bullet$] \textbf{\textit{HaifaSat}}: HaifaSat 1.0 beta is an

  2073 experimental solver written in \cpp. To use HaifaSat, set the environment

  2074 variable \texttt{HAIFASAT\_\allowbreak HOME} to the directory that contains the

  2075 \texttt{HaifaSat} executable. The \cpp{} sources for HaifaSat are available at

  2076 \url{http://cs.technion.ac.il/~gershman/HaifaSat.htm}.

  2078 \item[$\bullet$] \textbf{\textit{smart}}: If \textit{sat\_solver} is set to

  2079 \textit{smart}, Nitpick selects the first solver among MiniSatJNI, MiniSat,

  2080 PicoSAT, zChaffJNI, zChaff, RSat, BerkMin, BerkMinAlloy, and Jerusat that is

  2081 recognized by Isabelle. If none is found, it falls back on SAT4J, which should

  2082 always be available. If \textit{verbose} is enabled, Nitpick displays which SAT

  2083 solver was chosen.

  2084 \end{enum}

  2085 \fussy

  2087 \opt{batch\_size}{int\_or\_smart}{smart}

  2088 Specifies the maximum number of Kodkod problems that should be lumped together

  2089 when invoking Kodkodi. Each problem corresponds to one scope. Lumping problems

  2090 together ensures that Kodkodi is launched less often, but it makes the verbose

  2091 output less readable and is sometimes detrimental to performance. If

  2092 \textit{batch\_size} is set to \textit{smart}, the actual value used is 1 if

  2093 \textit{debug} (\S\ref{output-format}) is set and 64 otherwise.

  2095 \optrue{destroy\_constrs}{dont\_destroy\_constrs}

  2096 Specifies whether formulas involving (co)in\-duc\-tive datatype constructors should

  2097 be rewritten to use (automatically generated) discriminators and destructors.

  2098 This optimization can drastically reduce the size of the Boolean formulas given

  2099 to the SAT solver.

  2101 \nopagebreak

  2102 {\small See also \textit{debug} (\S\ref{output-format}).}

  2104 \optrue{specialize}{dont\_specialize}

  2105 Specifies whether functions invoked with static arguments should be specialized.

  2106 This optimization can drastically reduce the search space, especially for

  2107 higher-order functions.

  2109 \nopagebreak

  2110 {\small See also \textit{debug} (\S\ref{output-format}) and

  2111 \textit{show\_consts} (\S\ref{output-format}).}

  2113 \optrue{skolemize}{dont\_skolemize}

  2114 Specifies whether the formula should be skolemized. For performance reasons,

  2115 (positive) $\forall$-quanti\-fiers that occur in the scope of a higher-order

  2116 (positive) $\exists$-quanti\-fier are left unchanged.

  2118 \nopagebreak

  2119 {\small See also \textit{debug} (\S\ref{output-format}) and

  2120 \textit{show\_skolems} (\S\ref{output-format}).}

  2122 \optrue{star\_linear\_preds}{dont\_star\_linear\_preds}

  2123 Specifies whether Nitpick should use Kodkod's transitive closure operator to

  2124 encode non-well-founded ``linear inductive predicates,'' i.e., inductive

  2125 predicates for which each the predicate occurs in at most one assumption of each

  2126 introduction rule. Using the reflexive transitive closure is in principle

  2127 equivalent to setting \textit{iter} to the cardinality of the predicate's

  2128 domain, but it is usually more efficient.

  2130 {\small See also \textit{wf} (\S\ref{scope-of-search}), \textit{debug}

  2131 (\S\ref{output-format}), and \textit{iter} (\S\ref{scope-of-search}).}

  2133 \optrue{uncurry}{dont\_uncurry}

  2134 Specifies whether Nitpick should uncurry functions. Uncurrying has on its own no

  2135 tangible effect on efficiency, but it creates opportunities for the boxing

  2136 optimization.

  2138 \nopagebreak

  2139 {\small See also \textit{box} (\S\ref{scope-of-search}), \textit{debug}

  2140 (\S\ref{output-format}), and \textit{format} (\S\ref{output-format}).}

  2142 \optrue{fast\_descrs}{full\_descrs}

  2143 Specifies whether Nitpick should optimize the definite and indefinite

  2144 description operators (THE and SOME). The optimized versions usually help

  2145 Nitpick generate more counterexamples or at least find them faster, but only the

  2146 unoptimized versions are complete when all types occurring in the formula are

  2147 finite.

  2149 {\small See also \textit{debug} (\S\ref{output-format}).}

  2151 \optrue{peephole\_optim}{no\_peephole\_optim}

  2152 Specifies whether Nitpick should simplify the generated Kodkod formulas using a

  2153 peephole optimizer. These optimizations can make a significant difference.

  2154 Unless you are tracking down a bug in Nitpick or distrust the peephole

  2155 optimizer, you should leave this option enabled.

  2157 \opt{sym\_break}{int}{20}

  2158 Specifies an upper bound on the number of relations for which Kodkod generates

  2159 symmetry breaking predicates. According to the Kodkod documentation

  2160 \cite{kodkod-2009-options}, ``in general, the higher this value, the more

  2161 symmetries will be broken, and the faster the formula will be solved. But,

  2162 setting the value too high may have the opposite effect and slow down the

  2163 solving.''

  2165 \opt{sharing\_depth}{int}{3}

  2166 Specifies the depth to which Kodkod should check circuits for equivalence during

  2167 the translation to SAT. The default of 3 is the same as in Alloy. The minimum

  2168 allowed depth is 1. Increasing the sharing may result in a smaller SAT problem,

  2169 but can also slow down Kodkod.

  2171 \opfalse{flatten\_props}{dont\_flatten\_props}

  2172 Specifies whether Kodkod should try to eliminate intermediate Boolean variables.

  2173 Although this might sound like a good idea, in practice it can drastically slow

  2174 down Kodkod.

  2176 \opt{max\_threads}{int}{0}

  2177 Specifies the maximum number of threads to use in Kodkod. If this option is set

  2178 to 0, Kodkod will compute an appropriate value based on the number of processor

  2179 cores available.

  2181 \nopagebreak

  2182 {\small See also \textit{batch\_size} (\S\ref{optimizations}) and

  2183 \textit{timeout} (\S\ref{timeouts}).}

  2184 \end{enum}

  2186 \subsection{Timeouts}

  2187 \label{timeouts}

  2189 \begin{enum}

  2190 \opt{timeout}{time}{$\mathbf{30}$ s}

  2191 Specifies the maximum amount of time that the \textbf{nitpick} command should

  2192 spend looking for a counterexample. Nitpick tries to honor this constraint as

  2193 well as it can but offers no guarantees. For automatic runs,

  2194 \textit{timeout} is ignored; instead, Auto Quickcheck and Auto Nitpick share

  2195 a time slot whose length is specified by the ``Auto Counterexample Time

  2196 Limit'' option in Proof General.

  2198 \nopagebreak

  2199 {\small See also \textit{max\_threads} (\S\ref{optimizations}).}

  2201 \opt{tac\_timeout}{time}{$\mathbf{500}$\,ms}

  2202 Specifies the maximum amount of time that the \textit{auto} tactic should use

  2203 when checking a counterexample, and similarly that \textit{lexicographic\_order}

  2204 and \textit{sizechange} should use when checking whether a (co)in\-duc\-tive

  2205 predicate is well-founded. Nitpick tries to honor this constraint as well as it

  2206 can but offers no guarantees.

  2208 \nopagebreak

  2209 {\small See also \textit{wf} (\S\ref{scope-of-search}),

  2210 \textit{check\_potential} (\S\ref{authentication}),

  2211 and \textit{check\_genuine} (\S\ref{authentication}).}

  2212 \end{enum}

  2214 \section{Attribute Reference}

  2215 \label{attribute-reference}

  2217 Nitpick needs to consider the definitions of all constants occurring in a

  2218 formula in order to falsify it. For constants introduced using the

  2219 \textbf{definition} command, the definition is simply the associated

  2220 \textit{\_def} axiom. In contrast, instead of using the internal representation

  2221 of functions synthesized by Isabelle's \textbf{primrec}, \textbf{function}, and

  2222 \textbf{nominal\_primrec} packages, Nitpick relies on the more natural

  2223 equational specification entered by the user.

  2225 Behind the scenes, Isabelle's built-in packages and theories rely on the

  2226 following attributes to affect Nitpick's behavior:

  2228 \begin{itemize}

  2229 \flushitem{\textit{nitpick\_def}}

  2231 \nopagebreak

  2232 This attribute specifies an alternative definition of a constant. The

  2233 alternative definition should be logically equivalent to the constant's actual

  2234 axiomatic definition and should be of the form

  2236 \qquad $c~{?}x_1~\ldots~{?}x_n \,\equiv\, t$,

  2238 where ${?}x_1, \ldots, {?}x_n$ are distinct variables and $c$ does not occur in

  2239 $t$.

  2241 \flushitem{\textit{nitpick\_simp}}

  2243 \nopagebreak

  2244 This attribute specifies the equations that constitute the specification of a

  2245 constant. For functions defined using the \textbf{primrec}, \textbf{function},

  2246 and \textbf{nominal\_\allowbreak primrec} packages, this corresponds to the

  2247 \textit{simps} rules. The equations must be of the form

  2249 \qquad $c~t_1~\ldots\ t_n \,=\, u.$

  2251 \flushitem{\textit{nitpick\_psimp}}

  2253 \nopagebreak

  2254 This attribute specifies the equations that constitute the partial specification

  2255 of a constant. For functions defined using the \textbf{function} package, this

  2256 corresponds to the \textit{psimps} rules. The conditional equations must be of

  2257 the form

  2259 \qquad $\lbrakk P_1;\> \ldots;\> P_m\rbrakk \,\Longrightarrow\, c\ t_1\ \ldots\ t_n \,=\, u$.

  2261 \flushitem{\textit{nitpick\_intro}}

  2263 \nopagebreak

  2264 This attribute specifies the introduction rules of a (co)in\-duc\-tive predicate.

  2265 For predicates defined using the \textbf{inductive} or \textbf{coinductive}

  2266 command, this corresponds to the \textit{intros} rules. The introduction rules

  2267 must be of the form

  2269 \qquad $\lbrakk P_1;\> \ldots;\> P_m;\> M~(c\ t_{11}\ \ldots\ t_{1n});\>

  2270 \ldots;\> M~(c\ t_{k1}\ \ldots\ t_{kn})\rbrakk \,\Longrightarrow\, c\ u_1\

  2271 \ldots\ u_n$,

  2273 where the $P_i$'s are side conditions that do not involve $c$ and $M$ is an

  2274 optional monotonic operator. The order of the assumptions is irrelevant.

  2276 \end{itemize}

  2278 When faced with a constant, Nitpick proceeds as follows:

  2280 \begin{enum}

  2281 \item[1.] If the \textit{nitpick\_simp} set associated with the constant

  2282 is not empty, Nitpick uses these rules as the specification of the constant.

  2284 \item[2.] Otherwise, if the \textit{nitpick\_psimp} set associated with

  2285 the constant is not empty, it uses these rules as the specification of the

  2286 constant.

  2288 \item[3.] Otherwise, it looks up the definition of the constant:

  2290 \begin{enum}

  2291 \item[1.] If the \textit{nitpick\_def} set associated with the constant

  2292 is not empty, it uses the latest rule added to the set as the definition of the

  2293 constant; otherwise it uses the actual definition axiom.

  2294 \item[2.] If the definition is of the form

  2296 \qquad $c~{?}x_1~\ldots~{?}x_m \,\equiv\, \lambda y_1~\ldots~y_n.\; \textit{lfp}~(\lambda f.\; t)$,

  2298 then Nitpick assumes that the definition was made using an inductive package and

  2299 based on the introduction rules marked with \textit{nitpick\_\allowbreak

  2300 ind\_\allowbreak intros} tries to determine whether the definition is

  2301 well-founded.

  2302 \end{enum}

  2303 \end{enum}

  2305 As an illustration, consider the inductive definition

  2307 \prew

  2308 \textbf{inductive}~\textit{odd}~\textbf{where} \\

  2309 ``\textit{odd}~1'' $\,\mid$ \\

  2310 ``\textit{odd}~$n\,\Longrightarrow\, \textit{odd}~(\textit{Suc}~(\textit{Suc}~n))$''

  2311 \postw

  2313 Isabelle automatically attaches the \textit{nitpick\_intro} attribute to

  2314 the above rules. Nitpick then uses the \textit{lfp}-based definition in

  2315 conjunction with these rules. To override this, we can specify an alternative

  2316 definition as follows:

  2318 \prew

  2319 \textbf{lemma} $\mathit{odd\_def}'$ [\textit{nitpick\_def}]: ``$\textit{odd}~n \,\equiv\, n~\textrm{mod}~2 = 1$''

  2320 \postw

  2322 Nitpick then expands all occurrences of $\mathit{odd}~n$ to $n~\textrm{mod}~2

  2323 = 1$. Alternatively, we can specify an equational specification of the constant:

  2325 \prew

  2326 \textbf{lemma} $\mathit{odd\_simp}'$ [\textit{nitpick\_simp}]: ``$\textit{odd}~n = (n~\textrm{mod}~2 = 1)$''

  2327 \postw

  2329 Such tweaks should be done with great care, because Nitpick will assume that the

  2330 constant is completely defined by its equational specification. For example, if

  2331 you make ``$\textit{odd}~(2 * k + 1)$'' a \textit{nitpick\_simp} rule and neglect to provide rules to handle the $2 * k$ case, Nitpick will define

  2332 $\textit{odd}~n$ arbitrarily for even values of $n$. The \textit{debug}

  2333 (\S\ref{output-format}) option is extremely useful to understand what is going

  2334 on when experimenting with \textit{nitpick\_} attributes.

  2336 \section{Standard ML Interface}

  2337 \label{standard-ml-interface}

  2339 Nitpick provides a rich Standard ML interface used mainly for internal purposes

  2340 and debugging. Among the most interesting functions exported by Nitpick are

  2341 those that let you invoke the tool programmatically and those that let you

  2342 register and unregister custom coinductive datatypes.

  2344 \subsection{Invocation of Nitpick}

  2345 \label{invocation-of-nitpick}

  2347 The \textit{Nitpick} structure offers the following functions for invoking your

  2348 favorite counterexample generator:

  2350 \prew

  2351 $\textbf{val}\,~\textit{pick\_nits\_in\_term} : \\

  2352 \hbox{}\quad\textit{Proof.state} \rightarrow \textit{params} \rightarrow \textit{bool} \rightarrow \textit{term~list} \rightarrow \textit{term} \\

  2353 \hbox{}\quad{\rightarrow}\; \textit{string} * \textit{Proof.state}$ \\

  2354 $\textbf{val}\,~\textit{pick\_nits\_in\_subgoal} : \\

  2355 \hbox{}\quad\textit{Proof.state} \rightarrow \textit{params} \rightarrow \textit{bool} \rightarrow \textit{int} \rightarrow \textit{string} * \textit{Proof.state}$

  2356 \postw

  2358 The return value is a new proof state paired with an outcome string

  2359 (``genuine'', ``likely\_genuine'', ``potential'', ``none'', or ``unknown''). The

  2360 \textit{params} type is a large record that lets you set Nitpick's options. The

  2361 current default options can be retrieved by calling the following function

  2362 defined in the \textit{Nitpick\_Isar} structure:

  2364 \prew

  2365 $\textbf{val}\,~\textit{default\_params} :\,

  2366 \textit{theory} \rightarrow (\textit{string} * \textit{string})~\textit{list} \rightarrow \textit{params}$

  2367 \postw

  2369 The second argument lets you override option values before they are parsed and

  2370 put into a \textit{params} record. Here is an example:

  2372 \prew

  2373 $\textbf{val}\,~\textit{params} = \textit{Nitpick\_Isar.default\_params}~\textit{thy}~[(\textrm{``}\textrm{timeout}\textrm{''},\, \textrm{``}\textrm{none}\textrm{''})]$ \\

  2374 $\textbf{val}\,~(\textit{outcome},\, \textit{state}') = \textit{Nitpick.pick\_nits\_in\_subgoal}~\begin{aligned}[t]

  2375 & \textit{state}~\textit{params}~\textit{false} \\[-2pt]

  2376 & \textit{subgoal}\end{aligned}$

  2377 \postw

  2379 \let\antiq=\textrm

  2381 \subsection{Registration of Coinductive Datatypes}

  2382 \label{registration-of-coinductive-datatypes}

  2384 If you have defined a custom coinductive datatype, you can tell Nitpick about

  2385 it, so that it can use an efficient Kodkod axiomatization similar to the one it

  2386 uses for lazy lists. The interface for registering and unregistering coinductive

  2387 datatypes consists of the following pair of functions defined in the

  2388 \textit{Nitpick} structure:

  2390 \prew

  2391 $\textbf{val}\,~\textit{register\_codatatype} :\,

  2392 \textit{typ} \rightarrow \textit{string} \rightarrow \textit{styp~list} \rightarrow \textit{theory} \rightarrow \textit{theory}$ \\

  2393 $\textbf{val}\,~\textit{unregister\_codatatype} :\,

  2394 \textit{typ} \rightarrow \textit{theory} \rightarrow \textit{theory}$

  2395 \postw

  2397 The type $'a~\textit{llist}$ of lazy lists is already registered; had it

  2398 not been, you could have told Nitpick about it by adding the following line

  2399 to your theory file:

  2401 \prew

  2402 $\textbf{setup}~\,\{{*}\,~\!\begin{aligned}[t]

  2403 & \textit{Nitpick.register\_codatatype} \\[-2pt]

  2404 & \qquad @\{\antiq{typ}~``\kern1pt'a~\textit{llist}\textrm{''}\}~@\{\antiq{const\_name}~ \textit{llist\_case}\} \\[-2pt] %% TYPESETTING

  2405 & \qquad (\textit{map}~\textit{dest\_Const}~[@\{\antiq{term}~\textit{LNil}\},\, @\{\antiq{term}~\textit{LCons}\}])\,\ {*}\}\end{aligned}$

  2406 \postw

  2408 The \textit{register\_codatatype} function takes a coinductive type, its case

  2409 function, and the list of its constructors. The case function must take its

  2410 arguments in the order that the constructors are listed. If no case function

  2411 with the correct signature is available, simply pass the empty string.

  2413 On the other hand, if your goal is to cripple Nitpick, add the following line to

  2414 your theory file and try to check a few conjectures about lazy lists:

  2416 \prew

  2417 $\textbf{setup}~\,\{{*}\,~\textit{Nitpick.unregister\_codatatype}~@\{\antiq{typ}~``

  2418 \kern1pt'a~\textit{list}\textrm{''}\}\ \,{*}\}$

  2419 \postw

  2421 \section{Known Bugs and Limitations}

  2422 \label{known-bugs-and-limitations}

  2424 Here are the known bugs and limitations in Nitpick at the time of writing:

  2426 \begin{enum}

  2427 \item[$\bullet$] Underspecified functions defined using the \textbf{primrec},

  2428 \textbf{function}, or \textbf{nominal\_\allowbreak primrec} packages can lead

  2429 Nitpick to generate spurious counterexamples for theorems that refer to values

  2430 for which the function is not defined. For example:

  2432 \prew

  2433 \textbf{primrec} \textit{prec} \textbf{where} \\

  2434 ``$\textit{prec}~(\textit{Suc}~n) = n$'' \\[2\smallskipamount]

  2435 \textbf{lemma} ``$\textit{prec}~0 = \undef$'' \\

  2436 \textbf{nitpick} \\[2\smallskipamount]

  2437 \quad{\slshape Nitpick found a counterexample for \textit{card nat}~= 2:

  2438 \nopagebreak

  2439 \\[2\smallskipamount]

  2440 \hbox{}\qquad Empty assignment} \nopagebreak\\[2\smallskipamount]

  2441 \textbf{by}~(\textit{auto simp}: \textit{prec\_def})

  2442 \postw

  2444 Such theorems are considered bad style because they rely on the internal

  2445 representation of functions synthesized by Isabelle, which is an implementation

  2446 detail.

  2448 \item[$\bullet$] Nitpick maintains a global cache of wellfoundedness conditions,

  2449 which can become invalid if you change the definition of an inductive predicate

  2450 that is registered in the cache. To clear the cache,

  2451 run Nitpick with the \textit{tac\_timeout} option set to a new value (e.g.,

  2452 501$\,\textit{ms}$).

  2454 \item[$\bullet$] Nitpick produces spurious counterexamples when invoked after a

  2455 \textbf{guess} command in a structured proof.

  2457 \item[$\bullet$] The \textit{nitpick\_} attributes and the

  2458 \textit{Nitpick.register\_} functions can cause havoc if used improperly.

  2460 \item[$\bullet$] Local definitions are not supported and result in an error.

  2462 \item[$\bullet$] All constants and types whose names start with

  2463 \textit{Nitpick}{.} are reserved for internal use.

  2464 \end{enum}

  2466 \let\em=\sl

  2467 \bibliography{../manual}{}

  2468 \bibliographystyle{abbrv}

  2470 \end{document}

author	blanchet
	Wed, 28 Oct 2009 17:43:43 +0100
changeset 33552	ab01b72715ef
parent 33550	63925777ccf9
child 33555	75ce0f60617a
permissions	-rw-r--r--