wneuper/isa: doc-isac/mlehnfeld/master/thesis/isabelle

     1 \chapter{\isabelle{} and \isac{}}

     2 \label{cha:isabelle_isac}

     4 The case study described in this chapter was performed as an

     5 invasive development on \isac{}, a prototype of an educational tool for

     6 applied mathematics which is built upon the theorem prover \isabelle{}

     7 in order to ensure the logical correctness of calculations. The development was invasive in that it addressed basic mechanisms of \isabelle{}

     8 which are reused by \isac{}. The motivation for this development was to improve

     9 \isac{}'s integration with \isabelle{} as well as \isac{}'s efficiency and

    10 responsiveness in a multi-user setting. Both improvements are related to

    11 the introduction of parallelism in \isabelle{} during the last years.

    13 Section \ref{sec:isabelle} introduces \isabelle{}, its basic architecture

    14 and the main components it is comprised of. It furthermore discusses \textit{Standard ML} and

    15 the \textit{Poly/ML} compiler it uses (section \ref{ssec:polyml}).

    16 Section \ref{sec:isadevdoc} is about \isabelle{}'s development and documentation

    17 workflow which has mostly been adopted for the work on \isac{}. An overview

    18 of the basics of why parallelism was added to \isabelle{}'s computation model

    19 in section \ref{ssec:isab-parallel} wraps up the main part about \isabelle{}. In order to make the gain from \isac{}'s improvements comprehensible,

    20 section \ref{ssec:isac-design} gives some background information about

    21 \isac's principal design and architecture.

    22 Section \ref{ssec:isac-archi} shows \isac's

    23 architecture, which had to be adhered to by the study's implementation

    24 work.

    25 Since the case study concerns deeply invasive development into \isac,

    26 section \ref{ssec:future-isab} is dedicated to the question,

    27 how \isac's prototype development relates to

    28 \isabelle{}'s ongoing development, which is invasive, too.

    29 The detailed descriptions of the study's implementation

    30 work are section \ref{ssec:parall-thy} on the integration

    31 of \isac's knowledge definitions into \isabelle{}'s parallel theory evaluation

    32 and section \ref{ssec:session-manag} on the introduction of concurrency

    33 to \isac's user session management.

    36 \section{\isabelle{}}

    37 \label{sec:isabelle}

    38 \isabelle{} is one of the leading computer theorem provers as introduced in section \ref{sec:ctp}. It is used in many academic courses, courses on semantics of programming languages, on program verification, on introduction to mathematical proof, etc. But the number of professional users can hardly be estimated. Continuous reading of the mailing list for technical support \cite{pipermail2014clisa} suggests more than a hundred persons worldwide who already did a notable development within a PhD, some academic postdoctoral work or as engineers in industry. \isabelle{} represents more than hundred man years of development, dedicated within the last twenty five years and still ongoing. \isabelle{} was started by Larry Paulson as a ``logical framework'', an answer to the issue of {\em The Next Seven Hundred Theorem Provers} mentioned on page \pageref{par:next-700}.

    40 Today, development is ongoing at three universities: At the {\em Computer Laboratory of the University of Cambridge}, Larry Paulson develops specific automated provers, formalizes mathematics and security protocols. {\em The Chair for Logic and Verification} at the {\em Technical University of Munich} spawned from the institute of Manfred Broy and is led by Tobias Nipkow. Nipkow is responsible for the core system as well as for general theory development. About ten postdocs and PhDs are employed for these purposes \cite{tum2013team}. The parallelization efforts and development of the \textit{Isabelle/jEdit} front-end (section \ref{ssec:isa-jedit}) are undertaken at {\em Université Paris-Sud} in the {\em Laboratoire de Recherche en Informatique}, supervised by Makarius Wenzel. Confirmed by a wide range of expectations in TP technology as mentioned in section \ref{sec:ctp}, \isabelle{} strives for usability at the workplace of computer scientists, of practitioners in various engineering disciplines and of mathematicians. E.g. at \textit{RISC Linz} there is an initiative for verified computer algebra using \isabelle{}. Usability issues were the driving force to introduce parallelism to \isabelle{}.

    42 \isabelle{} integrates various tools and languages. This section describes the technologies involved in the system, its most important components and how they work together. Also, it discusses the repository, development and documentation workflow and the introduction of parallelism. The distribution for all major platforms can be downloaded from the \isabelle{} website \cite{tum2014isainst}. Whenever files are referenced, ``{\tt \textasciitilde\textasciitilde}'' stands for the distibution's root directory. These references address the \isabelle{} release {\em Isabelle2013-2}.

    43 % "Isabelle as document-oriented proof assistant" paper

    44 % Isabelle was not designed; it evolved. Not everyone likes this idea.

    45 %Specification experts rightly abhor trial-and-error programming. They

    46 %suggest that no one should write a program without first writing a complete formal specification. But university departments are not software

    47 % houses. Programs like Isabelle are not products: when they have served

    48 % their purpose, they are discarded.

    49 % Lawrence C. Paulson, "Isabelle: The Next 700 Theorem Provers"

    51 % intelligent_computer_math_isabelle.pdf p. 484-487

    52 % itp-smp.pdf (survey paper)

    53 % parallel-ml.pdf

    54 % itp-pide.pdf

    55 % isabelle_proof_method_lang.pdf

    56 % polyml_concurrent_sml.pdf

    58 \subsection{\textit{Isabelle/ML}}

    59 \label{sec:bigisaml}

    60 This section presents \textit{Isabelle/ML}, which allows programmers to embed \textit{Standard ML} code into the \isabelle{} environment. The used compiler is called \textit{Poly/ML}. It is fast and efficient, implements the full \textit{ML} standard \cite{milner1997definition} and provides additional features like a foreign language interface and support for POSIX threads based concurrency. \textit{Isabelle/ML} furthermore includes a number of libraries built on top of \textit{Poly/ML} with extra language constructs for concurrency. Before we take a deeper look into these we need to discuss \textit{Standard ML} and the \textit{Poly/ML} compiler.

    62 \subsubsection{\textit{Standard ML}}

    63 \label{sec:isasml}

    64 \textit{ML} stands for {\em meta language}. The language was briefly introduced in section \ref{sec:funhist}. It is a strongly typed, functional programming language, which is not pure as it allows side effects and updatable references. Many people contributed ideas to the language. In order maintain a common ground, a standard was established \cite{milner1997definition}. All the code samples in this document are written in \textit{Standard ML} syntax. Since \textit{Standard ML} is only a language definition which is not dictated by a reference implementation, there are multiple compilers available, most notably {\em Standard ML of New Jersey} ({\em SML/NJ}) \cite{smlnj2013standard}, {\em MLton} \cite{weeks2006whole} and {\em Moscow ML} \cite{romanenko2014moscow}. They differ in their priorities such as optimization, standard library size or platform support. \textit{ML} was originally designed as a command language for the interactive theorem prover \textit{LCF} (section \ref{sssec:lcf}) \cite{milner1978theory}. Because of its history in this area, it was specifically suitable for the development of \isabelle{}. Inference rules can easily be implemented as functions that accept a theorem and return a new one. Complete theorems can be constructed by applying a sequence of rules to other known theorems.

    66 \paragraph{The Hind\-ley-Mil\-ner Type System.}\label{sec:hindleymilner}

    67 \textit{Standard ML} makes use of {\em Hind\-ley-Mil\-ner type inference} \cite{milner1978theory} based on a type system for the lambda calculus which utilizes parametric polymorphism, i.e. types can be declared using type variables (e.g. \textit{Synchronized} signature on page \pageref{alg:guarded_access_sig}) and the decision on the instantiated types is left to the inference algorithm. This allows for the generic definition of functions while still ensuring static type safety. The type inference method is complete and is able to always deduce the most general type of an expression without requiring type annotations. If it cannot find a suitable type for an expression, its declaration must contain errors. Because type checking in \textit{ML} is secure, deduction of theorems is always sound. Wrong applications of rules lead to exceptions.

    69 \bigskip

    71 Another important feature of \textit{Standard ML} is its design for robust, large scale, modular software. Signatures are part of the type system and describe the interfaces between different modules.

    73 \subsubsection{\textit{Poly/ML}}

    74 \label{ssec:polyml}

    75 In the course of his PhD thesis, David Matthews from the \textit{University of Cambridge} developed and implemented a language called {\em Poly}, which had many ideas in common with \textit{Standard ML}. The compiler was reimplemented in \textit{Poly} itself. Later, Matthews rewrote parts of the compiler to compile \textit{Standard ML} code. The result was {\em Poly/ML} \cite{matthews1995papers} which is now available under an open source license. It implements the full \textit{Standard ML} specification and provides a few extensions. The run-time system was written in \textit{C}. \textit{Poly/ML}'s most notable feature are its concurrency mechanisms. While the \textit{Standard ML} specification \cite{milner1997definition} is limited to sequential programming, \textit{Poly/ML} provides concurrency primitives based on POSIX threads. Its native support for system threads and parallel garbage collection make it unique amongst \textit{Standard ML} compilers. Another related project is {\em Concurrent ML}, an extension of \textit{SML/NJ} \cite{reppy1993concurrent}. The primitives in \textit{Poly/ML} include threads, mutexes and condition variables. An example for forking a thread was shown in the communicating sequential process example on page \pageref{alg:csp}. For details on the operations and their implementation, please refer to \cite{matthews2010efficient}.

    76 % pretty much extendable: garbage collection, memory management, libraries, more details of concurrency, foreign language interface, esp. C interface

    78 \subsubsection{\textit{Isabelle/ML}}

    79 \label{sec:isabelleml}

    80 \textit{Isabelle/ML} is a way of programming in \textit{Standard ML} which is encouraged and enhanced within the \isabelle{} infrastructure. It adds many library modules to plain \textit{Poly/ML}.

    82 \paragraph{Antiquotations.}\label{sec:antiquot}

    83 A very powerful mechanism for easy access to \isabelle{} system values and logical entities are so called {\em antiquotations} \cite{wenzel2013impl}. Depending on their particular purpose they are resolved at compile, link or run time. They can refer to various datatypes such as {\em theorems} (derived propositions), {\em theories} (containers for global declarations), {\em contexts} (deductive environment states) and many other concepts within the \isabelle{} framework. E.g. {\tt @\{context\}} would refer to the context at compile time at the location where it is used, i.e. a \textit{Standard ML} value of type {\tt context}.

    85 \paragraph{Concurrency.}

    86 \textit{Poly/ML} comes with very general mechanisms for concurrency. Its approach is based on shared resources and synchronized access. While this concurrency model is well established, it is easy for the programmer to overlook some detail in the coordination of threads and deadlocks can occur. In order to ensure correctness by design, \textit{Isabelle/ML} provides higher-order combinators and thereby hides synchronization details.

    88 \subparagraph{Guarded Access.}\label{sec:guarded-access}

    89 Guarded access primitives operate on an abstract type called {\tt var}. They are available from the module {\em Synchronized}\footnote{\label{ftn:syncml}{\tt \textasciitilde\textasciitilde/src/Pure/Concurrent/synchronized.ML}} and the most notable elements of its interface are declared as

    90 \imlcode{\label{alg:guarded_access_sig}

    91 ~~\=ty\=pe\=~'\=a \=var\\

    92 \>val var: string -> 'a -> 'a var\\

    93 \>val value: 'a var -> 'a\\

    94 \>val guarded\_access: 'a var -> ('a -> ('b * 'a) option) -> 'b\\

    95 \>val change: 'a var -> ('a -> 'a) -> unit \textrm{.}}

    96 \noindent

    97 This type is merely a wrapper for an ML reference. Please note the type parameters {\tt 'a} and {\tt 'b}. The constructor creates a {\tt var} from such a reference. {\tt value} is the inverse operation and performs a read access. Machine code produced by the \textit{Poly/ML} compiler generally treats memory cells as volatile, i.e. calls to {\tt value} need not be synchronized because read operations always return {\em some} value. The {\tt guarded\_access} combinator is where the magic happens. Besides the {\tt var} element $x$ with value $v$ that the write access will be performed on, it expects a function $f$ that maps $v$ to an option element of the touple type {\em 'a * 'b}. The type {\em option} is \textit{Standard ML}'s equivalent of the {\tt Maybe} monad in \textit{Haskell}: It can either be {\tt NONE} or contain a value, i.e. in our case {\tt SOME ($y$, $v2$)}. $f$ is executed in a critical section of $x$ on its value $v$. As long as it returns {\tt NONE}, the current execution context waits for another thread to change $v$. If $f$ returns {\tt SOME ($y$, $v2$)}, the value of $x$ is set to $v2$ and the call to {\tt guarded\_access} returns with result $y$.

    98 The original definition of {\tt change}\footnote{{\tt \textasciitilde\textasciitilde/src/Pure/Concurrent/synchronized.ML}}, i.e. an unconditional, synchronized update to a {\tt var} element is given by

    99 \imlcode{~~\=fun change\_result var f = guarded\_access var (SOME o f);\\

   100 \>fun change var f = change\_result var (fn x => ((), f x)); \textrm{.}}

   101 \noindent

   102 It is easily possible to implement higher-order mechanisms using these primitives. See \cite{matthews2010efficient} for an example of a message queue.

   103 % EXTEND: exception handling, performance

   105 \subparagraph{Futures.}\label{sec:isafut}

   106 While the last approach allows for process synchronization on a high level, most of the complexity persists and as software gets larger it still gets increasingly hard to coordinate a high number of concurrent computations efficiently. For this reason, \textit{Isabelle/ML} provides {\em futures} (section \ref{sec:futurespro}) and is thereby able to completely hide threads and their synchronization from the programmer.

   107 The goal was to provide a mechanism for parallelism which is both simple to use and performant with a special emphasis on operations typically carried out by a proof engine. {\em Future values} fulfill these requirements. They are value-oriented, i.e. the focus hereby is on parallelism rather than concurrency. Error handling happens synchronously by means of exceptions and asynchronously with interrupts.

   108 The {\em Future} module\footnote{\tt \textasciitilde\textasciitilde/src/Pure/Concurrent/future.ML} has a comprehensive signature. The following declarations are only an excerpt:

   109 \imlcode{~~\=type 'a future\\

   110 \>val fork: (unit -> 'a) -> 'a future\\

   111 \>val cancel: 'a future -> unit\\

   112 \>val is\_finished: 'a future -> bool\\

   113 \>val join: 'a future -> 'a\\

   114 \>val map: ('a -> 'b) -> 'a future -> 'b future\\

   115 \>val value: 'a -> 'a future}

   116 \noindent

117 Notice that {\tt 'a} is a placeholder for an arbitrary type which a future can represent. Calls to {\tt Future.join} synchronize the evaluation and wait for the future's result if it is not already available. In case an exception is raised during the evaluation, its propagation will wait until the future is joined. {\tt Future.cancel} stops a future's evaluation if it has not finished, otherwise the function has no effect. The evaluation of futures is managed by means of tasks and worker threads. The number of workers should be related to the number of available processing cores. The exact number, however, is adjusted dynamically because the future scheduler always keeps a limited number of inactive threads in memory, such that they can immediately take over if another thread is temporarily stalled. In practice, the additional functions {\tt Future.map} and {\tt Future.value} have turned out to be useful in reducing the number of tasks and as a consequence the scheduling overhead. {\tt Future.map} can append another operation to an existing future, which will be executed on the originally scheduled function's result. If the original future's evaluation has not started yet, the appended operation will be computed within the same task. {\tt Future.value} produces a dummy future holding the function's argument as a result. Additionally, futures can be assigned priorities in order to influence execution order which by default is based on creation time. Also, futures can be organized in groups and even hierarchies of groups. This allows futures that depend on each other to be canceled whenever one of them raises an exception.

   119 A simple {\em future} example has already been shown on page \pageref{alg:futures}. Internally, the {\em Futures} module is based on the guarded access mechanisms introduced above. Based on futures, \textit{Isabelle/ML} provides more specific parallel combinators such as the parallel list combinators {\tt map} and {\tt find}. The latter makes use of future groups and throws an exception as soon as a match is found, such that other search branches are not evaluated. This example shows how exceptions can be used as a basic means for functional interaction of futures without reintroducing the complexities usually associated with inter-process communication. See \cite{matthews2010efficient} for further details and a discussion of the performance of futures.

   121 \subsection{\textit{Isabelle/Isar}}

   122 \label{sec:isaisar}

   123 While in older versions of \isabelle{}, theorem proving tasks had to be solved using raw \textit{ML} tactic scripts, theories and proofs can now be developed interactively in an interpreted, structured, formal proof language called {\em Isar} (Intelligible semi-automated reasoning) \cite{wenzel2013isar,wenzel1999isar}. \textit{Isar} was designed to address an issue common to state-of-the-art theorem provers: Despite successfully formalizing a good amount of existing mathematical knowledge, logics and computer science theory, it was still hard to reach potential target groups unfamiliar with the involved computer programming requirements. The primary users of theorem provers should be mathematicians and engineers who utilize them to formulate new proofs and for verification purposes. \textit{Isar} enables these users (``authors'' in fig. \ref{fig:isar-use}) to write proofs and theories in a way that can be understood by machines while being much closer to human language than \textit{ML} code or some logical calculus. As much as possible, it hides operational details thanks to its notion of {\em obvious inferences}. With a little extra care it can therefore also be used for presentation to an audience. The short, well-known proof that the square root of 2 is an irrational number is presented in appendix \ref{app:sqrt2-isar}.

   125 \begin{figure}

   126 \centering

   127 \def\svgwidth{18.5em}

   128 \resizebox{.5\textwidth}{!}{\input{images/isar_use.pdf_tex}}

   129 \caption{\textit{Isar} proof notions \cite{wenzel1999isar}.}

   130 \label{fig:isar-use}

   131 \end{figure}

   133 The {\em Isar virtual machine} interpreter performs proof checking and incorporates an operational semantics for \textit{Isar}. It is provided by the \textit{Isabelle/Isar} framework, which embeds Isar into the \isabelle{} environment. It supplies the \textit{Isar/VM} interpreter with all available proofs and definitions which are implemented in theory documents. These theories are structured in an inheritance hierarchy that forms a directed, acyclic graph. \textit{Isar} proofs operate on the basis of certain object logics. The most established logic is \textit{Isabelle/HOL}, i.e. higher-order logic (simply typed set theory), but a wide range of logics is available through the \textit{Isabelle/Pure} meta logic. Another important element are {\em proof methods}, which are function mappings between proof goals and according proof rules. \textit{Isar}'s lack of mechanisms for the formulation of automated proof procedures turned out to cause an unacceptable amount of duplicate code. Very recent work on a new language called {\em Eisbach}, which is based on \textit{Isar}, has tried to address these issues \cite{matichuk2014isabelle}.

   135 \subsection{\textit{Isabelle/Scala}}

   136 \label{ssec:isa-scala}

   137 The original user interface for the \isabelle{} theorem prover was a project called {\em Proof General}. It is a proof assistant front-end based on \textit{Emacs}, also available for other provers like {\em Coq} \cite{chlipala2011certified}. Its interaction model offers some improvements over a basic TTY model. The user enters a prover command and immediately receives a result, such as information on proof state or goals, before they can issue the next command. But interaction remains sequential and synchronous. There are two main weaknesses to \textit{Proof General} and similar approaches. The underlying editor framework is outdated: \textit{Emacs} offers a very powerful environment, but its graphical user interface can be considered dated. Also \textit{Emacs}' \textit{Lisp} engine does not support multithreading and does therefore not allow the development environment to make full use of the available CPU power. The other main shortcoming is the poor interaction model that is limited to a synchronous input / output sequence. Instead of simple text editing tools, a fully-featured IDE was desirable \cite{wenzel2012asynchronous}. In addition, the IDE should be available on many platforms and as the main development focus of \isabelle{} developers is theorem provers, they wanted to be able to draw on existing IDE frameworks, rather than developing a new one from scratch. They chose to make use of the \textit{Java Virtual Machine} and enable tools written for the \textit{JVM} to communicate with the \isabelle{} prover via simple byte streams. The resulting asynchronous document model utilized for the communication between \textit{ML} and \textit{JVM} allows GUIs, text editors like \textit{jEdit}, IDE frameworks such as \textit{Eclipse} as well as web services to make use of the \isabelle{} prover. The outcome of the efforts to design this {\bf p}rover {\bf IDE} framework is called {\em Isabelle/PIDE}. In order to overcome the differing programming models and datastructures of \textit{Standard ML} and the \textit{JVM}, \textit{Scala} was chosen as the \textit{JVM} communication endpoint because, due to its flexibility and functional features, it allows programmers to imitate a programming style similar to \textit{Isabelle/ML}. \textit{Scala}'s actors (section \ref{sec:actormodel}) were used heavily for asynchronous communication and concurrency until recently. Since the original Actor library was replaced by {\em Akka}, \isabelle{} developers have decided to imitate the \textit{ML} variants of parallelism mechanisms in \textit{Scala} without actors \cite{wenzel2014actors}. All datastructures that are part of the protocol have been implemented in both languages side by side. E.g. the datatype definition of file system path elements is

   138 \imlcode{~~\=da\=tatype elem =\\

   139 \>\>Root of string |\\

   140 \>\>Basic of string |\\

   141 \>\>Variable of string |\\

   142 \>\>Parent;}

   143 \noindent

   144 in \textit{Standard ML}\footnote{\tt \textasciitilde\textasciitilde/src/Pure/General/path.ML}. Here, a value of type \texttt{elem} can be either of the four subtypes, three of which are \texttt{string}s. The equivalent datastructure in \textit{Scala}\footnote{\tt \textasciitilde\textasciitilde/src/Pure/General/path.scala} is given as four concrete classes inheriting from the abstract class \texttt{Elem}:

   145 \imlcode{~~\=sealed abstract class Elem\\

   146 \>private case class Root(val name: String) extends Elem\\

   147 \>private case class Basic(val name: String) extends Elem\\

   148 \>private case class Variable(val name: String) extends Elem\\

   149 \>private case object Parent extends Elem}

   150 \noindent

   151 The internal protocol utilizes \textit{YXML}, a transfer format based on XML \cite{wenzel2011isabelle}. Instead of keeping a public protocol specification, the details are being kept private. This allows the developers to make substantial changes to protocol and document model to improve robustness and performance without further ado. Instead, both \textit{Isabelle/ML} and \textit{Isabelle/Scala} offer public, static APIs for their communication which are maintained and kept simple and stable.

   153 \begin{figure}

   154 \centering

   155 \def\svgwidth{17.5em}

   156 \resizebox{.6\textwidth}{!}{\bf\input{images/pide_comm.pdf_tex}}

   157 \caption{The \textit{Isabelle/PIDE} communication model \cite{wenzel2014uipide}.}

   158 \label{fig:pide-comm}

   159 \end{figure}

   161 Fig. \ref{fig:pide-comm} outlines the basic communication model between the \textit{JVM} and the prover. Whenever the user edits a file in an editor, a description of this update is sent to the prover which then produces markup to annotate the new contents with proof data and other semantic information. Then the changes on the editor side are reverted and replaced by the new, annotated content. As a consequence, the interaction mechanism adheres to the functional paradigm's philosophy by using strictly immutable document snapshots of versioned file histories.

   162 % scala actors -> email!

   163 %pide_ocaml.pdf

   164 %isabelle_system_manual.pdf

   165 %async_isabelle_scala_jedit.pdf

   166 % itp-pide.pdf

   167 % Walther mail 24.04. "Fwd: [isabelle-dev] scala-2.11.0" (scala actors)

   168 % isabelle-doc.pdf (YXML)

   170 \subsection{\textit{Isabelle/jEdit}}

   171 \label{ssec:isa-jedit}

   172 Recent \isabelle{} distributions include a complete prover IDE prototype implementation based on the text editor framework {\em jEdit} \cite{wenzel2012isabelle}. \textit{Isabelle/jEdit} features a completely new interaction model that performs continuous, parallel proof checking. It uses GUI functionality such as highlighting, coloring, tooltips, popup windows and hyperlinks to annotate user code written in \textit{Isar} and \textit{ML} incrementally with semantic information from the prover (see fig. \ref{fig:jedit_feedback}). Prover and editor front-end are executed independently and never block each other thanks to the asynchronous protocol. Additionally, \textit{Isabelle/jEdit} includes several plugins and fonts to facilitate entering and rendering of mathematical symbols.

   174 \begin{figure}

   175 \centering

   176 \includegraphics[width=94mm]{jedit_feedback} %width=.95\textwidth

   177 \caption{GUI feedback in \textit{Isabelle/jEdit}.}

   178 \label{fig:jedit_feedback}

   179 \end{figure}

   181 The IDE also manages loaded theory files and the computation of their dependency graph. Changes to theories are automatically propagated appropriately and the user receives immediate feedback on the processing of theories (fig. \ref{fig:jedit_thy}).

   182 \begin{figure}

   183 \centering

   184 \includegraphics[width=80mm]{theories_processing} %width=.95\textwidth 70mm

   185 \caption{Feedback on theory processing \textit{Isabelle/jEdit}.}

   186 \label{fig:jedit_thy}

   187 \end{figure}

   188 %async_isabelle_scala_jedit.pdf

   189 %jedit.pdf

   191 \subsection{The Prover Framework}

   192 \label{sec:isaatp}

   193 % isabelle/pure vs. isabelle/hol and other logics

   194 % proof general

   195 % pure framework

   196 % graphics Isar-slides-TPHOLs99.ps.gz

   197 %hol_springer.pdf:361

   198 %isahol_proofdisproof.pdf

   199 %itp-pide.pdf:5

   200 %isabelle_tutorial.pdf

   201 %isar_sml_antiquot.pdf:6

   202 %Isar-slides-TPHOLs99.ps.gz:2

   203 %isa_pide_educational.pdf:146

   205 % unlimitide undo, document model immutable: previous states are preserved and full history is saved. unreachable states are garbage collected

   206 % cross-link step 2 under "integration with isabelle's par..." annotations

   207 The paragraph titled {\em The Next Seven Hundred Theorem Provers} in section \ref{par:next-700} (page \pageref{par:next-700}) explained that \isabelle{} was initiated and designed as a logical framework for automated theorem provers. Its design is described in \cite{wenzel2009parallel}. The metaprover / collection of external automated provers supporting interactive proofs is called {\em Sledgehammer}. The standard provers are called {\em E}, {\em SPASS}, {\em Vampire} and {\em Z3}. Fig. \ref{fig:sledgehammer} (page \pageref{fig:sledgehammer}) depicts the view on \textit{Sledgehammer} in \textit{Isabelle/jEdit}.

   208 \begin{figure}

   209 \centering

   210 \includegraphics[width=.95\textwidth]{sledgehammer}

   211 \caption{instance of the \textit{Sledgehammer} panel.}

   212 \label{fig:sledgehammer}

   213 \end{figure}

   214 In those cases where theorems are invalid and therefore cannot be proved, two disproof tools called {\em Quickcheck} and {\em Nitpick} can be utilized for producing counterexamples. See \cite{blanchette2011automatic} for a more detailed overview on the integration of these external tools.

   215 % proof planner Isa?

   217 %\begin{figure}

   218 %\centering

   219 %\includegraphics[width=.65\textwidth]{lcf_concepts.png}

   220 %\caption{Main concepts of the LCF approach \cite{brandt2007theorem}}

   221 %\label{fig:lcf-concepts}

   222 %\end{figure}

   224 %\begin{figure}

   225 %\centering

   226 %\includegraphics[width=.65\textwidth]{isaisar_concepts.png}

   227 %\caption{Main concepts of Isabelle/Isar \cite{brandt2007theorem}}

   228 %\label{fig:isaisar-concepts}

   229 %\end{figure}

   232 \subsection{Development and Documentation}

   233 \label{sec:isadevdoc}

   234 Apart from the technologies used in \isabelle{}, also \isabelle{}'s development process and system documentation had an impact on this thesis. The adaption to \isabelle{}'s specifics was quite challenging. \isabelle{}'s development process is much faster and much more radical as compared to commercial software development due to the specific kind of development. This development is led by three persons (see introduction to section \ref{sec:isabelle}, two of them supervising and also writing code themselves) and executed by a small team of about ten postdocs and PhDs, each working under contracts for three years at least. The members grow into the team by doing projects and master theses before entering the team. According to my supervisor, this particular kind of development leads to the following way of system documentation which is given by:

   235 \begin{enumerate}

   236 \item{\bf The code} itself, without any further comments except in few specific cases. This is common to functional programming and very different from what one is used to from e.g. \textit{Javadoc}. The code can be inspected in the repository, see pt. \ref{enum:isa-repos} below.

   237 \item\label{enum:isa-man}{\bf Reference manuals and tutorials}, which can be found online \cite{tum2013isadoc} and are easily accessible from \textit{Isabelle/jEdit} (section \ref{ssec:isa-jedit}). Both, manuals and tutorials are automatically held in sync with the code by specifically developed mechanisms such as antiquotations (section \ref{sec:bigisaml}, page \pageref{sec:antiquot}), e.g. {\tt @\{type \dots\}}, {\tt @\{const \dots\}}, etc. These entries in the documentation raise exceptions in case some type or constant has changed in the code.

   238 \item\label{enum:isa-repos}{\bf The repository}, publicly readable \cite{tum2014repo} is the third pillar for documentation. Commits must follow a {\em minimal changeset} policy which allows for quick queries and documents particular interdependencies across several files. This kind of documentation of complicated connections is very efficient and always up to date even with frequent changes.

   239 \end{enumerate}

   240 \noindent

   241 This documentation model has evolved during more than two decades and still allows for ongoing rapid development with invasive changes, last but not least parallelization. However, relevant documentation for this thesis is marginal. Parallelization, as already mentioned, has been done by Makarius Wenzel and was preceded by a feasibility study conducted by a master student \cite{haller2006object}. Wenzel's respective papers \cite{matthews2010efficient,wenzel1999isar,wenzel2012asynchronous, wenzel2011isabelle,wenzel2012isabelle} concern fundamental considerations. The most relevant reference manual \cite{wenzel2013impl} contains less than ten pages on parallelization. All the rest had to be found in the code (several hundred thousand LOCs\footnote{{\em lines of code}}). The gap between papers and code was hard to bridge.

   243 \subsection{Reasons for Parallelism in \isabelle{}}

   244 \label{ssec:isab-parallel}

   245 \isabelle{} is the first theorem prover which is able to exploit multi-core hardware. The reasons for introducing parallelism in \isabelle{} are directed towards the future and complement considerations discussed for functional programming. There are three motivations for parallelization mentioned for \isabelle{}. They have all been addressed at once: parallel invocation of multiple automated provers, a responsive prover IDE and efficient theory evaluation envisage better usability in engineering practice. All three development issues are successfully being tackled, an end cannot be foreseen yet.

   247 \subsubsection{Run Automated Provers in Parallel}

   248 \isabelle{}'s collection of automated provers was mentioned in section \ref{sec:isaatp}. It still has a mechanism to connect with remote prover instances and these already featured parallel access to external resources. Since personal computers and laptops have become powerful enough to run \isabelle{}, this hardware has also become powerful enough to run automated provers. But running them in parallel on a local machine required a redesign.

   250 \subsubsection{Provide a Tool Usable in Engineering Practice}

   251 The paragraph about {\em Continuous Penetration of Computer Science} by formal methods in section \ref{par:invade-cs} (page \pageref{par:invade-cs}) mentioned the increasing demand for theorem provers in the practice of software engineering and related mathematics. In particular, software engineers are used to comprehensive support by integrated development environments (IDEs): interlinked code, various views on relations between code elements, etc. Since TPs have been used by experts only until the presence, specific user interfaces were sufficient. For \isabelle{} this was {\em Proof General} (see section \ref{ssec:isa-jedit}), which is based on \textit{Emacs}. Those who still use this editor which was dominant during the last decades, know how different \textit{Emacs} is: not reasonable for engineering practice of today. So the issue was to meet the usability requirements of contemporary engineering practice. \isabelle{}'s attempt to meet such requirements is the {\em PIDE} project (section \ref{ssec:isa-scala}). \textit{Isabelle/jEdit} (section \ref{ssec:isa-jedit}) is the reference implementation, which is accompanied by development of an \textit{Eclipse} plugin \cite{andriusvelykis2013isaclipse} and a browser-based front-end under construction.

   253 \subsubsection{Accelerate theory evaluation}

   254 My supervisor's first experience with \isabelle{} was about fifteen years ago on a high-end SPARK workstation. The initial evaluation of \textit{Isabelle/HOL} took more than twenty minutes; not to speak of the considerable amount of time an experienced system administrator needed for installation. Since this time \textit{Isabelle/HOL} has become about three times larger, but the initial evaluation of theories takes about two minutes for recent \isabelle{} releases. \textit{Isabelle/jEdit} represents theory evaluation as shown in fig. \ref{fig:jedit_thy} on page \pageref{fig:jedit_thy}. Parallel branches in the dependency graph are evaluated in parallel. Without this kind of speedup, large theory developments involving several dozens of theories would not be manageable. The most recent \isabelle{} releases do not package binaries of evaluated theories any more. Instead, they pack further components into the bundle. The bundle of the current release is more than 400MB for download. Also, all other bandwidth aspects of \isabelle{} are at the limits of present mainstream hardware. \isabelle{} does not run below 4GB main memory and requires a dual-core processor with more than 2GHz per core.

   257 \section{\isac{}}

   258 \label{sec:isac}

   259 \isac{} is a prototype in both, in the mathematics-engine based on \isabelle{} and written in \textit{Standard ML} as well as in the front-end involving rule-based dialogs and written in \textit{Java}. The mathematics-engine heavily reuses technology from \isabelle{} and thus comprises a comparably small portion of code: 25,000 LOCs\footnote{{\em lines of code}} plus 15,000 LOCs of knowledge in theories compiled to \textit{Standard ML} for fast access and 40,000 LOCs for tests. This code features functionality which is radically new for educational mathematics systems. The gain from TP technology is explained in section \ref{ssec:isac-design}. The front-end exploits the mathematics-engine's TP-based functionality for dialogs on a new level of flexibility and user guidance. The gain from rule-based dialog guidance is addressed in section \ref{ssec:isac-archi}. Since \isabelle{} is still under rapid development, the principal question arises, whether \isac's prototype development can persistently adjust to \isabelle{}, an issue addressed in section \ref{ssec:future-isab}. Otherwise the success achieved with the case study on parallelization would be worthless in the long term. The description of the case study is comprised of two parts: the integration with \isabelle{}'s parallel theory evaluation is described in section \ref{ssec:parall-thy} and section \ref{ssec:session-manag} discusses the parallelization of \isac{}'s user session management.

   261 \subsection{Principal Design Decisions}

   262 \label{ssec:isac-design}

   263 \isac{} is still the only educational mathematics system worldwide which adheres to concepts of TP and which is built reusing TP technology, except for a system under construction by the Scandinavian \textit{E-Math} project \cite{emath2013emath}. All the other math tutoring systems are based on computer algebra. Thus, the distinctive design decision to rely on TP deserves an explanation.

   265 \paragraph{TP concepts ensure logical correctness} in stepwise problem solving. When learning independently from a human tutor or supervisor, a student wants to know whether the calculations are done correctly or not. An educational software system supporting this expectation cannot be based on computer algebra systems. These are in no way related to logics (which is an unresolved issue for computer algebra, for instance at \textit{RISC Linz}), so true / false or correct / not correct requires underpinning with logical concepts as is the case e.g. in the {\em Theorema} project at \textit{RISC Linz}. The system under construction in \textit{E-Math} supports so called {\em structured derivations} which are close to calculations written by hand (see fig. \ref{fig:qed-editor}).

   267 \begin{figure}

   268 \centering

   269 \includegraphics[width=.95\textwidth]{QED_Editor}

   270 \caption{Structured derivation editor (source: \cite{back2010structured}).}

   271 \label{fig:qed-editor}

   272 \end{figure}

   274 \noindent

   275 This kind of calculation is proved logically equivalent \cite{back2010structured} to natural deduction, which forms the logical foundation of \isabelle{} and other provers as mentioned in section \ref{sssec:holfam}.

   277 With the decision for \isabelle{}, \isac{} intentionally dissociates itself from the kind of artificial intelligence that tries to imitate human behavior. \isac{} does not intend to mimic a human tutor by modeling the system with respect to human thought, human comprehension or human learning. Rather, \isac{} is designed as a model of mathematics, following the structure of mathematics from the very bottom, from formal logics as implemented in \isabelle{}. However, AI technologies can take profit from the strength, the generality and the flexibility of the mathematics-engine resulting from this design. The principal strengths of \isac's math engine are described in the following paragraphs.

   279 \paragraph{TP technology features new functionality} of \isac's mathematics-engine. The following features arise naturally from TP:

   280 \begin{description}

   281 \sloppy

   282 \item[Check students' input automatically, flexibly and reliably.] Given a problem of applied mathematics by a formal specification (input, output, precondition and postcondition; generally hidden from the student), any formula (or a theorem for application) input by the student establishes a proof situation. Automated provers then derive, if possible, the input from the logical context. This is what computer theorem provers are built for. They represent the technology for accomplishing this task as generally and reliably as possible at the present state of the art in computer mathematics.

   283 \item[Give explanations on request by students.] Provers designed on the basis of the \textit{LCF principle} (section \ref{sssec:lcf}) such as \isabelle{}, derive all knowledge from ``first principles''. For each definition and for each theorem the respective prerequisites can be traced down to the most fundamental axioms. All this knowledge is represented in human readable form \cite{tum2013isahol}. Thus the situation in systems like \isabelle{} is quite different from computer algebra systems. While documentation for the latter requires additional efforts for creation, all documentation is already there in \isabelle{}. Generation of explanations calls for filtering out details, which could distract or overwhelm students. Another nice feature of \textit{Isabelle/Isar}, although not relevant for applied mathematics, are human readable proofs close to mathematical traditions (see section \ref{sec:isaisar} and appendix \ref{app:sqrt2-isar}).

   284 \item[Propose a next step if students get stuck.] This feature seems indispensable for independent learning. However, this exceeds the traditional discipline of TP. E.g., {\it lemma} $\int{1+x+x^2\;{\it dx}} = x + \frac{x^2}{2} + \frac{x^3}{3} + c$ can be proved, but given $\int{1+x+x^2\;{\it dx}}$ there is no direct way to calculate the result $x + \frac{x^2}{2} + \frac{x^3}{3} + c$ in TP. Since indispensable, \isac{} provides this feature by a novel combination of deduction and computation, called {\em Lucas-Interpretation} \cite{neuper2012automated}. \textit{Lucas-Interpretation} works on programs which describe how to solve a problem of applied mathematics in a quite traditional way. In addition to maintaining an environment as usual for interpretation, \textit{Lucas-Interpretation} maintains a logical context, such that automated prov\-ers are provided with all current knowledge required to check user input. \cite{daroczy2013error} gives a brief introduction to \textit{Lucas-Interpretation}.

   285 \end{description}

   286 \noindent

   287 These three features together establish a powerful interface to the mathe\-matics-engine. With these given, interaction can be analogous to learning how to play chess in interaction with a chess program. The player executes a move (the student inputs a formula) and the program checks the correctness of the move (\isac{} checks the input). The program performs a move (\isac{} proposes a next step) and the player accepts or changes it. If the player is in danger of losing the game (the student gets stuck within the steps towards a problem solution), they can go back a few moves and try alternative moves. Or the player can even switch roles with the system and watch how the system copes with the unfortunate configuration. All these choices are available to a student analogously when learning mathematics in interaction with \isac.

   289 The comparison of interaction in \isac{} with interaction in playing chess makes clear, that learning with a mechanical system does not necessarily lead to drill and practice in operational mechanization. The degree of freedom in interaction allows for fostering of the evaluation of alternatives and thus learning by trial and error, comparing strategies, etc.

   291 \subsection{Architecture Separating Mathematics and Dialogs}

   292 \label{ssec:isac-archi}

   293 The strengths of \isac's mathematics-engine, as described above, provide powerful features for human-machine interaction and thus call for concepts and technology from AI. In order to cope with increasing complexity, a separation of concerns between AI and mathematics is required.

   295 \subsubsection{Separation of Concerns between Dialogs and Mathematics}

   296 This separation is reflected in \isac's architecture shown in fig. \ref{fig:isac-archi}.

   297 \begin{figure}

   298 \centering

   299 \def\svgwidth{32.5em}

   300 \resizebox{\textwidth}{!}{\input{images/isac_arch.pdf_tex}}

   301 \caption{\isac{}'s architecture as seen by users.}

   302 \label{fig:isac-archi}

   303 \end{figure}

   304 The front-end on the left within a \textit{Java} environment comprises {\em GUI 1..n}, an arbitrary number of remote devices for input and output, the \isac{} {\em server} with the {\em dialog} component. The {\em dialog} component comprises a rule-based system as described in \cite{kienleitner2012towards}, not shown in the figure. {\em Dialog authors} are not concerned with mathematics and are free to focus on the complexity of dialog guidance, see \cite{daroczy2013error} and an example from the paper in fig. \ref{fig:interaction}.

   305 \begin{figure}

   306 \centering

   307 \begin{tikzpicture}[

   308   font=\footnotesize,

   309   >=latex,

   310   arr/.style = {<-,shorten <=2.5em, shorten >=2em,color=color1,line width=1.5pt},

   311   block/.style = {align=center, anchor=north, inner sep=2},

   312   formula/.style = {font=\large}

   313 ]

   315   \node [rectangle,rounded corners,very thick,draw=color1,block,formula,inner sep=5] (a) {$\frac{6a+3b}{3b} =\;?$};

   317   \node [anchor=north,minimum width=14em,minimum height=18em,fill=color5] (student) at (-3.3,-1.5) {};

   318   \node [anchor=north,minimum width=14em,minimum height=18em,fill=color5] (computer) at (3.3,-1.5) {};

   320   \node [anchor=north west,color=color2,font=\bf] (slabel) at (-5.96,-1.55) {student};

   322   \node [anchor=north west,color=color2,font=\bf] (clabel) at (0.69,-1.57) {computer};

   325   \node [block,formula] (b) at (-3.3,-2.4) {$\frac{6a+3b}{3b} = 6a$}

   326     edge[arr] (a);

   328   \node [block,text width=13em] (c) at (3.3,-2)

   329 {\begin{center}

   330 ``You can't simplify like that.\\

   331 Try to make products out of both,\\

   332 nominator and denominator.\\

   333 \medskip

   334 Here is a link to look up.''

   335 \end{center}}

   336     edge[arr] (b);

   338   \node [block,below=2.2em of b] (d) {\includegraphics[width=5.3em]{next_btn}}

   339     edge[arr] (c);

   341   \node [block,formula,below=2em of c] (e) {$\frac{6a+3b}{3b} = \frac{..(..+..)}{..}$}

   342     edge[arr] (d);

   344   \node [block,below=2.2em of d] (f) {\includegraphics[width=5.3em]{next_btn}}

   345     edge[arr] (e);

   347   \node [block,formula,below=2em of e] (g) {$\frac{6a+3b}{3b} = \frac{3(2a+..)}{..}$}

   348     edge[arr] (f);

   350   \node [block,formula,below=2.2em of f] (h) {$\frac{6a+3b}{3b} = \frac{2a+b}{b}$}

   351     edge[arr] (g);

   353 \end{tikzpicture}

   354 \caption{Interaction with \isac{} \cite{daroczy2013error}.}

   355 \label{fig:interaction}

   356 \end{figure}

   357 In fig. \ref{fig:isac-archi}, the server accesses the {\em knowledge} in HTML representation, both in turn administered by a learning management system like {\em moodle}.

   358 Separated from the front-end concerned with dialogs etc. is the \textit{Standard ML} environment, the implementation language of \isabelle{} and \isac{}: the {\em math-engine} together with {\em programs} for \textit{Lucas-Interpretation} and {\em knowledge} in \textit{Standard ML} representation for fast access. The math-engine builds upon {\em Isabelle} by reusing the term structure of simply typed lambda calculus and respective parsing, pretty printing, matching as well as management of {\em knowledge} within theories. The orange color in fig. \ref{fig:isac-archi} indicates parallelization in order to exploit availability of an arbitrary number of cores ({\em core 1..n}). In 2009, \isabelle{} introduced specific mechanisms to cope with multi-core environments. The central \isabelle{} component for parallelization is the {\em Scala bridge} (section \ref{ssec:isa-scala}) supporting asynchronous user interaction \cite{wenzel2012asynchronous} and parallel proof checking \cite{wenzel2009parallel} via the editor {\em jEdit} (section \ref{ssec:isa-jedit}). \isabelle{}'s parallel mechanisms are reused in the case study. The horizontal orange bar above {\em Isabelle} indicates that parallelization of \isabelle{} affected \isac's management of {\em knowledge} in \textit{Standard ML} representation. This part of the case study is described in section \ref{ssec:parall-thy}. The vertical orange bar to the left of the {\em \isac{} math-engine} indicates the concurrent user session management accomplished by the second part of the case study, which is described in section \ref{ssec:session-manag}.

   360 \subsubsection{Modeling the Universe of Mathematics}

   361 This goes beyond \isabelle{}'s way of modeling the deductive dimension of knowledge. Due to a suggestion by Bruno Buchberger in 2001, \isac{} now implements two additional dimensions, together spanning a three-dimensional space:

   362 \begin{enumerate}

   363 \item\label{enum:thy}{\bf Deductive knowledge} is defined and managed in TP. Beginning with the basic axioms (axioms of higher-order logic in \textit{Isabelle/HOL} used by \isac), definitions are given and theorems about these definitions are proved, theory by theory. \isabelle{} provides an elegant mechanism for defining new syntax, which results in a language close to traditional mathematical notation. The subset of \isabelle{} knowledge presently in use by \isac{} can be found online \cite{isac2014knowledge}.

   364 \item\label{enum:pbl}{\bf Application-oriented knowledge} is given by formal specifications of problems in applied mathematics. The specification of all problems, which can be solved automatically by \isac{} (with the help of algorithms, see below), is available online \cite{isac2014pbl}.

   365 This collection is structured as a tree. Given a tree, automated problem refinement is possible, e.g. starting from the specification at the node of {\em univariate equations} \cite{isac2014pblequuniv} a certain equation can be matched with all child nodes until the appropriate type of equation has been identified. In case a match has been found, such a node contains a pointer to a program solving the specific type of equation.

   366 This kind of problem refinement makes transparent, what is done by computer algebra systems as black boxes.

   367 \item\label{enum:met}{\bf Algorithmic knowledge} is given by programs to undergo \textit{Lucas-Interpretation}. The collection of programs is structured as a tree. However, principal requirements on the structure are still unclear. There are ideas of how to group algorithms in \cite{buchberger1984mathematik} but the ideas are too vague for mechanization. \isac's programs can be found online \cite{isac2014met}.

   368 \end{enumerate}

   369 The structure of the knowledge collections of pt. \ref{enum:pbl} and \ref{enum:met} are different from pt. \ref{enum:thy}. E.g. linear equations in pt. \ref{enum:pbl} are the same for rational numbers and complex numbers, but the definition of rationals is far away from complex numbers in pt. \ref{enum:thy}.

   370 So the implementation of the latter two collections is orthogonal to the first, the collection structured and managed by \isabelle{}. The only way of implementation was by {\tt Unsynchronized.ref}, i.e. by global references. Write access to these references assumed a certain sequence in evaluation of the theories. This assumption broke with the introduction of parallel execution in \isabelle{}. Section \ref{ssec:parall-thy} explains how this issue was resolved.

   372 \subsection{Relation to Ongoing \isabelle{} Development}

   373 \label{ssec:future-isab}

   374 In my Bachelor's project \cite{lehnfeld2011verbindung} I already had the chance to contribute to \isac{}. From my experience with \isac{} I can give the following overview of \isac{}'s development.

   376 Although being active for more than two decades, \isabelle{} development still undergoes invasive reforms concerning localization, parallelization, etc. See the NEWS file \cite{isabelle2013news} for details. These reforms cause significant changes in module signatures and in functionality more or less with each release. Many of these changes enforce considerable updates of \isac{} because \isac{} extracts functionality from \isabelle{} at a rather deep level, apart from public interfaces.

   377 Updating \isac{} in accordance to the regular \isabelle{} releases is crucial: Between 2002 and 2010 \isac{} development focused on the front-end. In 2010, \textit{Lucas-Interpretation} \cite{neuper2012automated} was clarified and \isabelle{}'s contexts appeared indispensable. Contexts appeared in \isabelle{} after 2002, so an update was necessary. However, updating \isac's math-engine from \textit{Isabelle2002} to \textit{Isabelle2009-2} turned out to be very hard and it was done during and after my Bachelor's project.

   378 Ever since this unpleasant experience \isac{} has been updated with each official \isabelle{} release which are published every eight to twelve months. In addition to these regular updates there are major tasks concerning further adoption of \isabelle{}'s mechanisms which are open due to early design decisions of \isac.

   380 \paragraph{Adopt logical contexts from \isabelle{}.}

   381 \isac's mathematics-engine was developed between 2000 and 2002, long before \isabelle{} introduced logical contexts \cite{wenzel2013impl}. Without these contexts the user needed to explicitly fix the type of variables during input. This was very inconvenient. I implemented my Bachelor's project in time to introduce contexts to \isac{} and gain first experiences with the system.

   382 Experience with contexts also contributed to the theoretical clarification of \textit{Lucas-Interpretation} \cite{neuper2012automated}.

   384 \paragraph{Adapt to \isabelle{}'s parallel theory evaluation.}

   385 Since \isabelle{} traverses its theory dependency graph in parallel, \isac{} can no longer rely on a particular, deterministic execution order. Fixing the problems which arose from this fact and integrating the related datastructures into \isabelle{}'s theory data management model is one of the big achievements of the project work that this thesis is based on. The details of this process are outlined in section \ref{ssec:parall-thy}.

   387 \paragraph{Exploit parallelism for concurrent session management.}

   388 This task addresses the utilization of \isabelle{}'s new mechanisms for parallelism, most notably futures, for \isac{}'s user session management in order to allow for a high number of concurrent, efficiently executed calculations. Section \ref{ssec:session-manag} documents this step in more depth.

   390 \paragraph{Make \isac's programming language usable.} \isac's programming language was implemented between 2000 and 2002, long before \isabelle{}'s function package was available \cite{krauss2013defining}. \textit{Lucas-Interpretation} on programs in this language uses a variety of rulesets for rewriting, too complicated for an average programmer \cite{rocnik2012interactive}.

   391 Considerable simplification of programming can be expected by narrowing \isac's programming language towards \isabelle{}'s function package. In particular, components of \isabelle{}'s code generator \cite{haftmann2010code} promise to greatly improve computational efficiency.

   393 \paragraph{Adopt \isabelle{}'s numeral computation for \isac{}.}

   394 The initial conception of \isac{} emphasized applied mathematics, which is unrealistic without floating point numbers. There were early attempts to implement the latter together with complex numbers. Both, floating point numbers and complex numbers have in the meantime been implemented in \isabelle{}. These shall now be adopted for \isac{}.

   396 \paragraph{Improve the efficiency of \isac{}'s rewrite engine.}

   397 \isac's initial design stressed the importance of computations closely resembling work by paper and pencil. Major parts of computation happen by rewriting, so a large number of rewrite rules need to form groups which as such mimic the step width of human computations.

   398 \isabelle{}'s rewrite engine meets opposite requirements: provide a maximum of automation with large rule sets (``simpsets''). Only recently, frequent user requests motivated projects in the \isabelle{} development to provide readable traces for rewriting.

   399 However, the requirements of \isabelle{} and \isac{} are too different and \isac{} will keep its own rewrite engine. But the mechanism of {\em term nets} \cite{charniak2014artificial} could be adopted to improve \isac's efficiency.

   401 \paragraph{Adopt \textit{Isabelle/jEdit} for \isac{}.}

   402 \isac's initial design also stressed usability of the front-end for students. At this time \isabelle{}'s front-end was {\em Proof General} \cite{aspinall2000proof}. Thus there was no choice but to develop a custom GUI for \isac{}.

   403 In the meantime \textit{Isabelle/jEdit} has become an appealing prover IDE. However, the requirements are presently quite opposing. \isabelle{} connects one engineer with a multitude of cores while \isac{} connects a multitude of students with one server. \isac's centralized structure has good reasons: groups of students have one lecturer, groups of students are examined together, etc.

   404 During the next years, \isac{}'s front-end will be developed and improved independently from \isabelle{}. Narrowing towards \isabelle{} can start as soon as \textit{Isabelle/jEdit} moves towards collaborative work, implements session management, version management, etc.

   406 \subsubsection{Summary on \isac's Approach to \isabelle{}}

   407 Above, those points are listed, where \isac's math-engine is supposed to approach \isabelle{} in future development, as this thesis already has begun. Development of \isac's math-engine is separated from the development of \isac's front-end. Section \ref{ssec:isac-archi} describes \isac's {\em architecture {\em separating} mathematics and dialogs}.

   408 Development of the front-end addresses the expertise represented by our {\em University of Applied Sciences} in Hagenberg: human-centered computing, interactive media, mobile computing, secure information systems, software engineering, etc. \isac's development efforts in these disciplines can be planned independently from the above list for several years, until \textit{Isabelle/jEdit} is ready to be adopted for \isac{} in a particularly interesting development effort.

   411 \section{Practical Work on \isac{}}

   412 \label{sec:contrib}

   413 This section documents the implementation work that has been carried out as project work in preparation for and along with the writing of this thesis. During the first big step, the breakdown of \isac{}'s theory evaluation caused by an invasive update to \isabelle{}'s mechanisms was fixed (section \ref{ssec:parall-thy}). This made possible the introduction of concurrency to \isac{}'s user session management and thus enabled the parallel computation of independent calculations. This process is described in section \ref{ssec:session-manag}. Section \ref{sec:performance} discusses the effects of this process on \isac{}'s performance. Finally, \isac{}'s project status is summarized in section \ref{sec:proj-stat}.

   416 \subsection{Integration With \isabelle{}'s Parallel Theory Evaluation}

   417 \label{ssec:parall-thy}

   418 In older \isabelle{} versions, it was not possible for programmers to append arbitrary data to the datastructure which holds all the theories. {\em Isabelle2005} \cite{isabelle2013news} then introduced a concept called {\em theory data}, which allowed developers to store user-defined datastructures in a theory's context from \textit{ML} code embedded in that theory.

   420 \paragraph{Data flow.} In order to understand the following paragraphs, we need to know how \isabelle{} computes theories and what the theory data flow looks like. Section \ref{sec:isaisar} already mentioned, that the inheritance structure of \isabelle{}'s theories forms an acyclic, directed graph. Computation begins at the root theories which are determined backwards from the current working theory. The resulting data is then available in all descendant theories. Whenever a theory has more than one ancestor, the derived data is merged.

   422 \bigskip

   424 The last sections have outlined the conceptual and architectural differences between \isabelle{} and \isac{}. \isac{} needs to manage application-oriented and algorithmic knowledge which originally could not be integrated with the deductive structure of \isabelle{}'s theories. Therefore these data were stored in raw ML references. {\em Isabelle2009} then introduced parallelism. Now the computation order of the theories was not deterministic anymore and \isac{} could not ensure that its \textit{ML} references were accessed and updated in the same order as previously. This caused certain parts of the system to show faulty behavior in some cases. {\em Isabelle2009-1} then added the wrapper structure {\tt Unsynchronized.ref} to denote that these references are not synchronized with the parallel environment managed by \isabelle{}.

   425 While the temporary deactivation of faulty modules and certain work\-arounds fixed most problems, the parallelization of \isac{}'s user session management required that most relevant data be properly managed by \isabelle{}. Therefore the preparatory step for the parallelization was the integration of the unsynchronized references in question with \isabelle{} by means of the functor {\tt Theory\_Data} \cite{wenzel2013impl}.

   426 It is important to mention that the \isabelle{} implementation manual \cite{wenzel2013impl} warns to be careful about using too many datastructures on the basis of {\tt Theory\_Data} because they are newly initialized for every single theory that derives from the respective module that declared the datastructures. Thus, space is reserved which can cause a significantly increased memory footprint. Most of the overhead, however, occurs when theories are loaded dynamically. When working with a compiled \isac{} system the overhead should be reasonable.

   428 The workflow was implemented for several central elements and broken down into five isolated steps to conform with \isabelle{}'s minimal changeset development and documentation model (section \ref{sec:isadevdoc}):

   429 \begin{enumerate}

   430   \item\label{enum:isolate} {\bf Isolate access.} In this step, read accesses were centralized by wrapping references in an access function and replacing all occurrences with a call to the function:

   431 \imlcode{~~\=val foo = Unsynchronized.ref [1,2,3]; (*declaration*)\\

   432 \>fun get\_foo () = !foo; (*new*)\\

   433 \>fun add x y = x + y;\\

   434 \>fun foo\_sum () = fold add (!foo) 0; (*obsolete*)\\

   435 \>fun foo\_sum () = fold add (get\_foo ()) 0; (*new*)}

   437   \item\label{enum:addthydata} {\bf Add {\tt Theory\_Data} access functions} in parallel to the existing ones. This includes a new version of the getter function ({\tt get\_foo} in pt. \ref{enum:isolate}). But first we need to define a datastructure that \isabelle{} can manage. Thus, we need to implement the functor \texttt{Theory\_Data} for our specific datatype\footnote{see appendix \ref{app:merge-lists} for definitions of {\tt merge\_lists(')}}:

   438 \imlcode{\label{alg:thydata-functor}

   439 ~~\=st\=ructure Foo\_Data = Theory\_Data (\\

   440 \>\>type T = int list;\\

   441 \>\>val empty = [];\\

   442 \>\>val extend = I; (*identity*)\\

   443 \>\>val merge = merge\_lists;\\

   444 \>);}

   445 \noindent

   446 {\tt T} is the type of the data. {\tt empty} is the initial value for every theory that does not define this particular data slot content. {\tt extend} is used for reinitialization on import of a theory and can be understood as a unitary {\tt merge}. Finally, {\tt merge} declares, how the data slot contents of two theories are to be joined. Unlike the operations that we are replacing during this process, the two simple access functions

   447 \imlcode{~~\=val get\_foo' = Foo\_Data.get;\\

   448 \>fun put\_foo' xs = Foo\_Data.map (merge\_lists' xs);}

   449 \noindent

   450 additionally require the target theory as input parameter. Because \textit{ML} code is always embedded in \isabelle{} theories, they can operate on any available theory according to the dependency graph. Each theory stores its own version of the data slot. A new version of {\tt get\_foo},

   451 \imlcode{~~fun get\_foo () = get\_foo' @\{theory\}; \textrm{,}}

   452 which is not yet in use during this phase, could replace \texttt{get\_foo} from pt. \ref{enum:isolate}. This definition uses antiquotations (section \ref{sec:antiquot}, page \pageref{sec:antiquot}). This means that {\tt @\{theory\}} always refers to the theory where the function definition is located and is thus resolved at compile time. The alternative approach

   453 \imlcode{~~fun get\_foo () = get\_ref\_thy () |> get\_foo';}

   454 \noindent

   455 is more flexible in that {\tt get\_foo'} operates on a reference theory set by the programmer (details on {\tt get\_ref\_thy} are explained in the subsequent section) and passes it to {\tt get\_foo'} from above. The last aspect of this step is the addition of {\bf setup} blocks \cite{wenzel2013isar} where previously the raw references had been updated. These blocks must contain one \textit{ML} expression which represents a mapping between two {\tt theory}s. As a consequence, \isabelle{} updates the current theory context with the function's result in these locations, e.g.

   456 \imlcode{~~setup \{* put\_foo' [4,5,6] *\} \textit{.}}

   458   \item\label{enum:checkdiff} {\bf Check differences.} Now that both, the old unsynchronized reference and the new stored as theory data, are created and updated, we can compare their contents and make sure that the last steps were successful. Due to the nature of the datastructures and the fact that the new method caused a different ordering to some of the elements, the most convenient solution was to compute a string representation from both results, write them into documents and compare the documents using a file comparison utility. If any errors occurred, step \ref{enum:isolate} and \ref{enum:addthydata} required revision.

   460   \item\label{enum:shiftthydata} {\bf Shift to theory data.} Optimally, this step merely meant exchanging the old definition of {\tt get\_foo} (pt. \ref{enum:isolate}) for the new one (pt. \ref{enum:addthydata}).

   462   \item\label{enum:removecode} {\bf Cleanup.} Finally, we can remove all code concerned with the unsynchronized reference.

   464 \end{enumerate}

   466 Besides acquiring the necessary knowledge on how to store and access arbitrary datastructures in a theory's context, the biggest challenges included understanding and merging the different kinds of custom datastructures and keeping the solutions simple; optimally simpler than the previous code. Also, I had to adjust the theory hierarchy and add new theories in order to keep a clean structure and ensure that the availability and computation of datastructures was sound and behaved as it had previously. Some of the data dependencies had not been reflected in the dependency graph but had rather been fulfilled by a fortunate execution order.

   468 As a result of this contribution, most of the stateful components in \isac{}, which had been necessary to circumvent mismatches between \isac{}'s and \isabelle{}'s architectures, were eradicated from the system. Those which are still not synchronized are currently being replaced. However, they are not accessed by theories potentially executed in parallel and therefore do not interfere with \isac{}'s parallel user session management.

   470 % calcelems.sml:839

   471 % merge tree into another tree (not bidirectional!)

   472 %(* this function only works wrt. the way Isabelle evaluates Theories and is not a general merge

   473 %  function for trees / ptyps *)

   474 %fun merge_ptyps ([], pt) = pt

   475 %  | merge_ptyps (pt, []) = pt

   476 %  | merge_ptyps ((x' as Ptyp (k, x, ps)) :: xs, (xs' as Ptyp (k', y, ps') :: ys)) =

   477 %      if k = k'

   478 %      then Ptyp (k, y, merge_ptyps (ps, ps')) :: merge_ptyps (xs, ys)

   479 %      else x' :: merge_ptyps (xs, xs');

   481 %fun merge_tree

   483 % challenges:

   484 %   ensure that required functions are available wherever used/needed (change inheritance structure, introduce new theories)

   485 %   don't add to complexity

   486 %   datastructures different (lists, trees)

   488 %DONE:

   489 %> Offensichtlich sind die Umstände durch die unterschiedlichen Architekturen und Requirements der beiden Systeme und der ständigen Erweiterung Isabelles entstanden.

   490 %Probleme sind durch die Weiterentwicklung von Isabelle entstanden, unterschiedliche Architekturen und Requirements spielen hier keine Rolle:

   491 %ISAC's "stateful components" haben aufgrund von Isabelle's Parallelisierung "non-deterministic behaviour" bekommen. In ISAC betraf dies Unsynchronized.ref.

   492 %> Meine momentane Arbeit ist notwendig, um dann ISAC zu parallelisieren, hat aber forschungstechnisch weniger mit der Parallelisierung funktionaler Programme zu tun, sondern mit Software Engineering Prozessen speziell in Standard ML, da es mutable Data erlaubt - oder?

   493 %"mutable data", die "Unsynchronized.ref" waren notwendig, um der "deductive structure of Isabelle theories" zu entkommen und "Application-oriented knowledge" sowie "Algorithmic knowledge" hinzuzufügen.

   494 %Hier ging es also um "Architectural Design", das nur so unschön zu implementieren war.

   495 %Die bisherige Arbeit war notwendig, weil ISAC mit den "stateful components" auf dem parallelsierten Isabelle schlichtweg nicht mehr funktionierte.

   496 %> Bei einer Implementierung mittels einer rein funktionalen Sprache hätte es diese Probleme maximal in anderer Form gegegeben.

   497 %Genau.

   499 %DONE:

   500 %>> The init operation is supposed to produce a pure value from the given back-

   501 %>> ground theory and should be somehow “immediate”. Whenever a proof con-

   502 %>> text is initialized, which happens frequently, the the system invokes the init

   503 %>> operation of all theory data slots ever declared. This also means that one

   504 %>> needs to be economic about the total number of proof data declarations in

   505 %>> the system, i.e. each ML module should declare at most one, sometimes two

   506 %>> data slots for its internal use. Repeated data declarations to simulate a

   507 %>> record type should be avoided!

   508 %> (S. 46)

   509 %>

   510 %> Ich weiß nicht, ob das eh vernachlässigbar ist.

   511 %Nicht vernachl"ssigen --- sondern in 4.2.4 erkl"aren:

   512 %# ISAC erzeugt eine gewaltige Menge von Theory_Data

   513 %# Theory_Data werden von Theory zu Theory 'vererbt' --- das erzeugt Overhead, wenn Theories dynamisch %geladen werden

   514 %# ISAC braucht aber kein dynamisches Laden von Theories --- alle Theories sind in einer Session zusammengefasst, die 1-mal beim Hochfahren geladen wird.

   517 \subsection{Introduction of Concurrent User Session Management}

   518 \label{ssec:session-manag}

   519 \isabelle{}'s programming conventions prohibit user programs based on \isabelle{} from forking their own \textit{ML} threads. This is necessary because \isabelle{} has its own managed environment for concurrency which takes care of thread synchronization and shared resources. Accesses from user threads could easily break this well-tested framework. After careful analysis of the \isabelle{} documentation and the code in some concurrency related \textit{Isabelle/ML} modules, our suspicion that parallelizing \isac{}'s user session management required breaking this convention was rendered unfounded. As it turned out, \isabelle{}'s implementation of {\em future}s (section \ref{sec:isafut}) provided everything we needed. As a result, the actual implementation effort was marginal. Since \isac{}'s does not maintain any problematic stateful resources anymore, there was no further effort required for synchronizing shared resources beyond those managed by \isabelle{}, except for the {\tt states} element, which maintains a state for each active calculation. Instead of an {\tt Unsynchronized.ref}, it was replaced by a {\tt Synchronized.var} (see section \ref{sec:guarded-access}, page \pageref{sec:guarded-access}).

   521 The function {\tt appendFormula}, which derives the input formula from a calculation's state, had to be transformed in a way, such that it can be passed to {\tt Future.fork} which then takes care of the parallel execution.

   522 \imlcode{~~fun ap\=pendFormula params =\\

   523 \>Future.fork (fn () => appendFormula' params);}

   524 \noindent

   525 is a simplified version of the key line in this transition, with {\tt appendFormula'} being the original version of the function. The same transformation was applied to {\tt autoCalculate}, which can compute calculations step by step as well as automatically determine the final result. Many of the computations within {\tt autoCalculate} require access to the current proof context datastructure. During the previous part of the project (section \ref{ssec:parall-thy}), this was done using the function {\tt the\_generic\_context} from \isabelle{}'s {\em ML\_Context} module. When I called more than one instance of {\tt autoCalculate} simultaneously, it turned out that functions executed within a future cannot acquire access to the current proof context through this function. In practice, the context data of the theory {\em Isac} is sufficient for all calculations and could be completed and made available at compile time. However, the \isac{} test suite extends the context data, which cannot be modified at run time. In order to overcome this, there is now a {\tt Synchronized.var} at an appropriate location within the dependency graph, which determines the reference theory for all operations accessing the newly introduced {\tt Theory\_Data} elements. The test suite uses a new function {\tt set\_ref\_thy} to override the standard value which is the theory {\em Isac}.

   527 Now that all calls to {\tt the\_generic\_context} have been eliminated, invocations of {\tt autoCalculate} and {\tt appendFormula} return immediately with a future value and thus the respective operations are executed in the background.

   530 \subsection{Performance}

   531 \label{sec:performance}

   532 The main objective of my project was to improve the speed with which calculations are performed by the system when multiple user front-ends invoke the \isac{} mathematics-engine simultaneously.

   534 The change in memory usage is not our foremost concern and it can only increase linearly. The number of concurrent calculations depends on \isabelle{}'s thread pool size. Pending invocations of the mathematics-engine are problem descriptions whose memory footprint is negligible, considering a reasonable number of clients working on the same \isac{} server. One concern is the keeping of state data on a per user basis which could accumulate and pose a challenge to the server's working memory under heavy load.

   536 Temporal gains were measured on a 64-bit {\em Arch Linux} system running on a {\em Lenovo ThinkPad X230} with 4GB RAM and an {\em Intel\textregistered{} Core\texttrademark{} i5-3320M CPU} with two physical 2.60GHz cores and one additional logical core respectively (see {\em simultaneous multithreading} in section \ref{sec:multicoreproc}). The performance test file can be found on the attached DVD (see appendix \ref{app:dvd-proj}). It uses the most CPU-intensive problem instance currently available, whose calculation takes about 1.58 seconds on average. Various numbers of parallel invocations were issued and the time until all of them were finished was measured. Per test there were 20 repetitions and an average taken. All the samples are shown in fig. \ref{fig:autocalc}.

   538 \def\thinkpaddata{data/thinkpad.dat}

   539 \def\thinkpadavg{data/thinkpadavg.dat}

   540 \def\thinkpaddataseq{data/thinkpad_seq.dat}

   541 \def\thinkpadavgseq{data/thinkpadavg_seq.dat}

   543 \pgfplotsset{

   544     colormap={spectral}{[5pt]

   545         color(0pt)=(color1);

   546         color(5pt)=(color2);

   547     }

   548 }

   550 \begin{figure}

   551 \centering

   552 \begin{tikzpicture}

   553   \begin{axis}[

   554     xlabel=number of calculations $n$,

   555     ylabel=time $t$ in ms,

   556     ylabel style={at={(-0.23,0.5)}},

   557     xmin=0,

   558     xmax=15.5,

   559     xticklabel style={/pgf/number format/.cd,1000 sep={,}},

   560     xticklabel=\pgfmathparse{\tick}\pgfmathprintnumber{\pgfmathresult},

   561     ymin=0,

   562     ymax=24000,

   563     scaled y ticks = false,

   564     yticklabel style={/pgf/number format/.cd,fixed,1000 sep={,}},

   565     yticklabel=\pgfmathparse{\tick}\pgfmathprintnumber{\pgfmathresult},

   566     grid=major,

   567     legend entries={seq. samples, seq. mean, par. samples, par. mean},

   568     legend style={draw=none,at={(0.49,0.92)}},

   569     axis lines=left

   570   ]

   571     \addplot[scatter, mark=o, only marks, color=color1, point meta=explicit symbolic] table[y=t, x=n,meta index=0] {\thinkpaddataseq};

   572     \addplot[color=color1] table[y=t, x=n, meta index=0]{\thinkpadavgseq};

   573     \addplot[scatter, mark=o, only marks, color=color2, point meta=explicit symbolic] table[y=t, x=n,meta index=1] {\thinkpaddata};

   574     \addplot[color=color2] table[y=t, x=n, meta index=1]{\thinkpadavg};

   575   \end{axis}

   576 \end{tikzpicture}

   578 \caption{{\tt autoCalculate} performance comparison.}

   579 \label{fig:autocalc}

   580 \end{figure}

   582 \subsubsection{Discussion}

   583 From these results it is evident, that two parallel calculations can almost fully exploit both physical processing cores. The slight increase of processing time can be explained by a parallelization overhead caused by \isabelle{}'s future management. Another factor are the synchronized accesses to the {\tt states} datastructure. Since simultaneous multithreading cannot completely simulate additional cores, the increase in runtime between two and three parallel calculations is then slightly bigger. The same happens between six and seven parallel {\tt autoCalculate} invocations. As the number of logical cores of the used machine is four, multiples of four show a significant increase in runtime because once all cores are occupied, additional calculations need to wait. Hence, the execution time accurately doubles with the number of calculations according to the number of cores. The measured runtime of the new {\tt autoCalculate} version was only $38.7\%$ of the previous, sequential implementation with eight parallel calculations, $39.8\%$ with ten and $39.7\%$ with twelve. This is a very satisfying outcome and we trust that it will help to significantly improve the end user experience and enable \isac{} servers to deal with a high number of clients simultaneously.

   585 \subsection{Project Status}

   586 \label{sec:proj-stat}

   587 The user session management is now able to utilize several processing cores for independent computations. However, communication between the \textit{ML} and \textit{Java} layers occurs on the basis of plain standard I/O streams, i.e. multiple front-ends using the same math-engine instance need to write to the same input stream on the \textit{ML} side and thus communication is unsynchronized. This means that although the math-engine is now able to deal with calculations for multiple users concurrently in an efficient manner, the communication model does not yet allow for thread-safe communication. Changing this may be subject for a future project. Another area requiring further investigation is the memory footprint of accumulated state data for calculations and how to deal with a very high number of concurrent user sessions in this respect.

   589 % problems with streams, communication between ml and java, no return value of appendFormula

author	Walther Neuper <neuper@ist.tugraz.at>
	Wed, 23 Jul 2014 14:32:19 +0200
changeset 55476	8e3f73e1e3a3
parent 55466	55c2d2ee3f92
child 60710	21ae85b023bb
permissions	-rw-r--r--