1.1 --- a/doc-isac/mlehnfeld/master/thesis/funproglangs_mcsystems.tex	Wed Jul 23 14:09:51 2014 +0200
     1.2 +++ b/doc-isac/mlehnfeld/master/thesis/funproglangs_mcsystems.tex	Wed Jul 23 14:32:19 2014 +0200
     1.3 @@ -2,7 +2,7 @@
     1.4  \label{cha:funproglangs_mcsystems}
     1.5  This section presents several concepts for parallelism and concurrency which have been suggested for and implemented in functional programming languages. The central idea here is the philosophy of the declarative paradigm: The developer should not have to worry about {\em how} the parallel execution needs to be organized but at most make informed decisions on {\em what} has to be parallelized. Ideally, process management details such as process creation, resource assignment and synchronization should be taken care of by compiler and run-time system. In practice, various properties of languages such as impureness and monadic I/O, may allow for language constructs from imperative programming or even require forms of explicit synchronization.
     1.6  
     1.7 -Many of the mechanisms presented in the following sections have been implemented in and modified for certain programming languages. It seemed appropriate to place details on these realization details and changes in the respective sections. Table \ref{tab:lang_par_overview} gives a brief overview. Note that it only contains those michanisms and sections that mention concrete implementations.
     1.8 +Many of the mechanisms presented in the following sections have been implemented in and modified for certain programming languages. It seemed appropriate to place details on these realization details and changes in the respective sections. Table \ref{tab:lang_par_overview} gives a brief overview. Note that it only contains those mechanisms and sections that mention concrete implementations.
     1.9  \begin{table}[ht]
    1.10  \centering
    1.11  \caption{Overview of parallelism mechanisms and referenced programming languages.}
    1.12 @@ -61,7 +61,7 @@
    1.13  \label{sec:annot}
    1.14  Another straight forward solution is the use of annotations. This is probably the simplest way of expressing parallelism because it does not affect the semantics of a program but only its runtime behavior. By ignoring the annotations the program can then still be compiled sequentially. This method has been implemented in the programming language \textit{Clean} \cite{nocker1991concurrent} which also allows indications of the target processing unit for the evaluation. Since eager languages evaluate a function's arguments before invoking the function itself, this mechanism, in its simple form, only works for lazy languages. A potential implementation of the parallel {\tt fib} function\footnote{see comments in appendix \ref{sec:fib-imp}} in a lazy version of \textit{Standard ML} could look somewhat like the definition
    1.15  \imlcode{\label{alg:parannot}
    1.16 -~~\=fu\=n fib 0 = 1\\
    1.17 +~~\=fu\=n fib 0 = 0\\
    1.18  \>\>| fib 1 = 1\\
    1.19  \>\>| fib x = fib (x - 1) + par (fib (x - 2)); \textrm{.}}
    1.20  \noindent
    1.21 @@ -71,7 +71,7 @@
    1.22  \label{sec:futurespro}
    1.23  The concepts of {\em promises} and {\em futures} were first suggested in the mid 1970s \cite{baker1977incremental}. The idea is similar to annotations (see previous section). One of the first languages that adopted futures was called {\em Multilisp} \cite{halstead1985multilisp}. As the {\em Isabelle} theorem prover provides its own implementation of futures \cite{matthews2010efficient} which have been introduced to \isac{} during the parallelization process (section \ref{ssec:session-manag}), they are of special interest for this thesis. Futures are objects representing the result of an expression which may not be known initially because its computation is managed by the run-time system and may occur in parallel. {\em Implicit} futures are usually an integral part of a language and treated like ordinary references. If results are not available the first time they are requested, program execution stops until the future value has been obtained. In contrast, {\em explicit} futures require programmers to manually enforce their computations before they can access their results. {\em Isabelle}'s solution follows the latter approach and will be discussed in more detail in section \ref{sec:isafut}. The definition
    1.24  \imlcode{\label{alg:futures}
    1.25 -~~\=fu\=n \=fi\=b \=0 = 1\\
    1.26 +~~\=fu\=n \=fi\=b \=0 = 0\\
    1.27  \>\>| fib 1 = 1\\
    1.28  \>\>| fib x = let\\
    1.29  \>\>\>\>\>fun fib' () = fib (x - 2)\\
    1.30 @@ -90,7 +90,7 @@
    1.31  
    1.32  \subsection{Algorithmic Skeletons}
    1.33  \label{sec:algoskel}
    1.34 -The idea of algorthmic skeletons is simple and not limited to parallelism. Most parallel algorithms show common patterns. These patterns can be abstracted into higher-order functions. A well known example is {\em divide-and-conquer}:
    1.35 +The idea of algorithmic skeletons is simple and not limited to parallelism. Most parallel algorithms show common patterns. These patterns can be abstracted into higher-order functions. A well known example is {\em divide-and-conquer}:
    1.36  \imlcode{~~\=fu\=n \=da\=c \=is\=\_trivial solve divide conquer x =\\
    1.37  \>\>\>if is\_trivial x then\\
    1.38  \>\>\>\>solve x\\
    1.39 @@ -99,7 +99,7 @@
    1.40  \>\>\>\>\>|> map (dac is\_trivial solve divide conquer)\\
    1.41  \>\>\>\>\>|> conquer; \textrm{,}}
    1.42  \noindent
    1.43 -where \texttt{x |> f} is syntactic sugar for \texttt{f x} and \texttt{is\_trivial} is a function with one argument and a \texttt{bool} return value. In order to parallelize the algorithm we only need to use a parallel version of {\tt map}. Other common patterns include {\em branch-and-bound} and {\em dynamic programming}. While this method is also available for imperative languages such as \textit{Java}, the fact that functional languages treat higher-order functions as first-class citizens makes them particularly suitable for the use of skeletons. Algorithmic skeletons hide the orchestration between sequential portions of code from the programmer. Sometimes the compiler can provide low-level support for the skeletons. Furthermore, multiple primitive skeletons can be composed to produce more powerful patterns. Unlike other high-level parallelism mechanisms, they are apt to cost modelling \cite{hammond2000research}.
    1.44 +where \texttt{x |> f} is syntactic sugar for \texttt{f x} and \texttt{is\_trivial} is a function with one argument and a \texttt{bool} return value. In order to parallelize the algorithm we only need to use a parallel version of {\tt map}. Other common patterns include {\em branch-and-bound} and {\em dynamic programming}. While this method is also available for imperative languages such as \textit{Java}, the fact that functional languages treat higher-order functions as first-class citizens makes them particularly suitable for the use of skeletons. Algorithmic skeletons hide the orchestration between sequential portions of code from the programmer. Sometimes the compiler can provide low-level support for the skeletons. Furthermore, multiple primitive skeletons can be composed to produce more powerful patterns. Unlike other high-level parallelism mechanisms, they are apt to cost modeling \cite{hammond2000research}.
    1.45  
    1.46  
    1.47  \section{Data Parallelism}
    1.48 @@ -117,7 +117,7 @@
    1.49  
    1.50  \subsection{Software Transactional Memory}
    1.51  \label{sec:stm}
    1.52 -While hardware support for transactions had been introduced before and they were a fundamental aspect of fault tolerance in the context of database systems, software-only transactional memory was first suggested by Shavit and Touitou in 1997 \cite{shavit1997software}. Unlike {\em lock-based synchronization} STM generally is a non-blocking strategy. Transactions are sequences of read and write operations on a shared memory. They are tagged atomic blocks, processed optimistically, regardless of other threads potentially performing other alterations to the memory simultaneously. Each access to the memory is logged and once a transaction is finished, a final validation step verifies that no other thread modified any of the elements which were read during the process, i.e. the transaction's view of the memory was consistent. If there was no conflicting write access, the transaction is commited to the memory. Otherwise the whole process is repeated until no more conflicts arise. In order to be correct, transactions must be linearizable, i.e. it must be possible to execute them in a certain sequence without temporal overlaps, such that the resulting state of the shared memory is the same as if the transactions had been carried out concurrently \cite{herlihy1990linearizability}. Because of their non-blocking nature, STMs do not cause deadlocks or priority inversion. They are inherently fault tolerant to a certain extent, because transactions can be undone in the event of a timeout or an exception. Also, they are free from the tradeoffs between lock granularity and concurrency.
    1.53 +While hardware support for transactions had been introduced before and they were a fundamental aspect of fault tolerance in the context of database systems, software-only transactional memory was first suggested by Shavit and Touitou in 1997 \cite{shavit1997software}. Unlike {\em lock-based synchronization} STM generally is a non-blocking strategy. Transactions are sequences of read and write operations on a shared memory. They are tagged atomic blocks, processed optimistically, regardless of other threads potentially performing other alterations to the memory simultaneously. Each access to the memory is logged and once a transaction is finished, a final validation step verifies that no other thread modified any of the elements which were read during the process, i.e. the transaction's view of the memory was consistent. If there was no conflicting write access, the transaction is committed to the memory. Otherwise the whole process is repeated until no more conflicts arise. In order to be correct, transactions must be linearizable, i.e. it must be possible to execute them in a certain sequence without temporal overlaps, such that the resulting state of the shared memory is the same as if the transactions had been carried out concurrently \cite{herlihy1990linearizability}. Because of their non-blocking nature, STMs do not cause deadlocks or priority inversion. They are inherently fault tolerant to a certain extent, because transactions can be undone in the event of a timeout or an exception. Also, they are free from the tradeoffs between lock granularity and concurrency.
    1.54  
    1.55  Recent work on {\em commitment ordering} has helped reduce the number of transaction conflicts and thereby improve performance of STM implementations \cite{ramadan2009committing,zhang2010software}. Also, there are various STM solutions that do in fact use locking mechanisms during different phases of transaction. They can potentially reduce the number of rollbacks and consequently also make process coordination faster \cite{ennals2006software}. The significant difference between these approaches and conventional, mutex based thread synchronization mechanisms is the fact that with STM, any locking is invisible to the programmer and taken care of by the run-time system. The downside of STM is the performance overhead caused by transaction management which typically impairs the performance of programs with a low number of threads, such that they are slower than when implemented in a sequential manner. STMs, however, scale particularly well with respect to higher thread and processor numbers \cite{saha2006mcrt}.
    1.56  
    1.57 @@ -131,7 +131,7 @@
    1.58  
    1.59  \subsection{The Actor Model}
    1.60  \label{sec:actormodel}
    1.61 -In 1973, Hewitt, Bishop and Steiger suggested a modular actor formalism \cite{hewitt1973universal} which laid out the basis of what is now known as the {\em Actor model}. Several publications of the following decade formed a comprehensive Actor model theory. The two central concepts of the Actor model are {\em actors} and {\em messages} \cite{agha1985actors}. Actors are entities that can send and receive messages. Their behavior is defined by the {\bf actions} they perform on receipt of a message (e.g. create another actor) and a finite number of other actors they know about, called {\bf acquaintances}. The acquantance relation is not necessarily bidirectional, i.e. an acquantance $B$ of actor $A$ must not necessarily know actor $A$. Each actor has an immutable, unique name and a local state. Communication between actors is asynchronous. Incoming messages are processed one at a time and as a response to each message an atomic sequence of actions is performed. Since actors and messages are autonomous, encapsulated entities they are easy to reason about and can be composed into more complex systems \cite{agha1997foundation}. The order of actions affects an actor's external behavior. But because messaging occurs asynchronously, the processing order of messages is undefined. This leads to nondeterminism in computations based on the actor system. There are certain abstractions available which allow for constraints on message ordering and can therefore eliminate part of the nondeterminism \cite{karmani2009actor}. They can be summarized in four important properties:
    1.62 +In 1973, Hewitt, Bishop and Steiger suggested a modular actor formalism \cite{hewitt1973universal} which laid out the basis of what is now known as the {\em Actor model}. Several publications of the following decade formed a comprehensive Actor model theory. The two central concepts of the Actor model are {\em actors} and {\em messages} \cite{agha1985actors}. Actors are entities that can send and receive messages. Their behavior is defined by the {\bf actions} they perform on receipt of a message (e.g. create another actor) and a finite number of other actors they know about, called {\bf acquaintances}. The acquaintance relation is not necessarily bidirectional, i.e. an acquaintance $B$ of actor $A$ must not necessarily know actor $A$. Each actor has an immutable, unique name and a local state. Communication between actors is asynchronous. Incoming messages are processed one at a time and as a response to each message an atomic sequence of actions is performed. Since actors and messages are autonomous, encapsulated entities they are easy to reason about and can be composed into more complex systems \cite{agha1997foundation}. The order of actions affects an actor's external behavior. But because messaging occurs asynchronously, the processing order of messages is undefined. This leads to non-determinism in computations based on the actor system. There are certain abstractions available which allow for constraints on message ordering and can therefore eliminate part of the non-determinism \cite{karmani2009actor}. They can be summarized in four important properties:
    1.63  \begin{enumerate}
    1.64  \item {\bf Encapsulation.} Unlike shared memory thread models, actors are highly encapsulated. Therefore, they provide no locking or synchronization mechanisms for shared resources. Their internal states are not directly accessible by other actors. The only way for them to exchange data or influence each other is by means of messages. This limitation, however, is violated by some implementations in languages such as \textit{Scala} \cite{thurauakka}. In programming languages which permit mutable data, messages should, in theory, not transfer references to locations in a shared address space. Otherwise the receiving party could modify memory locations that are contained in another actor's state representation. However, \textit{Java}'s actor library {\em Akka} allows actors to do just that and it is the programmer's responsibility to maintain encapsulation.
    1.65  \item {\bf Fairness.} Another important property is fairness with respect to actor sched\-ul\-ing: a message will be delivered to its target actor eventually, i.e. unless an actor dies or displays faulty behavior, it will be scheduled in a fair manner.
    1.66 @@ -145,7 +145,7 @@
    1.67  
    1.68  \subsubsection{\textit{Scala}}
    1.69  \label{sec:actors_scala}
    1.70 -\textit{Scala} is a programming language for the \textit{Java Virtual Machine} that combines concepts of object-oriented and functional programming \cite{odersky2004overview}. Its distribution comes with a toolkit for actors and message passing functionality called {\em Akka}. \textit{Scala}'s standard actors library up to version 2.9.2 \cite{scala2014migration} has now been depricated and replaced by \textit{Akka}. Its implementation is based on \textit{Erlang} like, fault tolerant actors and its execution model follows an {\em event-based} approach, thus introducing {\em inversion of control}. Compared to \textit{Scala}'s orginal actors, \textit{Akka} actors are more performant, fault tolerant and offer automatic load balancing \cite{tasharofi2013scala}. They draw on the language's functional programming features such as partial functions and, like \textit{Erlang}, pattern matching which are convenient for message passing. Actors potentially provide safer concurrency than traditional shared memory mechanisms because by design, no race conditions can occur. Also, actors are more lightweight than threads.
    1.71 +\textit{Scala} is a programming language for the \textit{Java Virtual Machine} that combines concepts of object-oriented and functional programming \cite{odersky2004overview}. Its distribution comes with a toolkit for actors and message passing functionality called {\em Akka}. \textit{Scala}'s standard actors library up to version 2.9.2 \cite{scala2014migration} has now been deprecated and replaced by \textit{Akka}. Its implementation is based on \textit{Erlang} like, fault tolerant actors and its execution model follows an {\em event-based} approach, thus introducing {\em inversion of control}. Compared to \textit{Scala}'s original actors, \textit{Akka} actors are more performant, fault tolerant and offer automatic load balancing \cite{tasharofi2013scala}. They draw on the language's functional programming features such as partial functions and, like \textit{Erlang}, pattern matching which are convenient for message passing. Actors potentially provide safer concurrency than traditional shared memory mechanisms because by design, no race conditions can occur. Also, actors are more lightweight than threads.
    1.72  % why no race conditions!?!?!?
    1.73  
    1.74  \subsubsection{Futures}
changeset 55476	8e3f73e1e3a3
parent 55466	55c2d2ee3f92
child 60710	21ae85b023bb