doc-isac/mlehnfeld/master/thesis/fundamentals.tex
author wneuper <Walther.Neuper@jku.at>
Sun, 31 Dec 2023 09:42:27 +0100
changeset 60787 26037efefd61
parent 55476 8e3f73e1e3a3
permissions -rw-r--r--
Doc/Specify_Phase 2: copy finished
s1210629013@55404
     1
\chapter{Fundamentals}
s1210629013@55404
     2
\label{cha:fundamentals}
s1210629013@55466
     3
In this chapter we want to get an overview of fundamental concepts relevant for the practical project work carried out along with and as a basis for this thesis. We will also consider ideas and technologies that are related to the topic of the thesis and could be utilized for similar practical applications. Section \ref{sec:funprog} explores the history and common features of functional programming. Computer theorem proving will be introduced in section \ref{sec:ctp}, followed by \ref{sec:parallelism} on a theoretical background on parallelism. Sections \ref{sec:concurresp} and \ref{sec:refacfunprogs} explain concurrency in relation to latency and responsiveness and refactoring of programs developed in functional programming languages, respectively.
s1210629013@55466
     4
%EXTEND: par_func_intro.pdf
s1210629013@55404
     5
s1210629013@55404
     6
\section{Functional Programming}
s1210629013@55404
     7
\label{sec:funprog}
s1210629013@55466
     8
Functional programming is a declarative way of programming, i.e. source code does not describe routines but rather the desired result of a computation. Computation is the evaluation of expressions. Even programs are in fact functions; hence the name {\em functional programming}. Pure functional programming languages do not permit mutable data which means that there are no variables, only constant values. Functional programs are side effect free, i.e. given the same input parameters they must always produce the same results. Unlike subroutines in imperative languages, here the term {\em function} refers to a function in the mathematical sense. Therefore the implementation of programs tends to be closer to their specification compared to other paradigms. Although functional programming languages are rarely utilized for the development of commercial software \cite{hammond2000research}, they have had a significant impact on the theory and pragmatics of other paradigms and programming languages in general \cite{hudak1989conception}.
s1210629013@55404
     9
s1210629013@55404
    10
\subsection{History}
s1210629013@55466
    11
\label{sec:funhist}
s1210629013@55466
    12
The history of functional programming dates back to a time before computers existed, when Alonzo Church published his work on the lambda calculus in 1932 \cite{church1932set}. This formal system was the first methodical approach towards describing a computational perspective on mathematical functions. One of its most notable properties is function self-application, which gives the lambda calculus its power whilst maintaining its consistency as a mathematical system. One of the first languages which contained elements of what is now known as functional programming is \textit{Lisp}. Its initial language design, established in 1958, was only slightly influenced by the lambda calculus, but later dialects have encouraged a more pure, side effect free programming style \cite{mccarthy1978history}. The original purpose for the design of \textit{Lisp} was a language for {\bf lis}t {\bf p}rocessing to be utilized in artificial intelligence research. A notable feature of \textit{Lisp} are higher-order functions which can be found in most contemporary functional languages and which enable list iterations without the need for explicit loops.
s1210629013@55466
    13
s1210629013@55466
    14
In 1978 John Backus, who had been heavily involved in the design of \textit{Fortran} and \textit{ALGOL}, held a Turing award lecture which emphasized certain benefits of functional style languages over imperative programming. The document which resulted from the lecture \cite{backus1978can} strongly influenced and fueled research in the area of functional programming. Also in the mid 1970s, Robin Milner and others from the \textit{University of Edinburgh} designed the \textit{ML} {\em metalanguage} as a command language for the theorem prover {\em LCF} \cite{milner1978theory} (see section \ref{sec:ctp}). They continued the development of \textit{ML} and turned it into a stand-alone programming language. Even though it has impure features such as mutable references and an I/O system which is not side effect free, \textit{ML} is known as a functional programming language because it encourages a functional programming style. \textit{ML}'s type system employs the {\em Hindley–Milner type inference algorithm} (see section \ref{sec:hindleymilner}) which reduces the number of required type annotations because it can infer the types of expressions in most cases. Later, a standard for the language was established \cite{milner1997definition}, which resulted in the programming language \textit{Standard ML}.
s1210629013@55466
    15
neuper@55476
    16
As a dozen non-strict languages (i.e. employing lazy evaluation) emerged in the 1980s, it was decided at a conference in 1987 that the parallel efforts should be merged into one common language as a basis for an exchange of ideas. This decision resulted in the formation of a committee which created {\em Haskell} \cite{jones2003haskell}. Research around this purely functional, general-purpose programming language has produced many innovations in the area of functional programming during the last two decades.
s1210629013@55404
    17
s1210629013@55404
    18
\subsection{Common Features}
s1210629013@55466
    19
There are certain distinguishing features which can be found in most contemporary functional programming languages. This section outlines the most prevalent ones (e.g., see \cite{hudak1989conception, haskell2013functional}).
s1210629013@55466
    20
s1210629013@55466
    21
\subsubsection{Referential Transparency}
s1210629013@55466
    22
The heavy use of immutable data has already been mentioned in section \ref{sec:funprog}. A consequence of pure functions is {\em referential transparency}, which is the property of computations which always yield equal results whenever they are invoked with the same input parameters. Later in this section we will see how pure languages enable e.g. input, output and exception handling whilst preserving purity.
s1210629013@55466
    23
s1210629013@55466
    24
\subsubsection{Recursion}
neuper@55476
    25
Recursion is not a technique which is specific to functional programming. However, since loops require the use of mutable values (counter variables), recursion is used way more excessively than in imperative languages. Often, it is the only way to describe iterations. As recursion can easily cause an overflow of the call stack, most implementations of functional languages employ {\em tail call optimization}. This mechanism discards a function call's environment and thereby avoids creating a new stack frame for the recursive call and saves memory. This only works for true tail recursion, i.e. if the result of the recursive call is the result of the function itself.
s1210629013@55466
    26
s1210629013@55466
    27
\subsubsection{Higher-Order Functions}
s1210629013@55466
    28
Higher-order functions (HOFs) are functions which expect other functions as their input parameters or have functions as their return values. Examples for frequently used HOFs are the {\em map} and {\em fold} operations, which apply other functions to lists of values as in the \textit{Standard ML} expression
s1210629013@55466
    29
\imlcode{~~val ls = map (fn x => x + 2) [1, 2, 5]}
s1210629013@55466
    30
\noindent
s1210629013@55466
    31
which would result in the list \texttt{[3, 4, 7]}. HOFs generally require languages to treat functions as first-class citizens, i.e. simple values. Although HOFs are a central aspect of functional programming, imperative languages may facilitate the use of functions as arguments of other functions. E.g. \textit{C} allows the use of function pointers as first-class citizens. As we will see later, e.g. in section \ref{sec:futurespro}, HOFs can be beneficial for the expression of concurrency.
s1210629013@55466
    32
s1210629013@55466
    33
\subsubsection{Purity and Effects}
s1210629013@55466
    34
\label{sec:pureeffects}
neuper@55476
    35
Many functional programming languages such as \textit{Standard ML} encourage a purely functional coding style. However, certain functionality like memory I/O require developers to write impure code. Purely functional programming languages like \textit{Haskell} and \textit{Clean} do not permit any side effects and therefore use other mechanisms to accomplish the desired effects. The \textit{Haskell} developers decided to implement a language construct called {\em monad} \cite{jones2003haskell}. Monads can be described as representations of possibly impure computations. In order to preserve purity, functions which perform e.g. I/O operations have to accept monads as arguments or return them. Similarly, \textit{Clean} uses {\em uniqueness types} \cite{achten1995ins} which ensure that every function's type makes transparent to the environment, which mutable aspects of the environment it manipulates. This is accomplished by allowing these functions exactly one access to the denoted mutable data which they receive as arguments and whose manipulated versions are contained in their return values.
s1210629013@55466
    36
s1210629013@55466
    37
\subsubsection{Partial Function Evaluation}
s1210629013@55466
    38
Partial function evaluation facilitates applying less arguments to a function than it requires which results in another function that only requires the remaining arguments. E.g. the function \texttt{fun add x y = x + y} accepts two arguments. Its partial evaluation \texttt{val add2 = add 2} results in a function which only expects one argument and adds 2 to it.
s1210629013@55466
    39
s1210629013@55466
    40
\subsubsection{Pattern Matching}
s1210629013@55466
    41
\label{sec:patmatch}
s1210629013@55466
    42
Pattern matching allows for data to be processed on the basis of its structure. E.g. in \textit{Standard ML}, functions are usually expressed with exhaustive pattern definitions. The classic definition of the Fibonacci numbers\footnote{see comments in appendix \ref{sec:fib-imp}} would be
s1210629013@55466
    43
\imlcode{\label{alg:patmatch}
s1210629013@55466
    44
~~\=fu\=n fib 0 = 0\\
s1210629013@55466
    45
\>\>| fib 1 = 1\\
s1210629013@55466
    46
\>\>| fib x = fib (x - 1) + fib (x - 2); \textrm{.}}
s1210629013@55466
    47
s1210629013@55466
    48
s1210629013@55466
    49
\subsubsection{Lazy Evaluation}
neuper@55476
    50
Due to referential transparency, the order in which expressions are evaluated does not affect the results. Therefore, some languages like \textit{Haskell} defer the computation of expressions until they are actually needed. This avoids unnecessary calculations and makes infinite data structures possible. On the other hand, lazy evaluation can potentially cause space leaks \cite{launchbury1993natural} and it makes formal reasoning about code hard \cite{lippmeier2009type}. The opposite is called {\em eager} or {\em strict} evaluation which is the standard for most imperative languages and e.g. \textit{Standard ML}.
s1210629013@55404
    51
s1210629013@55404
    52
s1210629013@55404
    53
\section{Computer Theorem Proving}
s1210629013@55404
    54
\label{sec:ctp}
neuper@55476
    55
Computer Theorem Proving is related to functional programming in several fundamental ways. My practical work for this thesis concerned TP ({\bf T}heorem {\bf P}roving) on a technical level below the logical level of TP, on the level of parallel execution on multi-core hardware. However, it is the logical level of TP which provides the fundamental relations to functional programming.
s1210629013@55466
    56
s1210629013@55466
    57
\subsection{History}
s1210629013@55466
    58
\label{ssec:ctp-hist}
s1210629013@55466
    59
In order to understand TP, we start with the history of TP from the very beginning.
s1210629013@55466
    60
s1210629013@55466
    61
\paragraph{AI and \textit{Lisp}.}
s1210629013@55466
    62
Mechanical or ``formal'' reasoning was an original interest of Artificial Intelligence (AI). The first important programming languages in AI were \textit{Prolog} (logic programming) and \textit{Lisp}. The latter was briefly introduced as a functional language in section \ref{sec:funhist}.
s1210629013@55466
    63
s1210629013@55466
    64
\paragraph{\textit{Automath}.}
neuper@55476
    65
A first notable success was the mechanization of Edmund Landau's {\em Foundations of Analysis} \cite{van1979checking} in {\em Automath} in 1969. {\em Automath} is outdated, but some of its achievements are still used in TP, e.g. de-Brujin indices, the lambda calculus and Curry-Howard correspondence. This early success in TP was not acknowledged by mainstream mathematics. Rather, TP was mixed up with predictions of early AI, e.g. Herbert A. Simon's claim that ``{\em [m]achines will be capable, within twenty years, of doing any work a man can do.}'' \cite{simon1965shape}; predictions that soon turned out to be unrealistic. Nevertheless, mechanization of various logics (sequent calculus, temporal logics, communicating sequential processes (see section \ref{sec:csp}), etc.) was continued and led to many different, unrelated software tools.
s1210629013@55466
    66
s1210629013@55466
    67
\paragraph{\textit{LCF}.}
s1210629013@55466
    68
\label{sssec:lcf}
neuper@55476
    69
\textit{LCF} (Logic for Computable Functions) \cite{milner1978theory} denotes an interactive, automated theorem prover developed at the universities of Edinburgh and Stanford in the early 1970s. The \textit{LCF} project is relevant for this thesis, because it introduced the \textit{ML} programming language explored in section \ref{sec:isasml} and was used for the practical work within the thesis. The purpose of \textit{LCF} was to allow users to write theorem proving tactics and to encapsulate theorems as abstract da\-ta\-types, such that only the admitted inference rules can establish a theorem. This is still called {\em LCF principle}. \textit{LCF} became the ancestor of the {\em HOL family} of provers introduced below.
s1210629013@55466
    70
s1210629013@55466
    71
\paragraph{\textit{The Next Seven Hundred Theorem Provers}.}
s1210629013@55466
    72
\label{par:next-700}
s1210629013@55466
    73
As mentioned above, although an academic niche, TP developed many different and unrelated software tools in the 1970s and 1980s. So in the 1980s the situation with TP was similar to the situation with programming languages in the 1960s. In 1965 Peter Landin had published a talk and a paper titled {\em The next 700 programming languages} \cite{landin1966next}. This paper influenced the convergence of programming languages and led to focused efforts on a few leading products. When Larry Paulson started the development of a generic logical framework with the purpose of allowing various logics to be implemented within this framework, he reused the analogy and published a paper titled {\em The Next Seven Hundred Theorem Provers} \cite{paulson1988isabelle}. This paper addressed the general issues involved in the design of a TP, the many technicalities to overcome until a system becomes usable and warns to produce provers for one-time usage: ``{\em Programs like {\em Isabelle} are not products: when they have served their purpose, they are discarded.}'' \cite{paulson1988isabelle}. In fact, the resulting logical framework, \isabelle{}, was able to attract efforts to implement new logics \cite{tum2013isadoc} within \isabelle{} and thus focused efforts, avoiding development of further TPs from scratch.
s1210629013@55466
    74
s1210629013@55466
    75
\paragraph{The HOL family of provers.}
s1210629013@55466
    76
\label{sssec:holfam}
s1210629013@55466
    77
This family of provers is a descendant of \textit{LCF} introduced above and models higher-order logic, a specific logic which does not have sets (as preferred by mathematics) at the axiomatic basis, but functions. Thus this family shows great affinity to functional programming. The ancestor of this family is {\em HOL} \cite{gordon1993introduction}. \isabelle{} \cite{nipkow2002isabelle} abstracts from higher-order logics to natural deduction and thus allows for implementations of various logics within one and the same system as mentioned above. {\em HOL Light} \cite{harrison2014hol} uses a much simpler logical core and has little legacy code, giving the system a simple and uncluttered feel for programmers. \isabelle{} is also a good example demonstrating that the traditional distinction between automated theorem proving (ATP) and interactive theorem proving (ITP) is obsolete today. \isabelle{} and other interactive proof assistants include various automated provers in order to assist interactive construction of proofs. \isabelle{}'s collection of automated theorem provers is called {\em Sledgehammer} (see section \ref{sec:isaatp}).
s1210629013@55466
    78
s1210629013@55466
    79
\paragraph{Breakthrough in mathematics: the four color theorem.}
s1210629013@55466
    80
The four color theorem was stated more than a hundred years ago, but it turned out to be hard to prove. In 1976 the theorem was tackled by Appel and Haken by the use of computer software \cite{appel1977solution}. However, they could not prove their program correct and thus their work was not acknowledged as a mathematical proof. In 2005, Georges Gonthier \cite{gonthier2008formal} used TP for proving the theorem and since that time TP was accepted as an indispensable tool for large and complicated proofs. Gonthier used the TP {\em Coq} \cite{chlipala2011certified}, which works within the theory of the calculus of inductive constructions, a derivative of the calculus of constructions. Another famous theorem is the Kepler conjecture which took even longer to be proved. In 1998, Thomas Hales presented a proof of 250 pages of notes and 3GB of computer programs, data and results. In 2003, after four years of work, a prominent team of referees reported that they were ``99\% certain'' of the correctness of the proof \cite{hales2011historical}. Since a mathematical proof is either correct or not, Hales started the {\em Flyspeck} project \cite{google2012flyspeck} in order to employ TP for a complete and trustable proof. The mentioned webpage hints at the important role of \isabelle{} in the project.
s1210629013@55466
    81
s1210629013@55466
    82
\paragraph{Continuous penetration of computer science.}
s1210629013@55466
    83
\label{par:invade-cs}
s1210629013@55466
    84
In computer science there are no spectacular breakthroughs like in mathematics. But there was much work on formalizing foundations from the very beginning. In the early days of computer science, Austria had a world-renowned group gathered by Heinz Zemanek \cite{austriaforum2014zemanek}. At \textit{Vienna University of Technology}, Zemanek had built one of the first computers only with transistors, the {\em Mailüfterl} in 1955. And IBM, the dominating company at that time, hired Zemanek to implement IBM's programming language number one, \textit{PL1}. \textit{PL1} was a monstrosity born by an IBM management decision to join \textit{Fortran} with \textit{COBOL} using insights from \textit{ALGOL}. The overwhelming difficulties to implement \textit{PL1} were solved by what later became known as the {\em Vienna Development Method} (VDM) \cite{fitzgerald2008vienna}, an ancestor of what is now called {\em formal methods}.
s1210629013@55466
    85
s1210629013@55466
    86
Formal methods, as the name indicates, formalizes a domain with mathematical notions. In parallel, computer science clarified the logical foundations of programming (Hoare, Dijkstra and others), such that programs could be proved correct with respect to a formal specification, i.e. proofs rely on specifications created by formal methods. Ever since the {\em Pentium bugs} \cite{sharangpani1994statistical,intel1997intel} swallowed several millions of dollars, Intel has invested in formal methods. For instance, Intel supports the TP {\em HOL Light} mentioned above. As there are many safety critical applications in medicine, air traffic control, etc. and technical systems become more complex, the demand for formal methods is increasing. This includes TP.
s1210629013@55466
    87
s1210629013@55466
    88
\subsection{Relevance}
s1210629013@55466
    89
s1210629013@55466
    90
\paragraph{The executable fragment of HOL and code generation.}
neuper@55476
    91
As already mentioned, higher-order logic is closely related to functional programming. This relation includes an executable fragment, i.e. equational theorems within HOL, which can model programs and which can be evaluated within the logic. This way of embedding programs in the logical environment provides the best prerequisites for proving properties of programs and for software verification. \isabelle{} recently implemented a {\em function package}. Execution, or more appropriately, evaluation, within a prover is very inefficient in comparison with state-of-the-art production software. Therefore engineering software cannot be built within a prover. However, it turned out that algorithms described within the executable fragment of HOL are a concise source for automatic generation of efficient code in a surprisingly straight forward manner.
s1210629013@55466
    92
s1210629013@55466
    93
\paragraph{Functional programming.}
s1210629013@55466
    94
Software verification tools are presently already more advanced for functional programming languages. And the theoretical background presented in this section indicates that verification tools will develop better in the paradigm of functional programming than in other paradigms. Also, efficiency considerations are expected to change in favor of the functional paradigm. Programs written on an abstract level in a TP can be transformed into efficient code by automated code generation. This transformation can potentially handle parallelization much more conveniently than programs developed within other paradigms \cite{haftmann2010code}.
s1210629013@55404
    95
s1210629013@55404
    96
s1210629013@55404
    97
\section{Parallelism}
s1210629013@55404
    98
\label{sec:parallelism}
neuper@55476
    99
Concurrency in software has been around for a long time and without it, multitasking would be impossible. It means that processes and threads may be started, executed and completed in overlapping time periods. This does not necessarily imply any parallel, i.e. simultaneous, execution. Parallel software is structured in a way which allows for literally simultaneous execution, e.g. on multi-core processors. Now that this kind of processors is used heavily in PCs, laptops and smart phones, efficient software must be able to exploit parallelism. Approaches to parallelism can roughly be grouped into three different levels: {\em instruction-level} (section \ref{sec:multicoreproc}), {\em task parallelism} (section \ref{sec:task_par}) and {\em data} (section \ref{sec:data_parallelism}). The following sections will establish a theoretical background of parallelism.
s1210629013@55404
   100
s1210629013@55404
   101
\subsection{Gain, Cost and Limits}
s1210629013@55466
   102
Formulated in 1967, Amdahl's Law \cite{amdahl1967validity} describes the correlation between the inherently sequential fraction of a program and the resulting maximally possible speedup $S_A$ under parallel execution (see fig. \ref{fig:amdahls}), given by
s1210629013@55466
   103
\begin{equation}
s1210629013@55466
   104
S_A(n) = \frac{1}{1 - f + \frac{f}{n}} ,
s1210629013@55466
   105
\label{eq:amdahls}
s1210629013@55466
   106
\end{equation}
s1210629013@55466
   107
where $f$ is the ratio of the parts of a program which can be parallelized and $n$ is the number of processing units. The upper limit for the speedup for a theoretically infinite number of processors is therefore $$\hat{S}_A = \lim_{n \rightarrow \infty} \frac{1}{1 - f + \frac{f}{n}} = \frac{1}{1 - f} .$$
s1210629013@55404
   108
s1210629013@55466
   109
\begin{figure}
s1210629013@55466
   110
\centering
s1210629013@55466
   111
\begin{tikzpicture}[domain=1:10192,samples=2500]
s1210629013@55466
   112
  \begin{semilogxaxis}[
s1210629013@55466
   113
    xlabel=number of processors $n$,
s1210629013@55466
   114
    ylabel=speedup $S_A(n)$,
s1210629013@55466
   115
    log basis x=2,
s1210629013@55466
   116
    xmin=1,
s1210629013@55466
   117
    xmax=10192,
s1210629013@55466
   118
    xticklabel style={/pgf/number format/.cd,1000 sep={}},
s1210629013@55466
   119
    xticklabel=\pgfmathparse{2^\tick}\pgfmathprintnumber{\pgfmathresult},
s1210629013@55466
   120
    ymin=0,
s1210629013@55466
   121
    ymax=21,
s1210629013@55466
   122
    grid=major,
s1210629013@55466
   123
    legend entries={\hspace{-.7cm}{parallel portion $f$},$50\%$,$75\%$,$90\%$,$95\%$},
s1210629013@55466
   124
    legend style={draw=none,at={(0.95,0.85)}},
s1210629013@55466
   125
    title={Amdahl's Law},
s1210629013@55466
   126
    axis lines=left,
s1210629013@55466
   127
    scale=1.59
s1210629013@55466
   128
  ]
s1210629013@55466
   129
    \addlegendimage{empty legend};
s1210629013@55466
   130
    \addplot[color1,very thick] plot (\x,{1/(0.5 + (0.5 / \x))}) node[right]{};
s1210629013@55466
   131
    \addplot[color2,very thick] plot (\x,{1/(0.25 + (0.75 / \x))}) node[right]{}; 
s1210629013@55466
   132
    \addplot[color3,very thick] plot (\x,{1/(0.1 + (0.9 / \x))}) node[right]{}; 
s1210629013@55466
   133
    \addplot[color4,very thick] plot (\x,{1/(0.05 + (0.95 / \x))}) node[right]{};
s1210629013@55466
   134
  \end{semilogxaxis}
s1210629013@55466
   135
\end{tikzpicture}
s1210629013@55466
   136
\caption{Amdahl's Law and potential speedups.}
s1210629013@55466
   137
\label{fig:amdahls}
s1210629013@55466
   138
\end{figure}
s1210629013@55466
   139
s1210629013@55466
   140
Amdahl's Law expects the input dataset size to be fixed. However, John L. Gustafson discovered that generally the fraction which can be computed in parallel behaves proportionally to the problem size, i.e. the bigger the input dataset, the higher the percentage of calculations which can be carried out simultaneously \cite{gustafson1988reevaluating}. Therefore Gustafson's Law approaches the problem from a different perspective. It assumes that in practice an algorithm's problem size depends on available temporal and computational resources. The resulting {\em scaled speedup} $S_G$ identifies the ratio between total and parallel execution time of an algorithm which linearly increases with the number of processors (see fig. \ref{fig:gustafsons}) and it is given by
s1210629013@55466
   141
\begin{equation}
s1210629013@55466
   142
S_G(n) = 1 - f \cdot (1 - n) .
s1210629013@55466
   143
\label{eq:gustafsons}
s1210629013@55466
   144
\end{equation}
s1210629013@55466
   145
s1210629013@55466
   146
\begin{figure}
s1210629013@55466
   147
\centering
s1210629013@55466
   148
\begin{tikzpicture}[domain=0:512,samples=2500]
s1210629013@55466
   149
  \begin{axis}[
s1210629013@55466
   150
    xlabel=number of processors $n$,
s1210629013@55466
   151
    ylabel=speedup $S_G(n)$,
s1210629013@55466
   152
    xmin=1,
s1210629013@55466
   153
    xmax=512,
s1210629013@55466
   154
    xticklabel style={/pgf/number format/.cd,1000 sep={\,}},
s1210629013@55466
   155
    xticklabel=\pgfmathparse{\tick}\pgfmathprintnumber{\pgfmathresult},
s1210629013@55466
   156
    ymin=0,
s1210629013@55466
   157
    ymax=514,
s1210629013@55466
   158
    grid=major,
s1210629013@55466
   159
    legend entries={\hspace{-.7cm}{parallel portion $f$},$50\%$,$75\%$,$90\%$,$95\%$},
s1210629013@55466
   160
    legend style={draw=none,at={(0.42,0.95)}},
s1210629013@55466
   161
    title={Gustafson's Law},
s1210629013@55466
   162
    axis lines=left,
s1210629013@55466
   163
    scale=1.59
s1210629013@55466
   164
  ]
s1210629013@55466
   165
    \addlegendimage{empty legend};
s1210629013@55466
   166
    \addplot[color1,very thick] plot (\x,{1 - 0.5 * (1 - \x)}) node[right]{};
s1210629013@55466
   167
    \addplot[color2,very thick] plot (\x,{1 - 0.75 * (1 - \x)}) node[right]{}; 
s1210629013@55466
   168
    \addplot[color3,very thick] plot (\x,{1 - 0.9 * (1 - \x)}) node[right]{}; 
s1210629013@55466
   169
    \addplot[color4,very thick] plot (\x,{1 - 0.95 * (1 - \x)}) node[right]{};
s1210629013@55466
   170
  \end{axis}
s1210629013@55466
   171
\end{tikzpicture}
s1210629013@55466
   172
\caption{Gustafson's Law and potential speedups.}
s1210629013@55466
   173
\label{fig:gustafsons}
s1210629013@55466
   174
\end{figure}
s1210629013@55466
   175
s1210629013@55466
   176
The two theoretical speedups do not take into account dependencies or communications of parallel tasks. These factors can cause a performance overhead which may exceed the speedup if parallelization is taken too far. A certain class of parallel problems, called {\em embarrassingly parallel problems} \cite{moler1986matrix}, do not require any communication.
s1210629013@55466
   177
s1210629013@55466
   178
\subsection{Multi-Core Processors}
s1210629013@55466
   179
\label{sec:multicoreproc}
s1210629013@55466
   180
While the computing industry continued to fulfill Moore's predictions \cite{moore1965cramming} by fitting more and more transistors on smaller chips and thereby producing persistently faster uniprocessors, it is now physically almost impossible to make transistors any smaller \cite{geer2005chip}. Heat dissipation resulting from transistor density on the fastest chips requires extensive cooling and causes a high power consumption. Therefore hardware manufacturers have turned to building chips with multiple processing cores which may not be as fast as recent uniprocessors, but produce less heat and are more energy efficient. Architectures differ in the number of cores and their memory models. Several or all cores may share one cache or may each have their own cache. Separate caches avoid the challenges of coordinating cache accesses, but sometimes a centralized cache can be faster. Unlike multiple chips, several cores on the same chip can share memory management and memory elements and signaling between them can be faster because they are placed on the same die.
s1210629013@55466
   181
s1210629013@55466
   182
Multi-core processors, however, could not solve all performance issues, even if software were able to fully exploit their power. Other factors like memory bandwidth can turn out to be performance bottlenecks \cite{moore2008multicore}. Also, these processors are not the only way hardware may support parallelism:
s1210629013@55466
   183
\begin{description}
s1210629013@55466
   184
\item[Bit-level parallelism] takes advantage of increased processor word sizes in order to reduce the number of instructions that need to be executed to complete operations on variables which are longer than one word \cite{culler1999parallel}.
s1210629013@55466
   185
neuper@55476
   186
\item[Instruction-level parallelism] tries to discover data independent instructions which often can be reordered and computed during the same processing cycle. Also, instruction pipelines with multiple stages allow different instructions to be in different stages and thereby overlap their processing \cite{hennessy2012computer}.
s1210629013@55466
   187
s1210629013@55466
   188
\item[Simultaneous multithreading] enables multiple threads to issue instructions on a single processing core which may be processed during the same cycle \cite{tullsen1995simultaneous}.
s1210629013@55466
   189
\end{description}
s1210629013@55404
   190
s1210629013@55404
   191
\subsection{Operating Systems}
neuper@55476
   192
Operating systems are heavily involved in the resource management of their underlying architecture, both for their own operations and for user applications. Besides support for simultaneous multithreading (see previous section), general-purpose OSes have to deal with diverse hardware approaches, e.g. multiprocessors vs. multi-core processors, shared vs. separate cache models. Scheduling algorithms and memory management are facing new challenges because they must balance loads between cores while avoiding cache misses and reducing the number of context switches. In this thesis we will assume symmetric multiprocessing (SMP), i.e. systems whose processing cores are identical and access the same main memory. Scheduling algorithms in OSes for more heterogeneous systems also need to take into account differing performance characteristics and possibly even instruction sets of several processors.
s1210629013@55404
   193
s1210629013@55404
   194
\subsection{Functional Programming}
s1210629013@55466
   195
\label{sec:par_funprog}
s1210629013@55466
   196
Imperative programming lets the programmer make the steps that the system should carry out explicit. The idea of declarative, including functional, paradigms is a description of the desired result of a computation. On the spectrum between an algorithm's formal specification and the resulting machine code, functional code is generally said to be particularly close to the formal specification \cite{henderson1986functional}. Therefore this high-level programming paradigm puts higher demands for optimization on compilers and run-time systems. This also includes the management and exploitation of multiple CPU cores.
s1210629013@55404
   197
s1210629013@55466
   198
As purely functional programming prohibits side effects, parallelizing such programs should be trivial. However, in practice, general-purpose functional programming languages are not only used for {\em embarrassingly parallel problems}. Harris and Singh \cite{harris2007feedback} identify five issues which may arise when trying to automatically parallelize functional programs:
s1210629013@55466
   199
\begin{enumerate}
s1210629013@55466
   200
  \item Inherent parallelism may be limited, i.e. call hierarchies may enforce a certain order of execution.
s1210629013@55466
   201
  \item Parallel executions need to be managed by the run-time system. This introduces an overhead which can even cause a slowdown if the parallelizable work is too fine-grained.
s1210629013@55466
   202
  \item Sometimes, purely functional code may contain side effects when it is compiled (e.g. memoization).
s1210629013@55466
   203
  \item Purely functional general-purpose languages generally do have means of encapsulating side effects such as I/O or updates to mutable data.
s1210629013@55466
   204
  \item In languages supporting lazy evaluation it is not known whether expressions actually need to be computed until their results are queried.
s1210629013@55466
   205
\end{enumerate}
s1210629013@55466
   206
For more details see chapter \ref{cha:funproglangs_mcsystems} which is entirely dedicated to parallelism in functional programming languages.
s1210629013@55404
   207
s1210629013@55466
   208
\section{Concurrency, Latency and Responsiveness}
s1210629013@55404
   209
\label{sec:concurresp}
s1210629013@55466
   210
Concurrency can improve system performance by hiding latencies. While one thread performs blocking operations such as e.g. disk I/O, another thread's execution can continue to perform other calculations independently. In cases where latencies are introduced only because a computationally complex operation claims the processor for a long period of time, multithreading cannot improve performance. But it can improve responsiveness by updating the user interface, giving feedback about pending operations' status and react to user input. Responsiveness does not necessarily require high performance, but it is a major factor of the user experience that an application is able to deliver. Concurrency can facilitate increased responsiveness. However, reasoning intuitively as well es formally about concurrent software and developing it is said to be difficult, mostly due to undetermined execution orders and thread synchronization issues \cite{sutter2005software}.
s1210629013@55404
   211
s1210629013@55404
   212
s1210629013@55404
   213
\section{Refactoring Functional Programs}
s1210629013@55404
   214
\label{sec:refacfunprogs}
s1210629013@55466
   215
Refactoring is a process which changes the design and structure of existing software without changing its external behavior. It can result in simpler programs, remove duplicate code or prepare an extension of the program's functionality. Because refactoring is a potentially tedious and error-prone process \cite{thompson2005refactoring}, there is a variety of tools which automatically check preconditions for and apply refactoring steps for object-oriented programming languages. The number of such tools for functional programming languages, however, is limited. One of the few existing projects targets \textit{Haskell} code and is called {\em HaRe} (\textit{\textbf{Ha}skell} \textbf{Re}factorer). It follows five main strategies:
s1210629013@55466
   216
\begin{enumerate}
neuper@55476
   217
  \item {\bf Generalization.} Extract some of a function's behavior into an argument, possibly a HOF, and thereby make the function more general and reusable.
s1210629013@55466
   218
  \item {\bf Commonality.} Similar or identical parts of a program are identified, extracted into a single function which is invoked at the respective locations and with appropriate parameters.
s1210629013@55466
   219
  \item {\bf Data Abstraction.} Replacing algebraic da\-ta\-types by abstract da\-ta\-types often allows programmers to modify concrete implementations without touching client code.
s1210629013@55466
   220
  \item {\bf Overloading.} The use of {\em type classes} and {\em instances} allows for overloading of names. This may improve readability and facilitate code reuse.
s1210629013@55466
   221
  \item {\bf Monadification.} Packing code into monads also makes it possible to separate parts of a system from the client code, i.e. developers can modify the monads without changing client code.
s1210629013@55466
   222
\end{enumerate}
s1210629013@55466
   223
{\em HaRe} furthermore implements structural refactorings such as deletion of unused definitions and arguments, renaming of several kinds of program items and modifications of the scopes of definitions. Later versions introduced module-aware refactorings including import/export function organization and moving definitions between modules. For more details and refactoring techniques implemented in {\em HaRe} please refer to \cite{thompson2005refactoring}.
s1210629013@55404
   224
s1210629013@55466
   225
{\em Type classes} and {\em monadification} are specific to \textit{Haskell}. In \textit{Standard ML} overloading can be achieved simply by omitting the type indicators, thanks to the Hindley–Milner type inference algorithm. Please note that \textit{Standard ML} is a strongly typed language \cite{milner1997definition}.