wneuper/isa: doc-isac/mlehnfeld/master/thesis/fundamentals.tex@26037efefd61 (annotated)

s1210629013@55404	1	\chapter{Fundamentals}
s1210629013@55404	2	\label{cha:fundamentals}
s1210629013@55466	3	In this chapter we want to get an overview of fundamental concepts relevant for the practical project work carried out along with and as a basis for this thesis. We will also consider ideas and technologies that are related to the topic of the thesis and could be utilized for similar practical applications. Section \ref{sec:funprog} explores the history and common features of functional programming. Computer theorem proving will be introduced in section \ref{sec:ctp}, followed by \ref{sec:parallelism} on a theoretical background on parallelism. Sections \ref{sec:concurresp} and \ref{sec:refacfunprogs} explain concurrency in relation to latency and responsiveness and refactoring of programs developed in functional programming languages, respectively.
s1210629013@55466	4	%EXTEND: par_func_intro.pdf
s1210629013@55404	5
s1210629013@55404	6	\section{Functional Programming}
s1210629013@55404	7	\label{sec:funprog}
s1210629013@55466	8	Functional programming is a declarative way of programming, i.e. source code does not describe routines but rather the desired result of a computation. Computation is the evaluation of expressions. Even programs are in fact functions; hence the name {\em functional programming}. Pure functional programming languages do not permit mutable data which means that there are no variables, only constant values. Functional programs are side effect free, i.e. given the same input parameters they must always produce the same results. Unlike subroutines in imperative languages, here the term {\em function} refers to a function in the mathematical sense. Therefore the implementation of programs tends to be closer to their specification compared to other paradigms. Although functional programming languages are rarely utilized for the development of commercial software \cite{hammond2000research}, they have had a significant impact on the theory and pragmatics of other paradigms and programming languages in general \cite{hudak1989conception}.
s1210629013@55404	9
s1210629013@55404	10	\subsection{History}
s1210629013@55466	11	\label{sec:funhist}
s1210629013@55466	12	The history of functional programming dates back to a time before computers existed, when Alonzo Church published his work on the lambda calculus in 1932 \cite{church1932set}. This formal system was the first methodical approach towards describing a computational perspective on mathematical functions. One of its most notable properties is function self-application, which gives the lambda calculus its power whilst maintaining its consistency as a mathematical system. One of the first languages which contained elements of what is now known as functional programming is \textit{Lisp}. Its initial language design, established in 1958, was only slightly influenced by the lambda calculus, but later dialects have encouraged a more pure, side effect free programming style \cite{mccarthy1978history}. The original purpose for the design of \textit{Lisp} was a language for {\bf lis}t {\bf p}rocessing to be utilized in artificial intelligence research. A notable feature of \textit{Lisp} are higher-order functions which can be found in most contemporary functional languages and which enable list iterations without the need for explicit loops.
s1210629013@55466	13
s1210629013@55466	14	In 1978 John Backus, who had been heavily involved in the design of \textit{Fortran} and \textit{ALGOL}, held a Turing award lecture which emphasized certain benefits of functional style languages over imperative programming. The document which resulted from the lecture \cite{backus1978can} strongly influenced and fueled research in the area of functional programming. Also in the mid 1970s, Robin Milner and others from the \textit{University of Edinburgh} designed the \textit{ML} {\em metalanguage} as a command language for the theorem prover {\em LCF} \cite{milner1978theory} (see section \ref{sec:ctp}). They continued the development of \textit{ML} and turned it into a stand-alone programming language. Even though it has impure features such as mutable references and an I/O system which is not side effect free, \textit{ML} is known as a functional programming language because it encourages a functional programming style. \textit{ML}'s type system employs the {\em Hindley–Milner type inference algorithm} (see section \ref{sec:hindleymilner}) which reduces the number of required type annotations because it can infer the types of expressions in most cases. Later, a standard for the language was established \cite{milner1997definition}, which resulted in the programming language \textit{Standard ML}.
s1210629013@55466	15
neuper@55476	16	As a dozen non-strict languages (i.e. employing lazy evaluation) emerged in the 1980s, it was decided at a conference in 1987 that the parallel efforts should be merged into one common language as a basis for an exchange of ideas. This decision resulted in the formation of a committee which created {\em Haskell} \cite{jones2003haskell}. Research around this purely functional, general-purpose programming language has produced many innovations in the area of functional programming during the last two decades.
s1210629013@55404	17
s1210629013@55404	18	\subsection{Common Features}
s1210629013@55466	19	There are certain distinguishing features which can be found in most contemporary functional programming languages. This section outlines the most prevalent ones (e.g., see \cite{hudak1989conception, haskell2013functional}).
s1210629013@55466	20
s1210629013@55466	21	\subsubsection{Referential Transparency}
s1210629013@55466	22	The heavy use of immutable data has already been mentioned in section \ref{sec:funprog}. A consequence of pure functions is {\em referential transparency}, which is the property of computations which always yield equal results whenever they are invoked with the same input parameters. Later in this section we will see how pure languages enable e.g. input, output and exception handling whilst preserving purity.
s1210629013@55466	23
s1210629013@55466	24	\subsubsection{Recursion}
neuper@55476	25	Recursion is not a technique which is specific to functional programming. However, since loops require the use of mutable values (counter variables), recursion is used way more excessively than in imperative languages. Often, it is the only way to describe iterations. As recursion can easily cause an overflow of the call stack, most implementations of functional languages employ {\em tail call optimization}. This mechanism discards a function call's environment and thereby avoids creating a new stack frame for the recursive call and saves memory. This only works for true tail recursion, i.e. if the result of the recursive call is the result of the function itself.
s1210629013@55466	26
s1210629013@55466	27	\subsubsection{Higher-Order Functions}
s1210629013@55466	28	Higher-order functions (HOFs) are functions which expect other functions as their input parameters or have functions as their return values. Examples for frequently used HOFs are the {\em map} and {\em fold} operations, which apply other functions to lists of values as in the \textit{Standard ML} expression
s1210629013@55466	29	\imlcode{~~val ls = map (fn x => x + 2) [1, 2, 5]}
s1210629013@55466	30	\noindent
s1210629013@55466	31	which would result in the list \texttt{[3, 4, 7]}. HOFs generally require languages to treat functions as first-class citizens, i.e. simple values. Although HOFs are a central aspect of functional programming, imperative languages may facilitate the use of functions as arguments of other functions. E.g. \textit{C} allows the use of function pointers as first-class citizens. As we will see later, e.g. in section \ref{sec:futurespro}, HOFs can be beneficial for the expression of concurrency.
s1210629013@55466	32
s1210629013@55466	33	\subsubsection{Purity and Effects}
s1210629013@55466	34	\label{sec:pureeffects}
neuper@55476	35	Many functional programming languages such as \textit{Standard ML} encourage a purely functional coding style. However, certain functionality like memory I/O require developers to write impure code. Purely functional programming languages like \textit{Haskell} and \textit{Clean} do not permit any side effects and therefore use other mechanisms to accomplish the desired effects. The \textit{Haskell} developers decided to implement a language construct called {\em monad} \cite{jones2003haskell}. Monads can be described as representations of possibly impure computations. In order to preserve purity, functions which perform e.g. I/O operations have to accept monads as arguments or return them. Similarly, \textit{Clean} uses {\em uniqueness types} \cite{achten1995ins} which ensure that every function's type makes transparent to the environment, which mutable aspects of the environment it manipulates. This is accomplished by allowing these functions exactly one access to the denoted mutable data which they receive as arguments and whose manipulated versions are contained in their return values.
s1210629013@55466	36
s1210629013@55466	37	\subsubsection{Partial Function Evaluation}
s1210629013@55466	38	Partial function evaluation facilitates applying less arguments to a function than it requires which results in another function that only requires the remaining arguments. E.g. the function \texttt{fun add x y = x + y} accepts two arguments. Its partial evaluation \texttt{val add2 = add 2} results in a function which only expects one argument and adds 2 to it.
s1210629013@55466	39
s1210629013@55466	40	\subsubsection{Pattern Matching}
s1210629013@55466	41	\label{sec:patmatch}
s1210629013@55466	42	Pattern matching allows for data to be processed on the basis of its structure. E.g. in \textit{Standard ML}, functions are usually expressed with exhaustive pattern definitions. The classic definition of the Fibonacci numbers\footnote{see comments in appendix \ref{sec:fib-imp}} would be
s1210629013@55466	43	\imlcode{\label{alg:patmatch}
s1210629013@55466	44	~~\=fu\=n fib 0 = 0\\
s1210629013@55466	45	\>\>\| fib 1 = 1\\
s1210629013@55466	46	\>\>\| fib x = fib (x - 1) + fib (x - 2); \textrm{.}}
s1210629013@55466	47
s1210629013@55466	48
s1210629013@55466	49	\subsubsection{Lazy Evaluation}
neuper@55476	50	Due to referential transparency, the order in which expressions are evaluated does not affect the results. Therefore, some languages like \textit{Haskell} defer the computation of expressions until they are actually needed. This avoids unnecessary calculations and makes infinite data structures possible. On the other hand, lazy evaluation can potentially cause space leaks \cite{launchbury1993natural} and it makes formal reasoning about code hard \cite{lippmeier2009type}. The opposite is called {\em eager} or {\em strict} evaluation which is the standard for most imperative languages and e.g. \textit{Standard ML}.
s1210629013@55404	51
s1210629013@55404	52
s1210629013@55404	53	\section{Computer Theorem Proving}
s1210629013@55404	54	\label{sec:ctp}
neuper@55476	55	Computer Theorem Proving is related to functional programming in several fundamental ways. My practical work for this thesis concerned TP ({\bf T}heorem {\bf P}roving) on a technical level below the logical level of TP, on the level of parallel execution on multi-core hardware. However, it is the logical level of TP which provides the fundamental relations to functional programming.
s1210629013@55466	56
s1210629013@55466	57	\subsection{History}
s1210629013@55466	58	\label{ssec:ctp-hist}
s1210629013@55466	59	In order to understand TP, we start with the history of TP from the very beginning.
s1210629013@55466	60
s1210629013@55466	61	\paragraph{AI and \textit{Lisp}.}
s1210629013@55466	62	Mechanical or ``formal'' reasoning was an original interest of Artificial Intelligence (AI). The first important programming languages in AI were \textit{Prolog} (logic programming) and \textit{Lisp}. The latter was briefly introduced as a functional language in section \ref{sec:funhist}.
s1210629013@55466	63
s1210629013@55466	64	\paragraph{\textit{Automath}.}
neuper@55476	65	A first notable success was the mechanization of Edmund Landau's {\em Foundations of Analysis} \cite{van1979checking} in {\em Automath} in 1969. {\em Automath} is outdated, but some of its achievements are still used in TP, e.g. de-Brujin indices, the lambda calculus and Curry-Howard correspondence. This early success in TP was not acknowledged by mainstream mathematics. Rather, TP was mixed up with predictions of early AI, e.g. Herbert A. Simon's claim that ``{\em [m]achines will be capable, within twenty years, of doing any work a man can do.}'' \cite{simon1965shape}; predictions that soon turned out to be unrealistic. Nevertheless, mechanization of various logics (sequent calculus, temporal logics, communicating sequential processes (see section \ref{sec:csp}), etc.) was continued and led to many different, unrelated software tools.
s1210629013@55466	66
s1210629013@55466	67	\paragraph{\textit{LCF}.}
s1210629013@55466	68	\label{sssec:lcf}
neuper@55476	69	\textit{LCF} (Logic for Computable Functions) \cite{milner1978theory} denotes an interactive, automated theorem prover developed at the universities of Edinburgh and Stanford in the early 1970s. The \textit{LCF} project is relevant for this thesis, because it introduced the \textit{ML} programming language explored in section \ref{sec:isasml} and was used for the practical work within the thesis. The purpose of \textit{LCF} was to allow users to write theorem proving tactics and to encapsulate theorems as abstract da\-ta\-types, such that only the admitted inference rules can establish a theorem. This is still called {\em LCF principle}. \textit{LCF} became the ancestor of the {\em HOL family} of provers introduced below.
s1210629013@55466	70
s1210629013@55466	71	\paragraph{\textit{The Next Seven Hundred Theorem Provers}.}
s1210629013@55466	72	\label{par:next-700}
s1210629013@55466	73	As mentioned above, although an academic niche, TP developed many different and unrelated software tools in the 1970s and 1980s. So in the 1980s the situation with TP was similar to the situation with programming languages in the 1960s. In 1965 Peter Landin had published a talk and a paper titled {\em The next 700 programming languages} \cite{landin1966next}. This paper influenced the convergence of programming languages and led to focused efforts on a few leading products. When Larry Paulson started the development of a generic logical framework with the purpose of allowing various logics to be implemented within this framework, he reused the analogy and published a paper titled {\em The Next Seven Hundred Theorem Provers} \cite{paulson1988isabelle}. This paper addressed the general issues involved in the design of a TP, the many technicalities to overcome until a system becomes usable and warns to produce provers for one-time usage: ``{\em Programs like {\em Isabelle} are not products: when they have served their purpose, they are discarded.}'' \cite{paulson1988isabelle}. In fact, the resulting logical framework, \isabelle{}, was able to attract efforts to implement new logics \cite{tum2013isadoc} within \isabelle{} and thus focused efforts, avoiding development of further TPs from scratch.
s1210629013@55466	74
s1210629013@55466	75	\paragraph{The HOL family of provers.}
s1210629013@55466	76	\label{sssec:holfam}
s1210629013@55466	77	This family of provers is a descendant of \textit{LCF} introduced above and models higher-order logic, a specific logic which does not have sets (as preferred by mathematics) at the axiomatic basis, but functions. Thus this family shows great affinity to functional programming. The ancestor of this family is {\em HOL} \cite{gordon1993introduction}. \isabelle{} \cite{nipkow2002isabelle} abstracts from higher-order logics to natural deduction and thus allows for implementations of various logics within one and the same system as mentioned above. {\em HOL Light} \cite{harrison2014hol} uses a much simpler logical core and has little legacy code, giving the system a simple and uncluttered feel for programmers. \isabelle{} is also a good example demonstrating that the traditional distinction between automated theorem proving (ATP) and interactive theorem proving (ITP) is obsolete today. \isabelle{} and other interactive proof assistants include various automated provers in order to assist interactive construction of proofs. \isabelle{}'s collection of automated theorem provers is called {\em Sledgehammer} (see section \ref{sec:isaatp}).
s1210629013@55466	78
s1210629013@55466	79	\paragraph{Breakthrough in mathematics: the four color theorem.}
s1210629013@55466	80	The four color theorem was stated more than a hundred years ago, but it turned out to be hard to prove. In 1976 the theorem was tackled by Appel and Haken by the use of computer software \cite{appel1977solution}. However, they could not prove their program correct and thus their work was not acknowledged as a mathematical proof. In 2005, Georges Gonthier \cite{gonthier2008formal} used TP for proving the theorem and since that time TP was accepted as an indispensable tool for large and complicated proofs. Gonthier used the TP {\em Coq} \cite{chlipala2011certified}, which works within the theory of the calculus of inductive constructions, a derivative of the calculus of constructions. Another famous theorem is the Kepler conjecture which took even longer to be proved. In 1998, Thomas Hales presented a proof of 250 pages of notes and 3GB of computer programs, data and results. In 2003, after four years of work, a prominent team of referees reported that they were ``99\% certain'' of the correctness of the proof \cite{hales2011historical}. Since a mathematical proof is either correct or not, Hales started the {\em Flyspeck} project \cite{google2012flyspeck} in order to employ TP for a complete and trustable proof. The mentioned webpage hints at the important role of \isabelle{} in the project.
s1210629013@55466	81
s1210629013@55466	82	\paragraph{Continuous penetration of computer science.}
s1210629013@55466	83	\label{par:invade-cs}
s1210629013@55466	84	In computer science there are no spectacular breakthroughs like in mathematics. But there was much work on formalizing foundations from the very beginning. In the early days of computer science, Austria had a world-renowned group gathered by Heinz Zemanek \cite{austriaforum2014zemanek}. At \textit{Vienna University of Technology}, Zemanek had built one of the first computers only with transistors, the {\em Mailüfterl} in 1955. And IBM, the dominating company at that time, hired Zemanek to implement IBM's programming language number one, \textit{PL1}. \textit{PL1} was a monstrosity born by an IBM management decision to join \textit{Fortran} with \textit{COBOL} using insights from \textit{ALGOL}. The overwhelming difficulties to implement \textit{PL1} were solved by what later became known as the {\em Vienna Development Method} (VDM) \cite{fitzgerald2008vienna}, an ancestor of what is now called {\em formal methods}.
s1210629013@55466	85
s1210629013@55466	86	Formal methods, as the name indicates, formalizes a domain with mathematical notions. In parallel, computer science clarified the logical foundations of programming (Hoare, Dijkstra and others), such that programs could be proved correct with respect to a formal specification, i.e. proofs rely on specifications created by formal methods. Ever since the {\em Pentium bugs} \cite{sharangpani1994statistical,intel1997intel} swallowed several millions of dollars, Intel has invested in formal methods. For instance, Intel supports the TP {\em HOL Light} mentioned above. As there are many safety critical applications in medicine, air traffic control, etc. and technical systems become more complex, the demand for formal methods is increasing. This includes TP.
s1210629013@55466	87
s1210629013@55466	88	\subsection{Relevance}
s1210629013@55466	89
s1210629013@55466	90	\paragraph{The executable fragment of HOL and code generation.}
neuper@55476	91	As already mentioned, higher-order logic is closely related to functional programming. This relation includes an executable fragment, i.e. equational theorems within HOL, which can model programs and which can be evaluated within the logic. This way of embedding programs in the logical environment provides the best prerequisites for proving properties of programs and for software verification. \isabelle{} recently implemented a {\em function package}. Execution, or more appropriately, evaluation, within a prover is very inefficient in comparison with state-of-the-art production software. Therefore engineering software cannot be built within a prover. However, it turned out that algorithms described within the executable fragment of HOL are a concise source for automatic generation of efficient code in a surprisingly straight forward manner.
s1210629013@55466	92
s1210629013@55466	93	\paragraph{Functional programming.}
s1210629013@55466	94	Software verification tools are presently already more advanced for functional programming languages. And the theoretical background presented in this section indicates that verification tools will develop better in the paradigm of functional programming than in other paradigms. Also, efficiency considerations are expected to change in favor of the functional paradigm. Programs written on an abstract level in a TP can be transformed into efficient code by automated code generation. This transformation can potentially handle parallelization much more conveniently than programs developed within other paradigms \cite{haftmann2010code}.
s1210629013@55404	95
s1210629013@55404	96
s1210629013@55404	97	\section{Parallelism}
s1210629013@55404	98	\label{sec:parallelism}
neuper@55476	99	Concurrency in software has been around for a long time and without it, multitasking would be impossible. It means that processes and threads may be started, executed and completed in overlapping time periods. This does not necessarily imply any parallel, i.e. simultaneous, execution. Parallel software is structured in a way which allows for literally simultaneous execution, e.g. on multi-core processors. Now that this kind of processors is used heavily in PCs, laptops and smart phones, efficient software must be able to exploit parallelism. Approaches to parallelism can roughly be grouped into three different levels: {\em instruction-level} (section \ref{sec:multicoreproc}), {\em task parallelism} (section \ref{sec:task_par}) and {\em data} (section \ref{sec:data_parallelism}). The following sections will establish a theoretical background of parallelism.
s1210629013@55404	100
s1210629013@55404	101	\subsection{Gain, Cost and Limits}
s1210629013@55466	102	Formulated in 1967, Amdahl's Law \cite{amdahl1967validity} describes the correlation between the inherently sequential fraction of a program and the resulting maximally possible speedup $S_A$ under parallel execution (see fig. \ref{fig:amdahls}), given by
s1210629013@55466	103	\begin{equation}
s1210629013@55466	104	S_A(n) = \frac{1}{1 - f + \frac{f}{n}} ,
s1210629013@55466	105	\label{eq:amdahls}
s1210629013@55466	106	\end{equation}
s1210629013@55466	107	where $f$ is the ratio of the parts of a program which can be parallelized and $n$ is the number of processing units. The upper limit for the speedup for a theoretically infinite number of processors is therefore $$\hat{S}_A = \lim_{n \rightarrow \infty} \frac{1}{1 - f + \frac{f}{n}} = \frac{1}{1 - f} .$$
s1210629013@55404	108
s1210629013@55466	109	\begin{figure}
s1210629013@55466	110	\centering
s1210629013@55466	111	\begin{tikzpicture}[domain=1:10192,samples=2500]
s1210629013@55466	112	\begin{semilogxaxis}[
s1210629013@55466	113	xlabel=number of processors $n$,
s1210629013@55466	114	ylabel=speedup $S_A(n)$,
s1210629013@55466	115	log basis x=2,
s1210629013@55466	116	xmin=1,
s1210629013@55466	117	xmax=10192,
s1210629013@55466	118	xticklabel style={/pgf/number format/.cd,1000 sep={}},
s1210629013@55466	119	xticklabel=\pgfmathparse{2^\tick}\pgfmathprintnumber{\pgfmathresult},
s1210629013@55466	120	ymin=0,
s1210629013@55466	121	ymax=21,
s1210629013@55466	122	grid=major,
s1210629013@55466	123	legend entries={\hspace{-.7cm}{parallel portion $f$},$50\%$,$75\%$,$90\%$,$95\%$},
s1210629013@55466	124	legend style={draw=none,at={(0.95,0.85)}},
s1210629013@55466	125	title={Amdahl's Law},
s1210629013@55466	126	axis lines=left,
s1210629013@55466	127	scale=1.59
s1210629013@55466	128	]
s1210629013@55466	129	\addlegendimage{empty legend};
s1210629013@55466	130	\addplot[color1,very thick] plot (\x,{1/(0.5 + (0.5 / \x))}) node[right]{};
s1210629013@55466	131	\addplot[color2,very thick] plot (\x,{1/(0.25 + (0.75 / \x))}) node[right]{};
s1210629013@55466	132	\addplot[color3,very thick] plot (\x,{1/(0.1 + (0.9 / \x))}) node[right]{};
s1210629013@55466	133	\addplot[color4,very thick] plot (\x,{1/(0.05 + (0.95 / \x))}) node[right]{};
s1210629013@55466	134	\end{semilogxaxis}
s1210629013@55466	135	\end{tikzpicture}
s1210629013@55466	136	\caption{Amdahl's Law and potential speedups.}
s1210629013@55466	137	\label{fig:amdahls}
s1210629013@55466	138	\end{figure}
s1210629013@55466	139
s1210629013@55466	140	Amdahl's Law expects the input dataset size to be fixed. However, John L. Gustafson discovered that generally the fraction which can be computed in parallel behaves proportionally to the problem size, i.e. the bigger the input dataset, the higher the percentage of calculations which can be carried out simultaneously \cite{gustafson1988reevaluating}. Therefore Gustafson's Law approaches the problem from a different perspective. It assumes that in practice an algorithm's problem size depends on available temporal and computational resources. The resulting {\em scaled speedup} $S_G$ identifies the ratio between total and parallel execution time of an algorithm which linearly increases with the number of processors (see fig. \ref{fig:gustafsons}) and it is given by
s1210629013@55466	141	\begin{equation}
s1210629013@55466	142	S_G(n) = 1 - f \cdot (1 - n) .
s1210629013@55466	143	\label{eq:gustafsons}
s1210629013@55466	144	\end{equation}
s1210629013@55466	145
s1210629013@55466	146	\begin{figure}
s1210629013@55466	147	\centering
s1210629013@55466	148	\begin{tikzpicture}[domain=0:512,samples=2500]
s1210629013@55466	149	\begin{axis}[
s1210629013@55466	150	xlabel=number of processors $n$,
s1210629013@55466	151	ylabel=speedup $S_G(n)$,
s1210629013@55466	152	xmin=1,
s1210629013@55466	153	xmax=512,
s1210629013@55466	154	xticklabel style={/pgf/number format/.cd,1000 sep={\,}},
s1210629013@55466	155	xticklabel=\pgfmathparse{\tick}\pgfmathprintnumber{\pgfmathresult},
s1210629013@55466	156	ymin=0,
s1210629013@55466	157	ymax=514,
s1210629013@55466	158	grid=major,
s1210629013@55466	159	legend entries={\hspace{-.7cm}{parallel portion $f$},$50\%$,$75\%$,$90\%$,$95\%$},
s1210629013@55466	160	legend style={draw=none,at={(0.42,0.95)}},
s1210629013@55466	161	title={Gustafson's Law},
s1210629013@55466	162	axis lines=left,
s1210629013@55466	163	scale=1.59
s1210629013@55466	164	]
s1210629013@55466	165	\addlegendimage{empty legend};
s1210629013@55466	166	\addplot[color1,very thick] plot (\x,{1 - 0.5 * (1 - \x)}) node[right]{};
s1210629013@55466	167	\addplot[color2,very thick] plot (\x,{1 - 0.75 * (1 - \x)}) node[right]{};
s1210629013@55466	168	\addplot[color3,very thick] plot (\x,{1 - 0.9 * (1 - \x)}) node[right]{};
s1210629013@55466	169	\addplot[color4,very thick] plot (\x,{1 - 0.95 * (1 - \x)}) node[right]{};
s1210629013@55466	170	\end{axis}
s1210629013@55466	171	\end{tikzpicture}
s1210629013@55466	172	\caption{Gustafson's Law and potential speedups.}
s1210629013@55466	173	\label{fig:gustafsons}
s1210629013@55466	174	\end{figure}
s1210629013@55466	175
s1210629013@55466	176	The two theoretical speedups do not take into account dependencies or communications of parallel tasks. These factors can cause a performance overhead which may exceed the speedup if parallelization is taken too far. A certain class of parallel problems, called {\em embarrassingly parallel problems} \cite{moler1986matrix}, do not require any communication.
s1210629013@55466	177
s1210629013@55466	178	\subsection{Multi-Core Processors}
s1210629013@55466	179	\label{sec:multicoreproc}
s1210629013@55466	180	While the computing industry continued to fulfill Moore's predictions \cite{moore1965cramming} by fitting more and more transistors on smaller chips and thereby producing persistently faster uniprocessors, it is now physically almost impossible to make transistors any smaller \cite{geer2005chip}. Heat dissipation resulting from transistor density on the fastest chips requires extensive cooling and causes a high power consumption. Therefore hardware manufacturers have turned to building chips with multiple processing cores which may not be as fast as recent uniprocessors, but produce less heat and are more energy efficient. Architectures differ in the number of cores and their memory models. Several or all cores may share one cache or may each have their own cache. Separate caches avoid the challenges of coordinating cache accesses, but sometimes a centralized cache can be faster. Unlike multiple chips, several cores on the same chip can share memory management and memory elements and signaling between them can be faster because they are placed on the same die.
s1210629013@55466	181
s1210629013@55466	182	Multi-core processors, however, could not solve all performance issues, even if software were able to fully exploit their power. Other factors like memory bandwidth can turn out to be performance bottlenecks \cite{moore2008multicore}. Also, these processors are not the only way hardware may support parallelism:
s1210629013@55466	183	\begin{description}
s1210629013@55466	184	\item[Bit-level parallelism] takes advantage of increased processor word sizes in order to reduce the number of instructions that need to be executed to complete operations on variables which are longer than one word \cite{culler1999parallel}.
s1210629013@55466	185
neuper@55476	186	\item[Instruction-level parallelism] tries to discover data independent instructions which often can be reordered and computed during the same processing cycle. Also, instruction pipelines with multiple stages allow different instructions to be in different stages and thereby overlap their processing \cite{hennessy2012computer}.
s1210629013@55466	187
s1210629013@55466	188	\item[Simultaneous multithreading] enables multiple threads to issue instructions on a single processing core which may be processed during the same cycle \cite{tullsen1995simultaneous}.
s1210629013@55466	189	\end{description}
s1210629013@55404	190
s1210629013@55404	191	\subsection{Operating Systems}
neuper@55476	192	Operating systems are heavily involved in the resource management of their underlying architecture, both for their own operations and for user applications. Besides support for simultaneous multithreading (see previous section), general-purpose OSes have to deal with diverse hardware approaches, e.g. multiprocessors vs. multi-core processors, shared vs. separate cache models. Scheduling algorithms and memory management are facing new challenges because they must balance loads between cores while avoiding cache misses and reducing the number of context switches. In this thesis we will assume symmetric multiprocessing (SMP), i.e. systems whose processing cores are identical and access the same main memory. Scheduling algorithms in OSes for more heterogeneous systems also need to take into account differing performance characteristics and possibly even instruction sets of several processors.
s1210629013@55404	193
s1210629013@55404	194	\subsection{Functional Programming}
s1210629013@55466	195	\label{sec:par_funprog}
s1210629013@55466	196	Imperative programming lets the programmer make the steps that the system should carry out explicit. The idea of declarative, including functional, paradigms is a description of the desired result of a computation. On the spectrum between an algorithm's formal specification and the resulting machine code, functional code is generally said to be particularly close to the formal specification \cite{henderson1986functional}. Therefore this high-level programming paradigm puts higher demands for optimization on compilers and run-time systems. This also includes the management and exploitation of multiple CPU cores.
s1210629013@55404	197
s1210629013@55466	198	As purely functional programming prohibits side effects, parallelizing such programs should be trivial. However, in practice, general-purpose functional programming languages are not only used for {\em embarrassingly parallel problems}. Harris and Singh \cite{harris2007feedback} identify five issues which may arise when trying to automatically parallelize functional programs:
s1210629013@55466	199	\begin{enumerate}
s1210629013@55466	200	\item Inherent parallelism may be limited, i.e. call hierarchies may enforce a certain order of execution.
s1210629013@55466	201	\item Parallel executions need to be managed by the run-time system. This introduces an overhead which can even cause a slowdown if the parallelizable work is too fine-grained.
s1210629013@55466	202	\item Sometimes, purely functional code may contain side effects when it is compiled (e.g. memoization).
s1210629013@55466	203	\item Purely functional general-purpose languages generally do have means of encapsulating side effects such as I/O or updates to mutable data.
s1210629013@55466	204	\item In languages supporting lazy evaluation it is not known whether expressions actually need to be computed until their results are queried.
s1210629013@55466	205	\end{enumerate}
s1210629013@55466	206	For more details see chapter \ref{cha:funproglangs_mcsystems} which is entirely dedicated to parallelism in functional programming languages.
s1210629013@55404	207
s1210629013@55466	208	\section{Concurrency, Latency and Responsiveness}
s1210629013@55404	209	\label{sec:concurresp}
s1210629013@55466	210	Concurrency can improve system performance by hiding latencies. While one thread performs blocking operations such as e.g. disk I/O, another thread's execution can continue to perform other calculations independently. In cases where latencies are introduced only because a computationally complex operation claims the processor for a long period of time, multithreading cannot improve performance. But it can improve responsiveness by updating the user interface, giving feedback about pending operations' status and react to user input. Responsiveness does not necessarily require high performance, but it is a major factor of the user experience that an application is able to deliver. Concurrency can facilitate increased responsiveness. However, reasoning intuitively as well es formally about concurrent software and developing it is said to be difficult, mostly due to undetermined execution orders and thread synchronization issues \cite{sutter2005software}.
s1210629013@55404	211
s1210629013@55404	212
s1210629013@55404	213	\section{Refactoring Functional Programs}
s1210629013@55404	214	\label{sec:refacfunprogs}
s1210629013@55466	215	Refactoring is a process which changes the design and structure of existing software without changing its external behavior. It can result in simpler programs, remove duplicate code or prepare an extension of the program's functionality. Because refactoring is a potentially tedious and error-prone process \cite{thompson2005refactoring}, there is a variety of tools which automatically check preconditions for and apply refactoring steps for object-oriented programming languages. The number of such tools for functional programming languages, however, is limited. One of the few existing projects targets \textit{Haskell} code and is called {\em HaRe} (\textit{\textbf{Ha}skell} \textbf{Re}factorer). It follows five main strategies:
s1210629013@55466	216	\begin{enumerate}
neuper@55476	217	\item {\bf Generalization.} Extract some of a function's behavior into an argument, possibly a HOF, and thereby make the function more general and reusable.
s1210629013@55466	218	\item {\bf Commonality.} Similar or identical parts of a program are identified, extracted into a single function which is invoked at the respective locations and with appropriate parameters.
s1210629013@55466	219	\item {\bf Data Abstraction.} Replacing algebraic da\-ta\-types by abstract da\-ta\-types often allows programmers to modify concrete implementations without touching client code.
s1210629013@55466	220	\item {\bf Overloading.} The use of {\em type classes} and {\em instances} allows for overloading of names. This may improve readability and facilitate code reuse.
s1210629013@55466	221	\item {\bf Monadification.} Packing code into monads also makes it possible to separate parts of a system from the client code, i.e. developers can modify the monads without changing client code.
s1210629013@55466	222	\end{enumerate}
s1210629013@55466	223	{\em HaRe} furthermore implements structural refactorings such as deletion of unused definitions and arguments, renaming of several kinds of program items and modifications of the scopes of definitions. Later versions introduced module-aware refactorings including import/export function organization and moving definitions between modules. For more details and refactoring techniques implemented in {\em HaRe} please refer to \cite{thompson2005refactoring}.
s1210629013@55404	224
s1210629013@55466	225	{\em Type classes} and {\em monadification} are specific to \textit{Haskell}. In \textit{Standard ML} overloading can be achieved simply by omitting the type indicators, thanks to the Hindley–Milner type inference algorithm. Please note that \textit{Standard ML} is a strongly typed language \cite{milner1997definition}.

author	wneuper <Walther.Neuper@jku.at>
	Sun, 31 Dec 2023 09:42:27 +0100
changeset 60787	26037efefd61
parent 55476	8e3f73e1e3a3
permissions	-rw-r--r--