Paraffin is a set of Ada 2012 generics that may be used to add
parallelism to iterative loops and recursive code.
Paraffin includes generics for both Ravenscar and non-Ravenscar use.
The Ravenscar version utilizes static task pools with dispatching
domains intended for real-time programming.
Paraffin also includes Paraffinalia, which is a suit of useful parallel
utilities that utilize the Paraffin generics. These include generics for;
1) generic to integrating a function in parallel
2) generic to apply quicksort algorithm in parallel to an array
3) generic to apply fast fourier transform to an array of data.
4) generic Red-Black tree container that performs some operations
in parallel.
5) function to solve matrices using Gauss-Jordan Elimination
6) generic to perform prefix sum calculations
7) generic to perform sequence alignment using the Smith-Waterman
algorithm to find similar regions between two strings for
problems such as comparing genetic nucleotide or protein sequences,
or checking for plagiarism between two text sourcs.
The current posted version of the code is intended to be compiled
only with Ada 2012 compilers, though archived versions can also be
downloaded that support Ada 2005 compilers. The interfaces between
the Ada 2012 and the Ada 2005 versions are now quite different,
but at some point, the current interfaces may get back ported to
the Ada 2005 version, at some point.
Paraffin is free software. See the file COPYING for copying permission.
Most of the source code is released under the GMGPL (GNAT Modified GPL)
See the individual source files for details.
Any comments on the generics would be greatly appreciated.
Please send comments to brad.moore@shaw.ca
1.0 INTRODUCTION
================
For iterative parallelism, three main flavours of parallelism generics
exist within paraffin.
Work-Sharing
Work-Seeking
Work-Stealing
For recursive parallelism, only Work-Sharing and Work-Seeking forms
exist.
For each parallelism approach, there are 3 cases;
elementary (functional) reduction
composite (procedural) reduction
no reduction
The elementary (functional) reducing generics are intended to be used
when a final result needs to be generated, and the type of the final
result is an elementary type (i.e. Integer, Boolean, Float).
The elementary generics essentially are ones that return the final
reduction result as a function return value, rather than an in out
parameter of a procedure.
It is not strictly necessary that the type be an elementary type,
however, it is likely that for composite types, better performance will
result from using the composite generics, since function returns involve
copying the result into a result variable, whereas procedure calls act
on the parameter directly without having to make a copy.
The composite (procedural) reducing generics are intended to be used
when a final result needs to be generated, and the type of the final
result is a composite type (i.e. record, tagged type).
That is not to say that elementary types cannot be used. You may find
that elementary types perform well with the composite generics as well.
The remaining class of generics do not produce a final result, and
therefore do not involve any reduction. These generics are useful when
there is only a need to iterate or recurse in parallel.
2.0 BUILD INSTRUCTIONS =========
- Irvine ICC Ada 2005 compiler
An archive of the Ada 2005 version of paraffin is included in the
source. This version of the source should be used currently with
the Irvine ICC Ada compiler, until they have Ada 2012 support.
- For the Irvine ICC Ada 2005 compiler on Windows, execute the
following script;
icm new
icm scan -subdir "*.ad{s,b}"
icm scan parallel/config/windows/parallel-config.ads
icm make test_paraffin -compile_flags=\"-predef=(f32,lf64)\"
icm make test_integrate -compile_flags=\"-predef=(f32,lf64)\"
icm make test_quicksort -compile_flags=\"-predef=(f32,lf64)\"
icm make test_fft -compile_flags=\"-predef=(f32,lf64)\"
You can add other compile flags as well, such as
-compile_flags=\"-predef=(f32,lf64) -opt -debug -nochecks\"
to turn on optimization, debug, and disable checks.
To compile for Irvine ICC on Linux, execute the following script
instead;
icm new
icm scan -subdir "*.ad{s,b}"
icm scan parallel/config/linux/parallel-config.ads
icm make test_paraffin -compile_flags='"-predef=(f32,lf64)"' \
-link_flags='"-linker_postflag=(-lpthread)"'
icm make test_integrate -compile_flags='"-predef=(f32,lf64)"' \
-link_flags='"-linker_postflag=(-lpthread)"'
icm make test_quicksort -compile_flags='"-predef=(f32,lf64)"' \
-link_flags='"-linker_postflag=(-lpthread)"'
icm make test_fft -compile_flags='"-predef=(f32,lf64)"' \
-link_flags='"-linker_postflag=(-lpthread)"'
You can add other compile flags as well, such as;
-compile_flags='"-predef=(f32,lf64) -opt -debug -nochecks"'
to turn on optimization, debug, and disable checks.
- For GNAT Pro, GNAT GPL or GNAT AUX, load the appropriate .gpr file
into the GPS ide, and build the executable from within the ide, or
alternatively use gnatmake to perform the equivalent actions described
in the .gpr file.
3.0 TESTED PLATFORMS =========
Paraffin has been ported to the following compilers and platforms.
GNAT GPL 2013 (Windows, Linux)
Irvine Ada 2005 (Windows, Linux)
GNAT AUX FSF 4.6.1 (Ada 2005 only) (Android)
Paraffin is intended to be portable to any platform that supports
Ada 2005 or Ada 2012 compilation, and in theory, any Ada 2005 or Ada
2012 compiler should be able to compile the code since there are no
dependencies on the GNAT run-time.
It should also be possible to compile the generics for any target
system, since the generics do not rely on any OS-specific support.
4.0 DOWNLOADING ==============
The latest stable release and older releases may be downloaded from;
https://sourceforge.net/projects/paraffin/files/
For those who want the current development versions of the source they
can download using git (http://git-scm.com/) by issuing the following
commands;
mkdir sandbox
cd sandbox
git clone git://git.code.sf.net/p/paraffin/code paraffin-code
The current development version typically will correspond to the latest
stable release, but may at times be unstable when new features are being
worked on.
5.0 PARALLELISM STRATEGIES
==========================
Work-Sharing is a simpler divide and conquer strategy where an attempt
is made to divide the work evenly among the worker at the start of the
parallelism.
Work-Seeking is an approach where workers can seek more work when they
complete their tasks. The hand-off of work from the seeker and the donor
is cooperative. A seeking-work flag is monitored by the client. If the
flag is set, it indicates other workers are seeking work, and the busy
worker donates work to the seeker.
Work-Stealing is an approach where workers can steal work from other
workers when they complete their tasks. The strategy is to have the idle
workers steal work from busy workers with minimum interference. The idle
worker searches randomly to find a busy worker. When a candidate is
found, the stealer modifies the candidates loop bounds to force the
candidate to exit early. The stealer takes work from the candidate, then
restarts the candidate to continue from where it left off.
Work-Sharing generally makes sense when the amount of work can be
divided evenly between the workers.
Work-Seeking and Work-Stealing are usually good choices then the amount
of work can not be divided evenly. In many cases, these forms can also
outperform Work-Sharing, since one can never expect that the workers
for evenly distributed loads will finish at precisely the same time,
especially considering the load introduced by other processes or threads
on the system.
Work-Stealing and Work-Seeking are generally pretty comparable.
It seems that for most of the tests, there will be a slight edge to
Work-Seeking, though this is not always the case.
6.0 LOOP CONSTRUCTS
=====================
For iterative parallelism, work sharing, work seeking, and work
stealing may be applied to for loops and while loops.
7.0 RECURSIVE CONSTRUCTS
========================
For recursive parallelism, work stealing forms do not yet exist.
However, in addition to regular work-seeking and
work-sharing, there also exists stack-limited and stack-safe work
seeking forms. These generics are used to avoid stack overflow.
If recursion gets too deep on the stack, the generics save the
subordinate work and come back later to process the missing recursion
after the stack has been cleared to a fresh start.
8.0 EXAMPLES
============
A) Iterative Work Sharing example: Computing a sum of integers
--------------------------------------------------------------
-- Sum : Integer := 0;
-- for I in 1 .. N loop
-- Sum := Sum + I;
-- end loop
-- Note the instantiations can be library level instantiations
-- separate from the use of the instantiation, and reused for
-- multiple for loops.
with Parallel.Functional_Reducing_Loops.Work_Sharing;
package Integer_Addition_Loops is new
Parallel.Functional_Reducing_Loops
(Result_Type => Integer,
Reducer => "+",
Identity_Value => 0,
Iteration_Index_Type => Positive);
package Work_Sharing_Integer_Addition_Loops is new
Integer_Addition_Loops.Work_Sharing;
-------------------------------------------------------
Sum : Integer := 0;
Manager : Work_Sharing_Integer_Addition_Loops.Work_Sharing_Manager
:= Work_Sharing_Integer_Addition_Loops.Create;
procedure Iteration
(Start, Finish : Positive;
Result : in out Integer) is
begin
for I in Start .. Finish loop
Result := Result + I;
end loop;
end Iteration;
begin
Manager.Execute_Parallel_Loop
(From => 1,
To => N,
Process => Iteration'Access,
Result => Sum);
end;
B) Iterative Work Seeking Example: Computing a sum of integers
--------------------------------------------------------------
-- Sum : Integer := 0;
-- for I in 1 .. N loop
-- Sum := Sum + I;
-- end loop
-- Note the instantiations can be library level instantiations
-- separate from the use of the instantiation, and reused for
-- multiple for loops.
with Parallel.Functional_Reducing_Loops.Work_Seeking;
package Integer_Addition_Loops is new
Parallel.Functional_Reducing_Loops
(Result_Type => Integer,
Reducer => "+",
Identity_Value => 0,
Iteration_Index_Type => Positive);
package Work_Seeking_Integer_Addition_Loops is new
Integer_Addition_Loops.Work_Seeking;
-------------------------------------------------------
Sum : Integer := 0;
Manager : Work_Seeking_Integer_Addition_Loops.Work_Seeking_Manager
:= Work_Seeking_Integer_Addition_Loops.Create;
procedure Iteration
(Start, Finish : Positive;
Result : in out Integer) is
begin
for I in Start .. Finish loop
Result := Result + I;
end loop;
end Iteration;
begin
Manager.Execute_Parallel_Loop
(From => 1,
To => N,
Process => Iteration'Access,
Result => Sum);
end;
C) Iterative Work Stealing Example: Computing a sum of integers
---------------------------------------------------------------
-- Sum : Integer := 0;
-- for I in 1 .. N loop
-- Sum := Sum + I;
-- end loop
-- Note the instantiations can be library level instantiations
-- separate from the use of the instantiation, and reused for
-- multiple for loops.
with Parallel.Functional_Reducing_Loops.Work_Stealing;
package Integer_Addition_Loops is new
Parallel.Functional_Reducing_Loops
(Result_Type => Integer,
Reducer => "+",
Identity_Value => 0,
Iteration_Index_Type => Positive);
package Work_Stealing_Integer_Addition_Loops is new
Integer_Addition_Loops.Work_Stealing;
-------------------------------------------------------
Sum : Integer := 0;
Manager : Work_Stealing_Integer_Addition_Loops.Work_Stealing_Manager
:= Work_Stealing_Integer_Addition_Loops.Create;
procedure Iteration
(Start, Finish : Positive;
Result : in out Integer) is
begin
for I in Start .. Finish loop
Result := Result + I;
end loop;
end Iteration;
begin
Manager.Execute_Parallel_Loop
(From => 1,
To => N,
Process => Iteration'Access,
Result => Sum);
end;
D) Recursive Work Sharing Fibonacci
-----------------------------------
-- function Fibonacci (Number : Natural) return Natural is
-- begin
-- if Number < 2 then
-- return Number;
-- else
-- return Fibonacci (Number - 2) + Fibonacci (Number - 1);
-- end if;
-- end Fibonacci;
with Parallel.Functional_Reducing_Recursion.Work_Sharing;
-- Note the instantiations can be library level instantiations
-- separate from the use of the instantiation, and reused for
-- multiple parallelism opportunities.
package Functional_Integer_Recursion is new
Parallel.Functional_Reducing_Recursion
(Result_Type => Integer,
Reducer => "+",
Identity_Value => 0,
Work_Type => Natural);
package Work_Sharing_Integer_Addition_Recursion is new
Functional_Integer_Recursion.Work_Sharing;
function Fibonacci (Value : Integer) return Integer
is
use Work_Sharing_Integer_Addition_Recursion;
Manager : Work_Sharing_Manager := Create;
function Sequential_Fibonacci (Number : Integer) return Integer is
begin
if Number < 2 then
return Number;
else
return
Sequential_Fibonacci (Number - 2) +
Sequential_Fibonacci (Number - 1);
end if;
end Sequential_Fibonacci;
function Parallel_Fibonacci (Number : Natural) return Integer is
begin
if Manager.Sequential then
return Sequential_Fibonacci (Number);
end if;
if Number < 2 then
return Number;
else
return
Manager.Dispatcher.Recurse
(Item => Number - 2,
Split => 1,
Of_Splits => 2) +
Manager.Dispatcher.Recurse
(Item => Number - 1,
Split => 2,
Of_Splits => 2);
end if;
end Parallel_Fibonacci;
begin -- Fibonacci
return Manager.Execute_Parallel_Subprogram
(Item => Value,
Process => Parallel_Fibonacci'Access);
end Fibonacci;
E) Recursive Work Seeking Fibonacci
-----------------------------------
-- function Fibonacci (Number : Natural) return Natural is
-- begin
-- if Number < 2 then
-- return Number;
-- else
-- return Fibonacci (Number - 2) + Fibonacci (Number - 1);
-- end if;
-- end Fibonacci;
with Parallel.Functional_Reducing_Recursion.Work_Seeking;
-- Note the instantiations can be library level instantiations
-- separate from the use of the instantiation, and reused for
-- multiple parallelism opportunities.
package Functional_Integer_Recursion is new
Parallel.Functional_Reducing_Recursion
(Result_Type => Integer,
Reducer => "+",
Identity_Value => 0,
Work_Type => Natural);
package Work_Seeking_Integer_Addition_Recursion is new
Functional_Integer_Recursion.Work_Seeking;
use Work_Seeking_Integer_Addition_Recursion;
function Fibonacci (Value : Integer) return Integer
is
Sequential_Cutoff : constant Integer := 22;
Others_Workers : aliased Parallel.Work_Seeking_State;
Manager : Work_Seeking_Manager
:= Create (Others_Seeking_Work => Others_Workers);
function Parallel_Fibonacci (Number : Natural) return Integer is
begin
if Number < 2 then
return Number;
elsif Others_Workers.Seeking_Work and then
Number > Sequential_Cutoff then
return
Manager.Dispatcher.Recurse (Item => Number - 2) +
Parallel_Fibonacci (Number - 1);
else
return
Parallel_Fibonacci (Number - 2) +
Parallel_Fibonacci (Number - 1);
end if;
end Parallel_Fibonacci;
begin -- Work_Seeking_Fibonacci
return Manager.Execute_Parallel_Subprogram
(Item => Value,
Process => Parallel_Fibonacci'Access);
end Work_Seeking_Fibonacci;
9.0 TEST EXECUTABLES
====================
Paraffin provides the following test executables
1) test_parallel_loops - performs a battery of loop tests
2) test_parallel_recursion - performs a battery of recursive tests
3) test_integrate - executes a test that recursively determines
the area under the curve in parallel for the
square root function between a user specified
range.
4) test_quicksort - executes a test of a parallel quicksort.
This is a conquer and divide strategy, as
opposed to a divide and conquer strategy.
5) test_fft - executes a test of a parallel fast fourier
transform (and inverse). The implementation is
a recursive approach based on the Cooley-Tukey
algorithm, and the Danielson-Lanczos lemma.
This test is unique in that it features both
the recursive generics and the iterative
generics, plus it uses Paraffin's barrier
utility to interleave sequential and parallel
code.
6) test_matrix - executes a function to solve a matrix of
float numbers using Gauss-Jordan Elimination.
This test is also utilizes Paraffin's barrier
utility to interleave sequential and parallel
code. This test utilizes Annex G, Numerics
capabilities, and has only been built for Linux,
as it requires extra libraries to be installed
on the target. (BLAS, LAPACK)
7) test_smith_waterman - executes a function that compares two text
strings, or two text files, using the smith-
waterman dynamic algorithm in parallel.
Various parameters of the generics can be controlled by environment
variables defined in the package Parallel_Test_Harness in the test
folder.
In particular;
DEFAULT_WORKER_COUNT (Number) defines the number of workers to use.
WORKER_STORAGE_SIZE size of the stack for workers
MAX_STACK_DEPTH For the stack safe work-seeking recursion, defines the
percentage or amount of stack space allowed to be
occupied by the worker before deferring the subordinate
work to later so, in order to prevent stack overflow
from occurring
Brad Moore