Download Latest Version jocptask_all_jars.zip (193.3 kB)
Email in envelope

Get an email when there's a new version of PTask

Home / ptask v0.81
Name Modified Size InfoDownloads / Week
Parent folder
jocptasktestsrc.jar 2009-12-16 4.8 kB
jocptasktest.jar 2009-12-16 16.4 kB
jocptasksrc.jar 2009-12-16 18.5 kB
jocptask.jar 2009-12-16 41.2 kB
README.txt 2009-12-16 4.5 kB
Totals: 5 Items   85.3 kB 0
PTask library
=============
Version 0.81


Idea
----

PTask (for Parallel Task), is a library that aims to simplify the
implementation of parallelized algorithms.  The base concept is to define a
description of your algorithms with it's parallel and serial sections and then
execute these descriptions on a fixed amount of worker threads.

Main goals are

- fixed number of threads if possible, usually same as number of CPUs
- no waits/joins unless unavoidable
- exception handling
- Allow multiple independent algorithm descriptions (from different threads)
  to be executed in parallel transparently.



The current PTask implementation uses internal bookkeeping to achieve these
goals. Waits or Joins (which i.e. mean more threads) are not required except
when tasks inside one description execute their own independent algorithm
descriptions.


Project Status
--------------

The PTask library is open source and is released under LGPL license. You can
find it also on sourceforge.net. The status of this project is pre-alpha.
A couple of unit tests and some small test apps run, but no real live
application uses this framework yet. Your feedback is highly appreciated!
Please do not hesitate to mail me about bugs or implementation faults. I will
provide more in-depth documentation about the current implementation soon.


How to Use
----------

An algorithm description is defined by nesting ptask queues and ptasks
according to the work flow your algorithm has. PTask queues are themselves
ptasks so in fact you are nesting just ptasks. Computing for example
statistics on a data array followed by finding a proper threshold, followed
by application of a filter to each data item using this threshold can be
defined in the PTask framework as a serial queue consisting of one parallel
queue containing a number of ptasks gathering statistics, followed by the
filter ptask, followed by a parallel queue containing a number of ptasks
applying the filter.

This is how it would look in code:

//...
SerialPTaskQueue sq = new SerialPTaskQueue();
ParallelPTaskQueue pq1 = new ParallelPTaskQueue();

pq1.addPTask(new StatPTask(0, 99999));
pq1.addPTask(new StatPTask(100000, 199999));

sq.addPTask(pq1);
sq.addPTask(new ThresholdPTask());

ParallelPTaskQueue pq2 = new ParallelPTaskQueue();

pq2.addPTask(new FilterPTask(0, 99999));
pq2.addPTask(new FilterPTask(100000, 199999));

sq.addPTask(pg2);

try {
    sq.process(myData);	//waits until all the tasks are executed
} catch (ExecutionException e) {
    … //process exceptions from your code here
}
//...

The PTask interface defines just one method that you must implement to get work
done:

void process(Object data) throws InterruptedException, ExecutionException;

The number of tasks to use depends on the number of CPUs in your system. There
is also a DynamicParallelPTaskQueue which is configured by a factory to
generate the appropriate number of parallel tasks for a given number of
parallel threads (CPUs) on demand. In my first tests it runs considerably
faster than just creating threads and join them for the parallel portions and
still somewhat faster if the threads are pre-created using a
ThreadPoolExecutor.

Input data and results are managed using just an Object reference passed to the
process methods of the tasks. I used a synchronized Map in my tests and some
synchronized custom objects inside the map. What you use is entirely up to you,
the framework classes just pass the data object along without using it.

IMPORTANT:
- Accessing synchronized objects is slow! if you use synchronized objects for
  communication between your tasks, make sure to call to them sparsely as
  otherwise your parallelized code might run slower as the serial version. If
  you e.g. compute statistics, compute local stats first and merge them to the
  global one once at the end of each ptask. You can find a samples for this in
  the provided sample code.
- Do not block your tasks by waits or joins, as the framework will currently
  not detect this and may stall.


PTask Home
----------

You can find the latest update at
    http://www.jocware.de / http://www.jocware.com
and at
    https://sourceforge.net/projects/ptask.

Please mail to support@jocware.de for Questions, Bug reports and suggestions!


Have fun with PTasks!

...Jochen Riekhof



Release Notes
-------------

Version 0.81

Fixed bug in TicketPTaskManager.executeParallel that occasionally caused infinite wait.


Version 0.8

Initial public release
Source: README.txt, updated 2009-12-16