Re: [Algorithms] General purpose task parallel threading approach

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

The word "priority" is commonly used in task scheduling (e.g. in the GDC
talk I pointed out earlier) to define preferred order of task execution. I'm
sure I don't have to explain basic scheduling terminology to you, right?
Anyway, the system I described is generic and in fact less restrictive than
what you have been talking about: You are not forced to statically define
dependencies in the code flow (you can do it though), thus you can gain
better throughput. But seems like you have firmly set your mind on specific
way of doing multi-threading and adamantly believe it results in optimal
throughput regardless what people tell you, so good luck with that.

Cheers, Jarkko

  _____  

From: Sebastian Sylvan [mailto:seb...@gm...] 
Sent: Saturday, April 11, 2009 2:55 PM
To: Game Development Algorithms
Subject: Re: [Algorithms] General purpose task parallel threading approach

On Sat, Apr 11, 2009 at 10:20 AM, Jarkko Lempiainen <al...@gm...>
wrote:

I was talking about generic task system before.

Later on you're saying that it's not a generic task system. By "generic" I
mean "can handle all current kinds of control flow", which yours can't, as
far as I understand you.

Dependencies impact to the task priorities

Yes I see what you mean now, but it's very confusing to use the word
"priority" here. I was assuming you were talking about a general task system
(which *requires* an on-line, as opposed to an ahead-of-time, scheduler -
meaning that the word "priority" could only mean a dynamic priority that
shuffles things around while tasks are already running). 

, because you want the longest chain of work (critical path) to start as
early as possible in the frame.

I think it's worth mentioning yet again that work stealing, an on-line
scheduler which doesn't have your limitations, still only depends on the
critical path with a constant factor (which, empirically, turns out to be
very close to 1 in practice). 

In contrast, your system *doesn't* have a running time bound that's depends
on the critical path with a constant factor as far as I can tell (this is
back-of-a-napkin, feel free to do a rigorous analysis)! So I'd be careful
claiming that your system is inherently more efficient if asymptotically
it's actually worse. It may be better in practice (the constant involved in
works stealing may be large), but you'd definitely need empirical data to
show this...

in games we are in fortunate position that pretty much the same call graph
is ran from one frame to the next, so it's possible to exploit
frame-to-frame coherency to prioritize tasks and have pretty good idea
what's going to happen ahead of time. It's not perfect of course (i.e. when
a task is run for the first time there is no priority for it), but it's
close and definitely better than not prioritizing at all.

Where's your evidence for this? If you compare your *restricted* system
running with and without priorities then I'm not surprised, but you have to
compare it to a system that isn't as restricted, and actually won't consider
something a dependency until it knows for a fact that it is, since this has
the potential to drastically affect throughput.

The way I see it *loads* of functions start off by checking some internal
value or parameter and basically touching different data based on the
result. And these functions call *other* functions which do the same thing.
If your task system has to pessimistically consider *any* data it could
possibly touch a dependency, rather than only the *actual* data it touches,
then this would cause the dependency graph to explode exponentially, which
could severely restrict your throughput.

And like I pointed out before, this still assumes that your code is of the
very restricted form that even allows pessimistic dependency tracking, some
very simple and common control flow would simply be mathematically
impossible to schedule ahead of time (I gave a few examples earlier).

In my implementation I don't define dependencies beforehand like you seem to
assume you would have to do, but I define them at the call site, so there
isn't external maintenance to do but the usage is pretty much like calling a
function (1-2 lines of code), so it's quite convenient to use. One
motivation for the job queue I implemented was that it should be simple to
use so that you could easier take advantage of opportunities for
parallelism.

How does that actually work? If I spawn a task in the function foo, then
surely you must execute foo before you know what task, if any, I've spawned
(it may depend on dynamic data, after all)?

Now imagine foo is actually called from within another task, doesn't that
mean you have to start executing tasks before you know what tasks are
available, meaning you need at least some form of on-line scheduler? 

-- 
Sebastian Sylvan
+44(0)7857-300802
UIN: 44640862