Re: [Algorithms] General purpose task parallel threading approach
Brought to you by:
vexxed72
|
From: Sebastian S. <seb...@gm...> - 2009-04-11 16:49:53
|
On Sat, Apr 11, 2009 at 5:11 PM, Jarkko Lempiainen <al...@gm...>wrote: > The word “priority” is commonly used in task scheduling (e.g. in the GDC > talk I pointed out earlier) to define preferred order of task execution. I’m > sure I don’t have to explain basic scheduling terminology to you, right? > Anyway, the system I described is generic and in fact less restrictive than > what you have been talking about: You are not forced to statically define > dependencies in the code flow (you can do it though), thus you can gain > better throughput. > The confusion came from the fact that this discussion was about a general task parallelism, not this restricted form you're talking about, and in that context priority in the sense that you meant doesn't really make sense (since dependencies would be enforced "outside" the priority system). I'm just trying to explain why I misunderstood you the first time around. I know what you meant now, and we can stop arguing semantics. I tried to be very clear about what I meant by static/dynamic. By "static" I meant that you need to know the dependencies before you start executing tasks. Is this incorrect? I understand that you can dynamically change the graph of tasks on a frame-by-frame basis, but you still can't have arbitrary dynamic code flow in the tasks themselves (because then it would be impossible to know the dependencies ahead of time, which you need for your scheduler). But seems like you have firmly set your mind on specific way of doing > multi-threading and adamantly believe it results in optimal throughput > regardless what people tell you, so good luck with that. We're talking about code here, not personalities. I'm trying to point out that your system is, as a matter of mathematical fact (not "adamant beliefs"), not a general task system, because it imposes restrictions on the kinds of code you can run through it. I'm not disregarding what people tell me, I'm happy to be shown, and have repeatedly asked for, facts, evidence or even just well-reasoned speculation. So if you think I'm wrong please explain where I'm going wrong rather than bailing out with an ad-hominem attack. Here, I'll even summarize my position for you so you can do it more easily: 1. Your system is an ahead-of-time scheduler, in that the dependencies of your task system must be known before you start executing anything. It can still change from frame-to-frame, but dependencies must be known before execution. 2. By definition certain kinds of control flow can not be scheduled ahead of time (i.e. it's mathematically impossible). 3. Therefore your system is not a general system, because it does not allow dynamic control flow. 4. Dynamic control flow is very common in real code. 5. Your system must have "worst case" dependencies for each task, i.e. if a task has an if-else clause that task must depend on the data used in both branches. Since this is a tree, this gives you expontentially more dependencies in your graph. 6. I haven't done the maths, but I *suspect* that (5) means that the asymptotic time complexity w.r.t. the critical path is not constant, but at least linear and probably exponential (it's a bit tricky to figure out exactly, maybe I'm just thick). 7. In order to schedule dynamic code flow you need an on-line scheduler by definition. 9. Work stealing is an on-line scheduler with provable efficiency (it's asymptotically optimal). 10. (6) and (9) means you can't claim your ahead of time scheduler is better than work-stealing in any absolute sense, though I'd be happy to see benchmarks of specific instances where this is the case, but so far every benchmark I've seen has had work stealing come out on top. 11. All these points taken together means that an on-line scheduler based on work-stealing gives me: Where am I going wrong here? -- Sebastian Sylvan +44(0)7857-300802 UIN: 44640862 |