Re: [Algorithms] General purpose task parallel threading approach
Brought to you by:
vexxed72
|
From: Jon W. <jw...@gm...> - 2009-04-05 05:32:54
|
Jarkko Lempiainen wrote: > I think you really need to target for both task and data level parallelism > to make your game engine truly scalable for many-core architectures. I don't > see why you would need to consider the cost of context switching in relation > to const of a task execution though since if you got a job queue, each > worker thread just pulls tasks from the queue without switching context. > If you are fully FIFO in your data flow, then that is true. However, in real code, you end up with data flow dependencies. For example, you typically want to do simulation something like: extract state from previous iteration of physics frame run collision/intersection tests push state over to renderer read user input extract entity behavior based on collision and input state kick off new physics frame present the frame (vsync) There are, unfortunately, a number of intra-frame dependencies here, such as the entity behavior needing the output of the collision tests. (Can I jump? That depends on whether I have my feet on the ground.) You may even have intra-frame dependencies in the entity behavior job, where one entity could cause more behavior for other entities (explosions, triggers, etc). You can set this up to be totally FIFO, and thus be able to start all the jobs in parallel, but you will be introducing latency; sometimes as much as two additional frames (which for certain game types can be noticed as sluggish controls, etc). Now, most of these dependencies are in the form of "this task can't start until that task has ended" which can be implemented in a multi-threaded pool without blocking -- if a single thread runs low on work, it simply busy-waits until some pre-requisite finishes. Or you have a queue of background work, such as pathfinding queries etc -- but sometimes, that queue will be empty anyway. Or you can use a blocking primitive of some sort (fibers or threads) to deal with the case where the queue is temporarily stalled waiting on the output of something else that's executing out of the queue. Depending on how much of this happens each frame, the overhead may or may not matter. This brings up another interesting point, which you also remarked on: The workload for a game is generally quite different than the workload from a server. Typically, you pre-load everything you want, and just run in-core. The cases where that breaks down is various large-world games (streaming worlds, MMOs, etc) where there really are "unplanned" loads of data, where being able to have a task block until it's done would sometimes be convenient. However, typically this happens seldom enough (compared to the other stuff) that a state machine approach, or a simple threading approach, really isn't that punitive. Sincerely, jw |