Re: [Algorithms] General purpose task parallel threading approach
Brought to you by:
vexxed72
|
From: <asy...@gm...> - 2009-04-05 17:31:52
|
Cost of starting a task is the cost to allocate memory for it (you can provide your own allocator by the way). You can also provide already allocated memory. There is no additional cost. The cost of a "switch" is not actually the cost of a context switch (which is just few cycles, really), but the total cost required to setup a wait operation on synchronization primitive and further resume of a task (in my benchmark it was blocking on an event). Alexander. 2009/4/5 Sebastian Sylvan <seb...@gm...> > > > On Sat, Apr 4, 2009 at 7:29 AM, <asy...@gm...> wrote: > >> Hi Jon, >> >> >you instead just suspend the current thread and >> >create a new thread in the thread pool, what would the difference be? >> The difference here is speed. >> >> I have an implementation which does the same but with threads (for >> debugging purposes - you can find it in the package as well), and it is >> significantly slower. While mine implementation still touches a semaphore on >> each task switch, it can push about a million task switches per second now. >> I actually didn't measure performance with threads, but will try doing that >> later. >> My target goal for a task switch cost is about 1-2 hundred of cycles, >> which is about 10-15 millions of task switches per second on a 3ghz cpu. > > > Hi, I didn't look at the source code, but am I to understand that this > "switch" you're talking about happens every single time you start a new > task? I.e. when a worker runs tasks A, B, and C, neither of which does any > blocking, you will pay that cost 3 times? > If not, then what's the additional cost of starting a task given that each > task is now a fibre? > > Assuming I did understand you correctly, I'll just say that I think adding > (even slight) overhead to the "run" method of every single task in order to > make blocking cheaper is a big mistake. The whole motivating idea behind > task parallelism is that you saturate the system with way more runnable > tasks than there are hardware capabilities, so that performance can scale > with hardware, and that no CPU capability is left idle. If the number of > runnable tasks is much larger than the actual number of worker threads, that > means that most tasks will just run serially on the same worker that spawned > them, which means that most synchronisation points won't block, which means > that the number of thread switches will be few compared to the number of > task executions, which means that optimizing for fast blocking is not as > important as making sure that running a task is dirt cheap. > > -- > Sebastian Sylvan > +44(0)7857-300802 > UIN: 44640862 > > > ------------------------------------------------------------------------------ > > _______________________________________________ > GDAlgorithms-list mailing list > GDA...@li... > https://lists.sourceforge.net/lists/listinfo/gdalgorithms-list > Archives: > http://sourceforge.net/mailarchive/forum.php?forum_name=gdalgorithms-list > -- Regards, Alexander Karnakov |