Re: [Algorithms] General purpose task parallel threading approach
Brought to you by:
vexxed72
|
From: <asy...@gm...> - 2009-04-04 09:22:36
|
Replace '870 million' with '870 thousands' in my previous post.
Sorry for mistake
Alexander.
2009/4/4 <asy...@gm...>
> >Have you measured it?
> I composed a quick sample - 2 threads which ping pong each other via 2
> events (1 core utilized in total), each thread did 10 million iterations of
> passing execution to another thread (20 million execution passes in total),
> and did the same test with tasks.
> Win32 threads: 85seconds ~ 236 thousands task switches per second
> asynclib: 23seconds ~ 870 million task switches per second
>
> Then I also tried hacking in scheduler optimization I'm about to start
> working on (which removes polling of the above mentioned semaphore - the
> slowest part of the current scheduler), the result is 4seconds ~ 5million
> task switches per second when you avoid using any windows api.
>
> Regards,
> Alexander.
>
> Sample I used:
>
>
> HANDLE hEvent1;
> HANDLE hEvent2;
> async::Event Event1;
> async::Event Event2;
>
> static const int nIterations = 10000000;
> DWORD __stdcall Thread1_Proc(void*)
> {
> for (int i = 0; i < nIterations; ++i)
> {
> WaitForSingleObject(hEvent1, INFINITE);
> ResetEvent(hEvent1);
> SetEvent(hEvent2);
> }
> return 0;
> }
> DWORD __stdcall Thread2_Proc(void*)
> {
> for (int i = 0; i < nIterations; ++i)
> {
> WaitForSingleObject(hEvent2, INFINITE);
> ResetEvent(hEvent2);
> SetEvent(hEvent1);
> }
> return 0;
> }
>
> void ASYNC_API Task1_Proc(void*)
> {
> for (int i = 0; i < nIterations; ++i)
> {
> async::Wait(Event1);
> Event1.Reset();
> Event2.Signale();
> }
> }
> void ASYNC_API Task2_Proc(void*)
> {
> for (int i = 0; i < nIterations; ++i)
> {
> async::Wait(Event2);
> Event2.Reset();
> Event1.Signale();
> }
> }
>
> void Profile_Threads()
> {
> hEvent1 = CreateEvent(NULL, TRUE, FALSE, NULL);
> hEvent2 = CreateEvent(NULL, TRUE, FALSE, NULL);
>
> HANDLE hThread1 = CreateThread(NULL, 4096, &Thread1_Proc, NULL, 0, NULL);
> HANDLE hThread2 = CreateThread(NULL, 4096, &Thread2_Proc, NULL, 0, NULL);
>
> int nTicks = GetTickCount();
> SetEvent(hEvent1);
> WaitForSingleObject(hThread1, INFINITE);
> WaitForSingleObject(hThread2, INFINITE);
> nTicks = GetTickCount() - nTicks;
>
> printf("thread ticks %03f\n", (float) ((double)nTicks / 1000.0));
> CloseHandle(hThread1);
> CloseHandle(hThread2);
>
>
> async::Initialize(0, 0);
> async::Task* pTask1; async::Task* pTask2;
> async::Execute(&Task1_Proc, 0, &pTask1, NULL, 0, 512, NULL);
> async::Execute(&Task2_Proc, 0, &pTask2, NULL, 0, 512, NULL);
>
> nTicks = GetTickCount();
> Event1.Signale();
> async::WaitAll(*pTask1, *pTask2);
> nTicks = GetTickCount() - nTicks;
>
> printf("task ticks %03f\n", (float) ((double)nTicks / 1000.0));
>
> pTask1->Release();
> pTask2->Release();
> async::Shutdown();
> }
>
>
> 2009/4/4 Jon Watte <jw...@gm...>
>
>> asy...@gm... wrote:
>>
>> > To fight blocking conditions I've made a task suspension mechanics,
>> > which allows a worker thread to suspend a task when it meets something
>> > it should wait for (whether it is a mutex or an on going disk read).
>> > After task is suspended, worker thread picks up another task from the
>> > task pool, and keeps on executing it. When blocking condition ends,
>> > task is resumed, and put back into a task pool.
>>
>> You are basically combining fibers (co-operative threads) with
>> pre-emptive threads. There really isn't that much cost difference
>> between a pre-emptive thread and a fiber, so you might as well just
>> spawn another pre-emptive thread, and suspend the current thread, rather
>> than switching fiber on the current thread. If you're using a
>> dynamically growing thread pool, your worst-case thread count (and thus
>> memory usage) will be no worse than your worst-case fiber count.
>>
>> >
>> > - a unified way to wait on primitives (you can wait on all sort of
>> > primitives including tasks together)
>> > - task canceling
>> > - timeout on wait operations
>>
>> We do that in our fibers, too. However, it turns out that on most OS-es,
>> you can do that for OS primitives too. On Windows, pretty much anything
>> can go into WaitForMultipleObjects() (although there is a 64-object
>> limitation). On UNIX, as long as something is a file descriptor, you can
>> wait on it using select().
>>
>> So, if instead of "suspending" the current "task" and picking another
>> task within the worker thread -- which I presume to be exactly like
>> fiber switching -- you instead just suspend the current thread and
>> create a new thread in the thread pool, what would the difference be?
>> Have you measured it?
>>
>> Sincerely,
>>
>> jw
>>
>>
>>
>> ------------------------------------------------------------------------------
>> _______________________________________________
>> GDAlgorithms-list mailing list
>> GDA...@li...
>> https://lists.sourceforge.net/lists/listinfo/gdalgorithms-list
>> Archives:
>> http://sourceforge.net/mailarchive/forum.php?forum_name=gdalgorithms-list
>>
>
>
|