|
From: Phil L. <plo...@sa...> - 2013-01-17 19:12:55
|
I'm using helgrind to look for lock order violations, and got one: ==14876== ---------------------------------------------------------------- ==14876== ==14876== Thread #6: lock order "0xD13D7C8 before 0xCFA7BB4" violated ==14876== ==14876== Observed (incorrect) order is: acquisition of lock at 0xCFA7BB4 ==14876== (stack unavailable) ==14876== ==14876== followed by a later acquisition of lock at 0xD13D7C8 ==14876== at 0x57748: pthread_mutex_lock (hg_intercepts.c:506) ==14876== by 0x86F362E: osiSemaphore::Take(int) const (osiSemaphore.h:357) ==14876== by 0x86F36D0: osiRWSemaphore::wTake(int) (osiSemaphoreLinux.cpp:423) ==14876== by 0x86F377A: osiRWSemAutoP::osiRWSemAutoP(osiSemaphoreMode, osiRWSemaphore*) (osiSemaphoreLinux.cpp:503) ==14876== by 0x862147F: pdbMirrorLink::CreateOutNotification(char const*, int, mirrorLinkNotifyIn*) (pdbMirrorLink.cpp:1264) ==14876== by 0x8621E15: pdbMirrorLink::UpdateNotifications(bool, unsigned int) (pdbMirrorLink.cpp:1388) ==14876== by 0x865A78C: pdbTreeCache::ActivateIndex(unsigned int) (pdbTreeCacheNode.cpp:1866) ==14876== by 0x861737D: pdbMirrorLinkCacheHelper::_ActivateIndex(osiJQData*) (pdbMirrorLink.h:544) ==14876== by 0x85E0D88: osiJobQueue::entry() (osiJob.cpp:119) ==14876== by 0x86F0A88: osiThread::MyRun(void*) (osiThreadLinux.cpp:194) ==14876== by 0x86EF814: threadMain(void*) (osiThreadLauncherLinux.cpp:50) ==14876== by 0x5A608: mythread_wrapper (hg_intercepts.c:219) What is the cause of "stack unavailable"? This error message doesn't give me enough information to go on, since it doesn't tell me anything about what the first incorrectly locked mutex is or about the stacks establishing the correct locking order. Phil ----- Phil Longstaff Senior Software Engineer x2904 |
|
From: Philippe W. <phi...@sk...> - 2013-01-18 05:59:07
|
On Thu, 2013-01-17 at 19:00 +0000, Phil Longstaff wrote: > What is the cause of “stack unavailable”? This error message doesn’t > give me enough information to go on, since it doesn’t tell me anything > about what the first incorrectly locked mutex is or about the stacks > establishing the correct locking order. I suspect this can happen when the (invalid) locking chain is complex (e.g. sequences more complex than lockA,lockB and lockB,lockA). Not digged more in depth. There is a small helgrind regression test which causes also this output (helgrind/tests/tc14_laog_dinphils.vgtest). Would be nice to investigate further ... Philippe |
|
From: Julian S. <js...@ac...> - 2013-01-18 08:38:55
|
On 01/18/2013 06:59 AM, Philippe Waroquiers wrote: > On Thu, 2013-01-17 at 19:00 +0000, Phil Longstaff wrote: > >> What is the cause of “stack unavailable”? This error message doesn’t >> give me enough information to go on, since it doesn’t tell me anything >> about what the first incorrectly locked mutex is or about the stacks >> establishing the correct locking order. > I suspect this can happen when the (invalid) locking chain is complex > (e.g. sequences more complex than lockA,lockB and lockB,lockA). > > Not digged more in depth. There is a small helgrind regression test > which causes also this output (helgrind/tests/tc14_laog_dinphils.vgtest). I suspect there's a good reason for it, but I can't remember what it is, and from a quick scan of the lock-order checking code, I can't guess what it is. Philippe hacked on this stuff more recently than me, so he might be able to guess. In any case, at least we have a small test case with the tc14_laog_dinphils test. J |