|
From: Stefan K. <en...@ho...> - 2010-07-08 11:36:38
|
On 08.07.2010 13:23, Konstantin Serebryany wrote: > The stack trace from gdb suggests that your program is blocked on > pthread_cond_wait, which does not necessary mean there is a mutex > deadlock. > You might be waiting for some condition which never becomes true. > Thanks for the help. And this was actually the case. With some combinations of sort | uniq I found the cause. Stefan > --kcc > > On Thu, Jul 8, 2010 at 2:18 PM, Stefan Kost <en...@ho...> wrote: > >> On 08.07.2010 12:30, Konstantin Serebryany wrote: >> >>> On Thu, Jul 8, 2010 at 1:09 PM, Stefan Kost <en...@ho...> wrote: >>> >>> >>>> On 08.07.2010 11:34, Konstantin Serebryany wrote: >>>> >>>> >>>>> --tool=helgrind >>>>> >>>>> >>>>> >>>> Nope. helgrind does not complain. Does it run cycle checks on-the-fly? >>>> >>>> >>> Yes, http://valgrind.org/docs/manual/hg-manual.html#hg-manual.lock-orders >>> >>> >> hm, then it should detect the problem indeed. >> >>> >>>> Or how would it detect that the app deadlocked. >>>> >>>> >>> helgrind finds cycles in lock ordering, deadlock does not have to >>> actually happen during the execution. >>> >>> Does your program use pthread_mutex_ or something else? >>> Is the program dynamically linked? >>> >>> >> The application is a benchmark for gstreamer, using glib's gthread >> (which uses pthread on linux). The program is dynamically linked. If I >> ctrl-c the app under gdb and dump all strackframes, I have a lot of >> stackframes like the two below: >> #0 0x0012d422 in __kernel_vsyscall () >> #1 0x00325af9 in __lll_lock_wait () at >> ../nptl/sysdeps/unix/sysv/linux/i386/i686/../i486/lowlevellock.S:142 >> #2 0x00328e1c in _L_cond_lock_826 () from >> /lib/tls/i686/cmov/libpthread.so.0 >> #3 0x00328c40 in __pthread_mutex_cond_lock (mutex=0x824e6b0) at >> ../nptl/pthread_mutex_lock.c:61 >> #4 0x003230b3 in pthread_cond_wait@@GLIBC_2.3.2 () at >> ../nptl/sysdeps/unix/sysv/linux/i386/i686/../i486/pthread_cond_wait.S:203 >> ... >> and >> #0 0x0012d422 in __kernel_vsyscall () >> #1 0x00323015 in pthread_cond_wait@@GLIBC_2.3.2 () at >> ../nptl/sysdeps/unix/sysv/linux/i386/i686/../i486/pthread_cond_wait.S:122 >> ... >> >> Stefan >> >> >> >>> --kcc >>> >>> >>> >>>> I was thinking of >>>> writing a LD_PRELOAD based toy, there I would ctrl-c the app and then >>>> run the cycle checks and dump the results. I have found no evidence in >>>> the docs that I can signal helgrind to tell that the app has no deadlocked. >>>> >>>> Stefan >>>> >>>> >>>> >>>> >>>>> On Thu, Jul 8, 2010 at 12:30 PM, Stefan Kost <en...@ho...> wrote: >>>>> >>>>> >>>>> >>>>>> hi, >>>>>> >>>>>> is anyone aware of a valgrind tool that can help me to debug a deadlock >>>>>> in a highly threaded program. The programm can easily create hundreds of >>>>>> threads. >>>>>> What I am locking for is a tool that tracks for each thread which >>>>>> mutexes are locked (incl. the strackframe of the lock) and if it is >>>>>> waiting on a mutex (also including the stackframe). When the app >>>>>> deadlocks, the collected data can be represented as a directed graph >>>>>> ("thread -> mutex" for a held lock and "mutex -> thread" for a pending >>>>>> lock) and one could run Tarjan's strongly connected components algorithm >>>>>> [1][2] to detect cycles. For each found cycle it could print the >>>>>> involved threads with the backtraces. >>>>>> >>>>>> Stefan >>>>>> >>>>>> >>>>>> [1] >>>>>> http://en.wikipedia.org/wiki/Tarjan%E2%80%99s_strongly_connected_components_algorithm >>>>>> [2] http://www.logarithmic.net/pfh/blog/01208083168 >>>>>> >>>>>> ------------------------------------------------------------------------------ >>>>>> This SF.net email is sponsored by Sprint >>>>>> What will you do first with EVO, the first 4G phone? >>>>>> Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first >>>>>> _______________________________________________ >>>>>> Valgrind-users mailing list >>>>>> Val...@li... >>>>>> https://lists.sourceforge.net/lists/listinfo/valgrind-users >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>> ------------------------------------------------------------------------------ >>>>> This SF.net email is sponsored by Sprint >>>>> What will you do first with EVO, the first 4G phone? >>>>> Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first >>>>> _______________________________________________ >>>>> Valgrind-users mailing list >>>>> Val...@li... >>>>> https://lists.sourceforge.net/lists/listinfo/valgrind-users >>>>> >>>>> >>>>> >>>> ------------------------------------------------------------------------------ >>>> This SF.net email is sponsored by Sprint >>>> What will you do first with EVO, the first 4G phone? >>>> Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first >>>> _______________________________________________ >>>> Valgrind-users mailing list >>>> Val...@li... >>>> https://lists.sourceforge.net/lists/listinfo/valgrind-users >>>> >>>> >>>> >>>> >>>> >> >> |