|
From: Peter T. <pt...@li...> - 2011-11-24 09:58:48
|
Hi masters of the memory problems... I have an annoying problem, where I need some general hints. An you might know already I have been using valgrind (mostly memtool) for a long time - so the basics are ok. My current Linux-program give one valgrind error - at random - every ~0.01% of the cases. In these rare cases I see valgrind barfing with ==8110== Thread 3: ==8110== Invalid read of size 8 ==8110== at 0x4B9741C: XXEqvm (in /home/pto/tester/somelib-O.so) I know that the program has 3 threads (digging into /proc//tasks/ -> 3 dirs), and I do not have full source code for this somelib-O :/ but it seems linked to Linux-system calls to clone() I also experience - an unexpected - high memory drain in certain cases, which could relate to the same problem. Question; Can I ... or should I ... also throw "valgrind --tool=helgrind" into the game? I have not been using this yet. Any of the other tools within the valgrind family, which I should focus on? If I run valgrind --tool=massif on the same program then I normally see very modest memory usage - but should I use any special voodoo-tricks since the program is multi-threaded? (and a huge general THANX to you valgrind developers - super super work out there) /pto -- Peter Toft, PhD http://petertoft.dk |
|
From: Julian S. <js...@ac...> - 2011-11-24 10:42:25
|
> Question; Can I ... or should I ... also throw "valgrind > --tool=helgrind" into the game? I have not been using this yet. You can use --tool=helgrind or --tool=drd to check threaded programs for errors. They provide similar functionality but using different algorithms, and using both is often worthwhile. One critical thing for both tools is, though, that the application under test must use only posix pthread_ functions for thread synchronisation. If that is not the case, both tools will report huge numbers of false errors. I'd strongly advise you to read the documentation for them at http://valgrind.org/docs/manual/manual.html before spending a lot of time on this. That said, if you do manage to get them to work well, they are pretty effective at finding threading errors which may otherwise be almost unfindable. J |
|
From: Peter T. <pt...@li...> - 2011-11-24 11:06:59
|
On Thu, 24 Nov 2011 11:34:25 +0100, Julian Seward wrote: >> Question; Can I ... or should I ... also throw "valgrind --tool=helgrind" into the game? I have not been using this yet. > You can use --tool=helgrind or --tool=drd to check threaded programs for errors. They provide similar functionality but using different algorithms, and using both is often worthwhile. One critical thing for both tools is, though, that the application under test must use only posix pthread_ functions for thread synchronisation. If that is not the case, both tools will report huge numbers of false errors. I'd strongly advise you to read the documentation for them at http://valgrind.org/docs/manual/manual.html [1] before spending a lot of time on this. That said, if you do manage to get them to work well, they are pretty effective at finding threading errors which may otherwise be almost unfindable. J Thanx Julian Since my problem happens very seldom, one follow-up question is whether I can combine --tool=memtool and --tool=drd in one run? /pto -- Peter Toft, PhD http://petertoft.dk Links: ------ [1] http://valgrind.org/docs/manual/manual.html |
|
From: Julian S. <js...@ac...> - 2011-11-24 13:33:50
|
On Thursday, November 24, 2011, Peter Toft wrote: > Since my problem happens very > seldom, one follow-up question is whether I can combine --tool=memtool > and --tool=drd in one run? No, you can't. But note: it might be that the problem occurs all the time (a race of some kind, for example) but that you only ever observe its effect via memcheck very seldom. So it might be that Helgrind or DRD can tell you something about it even in the 99.whatever% of cases that you think the problem does not occur in. (this is only a guess, btw ..) J |
|
From: Peter T. <pt...@li...> - 2011-11-24 17:06:49
|
On Thu, 24 Nov 2011 14:25:57 +0100, Julian Seward wrote: > On Thursday, November 24, 2011, Peter Toft wrote: >> Since my problem happens very seldom, one follow-up question is whether I can combine --tool=memtool and --tool=drd in one run? > No, you can't. But note: it might be that the problem occurs all the time (a race of some kind, for example) but that you only ever observe its effect via memcheck very seldom. So it might be that Helgrind or DRD can tell you something about it even in the 99.whatever% of cases that you think the problem does not occur in. (this is only a guess, btw ..) J He he - life is evil. I find that Helgrind and DRD reports the same - and I see the problems very very seldom (similar to memtool) , ==4698== drd, a thread error detector ==4698== Copyright (C) 2006-2010, and GNU GPL'd, by Bart Van Assche. ==4698== Using Valgrind-3.6.0 and LibVEX; rerun with -h for copyright info ==4698== Command: ./mem2 ==4698== ==4698== Destroying locked mutex: mutex 0x4fc0808, recursion count 1, owner 3. ==4698== at 0x490B014: pthread_mutex_destroy (drd_pthread_intercepts.c:569) ... ==4698== by 0x3F05C333A4: exit (in /lib64/libc-2.5.so) ==4698== by 0x3F05C1D99A: (below main) (in /lib64/libc-2.5.so) ==4698== mutex 0x4fc0808 was first observed at: ==4698== at 0x490AA8D: pthread_mutex_init (drd_pthread_intercepts.c:546) I must admit, that my insights to multi-threaded programming is too sloppy to tell how bad this is. Comments are welcome... -- Peter Toft, PhD http://petertoft.dk |
|
From: Julian S. <js...@ac...> - 2011-11-25 10:03:23
|
> I > find that Helgrind and DRD reports the same - and I see the problems > very very seldom (similar to memtool) , > ==4698== > Destroying locked mutex: mutex 0x4fc0808, recursion count 1, owner 3. > ==4698== at 0x490B014: pthread_mutex_destroy > (drd_pthread_intercepts.c:569) > ... > ==4698== by 0x3F05C333A4: exit > (in /lib64/libc-2.5.so) > > ==4698== by 0x3F05C1D99A: (below main) (in > /lib64/libc-2.5.so) > > ==4698== mutex 0x4fc0808 was first observed at: > > > ==4698== at 0x490AA8D: pthread_mutex_init > (drd_pthread_intercepts.c:546) > > I must admit, that my insights to > multi-threaded programming is too sloppy to tell how bad this is. Destroying a locked mutex could be potentially serious, since it could be that some other thread is waiting to acquire it, and now (post destroy, and overwrite of the memory area) can't. Hence deadlock is a possibility. J |