|
From: Dennis L. <pla...@tz...> - 2005-11-10 22:27:37
|
Hi, in the good old days of helgrind, I was able to track down some race conditions in our application. Now that helgrind isnt working anymore, Im facing a big problem of some heisenbugs. Core dumps are corrupted and running within gdb or valgrind solely does not trigger the bug. So, whats the current status of helgrind? What can we (the community) do to speed up getting helgrind to work? As a temporary workaround (i.e. to try to trigger the bug) it would be nice if valgrinds scheduler could have some randomness (of course enabled on user request only) in its execution of different threads, and maybe it can add some "sleeps" to further influence the timing of threads(to trigger the bugs) greets Dennis |
|
From: Nicholas N. <nj...@cs...> - 2005-11-15 17:08:17
|
On Thu, 10 Nov 2005, Dennis Lubert wrote:
> in the good old days of helgrind, I was able to track down some race
> conditions in our application. Now that helgrind isnt working anymore,
> Im facing a big problem of some heisenbugs. Core dumps are corrupted and
> running within gdb or valgrind solely does not trigger the bug.
> So, whats the current status of helgrind? What can we (the community) do
> to speed up getting helgrind to work?
We want to reinstate it. A reasonable number (off the top of my head:
about 10--15?) of people complained in the survey about its absence.
The main obstacle to getting it working is that we need to support
function wrapping. We currently support function replacement, ie. the
ability to replace a function with our own version. Function wrapping
would extend that to allow us to call the original from within our
replacement, eg:
void replacement_for_foo(int x, char y)
{
// do pre-stuff
foo(x, y); // call original
// do post-stuff
}
This is needed for Helgrind so we can intercept and track calls to
functions like pthread_mutex_lock().
Julian and I have discussed numerous times about how to implement function
wrapping in a sane way, and we've experimented with different approaches,
so far without success. It's a difficult problem. It's definitely on our
radar, but as always finding time for any particular problem is difficult
when there are 101 other problems to be fixed as well.
Nick
|