|
From: ERSEK L. <la...@ca...> - 2009-11-04 21:29:33
|
Hi,
[0] states:
----------
7.5. Hints and Tips for Effective Use of Helgrind
3. Avoid POSIX condition variables
[...]
a solution to this problem that does not require source-level
annotation of condition-variable wait loops is beyond the current
state of the art.
----------
I respectfully ask if the following passage could be recommended for the
Waiter as a code transformation that pessimizes, but does not invalidate
the code. If this qualifies as "source-level annotation of
condition-variable wait loops", then I apologize for the noise. (I
interpret "source-level annotation" as specially formatted comments, or
#pragma's, or Valgrind-specific API calls etc.)
----------
If your original code looks (as it should look) like
pthread_mutex_lock(mx);
while (!b) {
pthread_cond_wait(cv, mx);
}
pthread_mutex_unlock(mx);
Then consider inserting the following COMPILING_FOR_VALGRIND block:
pthread_mutex_lock(mx);
#ifdef COMPILING_FOR_VALGRIND
{
struct timespec epoch;
int ret;
epoch.tv_sec = 0;
epoch.tv_nsec = 0;
ret = pthread_cond_timedwait(cv, mx, &epoch);
assert(ETIMEDOUT == ret || 0 == ret);
}
#endif
while (!b) {
pthread_cond_wait(cv, mx);
}
pthread_mutex_unlock(mx);
----------
1. The dependency on the condition variable is now unavoidable, no matter
the initial value of "b" right after acquiring "mx"; so Valgrind can see
the dependency unconditionally.
2. No matter the value of "ret" (0 or ETIMEDOUT), pthread_cond_timedwait()
will have released and re-acquired mutex "mx"; see [1]. This is new
mutex/condvar activity, but it shouldn't hurt too much performance-wise.
3. The clock selection (see [2]) of the condition variable shouldn't
matter with an absolute time barrier set to 0. The epoch of whichever
clock is selected happens in the past, thus pthread_cond_timedwait()
should return immediately.
4. pthread_cond_timedwait() introduces a cancellation point. This is
nothing new if (!b) holds, because pthread_cond_wait() is a cancellation
point anyway. If "b" is true, however, this cancellation point is new.
Shouldn't hurt too much, since the original waiter code can't predict the
value of "b" either, so it must be prepared for cancellation anyway.
5. If "ret" is ETIMEDOUT, then we've received no signal/broadcast for
sure; back to business as usual. If "ret" is 0, we have received a
signal/broadcast (or a spurious wakeup), but we'll still check "b" first,
so the code stays valid.
As said above, I apologize if this is obvious and already covered by the
exclusion of "source-level annotation".
Thanks,
lacos
[0] http://valgrind.org/docs/manual/hg-manual.html#hg-manual.effective-use
[1] http://www.opengroup.org/onlinepubs/000095399/functions/pthread_cond_timedwait.html
[2] http://valgrind.org/docs/manual/drd-manual.html#drd-manual.pctw
|
|
From: Konstantin S. <kon...@gm...> - 2009-11-05 05:13:22
|
On Wed, Nov 4, 2009 at 11:52 PM, ERSEK Laszlo <la...@ca...> wrote: > Hi, > > [0] states: > > > ---------- > 7.5. Hints and Tips for Effective Use of Helgrind > > 3. Avoid POSIX condition variables > > [...] > > a solution to this problem that does not require source-level > annotation of condition-variable wait loops is beyond the current > state of the art. > I think this section of helgrind docs is generally out of date (Julian, please confirm). Helgrind 3.5 is a pure happens-before detector and thus handles this case correctly. Same for DRD. If you want a hybrid detector (which finds more races and is more stable, but does indeed choke on cond vars), see e.g. here: http://code.google.com/p/data-race-test/wiki/DynamicAnnotations#pthread_cond_wait_loop --kcc > ---------- > > > I respectfully ask if the following passage could be recommended for the > Waiter as a code transformation that pessimizes, but does not invalidate > the code. If this qualifies as "source-level annotation of > condition-variable wait loops", then I apologize for the noise. (I > interpret "source-level annotation" as specially formatted comments, or > #pragma's, or Valgrind-specific API calls etc.) > > > ---------- > If your original code looks (as it should look) like > > pthread_mutex_lock(mx); > while (!b) { > pthread_cond_wait(cv, mx); > } > pthread_mutex_unlock(mx); > > Then consider inserting the following COMPILING_FOR_VALGRIND block: > > pthread_mutex_lock(mx); > > #ifdef COMPILING_FOR_VALGRIND > { > struct timespec epoch; > int ret; > > epoch.tv_sec = 0; > epoch.tv_nsec = 0; > > ret = pthread_cond_timedwait(cv, mx, &epoch); > assert(ETIMEDOUT == ret || 0 == ret); > } > #endif > > while (!b) { > pthread_cond_wait(cv, mx); > } > pthread_mutex_unlock(mx); > ---------- > > 1. The dependency on the condition variable is now unavoidable, no matter > the initial value of "b" right after acquiring "mx"; so Valgrind can see > the dependency unconditionally. > > 2. No matter the value of "ret" (0 or ETIMEDOUT), pthread_cond_timedwait() > will have released and re-acquired mutex "mx"; see [1]. This is new > mutex/condvar activity, but it shouldn't hurt too much performance-wise. > > 3. The clock selection (see [2]) of the condition variable shouldn't > matter with an absolute time barrier set to 0. The epoch of whichever > clock is selected happens in the past, thus pthread_cond_timedwait() > should return immediately. > > 4. pthread_cond_timedwait() introduces a cancellation point. This is > nothing new if (!b) holds, because pthread_cond_wait() is a cancellation > point anyway. If "b" is true, however, this cancellation point is new. > Shouldn't hurt too much, since the original waiter code can't predict the > value of "b" either, so it must be prepared for cancellation anyway. > > 5. If "ret" is ETIMEDOUT, then we've received no signal/broadcast for > sure; back to business as usual. If "ret" is 0, we have received a > signal/broadcast (or a spurious wakeup), but we'll still check "b" first, > so the code stays valid. > > > As said above, I apologize if this is obvious and already covered by the > exclusion of "source-level annotation". > > Thanks, > lacos > > [0] http://valgrind.org/docs/manual/hg-manual.html#hg-manual.effective-use > [1] > http://www.opengroup.org/onlinepubs/000095399/functions/pthread_cond_timedwait.html > [2] http://valgrind.org/docs/manual/drd-manual.html#drd-manual.pctw > > > ------------------------------------------------------------------------------ > Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day > trial. Simplify your report design, integration and deployment - and focus > on > what you do best, core application coding. Discover what's new with > Crystal Reports now. http://p.sf.net/sfu/bobj-july > _______________________________________________ > Valgrind-developers mailing list > Val...@li... > https://lists.sourceforge.net/lists/listinfo/valgrind-developers > |
|
From: Julian S. <js...@ac...> - 2009-11-05 08:01:45
|
On Thursday 05 November 2009, Konstantin Serebryany wrote: > I think this section of helgrind docs is generally out of date (Julian, > please confirm). > Helgrind 3.5 is a pure happens-before detector and thus handles this case > correctly. > Same for DRD. Unfortunately I think the docs are correct, and neither Helgrind nor DRD can handle this properly. The basic problem is that there can be a h-b dependency which arises from the loop and values in memory, not from the mx or cv, so the dependency is "invisible". I believe Laszlo's comments are valid and correct, and an interesting solution. I'd have to stare at it more it be convinced it's correct, but it looks plausible. Laszlo, can you explain the background to this a bit? How did you come to be looking at this problem? Does your solution help? J |
|
From: Konstantin S. <kon...@gm...> - 2009-11-05 15:38:45
|
On Thu, Nov 5, 2009 at 11:15 AM, Julian Seward <js...@ac...> wrote: > On Thursday 05 November 2009, Konstantin Serebryany wrote: > > > I think this section of helgrind docs is generally out of date (Julian, > > please confirm). > > Helgrind 3.5 is a pure happens-before detector and thus handles this case > > correctly. > > Same for DRD. > > Unfortunately I think the docs are correct, and neither Helgrind nor DRD > can handle this properly. Hm. Why??? cond var should only be used between mutex lock/unlock and those create h-b arcs... Here is the test: http://code.google.com/p/data-race-test/source/browse/trunk/unittest/racecheck_unittest.cc#520 Helgrind, DRD and ThreadSanitizer (in pure happens-before mode) are silent here. ThreadSanitizer in the hybrid mode barks. void Waker() { GLOB = 1; MU.Lock(); COND = 1; CV.Signal(); MU.Unlock(); } void Waiter() { usleep(100000); // Make sure the signaller gets first. MU.Lock(); while(COND != 1) CV.Wait(&MU); MU.Unlock(); GLOB = 2; } > The basic problem is that there can be a h-b > dependency which arises from the loop and values in memory, not from the > mx or cv, so the dependency is "invisible". > > I believe Laszlo's comments are valid and correct, and an interesting > solution. I'd have to stare at it more it be convinced it's correct, > but it looks plausible. > > Laszlo, can you explain the background to this a bit? How did you > come to be looking at this problem? Does your solution help? > > J > |
|
From: Bart V. A. <bar...@gm...> - 2009-11-06 08:05:52
|
On Wed, Nov 4, 2009 at 9:52 PM, ERSEK Laszlo <la...@ca...> wrote: > ---------- > 7.5. Hints and Tips for Effective Use of Helgrind > > 3. Avoid POSIX condition variables > > [...] > > a solution to this problem that does not require source-level > annotation of condition-variable wait loops is beyond the current > state of the art. > ---------- IMHO item 3 in section 7.5 of the Helgrind manual was correct for previous Helgrind versions but is no longer correct for Helgrind 3.5.0. I have created a bugzilla item about this issue (see also https://bugs.kde.org/show_bug.cgi?id=213383). Bart. |
|
From: ERSEK L. <la...@ca...> - 2009-11-06 11:17:41
|
On Fri, 6 Nov 2009, Bart Van Assche wrote: > https://bugs.kde.org/show_bug.cgi?id=213383 This answers my original question/proposition; thank you very much. Cheers, lacos |