|
From: Pohl, H. <har...@ad...> - 2014-11-25 09:23:14
|
Hi,
I'm currently playing around with helgrind.
First, I'm trying to understand the annotations, cause I want to use them for my atomics implementation.
Second, I'd like to understand how well helgrind works on transitive happens before/after relationships across threads.
While doing this I observed strange reports from helgrind. It reports a false positive with a strange stack trace.
I have 2 test cases doing similar shared data accesses, the first is synchronized by an atomic flag, the second is unsynchronized using sleep. Yes, I know, this can change the execution order, but in all my runs with and without helgrind, the execution order was correct. I use sleep to provoke data races.
While helgrind correctly reports the data race for the second test case, it fails to recognize the atomic synchronization for the first, although I've annotated the synchronization.
Interesting: If I remove the second test case, the report is correct. No race is found.
I get this report with helgrind 3.10 and 3.9 with RHEL5.
Maybe I don't use the annotations correctly?
Or is this just a false positive I have to live with?
What confuses me is that the stack trace incorrectly reports a data race between the first and the second test thread. Which is impossible and makes me think that maybe the pthread_join() is incorrectly handled by helgrind?
Thanks and best regards,
Hartmut Pohl
The Code:
// g++ -g -O3 -pthread pt_thread_test.cpp
#include <stdio.h>
#include <pthread.h>
#include <assert.h>
#include <unistd.h>
#include <valgrind/helgrind.h>
template <class TYPE>
class AtomicAccess
{
public:
AtomicAccess(TYPE value)
: mMemoryModel(__ATOMIC_SEQ_CST),
mValue(value)
{
VALGRIND_HG_DISABLE_CHECKING(&mValue, sizeof(mValue));
}
~AtomicAccess()
{
VALGRIND_HG_ENABLE_CHECKING(&mValue, sizeof(mValue));
}
__attribute__ ((noinline)) TYPE syncLoad()
{
TYPE ret = __atomic_load_n(&mValue, mMemoryModel);
return ret;
}
__attribute__ ((noinline)) void syncStore(const TYPE& value)
{
__atomic_store_n(&mValue, value, mMemoryModel);
}
TYPE * getAddress()
{
return &mValue;
}
private:
// Memory model (Initializes to default memory model)
const int mMemoryModel;
// Variable to store the value
TYPE mValue;
};
pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
AtomicAccess<bool> atomicWait(true);
void *timedWaitToGiveMutex(void *arg)
{
printf("%s\n",__FUNCTION__);
pthread_mutex_lock(&mutex);
sleep(3);
pthread_mutex_unlock(&mutex);
}
void *syncWaitToGiveMutex(void *arg)
{
printf("%s\n",__FUNCTION__);
pthread_mutex_lock(&mutex);
bool wait = true;
do
{
wait = atomicWait.syncLoad();
ANNOTATE_HAPPENS_AFTER(atomicWait.getAddress());
}
while (wait) ;
pthread_mutex_unlock(&mutex);
}
void *workWithMutex(void *arg)
{
printf("%s\n",__FUNCTION__);
pthread_mutex_lock(&mutex);
int *iptr = (int*) arg;
(*iptr)++;
pthread_mutex_unlock(&mutex);
return iptr;
}
int main()
{
int i = 10;
pthread_t tid, tid2;
int *ret;
// 1st test case: Start T1, take mutex, waits on atomic. Start T2, block on
// mutex.
// Main writes atomic - releases wait.
// T1 unblocks, gives mutex, T2 takes it, reads and writes, unlocks
pthread_create(&tid, 0, syncWaitToGiveMutex, 0);
sleep(1);
pthread_create(&tid2, 0, workWithMutex, &i);
sleep(1);
i = 30;
ANNOTATE_HAPPENS_BEFORE(atomicWait.getAddress());
atomicWait.syncStore(0);
pthread_join(tid2, (void **)&ret);
assert(i == *ret);
pthread_join(tid, (void **)&ret);
// 2nd test case: Same as 3 but with a timer instead of an atomic operation
pthread_create(&tid, 0, timedWaitToGiveMutex, 0);
sleep(1);
pthread_create(&tid2, 0, workWithMutex, &i);
sleep(1);
i = 40;
pthread_join(tid2, (void **)&ret);
assert(i == *ret);
pthread_join(tid, (void **)&ret);
}
The report:
==7704== Helgrind, a thread error detector
==7704== Copyright (C) 2007-2013, and GNU GPL'd, by OpenWorks LLP et al.
==7704== Using Valgrind-3.10.0 and LibVEX; rerun with -h for copyright info
==7704== Command: a.out
==7704== Parent PID: 22355
==7704==
==7704== ---Thread-Announcement------------------------------------------
==7704==
==7704== Thread #3 was created
==7704== at 0x38458D496E: clone (in /lib64/libc-2.5.so)
==7704== by 0x3846406DCB: create_thread (createthread.c:76)
==7704== by 0x3846406DCB: pthread_create@@GLIBC_2.2.5 (pthread_create.c:544)
==7704== by 0x4A10890: pthread_create_WRK (hg_intercepts.cpp:457)
==7704== by 0x4006CB: main (helgrind_problem.cpp:101)
==7704==
==7704== ----------------------------------------------------------------
==7704==
==7704== Possible data race during read of size 8 at 0x4C225D0 by thread #3
==7704== Locks held: none
==7704== at 0x4A1402F: pthread_mutex_lock (hg_intercepts.cpp:795)
==7704== by 0x4009F7: workWithMutex(void*) (helgrind_problem.cpp:80)
==7704== by 0x4A10A26: mythread_wrapper (hg_intercepts.cpp:421)
==7704== by 0x384640677C: start_thread (pthread_create.c:301)
==7704== by 0x38458D49AC: clone (in /lib64/libc-2.5.so)
==7704== Address 0x4c225d0 is 0 bytes inside data symbol "_ZL21fn_pthread_mutex_lock"
==7704==
==7704== ---Thread-Announcement------------------------------------------
==7704==
==7704== Thread #5 was created
==7704== at 0x38458D496E: clone (in /lib64/libc-2.5.so)
==7704== by 0x3846406DCB: create_thread (createthread.c:76)
==7704== by 0x3846406DCB: pthread_create@@GLIBC_2.2.5 (pthread_create.c:544)
==7704== by 0x4A10890: pthread_create_WRK (hg_intercepts.cpp:457)
==7704== by 0x40079A: main (helgrind_problem.cpp:116)
==7704==
==7704== ----------------------------------------------------------------
==7704==
==7704== Lock at 0x6012E0 was first observed
==7704== at 0x4A0E385: pthread_mutex_lock_WRK (hg_intercepts.cpp:778)
==7704== by 0x4A1404C: pthread_mutex_lock (hg_intercepts.cpp:803)
==7704== by 0x400A40: syncWaitToGiveMutex(void*) (helgrind_problem.cpp:65)
==7704== by 0x4A10A26: mythread_wrapper (hg_intercepts.cpp:421)
==7704== by 0x384640677C: start_thread (pthread_create.c:301)
==7704== by 0x38458D49AC: clone (in /lib64/libc-2.5.so)
==7704== Address 0x6012e0 is 0 bytes inside data symbol "mutex"
==7704==
==7704== Possible data race during read of size 4 at 0xFFEFFFE40 by thread #5
==7704== Locks held: 1, at address 0x6012E0
==7704== at 0x4009F8: workWithMutex(void*) (helgrind_problem.cpp:82)
==7704== by 0x4A10A26: mythread_wrapper (hg_intercepts.cpp:421)
==7704== by 0x384640677C: start_thread (pthread_create.c:301)
==7704== by 0x38458D49AC: clone (in /lib64/libc-2.5.so)
==7704==
==7704== This conflicts with a previous write of size 4 by thread #3
==7704== Locks held: 1, at address 0x6012E0
==7704== at 0x4009F8: workWithMutex(void*) (helgrind_problem.cpp:82)
==7704== by 0x4A10A26: mythread_wrapper (hg_intercepts.cpp:421)
==7704== by 0x384640677C: start_thread (pthread_create.c:301)
==7704== by 0x38458D49AC: clone (in /lib64/libc-2.5.so)
==7704== Address 0xffefffe40 is on thread #1's stack
==7704== in frame #2, created by main (helgrind_problem.cpp:90)
==7704==
==7704== ---Thread-Announcement------------------------------------------
==7704==
==7704== Thread #1 is the program's root thread
==7704==
==7704== ----------------------------------------------------------------
==7704==
==7704== Lock at 0x6012E0 was first observed
==7704== at 0x4A0E385: pthread_mutex_lock_WRK (hg_intercepts.cpp:778)
==7704== by 0x4A1404C: pthread_mutex_lock (hg_intercepts.cpp:803)
==7704== by 0x400A40: syncWaitToGiveMutex(void*) (helgrind_problem.cpp:65)
==7704== by 0x4A10A26: mythread_wrapper (hg_intercepts.cpp:421)
==7704== by 0x384640677C: start_thread (pthread_create.c:301)
==7704== by 0x38458D49AC: clone (in /lib64/libc-2.5.so)
==7704== Address 0x6012e0 is 0 bytes inside data symbol "mutex"
==7704==
==7704== Possible data race during write of size 4 at 0xFFEFFFE40 by thread #5
==7704== Locks held: 1, at address 0x6012E0
==7704== at 0x4009F8: workWithMutex(void*) (helgrind_problem.cpp:82)
==7704== by 0x4A10A26: mythread_wrapper (hg_intercepts.cpp:421)
==7704== by 0x384640677C: start_thread (pthread_create.c:301)
==7704== by 0x38458D49AC: clone (in /lib64/libc-2.5.so)
==7704==
==7704== This conflicts with a previous read of size 4 by thread #1
==7704== Locks held: none
==7704== at 0x400750: main (helgrind_problem.cpp:109)
==7704== Address 0xffefffe40 is on thread #1's stack
==7704== in frame #2, created by main (helgrind_problem.cpp:90)
==7704==
==7704==
==7704== For counts of detected and suppressed errors, rerun with: -v
==7704== Use --history-level=approx or =none to gain increased speed, at
==7704== the cost of reduced accuracy of conflicting-access information
==7704== ERROR SUMMARY: 3 errors from 3 contexts (suppressed: 96 from 43)
|
|
From: Julian S. <js...@ac...> - 2014-11-25 12:40:45
|
On 11/25/2014 10:22 AM, Pohl, Hartmut wrote:
> What confuses me is that the stack trace incorrectly reports a data race
> between the first and the second test thread. Which is impossible and makes me
> think that maybe the pthread_join() is incorrectly handled by helgrind?
There are a bunch of stack traces in this message. Exactly which one is
the one you think is wrong?
I don't exactly understand the test program, but this ..
> bool wait = true;
> do
> {
> wait = atomicWait.syncLoad();
> ANNOTATE_HAPPENS_AFTER(atomicWait.getAddress());
> }
> while (wait) ;
.. doesn't feel right to me. You need to put the ANNOTATE_HAPPENS_AFTER
after the end of the while loop, not inside it, because merely reading
the value
> wait = atomicWait.syncLoad();
doesn't mean the thread can go forwards. It has to spin until that
value becomes |true|. So the h-b edge from the main thread "arrives
here" after the loop, not inside it.
J
|
|
From: Pohl, H. <har...@ad...> - 2014-11-25 13:04:52
|
Hi,
The stack I think is wrong is the following combination:
==7704== Possible data race during read of size 4 at 0xFFEFFFE40 by thread #5 ==7704== Locks held: 1, at address 0x6012E0
==7704== at 0x4009F8: workWithMutex(void*) (helgrind_problem.cpp:82)
==7704== by 0x4A10A26: mythread_wrapper (hg_intercepts.cpp:421)
==7704== by 0x384640677C: start_thread (pthread_create.c:301)
==7704== by 0x38458D49AC: clone (in /lib64/libc-2.5.so)
==7704==
==7704== This conflicts with a previous write of size 4 by thread #3 ==7704== Locks held: 1, at address 0x6012E0
==7704== at 0x4009F8: workWithMutex(void*) (helgrind_problem.cpp:82)
==7704== by 0x4A10A26: mythread_wrapper (hg_intercepts.cpp:421)
==7704== by 0x384640677C: start_thread (pthread_create.c:301)
==7704== by 0x38458D49AC: clone (in /lib64/libc-2.5.so)
As far as I understand, Thread #5 belongs to the second test case and thus cannot have a data race with thread #3 due to the pthread_join()s.
I put the annotation at the end of the while (that's what I started with). Same output.
Best regards,
Hartmut
-----Original Message-----
From: Julian Seward [mailto:js...@ac...]
Sent: Dienstag, 25. November 2014 13:40
To: val...@li...; Pohl, Hartmut
Subject: Re: [Valgrind-users] Problem with helgrind?
On 11/25/2014 10:22 AM, Pohl, Hartmut wrote:
> What confuses me is that the stack trace incorrectly reports a data
> race between the first and the second test thread. Which is impossible
> and makes me think that maybe the pthread_join() is incorrectly handled by helgrind?
There are a bunch of stack traces in this message. Exactly which one is the one you think is wrong?
I don't exactly understand the test program, but this ..
> bool wait = true;
> do
> {
> wait = atomicWait.syncLoad();
> ANNOTATE_HAPPENS_AFTER(atomicWait.getAddress());
> }
> while (wait) ;
.. doesn't feel right to me. You need to put the ANNOTATE_HAPPENS_AFTER after the end of the while loop, not inside it, because merely reading the value
> wait = atomicWait.syncLoad();
doesn't mean the thread can go forwards. It has to spin until that value becomes |true|. So the h-b edge from the main thread "arrives here" after the loop, not inside it.
J
|
|
From: Julian S. <js...@ac...> - 2014-11-25 14:25:45
|
On 11/25/2014 02:04 PM, Pohl, Hartmut wrote: > As far as I understand, Thread #5 belongs to the second test case and > thus cannot have a data race with thread #3 due to the pthread_join()s. I agree. > I put the annotation at the end of the while (that's what I started with). I looked at this in some detail, but I don't understand what is happening. I did notice that if you comment out the line causing the real race i = 40; then no races are reported. So at least you have a situation where, if there is a race, then (multiple) races are reported, but if you have no race, then no races are reported. To investigate further I'd need to look at Helgrind's memory state machine transitions for |i|. J |