|
From: Christoph B. <bar...@or...> - 2007-10-27 12:20:29
Attachments:
thread.C
|
Hi, thanks for your work on a helgrind replacement. Such a tool is a great help for us. I have tested thrcheck from revision 7044 with the attached program. Here are some problems I found: 1. The concurrent writes on line 72 are not protected by a mutex (uncommented above and below). Thrcheck does not flag this error on my i386 and x86_64 machines. 2. There is no race-condition for the writes in line 92. However the i386 version of thrcheck flags an error here. 3. The x86_64 version only shows that one thread is started. Threads 3 -- 9 are not shown. 4. The i386 version flags a possible race condition in line 41. In my opinion there is none. Greetings Christoph |
|
From: Julian S. <js...@ac...> - 2007-10-27 17:26:51
|
> thanks for your work on a helgrind replacement. Such a tool is a great help
> for us.
Thanks for the feedback and test program. What version of glibc do
your test machines have?
> 1. The concurrent writes on line 72 are not protected by a mutex
> (uncommented above and below). Thrcheck does not flag this error on my i386
> and x86_64 machines.
This appears to be scheduling-sensitive. For both i386 and x86_64 (Core 2
Duo) I get an error here maybe for every 1 run in 4:
Created counter at address: 0x62e9b20
[...]
Possible data race during write of size 4 at 0x62E9B20
at 0x401315: Loop::run() (bartothread.cpp:72)
by 0x400F7B: start (bartothread.cpp:44)
by 0x4C25DEB: mythread_wrapper (tc_intercepts.c:170)
by 0x4E2F09D: start_thread (in /lib64/libpthread-2.5.so)
by 0x58674CC: clone (in /lib64/libc-2.5.so)
Old state: shared-readonly by threads #3, #4
New state: shared-modified by threads #3, #4
Reason: this thread, #3, holds no consistent locks
Location 0x62E9B20 has never been protected by any lock
Unfortunate that the error-detection is scheduling-sensitive. Given the
use of pthread_cond_* that is difficult to avoid.
> 2. There is no race-condition for the writes in line 92. However the i386
> version of thrcheck flags an error here.
Yes .. also the x86_64 version flags an error. I'll look into it.
> 3. The x86_64 version only shows that one thread is started. Threads 3 -- 9
> are not shown.
I presume you mean you only get one message of the form
Thread #3 was created
at 0x586748E: clone (in /lib64/libc-2.5.so)
by 0x4E2F305: do_clone (in /lib64/libpthread-2.5.so)
by 0x4E2F7C5: pthread_create@@GLIBC_2.2.5 (in /lib64/libpthread-2.5.so)
by 0x4C23864: pthread_create@* (tc_intercepts.c:190)
These message are not printed when a thread starts - that would be
impractical for a program with many threads. They are printed on demand
so as to make later error messages make sense. Specifically, if it
produces a race message containing this:
Possible data race during [...]
Old state: shared-readonly by threads #3, #6
New state: shared-modified by threads #3, #6
then it will first print out a "Thread #3 was created at" message, and a
"Thread #6 was created at" message, so you can make sense of the race message.
> 4. The i386 version flags a possible race condition in line 41. In my
> opinion there is none.
I agree. You should ignore all race errors reported inside any
pthread_* function and inside ld.so. I tried to suppress them, but clearly
there is more work to do there.
J
|
|
From: Christoph B. <bar...@or...> - 2007-10-27 18:02:54
|
Am Samstag, 27. Oktober 2007 schrieb Julian Seward: > > thanks for your work on a helgrind replacement. Such a tool is a great > > help for us. > > Thanks for the feedback and test program. What version of glibc do > your test machines have? Both machines are opensuse 10.2 i386: GNU C Library stable release version 2.5 (20061011), by Roland McGrath et al. x86_64: GNU C Library stable release version 2.5 (20061011), by Roland McGrath et al. > > 1. The concurrent writes on line 72 are not protected by a mutex > > (uncommented above and below). Thrcheck does not flag this error on my > > i386 and x86_64 machines. > > This appears to be scheduling-sensitive. For both i386 and x86_64 (Core 2 > Duo) I get an error here maybe for every 1 run in 4: You are right. After running the test program several times I also see the error here. > > 3. The x86_64 version only shows that one thread is started. Threads 3 -- > > 9 are not shown. > > I presume you mean you only get one message of the form > > Thread #3 was created > at 0x586748E: clone (in /lib64/libc-2.5.so) > by 0x4E2F305: do_clone (in /lib64/libpthread-2.5.so) > by 0x4E2F7C5: pthread_create@@GLIBC_2.2.5 (in > /lib64/libpthread-2.5.so) by 0x4C23864: pthread_create@* > (tc_intercepts.c:190) > > These message are not printed when a thread starts - that would be > impractical for a program with many threads. They are printed on demand > so as to make later error messages make sense. Specifically, if it > produces a race message containing this: > > Possible data race during [...] > Old state: shared-readonly by threads #3, #6 > New state: shared-modified by threads #3, #6 > > then it will first print out a "Thread #3 was created at" message, and a > "Thread #6 was created at" message, so you can make sense of the race > message. Ok, now I understand that the number of messages varies between different runs. Is there some documentation on the algorithm that thrcheck uses? Greetings Christoph |
|
From: Julian S. <js...@ac...> - 2007-10-28 02:04:44
|
> Is there some documentation on the algorithm that thrcheck uses? See http://bugs.kde.org/show_bug.cgi?id=137396 for other user feedback and a description of how to build the documentation. J |
|
From: Julian S. <js...@ac...> - 2007-10-28 15:00:03
|
You might want to svn up to >= r7047 and try the new flags shown below. They may help make sense of the reports (or not, just cause confusion and slow the tool down :-( J Add new flags --trace-addr and --trace-level, which cause all state changes for a given address to be logged. Intended to help track down the root causes of races. At --trace-level=2, a complete stack trace is printed every time the location specified with --trace-addr changes state. At --trace-level=1 (the default) just a 1-line summary for each state change is printed. |
|
From: Julian S. <js...@ac...> - 2007-10-29 11:32:57
|
Here is a completely "fixed" version of your program (I hope).
By using semaphores, Thrcheck can reliably see all synchronisation
events. It should give the same results on all runs.
The false positives in Jump::run exist for the following reason.
1. You create NUM_THREADS Loop jobs. These access the counter
array, some with shared locks, some without.
2. You call wait_for_jobs(). This is like a barrier: all the
Loop jobs must finish before it returns.
3. You create NUM_THREADS Jump jobs. These modify the counter
array again.
Unfortunately, in (3), Thrcheck has no way to know that the Jump
jobs are logically unrelated to the Loop jobs that happened before.
So (for example) vec[0].ptr is accessed by multiple threads in the
Loop jobs. And so Thrcheck expects that it should continue to be
protected by a lock in the Jump jobs, but it is not.
This is a classic problem with thread checkers: "memory recycling".
My solution is a standard one. You must tell Thrcheck, after the
Loop jobs have finished, to "forget" what it knows about
vec[..].counter. This is done using the calls to VALGRIND_TC_CLEAN_MEMORY.
One final detail. If you write the program in a more simple way
then Thrcheck can give correct results without using VALGRIND_TC_CLEAN_MEMORY.
The more simple way is: create a new worker thread for each task, then
wait for it to exit. Thrcheck understands that when a thread joins with
its parent, then the parent "inherits" ownership of the child's memory
locations. So in this case, after all the threads for the Loop jobs
had exited, then the whole memory state would be clean, and ready for
you to start new threads for the Jump jobs.
I hope that makes some sense. Thanks again for the feedback.
J
----------------------------------------------------------------------
g++ -g -O -o thread thread.C -lpthread -I$prefix/include
#include <iostream>
#include <vector>
extern "C" {
#include <pthread.h>
#include <semaphore.h>
#include <assert.h>
#include "valgrind/thrcheck.h"
}
pthread_mutex_t odd_mutex = PTHREAD_MUTEX_INITIALIZER;
pthread_mutex_t even_mutex = PTHREAD_MUTEX_INITIALIZER;
static int const NUM_THREADS = 8;
class Job {
public:
virtual void run() = 0;
};
/* Used to tell workers to look in the work[] array. */
sem_t jobs_available;
/* Used to tell the controller that a job is done. */
sem_t jobs_completed;
/* Work array. Must be locked using this lock whenever anyone
accesses it. */
pthread_mutex_t work_mutex = PTHREAD_MUTEX_INITIALIZER;
Job* work[NUM_THREADS] = {NULL, NULL};
extern "C" void * start(void *) {
while (true) {
/* First, wait for the controller to tell me there is some
work available. */
int r = sem_wait( &jobs_available ); assert(r==0);
/* Now scan the work[] array to find out what I should do. */
pthread_mutex_lock(&work_mutex);
Job * job = NULL;
for (int i = 0; i < NUM_THREADS; ++i) {
if (work[i] != NULL) {
job = work[i];
work[i] = NULL;
break;
}
}
pthread_mutex_unlock(&work_mutex);
/* If there is no work found, I should quit. */
if (job == NULL)
return NULL;
/* else do it, then post to the controller that I am done. */
job->run();
r = sem_post(&jobs_completed); assert(r==0);
}
/*NOTREACHED*/
}
struct Data {
Data() : counter(0) {}
int counter;
};
class Loop : public Job {
public:
Loop(std::vector<Data *> & vec) : vec(vec) {}
virtual void run() {
for (std::size_t i = 0; i < vec.size(); ++i) {
if (i % 2 == 0) {
pthread_mutex_lock(&even_mutex);
vec[i]->counter++;
pthread_mutex_unlock(&even_mutex);
} else {
// pthread_mutex_lock(&odd_mutex);
vec[i]->counter++;
// pthread_mutex_unlock(&odd_mutex);
}
}
}
private:
std::vector<Data *> & vec;
};
class Jump : public Job {
public:
Jump(std::vector<Data *> & vec, std::size_t my_indices)
: vec(vec), my_indices(my_indices) {}
virtual void run() {
for (std::size_t i = 0; i < vec.size(); ++i) {
if (i % NUM_THREADS == my_indices) {
vec[i]->counter++;
}
}
}
private:
std::vector<Data *> & vec;
std::size_t my_indices;
};
void wait_for_jobs(int num) {
int i, r;
for (i = 0; i < num; i++) {
std::cout << "waitf_j " << i << std::endl;
r = sem_wait( &jobs_completed ); assert(r==0);
}
}
void start_jobs(int num) {
int i, r;
for (i = 0; i < num; i++) {
std::cout << "start_j " << i << std::endl;
r = sem_post( &jobs_available ); assert(r==0);
}
}
int main() {
int r;
int* counter_ptrs[NUM_THREADS];
r = sem_init( &jobs_available, 0/*pshared*/, 0/*initial_value*/ );
assert(r==0);
r = sem_init( &jobs_completed, 0/*pshared*/, 0/*initial_value*/ );
assert(r==0);
std::cout << "Starting main programm" << std::endl;
pthread_t tid[NUM_THREADS];
for (int i = 0; i < NUM_THREADS; ++i) {
pthread_create(&tid[i], NULL, start, NULL);
}
std::cout << "Threads created" << std::endl;
std::vector<Data *> vec;
for (std::size_t i = 0; i < NUM_THREADS; ++i) {
vec.push_back(new Data());
std::cout << "Created counter at address: "
<< &vec.back()->counter << std::endl;
counter_ptrs[i] = &vec.back()->counter;
}
std::cout << "Vector initialized" << std::endl;
Loop l(vec);
pthread_mutex_lock(&work_mutex);
for (std::size_t i = 0; i < NUM_THREADS; ++i) {
work[i] = &l;
}
pthread_mutex_unlock(&work_mutex);
std::cout << "Added loop jobs" << std::endl;
start_jobs(NUM_THREADS);
wait_for_jobs(NUM_THREADS);
/* Create new jobs and clean up the memory used by the old jobs. */
std::vector<Jump *> jump_vec;
for (std::size_t i = 0; i < NUM_THREADS; ++i) {
jump_vec.push_back(new Jump(vec, i));
VALGRIND_TC_CLEAN_MEMORY( counter_ptrs[i], sizeof(int) );
}
pthread_mutex_lock(&work_mutex);
for (std::size_t i = 0; i < NUM_THREADS; ++i) {
assert(work[i] == NULL);
work[i] = jump_vec[i];
}
pthread_mutex_unlock(&work_mutex);
std::cout << "Added jump jobs" << std::endl;
start_jobs(NUM_THREADS);
wait_for_jobs(NUM_THREADS);
/* Cause all the threads to exit, by ensuring work[] is completely
NULL (which it should already be), then doing start_jobs. All
threads will fail to find any work, and exit. */
pthread_mutex_lock(&work_mutex);
for (std::size_t i = 0; i < NUM_THREADS; ++i) {
assert(work[i] == NULL);
}
pthread_mutex_unlock(&work_mutex);
start_jobs(NUM_THREADS);
/* And collect up all the threads. */
for (std::size_t i = 0; i < NUM_THREADS; ++i) {
pthread_join(tid[i], NULL);
}
}
|