|
From: Sérgio D. J. <ser...@li...> - 2008-10-08 15:07:20
|
Hello guys, I was reading the Valgrind's website and found something which interested us. The page is: http://valgrind.org/docs/manual/hg-manual.html (Which might be familiar to you, I think :-) ). By reading it, I saw that Valgrind and Helgrind do a very good work with multithreaded applications. Also, this particular bullet has called my attention: * Runtime support library for GNU OpenMP (part of GCC), at least GCC versions 4.2 and 4.3. With some minor effort of modifying the GNU OpenMP runtime support sources, it is possible to use Helgrind on GNU OpenMP compiled codes. Please contact the Valgrind authors for details. And that's the reason why I'm writing this e-mail. I have a couple of questions here, so I thought it'd a good idea to write one of to the developer's mailing-list and try to have them answered :-). Basically, my initial question is: What should I specifically do to get Helgrind working with OpenMP? What kind of modifications in the GNU OpenMP runtime support sources should I make? I'd be happy if you could answer that :-). Also, I've already mailed Jeremy (Helgrind author), but unfortunately he didn't answer yet. Thanks in advance, -- Sérgio Durigan Júnior Linux on Power Toolchain - Software Engineer Linux Technology Center - LTC IBM Brazil |
|
From: Julian S. <js...@ac...> - 2008-10-08 16:15:06
|
> What should I specifically do to get Helgrind working with OpenMP? See the attached files. Note that they are for valgrind-3.3.1. Note also, you really need to run this stuff on x86/amd64; on ppc32/ppc64 you will get bad results, due to not-good support for ppc atomic operations (lwarx/stwcx) in Helgrind 3.3.x. ppc32/ppc64 support will be improved in Helgrind 3.4.x. J |
|
From: Sérgio D. J. <ser...@li...> - 2008-10-08 22:44:40
|
Hi Julian, On Wed, 2008-10-08 at 18:03 +0200, Julian Seward wrote: > > What should I specifically do to get Helgrind working with OpenMP? > > See the attached files. Note that they are for valgrind-3.3.1. Thank you. I've tried to locate those files (or at least the README) in the tar.bz2 package without success. Are they in some specific place, or only you developers have them? :-) > Note also, you really need to run this stuff on x86/amd64; on > ppc32/ppc64 you will get bad results, due to not-good support > for ppc atomic operations (lwarx/stwcx) in Helgrind 3.3.x. > ppc32/ppc64 support will be improved in Helgrind 3.4.x. Right, thanks for the warning. I'll try it first on x86 then, and probably I'll have more questions to ask after my initial tests with Helgrind and OpenMP. By the way, is there an IRC channel or something where I can get in touch with you, guys? I've tried #valgrind on Freenode, but apparently it's not even registered. Regards, -- Sérgio Durigan Júnior Linux on Power Toolchain - Software Engineer Linux Technology Center - LTC IBM Brazil |
|
From: Bart V. A. <bar...@gm...> - 2008-10-09 06:23:22
|
On Wed, Oct 8, 2008 at 5:08 PM, Sérgio Durigan Júnior <ser...@li...> wrote: > What should I specifically do to get Helgrind working with OpenMP? What > kind of modifications in the GNU OpenMP runtime support sources should > I make? In the Valgrind trunk there are two thread error detection tools present, namely Helgrind and DRD. Both support OpenMP. With regard to OpenMP, DRD has the advantage that it works with unmodified GNU OpenMP runtime source code. For instructions on how to build DRD and its documentation, see also http://www.nabble.com/-ANNOUNCEMENT--DRD,-a-Thread-Error-Detector-td18336351.html Bart. |
|
From: Sérgio D. J. <ser...@li...> - 2008-10-10 18:51:47
|
Hi Bart, On Thu, 2008-10-09 at 08:23 +0200, Bart Van Assche wrote: > In the Valgrind trunk there are two thread error detection tools > present, namely Helgrind and DRD. Both support OpenMP. With regard to > OpenMP, DRD has the advantage that it works with unmodified GNU OpenMP > runtime source code. > > For instructions on how to build DRD and its documentation, see also > http://www.nabble.com/-ANNOUNCEMENT--DRD,-a-Thread-Error-Detector-td18336351.html Thanks a lot for this information! DRD seems a good tool for the job, indeed. But I have some questions. I've read the documentation that comes with the program (specially the OpenMP section), and tried to run some tests using DRD. These tests are very simple, and they are correct AFAIK. However, DRD reported some errors regarding printf() function. One of these errors is pasted below: ==31740== Conflicting load by thread 2/2 at 0x041930b8 size 4 ==31740== at 0x40A1C2D: vfprintf (in /lib/libc-2.6.1.so) ==31740== by 0x40AA9F2: printf (in /lib/libc-2.6.1.so) ==31740== by 0x80487F5: main.omp_fn.0 (omp_bug5fix.c:39) ==31740== by 0x402E7E7: gomp_thread_start (team.c:108) ==31740== by 0x402651E: vg_thread_wrapper (drd_pthread_intercepts.c:189) ==31740== by 0x405418A: start_thread (in /lib/libpthread-2.6.1.so) ==31740== by 0x412709D: clone (in /lib/libc-2.6.1.so) ==31740== Allocation context: BSS section of /lib/libc-2.6.1.so AFAIK, if I declare something private in OpenMP, it makes a copy of this variable for every thread running. The piece of code which DRD is telling that has a conflicting load is something like: ... int tid; #pragma omp parallel private(tid) { tid = omp_get_thread_num(); printf ("Thread %d\n", tid); ... } Please correct me if I'm wrong, but DRD should not complain about this, right? Can you provide me some light on why it's generating an error message for this code? There's a note on DRD documentation saying: Note: DRD reports errors on the libgomp library included with gcc 4.2.0 up to and including 4.3.1. This might indicate a race condition in the POSIX version of libgomp. But I think it's not the case in this sittuation, right? I'm really confused about it :-(. Thanks in advance, -- Sérgio Durigan Júnior Linux on Power Toolchain - Software Engineer Linux Technology Center - LTC IBM Brazil |
|
From: Julian S. <js...@ac...> - 2008-10-11 11:05:15
|
> > > What should I specifically do to get Helgrind working with OpenMP? > > > > See the attached files. Note that they are for valgrind-3.3.1. > > Thank you. I've tried to locate those files (or at least the README) in > the tar.bz2 package without success. Are they in some specific place, or > only you developers have them? :-) Just sitting around in my tree somewhere :-( A better solution would be to make Helgrind aware of the required GNU OpenMP primitives, so it supports GNU OpenMP directly, like drd. Another thing you might want to do is try the "YARD" branch Helgrind with those files. It has a lower false error rate and better error messages than the trunk or 3.3.1 Helgrind, in that it shows you tracebacks for both memory accesses involved in a race. It may also behave better on ppc (maybe; am not sure about that). svn co svn://svn.valgrind.org/valgrind/branches/YARD yard cd yard ./autogen.sh then configure/build as usual. > By the way, is there an IRC channel or something where I can get in > touch with you, guys? I've tried #valgrind on Freenode, but apparently > it's not even registered. Er, no. We've never had an irc channel. J |
|
From: Sérgio D. J. <ser...@li...> - 2008-10-11 22:13:31
|
Hello Julian, On Sat, 2008-10-11 at 11:33 +0200, Julian Seward wrote: > Another thing you might want to do is try the "YARD" branch Helgrind > with those files. It has a lower false error rate and better error > messages than the trunk or 3.3.1 Helgrind, in that it shows you > tracebacks for both memory accesses involved in a race. It may also > behave better on ppc (maybe; am not sure about that). > > svn co svn://svn.valgrind.org/valgrind/branches/YARD yard > cd yard > ./autogen.sh > > then configure/build as usual. Well, I'll try that branch as well. But IMHO the main "problem" with both Helgrind and DRD is that you have to recompile GCC in order to get things working. That's why I want to understand what's currently "wrong" with GCC and OpenMP nowadays (specially regarding the sys_futex() syscall), and what can be done to get Helgrind/DRD working with default GCC versions that are usually shipped today. I've already sent an e-mail to Bart asking more details about this issue. > > By the way, is there an IRC channel or something where I can get in > > touch with you, guys? I've tried #valgrind on Freenode, but apparently > > it's not even registered. > > Er, no. We've never had an irc channel. Hmm, and with all respect, don't you think it's time to set it up? :-) BTW, thank you very much for your answers. Regards, -- Sérgio Durigan Júnior Linux on Power Toolchain - Software Engineer Linux Technology Center - LTC IBM Brazil |
|
From: Bart V. A. <bar...@gm...> - 2008-10-11 19:40:54
|
On Fri, Oct 10, 2008 at 8:52 PM, Sérgio Durigan Júnior <ser...@li...> wrote: > However, DRD reported some errors regarding printf() function. > One of these errors is pasted below: > > ==31740== Conflicting load by thread 2/2 at 0x041930b8 size 4 > ==31740== at 0x40A1C2D: vfprintf (in /lib/libc-2.6.1.so) > ==31740== by 0x40AA9F2: printf (in /lib/libc-2.6.1.so) > ==31740== by 0x80487F5: main.omp_fn.0 (omp_bug5fix.c:39) > ==31740== by 0x402E7E7: gomp_thread_start (team.c:108) > ==31740== by 0x402651E: vg_thread_wrapper > (drd_pthread_intercepts.c:189) > ==31740== by 0x405418A: start_thread (in /lib/libpthread-2.6.1.so) > ==31740== by 0x412709D: clone (in /lib/libc-2.6.1.so) > ==31740== Allocation context: BSS section of /lib/libc-2.6.1.so The above race report refers to stdout. glibc uses its own locking mechanism for streams (see also _IO_flockfile(FILE*) in the glibc source tree). Some of these races were already suppressed by drd, but not all. This has been fixed (trunk, revision 8663). Thanks for reporting this. Bart. |
|
From: Sérgio D. J. <ser...@li...> - 2008-10-11 22:06:34
|
Hi Bart, On Sat, 2008-10-11 at 21:40 +0200, Bart Van Assche wrote: > The above race report refers to stdout. glibc uses its own locking > mechanism for streams (see also _IO_flockfile(FILE*) in the glibc > source tree). Some of these races were already suppressed by drd, but > not all. This has been fixed (trunk, revision 8663). Thanks for > reporting this. Thanks. I've updated my local copy of the repository and am the tests again. I'll let you know if there's something strange happening. Meanwhile, I'd like to ask a question about the limitations of Valgrind when libgomp uses sys_futex() to make the barrier implementation. I've tried to investigate and understand more about this subject, but unfortunately it seems a little Valgrind-specific (and I'm still new on this field). I've found the following thread discussion involving you and Julian: http://www.mail-archive.com/val...@li.../msg02349.html So, could you explain a little more why do I have to recompile gcc using the --disable-linux-futex parameter? Best regards, -- Sérgio Durigan Júnior Linux on Power Toolchain - Software Engineer Linux Technology Center - LTC IBM Brazil |
|
From: Bart V. A. <bar...@gm...> - 2008-10-12 06:58:25
|
On Sun, Oct 12, 2008 at 12:07 AM, Sérgio Durigan Júnior <ser...@li...> wrote: > So, could you explain a little more why do I have to recompile gcc using > the --disable-linux-futex parameter? Any thread checking tool needs at least the following information: * Which memory accesses have been performed by each thread. * Which synchronization operations have been performed by each thread. Both Helgrind and DRD obtain information about loads and stores by instrumenting the executable code. And information about synchronization operations is gathered through "redirection": recognizing function names and replacing a call to a (library) function by a call to an instrumented function. If you have a look at the libgomp source code, you will see that in the "linux" version functions like gomp_mutex_lock() and gomp_mutex_unlock() have been declared inline. Which makes it impossible to intercept these functions. This difficulty does not exist in in the "posix" version of libgomp. That is why gcc has to be recompiled with the flag --disable-linux-futex. Bart. |
|
From: Julian S. <js...@ac...> - 2008-10-12 09:04:43
|
On Sunday 12 October 2008, Bart Van Assche wrote:
> Both Helgrind and DRD obtain information about loads and stores by
> instrumenting the executable code. And information about
> synchronization operations is gathered through "redirection":
> recognizing function names and replacing a call to a (library)
> function by a call to an instrumented function.
>
> If you have a look at the libgomp source code, you will see that in
> the "linux" version functions like gomp_mutex_lock() and
> gomp_mutex_unlock() have been declared inline. Which makes it
> impossible to intercept these functions. This difficulty does not
> exist in in the "posix" version of libgomp. That is why gcc has to be
> recompiled with the flag --disable-linux-futex.
Sérgio could legitimately ask, why do we need to intercept these
functions at all? Is it possible for DRD and Helgrind to work
without intercepting them?
Intercepting functions (basically, lock/unlock functions, barrier
functions, and other stuff like pthread_cond_{wait,signal}) is necessary
so that DRD and Helgrind can "see" the inter-thread synchronisation
events.
An alternative approach is to not intercept those functions. All such
functions (at least on Linux) appear to be implemented using a combination
of atomic instructions (lock-prefixed, or lwarx/stwcx) together with calls
to sys_futex to resolve contended cases.
That would be a better solution. The function intercepting is complex
(to implement) and fragile, especially on ppc64-linux. The problem is how
to deduce, from observing the atomic instructions and sys_futex calls, what
the resulting inter-thread synchronisations are. I have considered this
problem a bit but cannot see any solution that does not require a very large
runtime overhead, and a lot of complexity.
Here's a simplified example:
(global) volatile char c = 0;
(thread1) while (c == 0) { }; /*spinlock*/
(thread2) ...
c = 1
...
thread1 will not advance past the loop until thread2 sets c to 1. So there's
an inter-thread dependency here. (I'm not saying that the real threading
primitives, pthread_mutex_lock, etc, are implemented like this, but I do
believe that a general solution to the problem should also work for this
particular example).
So how do we know that there's an inter-thread dependency? We have to observe
that (a) thread1 is in a loop, (b) thread1 is reading a memory location which
it does not write in the loop, (c) the loop exit condition depends on the
value read from memory, (d) thread2 writes a value to that same memory
location, and (e) that the written value causes the loop to exit.
Which all sounds very complicated and difficult to me. Hence at the moment
we rely on intercepting pthread_mutex_lock, gomp_mutex_lock, etc, to see
such inter-thread dependencies.
If you can think of a solution to this ...
J
|
|
From: Sérgio D. J. <ser...@li...> - 2008-10-15 14:47:54
|
Hello Julian and Bart, On Sun, 2008-10-12 at 10:48 +0200, Julian Seward wrote: > If you can think of a solution to this ... So, based on what you two said to me (by the way, thanks for all the explanation), I have two questions. 1) If we could ask GCC guys to generate debugging information telling where each inline function is, would that help Valgrind to intercept the calls? 2) Julian said that detecting locking primitives using only instructions is too complex, maybe impossible. Well, but as far as I understood, you are assuming a "general locking primitives detector". What if we limit this problem only to the locking primitives present in the libgomp? Would that be easier to do? (Of course it has a down side because every time the libgomp changed, we would have to change Valgrind too... But I think it's a valid question anyway) So, basically that's it. Regarding to the first question, it'd be good to know how much of the debugging information available in a binary Valgrind can use. Thanks :-) -- Sérgio Durigan Júnior Linux on Power Toolchain - Software Engineer Linux Technology Center - LTC IBM Brazil |
|
From: Bart V. A. <bar...@gm...> - 2008-10-15 17:52:50
|
On Wed, Oct 15, 2008 at 3:42 PM, Sérgio Durigan Júnior <ser...@li...> wrote: > 1) If we could ask GCC guys to generate debugging information telling > where each inline function is, would that help Valgrind to intercept the > calls? A possible alternative is to make sure that none of the synchronization primitives in libgomp is declared inline, and to ask Valgrind users to install the libgomp debuginfo package. Bart. |
|
From: Sérgio D. J. <ser...@li...> - 2008-11-03 19:23:34
|
Hi guys, I'm sorry for the absence. I was a little busy working with other things here :-). After a few moments thinking about our (long and useful) discussion, I have another question. It has to do with one of my proposals: On Wed, 2008-10-15 at 11:42 -0200, Sérgio Durigan Júnior wrote: > 2) Julian said that detecting locking primitives using only instructions > is too complex, maybe impossible. Well, but as far as I understood, you > are assuming a "general locking primitives detector". What if we limit > this problem only to the locking primitives present in the libgomp? > Would that be easier to do? (Of course it has a down side because every > time the libgomp changed, we would have to change Valgrind too... But I > think it's a valid question anyway) Basically, I'd like to know: if I start to implement this idea, what are the chances for me to have to change something in the VEX "package"? Will I certainly have to modify VEX's source, or only Valgrind's core? I don't know if you guys can answer this, but you certainly are more capable than me. Thanks in advance, -- Sérgio Durigan Júnior Linux on Power Toolchain - Software Engineer Linux Technology Center - LTC IBM Brazil |
|
From: Julian S. <js...@ac...> - 2008-11-03 19:18:40
|
> > 2) Julian said that detecting locking primitives using only instructions > > is too complex, maybe impossible. Well, but as far as I understood, you > > are assuming a "general locking primitives detector". What if we limit > > this problem only to the locking primitives present in the libgomp? > > Would that be easier to do? (Of course it has a down side because every > > time the libgomp changed, we would have to change Valgrind too... But I > > think it's a valid question anyway) > > Basically, I'd like to know: if I start to implement this idea, what are > the chances for me to have to change something in the VEX "package"? > Will I certainly have to modify VEX's source, or only Valgrind's core? Difficult to say without knowing what it is you propose to do. If you mean to intercept entry points (functions) in libgomp.so, that happens entirely outside VEX -- it just uses the function intercept mechanism. If you are planning anything else, then I don't know. But I have to say I can't imagine how any other scheme would work. J |
|
From: Bart V. A. <bar...@gm...> - 2008-11-03 20:03:29
|
On Mon, Nov 3, 2008 at 6:12 PM, Sérgio Durigan Júnior <ser...@li...> wrote: > On Wed, 2008-10-15 at 11:42 -0200, Sérgio Durigan Júnior wrote: > >> 2) Julian said that detecting locking primitives using only instructions >> is too complex, maybe impossible. Well, but as far as I understood, you >> are assuming a "general locking primitives detector". What if we limit >> this problem only to the locking primitives present in the libgomp? >> Would that be easier to do? (Of course it has a down side because every >> time the libgomp changed, we would have to change Valgrind too... But I >> think it's a valid question anyway) > > Basically, I'd like to know: if I start to implement this idea, what are > the chances for me to have to change something in the VEX "package"? > Will I certainly have to modify VEX's source, or only Valgrind's core? While the VEX library is a great basis for tools that dynamically instrument binaries, recognizing locking primitives in assembly code might need a more powerful approach. In this context it might be interesting to know that the algorithms needed for translating assembly language into D-structures are well documented (D-structures = Dijkstra's one-in/one-out structures). See e.g. F. Zhang and E. D'Hollander, "Using Hammock Graphs to Structure Programs", IEEE Transactions on Software Engineering, http://portal.acm.org/citation.cfm?id=977250.977393&coll=ACM&dl=ACM&CFID=9308972&CFTOKEN=18919655. Bart. |
|
From: Bart V. A. <bar...@gm...> - 2008-10-12 17:29:45
|
On Sun, Oct 12, 2008 at 10:48 AM, Julian Seward <js...@ac...> wrote: > That would be a better solution. The function intercepting is complex > (to implement) and fragile, especially on ppc64-linux. The problem is how > to deduce, from observing the atomic instructions and sys_futex calls, what > the resulting inter-thread synchronisations are. I have considered this > problem a bit but cannot see any solution that does not require a very large > runtime overhead, and a lot of complexity. Another question is whether it is even possible to deduce the inter-thread synchronization information from the executed instructions alone. Valgrind only sees the executed instructions. IMHO the complete algorithm is needed in order to deduce information about which inter-thread synchronization operation is being executed. Bart. |
|
From: Julian S. <js...@ac...> - 2008-10-12 17:43:10
|
On Sunday 12 October 2008, Bart Van Assche wrote: > On Sun, Oct 12, 2008 at 10:48 AM, Julian Seward <js...@ac...> wrote: > > That would be a better solution. The function intercepting is complex > > (to implement) and fragile, especially on ppc64-linux. The problem is > > how to deduce, from observing the atomic instructions and sys_futex > > calls, what the resulting inter-thread synchronisations are. I have > > considered this problem a bit but cannot see any solution that does not > > require a very large runtime overhead, and a lot of complexity. > > Another question is whether it is even possible to deduce the > inter-thread synchronization information from the executed > instructions alone. Valgrind only sees the executed instructions. IMHO > the complete algorithm is needed in order to deduce information about > which inter-thread synchronization operation is being executed. Yes, I agree. The wording in my previous message was poor. I am not claiming that this is even possible in the general case. I just don't know. I guess any attempt to solve this would need to look at all the data dependencies and how they effect the control flow. Definitely a research level question. J |
|
From: Julian S. <js...@ac...> - 2008-10-15 16:27:11
|
> 1) If we could ask GCC guys to generate debugging information telling
> where each inline function is, would that help Valgrind to intercept the
> calls?
We need to see the entry and exit of these functions, along with the
arguments and return value. This is all very difficult because the
point of inlining is only partially to avoid a function call. More
important is that the compiler can then transform the merged caller-callee
pair arbitrarily, as it wants. This means there may not really be any
well defined boundary between the caller and callee when it's done.
That's all a bit abstract. Example. Before:
void foo ( int x ) {
if (x) { A }; else { B };
}
void bar ( int x ) {
foo(x);
C;
}
The "obvious" result of inlining is
void bar ( int x ) {
if (x) { A }; else { B; }
C;
}
But suppose C is small enough to duplicate; or for whatever reason, placing
it directly after A and B is beneficial. Then this might be the result:
void bar ( int x ) {
if (x) { A; C; } else { B; C };
}
Now you really have to mark _two_ different exit points from the inlined
"foo". etc, etc.
In the general case I don't think this idea is likely to work, unfortunately.
A simpler solution would be simply not to inline this stuff.
Maybe the gcc people can make a better suggestion.
> 2) Julian said that detecting locking primitives using only instructions
> is too complex, maybe impossible. Well, but as far as I understood, you
> are assuming a "general locking primitives detector". What if we limit
> this problem only to the locking primitives present in the libgomp?
> Would that be easier to do? (Of course it has a down side because every
> time the libgomp changed, we would have to change Valgrind too... But I
> think it's a valid question anyway)
A general locking primitives detector would be really useful, although
(as Bart suggests) maybe impossible. Maybe it is equivalent to solving the
halting problem. I don't know.
From my brief investigation of the libgomp primitives, they are the same or
similar to that which libpthread uses. So a solution to libgomp would also
allow us to see inside libpthread, which would be good. But to be honest,
overall I simply don't understand enough about the problem at this point to
answer this question properly.
J
|
|
From: Bart V. A. <bar...@gm...> - 2008-10-15 17:58:40
|
On Wed, Oct 15, 2008 at 6:11 PM, Julian Seward <js...@ac...> wrote: >> 2) Julian said that detecting locking primitives using only instructions >> is too complex, maybe impossible. Well, but as far as I understood, you >> are assuming a "general locking primitives detector". What if we limit >> this problem only to the locking primitives present in the libgomp? >> Would that be easier to do? (Of course it has a down side because every >> time the libgomp changed, we would have to change Valgrind too... But I >> think it's a valid question anyway) [ ... ] > From my brief investigation of the libgomp primitives, they are the same or > similar to that which libpthread uses. So a solution to libgomp would also > allow us to see inside libpthread, which would be good. But to be honest, > overall I simply don't understand enough about the problem at this point to > answer this question properly. My opinion is that the extraction of information about locking primitives from binary executables is a really powerful technique and would be a very interesting addition to Valgrind. The big question here is whether this is possible. This is at least a challenging research topic. In order to detect as much programming errors as possible, tools like Helgrind and DRD discern a.o. the following primitives: * atomic modifications of variables. * mutex lock, unlock and trylock operations. * semaphore post, wait and trywait operations. * condition variables. * reader-writer locks. * barriers. One issue that puzzles me is the following: it is possible to implement a semaphore using one mutex and one condition variable, and it is possible to implement a mutex using one semaphore. libpthread implements semaphores, mutexes and condition variables via futexes. So how is it possible by only analyzing futex calls and control flow whether the (library) programmer has implemented a mutex or a semaphore ? Bart. |
|
From: Sérgio D. J. <ser...@li...> - 2008-10-15 18:41:45
|
Hello Bart, On Wed, 2008-10-15 at 19:40 +0200, Bart Van Assche wrote: > On Wed, Oct 15, 2008 at 3:42 PM, Sérgio Durigan Júnior > <ser...@li...> wrote: > > 1) If we could ask GCC guys to generate debugging information telling > > where each inline function is, would that help Valgrind to intercept the > > calls? > > A possible alternative is to make sure that none of the > synchronization primitives in libgomp is declared inline, and to ask > Valgrind users to install the libgomp debuginfo package. IMHO this is not really an alternative. Actually I think that the more multithreading applications evolve, the more we'll see things like inlining primitive functions. Also, I don't think that GCC guys will ever accept a request to take off the inline :-) Regards, -- Sérgio Durigan Júnior Linux on Power Toolchain - Software Engineer Linux Technology Center - LTC IBM Brazil |
|
From: Julian S. <js...@ac...> - 2008-10-16 01:04:26
|
> > A possible alternative is to make sure that none of the > > synchronization primitives in libgomp is declared inline, and to ask > > Valgrind users to install the libgomp debuginfo package. > > IMHO this is not really an alternative. Actually I think that the more > multithreading applications evolve, the more we'll see things like > inlining primitive functions. Also, I don't think that GCC guys will > ever accept a request to take off the inline :-) I suspect you are right (unfortunately). J |
|
From: Julian S. <js...@ac...> - 2008-10-16 01:04:35
|
> My opinion is that the extraction of information about locking > primitives from binary executables is a really powerful technique and > would be a very interesting addition to Valgrind. The big question > here is whether this is possible. This is at least a challenging > research topic. Yes. I agree. > One issue that puzzles me is the following: it is possible to > implement a semaphore using one mutex and one condition variable, and > it is possible to implement a mutex using one semaphore. libpthread > implements semaphores, mutexes and condition variables via futexes. So > how is it possible by only analyzing futex calls and control flow > whether the (library) programmer has implemented a mutex or a > semaphore ? Probably irrelevant, but .. would it maybe help to model mutexes, semaphores, in terms of simple message passing between threads? eg, an unlock causes a token to be placed in a mailbox associated with the mutex. A lock request causes the requesting thread to remove the token from the mailbox associated with the mutex. If the mailbox is empty then the thread blocks. Anyway, I wonder if all such primitives can be modelled like this and/or whether that is helpful for this problem. Maybe not. J |
|
From: Bart V. A. <bar...@gm...> - 2008-10-16 06:24:00
|
On Wed, Oct 15, 2008 at 11:57 PM, Julian Seward <js...@ac...> wrote: > >> One issue that puzzles me is the following: it is possible to >> implement a semaphore using one mutex and one condition variable, and >> it is possible to implement a mutex using one semaphore. libpthread >> implements semaphores, mutexes and condition variables via futexes. So >> how is it possible by only analyzing futex calls and control flow >> whether the (library) programmer has implemented a mutex or a >> semaphore ? > > Probably irrelevant, but .. would it maybe help to model mutexes, > semaphores, in terms of simple message passing between threads? If I understand the above correctly, the message passing model is an alternative way to model the happens-before relationship between threads. Since both libgomp and libpthread use the futex system call on Linux for synchronizing threads, it is probably possible to extract all happens-before relationships between threads by intercepting this system call. This information should be sufficient as input for a happens-before data-race detector. This approach has two disadvantages however: - Not portable to other Unixes -- futexes are Linux-specific. - Pthreads API-level checks are no longer possible. The most comprehensive overview of futexes I know of is the following: http://people.redhat.com/drepper/futex.pdf. Bart. |
|
From: Bart V. A. <bar...@gm...> - 2008-10-16 07:31:20
|
On Wed, Oct 15, 2008 at 11:57 PM, Julian Seward <js...@ac...> wrote: > Probably irrelevant, but .. would it maybe help to model mutexes, > semaphores, in terms of simple message passing between threads? The most difficult part is not how to model inter-thread ordering, but how to extract this ordering information from a running program. E.g. some atomic operations (A) have inter-thread ordering semantics, but no inter-thread semantics must be assigned to other atomic operations (B). Maybe the atomic operations that control the execution of futex system calls fall into category (A) and those atomic operations that do not control the execution of futex system calls fall into category (B), but I'm not sure about this. Bart. |