|
From: Florian K. <br...@ac...> - 2011-10-27 04:16:17
|
helgrind/tests/cond_timedwait_invalid is failing on x86-linux.
There is an additional line in the backtrace. Unfiltered, the
backtrace looks like this:
==28111== Thread #1's call to pthread_cond_timedwait failed
==28111== with error code 22 (EINVAL: Invalid argument)
==28111== at 0x4029F3B: pthread_cond_timedwait_WRK (hg_intercepts.c:784)
==28111== by 0x4029F7F: pthread_cond_timedwait@* (hg_intercepts.c:797)
==28111== by 0x4125DF3: pthread_cond_timedwait@@GLIBC_2.3.2
(forward.c:152)
==28111== by 0x80485DF: main (cond_timedwait_invalid.c:22)
Note the line referring to pthread_cond_timedwait@@GLIBC_2.3.2
which should not be there because pthread_cond_timedwait is
supposed to be intercepted and wrapped.
I poked around a bit and found that this testcase is linked against -lrt
instead of -lpthread. (why?) If we linked against -lpthread instead the
...@@GLIBC_2.3.2.... line is gone. But that would only cure the symptom.
It seems that on my system the symbol pthread_cond_timedwait exists in
two places:
(1) /usr/lib/debug/lib/libc-2.12.1.so
$ nm /usr/lib/debug/lib/libc-2.12.1.so | grep pthread_cond_timedwait
000dddb0 T pthread_cond_timedwait@@GLIBC_2.3.2
0010fa50 T pthread_cond_timedwait@GLIBC_2.0
(2) /usr/lib/debug/lib/libpthread-2.12.1.so
$ nm /usr/lib/debug/lib/libpthread-2.12.1.so | grep
pthread_cond_timedwait
0000a730 T pthread_cond_timedwait@@GLIBC_2.3.2
0000ae50 T pthread_cond_timedwait@GLIBC_2.0
And both these libraries get loaded. Running with --trace-symtab=yes
--trace-redir=yes
--28287-- Reading syms from /lib/libc-2.12.1.so (0x4048000)
......
raw symbol [7032]: GLO FUN : svma 0x00000dddb0, sz 74
pthread_cond_timedwait@@GLIBC_2.3.2
rec(t) [7032]: val 0x0004125db0, sz 74
pthread_cond_timedwait@@GLIBC_2.3.2
--28287-- Reading syms from /lib/libpthread-2.12.1.so (0x41a5000)
......
raw symbol [ 575]: GLO FUN : svma 0x000000a730, sz 715
pthread_cond_timedwait@@GLIBC_2.3.2
rec(t) [ 575]: val 0x00041af730, sz 715
pthread_cond_timedwait@@GLIBC_2.3.2
......
==28287== Adding active redirection:
--28287-- new: 0x041af730 (pthread_cond_timedwait@@GLIBC_2.3.2) W->
(0000.0) 0x04029f69 pthread_cond_timedwait@*
--28287-- REDIR: 0x41af730 (pthread_cond_timedwait@@GLIBC_2.3.2)
redirected to 0x4029f69 (pthread_cond_timedwait@*)
Does this look like a problem in the redirection machinery?
If so, any pointers how to debug this further would be welcome.
Florian
|
|
From: Julian S. <js...@ac...> - 2011-10-27 07:11:02
|
On Thursday, October 27, 2011, Florian Krohm wrote: > helgrind/tests/cond_timedwait_invalid is failing on x86-linux. > There is an additional line in the backtrace. Unfiltered, the > backtrace looks like this: > > ==28111== Thread #1's call to pthread_cond_timedwait failed > ==28111== with error code 22 (EINVAL: Invalid argument) > ==28111== at 0x4029F3B: pthread_cond_timedwait_WRK (hg_intercepts.c:784) > ==28111== by 0x4029F7F: pthread_cond_timedwait@* (hg_intercepts.c:797) > ==28111== by 0x4125DF3: pthread_cond_timedwait@@GLIBC_2.3.2 > (forward.c:152) > ==28111== by 0x80485DF: main (cond_timedwait_invalid.c:22) > > Note the line referring to pthread_cond_timedwait@@GLIBC_2.3.2 > which should not be there because pthread_cond_timedwait is > supposed to be intercepted and wrapped. My guess what's happening is: main calls pthread_cond_timedwait, which winds up in the libc.so version. That has page offset 0xdb0, and the call out from it has a return point with page offset 0xDF3, just a bit along from the start, so that seems plausible. This function in turn calls the pthread_cond_timedwait in libpthread, which is intercepted as required by the intercept machinery. The intercept specifications are (soname, functionname) pairs, so there is no possibility that the intercept machinery would accidently intercept pthread_cond_timedwait@* in libc.so instead of the version in libpthread.so. And indeed .. > (1) /usr/lib/debug/lib/libc-2.12.1.so > $ nm /usr/lib/debug/lib/libc-2.12.1.so | grep pthread_cond_timedwait > 000dddb0 T pthread_cond_timedwait@@GLIBC_2.3.2 > 0010fa50 T pthread_cond_timedwait@GLIBC_2.0 > > (2) /usr/lib/debug/lib/libpthread-2.12.1.so > $ nm /usr/lib/debug/lib/libpthread-2.12.1.so | grep > pthread_cond_timedwait > 0000a730 T pthread_cond_timedwait@@GLIBC_2.3.2 > 0000ae50 T pthread_cond_timedwait@GLIBC_2.0 > ==28287== Adding active redirection: > --28287-- new: 0x041af730 (pthread_cond_timedwait@@GLIBC_2.3.2) W-> > (0000.0) 0x04029f69 pthread_cond_timedwait@* > > --28287-- REDIR: 0x41af730 (pthread_cond_timedwait@@GLIBC_2.3.2) > redirected to 0x4029f69 (pthread_cond_timedwait@*) These both intercept a function with page offset 0x730, which is can only be pthread_cond_timedwait@@GLIBC_2.3.2 in libpthread-2.12.1.so. So I don't think there's a redir bug. Now, why it's linked against librt instead of libpthread, I have no idea. Can you get any useful info from svn ann and then svn log of the relevant Makefile.am? All else being equal, it sounds to me cleaner to link against libpthread, but perhaps there was some special reason to link against librt. J |
|
From: Florian K. <br...@ac...> - 2011-10-27 13:59:48
|
On 10/27/2011 03:07 AM, Julian Seward wrote:
>
> So I don't think there's a redir bug.
>
Oh, good. Thanks for the details.
> Now, why it's linked against librt instead of libpthread, I have no
> idea. Can you get any useful info from svn ann and then svn log of
> the relevant Makefile.am?
Actually, the way the testcase is written it does require linking
against -lrt. Otherwise, function clock_gettime will not be found.
When I was debugging it yesterday, I had compressed the testcase and
eliminated the clock_gettime call.
The testcase came into existence in r12164 calling clock_gettime on
all platforms.
In r12213 Bart made the testcase work on Darwin with this patch:
+
+#ifdef HAVE_CLOCK_GETTIME
assert(clock_gettime(CLOCK_REALTIME, &abstime)==0);
+#else
+ abstime.tv_sec = time(NULL) + 2;
+ abstime.tv_nsec = 0;
+#endif
Reading through the man pages suggests that we should be able to use the
#else clause on Linux as well and thereby eliminate the need for -lrt.
I'm suggesting this patch:
Index: helgrind/tests/cond_timedwait_invalid.c
===================================================================
--- helgrind/tests/cond_timedwait_invalid.c (revision 12237)
+++ helgrind/tests/cond_timedwait_invalid.c (working copy)
@@ -1,4 +1,4 @@
-#include "config.h"
+
#include <time.h>
#include <pthread.h>
#include <assert.h>
@@ -10,12 +10,12 @@
pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
pthread_cond_t cond = PTHREAD_COND_INITIALIZER;
-#ifdef HAVE_CLOCK_GETTIME
- assert(clock_gettime(CLOCK_REALTIME, &abstime)==0);
-#else
+
+
+
abstime.tv_sec = time(NULL) + 2;
abstime.tv_nsec = 0;
-#endif
+
abstime.tv_nsec += 1000000000;
assert(pthread_mutex_lock(&mutex)==0);
Index: helgrind/tests/Makefile.am
===================================================================
--- helgrind/tests/Makefile.am (revision 12237)
+++ helgrind/tests/Makefile.am (working copy)
@@ -185,8 +185,3 @@
else
annotate_hbefore_CFLAGS = $(AM_CFLAGS)
endif
-
-if VGCONF_OS_IS_LINUX
-cond_timedwait_invalid_LDADD = -lrt
-endif
-
With this patch is passes on x86 and also on that old s390x system I'm
using. The testcase was failing there as well with the same symptom.
Florian
|
|
From: Philippe W. <phi...@sk...> - 2011-10-27 19:45:01
|
> Reading through the man pages suggests that we should be able to use the > #else clause on Linux as well and thereby eliminate the need for -lrt. > I'm suggesting this patch: Patch looks ok to me : there is no need to call clock_gettime to verify the behaviour of an invalid pthread_cond_timedwait. Philippe |
|
From: Florian K. <br...@ac...> - 2011-10-28 00:17:07
|
On 10/27/2011 03:44 PM, Philippe Waroquiers wrote: >> Reading through the man pages suggests that we should be able to use the >> #else clause on Linux as well and thereby eliminate the need for -lrt. >> I'm suggesting this patch: > Patch looks ok to me : there is no need to call clock_gettime > to verify the behaviour of an invalid pthread_cond_timedwait. > OK. Thanks. Committed as r12246. Florian |