|
From: Alberto B. <alb...@te...> - 2005-04-28 05:19:47
|
Hi! I'm getting a very strange message from valgrind 2.4.0, on Linux 2.6.11.2 with glibc 2.3.4 in a Pentium 4 2.8Ghz with HT, gcc 3.3.5: ==2005-04-28 01:21:07.834 12775== ==2005-04-28 01:21:07.834 12775== Thread 2: ==2005-04-28 01:21:07.834 12775== Syscall param futex(futex) points to unaddressable byte(s) ==2005-04-28 01:21:07.834 12775== at 0x1B8E4882: (within /lib/ld-2.3.4.so) ==2005-04-28 01:21:07.834 12775== by 0x8049BF5: filo_unlock (libfilo.c:474) ==2005-04-28 01:21:07.834 12775== by 0x80489FA: locoth (lockloco.c:58) ==2005-04-28 01:21:07.834 12775== by 0x1B92095A: start_thread (in /lib/libpthread-2.3.4.so) ==2005-04-28 01:21:07.834 12775== by 0x1B9DDA69: clone (in /lib/libc-2.3.4.so) ==2005-04-28 01:21:07.834 12775== Address 0x1D3620C0 is 24 bytes inside a block of size 48 free'd ==2005-04-28 01:21:07.834 12775== at 0x1B902B5C: free (vg_replace_malloc.c:152) ==2005-04-28 01:21:07.834 12775== by 0x804971E: wait_on (libfilo.c:365) ==2005-04-28 01:21:07.834 12775== by 0x8049A73: filo_lock (libfilo.c:433) ==2005-04-28 01:21:07.834 12775== by 0x80489CD: locoth (lockloco.c:37) ==2005-04-28 01:21:07.834 12775== by 0x1B92095A: start_thread (in /lib/libpthread-2.3.4.so) ==2005-04-28 01:21:07.834 12775== by 0x1B9DDA69: clone (in /lib/libc-2.3.4.so) In the following context: I've written a small library to do userspace file locking, and I wrote a little test application that creates three threads, each one which lock and unlock random regions in an infinite loop. After a big number of iterations (about 5500000, 30650000, 3050000 and 6500000 in the four times I managed to reproduce it) I see the message above, once. Everything seems to run just fine afterwards, and I have no idea what can cause it. The library uses a semaphore to wake threads up when it has contention, and it seems that the failure is related to it. However, the code is quite simple and it seems to work just fine. When there is contention on a region, one thread waits for a semaphore, and when the lock owner releases it, it posts the semaphore, which wakes the sleeping one up. The second warning about the free also worries me: the pointer being freed is malloc()ed a couple of lines before in the same function and it shouldn't be altered. Maybe it's a side effect of the first one, I just don't know. I was wondering if anyone had an idea of what might be causing this. If you're intrested in trying to reproduce it, you can find the code that causes this problem at http://users.auriga.wearlab.de/~alb/libfilo/ If you unpack the tarball, and run gcc -Wall -D_XOPEN_SOURCE=500 -O0 -D_LARGEFILE_SOURCE=1 \ -D_LARGEFILE64_SOURCE=1 -D_LFS_LARGEFILE=1 -D_LFS64_LARGEFILE=1 \ -D_FILE_OFFSET_BITS=64 `getconf LFS_CFLAGS 2>/dev/null` -g \ tests/lockloco.c -o lockloco libfilo.c -lpthread you will build "lockloco", which is the test application I use. Then, run it without arguments and that's it. If anyone wants further explanation about what the code does, why or how; or needs any other data, please let me know. Thanks a lot, Alberto PS: please CC me on replies as I'm not subscribed to the mailing list. |
|
From: Jeremy F. <je...@go...> - 2005-04-29 18:26:13
|
Alberto Bertogli wrote:
>Hi!
>
>I'm getting a very strange message from valgrind 2.4.0, on Linux 2.6.11.2
>with glibc 2.3.4 in a Pentium 4 2.8Ghz with HT, gcc 3.3.5:
>
>==2005-04-28 01:21:07.834 12775==
>==2005-04-28 01:21:07.834 12775== Thread 2:
>==2005-04-28 01:21:07.834 12775== Syscall param futex(futex) points to
> unaddressable byte(s)
>==2005-04-28 01:21:07.834 12775== at 0x1B8E4882: (within
> /lib/ld-2.3.4.so)
>==2005-04-28 01:21:07.834 12775== by 0x8049BF5: filo_unlock
> (libfilo.c:474)
>==2005-04-28 01:21:07.834 12775== by 0x80489FA: locoth (lockloco.c:58)
>==2005-04-28 01:21:07.834 12775== by 0x1B92095A: start_thread (in
> /lib/libpthread-2.3.4.so)
>==2005-04-28 01:21:07.834 12775== by 0x1B9DDA69: clone (in
> /lib/libc-2.3.4.so)
>==2005-04-28 01:21:07.834 12775== Address 0x1D3620C0 is 24 bytes inside a
> block of size 48 free'd
>==2005-04-28 01:21:07.834 12775== at 0x1B902B5C: free
> (vg_replace_malloc.c:152)
>==2005-04-28 01:21:07.834 12775== by 0x804971E: wait_on (libfilo.c:365)
>==2005-04-28 01:21:07.834 12775== by 0x8049A73: filo_lock
> (libfilo.c:433)
>==2005-04-28 01:21:07.834 12775== by 0x80489CD: locoth (lockloco.c:37)
>==2005-04-28 01:21:07.834 12775== by 0x1B92095A: start_thread (in
> /lib/libpthread-2.3.4.so)
>==2005-04-28 01:21:07.834 12775== by 0x1B9DDA69: clone (in
> /lib/libc-2.3.4.so)
>[...]
>
>
>The second warning about the free also worries me: the pointer being freed
>is malloc()ed a couple of lines before in the same function and it
>shouldn't be altered. Maybe it's a side effect of the first one, I just
>don't know.
>
>
That isn't a second warning; it's the second part of the one warning.
The first part is saying that you're passing a pointer to unaddressable
memory to futex; the second part it telling you how it became
unaddressable. Basically, you're freeing some memory too early, while
another thread is still using it as a lock. I'm guessing you have a
race-condition which is allowing your lock to be freed while it's still
in use (perhaps while it is still locked).
J
|