|
From: Jeremy F. <je...@go...> - 2005-01-15 11:17:25
|
On Sat, 2005-01-15 at 20:59 +1100, Eyal Lebedinsky wrote: > SYSCALL[29282,2](120) special:sys_clone ( 1200011, 0x0, 0x0, 0x0, 0x1D190BF8 )sys_fork ( ) fork: process 29282 created child 29303 > --> 0 (0x0) > --> 29303 (0x7277) > SYSCALL[29282,2]( 7) mayBlock:sys_waitpid ( 29303, 0x1D1909B4, 0 ) --> ... > SYSCALL[29303,2]( 4) mayBlock:sys_write ( 2, 0x1BFC3378, 635 ) --> ... > SYSCALL[29303,2]( 4) --> 635 (0x27B) > SYSCALL[29303,2](175) special:sys_rt_sigprocmask ( 1, 0x1D1906A4, 0x0, 8 ) --> 0 (0x0) > SYSCALL[29303,2](270):sys_tgkill ( 29282, 29300, 6 )==2005-01-15 20:56:02.227 29282== This looks like an assertion failure in the child process very shortly after a fork. It kills threadgroup 29282. > ==2005-01-15 20:56:02.227 29282== Process terminating with default action of signal 6 (SIGABRT) > ==2005-01-15 20:56:02.228 29282== at 0x1BE59511: (within /lib/tls/libpthread-0.60.so) > ==2005-01-15 20:56:02.228 29282== > ==2005-01-15 20:56:02.228 29282== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 25 from 1) > ==2005-01-15 20:56:02.228 29282== malloc/free: in use at exit: 10008 bytes in 26 blocks. > ==2005-01-15 20:56:02.228 29282== malloc/free: 34 allocs, 8 frees, 14241 bytes allocated. > ==2005-01-15 20:56:02.228 29282== For counts of detected errors, rerun with: -v > ==2005-01-15 20:56:02.228 29282== searching for pointers to 26 not-freed blocks. > ==2005-01-15 20:56:02.489 29282== checked 24775060 bytes. > ==2005-01-15 20:56:02.490 29282== > ==2005-01-15 20:56:02.490 29282== 68 bytes in 1 blocks are possibly lost in loss record 4 of 16 > ==2005-01-15 20:56:02.490 29282== at 0x1B906FE5: calloc (vg_replace_malloc.c:175) > ==2005-01-15 20:56:02.490 29282== by 0x1B8F25A8: (within /lib/ld-2.3.2.so) > ==2005-01-15 20:56:02.490 29282== by 0x1B8F287B: _dl_allocate_tls (in /lib/ld-2.3.2.so) > ==2005-01-15 20:56:02.490 29282== by 0x1BE5424A: allocate_stack (in /lib/tls/libpthread-0.60.so) > ==2005-01-15 20:56:02.490 29282== by 0x1BE53C54: pthread_create@@GLIBC_2.1 (in /lib/tls/libpthread-0.60.so) > ==2005-01-15 20:56:02.490 29282== by 0x1BD1CC1F: ??? (thread.c:793) > ==2005-01-15 20:56:02.490 29282== by 0x1BD1DD18: skthcr (thread.c:1401) > ==2005-01-15 20:56:02.490 29282== by 0x1BCFA06B: skmain (main.c:1237) > ==2005-01-15 20:56:02.490 29282== by 0x804A4B9: main (ssan3sv.c:319) > ==2005-01-15 20:56:02.490 29282== > ==2005-01-15 20:56:02.490 29282== > ==2005-01-15 20:56:02.490 29282== 268 bytes in 1 blocks are definitely lost in loss record 7 of 16 > ==2005-01-15 20:56:02.490 29282== at 0x1B9065C1: malloc (vg_replace_malloc.c:130) > ==2005-01-15 20:56:02.490 29282== by 0x1BD37D95: suwmal (malloc.c:46) > ==2005-01-15 20:56:02.490 29282== by 0x1BCF7579: ??? (log.c:298) > ==2005-01-15 20:56:02.490 29282== by 0x1BCF78D0: sksetp (log.c:520) > ==2005-01-15 20:56:02.490 29282== by 0x1BD033A2: skredi (redir.c:217) > ==2005-01-15 20:56:02.490 29282== by 0x1BCF9E20: skmain (main.c:1124) > ==2005-01-15 20:56:02.490 29282== by 0x804A4B9: main (ssan3sv.c:319) > ==2005-01-15 20:56:02.490 29282== > ==2005-01-15 20:56:02.490 29282== LEAK SUMMARY: > ==2005-01-15 20:56:02.490 29282== definitely lost: 268 bytes in 1 blocks. > ==2005-01-15 20:56:02.490 29282== possibly lost: 68 bytes in 1 blocks. > ==2005-01-15 20:56:02.490 29282== still reachable: 9672 bytes in 24 blocks. > ==2005-01-15 20:56:02.490 29282== suppressed: 0 bytes in 0 blocks. > ==2005-01-15 20:56:02.490 29282== Reachable blocks (those to which a pointer was found) are not shown. > ==2005-01-15 20:56:02.490 29282== To see them, rerun with: --show-reachable=yes > --> 0 (0x0) > SYSCALL[29303,2](174) special:sys_rt_sigaction ( 6, 0x1D1905DC, 0x0, 8 ) --> 0 (0x0) > SYSCALL[29303,2]( 91):sys_munmap ( 0x1BE3B000, 4096 ) --> 0 (0x0) > SYSCALL[29303,2](270):sys_tgkill ( 29282, 29300, 6 ) --> 0 (0x0) And this looks like a bug, but I'm not sure whose. 29303, the process created by the fork(), it trying to send itself SIGABRT with tgkill, after (presumably) using sigaction to set SIGABRT to SIG_DFL. But it's using the threadgroup id of its parent along with its own pid, which should cause it to fail with ESRCH. It fails to do the kill, because there's no pending SIGABRT, but the syscall doesn't return an error code. So, it could be a libc bug (for using a bad tgkill after fork()), a Valgrind bug (for confusing the client into using a bad tgkill), and/or a kernel bug (for not making tgkill return an error code when it should). Because the SIGABRT doesn't get delivered to 29303, it carries on to the following instruction, which is hlt, and so this crash. So the real question is why there's an assertion failure? Are you positive that your processes are not crashing like this outside of Valgrind? > --2005-01-15 20:56:02.496 29303-- disInstr: unhandled instruction bytes: 0xF4 0xEB 0xDE 0xC7 > --2005-01-15 20:56:02.496 29303-- at 0x1BEB1E23: abort (in /lib/tls/libc-2.3.2.so) > ==2005-01-15 20:56:02.496 29303== > ==2005-01-15 20:56:02.496 29303== Process terminating with default action of signal 4 (SIGILL) > ==2005-01-15 20:56:02.496 29303== Illegal operand at address 0xB00896F0 > ==2005-01-15 20:56:02.496 29303== at 0x1BEB1E23: abort (in /lib/tls/libc-2.3.2.so) > ==2005-01-15 20:56:02.496 29303== > ==2005-01-15 20:56:02.496 29303== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 25 from 1) > ==2005-01-15 20:56:02.496 29303== malloc/free: in use at exit: 10008 bytes in 26 blocks. > ==2005-01-15 20:56:02.496 29303== malloc/free: 38 allocs, 12 frees, 15977 bytes allocated. > ==2005-01-15 20:56:02.496 29303== For counts of detected errors, rerun with: -v > ==2005-01-15 20:56:02.496 29303== searching for pointers to 26 not-freed blocks. > [29267] end: Sat Jan 15 20:56:02 EST 2005 > ==2005-01-15 20:56:02.665 29303== checked 24771384 bytes. > ==2005-01-15 20:56:02.665 29303== > ==2005-01-15 20:56:02.665 29303== 68 bytes in 1 blocks are possibly lost in loss record 4 of 16 > ==2005-01-15 20:56:02.665 29303== at 0x1B906FE5: calloc (vg_replace_malloc.c:175) > ==2005-01-15 20:56:02.665 29303== by 0x1B8F25A8: (within /lib/ld-2.3.2.so) > ==2005-01-15 20:56:02.665 29303== by 0x1B8F287B: _dl_allocate_tls (in /lib/ld-2.3.2.so) > ==2005-01-15 20:56:02.665 29303== by 0x1BE5424A: allocate_stack (in /lib/tls/libpthread-0.60.so) > ==2005-01-15 20:56:02.665 29303== by 0x1BE53C54: pthread_create@@GLIBC_2.1 (in /lib/tls/libpthread-0.60.so) > ==2005-01-15 20:56:02.665 29303== by 0x1BD1CC1F: ??? (thread.c:793) > ==2005-01-15 20:56:02.665 29303== by 0x1BD1DD18: skthcr (thread.c:1401) > ==2005-01-15 20:56:02.665 29303== by 0x1BCFA06B: skmain (main.c:1237) > ==2005-01-15 20:56:02.665 29303== by 0x804A4B9: main (ssan3sv.c:319) > ==2005-01-15 20:56:02.665 29303== > ==2005-01-15 20:56:02.665 29303== > ==2005-01-15 20:56:02.665 29303== 268 bytes in 1 blocks are definitely lost in loss record 7 of 16 > ==2005-01-15 20:56:02.665 29303== at 0x1B9065C1: malloc (vg_replace_malloc.c:130) > ==2005-01-15 20:56:02.665 29303== by 0x1BD37D95: suwmal (malloc.c:46) > ==2005-01-15 20:56:02.665 29303== by 0x1BCF7579: ??? (log.c:298) > ==2005-01-15 20:56:02.665 29303== by 0x1BCF78D0: sksetp (log.c:520) > ==2005-01-15 20:56:02.665 29303== by 0x1BD033A2: skredi (redir.c:217) > ==2005-01-15 20:56:02.665 29303== by 0x1BCF9E20: skmain (main.c:1124) > ==2005-01-15 20:56:02.665 29303== by 0x804A4B9: main (ssan3sv.c:319) > ==2005-01-15 20:56:02.665 29303== > ==2005-01-15 20:56:02.665 29303== LEAK SUMMARY: > ==2005-01-15 20:56:02.665 29303== definitely lost: 268 bytes in 1 blocks. > ==2005-01-15 20:56:02.665 29303== possibly lost: 68 bytes in 1 blocks. > ==2005-01-15 20:56:02.665 29303== still reachable: 9672 bytes in 24 blocks. > ==2005-01-15 20:56:02.665 29303== suppressed: 0 bytes in 0 blocks. > ==2005-01-15 20:56:02.665 29303== Reachable blocks (those to which a pointer was found) are not shown. > ==2005-01-15 20:56:02.665 29303== To see them, rerun with: --show-reachable=yes J |