|
From: <ar...@de...> - 2004-02-02 15:41:18
|
Tom Hughes <th...@cy...> writes: > ------- You are receiving this mail because: ------- > You are the assignee for the bug, or are watching the assignee. > You reported the bug, or are watching the reporter. > You are on the CC list for the bug, or are watching someone who is. > > http://bugs.kde.org/show_bug.cgi?id=74016 > > > > > ------- Additional Comments From th...@cy... 2004-02-02 16:05 ------- > Subject: Re: New: corrupt stack on tkill syscall wrapper > > In message <200...@kt...> > ar...@de... wrote: > >> Running gdb under valgrind I get this core dump: >> >> (gdb) bt >> #0 0x40194c3a in vgPlain_do_syscall () from /usr/lib/valgrind/valgrind.so >> #1 0x4017691f in vgPlain_ktkill (tid=1075674952, signo=22208) at vg_mylibc.c: >> 207 >> Previous frame inner to this frame (corrupt stack?) >> >> It looks like some bit of the tkill syscall wrapper is broken. > > We're going to need a bit more information than that - can you compile > valgrind with debugging so you can actually see where in do_syscall it > is failing? > > You've shown us the backtrace, but what actually happened to valgrind > at that point? Presumably it died with a signal, but which one? Segfault. > > Those tid and signo values in the arguments to ktkill look pretty > implausible as well, or was the stack corrupted or something? The stack wasn't corrupted, but GDB's backtraces have never worked well for valgrind's syscall wrapper. > > What was the code trying to do anyway? What signal was it trying to > send and where had it got the TID to send it to from? This is GDB. It was probably trying to send a SIGSTOP to a thread... Wait, I apologize, it is not actually related to tkill. It looks like valgrind called its own tkill wrapper to cause a SIGSEGV. Here's the output from just before: ==9356== ==9356== Invalid read of size 4 ==9356== at 0x403BD643: __GI_tcsetattr (tcsetattr.c:74) ==9356== by 0x812B05F: (within /usr/bin/gdb) ==9356== by 0x81149A8: serial_set_tty_state (in /usr/bin/gdb) ==9356== by 0x8122423: terminal_inferior (in /usr/bin/gdb) ==9356== Address 0x0 is not stack'd, malloc'd or free'd ==9356== ==9356== Process terminating with default action of signal 11 (SIGSEGV): dumping core ==9356== Address not mapped to object at address 0x0 ==9356== at 0x403BD643: __GI_tcsetattr (tcsetattr.c:74) ==9356== by 0x812B05F: (within /usr/bin/gdb) ==9356== by 0x81149A8: serial_set_tty_state (in /usr/bin/gdb) ==9356== by 0x8122423: terminal_inferior (in /usr/bin/gdb) ==9356== ==9356== ERROR SUMMARY: 20 errors from 6 contexts (suppressed: 7 from 2) ==9356== malloc/free: in use at exit: 375692 bytes in 2638 blocks. ==9356== malloc/free: 3317 allocs, 679 frees, 447506 bytes allocated. ==9356== For a detailed leak analysis, rerun with: --leak-check=yes ==9356== For counts of detected errors, rerun with: -v zsh: segmentation fault (core dumped) valgrind gdb ./a.out Just run a trivial binary inside the GDB in unstable to reproduce, it segfaults immediately after "run". I don't believe there is anything wrong with the tcsetattr call. Because GDB can't backtrace through valgrind nowadays I can't figure out much more. > > Tom -- Andrés Roldán <ar...@de...> GPG Key-ID: 0xB29396EB http://people.fluidsignal.com/~aroldan |
|
From: <ar...@de...> - 2004-02-04 04:05:48
|
aroldan@volatile:~$ valgrind --num-callers=10 --tool=none gdb /bin/cat ==13497== Nulgrind, a binary JIT-compiler for x86-linux. ==13497== Copyright (C) 2002-2003, and GNU GPL'd, by Nicholas Nethercote. ==13497== Using valgrind-2.1.0, a program supervision framework for x86-linux. ==13497== Copyright (C) 2000-2003, and GNU GPL'd, by Julian Seward. ==13497== Estimated CPU clock rate is 897 MHz ==13497== For more details, rerun with: -v ==13497== GNU gdb 6.0-debian Copyright 2003 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-linux"...(no debugging symbols found)... (gdb) run Starting program: /bin/cat ==13497== ==13497== Process terminating with default action of signal 11 (SIGSEGV): dumping core ==13497== Address not mapped to object at address 0x0 ==13497== at 0x40292643: __GI_tcsetattr (tcsetattr.c:74) ==13497== by 0x812B15F: (within /usr/bin/gdb) ==13497== by 0x8114AA8: serial_set_tty_state (in /usr/bin/gdb) ==13497== by 0x8122523: terminal_inferior (in /usr/bin/gdb) ==13497== by 0x80C3665: resume (in /usr/bin/gdb) ==13497== by 0x812DA08: startup_inferior (in /usr/bin/gdb) ==13497== by 0x812CF9C: (within /usr/bin/gdb) ==13497== by 0x812D8F4: fork_inferior (in /usr/bin/gdb) ==13497== by 0x812D00C: (within /usr/bin/gdb) ==13497== by 0x80F5D5B: find_default_create_inferior (in /usr/bin/gdb) ==13497== zsh: segmentation fault (core dumped) valgrind --num-callers=10 --tool=none gdb /bin/cat So it's not related to memcheck. The pointer that gets past at this point is a static variable, btw - I think. Jeremy Fitzhardinge <je...@go...> writes: > ------- You are receiving this mail because: ------- > You are the assignee for the bug, or are watching the assignee. > You reported the bug, or are watching the reporter. > You are on the CC list for the bug, or are watching someone who is. > > http://bugs.kde.org/show_bug.cgi?id=74016 > > > > > ------- Additional Comments From je...@go... 2004-02-04 00:29 ------- > Subject: Re: corrupt stack on tkill syscall wrapper > > On Tue, 2004-02-03 at 11:38, Andrés Roldán wrote: >> I wouldn't have filed it as a valgrind bug if I hadn't checked out the >> possibility of a GDB bug first... when not run under valgrind, >> termios_p is not NULL. The instruction which generated SIGSEGV >> under valgrind completes successfully. It's the first call from >> set_tty_state in GDB to tcsetattr, and the third call to tcsetattr in >> the program. > > Well, that doesn't necessarily mean that the bug is in Valgrind, since > the environmental changes Valgrind imposes on the client could trigger a > SEGV in buggy code which might not have otherwise happened. > > Could you try this: valgrind --num-callers=10 --tool=none gdb > > This will see whether it's related to the things which memcheck does, or > its just Valgrind in general which causes the problem. The > --num-callers will let us see where exactly it is crashing in gdb. > > J -- Andrés Roldán <ar...@de...> GPG Key-ID: 0xB29396EB http://people.fluidsignal.com/~aroldan |
|
From: Jeremy F. <je...@go...> - 2004-02-04 17:57:59
|
On Tue, 2004-02-03 at 20:23, Andrés Roldán wrote: > aroldan@volatile:~$ valgrind --num-callers=10 --tool=none gdb /bin/cat > ==13497== Nulgrind, a binary JIT-compiler for x86-linux. > ==13497== Copyright (C) 2002-2003, and GNU GPL'd, by Nicholas Nethercote. > ==13497== Using valgrind-2.1.0, a program supervision framework for x86-linux. > ==13497== Copyright (C) 2000-2003, and GNU GPL'd, by Julian Seward. > ==13497== Estimated CPU clock rate is 897 MHz > ==13497== For more details, rerun with: -v > ==13497== > GNU gdb 6.0-debian > Copyright 2003 Free Software Foundation, Inc. > GDB is free software, covered by the GNU General Public License, and you are > welcome to change it and/or distribute copies of it under certain conditions. > Type "show copying" to see the conditions. > There is absolutely no warranty for GDB. Type "show warranty" for details. > This GDB was configured as "i386-linux"...(no debugging symbols found)... > (gdb) run > Starting program: /bin/cat > ==13497== > ==13497== Process terminating with default action of signal 11 (SIGSEGV): dumping core > ==13497== Address not mapped to object at address 0x0 > ==13497== at 0x40292643: __GI_tcsetattr (tcsetattr.c:74) > ==13497== by 0x812B15F: (within /usr/bin/gdb) > ==13497== by 0x8114AA8: serial_set_tty_state (in /usr/bin/gdb) > ==13497== by 0x8122523: terminal_inferior (in /usr/bin/gdb) > ==13497== by 0x80C3665: resume (in /usr/bin/gdb) > ==13497== by 0x812DA08: startup_inferior (in /usr/bin/gdb) > ==13497== by 0x812CF9C: (within /usr/bin/gdb) > ==13497== by 0x812D8F4: fork_inferior (in /usr/bin/gdb) > ==13497== by 0x812D00C: (within /usr/bin/gdb) > ==13497== by 0x80F5D5B: find_default_create_inferior (in /usr/bin/gdb) > ==13497== > zsh: segmentation fault (core dumped) valgrind --num-callers=10 --tool=none gdb /bin/cat > > So it's not related to memcheck. The pointer that gets past at this > point is a static variable, btw - I think. Hm. I just built gdb-6 from source and tried this, without seeing a problem. Do you know what the differences are between the version you're using and the standard version? Maybe you could attach your gdb binary to the bug. J |
|
From: <ar...@de...> - 2004-02-05 07:28:53
|
I wouldn't have filed it as a valgrind bug if I hadn't checked out the possibility of a GDB bug first... when not run under valgrind, termios_p is not NULL. The instruction which generated SIGSEGV under valgrind completes successfully. It's the first call from set_tty_state in GDB to tcsetattr, and the third call to tcsetattr in the program. The version of GDB (6.0-4) in Debian unstable should reproduce this. Jeremy Fitzhardinge <je...@go...> writes: > ------- You are receiving this mail because: ------- > You are the assignee for the bug, or are watching the assignee. > You reported the bug, or are watching the reporter. > You are on the CC list for the bug, or are watching someone who is. > > http://bugs.kde.org/show_bug.cgi?id=74016 > je...@go... changed: > > What |Removed |Added > ---------------------------------------------------------------------------- > Status|UNCONFIRMED |RESOLVED > Resolution| |INVALID > > > > ------- Additional Comments From je...@go... 2004-02-03 03:36 ------- > Yep. Your target program (gdb) touched a NULL pointer in __GI_tcsetattr, and suffered a SIGSEGV as a consequence - and Valgrind told you all about it. > > Now, why __GI_tcsetattr got a NULL pointer is another question. It might be a problem in Valgrind's emulation of something to do with terminals/job control. Or, more likely, there's a bug in your program. > > If it looks like a job control/signal bug in Valgrind, please file another bug. -- Andrés Roldán <ar...@de...> GPG Key-ID: 0xB29396EB http://people.fluidsignal.com/~aroldan |
|
From: Jeremy F. <je...@go...> - 2004-02-03 23:29:32
|
On Tue, 2004-02-03 at 11:38, Andrés Roldán wrote: > I wouldn't have filed it as a valgrind bug if I hadn't checked out the > possibility of a GDB bug first... when not run under valgrind, > termios_p is not NULL. The instruction which generated SIGSEGV > under valgrind completes successfully. It's the first call from > set_tty_state in GDB to tcsetattr, and the third call to tcsetattr in > the program. Well, that doesn't necessarily mean that the bug is in Valgrind, since the environmental changes Valgrind imposes on the client could trigger a SEGV in buggy code which might not have otherwise happened. Could you try this: valgrind --num-callers=10 --tool=none gdb This will see whether it's related to the things which memcheck does, or its just Valgrind in general which causes the problem. The --num-callers will let us see where exactly it is crashing in gdb. J |