|
From: Guilhem B. <gu...@my...> - 2003-09-28 16:38:02
|
Hi!
I thought this was of interest, if you sometimes can't get Valgrind to
print the complete stacktrace. Maybe it can be classified as a feature request.
Running some SQL commands in MySQL I saw (this bug is now fixed, but this
is an example) valgrind-20030725 print only:
==4259== Thread 3:
==4259== Conditional jump or move depends on uninitialised value(s)
==4259== at 0x839563C: longlong2str (in /home/mysql_src/mysql-4.0/sql/mysqld)
==4259==
==4259== Thread 3:
==4259== Use of uninitialised value of size 4
==4259== at 0x839564C: longlong2str (in /home/mysql_src/mysql-4.0/sql/mysqld)
<that's all>
This is an unusually short stacktrace.
Using valgrind --gdb-attach=yes does not help, as the frames you can
access in gdb are not "interesting" (they're only some obscure Valgrind
code vg_*.c and you can't go up in the stack).
The offender (not shown by the above stacktrace) is this C code
str->set(*(longlong*) entry->value);
(entry->value is a char* pointing to a string of 2 chars, so it's illegal
to consider this as longlong (which is 4 bytes); it's using uninitialized
memory).
So how come Valgrind does not mention this function (String::set(); String is
a MySQL-specific C++ class) in the stacktrace?
If after str->set() you add:
printf("Let's see: %lld\n", *(longlong*) entry->value);
then you get a normally long stacktrace:
==4259== Thread 3:
==4259== Conditional jump or move depends on uninitialised value(s)
==4259== at 0x4044422E: (within /lib/i686/libc-2.3.1.so)
==4259== by 0x404460AB: _IO_vfprintf (in /lib/i686/libc-2.3.1.so)
==4259== by 0x4044EEB1: _IO_printf (in /lib/i686/libc-2.3.1.so)
==4259== by 0x80DD18E: Item_func_get_user_var::val_str(String*)
(item_func.cc:1985)
==4259==
==4259== Thread 3:
==4259== Use of uninitialised value of size 4
==4259== at 0x4044423E: (within /lib/i686/libc-2.3.1.so)
==4259== by 0x404460AB: _IO_vfprintf (in /lib/i686/libc-2.3.1.so)
==4259== by 0x4044EEB1: _IO_printf (in /lib/i686/libc-2.3.1.so)
==4259== by 0x80DD18E: Item_func_get_user_var::val_str(String*)
(item_func.cc:1985)
==4259==
And with --gdb-attach=yes you can reach the guilty code (you can 'up').
The reason why Valgrind does not give the stacktrace for str->set() is
that the String::set(longlong) function is:
str_length=(uint32) (longlong10_to_str(num,Ptr,-10)-Ptr);
and in MySQL longlong10_to_str is not C code but assembly code
(i.e we call a function written in assembly, from the C code).
Valgrind only reports that a problem was found in the longlong2str
module, which is true, but I would think Valgrind would also know
the callers (the C function) and print the stacktrace, which would
be super-helpful (help spot the bug in 5 seconds :)
If you know a Valgrind option which would help, or any comment you
have, they are warmly welcome :)
Guilhem
--
For technical support contracts, visit https://order.mysql.com/?ref=mgbi
Are you MySQL certified? visit http://www.mysql.com/certification/
__ ___ ___ ____ __
/ |/ /_ __/ __/ __ \/ / Mr. Guilhem Bichot <gu...@my...>
/ /|_/ / // /\ \/ /_/ / /__ MySQL AB, Full-Time Software Developer
/_/ /_/\_, /___/\___\_\___/ Bordeaux, France
<___/ www.mysql.com
|
|
From: Nicholas N. <nj...@ca...> - 2003-09-29 17:11:49
|
On Sun, 28 Sep 2003, Guilhem Bichot wrote: > I thought this was of interest, if you sometimes can't get Valgrind to > print the complete stacktrace. [snip] > Using valgrind --gdb-attach=yes does not help, as the frames you can > access in gdb are not "interesting" (they're only some obscure Valgrind > code vg_*.c and you can't go up in the stack). [snip] > The reason why Valgrind does not give the stacktrace for str->set() is > that the String::set(longlong) function is: > str_length=(uint32) (longlong10_to_str(num,Ptr,-10)-Ptr); > and in MySQL longlong10_to_str is not C code but assembly code > (i.e we call a function written in assembly, from the C code). Valgrind finds its traces by walking the stack. AIUI, this relies on your program using %ebp as a frame pointer. I think this is normal for compiled C code (unless you use -fomit-frame-pointer), but it probably isn't the case in your asm code. So Valgrind can't work out where the stack frame boundaries are. I would guess that GDB doesn't find anything "interesting" for the same reason. So I don't think there's an easy solution, sorry. N |
|
From: Guilhem B. <gu...@my...> - 2003-09-29 19:31:12
|
On Mon, 2003-09-29 at 19:11, Nicholas Nethercote wrote: > On Sun, 28 Sep 2003, Guilhem Bichot wrote: > > > I thought this was of interest, if you sometimes can't get Valgrind to > > print the complete stacktrace. > > [snip] > > > Using valgrind --gdb-attach=yes does not help, as the frames you can > > access in gdb are not "interesting" (they're only some obscure Valgrind > > code vg_*.c and you can't go up in the stack). > > [snip] > > > The reason why Valgrind does not give the stacktrace for str->set() is > > that the String::set(longlong) function is: > > str_length=(uint32) (longlong10_to_str(num,Ptr,-10)-Ptr); > > and in MySQL longlong10_to_str is not C code but assembly code > > (i.e we call a function written in assembly, from the C code). > > Valgrind finds its traces by walking the stack. AIUI, this relies on your > program using %ebp as a frame pointer. I think this is normal for > compiled C code (unless you use -fomit-frame-pointer), but it probably > isn't the case in your asm code. So Valgrind can't work out where the > stack frame boundaries are. I would guess that GDB doesn't find anything > "interesting" for the same reason. > > So I don't think there's an easy solution, sorry. > > N True, in this asm code we use %ebp for other things than as a frame pointer. And (I just tested) when I disable the optimised asm functions and use C equivalent functions instead, I get the beautiful stacktrace. Thank you very much for this explaination! Guilhem |
|
From: Sebastian <sc...@nb...> - 2003-09-30 02:08:34
|
Hi, On Mon, Sep 29, 2003 at 06:11:47PM +0100, Nicholas Nethercote wrote: > Valgrind finds its traces by walking the stack. AIUI, this relies on your > program using %ebp as a frame pointer. I think this is normal for > compiled C code (unless you use -fomit-frame-pointer), but it probably > isn't the case in your asm code. So Valgrind can't work out where the > stack frame boundaries are. I would guess that GDB doesn't find anything > "interesting" for the same reason. > So I don't think there's an easy solution, sorry. Maybe, if %ebp is not saved within a function (between the occuring error and the function entry through call), one could use the call target address to figure the function address and stack: It would be possible to keep a small stack within valgrind, and save the %esp whenever a call instruction occurs. Then, one do not have to rely on %ebp being used as framepointer, and optimized leaf functions (-momit-leaf-frame-pointer) and optimized -fomit-frame-pointer code could be backtraced. Is that possible, or am I missing something? > N ciao, Sebastian -- -. sc...@nb... -. + http://segfault.net/~scut/ `--------------------. -' segfault.net/~scut/pgp `' 5453 AC95 1E02 FDA7 50D2 A42D 427E 6DEF 745A 8E07 `- W88 heads sold. 4 payloads ready, payment due. hi echelon! ---------------' |
|
From: Nicholas N. <nj...@ca...> - 2003-09-30 07:50:23
|
On Tue, 30 Sep 2003, Sebastian wrote: > Maybe, if %ebp is not saved within a function (between the occuring error > and the function entry through call), one could use the call target address > to figure the function address and stack: > > It would be possible to keep a small stack within valgrind, and save the > %esp whenever a call instruction occurs. Then, one do not have to rely on > %ebp being used as framepointer, and optimized leaf functions > (-momit-leaf-frame-pointer) and optimized -fomit-frame-pointer code could be > backtraced. > > Is that possible, or am I missing something? I imagine it would be possible, but very fiddly. Doing stack-walking is already surprisingly tricky, Valgrind's function for doing it has six dated comments from developers explaining certain changes they made for certain cases. And real code never pairs up 'call' and 'ret' like you'd hope/expect... there are always weird things like tail calls, jmps-as-calls, pushing a target address and then calling 'ret' to jump to it, and strange stuff like that, making this kind of thing trickier than you'd expect. And I don't think the problem occurs often enough for it to be worth the extra effort and complexity... but others may disagree. N |
|
From: Dirk M. <dm...@gm...> - 2003-09-29 17:41:07
|
On Sunday 28 September 2003 15:42, Guilhem Bichot wrote: > I thought this was of interest, if you sometimes can't get Valgrind to > print the complete stacktrace. Maybe it can be classified as a feature > request. gdb does fairly complicated stuff to find the backtrace, and if even fails to print a stack trace then I believe that valgrind does not have much of a chance either. I believe the problem is that you didn't compile the assembler file with debug? Is this inline assembler or a separate file? is it a .c file or a .S ? do you compile with -fomit-frame-pointer, or without? is it inline-assembly? if not, do you write the function prolog in assembler too? is objdump -d -S on the object file able to intermix source/code correctly? I seem to remember that gdb has problems if it cannot find the function prolog. any small testcase to look at? > If you know a Valgrind option which would help, or any comment you > have, they are warmly welcome :) valgrind provides those superuseful CHECK_DEFINED macros :) |