|
From: Jeroen N. W. <jn...@xs...> - 2005-11-04 10:26:00
|
Greetings,
In blanket, when I use tl_assert(false) or VG_(tool_panic)("Text") I get a
stack trace without function name, source file and line number
information:
Blanket: the 'impossible' happened:
Cannot determine order of keys: 4000C22:4000C27 <=> 4000C20:4000C22
==6748== at 0xB00015DA: ???
==6748== by 0xB00015D9: ???
[11 similar lines removed]
Module .in_place/blanket does contain debugging information, so `readelf
-w` tells me. I cannot find an open bug report for this problem. Can
anybody shed some light on this?
Thanks.
Jeroen.
PS: Debian 3,1 (sarge), Linux 2.4.26-1-386, gcc version 3.3.5 (Debian
1:3.3.5-13), Valgrind rev. 4995, VEX rev. 1431.
|
|
From: Tom H. <to...@co...> - 2005-11-04 11:36:51
Attachments:
x
|
In message <17018.194.109.230.85.1131099944.squirrel@194.109.230.85>
Jeroen N. Witmond <jn...@xs...> wrote:
> In blanket, when I use tl_assert(false) or VG_(tool_panic)("Text") I get a
> stack trace without function name, source file and line number
> information:
>
> Blanket: the 'impossible' happened:
> Cannot determine order of keys: 4000C22:4000C27 <=> 4000C20:4000C22
> ==6748== at 0xB00015DA: ???
> ==6748== by 0xB00015D9: ???
> [11 similar lines removed]
>
> Module .in_place/blanket does contain debugging information, so `readelf
> -w` tells me. I cannot find an open bug report for this problem. Can
> anybody shed some light on this?
No idea I'm afraid - it seems to work for me on x86 systems with
a tl_assert call inserted in nl_instrument in the none tool.
There are some issues on amd64 - one I have just committed a fix
for but the other is more complicated.
The second issue is that when report_and_quit gets the initial
context it does this:
ip = (Addr)__builtin_return_address(0);
GET_REAL_SP_AND_FP(sp, fp);
The problem is that by doing that we get the return address (which
is in the calling function) but the current functions stack and frame
pointers which means that a CFI unwind fails because it looks up the
calling function's CFI data and tries to adjust the SP to get the
address of the CFA but as the SP value is bogus it doesn't find it
and the unwind fails.
As there is no reliable/easy way to get the calling functions stack
and frame pointers my best fix at the moment is to use the current
value of the PC instead of the return address. A patch for that is
attached which works on x86/amd64 but needs a small assembly tweak
for ppc32 to get the PC value.
Tom
--
Tom Hughes (to...@co...)
http://www.compton.nu/
|
|
From: Julian S. <js...@ac...> - 2005-11-05 14:48:49
|
> As there is no reliable/easy way to get the calling functions stack > and frame pointers my best fix at the moment is to use the current > value of the PC instead of the return address. A patch for that is > attached which works on x86/amd64 but needs a small assembly tweak > for ppc32 to get the PC value. Looks good. I committed it. (am not unhappy to wave bye-bye to __builtin_return_address). ppc was a little tricky :-) J |
|
From: Jeroen N. W. <jn...@xs...> - 2005-11-04 15:33:38
Attachments:
empty_stacktrace-symtab.c.patch
|
On Fri, 4 Nov 2005, Tom Hughes wrote: > As there is no reliable/easy way to get the calling functions stack > and frame pointers my best fix at the moment is to use the current > value of the PC instead of the return address. A patch for that is > attached which works on x86/amd64 but needs a small assembly tweak > for ppc32 to get the PC value. Found it. The problem is in 'static Bool is_self ( HChar* filename )' at line 133 in file coregrind/m_debuginfo/symtab.c. It is correct when running an installed Valgrind (as I assume you are), but fails when running Valgrind from the source tree (as I do). When I hack is_self() to always return True (see attached file empty_stacktrace-symtab.c.patch), my stack trace is correct. The presence or absence of your patch does not influence Valgrind's output with either version of is_self(). As the regression test does not produce additional errors when this hack is present, it would seem that is_self() and all references to it can be removed. Would you like a bug report for this problem? On a related issue: I noticed that there are no regression tests for Valgrind's handling of tool errors, such as VG_(tool_panic)(). To regression test the current problem, I have adapted Nulgrind to call VG_(tool_panic)() when option --test-tool-panic=yes is specified. Let me know if you want me to send you the related files. A problem with this approach is that it renders Nulgrind less usable as template for new tools. It may be better to create a new directory in which to store regression tests for Valgrind's handling of tool errors. I'm quite willing to supply the initial content for that directory. What do you think? Jeroen. |
|
From: Nicholas N. <nj...@cs...> - 2005-11-04 15:44:54
|
On Fri, 4 Nov 2005, Jeroen N. Witmond wrote: > Found it. The problem is in 'static Bool is_self ( HChar* filename )' at > line 133 in file coregrind/m_debuginfo/symtab.c. It is correct when > running an installed Valgrind (as I assume you are), but fails when > running Valgrind from the source tree (as I do). When I hack is_self() to > always return True (see attached file empty_stacktrace-symtab.c.patch), my > stack trace is correct. The presence or absence of your patch does not > influence Valgrind's output with either version of is_self(). Good catch. I've sometimes had this problem myself and never known what was going on. > As the regression test does not produce additional errors when this hack > is present, it would seem that is_self() and all references to it can be > removed. Would you like a bug report for this problem? Julian, can you comment on what is_self() is meant to do exactly? Perhaps it would be better if it checked for the .in_place case as well as the /lib/valgrind case. > On a related issue: I noticed that there are no regression tests for > Valgrind's handling of tool errors, such as VG_(tool_panic)(). To > regression test the current problem, I have adapted Nulgrind to call > VG_(tool_panic)() when option --test-tool-panic=yes is specified. Let me > know if you want me to send you the related files. > > A problem with this approach is that it renders Nulgrind less usable as > template for new tools. It may be better to create a new directory in > which to store regression tests for Valgrind's handling of tool errors. > I'm quite willing to supply the initial content for that directory. What > do you think? The coverage of the tests is far from perfect. I don't think changing Nulgrind is a good idea, it should be as minimal as possible. This option could be added to Lackey, but then it might interfere with its goal of being a good teaching tool. Perhaps we should create a new tool that is purely for testing purposes. Having it just to test VG_(tool_panic)() seems like overkill, but if you can find several other important pieces of code that cannot be tested sensibly in any of the existing tools it might be worth creating. Nick |
|
From: Julian S. <js...@ac...> - 2005-11-04 17:53:17
|
> Julian, can you comment on what is_self() is meant to do exactly? Perhaps
> it would be better if it checked for the .in_place case as well as the
> /lib/valgrind case.
It's a kludge. What happens is:
- After the client is loaded and all ready to go, a loop at
m_main.c:2330 hands all segment descriptors off to VG_(di_notify_mmap),
which allows m_debuginfo to read debug info for any mappings it likes.
This is how the initial debug info is read.
- VG_(di_notify_mmap) will read debug info from plausible-looking elf
files that are mapped for the client. But we also want to load the
debug info for V itself, and that is a non-client mapping. Hence:
ok = (seg->kind == SkFileC || (seg->kind == SkFileV && is_self(filename)))
&& various other conditions
is_self is simply a kludge to guess at what is the name of the V executable
itself.
The Right Way (tm) would be to get rid of is_self and instead do:
seg->kind == SkFileC
|| (seg->kind == SkFileV && 0==VG_(strcmp)(filename, argv[0] from main())
Then it does not matter what the V executable is called or where
it is located.
Then the only difficulty is to get main's argv[0] through to this
function. Maybe the least ugly is to pass it as a second arg to
VG_(di_notify_mmap) (or NULL from all other call sites), since
there's no obviously-correct place to put it as a global variable.
J
|
|
From: Jeroen N. W. <jn...@xs...> - 2005-11-05 09:13:14
|
> >> Julian, can you comment on what is_self() is meant to do exactly? >> Perhaps >> it would be better if it checked for the .in_place case as well as the >> /lib/valgrind case. > > It's a kludge. What happens is: > > - After the client is loaded and all ready to go, a loop at > m_main.c:2330 hands all segment descriptors off to VG_(di_notify_mmap), > which allows m_debuginfo to read debug info for any mappings it likes. > This is how the initial debug info is read. > > - VG_(di_notify_mmap) will read debug info from plausible-looking elf > files that are mapped for the client. But we also want to load the > debug info for V itself, and that is a non-client mapping. Hence: > > ok = (seg->kind == SkFileC || (seg->kind == SkFileV && > is_self(filename))) > && various other conditions > > is_self is simply a kludge to guess at what is the name of the V > executable > itself. Forgive my ignorance, but why do you need to check the segment's file name at all? Now that V is one static executable, the only file to have SkFileV should be the V executable anyway. Enabling 'show_nsegment()' in 'read_maps_callback()' in file coregrind/m_aspacemgr/aspacemgr.c seems be confirm this. Jeroen. > > The Right Way (tm) would be to get rid of is_self and instead do: > > seg->kind == SkFileC > || (seg->kind == SkFileV && 0==VG_(strcmp)(filename, argv[0] from > main()) > > Then it does not matter what the V executable is called or where > it is located. > > Then the only difficulty is to get main's argv[0] through to this > function. Maybe the least ugly is to pass it as a second arg to > VG_(di_notify_mmap) (or NULL from all other call sites), since > there's no obviously-correct place to put it as a global variable. > > J > > > ------------------------------------------------------- > SF.Net email is sponsored by: > Tame your development challenges with Apache's Geronimo App Server. > Download > it for free - -and be entered to win a 42" plasma tv or your very own > Sony(tm)PSP. Click here to play: http://sourceforge.net/geronimo.php > _______________________________________________ > Valgrind-developers mailing list > Val...@li... > https://lists.sourceforge.net/lists/listinfo/valgrind-developers > |
|
From: Julian S. <js...@ac...> - 2005-11-05 14:11:45
|
> Forgive my ignorance, but why do you need to check the segment's file name > at all? Now that V is one static executable, the only file to have SkFileV > should be the V executable anyway. That's true, but I don't want to hardwire that assumption in. Anyway, what I suggested doesn't work, because argv[0] can be a relative pathname, whereas the names in the NSegments are always absolute paths. I just committed the change Nick suggested (r5020) which should allow reading from in-place builds too. However, since I don't know how to run the system in-place (I never do so myself) I haven't test it - can you try? J |
|
From: Jeroen N. W. <jn...@xs...> - 2005-11-06 14:24:04
Attachments:
empty_stacktrace-procselfexe.patch
|
> >> Forgive my ignorance, but why do you need to check the segment's file >> name >> at all? Now that V is one static executable, the only file to have >> SkFileV >> should be the V executable anyway. > > That's true, but I don't want to hardwire that assumption in. > > Anyway, what I suggested doesn't work, because argv[0] can be a > relative pathname, whereas the names in the NSegments are always > absolute paths. > > I just committed the change Nick suggested (r5020) which should allow > reading from in-place builds too. However, since I don't know how to > run the system in-place (I never do so myself) I haven't test it - can > you try? This change does not work, because the name constructed by coregrind/launcher.c to execve the tool is a symbolic link, which is resolved to the real name before it appears in /proc/self/maps. This real name does not contain the string '.in_place/'. The attached patch is a hack that solves both the relative file name problem and the symbolic link problem, without assuming that Valgrind is a single static module. It works for me, it does not break the regression test; but it probably is not a real fix, because of violations of the modularization of Valgrind. It's beyond me to fix that. :-) Jeroen. |
|
From: Julian S. <js...@ac...> - 2005-11-08 00:50:30
|
Euh, ok. Try r5033. The idea of readlink("/proc/self/exe") is pretty
good, but there are systems on which /proc/self/exe is borked. So I
fell back to what is essentially your original plan.
J
On Sunday 06 November 2005 14:23, Jeroen N. Witmond wrote:
> >> Forgive my ignorance, but why do you need to check the segment's file
> >> name
> >> at all? Now that V is one static executable, the only file to have
> >> SkFileV
> >> should be the V executable anyway.
> >
> > That's true, but I don't want to hardwire that assumption in.
> >
> > Anyway, what I suggested doesn't work, because argv[0] can be a
> > relative pathname, whereas the names in the NSegments are always
> > absolute paths.
> >
> > I just committed the change Nick suggested (r5020) which should allow
> > reading from in-place builds too. However, since I don't know how to
> > run the system in-place (I never do so myself) I haven't test it - can
> > you try?
>
> This change does not work, because the name constructed by
> coregrind/launcher.c to execve the tool is a symbolic link, which is
> resolved to the real name before it appears in /proc/self/maps. This real
> name does not contain the string '.in_place/'.
>
> The attached patch is a hack that solves both the relative file name
> problem and the symbolic link problem, without assuming that Valgrind is a
> single static module. It works for me, it does not break the regression
> test; but it probably is not a real fix, because of violations of the
> modularization of Valgrind. It's beyond me to fix that. :-)
>
> Jeroen.
|
|
From: Jeroen N. W. <jn...@xs...> - 2005-11-08 19:43:20
|
On Tue, 8 Nov 2005 , Julian Seward wrote: > Euh, ok. Try r5033. This fix works both for running V from the source tree and for running V as installed. Thanks! Jeroen. |
|
From: Julian S. <js...@ac...> - 2005-11-08 20:09:07
|
On Tuesday 08 November 2005 19:43, Jeroen N. Witmond wrote: > On Tue, 8 Nov 2005 , Julian Seward wrote: > > Euh, ok. Try r5033. > > This fix works both for running V from the source tree and for running V > as installed. Thanks! Cool. How's the coverage tool coming along? J |