|
From: Julian S. <js...@ac...> - 2007-11-12 01:35:06
|
I've been testing Valgrind on a recent ppc distro, openSUSE 10.3 running on a 32-bit ppc box. Memcheck generates huge numbers of undefined value errors even for the simplest program (eg /bin/date) and I'm trying to figure out what's going on. Many of the errors seem to relate to load instructions like this 387d8: 80 02 8f f4 lwz r0,-28684(r2) complaining that r2 contains an undefined value. And it's true; it is not written at all in the procedure in which this is reported. Which is odd. AFAIK r2 is not an argument register and it didn't used to have any particular meaning in the ppc32 ELF ABI. Now I'm wondering if the 32-bit ABI has morphed into something more similar to the 64-bit ppc ELF ABI. That uses r2 to point to tables of constants, which kinda looks like what I'm seeing. Or perhaps it is being used as a pointer to some thread-local data area? Anybody know anything about this? This is with gcc-4.2.1, gcc-2.6.1, kernel 2.6.22.9-0.4-default. J |
|
From: Paul M. <pa...@sa...> - 2007-11-12 02:29:57
|
Julian Seward writes: > complaining that r2 contains an undefined value. And it's true; > it is not written at all in the procedure in which this is > reported. Which is odd. AFAIK r2 is not an argument register > and it didn't used to have any particular meaning in the ppc32 ELF > ABI. It's now used as the TLS pointer. It should get initialized for the main thread in glibc, and for other threads by the clone() system call. Maybe the CLONE_SETTLS flag isn't handled properly by the syscall wrapper for clone? It should cause r2 in the child to be initialized to the value of the 4th argument (for 32-bit processes; 64-bit processes use r13 instead). Paul. |
|
From: John R.
|
Paul Mackerras wrote: > Julian Seward writes: > > >>complaining that r2 contains an undefined value. ... > > It's now used as the TLS pointer. It should get initialized for the > main thread in glibc, and for other threads by the clone() system > call. ... Please give a citation to a reference for this change. -- |
|
From: Paul M. <pa...@sa...> - 2007-11-12 03:31:19
|
John Reiser writes: > > It's now used as the TLS pointer. It should get initialized for the > > main thread in glibc, and for other threads by the clone() system > > call. ... > > Please give a citation to a reference for this change. I think it's described in the LSB psABI for 32-bit powerpc. Paul. |
|
From: John R.
|
Paul Mackerras wrote [regarding r2 as TLS pointer for 32-bit PowerPC]: > I think it's described in the LSB psABI for 32-bit powerpc. Do you have a more specific citation? I can't find any designation of r2 as TLS pointer; only as "System reserved." The page http://refspecs.linux-foundation.org/LSB_3.1.0/LSB-Core-PPC32/LSB-Core-PPC32/normativerefs.html#STD.PPC32.ABI points to http://refspecs.freestandards.org/elf/elfspec_ppc.pdf (dated Sept.1995, twelve years ago) which says that r2 is "System reserved" but does not say anything about TLS (thread-local storage) pointer. The page http://refspecs.linux-foundation.org/LSB_3.1.0/LSB-Core-PPC32/LSB-Core-PPC32/processinitialization.html also does not mention r2. -- |
|
From: Julian S. <js...@ac...> - 2007-11-13 03:10:24
|
On Monday 12 November 2007 03:28, Paul Mackerras wrote: > > It's now used as the TLS pointer. [...] Thanks for the info. After hours of frustratingly chasing undefined values around vast nameless blocks of machine code, I discovered the large numbers of uninitialised values are a result of having /lib/ld-2.6.1.so being almost completely devoid of symbol table info. On ppc32, ld.so has its own strlen/strcmp/strchr functions, which do strange things with carry bits that fool Memcheck. (or something that it can't handle - I can't remember). For glibc <= 2.5 V simply supplied its own non-optimised replacements for these functions in ld.so, and used them. That worked fine, but now fails because the strlen/strcmp/strchr symbols on ld.so are no longer present. Installing the debuginfo package provides that info, so V's replacements kick in, and the problem goes away. Result is that on ppc32-linux and ppc64-linux, for glibc 2.6 and later, Valgrind will be unusable unless glibc-2.6-debuginfo.rpm is installed. J |
|
From: Dave N. <dc...@us...> - 2007-12-03 19:31:15
|
So you tracked down these unitialized values down to the strxxx functions defined in ld.so and Valgrind normally intercepts these calls because Memcheck can't handle the sorts of code that is generated for these routines? Is it possible to teach Memcheck to deal with these optimizations? Steve Munroe, the author of those optimized strxxx functions, tells me that the kinds of optimizations done for these routines are going to start appearing in other library routines, and possibly in generated object code so the problem is going to become more pervasive. Julian Seward wrote: > On Monday 12 November 2007 03:28, Paul Mackerras wrote: > >> It's now used as the TLS pointer. [...] >> > > Thanks for the info. > > After hours of frustratingly chasing undefined values around vast > nameless blocks of machine code, I discovered the large numbers of > uninitialised values are a result of having /lib/ld-2.6.1.so being > almost completely devoid of symbol table info. > > On ppc32, ld.so has its own strlen/strcmp/strchr functions, which do > strange things with carry bits that fool Memcheck. (or something > that it can't handle - I can't remember). > > For glibc <= 2.5 V simply supplied its own non-optimised replacements > for these functions in ld.so, and used them. That worked fine, but > now fails because the strlen/strcmp/strchr symbols on ld.so are no > longer present. Installing the debuginfo package provides that info, > so V's replacements kick in, and the problem goes away. > > Result is that on ppc32-linux and ppc64-linux, for glibc 2.6 and later, > Valgrind will be unusable unless glibc-2.6-debuginfo.rpm is installed. > > J > > ------------------------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. > Still grepping through log files to find problems? Stop. > Now Search log events and configuration files using AJAX and a browser. > Download your FREE copy of Splunk now >> http://get.splunk.com/ > _______________________________________________ > Valgrind-developers mailing list > Val...@li... > https://lists.sourceforge.net/lists/listinfo/valgrind-developers > -- Dave Nomura LTC Linux Power Toolchain |
|
From: Julian S. <js...@ac...> - 2007-12-04 02:36:28
|
On Monday 03 December 2007 20:29, Dave Nomura wrote:
> So you tracked down these unitialized values down to the strxxx
> functions defined in ld.so and Valgrind normally intercepts these calls
> because Memcheck can't handle the sorts of code that is generated for
> these routines?
Correct.
> Is it possible to teach Memcheck to deal with these optimizations?
>
> Steve Munroe, the author of those optimized strxxx functions, tells me
> that the kinds of optimizations done for these routines are going to
> start appearing in other library routines, and possibly in generated
> object code so the problem is going to become more pervasive.
You're in the land of difficult tradeoffs. A lot of effort has
already been applied here.
All these optimised, vectorised (effectively) string ops rely on two
techniques:
(1) using properties of carry-chain propagation in addition/subtraction
so as find out whether any byte in a word is zero, and if so
which one
(2) reading (traditional C-style zero-terminated) strings using
aligned word reads, rather than byte reads
(1) fools Memcheck's normal handling of definedness tracking for
adds/subtracts, causing it to believe the result of the add/subtract
is completely undefined, when it isn't really. In fact Memcheck
can and sometimes does generate a more exact interpretation, which
does handle this case correctly.
The problem is deciding when to apply it. The standard analysis
costs about 3 insns in the generated code, and the exact analysis
more than 10 insns (+ more registers). Applying the expensive case
throughout would cause significant slowdowns to the 99.99% of code
fragments for which the standard handling is perfectly adequate.
(2) causes Memcheck to report invalid address errors for the partial
word loads covering the zero terminating bytes at the end of
strings. You can stop it complaining about this by giving
--partial-loads-ok=yes, but that could cause genuine errors to
be missed. Said flag is not enabled by default.
I realise that (2) is "perfectly safe" in that the word-sized loads
are naturally aligned and so cannot possibly cause any page faults
that would not otherwise occur. Nevertheless, any way you slice it,
ISO C/C++ says that reading memory outside of allocated blocks
counts as undefined behaviour (IIUC), and that's precisely what
Memcheck aims to report.
We have never claimed that Memcheck is suitable for code compiled at
-O2 and above. -O is the max recommended level. I would advocate the
following:
* do not allow gcc to inline stringops at -O, only at -O2 and above
* do not strip all symbol names off ld.so
In short there's a conflict between optimising the hell out of stringops
and having enough visibility for reliable debugging. Given the above
constraints I don't see how you can have your cake and eat it.
Note that none of the above is PPC specific -- it also applies to
x86/amd64. I'm not sure why these problems appear more acute on ppc
-- it may be some interaction between the carry chain propagation
games and the fact that ppc is bigendian.
J
|