|
From: Andrew C. <and...@fr...> - 2004-08-26 09:43:08
|
(PS. I thought I sent this the other day, but it didn't show up on the list - excuse me if its a dupe) I've been trying to get valgrind working to debug a plugin (.so dynamically loaded into host app) of mine which is written for Alias' Maya 3D app (www.alias.com). I have the source to my plugin, but only binaries for Maya. Maya is a huge pig of a program, that generally thwarts any debugging attempts, but valgrind is doing excellent things for my standalone apps so I was hoping to get it working. When I run 'valgrind --tool=memcheck /opt/aw/maya6.0/bin/maya.bin' I get the following output ending in a segfault: ==12627== Memcheck, a memory error detector for x86-linux. ==12627== Copyright (C) 2002-2004, and GNU GPL'd, by Julian Seward et al. ==12627== Using valgrind-2.1.2, a program supervision framework for x86-linux. ==12627== Copyright (C) 2000-2004, and GNU GPL'd, by Julian Seward et al. ==12627== For more details, rerun with: -v ==12627== ==12627== Invalid read of size 4 ==12627== at 0x4F9EECE4: _IO_vfprintf_internal (in /lib/i686/libc-2.3.2.so) ==12627== by 0x4FA0FC5F: _IO_vsnprintf (in /lib/i686/libc-2.3.2.so) ==12627== by 0x2074B08E: AL_vsnprintf (in /opt/aw/maya6.0/lib/libBase.so) ==12627== by 0x20776CB5: awString::CStringImpl::doFormat(char const*, char*, unsigned) (in /opt/aw/maya6.0/lib/libBase.so) ==12627== Address 0x0 is not stack'd, malloc'd or (recently) free'd ==12627== ==12627== Process terminating with default action of signal 11 (SIGSEGV) ==12627== Access not within mapped region at address 0x0 ==12627== at 0x4F9EECE4: _IO_vfprintf_internal (in /lib/i686/libc-2.3.2.so) ==12627== by 0x4FA0FC5F: _IO_vsnprintf (in /lib/i686/libc-2.3.2.so) ==12627== by 0x2074B08E: AL_vsnprintf (in /opt/aw/maya6.0/lib/libBase.so) ==12627== by 0x20776CB5: awString::CStringImpl::doFormat(char const*, char*, unsigned) (in /opt/aw/maya6.0/lib/libBase.so) ==12627== ==12627== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 123 from 1) ==12627== malloc/free: in use at exit: 641217 bytes in 19 blocks. ==12627== malloc/free: 21 allocs, 2 frees, 641225 bytes allocated. ==12627== For a detailed leak analysis, rerun with: --leak-check=yes ==12627== For counts of detected errors, rerun with: -v Segmentation fault The errors it reports I'm not concerned about, just the fact it dies. This is with valgrind v2.1.2 on Fedora Core 1. Interestingly I get the same output with '-tool=none'. Anyone have any ideas? There is a free download version of Maya available from www.alias.com if anyone else was interested in trying valgrind with it. -- Andrew Chapman Senior Technical Director - Framestore CFC |
|
From: Tom H. <th...@cy...> - 2004-08-26 10:11:32
|
In message <412...@fr...>
Andrew Chapman <and...@fr...> wrote:
> ==12627== Invalid read of size 4
> ==12627== at 0x4F9EECE4: _IO_vfprintf_internal (in
> /lib/i686/libc-2.3.2.so)
> ==12627== by 0x4FA0FC5F: _IO_vsnprintf (in /lib/i686/libc-2.3.2.so)
> ==12627== by 0x2074B08E: AL_vsnprintf (in /opt/aw/maya6.0/lib/libBase.so)
> ==12627== by 0x20776CB5: awString::CStringImpl::doFormat(char
> const*, char*, unsigned) (in /opt/aw/maya6.0/lib/libBase.so)
> ==12627== Address 0x0 is not stack'd, malloc'd or (recently) free'd
Here valgrind is warning you that your program is about to read
through a nul pointer.
> ==12627== Process terminating with default action of signal 11 (SIGSEGV)
> ==12627== Access not within mapped region at address 0x0
> ==12627== at 0x4F9EECE4: _IO_vfprintf_internal (in
> /lib/i686/libc-2.3.2.so)
> ==12627== by 0x4FA0FC5F: _IO_vsnprintf (in /lib/i686/libc-2.3.2.so)
> ==12627== by 0x2074B08E: AL_vsnprintf (in /opt/aw/maya6.0/lib/libBase.so)
> ==12627== by 0x20776CB5: awString::CStringImpl::doFormat(char
> const*, char*, unsigned) (in /opt/aw/maya6.0/lib/libBase.so)
Then your program does read through a nul pointer and crashes.
> The errors it reports I'm not concerned about, just the fact it
> dies. This is with valgrind v2.1.2 on Fedora Core 1. Interestingly I
> get the same output with '-tool=none'.
I'm afraid valgrind won't magically fix your program, and if your
program dies with a SEGV then it will still die.
Tom
--
Tom Hughes (th...@cy...)
Software Engineer, Cyberscience Corporation
http://www.cyberscience.com/
|
|
From: Andrew C. <and...@fr...> - 2004-08-26 10:42:41
|
Tom Hughes wrote: > > Here valgrind is warning you that your program is about to read > through a nul pointer. > ... > Then your program does read through a nul pointer and crashes. > ... > I'm afraid valgrind won't magically fix your program, and if your > program dies with a SEGV then it will still die. Sorry, should have made it a bit clearer there. The reported errors are when starting Maya, not my software, which would be dynamically linked in at a later point. Maya is a very expensive, widely used, (relatively) robust piece of commercial software. It does not crash when run by itself, only when run through valgrind. So, my immediate thought is that valgrind is being tripped up by something - is there anything I can do to investigate things further myself? I understand the developers wouldn't be interested in downloading and installing a huge package to test it out for themselves without further info. -- Andrew Chapman Senior Technical Director - Framestore CFC |
|
From: David E. <tw...@us...> - 2004-08-26 10:56:14
|
On Thu, 2004-08-26 at 12:38, Andrew Chapman wrote:
> Tom Hughes wrote:
> >
> > Here valgrind is warning you that your program is about to read
> > through a nul pointer.
> > ...
> > Then your program does read through a nul pointer and crashes.
> > ...
> > I'm afraid valgrind won't magically fix your program, and if your
> > program dies with a SEGV then it will still die.
>
> Sorry, should have made it a bit clearer there. The reported errors are
> when starting Maya, not my software, which would be dynamically linked
> in at a later point.
>
> Maya is a very expensive, widely used, (relatively) robust piece of
> commercial software. It does not crash when run by itself, only when run
> through valgrind.
When reading the above sentence I thought of something:
Maybe valgrind triggers some kind of copy protection?
> So, my immediate thought is that valgrind is being tripped up by
> something - is there anything I can do to investigate things further
> myself? I understand the developers wouldn't be interested in
> downloading and installing a huge package to test it out for themselves
> without further info.
--
Regards,
-\- David Eriksson -/-
SynCE - http://synce.sourceforge.net
CalcEm - http://calcem.sourceforge.net
ScummVM - http://scummvm.sourceforge.net
Desquirr - http://desquirr.sourceforge.net
SetiWrapper - http://setiwrapper.sourceforge.net
|
|
From: Nicholas N. <nj...@ca...> - 2004-08-26 11:16:09
|
On Thu, 26 Aug 2004, David Eriksson wrote: >> Maya is a very expensive, widely used, (relatively) robust piece of >> commercial software. It does not crash when run by itself, only when run >> through valgrind. > > When reading the above sentence I thought of something: > > Maybe valgrind triggers some kind of copy protection? Programs don't work exactly the same under Valgrind -- ie. memory layout is different, thread scheduling is different. This can be enough to expose latent bugs. Unfortunately in cases like this, it's really hard to work out where the problem really lies. N |
|
From: Andrew C. <and...@fr...> - 2004-08-26 11:21:37
|
David Eriksson wrote: > >>Maya is a very expensive, widely used, (relatively) robust piece of >>commercial software. It does not crash when run by itself, only when run >>through valgrind. > > When reading the above sentence I thought of something: > > Maybe valgrind triggers some kind of copy protection? Interesting thought - Maya is licensed via FlexLM (again a very robust, widely used piece of software), but from the call stack it looks like classes internal to Maya, not the licensing API. Another thing to mention: Maya normally takes a good 5-20 seconds to fully start up here, but that crash happens when running within valgrind within a few seconds. So it looks like something pretty fundamental going wrong. If you guys are convinced that it isn't a problem with valgrind itself, then I'll get in touch with the vendors, see if they have anything to say about it. Cheers. -- Andrew Chapman Senior Technical Director - Framestore CFC |
|
From: Paul P. <pa...@pa...> - 2004-08-26 15:12:51
|
>>>>> On Thu, 26 Aug 2004 12:20:42 +0100, Andrew Chapman <and...@fr...> said:
> If you guys are convinced that it isn't a problem with valgrind itself,
> then I'll get in touch with the vendors, see if they have anything to
> say about it.
It is my understanding that there are several versions of Maya,
and that some of them work only with LinuxThreads, while others
only work with NPTL.
Note that valgrind *forces* its own /usr/local/lib/valgrind/libpthread.so
instead of /lib/{i686,tls}/libpthread.so.0 which would be used without it.
If maya.bin is "sensitive" enough to distinguish LinuxThreads/NPTL,
it may also be sensitive enough to not work with valgrind/libpthread.so;
but perhaps the "other" version of maya.bin will work?
Cheers,
--
Paul Pluzhnikov pa...@pa...
|
|
From: Andrew C. <and...@fr...> - 2004-08-26 17:16:59
|
Paul Pluzhnikov wrote: > > If maya.bin is "sensitive" enough to distinguish LinuxThreads/NPTL, > it may also be sensitive enough to not work with valgrind/libpthread.so; > but perhaps the "other" version of maya.bin will work? I did an objdump on all the Maya binaries, then grep'ed the results for thread looking symbols. Got plenty of '*pthread*' looking symbols, nothing containing '*nptl*' - though I'm not sure exactly what to be looking for there. > It is my understanding that there are several versions of Maya, > and that some of them work only with LinuxThreads, while others > only work with NPTL. Really not sure about that, AFAIK the core is the same across all linux versions (they have 'full', 'light' and the free 'learning' versions). -- Andrew Chapman Senior Technical Director - Framestore CFC |
|
From: Paul P. <pa...@pa...> - 2004-08-26 22:30:57
|
>>>>> On Thu, 26 Aug 2004 18:10:18 +0100, Andrew Chapman <and...@fr...> said:
> Paul Pluzhnikov wrote:
>> It is my understanding that there are several versions of Maya,
>> and that some of them work only with LinuxThreads, while others
>> only work with NPTL.
It turns out I got my info mixed-up (I do not use Maya myself):
there don't appear to be any versions of maya that run *only*
on NPTL.
However, I got some LD_DEBUG traces from running maya under VG,
and it seems that the problem is similar to what is happening
with the test case below:
$ cat junk.c
#include <dlfcn.h>
#include <assert.h>
int main()
{
void *h, *p;
h = dlopen("libc.so.6", RTLD_LAZY);
assert(h);
p = dlsym(h, "errno");
assert(p);
return 0;
}
$ gcc -g junk.c -ldl
$ ./a.out && echo ok
ok
$ valgrind --skin=none --num-callers=10 ./a.out
==32331== Nulgrind, a binary JIT-compiler for x86-linux.
==32331== Copyright (C) 2002-2004, and GNU GPL'd, by Nicholas Nethercote.
==32331== Using valgrind-2.1.2, a program supervision framework for x86-linux.
==32331== Copyright (C) 2000-2004, and GNU GPL'd, by Julian Seward et al.
==32331== For more details, rerun with: -v
==32331==
a.out: junk.c:11: main: Assertion `p' failed.
==32331==
==32331== Process terminating with default action of signal 6 (SIGABRT)
==32331== at 0x3A9BE5C1: kill (in /lib/libc-2.3.2.so)
==32331== by 0x3A97EBBE: gsignal (vg_intercept.c:93)
==32331== by 0x3A9BF89A: abort (in /lib/libc-2.3.2.so)
==32331== by 0x3A9B78A4: __GI___assert_fail (in /lib/libc-2.3.2.so)
==32331== by 0x804847A: main (junk.c:11)
==32331==
Aborted
Since the program under VG still uses the same libc.so.6, I don't
understand why 'dlsym(h, "errno")' fails, but it is probably easy
to fix.
Regards,
--
Paul Pluzhnikov pa...@pa...
|
|
From: Andrew C. <and...@fr...> - 2004-08-26 22:41:08
|
Paul Pluzhnikov wrote:
>
> However, I got some LD_DEBUG traces from running maya under VG,
> and it seems that the problem is similar to what is happening
> with the test case below:
>
> $ cat junk.c
> #include <dlfcn.h>
> #include <assert.h>
>
> int main()
> {
> void *h, *p;
>
> h = dlopen("libc.so.6", RTLD_LAZY);
> assert(h);
> p = dlsym(h, "errno");
> assert(p);
> return 0;
> }
>
> [snip]
>
> Since the program under VG still uses the same libc.so.6, I don't
> understand why 'dlsym(h, "errno")' fails, but it is probably easy
> to fix.
Ahh, that would make sense, as Maya is just a tiny executable and a
whole bunch of SOs, both explicity and implicitly dynamically linked.
--
Andrew Chapman
Senior Technical Director - Framestore CFC
|
|
From: Tom H. <th...@cy...> - 2004-08-26 23:02:35
|
In message <166...@bu...>
Paul Pluzhnikov <pa...@pa...> wrote:
> $ cat junk.c
> #include <dlfcn.h>
> #include <assert.h>
>
> int main()
> {
> void *h, *p;
>
> h = dlopen("libc.so.6", RTLD_LAZY);
> assert(h);
> p = dlsym(h, "errno");
> assert(p);
> return 0;
> }
>
> $ gcc -g junk.c -ldl
> $ ./a.out && echo ok
> ok
> $ valgrind --skin=none --num-callers=10 ./a.out
> ==32331== Nulgrind, a binary JIT-compiler for x86-linux.
> ==32331== Copyright (C) 2002-2004, and GNU GPL'd, by Nicholas Nethercote.
> ==32331== Using valgrind-2.1.2, a program supervision framework for x86-linux.
> ==32331== Copyright (C) 2000-2004, and GNU GPL'd, by Julian Seward et al.
> ==32331== For more details, rerun with: -v
> ==32331==
> a.out: junk.c:11: main: Assertion `p' failed.
> ==32331==
> ==32331== Process terminating with default action of signal 6 (SIGABRT)
> ==32331== at 0x3A9BE5C1: kill (in /lib/libc-2.3.2.so)
> ==32331== by 0x3A97EBBE: gsignal (vg_intercept.c:93)
> ==32331== by 0x3A9BF89A: abort (in /lib/libc-2.3.2.so)
> ==32331== by 0x3A9B78A4: __GI___assert_fail (in /lib/libc-2.3.2.so)
> ==32331== by 0x804847A: main (junk.c:11)
> ==32331==
> Aborted
Can you raise a bug for this please.
> Since the program under VG still uses the same libc.so.6, I don't
> understand why 'dlsym(h, "errno")' fails, but it is probably easy
> to fix.
Actually I suspect it will be really hard. Anything to do with the
dynamic linker search order usually is. Especially since it seems to
vary wildly from one version of glibc to the next.
Tom
--
Tom Hughes (th...@cy...)
Software Engineer, Cyberscience Corporation
http://www.cyberscience.com/
|