|
From: Simon G. <sim...@ma...> - 2009-07-12 20:38:17
|
So, I'm new to this, and I'm not sure if it's a mac-specific thing, or
a general issue... Be gentle :)
I'm trying to run my application under valgrind, and it's failing when
I initialise the OpenGL context... I get the debug log:
==50304== ... (snipped)
==50304== Process terminating with default action of signal 11 (SIGSEGV)
==50304== Access not within mapped region at address 0x187E7108
==50304== at 0x1466316: memcpy (mc_replace_strmem.c:482)
==50304== by 0x16050DFA: gldGetTextureLevel (in /System/Library/
Extensions/GeForce8xxxGLDriver.bundle/Contents/MacOS/
GeForce8xxxGLDriver)
==50304== by 0x160528A0: gldGetTextureLevel (in /System/Library/
Extensions/GeForce8xxxGLDriver.bundle/Contents/MacOS/
GeForce8xxxGLDriver)
==50304== by 0x160EBA76: gldGetTextureLevel (in /System/Library/
Extensions/GeForce8xxxGLDriver.bundle/Contents/MacOS/
GeForce8xxxGLDriver)
==50304== by 0x160EBDDA: gldGetTextureLevel (in /System/Library/
Extensions/GeForce8xxxGLDriver.bundle/Contents/MacOS/
GeForce8xxxGLDriver)
==50304== by 0x1606ED55: gldGetTextureLevel (in /System/Library/
Extensions/GeForce8xxxGLDriver.bundle/Contents/MacOS/
GeForce8xxxGLDriver)
==50304== by 0x15FEA6CE: gldAllocVertexBuffer (in /System/Library/
Extensions/GeForce8xxxGLDriver.bundle/Contents/MacOS/
GeForce8xxxGLDriver)
==50304== by 0x15FEA7D0: gldAllocVertexBuffer (in /System/Library/
Extensions/GeForce8xxxGLDriver.bundle/Contents/MacOS/
GeForce8xxxGLDriver)
==50304== by 0x15FC9EFB: gldCreateContext (in /System/Library/
Extensions/GeForce8xxxGLDriver.bundle/Contents/MacOS/
GeForce8xxxGLDriver)
==50304== by 0x15E1387C: gliCreateContext (in /System/Library/
Frameworks/OpenGL.framework/Versions/A/Resources/GLEngine.bundle/
GLEngine)
==50304== by 0x24F58E7: cglInitializeContext (in /System/Library/
Frameworks/OpenGL.framework/Versions/A/OpenGL)
==50304== by 0x24F51DD: CGLCreateContext (in /System/Library/
Frameworks/OpenGL.framework/Versions/A/OpenGL)
==50304== If you believe this happened as a result of a stack
==50304== overflow in your program's main thread (unlikely but
==50304== possible), you can try to increase the size of the
==50304== main thread stack using the --main-stacksize= flag.
==50304== The main thread stack size used in this run was 8388608.
--50304:0:schedule VG_(sema_down): read returned -4
==50304==
==50304== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 23 from
2)
==50304== malloc/free: in use at exit: 6,482,048 bytes in 23,552 blocks.
==50304== malloc/free: 110,105 allocs, 86,553 frees, 17,010,063 bytes
allocated.
==50304== For counts of detected errors, rerun with: -v
==50304== searching for pointers to 23,552 not-freed blocks.
==50304== checked 47,251,752 bytes.
==50304==
==50304== LEAK SUMMARY:
==50304== definitely lost: 272 bytes in 17 blocks.
==50304== indirectly lost: 6,828 bytes in 12 blocks.
==50304== possibly lost: 868,510 bytes in 2,184 blocks.
==50304== still reachable: 5,600,608 bytes in 21,197 blocks.
==50304== suppressed: 5,830 bytes in 142 blocks.
==50304== Rerun with --leak-check=full to see details of leaked memory.
I've been playing with suppressions because the above are OS-internal
issues, and eventually (after failing to stop the SIGSEGV being sent),
my 'valgrind' file looks like:
-----8<-----8<----- Snip here -----8<-----8<-----
{
GL context creation
Memcheck:Addr16
fun:memcpy
...
fun:CGLCreateContext
}
{
GL context creation
Memcheck:Addr8
fun:memcpy
...
fun:CGLCreateContext
}
{
GL context creation
Memcheck:Addr4
fun:memcpy
...
fun:CGLCreateContext
}
{
GL context creation
Memcheck:Addr2
fun:memcpy
...
fun:CGLCreateContext
}
{
GL context creation
Memcheck:Addr1
fun:memcpy
...
fun:CGLCreateContext
}
-----8<-----8<----- Snip here -----8<-----8<-----
I started off with more-specific suppressions, but even with the
above, the application is aborted every time I run. As far as I can
tell from the manual, *any* access with a chain starting with
'CGLCreateContext' and ending with 'memcpy' ought to be ignored by
memcheck... Either I'm missing something, or I've tickled a bug... I
tried making the stack larger, but that made no difference ...
Any help gratefully received :)
Simon
|
|
From: tom f. <tf...@al...> - 2009-07-12 21:01:35
|
Simon Gornall <sim...@ma...> writes:
> I'm trying to run my application under valgrind, and it's failing
> when I initialise the OpenGL context... I get the debug log:
To clarify -- are you saying that it can initialize the context fine
when run standalone, and only fails through valgrind?
(I'm going to assume `yes'... sorry.)
> ==50304== ... (snipped)
> ==50304== Process terminating with default action of signal 11 (SIGSEGV)
> ==50304== Access not within mapped region at address 0x187E7108
> ==50304== at 0x1466316: memcpy (mc_replace_strmem.c:482)
> ==50304== by 0x16050DFA: gldGetTextureLevel (in /System/Library/
> Extensions/GeForce8xxxGLDriver.bundle/Contents/MacOS/
> GeForce8xxxGLDriver)
[snip]
> I've been playing with suppressions because the above are OS-internal
> issues, and eventually (after failing to stop the SIGSEGV being sent),
> my 'valgrind' file looks like:
>
> -----8<-----8<----- Snip here -----8<-----8<-----
> {
> GL context creation
> Memcheck:Addr16
> fun:memcpy
> ...
> fun:CGLCreateContext
> }
[snip]
> I started off with more-specific suppressions, but even with the
> above, the application is aborted every time I run.
This sentence implies you are slightly confused about the semantics of
a suppression file. These files are merely to suppress *reporting* of
errors; the existance or not of any given suppression should not change
the behavior of a program. It will certainly change what valgrind
prints out, but not what's going on under the hood.
Since you're using nvidia's driver, I'd ask you to re-run using the
command line option `--smc-check=all'. I am not as familiar with
nvidia's Mac driver, but on Linux the driver utilizes self modifying
code, which causes me to see a lot of errors similar to the one you
describe above. Note that even with SMC checking, I still see a lot of
nvidia-based errors, so don't expect this will be a 100% solution.
If that fails, I suggest linking your program against Mesa, probably
mangled Mesa for logistical reasons. That will valgrind much more
`nicely'. This:
http://visitusers.org/index.php?title=Mangled_Mesa
might help you, if you need to go that route.
Cheers,
-tom
|
|
From: tom f. <tf...@al...> - 2009-07-12 22:12:57
|
Hi Simon, let's keep discussions on-list. Mostly since I'm no valgrind expert, and if I make incorrect statements I'd like one to jump in with a clue bat :) Simon Gornall <sim...@ma...> writes: > > On Jul 12, 2009, at 2:02 PM, tom fogal wrote: > > > Simon Gornall <sim...@ma...> writes: > >> I'm trying to run my application under valgrind, and it's failing > >> when I initialise the OpenGL context... > > [snip] > >> I started off with more-specific suppressions, but even with the > >> above, the application is aborted every time I run. > > > > This sentence implies you are slightly confused about the > > semantics of a suppression file. These files are merely to > > suppress *reporting* of errors; the existance or not of any given > > suppression should not change the behavior of a program. It will > > certainly change what valgrind prints out, but not what's going on > > under the hood. > > Ah - yes, I was hoping it was more along the lines of "yes, I know > there's an issue there, just ignore it for now". I take it there's no > way of doing that, then ? Sorry, "that" reads ambiguously to me. You mean no way of modifying the behavior of a program using valgrind? No, not beyond the normal valgrind instrumentation. A suppression is your way of saying exactly what you've put in quotes, if I was unclear earlier. > > Since you're using nvidia's driver, I'd ask you to re-run using the > > command line option `--smc-check=all'. [snip] > Yep, it fails in the same way. Doh, sorry that wasn't any help :( > > If that fails, I suggest linking your program against Mesa [. . .] > > Thanks, I'll see if that helps - the application uses CoreImage (and > therefore a lot of GPU shader code) quite heavily though, so I'm not > sure if Mesa is up to it (I haven't used it in a decade or so, so I > might be unfairly maligning Mesa :) I do GPU-based raycasted volume rendering through Mesa quite regularly. It works. The swrast Mesa backend supports OpenGL 2.1. It's about an order of magnitude slower than my nvidia card. It's also about 2 order of magnitudes faster than Apple's software fallback. -tom |
|
From: Simon G. <sim...@ma...> - 2009-07-12 22:33:25
|
On Jul 12, 2009, at 3:14 PM, tom fogal wrote: > Hi Simon, > > let's keep discussions on-list. Mostly since I'm no valgrind expert, > and if I make incorrect statements I'd like one to jump in with a clue > bat :) Yeah, I hit 'reply' rather than 'reply-all'. I actually re-posted it to "users" afterwards :) > > > Simon Gornall <sim...@ma...> writes: >> >> On Jul 12, 2009, at 2:02 PM, tom fogal wrote: >> >>> Simon Gornall <sim...@ma...> writes: >>>> I'm trying to run my application under valgrind, and it's failing >>>> when I initialise the OpenGL context... >>> [snip] >>>> I started off with more-specific suppressions, but even with the >>>> above, the application is aborted every time I run. >>> >>> This sentence implies you are slightly confused about the >>> semantics of a suppression file. These files are merely to >>> suppress *reporting* of errors; the existance or not of any given >>> suppression should not change the behavior of a program. It will >>> certainly change what valgrind prints out, but not what's going on >>> under the hood. >> >> Ah - yes, I was hoping it was more along the lines of "yes, I know >> there's an issue there, just ignore it for now". I take it there's no >> way of doing that, then ? > > Sorry, "that" reads ambiguously to me. You mean no way of modifying > the behavior of a program using valgrind? No, not beyond the normal > valgrind instrumentation. A suppression is your way of saying exactly > what you've put in quotes, if I was unclear earlier. Right. The problem (for me at least) is that 1) There's a piece of code not under my control that has a memory- access issue 2) Valgrind kills my program whenever there's a memory-access issue What I'd like valgrind to do (in this case) is run my program, report the problem, and not kill my program if it's in the 'suppressed' file Is it possible to do that ? From what you say above, I got the impression that it was just the reporting of the problem that the suppression file affects, not whether valgrind kills my program. I'd happily take valgrind just reporting the issue (and me subsequently ignoring it) if I could get past the SIGSEGV :) > > >>> Since you're using nvidia's driver, I'd ask you to re-run using the >>> command line option `--smc-check=all'. > [snip] >> Yep, it fails in the same way. > > Doh, sorry that wasn't any help :( > >>> If that fails, I suggest linking your program against Mesa [. . .] >> >> Thanks, I'll see if that helps - the application uses CoreImage (and >> therefore a lot of GPU shader code) quite heavily though, so I'm not >> sure if Mesa is up to it (I haven't used it in a decade or so, so I >> might be unfairly maligning Mesa :) > > I do GPU-based raycasted volume rendering through Mesa quite > regularly. > It works. The swrast Mesa backend supports OpenGL 2.1. It's about an > order of magnitude slower than my nvidia card. > > It's also about 2 order of magnitudes faster than Apple's software > fallback. Cool - I'll give it a go then :) ATB, Simon |
|
From: Nicholas N. <n.n...@gm...> - 2009-07-12 22:45:27
|
On Mon, Jul 13, 2009 at 8:33 AM, Simon Gornall<sim...@ma...> wrote: > > 1) There's a piece of code not under my control that has a memory- > access issue > 2) Valgrind kills my program whenever there's a memory-access issue > > What I'd like valgrind to do (in this case) is run my program, report > the problem, and not kill my program if it's in the 'suppressed' file > > Is it possible to do that ? From what you say above, I got the > impression that it was just the reporting of the problem that the > suppression file affects, not whether valgrind kills my program. Correct; ie. the suppressions file won't help. > I'd > happily take valgrind just reporting the issue (and me subsequently > ignoring it) if I could get past the SIGSEGV :) http://www.valgrind.org/docs/manual/faq.html#faq.crashes is relevant here. It looks like you are in an unlucky situation. Probably the best thing you can do is report any Valgrind errors that occur before the crash to nVidia, and hope that the next version of the driver works better :( Nick |
|
From: Ashley P. <as...@pi...> - 2009-07-13 09:45:47
|
On Mon, 2009-07-13 at 08:45 +1000, Nicholas Nethercote wrote: > > http://www.valgrind.org/docs/manual/faq.html#faq.crashes is relevant > here. It looks like you are in an unlucky situation. Probably the > best thing you can do is report any Valgrind errors that occur before > the crash to nVidia, and hope that the next version of the driver > works better :( Given this is a device driver and GPU off-load code it could well not be to do with errors in the driver but possibly that the driver has mapped some pages into userspace that Valgrind doesn't know about? As I recall OpenGL didn't work on Linux for a couple of years for just this reason. Are there any "unrecognised syscall" errors before the crash? Ultimately I suppose the cause doesn't make any difference, all you can do is try and find a way of running the code using different library's, the use of a software renderer sounds like a promising solution. Ashley, -- Ashley Pittman, Bath, UK. Padb - A parallel job inspection tool for cluster computing http://padb.pittman.org.uk |
|
From: Dan K. <da...@ke...> - 2009-07-13 00:14:12
|
On Sun, Jul 12, 2009 at 1:38 PM, Simon Gornall<sim...@ma...> wrote: > So, I'm new to this, and I'm not sure if it's a mac-specific thing, or > a general issue... Be gentle :) > > I'm trying to run my application under valgrind, and it's failing when > I initialise the OpenGL context... I get the debug log: > > ==50304== ... (snipped) > ==50304== Process terminating with default action of signal 11 (SIGSEGV) Bam. Stop right there. If it crashes, all bets are off. Figure out why it crashes first. You might find valgrind's --db-attach=yes command handy. Alternately, insert printf's until you figure out why the crash occurs. - Dan |