|
From: Folkert v. H. <fol...@gm...> - 2012-04-15 17:18:02
|
Hi, I'm trying to debug this (opengl) application I'm writing. Now something odd happens: when I run it in gdb, it occasionally sigsegvs which is not ok but expected, but when ran under valgrind the whole system crashes. I think it starts swapping like hell, but far more than the usual out of memory situation because not even the mouse cursor moves. So what I would like to know: is it possible to let valgrind limit the amount of memory the 'guest application' uses? Could not find this in the man page. Thanks. -- www.vanheusden.com bitcoin account: 14ExronPRN44urf4jqPMyoAN46T75MKGgP msn address: sp...@va... |
|
From: Philippe W. <phi...@sk...> - 2012-04-15 20:22:31
|
On Sun, 2012-04-15 at 19:17 +0200, Folkert van Heusden wrote: > Hi, > > I'm trying to debug this (opengl) application I'm writing. > Now something odd happens: when I run it in gdb, it occasionally > sigsegvs which is not ok but expected, but when ran under valgrind the > whole system crashes. > I think it starts swapping like hell, but far more than the usual out > of memory situation because not even the mouse cursor moves. > So what I would like to know: is it possible to let valgrind limit the > amount of memory the 'guest application' uses? Could not find this in > the man page. Which Valgrind version are you using ? 3.7.0 contains a fix related to memory usage (bug 250101). 3.8.0 SVN has also some improvements related to memory usage. If you still have a problem, ulimit -d .... will limit the total memory used by Valgrind and the guest application. Philippe |
|
From: Folkert v. H. <fol...@gm...> - 2012-04-15 20:53:16
|
> > I'm trying to debug this (opengl) application I'm writing. > > Now something odd happens: when I run it in gdb, it occasionally > > sigsegvs which is not ok but expected, but when ran under valgrind the > > whole system crashes. > > I think it starts swapping like hell, but far more than the usual out > > of memory situation because not even the mouse cursor moves. > > So what I would like to know: is it possible to let valgrind limit the > > amount of memory the 'guest application' uses? Could not find this in > > the man page. > > Which Valgrind version are you using ? > 3.7.0 contains a fix related to memory usage (bug 250101). I'm using 3.7.0 - the one currently in debian testing. > If you still have a problem, ulimit -d .... will limit the total memory > used by Valgrind and the guest application. Tried that but that did not help. I'm also not entirely sure if it indeed is a leak in my program as I have a routine in it which constantly (20 times/sec) checks the memory usage (via /proc) of my program and does an exit() when it reaches 500MB (normally it should not use more than 120MB). If it is something in Xorg, would valgrind "see" this? I think it would (not sure, that's why I ask) as it would be the outcome of a call to it (via SDL/GL). Thanks. -- www.vanheusden.com bitcoin account: 14ExronPRN44urf4jqPMyoAN46T75MKGgP msn address: sp...@va... |
|
From: Philippe W. <phi...@sk...> - 2012-04-15 21:06:41
|
On Sun, 2012-04-15 at 22:52 +0200, Folkert van Heusden wrote: > I'm also not entirely sure if it indeed is a leak in my program as I > have a routine in it which constantly (20 times/sec) checks the memory > usage (via /proc) of my program and does an exit() when it reaches > 500MB (normally it should not use more than 120MB). > If it is something in Xorg, would valgrind "see" this? I think it > would (not sure, that's why I ask) as it would be the outcome of a > call to it (via SDL/GL). I guess you mean by Xorg the separate system process handling the X display ? Valgrind will not monitor what is happening in this separate process, unless it is launched under Valgrind. Philippe |
|
From: Folkert v. H. <fol...@gm...> - 2012-04-17 08:41:23
|
> > I'm also not entirely sure if it indeed is a leak in my program as I > > have a routine in it which constantly (20 times/sec) checks the memory > > usage (via /proc) of my program and does an exit() when it reaches > > 500MB (normally it should not use more than 120MB). > > If it is something in Xorg, would valgrind "see" this? I think it > > would (not sure, that's why I ask) as it would be the outcome of a > > call to it (via SDL/GL). > I guess you mean by Xorg the separate system process handling > the X display ? > Valgrind will not monitor what is happening in this separate > process, unless it is launched under Valgrind. I think I found it. When developing this application I decided that it would be cleaner to use unsigned ints when applicable. That's indeed very clean but a hell when fixing bugs: when something goes wrong and a negative value is put in such an unsigned integer, a very large positive value is set in reality. For example: unsigned short x = -1; would give x=65535. So when this happens, my program would allocate gigabytes of ram. And since I used --malloc-fill=, valgrind would then initialize this ram (I'm speculating here) causing big time swapping. I found this out by disabling swap memory. So either I'm totally wrong and something else is going wrong or it might be nice to implement "lazy malloc fill" which initializes pages to that value only when a pagefault occurs. Might help overcommit as well. -- www.vanheusden.com bitcoin account: 14ExronPRN44urf4jqPMyoAN46T75MKGgP msn address: sp...@va... |
|
From: Philippe W. <phi...@sk...> - 2012-04-17 18:01:02
|
On Tue, 2012-04-17 at 10:40 +0200, Folkert van Heusden wrote: > I think I found it. > When developing this application I decided that it would be cleaner to > use unsigned ints when applicable. > That's indeed very clean but a hell when fixing bugs: when something > goes wrong and a negative value is put in such an unsigned integer, a > very large positive value is set in reality. For example: unsigned > short x = -1; would give x=65535. > So when this happens, my program would allocate gigabytes of ram. And > since I used --malloc-fill=, valgrind would then initialize this ram > (I'm speculating here) causing big time swapping. I found this out by > disabling swap memory. To verify that this is the problem, you might use --trace-malloc=yes, and see the last trace before the thing does not respond anymore. Note that also that if that is the problem, either ulimit -d or ulimit -m should give a protection and make the thing fail. > So either I'm totally wrong and something else is going wrong or it > might be nice to implement "lazy malloc fill" which initializes pages > to that value only when a pagefault occurs. Might help overcommit as > well. In your case, wouldn't that only hide a (real) bug ? It would be difficult to implement such a page fault handler at Valgrind level, from what I understand from Valgrind and linux. Philippe |
|
From: Folkert v. H. <fol...@gm...> - 2012-04-17 18:16:57
|
> > So when this happens, my program would allocate gigabytes of ram. And > > since I used --malloc-fill=, valgrind would then initialize this ram > > (I'm speculating here) causing big time swapping. I found this out by > > disabling swap memory. > To verify that this is the problem, you might use --trace-malloc=yes, > and see the last trace before the thing does not respond anymore. > Note that also that if that is the problem, either ulimit -d > or ulimit -m should give a protection and make the thing fail. The trace malloc (I have that one) never ends up on disk. I also tried the -d and -m but neither helpded. So hmmm, maybe then it is some other problem. > > So either I'm totally wrong and something else is going wrong or it > > might be nice to implement "lazy malloc fill" which initializes pages > > to that value only when a pagefault occurs. Might help overcommit as > > well. > > In your case, wouldn't that only hide a (real) bug ? Well, no: it would more slowly fill the memory which I can monitor then with 'top'. > It would be difficult to implement such a page fault handler at Valgrind > level, from what I understand from Valgrind and linux. >From what I read on wikipedia, Valgrind runs things in a virtual machine and from my experience (wrote an MSX (z80) emulator once, no twice) you can emulate everything, maybe a tad slow. -- www.vanheusden.com bitcoin account: 14ExronPRN44urf4jqPMyoAN46T75MKGgP msn address: sp...@va... |
|
From: Philippe W. <phi...@sk...> - 2012-04-17 18:37:41
|
On Tue, 2012-04-17 at 20:16 +0200, Folkert van Heusden wrote: > >From what I read on wikipedia, Valgrind runs things in a virtual > machine and from my experience (wrote an MSX (z80) emulator once, > no twice) you can emulate everything, maybe a tad slow. Valgrind provides a simulated cpu, but not a simulated OS and simulated mmu etc etc. In other words, Valgrind runs a "unix application process" on top of a virtual cpu, Valgrind does not provide a virtual machine like kvm or Xen or ... Philippe |
|
From: Folkert v. H. <fol...@gm...> - 2012-04-17 19:02:31
|
> Valgrind provides a simulated cpu, but not a simulated OS and > simulated mmu etc etc. > In other words, Valgrind runs a "unix application process" on > top of a virtual cpu, Valgrind does not provide a virtual > machine like kvm or Xen or ... hmmm ok. it seems it can't handle corruptions that nicely: ==21521== at 0x6A39957: ioctl (syscall-template.S:82) ==21521== by 0x40A8B44: ukiCreateContext (in /usr/lib/x86_64-linux-gnu/libatiuki.so.1.0) ==21521== by 0xF808A35: ??? (in /usr/lib/x86_64-linux-gnu/dri/fglrx_dri.so) ==21521== by 0x1016A3AF: ??? ==21521== by 0x10169467: ??? ==21521== by 0x1016956F: ??? ==21521== by 0xF862E0F: ??? (in /usr/lib/x86_64-linux-gnu/dri/fglrx_dri.so) ==21521== by 0xF: ??? ==21521== Address 0x7feff2528 is on thread 1's stack ==21521== Uninitialised value was created by a stack allocation ==21521== at 0xDEE4168: ??? (in /usr/lib/x86_64-linux-gnu/dri/fglrx_dri.so) ==21521== ==21521== Conditional jump or move depends on uninitialised value(s) ==21521== at 0xF8704A4: ??? (in /usr/lib/x86_64-linux-gnu/dri/fglrx_dri.so) ==21521== by 0xF86FC03: ??? (in /usr/lib/x86_64-linux-gnu/dri/fglrx_dri.so) ==21521== by 0xF86FE7C: ??? (in /usr/lib/x86_64-linux-gnu/dri/fglrx_dri.so) ==21521== by 0xF8615E7: ??? (in /usr/lib/x86_64-linux-gnu/dri/fglrx_dri.so) ==21521== Uninitialised value was created by a stack allocation ==21521== at 0xF87040F: ??? (in /usr/lib/x86_64-linux-gnu/dri/fglrx_dri.so) ... ==21521== Conditional jump or move depends on uninitialised value(s) ==21521== at 0xEA323F9: ??? (in /usr/lib/x86_64-linux-gnu/dri/fglrx_dri.so) ==21521== by 0xEA32477: ??? (in /usr/lib/x86_64-linux-gnu/dri/fglrx_dri.so) ==21521== by 0x1101: ??? ==21521== by 0x3800274F: ??? (in /usr/lib/valgrind/memcheck-amd64-linux) ==21521== by 0x10CC42AF: ??? ==21521== Uninitialised value was created by a heap allocation ... ==21521== Conditional jump or move depends on uninitialised value(s) ==21521== at 0xDFA9581: ??? (in /usr/lib/x86_64-linux-gnu/dri/fglrx_dri.so) ==21521== by 0x4A: ??? ==21521== by 0x77F: ??? ==21521== by 0x4AF: ??? ==21521== Uninitialised value was created by a heap allocation ==21521== at 0x402894D: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==21521== by 0xE813750: ??? (in /usr/lib/x86_64-linux-gnu/dri/fglrx_dri.so) ==21521== by 0x7FEFFFD4F: ??? ==21521== by 0x12FEFFFD67: ??? ==21521== by 0x1FEFFFD5F: ??? ==21521== by 0x103BB88F: ??? ==21521== by 0x7FEFFFD8F: ??? ==21521== by 0x59EF7D: ??? (in /home/folkert/Projects/sysopview/trunk/sysopview) ==21521== by 0x109B088F: ??? ==21521== by 0xEC398BA: ??? (in /usr/lib/x86_64-linux-gnu/dri/fglrx_dri.so) ==21521== by 0x1101: ??? ==21521== by 0x3800274F: ??? (in /usr/lib/valgrind/memcheck-amd64-linux) ... ==21521== Invalid write of size 1 ==21521== at 0x402A788: memcpy (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==21521== by 0xF867F95: ??? (in /usr/lib/x86_64-linux-gnu/dri/fglrx_dri.so) ==21521== by 0xFFFF1DFFFFFFFFFF: ??? ==21521== by 0xFFFF1EFFFFFFFFFF: ??? ==21521== by 0x1FFFFFFFF: ??? ==21521== by 0x205: ??? ==21521== by 0x200000000: ??? ==21521== by 0x2: ??? ==21521== Address 0x7f19655e45ab is not stack'd, malloc'd or (recently) free'd etc. -- www.vanheusden.com bitcoin account: 14ExronPRN44urf4jqPMyoAN46T75MKGgP msn address: sp...@va... |
|
From: Philippe W. <phi...@sk...> - 2012-04-17 19:51:07
|
On Tue, 2012-04-17 at 21:02 +0200, Folkert van Heusden wrote: > > Valgrind provides a simulated cpu, but not a simulated OS and > > simulated mmu etc etc. > > In other words, Valgrind runs a "unix application process" on > > top of a virtual cpu, Valgrind does not provide a virtual > > machine like kvm or Xen or ... > > hmmm ok. > it seems it can't handle corruptions that nicely: Not too sure I understand. The below msgs from Valgrind are indicating (probable/possible) bugs. Apart of reporting the error, the behaviour is (usually) not influenced too much (compared to a native execution). malloc-fill might cause bigger differences, in case non initialised memory is used. Or do you mean the stack trace is not that good/clear ? Maybe the gdb+Valgrind gdbserver will give better stack traces ? Philippe > > ==21521== at 0x6A39957: ioctl (syscall-template.S:82) > ==21521== by 0x40A8B44: ukiCreateContext (in > /usr/lib/x86_64-linux-gnu/libatiuki.so.1.0) > ==21521== by 0xF808A35: ??? (in /usr/lib/x86_64-linux-gnu/dri/fglrx_dri.so) > ==21521== by 0x1016A3AF: ??? > ==21521== by 0x10169467: ??? > ==21521== by 0x1016956F: ??? > ==21521== by 0xF862E0F: ??? (in /usr/lib/x86_64-linux-gnu/dri/fglrx_dri.so) > ==21521== by 0xF: ??? > ==21521== Address 0x7feff2528 is on thread 1's stack > ==21521== Uninitialised value was created by a stack allocation > ==21521== at 0xDEE4168: ??? (in /usr/lib/x86_64-linux-gnu/dri/fglrx_dri.so) |
|
From: Folkert v. H. <fol...@gm...> - 2012-04-17 19:55:44
|
Hi Philippe, >> hmmm ok. >> it seems it can't handle corruptions that nicely: > Not too sure I understand. The below msgs from Valgrind > are indicating (probable/possible) bugs. Apart of reporting > the error, the behaviour is (usually) not influenced too much > (compared to a native execution). malloc-fill might cause > bigger differences, in case non initialised memory is used. The problem I see is that the stacktraces seem to be incorrect. For example: ==21521== Invalid write of size 1 ==21521== at 0x402A788: memcpy (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==21521== by 0xF867F95: ??? (in /usr/lib/x86_64-linux-gnu/dri/fglrx_dri.so) ==21521== by 0xFFFF1DFFFFFFFFFF: ??? ==21521== by 0xFFFF1EFFFFFFFFFF: ??? ==21521== by 0x1FFFFFFFF: ??? ==21521== by 0x205: ??? ==21521== by 0x200000000: ??? ==21521== by 0x2: ??? ==21521== Address 0x7f19655e45ab is not stack'd, malloc'd or (recently) free'd This happened on a system with an ati card with the fglrx driver. On my laptop with intel video chipset it does not. Hmmm. -- www.vanheusden.com bitcoin account: 14ExronPRN44urf4jqPMyoAN46T75MKGgP msn address: sp...@va... |
|
From: Philippe W. <phi...@sk...> - 2012-04-17 20:00:44
|
On Tue, 2012-04-17 at 21:55 +0200, Folkert van Heusden wrote: > The problem I see is that the stacktraces seem to be incorrect. gdb unwinder might work better => try with the Valgrind gdbserver (give --vgdb-error=0 arg to Valgrind, and follow instructions to attach gdb, and then 'continue' your process till the error is encountered). > For example: > > ==21521== Invalid write of size 1 > ==21521== at 0x402A788: memcpy (in > /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) > ==21521== by 0xF867F95: ??? (in /usr/lib/x86_64-linux-gnu/dri/fglrx_dri.so) > ==21521== by 0xFFFF1DFFFFFFFFFF: ??? > ==21521== by 0xFFFF1EFFFFFFFFFF: ??? > ==21521== by 0x1FFFFFFFF: ??? > ==21521== by 0x205: ??? > ==21521== by 0x200000000: ??? > ==21521== by 0x2: ??? > ==21521== Address 0x7f19655e45ab is not stack'd, malloc'd or (recently) free'd > > This happened on a system with an ati card with the fglrx driver. > On my laptop with intel video chipset it does not. > Hmmm. > |