From: Philippe W. <phi...@sk...> - 2022-08-31 22:14:27
|
On Wed, 2022-08-31 at 17:42 +0000, Bresalier, Rob (Nokia - US/Murray Hill) wrote: > > When running memcheck on a massive monolith embedded executable > > (237MB stripped, 1.8GiB unstripped), after I stop the executable under > > valgrind I see the "HEAP SUMMARY" but then valgrind dies before any leak > > reports are printed. The parent process sees that the return status of > > memcheck is that it was SIGKILLed (status returned in waitpid call is '9'). > > We found that removing a call to _exit(0) made it so that valgrind is no longer > SIGKILLED. > > Any ideas why using _exit(0) may get rid of valgrind getting SIGKILLed? > > Previously exit(0) was called, without the leading underscore, but changed it to > _exit(0) to really make sure no memory was being deallocated. This worked well on a > different process, so we carried it over to this one, that is why we did it. > > Even with exit(0) (no underscore), in this process there is not much deallocation going > on in exit handlers, so have lots of doubts that valgrind/memcheck was using too much > memory and invoking the OOM killer. > > Using strace and dmesg while we had _exit(0) in use didn't show that OOM killer was > SIGKILLing valgrind. > > I also tried reducing number of callers from 12 to 6 when using _exit(0), still got the > SIGKILL. > > Also tried using a system that had an additional 4GByte of memory, and also got the > SIGKILL there. > > So I have many doubts that Valgrind was getting SIGKILLed due to too much memory usage. > > Don't know why removing _exit(0) got rid of the SIGKILL. Was wondering if anyone had any > ideas? Normally, if it is the OOM that kills a process, you should find a trace of this in the system logs. I do not understand what you mean by reducing the nr of callers from 12 to 6. What are these callers ? Is that some threads of the process you are running under valgrind ? And just in case: are you using the last version of Valgrind ? You might use "strace" on valgrind to see what is going on at the time _exit(0) is called. You might also start valgrind with some debug trace e.g. -d -d -d -d -v -v -v -v Philippe |