Re: [Valgrind-users] memcheck is getting SIGKILLed before leak report is output

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

On Wed, 2022-08-31 at 17:42 +0000, Bresalier, Rob (Nokia - US/Murray Hill) wrote:
> > When running memcheck on a massive monolith embedded executable
> > (237MB stripped, 1.8GiB unstripped), after I stop the executable under
> > valgrind I see the "HEAP SUMMARY" but then valgrind dies before any leak
> > reports are printed. The parent process sees that the return status of
> > memcheck is that it was SIGKILLed (status returned in waitpid call is '9').
> 
> We found that removing a call to _exit(0) made it so that valgrind is no longer
> SIGKILLED.
> 
> Any ideas why using _exit(0) may get rid of valgrind getting SIGKILLed?
> 
> Previously exit(0) was called, without the leading underscore, but changed it to
> _exit(0) to really make sure no memory was being deallocated. This worked well on a
> different process, so we carried it over to this one, that is why we did it.
> 
> Even with exit(0) (no underscore), in this process there is not much deallocation going
> on in exit handlers, so have lots of doubts that valgrind/memcheck was using too much
> memory and invoking the OOM killer.
> 
> Using strace and dmesg while we had _exit(0) in use didn't show that OOM killer was
> SIGKILLing valgrind.
> 
> I also tried reducing number of callers from 12 to 6 when using _exit(0), still got the
> SIGKILL.
> 
> Also tried using a system that had an additional 4GByte of memory, and also got the
> SIGKILL there.
> 
> So I have many doubts that Valgrind was getting SIGKILLed due to too much memory usage.
> 
> Don't know why removing _exit(0) got rid of the SIGKILL. Was wondering if anyone had any
> ideas?
Normally, if it is the OOM that kills a process, you should find a trace of this in the
system logs.

I do not understand what you mean by reducing the nr of callers from 12 to 6.
What are these callers ? Is that some threads of the process you are
running under valgrind ?

And just in case: are you using the last version of Valgrind ?

You might use "strace" on valgrind to see what is going on at the time _exit(0) is called.
You might also start valgrind with some debug trace e.g.  -d -d -d -d -v -v -v -v

Philippe