|
From: Jeremy F. <je...@go...> - 2005-01-20 06:45:46
|
On Thu, 2005-01-20 at 13:06 +1100, Eyal Lebedinsky wrote: > I will repeat - this was not a problem until recently. I am rather sure the stable 2.2.0 > gives good backtraces. Oh, I believe you, but I don't think anything has change recently which would have affected this; at least not for calloc. > I would like to offer another observation. I just created a simple program > in an attempt to demonstrate the laconic report problem. Instead, it crashed > (sig 11) on a return. > > After repeating it a few times, I noticed that my big test is hanging again. > I killed it and deleted the semaphore it hold (somehow it is never released > after a crash). > > The tiny test program now works (no sig 11). > > Is it possible that vg uses some semaphore that all instances share and it > gets into trouble after a while? My test suit always fails after a number of > tests finish successfully, and every program thereafter gets sig 11. Every > single valgrind run. If I kill everything (and remove [ipcrm -s] my own > semaphore that my tests use) then I can continue with the tests (well, at > least for a while). Valgrind doesn't use semaphores itself, and it should just be passing your syscalls through to the kernel untouched. It also respects the CLONE_SYSVSEMA flag, so that should be OK. I'll note that both FC2 and SUSE 9.2 2.6 kernels seem to show sporadic problems with delivering signals without proper siginfo information. That will cause your program to spontaneously SIGSEGV when it tries to grow the stack, which almost every program will need to do. The kernel will stay in this state for some indeterminate amount of time, but then will spontaneously start working again. You can test for this state by running none/test/faultstatus (natively, not under Valgrind). If it doesn't pass everything, then your kernel is in a buggy state. I have never seen this with stock kernel.org kernels. Is your kernel a Debian-supplied one, or one you've built yourself? J |