|
From: Richard C. <ric...@pr...> - 2005-08-11 08:55:31
|
Thanks too all who replied. I already had a copy of memtest86 and I have run it a few times overnight without any memory failures. I hadn't heard about prim95 before, but I ran it at higher priority for 5 hours without a failure. During the running of prim95 there were many messages about the CPU temperature being high - but as mentioned by 'Julian Seward' the CPU thermal management must be working. The only other difference with this machine is the HD is SATA, but I now consider it a long shot that the bug is not present in our code! :( Again thanks for the advice - when (if) I track down the problem I'll let you know, especially if its something that valgrind may be able to catch. Regards, Richard Richard Corden wrote: > > Hi, > > One of our tools fails some tests on one of our machines. The > failures are consistent and reproducible, they are seg 11s, and in > some cases I get a message from glibc. > > *** glibc detected *** free(): invalid pointer: 0x082587f4 *** > > These messages always have the same memory address. > > The interesting thing is that when I run the same test with valgrind, > I don't get any failures, and the tools pass the test as expected. > Initially 'valgrind' found a 'memcpy' with overlapping memory which is > now fixed but other than that there were no other issues when using > --tool=memcheck. > > The machine is relatively new (Intel Zeon hyper threading P4 3.2GHz) > and I have had problems keeping it cool, I do get a lot of messages > from syslogd about passing temperature threshold. > > I need to find out if it is the overheating of the CPU or ?something > else? which causes the failure? > > My question is, what could valgrind be doing that might stop the > problem from occurring? > > I've been running using --tool=memcheck, should I try something else? > > > Regards, > > > Richard > -- Richard Corden Programming Research Ltd. ric...@pr... + 44 845 0048478 |