|
From: Nick L. <ni...@io...> - 2005-10-17 14:22:26
|
Hi all
Looking for any hints on why valgrind may be reporting hundreds of bogus
errors in some C++/C code, compared to purify working and reporting
flawlessly on the same code on the same machine.
The code was tested with both older valgrinds and the latest valgrind 3 on
a code base with openssh, libssh2, and other code, and has been giving not
only vast numbers of errors, but plenty of errors reports that are
deomonstrably bogus. It took a couple of hours to produce a suppressions
file that would suppress enough to be able to see any possible genuine
errors, but even then, the genuine bugs in libssh2, for example, weren't
caught as far as we could see. Only purify gave the true picture, finding
the problems in libssh2, and only a few other UMR reports related to
openssh crypto functions that are actually ok.
I know from previous experience that valgrind can be good, so does anyone
have any ideas why in this case it was so disastrously off base? Files
were optimised -O0 as recommended, and tested both on a 2.2 kernel and
2.95.3 (with an old valgrind), and the latest valgrind with 2.4.20-31.9
and gcc 3.2.2.
One simple failing example was:
n = libssh2_channel_read(ch->channel_ptr(), buf, bufsize - 1);
if (n < 0) {
//
}
where n was sometimes reported as being uninitialised on the conditional.
This was provably incorrect, and verifed both by purify and having a
correct return result, with all code paths inside the function
initialising the return value, and with values that were themselves always
initialised, so there was no propagation of uninitialised data. Some of
the libssh code was also rewritten to move automatics defined in
innerblocks to the outer most level, just in case overlapping autos were
in anyway a problem, but as expected, this made no difference.
Sorry if I missed something such as valgrind not working with C++
binaries, but I don't believe that there's any problem there.
Any suggestions to try are welcome!
Nick
|
|
From: Nicholas N. <nj...@cs...> - 2005-10-17 15:16:32
|
On Mon, 17 Oct 2005, Nick Lindridge wrote: > Looking for any hints on why valgrind may be reporting hundreds of bogus > errors in some C++/C code, compared to purify working and reporting > flawlessly on the same code on the same machine. > > The code was tested with both older valgrinds and the latest valgrind 3 on > a code base with openssh, libssh2, and other code, The only thing I can think of is that I think SSL (which ssh uses?) deliberately uses uninitialised memory as a source of entropy for some random number generator, or something like that. So it's possible that these are propagating throughout the program in various ways. If it is the case, the way to fix it would be to add client requests to SSL (eg. VALGRIND_MAKE_READABLE) to mark the uninitialised memory as being initialised. Nick |
|
From: Julian S. <js...@ac...> - 2005-10-17 22:38:16
|
As a general point, the fact that Purify reports no error and V does does not necessary mean that V is wrong. In reality both systems have to approximate reality to some extent. It's not difficult to write a program with an error which Purify misses but Valgrind doesn't. It's arguable that V does more accurate value tracking than Purify and experience tends to show that, for the most part, when it complains about something it is correct. Sure, it's possible to fool it into reporting nonexistent problems re uses of uninitialised variables, but for real production code (as opposed to carefully constructed test cases) but this tends to be fairly rare. J |