|
From: Nicholas N. <nj...@ca...> - 2004-10-21 12:15:30
|
Hi,
Julian and I are looking at writing a paper about Memcheck, in particular
describing the uninitialised value checking (V bit checking). In papers
about error detection tools, it's not easy to provide evidence that a tool
actually finds real, useful bugs.
So we are wondering if anyone has any examples from their use of
Valgrind/Memcheck that they would be willing to tell us about for possible
inclusion?
In particular, it would be great to know of examples of uninitialised
value bugs found with Memcheck, eg. found with one of the
following error messages:
"Syscall param write(buf) contains uninitialised or unaddressable byte(s)"
"Conditional jump or move depends on uninitialised value(s)"
"Use of uninitialised value of size 4"
(as opposed to errors about unaddressable memory, or memory leaks).
In particular, examples with one or more of the following characteristics
would be really helpful:
- the bug is in a large/well known/widely used piece of software
- the bug would not have been found without Memcheck
- the bug had been in the code for a long time
- the bug is really subtle (eg. where Memcheck detected
the use of a single uninitialised bit)
- you can provide a small snippet of code that demonstrates the bug.
- anything else that makes it impressive that Memcheck found
Eg. a perfect example would be something like "In our large system with
10000 users, Memcheck found the following bug whereby an incorrect
bitfield operation left 1 bit uninitialised ... <insert 5 line code
snippet> ... this bug had been causing mysterious, random crashes when
using large datasets; the code had been in the source code for 18 months,
and had passed 5 code reviews."
That example is perhaps unrealistic, but anything roughly like that would
be extremely helpful :)
Alternatively, if you have any kind of more general information or
statistics about Memcheck finding errors (particularly uninitialised value
errors) in your software, that might also be helpful to know about. See
www.kegel.com/openoffice/valgrindingOOo.html for an example of the kind of
thing that would be really useful.
You can send examples directly to me, or to the list if you are happy to
share them (but be aware that we want to put them in a paper that will
hopefully be published, so don't for example tell me things you don't want
to become public knowledge).
We'd really appreciate any help you can give us on this. Thanks very
much.
N
|
|
From: Kevin P. <pu...@pu...> - 2004-10-31 18:32:34
|
Well, I would have one really good one from some years ago, only I finally tracked it down a few weeks before I first heard of valgrind. The symptom was that sometimes, on some machines, KDE apps started through kdeinit would not use font antialiasing. Debugging builds rarely (if ever) showed this behavior. Applications not launched through kdeinit rarely (if ever) showed this behavior. Inserting printfs to look at things would make the bug come and go. changing load order around (by using LD_BIND_NOW) made it go away. All the classic heisenbug symptoms, really :-) I finally identified a function which was returning a wrong value (managed to reduce the footprint of my debug instrumentation enough that the bug still occurred, by copying the relevant values to a global and printing it from somewhere else). Kept drilling down until I made it into Xrender, and found the culprit. It was a long, painful process of elimination. Since a paper teaching more people how much valgrind rocks seems like a cause worth supporting, I've even dug up a few old posts on the subject :-) http://bugs.debian.org/137145 http://www.winehq.com/hypermail/wine-devel/2002/05/0053.html http://www.kerneltraffic.org/kde/kde20020301_34.xml Search keywords if you need more: XRenderFindVisualFormat LD_BIND_NOW and antialiasing Nicholas Nethercote wrote: > Hi, > > Julian and I are looking at writing a paper about Memcheck, in particular > describing the uninitialised value checking (V bit checking). In papers > about error detection tools, it's not easy to provide evidence that a tool > actually finds real, useful bugs. > > So we are wondering if anyone has any examples from their use of > Valgrind/Memcheck that they would be willing to tell us about for possible > inclusion? > > In particular, it would be great to know of examples of uninitialised > value bugs found with Memcheck, eg. found with one of the > following error messages: > > "Syscall param write(buf) contains uninitialised or unaddressable > byte(s)" "Conditional jump or move depends on uninitialised value(s)" > "Use of uninitialised value of size 4" > > (as opposed to errors about unaddressable memory, or memory leaks). > > In particular, examples with one or more of the following characteristics > would be really helpful: > > - the bug is in a large/well known/widely used piece of software The bug was in libXrender in XFree86 4.0-4.1.x, specifically in XRenderFindVisualFormat. That's fairly widely deployed... > - the bug would not have been found without Memcheck Can't claim this, since I did find it. But it took a few weeks of determined digging, and frankly if I hadn't been one of the proud owners of a machine that did it all the damn time, I'm not sure I would have had the patience. > - the bug had been in the code for a long time I think the Xrender problem had been there since libXrender was introduced It had already been 'fixed' - though not found - in 4.2, owing to some overall code restructuring that resulted in breaking up that function entirely. The KDE bug report had been open for some months. It was severe enough at the time that Qt accepted my patch to workaround it by intentionally poisoning a range of stack addresses before calling the function. Wine appears to have adopted the same workaround, so they probably got it from Qt (though I've never confirmed that). > - the bug is really subtle (eg. where Memcheck detected > the use of a single uninitialised bit) Can't say this one - it was the whole variable. It was subtle in the code, though, as there were two variables used for related purposes, and the one the code was supposed to use was in fact initialized. The code was using the wrong one, which was not set until later. > - you can provide a small snippet of code that demonstrates the bug. > > - anything else that makes it impressive that Memcheck found When I tried VG on it a few weeks later it spotted it, and printed it with a line number. I think I would have appreciated that a week or two sooner :-) Understandably, I've been a bit of a valgrind advocate ever since. > Eg. a perfect example would be something like "In our large system with > 10000 users, Memcheck found the following bug whereby an incorrect > bitfield operation left 1 bit uninitialised ... <insert 5 line code > snippet> ... this bug had been causing mysterious, random crashes when > using large datasets; the code had been in the source code for 18 months, > and had passed 5 code reviews." > > That example is perhaps unrealistic, but anything roughly like that would > be extremely helpful :) > > Alternatively, if you have any kind of more general information or > statistics about Memcheck finding errors (particularly uninitialised value > errors) in your software, that might also be helpful to know about. See > www.kegel.com/openoffice/valgrindingOOo.html for an example of the kind of > thing that would be really useful. > > You can send examples directly to me, or to the list if you are happy to > share them (but be aware that we want to put them in a paper that will > hopefully be published, so don't for example tell me things you don't want > to become public knowledge). > > We'd really appreciate any help you can give us on this. Thanks very > much. > > N > > > ------------------------------------------------------- > This SF.net email is sponsored by: IT Product Guide on ITManagersJournal > Use IT products in your business? Tell us what you think of them. Give us > Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out > more http://productguide.itmanagersjournal.com/guidepromo.tmpl |