|
From: Julian S. <js...@ac...> - 2006-09-02 02:19:14
|
I'm pleased to see that 'drd', Bart's data race detection tool, has=20
recently progressed to the point where we can start to evaluate it.
A robust, accurate, data race detection tool would be an excellent=20
addition to our tool suite, and drd is making promising noises.
There's still a way to go, however. I've been thinking about why
drd is difficult to use, what can be done to improve it.
When it reports a race, drd says, basically
I found a race between two thread segments
Here's the first segment: thread X, starting stack S1,=20
ending stack E1
Here's the second segment: thread Y, starting stack S2,
ending stack E2
Here are the data addresses ("contended addresses") involved in=20
the race:
0xF00, 0xBAR, 0xXYZZY
If you get lucky, it may identify those addresses as being global
data symbols, or inside malloc'd blocks, but often it doesn't.
That's good as far as it goes. For small programs, that kind
of info is often enough to make sense of the race. But playing with=20
drd on knode (threaded news reader in KDE 3.5.4), I find I haven't
a clue what's going on. Ok, I'm not familar with knode (or KDE, or
Qt at all), but nevertheless I look at what drd tells me and I still
am completely unable to identify which parts of the sources might
be involved in the race.
What drd tells us is the data addresses involved in the race
(is "contended addresses" a good name for them?) Problem is,
unless you are extremely lucky, the program is tiny, or you=20
are a total genius, it's nearly impossible to figure out where
in the code these reads/writes are being done.
The DIOTA papers mention the idea of a replay tool associated with
the race detector. In a first run, the race addresses are computed
by a drd-like tool. The program is then re-run under the control of
a "deterministic replay tool", which somehow manages to recreate the
previous execution, the purpose being to monitor all reads/writes so
it can take snapshots of the thread stacks when it detects accesses
to the race addresses. This presumably makes it simple to relate the
racing to source code locations.
It's a nice idea, but I don't think it's practical here. I don't see
how the replay tool can work for interactive applications like web
browsers. And I don't think users will appreciate the hassle.
So here's a different impractical idea :-) When drd is recording
races, don't just record the set of read/written addresses, but
something else too: for each written address in the segment, also
record the program counter of the first (or any) writer of that address.
By definition a race must involve at least one of the threads writing
the contended address(es). So this is guaranteed to produce at least=20
one source code location involved in the race. Now the error report
might look like this:
[.. other stuff as before ..]
Here's the data addresses involved in the race:
0xF00 (first write in segment 1, abcd.c:124),=20
0xBAR (first write in segment 2, snafu.cpp:678),
0xXYZZY (first write in segment 1, spqr.c:987
and in segment 2, badness.c:666)
Recording program counters without a stack is of limited use, but I
can't see how to record a complete stack for each first-write without
huge space/time overheads. It might be possible to devise a highly=20
compressed representation for just the PC values, based on the idea=20
that each segment is only going to use a tiny subset of the 2^32/2^64
possible PC values.
=46rom reading the DIOTA papers I got the impression they did a lot of
work to make DIOTA fast and memory efficient, but I didn't see much stuff
about what the tool was like to use in practice. Maybe I didn't look
in the right places.
J
|