|
From: Konstantin S. <kon...@gm...> - 2008-01-17 14:45:23
|
Julian, all,
I'd like to return to the discussion of false positives and false negatives
in helgrind (once you have time).
Previously, I sent few examples with wrong reports (or with missing
reports).
Now I have some more examples in one file (see racecheck_unittest.cc,
attached).
The most interesting, I think, are:
test10: (false negative)
// Writer: Reader:
// 1. write(GLOB) a. sleep(long enough so that GLOB
// is most likely initialized by Writer)
// b. read(GLOB)
test11: (false positive)
// Parent: Worker1, Worker2:
// 1. Start(workers) a. read(GLOB)
// 2. MU.Lock() b. MU.Lock()
// 3. while(COND != 2) /-------- c. CV.Signal()
// CV.Wait(&MU) <-------/ d. MU.Unlock()
// 4. MU.Unlock()
// 5. write(GLOB)
test32: (false positive)
// Parent: Writer: Reader:
// 1. Start(Reader) -----------------------\ .
// \ .
// 2. Start(Writer) ---\ \ .
// \---> a. MU.Lock() \--> A. sleep(long enough)
// b. write(GLOB)
// /---- c. MU.Unlock()
// 3. Join(Writer) <---/
// B. MU.Lock()
// C. read(GLOB)
// /------------ D. MU.Unlock()
// 4. Join(Reader) <----------------/
// 5. write(GLOB)
To fix test10 (i.e. to report the actual race) I think we need to
1. distinguish between ExclusiveM and ExclusiveR states. (to be able to
detect the race) *
*2. keep the lockset even for exclusive states (to avoid false positives).
This was missing in the patch I sent you previously. *
*3. add one transition in msm__handle_read: ExclusiveM->ShM (to actually
report the race of the lockset is empty)
To fix test32 (i.e. *not* to report the non-existing race) I think we also
need 1. and 2.
To simplify understanding, here what happens in test32 (helgrind sees events
in this order):
1. TRACE: Parent :: GLOB: NoAccess --> New
2. CREATE Reader
3. CREATE Writer
4. TRACE: wr Writer :: GLOB: New --> ExclusiveW(Writer)
5. JOIN: Writer :: GLOB: does not change
6. TRACE: rd Reader :: GLOB: ExclusiveW(Writer) --> ShM(Reader,
Writer,#Lset=1)
7. JOIN: Reader:: GLOB: ShM(Reader, Writer,#Lset=1) --> ShM(Parent,
Writer,#Lset=1)
8. TRACE: wr Parent :: GLOB: ShM(Parent, Writer,#Lset=1) --> ShM(Parent,
Writer,#Lset=0) # a race is reported
So, to fix this (IMHO) we need to check happens_before during all state
transitions.* *
For this particular case, we need to check that
happens_before(last_segment_of_Writer_where_we accessed_GLOB,
current_segment_of_parent)
and make this transition:
8. TRACE: wr Parent :: GLOB: ShM(Parent, Writer,#Lset=1) --> ExclM(Parent,
#Lset=0)
So, this actually requires keeping thread segments instead of threads in
shared states.
All this will make things slower, but maybe not much. So far I haven't seen
happens_before in the helgrind's profiles.
Alternatively, we may do something tricky at the time we join the writer,
something similar to your 'cv-hack' patch.
I am not sure which is better.
I did not analyze test11 much, bit it looks similar.
Note, that the tests become slightly more complicated as we add
readers/writers.
*To wrap up my proposal: *
- There are now only two states (other than New and NoAccess): ShM and ShR.
- Each SVal has Lset.
- Each SVal has thread-segment-sets instead of thread-sets.
- Memory accessed by just one thread is still ShM/ShR, but with one element
in thread-segment-set (might be optimized to avoid WS calls).
- We check happens_before in all states.
What do you think?
--kcc
P.S. test35 is a unit test for performance issue in
shadow_mem_make_NoAccess.
test34 and test33 are tests that create millions of TSETs/LSETs to
check capacity of SVal.
P.P.S. There is one more crazy test which suggest me that even what I
propose is not enough, see test28.
// Putter1: Getter: Putter2:
// 1. MU.Lock() A. MU.Lock()
// 2. write(GLOB) B.
write(GLOB)
// 3. MU.Unlock() C.
MU.Unlock()
// 4. Q.Put() ---------\ /------- D. Q.Put()
// 5. MU.Lock() \-------> a. Q.Get() / E. MU.Lock
()
// 6. read(GLOB) b. Q.Get() <---------/ F.
read(GLOB)
// 7. MU.Unlock() c. read(GLOB) G.
MU.Unlock()
P.P.P.S Sorry for a long message :)
|