|
From: Alex B. <ker...@be...> - 2014-05-06 17:17:51
|
Hi, I was recently using Valgrind to investigate some issues with QEMU's multi-threading behaviour. One problem was without locking the code generator multiple threads started accessing the codegen buffer and hilarity ensued. Is it possible for the DRD/Helgrind tools to detect this sort of double-write access behaviour? Could I instrument QEMU so it marked the codegen buffer as one that should only grow upwards (modulo-patchable bits) so if anything re-wrote the buffer it could trigger an error? -- Alex Bennée |
|
From: Julian S. <js...@ac...> - 2014-05-07 14:30:38
|
> Is it possible for the DRD/Helgrind tools to detect this sort of > double-write access behaviour? Both of them should be able to detect a write-vs-write race, if that's what you mean. > Could I instrument QEMU so it marked the > codegen buffer as one that should only grow upwards (modulo-patchable > bits) so if anything re-wrote the buffer it could trigger an error? This is confusing. Both tools are able to detect races at a byte level granularity. If you can show that QEMU doesn't race on individual writes to its code buffer, isn't that good enough from a correctness perspective? J |
|
From: Alex B. <ker...@be...> - 2014-05-08 08:26:55
|
Julian Seward <js...@ac...> writes: >> Is it possible for the DRD/Helgrind tools to detect this sort of >> double-write access behaviour? > > Both of them should be able to detect a write-vs-write race, if > that's what you mean. But is that only if two threads race to write the same location at the same time? > >> Could I instrument QEMU so it marked the >> codegen buffer as one that should only grow upwards (modulo-patchable >> bits) so if anything re-wrote the buffer it could trigger an error? > > This is confusing. Both tools are able to detect races at a byte level > granularity. If you can show that QEMU doesn't race on individual writes > to its code buffer, isn't that good enough from a correctness > perspective? What I think has happened is: * thread a writes from code_buf_start a series of operations * thread b writes from code_buf_start over the top of thread a * thread b finalises the code buffer at code_buf_end<b> * thread a finalises the code buffer at code_buf_end<a> The result is a corrupted code buffer where one threads output has been stomped on by the other. I don't think they raced on the individual writes, just over wrote the previous work on the next schedule. > > J -- Alex Bennée |
|
From: Julian S. <js...@ac...> - 2014-05-08 10:24:14
|
On 05/08/2014 10:26 AM, Alex Bennée wrote: > > Julian Seward <js...@ac...> writes: > >>> Is it possible for the DRD/Helgrind tools to detect this sort of >>> double-write access behaviour? >> >> Both of them should be able to detect a write-vs-write race, if >> that's what you mean. > > But is that only if two threads race to write the same location at the > same time? Well, yes. But isn't that what you care about? Or are you concerned about a race at a higher level of granularity? > What I think has happened is: > > (1) thread a writes from code_buf_start a series of operations > (2) thread b writes from code_buf_start over the top of thread a > (3) thread b finalises the code buffer at code_buf_end<b> > (4) thread a finalises the code buffer at code_buf_end<a> I'm still confused. What does this sequence look like if you add in the lock acquire/release points? This sounds like it ought to be straightforward, but you need to say what locks exist, what they protect, and when they are supposed to be acquired/released. In the absence of further info I'd assume that there is just one lock, which protects the entire code buffer and any "finalisation info". And that is supposed to be acquired and held for the entire period in which a thread writes to and finalises the code buffer. In which case the above sequence can't happen, since (2) will block as soon as it tries to acquire the lock. J |