|
From: Nicholas N. <nj...@ca...> - 2003-05-26 10:05:25
|
On Mon, 26 May 2003, Josef Weidendorfer wrote: > currently I'm thinking a little bit of what would be needed to allow > applications run under Valgrind to use processors in parallel. The main goal > would be to speed up cache simulation for multithreaded applications, more > specially first to let OpenMP apps (number crunshing) run simultaneously. > I'm not at all convinced if there will be any benefit/speedup at all on > multiple processors because of a possible need for additional fine-grained > communication among the threads. > > So perhaps its simple not worth it. > To come to this conclusion faster, I wanted to ask you for the problems you > see in this for the Valgrind core framework. > > As I see it: all global data structures accessable by multiple threads either > must be avoided or locked on access. > * Could the instrumentation engine/translation table be separated for each > thread? This would duplicate translation for each thread, but would avoid > synchronisation on accessing the translation hash table. > * V memory allocation functions have to be multithread-aware. > * Signal handling? Is there anything special that I have overlooked? > * What's with Valgrinds version of the pthread library? Do you think that it's > a big task to make this reentrant-safe? Or perhaps we even could get rid of > our own implementation? Julian and I have discussed this, and AFAWCT the killer point is shadow memory for Memcheck -- each time memory is written shadow memory is written a few instructions before. The danger is that if two threads are racing on a memory word, you could get a thread switch in between the shadow write and the real write, and then your shadow memory would not match your real memory. The problem would be avoided if we could guarantee that thread switches only occur between basic blocks. Actually, now that I think about it, that shouldn't be a problem with the current implementation since it does thread scheduling itself, and never does thread switches in the middle of a basic block. Hmm. As for getting rid of Valgrind's threads implementation, the best idea so far is to intercept the clone() syscall, and have Valgrind schedule threads itself but not do all the pthreads ops itself. This sounds plausible, but I think the details haven't been worked out. The big advantage would be getting rid of libpthread.so which is a source of much complexity. The disadvantage would be that the pthreads API error checking would disappear (I think). All the other stuff you mention about making Valgrind thread-safe seem like it shouldn't be too hard to do (famous last words...) All this is very Linux-centric, and a (very) long-term goal would be to support other operating systems, but it's not at all clear how to do this. Does that answer some of your questions? N |