From: Julian S. <ju...@va...> - 2005-06-20 10:27:42
|
> [...] > > Interesting. If I read this right valgrind acting as ELF loader. Correct. There's no way to avoid this [that we know of.] > Does it do linking stuff too? No. We merely load the executable and its direct dependencies, then start up the stated ELF interpreter (ld.so) on our virtual CPU. So we don't have to get into the dynamic linking swamp, fortunately. > Does the target program effectively > have it's own dynamic linker or is it shared with valgrind? > Does it > share instances of the libraries? - it appears that stage2 is > dynamically linked as well. Our design goal is that V is completely independent of any other libraries. We haven't quite got there yet, but it nearly is. The primary motivation is that V has to maintain complete control over the process' address space and signal state, and that's essentially impossible if we defer to glibc to do low level stuff like malloc, free, etc. Also, imagine the potential chaos if V and the program shared glibc.so, and the simulated program was part way through doing malloc (on the simulated CPU) when V decided to call malloc on the real CPU. Even if this didn't turn out to be a problem, the difficulty in convincing ourselves that it's safe and always going to work is huge. So our policy is to make V (viz, stage2) as completely self-contained as we can. Since we're not quite there yet .. V does use glibc.so and ld.so, but has its own instances of them. > It's true that it's only one executable but that could be something > pretty weird. I'm kinda just thinking out loud here but what about > the following: > > - valgrind starts up and gets through loading the program to be debugged. > - valgrind stops and dumps itself w/ vmadump (bproc_dump()). > - bpsh/mpirun migrates THAT process image instead of some fresh executable. > - half started process w/ valgrind + other executable wakes up and > runs on the slave node. > > The nasty bit here is that valgrind would have to be linked w/ bproc. > I did some weird stuff w/ editing freshly loaded elf binaries to add a > preinit section that called bproc. That basically allowed the kernel > to take over again after dynamic linking was done but before the > program ran. I don't know if some similar hack could work here. I > don't know - just a thought. > > This would be pretty easy to test, I think. If you added the > bproc_dump call and just dumped to a plain file, you can execve that > file directly to reload the dump. That would allow bpsh to do its > thing. I real solution would probably look more like dump into a pipe > or something. > > That still leaves the problem of valgrind getting at files when it > pleases. Would it be possible/reasonable for valgrind to pre-load > everything it *might* need down the line? That could be optional. That kind of thing might be a possibility, although I have to be honest and say I'd prefer not to have to put BProc specifics into V if I don't have to, especially as at this time we're working hard to make V less target-specific. I've been pondering a more generalised solution .. tell me if this sounds crazy. It's a modified version of bpsh (or a replacement). Instead of doing bpsh <node_specifiers> program args do modified_bpsh <node_specifiers> path program args modified_bpsh reads the entire tree rooted at path into itself (mmap games, perhaps), migrates to the nodes, dumps the tree back into the node-local filesystem, and execs program w/args as usual. Running V on a slave node is then modified_bpsh <node_specifiers> \ /where/V/is/installed/on/master \ # the path /where/V/is/installed/on/master/bin/valgrind \ # stage1 program \ args This strikes me as having several advantages: * doesn't require slaves to use an NFS-mounted filesystem * is useful for any kind of tool requiring a readonly filesystem * doesn't require linking V against BProc The intention is that the carried-around tree is small, as is indeed V's install tree is (5.7 M). If bandwidth is an issue (fair enough, if sending copies to N hundred nodes) then it might be possible to compress the tree as it is read using a real-time compression package and decompress on the slaves. I'm thinking of LZO (http://www.oberhumer.com/opensource/lzo) which is GPLd and very fast. Comments? J |