From: Julian S. <ju...@va...> - 2005-06-17 22:38:40
|
> I've never tried to use it in this manner. I did take a quick peek at > this once a while back. At the time it looked to me like starting a > process with valgrind was (essentially) setting LD_PRELOAD to load the > valgrind .so files and maybe a few other environment variables. Is > this still more or less what valgrind is doing? /usr/bin/valgrind I > found on my system here is just a binary. Erik The LD_PRELOAD mechanism went away some time back as it relies too heavily on glibc and libpthread specifics. Startup is tricky because we need to load both Valgrind and the application to be debugged into the same address space (same process); but the application has no idea this is happening. How it works now is: * User runs /usr/bin/valgrind prog args-for-prog * /usr/bin/valgrind is not the "real" valgrind executable. That is /usr/lib/valgrind/stage2. /usr/bin/valgrind loads stage2 high in the address space and hands control to it. * stage2 unmaps /usr/bin/valgrind. It is now alone in the address space, and in particular there is a hole at the standard load address (0x8040000, or wherever). * stage2 has its own implementation of exec() (sort of). It uses this to load prog (+ dependent .so's) and start it. * From prog's point of view it is started just as it would be normally. In reality it is running on a virtual CPU provided by stage2. stage2 intercepts and messes with all mmap() etc done by prog to ensure it doesn't screw up valgrind. So .. > 1. Valgrind has to have some reasonably nice mechanism to tell us what > exactly needs to be set. I'm not sure exactly what that should look > like but I figure there's lots of possibilities No env vars are needed for startup, I think. There's some env var trickery if a valgrinded process wants to start a valgrinded child process, but we can ignore that for now. > 3. The valgrind libraries need to be available on the slave nodes. > This is just a system configuration issue. I did some work > experimenting with migration after linking so this requirement could > potentially go away. Well, not just the .so's. stage2 assumes it can grab any of the stuff in PREFIX/lib/valgrind as/when it likes. There are various .so's forced into the address space at startup, but there are also a bunch of text files (*.supp) which are important. So far I managed to get it to work as follows: * on the master node, install into /opt/valgrind * bpcp -r /opt/valgrind to the slave nodes * bpcp 'prog' to somewhere on the slave nodes, say /opt/prog * on the master do bpsh <nodeid> /opt/valgrind/bin/valgrind /opt/prog args-for-prog This is pretty ugly. As I understand it, bpsh takes /opt/valgrind/bin/valgrind from the master, migrates it to the slave(s), starts it there, and it just happens to work because the valgrind install trees on the master and slaves are identical. I don't have any better ideas. Fundamentally it seems difficult because bpsh is only prepared to migrate one executable and that has to be /opt/valgrind/bin/valgrind, so you have to have a different way to get the executable-to-be-debugged to the slaves. If the entire V install tree could be pre-installed on the slaves that would help. I guess one option is for all slaves to refer to a global NFS mount. But there would still be a problem of moving the executable. Ideas? J |