|
From: Julian S. <js...@ac...> - 2003-04-06 17:21:00
|
I've spent the afternoon messing with R H 9. Graydon and Jeff have
done a great job telling me useful stuff for getting V to work on
RH9, and thanks to them for that.
So it sort-of works. Unfortunately OpenOffice hangs, due to an accept()
call which isn't correctly routed to V's only-this-thread-blocks
implementation thereof, and so the whole app hangs. Sigh. I think it
might be due to ld.so in 9 ignoring -z initfirst.
It's all a swamp, and V is getting more and more cluttered with stuff
to work around this kind of problem. JeremyF's vg_intercept.c thing
is the latest addition.
Seems like time for a clean start. How about this for a suggestion.
Read and barf :-)
-----------------
We have (at least) 3 fundamental problems.
1. We can't really simulate the native posix threading libraries;
i really don't want to get into the problems of making V -- and
more importantly, the code it produces -- thread safe. And it
practice it turns out to be quite useful to be able to intercept
the pthread calls and do our own checking of them anyway. This
means we want to supply our own libpthread.so. At least a simple
implementation of posix pthreads has turned out to be not very
difficult to implement anyway.
2. We have to rely on the LD_PRELOAD mechanism to sneak in our own
libpthread.so (and to gain control at all). This is fragile,
inflexible, and requires us to play games with weak symbols etc
to get the right bindings. Even so it sometimes fails to work
well enough, as with RH9.
3. Using LD_PRELOAD to gain control has worked surprisingly well
for a long time. But it's also a nuisance. It means V has to
be entirely self-contained; we can't use any glibc fns ourselves
and so have to supply vg_mylibc.c, for example. It also more or
less prevents us from writing any parts of V in any other
programming language -- for example, lots of Nick's skin stuff
might be easier done in C++.
What I am thinking of is this:
A. Turn V into a bog-standard program, not an LD_PRELOAD thing,
which contains its own ELF loader. This makes (3) go away, and
it gives us complete control of (2). ELF loaders exist;
we might be able to start from Mark Probst's loader in bintrans
if we ask nicely :-)
(http://www.complang.tuwien.ac.at/schani)
B. Now that we have complete control over loading, we can reliably
substitute in our own libpthread.so.
Now, we are going to have to jump through hoops to come up with
another libpthread.so which works properly with NPTL. So
(and this is where you need your barfbags) how about also ignoring
requests to load glibc.so. Instead, we supply our own implementation
of libc, which probably also contains our integrated pthreads
implementation.
Um, I don't know if this is a really really stupid idea or not.
Probably is. Please feel free to shoot me down in flames.
(be as rude as you like :)
It would be nice (in the long run) for V to be more modular. It's
pretty clear that the virtual CPU can be made fairly modular --
basically, made into a library which translates a source bb into an
instrumented bb. Being modular helps if we want to think about making
a virtual CPU for some other architecture.
Nick's work shows that the instrumentation system(s) can be made modular.
So, finally, it would be nice to modularise, as best
we can, that part of it which constructs the environment in which
the simulated program is to run. This seems to me to be a step in
the right direction.
Comments? This is obviously a large amount of work, and I'd be
interested in hearing opinions of how to ensure the longer-term
maintainability of V.
J
|
|
From: Nicholas N. <nj...@ca...> - 2003-04-07 13:40:40
|
On Sun, 6 Apr 2003, Julian Seward wrote: > Now, we are going to have to jump through hoops to come up with > another libpthread.so which works properly with NPTL. So > (and this is where you need your barfbags) how about also ignoring > requests to load glibc.so. Instead, we supply our own implementation > of libc, which probably also contains our integrated pthreads > implementation. > > Um, I don't know if this is a really really stupid idea or not. > Probably is. Please feel free to shoot me down in flames. > (be as rude as you like :) Whoa, glibc.so is pretty big; the libpthread.so implementation isn't even complete (missing some obscure functions), replacing glibc.so will have that problem but on a much worse scale. Also, this would mean that the program running under Valgrind is even further from the original program than is the case currently, where libpthread.so and malloc() et al are replaced. This seems like a bad idea to me, especially for eg. Cachegrind and any other skins that try to measure aspects of performance. This sounds dangerous to me. What happened to your idea about intercepting clone() but nothing else, and having Valgrind schedule the threads? Is that not possible with NPTL? N |
|
From: Adam G. <ar...@cy...> - 2003-04-07 14:28:18
|
At 17:28 06/04/03 +0000, Julian Seward wrote: >I've spent the afternoon messing with R H 9. Graydon and Jeff have >done a great job telling me useful stuff for getting V to work on >RH9, and thanks to them for that. > >So it sort-of works. Unfortunately OpenOffice hangs, due to an accept() >call which isn't correctly routed to V's only-this-thread-blocks >implementation thereof, and so the whole app hangs. Sigh. I think it >might be due to ld.so in 9 ignoring -z initfirst. > >It's all a swamp, and V is getting more and more cluttered with stuff >to work around this kind of problem. JeremyF's vg_intercept.c thing >is the latest addition. just as a matter of interest (or maybe shared problems!) WINE has been having trouble with NPTL as well, since they also want to provide their own pthread implementation... might be worth having a look at some of the hoops the new NPTL patches in WINE go through. Seeya, Adam -- Real Programmers don't comment their code. If it was hard to write, it should be hard to read, and even harder to modify. These are all my own opinions. |
|
From: Jeremy F. <je...@go...> - 2003-04-08 10:00:46
|
Quoting Julian Seward <js...@ac...>: > > I've spent the afternoon messing with R H 9. Graydon and Jeff have > done a great job telling me useful stuff for getting V to work on > RH9, and thanks to them for that. > > So it sort-of works. Unfortunately OpenOffice hangs, due to an > accept() > call which isn't correctly routed to V's only-this-thread-blocks > implementation thereof, and so the whole app hangs. Sigh. I think it > might be due to ld.so in 9 ignoring -z initfirst. At least one possible problem is that if two things have initfirst - well, they can't both be first. > It's all a swamp, and V is getting more and more cluttered with stuff > to work around this kind of problem. JeremyF's vg_intercept.c thing > is the latest addition. And it is well ugh-worthy. > 1. We can't really simulate the native posix threading libraries; What do you mean by "simulate"? Do you mean we can emulate the presence of the library, but we can't run a simulation of the actual library? What we could do is move Valgrind's thread simulation an abstraction level down, and intercept the clone system call, and then run the "real" thread library on that. Unfortunately that loses the higher-level insight Valgrind gets into the threading behaviour of the target program. > What I am thinking of is this: > > A. Turn V into a bog-standard program, not an LD_PRELOAD thing, > which contains its own ELF loader. This makes (3) go away, and > it gives us complete control of (2). ELF loaders exist; > we might be able to start from Mark Probst's loader in bintrans > if we ask nicely :-) > (http://www.complang.tuwien.ac.at/schani) ELF loaders are easy. There are some complexities though: does Valgrind live in the same address space as the target app, or a separate one? If its the same one, then you get the problem of making two libc's exist in the same address space. If they're in different address spaces, you have the question of how to give Valgrind efficient control of the target address space (which ideally requires OS support in the form of an address space without its own thread - something pretty alien to Linux's model of things). The third option is the compromise: make them logically distinct address spaces, but have them share the same actual address space by adding another virtualization layer on address translation. > B. Now that we have complete control over loading, we can reliably > substitute in our own libpthread.so. Whoa there - you're leaping to conclusions. A implies you have an ELF loader which emulates execve() - ie, user-space exec. But it doesn't necessarily mean you've replaced ld.so and its loading mechanisms (which was already all in user- space). If you do mean replacing ld.so and its resolution mechanism, it means you're willing to actually understand how all that stuff works and 1) emulate it while 2) modifying the behaviour for our purposes. However, I think once you've done 1) you can probably get the existing dynamic linker to do what we need. User-space exec does make it easy to replace ld.so with anything we like, because we can choose to mis-interpret the ET_INTERP program header. > Now, we are going to have to jump through hoops to come up with > another libpthread.so which works properly with NPTL. So > (and this is where you need your barfbags) how about also ignoring > requests to load glibc.so. Instead, we supply our own > implementation > of libc, which probably also contains our integrated pthreads > implementation. If you're actually proposing that we replace glibc with an ABI-compatible re- implementation which happens to not use NPTL, then I think that's a huge amount of work with a lot of disadvantages. We could just package an appropriately built glibc with valgrind itself and use that, but that may still cause more problems than it solves. I think it would help me if someone could explain exactly how NPTL differs from normal pthreads at the ABI level. Why does Valgrind care? What changes are made to the instruction stream? > Um, I don't know if this is a really really stupid idea or not. > Probably is. Please feel free to shoot me down in flames. > (be as rude as you like :) > > It would be nice (in the long run) for V to be more modular. It's > pretty clear that the virtual CPU can be made fairly modular -- > basically, made into a library which translates a source bb into an > instrumented bb. Being modular helps if we want to think about making > a virtual CPU for some other architecture. I definitely agree. The trouble is that V really wants to operate of several levels of abstraction at once. At core it is a CPU/OS syscall emulator, and at that level this proposal makes some sense. However, because Valgrind also does emulation at the library level, it wants to be at the next level up. I think we can probably address this in two parts: 1. make V a full stand-alone CPU/syscall emulator, which is capable of running most programs (even threaded) without having to do any library intercepts, simply by emulating enough of the syscall layer for pthreads to work. 2. Some skins really want to understand what's happening at the library call level rather than the syscall leveL (helgrind, for example, wants to know where and what a lock is). In that case, library intercepts are the only option. We can either do this at the dynamic linker level (ie, what V does now), or perhaps at the CPU level (if you see a call to this address, quietly redirect it through this function). This means that we can avoid all the current complexity. All we need to do is look up the address of pthread_mutex_lock (say), and add that address to an intercept table which the dynamic codegen can consult whenever it is about to generate some code. That intercept table would list a pair of "run before"/"run after" functions which would be inserted into the generated code. J |
|
From: Adam G. <ar...@cy...> - 2003-04-08 10:56:15
|
At 03:00 08/04/03 -0700, Jeremy Fitzhardinge wrote: >Quoting Julian Seward <js...@ac...>: > >> >> I've spent the afternoon messing with R H 9. Graydon and Jeff have >> done a great job telling me useful stuff for getting V to work on >> RH9, and thanks to them for that. >> >> So it sort-of works. Unfortunately OpenOffice hangs, due to an >> accept() >> call which isn't correctly routed to V's only-this-thread-blocks >> implementation thereof, and so the whole app hangs. Sigh. I think it >> might be due to ld.so in 9 ignoring -z initfirst. > >At least one possible problem is that if two things have initfirst - well, they >can't both be first. > >> It's all a swamp, and V is getting more and more cluttered with stuff >> to work around this kind of problem. JeremyF's vg_intercept.c thing >> is the latest addition. > >And it is well ugh-worthy. > >> 1. We can't really simulate the native posix threading libraries; > >What do you mean by "simulate"? Do you mean we can emulate the presence of the >library, but we can't run a simulation of the actual library? > >What we could do is move Valgrind's thread simulation an abstraction level down, >and intercept the clone system call, and then run the "real" thread library on >that. Unfortunately that loses the higher-level insight Valgrind gets into the >threading behaviour of the target program. clone() implementation patch available on request... ;-) this is exactly what I did for valgrind's WINE support. there are several problems with implementing clone() using 'green' threads - the biggest one is that external processes cannot send signals to the new threads (it's causing major headaches trying to get patches into WINE to cope with this). if you want to REALLY clone a new thread, then you'll have to add all sorts of locking to valgrind to cope with it - I had a look at this but it appears to be a big job. currently valgrind itself is pretty much single threaded - with real cloned threads that won't be the case. Seeya, Adam -- Real Programmers don't comment their code. If it was hard to write, it should be hard to read, and even harder to modify. These are all my own opinions. |