|
From: Julian S. <js...@ac...> - 2004-09-02 10:24:06
|
One point to bear in mind is that the conversation that followed appeared to be silently predicated on the assumption that Linux is the only kernel we're interested in. We have to think beyond that: I would love to make V available for MacOSX, and I bet (eg) the OpenBSD folks would love to get their hands on an x86-openbsd variant of Valgrind. Indeed, once Nick's current commit set goes in, one interesting experiment would be to try an openbsd port. Or freebsd (already exists) or netbsd. > --------------------------------------------------------------------------- >-- Problems + solutions > --------------------------------------------------------------------------- >-- P1. It assumes 3G:1G user/kernel split. > > - For 4G kernels, Valgrind gets the whole extra 1GB for its own use (I > think). This works, but is sub-optimal. > > - For other layouts (eg. 2G:2G, or even 2.9G:1.1G) it just doesn't work. > Changing KICKSTART_BASE is a workaround, if you know that. (But 2G:2G > still cannot run Memcheck, see below.) > > S1. This can be solved easily, by using position-independent executables > (PIE). We can do a configure-time test for PIE, and if supported, make > stage2 a PIE. Then stage1 can decide where stage2 should go, by doing > some kind of run-time test (which would look at where the stack is, or > use shmat(), or something, to determine where the user/kernel division > lies). > > This change is pretty uncontroversial, and Paul already has a patch for > it (which I don't think should be committed as-is, but is a good start). > > For non-PIE-supporting systems, we could build 3 or 4 versions of > stage2, and choose the most appropriate one (I have a patch for this). > Or just a single fixed-location back-up stage2 might be enough. Ok, I agree with P1/S1. This is uncontroversial. Since we can't rely on PIE being around, the 3 or 4 versions solution sounds good to me. > --------------------------------------------------------------------------- >-- P2. For kernels with "overcommit" mmapping off -- which prevents a > process from allocating more address space than the available swap space -- > you need at least 1.5GB of swap for Memcheck to run, because swap must be > at least as large as any individual segment. (And I think users with > ulimit -v set suffer the same problem.) > > S2. Avoiding this requires not using the big-bang shadow allocation > method, and that shadow memory instead be done incrementally. (More about > that below.) Let's just forget about big-bang shadow allocation. It causes a whole bunch of problems, we're not using it at the moment, and we don't have a clear picture of where the cycle-level costs of shadow memory come from anyway. For example, if shadow memory really kills us because it jacks up the D1/L2 cache miss rates, then it's going to do so regardless of the address translation scheme in use. > A more radical solution: truly virtualise the address space (rather > than just partitioning it) -- ie. Valgrind implements it's own virtual > MMU and page table. The exact details of how it would work are not yet > clear. If even feasible, this is a long-term solution; something else > should be done in the meantime. I agree. Currently I do not see how to do this with a small enough performance overhead, so forget about this for the time being. > --------------------------------------------------------------------------- >-- My suggestion > --------------------------------------------------------------------------- >-- In this order, make the following changes. > > 1. Use PIE where possible, solving P1. Agree. > 2. Switch big-bang shadow memory allocation to incremental, solving P2, > P4, P5. Agree. > 3. Make the client/Valgrind division movable, largely solving P3. Agree. > 4. Maybe make the --pointer-check=no change, if it seems useful. Well, I like the fact that currently the client can't trash V. Another thing to consider is how to achieve this portably, on non-x86s. If the client address space is contained entirely in 0 .. N-1, and N is a power of two, ANDing is obviously a cheap solution. If the machine contains a scalar 'min' insn, then we can do this cheaply for any N. J |