|
From: Nicholas N. <nj...@ca...> - 2004-11-19 13:43:12
|
Hi,
The thesis of this message is that "Full Virtualisation" (FV, better
described as "strict address space partitioning") is a bad idea.
(In what follows, 'V' is Valgrind, 'P' is the client program being run
under Valgrind's control.)
-----------------------------------------------------------------------------
Original, V was a shared object, grafted onto the client via LD_PRELOAD.
V memory and P memory were all mixed up. This had various problems.
So we moved to FV... basically, V is a real executable, stage1 loads it
at a high address, stage2 loads the client at a low address, V maintains
a full segment list of all memory mappings, P cannot touch V due to the
use of x86 segment registers.
Here's some problems FV was meant to solve, and an evaluation of how well
it fared in practice.
1. P runs for some time before V gains control, which is dodgy, and
causes problems with threads.
Success! V has full control over P from startup.
2. V shares the dynamic linker and shared libraries with the client; V
must be very careful not to interfere with the client's use of these
(eg. have its own libc functions).
Failure. We haven't been able to use glibc in V really because we
can't use any function that might use malloc(), because we can't
separate V and P's use of brk() and the data segment. Having our own
libc still seems like a good idea.
3. P can clobber V's memory.
Partial success. P can't clobber V; but I'm not sure how much of a
problem this was in the first place. (Are there any cases
where Memcheck wouldn't report the clobbering first?) Also, x86
segment selectors aren't portable, and no convincing alternative is
known for other architectures.
4. No self-hosting possible.
Failure. Self-hosting is no closer to happening.
5. Intercepting library calls is difficult and fragile.
??? It still seems difficult, but I don't understand that stuff at
all. Can someone elaborate?
6. Statically linked programs are not supported.
Partial success. We can run statically linked binaries. However, we
can't intercept malloc() et al, which hobbles tools to various
degrees (eg. Cachegrind not at all, Memcheck somewhat, Addrcheck
quite a lot, Massif totally).
Six problems, one clear success, two partial sucesses, two failures, and a
(to me) unknown. All in all, not very convincing.
There were probably other advantages as a result, that I can't think of
now, that it would be worth discussing.
-----------------------------------------------------------------------------
And what were the costs?
- Code size. FV added a lot of code. Especially keeping track of all the
mapped segments (and there are still several nasty bugs in there).
- Robustness. FV is generally more fragile; there are more things to
get right, and the consequences are bad if they are not right. IMHO
we get more random seg fault problems now. A lot have been cleaned up
(it was really bad at first), but they still happen.
- Also, the inflexibility of the memory layout has caused many problems:
- difficulties for non-standard (ie. 3G:1G) kernels
- can't run with a virtual memory ulimit (bad esp. for embedded
developers)
- clients and tools run out of memory earlier than they used to, and
we have problems reading debug info for large files
- big-bang shadow allocation causes problems with non-overcommitting
kernels
V has dropped from #6 to #72 highest-rated project at Freshmeat.net over
the last year or so; I think the reason for this is that V's "it just
works" characteristic has been diminished, due to the robustness and
inflexibility problems.
-----------------------------------------------------------------------------
Here's a rough proposal that attempts to combine the best features of the
original scheme with FV.
First, the features I think are important to preserve:
- Valgrind having control from the very start
- Being able to run static binaries (even if the tools don't work fully)
Features I'm willing to give up:
- Total separation of P and V memory. The inflexibility and complexity
are not worth it. --pointercheck isn't portable, anyway.
- V's ability to use other libraries. It doesn't really work now
anyway, and just increases V's dependence on other things.
(Dependence could become a problem as we port to other archs/OSes, as
libraries may not work in exactly the same way in all cases.)
Features I'd really like:
- Self-hosting.
- Function wrapping [but that's kind of orthogonal to the rest of this
message]
How it would work:
- stage1 still starts things up. stage2 is maybe a .so again? Not
sure, but it wouldn't be put in a fixed location.
- V and P mappings are totally intermingled. We just let the kernel
mmap things wherever it wants, without trying to enforce any layout
ourselves. (This precludes big-bang shadow allocation, and thus
precludes fixed-offset shadow memory addressing being used in the
future. This does not worry me.) This makes startup much easier,
since we don't have to be so careful about where things go.
- We still need a segment mapping list of sorts, at least for exe segments
with debug info.
- Self-hosting might be possible now that Valgrind (ie. stage1) is a
normal executable again, rather than a .so as it was originally? Not
sure.
- We still use our own libraries, and in fact try to remove all our use
of glibc altogether, as relying on it feels like a bad idea. (stage1
might be able to use glibc, though? Not sure.)
- V and P would probably share the dynamic linker? So we'd have to be
careful about the symbol namespace, but we already do that most of the
time anyway so it's not a problem.
So the big wins are: reduced complexity (far less memory tracking, we
let the kernel decide where things go), increased memory layout
flexibility.
Things I'm not sure about:
- Valgrind's stack? Currently auto-extended, but it used to be a fixed
size, and that wasn't much of a problem. Could make the SEGV handler
much simpler?
- Debug info reading, and segments, and the interactions there.
- Is the segment list used in other ways?
- Library interceptions -- what's the effect?
- The current SEGV handler for the client is nice, in that a
seg-faulting client dies with an informative message from V. Also,
the "INTERNAL ERROR" msg about V's own seg faults is good. It would
be good to preserve that.
-----------------------------------------------------------------------------
I think the end result would be simpler, have less code, be more robust,
and cause fewer problems for users. Discuss.
[On a related note, what do we gain from keeping the P and V file
descriptors strictly partitioned? It's a lot of work enforcing the
partition, and it makes self-hosting harder.]
|
|
From: Jeremy F. <je...@go...> - 2004-11-19 23:45:39
|
On Fri, 2004-11-19 at 13:42 +0000, Nicholas Nethercote wrote: > 2. V shares the dynamic linker and shared libraries with the client; V > must be very careful not to interfere with the client's use of these > (eg. have its own libc functions). > > Failure. We haven't been able to use glibc in V really because we > can't use any function that might use malloc(), because we can't > separate V and P's use of brk() and the data segment. Having our own > libc still seems like a good idea. Having our own libc is definitely not a good idea, but we haven't gone to much effort to replace it yet. glibc makes it hard to intercept brk directly, but it is possible to replace malloc/calloc/realloc/etc and be reasonably sure of avoiding the use of brk (particularly if you get the kernel to enforce it). Having our own libc going to be a portability liability. But using system libraries wasn't the only reason to disentangle ourselves from the dynamic linker. The dynamic linker itself is 1) very GNU/glibc-specific 2) has changed a lot over the last few years, and doesn't seem like stopping, and so 3) depending on it in detail is going to continue to be a maintenance and portability problem. We're stuck with having to deal with it for the purposes of interception, but it would be nice to be independent of it for the basic functioning of Valgrind. > 3. P can clobber V's memory. > > Partial success. P can't clobber V; but I'm not sure how much of a > problem this was in the first place. (Are there any cases > where Memcheck wouldn't report the clobbering first?) Also, x86 > segment selectors aren't portable, and no convincing alternative is > known for other architectures. A bounds-limit test for each memory access isn't that expensive, and in 64-bit address spaces, you can make the client address space a power of 2 in size, which simplifies the test. You could also use a v. large redzone to make hits much more unlikely. The segment test is nice because it actually is free, but explicit testing probably isn't that expensive, particularly if the codegen can remove redundant tests, and schedule the tests it does generate appropriately. memcheck and addrcheck will report on out of bounds writes, but other tools won't. In addition, if you get an out of bounds error which doesn't crash the process, it immediately means that all other program and valgrind output is unreliable, so you need to be very strict about fixing the first problem before considering the rest. I think this makes the Valgrind output unreliable. Well, hm. If Valgrind is sharing ld.so with the client, then they're not really separate programs at all. If the client screws up the dynamic linker, Valgrind could get hit and crash without being able to report on it at all. > 4. No self-hosting possible. > > Failure. Self-hosting is no closer to happening. True. We need more virtualization to do it properly. > 5. Intercepting library calls is difficult and fragile. > > ??? It still seems difficult, but I don't understand that stuff at > all. Can someone elaborate? The machinery in there now is pretty simple. You list a set of addresses of functions you want to intercept, and when the codegen is told to fetch from those addresses, it actually fetches from the intercepting function instead. The addresses can be specified either literally or symbolically; symbol names can be qualified by a particular library name, or be unqualified. glibc makes complex by being complex itself, but the core machinery is pretty simple. > 6. Statically linked programs are not supported. > > Partial success. We can run statically linked binaries. However, we > can't intercept malloc() et al, which hobbles tools to various > degrees (eg. Cachegrind not at all, Memcheck somewhat, Addrcheck > quite a lot, Massif totally). We can intercept malloc/free/etc if the program hasn't been stripped. Its just that with dynamic linking, the programs are never (can't be) completely stripped. > Six problems, one clear success, two partial sucesses, two failures, and a > (to me) unknown. All in all, not very convincing. > > There were probably other advantages as a result, that I can't think of > now, that it would be worth discussing. > > ----------------------------------------------------------------------------- > And what were the costs? > > - Code size. FV added a lot of code. Especially keeping track of all the > mapped segments (and there are still several nasty bugs in there). You know, I'm really not sure that it did. I'll agree that the skiplist code has been more subtly broken for longer than it should have been, but as a generic data structure we should be able to get good use from it. And really, the mapped segment code is there to replace the old stuff which kept reading /proc/self/map; that was getting to be a pretty significant bottleneck and was plain ugly (ie, the mapped segment stuff would have been needed anyway, regardless of FV). The other large code change is the syscall handing stuff, which is independent of FV. I dunno. Valgrind is a lot more complex now, but it does do a lot more stuff. I don't think we're going to return to the halcyon days of 1.0 simplicity and still manage to keep the functionality. > - Robustness. FV is generally more fragile; there are more things to > get right, and the consequences are bad if they are not right. IMHO > we get more random seg fault problems now. A lot have been cleaned up > (it was really bad at first), but they still happen. Yes, but I think that comes with the "doing more stuff". 1.0 would just fail outright on a lot of programs. 2.x tries to run them, and generally (but not always) succeeds. And again, I don't think this is strictly an FV issue. > - Also, the inflexibility of the memory layout has caused many problems: > - difficulties for non-standard (ie. 3G:1G) kernels > - can't run with a virtual memory ulimit (bad esp. for embedded > developers) Can you explain? What do you mean by "embedded developers"? The > - clients and tools run out of memory earlier than they used to, and > we have problems reading debug info for large files > - big-bang shadow allocation causes problems with non-overcommitting > kernels > > V has dropped from #6 to #72 highest-rated project at Freshmeat.net over > the last year or so; I think the reason for this is that V's "it just > works" characteristic has been diminished, due to the robustness and > inflexibility problems. Um, do you have anything to support that? I think the ranking is dropping because V is not new anymore, and people are taking it for granted. Are we seeing an increase in bug reports disproportionate to the number of users? > Features I'm willing to give up: > - Total separation of P and V memory. The inflexibility and complexity > are not worth it. --pointercheck isn't portable, anyway. > - V's ability to use other libraries. It doesn't really work now > anyway, and just increases V's dependence on other things. > (Dependence could become a problem as we port to other archs/OSes, as > libraries may not work in exactly the same way in all cases.) I'd still like to be able to use C++ internally. > Features I'd really like: > - Self-hosting. > - Function wrapping [but that's kind of orthogonal to the rest of this > message] > > How it would work: > - stage1 still starts things up. stage2 is maybe a .so again? Not > sure, but it wouldn't be put in a fixed location. > - V and P mappings are totally intermingled. We just let the kernel > mmap things wherever it wants, without trying to enforce any layout > ourselves. (This precludes big-bang shadow allocation, and thus > precludes fixed-offset shadow memory addressing being used in the > future. This does not worry me.) This makes startup much easier, > since we don't have to be so careful about where things go. ... I guess my OS/kernel background is really making me dislike this idea. I agree that the partition makes things tight in a 32-bit address space, and that shuffling things around is a good idea to make more things work, but I really like having Valgrind and the client be as separate as possible. And I really think that the direct mapped shadow memory makes the most sense for 64-bit systems, even if it doesn't for 32-bit. > - Self-hosting might be possible now that Valgrind (ie. stage1) is a > normal executable again, rather than a .so as it was originally? Not > sure. Don't see why. That's not the hard part of self-hosting. The tricky part is making the system emulation match what Valgrind itself uses (which in turn needs improvements to the VCPU's exception model). > - We still use our own libraries, and in fact try to remove all our use > of glibc altogether, as relying on it feels like a bad idea. (stage1 > might be able to use glibc, though? Not sure.) I think carrying around lots of private libc is just asking for more work when porting to other systems. My view has been to do a little bit of hard work, with the payoff that we can avoid lots of hard work by using the system libraries where possible. I'm not keen on using lots of 3rd party libraries, but I would like to be able to compile Valgrind as a mostly normal executable using C and C++ code without having to worry too much about what libraries it pulls in. We can't be completely blithe, and we'll always need to link a bit strangely (though -fpie helps), but I think that's better than vg_libc.c. > - V and P would probably share the dynamic linker? So we'd have to be > careful about the symbol namespace, but we already do that most of the > time anyway so it's not a problem. Well, that point of contact makes V very vulnerable to the correct behaviour of the client. > So the big wins are: reduced complexity (far less memory tracking, we > let the kernel decide where things go), increased memory layout > flexibility. The increased layout flexibility is the only obvious win to me, and I think it costs quite a bit. And now that I have a 64-bit machine, I can easily say that this is all a lot of engineering effort to keep obsolete systems happy ;-). > Things I'm not sure about: > - Valgrind's stack? Currently auto-extended, but it used to be a fixed > size, and that wasn't much of a problem. Could make the SEGV handler > much simpler? Well, we still need to keep Valgrind and the client stack separate. Valgrind's stack is fixed size, but the client's has to grow. If we use the stack the system gave us as the client stack, then we don't need to worry about it. > - Debug info reading, and segments, and the interactions there. That's an independent problem. We can massively reduce the memory/address space use needed for debug info reading whether we use FV or not. > - Is the segment list used in other ways? It's also used to keep track of what areas in memory have cached code. We don't use that at the moment, but I was anticipating it being useful for the self-modifying-code problem. > - Library interceptions -- what's the effect? Unchanged I think. Well, the old model was "we have a function called X, so it replaces some other function called X", but that ignores the real complexity of ld.so/glibc's symbol matching, versioned symbols, etc. We could either do our own lookups, or get ld.so to do them - I'd prefer to do it ourselves, on the grounds that we have a bit more understanding of what's happening (and are't subject to yet another change in ld.so's lookup rules). I'm planning on putting some actual work into dropping our private libpthread, which will change the interception requirements. > - The current SEGV handler for the client is nice, in that a > seg-faulting client dies with an informative message from V. Also, > the "INTERNAL ERROR" msg about V's own seg faults is good. It would > be good to preserve that. That's independent of FV, though it might be a bit simpler with FV than without. > I think the end result would be simpler, have less code, be more robust, > and cause fewer problems for users. Discuss. I think its more complicated than that. > [On a related note, what do we gain from keeping the P and V file > descriptors strictly partitioned? It's a lot of work enforcing the > partition, and it makes self-hosting harder.] Unix has well known historical rules about how fds are allocated - the kernel guarantees that the next fd allocated will be the lowest free one. Programs can legitimately do something like: open("foo", O_RDONLY); /* I know that fds 0-4 are allocated, so this returns 5 */ read(5, buf, sizeof(buf)); If we have one of our fds mixed in with the client's, they could easily be stomped on (not to mention that programs have at least as many use-after-free style bugs with fds as memory, so if we end up using an fd which the program had once used, the client might decide to stomp on it). J |
|
From: Julian S. <js...@ac...> - 2004-11-20 12:08:44
|
> Having our own libc is definitely not a good idea, but we haven't gone > to much effort to replace it yet. glibc makes it hard to intercept brk > directly, but it is possible to replace malloc/calloc/realloc/etc and be > reasonably sure of avoiding the use of brk (particularly if you get the > kernel to enforce it). > > Having our own libc going to be a portability liability. I have to disagree. Having our own mini-glibc decouples us from the vagaries of what's supplied as libc on, eg, *BSD, Solaris, AIX, etc. If we were going to reproduce a large fraction of glibc then it would be a liability, but we only use a small amount. I am in favour of a "system abstraction layer" to support V's internal activities, such as Mozilla's NPR (?) and OOo's SAL. Clearly it would be smaller and simpler than either of those, but the principle is the same. > > 3. P can clobber V's memory. > > > > Partial success. P can't clobber V; but I'm not sure how much of a > > problem this was in the first place. (Are there any cases > > where Memcheck wouldn't report the clobbering first?) Also, x86 > > segment selectors aren't portable, and no convincing alternative is > > known for other architectures. > > memcheck and addrcheck will report on out of bounds writes, but other > tools won't. In addition, if you get an out of bounds error which > doesn't crash the process, it immediately means that all other program > and valgrind output is unreliable, so you need to be very strict about > fixing the first problem before considering the rest. I think this > makes the Valgrind output unreliable. I never got the impression from user feedback that the P-clobbers-V problem was significant. What's more, telling people to fix errors in the order is necessary even at present: even if P does not trash V, errors often form cascades, and it is the first one that needs to be fixed first. > Well, hm. If Valgrind is sharing ld.so with the client, then they're > not really separate programs at all. If the client screws up the > dynamic linker, Valgrind could get hit and crash without being able to > report on it at all. The underlying issue here is that mc/ac access control is too crude. For a while I have been thinking about 2 A bits per byte, one for read/exec access and one for write. Then executable areas could be marked as readonly and the above cannot happen. It would also finally give us Robert's memory watchpoints for free. > > 5. Intercepting library calls is difficult and fragile. > > > > ??? It still seems difficult, but I don't understand that stuff at > > all. Can someone elaborate? > > The machinery in there now is pretty simple. You list a set of > addresses of functions you want to intercept, and when the codegen is > told to fetch from those addresses, it actually fetches from the > intercepting function instead. The addresses can be specified either > literally or symbolically; symbol names can be qualified by a particular > library name, or be unqualified. glibc makes complex by being complex > itself, but the core machinery is pretty simple. Yes, I agree -- it's not that complex. > > 6. Statically linked programs are not supported. > > > > Partial success. We can run statically linked binaries. However, we > > can't intercept malloc() et al, which hobbles tools to various > > degrees (eg. Cachegrind not at all, Memcheck somewhat, Addrcheck > > quite a lot, Massif totally). > > We can intercept malloc/free/etc if the program hasn't been stripped. > Its just that with dynamic linking, the programs are never (can't be) > completely stripped. I never understood why we care about statically linked executables. My view is they are a special-case anomaly which it is not worth supporting. No developer doing day-to-day hacking is going to continually be building statically linked executables (are they?) IMO just detecting them and stopping with a warning is good enough. > > - Also, the inflexibility of the memory layout has caused many problems: > > - difficulties for non-standard (ie. 3G:1G) kernels > > - can't run with a virtual memory ulimit (bad esp. for embedded > > developers) > > Can you explain? What do you mean by "embedded developers"? The I guess, developers operating in scenarios with small amounts of virtual memory? > > V has dropped from #6 to #72 highest-rated project at Freshmeat.net over > > the last year or so; I think the reason for this is that V's "it just > > works" characteristic has been diminished, due to the robustness and > > inflexibility problems. > > Um, do you have anything to support that? I think the ranking is > dropping because V is not new anymore, and people are taking it for > granted. I tend to agree. Not convinced that V is particularly much more buggy than before. > > - V and P mappings are totally intermingled. We just let the kernel > > mmap things wherever it wants, without trying to enforce any layout > > ourselves. (This precludes big-bang shadow allocation, and thus > > precludes fixed-offset shadow memory addressing being used in the > > future. This does not worry me.) This makes startup much easier, > > since we don't have to be so careful about where things go. > > ... > I guess my OS/kernel background is really making me dislike this idea. Uh ... you need to say *why* you dislike the idea. Why? > > - Self-hosting might be possible now that Valgrind (ie. stage1) is a > > normal executable again, rather than a .so as it was originally? Not > > sure. > > Don't see why. That's not the hard part of self-hosting. The tricky > part is making the system emulation match what Valgrind itself uses > (which in turn needs improvements to the VCPU's exception model). The new VCPU (VEX) provides precise (memory) exceptions if you need them. > The increased layout flexibility is the only obvious win to me, and I > think it costs quite a bit. And now that I have a 64-bit machine, I can > easily say that this is all a lot of engineering effort to keep obsolete > systems happy ;-). :-) the obsolete systems are going to be around for a *long* time yet :-) and will probably always be the majority. J |
|
From: Jeremy F. <je...@go...> - 2004-11-23 23:57:52
|
On Sat, 2004-11-20 at 12:08 +0000, Julian Seward wrote: > I have to disagree. Having our own mini-glibc decouples us from > the vagaries of what's supplied as libc on, eg, *BSD, Solaris, AIX, > etc. If we were going to reproduce a large fraction of glibc then > it would be a liability, but we only use a small amount. I am in > favour of a "system abstraction layer" to support V's internal > activities, such as Mozilla's NPR (?) and OOo's SAL. Clearly it > would be smaller and simpler than either of those, but the > principle is the same. Most of what we have in vg_libc is either incredibly generic stuff, like str*(), or very system-dependent stuff, like syscall interfaces. The trouble is that every OS has different conventions for how syscalls are called, and how libc functions are mapped to those syscalls. That's stuff that the system libc already has and knows how to deal with. We would just have to replicate it for each system. Our libc needs are pretty generic: we're not going to have problems with a libc not having memcpy. I think using the system libcs will present less variation than the differences in the underlying kernel interfaces which they hide. > I never got the impression from user feedback that the P-clobbers-V > problem was significant. What's more, telling people to fix errors > in the order is necessary even at present: even if P does not trash > V, errors often form cascades, and it is the first one that needs > to be fixed first. The whole point is that Valgrind is for operating on programs which are assumed to be broken; assumed to be behaving in unpredictable ways. A program may work fine under memcheck but trash random addresses under helgrind. If Valgrind can't protect itself from the client, then its results will always be under a cloud. With pointercheck, I know that something is either a bug in Valgrind, or a bug in the client, but it definitely isn't the client trashing Valgrind. > > Well, hm. If Valgrind is sharing ld.so with the client, then they're > > not really separate programs at all. If the client screws up the > > dynamic linker, Valgrind could get hit and crash without being able to > > report on it at all. > > The underlying issue here is that mc/ac access control is too crude. > For a while I have been thinking about 2 A bits per byte, one for > read/exec access and one for write. Then executable areas could be > marked as readonly and the above cannot happen. I don't see how that would help. If there's one instance of ld.so, then there's one set of datastructures. Sometimes those structures will be manipulated by the client, with ld.so running on the virtual CPU, and sometimes running on the real CPU as part of Valgrind. I don't see how you could set up a permissions system which allows those structures to be manipulated correctly but never trashed by the client. > It would also finally > give us Robert's memory watchpoints for free. Well, at the cost of increasing the virtual address space pressure, which apparently is a significant problem. > I never understood why we care about statically linked executables. > My view is they are a special-case anomaly which it is not worth supporting. > No developer doing day-to-day hacking is going to continually be building > statically linked executables (are they?) IMO just detecting them and > stopping with a warning is good enough. They're useful for people doing embedded stuff. You can link an RTOS with an app into a simulation environment and run it as a normal process. > > > - Also, the inflexibility of the memory layout has caused many problems: > > > - difficulties for non-standard (ie. 3G:1G) kernels > > > - can't run with a virtual memory ulimit (bad esp. for embedded > > > developers) > > > > Can you explain? What do you mean by "embedded developers"? The > > I guess, developers operating in scenarios with small amounts of > virtual memory? Embedded systems normally have a problem of not enough physical memory, which is unaffected by any of these proposls. If they're running Linux on an x86, then there's no particular reason they'd have limited virtual space. The virtual ulimit is only useful for stopping a runaway program from generating a thrash storm, and only under a very limited number of cases (since it only affects the brk syscall, and not mmap). > > > - V and P mappings are totally intermingled. We just let the kernel > > > mmap things wherever it wants, without trying to enforce any layout > > > ourselves. (This precludes big-bang shadow allocation, and thus > > > precludes fixed-offset shadow memory addressing being used in the > > > future. This does not worry me.) This makes startup much easier, > > > since we don't have to be so careful about where things go. > > > > ... > > I guess my OS/kernel background is really making me dislike this idea. > > Uh ... you need to say *why* you dislike the idea. Why? Well, from an OS perspective, having a kernel which protects itself from buggy application code is what marks the difference from a real OS and a piece of unreliable junk. Unprotected operating systems work with the charming naievety that all application code is basically bug-free and won't cause any damage. Since in Valgrind we know the client code is buggy, probably with some kind of memory problem, we know there's a good likelihood that the client is going to start taking pot-shots at the core. If we're lucky, it will do it when we're running under addr/memcheck. If not, it will quietly corrupt things when we're using some other tool. And I know, heisenbugs which appear under some tools but not other are an orthogonal problem which is generally unsolvable, but FV makes checking for the wildest memory access problems pretty cheap, and it also means that Valgrind's memory allocation patterns have less/no effect on the client allocation layout. > The new VCPU (VEX) provides precise (memory) exceptions if you need > them. Good. > :-) the obsolete systems are going to be around for a *long* time > yet :-) and will probably always be the majority. Yes, 64-bit machines are already becoming pretty common, and will be a more attractive machine for hosting Valgrind sessions than 32-bit machines. In other words, in the total population of systems I agree with you, but in the world of developers-using-Valgrind, I think 64-bit systems will be much more common. J |
|
From: Johannes S. <Joh...@gm...> - 2004-11-20 15:49:34
|
Hi, On Sat, 20 Nov 2004, Julian Seward wrote: > I never understood why we care about statically linked executables. > My view is they are a special-case anomaly which it is not worth supporting. > No developer doing day-to-day hacking is going to continually be building > statically linked executables (are they?) IMO just detecting them and > stopping with a warning is good enough. Valgrind started as just a good memory leak checker. As it evolved, it got possible to make use of it to do reverse engineering - a sort of "call tree for assembler". Ciao, Dscho |
|
From: Nicholas N. <nj...@ca...> - 2004-11-20 16:03:45
|
On Sat, 20 Nov 2004, Johannes Schindelin wrote: >> I never understood why we care about statically linked executables. >> My view is they are a special-case anomaly which it is not worth supporting. >> No developer doing day-to-day hacking is going to continually be building >> statically linked executables (are they?) IMO just detecting them and >> stopping with a warning is good enough. > > Valgrind started as just a good memory leak checker. As it evolved, it got > possible to make use of it to do reverse engineering - a sort of "call > tree for assembler". Sorry, I don't see how this relates to statically linked binaries? N |
|
From: Julian S. <js...@ac...> - 2004-11-20 16:25:21
|
On Saturday 20 November 2004 16:03, Nicholas Nethercote wrote: > On Sat, 20 Nov 2004, Johannes Schindelin wrote: > >> I never understood why we care about statically linked executables. > >> My view is they are a special-case anomaly which it is not worth > >> supporting. No developer doing day-to-day hacking is going to > >> continually be building statically linked executables (are they?) IMO > >> just detecting them and stopping with a warning is good enough. > > > > Valgrind started as just a good memory leak checker. As it evolved, it > > got possible to make use of it to do reverse engineering - a sort of > > "call tree for assembler". > > Sorry, I don't see how this relates to statically linked binaries? Me neither. Besides, if I merely wanted a leak checker I would never have jumped through 10000 hoops to build a simulated CPU. It really started out as a good uninitialised-value detector. J |
|
From: Johannes S. <Joh...@gm...> - 2004-11-20 21:06:58
|
Hi, On Sat, 20 Nov 2004, Nicholas Nethercote wrote: > On Sat, 20 Nov 2004, Johannes Schindelin wrote: > > >> I never understood why we care about statically linked executables. > > > > Valgrind started as just a good memory leak checker. As it evolved, it got > > possible to make use of it to do reverse engineering - a sort of "call > > tree for assembler". > > Sorry, I don't see how this relates to statically linked binaries? Sorry, I was a bit terse: Suppose you have a statically linked binary. You don't have the source code, and you will never get it. This program does a lot of useful things, but you are interested in exactly one function. To find out where the code for this function is, in order to analyse that assembly code, you can "follow the data", i.e. you can write a program which knows at which point in memory the input is stored, marks this as interesting data, and then executes the program, marking all memory locations which contain derived data, and all conditional jumps depending on the values of marked data as interesting. In this manner, you can get a good idea where to look closely at the assembly code. Unfortunately, I did not yet succeed to do this with valgrind, but I used bochs for this some years ago. I plan to do this sometime next year with valgrind, because it is much easier (as compared to bochs) to mark the initial interesting data, and it is way faster. The only problem there would be tracking of interesting data in the registers. And I didn't mean to say "just another memory leak checker"; I rather meant that when valgrind v1 was current, I used valgrind to prove to a software vendor that indeed, his program was faulty, and no, my data was not too large for the application. I even could point to a certain code range where memory was eaten away, when it should have been freed at each iteration of a for loop. So I should have said "I got addicted to valgrind when I used it just as a memory leak checker, when LD_PRELOAD methods didn't work because the malloc syscall was called directly". Keep up the good work, Dscho |
|
From: Nicholas N. <nj...@ca...> - 2004-11-22 16:06:35
|
On Fri, 19 Nov 2004, Jeremy Fitzhardinge wrote: > Having our own libc is definitely not a good idea, but we haven't gone > to much effort to replace it yet. glibc makes it hard to intercept brk > directly, but it is possible to replace malloc/calloc/realloc/etc and be > reasonably sure of avoiding the use of brk (particularly if you get the > kernel to enforce it). I don't think "reasonably sure" is good enough. > But using system libraries wasn't the only reason to disentangle > ourselves from the dynamic linker. The dynamic linker itself is 1) very > GNU/glibc-specific 2) has changed a lot over the last few years, and > doesn't seem like stopping, and so 3) depending on it in detail is going > to continue to be a maintenance and portability problem. We're stuck > with having to deal with it for the purposes of interception, but it > would be nice to be independent of it for the basic functioning of > Valgrind. How does FV provide that independence? > A bounds-limit test for each memory access isn't that expensive, and in > 64-bit address spaces, you can make the client address space a power of > 2 in size, which simplifies the test. You could also use a v. large > redzone to make hits much more unlikely. The segment test is nice > because it actually is free, but explicit testing probably isn't that > expensive, particularly if the codegen can remove redundant tests, and > schedule the tests it does generate appropriately. I'm not very keen on features that require greatly different mechanisms on different architectures. > Well, hm. If Valgrind is sharing ld.so with the client, then they're > not really separate programs at all. If the client screws up the > dynamic linker, Valgrind could get hit and crash without being able to > report on it at all. Julian made a good point about distinguishing between read-only and read-write memory with Memcheck/Addrcheck. Also, my proposal doesn't preclude Valgrind from keeping it's own copy of ld.so, as is done now. >> - Code size. FV added a lot of code. Especially keeping track of all the >> mapped segments (and there are still several nasty bugs in there). > > You know, I'm really not sure that it did. I'll agree that the skiplist > code has been more subtly broken for longer than it should have been, > but as a generic data structure we should be able to get good use from > it. And really, the mapped segment code is there to replace the old > stuff which kept reading /proc/self/map; that was getting to be a pretty > significant bottleneck and was plain ugly (ie, the mapped segment stuff > would have been needed anyway, regardless of FV). Ok, the segment list could be kept with my proposal. The implementation needs overhauling though. The big problem is that each segment is a range, but the skip-list's interface only allows for it to be (easily) treated as a key-value table, rather than a range-value table. And so various hoops have to be jumped through to account for that -- for example, SkipList_Find's non-intuitive behaviour that it returns the matching node, or the previous one if there's no match, or NULL if the key is below the first on the list; if you want to find the segment that contains an address, you have to call SkipList_Find and then look at the returned node to see if the searched for address is within it. This is crazy. I rewrote the skip-list the other day to provide a much cleaner interface that avoids these strange behaviours. I haven't managed to integrate it yet, however, because several of the places that use the skip-list functions do so in such a difficult-to-understand way that I was unable to determine for a number of them if they were buggy, or doing something extremely subtle. There's also a nasty 7-function cycle in the memory allocation stuff which Julian stumbled across; in obscure circumstances you can get an infinite loop when the program allocates memory, so Valgrind creates a new skip-list node, which can require allocating a new superblock, which requires another skip-list node, etc. (Or something like that, I can't remember the exact details now.) The segment list should not use the same allocator as the rest of Valgrind. > The other large code change is the syscall handing stuff, which is > independent of FV. Sure. > I dunno. Valgrind is a lot more complex now, but it does do a lot more > stuff. I don't think we're going to return to the halcyon days of 1.0 > simplicity and still manage to keep the functionality. Of course not. My proposal doesn't reduce the functionality at all, except for the strict client/Valgrind separation. What I object to is the use of techniques that are clever but fragile. Also, doing work that the kernel could do for us (ie. deciding where to put maps) is not good. >> - Robustness. FV is generally more fragile; there are more things to >> get right, and the consequences are bad if they are not right. IMHO >> we get more random seg fault problems now. A lot have been cleaned up >> (it was really bad at first), but they still happen. > > Yes, but I think that comes with the "doing more stuff". 1.0 would just > fail outright on a lot of programs. 2.x tries to run them, and > generally (but not always) succeeds. And again, I don't think this is > strictly an FV issue. I agree with that statement for the ProxyLWP stuff -- yes, it's more complicated, but handles more programs. As for FV... apart from statically linked binaries, what kinds of programs can we run with it that we could not run without? My argument is that FV reduced the number of programs Valgrind could run, since you need a standard-ish kernel, no virtual memory limit, enough swap space if you have a non-overcommitting kernel. And also it runs out of memory earlier than previously due to the address layout inflexibilities. > Can you explain? What do you mean by "embedded developers"? The Like Julian said, people writing programs that have strict memory limits. A couple of people have recently asked for a --mem-limit option, because they cannot use ulimit to restrict memory sizes. >> V has dropped from #6 to #72 highest-rated project at Freshmeat.net over >> the last year or so; I think the reason for this is that V's "it just >> works" characteristic has been diminished, due to the robustness and >> inflexibility problems. > > Um, do you have anything to support that? I think the ranking is > dropping because V is not new anymore, and people are taking it for > granted. Are we seeing an increase in bug reports disproportionate to > the number of users? Of course I can't prove it. But I do think that Valgrind crashes with random, unexplained seg faults more than it used to. There are a quite a lot of bugs in Bugzilla like that. Some of them have been dealt with since I made FV more strict about checking the results of mmap(), etc, but we still get them. > I'd still like to be able to use C++ internally. It could certainly be useful in places; some places where tools augment core data structures with extra info cry out for inheritance. But I'm happy to not use C++ if it causes too many problems. It's also a slippery slope if you try to restrict yourself to only a subset of the language. > And I really think that the direct mapped shadow memory makes the most > sense for 64-bit systems, even if it doesn't for 32-bit. Well, it's only speculation at the moment. > The increased layout flexibility is the only obvious win to me, and I > think it costs quite a bit. I consider it a big win, and the disadvantages not that big. It's the thin vs. thick model again; with FV we are duplicating the work of the kernel for memory layout. Just take a look at the syscall wrappers for mremap and brk. Yesterday I tried removing the strict partitioning, and the built-in support for shadow memory (tools allocate shadow memory just with VG_(get_memory_from_mmap)(), like they used to). I cut about 300 lines of code with hardly any effort, and that was just a start. > Well, we still need to keep Valgrind and the client stack separate. > Valgrind's stack is fixed size, but the client's has to grow. If we use > the stack the system gave us as the client stack, then we don't need to > worry about it. Yes. >> I think the end result would be simpler, have less code, be more robust, >> and cause fewer problems for users. Discuss. > > I think its more complicated than that. How did I come around to this viewpoint? I've been doing a lot of thinking about memory layout in the last couple of months, mostly trying to work out how to get the flexibility back while preserving FV's strict client/Valgrind separation. The thinking has been prompted by the large number of people who have been having difficulties due to non-3G:1G kernels, the RH8 over-committing problem, people complaining about ulimits not working, running out of memory prematurely, etc. There have been a lot. I've really been trying to come up with plausible ways to address the problems within FV, and I have concluded that it can't be done. If you are willing to abandon the strict client/Valgrind separation, lots of complication and problems fall away immediately. Also, the difficulties above that I encountered when trying to fix the skip-list problems made me quite unhappy; the code has to stay maintainable. FV is a nice idea, but practice has shown us that it ultimately is flawed. There's no shame in that, but it's not a good idea to ignore these flaws. N |
|
From: Nicholas N. <nj...@ca...> - 2004-11-22 16:13:06
|
On Sat, 20 Nov 2004, Julian Seward wrote: > I never understood why we care about statically linked executables. > My view is they are a special-case anomaly which it is not worth supporting. > No developer doing day-to-day hacking is going to continually be building > statically linked executables (are they?) I don't think that's true. I remember at least one person saying "thank you thank you thank you" when support for this was added. |
|
From: Ashley P. <as...@qu...> - 2004-11-22 16:35:12
|
On Mon, 2004-11-22 at 16:12 +0000, Nicholas Nethercote wrote: > On Sat, 20 Nov 2004, Julian Seward wrote: > > > I never understood why we care about statically linked executables. > > My view is they are a special-case anomaly which it is not worth supporting. > > No developer doing day-to-day hacking is going to continually be building > > statically linked executables (are they?) > > I don't think that's true. I remember at least one person saying "thank > you thank you thank you" when support for this was added. I to write and maintain software that doesn't really like static executables and there are a surprising number of people who use them. Ashley, |
|
From: Jeremy F. <je...@go...> - 2004-11-24 00:01:00
|
On Mon, 2004-11-22 at 16:06 +0000, Nicholas Nethercote wrote:
> I don't think "reasonably sure" is good enough.
At the moment, in Valgrind as it stands, its "absolutely sure", because
we set the ulimit to prevent the kernel from paying attention to the brk
syscall.
In the larger scheme of things, memory allocators are *the* most
overridden class of functions in any libc; every malloc debugging
package needs to be able to do it. If we can't do it, then that's a
pretty severe libc bug on that platform.
BSD's libc makes it easy to intercept the brk syscall, which solves the
problem simply. glibc makes it hard, but not impossibly so. The
problem will need to be addressed in some way for each platform.
But the libc question is a digression.
> > But using system libraries wasn't the only reason to disentangle
> > ourselves from the dynamic linker.
>
> How does FV provide that independence?
The client and the Valgrind core are running completely separate
instances of ld.so. The client may not have an ld.so at all, or it may
be a completely different implementation from the core's. The point is
that the core uses ld.so like any other normal program would, and
doesn't rely on it performing special tricks or magic (or at all -
Valgrind could be statically linked if the target required it).
In the original scheme, we were relying on:
* ld.so starting Valgrind "early enough"
* the client and V running ld.so on both the virtual and real CPUs
* blind good luck that these would never happen at the same time
* the client not trashing the ld.so structures
The only reason it worked at all is that we used dlopen(, LD_BIND_NOW)
which stopped V from doing lazy incremental binding as it ran, but it's
putting a lot of faith in the dynamic linker to assume that that means
it will never run code on your behalf.
> > A bounds-limit test for each memory access isn't that expensive, and in
> > 64-bit address spaces, you can make the client address space a power of
> > 2 in size, which simplifies the test. You could also use a v. large
> > redzone to make hits much more unlikely. The segment test is nice
> > because it actually is free, but explicit testing probably isn't that
> > expensive, particularly if the codegen can remove redundant tests, and
> > schedule the tests it does generate appropriately.
>
> I'm not very keen on features that require greatly different mechanisms on
> different architectures.
Me neither. In this particular instance, it is something which is
arch-dependent anyway, and doesn't have widespread implications. For
CPU X, how do we generate a "bounds check pointer Y" operation"? It's
just a codegen question. Similarly, creating a redzone is just part of
"How do we lay out a process? What goes where?".
The bigger problem is handing shadow memory access. The current scheme
can't be scaled to a larger address space simply - at the very least you
need to add an extra layer of page table. Once you have that, you need
to work out how to abstract "shadow memory access" so that the every
tool doesn't need to know about how to do it, which in turn means that
your options are wider.
The nice thing about a lare shadow mapping is that it does naturally
scale unchanged from 32-bit to 64-bit address spaces, because we use the
CPU's own mapping hardware to do the expensive/tricky parts. The
downside is that it uses too much virtual address space on 32-bit
machines. The "manual" pagetable scheme is better for virtual address
space use, but it flat out doesn't scale to 64-bits at all.
> Julian made a good point about distinguishing between read-only and
> read-write memory with Memcheck/Addrcheck. Also, my proposal doesn't
> preclude Valgrind from keeping it's own copy of ld.so, as is done now.
Yeah, that would have to be a prerequisite for it to be a workable
scheme.
I guess you could keep them separate, but you'd still need to stage1/2
system, and a specially linked Valgrind to make sure that as V
initializes it doesn't start occupying the client's address space.
> I rewrote the skip-list the other day to provide a much cleaner
> interface that avoids these strange behaviours.
Oh, good. Can you describe it?
> several of the places that use the
> skip-list functions do so in such a difficult-to-understand way that I was
> unable to determine for a number of them if they were buggy, or doing
> something extremely subtle
Yeah, that needs cleaning up.
> There's also a nasty 7-function cycle in the
> memory allocation stuff which Julian stumbled across
Yep, that stuff is always tricky.
> Of course not. My proposal doesn't reduce the functionality at all,
> except for the strict client/Valgrind separation. What I object to is the
> use of techniques that are clever but fragile. Also, doing work that the
> kernel could do for us (ie. deciding where to put maps) is not good.
Well, map placement is pretty straightforward. It seems to me that if
we can use the CPU's pagetable hardware directly, that's a bigger
complexity/performance win, and if the cost is doing our own placement,
that's a fair tradeoff. Even aside from that, I think pointercheck is a
pretty important thing in its own right.
> Like Julian said, people writing programs that have strict memory limits.
> A couple of people have recently asked for a --mem-limit option, because
> they cannot use ulimit to restrict memory sizes.
Eh? That's different. They want to be able to restrict the client's
memory allocation. That doesn't mean that Valgrind itself is
constrained in how much memory it can use. And as I said to Julian,
embedded systems have physical memory limits, not virtual.
> Of course I can't prove it. But I do think that Valgrind crashes with
> random, unexplained seg faults more than it used to. There are a quite a
> lot of bugs in Bugzilla like that. Some of them have been dealt with
> since I made FV more strict about checking the results of mmap(), etc, but
> we still get them.
My concern is that allowing the client to trash Valgrind's memory will
increase the incidence of unexplainable bugs rather than decrease them.
> FV is a nice idea, but
> practice has shown us that it ultimately is flawed. There's no shame in
> that, but it's not a good idea to ignore these flaws.
I agree we should acknowledge and try to fix the problems, but I'm
basically pessimistic. The real problem is that 32-bits is just not
enough address space for us. We need more. After all, both schemes use
the same amount of physical memory; this is just a question of how
virtual address space gets used. With, say, memcheck, a process which
approaches using 1.5Gbytes of memory (in a 3G user space) is going to
run out of memory either way.
I also think the implied process memory model created by the
intermingled scheme presents a lot of problems.
At the moment, the client gets a nice clear piece of address space which
it can do what it likes with: it can create large(-ish) mappings,
knowing that the space is clear; it can scan /proc/self/maps and see
only the mappings it created[*], knowing that it can create mappings in
the gaps. It can know where it has been placed, make assumptions about
what mappings in the address space exist, use MAP_FIXED, munmap,
mprotect without causing us any problems. If it tries to get out of its
address space, it fails exactly as if the kernel had given it a small
address space.
With the intermingled scheme, we can handle the client making lots of
little maps all over the address space, but we're fragmenting the
client's address space. Programs which want to create large mappings
will be thwarted because there just aren't any more holes left in the
address space (these could get very small). What's worse, this could
easily be non-deterministic from run to run, since the kernel could
easily choose new places for the mappings.
If a process does a mmap/munmap/mprotect, we have to make sure it
doesn't hit any Valgrind mappings, and work out what a sane response is
if it does. There's no analogy in the normal operation of the kernel
(since we're effectively creating a discontiguous address space), so we
need to invent a discontiguous process memory model and its semantics.
[ * - not implemented yet ]
So if I can summarize:
FV - pros:
* Valgrind protected from the client
* Client gets clear, flat address space
cons:
* static allocation of the address spaces limits client address
space size
* construction of address space mappings can fail on some systems
* code complexity in managing mappings, and separating address
spaces
Intermingled - pros:
* flexible address space layout handles clients with lots of small
sparse mappings
* no large mmaps or other tricky allocations
cons:
* introduces discontigious process memory model
* code complexity in protecting valgrind mmaps
* valgrind unprotected from client
* non-deterministic process layout and address space fragmentation
J
|
|
From: Nicholas N. <nj...@ca...> - 2004-11-24 09:44:14
Attachments:
diff-skiplist
s.c
|
On Tue, 23 Nov 2004, Jeremy Fitzhardinge wrote: >> I rewrote the skip-list the other day to provide a much cleaner >> interface that avoids these strange behaviours. > > Oh, good. Can you describe it? See the attached diff, and a unit test file, which you have to put in the coregrind/ directory of a Valgrind tree, and built with something like this: gcc -c -g -I../include -I../include/x86 -Ix86 -I../include/x86-linux/ -I../include/linux -Ix86-linux -I.. s.c -fprofile-arcs -ftest-coverage gcc s.o -o a.out -fprofile-arcs -ftest-coverage The basic change is that the node compare function is now asymmetric -- it compares a node with a key, rather than two keys. This is necessary for doing ranges, because usually you're comparing an address against a range held by a segment. You can still use the data structre for normal non-range key/value tables too. >> Of course not. My proposal doesn't reduce the functionality at all, >> except for the strict client/Valgrind separation. What I object to is the >> use of techniques that are clever but fragile. Also, doing work that the >> kernel could do for us (ie. deciding where to put maps) is not good. > > Well, map placement is pretty straightforward. I disagree. You said yourself that the skiplist/segment stuff is messy and needs cleaning up. Remember also that you wrote that code; I look at it and get confused. And it's distributed complexity, which is a bad thing. > I agree we should acknowledge and try to fix the problems, but I'm > basically pessimistic. The real problem is that 32-bits is just not > enough address space for us. We need more. No -- 32 bits is not enough address space for FV. We can't ignore 32-bit platforms just because they're more difficult than 64-bit platforms. We have to balance conflicting constraints, and I think FV goes too far in one direction. > At the moment, the client gets a nice clear piece of address space which > it can do what it likes with: it can create large(-ish) mappings, > knowing that the space is clear; it can scan /proc/self/maps and see > only the mappings it created[*], knowing that it can create mappings in > the gaps. It can know where it has been placed, make assumptions about > what mappings in the address space exist, use MAP_FIXED, munmap, > mprotect without causing us any problems. If it tries to get out of its > address space, it fails exactly as if the kernel had given it a small > address space. > > With the intermingled scheme, we can handle the client making lots of > little maps all over the address space, but we're fragmenting the > client's address space. Programs which want to create large mappings > will be thwarted because there just aren't any more holes left in the > address space (these could get very small). Oh come on, without FV the kernel is going to basically map things upwards in a straight line from the mapbase; it's not going to put little maps around randomly and cause fragmentation. The memory layout would be more flexible, not less, because you wouldn't have the big-bang shadow area, and you wouldn't have Valgrind occupying the top 0x10000000 bytes. > What's worse, this could easily be non-deterministic from run to run, > since the kernel could easily choose new places for the mappings. Why is that a problem? N |
|
From: Jeremy F. <je...@go...> - 2004-11-24 19:25:15
|
On Wed, 2004-11-24 at 09:43 +0000, Nicholas Nethercote wrote: > > I agree we should acknowledge and try to fix the problems, but I'm > > basically pessimistic. The real problem is that 32-bits is just not > > enough address space for us. We need more. > > No -- 32 bits is not enough address space for FV. We can't ignore 32-bit > platforms just because they're more difficult than 64-bit platforms. We > have to balance conflicting constraints, and I think FV goes too far in > one direction. Yes, I understand that, but I think the intermingled scheme has a number of serious problems as well. It needs to use address space as well, there's no getting around that, and the question is whether its better to preallocate a chunk up front, or mingle it among the application's mappings, on an incremental basis. In both cases, the client can allocate the same amount of actual memory; the same number of pages are available. In the FV case, pages with addresses above a particular address are off limits. In the intermingled case, the off-limits pages are scattered among the client's mappings. So, aside from programs which actually need to use address X, where X is in Valgrind's part of the address space, what doesn't work in the FV case which does work in the intermingled case? The pressing problem is that V+memcheck more than doubles a program's memory use, and when a program starts using about as much memory as it has address space, that becomes a problem. That's why address space is a problem for Valgrind, regardless of how it actually uses it. > Oh come on, without FV the kernel is going to basically map things upwards > in a straight line from the mapbase; it's not going to put little maps > around randomly and cause fragmentation. Yes, it is. The kernel, by design, puts mmaps at random addresses as a security measure. You can work around it, but assuming that mmap isn't going to do that is non-portable, and just doesn't work for modern kernels. Even on older kernels, once the mmap area had been gone over once, the map placement was pretty random. Even if it did just mmap linearly up the address space, it's going to lead to bad fragmentation. After each client mapping you'd have a valgrind mapping, so that the memory layout looks like CVCVCVCVCV. If the client then removes all its mappings and wants to create one large one, then it will fail to find the memory, even though there should be enough. > The memory layout would be more > flexible, not less, because you wouldn't have the big-bang shadow area, > and you wouldn't have Valgrind occupying the top 0x10000000 bytes. Well, the flexibility is useful, but it's also a problem. It means that most stuff will work mostly, but it also means that the free address space is broken up in ways that the client (or the programmer trying to debug it) can't know about, anticipate, work around or understand. The location and size distribution of free address space will vary from run to run, possibly causing the client to work sometime and fail at others. > > What's worse, this could easily be non-deterministic from run to run, > > since the kernel could easily choose new places for the mappings. > > Why is that a problem? Well, because you'll end up with bug reports like: "My program ran for 2 hours before crashing with mmap failed. When I ran it again it crashed after only 10 minutes. It has been running now for 3 hours, but I don't know if it will finish (I expect it will take 10)". If the program only does small mmaps, then there probably won't be a problem, but as soon as it wants to do largeish (64Mbyte? 128Mbyte?) maps, virtual address space fragmentation will be a real, pressing issue. J |