|
From: Jeremy F. <je...@go...> - 2004-06-22 21:40:12
|
On Tue, 2004-06-22 at 13:03 +0100, Nicholas Nethercote wrote: > Hi, > > I've been looking at ways to improve FV's memory layout, to address bug > #82301. > > There's a comment in coregrind/ume.c that explains things: > > CLIENT_BASE +-------------------------+ > | client address space | > : : > : : > | client stack | > client_end +-------------------------+ > | redzone | > shadow_base +-------------------------+ > | | > : shadow memory for skins : > | (may be 0 sized) | > shadow_end +-------------------------+ > : gap (may be 0 sized) : > valgrind_base +-------------------------+ > | valgrind .so files | > | and mappings | > valgrind_mmap_end - > | kickstart executable | > - - > | valgrind heap vvvvvvvvv| > valgrind_end - - > | valgrind stack ^^^^^^^^^| > +-------------------------+ > : kernel : > > Basically, memory is partitioned into 3 parts: > > - client space > - shadow memory space (if needed) > - valgrind + tool space > > The problem is that sometimes one of these spaces gets exhausted when > there's still plenty of room in the others, which is a shame. > > It's not easy to see how to make things more flexible, although I have > some ideas. I have some questions, most of which are primarily directed > to Jeremy: > > * What's considered a normal client stack size? Is it reasonable to > limit it? If I could safely limit that to say, 8MB, that could be very > useful in making limits more flexible. (Even something much bigger, eg. > 64MB, would be ok). Mostly it can be pretty small (~8M); I would guess that almost nothing uses more normally. Some programs may be heavily recursive, or use large local arrays, and they would need more. You could make this a CLO. > * CLIENT_SIZE_MULTIPLE is 64M. Why so big? It means the gap between > shadow memory and valgrind's space is 95MB on my machine, for Memcheck. > I've reduced it to 4M without any apparent ill-effects, and the gap is > reduced to 1.5MB, which gives the client an extra 44MB of space to play > with. As a result, the largest segment I can successfully mmap jumps > from 235MB to 280MB. Changing this seems like the easiest way to squeeze > out some more megs from the address space. There's no particular reason for this number; it was mostly to make the addresses round for easy reading. Reducing it to 4k should be OK. > * The diagram above mentions the "valgrind heap". AIUI, Valgrind doesn't > really have a heap as such, because all its allocations are done out of > maps. Are those allocations done out of the "valgrind .so files and > mappings" area? If so, then 128MB for the "kickstart executable, valgrind > heap, valgrind stack" area seems far more than is necessary. Could it be > reduced to just big enough for the kickstart executable + valgrind's > stack, which together should only be a couple of MB?. (I tried changing > this 128MB size but it kept seg-faulting, even when I only shrunk it to > 112MB, so I think I was not making the change correctly.) Valgrind does have a heap - it's where VG_(malloc) goes, allocated with VG_(brk). valgrind_base->valgrind_map_end is used for when V mmaps files, but not memory allocation. The mmap area only needs to be big enough to fix one .so at a time when reading symbols. Note that we don't use glibc's malloc internally. Any relying on the "fallback to mmap" behavior is pretty non-portable anyway. > * On my machine, normally, the heap is roughly 800MB, and the space for > mmap segments is 2GB. (I compute this from the usual heap start being > slightly bigger than 0x8048000, and mmap segments usually starting at > 0x40000000, and the kernel starting at 0xc0000000). So the heap is about > 2--2.5x smaller than the mmap segment area. But under Valgrind, the > "heap" (client_base..client_mapbase distance) is set to be 3x the mmap > segment area (client_mapbase..client_end). Surely the ratios should be > similar to normal execution? If I change the ratio from 3x to 0.5x, I get > a heap size of 441MB and mmap-segment size of 882MB, and can mmap 640MB > segments in Memcheck (up from 235MB). [Actually, it's complicated by > Memcheck's replacement malloc() not using brk() but rather mmap(). Hmm.] > There's definitely room for improvement here. Comments? I picked all the initial numbers out of the air, so this is definitely a tuning process I expected to happen. I think we can do better for "typical" programs, but I don't think a static layout can suit everyone. The other variable is that the kernel might be higher. FC2's user address space goes up to ffff0000, so there's a lot more space to play with. In principle it should be easy to build Valgrind to work in this larger space, but unfortunately it can't be done dynamically, since it affects the linking address of the code. J |