From: William H. N. <wil...@ai...> - 2001-07-12 02:42:21
|
I've been having some problems with "memory leaks", more or less, in my Go-playing program. I spent a long time first trying to debug the problem, and then trying to debug the SBCL behavior which prevents me from using a bunch of verbose (GC :FULL T) operations scattered through my application to figure out where the memory is or isn't being unlinked. I went down a lot of false trails because the behavior of the problem was insanely sensitive to small changes (like whether the test function was named FOO or BAR..), in a way which might suggest that rampant conservative GC coincidences were part of the problem, except that I'd've expected GC coincidences to be very rare, so that didn't make any sense. Presently I discovered that the way that the GENCGC code deals with huge arrays is *extremely* conservative: if it discovers a pointer which might refer to any of the elements of the array, it locks every page of the array (i.e. dont_move=1). I think, though I haven't determined for sure, that what's going on in my application is that various large arrays that it's using end up getting in trouble with this extreme conservatism, since this extreme conservatism makes coincidences rather common. I suspect that the program I'm working on isn't the only application which gets in trouble. It may be that part of the reason that GENESIS is such a pig is that this conservatism makes it rather unlikly to free huge arrays (like the ones it uses to represent partial target images). And I think I might be able to construct a test case which dies from heap overflow by using a reasonably deep stack with a high proportion of random numbers while trying to manipulate large (e.g. 16Mbyte) vectors. I haven't constructed such a test case yet, but I do think this is a problem in practice, because when I added some debugging output to gencgc.c, I found that in the very first GC at boot, with 268 words on the stack, 7100 pages (29081600 bytes) are pinned due to conservative pointers. That's a pretty significant proportion of the dynamic memory, pinned by a not-very large stack. So, I'd like to get rid of this behavior. However, I'm not completely sure whether I'll introduce subtle bugs or problems by doing so, so I thought I'd run it past y'all for comments. My first impulse is to redo the preserve_pointer() logic in GENCGC so that it doesn't try to preserve any object because of a pointer when it can tell that the pointer doesn't refer to the beginning of the object. In particular, when it's given a pointer into a page which holds part of a huge object, but not the beginning of the huge object, preserve_pointer() should simply ignore the pointer. Soon I might try writing some new GC code, conditionalized out of the main build with #ifdef and whatnot, where preserve_pointer() works this way. If this is a gross error, I'll probably notice pretty quickly, but if there are deeper problems with it, I might not, so I'm soliciting comments on problems with the idea. My guess is that this more liberal GC is safe as long as GC is always called synchronously, from single-threaded Lisp code. It's a problem is if C or other foreign code is looping over elements of the array at the time that GC happens, but in current SBCL, I think that's impossible except if GC occurs in a signal handler. So: * Is anyone going to be terribly unhappy if SBCL doesn't do GC in signal handlers, at least in the architectures where conservative GC is used? * Is anyone thinking sufficiently seriously about adding multithreading that having the GC move away from multithread safety will be painful? * Does anyone think I'm off base in thinking that the current preserve_pointer() logic is excessively, probably unacceptably conservative for huge arrays? * Can anyone think of any other issues? -- William Harold Newman <wil...@ai...> The way men live is so far removed from the way they should live that anyone who abandons what is for what should be pursues his own downfall rather than his preservation. - Niccolo Machiavelli PGP key fingerprint 85 CE 1C BA 79 8D 51 8C B9 25 FB EE E0 C3 E5 7C |
From: Nathan F. <fr...@em...> - 2001-07-12 04:53:50
|
Warning: I post this only having rudimentary knowledge of GC techniques (i.e. never implemented one) and even less knowledge of GENCGC. > Presently I discovered that the way that the GENCGC code deals with > huge arrays is *extremely* conservative: if it discovers a pointer > which might refer to any of the elements of the array, it locks every > page of the array (i.e. dont_move=1). I think, though I haven't > determined for sure, that what's going on in my application is that > various large arrays that it's using end up getting in trouble with > this extreme conservatism, since this extreme conservatism makes > coincidences rather common. Isn't this the whole point of conservative garbage collection (and the reason that some people have problems with it)? I think there have been various techniques invented to minimize the penalty of things like this (such as your idea below). > My first impulse is to redo the preserve_pointer() logic in GENCGC so > that it doesn't try to preserve any object because of a pointer when > it can tell that the pointer doesn't refer to the beginning of the > object. In particular, when it's given a pointer into a page which > holds part of a huge object, but not the beginning of the huge object, > preserve_pointer() should simply ignore the pointer. This sounds pretty reasonable. When you made your measurements about the pages pinned at boot (which I deleted, but can't get back since I have a losing mail client), is that memory allocated by the runtime system that we're not allowed to move around, or memory we're not allowed to access in some way? I ask this because I don't recall my SBCL taking up that much memory when it starts. > * Is anyone thinking sufficiently seriously about adding > multithreading that having the GC move away from multithread > safety will be painful? Certainly the idea of multithread safe code appeals to the techie in me, whether or not the code is actually used in a multithreaded environment. It would be nice someday to have MP work on a cross-platform level (maybe with pthreads or something -- or clone(), at least on Linux), at which point the multithread safe GC would be appreciated. As long as the multithread safety is maintained via #ifdefs or something, I don't think there's a problem with this. > * Does anyone think I'm off base in thinking that the > current preserve_pointer() logic is excessively, probably > unacceptably conservative for huge arrays? No, you're not off base (IMHO). However, (ignorance showing through) does this proposed change to reduce the pain on huge arrays have bearing on any other equally large structures (lists, structures, classes, etc.)? > * Can anyone think of any other issues? Nope. But that's probably because I'm not thinking very hard. :) -Nathan |
From: William H. N. <wil...@ai...> - 2001-07-12 15:13:13
|
On Wed, Jul 11, 2001 at 11:53:42PM -0500, Nathan Froyd wrote: > Warning: I post this only having rudimentary knowledge of GC > techniques (i.e. never implemented one) and even less knowledge of > GENCGC. > > > Presently I discovered that the way that the GENCGC code deals with > > huge arrays is *extremely* conservative: if it discovers a pointer > > which might refer to any of the elements of the array, it locks > every > > page of the array (i.e. dont_move=1). I think, though I haven't > > determined for sure, that what's going on in my application is that > > various large arrays that it's using end up getting in trouble with > > this extreme conservatism, since this extreme conservatism makes > > coincidences rather common. > > Isn't this the whole point of conservative garbage collection (and the ^^^^ > reason that some people have problems with it)? I think there have > been various techniques invented to minimize the penalty of things > like this (such as your idea below). I'm not sure what you mean by "this" here. If pointer arithmetic is going on, it's not safe to free an object if there are any pointers to its components. It's certainly not safe to free every single component in that case. In C/C++, references/pointers to substructure of compound objects are themselves first class objects. Consider struct node { int count; struct node *left, *right; }; or int arr[100]; .. for (i = sizeof(arr)/sizeof(*arr), p = arr + i; --p, --i >= 0; ) if (*p) return i; It's reasonable for a C program to have a live pointer to e.g. &node_b->left even when there are no live pointers to the beginning of node_b. Similarly, it's reasonable for a C program to have a live pointer to arr[14] even when there are no pointers left to the beginning of arr[]. In Lisp, such references aren't first class objects; there's no way to construct them explicitly. This actually caused me a fair amount of pain in the aforementioned Go program, until I learned to work around it by storing closures like (lambda (new-value) (setf (node-left node-b) new-value)) where in C++ I'd store &node_b->left. Since in Lisp, the references can't be constructed, and any way that you fake them up requires using a pointer to the head of the original object, I think the current GENCGC hyperconservatism isn't needed. Since the original message, I've thought of one more place where the current GENCGC hyperconservatism would be needed. Even in languages which don't support pointer arithmetic, optimizing compilers can transform array operations into pointer arithmetic. So (especially in FORTRAN) a compiler can take the moral equivalent of for (i = 0; i < 256; ++i) { a[i] = 0; } and automatically transform it into something like for (i = 0, p = a; ++p, ++i < 256; ) *p = 0; In order to get serious array performance in a language which doesn't support pointer arithmetic, you need to do this. Python doesn't do this, but it would be good if it did. If and when it ever does, and if and when I make it do the more liberal GC that I've proposed, it will be important to make sure that a live reference to the original a[] is preserved across the entire loop, not optimized away. > > My first impulse is to redo the preserve_pointer() logic in GENCGC > so > > that it doesn't try to preserve any object because of a pointer when > > it can tell that the pointer doesn't refer to the beginning of the > > object. In particular, when it's given a pointer into a page which > > holds part of a huge object, but not the beginning of the huge > object, > > preserve_pointer() should simply ignore the pointer. > > This sounds pretty reasonable. > > When you made your measurements about the pages pinned at boot (which > I deleted, but can't get back since I have a losing mail client), is > that memory allocated by the runtime system that we're not allowed to > move around, or memory we're not allowed to access in some way? I ask > this because I don't recall my SBCL taking up that much memory when it > starts. It's memory that can't be reclaimed in this pass of GC. The GC reclaims memory by "moving" (copying; also called "scavenging", more or less) all live data out of each page, then freeing the page. When there's a live potential conservative pointer to the page, the GENCGC deals with it by freezing everything on the page (i.e. not doing the "moving"/"scavenging" step, and then not freeing the page). > > * Is anyone thinking sufficiently seriously about adding > > multithreading that having the GC move away from multithread > > safety will be painful? > > Certainly the idea of multithread safe code appeals to the techie in > me, whether or not the code is actually used in a multithreaded > environment. It would be nice someday to have MP work on a > cross-platform level (maybe with pthreads or something -- or clone(), > at least on Linux), at which point the multithread safe GC would be > appreciated. As long as the multithread safety is maintained via > #ifdefs or something, I don't think there's a problem with this. Actually, I now think it's probably possible to make the more liberal GC thread-safe, although it may require care in avoiding subtle gotchas like the one above where the pointer to the head of the object is optimized away. > > * Does anyone think I'm off base in thinking that the > > current preserve_pointer() logic is excessively, probably > > unacceptably conservative for huge arrays? > > No, you're not off base (IMHO). However, (ignorance showing through) > does this proposed change to reduce the pain on huge arrays have > bearing on any other equally large structures (lists, structures, > classes, etc.)? The particular problem that I'm looking at right now is the code in preserve_pointer() which looks like /* Work backwards to find a page with a first_object_offset of 0. * The pages should be contiguous with all bytes used in the same * gen. Assumes the first_object_offset is negative or zero. */ first_page = addr_page_index; while (page_table[first_page].first_object_offset != 0) { first_page--; ... } and /* Now work forward until the end of this contiguous area is found, * marking all pages as dont_move. */ for (i = first_page; ;i++) { gc_assert(page_table[i].allocated == region_allocation); /* Mark the page static. */ page_table[i].dont_move = 1; ... } This is only a big problem for objects which are big enough to span page boundaries. A page is 4096 bytes on all the X86/*nix systems that SBCL supports. It's a nonproblem for conses, which never span page boundaries. It's only a very small problem for typical STRUCTURE-OBJECTs or STANDARD-OBJECTs: I've never seen such an object with anywhere near 1000 slots. But it's potentially a serious problem for things like the GSPACE-BYTES arrays in src/compiler/genesis.lisp, which grow to be arbitrarily huge (many megabytes). Another way to look at it: If you only have small objects (less than 4096 bytes) in dynamic storage, then each potential pointer in the stack can pin at most one page; and because multiple pointers point to the same page, and non-pointer values mistaken for pointers tend to point to areas which aren't mapped at all, the expectation value of the number of pages pinned is substantially less than the size of the stack. But with big objects (many pages) then with the preserve_pointer() algorithm above, each potential pointer in the stack can pin many pages, and in the example I referred to, a stack of 268 potential pointers pinned 7100 pages. -- William Harold Newman <wil...@ai...> The way men live is so far removed from the way they should live that anyone who abandons what is for what should be pursues his own downfall rather than his preservation. - Niccolo Machiavelli PGP key fingerprint 85 CE 1C BA 79 8D 51 8C B9 25 FB EE E0 C3 E5 7C |
From: William H. N. <wil...@ai...> - 2001-07-12 23:26:13
|
I tried to build my test case to illustrate the problem and discovered that I was wrong about the problem. I've just checked in sbcl-0.6.12.46, which has some related clarifications and comments but few substantive changes. It's rather impressive how much work DTC did to deal with this and other issues. I just wish I were better at understanding it.. On Thu, Jul 12, 2001 at 10:00:59AM -0500, William Harold Newman wrote: > a pointer to the head of the original object, I think the current > GENCGC hyperconservatism isn't needed. ^^^^^^^^^^^^^^^^^^^^^^^^ [snip] > current GENCGC hyperconservatism would be needed. Even in languages ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ [snip] [etc.] > > When you made your measurements about the pages pinned at boot (which > > I deleted, but can't get back since I have a losing mail client), is > > that memory allocated by the runtime system that we're not allowed to > > move around, or memory we're not allowed to access in some way? I ask > > this because I don't recall my SBCL taking up that much memory when it > > starts. > > It's memory that can't be reclaimed in this pass of GC. The GC > reclaims memory by "moving" (copying; also called "scavenging", more > or less) all live data out of each page, then freeing the page. When > there's a live potential conservative pointer to the page, the GENCGC > deals with it by freezing everything on the page (i.e. not doing the > "moving"/"scavenging" step, and then not freeing the page). preserve_pointer() doesn't bogusly pin huge arrays nearly so often I thought, because the code which'd pin the huge arrays is downstream from if (enable_pointer_filter && !possibly_valid_dynamic_space_pointer(addr)) return; and therefore is usually never executed. (Here possibly_valid_dynamic_space_pointer() is my new name for valid_dynamic_space_pointer().) which makes huge arrays much less problematic than I thought. There may still be a parallel problem for huge type_CodeHeader objects, noted in the current code as /* We need to allow raw pointers into Code objects for return * addresses. This will also pick up pointers to functions in code * objects. */ if (TypeOf(*start_addr) == type_CodeHeader) { /* XXX could do some further checks here */ return 1; } but (1) the pattern of usage which could tickle that problem (dynamically allocating and discarding many huge code objects) seems much less likely and reasonable than allocating and discarding many huge arrays, and (2) that problem wouldn't explain what's going wrong with my application. Thus, I'm not all that motivated to worry about it right now. The huge pinning statistics reported by GENCGC at startup time seem to be due almost entirely to valid pointers on the stack, i.e. pointers which actually refer to the beginning of the array. In that case, the pinning doesn't seem to do any great harm, just making it a little bit harder to compact one page. I'd thought that filling the stack with random numbers would pin any array when any random number pointed into any interior element, but that's not so, because of the pointer filtering mentioned above.. -- William Harold Newman <wil...@ai...> The way men live is so far removed from the way they should live that anyone who abandons what is for what should be pursues his own downfall rather than his preservation. - Niccolo Machiavelli PGP key fingerprint 85 CE 1C BA 79 8D 51 8C B9 25 FB EE E0 C3 E5 7C |