From: Martin C. <cra...@co...> - 2014-02-06 02:14:10
|
I have a mystery on my hands with gencgc obviously not collecting the way it should. I see contiguous growth and no shrinking in my toy at work, as displayed by both (room) and resident memory. I blamed it on something crazy that we do, but now it doesn't look like it. I am going through the system and shut down everything, and I set every global variable I can find (grep, not walking the package) to nil, do a full GC and call room again. It just doesn't go down. GC as such is mostly working, a majority of new objects gets collected. However, there is contiguous growth as a base and I can't kill it no matter what I set to nil. As people know, we have large caches of preallocated objects, but I set those caches to NIL here, and I have used the appended tool to make sure they are not referenced from anywhere else. They do live in the non-collected generation from the image themselves. Anyway... <==== Question to the other users of big systems: if you NIL the variables holding your stuff, do a full GC, then call (room), does it get down to where you expect it? %% Now, to elaborate a bit. There is a concrete question in the last sentence below, in case your eyes roll out on the way :-) Here is the typical (room) in this state. It is basically the same at the beginning and the end of nuking things. Dynamic space usage is: 2,267,394,912 bytes. Read-only space usage is: 6,000 bytes. Static space usage is: 4,096 bytes. Control stack usage is: 2,672 bytes. Binding stack usage is: 816 bytes. Control and binding stack usage is for the current thread only. Garbage collection is currently enabled. Breakdown for dynamic space: 868,913,920 bytes for 6,914,139 instance objects. 542,185,056 bytes for 5,333,371 simple-vector objects. 392,328,672 bytes for 564,783 simple-array-unsigned-byte-64 objects. 262,507,936 bytes for 16,406,746 cons objects. 201,459,328 bytes for 2,619,767 other objects. 2,267,394,912 bytes for 31,838,806 dynamic objects (space total.) We have a little utility that breaks this up more: 43704715,699275440,0,0,43704542,699272672,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,173,2768,0,0,0,0,SYSTEM-AREA-POINTER 21852295,699273440,0,0,21852269,699272608,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,26,832,0,0,0,0,ALIEN 15928304,254852864,0,0,15372,245952,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,11432976,182927616,4748,75968,4270082,68321312,205126,3282016,CONS 5330041,541810560,0,0,4976,365536,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,193390,58040992,254,46928,3908877,391626288,1222544,91730816,ARRAY-169 array-169 is simple-vector. First field is # of objects, then total size in bytes, then the different generations. I appended another tool, which you can call on an object type and a number n, and it will find the n'th instance of the object and list references to it from the dynamic heap (we have everything in the dynamic heap). Examples: (find-struct-reference 'system-area-pointer 1) ==> usually the standard streams ;; find who holds on to the 10 millionth cons cell (find-struct-reference 'cons 10000000) Now, here is where it gets interesting: (find-struct-reference 'alien 31) Using this utility on objects of type alien (and as you see above, we have a lot of those), it will find objects that objectiously are assorted regular Lisp objects. Specifically, it often finds Lisp objects that I think should have been held by global variables in our package, the ones I nuked, trying to sever the root to these objects. There are two classes of "alien" occurances in there that I find odd: - aliens that point to things I think should be gone from the heap, and the tool prints "ref from invalid obj" - the unicode database pops up as a root to a very large number of those aliens Setting SB-IMPL::*UNICODE-CHARACTER-NAME-DATABASE* to nil and full-gcing doesn't give me the space back that I want, so I think it really isn't as simple as that. But it is odd how often it pops up. To me it looks like there is a problem here in how the GC interprets alien objects and it creates false roots to regular Lisp things. And the tool appended seems to know it and say "ref from invalid obj". Martin -- %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% Martin Cracauer <cra...@co...> http://www.cons.org/cracauer/ |