Re: [Squeak-VMdev] Re: GC improvements

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

On Jan 19, 2005, at 10:41 AM, Andreas Raab wrote:

> John,
>
>>>> Note how we allocate 76 objects, do a young space GC, then have two  
>>>> survivors, finally we reach the 200K minimum GC
>>>> threshold and do a full GC followed by growing young space. However  
>>>> this process is very painful.
>>>
>>> By saying we want some slack growHeadroom*3/2 - (self sizeOfFree:  
>>> freeBlock) we avoid the above problem.
>>
>> I see. Yes this makes sense. (btw, I'm not sure if these parameter  
>> choices are best but I guess since they aren't worse than what's  
>> there they must be good enough ;-)
>
> I take this back. The longer I'm looking at these changes the more  
> questionable they look to me. With them, unless you do a manual full  
> GC at some point you keep growing and growing and growing until you  
> just run out of memory. I *really* don't like this.
>
> The current machinery may be inefficient in some borderline situations  
> but it works very well with the default situations. With these  
> tenuring changes we risk to make the default behavior of the system to  
> be one in which we grow endlessly (say, if you run a web server or  
> something like this), for example:
>
>    queue := Array new: 20000.
>    index := 0.
>    [true] whileTrue:[
>        (index := index + 1) > queue size ifTrue:[index := 1].
>        queue at: index put: Object new.
>    ].
>
> You keep this guy looping and the only question is *when* you are  
> running out of memory (depending on the size of the object you stick  
> into the queue), not if. Compare this to the obscure circumstances in  
> which we get a (hardly noticable) slowdown with the current behavior.  
> So I think some way for bounding growths like in the above is  
> absolutely required before even considering that change.

You miss the point that after N MB of memory growth, we do a full GC  
event.

The logic to do the full GC is either in the smalltalk code running as  
active memory monitoring, or can be moved into the image. I'm not  
growing the image endlessly. I've attached two jpegs of before/after  
memory end boundary charts when a person was working in a croquet  
world, not a borderline case.
Also two jpegs from a seaside application (again not a borderline case)  
which were generated by doing:

"	wget --recursive --no-parent --delete-after --non-verbose \
		http://localhost/seaside/alltests
from 4 simultaneous threads."

You'll note how the seaside application using the historical logic  
grows to 64MB, perhaps it will grow forever?
Using the modified logic we actually cycle between 24MB and 45MB.

Lastly hitting the boundary condition where you trigger 1000's of  
incremental GC events is triggered by just running the macrobenchmarks.

As a reminder here is a summary of the information I calculated last nov

OMniBrowser/Monnticello SUnits from Colin Putney

Before any changes what I see is  (averages)
8139 marked objects per young space GC, where 2426 marked via  
interpreter roots, and 5713 by remember table for 6703 iterations
4522 swept objects in young space
714 survivors

After changes where we bias towards growth (more likely to tenure on  
excessive marking), and ensure young space stays largish,
versus heading towards zero I see (again averages)

4652 marked objects per young space GC, where 2115 marked via  
interpreter roots, and 2526 by remember table for 6678 iterations
4238 swept objects in young space.
368 survivors

This of course translates into fewer CPU cycles needed fpr  youngspace  
GC work

Jerry Bell send me some Croquet testing data

Seems Croquet starts at about 30MB and grows upwards to 200MB when you  
invoke a teapot and look about.

Jerry has to confirm what he did and if it was repeated mostly the  
same, but it did do 65,000 to 70,000 young space GC and it appears
we reduced the young space GC time by 40 seconds.  This does result in  
more full GC work (5) since I tenure about 16MB before doing a Full GC,  
but that accounts only for an extra second of real time...

Marking in the original case is average 20,808 per young gc
After alterations it's 11,386, making GC work faster

I'll also note growing to the 195mb takes 49 seconds versus the  
original 57.

>
>> > statMarkCount:
>> Actually this is the number of times around the marking loop,
>> I don't think it's same as the survivor count plus roots.
>
> That's right, the number of times around the loop is essentially  
> fieldCount(roots+survivors). But my point still stands that it is  
> easily computed and that we really don't need to explicitly count that  
> loop.

Fine compute them.

>
> Cheers,
>  - Andreas
>
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by: IntelliVIEW -- Interactive Reporting
> Tool for open source databases. Create drag-&-drop reports. Save time
> by over 75%! Publish reports on the web. Export to DOC, XLS, RTF, etc.
> Download a FREE copy at http://www.intelliview.com/go/osdn_nl
> _______________________________________________
> Squeak-VMdev mailing list
> Squ...@li...
> https://lists.sourceforge.net/lists/listinfo/squeak-vmdev
>
>
--
======================================================================== 
===
John M. McIntosh <jo...@sm...> 1-800-477-2659
Corporate Smalltalk Consulting Ltd.  http://www.smalltalkconsulting.com
======================================================================== 
===