Vassilis Radis wrote on Fri, Nov 30, 2012 at 08:34:08PM +0200:
> Martin, thanks for the detailed explanation.
> I admit I cant fully understand all of what you said.
That's because my explanation sucks :-) I thought but can't find now
that I sent actual findings to some sbcl earlier.
> That said, I would
> like to add the following, based on what I think I understood:
> What I understand is that you describe a fragmentation problem which I
> would understand as the source of my problem, but what bugs me is that:
> Room function reports no extra space usage after each call. So based on
> that, and on what you say, I must conclude that a function call that does
> not add a net positive memory usage can indeed inreversibly increase the
> memory sbcl uses as a process. How is that possible? It means that if it is
> called sufficient times, it can use up all the memory? I left the system
> idle for 5 hours and no memory decrease in OS report.
> As I understand, fragmentation is that I allocate X space, then I free Y
> space it in such a way that although Y is free, it is not usable because it
> is in possibly small fragments.
No, let's consider the simpler C case.
malloc() gets it's space originally from the OS. But free() doesn't
give it back to the OS, it is kept and re-used for further malloc
calls, even if whole pages (or large regions) are not used for actual
objects right now. In addition memory fragmentation can nail doesn't
whole pages from single objects, and that frequently happens.
At some point most (but not all) malloc implementations do give back
things to the OS, but giving back is very expensive and done
sparingly, and only in large chunks (which then must be completely
free of objects since C can't move objects like Lisp can).
In Lisp the main problem is that a GC that is moving a generation of
objects from A to B doesn't mean you can just give A back all the time
(even if free of pinned objects) because it is expensive and for all
you know you need more space right away and you'd slow down even more
because now you have to have expensive things going on to get more
space from the OS (minor page faults at a minimum).
In SBCL giving back is done on a full GC, although
sb-alien:int) from 1 to 0 can be used to do it on every GC. But that
slows down the system (the whole system) a lot.
SBCL won't do anything while it's idling, except that background
running loops or threads might cons and trigger a GC. To reduce space
you have to go through the GC which usually means cons more :-)
> But this, in my case, to be able to add up to an ever-increasing used
> space, means that X > Y . If I free all the space I allocate, no
> fragmentation can exist in the long run, no matter how the
> allocated/deallocated space is positioned or moved. Am I missing something?
> I make the above assumption because of the fact that even after 100 calls
> of my function and after 5 hours of leaving sbcl idle after the 100 calls,
> room still reports the same amount of used space as it was before the first
> call (no growth in usage at all), so whatever memory the function uses
> during its run, it isnt left hanging/referenced after it is finished. But
> "top" still hasnt report any decrase of VIRT or RES usage. Each call just
> adds 3MBs of used space to OS usage report but zero space to (room)
> Shouldnt there be even a slight increase in (room) report after 100 calls
> and 5 hours of idle time, if fragmentation was the problem?
> And if as you say , gc does not immediately gives back the pages to OS,
> what is a reasonable amount of time to give it back?
See above, you can tune giving back if you are willing to trade space
for time. You might be able to observe an improvement simply by
triggering a full GC manually.
> Also, Indirect (some drakma dependency) leakage through FFI , isnt
> possible? I assume room function cant track those allocations. Is this
No, C memory isn't tracked by room. However you can see it in memory
maps such as Linux /proc/<pid>/maps because you know which area is the
dynamic heap and most of the other anonymous regions will be from C
> I am new to this level of stuff, sorry If I am asking meaningless
> questions. It would be great help to point them :)
I should find out whether I ever wrote my findings down and if not do
it over the weekend.
As I said, at work we ended up seeing a huge improvement from Lisp
with less generations and very low
We wanted to pass this around because we are curious whether other
real-world implementations find the same. My personal guess is that
less generations will work out to be an overall win for most. Not
sure about generation-bytes.
> Thanks, Bill
> On Fri, Nov 30, 2012 at 7:24 PM, Martin Cracauer <cracauer@...> wrote:
> > Vassilis Radis wrote on Fri, Nov 30, 2012 at 07:11:29PM +0200:
> > >
> > > 1. Why top - reported memory keeps growing, while (room) says it does
> > not?
> > > The only thing I can think of is some leakage through ffi. I am not
> > > directly calling out with ffi but maybe some drakma dep does and forgets
> > to
> > > free its C garbage. Anyway I dont know if this could even be an
> > > explanation. Could it be something else? Any insights as to where I
> > should
> > > start searching? 2. Why isnt --dynamic-space-size honoured?
> > You are looking at physical memory versus objects you hold in software
> > view. They are always different, and C has worse problems with this
> > (because it can never move objects).
> > RSS size in this case is essentially "number of pages currently dirty
> > (touched)" and not swapped out. As the allocator allocated and the GC
> > moves things around they don't always give back pages that are no
> > longer occupied to the OS immediately. Doing so is prohibitively
> > expensive because it requires full system calls. To simplify: after
> > you move a generation worth of objects from A to B you can't outright
> > free all of A without huge performance penalty. And some objects
> > might be nailed down due to the slightly conservative GC.
> > For my toy at work I improved this a lot by reducing the number of
> > generations to 4 total (newest, not GCed + 2 others) and setting
> > sb-ext:generation-bytes-consed-between-gcs to a low value (2 MB). But
> > our toy has different allocation patterns than regular software so
> > YMMV. Also made it faster.
> > You can also set
> > (sb-alien::define-alien-variable "small_generation_limit" sb-alien:int)
> > from 1 to 0, which will trigger the "giving back" much more often, but
> > there is a huge performance penalty, and system-wide affecting other
> > processes, too.
> > As I said you will find that C has even worse problems with this
> > because it can never move objects and hence never compact parts of the
> > heap. One reason why allocation in C is so much slower than in a
> > Common Lisp like SBCL is that strategies for avoiding this problem
> > have to be executed at allocation time, and you might spend a lot of
> > CPU time and memory bandwidth on doing this for an object that gets
> > destroyed a microsecond later anyway.
> > Did I ever report on this RSS lowering things? I forgot. I wanted
> > to. I think it's likely that a lower number of generations benefits a
> > lot of applications, even non crazy stuff.
> > Martin
> > > Thank you,
> > > Bill
> > >
> > ------------------------------------------------------------------------------
> > > Keep yourself connected to Go Parallel:
> > > TUNE You got it built. Now make it sing. Tune shows you how.
> > > http://goparallel.sourceforge.net
> > > _______________________________________________
> > > Sbcl-help mailing list
> > > Sbcl-help@...
> > > https://lists.sourceforge.net/lists/listinfo/sbcl-help
> > --
> > %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
> > Martin Cracauer <cracauer@...> http://www.cons.org/cracauer/
> Keep yourself connected to Go Parallel:
> TUNE You got it built. Now make it sing. Tune shows you how.
> Sbcl-help mailing list
Martin Cracauer <cracauer@...> http://www.cons.org/cracauer/