On May 14, 2013 6:33 PM, "Craig Lanning" <clanning@isc8.com> wrote:
> On Tue, 2013-05-14 at 18:09 +0300, Nikodemus Siivola wrote:
> > Then the correct answer is ALLOCATION/WITH-FIXED-ALLOCATION;
> > everything else is more or less negotiable. Ie. the compiler needs to
> > know how to emit allocation sequences.
> Now, that was an eye opener.
> Ultimately, what I'd like to do is define a set of functions and global
> variables that can be define either in Lisp for a GC implemented in Lisp
> or in C for a GC implemented in C (and brought into Lisp via the alien
> facility).
> The company I work for has a very special use case that I think can be
> made easier by rewriting the GC.  I read the opinions about why the GC
> is in C instead of Lisp.  I understand them, agree with a few, and
> disagree with others.  I think that the changes I need to do to the GC
> will be easier to implement if I can do it in Lisp instead of C.

Are you able to share any details about the sort of changes that you have in mind?  Or about the use case, even in general terms?

> After looking at ALLOCATION/WITH-FIXED-ALLOCATION, it appears that
> switching the GC from a C implementation to a Lisp implementation would
> be very non-trivial.  Has anyone given any serious thought to what would
> need to be done to implement the GC in Lisp?

Some years ago (late 2008, maybe), I did some preliminary investigation into writing parts of the GC in Lisp, although I was mostly focusing on how to implement things like the dispatch involved in scavenging an object.

If you're planning to implement a GC in Lisp, one of the things to make sure of is that the code and data that implements the GC is available while you're running the GC, which is something that seemed very difficult in my original context, but for what you're doing simply arranging for any data tables required to be in static space or otherwise pinned and for code objects to not move would cover the worst of it.  Well, and there are the FDEFINITION objects to consider, but putting them into static space as well should work, if you can arrange that (I can think of a couple of approaches here).

Another aspect to consider is if the GC code itself should be allowed to allocate memory.  If it shouldn't, then you have to be careful about how you write the code in order to avoid allocation and you may also want to figure out how to tell the compiler that any allocation would be an error (so that you don't backslide during maintenance).

The inline allocation logic itself is actually fairly straightforward, modulo the overflow handling.  Each thread has an alloc_region (in a single-threaded system there is a global alloc_region), which contains two fields of interest to the allocation logic:  the current allocation pointer and the end of the region.  Everything else in the alloc_region is of interest to the GC only, and the general layout might well be completely different for a different GC.

You would also have to deal with the "safepoint" or "pseudo-atomic" logic, and I haven't really thought overmuch about what would be involved here, plus there's the whole issue of actually triggering a collection cycle.  And there's the matter of scavenging any interrupt contexts and the various thread stacks.

And if you're changing the heap layout too drastically, or need to arrange for certain things to be in certain heap spaces even in the cold-core, you may well end up modifying genesis (SYS:SRC;COMPILER;GENERIC;GENESIS.LISP), the program that actually creates the cold-core from a set of cross-compiled FASLs.

Anyway, I've given the matter a certain amount of thought, and I'm reasonably confident that I could write SOMETHING that would work, but it would take a while and I simply don't have a use-case that would make it worth the effort.

I hope that this brain-dump helps, and would love to be kept "in the loop" if you decide to go forward with writing a new GC for SBCL in Lisp.  Or even a new GC for SBCL in C.

> Craig

-- Alastair