On Tue, Feb 26, 2013 at 2:02 PM, Paul Khuong <pvk@pvk.ca> wrote:
Elliott Slaughter wrote:
I would note that writing an efficient GC in LLVM is not such an easy
thing to do, especially if you want it to be precise-on-the-stack. I
gave a talk about this after spending a summer trying (and mostly
failing) to do the same for Rust[1]. LLVM's existing "GC support" works
only when all pointers are permanently rooted on the stack, and prevents
LLVM from ever moving pointers into registers. There are hacks to work
around this, but they're all pretty awful.

Right, the usual via-C tricks…

SBCL's already conservative wrt the stack and registers, on x86[-64] and on threaded PPC; however, it assumes that there are no derived/interior pointers, or, at least, that they're backed by a tagged pointer that remains live (on stack, or in a register). Is it at least possible to disable optimisations that would CSE/hoist pointer arithmetic away or perform the address arithmetic in place, leaving only an untagged or, worse, an interior pointer? IIRC, this is very similar to the kind of guarantees BDW needs, so hopefully there's some support for these constraints.

At the level of the machine independent LLVM IR, you have fairly fine-grained control over optimization selection and order. You can easily avoid optimizations that are problematic and replace them with custom versions which maintain any invariants you care about. This part of LLVM also happens to be reasonably stable and well documented, so I think someone experienced in compiler development but new to LLVM could come in and get up to speed pretty quickly.

Below that, at the level of LLVM's machine-specific backends, the situation is less pleasant. This part of LLVM is less stable and less well documented; at the time I was working on this the documentation basically amounted to a blog post explaining the overall architecture, and the code itself. The machine backends are less restrictive on what they can do to registers, so you can't really know what's in a register after the code passes through all the optimizations. The only saving grace at this layer is that (I believe) the backends won't optimize across basic blocks, so things can only travel so far. If you want to be safe at this level the most straightforward way to do it is to copy LLVM's GC support and make sure you keep copies of all the roots on the stack. Other alternatives require custom modifications to the LLVM backends, which are challenging at best and untenable at worst.

I think a conservative GC should be entirely doable, the only question is how the performance will be at the end. To start out, you could just use LLVM's existing GC support, and worry about performance after you're past the initial hurdle of making it work at all.

Elliott Slaughter

"Don't worry about what anybody else is going to do. The best way to predict the future is to invent it." - Alan Kay