From: Christophe R. <cs...@ca...> - 2006-02-21 15:32:11
|
Cyrus Harmon <ch...@bo...> writes: > But I think all of that can wait. I gave the patch a thorough read this morning. There are a couple of other problems, only one of them (I think) serious. alloc.c has a function called pa_alloc(), which for gencgc attempts to allocate pseudo-atomically -- that is, deferring handling interrupts and gcing until a known-safe point. (It doesn't attempt to do so for the cheney collector, but that is a tale for another day). It very much looks to me that pa_alloc() is in fact only really guarding against garbage collection; certainly that's all that the Cheneygc version defends against, as G=E1bor has already noted in a comment. It took me a long time to understand in the context of this that the pseudo_atomic macros in Cyrus' patch (SetSymbolValue in HEAD) are only the same as the regular pseudo-atomic implementation because that was convenient; if we only call pa_alloc() from interrupt-safe places, the only thing that do_pending_interrupt() might do is GC. This explains why the fact that in C-land using dynamic_space_free_pointer to hold the pseudo-atomic bits doesn't lose: the only time dynamic_space_free_pointer is checked for pa-ness is when it itself would have been set. (I had enviseaged lossage as follows: we have the atomic bit set in dynamic_space_free_pointer set, but take an interrupt; the interrupt handler quite rightly uses arch_pseudo_atomic_atomic() to see if we're meant to be atomic, finds that reg_ALLOC in the context is some random value, and probably decides not to defer the interrupt.) So I relaxed for a moment, before reading the following comment: if (get_pseudo_atomic_interrupted(th)) /* Even if we gc at this point, the new allocation will be * protected from being moved, because result is on the c stack * and points to it. */ do_pending_interrupt(); this comment is utterly untrue for gencgc/ppc. So a new lossage scenario presents itself: we are in pa_alloc(); alloc() decides to schedule a gc by putting a deferred handler on the interrupts, we come back to this test, get_pa_interrupted() says "yes", we do the gc, but our newly-allocated result from the allocation is not pointed to by any roots, so gets collected away. A resolution for this one issue might be if (get_pseudo_atomic_interrupted(th)) { #ifndef LISP_FEATURE_C_STACK_IS_CONTROL_STACK /* FIXME: stack-grows-which-way? */ current_control_stack_pointer +=3D 1; *current_control_stack_pointer =3D result; #endif do_pending_interrupt(); #ifndef LISP_FEATURE_C_STACK_IS_CONTROL_STACK current_control_stack_pointer -=3D 1; #endif } Can a GC expert confirm any of this reasoning? I'm aware that this doesn't make pa_alloc() actually pseudo-atomic on ppc/gencgc, but I think it does plug the heap integrity hole... Cheers, Christophe |