|
From: Matthew F. <mat...@gm...> - 2022-11-07 19:29:21
|
On Mon, Nov 7, 2022 at 11:13 AM Chris Cannam
<ca...@al...> wrote:
> On Mon, 7 Nov 2022, at 15:44, Matthew Fluet wrote:
> > Untested, but something like:
> >
> > [...]
> > + if (numObjptrs > 0 and l > 0) {
> > + markCard (s, ad);
> > + }
> > +
> > eltSize = bytesNonObjptrs + (numObjptrs * OBJPTR_SIZE);
> > GC_memmove (as + eltSize * ss, ad + eltSize * ds, eltSize * l);
> > }
> >
> > should do the trick.
>
> It does! With that added, it runs successfully - and produces the same results with copy-generational-ratio 10.0 as it did with the default arguments.
Excellent!
> I wonder what a minimal test-case for this would look like. If this copy is part of an optimisation for constructing a sequence, does this mean it is constructing it directly into the old generation heap, rather than in the nursery? If so, why? Is there a rule that it will do that for sequences above a certain size? Is that affected by copy-generational-ratio?
It doesn't necessarily mean that the sequence was allocated directly
in the old generation. A sequence allocation and initialization is a
multi-step process: allocate an uninitialized sequence of the correct
size, then initialize the elements. If there is non-trivial
computation involved in creating the elements, then it is possible for
a GC to occur, which will copy the partially initialized sequence from
the nursery to the old gen, then initialization resumes with
allocating elements (in the nursery) and updating the sequence to
point to the elements. For instance, a Vector.tabulate has this kind
of behavior (though, it does not invoke GC_sequenceCopy).
There is a GC control that affects whether a large sequence is
directly allocated in the old generation:
https://github.com/MLton/mlton/blob/master/runtime/gc/sequence-allocate.c#L49
with the default value of 0x100000 bytes.
That said, I suspect that it is the case that the array which has the
intergenerational point was directly allocated in the old generation.
A brief look at the Basis Library implementation suggests that the
initialization loops that perform GC_sequenceCopy are all
non-allocating loops. Essentially, all of the data to be copied into
the destination sequence is gathered ahead of time (typically in order
to calculate the size of the to-be-allocated sequence), and then it is
a simple tail-recursive loop over a list of slices performing
GC_sequenceCopy.
The oldGenSequenceSize is not affected by copy-generational-ratio.
What is affected by copy-generational-ratio is whether or not minor
GCs are executed at all. Here is the code that sets the `canMinor`
flag:
https://github.com/MLton/mlton/blob/master/runtime/gc/gc_state.c#L71
The critical condition is:
(float)h->size / (float)s->lastMajorStatistics.bytesLive <=
s->controls.ratios.copyGenerational
The default value for copy-generational-ratio is 4.0. Note also that
the default value for live-ratio is 8.0. (See
https://github.com/MLton/mlton/blob/master/runtime/gc/init.c#L287.)
After each (major) garbage collection, MLton tries to resize the heap
so as to be live-ratio times the bytesLive. There is some "wiggle
room" in the resizing; i.e., don't resize if the current heap size is
"close enough" and try not resize beyond ramSlot (default 0.5) times
RAM. But, for "small programs" on machines with "big memory", the
heap will pretty often be 8.0 times the lastMajorStatistics.bytesLive.
So, you can see that with the default value of copy-generational-ratio
4.0, the runtime will never do generational collection. In the
absence of generational collection, the card marking doesn't matter
--- we're always doing a majorGC and tracing the whole heap. On the
other hand, with copy-generational-ratio 10.0, the runtime will always
do generational collection, so you are much more likely to trigger the
card marking bug.
-Matthew
|