|
From: Philipp M. <ph...@ma...> - 2024-06-10 11:12:37
|
Hi Doug,
thank you for your continued patience.
>> Yeah, I thought about that already -- but one slot contains a SHA256,
>> so I'd need to use a bignum, which incurs a header word overhead (like
>> a structure). A SIMD pack has two words overhead.
>>
> Are you really certain that columnar storage is going to net out to
> fewer
> words of memory?
> Consider storing the sha256 in a structure with 4 64-bit raw slots.
Well, the SHA256 isn't unique per this instance,
but gets de-duplicated upon import.
So having a pointer right now doesn't hurt.
I pondered moving the SHA256 into 1 or 4 arrays of (unsigned-byte 64)
or perhaps one (unsigned-byte 8) array, I just didn't do that yet.
For both of these storing a number (integer 0 16e6)
(that gets multiplied by an appropriate amount before indexing),
so basically an (unsigned-byte 24) slot, is more than good enough -
I only have 12M different checksums, but I'm at 40M structures
pointing to them.
> If you consider the sum total of all things
> you put into this structure, a bit-packed defstruct is going to be a
> lot
> easier to deal with than a fancy encoding/representation technique that
> has
> some of the slots in one vector, some in another vector, and both
> vectors
> disembodied from the thing that you consider "the object' (i.e. the
> index
> to the vectors).
Yeah, well, I haven't decided yet, just looking for different ways.
> Every time someone asks why structures can't be packed as
> densely as in C, I wait for the follow-up question "and why can't you
> have
> type-based dispatch too?" Well, you can if you're willing to go to a
> completely statically compiled language model.
Are you saying I should look at sealed-generic-functions (or whatevery
they were called)? I don't think so, that's an ortogonal issue, right?
>> (DEFTYPE my-struct-idx () SB-SYS:OTHER-IMMEDIATE-1-LOWTAG-TYPE)
>>
>> and dispatching on that (typecase, typep, CLOS methods) becomes
>> possible? That sounds like a really nice addition to the SBCL
>> internals documentation that I could write as soon as it works.
>>
> I will look the other way just long enough for you to prototype it and
> see
> if it works. I can't actually assert that the GC is fully robust
> against
> other random uses of the unallocated immediate lowtags.
Well, I hope that I can find some time to play around with that.
Thanks for 736d471b91a6d647fc40557841bfc5e44ba66160 -- though that
won't help with the sb-posix:lstat issue, does it?
I can't reproduce that in smaller test cases yet... my guess is that
the GC signal arrives at a thread but the restarted syscall writes to
the old location, so some
(sb-sys:with-pinned-objects ...)
resp.
(sb-alien-internals:maybe-with-pinned-objects)
might help. (Crashes with sb-unix:unix-lstat as well.)
Just need to update my SBCL etc.....
Thanks!
|