[I'm CCing sbcl-devel, I hope you don't mind]
Stephan Frank wrote:
> At the beginning of the gmp_wrap.c file he defines a constant BIGNUM_TAG
> by which later the address is reduced to get the actual address of the
> lisp object. The constants is 7 and 3 for 63 and 32 bit systems
> respectively. Where does this number come from since the bignum widetag
> is 17 (or 10 for 32 bit SBCL) when looked up via sb-vm:bignum-widetag?
> How were the magic numbers 7 and 3 derived?
Here's how objects are represented on the heap in SBCL:
The header word has a one-byte widetag, and the rest can be anything. In
many cases (including bignums), variable-length objects store their
length in the remainder of the header word.
The exception is CONSes. For some reason that surely made sense back in
the days, we twist the system really hard to make cons lists efficient,
and conses are just two tagged words. AFAIK, this is why characters (and
single floats on x86-64) have widetags, even though they're immediate
values: when they'tagged characters or singles are stored in conses, we
can pretend they're one-word immediates, by dispatching on their widetag.
Pointers are tagged for three reasons:
1. we want to be able to pun fixnums and other immediate values in
2. we want to make some type tests quick;
3. we want to know when we're looking at a CONS or at a header word
(useful for the GC, otherwise, [mostly] not having interior pointers
makes things pretty simple).
All heap-allocated objects (or dynamic extent stack-allocated ones) are
aligned to two words (real words: 128 bit on x86-64, 64 on x86) so we
have plenty of free low-order bits. That's what lowtag-mask represents.
There are plenty of constraints, but suffices it to say that lowtags
and widetags are not the same. You can find their definition in
src/compiler/generic/early-objdef.lisp, and their value is probably best
found by typing sb-vm::foo-*tag at the REPL.
The tagging scheme on 64 bit architectures is a bit strange, because
Alastair Bridgewater managed to give us 63-bit fixnums: if the least bit
is 0, we have a fixnum, otherwise, it's something else… On other
platforms, a least bit of 0 means an immediate value (00 is for fixnums)
and 1 a tagged reference.
src/compiler/generic/objdef.lisp defines the layout for primitive lisp
objects, along with their widetag and lowtag. For bignum, we have
(define-primitive-object (bignum :lowtag other-pointer-lowtag
(digits :rest-p t :c-type #!-alpha "sword_t" #!+alpha "u32"))
(alpha is/was a strange 32/64 beast).
So, to take a pointer to a bignum and get the base address, we have to
subtract other-pointer-lowtag, which is 15 on x86-64 and 7 on x86 (which
happens to coincide with lowtag-mask, but that's purely coincidental).
That only gives us a pointer to the beginning of the bignum object, to
the header word. So, we have to go one word forward to get to the
digits. -15+8 = -7 (x86-64), and -7+4 = -3 (x86).