[Nomen-dev] Threads again- run for the hills!
Brought to you by:
bhurt
|
From: Brian H. <bh...@sp...> - 2002-04-04 21:18:23
|
The idea came to me last night- pointer swizzling.
OK, a little definitional here. Pointer swizzling is when you can tell
from it's form that a pointer is invalid. You then use "invalid" pointers
to hold reference information, and supply an unswizzle routine to turn an
invalid pointer into a valid one.
Some examples make this clearer. Let's say you're on the x86- a 32-bit
architecture that allows misaligned stores. You could limit memory usage
to the low 2 gig (want more memory? Use a 64-bit processor!). If the
high bit is clear, it's a valid normal pointer. If the high bit is set,
it's an invalid pointer, and the other 31 bits form some form of index to
allow the unswizzle routine determine what to do about it. On GCC you
could do something like:
extern void * unswizzle(void *);
static __inline__ void * load_ptr(void * ptr) {
if ((((unsigned long) ptr) & 0x80000000ul) != 0) {
return unswizzle(ptr);
} else {
return ptr;
}
}
The performance hit on this is minimal- it compiles to:
movl ptr, %ebx ; or whichever register- we'd have this
; instruction anyways
testl %ebx, %ebx
jns 1f
pushl %eax
pushl %ebx
call unswizzle
movl %eax, %ebx
addl $4, %esp
popl %eax
1:
We've added only 12 bytes of object code total over the original code
size, and executed only an additional two instructions (and no memory
reads/writes) over what was before in the (hopefully) common path of the
pointer being unswizzled. Note that this cost is already incurred in many
garbage collection algorithms.
If you were on something like a PPC, a 32-bit architecture which doesn't
allow misaligned loads, you could declare that if the low order bit is
set, it's an invalid pointer. Even better, you could simply patch the
signal handler and have it handle the deswizzling, meaning you don't have
to check each pointer load. An interesting read to this is:
http://citeseer.nj.nec.com/wilson92pointer.html
In addition to the classic uses of pointer swizzling for distributed
shared memory and persistant object store, we can also use it for garbage
collection and locks.
Here's the rules:
1) The user space threads always swizzle the pointers before writting
them, and then sends a message to the GC that a new pointer has been
written (note, this isn't as expensive as it sounds- most of the time you
simply append the address of the pointer just written to an array. A
handfull of instructions, and already done in many GC systems).
2) The user space threads always check for a swizzled pointer when they
load a pointer, and calls unswizzle() if it's swizzled. Among other
things, unswizzle() grabs any locks necessary. Unswizzle may also
unserialize the object, retrieve it from a remote system, or wait for the
GC to finish copying it before returning it. Once unswizzle() returns,
the thread owns the object until:
3) The user space thread no longer needs the object, at which point if the
original pointer is (still) swizzled, it calls release_swizzle(). Note
again this is only a handfull of instructions. release_swizzle() unlocks
any locks required.
4) The garbage collection can copy and object by first acquiring the lock
for the object, going through and changing all references to the lock to
be swizzled, copying the object, then going through and changing all the
references back.
5) In addition, the lock anaylzer can go through and decide what pointers
to unswizzle. Note the decision to lock or not lock can be done per
pointer- not per object. So if you're going from object A to object B you
don't have to lock (as A's pointer to B is unswizzled), while if you go
from object C to object B you do have to lock (because C's pointer is
swizzled).
You know, I'm tempted to abuse C++ as an experimentation system. Wow-
mark the calander. This is the second good thing I've said about C++ in
my entire life. I think I need to lay down for a bit. But C++'s lack of
an already-existing GC system, operator overloading, and templates give me
enough rope to screw around with this without designing a special purpose
language.
Brian
|