Re: [Sablevm-developer] Threading support in SableVM

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Chris Pickett wrote:
> On an SMP machine, the "main memory" is the non-cache heap memory, 
> visible to all processors.  Threads may reside on the same processor, 
> and if this is the case, their "working memories" are also visible to 
> each other, and no problems arise (functionally identical to the 
> uniprocessor case).  If they are on separate processors, in order to 
> meet the requirements of the JMM, each time a lock is acquired by a 
> thread, ALL of the lines brought into the cache by the current thread as 
> a result of reading values from the Java heap must be flushed:  they are 
> written back to main memory if they were touched by the thread, 
> otherwise they are simply invalidated.  When a lock is released, ONLY 
> those lines associated with the thread that have been modified since 
> acquiring the lock need flushing.
> 
> So ... assuming that's now correct, I think there are three things that 
> we might do to consider (some or all of which may be gibberish):
> 
> 1) Flush the entire processor cache as part of each MONITORENTER and 
> MONITOREXIT, or when entering or leaving a synchronized method.  This 
> would involve calling / executing one of:
>    a) WBINVD (not available in user mode),
>    b) CLFLUSH on the entire cache (one line at a time),
>    c) a kernel whole-cache flush routine,
>    d) flooding the cache by reading in a bunch of non-Java-heap data.
> 
> 2) Keep track of which Java heap addresses are read / written by a 
> thread, and flush only the cache lines that match those addresses as 
> part of MONITORENTER / MONITOREXIT, or when entering or leaving a 
> synchronized method.  This would involve calling / executing:
>    a) CLFLUSH for each line
>    b) a line-specific kernel cache flush routine.
> 
> 3) Use the memory barrier instructions:
>    a) MFENCE on each (Java only?) lock/unlock ensures that all loads and 
> stores occurring before the lock/unlock are globally visible before any 
> load or store that follows the MFENCE
>    b) *** while it appears that SFENCE (identical to MFENCE except only 
> stores are serialized) might be appropriate for the unlock operation, 
> this would mean a load operation depending on a store ordered before the 
> MFENCE might occur out-of-order, which would be bad. ***
> 
> ........
> 
> Finally:  After I wrote this, I looked again at question #118 of the 
> comp.programming.threads FAQ, and it seems to agree with what I've 
> written, and also makes me think that method (3) is the best.

That is consistent with my understanding as well. I think #3 is best
too, and it's probably OK to "punt" and say that a processor-specific
instruction sequence will be required for the read and write barriers
(and therefore an additional porting task). FYI the Linux kernel has
examples of asm() statements that create memory barriers for all its
supported architectures.

-Archie

__________________________________________________________________________
Archie Cobbs      *        CTO, Awarix        *      http://www.awarix.com