Re: [Sablevm-developer] Threading support in SableVM
Brought to you by:
egagnon
From: Chris P. <chr...@ma...> - 2004-02-19 22:15:19
|
Archie Cobbs wrote: > Chris Pickett wrote: >=20 >>I started looking at the POSIX 1003.1c (pthreads) spec (it's >>available online) and also at the comp.programming.threads FAQ (1 Mb >>html file kills Mozilla on my machine, better to download) ... and >>discovered a few interesting things: >> >>1) The only way to ensure cache coherency in a portable manner is to us= e >>the pthreads synchronization functions (e.g. lock and unlock). So I >>think that means there is no need for us to consider the Linux kernel >>cache flush architecture, nor any processor-specific cache flush >>instructions. >=20 >=20 > On a related note: Java semantics imply a read barrier at MONITORENTER > and a write barrier with MONITOREXIT. With fat locks, you get this > automatically because they are implemented using pthread mutexes. > But with thin locks where there is no contention, technicallly SableVM > is at fault because it doesn't explicitly impose the read/write barrier= s > (does it?). That's what I think :( Etienne wrote about it here: http://lists.debian.org/debian-ia64/2003/debian-ia64-200302/msg00035.html= (first hit if you google for "thin locks smp"!) and the description of locks in SableVM is here: http://www.usenix.org/publications/library/proceedings/jvm01/gagnon/gagno= n_html/node14.html After reading the comp.programming.threads FAQ stuff (just search the=20 document for "cache"), they say that although workable hacks exist, if=20 you want any portability or guarantees you need to use POSIX only, and=20 you should only use the hacks if you know /exactly/ what you're doing.=20 But at the same time, it sounds like strictly-POSIX thin locks don't=20 exist ... so it might be easier to try and introduce a cache flush=20 instruction or system cache flush call in places. There's two solutions I can see: 1) Make the current thin locks optional OR 2) Introduce explicit cache flushing where necessary Personally, I would be happy enough with (1), since my speculative=20 multithreading work only needs to show relative speedup (and indeed, the = faster an "unmodified" SableVM is, the less that relative speedup will=20 be ...), but I'm actually just eager to take the path of least resistance= :) > On i386 it works out anyway because I think the compare-and-swap > sequence enforces a memory barrier. But in general that's not true. Well, SableVM doesn't work on an Athlon MP 2000+, which is i686. But=20 I'm not sure if it's because of a broken C&S or not. If it IS because=20 of a broken C&S, that's a good thing; however, if the C&S is /already/=20 imposing an MB, then that's bad because it means the problem is=20 elsewhere. I think. (more reading ensues) I looked up the IA-32 instruction set reference (split in 2 parts): http://developer.intel.com/design/pentium4/manuals/253666.htm http://developer.intel.com/design/pentium4/manuals/253667.htm CMPXCHG doesn't mention flushing the processor's cache. INVD ignores cache contents and invalidates the cache. WBINVD writes back cache contents and invalidates the cache, and signals = other processors to do the same. However, the documentation says: The WBINVD instruction is a privileged instruction. When the processor=20 is running in protected mode, the CPL of a program or procedure must be=20 0 to execute this instruction. This instruction is also a serializing=20 instruction (see =93Serializing Instructions=94 in Chapter 8 of the IA-32= =20 Intel Architecture Software Developer=92s Manual, Volume 3). I'm not sure if this is a problem, but if not, maybe all that's required = is WBINVD in the C&S for i386? It would also be nice if we didn't have to call WBINVD on a uniprocessor = =2E.. > I could be wrong about all this but this what memory recalls. Whether or not you are, thanks for discussing it, it's always helpful. Cheers, Chris |