[Sablevm-developer] Flushing strategy ....

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Hi Etienne,

In a previous posting you had the following to say on flushing strategy. 
Here are some more ideas.

Currently you have _svmf_flush(_svmt_word *pword). This granularity is very 
inefficient for many present generation architecture implementations. The 
next best granularity would be the cache line. The best would would to do 
code generations for a "block" and then just call the iflush with a pointer 
and the length. The iflush can be tuned to a particular architecture (or 
more precisely an implementaion of the architecture). With cache sizes 
increasing (good fractions of a MB to several MB) it will be a bad idea to 
flush whole caches as this will greatly affect performance. The CPU 
bandwidth and the Memory bandwidth is increasing with successive generations 
....

Let's consider the following cases.

1. Implementation with no caches, then obviously iflush will have to do 
nothing.

2. Implementation with split I and D with no coherency between them. Then in 
this case one can either do flush on a cache line basis or if the routine 
figures out that size of code generations is >= cache size then it can just 
flush the whole caches.

3. Implementation with multi-level caches with unified caches occuring at 
higher levels. In this case we need only to flush to the first unified level 
(this will work only for the UP case).

4. MP implications, depending on the coherency of the various cache levels 
then one will have to invalidate all the other caches and write out the 
generated code to main memory.

5. For NUMA ??

Bottom-line: I think it is better off to provide iflush with a second size 
parameter and leave the implementation specfic for a particular architecture 
implementation.

What say you ?

Later,
-Gunda

Hi Grzegorz,

You are doing some very interesting work.

A simple comment: we will probably need to be ready to fine tune the
flushing strategy for specific architectures.  Should we act upon the
"data" cache or the "instruction" cache, or both, and what is the
ideal granularity of this action (single word (or cache line), or
general flush), what do we flush (write back buffer only, all entries
in the cache, ...). Yep, a lot of fun ahead...

The simplest strategy, on the short term, would be a full cache flush
( of both instruction and data caches). As this only happens "once"
for each executed method (at the end of method preparation), the
simple approach might have no significant impact on the running time,
while allowing the inline-threaded engine to work on modern
processors.

Of course, once we're done with the inline-threaded engine, we will
have to attack the multi-processor cache coherency problem...

Thanks a lot for this very important work.

Etienne

_________________________________________________________________
The new MSN 8: smart spam protection and 2 months FREE*  
http://join.msn.com/?page=features/junkmail