[Sablevm-developer] Flushing strategy ....
Brought to you by:
egagnon
From: Gunda D. <sab...@ho...> - 2002-12-10 22:17:43
|
Hi Etienne, In a previous posting you had the following to say on flushing strategy. Here are some more ideas. Currently you have _svmf_flush(_svmt_word *pword). This granularity is very inefficient for many present generation architecture implementations. The next best granularity would be the cache line. The best would would to do code generations for a "block" and then just call the iflush with a pointer and the length. The iflush can be tuned to a particular architecture (or more precisely an implementaion of the architecture). With cache sizes increasing (good fractions of a MB to several MB) it will be a bad idea to flush whole caches as this will greatly affect performance. The CPU bandwidth and the Memory bandwidth is increasing with successive generations .... Let's consider the following cases. 1. Implementation with no caches, then obviously iflush will have to do nothing. 2. Implementation with split I and D with no coherency between them. Then in this case one can either do flush on a cache line basis or if the routine figures out that size of code generations is >= cache size then it can just flush the whole caches. 3. Implementation with multi-level caches with unified caches occuring at higher levels. In this case we need only to flush to the first unified level (this will work only for the UP case). 4. MP implications, depending on the coherency of the various cache levels then one will have to invalidate all the other caches and write out the generated code to main memory. 5. For NUMA ?? Bottom-line: I think it is better off to provide iflush with a second size parameter and leave the implementation specfic for a particular architecture implementation. What say you ? Later, -Gunda Hi Grzegorz, You are doing some very interesting work. A simple comment: we will probably need to be ready to fine tune the flushing strategy for specific architectures. Should we act upon the "data" cache or the "instruction" cache, or both, and what is the ideal granularity of this action (single word (or cache line), or general flush), what do we flush (write back buffer only, all entries in the cache, ...). Yep, a lot of fun ahead... The simplest strategy, on the short term, would be a full cache flush ( of both instruction and data caches). As this only happens "once" for each executed method (at the end of method preparation), the simple approach might have no significant impact on the running time, while allowing the inline-threaded engine to work on modern processors. Of course, once we're done with the inline-threaded engine, we will have to attack the multi-processor cache coherency problem... Thanks a lot for this very important work. Etienne _________________________________________________________________ The new MSN 8: smart spam protection and 2 months FREE* http://join.msn.com/?page=features/junkmail |