From: SUGIOKA T. <su...@it...> - 2003-05-08 03:52:20
|
At 16:45 03/05/07 -0500, Paul Mundt <le...@li...> wrote: >On Wed, May 07, 2003 at 06:18:11PM +0900, SUGIOKA Toshinobu wrote: >> Thanks a lot. I tried this patch on my SH-7750R (which has 2-way cache) board. >> With a few changes, it worked very well. >> X window system with vnc-server runs on this kernel. >> >Okay, overall this looks good, though I have a few suggestions. For one, >I just fixed up flush_icache_range() in HEAD to clear out the valid bit >per-way, which it wasn't doing before. Please try this first and see if >it works for you. If it turns out that flush_cache_all() is faster, then >obviously we can use that, but I'd like to avoid excessive flushing. > flush_icache_range(unsigned long start, unsigned long end) in HEAD doesn't work anyway. This function is used in fs/binfmt_elf.c and requires flush range from start to end, but in HEAD version, it flushes only 1 line (32byte). I think it's better to flush whole i-cache because the range may be very large. It seems that this function is same as previous version of flush_cache_sigtramp(). flush_cache_sigtramp() is the optimized version for signal trampoline which needs only 1 line being flushed. # 2-cache-lines are write-backed on my patch, but 1-line should be sufficient for now. >Secondly, there were a few fixups posted earlier for >__flush_dcache_all() and flush_cache_4096_all() to deal with the new >cache layout (including multiple ways), which also has been reported as >working for 7750R and 7751R. In this case, I'd prefer to merge these >cleanups first and then take a look at the rest of your fixes to see >what is still relevant. Well, I know that fixups. but... (1) On older SH-4 CPUs, current 2.4 branch works well. These functions are oprimized for these CPUs, but they assumes direct mapped cache so we can't use on 2-way cache. That fixups simply extends address range for pre-loading to flush d-cache. I think this is a bit dangerous because LRU bit may not work always for this purpose. And anyway, there is more efficient method to flush 2-way cache. (2) On new type CPUs that have 2-way cache system, we have no restriction to access operand cache address array, so we can flush d-cache by clearing this array without performance penalty. (3) I don't like the code like this, because it is hard to maintain. + mov.l @(56,r6),r1 ! cpu_data->dcache.ways (4) We don't need to consider multi-way in functions that use flush_cache_4096(), ASSOC bit works for that. flush_dcache_page() and __flush_cache_page() in 2.4 branch seems work without any change. For performance reason, I consider that it's better to use different cache flushing method for 2-way cache from that for direct-mapped cache. ---- SUGIOKA Toshinobu |