|
From: Josef W. <Jos...@gm...> - 2006-11-07 17:33:40
|
On Tuesday 07 November 2006 08:12, Ulrich Drepper wrote: > In case a data set spreads two cache lines and the second cache line is > at index zero (i.e., the first at the highest index), the tag used for > the second cache line is wrong. It is one higher than the tag for the > first, otherwise no wrap-around would happen. Patch is attached. Indeed. And thanks; this is buggy in the callgrind version of the simulator, too. Wow. How did you find this? > Which brings on the next step: now the cache_t2 structure consists of 8 > words and the char array. If you rearrange the struct to move the tags > pointer before the desc_line element all commonly used elements are in > the first 32 or 64 bytes (for 32 or 64 byte platforms respectively). If > now cache_t2 is aligned for this value there is only one cache line > needed for L2, I1, D1. cache_t2 for sure is always present in L1 in the whole run. So this is not about reducing latency in a hot path, but about freeing a cache line in L1 to make room for other uses. The patch is fine; I just wonder if this really makes any difference in practice. With 32 kB L1 cache, you have 512 lines, so this gives you 0.2% more space. Josef |