|
From: Nicholas N. <nj...@cs...> - 2006-11-22 11:48:07
|
On Wed, 22 Nov 2006, Ulrich Drepper wrote: > Nicholas Nethercote wrote: >> This is not such a hot path, the set1 != set2 case is much less common >> than the set1 == set2 case. > > True, but still far too frequent to be ignored. Even the wrong case (i.e., > when the tag needs to be corrected) appeared in measurable amounts. I ran > the broken and fixed cachegrind binaries and the introduced errors were > measurable. And that's only a fraction of cases handle in this if branch. I found the set1 != set2 case to account for 2--3% of all cases, and the tag miscomputation to occur in about 0.05% of all cases. It's certainly worth fixing, but the performance effect of the extra tag computation on this cold path is negligible. >> Cache optimisations are sufficiently subtle that I would consider very few >> of them obvious. > > Well, it _is_ obvious that if in one version two cache lines are used and in > the other just one, the second is faster. What's less obvious is how much faster. I added the extra tag computation in the set1 != set2 case and saw no difference. I then tried the struct rearrangement you suggested and again saw no difference. I used valgrind/perf/bz2.c as the benchmark for these tests. I've committed a fix for the simulation error. Thanks for pointing this out! Nick |