From: Paul M. <pm...@en...> - 2002-07-12 17:36:21
|
> >Note: measurements on 2.4 do not make sense; reduction of cacheline >bouncing between 2.4 and 2.5 will change the results anyway and >if any of these patches are going to be applied to 2.4, reduction of >cacheline bouncing on ->d_count is going to go in before that one. I think there are some other possibilities for cache-bounce removal in struct dentry. The most obvious one is d_vfs_flags - not only does it get written to on every d_lookup (to set DCACHE_REFERENCED) but it also shares a cache line (on Pentium IV) with d_op, d_iname and part of d_name (along with d_sb and d_fsdata, but these don't seem to be so crucial). Some quick stats gathering suggested that DCACHE_REFERENCED is already set 95%-98% of the time, so this cache bounce is not even doing anything useful. I submitted this patch a while ago making the DCACHE_REFERENCED bit setting be conditional on it not being already set, which didn't generate any interest. One problem with this patch would be the additional branch prediction misses (on some architectures?) that would work against the benefits of not dirtying a cache line. Maybe we should have a function definition something like the following: static __inline__ void __ensure_bit_set(int nr, volatile unsigned long * addr) { #if defined (CONFIG_BIG_SMP) || defined(ARCH_HAVE_PREDICATED_WRITE) if(!test_bit(nr, addr)) #endif set_bit(nr, addr); } so that architectures that support conditional writes (arm and ia64?) and for SMP systems with enough processors that cache-bouncing is an issue, the test can be performed, and for others where the branch prediction miss would hurt us more than the cache dirtying it would just do it unconditionally. --- linux-2.5.13/fs/dcache.c Thu May 2 17:22:42 2002 +++ linux-2.5.13-nodref/fs/dcache.c Sat May 4 16:36:38 2002 @@ -883,7 +883,8 @@ if (memcmp(dentry->d_name.name, str, len)) continue; } - dentry->d_vfs_flags |= DCACHE_REFERENCED; + if(!(dentry->d_vfs_flags & DCACHE_REFERENCED)) + dentry->d_vfs_flags |= DCACHE_REFERENCED; return dentry; } return NULL; Perhaps another solution is to rearrange struct dentry - put the volatile stuff in one cache line (or set of lines), and the constant stuff in another. So probably d_count, d_lru and d_vfs_flags would be in the volatile line, and most other stuff in the the other. It would probably make sense to ensure that d_name and d_iname share a cache line if possible, even on smaller cache-line architectures, and maybe also d_hash and d_parent on that same line. Paul |
From: Paul M. <pm...@en...> - 2002-07-13 17:26:27
|
> >Frankly, I'd rather moved setting DCACHE_REFERENCED to dput(). We don't >care for that bit for dentries with positive ->d_count. > >So I'd just do > >vi fs/dcache.c -c '/|= DCACHE_R/d|/nr_un/pu|<|x' > >and be done with that. Linus? > Some possibly minor issues with that: - accessing foo/../bar, won't mark foo as referenced, even though it might be being referenced frequently. Probably not a common case for foo to be accessed exclusively in this way, but it could be fixed by marking a dentry referenced when following ".." - currently, negative dentries start off unreferenced and get marked referenced the second and subsequent time that they're used. This would change to starting off referenced (by the ref count set in lock_nd() after the ->lookup()) but then not being marked referenced ever again, as they're always looked at under dcache_lock, and no count is taken on them. So used-once negative dentries would hang around longer, and frequently-used negative dentries would be cleaned up sooner. - referenced bit will be set possibly long after the reference is actually taken/used, which might make dentry pruning a little less accurate. I was considering suggesting moving the reference bit setting to unlock_nd(), since that's another place where we're already changing d_count, but that still has the first two problems that I mentioned. Either way, moving d_vfsflags to the same cacheline as d_count would probably be a good idea. Paul |
From: Alexander V. <vi...@ma...> - 2002-07-13 17:33:14
|
On Sat, 13 Jul 2002, Paul Menage wrote: > - accessing foo/../bar, won't mark foo as referenced, even though it > might be being referenced frequently. Probably not a common case for foo > to be accessed exclusively in this way, but it could be fixed by marking > a dentry referenced when following ".." It certainly will. Look - until ->d_count hits zero referenced bit is not touched or looked at. At all. Look at the code. There is _no_ aging for dentries with positive ->d_count. They start aging only when after they enter unused_list... |
From: Alexander V. <vi...@ma...> - 2002-07-13 17:54:33
|
On Sat, 13 Jul 2002, Alexander Viro wrote: > > > On Sat, 13 Jul 2002, Paul Menage wrote: > > > - accessing foo/../bar, won't mark foo as referenced, even though it > > might be being referenced frequently. Probably not a common case for foo > > to be accessed exclusively in this way, but it could be fixed by marking > > a dentry referenced when following ".." > > It certainly will. Look - until ->d_count hits zero referenced bit is > not touched or looked at. At all. > > Look at the code. There is _no_ aging for dentries with positive ->d_count. > They start aging only when after they enter unused_list... On the second thought, I should apply that advice myself. It's true that they start aging only after they hit the unused list, but they are born old. Hrm... OK, it kills that variant and it really looks like there's no way around dirtying dentry first time we find it on d_lookup(). Which means that we probably want to put that |= under if - without any ifdefs... |
From: Paul M. <pm...@en...> - 2002-07-13 17:51:42
|
> > >On Sat, 13 Jul 2002, Paul Menage wrote: > >> - accessing foo/../bar, won't mark foo as referenced, even though it >> might be being referenced frequently. Probably not a common case for foo >> to be accessed exclusively in this way, but it could be fixed by marking >> a dentry referenced when following ".." > >It certainly will. Look - until ->d_count hits zero referenced bit is >not touched or looked at. At all. > >Look at the code. There is _no_ aging for dentries with positive ->d_count. >They start aging only when after they enter unused_list... > Yes, but with the fastwalk lookup, accessing foo/../bar won't do a dget() for foo (assuming it was cached), and hence will never do a dput() on it either. So if the only references to foo are as foo/../bar then its d_count will stay zero and it will never be marked referenced. (Or am I missing something?) Paul |
From: Alexander V. <vi...@ma...> - 2002-07-13 08:52:31
|
On Fri, 12 Jul 2002, Paul Menage wrote: > >Note: measurements on 2.4 do not make sense; reduction of cacheline > >bouncing between 2.4 and 2.5 will change the results anyway and > >if any of these patches are going to be applied to 2.4, reduction of > >cacheline bouncing on ->d_count is going to go in before that one. > > I think there are some other possibilities for cache-bounce removal in > struct dentry. The most obvious one is d_vfs_flags - not only does it > get written to on every d_lookup (to set DCACHE_REFERENCED) but it also > shares a cache line (on Pentium IV) with d_op, d_iname and part of > d_name (along with d_sb and d_fsdata, but these don't seem to be so > crucial). > > Some quick stats gathering suggested that DCACHE_REFERENCED is already > set 95%-98% of the time, so this cache bounce is not even doing anything > useful. I submitted this patch a while ago making the DCACHE_REFERENCED > bit setting be conditional on it not being already set, which didn't > generate any interest. One problem with this patch would be the > additional branch prediction misses (on some architectures?) that would > work against the benefits of not dirtying a cache line. Frankly, I'd rather moved setting DCACHE_REFERENCED to dput(). We don't care for that bit for dentries with positive ->d_count. So I'd just do vi fs/dcache.c -c '/|= DCACHE_R/d|/nr_un/pu|<|x' and be done with that. Linus? |