From: Eric A. <er...@an...> - 2009-03-27 20:10:35
|
On Fri, 2009-03-27 at 19:10 +0100, Andi Kleen wrote: > On Fri, Mar 27, 2009 at 09:36:45AM -0700, Eric Anholt wrote: > > > > You are aware that there is a fast path now (get_user_pages_fast) which > > > > is significantly faster? (but has some limitations) > > > > > > In the code I have, get_user_pages_fast is just a wrapper that calls the > > > get_user_pages in the way that I'm calling it from the DRM. > > > > Ah, I see: that's a weak stub, and there is a real implementation. I > > didn't know we could do weak stubs. > > The main limitation is that it only works for your current process, > not another one. For more details you can check the git changelog > that added it (8174c430e445a93016ef18f717fe570214fa38bf) > > And yes it's only faster for architectures that support it, that's > currently x86 and ppc. OK. I'm not too excited here -- 10% of 2% of the CPU time doesn't get me to the 10% loss that the slow path added up to. Most of the cost is in k{un,}map_atomic of the returned pages. If the gup somehow filled in the user's PTEs, I'd be happy and always use that (since then I'd have the mapping already in place and just use that). But I think I can see why that can't be done. I suppose I could rework this so that we get_user_pages_fast outside the lock, then walk doing copy_from_user_inatomic, and fall back to kmap_atomic of the page list if we fault on the user's address. It's still going to be a cost in our hot path, though, so I'd rather not. I'm working on a set of tests and microbenchmarks for GEM, so other people will be able to play with this easily soon. -- Eric Anholt er...@an... eri...@in... |