On Mon, Sep 22, 2003 at 10:27:38PM -0500, Ray Bryant wrote:
> For large address spaces, and lots of threads (large here means 50-100GB,
> and 32 or more threads pinned to that many cpus), we're seeing severe
> contention on the mm->page_table_lock in do_anonymous_page() and
> handle_mm_fault(). The effect of this is that after a (Fortran) program
I have seen this problem on smaller NUMA systems with smaller loads
too:
Just have one process that exceeds its working set all the time. On another
CPU the swapper will run continuously and age its working set. The process
does page faults all the time because of its unmapped aged pages.
In one particularly base case the two CPUs doing this were near completely
busy just fighting for page_table_lock and sending TLB flush IPIs
to each other. Workload made nearly no progress.
I saw this case on 2.4, but I suspect it's in 2.6 too, just maybe
harder to trigger.
- Possible solutions: do batched swapping and take the lock
only for multiple pages (not sure if that would help in your case,
making the normal user page fault use batched locking is probably a lot of
work)
- Use a more fair queued lock for the page_table_lock
This would make the common case slightly slower though.
-Andi
|