From: Samuel O. <so...@db...> - 2002-02-05 23:32:09
|
On Tue, 5 Feb 2002, William Lee Irwin III wrote: > On Mon, 4 Feb 2002, William Lee Irwin III wrote: > >> (2) The way the classzone_need_balance() logic interacts with partitioning > >> by node is a little unclear. What sort of behavior do you expect? > >> Does it actually happen? > > On Tue, Feb 05, 2002 at 09:59:57AM -0800, Samuel Ortiz wrote: > > Well, the previous behaviour set the first zone of the zonelist as > > unbalanced. The global swap daemon needed that to check if he really had > > to try to swap pages out from a particular zone or not. > > And things then get chosen from the wrong nodes etc. Yes, exactly. This is a wrong behaviour for NUMA. > On Tue, Feb 05, 2002 at 09:59:57AM -0800, Samuel Ortiz wrote: > > Now, as you can see from the patch, when we change node, the classzone > > changes. So, if node 0 and 1 are unbalanced, the first zone for these > > nodes from the zonelist will be marked as unbalanced. The expected > > behaviour is to have each swap daemon trying to get pages from its node if > > this node is totally unbalanced. The other daemons will keep sleeping if > > their node is balanced. > > I tried this on a 4 nodes SGI machine, where we have a unique zone per > > node : the DMA one. When I try to allocate an array bigger than a node, I > > see the following behaviour : > > As the first node is getting out of memory, its swap daemon is waken > > up. I see the node getting totally out of memory, and then because the > > swap daemon is working on it, I see some free pages appearing. > > I hope to be understandable enough. If not, please let me know. > > Looks like it's trying pretty hard to avoid getting pages from other > nodes, which is definitely good. Actually, a swap daemon will get pages only from its node. > I'm not sure how much classzone-style > balancing really helps here, it seems like with everything ZONE_DMA and > classzones being per-node it doesn't do much in your case, but it's > apparently not hurting you either. Very good news! Well, it actually does something. We just don't see the benefit of classzone since on our machine a classzone is actually one zone. However we really need to mark it as unbalanced. Otherwise, the swap daemon won't work on it. > On Mon, 4 Feb 2002, William Lee Irwin III wrote: > >> (3) with some stronger boundaries between nodes it should be possible > >> to run these things in parallel, that is, if address spaces > > On Tue, Feb 05, 2002 at 09:59:57AM -0800, Samuel Ortiz wrote: > > Very interesting indeed. The consequent work is worth it. I will try to > > have a look at Rik' rmap patch and see how this can help getting this > > stuff working. > > I can also pass along my attempt which built on top of -rmap and Momchil > Velikov's radix tree pagecache, which focused almost exclusively on the > locking. It's not what I would call bug-free by any means, but it looks > like something you're generally interested in. I'll follow up after I put > it back up somewhere downloadable and include some explanatory notes with > the post. Oh, that would be great. Please let me know as soon as you get something downloadable. Thanks again, Samuel. |