Re: [Lse-tech] [PATCH] break out numa scheduling config

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

On Monday 21 April 2003 17:01, Erich Focht wrote:
> On Saturday 19 April 2003 17:20, Martin J. Bligh wrote:
> > I think the case when the busy node rebalance is broken (as far as I'=
m
> > concerned) is when the load is low. If I have 4 tasks on one node
> > (1 per cpu) and none on the other, that's just fine by me. That's wha=
t
> > I said I was going to fix last night. However, I realised that sucks
> > for other people ... we need a better metric here.
>
> Indeed... CPUs in one node are typically on a common FSB. This means
> that the bandwidth to memory inside a node is limited and the CPUs
> share it. It makes a lot of difference to have one bandwidth eater one
> one node which gets the full stream performance or have two of them,
> each getting only half of the performance.
>
> > Erich, I presume you want more perfect balancing across nodes for mem
> > bandwidth concerns? Ie on a 4 node, 2 cpu per node system (which is w=
hat
> > I thought yours was), you don't want 2/2/0/0 tasks for each node, you
> > really, really want 1/1/1/1? Or is 2/2/0/0 just as good?
>
> The TX7 has normally 4 CPUs per node. My particular configuration has
> 2. I absolutely prefer 1/1/1/1 :-)

So when balancing from 4/0/0/0 to 1/1/1/1, are we going to pull tasks in =
a run=20
state? =20

IMO, how we figure out which tasks benefit from 4/0/0/0 vs 1/1/1/1 is wha=
t=20
2.7-NUMA is about :)  I can't possibly see something working and fully te=
sted=20
over many workloads within a couple of months.   Right now on exec, we ge=
t=20
1/1/1/1, and on fork we get 4/0/0/0, and that's about all we can do at th=
is=20
point.

However I would like to start discussing how we figure this out -shared m=
em,=20
pipes, sockets, pg faults <- a whole lot of things to look at to help the=
=20
kernel figure out what's best.

-Andrew