Re: [Lse-tech] [patch] HT scheduler, sched-2.5.59-E2

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

On Tuesday 04 February 2003 04:23, Ingo Molnar wrote:
> On Mon, 3 Feb 2003, Andrew Theurer wrote:
> > In case anyone is interested, here is SchedD7, with and without HT bi=
ts,
> > compared to SchedD7 minus HT bits plus numa-ht topology, compared to =
a
> > run with E2:
> >
> > The system is a dual P4 xeon, serverworks chipset, kernel is 2.5.59. =
I
> > ran "kernbench" with -j2, -j4, and -j8.  Kernbench result is an avera=
ge
> > of 10 runs.  The kernels are:
>
> thanks for the analysis - looks like the -E2 scheduler got the best
> numbers in every benchmark, the difference is especially visible in the
> -j2 test, where the difference between schedD7-noht and schedE2 is more
> than 30%. But -E2 even beats the best NUMA-HT scheduler
> (2.5.59-D7-numaht2) by 8%.

I think it's going to be hard, if not impossible to beat E2 with a numa-s=
ched=20
for HT.  The approaches are a little different, and E2 just may be the mo=
re=20
efficient approach.  The numa scheduler tries to find the best cpu right =
when=20
the new process is exec'd, but does not do an "active" load balance that =
the=20
D7&E2 patches do.  The sched_best_cpu() from numa_sched can take some tim=
e to=20
complete, while active load balance can help when some tasks exit and a H=
T=20
type imbalance exists.  Although the HT-numa way is a good improvement ov=
er=20
stock 2559, it just doesn't approach the performance of E2.

Also, FYI, last time Michael tested with D7, he had some weirdness with a=
=20
"real" NUMA system.  Hopefully we can reproduce that soon on E2 and track=
=20
down any issues. =20

-Andrew Theurer