[Lse-tech] Re: NUMA scheduler (was: 2.5 merge candidate list 1.5)

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

On Sunday 27 October 2002 19:16, Martin J. Bligh wrote:
> > OK, I went to your latest patches (just 1 and 2). And they worked!
> > You've fixed the performance degradation problems for kernel compile
> > (now a 14% improvement in systime), that core set works without
> > further futzing about or crashing, with or without TSC, on either
> > version of gcc ... congrats!
>
> So I have a slight correction to make to the above ;-) Your patches
> do work just fine, no crashes any more. HOWEVER ... turns out I only
> had the first patch installed, not both. Silly mistake, but turns out
> to be very interesting.
>
> So your second patch is the balance on exec stuff ... I've looked at
> it, and think it's going to be very expensive to do in practice, at
> least the simplistic "recalc everything on every exec" approach. It
> does benefit the low end schedbench results, but not the high end ones,
> and you can see the cost of your second patch in the system times of
> the kernbench.

This is interesting, indeed. As you might have seen from the tests I
posted on LKML I could not see that effect on our IA64 NUMA machine.
Which arises the question: is it expensive to recalculate the load
when doing an exec (which I should also see) or is the strategy of
equally distributing the jobs across the nodes bad for certain
load+architecture combinations? As I'm not seeing the effect, maybe
you could do the following experiment:
In sched_best_node() keep only the "while" loop at the beginning. This
leads to a cheap selection of the next node, just a simple round robin.=20

Regarding the schedbench results: are they averages over multiple runs?
The numa_test needs to be repeated a few times to get statistically
meaningful results.

Thanks,
Erich

> In summary, I think I like the first patch alone better than the
> combination, but will have a play at making a cross between the two.
> As I have very little context about the scheduler, would appreciate
> any help anyone would like to volunteer ;-)
>
> Corrected results are:
>
> Kernbench:
>                              Elapsed        User      System         CP=
U
>               2.5.44-mm4     19.676s    192.794s     42.678s     1197.4=
%
>         2.5.44-mm4-hbaum     19.422s    189.828s     40.204s     1196.2=
%
>       2.5.44-mm4-focht-1      19.46s    189.838s     37.938s       1171=
%
>      2.5.44-mm4-focht-12      20.32s        190s       44.4s     1153.6=
%
>
> Schedbench 4:
>                              Elapsed   TotalUser    TotalSys     AvgUse=
r
>               2.5.44-mm4       32.45       49.47      129.86        0.8=
2
>         2.5.44-mm4-hbaum       31.31       43.85      125.29        0.8=
4
>       2.5.44-mm4-focht-1       38.61       45.15      154.48        1.0=
6
>      2.5.44-mm4-focht-12       23.23       38.87       92.99        0.8=
5
>
> Schedbench 8:
>                              Elapsed   TotalUser    TotalSys     AvgUse=
r
>               2.5.44-mm4       39.90       61.48      319.26        2.7=
9
>         2.5.44-mm4-hbaum       32.63       46.56      261.10        1.9=
9
>       2.5.44-mm4-focht-1       37.76       61.09      302.17        2.5=
5
>      2.5.44-mm4-focht-12       28.40       34.43      227.25        2.0=
9
>
> Schedbench 16:
>                              Elapsed   TotalUser    TotalSys     AvgUse=
r
>               2.5.44-mm4       62.99       93.59     1008.01        5.1=
1
>         2.5.44-mm4-hbaum       49.78       76.71      796.68        4.4=
3
>       2.5.44-mm4-focht-1       51.69       60.23      827.20        4.9=
5
>      2.5.44-mm4-focht-12       51.24       60.86      820.08        4.2=
3
>
> Schedbench 32:
>                              Elapsed   TotalUser    TotalSys     AvgUse=
r
>               2.5.44-mm4       88.13      194.53     2820.54       11.5=
2
>         2.5.44-mm4-hbaum       54.67      147.30     1749.77        7.9=
1
>       2.5.44-mm4-focht-1       56.71      123.62     1815.12        7.9=
2
>      2.5.44-mm4-focht-12       55.69      118.85     1782.25        7.2=
8
>
> Schedbench 64:
>                              Elapsed   TotalUser    TotalSys     AvgUse=
r
>               2.5.44-mm4      159.92      653.79    10235.93       25.1=
6
>         2.5.44-mm4-hbaum       65.20      300.58     4173.26       16.8=
2
>       2.5.44-mm4-focht-1       55.60      232.36     3558.98       17.6=
1
>      2.5.44-mm4-focht-12       56.03      234.45     3586.46       15.7=
6