Re: [Lse-tech] Re: [patch] sched-2.5.59-A2

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

>> I repeated the tests with your B0 version and it's still not
>> satisfying. Maybe too aggressive NODE_REBALANCE_IDLE_TICK, maybe the
>> difference is that the other calls of load_balance() never have the
>> chance to balance across nodes.
> 
> Nope, I found the problem. The topo cleanups are broken - we end up 
> taking all mem accesses, etc to node 0.

Kernbench:
                                   Elapsed        User      System         CPU
                        2.5.59     20.032s     186.66s      47.73s       1170%
               2.5.59-ingo-mjb     19.986s    187.044s     48.592s     1178.8%

NUMA schedbench 4:
                                   AvgUser     Elapsed   TotalUser    TotalSys
                        2.5.59        0.00       36.38       90.70        0.62
               2.5.59-ingo-mjb        0.00       34.70       88.58        0.69

NUMA schedbench 8:
                                   AvgUser     Elapsed   TotalUser    TotalSys
                        2.5.59        0.00       42.78      249.77        1.85
               2.5.59-ingo-mjb        0.00       49.33      256.59        1.69

NUMA schedbench 16:
                                   AvgUser     Elapsed   TotalUser    TotalSys
                        2.5.59        0.00       56.84      848.00        2.78
               2.5.59-ingo-mjb        0.00       65.67      875.05        3.58

NUMA schedbench 32:
                                   AvgUser     Elapsed   TotalUser    TotalSys
                        2.5.59        0.00      116.36     1807.29        5.75
               2.5.59-ingo-mjb        0.00      142.77     2039.47        8.42

NUMA schedbench 64:
                                   AvgUser     Elapsed   TotalUser    TotalSys
                        2.5.59        0.00      240.01     3634.20       14.57
               2.5.59-ingo-mjb        0.00      293.48     4534.99       20.62

System times are little higher (multipliers are set at busy = 10,
idle = 10) .... I'll try setting the idle multipler to 100, but
the other thing to do would be into increase the cross-node migrate
resistance by setting some minimum imbalance offsets. That'll 
probably have to be node-specific ... something like the number
of cpus per node ... but probably 0 for the simple HT systems.

M.