|
From: cardente, j. <car...@em...> - 2001-01-26 19:41:09
|
> [snip] > I can think of a totally synthetic benchmark in which threads do a >(possibly random) amount of compute followed (randomly) by > a sleep (for a random time) or a yield or even some I/O. The computation > could be constructed to pollute the cache a bit etc. > The number of threads spawned would of course, be parameterized as well. > > I guess the key would be in the choice of the parameters, but what would it > mean in terms of acceptability as a kernel scheduler benchmark ? > > Shailabh Nagar > (914) 945 2851 > na...@us... I ran into a similar benchmark "appropriateness" issue while doing some (rudimentary) performance evaluations of the 2.2.x kernel's affinity support in SMP systems. Specifically I was trying quantify how cache pollution effected performance and how effective the static bonus given in the goodness value was at minimizing cache pollution. Unfortunately I was under a time limit and couldn't find an off the shelf multi-threaded compute bound benchmark so I coded up a synthetic one. I ended up seeing some interesting behaviors (i.e. the affinity bonus reducing performance at #threads=#cpus) but because it was synthetic judging the worth of the results was not easy. I've yet to revisit the experiment with the 2.4 kernel and better benchmarks. Perhaps a reasonable approach to this problem would be the development of a configurable synthetic benchmark that could be used to emulate the behavior of a target real-world app. That might allow a way to preserve the scheduler specific behaviors of a workload without dragging along any parasitic issues (i.e. like the networking problems that have been brought up). Of course the degrees of configurability may too large and complex to completely be able to emulate all possible workloads but I'm guessing something could be banged together that would render a good first order approximation. Wrap that with scripts that could create a config file from performance tools profiling a running app and the world would be a better place....... well easier to benchmark at least... ;-) I personally regard micro-benchmarks as precision tools for chasing sub-system specific issues identified by running more complex real-world benchmarks and therefore not the proper tool for estimating performance. Re-running the real-world benchmark is the only way to determine the impact of any changes and periodic check-pointing avoids chasing micro benchmark specific behaviors (at least not for too long). Why use them at all then? Well to avoid unrelated issues (again the network stuff) and any overhead of setting up the real benchmark (maybe not significant for the chat script but what about a TCP-C/D workload?). Of course I'm probably preaching to the choir.... Thanks John P.s. If I did return to my affinity experiments anybody got any suggestions for a workload? How about monitoring tools. I basically hacked the kernel with stats and threw in a /proc interface but I'm sure there's tools already available. ---------------------------------------------------------------------------- - John Cardente car...@em... Principal Engineer 508-898-7340 EMC Enterprise Engineering 4400 Computer Dr, Westboro, MA 01580 |