From: Mark P. <pel...@au...> - 2003-08-31 03:58:55
|
Andrew Morton wrote: >Mark Peloquin <pel...@au...> wrote: > > >> Here is a link to the latest history graphs. >> >> http://ltcperf.ncsa.uiuc.edu/data/history-graphs/ >> >> Nightly Regression Summary for 2.6.0-test4-mm2 vs 2.6.0-test4-mm3 >> >> > >Thanks. > >It would be nice if you could add a few words of interpretation to your >email announcements. You are more skilled in understanding these tests and >have the benefit of having been running them for some time. In other >words: make it easy for us ;) > I apologize for my haste. Steve Pratt usually post comments on the results, however he was not available today, and I hope the new thumbnails would make it easier for others to find problems and initiate discussions about the results. I will ensure the necessary commentary will be included in future postings. > >The results which we're most interested in at this time are specjbb and >volanomark. > >But > > http://ltcperf.ncsa.uiuc.edu/data/2.6.0-test4-mm3/2.6.0-test4-vs-2.6.0-test4-mm3/specjbb.html > >has errors where the numbers should be and > > http://ltcperf.ncsa.uiuc.edu/data/2.6.0-test4-mm3/2.6.0-test4-vs-2.6.0-test4-mm3/volanomark.html > >has only a single result, which suspiciously claims that -mm3 is 330% >faster. > The problem lies in the 2.6.0-test4 folder and the presence of some unwanted data from a stray benchmark run. Our scripting, not expecting stray data, ended up using the stray data as 2.6.0-test4 results, thus corrupting the any comparisons to 2.6.0-test4-vs-2.6.0-test4-mm3 as well as the history graphs. For the immediate time being, disregard 2.6.0-test4 results in the 2.6.0-test4-vs-2.6.0-test4-mm3. The 2.6.0-test4 comparisons vs any other kernel (mm2 for example) will contain the correct 2.6.0-test4 results. I've fixed the problem and am in the process of regenerating the corrected comparisons and history graphs. It takes a few hours to regenerate the history graphs, I'll post the corrected data asap. > > >The comparative graphs are really nice (bit too much rawiobench stuff though). > >I think > > http://ltcperf.ncsa.uiuc.edu/data/history-graphs/specjbb.results.avg.plot.16.png > >is telling me that -mm2 got a lot faster, and -mm3 faster still. > mm1 results were worse for both the 16 & 19 warehouse data points. mm2 seems to get the results back and make it slightly faster than 2.6.0-test4. mm3 made further improvements. >But that doesn't gel with the tables of numbers which we saw with >mm2. > > >And > > http://ltcperf.ncsa.uiuc.edu/data/history-graphs/specjbb.utilization.idle.avg.plot.16.png > http://ltcperf.ncsa.uiuc.edu/data/history-graphs/specjbb.utilization.idle.avg.plot.19.png > >are showing good reductions in idle time. > mm2/mm3 have notably improved the user and idle time figures, with an increase in system time (not sure if this is a good thing or not), and a decrease in the number of context switches (a good thing). specjbb comparison of 2.6.0-test4 vs 2.6.0-test4-mm3 Results:Throughput (Graph) tolerance = 0.00 + 3.00% of 2.6.0-test4 2.6.0-test4 2.6.0-test4-mm3 # of WHs OPs/sec OPs/sec %diff diff tolerance ---------- ------------ ------------ -------- ------------ ------------ 1 9783.46 10063.75 2.86 280.29 293.50 4 33783.93 35417.80 4.84 1633.87 1013.52 * 7 54401.52 53841.78 -1.03 -559.74 1632.05 10 56861.59 57359.70 0.88 498.11 1705.85 13 56024.86 55679.72 -0.62 -345.14 1680.75 16 43874.77 51468.65 17.31 7593.88 1316.24 * 19 32658.83 35740.45 9.44 3081.62 979.76 * Results:User CPU Utilization (Graph) tolerance = 0.00 + 3.00% of 2.6.0-test4 2.6.0-test4 2.6.0-test4-mm3 # of WHs %CPU %CPU %diff diff tolerance ---------- ------------ ------------ -------- ------------ ------------ 1 11.90 11.81 -0.76 -0.09 0.36 4 49.40 49.47 0.14 0.07 1.48 7 86.51 86.26 -0.29 -0.25 2.60 10 97.91 97.83 -0.08 -0.08 2.94 13 97.55 97.19 -0.37 -0.36 2.93 16 82.99 95.09 14.58 12.10 2.49 * 19 67.40 91.93 36.39 24.53 2.02 * Results:Idle CPU Utilization (Graph) tolerance = 1.00 + 3.00% of 2.6.0-test4 2.6.0-test4 2.6.0-test4-mm3 # of WHs %CPU %CPU %diff diff tolerance ---------- ------------ ------------ -------- ------------ ------------ 1 87.30 87.30 0.00 0.00 3.62 4 49.53 49.51 -0.04 -0.02 2.49 7 12.40 12.34 -0.48 -0.06 1.37 10 0.36 0.35 -2.78 -0.01 1.01 13 1.20 0.77 -35.83 -0.43 1.04 16 15.17 2.28 -84.97 -12.89 1.46 * 19 30.66 4.28 -86.04 -26.38 1.92 * Results:System CPU Utilization (Graph) tolerance = 1.00 + 3.00% of 2.6.0-test4 2.6.0-test4 2.6.0-test4-mm3 # of WHs %CPU %CPU %diff diff tolerance ---------- ------------ ------------ -------- ------------ ------------ 1 0.72 0.81 12.50 0.09 1.02 4 0.99 0.94 -5.05 -0.05 1.03 7 1.07 1.35 26.17 0.28 1.03 10 1.74 1.82 4.60 0.08 1.05 13 1.25 2.04 63.20 0.79 1.04 16 1.84 2.62 42.39 0.78 1.06 19 1.91 3.79 98.43 1.88 1.06 * Results:Context Switches (Graph) tolerance = 0.00 + 10.00% of 2.6.0-test4 2.6.0-test4 2.6.0-test4-mm3 # of WHs cswch/sec cswch/sec %diff diff tolerance ---------- ------------ ------------ -------- ------------ ------------ 1 179.49 179.42 -0.04 -0.07 17.95 4 183.36 183.52 0.09 0.16 18.34 7 221.62 217.25 -1.97 -4.37 22.16 10 573.51 554.60 -3.30 -18.91 57.35 13 2586.15 1765.67 -31.73 -820.48 258.61 * 16 18150.40 5299.69 -70.80 -12850.71 1815.04 * 19 25743.95 10634.13 -58.69 -15109.82 2574.40 * > >I think > > http://ltcperf.ncsa.uiuc.edu/data/history-graphs/volanomark.throughput.plot.1.png > >is telling me that -mm still hasn't fixed the volanomark problems, but given >the problems with the tabulated results I'm not very confident in that. > I've looked at the corrected comparisons and volanomark results are still down by about 11%. volanomark comparison of 2.6.0-test4 vs 2.6.0-test4-mm3 Results:Throughput (Graph) tolerance = 0.00 + 3.00% of 2.6.0-test4 2.6.0-test4 2.6.0-test4-mm3 Msgs/sec Msgs/sec %diff diff tolerance ---------- ------------ ------------ -------- ------------ ------------ 1 40757 36197 -11.19 -4560.00 1222.71 * |