You can subscribe to this list here.
| 2002 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
(1) |
Oct
(122) |
Nov
(152) |
Dec
(69) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2003 |
Jan
(6) |
Feb
(25) |
Mar
(73) |
Apr
(82) |
May
(24) |
Jun
(25) |
Jul
(10) |
Aug
(11) |
Sep
(10) |
Oct
(54) |
Nov
(203) |
Dec
(182) |
| 2004 |
Jan
(307) |
Feb
(305) |
Mar
(430) |
Apr
(312) |
May
(187) |
Jun
(342) |
Jul
(487) |
Aug
(637) |
Sep
(336) |
Oct
(373) |
Nov
(441) |
Dec
(210) |
| 2005 |
Jan
(385) |
Feb
(480) |
Mar
(636) |
Apr
(544) |
May
(679) |
Jun
(625) |
Jul
(810) |
Aug
(838) |
Sep
(634) |
Oct
(521) |
Nov
(965) |
Dec
(543) |
| 2006 |
Jan
(494) |
Feb
(431) |
Mar
(546) |
Apr
(411) |
May
(406) |
Jun
(322) |
Jul
(256) |
Aug
(401) |
Sep
(345) |
Oct
(542) |
Nov
(308) |
Dec
(481) |
| 2007 |
Jan
(427) |
Feb
(326) |
Mar
(367) |
Apr
(255) |
May
(244) |
Jun
(204) |
Jul
(223) |
Aug
(231) |
Sep
(354) |
Oct
(374) |
Nov
(497) |
Dec
(362) |
| 2008 |
Jan
(322) |
Feb
(482) |
Mar
(658) |
Apr
(422) |
May
(476) |
Jun
(396) |
Jul
(455) |
Aug
(267) |
Sep
(280) |
Oct
(253) |
Nov
(232) |
Dec
(304) |
| 2009 |
Jan
(486) |
Feb
(470) |
Mar
(458) |
Apr
(423) |
May
(696) |
Jun
(461) |
Jul
(551) |
Aug
(575) |
Sep
(134) |
Oct
(110) |
Nov
(157) |
Dec
(102) |
| 2010 |
Jan
(226) |
Feb
(86) |
Mar
(147) |
Apr
(117) |
May
(107) |
Jun
(203) |
Jul
(193) |
Aug
(238) |
Sep
(300) |
Oct
(246) |
Nov
(23) |
Dec
(75) |
| 2011 |
Jan
(133) |
Feb
(195) |
Mar
(315) |
Apr
(200) |
May
(267) |
Jun
(293) |
Jul
(353) |
Aug
(237) |
Sep
(278) |
Oct
(611) |
Nov
(274) |
Dec
(260) |
| 2012 |
Jan
(303) |
Feb
(391) |
Mar
(417) |
Apr
(441) |
May
(488) |
Jun
(655) |
Jul
(590) |
Aug
(610) |
Sep
(526) |
Oct
(478) |
Nov
(359) |
Dec
(372) |
| 2013 |
Jan
(467) |
Feb
(226) |
Mar
(391) |
Apr
(281) |
May
(299) |
Jun
(252) |
Jul
(311) |
Aug
(352) |
Sep
(481) |
Oct
(571) |
Nov
(222) |
Dec
(231) |
| 2014 |
Jan
(185) |
Feb
(329) |
Mar
(245) |
Apr
(238) |
May
(281) |
Jun
(399) |
Jul
(382) |
Aug
(500) |
Sep
(579) |
Oct
(435) |
Nov
(487) |
Dec
(256) |
| 2015 |
Jan
(338) |
Feb
(357) |
Mar
(330) |
Apr
(294) |
May
(191) |
Jun
(108) |
Jul
(142) |
Aug
(261) |
Sep
(190) |
Oct
(54) |
Nov
(83) |
Dec
(22) |
| 2016 |
Jan
(49) |
Feb
(89) |
Mar
(33) |
Apr
(50) |
May
(27) |
Jun
(34) |
Jul
(53) |
Aug
(53) |
Sep
(98) |
Oct
(206) |
Nov
(93) |
Dec
(53) |
| 2017 |
Jan
(65) |
Feb
(82) |
Mar
(102) |
Apr
(86) |
May
(187) |
Jun
(67) |
Jul
(23) |
Aug
(93) |
Sep
(65) |
Oct
(45) |
Nov
(35) |
Dec
(17) |
| 2018 |
Jan
(26) |
Feb
(35) |
Mar
(38) |
Apr
(32) |
May
(8) |
Jun
(43) |
Jul
(27) |
Aug
(30) |
Sep
(43) |
Oct
(42) |
Nov
(38) |
Dec
(67) |
| 2019 |
Jan
(32) |
Feb
(37) |
Mar
(53) |
Apr
(64) |
May
(49) |
Jun
(18) |
Jul
(14) |
Aug
(53) |
Sep
(25) |
Oct
(30) |
Nov
(49) |
Dec
(31) |
| 2020 |
Jan
(87) |
Feb
(45) |
Mar
(37) |
Apr
(51) |
May
(99) |
Jun
(36) |
Jul
(11) |
Aug
(14) |
Sep
(20) |
Oct
(24) |
Nov
(40) |
Dec
(23) |
| 2021 |
Jan
(14) |
Feb
(53) |
Mar
(85) |
Apr
(15) |
May
(19) |
Jun
(3) |
Jul
(14) |
Aug
(1) |
Sep
(57) |
Oct
(73) |
Nov
(56) |
Dec
(22) |
| 2022 |
Jan
(3) |
Feb
(22) |
Mar
(6) |
Apr
(55) |
May
(46) |
Jun
(39) |
Jul
(15) |
Aug
(9) |
Sep
(11) |
Oct
(34) |
Nov
(20) |
Dec
(36) |
| 2023 |
Jan
(79) |
Feb
(41) |
Mar
(99) |
Apr
(169) |
May
(48) |
Jun
(16) |
Jul
(16) |
Aug
(57) |
Sep
(19) |
Oct
|
Nov
|
Dec
|
| S | M | T | W | T | F | S |
|---|---|---|---|---|---|---|
|
|
|
|
|
1
(16) |
2
(23) |
3
(15) |
|
4
(19) |
5
(21) |
6
(27) |
7
(18) |
8
(17) |
9
(15) |
10
(11) |
|
11
(9) |
12
(18) |
13
(26) |
14
(28) |
15
(26) |
16
(20) |
17
(27) |
|
18
(16) |
19
(40) |
20
(2) |
21
(11) |
22
(27) |
23
(24) |
24
(16) |
|
25
(10) |
26
(12) |
27
(16) |
28
(7) |
29
(6) |
30
(15) |
31
(5) |
|
From: Josef W. <Jos...@gm...> - 2005-12-12 22:48:52
|
On Monday 12 December 2005 17:27, Nicholas Nethercote wrote: > flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca > cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe pni monitor > ds_cpl cid It is a Prescott. "pni" means prescott new instructions, ie. SSE 3. Josef |
|
From: Nicholas N. <nj...@cs...> - 2005-12-12 20:42:51
|
On Mon, 12 Dec 2005, Nicholas Nethercote wrote: > How do I find out? I can't see anything relevant on the box. Ok, according to http://gentoo-wiki.com/Safe_Cflags#Pentium_4_.28Prescott.29_.28Intel.29, if cpu family is 15 and model is 4 it's a Prescott. So, super-long pipeline. Nick |
|
From: Nicholas N. <nj...@cs...> - 2005-12-12 19:47:48
|
On Mon, 12 Dec 2005, Julian Seward wrote: > It might be a P4 Prescott. Can you find out? There were two different P4 > incarnations with significantly different uarchitectures. Wilamette/Northwood > was the original one, with a ~20 stage pipe, whereas Prescott has 31 stages. > (I think Prescotts are labelled "Pentium 4 3.0 E", where the "E" is the > clue.) How do I find out? I can't see anything relevant on the box. > Ehm ... nanosleep causes the process to be descheduled and so won't > it have no effect on the total CPU time? You're measuring CPU and > not wallclock, right? I'm measuring wall-clock (real). Should I be measuring user time? Nick |
|
From: Nicholas N. <nj...@cs...> - 2005-12-12 19:42:08
|
On Mon, 12 Dec 2005, Cerion Armour-Brown wrote: >>> Hence, for ffbench: >>> >>> P4 Northwood 1.7GHz nt: 4.7s nl:11.3s ( 2.4x) mc:25.0s ( 5.3x) >>> P3 Tualatin 1.13GHz nt: 6.3s nl:11.4s ( 1.8x) mc:30.2s ( 4.8x) >>> MPC7447A 1.25Ghz (ppc G4) nt: 5.4s nl: 8.2s ( 1.5x) mc:25.8s ( 4.8x) >> >> Athlon XP 2100+ (1.7Ghz) nt: 3.6s nl: 6.6s ( 1.8x) mc:16.0s ( 4.4x) >> Opteron 250 (2.4Ghz) nt: 1.1s nl: 2.4s ( 2.2x) mc: 9.0s ( 8.4x) > > PPC970FX (2.5GHz) nt: 2.3s nl: 3.6s ( 1.6x) mc:11.6s ( 5.1x) This is great! :) I look forward to seeing more numbers as we build up the performance suite. Nick |
|
From: Cerion Armour-B. <ce...@op...> - 2005-12-12 18:28:28
|
On Monday 12 December 2005 18:38, Tom Hughes wrote: > In message <200...@ac...> > > Julian Seward <js...@ac...> wrote: > > > First attempt at some performance tracking tools. > > > > Hence, for ffbench: > > > > P4 Northwood 1.7GHz nt: 4.7s nl:11.3s ( 2.4x) mc:25.0s ( 5.3x) > > P3 Tualatin 1.13GHz nt: 6.3s nl:11.4s ( 1.8x) mc:30.2s ( 4.8x) > > MPC7447A 1.25Ghz (ppc G4) nt: 5.4s nl: 8.2s ( 1.5x) mc:25.8s ( 4.8x) > > Athlon XP 2100+ (1.7Ghz) nt: 3.6s nl: 6.6s ( 1.8x) mc:16.0s ( 4.4x) > Opteron 250 (2.4Ghz) nt: 1.1s nl: 2.4s ( 2.2x) mc: 9.0s ( 8.4x) PPC970FX (2.5GHz) nt: 2.3s nl: 3.6s ( 1.6x) mc:11.6s ( 5.1x) > > Here are the numbers for sarp (which is a bad case for memcheck): > > > > P4 Northwood 1.7GHz nt: 0.1s nl: 0.5s ( 4.8x) mc:20.3s (184.8x) > > P3 Tualatin 1.13GHz nt: 0.1s nl: 0.6s ( 5.3x) mc:29.3s (266.0x) > > MPC7447A 1.25Ghz (ppc G4) nt: 0.1s nl: 0.6s ( 5.2x) mc:22.0s (199.7x) > > Athlon XP 2100+ (1.7Ghz) nt: 0.1s nl: 0.4s ( 4.0x) mc:15.0s (149.6x) > Opteron 250 (2.4Ghz) nt: 0.1s nl: 0.3s ( 3.2x) mc:12.8s (127.8x) PPC970FX (2.5GHz) nt: 0.1s nl: 0.3s ( 3.2x) mc:10.9s (108.9x) Cerion |
|
From: Tom H. <to...@co...> - 2005-12-12 17:38:42
|
In message <200...@ac...>
Julian Seward <js...@ac...> wrote:
>
> > First attempt at some performance tracking tools.
>
> This is great stuff. Here are some prelim numbers. I'm disregarding
> sarp for the time being (will get to that). Hence, for ffbench:
>
> P4 Northwood 1.7GHz nt: 4.7s nl:11.3s ( 2.4x) mc:25.0s ( 5.3x)
> P3 Tualatin 1.13GHz nt: 6.3s nl:11.4s ( 1.8x) mc:30.2s ( 4.8x)
> MPC7447A 1.25Ghz (ppc G4) nt: 5.4s nl: 8.2s ( 1.5x) mc:25.8s ( 4.8x)
Athlon XP 2100+ (1.7Ghz) nt: 3.6s nl: 6.6s ( 1.8x) mc:16.0s ( 4.4x)
Opteron 250 (2.4Ghz) nt: 1.1s nl: 2.4s ( 2.2x) mc: 9.0s ( 8.4x)
> P4 Northwood 1.7GHz nt: 0.1s nl: 0.5s ( 4.8x) mc:20.3s (184.8x)
> P3 Tualatin 1.13GHz nt: 0.1s nl: 0.6s ( 5.3x) mc:29.3s (266.0x)
> MPC7447A 1.25Ghz (ppc G4) nt: 0.1s nl: 0.6s ( 5.2x) mc:22.0s (199.7x)
Athlon XP 2100+ (1.7Ghz) nt: 0.1s nl: 0.4s ( 4.0x) mc:15.0s (149.6x)
Opteron 250 (2.4Ghz) nt: 0.1s nl: 0.3s ( 3.2x) mc:12.8s (127.8x)
Tom
--
Tom Hughes (to...@co...)
http://www.compton.nu/
|
|
From: Julian S. <js...@ac...> - 2005-12-12 17:12:53
|
> > P4 Northwood 1.7GHz nt: 4.7s nl:11.3s ( 2.4x) mc:25.0s ( 5.3x) > > P3 Tualatin 1.13GHz nt: 6.3s nl:11.4s ( 1.8x) mc:30.2s ( 4.8x) > > MPC7447A 1.25Ghz (ppc G4) nt: 5.4s nl: 8.2s ( 1.5x) mc:25.8s ( 4.8x) > > Here are my numbers on a dual P4 3.0 GHz: > > ffbench: nt: 0.8s nl: 4.2s ( 5.0x) mc:10.7s (12.7x) > sarp: nt: 0.1s nl: 0.3s ( 2.9x) mc:13.7s (124.2x) > > Much worse than yours. I'm not sure what kind of P4 it is; /proc/cpuinfo > says (this info repeated twice, one per CPU): It might be a P4 Prescott. Can you find out? There were two different P4 incarnations with significantly different uarchitectures. Wilamette/Northwood was the original one, with a ~20 stage pipe, whereas Prescott has 31 stages. (I think Prescotts are labelled "Pentium 4 3.0 E", where the "E" is the clue.) > If you're right about the branch prediction, perhaps this machine has a > longer pipeline and so mispredicts are hitting harder? Maybe. Have got tired of listening to myself wittering on about branch mispredicts and am in mid-experiment to try and build a more-or-less-mispredict-free dispatcher. > > In this case, I'm wary of trusting these ratios much given that the run > > time of the native case is small enough (<= 0.1s) that measurement noise > > could be significant. > > If you look at the code I inserted a 0.1s nanosleep to mitigate this; > remove that and natively it will probably be measured as 0.00s. So the > slow-down is even worse than 100--200x. Ehm ... nanosleep causes the process to be descheduled and so won't it have no effect on the total CPU time? You're measuring CPU and not wallclock, right? > It's interesting to see that 2.4.X does very poorly on ffbench under > Memcheck (under Nulgrind it's only slightly slower than 3.1.X): > > ffbench: nt: 0.8s nl: 4.9s ( 6.0x) mc:40.8s (49.7x) > sarp: nt: 0.1s nl: 0.2s ( 2.1x) mc:11.1s (100.6x) Yes. That's due to the UCode JIT being microarchitecturally naive and doing a lot of fxsave/fxrestors around FP isns, with catastropic effects on performance. That's a baseline (nl) overhead though - I'm surprised it carries over into memcheck too. Ah well. > It is a good idea to build in compensation for different processor speeds. > The details are tricky; if we mandate a 1 second minimum for native, sarp > will run for a couple of minutes, which is a pain. As Dirk points out, we need at least a ~0.3s minimum. I don't mind if the benchmark suite takes several minutes to complete. Anyway, once COMPVBITS is merged, sarp only has a slowdown of 26, right? J |
|
From: Nicholas N. <nj...@cs...> - 2005-12-12 16:27:43
|
On Mon, 12 Dec 2005, Julian Seward wrote: > Hence, for ffbench: > > P4 Northwood 1.7GHz nt: 4.7s nl:11.3s ( 2.4x) mc:25.0s ( 5.3x) > P3 Tualatin 1.13GHz nt: 6.3s nl:11.4s ( 1.8x) mc:30.2s ( 4.8x) > MPC7447A 1.25Ghz (ppc G4) nt: 5.4s nl: 8.2s ( 1.5x) mc:25.8s ( 4.8x) > > Here are the numbers for sarp (which is a bad case for memcheck): > > P4 Northwood 1.7GHz nt: 0.1s nl: 0.5s ( 4.8x) mc:20.3s (184.8x) > P3 Tualatin 1.13GHz nt: 0.1s nl: 0.6s ( 5.3x) mc:29.3s (266.0x) > MPC7447A 1.25Ghz (ppc G4) nt: 0.1s nl: 0.6s ( 5.2x) mc:22.0s (199.7x) Here are my numbers on a dual P4 3.0 GHz: ffbench: nt: 0.8s nl: 4.2s ( 5.0x) mc:10.7s (12.7x) sarp: nt: 0.1s nl: 0.3s ( 2.9x) mc:13.7s (124.2x) Much worse than yours. I'm not sure what kind of P4 it is; /proc/cpuinfo says (this info repeated twice, one per CPU): processor : 1 vendor_id : GenuineIntel cpu family : 15 model : 3 model name : Intel(R) Pentium(R) 4 CPU 3.00GHz stepping : 4 cpu MHz : 2992.664 cache size : 1024 KB fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 3 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe pni monitor ds_cpl cid bogomips : 5976.88 If you're right about the branch prediction, perhaps this machine has a longer pipeline and so mispredicts are hitting harder? The site that has the ffbench program has another another one, fbench, which does some different FP operations. I think I'll add that to the suite. > In this case, I'm wary of trusting these ratios much given that the run > time of the native case is small enough (<= 0.1s) that measurement noise > could be significant. If you look at the code I inserted a 0.1s nanosleep to mitigate this; remove that and natively it will probably be measured as 0.00s. So the slow-down is even worse than 100--200x. I'm imagining that the performance suite will consist of some small-but-real programs (eg. ffbench), and artificial programs like sarp that test specific cases -- it shows a specific performance bug in Memcheck, in that a simple operation (a large change in the SP) becomes many operations (Memcheck has to set all the affected A+V bits). And this program does much better in the COMPVBITS branch: sarp: nt: 0.1s nl: 0.3s ( 2.9x) mc: 2.9s (26.2x) It's interesting to see that 2.4.X does very poorly on ffbench under Memcheck (under Nulgrind it's only slightly slower than 3.1.X): ffbench: nt: 0.8s nl: 4.9s ( 6.0x) mc:40.8s (49.7x) sarp: nt: 0.1s nl: 0.2s ( 2.1x) mc:11.1s (100.6x) > How about the following suggestion: all programs in the performance > suite take a single command line arg, an integer, which controls how many > iterations of the basic work-unit are to be done. The perl script > starts off feeding it '1', then increasing it (exponentially) until the > native run time exceeds some minimum value for reliable timing, say 1 > second. Doing this would get us reliable numbers on very fast machines > without making it run inordinately long on slower machines. It is a good idea to build in compensation for different processor speeds. The details are tricky; if we mandate a 1 second minimum for native, sarp will run for a couple of minutes, which is a pain. The minimum time could be a parameter in the .vgperf file, perhaps. As well as consistency across different machines, consistency on individual machines will be important -- ie. we want to get similar results on each run. This will be important when I get around to adding some kind of performance-tracking infrastructure. It will take some more programs and experience to see how to handle this. Nick |
|
From: <sv...@va...> - 2005-12-12 15:54:57
|
Author: njn
Date: 2005-12-12 15:54:50 +0000 (Mon, 12 Dec 2005)
New Revision: 5325
Log:
Make it clearer that internal errors are Valgrind's fault.
Modified:
trunk/coregrind/m_signals.c
Modified: trunk/coregrind/m_signals.c
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
--- trunk/coregrind/m_signals.c 2005-12-10 23:46:11 UTC (rev 5324)
+++ trunk/coregrind/m_signals.c 2005-12-12 15:54:50 UTC (rev 5325)
@@ -1719,7 +1719,7 @@
Valgrind internally.
*/
VG_(message)(Vg_DebugMsg,=20
- "INTERNAL ERROR: Valgrind received a signal %d (%s) - exiting",
+ "VALGRIND INTERNAL ERROR: Valgrind received a signal %d (%s) - exit=
ing",
sigNo, signame(sigNo));
=20
VG_(message)(Vg_DebugMsg,=20
|
|
From: Dirk M. <dm...@gm...> - 2005-12-12 08:36:36
|
On Monday 12 December 2005 03:16, Julian Seward wrote: > second. Doing this would get us reliable numbers on very fast machines > without making it run inordinately long on slower machines. It would also be useful to "pre-heat" the CPU, otherwise measurements are basically useless on speedstepping processor architectures (it takes 0.1-0.3s of CPU load until you can trust that the CPU actually runs at full speed). Dirk |
|
From: <js...@ac...> - 2005-12-12 04:18:44
|
Nightly build on phoenix ( SuSE 10.0 ) started at 2005-12-12 03:30:01 GMT Checking out vex source tree ... done Building vex ... done Checking out valgrind source tree ... done Configuring valgrind ... done Building valgrind ... done Running regression tests ... failed Regression test results follow == 208 tests, 6 stderr failures, 1 stdout failure ================= memcheck/tests/leak-tree (stderr) memcheck/tests/mempool (stderr) memcheck/tests/stack_switch (stderr) memcheck/tests/x86/scalar (stderr) none/tests/mremap2 (stdout) none/tests/x86/faultstatus (stderr) none/tests/x86/int (stderr) |
|
From: <js...@ac...> - 2005-12-12 03:47:09
|
Nightly build on g5 ( YDL 4.0, ppc970 ) started at 2005-12-12 04:40:00 CET Checking out vex source tree ... done Building vex ... done Checking out valgrind source tree ... done Configuring valgrind ... done Building valgrind ... done Running regression tests ... failed Regression test results follow == 175 tests, 15 stderr failures, 0 stdout failures ================= memcheck/tests/badjump (stderr) memcheck/tests/badjump2 (stderr) memcheck/tests/leak-cycle (stderr) memcheck/tests/leak-tree (stderr) memcheck/tests/mempool (stderr) memcheck/tests/partiallydefinedeq (stderr) memcheck/tests/pointer-trace (stderr) memcheck/tests/supp1 (stderr) memcheck/tests/supp_unknown (stderr) memcheck/tests/toobig-allocs (stderr) memcheck/tests/xml1 (stderr) massif/tests/toobig-allocs (stderr) none/tests/faultstatus (stderr) none/tests/fdleak_cmsg (stderr) none/tests/mremap (stderr) |
|
From: Tom H. <to...@co...> - 2005-12-12 03:42:22
|
Nightly build on dunsmere ( athlon, Fedora Core 4 ) started at 2005-12-12 03:30:05 GMT Results unchanged from 24 hours ago Checking out valgrind source tree ... done Configuring valgrind ... done Building valgrind ... done Running regression tests ... failed Regression test results follow == 210 tests, 7 stderr failures, 1 stdout failure ================= memcheck/tests/leak-tree (stderr) memcheck/tests/mempool (stderr) memcheck/tests/pointer-trace (stderr) memcheck/tests/stack_switch (stderr) memcheck/tests/x86/scalar (stderr) none/tests/mremap2 (stdout) none/tests/x86/faultstatus (stderr) none/tests/x86/int (stderr) |
|
From: Tom H. <th...@cy...> - 2005-12-12 03:29:30
|
Nightly build on alvis ( i686, Red Hat 7.3 ) started at 2005-12-12 03:15:04 GMT Results unchanged from 24 hours ago Checking out valgrind source tree ... done Configuring valgrind ... done Building valgrind ... done Running regression tests ... failed Regression test results follow == 209 tests, 17 stderr failures, 1 stdout failure ================= memcheck/tests/addressable (stderr) memcheck/tests/describe-block (stderr) memcheck/tests/erringfds (stderr) memcheck/tests/leak-0 (stderr) memcheck/tests/leak-cycle (stderr) memcheck/tests/leak-regroot (stderr) memcheck/tests/leak-tree (stderr) memcheck/tests/leakotron (stdout) memcheck/tests/match-overrun (stderr) memcheck/tests/mempool (stderr) memcheck/tests/partial_load_dflt (stderr) memcheck/tests/partial_load_ok (stderr) memcheck/tests/partiallydefinedeq (stderr) memcheck/tests/pointer-trace (stderr) memcheck/tests/sigkill (stderr) memcheck/tests/stack_changes (stderr) none/tests/x86/faultstatus (stderr) none/tests/x86/int (stderr) |
|
From: Tom H. <th...@cy...> - 2005-12-12 03:27:10
|
Nightly build on aston ( x86_64, Fedora Core 3 ) started at 2005-12-12 03:05:12 GMT Results differ from 24 hours ago Checking out valgrind source tree ... done Configuring valgrind ... done Building valgrind ... done Running regression tests ... failed Regression test results follow == 227 tests, 6 stderr failures, 1 stdout failure ================= memcheck/tests/mempool (stderr) memcheck/tests/x86/scalar (stderr) memcheck/tests/x86/scalar_supp (stderr) none/tests/amd64/faultstatus (stderr) none/tests/mremap2 (stdout) none/tests/x86/faultstatus (stderr) none/tests/x86/int (stderr) ================================================= == Results from 24 hours ago == ================================================= Checking out valgrind source tree ... done Configuring valgrind ... done Building valgrind ... done Running regression tests ... failed Regression test results follow == 227 tests, 6 stderr failures, 2 stdout failures ================= memcheck/tests/mempool (stderr) memcheck/tests/x86/scalar (stderr) memcheck/tests/x86/scalar_supp (stderr) none/tests/amd64/faultstatus (stderr) none/tests/mremap2 (stdout) none/tests/tls (stdout) none/tests/x86/faultstatus (stderr) none/tests/x86/int (stderr) ================================================= == Difference between 24 hours ago and now == ================================================= *** old.short Mon Dec 12 03:19:07 2005 --- new.short Mon Dec 12 03:27:05 2005 *************** *** 8,10 **** ! == 227 tests, 6 stderr failures, 2 stdout failures ================= memcheck/tests/mempool (stderr) --- 8,10 ---- ! == 227 tests, 6 stderr failures, 1 stdout failure ================= memcheck/tests/mempool (stderr) *************** *** 14,16 **** none/tests/mremap2 (stdout) - none/tests/tls (stdout) none/tests/x86/faultstatus (stderr) --- 14,15 ---- |
|
From: Tom H. <th...@cy...> - 2005-12-12 03:26:16
|
Nightly build on dellow ( x86_64, Fedora Core 4 ) started at 2005-12-12 03:10:11 GMT Results unchanged from 24 hours ago Checking out valgrind source tree ... done Configuring valgrind ... done Building valgrind ... done Running regression tests ... failed Regression test results follow == 227 tests, 5 stderr failures, 1 stdout failure ================= memcheck/tests/mempool (stderr) memcheck/tests/x86/scalar (stderr) none/tests/amd64/faultstatus (stderr) none/tests/mremap2 (stdout) none/tests/x86/faultstatus (stderr) none/tests/x86/int (stderr) |
|
From: Tom H. <th...@cy...> - 2005-12-12 03:20:36
|
Nightly build on gill ( x86_64, Fedora Core 2 ) started at 2005-12-12 03:00:03 GMT Results unchanged from 24 hours ago Checking out valgrind source tree ... done Configuring valgrind ... done Building valgrind ... done Running regression tests ... failed Regression test results follow == 227 tests, 6 stderr failures, 0 stdout failures ================= memcheck/tests/mempool (stderr) memcheck/tests/pointer-trace (stderr) none/tests/amd64/faultstatus (stderr) none/tests/fdleak_fcntl (stderr) none/tests/x86/faultstatus (stderr) none/tests/x86/int (stderr) |
|
From: Julian S. <js...@ac...> - 2005-12-12 02:17:13
|
> First attempt at some performance tracking tools. This is great stuff. Here are some prelim numbers. I'm disregarding sarp for the time being (will get to that). Hence, for ffbench: P4 Northwood 1.7GHz nt: 4.7s nl:11.3s ( 2.4x) mc:25.0s ( 5.3x) P3 Tualatin 1.13GHz nt: 6.3s nl:11.4s ( 1.8x) mc:30.2s ( 4.8x) MPC7447A 1.25Ghz (ppc G4) nt: 5.4s nl: 8.2s ( 1.5x) mc:25.8s ( 4.8x) ffbench is atypically favourable for V. The inner loop consists of one very long basic block, which vex's IR optimisation does well on, and the expensive fixed cost of jumping between bbs is pretty small. What are we to make from this? First off, it's nice to see that the ppc compilation pipeline produces code quality at least as good as x86, if not better. Perhaps ppc is a bit of an easier target; the condition code stuff is not quite as difficult to simulate as on x86, and it doesn't have the FP register stack idiocy to contend with. Interesting that P4 falls relatively far behind here with 'none' (nl). Given that the P3 is running an identical Linux distro and Valgrind setup, the performance differences must be microarchitectural, and, I'm betting, center around the P4's worse behaviour on branch mispredicts. Curious to see though that P4 makes up ground with memcheck (2.4x -> 5.3x) as compared to the 7447's showing (1.5x -> 4.8x). Perhaps the P4's aggressive out-of-orderness chews through the memcheck instrumentation and helper calls better than the 7447's relatively modest superscalar implementation. Here are the numbers for sarp (which is a bad case for memcheck): P4 Northwood 1.7GHz nt: 0.1s nl: 0.5s ( 4.8x) mc:20.3s (184.8x) P3 Tualatin 1.13GHz nt: 0.1s nl: 0.6s ( 5.3x) mc:29.3s (266.0x) MPC7447A 1.25Ghz (ppc G4) nt: 0.1s nl: 0.6s ( 5.2x) mc:22.0s (199.7x) In this case, I'm wary of trusting these ratios much given that the run time of the native case is small enough (<= 0.1s) that measurement noise could be significant. How about the following suggestion: all programs in the performance suite take a single command line arg, an integer, which controls how many iterations of the basic work-unit are to be done. The perl script starts off feeding it '1', then increasing it (exponentially) until the native run time exceeds some minimum value for reliable timing, say 1 second. Doing this would get us reliable numbers on very fast machines without making it run inordinately long on slower machines. J |