From: Rick K. <rk...@il...> - 2009-11-17 16:34:19
|
(forgot to copy the list on this reply)... ----- Forwarded Message ----- From: "Rick Kufrin" <rk...@il...> To: jj...@nu... Sent: Tuesday, November 17, 2009 10:33:02 AM GMT -06:00 US/Canada Central Subject: Re: [PerfSuite-users] Questions about "Wall clock time" Jie - It seems that the cause of the discrepancy in elapsed time reported is due to the differences in reporting of your machine's clock speed. I see from the content of the "brand string" element in your XML document that it is a Xeon E5540, 2.53 GHz. This information comes from the CPUID instruction. However, the "clockspeed" element reported in the document is 1.6 GHz; that information comes from /proc/cpuinfo. If you replace the clockspeed of 1600 with 2530, the numbers will be much closer. I am guessing there is some variable speed going on with your platform, and that the discrepancy stems from that, not from overhead generated by PerfSuite. Rick ----- Original Message ----- From: "Jie Jiang" <jj...@nu...> To: rk...@il... Cc: per...@li... Sent: Tuesday, November 17, 2009 8:38:49 AM GMT -06:00 US/Canada Central Subject: Re: [PerfSuite-users] Questions about "Wall clock time" Hi Rick, Enclosed is the original xml file. The wallclock time in the xml file is very different from the cputime. Please check it. Regards, Jie On 一, 2009-11-16 at 11:43 -0600, rk...@il... wrote: > Jie - > > My initial guess is that psprocess is miscalculating the wall clock time you mention. To be sure, I would like to see the original XML document that you got from the benchmark (the complete document). Can you please send me a copy? > > The wall clock you found in the XML document is not (I believe) displayed by psprocess, but in this instance it would seem to be more accurate; I am saying that based on its agreement with the "time" command. > > The difference between these two measures of time is that one is calculated from the elapsed clock ticks (this is the one that is off in your report); the other is gotten from information taken from the /proc filesystem related to the process/thread being measured. Another important difference is that the first is meant to be literally "wall clock time", while the second is "CPU time" where this is the amount of time actually spent using the processor (these can be very different depending on the application). > > Rick > > ----- Original Message ----- > From: "Jie Jiang" <jj...@nu...> > To: rk...@il... > Cc: per...@li... > Sent: Monday, November 16, 2009 8:03:13 AM GMT -06:00 US/Canada Central > Subject: [PerfSuite-users] Questions about "Wall clock time" > > Hi Rick, > > When processing the collected data with "psprocess", it always show the > "Wall clock time" result. > I have two questions about the "Wall clock time". > First, it is much larger than the run time of the target program. > > [root@node2 bin]# time psrun -c test_config1.xml ./cg.A > libpsrun.c:181 : SIGPROF ignored on startup. Handler=0x1, flags=14000000 > PerfSuite debugging enabled (debug level: PS_DEBUG_OFF) [PID 5562] > Library version: threaded > [PID 5562] Environment (entry of psrun_init) > [PID 5562] PSRUN_DOFORK = (null) > [PID 5562] LD_PRELOAD = libpsrun.so.0 > [PID 5562] PSRUN_PID = 5562 > [PID 5562] PS_HWPC_FILE = cg.A > > > NAS Parallel Benchmarks (NPB3.2-SER) - CG Benchmark > > Size: 14000 > Iterations: 15 > > Initialization time = 0.656 seconds > > iteration ||r|| zeta > 1 0.25789587124191E-12 19.9997581277040 > 2 0.25434985977194E-14 17.1140495745506 > 3 0.25346577542259E-14 17.1296668946143 > 4 0.25342984287709E-14 17.1302113581192 > 5 0.25247550490803E-14 17.1302338856353 > 6 0.25375789728060E-14 17.1302349879482 > 7 0.25309911213776E-14 17.1302350498916 > 8 0.24971158788969E-14 17.1302350537510 > 9 0.24662516791025E-14 17.1302350540101 > 10 0.25086578290790E-14 17.1302350540284 > 11 0.24878397192172E-14 17.1302350540298 > 12 0.24359141964394E-14 17.1302350540299 > 13 0.24247346800617E-14 17.1302350540299 > 14 0.24157219672237E-14 17.1302350540299 > 15 0.24243304908282E-14 17.1302350540299 > Benchmark completed > VERIFICATION SUCCESSFUL > Zeta is 0.171302350540E+02 > Error is 0.526781606656E-13 > > > CG Benchmark Completed. > Class = A > Size = 14000 > Iterations = 15 > Time in seconds = 2.06 > Mop/s total = 724.79 > Operation type = floating point > Verification = SUCCESSFUL > Version = 3.2.1 > Compile date = 09 Nov 2009 > > Compile options: > F77 = ifort > FLINK = $(F77) > F_LIB = (none) > F_INC = (none) > FFLAGS = -O -g > FLINKFLAGS = -O > RAND = randi8 > > > Please send all errors/feedbacks to: > > NPB Development Team > np...@na... > > > > real 0m2.756s > user 0m2.711s > sys 0m0.022s > > [root@node2 bin]# psprocess -m test_metric.xml cg.A.5562.node2.xml > PerfSuite Hardware Performance Summary Report > > Version : 1.0 > Created : Mon Nov 16 20:46:23 CST 2009 > Generator : psprocess 0.5 > XML Source : cg.A.5562.node2.xml > > Execution Information > ============================================================================================ > Collector : libpshwpc > Date : Mon Nov 16 20:45:34 2009 > Host : node2 > Process ID : 5562 > Thread : 0 > User : root > Command : cg.A > > Processor and System Information > ============================================================================================ > Node CPUs : 8 > Vendor : Intel > Family : Pentium Pro (P6) > Brand : Intel(R) Xeon(R) CPU E5540 @ > 2.53GHz > CPU Revision : 5 > Clock (MHz) : 1600.000 > Memory (MB) : 16078.69 > Pagesize (KB) : 4 > > Cache Information > ============================================================================================ > Cache levels : 3 > -------------------------------- > Level 1 > Type : instruction > Size (KB) : 32 > Linesize (B) : 64 > Assoc : 4 > Type : data > Size (KB) : 32 > Linesize (B) : 64 > Assoc : 8 > -------------------------------- > Level 2 > Type : unified > Size (KB) : 256 > Linesize (B) : 64 > Assoc : 8 > -------------------------------- > Level 3 > Type : unified > Size (KB) : 8192 > Linesize (B) : 64 > Assoc : 16 > > Index Description > Counter Value > ============================================================================================ > 1 MEM_LOAD_RETIRED:LLC_UNSHARED_HIT (description not available).... > 338818848 > 2 MEM_LOAD_RETIRED:LLC_MISS (description not available)............ > 3219718 > 3 UNHALTED_CORE_CYCLES (description not available)................. > 7312056865 > > Event Index > ============================================================================================ > 1: MEM_LOAD_RETIRED:LLC_UNSHARED_HIT 2: MEM_LOAD_RETIRED:LLC_MISS > 3: UNHALTED_CORE_CYCLES > > Statistics > ============================================================================================ > Counting domain........................................................ > user > Multiplexed............................................................ > no > Wall clock time (seconds).............................................. > 4.310 > ---------------------------------------------- > Here we can see that the "Wall clock time" output (4.31s) by psprocess > is quite larger than the runtime of cg.A (both in terms of the outputs > of cg.A,2.06s, and time command, about 2.7s.). > Where does other part of time go? What causes the overhead? > And what's the real meaning of the "Wall clock time" here? > > Second, in the output xml file of psrun, there is the count of cpu time: > <cputime units="seconds"> > <usertime>2.002680</usertime> > <systemtime>0.000010</systemtime> > </cputime> > > We can see that this is quite close to the real run time of cg.A. > Why does psprocess not show these valuse? > Will you add this function in upcoming ps-1.0? > > Regards, > Jie > > > > ------------------------------------------------------------------------------ > Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day > trial. Simplify your report design, integration and deployment - and focus on > what you do best, core application coding. Discover what's new with > Crystal Reports now. http://p.sf.net/sfu/bobj-july > _______________________________________________ > PerfSuite-users mailing list > Per...@li... > https://lists.sourceforge.net/lists/listinfo/perfsuite-users > > ------------------------------------------------------------------------------ Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july _______________________________________________ PerfSuite-users mailing list Per...@li... https://lists.sourceforge.net/lists/listinfo/perfsuite-users |