From: <rk...@il...> - 2009-11-16 17:44:19
|
Jie - My initial guess is that psprocess is miscalculating the wall clock time you mention. To be sure, I would like to see the original XML document that you got from the benchmark (the complete document). Can you please send me a copy? The wall clock you found in the XML document is not (I believe) displayed by psprocess, but in this instance it would seem to be more accurate; I am saying that based on its agreement with the "time" command. The difference between these two measures of time is that one is calculated from the elapsed clock ticks (this is the one that is off in your report); the other is gotten from information taken from the /proc filesystem related to the process/thread being measured. Another important difference is that the first is meant to be literally "wall clock time", while the second is "CPU time" where this is the amount of time actually spent using the processor (these can be very different depending on the application). Rick ----- Original Message ----- From: "Jie Jiang" <jj...@nu...> To: rk...@il... Cc: per...@li... Sent: Monday, November 16, 2009 8:03:13 AM GMT -06:00 US/Canada Central Subject: [PerfSuite-users] Questions about "Wall clock time" Hi Rick, When processing the collected data with "psprocess", it always show the "Wall clock time" result. I have two questions about the "Wall clock time". First, it is much larger than the run time of the target program. [root@node2 bin]# time psrun -c test_config1.xml ./cg.A libpsrun.c:181 : SIGPROF ignored on startup. Handler=0x1, flags=14000000 PerfSuite debugging enabled (debug level: PS_DEBUG_OFF) [PID 5562] Library version: threaded [PID 5562] Environment (entry of psrun_init) [PID 5562] PSRUN_DOFORK = (null) [PID 5562] LD_PRELOAD = libpsrun.so.0 [PID 5562] PSRUN_PID = 5562 [PID 5562] PS_HWPC_FILE = cg.A NAS Parallel Benchmarks (NPB3.2-SER) - CG Benchmark Size: 14000 Iterations: 15 Initialization time = 0.656 seconds iteration ||r|| zeta 1 0.25789587124191E-12 19.9997581277040 2 0.25434985977194E-14 17.1140495745506 3 0.25346577542259E-14 17.1296668946143 4 0.25342984287709E-14 17.1302113581192 5 0.25247550490803E-14 17.1302338856353 6 0.25375789728060E-14 17.1302349879482 7 0.25309911213776E-14 17.1302350498916 8 0.24971158788969E-14 17.1302350537510 9 0.24662516791025E-14 17.1302350540101 10 0.25086578290790E-14 17.1302350540284 11 0.24878397192172E-14 17.1302350540298 12 0.24359141964394E-14 17.1302350540299 13 0.24247346800617E-14 17.1302350540299 14 0.24157219672237E-14 17.1302350540299 15 0.24243304908282E-14 17.1302350540299 Benchmark completed VERIFICATION SUCCESSFUL Zeta is 0.171302350540E+02 Error is 0.526781606656E-13 CG Benchmark Completed. Class = A Size = 14000 Iterations = 15 Time in seconds = 2.06 Mop/s total = 724.79 Operation type = floating point Verification = SUCCESSFUL Version = 3.2.1 Compile date = 09 Nov 2009 Compile options: F77 = ifort FLINK = $(F77) F_LIB = (none) F_INC = (none) FFLAGS = -O -g FLINKFLAGS = -O RAND = randi8 Please send all errors/feedbacks to: NPB Development Team np...@na... real 0m2.756s user 0m2.711s sys 0m0.022s [root@node2 bin]# psprocess -m test_metric.xml cg.A.5562.node2.xml PerfSuite Hardware Performance Summary Report Version : 1.0 Created : Mon Nov 16 20:46:23 CST 2009 Generator : psprocess 0.5 XML Source : cg.A.5562.node2.xml Execution Information ============================================================================================ Collector : libpshwpc Date : Mon Nov 16 20:45:34 2009 Host : node2 Process ID : 5562 Thread : 0 User : root Command : cg.A Processor and System Information ============================================================================================ Node CPUs : 8 Vendor : Intel Family : Pentium Pro (P6) Brand : Intel(R) Xeon(R) CPU E5540 @ 2.53GHz CPU Revision : 5 Clock (MHz) : 1600.000 Memory (MB) : 16078.69 Pagesize (KB) : 4 Cache Information ============================================================================================ Cache levels : 3 -------------------------------- Level 1 Type : instruction Size (KB) : 32 Linesize (B) : 64 Assoc : 4 Type : data Size (KB) : 32 Linesize (B) : 64 Assoc : 8 -------------------------------- Level 2 Type : unified Size (KB) : 256 Linesize (B) : 64 Assoc : 8 -------------------------------- Level 3 Type : unified Size (KB) : 8192 Linesize (B) : 64 Assoc : 16 Index Description Counter Value ============================================================================================ 1 MEM_LOAD_RETIRED:LLC_UNSHARED_HIT (description not available).... 338818848 2 MEM_LOAD_RETIRED:LLC_MISS (description not available)............ 3219718 3 UNHALTED_CORE_CYCLES (description not available)................. 7312056865 Event Index ============================================================================================ 1: MEM_LOAD_RETIRED:LLC_UNSHARED_HIT 2: MEM_LOAD_RETIRED:LLC_MISS 3: UNHALTED_CORE_CYCLES Statistics ============================================================================================ Counting domain........................................................ user Multiplexed............................................................ no Wall clock time (seconds).............................................. 4.310 ---------------------------------------------- Here we can see that the "Wall clock time" output (4.31s) by psprocess is quite larger than the runtime of cg.A (both in terms of the outputs of cg.A,2.06s, and time command, about 2.7s.). Where does other part of time go? What causes the overhead? And what's the real meaning of the "Wall clock time" here? Second, in the output xml file of psrun, there is the count of cpu time: <cputime units="seconds"> <usertime>2.002680</usertime> <systemtime>0.000010</systemtime> </cputime> We can see that this is quite close to the real run time of cg.A. Why does psprocess not show these valuse? Will you add this function in upcoming ps-1.0? Regards, Jie ------------------------------------------------------------------------------ Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july _______________________________________________ PerfSuite-users mailing list Per...@li... https://lists.sourceforge.net/lists/listinfo/perfsuite-users |