You can subscribe to this list here.
2004 |
Jan
|
Feb
(2) |
Mar
(2) |
Apr
|
May
|
Jun
(1) |
Jul
(6) |
Aug
(3) |
Sep
|
Oct
(1) |
Nov
|
Dec
(2) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2005 |
Jan
(2) |
Feb
(2) |
Mar
|
Apr
(6) |
May
|
Jun
(4) |
Jul
(3) |
Aug
|
Sep
|
Oct
(2) |
Nov
(12) |
Dec
(10) |
2006 |
Jan
(27) |
Feb
(4) |
Mar
(3) |
Apr
(5) |
May
(5) |
Jun
(1) |
Jul
(2) |
Aug
|
Sep
(7) |
Oct
(5) |
Nov
(11) |
Dec
(5) |
2007 |
Jan
(15) |
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
(3) |
Sep
(1) |
Oct
|
Nov
(1) |
Dec
|
2008 |
Jan
(7) |
Feb
(9) |
Mar
(2) |
Apr
(1) |
May
|
Jun
(6) |
Jul
(2) |
Aug
|
Sep
|
Oct
(1) |
Nov
(3) |
Dec
(1) |
2009 |
Jan
(11) |
Feb
|
Mar
(2) |
Apr
(1) |
May
(8) |
Jun
(11) |
Jul
(9) |
Aug
(12) |
Sep
(1) |
Oct
(3) |
Nov
(10) |
Dec
|
2010 |
Jan
(3) |
Feb
(1) |
Mar
(5) |
Apr
|
May
|
Jun
|
Jul
|
Aug
(2) |
Sep
(1) |
Oct
(1) |
Nov
|
Dec
|
2011 |
Jan
(2) |
Feb
(2) |
Mar
(1) |
Apr
|
May
(1) |
Jun
|
Jul
|
Aug
(1) |
Sep
|
Oct
(2) |
Nov
|
Dec
|
2012 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(1) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2013 |
Jan
(1) |
Feb
|
Mar
|
Apr
(3) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(1) |
Nov
(1) |
Dec
(1) |
2014 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
(1) |
Oct
|
Nov
|
Dec
|
2015 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
(1) |
Aug
|
Sep
|
Oct
(2) |
Nov
|
Dec
|
From: <rk...@il...> - 2009-08-26 13:12:53
|
Jie, Thanks for reporting on your further experiments. With the issue still present when using PAPI, it seems similar to an issue we have seen on our Altix. Unfortunately, this remains an unresolved issue that may be related to the way that psrun operates internally. I'm afraid I do not have a solution at present, but I have found that using the PerfSuite API directly produces the proper results. If you are able and willing to do so, using the API involves inserting a call to the following routines: call psf_hwpc_init() - from the main thread call psf_hwpc_start() - from within a parallel region call psf_hwpc_stop(filename) - from within a parallel region There is an example (in C) in the PerfSuite distribution of calling the API from an OpenMP program. You will find it in: $PREFIX/share/perfsuite/examples/cpi/cpi-omp.c I'm afraid I do not have a better solution at this time, but it is an issue we are following up on. Rick ----- Original Message ----- From: "Robbie" <jj...@nu...> To: "Rick Kufrin" <rk...@il...> Cc: per...@li... Sent: Wednesday, August 26, 2009 7:27:05 AM GMT -06:00 US/Canada Central Subject: Re: [PerfSuite-users] Problems about using Perfsuite to monitor OpenMP program (NPB-3.2.1) Rick, Thanks for your suggestion. However, when I tried to monitor an OpenMP program with the default configuration file, I still got some errors and the data files are not created as expected. Followings are the platform information and how I measure the NPB-OMP program with psrun. Note here the target OpenMP program is compiled by gcc/gfortran with -fopenmp option. jiejiang@UT43:~/NPB3.2.1/NPB3.2-OMP/bin$ uname -a Linux UT43 2.6.27-perfctr #2 SMP Tue Apr 28 20:29:12 CST 2009 i686 GNU/Linux jiejiang@UT43:~/NPB3.2.1/NPB3.2-OMP/bin$ perfex -i PerfCtr Info: abi_version 0x05020501 driver_version 2.6.37 DEBUG cpu_type 14 (Intel Pentium M) cpu_features 0x7 (rdpmc,rdtsc,pcint) cpu_khz 798049 tsc_to_cpu_mult 1 cpu_nrctrs 2 cpus [0], total: 1 cpus_forbidden [], total: 0 jiejiang@UT43:~/NPB3.2.1/NPB3.2-OMP/bin$ ls bt.A ep.A is.A is.B jiejiang@UT43:~/NPB3.2.1/NPB3.2-OMP/bin$ export OMP_NUM_THREADS=2 jiejiang@UT43:~/NPB3.2.1/NPB3.2-OMP/bin$ psrun -p ./is.A NAS Parallel Benchmarks (NPB3.2-OMP) - IS Benchmark Size: 8388608 (class A) Iterations: 10 Number of available threads: 2 iteration 1 2 3 4 5 6 7 8 9 10 IS Benchmark Completed Class = A Size = 8388608 Iterations = 10 Time in seconds = 1.46 Total threads = 2 Avail threads = 2 Mop/s total = 57.29 Mop/s/thread = 28.64 Operation type = keys ranked Verification = SUCCESSFUL Version = 3.2.1 Compile date = 25 Aug 2009 Compile options: CC = gcc CLINK = $(CC) C_LIB = -lm C_INC = (none) CFLAGS = -O -g -fopenmp CLINKFLAGS = -O -fopenmp Please send all errors/feedbacks to: NPB Development Team np...@na... Inconsistency detected by ld.so: dl-close.c: 719: _dl_close: Assertion `map->l_init_called' failed! jiejiang@UT43:~/NPB3.2.1/NPB3.2-OMP/bin$ ls bt.A ep.A is.A is.A.0.30613.UT43.xml is.B The program execution finishes with the error message in the last line and there is only ONE xml output file, not two as expected. This also happens to papi_profile_cycles.xml configuration file. What's wrong? Regards, Jie Jiang Rick Kufrin wrote: > Jie, > > My guess is that what is happening here is related to the use of the "itimer.xml" configuration file. The problem is that signal delivery is not defined with POSIX threads, and the results are unpredictable. POSIX threads enter the picture when you are using OpenMP. > > Does your system happen to have kernel support for hardware counters? If so, you may have better luck by profiling with performance counters such as total cycles rather than itimers. > > Rick > ------------------------------------------------------------------------------ Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july _______________________________________________ PerfSuite-users mailing list Per...@li... https://lists.sourceforge.net/lists/listinfo/perfsuite-users |
From: Robbie <jj...@nu...> - 2009-08-26 12:28:01
|
Rick, Thanks for your suggestion. However, when I tried to monitor an OpenMP program with the default configuration file, I still got some errors and the data files are not created as expected. Followings are the platform information and how I measure the NPB-OMP program with psrun. Note here the target OpenMP program is compiled by gcc/gfortran with -fopenmp option. jiejiang@UT43:~/NPB3.2.1/NPB3.2-OMP/bin$ uname -a Linux UT43 2.6.27-perfctr #2 SMP Tue Apr 28 20:29:12 CST 2009 i686 GNU/Linux jiejiang@UT43:~/NPB3.2.1/NPB3.2-OMP/bin$ perfex -i PerfCtr Info: abi_version 0x05020501 driver_version 2.6.37 DEBUG cpu_type 14 (Intel Pentium M) cpu_features 0x7 (rdpmc,rdtsc,pcint) cpu_khz 798049 tsc_to_cpu_mult 1 cpu_nrctrs 2 cpus [0], total: 1 cpus_forbidden [], total: 0 jiejiang@UT43:~/NPB3.2.1/NPB3.2-OMP/bin$ ls bt.A ep.A is.A is.B jiejiang@UT43:~/NPB3.2.1/NPB3.2-OMP/bin$ export OMP_NUM_THREADS=2 jiejiang@UT43:~/NPB3.2.1/NPB3.2-OMP/bin$ psrun -p ./is.A NAS Parallel Benchmarks (NPB3.2-OMP) - IS Benchmark Size: 8388608 (class A) Iterations: 10 Number of available threads: 2 iteration 1 2 3 4 5 6 7 8 9 10 IS Benchmark Completed Class = A Size = 8388608 Iterations = 10 Time in seconds = 1.46 Total threads = 2 Avail threads = 2 Mop/s total = 57.29 Mop/s/thread = 28.64 Operation type = keys ranked Verification = SUCCESSFUL Version = 3.2.1 Compile date = 25 Aug 2009 Compile options: CC = gcc CLINK = $(CC) C_LIB = -lm C_INC = (none) CFLAGS = -O -g -fopenmp CLINKFLAGS = -O -fopenmp Please send all errors/feedbacks to: NPB Development Team np...@na... Inconsistency detected by ld.so: dl-close.c: 719: _dl_close: Assertion `map->l_init_called' failed! jiejiang@UT43:~/NPB3.2.1/NPB3.2-OMP/bin$ ls bt.A ep.A is.A is.A.0.30613.UT43.xml is.B The program execution finishes with the error message in the last line and there is only ONE xml output file, not two as expected. This also happens to papi_profile_cycles.xml configuration file. What's wrong? Regards, Jie Jiang Rick Kufrin wrote: > Jie, > > My guess is that what is happening here is related to the use of the "itimer.xml" configuration file. The problem is that signal delivery is not defined with POSIX threads, and the results are unpredictable. POSIX threads enter the picture when you are using OpenMP. > > Does your system happen to have kernel support for hardware counters? If so, you may have better luck by profiling with performance counters such as total cycles rather than itimers. > > Rick > |
From: Rick K. <rk...@il...> - 2009-08-25 14:41:41
|
Jie, My guess is that what is happening here is related to the use of the "itimer.xml" configuration file. The problem is that signal delivery is not defined with POSIX threads, and the results are unpredictable. POSIX threads enter the picture when you are using OpenMP. Does your system happen to have kernel support for hardware counters? If so, you may have better luck by profiling with performance counters such as total cycles rather than itimers. Rick ----- Original Message ----- From: "Robbie" <jj...@nu...> To: per...@li... Sent: Tuesday, August 25, 2009 9:32:33 AM GMT -06:00 US/Canada Central Subject: [PerfSuite-users] Problems about using Perfsuite to monitor OpenMP program (NPB-3.2.1) Hi, Recently I'm trying perfsuite-0.6.2 to monitoring OpenMP program (NPB-3.21/NPB-OMP) and encouter some problems. The target platform is Xeon64/Linux: uname -a: Linux node3 2.6.28 #2 SMP Tue Mar 3 15:49:55 CST 2009 x86_64 x86_64 x86_64 GNU/Linux The perfsuite configure options are list as follows: ./configure --prefix=/usr/local/perfsuite/psuite --with-tclinclude=/usr/local/perfusite/tcl/include --with-tdom=/ usr/local/perfsuite/tdom --enable-mpi F77=ifort MPICPPFLAGS=-I/home/jiangjie/install/mvapich2/include --disable-b inutils --with-tclsh=/usr/local/perfsuite/tcl/bin/tclsh8.5 --enable-debug My OpeneMP testsuite is NPB3.2.1/NPB3.2-OMP, and the compiler is ifort(version 10.1) with --openmp option. There are several different results returned by different experiments: (1) experiment 1 (export OMP_NUM_THREADS=2) [jiangjie@node1 bin]$ psrun -c /usr/local/perfsuite/psuite/share/perfsuite/xml/pshwpc/itimer.xml -p ./bt.A libpsrun.c:181 : SIGPROF ignored on startup. Handler=0x1, flags=14000000 PerfSuite debugging enabled (debug level: PS_DEBUG_OFF) [PID 14245] Library version: threaded [PID 14245] Environment (entry of psrun_init) [PID 14245] PSRUN_DOFORK = (null) [PID 14245] LD_PRELOAD = libpsrun_r.so.0 [PID 14245] PSRUN_PID = 14245 [PID 14245] PS_HWPC_FILE = bt.A 00400000-004b2000 r-xp 00000000 08:01 60425445 /home/jiangjie/NPB3.2.1/NPB3.2-OMP/bin/bt.A 006b2000-006b7000 rwxp 000b2000 08:01 60425445 /home/jiangjie/NPB3.2.1/NPB3.2-OMP/bin/bt.A 006b7000-032fb000 rwxp 006b7000 00:00 0 10378000-10399000 rwxp 10378000 00:00 0 2aaaaaaab000-2aaaaaac5000 r-xp 00000000 08:01 44958009 /lib64/ld-2.5.so 2aaaaaac5000-2aaaaaac7000 rwxp 2aaaaaac5000 00:00 0 2aaaaacc4000-2aaaaacc5000 r-xp 00019000 08:01 44958009 /lib64/ld-2.5.so 2aaaaacc5000-2aaaaacc6000 rwxp 0001a000 08:01 44958009 /lib64/ld-2.5.so 2aaaaacc6000-2aaaaacca000 r-xp 00000000 08:01 60425291 /usr/local/perfsuite/psuite/lib/libpsrun_r.so.0.0.1 2aaaaacca000-2aaaaaec9000 ---p 00004000 08:01 60425291 /usr/local/perfsuite/psuite/lib/libpsrun_r.so.0.0.1 2aaaaaec9000-2aaaaaeca000 rwxp 00003000 08:01 60425291 /usr/local/perfsuite/psuite/lib/libpsrun_r.so.0.0.1 2aaaaaeca000-2aaaaaecb000 rwxp 2aaaaaeca000 00:00 0 2aaaaaef1000-2aaaaaf73000 r-xp 00000000 08:01 44957999 /lib64/libm-2.5.so 2aaaaaf73000-2aaaab172000 ---p 00082000 08:01 44957999 /lib64/libm-2.5.so 2aaaab172000-2aaaab173000 r-xp 00081000 08:01 44957999 /lib64/libm-2.5.so 2aaaab173000-2aaaab174000 rwxp 00082000 08:01 44957999 /lib64/libm-2.5.so 2aaaab174000-2aaaab1db000 r-xp 00000000 08:01 66789180 /opt/intel10.1/fce/10.1.017/lib/libguide.so 2aaaab1db000-2aaaab2db000 ---p 00067000 08:01 66789180 /opt/intel10.1/fce/10.1.017/lib/libguide.so 2aaaab2db000-2aaaab2e0000 rwxp 00067000 08:01 66789180 /opt/intel10.1/fce/10.1.017/lib/libguide.so 2aaaab2e0000-2aaaab2e8000 rwxp 2aaaab2e0000 00:00 0 2aaaab2e8000-2aaaab2fd000 r-xp 00000000 08:01 44957988 /lib64/libpthread-2.5.so 2aaaab2fd000-2aaaab4fc000 ---p 00015000 08:01 44957988 /lib64/libpthread-2.5.so 2aaaab4fc000-2aaaab4fd000 r-xp 00014000 08:01 44957988 /lib64/libpthread-2.5.so 2aaaab4fd000-2aaaab4fe000 rwxp 00015000 08:01 44957988 /lib64/libpthread-2.5.so 2aaaab4fe000-2aaaab502000 rwxp 2aaaab4fe000 00:00 0 2aaaab502000-2aaaab646000 r-xp 00000000 08:01 44958004 /lib64/libc-2.5.so 2aaaab646000-2aaaab846000 ---p 00144000 08:01 44958004 /lib64/libc-2.5.so 2aaaab846000-2aaaab84a000 r-xp 00144000 08:01 44958004 /lib64/libc-2.5.so 2aaaab84a000-2aaaab84b000 rwxp 00148000 08:01 44958004 /lib64/libc-2.5.so 2aaaab84b000-2aaaab850000 rwxp 2aaaab84b000 00:00 0 2aaaab850000-2aaaab85d000 r-xp 00000000 08:01 44958007 /lib64/libgcc_s-4.1.1-20070105.so.1 2aaaab85d000-2aaaaba5c000 ---p 0000d000 08:01 44958007 /lib64/libgcc_s-4.1.1-20070105.so.1 2aaaaba5c000-2aaaaba5d000 rwxp 0000c000 08:01 44958007 /lib64/libgcc_s-4.1.1-20070105.so.1 2aaaaba5d000-2aaaaba5e000 rwxp 2aaaaba5d000 00:00 0 2aaaaba5e000-2aaaaba60000 r-xp 00000000 08:01 44958023 /lib64/libdl-2.5.so 2aaaaba60000-2aaaabc60000 ---p 00002000 08:01 44958023 /lib64/libdl-2.5.so 2aaaabc60000-2aaaabc61000 r-xp 00002000 08:01 44958023 /lib64/libdl-2.5.so 2aaaabc61000-2aaaabc62000 rwxp 00003000 08:01 44958023 /lib64/libdl-2.5.so 2aaaabc62000-2aaaabc6f000 r-xp 00000000 08:01 60425170 /usr/local/perfsuite/psuite/lib/libperfsuite_r.so.1.0.1 2aaaabc6f000-2aaaabe6e000 ---p 0000d000 08:01 60425170 /usr/local/perfsuite/psuite/lib/libperfsuite_r.so.1.0.1 2aaaabe6e000-2aaaabe6f000 rwxp 0000c000 08:01 60425170 /usr/local/perfsuite/psuite/lib/libperfsuite_r.so.1.0.1 2aaaabe6f000-2aaaabe71000 rwxp 2aaaabe6f000 00:00 0 2aaaabe71000-2aaaabe83000 r-xp 00000000 08:01 60425187 /usr/local/perfsuite/psuite/lib/libpshwpc_r.so.1.0.1 2aaaabe83000-2aaaac082000 ---p 00012000 08:01 60425187 /usr/local/perfsuite/psuite/lib/libpshwpc_r.so.1.0.1 2aaaac082000-2aaaac083000 rwxp 00011000 08:01 60425187 /usr/local/perfsuite/psuite/lib/libpshwpc_r.so.1.0.1 2aaaac083000-2aaaac084000 rwxp 2aaaac083000 00:00 0 2aaaac0aa000-2aaaac0ca000 r-xp 00000000 08:01 44958013 /lib64/libexpat.so.0.5.0 2aaaac0ca000-2aaaac2c9000 ---p 00020000 08:01 44958013 /lib64/libexpat.so.0.5.0 2aaaac2c9000-2aaaac2cc000 rwxp 0001f000 08:01 44958013 /lib64/libexpat.so.0.5.0 7fff65329000-7fff6533e000 rwxp 7fff65329000 00:00 0 [stack] ffffffffff600000-ffffffffffe00000 ---p 00000000 00:00 0 [vdso] NAS Parallel Benchmarks (NPB3.2-OMP) - BT Benchmark No input file inputbt.data. Using compiled defaults Size: 64x 64x 64 Iterations: 200 dt: 0.000800 Number of available threads: 2 Thread created... Thread returned from hwpc_start with status 0 Thread created... Fatal profiling error: cannot access TSD data in handler. Please report this as an internal PerfSuite bug to per...@li.... Thread executing cleanup handler Thread returned from hwpc_stop with status 0 (2) experiment 2 (export OMP_NUM_THREADS=8) [jiangjie@node1 bin]$ psrun -c /usr/local/perfsuite/psuite/share/perfsuite/xml/pshwpc/itimer.xml -p ./cg.A libpsrun.c:181 : SIGPROF ignored on startup. Handler=0x1, flags=14000000 PerfSuite debugging enabled (debug level: PS_DEBUG_OFF) [PID 15326] Library version: threaded [PID 15326] Environment (entry of psrun_init) [PID 15326] PSRUN_DOFORK = (null) [PID 15326] LD_PRELOAD = libpsrun_r.so.0 [PID 15326] PSRUN_PID = 15326 [PID 15326] PS_HWPC_FILE = cg.A 00400000-00406000 r-xp 00000000 08:01 60425440 /home/jiangjie/NPB3.2.1/NPB3.2-OMP/bin/cg.A 00606000-00607000 rw-p 00006000 08:01 60425440 /home/jiangjie/NPB3.2.1/NPB3.2-OMP/bin/cg.A 00607000-036c9000 rw-p 00607000 00:00 0 19af3000-19b14000 rw-p 19af3000 00:00 0 2aaaaaaab000-2aaaaaac5000 r-xp 00000000 08:01 44958009 /lib64/ld-2.5.so 2aaaaaac5000-2aaaaaac7000 rw-p 2aaaaaac5000 00:00 0 2aaaaacc4000-2aaaaacc5000 r--p 00019000 08:01 44958009 /lib64/ld-2.5.so 2aaaaacc5000-2aaaaacc6000 rw-p 0001a000 08:01 44958009 /lib64/ld-2.5.so 2aaaaacc6000-2aaaaacca000 r-xp 00000000 08:01 60425291 /usr/local/perfsuite/psuite/lib/libpsrun_r.so.0.0.1 2aaaaacca000-2aaaaaec9000 ---p 00004000 08:01 60425291 /usr/local/perfsuite/psuite/lib/libpsrun_r.so.0.0.1 2aaaaaec9000-2aaaaaeca000 rw-p 00003000 08:01 60425291 /usr/local/perfsuite/psuite/lib/libpsrun_r.so.0.0.1 2aaaaaeca000-2aaaaaecb000 rw-p 2aaaaaeca000 00:00 0 2aaaaaef1000-2aaaaaf87000 r-xp 00000000 08:01 53419189 /usr/lib64/libgfortran.so.1.0.0 2aaaaaf87000-2aaaab186000 ---p 00096000 08:01 53419189 /usr/lib64/libgfortran.so.1.0.0 2aaaab186000-2aaaab188000 rw-p 00095000 08:01 53419189 /usr/lib64/libgfortran.so.1.0.0 2aaaab188000-2aaaab20a000 r-xp 00000000 08:01 44957999 /lib64/libm-2.5.so 2aaaab20a000-2aaaab409000 ---p 00082000 08:01 44957999 /lib64/libm-2.5.so 2aaaab409000-2aaaab40a000 r--p 00081000 08:01 44957999 /lib64/libm-2.5.so 2aaaab40a000-2aaaab40b000 rw-p 00082000 08:01 44957999 /lib64/libm-2.5.so 2aaaab40b000-2aaaab40c000 rw-p 2aaaab40b000 00:00 0 2aaaab40c000-2aaaab413000 r-xp 00000000 08:01 53419101 /usr/lib64/libgomp.so.1.0.0 2aaaab413000-2aaaab612000 ---p 00007000 08:01 53419101 /usr/lib64/libgomp.so.1.0.0 2aaaab612000-2aaaab613000 rw-p 00006000 08:01 53419101 /usr/lib64/libgomp.so.1.0.0 2aaaab613000-2aaaab620000 r-xp 00000000 08:01 44958007 /lib64/libgcc_s-4.1.1-20070105.so.1 2aaaab620000-2aaaab81f000 ---p 0000d000 08:01 44958007 /lib64/libgcc_s-4.1.1-20070105.so.1 2aaaab81f000-2aaaab820000 rw-p 0000c000 08:01 44958007 /lib64/libgcc_s-4.1.1-20070105.so.1 2aaaab820000-2aaaab835000 r-xp 00000000 08:01 44957988 /lib64/libpthread-2.5.so 2aaaab835000-2aaaaba34000 ---p 00015000 08:01 44957988 /lib64/libpthread-2.5.so 2aaaaba34000-2aaaaba35000 r--p 00014000 08:01 44957988 /lib64/libpthread-2.5.so 2aaaaba35000-2aaaaba36000 rw-p 00015000 08:01 44957988 /lib64/libpthread-2.5.so 2aaaaba36000-2aaaaba3b000 rw-p 2aaaaba36000 00:00 0 2aaaaba3b000-2aaaabb7f000 r-xp 00000000 08:01 44958004 /lib64/libc-2.5.so 2aaaabb7f000-2aaaabd7f000 ---p 00144000 08:01 44958004 /lib64/libc-2.5.so 2aaaabd7f000-2aaaabd83000 r--p 00144000 08:01 44958004 /lib64/libc-2.5.so 2aaaabd83000-2aaaabd84000 rw-p 00148000 08:01 44958004 /lib64/libc-2.5.so 2aaaabd84000-2aaaabd89000 rw-p 2aaaabd84000 00:00 0 2aaaabd89000-2aaaabd96000 r-xp 00000000 08:01 60425170 /usr/local/perfsuite/psuite/lib/libperfsuite_r.so.1.0.1 2aaaabd96000-2aaaabf95000 ---p 0000d000 08:01 60425170 /usr/local/perfsuite/psuite/lib/libperfsuite_r.so.1.0.1 2aaaabf95000-2aaaabf96000 rw-p 0000c000 08:01 60425170 /usr/local/perfsuite/psuite/lib/libperfsuite_r.so.1.0.1 2aaaabf96000-2aaaabf98000 r-xp 00000000 08:01 44958023 /lib64/libdl-2.5.so 2aaaabf98000-2aaaac198000 ---p 00002000 08:01 44958023 /lib64/libdl-2.5.so 2aaaac198000-2aaaac199000 r--p 00002000 08:01 44958023 /lib64/libdl-2.5.so 2aaaac199000-2aaaac19a000 rw-p 00003000 08:01 44958023 /lib64/libdl-2.5.so 2aaaac19a000-2aaaac19b000 rw-p 2aaaac19a000 00:00 0 2aaaac19b000-2aaaac1a2000 r-xp 00000000 08:01 44957732 /lib64/librt-2.5.so 2aaaac1a2000-2aaaac3a2000 ---p 00007000 08:01 44957732 /lib64/librt-2.5.so 2aaaac3a2000-2aaaac3a3000 r--p 00007000 08:01 44957732 /lib64/librt-2.5.so 2aaaac3a3000-2aaaac3a4000 rw-p 00008000 08:01 44957732 /lib64/librt-2.5.so 2aaaac3a4000-2aaaac3a6000 rw-p 2aaaac3a4000 00:00 0 2aaaac3a6000-2aaaac3b8000 r-xp 00000000 08:01 60425187 /usr/local/perfsuite/psuite/lib/libpshwpc_r.so.1.0.1 2aaaac3b8000-2aaaac5b7000 ---p 00012000 08:01 60425187 /usr/local/perfsuite/psuite/lib/libpshwpc_r.so.1.0.1 2aaaac5b7000-2aaaac5b8000 rw-p 00011000 08:01 60425187 /usr/local/perfsuite/psuite/lib/libpshwpc_r.so.1.0.1 2aaaac5b8000-2aaaac5b9000 rw-p 2aaaac5b8000 00:00 0 2aaaac5df000-2aaaac5ff000 r-xp 00000000 08:01 44958013 /lib64/libexpat.so.0.5.0 2aaaac5ff000-2aaaac7fe000 ---p 00020000 08:01 44958013 /lib64/libexpat.so.0.5.0 2aaaac7fe000-2aaaac801000 rw-p 0001f000 08:01 44958013 /lib64/libexpat.so.0.5.0 7fffc37a5000-7fffc37ba000 rw-p 7fffc37a5000 00:00 0 [stack] ffffffffff600000-ffffffffffe00000 ---p 00000000 00:00 0 [vdso] NAS Parallel Benchmarks (NPB3.2-OMP) - CG Benchmark Size: 14000 Iterations: 15 Number of available threads: 8 Thread created... Thread created... Thread returned from hwpc_start with status 0 Thread returned from hwpc_start with status 0 Thread created... Thread created... Thread created... Thread created... Thread created... Fatal profiling error: cannot access TSD data in handler. Please report this as an internal PerfSuite bug to per...@li.... [PID 15326] Notice: caught SIGSEGV in wrap-up. (3) Experiment 3 (export OMP_NUM_THREADS=8) [jiangjie@node1 bin]$ psrun -c /usr/local/perfsuite/psuite/share/perfsuite/xml/pshwpc/itimer.xml -p ./cg.A libpsrun.c:181 : SIGPROF ignored on startup. Handler=0x1, flags=14000000 PerfSuite debugging enabled (debug level: PS_DEBUG_OFF) [PID 15784] Library version: threaded [PID 15784] Environment (entry of psrun_init) [PID 15784] PSRUN_DOFORK = (null) [PID 15784] LD_PRELOAD = libpsrun_r.so.0 [PID 15784] PSRUN_PID = 15784 [PID 15784] PS_HWPC_FILE = cg.A 00400000-004a7000 r-xp 00000000 08:01 60425440 /home/jiangjie/NPB3.2.1/NPB3.2-OMP/bin/cg.A 006a7000-006ae000 rwxp 000a7000 08:01 60425440 /home/jiangjie/NPB3.2.1/NPB3.2-OMP/bin/cg.A 006ae000-037a2000 rwxp 006ae000 00:00 0 15b6b000-15b8c000 rwxp 15b6b000 00:00 0 2aaaaaaab000-2aaaaaac5000 r-xp 00000000 08:01 44958009 /lib64/ld-2.5.so 2aaaaaac5000-2aaaaaac7000 rwxp 2aaaaaac5000 00:00 0 2aaaaacc4000-2aaaaacc5000 r-xp 00019000 08:01 44958009 /lib64/ld-2.5.so 2aaaaacc5000-2aaaaacc6000 rwxp 0001a000 08:01 44958009 /lib64/ld-2.5.so 2aaaaacc6000-2aaaaacca000 r-xp 00000000 08:01 60425291 /usr/local/perfsuite/psuite/lib/libpsrun_r.so.0.0.1 2aaaaacca000-2aaaaaec9000 ---p 00004000 08:01 60425291 /usr/local/perfsuite/psuite/lib/libpsrun_r.so.0.0.1 2aaaaaec9000-2aaaaaeca000 rwxp 00003000 08:01 60425291 /usr/local/perfsuite/psuite/lib/libpsrun_r.so.0.0.1 2aaaaaeca000-2aaaaaecb000 rwxp 2aaaaaeca000 00:00 0 2aaaaaef1000-2aaaaaf73000 r-xp 00000000 08:01 44957999 /lib64/libm-2.5.so 2aaaaaf73000-2aaaab172000 ---p 00082000 08:01 44957999 /lib64/libm-2.5.so 2aaaab172000-2aaaab173000 r-xp 00081000 08:01 44957999 /lib64/libm-2.5.so 2aaaab173000-2aaaab174000 rwxp 00082000 08:01 44957999 /lib64/libm-2.5.so 2aaaab174000-2aaaab1db000 r-xp 00000000 08:01 66789180 /opt/intel10.1/fce/10.1.017/lib/libguide.so 2aaaab1db000-2aaaab2db000 ---p 00067000 08:01 66789180 /opt/intel10.1/fce/10.1.017/lib/libguide.so 2aaaab2db000-2aaaab2e0000 rwxp 00067000 08:01 66789180 /opt/intel10.1/fce/10.1.017/lib/libguide.so 2aaaab2e0000-2aaaab2e8000 rwxp 2aaaab2e0000 00:00 0 2aaaab2e8000-2aaaab2fd000 r-xp 00000000 08:01 44957988 /lib64/libpthread-2.5.so 2aaaab2fd000-2aaaab4fc000 ---p 00015000 08:01 44957988 /lib64/libpthread-2.5.so 2aaaab4fc000-2aaaab4fd000 r-xp 00014000 08:01 44957988 /lib64/libpthread-2.5.so 2aaaab4fd000-2aaaab4fe000 rwxp 00015000 08:01 44957988 /lib64/libpthread-2.5.so 2aaaab4fe000-2aaaab502000 rwxp 2aaaab4fe000 00:00 0 2aaaab502000-2aaaab646000 r-xp 00000000 08:01 44958004 /lib64/libc-2.5.so 2aaaab646000-2aaaab846000 ---p 00144000 08:01 44958004 /lib64/libc-2.5.so 2aaaab846000-2aaaab84a000 r-xp 00144000 08:01 44958004 /lib64/libc-2.5.so 2aaaab84a000-2aaaab84b000 rwxp 00148000 08:01 44958004 /lib64/libc-2.5.so 2aaaab84b000-2aaaab850000 rwxp 2aaaab84b000 00:00 0 2aaaab850000-2aaaab85d000 r-xp 00000000 08:01 44958007 /lib64/libgcc_s-4.1.1-20070105.so.1 2aaaab85d000-2aaaaba5c000 ---p 0000d000 08:01 44958007 /lib64/libgcc_s-4.1.1-20070105.so.1 2aaaaba5c000-2aaaaba5d000 rwxp 0000c000 08:01 44958007 /lib64/libgcc_s-4.1.1-20070105.so.1 2aaaaba5d000-2aaaaba5e000 rwxp 2aaaaba5d000 00:00 0 2aaaaba5e000-2aaaaba60000 r-xp 00000000 08:01 44958023 /lib64/libdl-2.5.so 2aaaaba60000-2aaaabc60000 ---p 00002000 08:01 44958023 /lib64/libdl-2.5.so 2aaaabc60000-2aaaabc61000 r-xp 00002000 08:01 44958023 /lib64/libdl-2.5.so 2aaaabc61000-2aaaabc62000 rwxp 00003000 08:01 44958023 /lib64/libdl-2.5.so 2aaaabc62000-2aaaabc6f000 r-xp 00000000 08:01 60425170 /usr/local/perfsuite/psuite/lib/libperfsuite_r.so.1.0.1 2aaaabc6f000-2aaaabe6e000 ---p 0000d000 08:01 60425170 /usr/local/perfsuite/psuite/lib/libperfsuite_r.so.1.0.1 2aaaabe6e000-2aaaabe6f000 rwxp 0000c000 08:01 60425170 /usr/local/perfsuite/psuite/lib/libperfsuite_r.so.1.0.1 2aaaabe6f000-2aaaabe71000 rwxp 2aaaabe6f000 00:00 0 2aaaabe71000-2aaaabe83000 r-xp 00000000 08:01 60425187 /usr/local/perfsuite/psuite/lib/libpshwpc_r.so.1.0.1 2aaaabe83000-2aaaac082000 ---p 00012000 08:01 60425187 /usr/local/perfsuite/psuite/lib/libpshwpc_r.so.1.0.1 2aaaac082000-2aaaac083000 rwxp 00011000 08:01 60425187 /usr/local/perfsuite/psuite/lib/libpshwpc_r.so.1.0.1 2aaaac083000-2aaaac084000 rwxp 2aaaac083000 00:00 0 2aaaac0aa000-2aaaac0ca000 r-xp 00000000 08:01 44958013 /lib64/libexpat.so.0.5.0 2aaaac0ca000-2aaaac2c9000 ---p 00020000 08:01 44958013 /lib64/libexpat.so.0.5.0 2aaaac2c9000-2aaaac2cc000 rwxp 0001f000 08:01 44958013 /lib64/libexpat.so.0.5.0 7fff47070000-7fff47085000 rwxp 7fff47070000 00:00 0 [stack] ffffffffff600000-ffffffffffe00000 ---p 00000000 00:00 0 [vdso] NAS Parallel Benchmarks (NPB3.2-OMP) - CG Benchmark Size: 14000 Iterations: 15 Number of available threads: 8 Thread created... Thread created... Thread created... Thread created... Thread created... Fatal profiling error: cannot access TSD data in handler. Please report this as an internal PerfSuite bug to per...@li.... Thread returned from hwpc_start with status 0 Thread returned from hwpc_start with status 0 Thread created... Thread created... system error(35): __kmp_reap_monitor: monitor did not reap properly: Resource deadlock avoided OMP: Error #107: Fatal system error detected. Thread executing cleanup handler (4) Experiment 4 (export OMP_NUM_THREADS=8) [jiangjie@node1 bin]$ psrun -c /usr/local/perfsuite/psuite/share/perfsuite/xml/pshwpc/itimer.xml -p ./cg.A libpsrun.c:181 : SIGPROF ignored on startup. Handler=0x1, flags=14000000 PerfSuite debugging enabled (debug level: PS_DEBUG_OFF) [PID 15916] Library version: threaded [PID 15916] Environment (entry of psrun_init) [PID 15916] PSRUN_DOFORK = (null) [PID 15916] LD_PRELOAD = libpsrun_r.so.0 [PID 15916] PSRUN_PID = 15916 [PID 15916] PS_HWPC_FILE = cg.A 00400000-004a7000 r-xp 00000000 08:01 60425440 /home/jiangjie/NPB3.2.1/NPB3.2-OMP/bin/cg.A 006a7000-006ae000 rwxp 000a7000 08:01 60425440 /home/jiangjie/NPB3.2.1/NPB3.2-OMP/bin/cg.A 006ae000-037a2000 rwxp 006ae000 00:00 0 19c52000-19c73000 rwxp 19c52000 00:00 0 2aaaaaaab000-2aaaaaac5000 r-xp 00000000 08:01 44958009 /lib64/ld-2.5.so 2aaaaaac5000-2aaaaaac7000 rwxp 2aaaaaac5000 00:00 0 2aaaaacc4000-2aaaaacc5000 r-xp 00019000 08:01 44958009 /lib64/ld-2.5.so 2aaaaacc5000-2aaaaacc6000 rwxp 0001a000 08:01 44958009 /lib64/ld-2.5.so 2aaaaacc6000-2aaaaacca000 r-xp 00000000 08:01 60425291 /usr/local/perfsuite/psuite/lib/libpsrun_r.so.0.0.1 2aaaaacca000-2aaaaaec9000 ---p 00004000 08:01 60425291 /usr/local/perfsuite/psuite/lib/libpsrun_r.so.0.0.1 2aaaaaec9000-2aaaaaeca000 rwxp 00003000 08:01 60425291 /usr/local/perfsuite/psuite/lib/libpsrun_r.so.0.0.1 2aaaaaeca000-2aaaaaecb000 rwxp 2aaaaaeca000 00:00 0 2aaaaaef1000-2aaaaaf73000 r-xp 00000000 08:01 44957999 /lib64/libm-2.5.so 2aaaaaf73000-2aaaab172000 ---p 00082000 08:01 44957999 /lib64/libm-2.5.so 2aaaab172000-2aaaab173000 r-xp 00081000 08:01 44957999 /lib64/libm-2.5.so 2aaaab173000-2aaaab174000 rwxp 00082000 08:01 44957999 /lib64/libm-2.5.so 2aaaab174000-2aaaab1db000 r-xp 00000000 08:01 66789180 /opt/intel10.1/fce/10.1.017/lib/libguide.so 2aaaab1db000-2aaaab2db000 ---p 00067000 08:01 66789180 /opt/intel10.1/fce/10.1.017/lib/libguide.so 2aaaab2db000-2aaaab2e0000 rwxp 00067000 08:01 66789180 /opt/intel10.1/fce/10.1.017/lib/libguide.so 2aaaab2e0000-2aaaab2e8000 rwxp 2aaaab2e0000 00:00 0 2aaaab2e8000-2aaaab2fd000 r-xp 00000000 08:01 44957988 /lib64/libpthread-2.5.so 2aaaab2fd000-2aaaab4fc000 ---p 00015000 08:01 44957988 /lib64/libpthread-2.5.so 2aaaab4fc000-2aaaab4fd000 r-xp 00014000 08:01 44957988 /lib64/libpthread-2.5.so 2aaaab4fd000-2aaaab4fe000 rwxp 00015000 08:01 44957988 /lib64/libpthread-2.5.so 2aaaab4fe000-2aaaab502000 rwxp 2aaaab4fe000 00:00 0 2aaaab502000-2aaaab646000 r-xp 00000000 08:01 44958004 /lib64/libc-2.5.so 2aaaab646000-2aaaab846000 ---p 00144000 08:01 44958004 /lib64/libc-2.5.so 2aaaab846000-2aaaab84a000 r-xp 00144000 08:01 44958004 /lib64/libc-2.5.so 2aaaab84a000-2aaaab84b000 rwxp 00148000 08:01 44958004 /lib64/libc-2.5.so 2aaaab84b000-2aaaab850000 rwxp 2aaaab84b000 00:00 0 2aaaab850000-2aaaab85d000 r-xp 00000000 08:01 44958007 /lib64/libgcc_s-4.1.1-20070105.so.1 2aaaab85d000-2aaaaba5c000 ---p 0000d000 08:01 44958007 /lib64/libgcc_s-4.1.1-20070105.so.1 2aaaaba5c000-2aaaaba5d000 rwxp 0000c000 08:01 44958007 /lib64/libgcc_s-4.1.1-20070105.so.1 2aaaaba5d000-2aaaaba5e000 rwxp 2aaaaba5d000 00:00 0 2aaaaba5e000-2aaaaba60000 r-xp 00000000 08:01 44958023 /lib64/libdl-2.5.so 2aaaaba60000-2aaaabc60000 ---p 00002000 08:01 44958023 /lib64/libdl-2.5.so 2aaaabc60000-2aaaabc61000 r-xp 00002000 08:01 44958023 /lib64/libdl-2.5.so 2aaaabc61000-2aaaabc62000 rwxp 00003000 08:01 44958023 /lib64/libdl-2.5.so 2aaaabc62000-2aaaabc6f000 r-xp 00000000 08:01 60425170 /usr/local/perfsuite/psuite/lib/libperfsuite_r.so.1.0.1 2aaaabc6f000-2aaaabe6e000 ---p 0000d000 08:01 60425170 /usr/local/perfsuite/psuite/lib/libperfsuite_r.so.1.0.1 2aaaabe6e000-2aaaabe6f000 rwxp 0000c000 08:01 60425170 /usr/local/perfsuite/psuite/lib/libperfsuite_r.so.1.0.1 2aaaabe6f000-2aaaabe71000 rwxp 2aaaabe6f000 00:00 0 2aaaabe71000-2aaaabe83000 r-xp 00000000 08:01 60425187 /usr/local/perfsuite/psuite/lib/libpshwpc_r.so.1.0.1 2aaaabe83000-2aaaac082000 ---p 00012000 08:01 60425187 /usr/local/perfsuite/psuite/lib/libpshwpc_r.so.1.0.1 2aaaac082000-2aaaac083000 rwxp 00011000 08:01 60425187 /usr/local/perfsuite/psuite/lib/libpshwpc_r.so.1.0.1 2aaaac083000-2aaaac084000 rwxp 2aaaac083000 00:00 0 2aaaac0aa000-2aaaac0ca000 r-xp 00000000 08:01 44958013 /lib64/libexpat.so.0.5.0 2aaaac0ca000-2aaaac2c9000 ---p 00020000 08:01 44958013 /lib64/libexpat.so.0.5.0 2aaaac2c9000-2aaaac2cc000 rwxp 0001f000 08:01 44958013 /lib64/libexpat.so.0.5.0 7fff36965000-7fff3697b000 rwxp 7fff36965000 00:00 0 [stack] ffffffffff600000-ffffffffffe00000 ---p 00000000 00:00 0 [vdso] NAS Parallel Benchmarks (NPB3.2-OMP) - CG Benchmark Size: 14000 Iterations: 15 Number of available threads: 8 Thread created... Thread created... Thread created... Thread created... Thread created... Thread created... Thread created... Fatal profiling error: cannot access TSD data in handler. Please report this as an internal PerfSuite bug to per...@li.... Thread created... Fatal profiling error: cannot access TSD data in handler. Please report this as an internal PerfSuite bug to per...@li.... libpsrun fatal error: calling sequence not allowed It is strange that sometimes it works well. Some forementioned problems also occur on my x86/Linux platform. And we notice that message "Fatal profiling error: cannot access TSD data in handler." always appears. Note that for all serial and MPI NPB benchmarks, perfsuite works well. Any suggestion? Regards, Jie Jiang ------------------------------------------------------------------------------ Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july _______________________________________________ PerfSuite-users mailing list Per...@li... https://lists.sourceforge.net/lists/listinfo/perfsuite-users |
From: Robbie <jj...@nu...> - 2009-08-25 14:33:06
|
Hi, Recently I'm trying perfsuite-0.6.2 to monitoring OpenMP program (NPB-3.21/NPB-OMP) and encouter some problems. The target platform is Xeon64/Linux: uname -a: Linux node3 2.6.28 #2 SMP Tue Mar 3 15:49:55 CST 2009 x86_64 x86_64 x86_64 GNU/Linux The perfsuite configure options are list as follows: ./configure --prefix=/usr/local/perfsuite/psuite --with-tclinclude=/usr/local/perfusite/tcl/include --with-tdom=/ usr/local/perfsuite/tdom --enable-mpi F77=ifort MPICPPFLAGS=-I/home/jiangjie/install/mvapich2/include --disable-b inutils --with-tclsh=/usr/local/perfsuite/tcl/bin/tclsh8.5 --enable-debug My OpeneMP testsuite is NPB3.2.1/NPB3.2-OMP, and the compiler is ifort(version 10.1) with --openmp option. There are several different results returned by different experiments: (1) experiment 1 (export OMP_NUM_THREADS=2) [jiangjie@node1 bin]$ psrun -c /usr/local/perfsuite/psuite/share/perfsuite/xml/pshwpc/itimer.xml -p ./bt.A libpsrun.c:181 : SIGPROF ignored on startup. Handler=0x1, flags=14000000 PerfSuite debugging enabled (debug level: PS_DEBUG_OFF) [PID 14245] Library version: threaded [PID 14245] Environment (entry of psrun_init) [PID 14245] PSRUN_DOFORK = (null) [PID 14245] LD_PRELOAD = libpsrun_r.so.0 [PID 14245] PSRUN_PID = 14245 [PID 14245] PS_HWPC_FILE = bt.A 00400000-004b2000 r-xp 00000000 08:01 60425445 /home/jiangjie/NPB3.2.1/NPB3.2-OMP/bin/bt.A 006b2000-006b7000 rwxp 000b2000 08:01 60425445 /home/jiangjie/NPB3.2.1/NPB3.2-OMP/bin/bt.A 006b7000-032fb000 rwxp 006b7000 00:00 0 10378000-10399000 rwxp 10378000 00:00 0 2aaaaaaab000-2aaaaaac5000 r-xp 00000000 08:01 44958009 /lib64/ld-2.5.so 2aaaaaac5000-2aaaaaac7000 rwxp 2aaaaaac5000 00:00 0 2aaaaacc4000-2aaaaacc5000 r-xp 00019000 08:01 44958009 /lib64/ld-2.5.so 2aaaaacc5000-2aaaaacc6000 rwxp 0001a000 08:01 44958009 /lib64/ld-2.5.so 2aaaaacc6000-2aaaaacca000 r-xp 00000000 08:01 60425291 /usr/local/perfsuite/psuite/lib/libpsrun_r.so.0.0.1 2aaaaacca000-2aaaaaec9000 ---p 00004000 08:01 60425291 /usr/local/perfsuite/psuite/lib/libpsrun_r.so.0.0.1 2aaaaaec9000-2aaaaaeca000 rwxp 00003000 08:01 60425291 /usr/local/perfsuite/psuite/lib/libpsrun_r.so.0.0.1 2aaaaaeca000-2aaaaaecb000 rwxp 2aaaaaeca000 00:00 0 2aaaaaef1000-2aaaaaf73000 r-xp 00000000 08:01 44957999 /lib64/libm-2.5.so 2aaaaaf73000-2aaaab172000 ---p 00082000 08:01 44957999 /lib64/libm-2.5.so 2aaaab172000-2aaaab173000 r-xp 00081000 08:01 44957999 /lib64/libm-2.5.so 2aaaab173000-2aaaab174000 rwxp 00082000 08:01 44957999 /lib64/libm-2.5.so 2aaaab174000-2aaaab1db000 r-xp 00000000 08:01 66789180 /opt/intel10.1/fce/10.1.017/lib/libguide.so 2aaaab1db000-2aaaab2db000 ---p 00067000 08:01 66789180 /opt/intel10.1/fce/10.1.017/lib/libguide.so 2aaaab2db000-2aaaab2e0000 rwxp 00067000 08:01 66789180 /opt/intel10.1/fce/10.1.017/lib/libguide.so 2aaaab2e0000-2aaaab2e8000 rwxp 2aaaab2e0000 00:00 0 2aaaab2e8000-2aaaab2fd000 r-xp 00000000 08:01 44957988 /lib64/libpthread-2.5.so 2aaaab2fd000-2aaaab4fc000 ---p 00015000 08:01 44957988 /lib64/libpthread-2.5.so 2aaaab4fc000-2aaaab4fd000 r-xp 00014000 08:01 44957988 /lib64/libpthread-2.5.so 2aaaab4fd000-2aaaab4fe000 rwxp 00015000 08:01 44957988 /lib64/libpthread-2.5.so 2aaaab4fe000-2aaaab502000 rwxp 2aaaab4fe000 00:00 0 2aaaab502000-2aaaab646000 r-xp 00000000 08:01 44958004 /lib64/libc-2.5.so 2aaaab646000-2aaaab846000 ---p 00144000 08:01 44958004 /lib64/libc-2.5.so 2aaaab846000-2aaaab84a000 r-xp 00144000 08:01 44958004 /lib64/libc-2.5.so 2aaaab84a000-2aaaab84b000 rwxp 00148000 08:01 44958004 /lib64/libc-2.5.so 2aaaab84b000-2aaaab850000 rwxp 2aaaab84b000 00:00 0 2aaaab850000-2aaaab85d000 r-xp 00000000 08:01 44958007 /lib64/libgcc_s-4.1.1-20070105.so.1 2aaaab85d000-2aaaaba5c000 ---p 0000d000 08:01 44958007 /lib64/libgcc_s-4.1.1-20070105.so.1 2aaaaba5c000-2aaaaba5d000 rwxp 0000c000 08:01 44958007 /lib64/libgcc_s-4.1.1-20070105.so.1 2aaaaba5d000-2aaaaba5e000 rwxp 2aaaaba5d000 00:00 0 2aaaaba5e000-2aaaaba60000 r-xp 00000000 08:01 44958023 /lib64/libdl-2.5.so 2aaaaba60000-2aaaabc60000 ---p 00002000 08:01 44958023 /lib64/libdl-2.5.so 2aaaabc60000-2aaaabc61000 r-xp 00002000 08:01 44958023 /lib64/libdl-2.5.so 2aaaabc61000-2aaaabc62000 rwxp 00003000 08:01 44958023 /lib64/libdl-2.5.so 2aaaabc62000-2aaaabc6f000 r-xp 00000000 08:01 60425170 /usr/local/perfsuite/psuite/lib/libperfsuite_r.so.1.0.1 2aaaabc6f000-2aaaabe6e000 ---p 0000d000 08:01 60425170 /usr/local/perfsuite/psuite/lib/libperfsuite_r.so.1.0.1 2aaaabe6e000-2aaaabe6f000 rwxp 0000c000 08:01 60425170 /usr/local/perfsuite/psuite/lib/libperfsuite_r.so.1.0.1 2aaaabe6f000-2aaaabe71000 rwxp 2aaaabe6f000 00:00 0 2aaaabe71000-2aaaabe83000 r-xp 00000000 08:01 60425187 /usr/local/perfsuite/psuite/lib/libpshwpc_r.so.1.0.1 2aaaabe83000-2aaaac082000 ---p 00012000 08:01 60425187 /usr/local/perfsuite/psuite/lib/libpshwpc_r.so.1.0.1 2aaaac082000-2aaaac083000 rwxp 00011000 08:01 60425187 /usr/local/perfsuite/psuite/lib/libpshwpc_r.so.1.0.1 2aaaac083000-2aaaac084000 rwxp 2aaaac083000 00:00 0 2aaaac0aa000-2aaaac0ca000 r-xp 00000000 08:01 44958013 /lib64/libexpat.so.0.5.0 2aaaac0ca000-2aaaac2c9000 ---p 00020000 08:01 44958013 /lib64/libexpat.so.0.5.0 2aaaac2c9000-2aaaac2cc000 rwxp 0001f000 08:01 44958013 /lib64/libexpat.so.0.5.0 7fff65329000-7fff6533e000 rwxp 7fff65329000 00:00 0 [stack] ffffffffff600000-ffffffffffe00000 ---p 00000000 00:00 0 [vdso] NAS Parallel Benchmarks (NPB3.2-OMP) - BT Benchmark No input file inputbt.data. Using compiled defaults Size: 64x 64x 64 Iterations: 200 dt: 0.000800 Number of available threads: 2 Thread created... Thread returned from hwpc_start with status 0 Thread created... Fatal profiling error: cannot access TSD data in handler. Please report this as an internal PerfSuite bug to per...@li.... Thread executing cleanup handler Thread returned from hwpc_stop with status 0 (2) experiment 2 (export OMP_NUM_THREADS=8) [jiangjie@node1 bin]$ psrun -c /usr/local/perfsuite/psuite/share/perfsuite/xml/pshwpc/itimer.xml -p ./cg.A libpsrun.c:181 : SIGPROF ignored on startup. Handler=0x1, flags=14000000 PerfSuite debugging enabled (debug level: PS_DEBUG_OFF) [PID 15326] Library version: threaded [PID 15326] Environment (entry of psrun_init) [PID 15326] PSRUN_DOFORK = (null) [PID 15326] LD_PRELOAD = libpsrun_r.so.0 [PID 15326] PSRUN_PID = 15326 [PID 15326] PS_HWPC_FILE = cg.A 00400000-00406000 r-xp 00000000 08:01 60425440 /home/jiangjie/NPB3.2.1/NPB3.2-OMP/bin/cg.A 00606000-00607000 rw-p 00006000 08:01 60425440 /home/jiangjie/NPB3.2.1/NPB3.2-OMP/bin/cg.A 00607000-036c9000 rw-p 00607000 00:00 0 19af3000-19b14000 rw-p 19af3000 00:00 0 2aaaaaaab000-2aaaaaac5000 r-xp 00000000 08:01 44958009 /lib64/ld-2.5.so 2aaaaaac5000-2aaaaaac7000 rw-p 2aaaaaac5000 00:00 0 2aaaaacc4000-2aaaaacc5000 r--p 00019000 08:01 44958009 /lib64/ld-2.5.so 2aaaaacc5000-2aaaaacc6000 rw-p 0001a000 08:01 44958009 /lib64/ld-2.5.so 2aaaaacc6000-2aaaaacca000 r-xp 00000000 08:01 60425291 /usr/local/perfsuite/psuite/lib/libpsrun_r.so.0.0.1 2aaaaacca000-2aaaaaec9000 ---p 00004000 08:01 60425291 /usr/local/perfsuite/psuite/lib/libpsrun_r.so.0.0.1 2aaaaaec9000-2aaaaaeca000 rw-p 00003000 08:01 60425291 /usr/local/perfsuite/psuite/lib/libpsrun_r.so.0.0.1 2aaaaaeca000-2aaaaaecb000 rw-p 2aaaaaeca000 00:00 0 2aaaaaef1000-2aaaaaf87000 r-xp 00000000 08:01 53419189 /usr/lib64/libgfortran.so.1.0.0 2aaaaaf87000-2aaaab186000 ---p 00096000 08:01 53419189 /usr/lib64/libgfortran.so.1.0.0 2aaaab186000-2aaaab188000 rw-p 00095000 08:01 53419189 /usr/lib64/libgfortran.so.1.0.0 2aaaab188000-2aaaab20a000 r-xp 00000000 08:01 44957999 /lib64/libm-2.5.so 2aaaab20a000-2aaaab409000 ---p 00082000 08:01 44957999 /lib64/libm-2.5.so 2aaaab409000-2aaaab40a000 r--p 00081000 08:01 44957999 /lib64/libm-2.5.so 2aaaab40a000-2aaaab40b000 rw-p 00082000 08:01 44957999 /lib64/libm-2.5.so 2aaaab40b000-2aaaab40c000 rw-p 2aaaab40b000 00:00 0 2aaaab40c000-2aaaab413000 r-xp 00000000 08:01 53419101 /usr/lib64/libgomp.so.1.0.0 2aaaab413000-2aaaab612000 ---p 00007000 08:01 53419101 /usr/lib64/libgomp.so.1.0.0 2aaaab612000-2aaaab613000 rw-p 00006000 08:01 53419101 /usr/lib64/libgomp.so.1.0.0 2aaaab613000-2aaaab620000 r-xp 00000000 08:01 44958007 /lib64/libgcc_s-4.1.1-20070105.so.1 2aaaab620000-2aaaab81f000 ---p 0000d000 08:01 44958007 /lib64/libgcc_s-4.1.1-20070105.so.1 2aaaab81f000-2aaaab820000 rw-p 0000c000 08:01 44958007 /lib64/libgcc_s-4.1.1-20070105.so.1 2aaaab820000-2aaaab835000 r-xp 00000000 08:01 44957988 /lib64/libpthread-2.5.so 2aaaab835000-2aaaaba34000 ---p 00015000 08:01 44957988 /lib64/libpthread-2.5.so 2aaaaba34000-2aaaaba35000 r--p 00014000 08:01 44957988 /lib64/libpthread-2.5.so 2aaaaba35000-2aaaaba36000 rw-p 00015000 08:01 44957988 /lib64/libpthread-2.5.so 2aaaaba36000-2aaaaba3b000 rw-p 2aaaaba36000 00:00 0 2aaaaba3b000-2aaaabb7f000 r-xp 00000000 08:01 44958004 /lib64/libc-2.5.so 2aaaabb7f000-2aaaabd7f000 ---p 00144000 08:01 44958004 /lib64/libc-2.5.so 2aaaabd7f000-2aaaabd83000 r--p 00144000 08:01 44958004 /lib64/libc-2.5.so 2aaaabd83000-2aaaabd84000 rw-p 00148000 08:01 44958004 /lib64/libc-2.5.so 2aaaabd84000-2aaaabd89000 rw-p 2aaaabd84000 00:00 0 2aaaabd89000-2aaaabd96000 r-xp 00000000 08:01 60425170 /usr/local/perfsuite/psuite/lib/libperfsuite_r.so.1.0.1 2aaaabd96000-2aaaabf95000 ---p 0000d000 08:01 60425170 /usr/local/perfsuite/psuite/lib/libperfsuite_r.so.1.0.1 2aaaabf95000-2aaaabf96000 rw-p 0000c000 08:01 60425170 /usr/local/perfsuite/psuite/lib/libperfsuite_r.so.1.0.1 2aaaabf96000-2aaaabf98000 r-xp 00000000 08:01 44958023 /lib64/libdl-2.5.so 2aaaabf98000-2aaaac198000 ---p 00002000 08:01 44958023 /lib64/libdl-2.5.so 2aaaac198000-2aaaac199000 r--p 00002000 08:01 44958023 /lib64/libdl-2.5.so 2aaaac199000-2aaaac19a000 rw-p 00003000 08:01 44958023 /lib64/libdl-2.5.so 2aaaac19a000-2aaaac19b000 rw-p 2aaaac19a000 00:00 0 2aaaac19b000-2aaaac1a2000 r-xp 00000000 08:01 44957732 /lib64/librt-2.5.so 2aaaac1a2000-2aaaac3a2000 ---p 00007000 08:01 44957732 /lib64/librt-2.5.so 2aaaac3a2000-2aaaac3a3000 r--p 00007000 08:01 44957732 /lib64/librt-2.5.so 2aaaac3a3000-2aaaac3a4000 rw-p 00008000 08:01 44957732 /lib64/librt-2.5.so 2aaaac3a4000-2aaaac3a6000 rw-p 2aaaac3a4000 00:00 0 2aaaac3a6000-2aaaac3b8000 r-xp 00000000 08:01 60425187 /usr/local/perfsuite/psuite/lib/libpshwpc_r.so.1.0.1 2aaaac3b8000-2aaaac5b7000 ---p 00012000 08:01 60425187 /usr/local/perfsuite/psuite/lib/libpshwpc_r.so.1.0.1 2aaaac5b7000-2aaaac5b8000 rw-p 00011000 08:01 60425187 /usr/local/perfsuite/psuite/lib/libpshwpc_r.so.1.0.1 2aaaac5b8000-2aaaac5b9000 rw-p 2aaaac5b8000 00:00 0 2aaaac5df000-2aaaac5ff000 r-xp 00000000 08:01 44958013 /lib64/libexpat.so.0.5.0 2aaaac5ff000-2aaaac7fe000 ---p 00020000 08:01 44958013 /lib64/libexpat.so.0.5.0 2aaaac7fe000-2aaaac801000 rw-p 0001f000 08:01 44958013 /lib64/libexpat.so.0.5.0 7fffc37a5000-7fffc37ba000 rw-p 7fffc37a5000 00:00 0 [stack] ffffffffff600000-ffffffffffe00000 ---p 00000000 00:00 0 [vdso] NAS Parallel Benchmarks (NPB3.2-OMP) - CG Benchmark Size: 14000 Iterations: 15 Number of available threads: 8 Thread created... Thread created... Thread returned from hwpc_start with status 0 Thread returned from hwpc_start with status 0 Thread created... Thread created... Thread created... Thread created... Thread created... Fatal profiling error: cannot access TSD data in handler. Please report this as an internal PerfSuite bug to per...@li.... [PID 15326] Notice: caught SIGSEGV in wrap-up. (3) Experiment 3 (export OMP_NUM_THREADS=8) [jiangjie@node1 bin]$ psrun -c /usr/local/perfsuite/psuite/share/perfsuite/xml/pshwpc/itimer.xml -p ./cg.A libpsrun.c:181 : SIGPROF ignored on startup. Handler=0x1, flags=14000000 PerfSuite debugging enabled (debug level: PS_DEBUG_OFF) [PID 15784] Library version: threaded [PID 15784] Environment (entry of psrun_init) [PID 15784] PSRUN_DOFORK = (null) [PID 15784] LD_PRELOAD = libpsrun_r.so.0 [PID 15784] PSRUN_PID = 15784 [PID 15784] PS_HWPC_FILE = cg.A 00400000-004a7000 r-xp 00000000 08:01 60425440 /home/jiangjie/NPB3.2.1/NPB3.2-OMP/bin/cg.A 006a7000-006ae000 rwxp 000a7000 08:01 60425440 /home/jiangjie/NPB3.2.1/NPB3.2-OMP/bin/cg.A 006ae000-037a2000 rwxp 006ae000 00:00 0 15b6b000-15b8c000 rwxp 15b6b000 00:00 0 2aaaaaaab000-2aaaaaac5000 r-xp 00000000 08:01 44958009 /lib64/ld-2.5.so 2aaaaaac5000-2aaaaaac7000 rwxp 2aaaaaac5000 00:00 0 2aaaaacc4000-2aaaaacc5000 r-xp 00019000 08:01 44958009 /lib64/ld-2.5.so 2aaaaacc5000-2aaaaacc6000 rwxp 0001a000 08:01 44958009 /lib64/ld-2.5.so 2aaaaacc6000-2aaaaacca000 r-xp 00000000 08:01 60425291 /usr/local/perfsuite/psuite/lib/libpsrun_r.so.0.0.1 2aaaaacca000-2aaaaaec9000 ---p 00004000 08:01 60425291 /usr/local/perfsuite/psuite/lib/libpsrun_r.so.0.0.1 2aaaaaec9000-2aaaaaeca000 rwxp 00003000 08:01 60425291 /usr/local/perfsuite/psuite/lib/libpsrun_r.so.0.0.1 2aaaaaeca000-2aaaaaecb000 rwxp 2aaaaaeca000 00:00 0 2aaaaaef1000-2aaaaaf73000 r-xp 00000000 08:01 44957999 /lib64/libm-2.5.so 2aaaaaf73000-2aaaab172000 ---p 00082000 08:01 44957999 /lib64/libm-2.5.so 2aaaab172000-2aaaab173000 r-xp 00081000 08:01 44957999 /lib64/libm-2.5.so 2aaaab173000-2aaaab174000 rwxp 00082000 08:01 44957999 /lib64/libm-2.5.so 2aaaab174000-2aaaab1db000 r-xp 00000000 08:01 66789180 /opt/intel10.1/fce/10.1.017/lib/libguide.so 2aaaab1db000-2aaaab2db000 ---p 00067000 08:01 66789180 /opt/intel10.1/fce/10.1.017/lib/libguide.so 2aaaab2db000-2aaaab2e0000 rwxp 00067000 08:01 66789180 /opt/intel10.1/fce/10.1.017/lib/libguide.so 2aaaab2e0000-2aaaab2e8000 rwxp 2aaaab2e0000 00:00 0 2aaaab2e8000-2aaaab2fd000 r-xp 00000000 08:01 44957988 /lib64/libpthread-2.5.so 2aaaab2fd000-2aaaab4fc000 ---p 00015000 08:01 44957988 /lib64/libpthread-2.5.so 2aaaab4fc000-2aaaab4fd000 r-xp 00014000 08:01 44957988 /lib64/libpthread-2.5.so 2aaaab4fd000-2aaaab4fe000 rwxp 00015000 08:01 44957988 /lib64/libpthread-2.5.so 2aaaab4fe000-2aaaab502000 rwxp 2aaaab4fe000 00:00 0 2aaaab502000-2aaaab646000 r-xp 00000000 08:01 44958004 /lib64/libc-2.5.so 2aaaab646000-2aaaab846000 ---p 00144000 08:01 44958004 /lib64/libc-2.5.so 2aaaab846000-2aaaab84a000 r-xp 00144000 08:01 44958004 /lib64/libc-2.5.so 2aaaab84a000-2aaaab84b000 rwxp 00148000 08:01 44958004 /lib64/libc-2.5.so 2aaaab84b000-2aaaab850000 rwxp 2aaaab84b000 00:00 0 2aaaab850000-2aaaab85d000 r-xp 00000000 08:01 44958007 /lib64/libgcc_s-4.1.1-20070105.so.1 2aaaab85d000-2aaaaba5c000 ---p 0000d000 08:01 44958007 /lib64/libgcc_s-4.1.1-20070105.so.1 2aaaaba5c000-2aaaaba5d000 rwxp 0000c000 08:01 44958007 /lib64/libgcc_s-4.1.1-20070105.so.1 2aaaaba5d000-2aaaaba5e000 rwxp 2aaaaba5d000 00:00 0 2aaaaba5e000-2aaaaba60000 r-xp 00000000 08:01 44958023 /lib64/libdl-2.5.so 2aaaaba60000-2aaaabc60000 ---p 00002000 08:01 44958023 /lib64/libdl-2.5.so 2aaaabc60000-2aaaabc61000 r-xp 00002000 08:01 44958023 /lib64/libdl-2.5.so 2aaaabc61000-2aaaabc62000 rwxp 00003000 08:01 44958023 /lib64/libdl-2.5.so 2aaaabc62000-2aaaabc6f000 r-xp 00000000 08:01 60425170 /usr/local/perfsuite/psuite/lib/libperfsuite_r.so.1.0.1 2aaaabc6f000-2aaaabe6e000 ---p 0000d000 08:01 60425170 /usr/local/perfsuite/psuite/lib/libperfsuite_r.so.1.0.1 2aaaabe6e000-2aaaabe6f000 rwxp 0000c000 08:01 60425170 /usr/local/perfsuite/psuite/lib/libperfsuite_r.so.1.0.1 2aaaabe6f000-2aaaabe71000 rwxp 2aaaabe6f000 00:00 0 2aaaabe71000-2aaaabe83000 r-xp 00000000 08:01 60425187 /usr/local/perfsuite/psuite/lib/libpshwpc_r.so.1.0.1 2aaaabe83000-2aaaac082000 ---p 00012000 08:01 60425187 /usr/local/perfsuite/psuite/lib/libpshwpc_r.so.1.0.1 2aaaac082000-2aaaac083000 rwxp 00011000 08:01 60425187 /usr/local/perfsuite/psuite/lib/libpshwpc_r.so.1.0.1 2aaaac083000-2aaaac084000 rwxp 2aaaac083000 00:00 0 2aaaac0aa000-2aaaac0ca000 r-xp 00000000 08:01 44958013 /lib64/libexpat.so.0.5.0 2aaaac0ca000-2aaaac2c9000 ---p 00020000 08:01 44958013 /lib64/libexpat.so.0.5.0 2aaaac2c9000-2aaaac2cc000 rwxp 0001f000 08:01 44958013 /lib64/libexpat.so.0.5.0 7fff47070000-7fff47085000 rwxp 7fff47070000 00:00 0 [stack] ffffffffff600000-ffffffffffe00000 ---p 00000000 00:00 0 [vdso] NAS Parallel Benchmarks (NPB3.2-OMP) - CG Benchmark Size: 14000 Iterations: 15 Number of available threads: 8 Thread created... Thread created... Thread created... Thread created... Thread created... Fatal profiling error: cannot access TSD data in handler. Please report this as an internal PerfSuite bug to per...@li.... Thread returned from hwpc_start with status 0 Thread returned from hwpc_start with status 0 Thread created... Thread created... system error(35): __kmp_reap_monitor: monitor did not reap properly: Resource deadlock avoided OMP: Error #107: Fatal system error detected. Thread executing cleanup handler (4) Experiment 4 (export OMP_NUM_THREADS=8) [jiangjie@node1 bin]$ psrun -c /usr/local/perfsuite/psuite/share/perfsuite/xml/pshwpc/itimer.xml -p ./cg.A libpsrun.c:181 : SIGPROF ignored on startup. Handler=0x1, flags=14000000 PerfSuite debugging enabled (debug level: PS_DEBUG_OFF) [PID 15916] Library version: threaded [PID 15916] Environment (entry of psrun_init) [PID 15916] PSRUN_DOFORK = (null) [PID 15916] LD_PRELOAD = libpsrun_r.so.0 [PID 15916] PSRUN_PID = 15916 [PID 15916] PS_HWPC_FILE = cg.A 00400000-004a7000 r-xp 00000000 08:01 60425440 /home/jiangjie/NPB3.2.1/NPB3.2-OMP/bin/cg.A 006a7000-006ae000 rwxp 000a7000 08:01 60425440 /home/jiangjie/NPB3.2.1/NPB3.2-OMP/bin/cg.A 006ae000-037a2000 rwxp 006ae000 00:00 0 19c52000-19c73000 rwxp 19c52000 00:00 0 2aaaaaaab000-2aaaaaac5000 r-xp 00000000 08:01 44958009 /lib64/ld-2.5.so 2aaaaaac5000-2aaaaaac7000 rwxp 2aaaaaac5000 00:00 0 2aaaaacc4000-2aaaaacc5000 r-xp 00019000 08:01 44958009 /lib64/ld-2.5.so 2aaaaacc5000-2aaaaacc6000 rwxp 0001a000 08:01 44958009 /lib64/ld-2.5.so 2aaaaacc6000-2aaaaacca000 r-xp 00000000 08:01 60425291 /usr/local/perfsuite/psuite/lib/libpsrun_r.so.0.0.1 2aaaaacca000-2aaaaaec9000 ---p 00004000 08:01 60425291 /usr/local/perfsuite/psuite/lib/libpsrun_r.so.0.0.1 2aaaaaec9000-2aaaaaeca000 rwxp 00003000 08:01 60425291 /usr/local/perfsuite/psuite/lib/libpsrun_r.so.0.0.1 2aaaaaeca000-2aaaaaecb000 rwxp 2aaaaaeca000 00:00 0 2aaaaaef1000-2aaaaaf73000 r-xp 00000000 08:01 44957999 /lib64/libm-2.5.so 2aaaaaf73000-2aaaab172000 ---p 00082000 08:01 44957999 /lib64/libm-2.5.so 2aaaab172000-2aaaab173000 r-xp 00081000 08:01 44957999 /lib64/libm-2.5.so 2aaaab173000-2aaaab174000 rwxp 00082000 08:01 44957999 /lib64/libm-2.5.so 2aaaab174000-2aaaab1db000 r-xp 00000000 08:01 66789180 /opt/intel10.1/fce/10.1.017/lib/libguide.so 2aaaab1db000-2aaaab2db000 ---p 00067000 08:01 66789180 /opt/intel10.1/fce/10.1.017/lib/libguide.so 2aaaab2db000-2aaaab2e0000 rwxp 00067000 08:01 66789180 /opt/intel10.1/fce/10.1.017/lib/libguide.so 2aaaab2e0000-2aaaab2e8000 rwxp 2aaaab2e0000 00:00 0 2aaaab2e8000-2aaaab2fd000 r-xp 00000000 08:01 44957988 /lib64/libpthread-2.5.so 2aaaab2fd000-2aaaab4fc000 ---p 00015000 08:01 44957988 /lib64/libpthread-2.5.so 2aaaab4fc000-2aaaab4fd000 r-xp 00014000 08:01 44957988 /lib64/libpthread-2.5.so 2aaaab4fd000-2aaaab4fe000 rwxp 00015000 08:01 44957988 /lib64/libpthread-2.5.so 2aaaab4fe000-2aaaab502000 rwxp 2aaaab4fe000 00:00 0 2aaaab502000-2aaaab646000 r-xp 00000000 08:01 44958004 /lib64/libc-2.5.so 2aaaab646000-2aaaab846000 ---p 00144000 08:01 44958004 /lib64/libc-2.5.so 2aaaab846000-2aaaab84a000 r-xp 00144000 08:01 44958004 /lib64/libc-2.5.so 2aaaab84a000-2aaaab84b000 rwxp 00148000 08:01 44958004 /lib64/libc-2.5.so 2aaaab84b000-2aaaab850000 rwxp 2aaaab84b000 00:00 0 2aaaab850000-2aaaab85d000 r-xp 00000000 08:01 44958007 /lib64/libgcc_s-4.1.1-20070105.so.1 2aaaab85d000-2aaaaba5c000 ---p 0000d000 08:01 44958007 /lib64/libgcc_s-4.1.1-20070105.so.1 2aaaaba5c000-2aaaaba5d000 rwxp 0000c000 08:01 44958007 /lib64/libgcc_s-4.1.1-20070105.so.1 2aaaaba5d000-2aaaaba5e000 rwxp 2aaaaba5d000 00:00 0 2aaaaba5e000-2aaaaba60000 r-xp 00000000 08:01 44958023 /lib64/libdl-2.5.so 2aaaaba60000-2aaaabc60000 ---p 00002000 08:01 44958023 /lib64/libdl-2.5.so 2aaaabc60000-2aaaabc61000 r-xp 00002000 08:01 44958023 /lib64/libdl-2.5.so 2aaaabc61000-2aaaabc62000 rwxp 00003000 08:01 44958023 /lib64/libdl-2.5.so 2aaaabc62000-2aaaabc6f000 r-xp 00000000 08:01 60425170 /usr/local/perfsuite/psuite/lib/libperfsuite_r.so.1.0.1 2aaaabc6f000-2aaaabe6e000 ---p 0000d000 08:01 60425170 /usr/local/perfsuite/psuite/lib/libperfsuite_r.so.1.0.1 2aaaabe6e000-2aaaabe6f000 rwxp 0000c000 08:01 60425170 /usr/local/perfsuite/psuite/lib/libperfsuite_r.so.1.0.1 2aaaabe6f000-2aaaabe71000 rwxp 2aaaabe6f000 00:00 0 2aaaabe71000-2aaaabe83000 r-xp 00000000 08:01 60425187 /usr/local/perfsuite/psuite/lib/libpshwpc_r.so.1.0.1 2aaaabe83000-2aaaac082000 ---p 00012000 08:01 60425187 /usr/local/perfsuite/psuite/lib/libpshwpc_r.so.1.0.1 2aaaac082000-2aaaac083000 rwxp 00011000 08:01 60425187 /usr/local/perfsuite/psuite/lib/libpshwpc_r.so.1.0.1 2aaaac083000-2aaaac084000 rwxp 2aaaac083000 00:00 0 2aaaac0aa000-2aaaac0ca000 r-xp 00000000 08:01 44958013 /lib64/libexpat.so.0.5.0 2aaaac0ca000-2aaaac2c9000 ---p 00020000 08:01 44958013 /lib64/libexpat.so.0.5.0 2aaaac2c9000-2aaaac2cc000 rwxp 0001f000 08:01 44958013 /lib64/libexpat.so.0.5.0 7fff36965000-7fff3697b000 rwxp 7fff36965000 00:00 0 [stack] ffffffffff600000-ffffffffffe00000 ---p 00000000 00:00 0 [vdso] NAS Parallel Benchmarks (NPB3.2-OMP) - CG Benchmark Size: 14000 Iterations: 15 Number of available threads: 8 Thread created... Thread created... Thread created... Thread created... Thread created... Thread created... Thread created... Fatal profiling error: cannot access TSD data in handler. Please report this as an internal PerfSuite bug to per...@li.... Thread created... Fatal profiling error: cannot access TSD data in handler. Please report this as an internal PerfSuite bug to per...@li.... libpsrun fatal error: calling sequence not allowed It is strange that sometimes it works well. Some forementioned problems also occur on my x86/Linux platform. And we notice that message "Fatal profiling error: cannot access TSD data in handler." always appears. Note that for all serial and MPI NPB benchmarks, perfsuite works well. Any suggestion? Regards, Jie Jiang |
From: Rick K. <rk...@il...> - 2009-08-20 14:23:33
|
Nagaraju, Regarding installation of PerfSuite: please read the INSTALL file that is provided in the top level of the distribution. This contains instructions for compiling PerfSuite. If you have difficulties after doing this, then you can direct questions to this list. Regarding perfctr: again, review the instructions provided with that software. There is a perfctr mailing list to which questions may be directed, and occasionally some information is given via the PAPI mailing list. Rick ----- Original Message ----- From: "Nagaraju venkata Buddarapu" <ca...@gm...> To: per...@li... Sent: Thursday, August 20, 2009 9:14:04 AM GMT -06:00 US/Canada Central Subject: [PerfSuite-users] Perfsuite Installation can some body give overview of perfsuite installation please let me how to patch perftctr in kernel 2.6.28 ------------------------------------------------------------------------------ Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july _______________________________________________ PerfSuite-users mailing list Per...@li... https://lists.sourceforge.net/lists/listinfo/perfsuite-users |
From: Nagaraju v. B. <ca...@gm...> - 2009-08-20 14:14:20
|
can some body give overview of perfsuite installation please let me how to patch perftctr in kernel 2.6.28 |
From: Rick K. <rk...@il...> - 2009-08-19 21:11:45
|
Cheng, Well, I think the versioning number explains the negative sample count. Version 0.6.2a6 dates from 2006, and the fix I mentioned earlier was checked in in September 2007, so you would need a release later than that to pick up the fix (but the overflow issue would still be present). I am not sure what to make of the change of the offset to libc, the offset is computed from what is present in the load map for the process. I will think about it, and if I have any insight will follow up. Rick Cheng Liao wrote: > Rick, > > Here is my 'psinv -V' output for the particular machine on which I am > seeing the negative numbers of > sampling points. > > [cliao@maui1 bin]$ psinv -V > -------------------------------------------- > PerfSuite 0.6.2a6 > psinv 0.6 > University of Illinois/NCSA > Open Source License > http://perfsuite.ncsa.uiuc.edu/ > http://perfsuite.sourceforge.net/ > -------------------------------------------- > I also have PerfSuite 0.6.2 installed on different machines, but > haven't run any large job on those machines yet. > > To track down the incorrect libc offset on Linux, I simply pored throu > the xml file and ran addr2line manually > for the samples that were pertinent to the executable to see if I got > a lot of ??. No luck there, but I was able to > conclude the libc offset should be zero after I did a 'nm libc.so.6' > and saw pretty similiar addresses from the nm > output and the xml file. > > The software company I am working for also has a large PC/Windows > customer base... > > Thanks, > Cheng > > |
From: Cheng L. <cyl...@gm...> - 2009-08-19 20:43:52
|
Rick, Here is my 'psinv -V' output for the particular machine on which I am seeing the negative numbers of sampling points. [cliao@maui1 bin]$ psinv -V -------------------------------------------- PerfSuite 0.6.2a6 psinv 0.6 University of Illinois/NCSA Open Source License http://perfsuite.ncsa.uiuc.edu/ http://perfsuite.sourceforge.net/ -------------------------------------------- I also have PerfSuite 0.6.2 installed on different machines, but haven't run any large job on those machines yet. To track down the incorrect libc offset on Linux, I simply pored throu the xml file and ran addr2line manually for the samples that were pertinent to the executable to see if I got a lot of ??. No luck there, but I was able to conclude the libc offset should be zero after I did a 'nm libc.so.6' and saw pretty similiar addresses from the nm output and the xml file. The software company I am working for also has a large PC/Windows customer base... Thanks, Cheng On Wed, Aug 19, 2009 at 12:25 PM, Rick Kufrin <rk...@il...> wrote: > Cheng, > > These are some interesting observations and questions. I have a few > comments, but first one question: which release of PerfSuite are you > running? To find out, do "psinv -V", and it will appear at the top. The > reason why I ask this is that the appearance of negative sample counts is a > problem that I believe was addressed already, a fix checked in quite a while > ago. If the release you are using is after that fix, then apparently we > still have an issue (and I thank you for your report in advance). > > Second, I am glad you resolved (apparently) an incorrect offset, but I am > curious to know how you verify that this provided a fix. It is not unusual > to see question marks (PCs that cannot be mapped to source locations). > > Regarding the sampling bin size: that is a fixed size, and it is currently > the size of an unsigned short, which should allow for up to 64K samples to > map to any one bin. After that, overflow would occur and you would > effectively lose samples and get an incorrect profile. The only solution at > present, unfortunately and as you suggest, is to increase the sampling > period to reduce overall sample counts. > > Rick > > > Cheng Liao wrote: > >> Rick, >> The large ?? entry in my profile is caused by an incorrect offset for >> libc.so.6 base >> address. I need to manually set the libc offset to zero in the xml file, >> then run psprocess, to >> fix the problem. >> A new problem I have encountered is for large jobs, the bin for the >> number of sampling points would overflow: >> <sample pc="59d770">27665</sample> >> <sample pc="59d772">26940</sample> >> <sample pc="59d776">27564</sample> >> <sample pc="59d77a">-31502</sample> >> <---------- negative >> <sample pc="59d77e">11502</sample> >> <sample pc="59d782">5323</sample> >> <sample pc="59d78a">21065</sample> >> <sample pc="59d78e">32422</sample> >> <sample pc="59d790">27709</sample> >> <sample pc="59d794">-30384</sample> >> <------------- negative >> <sample pc="59d798">24312</sample> >> Does perfsuite support a large sampling point bin, or using a higher >> profiling overflow threhold would be the only solution? >> Thanks, >> Cheng >> >> > |
From: Rick K. <rk...@il...> - 2009-08-19 19:26:33
|
Cheng, These are some interesting observations and questions. I have a few comments, but first one question: which release of PerfSuite are you running? To find out, do "psinv -V", and it will appear at the top. The reason why I ask this is that the appearance of negative sample counts is a problem that I believe was addressed already, a fix checked in quite a while ago. If the release you are using is after that fix, then apparently we still have an issue (and I thank you for your report in advance). Second, I am glad you resolved (apparently) an incorrect offset, but I am curious to know how you verify that this provided a fix. It is not unusual to see question marks (PCs that cannot be mapped to source locations). Regarding the sampling bin size: that is a fixed size, and it is currently the size of an unsigned short, which should allow for up to 64K samples to map to any one bin. After that, overflow would occur and you would effectively lose samples and get an incorrect profile. The only solution at present, unfortunately and as you suggest, is to increase the sampling period to reduce overall sample counts. Rick Cheng Liao wrote: > Rick, > > The large ?? entry in my profile is caused by an incorrect offset for > libc.so.6 base > address. I need to manually set the libc offset to zero in the xml > file, then run psprocess, to > fix the problem. > > A new problem I have encountered is for large jobs, the bin for the > number of sampling points would overflow: > > <sample pc="59d770">27665</sample> > <sample pc="59d772">26940</sample> > <sample pc="59d776">27564</sample> > <sample pc="59d77a">-31502</sample> > <---------- negative > <sample pc="59d77e">11502</sample> > <sample pc="59d782">5323</sample> > <sample pc="59d78a">21065</sample> > <sample pc="59d78e">32422</sample> > <sample pc="59d790">27709</sample> > <sample pc="59d794">-30384</sample> > <------------- negative > <sample pc="59d798">24312</sample> > Does perfsuite support a large sampling point bin, or using a higher > profiling overflow threhold would be the only solution? > > Thanks, > Cheng > |
From: Cheng L. <cyl...@gm...> - 2009-08-19 16:24:58
|
Rick, The large ?? entry in my profile is caused by an incorrect offset for libc.so.6 base address. I need to manually set the libc offset to zero in the xml file, then run psprocess, to fix the problem. A new problem I have encountered is for large jobs, the bin for the number of sampling points would overflow: <sample pc="59d770">27665</sample> <sample pc="59d772">26940</sample> <sample pc="59d776">27564</sample> <sample pc="59d77a">-31502</sample> <---------- negative <sample pc="59d77e">11502</sample> <sample pc="59d782">5323</sample> <sample pc="59d78a">21065</sample> <sample pc="59d78e">32422</sample> <sample pc="59d790">27709</sample> <sample pc="59d794">-30384</sample> <------------- negative <sample pc="59d798">24312</sample> Does perfsuite support a large sampling point bin, or using a higher profiling overflow threhold would be the only solution? Thanks, Cheng On Wed, Jul 22, 2009 at 5:30 AM, Rick Kufrin <rk...@il...> wrote: > Cheng, > > Thanks for the followup and further info. I'm glad to hear that you are > getting better results with the itimer.xml configuration versus profil.xml. > I have seen problems with profil.xml before, which uses the C library > profil() functionality, but they had seemed to be isolated to Itanium > platforms. Regardless, this is something we will note and see if we can > reproduce. > > Thanks again for the info - I am copying the SourceForge list for > archival/status purposes. > > Rick > > ----- Original Message ----- > From: "Cheng Liao" <cyl...@gm...> > To: "Rick Kufrin" <rk...@il...> > Sent: Tuesday, July 21, 2009 11:30:08 PM GMT -06:00 US/Canada Central > Subject: Re: [PerfSuite-users] psrun profile doesn't show any sampling > point > > Rick, > > Thought I would follow up. > > Running longer jobs helps, since not all CPU time would be reported by > profiling > samples. However, my real problem is I need to use itimer.xml instead of > profil.xml for psrun. For some reason profil.xml would show empty. Maybe I > have > messed up my build somewhere? > > Now, everything looks great, except sometime a significant chuck of the > sampling points cannot find the function names and would show ?? on > the output. > > Thanks again, > Cheng > > > On Mon, Jul 20, 2009 at 1:28 PM, Cheng Liao < cyl...@gm... > wrote: > > > > Hi Rick, > > Thanks for your quick reply. Indeed the sample output I showed was from a > Nehalem machine. However, I saw the same problem on some wolfdale and > harpertown systems too. > > So far, I have tried only a 0 second and a 6 second jobs. Will make some > longer runs. > > Cheng > > > > > > On Mon, Jul 20, 2009 at 12:50 PM, Rick Kufrin < rk...@il... > > wrote: > > > Cheng, > > There are two issues that I can think of that you might be experiencing. > The first is simply that the program may not run long enough to generate > profiling samples. However, I am also wondering if (based on the processor > brand string listed in the output) if you are working with an Intel Nehalem > CPU. If so, there are known problems with the most recent release of > PerfSuite on that processor. We have finished adding the support for that > system, but have not yet placed a new software release on SourceForge or at > NCSA. We hope to do so in the very near future, though. > > Sorry for the difficulties, please let me know if you also suspect this may > be what is affecting you. > > Rick > > Cheng Liao wrote: > > > > > > I just installed perfsuite on some linux x86_64 machines (only one with > papi and perfctr). > However, even though 'make -s check' outputs looked fine, all the psprocess > outputs would > be 'empty' and show no profile timing info. Is this something that can be > easily fixed? > Thanks, > Cheng > PerfSuite Hardware Performance Summary Report > > Version : 1.0 > Created : Mon Jul 20 11:20:26 AM PDT 2009 > Generator : psprocess 0.3 > XML Source : ls.15381.em64td.xml > > Execution Information > > ============================================================================================ > Collector : libpshwpc > Date : Sat Jul 18 17:31:54 2009 > Host : em64td > User : cliao > Command : ls > > Processor and System Information > > ============================================================================================ > Node CPUs : 8 > Vendor : Intel > Family : Pentium Pro (P6) > Brand : Intel(R) Xeon(R) CPU X5570 @ 2.93GHz > CPU Revision : 4 > Clock (MHz) : 2933.589 > Memory (MB) : 24096.66 > Pagesize (KB) : 4 > > Cache Information > > ============================================================================================ > Cache levels : 0 > > Profile Information > > ============================================================================================ > Class : profil > Version : 2.5 > Event : milliseconds > Period : 10 > Samples : 0 > Domain : user > Run Time : 0.00 (seconds) > Min Self % : (all) > > Module Summary > > -------------------------------------------------------------------------------- > Samples Self % Total % Module > > > File Summary > > -------------------------------------------------------------------------------- > Samples Self % Total % File > > > Function Summary > > -------------------------------------------------------------------------------- > Samples Self % Total % Function > > > Function:File:Line Summary > > -------------------------------------------------------------------------------- > Samples Self % Total % Function:File:Line > > ------------------------------------------------------------------------ > > > ------------------------------------------------------------------------------ > Enter the BlackBerry Developer Challenge This is your chance to win up to > $100,000 in prizes! For a limited time, vendors submitting new applications > to BlackBerry App World(TM) will have > the opportunity to enter the BlackBerry Developer Challenge. See full prize > details at: http://p.sf.net/sfu/Challenge > ------------------------------------------------------------------------ > > _______________________________________________ > PerfSuite-users mailing list > Per...@li... > https://lists.sourceforge.net/lists/listinfo/perfsuite-users > > > > > |
From: <rk...@il...> - 2009-07-30 15:52:22
|
Jie, The extra output document you received when monitoring an OpenMP program is most likely due to the OpenMP runtime creating an additional monitoring thread that is tracked by PerfSuite just as any other thread. There is a mention of this in the "BUGS" file in the PerfSuite distribution (although, strictly speaking, this does not fall into the category of "bugs", I think). When doing your performance analysis, it is best to just disregard the data associated with the extra thread. On the other question you sent (regarding lack of libiberty shared libraries), unfortunately there is no other solution currently that I would consider to be "elegant" beyond running configure with --disable-binutils. One can generate a shared library version of libiberty from the source, but that is additional work that seems difficult for most users. It might be possible to fix things within PerfSuite by hand-editing the Autotools-generated Makefiles, but I think that is not an acceptable alternative for people to have to do. We have been working towards a better approach to handling binutils libraries, but this is not included in current releases, so I can only offer this current workaround (unfortunately). Rick ----- Original Message ----- From: jj...@nu... To: rk...@il... Cc: per...@li... Sent: Thursday, July 30, 2009 10:03:14 AM GMT -06:00 US/Canada Central Subject: Monitoring OpenMP program using Perfsuite Rick, There is a strange problem when monitoring OpneMP program with psrun. The application is NPB3.3-OMP/IS benchnmark. export OMP_NUM_THREADS=2 psrun -p ./is.A After program terminates, I got three xml output files: is.A.0.22016.amd.xml is.A.1.22016.amd.xml is.A.2.22016.amd.xml However, in the is.A.1.xx.xx.xml file, all fields except "Wall clock time (seconds)" are zero. And the "wall clock time" field in this xml file equals to the same fields in the other two xml output files. Since I specify the total number of threads to be two, why does psrun produce three output files? Jie Jiang |
From: Rick K. <rk...@il...> - 2009-07-28 15:13:25
|
Jie, This is a common problem that arises from mixing static and shared code/libraries. At the end of your output, you will find a clue that this is what's going wrong: > /usr/bin/ld: /usr/lib/../lib64/libbfd.a(bfd.o): relocation R_X86_64_32 against > `bfd_section_hash_newfunc' can not be used when making a shared object; recompile > with -fPIC > /usr/lib/../lib64/libbfd.a: could not read symbols: Bad value "libbfd.a" is a static library that is probably being used by the linker because the shared library version "libbfd.so" may not be present on your machine. Do: $ ls /usr/lib64/libbfd* to see what libraries are on your machine. If there is no "libbfd.so", and you have root privileges, you can probably address this problem by making a symbolic link to "/usr/lib64/libbfd*.so" to "/usr/lib64/libbfd.so". If this is not possible, a current workaround is to reconfigure PerfSuite with the flag "--disable-binutils", run "make clean", and "make" again. We are actually in the process of providing a more flexible way of dealing with this in the next release, but for now one of these approaches may help you out. Rick jj...@nu... wrote: > Hi, > > Recently I'm trying to install Perfsuite on an SMP machine with Intel Xeon E5450 > processors and Linux OS(RHEL5). > > The supported software packages (like kernel patch, perfctr lib, papi, tcl/tk, > tDOM, Expat) have been installed. > > When compiling Perfuite-0.6.2, I got the following error message: > ..... > ..... > Making all in tcllib > Making all in psutils > Making all in bfd > /bin/sh ../../../../libtool --tag=CC --mode=compile gcc -DHAVE_CONFIG_H -I. > -I../../../.. -I/usr/local/perfsuite/tcl/include -DUSE_TCL_STUBS -g -O2 -MT > libpsbfd_la-Bfd_control.lo -MD -MP -MF .deps/libpsbfd_la-Bfd_control.Tpo -c -o > libpsbfd_la-Bfd_control.lo `test -f 'Bfd_control.c' || echo './'`Bfd_control.c > mkdir .libs > gcc -DHAVE_CONFIG_H -I. -I../../../.. -I/usr/local/perfsuite/tcl/include > -DUSE_TCL_STUBS -g -O2 -MT libpsbfd_la-Bfd_control.lo -MD -MP -MF > .deps/libpsbfd_la-Bfd_control.Tpo -c Bfd_control.c -fPIC -DPIC -o > .libs/libpsbfd_la-Bfd_control.o > gcc -DHAVE_CONFIG_H -I. -I../../../.. -I/usr/local/perfsuite/tcl/include > -DUSE_TCL_STUBS -g -O2 -MT libpsbfd_la-Bfd_control.lo -MD -MP -MF > .deps/libpsbfd_la-Bfd_control.Tpo -c Bfd_control.c -o libpsbfd_la-Bfd_control.o > >> /dev/null 2>&1 >> > mv -f .deps/libpsbfd_la-Bfd_control.Tpo .deps/libpsbfd_la-Bfd_control.Plo > /bin/sh ../../../../libtool --tag=CC --mode=compile gcc -DHAVE_CONFIG_H -I. > -I../../../.. -I/usr/local/perfsuite/tcl/include -DUSE_TCL_STUBS -g -O2 -MT > libpsbfd_la-Bfd_init.lo -MD -MP -MF .deps/libpsbfd_la-Bfd_init.Tpo -c -o > libpsbfd_la-Bfd_init.lo `test -f 'Bfd_init.c' || echo './'`Bfd_init.c > gcc -DHAVE_CONFIG_H -I. -I../../../.. -I/usr/local/perfsuite/tcl/include > -DUSE_TCL_STUBS -g -O2 -MT libpsbfd_la-Bfd_init.lo -MD -MP -MF > .deps/libpsbfd_la-Bfd_init.Tpo -c Bfd_init.c -fPIC -DPIC -o > .libs/libpsbfd_la-Bfd_init.o > gcc -DHAVE_CONFIG_H -I. -I../../../.. -I/usr/local/perfsuite/tcl/include > -DUSE_TCL_STUBS -g -O2 -MT libpsbfd_la-Bfd_init.lo -MD -MP -MF > .deps/libpsbfd_la-Bfd_init.Tpo -c Bfd_init.c -o libpsbfd_la-Bfd_init.o >/dev/null > 2>&1 > mv -f .deps/libpsbfd_la-Bfd_init.Tpo .deps/libpsbfd_la-Bfd_init.Plo > /bin/sh ../../../../libtool --tag=CC --mode=compile gcc -DHAVE_CONFIG_H -I. > -I../../../.. -I/usr/local/perfsuite/tcl/include -DUSE_TCL_STUBS -g -O2 -MT > libpsbfd_la-Bfd_inquire.lo -MD -MP -MF .deps/libpsbfd_la-Bfd_inquire.Tpo -c -o > libpsbfd_la-Bfd_inquire.lo `test -f 'Bfd_inquire.c' || echo './'`Bfd_inquire.c > gcc -DHAVE_CONFIG_H -I. -I../../../.. -I/usr/local/perfsuite/tcl/include > -DUSE_TCL_STUBS -g -O2 -MT libpsbfd_la-Bfd_inquire.lo -MD -MP -MF > .deps/libpsbfd_la-Bfd_inquire.Tpo -c Bfd_inquire.c -fPIC -DPIC -o > .libs/libpsbfd_la-Bfd_inquire.o > gcc -DHAVE_CONFIG_H -I. -I../../../.. -I/usr/local/perfsuite/tcl/include > -DUSE_TCL_STUBS -g -O2 -MT libpsbfd_la-Bfd_inquire.lo -MD -MP -MF > .deps/libpsbfd_la-Bfd_inquire.Tpo -c Bfd_inquire.c -o libpsbfd_la-Bfd_inquire.o > >> /dev/null 2>&1 >> > mv -f .deps/libpsbfd_la-Bfd_inquire.Tpo .deps/libpsbfd_la-Bfd_inquire.Plo > /bin/sh ../../../../libtool --tag=CC --mode=compile gcc -DHAVE_CONFIG_H -I. > -I../../../.. -I/usr/local/perfsuite/tcl/include -DUSE_TCL_STUBS -g -O2 -MT > libpsbfd_la-Bfd_lookup.lo -MD -MP -MF .deps/libpsbfd_la-Bfd_lookup.Tpo -c -o > libpsbfd_la-Bfd_lookup.lo `test -f 'Bfd_lookup.c' || echo './'`Bfd_lookup.c > gcc -DHAVE_CONFIG_H -I. -I../../../.. -I/usr/local/perfsuite/tcl/include > -DUSE_TCL_STUBS -g -O2 -MT libpsbfd_la-Bfd_lookup.lo -MD -MP -MF > .deps/libpsbfd_la-Bfd_lookup.Tpo -c Bfd_lookup.c -fPIC -DPIC -o > .libs/libpsbfd_la-Bfd_lookup.o > gcc -DHAVE_CONFIG_H -I. -I../../../.. -I/usr/local/perfsuite/tcl/include > -DUSE_TCL_STUBS -g -O2 -MT libpsbfd_la-Bfd_lookup.lo -MD -MP -MF > .deps/libpsbfd_la-Bfd_lookup.Tpo -c Bfd_lookup.c -o libpsbfd_la-Bfd_lookup.o > >> /dev/null 2>&1 >> > mv -f .deps/libpsbfd_la-Bfd_lookup.Tpo .deps/libpsbfd_la-Bfd_lookup.Plo > /bin/sh ../../../../libtool --tag=CC --mode=link gcc -g -O2 -lbfd -liberty > -version-info 1:0:0 -o libpsbfd.la -rpath /usr/local/perfsuite/ps/lib/psbfd0.2 > libpsbfd_la-Bfd_control.lo libpsbfd_la-Bfd_init.lo libpsbfd_la-Bfd_inquire.lo > libpsbfd_la-Bfd_lookup.lo > > > gcc -shared .libs/libpsbfd_la-Bfd_control.o .libs/libpsbfd_la-Bfd_init.o > .libs/libpsbfd_la-Bfd_inquire.o .libs/libpsbfd_la-Bfd_lookup.o -lbfd -liberty > -Wl,-soname -Wl,libpsbfd.so.1 -o .libs/libpsbfd.so.1.0.0 > /usr/bin/ld: /usr/lib/../lib64/libbfd.a(bfd.o): relocation R_X86_64_32 against > `bfd_section_hash_newfunc' can not be used when making a shared object; recompile > with -fPIC > /usr/lib/../lib64/libbfd.a: could not read symbols: Bad value > collect2: ld returned 1 exit status > make[5]: *** [libpsbfd.la] Error 1 > make[4]: *** [all-recursive] Error 1 > make[3]: *** [all-recursive] Error 1 > make[2]: *** [all-recursive] Error 1 > make[1]: *** [all-recursive] Error 1 > make: *** [all] Error 2 > > > The BFD comes from binutils-devel package > (binutils-devel-2.17.50.0.6-9.el5.x86_64.rpm) of RHEL5. > > Hot to solve this problem? > > Regards, > Jie Jiang > > > > ------------------------------------------------------------------------------ > Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day > trial. Simplify your report design, integration and deployment - and focus on > what you do best, core application coding. Discover what's new with > Crystal Reports now. http://p.sf.net/sfu/bobj-july > _______________________________________________ > PerfSuite-users mailing list > Per...@li... > https://lists.sourceforge.net/lists/listinfo/perfsuite-users > > |
From: <jj...@nu...> - 2009-07-28 05:05:09
|
Hi, Recently I'm trying to install Perfsuite on an SMP machine with Intel Xeon E5450 processors and Linux OS(RHEL5). The supported software packages (like kernel patch, perfctr lib, papi, tcl/tk, tDOM, Expat) have been installed. When compiling Perfuite-0.6.2, I got the following error message: ..... ..... Making all in tcllib Making all in psutils Making all in bfd /bin/sh ../../../../libtool --tag=CC --mode=compile gcc -DHAVE_CONFIG_H -I. -I../../../.. -I/usr/local/perfsuite/tcl/include -DUSE_TCL_STUBS -g -O2 -MT libpsbfd_la-Bfd_control.lo -MD -MP -MF .deps/libpsbfd_la-Bfd_control.Tpo -c -o libpsbfd_la-Bfd_control.lo `test -f 'Bfd_control.c' || echo './'`Bfd_control.c mkdir .libs gcc -DHAVE_CONFIG_H -I. -I../../../.. -I/usr/local/perfsuite/tcl/include -DUSE_TCL_STUBS -g -O2 -MT libpsbfd_la-Bfd_control.lo -MD -MP -MF .deps/libpsbfd_la-Bfd_control.Tpo -c Bfd_control.c -fPIC -DPIC -o .libs/libpsbfd_la-Bfd_control.o gcc -DHAVE_CONFIG_H -I. -I../../../.. -I/usr/local/perfsuite/tcl/include -DUSE_TCL_STUBS -g -O2 -MT libpsbfd_la-Bfd_control.lo -MD -MP -MF .deps/libpsbfd_la-Bfd_control.Tpo -c Bfd_control.c -o libpsbfd_la-Bfd_control.o >/dev/null 2>&1 mv -f .deps/libpsbfd_la-Bfd_control.Tpo .deps/libpsbfd_la-Bfd_control.Plo /bin/sh ../../../../libtool --tag=CC --mode=compile gcc -DHAVE_CONFIG_H -I. -I../../../.. -I/usr/local/perfsuite/tcl/include -DUSE_TCL_STUBS -g -O2 -MT libpsbfd_la-Bfd_init.lo -MD -MP -MF .deps/libpsbfd_la-Bfd_init.Tpo -c -o libpsbfd_la-Bfd_init.lo `test -f 'Bfd_init.c' || echo './'`Bfd_init.c gcc -DHAVE_CONFIG_H -I. -I../../../.. -I/usr/local/perfsuite/tcl/include -DUSE_TCL_STUBS -g -O2 -MT libpsbfd_la-Bfd_init.lo -MD -MP -MF .deps/libpsbfd_la-Bfd_init.Tpo -c Bfd_init.c -fPIC -DPIC -o .libs/libpsbfd_la-Bfd_init.o gcc -DHAVE_CONFIG_H -I. -I../../../.. -I/usr/local/perfsuite/tcl/include -DUSE_TCL_STUBS -g -O2 -MT libpsbfd_la-Bfd_init.lo -MD -MP -MF .deps/libpsbfd_la-Bfd_init.Tpo -c Bfd_init.c -o libpsbfd_la-Bfd_init.o >/dev/null 2>&1 mv -f .deps/libpsbfd_la-Bfd_init.Tpo .deps/libpsbfd_la-Bfd_init.Plo /bin/sh ../../../../libtool --tag=CC --mode=compile gcc -DHAVE_CONFIG_H -I. -I../../../.. -I/usr/local/perfsuite/tcl/include -DUSE_TCL_STUBS -g -O2 -MT libpsbfd_la-Bfd_inquire.lo -MD -MP -MF .deps/libpsbfd_la-Bfd_inquire.Tpo -c -o libpsbfd_la-Bfd_inquire.lo `test -f 'Bfd_inquire.c' || echo './'`Bfd_inquire.c gcc -DHAVE_CONFIG_H -I. -I../../../.. -I/usr/local/perfsuite/tcl/include -DUSE_TCL_STUBS -g -O2 -MT libpsbfd_la-Bfd_inquire.lo -MD -MP -MF .deps/libpsbfd_la-Bfd_inquire.Tpo -c Bfd_inquire.c -fPIC -DPIC -o .libs/libpsbfd_la-Bfd_inquire.o gcc -DHAVE_CONFIG_H -I. -I../../../.. -I/usr/local/perfsuite/tcl/include -DUSE_TCL_STUBS -g -O2 -MT libpsbfd_la-Bfd_inquire.lo -MD -MP -MF .deps/libpsbfd_la-Bfd_inquire.Tpo -c Bfd_inquire.c -o libpsbfd_la-Bfd_inquire.o >/dev/null 2>&1 mv -f .deps/libpsbfd_la-Bfd_inquire.Tpo .deps/libpsbfd_la-Bfd_inquire.Plo /bin/sh ../../../../libtool --tag=CC --mode=compile gcc -DHAVE_CONFIG_H -I. -I../../../.. -I/usr/local/perfsuite/tcl/include -DUSE_TCL_STUBS -g -O2 -MT libpsbfd_la-Bfd_lookup.lo -MD -MP -MF .deps/libpsbfd_la-Bfd_lookup.Tpo -c -o libpsbfd_la-Bfd_lookup.lo `test -f 'Bfd_lookup.c' || echo './'`Bfd_lookup.c gcc -DHAVE_CONFIG_H -I. -I../../../.. -I/usr/local/perfsuite/tcl/include -DUSE_TCL_STUBS -g -O2 -MT libpsbfd_la-Bfd_lookup.lo -MD -MP -MF .deps/libpsbfd_la-Bfd_lookup.Tpo -c Bfd_lookup.c -fPIC -DPIC -o .libs/libpsbfd_la-Bfd_lookup.o gcc -DHAVE_CONFIG_H -I. -I../../../.. -I/usr/local/perfsuite/tcl/include -DUSE_TCL_STUBS -g -O2 -MT libpsbfd_la-Bfd_lookup.lo -MD -MP -MF .deps/libpsbfd_la-Bfd_lookup.Tpo -c Bfd_lookup.c -o libpsbfd_la-Bfd_lookup.o >/dev/null 2>&1 mv -f .deps/libpsbfd_la-Bfd_lookup.Tpo .deps/libpsbfd_la-Bfd_lookup.Plo /bin/sh ../../../../libtool --tag=CC --mode=link gcc -g -O2 -lbfd -liberty -version-info 1:0:0 -o libpsbfd.la -rpath /usr/local/perfsuite/ps/lib/psbfd0.2 libpsbfd_la-Bfd_control.lo libpsbfd_la-Bfd_init.lo libpsbfd_la-Bfd_inquire.lo libpsbfd_la-Bfd_lookup.lo gcc -shared .libs/libpsbfd_la-Bfd_control.o .libs/libpsbfd_la-Bfd_init.o .libs/libpsbfd_la-Bfd_inquire.o .libs/libpsbfd_la-Bfd_lookup.o -lbfd -liberty -Wl,-soname -Wl,libpsbfd.so.1 -o .libs/libpsbfd.so.1.0.0 /usr/bin/ld: /usr/lib/../lib64/libbfd.a(bfd.o): relocation R_X86_64_32 against `bfd_section_hash_newfunc' can not be used when making a shared object; recompile with -fPIC /usr/lib/../lib64/libbfd.a: could not read symbols: Bad value collect2: ld returned 1 exit status make[5]: *** [libpsbfd.la] Error 1 make[4]: *** [all-recursive] Error 1 make[3]: *** [all-recursive] Error 1 make[2]: *** [all-recursive] Error 1 make[1]: *** [all-recursive] Error 1 make: *** [all] Error 2 The BFD comes from binutils-devel package (binutils-devel-2.17.50.0.6-9.el5.x86_64.rpm) of RHEL5. Hot to solve this problem? Regards, Jie Jiang |
From: Rick K. <rk...@il...> - 2009-07-22 12:30:36
|
Cheng, Thanks for the followup and further info. I'm glad to hear that you are getting better results with the itimer.xml configuration versus profil.xml. I have seen problems with profil.xml before, which uses the C library profil() functionality, but they had seemed to be isolated to Itanium platforms. Regardless, this is something we will note and see if we can reproduce. Thanks again for the info - I am copying the SourceForge list for archival/status purposes. Rick ----- Original Message ----- From: "Cheng Liao" <cyl...@gm...> To: "Rick Kufrin" <rk...@il...> Sent: Tuesday, July 21, 2009 11:30:08 PM GMT -06:00 US/Canada Central Subject: Re: [PerfSuite-users] psrun profile doesn't show any sampling point Rick, Thought I would follow up. Running longer jobs helps, since not all CPU time would be reported by profiling samples. However, my real problem is I need to use itimer.xml instead of profil.xml for psrun. For some reason profil.xml would show empty. Maybe I have messed up my build somewhere? Now, everything looks great, except sometime a significant chuck of the sampling points cannot find the function names and would show ?? on the output. Thanks again, Cheng On Mon, Jul 20, 2009 at 1:28 PM, Cheng Liao < cyl...@gm... > wrote: Hi Rick, Thanks for your quick reply. Indeed the sample output I showed was from a Nehalem machine. However, I saw the same problem on some wolfdale and harpertown systems too. So far, I have tried only a 0 second and a 6 second jobs. Will make some longer runs. Cheng On Mon, Jul 20, 2009 at 12:50 PM, Rick Kufrin < rk...@il... > wrote: Cheng, There are two issues that I can think of that you might be experiencing. The first is simply that the program may not run long enough to generate profiling samples. However, I am also wondering if (based on the processor brand string listed in the output) if you are working with an Intel Nehalem CPU. If so, there are known problems with the most recent release of PerfSuite on that processor. We have finished adding the support for that system, but have not yet placed a new software release on SourceForge or at NCSA. We hope to do so in the very near future, though. Sorry for the difficulties, please let me know if you also suspect this may be what is affecting you. Rick Cheng Liao wrote: I just installed perfsuite on some linux x86_64 machines (only one with papi and perfctr). However, even though 'make -s check' outputs looked fine, all the psprocess outputs would be 'empty' and show no profile timing info. Is this something that can be easily fixed? Thanks, Cheng PerfSuite Hardware Performance Summary Report Version : 1.0 Created : Mon Jul 20 11:20:26 AM PDT 2009 Generator : psprocess 0.3 XML Source : ls.15381.em64td.xml Execution Information ============================================================================================ Collector : libpshwpc Date : Sat Jul 18 17:31:54 2009 Host : em64td User : cliao Command : ls Processor and System Information ============================================================================================ Node CPUs : 8 Vendor : Intel Family : Pentium Pro (P6) Brand : Intel(R) Xeon(R) CPU X5570 @ 2.93GHz CPU Revision : 4 Clock (MHz) : 2933.589 Memory (MB) : 24096.66 Pagesize (KB) : 4 Cache Information ============================================================================================ Cache levels : 0 Profile Information ============================================================================================ Class : profil Version : 2.5 Event : milliseconds Period : 10 Samples : 0 Domain : user Run Time : 0.00 (seconds) Min Self % : (all) Module Summary -------------------------------------------------------------------------------- Samples Self % Total % Module File Summary -------------------------------------------------------------------------------- Samples Self % Total % File Function Summary -------------------------------------------------------------------------------- Samples Self % Total % Function Function:File:Line Summary -------------------------------------------------------------------------------- Samples Self % Total % Function:File:Line ------------------------------------------------------------------------ ------------------------------------------------------------------------------ Enter the BlackBerry Developer Challenge This is your chance to win up to $100,000 in prizes! For a limited time, vendors submitting new applications to BlackBerry App World(TM) will have the opportunity to enter the BlackBerry Developer Challenge. See full prize details at: http://p.sf.net/sfu/Challenge ------------------------------------------------------------------------ _______________________________________________ PerfSuite-users mailing list Per...@li... https://lists.sourceforge.net/lists/listinfo/perfsuite-users |
From: Rick K. <rk...@il...> - 2009-07-20 21:41:56
|
Cheng, There are two issues that I can think of that you might be experiencing. The first is simply that the program may not run long enough to generate profiling samples. However, I am also wondering if (based on the processor brand string listed in the output) if you are working with an Intel Nehalem CPU. If so, there are known problems with the most recent release of PerfSuite on that processor. We have finished adding the support for that system, but have not yet placed a new software release on SourceForge or at NCSA. We hope to do so in the very near future, though. Sorry for the difficulties, please let me know if you also suspect this may be what is affecting you. Rick Cheng Liao wrote: > I just installed perfsuite on some linux x86_64 machines (only one > with papi and perfctr). > However, even though 'make -s check' outputs looked fine, all the > psprocess outputs would > be 'empty' and show no profile timing info. Is this something that > can be easily fixed? > > Thanks, > Cheng > > PerfSuite Hardware Performance Summary Report > > Version : 1.0 > Created : Mon Jul 20 11:20:26 AM PDT 2009 > Generator : psprocess 0.3 > XML Source : ls.15381.em64td.xml > > Execution Information > ============================================================================================ > Collector : libpshwpc > Date : Sat Jul 18 17:31:54 2009 > Host : em64td > User : cliao > Command : ls > > Processor and System Information > ============================================================================================ > Node CPUs : 8 > Vendor : Intel > Family : Pentium Pro (P6) > Brand : Intel(R) Xeon(R) CPU X5570 @ > 2.93GHz > CPU Revision : 4 > Clock (MHz) : 2933.589 > Memory (MB) : 24096.66 > Pagesize (KB) : 4 > > Cache Information > ============================================================================================ > Cache levels : 0 > > Profile Information > ============================================================================================ > Class : profil > Version : 2.5 > Event : milliseconds > Period : 10 > Samples : 0 > Domain : user > Run Time : 0.00 (seconds) > Min Self % : (all) > > Module Summary > -------------------------------------------------------------------------------- > Samples Self % Total % Module > > > File Summary > -------------------------------------------------------------------------------- > Samples Self % Total % File > > > Function Summary > -------------------------------------------------------------------------------- > Samples Self % Total % Function > > > Function:File:Line Summary > -------------------------------------------------------------------------------- > Samples Self % Total % Function:File:Line > > ------------------------------------------------------------------------ > > ------------------------------------------------------------------------------ > Enter the BlackBerry Developer Challenge > This is your chance to win up to $100,000 in prizes! For a limited time, > vendors submitting new applications to BlackBerry App World(TM) will have > the opportunity to enter the BlackBerry Developer Challenge. See full prize > details at: http://p.sf.net/sfu/Challenge > ------------------------------------------------------------------------ > > _______________________________________________ > PerfSuite-users mailing list > Per...@li... > https://lists.sourceforge.net/lists/listinfo/perfsuite-users > |
From: Cheng L. <cyl...@gm...> - 2009-07-20 20:28:15
|
Hi Rick, Thanks for your quick reply. Indeed the sample output I showed was from a Nehalem machine. However, I saw the same problem on some wolfdale and harpertown systems too. So far, I have tried only a 0 second and a 6 second jobs. Will make some longer runs. Cheng On Mon, Jul 20, 2009 at 12:50 PM, Rick Kufrin <rk...@il...> wrote: > Cheng, > > There are two issues that I can think of that you might be experiencing. > The first is simply that the program may not run long enough to generate > profiling samples. However, I am also wondering if (based on the processor > brand string listed in the output) if you are working with an Intel Nehalem > CPU. If so, there are known problems with the most recent release of > PerfSuite on that processor. We have finished adding the support for that > system, but have not yet placed a new software release on SourceForge or at > NCSA. We hope to do so in the very near future, though. > > Sorry for the difficulties, please let me know if you also suspect this may > be what is affecting you. > > Rick > > Cheng Liao wrote: > >> I just installed perfsuite on some linux x86_64 machines (only one with >> papi and perfctr). >> However, even though 'make -s check' outputs looked fine, all the >> psprocess outputs would >> be 'empty' and show no profile timing info. Is this something that can be >> easily fixed? >> Thanks, >> Cheng >> PerfSuite Hardware Performance Summary Report >> >> Version : 1.0 >> Created : Mon Jul 20 11:20:26 AM PDT 2009 >> Generator : psprocess 0.3 >> XML Source : ls.15381.em64td.xml >> >> Execution Information >> >> ============================================================================================ >> Collector : libpshwpc >> Date : Sat Jul 18 17:31:54 2009 >> Host : em64td >> User : cliao >> Command : ls >> >> Processor and System Information >> >> ============================================================================================ >> Node CPUs : 8 >> Vendor : Intel >> Family : Pentium Pro (P6) >> Brand : Intel(R) Xeon(R) CPU X5570 @ >> 2.93GHz >> CPU Revision : 4 >> Clock (MHz) : 2933.589 >> Memory (MB) : 24096.66 >> Pagesize (KB) : 4 >> >> Cache Information >> >> ============================================================================================ >> Cache levels : 0 >> >> Profile Information >> >> ============================================================================================ >> Class : profil >> Version : 2.5 >> Event : milliseconds >> Period : 10 >> Samples : 0 >> Domain : user >> Run Time : 0.00 (seconds) >> Min Self % : (all) >> >> Module Summary >> >> -------------------------------------------------------------------------------- >> Samples Self % Total % Module >> >> >> File Summary >> >> -------------------------------------------------------------------------------- >> Samples Self % Total % File >> >> >> Function Summary >> >> -------------------------------------------------------------------------------- >> Samples Self % Total % Function >> >> >> Function:File:Line Summary >> >> -------------------------------------------------------------------------------- >> Samples Self % Total % Function:File:Line >> >> ------------------------------------------------------------------------ >> >> >> ------------------------------------------------------------------------------ >> Enter the BlackBerry Developer Challenge This is your chance to win up to >> $100,000 in prizes! For a limited time, vendors submitting new applications >> to BlackBerry App World(TM) will have >> the opportunity to enter the BlackBerry Developer Challenge. See full >> prize details at: http://p.sf.net/sfu/Challenge >> ------------------------------------------------------------------------ >> >> _______________________________________________ >> PerfSuite-users mailing list >> Per...@li... >> https://lists.sourceforge.net/lists/listinfo/perfsuite-users >> >> > > |
From: Cheng L. <cyl...@gm...> - 2009-07-20 19:30:20
|
I just installed perfsuite on some linux x86_64 machines (only one with papi and perfctr). However, even though 'make -s check' outputs looked fine, all the psprocess outputs would be 'empty' and show no profile timing info. Is this something that can be easily fixed? Thanks, Cheng PerfSuite Hardware Performance Summary Report Version : 1.0 Created : Mon Jul 20 11:20:26 AM PDT 2009 Generator : psprocess 0.3 XML Source : ls.15381.em64td.xml Execution Information ============================================================================================ Collector : libpshwpc Date : Sat Jul 18 17:31:54 2009 Host : em64td User : cliao Command : ls Processor and System Information ============================================================================================ Node CPUs : 8 Vendor : Intel Family : Pentium Pro (P6) Brand : Intel(R) Xeon(R) CPU X5570 @ 2.93GHz CPU Revision : 4 Clock (MHz) : 2933.589 Memory (MB) : 24096.66 Pagesize (KB) : 4 Cache Information ============================================================================================ Cache levels : 0 Profile Information ============================================================================================ Class : profil Version : 2.5 Event : milliseconds Period : 10 Samples : 0 Domain : user Run Time : 0.00 (seconds) Min Self % : (all) Module Summary -------------------------------------------------------------------------------- Samples Self % Total % Module File Summary -------------------------------------------------------------------------------- Samples Self % Total % File Function Summary -------------------------------------------------------------------------------- Samples Self % Total % Function Function:File:Line Summary -------------------------------------------------------------------------------- Samples Self % Total % Function:File:Line |
From: Rick K. <rk...@il...> - 2009-06-26 15:20:19
|
Karim, Please provide a little more detail to help us understand and resolve any errors you may be receiving. A good place to start in the installation process is to read the file called INSTALL that should be in the top-level directory of the unpacked distribution. If you follow those instructions and experience errors, feel free to post again. Rick karim fathallah wrote: > hi to all > please can anyone help me and give me the procedure howto install > perfsuite on ubuntu8.04 > thanks for help > > -- > Enseignant A L'institut Préparatoire Aux Etudes D'ingenieurs De Bizerte > Membre De l'Unité De Recherche URAPAD > Membre IEEE > Membre ACM > Membre UBUNTU-TN > Membre Dfsa > ------------------------------------------------------------------------ > > ------------------------------------------------------------------------------ > > ------------------------------------------------------------------------ > > _______________________________________________ > PerfSuite-users mailing list > Per...@li... > https://lists.sourceforge.net/lists/listinfo/perfsuite-users > |
From: karim f. <fat...@gm...> - 2009-06-26 14:57:42
|
hi to all please can anyone help me and give me the procedure howto install perfsuite on ubuntu8.04 thanks for help -- Enseignant A L'institut Préparatoire Aux Etudes D'ingenieurs De Bizerte Membre De l'Unité De Recherche URAPAD Membre IEEE Membre ACM Membre UBUNTU-TN Membre Dfsa |
From: George M. <ge...@ma...> - 2009-06-15 22:19:55
|
Dear Rick, Thanks for the answer. At first I should say that I don't want to run the program many times in order to profile all the code as I believe it's not so right specially if I use a lot of processors and I have to run it again and again. Although your code (I am not expert in fortran) seems very nice and it does whatever I want. I don't care about XML file as I worry only for the result's accuracy about Mflops. Thanks a lot for the tip as I can understand it reads the hardware counters so maybe it's ok. Thanks a lot, Best regards, George Markomanolis On Mon, Jun 15, 2009 at 11:36 PM, Rick Kufrin <rk...@il...> wrote: > George, > > No need to be sorry, your question is reasonable. Unfortunately, although > I understand conceptually what you are trying to accomplish, that is: > > 1. Bracket multiple portions of your application and collect counter data > independently for each > 2. Output separate XML documents for each bracketed portion > 3. Use psprocess to generate derived metrics for each XML document > > ... this mode of operation is not well-supported in PerfSuite. One could > arrange to mimic this behavior, but it would be a lot of effort and code > changes within the application to manage the data. > > The simplest way of achieving this with the PS API is to do multiple runs > of your application, each run collecting the performance data from a > different region of the application, rather than attempting to collect data > for multiple regions simultaneously. > > It is still possible to get the information you are looking for through the > PerfSuite API, and I am attaching a small example program written today to > show the basic idea. The example has very few comments, but hopefully it is > not too difficult to understand. Note that XML is not used at all in this > way of measurement. > > Regarding your second question today (about profiling), it is currently not > possible through PerfSuite to profile based on more than one hardware event. > I should emphasize this is a limitation of PerfSuite, and not the hardware > capabilities or limitations of PAPI. > > Rick > > George Markomanolis wrote: > >> Dear all, >> >> sorry if my question is simple but I didn't find any info about it. I have >> a fortran code, I have profiled it either with psrun either by PSF_ >> commands. My question is if I can profile two (or more) sections of my code >> and take as output two values about Mflops for example I would like to see >> Mflops per function and not total Mflops for all the program. I tried with >> PSF commands to start profiling, write the file and start again profiling >> but there was no second file. >> >> Thanks a lot, >> Best regards, >> George Markomanolis >> >> >> ------------------------------------------------------------------------------ >> Crystal Reports - New Free Runtime and 30 Day Trial >> Check out the new simplified licensing option that enables unlimited >> royalty-free distribution of the report engine for externally facing >> server and web deployment. >> http://p.sf.net/sfu/businessobjects >> _______________________________________________ >> PerfSuite-users mailing list >> Per...@li... >> https://lists.sourceforge.net/lists/listinfo/perfsuite-users >> >> >> > > |
From: Rick K. <rk...@il...> - 2009-06-15 21:36:40
|
George, No need to be sorry, your question is reasonable. Unfortunately, although I understand conceptually what you are trying to accomplish, that is: 1. Bracket multiple portions of your application and collect counter data independently for each 2. Output separate XML documents for each bracketed portion 3. Use psprocess to generate derived metrics for each XML document ... this mode of operation is not well-supported in PerfSuite. One could arrange to mimic this behavior, but it would be a lot of effort and code changes within the application to manage the data. The simplest way of achieving this with the PS API is to do multiple runs of your application, each run collecting the performance data from a different region of the application, rather than attempting to collect data for multiple regions simultaneously. It is still possible to get the information you are looking for through the PerfSuite API, and I am attaching a small example program written today to show the basic idea. The example has very few comments, but hopefully it is not too difficult to understand. Note that XML is not used at all in this way of measurement. Regarding your second question today (about profiling), it is currently not possible through PerfSuite to profile based on more than one hardware event. I should emphasize this is a limitation of PerfSuite, and not the hardware capabilities or limitations of PAPI. Rick George Markomanolis wrote: > Dear all, > > sorry if my question is simple but I didn't find any info about it. I > have a fortran code, I have profiled it either with psrun either by PSF_ > commands. My question is if I can profile two (or more) sections of my > code and take as output two values about Mflops for example I would like > to see Mflops per function and not total Mflops for all the program. I > tried with PSF commands to start profiling, write the file and start > again profiling but there was no second file. > > Thanks a lot, > Best regards, > George Markomanolis > > ------------------------------------------------------------------------------ > Crystal Reports - New Free Runtime and 30 Day Trial > Check out the new simplified licensing option that enables unlimited > royalty-free distribution of the report engine for externally facing > server and web deployment. > http://p.sf.net/sfu/businessobjects > _______________________________________________ > PerfSuite-users mailing list > Per...@li... > https://lists.sourceforge.net/lists/listinfo/perfsuite-users > > |
From: George M. <ge...@ma...> - 2009-06-15 16:51:05
|
Dear all, sorry for my second email but I saw the xml files (psrun –C -c papi_profile_cycles.xml myprog) about profiling and as I have read that we can profile only for one event. My question is how could I measure Mflops like in counting mode? Also which is the most <<right>> threshold or sampling rate in order to be more accurate the result (if it's able to measure Mflops with one running)? Thanks a lot, Best regards, George Markomanolis |
From: George M. <ge...@ma...> - 2009-06-15 14:31:53
|
Dear all, sorry if my question is simple but I didn't find any info about it. I have a fortran code, I have profiled it either with psrun either by PSF_ commands. My question is if I can profile two (or more) sections of my code and take as output two values about Mflops for example I would like to see Mflops per function and not total Mflops for all the program. I tried with PSF commands to start profiling, write the file and start again profiling but there was no second file. Thanks a lot, Best regards, George Markomanolis |
From: George M. <ge...@ma...> - 2009-06-11 22:16:26
|
Dear Rick, Thanks again for the reply, I will send again when I have more formal results because now my scope is to profile four benchmarks from NAS. I used papi3_mflops and now I profile only the counters I want. Best regards, George Markomanolis On Thu, Jun 11, 2009 at 11:03 PM, Rick Kufrin <rk...@il...> wrote: > George, > > Do you have the raw event counts available to you from each method of > measurement you are doing? The metric megaflops is a derived metric, the > ratio of floating point operations to cycles, with the clock speed factored > in. Either a reduction in floating point operations or an increase in > cycles will reduce the value. Unfortunately, there is not enough > information here for me to determine the cause of the ~ 6% difference you > report. > > The psrun command will include all event occurrences over the entire run of > your program (from the time main() is entered until the program calls > exit()). So some time is likely included from within MPI itself, as you > guess. > > You indicate that multiplexing is enabled, so I am guessing you are running > psrun with its default configuration file. You may want to try the > following: > > $ psrun -C -c papi3_mflops.xml ./your-program > > ... as the provided alternate configuration papi3_mflops.xml is one that > only counts the two events you are interested in. Multiplexing would likely > not be used for such a run. Again: "multiplexing" and "statistical > sampling" are two different things, at least in "PerfSuite-speak" > > Rick > > > George Markomanolis wrote: > >> Dear Rick, >> >> First of all thanks for the help, it was a simple problem but because I am >> trying a lot of stuff I forgot about it. The problem is solved. >> About my second question. Basically I want to measure only two hardware >> counters PAPI_FP_OPS and PAPI_TOT_CYC in order to take Mflops. I ask about >> how accurate it is because I compared results from profiling matrix >> multipication (C, MPI, ScalaPack with psrun) with another profiling tool and >> with perfsuite I had 960 mflops per cpu but with the other one almost 1020 >> mflops. Multiplexing was enabled, so is it possible to loose so many flops? >> Also PerfSuite measure also MPI command's flops (all_reduce etc?). I was >> trying to figure out why there was such a difference. If PerfSuite use >> statistical sampling then it is possible to loose some data? >> >> Best regards, >> George Markomanolis >> >> > |
From: Rick K. <rk...@il...> - 2009-06-11 21:03:58
|
George, Do you have the raw event counts available to you from each method of measurement you are doing? The metric megaflops is a derived metric, the ratio of floating point operations to cycles, with the clock speed factored in. Either a reduction in floating point operations or an increase in cycles will reduce the value. Unfortunately, there is not enough information here for me to determine the cause of the ~ 6% difference you report. The psrun command will include all event occurrences over the entire run of your program (from the time main() is entered until the program calls exit()). So some time is likely included from within MPI itself, as you guess. You indicate that multiplexing is enabled, so I am guessing you are running psrun with its default configuration file. You may want to try the following: $ psrun -C -c papi3_mflops.xml ./your-program ... as the provided alternate configuration papi3_mflops.xml is one that only counts the two events you are interested in. Multiplexing would likely not be used for such a run. Again: "multiplexing" and "statistical sampling" are two different things, at least in "PerfSuite-speak" Rick George Markomanolis wrote: > Dear Rick, > > First of all thanks for the help, it was a simple problem but because > I am trying a lot of stuff I forgot about it. The problem is solved. > About my second question. Basically I want to measure only two > hardware counters PAPI_FP_OPS and PAPI_TOT_CYC in order to take > Mflops. I ask about how accurate it is because I compared results from > profiling matrix multipication (C, MPI, ScalaPack with psrun) with > another profiling tool and with perfsuite I had 960 mflops per cpu but > with the other one almost 1020 mflops. Multiplexing was enabled, so is > it possible to loose so many flops? Also PerfSuite measure also MPI > command's flops (all_reduce etc?). I was trying to figure out why > there was such a difference. If PerfSuite use statistical sampling > then it is possible to loose some data? > > Best regards, > George Markomanolis > |