From: Robbie <jj...@nu...> - 2009-08-27 14:25:41
|
Rick, Today I have tested perfsuite with NPB-OMP benchmarks on an Xeon/Linux machine with icc/ifort compiler. Luckily, all OpenMP benchmarks finish successfully and performance data files are generated as expected. I'm not clear about the reason behind this problem. Maybe the difference between icc/ifort and gcc/gfortran can be a useful indication. Jie 2009-08-26 at 08:12 -0500, rk...@il... wrote: > Jie, > > Thanks for reporting on your further experiments. With the issue still present when using PAPI, it seems similar to an issue we have seen on our Altix. Unfortunately, this remains an unresolved issue that may be related to the way that psrun operates internally. I'm afraid I do not have a solution at present, but I have found that using the PerfSuite API directly produces the proper results. If you are able and willing to do so, using the API involves inserting a call to the following routines: > > call psf_hwpc_init() - from the main thread > call psf_hwpc_start() - from within a parallel region > call psf_hwpc_stop(filename) - from within a parallel region > > There is an example (in C) in the PerfSuite distribution of calling the API from an OpenMP program. You will find it in: > > $PREFIX/share/perfsuite/examples/cpi/cpi-omp.c > > I'm afraid I do not have a better solution at this time, but it is an issue we are following up on. > > Rick > > > ----- Original Message ----- > From: "Robbie" <jj...@nu...> > To: "Rick Kufrin" <rk...@il...> > Cc: per...@li... > Sent: Wednesday, August 26, 2009 7:27:05 AM GMT -06:00 US/Canada Central > Subject: Re: [PerfSuite-users] Problems about using Perfsuite to monitor OpenMP program (NPB-3.2.1) > > Rick, > > Thanks for your suggestion. > However, when I tried to monitor an OpenMP program with the default > configuration file, I still got some errors and the data files are not > created as expected. > > Followings are the platform information and how I measure the NPB-OMP > program with psrun. > Note here the target OpenMP program is compiled by gcc/gfortran with > -fopenmp option. > > > jiejiang@UT43:~/NPB3.2.1/NPB3.2-OMP/bin$ uname -a > Linux UT43 2.6.27-perfctr #2 SMP Tue Apr 28 20:29:12 CST 2009 i686 > GNU/Linux > jiejiang@UT43:~/NPB3.2.1/NPB3.2-OMP/bin$ perfex -i > PerfCtr Info: > abi_version 0x05020501 > driver_version 2.6.37 DEBUG > cpu_type 14 (Intel Pentium M) > cpu_features 0x7 (rdpmc,rdtsc,pcint) > cpu_khz 798049 > tsc_to_cpu_mult 1 > cpu_nrctrs 2 > cpus [0], total: 1 > cpus_forbidden [], total: 0 > > jiejiang@UT43:~/NPB3.2.1/NPB3.2-OMP/bin$ ls > bt.A ep.A is.A is.B > > jiejiang@UT43:~/NPB3.2.1/NPB3.2-OMP/bin$ export OMP_NUM_THREADS=2 > jiejiang@UT43:~/NPB3.2.1/NPB3.2-OMP/bin$ psrun -p ./is.A > > > NAS Parallel Benchmarks (NPB3.2-OMP) - IS Benchmark > > Size: 8388608 (class A) > Iterations: 10 > Number of available threads: 2 > > > iteration > 1 > 2 > 3 > 4 > 5 > 6 > 7 > 8 > 9 > 10 > > > IS Benchmark Completed > Class = A > Size = 8388608 > Iterations = 10 > Time in seconds = 1.46 > Total threads = 2 > Avail threads = 2 > Mop/s total = 57.29 > Mop/s/thread = 28.64 > Operation type = keys ranked > Verification = SUCCESSFUL > Version = 3.2.1 > Compile date = 25 Aug 2009 > > Compile options: > CC = gcc > CLINK = $(CC) > C_LIB = -lm > C_INC = (none) > CFLAGS = -O -g -fopenmp > CLINKFLAGS = -O -fopenmp > > > Please send all errors/feedbacks to: > > NPB Development Team > np...@na... > > Inconsistency detected by ld.so: dl-close.c: 719: _dl_close: Assertion > `map->l_init_called' failed! > > > jiejiang@UT43:~/NPB3.2.1/NPB3.2-OMP/bin$ ls > bt.A ep.A is.A is.A.0.30613.UT43.xml is.B > > > The program execution finishes with the error message in the last line > and there is only ONE xml output file, not two as expected. > This also happens to papi_profile_cycles.xml configuration file. > > What's wrong? > > Regards, > Jie Jiang > > > > Rick Kufrin wrote: > > Jie, > > > > My guess is that what is happening here is related to the use of the "itimer.xml" configuration file. The problem is that signal delivery is not defined with POSIX threads, and the results are unpredictable. POSIX threads enter the picture when you are using OpenMP. > > > > Does your system happen to have kernel support for hardware counters? If so, you may have better luck by profiling with performance counters such as total cycles rather than itimers. > > > > Rick > > > > > > > ------------------------------------------------------------------------------ > Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day > trial. Simplify your report design, integration and deployment - and focus on > what you do best, core application coding. Discover what's new with > Crystal Reports now. http://p.sf.net/sfu/bobj-july > _______________________________________________ > PerfSuite-users mailing list > Per...@li... > https://lists.sourceforge.net/lists/listinfo/perfsuite-users > > |