You can subscribe to this list here.
2004 |
Jan
|
Feb
(2) |
Mar
(2) |
Apr
|
May
|
Jun
(1) |
Jul
(6) |
Aug
(3) |
Sep
|
Oct
(1) |
Nov
|
Dec
(2) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2005 |
Jan
(2) |
Feb
(2) |
Mar
|
Apr
(6) |
May
|
Jun
(4) |
Jul
(3) |
Aug
|
Sep
|
Oct
(2) |
Nov
(12) |
Dec
(10) |
2006 |
Jan
(27) |
Feb
(4) |
Mar
(3) |
Apr
(5) |
May
(5) |
Jun
(1) |
Jul
(2) |
Aug
|
Sep
(7) |
Oct
(5) |
Nov
(11) |
Dec
(5) |
2007 |
Jan
(15) |
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
(3) |
Sep
(1) |
Oct
|
Nov
(1) |
Dec
|
2008 |
Jan
(7) |
Feb
(9) |
Mar
(2) |
Apr
(1) |
May
|
Jun
(6) |
Jul
(2) |
Aug
|
Sep
|
Oct
(1) |
Nov
(3) |
Dec
(1) |
2009 |
Jan
(11) |
Feb
|
Mar
(2) |
Apr
(1) |
May
(8) |
Jun
(11) |
Jul
(9) |
Aug
(12) |
Sep
(1) |
Oct
(3) |
Nov
(10) |
Dec
|
2010 |
Jan
(3) |
Feb
(1) |
Mar
(5) |
Apr
|
May
|
Jun
|
Jul
|
Aug
(2) |
Sep
(1) |
Oct
(1) |
Nov
|
Dec
|
2011 |
Jan
(2) |
Feb
(2) |
Mar
(1) |
Apr
|
May
(1) |
Jun
|
Jul
|
Aug
(1) |
Sep
|
Oct
(2) |
Nov
|
Dec
|
2012 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(1) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2013 |
Jan
(1) |
Feb
|
Mar
|
Apr
(3) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(1) |
Nov
(1) |
Dec
(1) |
2014 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
(1) |
Oct
|
Nov
|
Dec
|
2015 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
(1) |
Aug
|
Sep
|
Oct
(2) |
Nov
|
Dec
|
From: Rick K. <rk...@nc...> - 2006-01-28 01:26:26
|
On Fri, 27 Jan 2006, Haoqiang H. Jin wrote: > You probably placed > INCLUDE 'fperfsuite.h' (or similar) > before > IMPLICIT DOUBLE PRECISION(A-H,O-Z) > > This is not allowed in Fortran. Try something like: > > IMPLICIT DOUBLE PRECISION(A-H,O-Z) > INCLUDE 'fperfsuite.h' > Thanks, Henry! Looks like the reason why I couldn't reproduce is because I did it the second way. The first way, using g77 3.4.4, also failed for me and produced the error message that Tirath reported. Rick |
From: Haoqiang H. J. <hj...@na...> - 2006-01-28 01:10:36
|
Tirath, On your second problem: > g77 -c -O2 -I/usr/local/include -malign-double -fautomatic -Wno- > globals -fno-globals gamess.f > source/gamess.srcpp: In program `gamess': > source/gamess.srcpp:320: > IMPLICIT DOUBLE PRECISION(A-H,O-Z) > 1 > fperfsuite.h:45: (continued): > integer PS_PID, PS_PPID, PS_PGRP, PS_SESSION, PS_TTY, > 2 > Statement at (1) invalid in context established by statement at (2) > ... (it goes on like that)... > > Does anyone know of a workaround for situations like this? > You probably placed INCLUDE 'fperfsuite.h' (or similar) before IMPLICIT DOUBLE PRECISION(A-H,O-Z) This is not allowed in Fortran. Try something like: IMPLICIT DOUBLE PRECISION(A-H,O-Z) INCLUDE 'fperfsuite.h' -Henry Jin NASA Ames Research Center |
From: Rick K. <rk...@nc...> - 2006-01-28 01:01:17
|
Tirath - looks like two issues here. For this one: > To begin with, is it normal/explainable that running psrun in > counting mode counts far more samples of a given event than when > running in profile mode? For example, I've been getting the following > with my program: > > PAPI_L2_LDM: > Counting: 5452797 samples > Profil: 305 samples (total) > > The case is similar for PAPI_RES_STL and PAPI_TOT_CYC, and I haven't > tested any others yet. If I understand your question properly, it is: why aren't the number of samples when profiling identical to the number of observed events when counting? Assuming I have that right, the answer is that it's because the nature of the measurements are different. For counting runs, the number of events reported should be the total number of events actually observed (except when multiplexing is involved, in which case it is an estimate). It's an aggregate count of total event occurrences. For profiling runs, a sample is recorded for only every Nth occurence of an event (where N is user-selectable in PerfSuite, either through the psrun command-line option "-t" or through the environment variable PS_HWPC_THRESHOLD). So if the total of events that occurred during the run of an application is M, then the total number of samples S recorded should be M/N, or equivalently M = N*S. Substituting your numbers above: M = 5452797 S = 305 and so N ~= 17878. The default threshold for PerfSuite 0.6.1 profiling is 10000, which is in the ballpark. In 0.6.2 it was raised to 100000, which is less likely to cause excessive overhead due to the interrupt handler that's invoked every time the threshold is reached. You might try raising the threshold value and running your experiments again to see if you can bring the calculations more closely in line (although they will probably never match exactly) > I figured I'd try invoking libpshwpc directly in the program instead > of relying on psrun... but that may not be so simple as the code I > have to work with is some pretty old Fortran. I'm pretty new to > Fortran, so I don't know how correct my analysis is, but I think the > problem is the code I have to work with relies heavily on implicit > variables. Here's what I get when trying to compile: > > g77 -c -O2 -I/usr/local/include -malign-double -fautomatic -Wno- > globals -fno-globals gamess.f > source/gamess.srcpp: In program `gamess': > source/gamess.srcpp:320: > IMPLICIT DOUBLE PRECISION(A-H,O-Z) > 1 > fperfsuite.h:45: (continued): > integer PS_PID, PS_PPID, PS_PGRP, PS_SESSION, PS_TTY, > 2 > Statement at (1) invalid in context established by statement at (2) > ... (it goes on like that)... > > Does anyone know of a workaround for situations like this? > I tried to reproduce this error with g77 version 3.4.4 and had no problem. Do you know which version of g77 you're using? (g77 --version) Rick |
From: Tirath R. <ti...@tp...> - 2006-01-27 17:16:35
|
Hi all, To begin with, is it normal/explainable that running psrun in counting mode counts far more samples of a given event than when running in profile mode? For example, I've been getting the following with my program: PAPI_L2_LDM: Counting: 5452797 samples Profil: 305 samples (total) The case is similar for PAPI_RES_STL and PAPI_TOT_CYC, and I haven't tested any others yet. I figured I'd try invoking libpshwpc directly in the program instead of relying on psrun... but that may not be so simple as the code I have to work with is some pretty old Fortran. I'm pretty new to Fortran, so I don't know how correct my analysis is, but I think the problem is the code I have to work with relies heavily on implicit variables. Here's what I get when trying to compile: g77 -c -O2 -I/usr/local/include -malign-double -fautomatic -Wno- globals -fno-globals gamess.f source/gamess.srcpp: In program `gamess': source/gamess.srcpp:320: IMPLICIT DOUBLE PRECISION(A-H,O-Z) 1 fperfsuite.h:45: (continued): integer PS_PID, PS_PPID, PS_PGRP, PS_SESSION, PS_TTY, 2 Statement at (1) invalid in context established by statement at (2) ... (it goes on like that)... Does anyone know of a workaround for situations like this? |
From: Rick K. <rk...@nc...> - 2006-01-25 13:31:45
|
Greetings all, This note is a followup to the problem reported by Tirath Ramdas last week about psprocess profiling reports differing under certain circumstances. Our experiments indicate that the problem source is indeed step 2 as described below, the use of BFD from within the PerfSuite Tcl BFD extension. It doesn't seem to be consistent, but the likely cause is a failure related to BFD initialization that persists through the course of processing the input XML document. The result is that no PC (program counter) values are mapped successfully to source code lines. The problem seems to be more likely to occur if the binary being processed was compiled without debugging/symbol information (-g). I've not been able to reproduce this error on the machines that I have access to, if anyone has a test case (source) on which it occurs and would like to forward to me, please do. In the meantime, a workaround has been implemented that bypasses the use of the PS Tcl extension in favor of using the "addr2line" utility as external process. Those using psprocess on an IA-64 platform or who configured PerfSuite with --disable-binutils are already using this method, but for the rest the BFD Tcl extension is used by default. The workaround gives the user of psprocess the option to force use of addr2line by setting an environment variable, PSPROCESS_MAPPER. Tirath reports that the workaround produces more consistent results for the situations that previously failed. This option has been implemented in the development version of PerfSuite (0.6.2a3) and will be included with the next release. We'll also take a closer look at the Tcl extension itself to see if we can isolate the core problem location and address within there as well (using the extension is faster than using an external process, especially for profiles with a large number of addresses to which samples were attributed). Many thanks to Tirath for the report and assist with tracking down, Rick > > psprocess' job is > conceptually pretty simple: it has to: > > 1. Parse an XML document and extract the samples within (it uses Tcl/tDOM) > 2. Map each sample to a source code location (using libbfd or addr2line) > 3. Do some management/bookkeeping/summarization and display results [ ... ] > > To rule out (or implicate) step 2, I'll work with you off-list to verify > the PC->source mappings that are generated. To do this, I'll modify the > portion of psprocess where this is going on to provide debugging output > for the individual PC values and where they are mapped. This will take a > few days (it'll be based on the 0.6.1 version of PerfSuite which your > psinv output shows you are using). I'll supply you a patch which > should simply be a drop-in file replacement (since psprocess is a > script, not a compiled object/binary). There is currently no such > tracing supported in psprocess but I think it will be a useful thing to > incorporate in general to track things like this. > > If the mappings are all identical (as they should be, but that's what > we're checking for), then it'll point to data management within psprocess > as the problem and we can go from there. > > In any case, we'll summarize the results back to this mailing list and > with luck will have pinpointed it for the upcoming 0.6.2a3 release. |
From: Rick K. <rk...@nc...> - 2006-01-19 14:43:43
|
Tirath, Thanks for the good synopsis of what's going on and your experiments and results - this is a good start to work with. Backing up a bit, just by way of explanation, psprocess' job is conceptually pretty simple: it has to: 1. Parse an XML document and extract the samples within (it uses Tcl/tDOM) 2. Map each sample to a source code location (using libbfd or addr2line) 3. Do some management/bookkeeping/summarization and display results I'm thinking that one of steps 2 or 3 are where things are going astray here. The unusual thing is that, according to your latest experiments, the issues arise when the shell's file management is in the picture, either through redirection or a pipe. That seems strange to me, but is something I can try out on this end. To rule out (or implicate) step 2, I'll work with you off-list to verify the PC->source mappings that are generated. To do this, I'll modify the portion of psprocess where this is going on to provide debugging output for the individual PC values and where they are mapped. This will take a few days (it'll be based on the 0.6.1 version of PerfSuite which your psinv output shows you are using). I'll supply you a patch which should simply be a drop-in file replacement (since psprocess is a script, not a compiled object/binary). There is currently no such tracing supported in psprocess but I think it will be a useful thing to incorporate in general to track things like this. If the mappings are all identical (as they should be, but that's what we're checking for), then it'll point to data management within psprocess as the problem and we can go from there. In any case, we'll summarize the results back to this mailing list and with luck will have pinpointed it for the upcoming 0.6.2a3 release. Rick p.s. the BFD n/a output only means that the version of BFD on your system does not support a way of determining its version number. See: http://sources.redhat.com/ml/binutils/2004-02/msg00196.html |
From: Tirath R. <ti...@tp...> - 2006-01-19 10:10:53
|
Hi Rick and rest of list, > Sometimes it's good to rule things out first and look more closely at > what's remaining. In this case, it seems to me that PAPI is not the > source of the problem because by the time the XML document is > generated, > PAPI is out of the picture. Its job is to accumulate sample counts > that Agreed, PAPI is most likely not to blame at all. psprocess and tcl are my most likely suspects I think, especially given my history of psprocess/tcl related issues (see earlier posts, "psprocess, PAPI support not found"). >> psprocess -e /opt/gamess_src/gamess.02.x /usr/scr/gamess.02.x.873.xml >> | less > [ some mappings, no source line info ] The missing mappings are for shared libraries -- specifically, ATLAS. The missing source line info I'm assuming was because the binary was built without debugging symbols (i.e. gcc WITHOUT -g)? > mapping? Also, psprocess supports an "-o" option, which should > cause the > output to be written to a file. Does the same issue come up with > that? Ah, I didn't know about the "-o" option! D'oh.. that oughta teach me to RTFM from now on! > I'm guessing that since your file names are identical (both the > executable > and the XML document), that nothing has changed along those lines, > like a > recompile, but it doesn't hurt to verify that with you. Yes that's correct -- identical binaries. And yes, the different psprocess output as previously described was observed with identical XML input. > It might be easier to help if you could provide access to the > executable > and the XML document in question - is this possible? (only reply to > me or Sure, will send the XML offlist. As for remote access... alas the boxen I have access to are all hidden behind VPN unfortunately. :( Fortunately, the problem IS reproducable... very fortunate because in an unforgivable lapse of judgement I had deleted the original offending xmls! Here is a twist though. With the newly generated xml, I can obtain all mappings except those I believe to be due to shared libraries only with: (A) A1: `psprocess -e /opt/gamess_src/gamess.02.x gamess.02.x.1049.xml` A2: `psprocess -o tmp1.txt -e /opt/gamess_src/gamess.02.x gamess.02.x. 1049.xml` [ some mappings, no source line info ] With all of the following, I get no mappings: (B) B1: `psprocess -e /opt/gamess_src/gamess.02.x gamess.02.x.1049.xml | less` B2: `psprocess -e /opt/gamess_src/gamess.02.x gamess.02.x.1049.xml > tmp.txt` [ no mappings ] Note that the observed behaviour is now the __INVERSE__ of what was originally documented. However, as I mentioned in the first post, there has been one other case where the B2 command produced no mappings; if that weren't the case I would not have noticed a problem existed at all. For the sake of recording all this, full outputs are listed at the bottom of this post for both case A and B. These outputs were validated multiple times with the same xml file. I re-ran the job to produce a fresh xml - the same behaviour as noted above is observed. Let's call this case 0. (XML labelled appropriately will be sent offlist.) I changed the counter active in the profiling config xml to PAPI_L1_LDM (as opposed to PAPI_L2_LDM): __all__ cases (B1, B2, A1, A2) produce the __same textual output__, i.e. some mappings, no source line info. Let's call this case 1. (XML labelled appropriately will be sent offlist.) So crude preliminary observations: PAPI_L1_LDM profiling xml output file and all other counter profiling xml files as far as I know: fine PAPI_L2_LDM profiling xml output file: dramas Let me know if there is any more testing you can think of that would be beneficial. psinv and psprocess output enclosed below. Let me know how you think we ought to proceed. cheers, -tirath > the system and software - that can be gotten with the output of > "psinv" System Information - Processors: 1 Total Memory (MB): 883.40 System Page Size (KB): 4.00 Processor Information - Vendor: Intel Processor family: Pentium 4 Brand: Intel(R) Pentium(R) 4 CPU 3.00GHz Model (Type): (unknown) Revision: 1 Clock Speed: 3000.11 MHz Cache and TLB Information - Cache levels: 2 Caches/TLBs: 5 Cache Details - Level 1: Type: Data Size: 16 KB Line size: 64 bytes Associativity: 8-way set associative Type: Instruction Trace Size: 12K uOps Associativity: 8-way set associative Level 2: Type: Unified Size: 1.00 MB Line size: 64 bytes Associativity: 8-way set associative TLB Details - Level 1: Type: Instruction Entries: 64 Pagesize (KB): 4 2048 4096 Associativity: Fully associative Type: Data Entries: 64 Pagesize (KB): 4 4096 Associativity: Fully associative > and "psinv -x". [Note: "BFD n/a" may be an important clue!] tramdas@PC00019127:/opt/gamess_src$ psinv -x Software Version Information - Build date: Jan 11 2006 19:00:17 BFD: n/a GNU C Library: 2.3.2 PerfSuite: 0.6.1 Real time clock: CPU interval timer Tcl shell: 8.4.9 tDOM versions: 0.7.8 ########## ########## Astart, start output for A: ########## PerfSuite Hardware Performance Summary Report Version : 1.0 Created : Thu Jan 19 20:19:04 EST 2006 Generator : psprocess 0.2 XML Source : gamess.02.x.1049.xml Executable : /opt/gamess_src/gamess.02.x Execution Information ======================================================================== ==================== Date : Thu Jan 19 19:46:48 2006 Host : PC00019127.eng.monash.edu.au User : tramdas Processor and System Information ======================================================================== ==================== Node CPUs : 1 Vendor : Intel Family : Pentium 4 Brand : Intel(R) Pentium(R) 4 CPU 3.00GHz CPU Revision : 1 Clock (MHz) : 3000.106 Memory (MB) : 883.40 Pagesize (KB) : 4 Cache Information ======================================================================== ==================== Cache levels : 2 -------------------------------- Level 1 Type : data Size (KB) : 16 Linesize (B) : 64 Assoc : 8 Type : instruction trace Size (KuOps) : 12 Assoc : 8 -------------------------------- Level 2 Type : unified Size (KB) : 1024 Linesize (B) : 64 Assoc : 8 Profile Information ======================================================================== ==================== Class : PAPI Event : PAPI_L2_LDM (Level 2 load misses) Period : 10000 Samples : 292 Domain : user Run Time : 494.21 (seconds) Min Self % : (all) Module Summary ------------------------------------------------------------------------ -------- Samples Self % Total % Module 222 76.03% 76.03% /opt/gamess_src/gamess.02.x 65 22.26% 98.29% /usr/lib/atlas/sse2/libblas.so.3.0 4 1.37% 99.66% /lib/libc-2.3.2.so 1 0.34% 100.00% /lib/libpthread-0.10.so File Summary ------------------------------------------------------------------------ -------- Samples Self % Total % File 223 76.37% 76.37% ? 69 23.63% 100.00% ?? Function Summary ------------------------------------------------------------------------ -------- Samples Self % Total % Function 80 27.40% 27.40% sotran_ 69 23.63% 51.03% ?? 43 14.73% 65.75% dirfck_ 24 8.22% 73.97% vclr_ 16 5.48% 79.45% prefin_ 16 5.48% 84.93% jacdia_ 5 1.71% 86.64% xyzint_ 5 1.71% 88.36% sonewt_ 4 1.37% 89.73% twoei_ 4 1.37% 91.10% shlden_ 3 1.03% 92.12% shells_ 3 1.03% 93.15% coproj_ 3 1.03% 94.18% symmos_ 2 0.68% 94.86% bndord_ 2 0.68% 95.55% vadd_ 2 0.68% 96.23% schwdn_ 2 0.68% 96.92% sograd_ 2 0.68% 97.60% symtrd_ 1 0.34% 97.95% extrap_ 1 0.34% 98.29% prcalc_ 1 0.34% 98.63% dmtx_ 1 0.34% 98.97% zmat2_ 1 0.34% 99.32% symdia_ 1 0.34% 99.66% pthread_mutex_lock 1 0.34% 100.00% salcao_ Function:File:Line Summary ------------------------------------------------------------------------ -------- Samples Self % Total % Function:File:Line 80 27.40% 27.40% sotran_:?:0 69 23.63% 51.03% ??:??:0 43 14.73% 65.75% dirfck_:?:0 24 8.22% 73.97% vclr_:?:0 16 5.48% 79.45% prefin_:?:0 16 5.48% 84.93% jacdia_:?:0 5 1.71% 86.64% sonewt_:?:0 5 1.71% 88.36% xyzint_:?:0 4 1.37% 89.73% shlden_:?:0 4 1.37% 91.10% twoei_:?:0 3 1.03% 92.12% symmos_:?:0 3 1.03% 93.15% shells_:?:0 3 1.03% 94.18% coproj_:?:0 2 0.68% 94.86% bndord_:?:0 2 0.68% 95.55% vadd_:?:0 2 0.68% 96.23% schwdn_:?:0 2 0.68% 96.92% sograd_:?:0 2 0.68% 97.60% symtrd_:?:0 1 0.34% 97.95% extrap_:?:0 1 0.34% 98.29% prcalc_:?:0 1 0.34% 98.63% dmtx_:?:0 1 0.34% 98.97% zmat2_:?:0 1 0.34% 99.32% symdia_:?:0 1 0.34% 99.66% pthread_mutex_lock:?:0 1 0.34% 100.00% salcao_:?:0 ########## ########## Aend, end output for A. ########## ########## ########## Bstart, start output for B: ########## PerfSuite Hardware Performance Summary Report Version : 1.0 Created : Thu Jan 19 20:25:23 EST 2006 Generator : psprocess 0.2 XML Source : gamess.02.x.1049.xml Executable : /opt/gamess_src/gamess.02.x Execution Information ======================================================================== ======== ============ Date : Thu Jan 19 19:46:48 2006 Host : PC00019127.eng.monash.edu.au User : tramdas Processor and System Information ======================================================================== ======== ============ Node CPUs : 1 Vendor : Intel Family : Pentium 4 Brand : Intel(R) Pentium(R) 4 CPU 3.00GHz CPU Revision : 1 Clock (MHz) : 3000.106 Memory (MB) : 883.40 Pagesize (KB) : 4 Cache Information ======================================================================== ======== ============ Cache levels : 2 -------------------------------- Level 1 Type : data Size (KB) : 16 Linesize (B) : 64 Assoc : 8 Type : instruction trace Size (KuOps) : 12 Assoc : 8 -------------------------------- Level 2 Type : unified Size (KB) : 1024 Linesize (B) : 64 Assoc : 8 Profile Information ======================================================================== ======== ============ Class : PAPI Event : PAPI_L2_LDM (Level 2 load misses) Period : 10000 Samples : 292 Domain : user Run Time : 494.21 (seconds) Min Self % : (all) Module Summary ------------------------------------------------------------------------ -------- Samples Self % Total % Module 222 76.03% 76.03% /opt/gamess_src/gamess.02.x 65 22.26% 98.29% /usr/lib/atlas/sse2/libblas.so.3.0 4 1.37% 99.66% /lib/libc-2.3.2.so 1 0.34% 100.00% /lib/libpthread-0.10.so File Summary ------------------------------------------------------------------------ -------- Samples Self % Total % File 292 100.00% 100.00% ?? Function Summary ------------------------------------------------------------------------ -------- Samples Self % Total % Function 292 100.00% 100.00% ?? Function:File:Line Summary ------------------------------------------------------------------------ -------- Samples Self % Total % Function:File:Line 292 100.00% 100.00% ??:??:0 ########## ########## Bend, end output for B. ########## |
From: Rick K. <rk...@nc...> - 2006-01-17 14:37:07
|
Tirath, > I've encountered something that seems quite weird to me... That doesn't just seem weird - it *is* weird. I haven't personally seen that before and have had no similar reports, but I'd like to learn what's behind it. > I profiled a job with psrun, and it all works fine with most > counters, however this happens only with PAPI_L2_LDM (as far as I've > experienced): Sometimes it's good to rule things out first and look more closely at what's remaining. In this case, it seems to me that PAPI is not the source of the problem because by the time the XML document is generated, PAPI is out of the picture. Its job is to accumulate sample counts that are attributed to particular values of the program counter. These are what the XML doc produced by profiling should contain. One could map a values of the program counter back to source code lines by using "addr2line -e EXE -f <pcvalue>". After the XML is written, the software that comes into play is: psprocess itself, Tcl, and either the BFD library from GNU binutils or the addr2line utility (also from binutils). So likely one of these is where things are heading down the wrong path. I can't think of a reason for non-deterministic behavior like you're describing by the introduction of "less" into the mix. > psprocess -e /opt/gamess_src/gamess.02.x /usr/scr/gamess.02.x.873.xml [ no mappings ] > psprocess -e /opt/gamess_src/gamess.02.x /usr/scr/gamess.02.x.873.xml > | less [ some mappings, no source line info ] > I guess this isn't really a huge problem (so long as the veracity of > the info is not disputed), but has anyone else experienced this weird > behaviour??? The last few times I tried redirecting to a text file > (i.e.: psprocess -e /opt/gamess_src/gamess.02.x /usr/scr/gamess.02.x. > 873.xml > tmp.txt) it has been working fine, but it didn't work > yesterday, which is how I noticed there was a problem to begin with. I'm curious about the behavior of redirection that didn't work. Was no output produced at all, or was it a similar situation of no source line mapping? Also, psprocess supports an "-o" option, which should cause the output to be written to a file. Does the same issue come up with that? I'm guessing that since your file names are identical (both the executable and the XML document), that nothing has changed along those lines, like a recompile, but it doesn't hurt to verify that with you. It might be easier to help if you could provide access to the executable and the XML document in question - is this possible? (only reply to me or place on an accessible server, of course!). Also some information about the system and software - that can be gotten with the output of "psinv" and "psinv -x". Rick |
From: Tirath R. <ti...@in...> - 2006-01-17 05:04:26
|
hi all, I've encountered something that seems quite weird to me... I profiled a job with psrun, and it all works fine with most counters, however this happens only with PAPI_L2_LDM (as far as I've experienced): --start psprocess -e /opt/gamess_src/gamess.02.x /usr/scr/gamess.02.x.873.xml ... File Summary ------------------------------------------------------------------------ -------- Samples Self % Total % File 305 100.00% 100.00% ?? Function Summary ------------------------------------------------------------------------ -------- Samples Self % Total % Function 305 100.00% 100.00% ?? Function:File:Line Summary ------------------------------------------------------------------------ -------- Samples Self % Total % Function:File:Line 305 100.00% 100.00% ??:??:0 --end BUT, when I pipe it through `less`, i.e.: --start psprocess -e /opt/gamess_src/gamess.02.x /usr/scr/gamess.02.x.873.xml | less ... File Summary ------------------------------------------------------------------------ -------- Samples Self % Total % File 243 79.67% 79.67% ? 62 20.33% 100.00% ?? Function Summary ------------------------------------------------------------------------ -------- Samples Self % Total % Function 78 25.57% 25.57% sotran_ 62 20.33% 45.90% ?? 41 13.44% 59.34% dirfck_ 28 9.18% 68.52% vclr_ ... Function:File:Line Summary ------------------------------------------------------------------------ -------- Samples Self % Total % Function:File:Line 78 25.57% 25.57% sotran_:?:0 62 20.33% 45.90% ??:??:0 41 13.44% 59.34% dirfck_:?:0 28 9.18% 68.52% vclr_:?:0 23 7.54% 76.07% prefin_:?:0 --end (The missing line summaries are fine because I compiled the executable without debugging symbols) I guess this isn't really a huge problem (so long as the veracity of the info is not disputed), but has anyone else experienced this weird behaviour??? The last few times I tried redirecting to a text file (i.e.: psprocess -e /opt/gamess_src/gamess.02.x /usr/scr/gamess.02.x. 873.xml > tmp.txt) it has been working fine, but it didn't work yesterday, which is how I noticed there was a problem to begin with. -tirath |
From: Giuseppe G. <gg9...@un...> - 2006-01-16 15:16:13
|
Tirath Ramdas ha scritto: > Hi Giuseppe, > > On 16/01/2006, at 10:16 PM, Giuseppe Grieco wrote: > >> <snip> >> PS_HWPC_CONFIG = fp.xml (I tried also using >> absolute path) >> <snip> > > > I'm not sure if this will help, but did you try to `export > PS_HWPC_CONFIG`? > > -tirath > Dear Tirath, I tried again and it works now. Before I put setting of environmental variable in makefile and it did not works. Doing it in my shell it works. Thanks, Giuseppe. |
From: Giuseppe G. <gg9...@un...> - 2006-01-16 11:18:46
|
Dear all, I am trying to use a different configuration file for counting events. It works when running my code with % psrun -c fp.xml a.out but it does not work when I use libpshpc library. In my makefile I set PS_HWPC_CONFIG = fp.xml (I tried also using absolute path) but the xml output file is done according to the default configuration file. In what am I wrong? Thanks, Giuseppe. |
From: Rick K. <rk...@nc...> - 2006-01-12 14:06:00
|
Tirath, On Thu, 12 Jan 2006, Tirath Ramdas wrote: > Ah yes, you're right that was the problem... the Debian package > tcl8.4-dev was installed, but for some reason the tcl.h and other > header files were installed in /usr/include/tcl8.4, which means even > specifying a ./configure --tclinclude=/usr/include/tcl8.4 would fail > since the configure script automatically adds a "/include" to > whatever tclinclude is specified! Good news, glad to hear it. Not the part about configure adding "/include", seems to me that's not very friendly of it! That's a problem on this end which I will fix before the next release. It falls into the "what was I thinking?" category. > > As a kludgy (but tolerable I hope) solution, I set and exported > C_INCLUDE_PATH="/usr/include/tcl8.4". I guess now that the tools are > all built I should be able to get rid of it... but I'll leave it in > my .bashrc for now, I can't see any problems arising because of this, > correct me if I'm wrong. > > Anyway, all is well now! Thanks for your help Rick! > Your solution sounds good to me. I'd never heard of C_INCLUDE_PATH before, it's good to know about it. I would imagine that now you have the Tcl-related libraries built, you should be OK commenting it out and seeing how things work. I'm glad the issues were sorted out. Thanks back to you for the problem report... Rick |
From: Tirath R. <ti...@tp...> - 2006-01-12 02:43:32
|
On 12/01/2006, at 12:50 AM, Rick Kufrin wrote: > configure:22048: result: failed > > ... so I think that what might be going on here is that your system > currently lacks the Tcl development packages and only has the > prebuilt Tcl > libraries and shells. I'm not familiar with Debian but a Google > search > resulted in a package called "tcl-devel" that looks promising. Ah yes, you're right that was the problem... the Debian package tcl8.4-dev was installed, but for some reason the tcl.h and other header files were installed in /usr/include/tcl8.4, which means even specifying a ./configure --tclinclude=/usr/include/tcl8.4 would fail since the configure script automatically adds a "/include" to whatever tclinclude is specified! As a kludgy (but tolerable I hope) solution, I set and exported C_INCLUDE_PATH="/usr/include/tcl8.4". I guess now that the tools are all built I should be able to get rid of it... but I'll leave it in my .bashrc for now, I can't see any problems arising because of this, correct me if I'm wrong. Anyway, all is well now! Thanks for your help Rick! cheers, -tirath |
From: Rick K. <rk...@nc...> - 2006-01-11 14:00:39
|
Giuseppe, > I am testing a process that lasts tipically 30-40 seconds and it is not > too short that many zeros appears. For this process I have variations up > to 100 % in the number of total floating point operations. > > I used perfsuite also to test a simple matrix-by-matrix multiplication > that lasts between 0.5 an 1.0 seconds, but variation in this parameter > are up to 20% and all counting parameters works. OK, thanks for the further information about the extent of the variation in your experiments. I am still putting inaccuracies due to multiplexing at the top of the list. > Does the number of subroutines I call influences this parameter? What > can I do to have a more precise test? For aggregate counting with psrun it shouldn't matter. Profiling is a different situation (as the comments forwarded by Harry M yesterday pointed out). I'd suggest using a non-multiplexed configuration file first to see if your results vary less. You can use an alternate config file with psrun through the "-c" option, for example: psrun -c fp.xml a.out Where "fp.xml" is an XML file that contains the following: <?xml version="1.0" encoding="UTF-8" ?> <ps_hwpc_eventlist class="PAPI"> <ps_hwpc_event type="preset" name="PAPI_FP_INS" /> <ps_hwpc_event type="preset" name="PAPI_TOT_CYC" /> </ps_hwpc_eventlist> This file should count only floating point instructions and total cycles on your system. Rick |
From: Rick K. <rk...@nc...> - 2006-01-11 13:50:58
|
Tirath, Thanks for sending the config.log output - those things are pretty long but are helpful when trying to see what's happening during the build and install on someone's machine. I think I may have a clue as to what's going on: On Wed, 11 Jan 2006, Tirath Ramdas wrote: > I tried setting TCLLIB="/usr/local/share/perfsuite/tcllib/pshwpc/". > With this, doing `package require pspapi` results in: > > "couldn't load file "/usr/local/lib/papi/libpspapi.so": /usr/local/ > lib/papi/libpspapi.so: cannot open shared object file: No such file > or directory". > > I tried to find libpspapi.so (`sudo find /usr -name libpspapi.so`... > that includes /usr/source) but no such file was found. > > Find attached the full config.log. > The library in question (pspapi) is a C-coded Tcl extension that's part of PerfSuite that exposes a portion of PAPI to Tcl scripts. psprocess is a Tcl script that uses it (the small graphical utility psconfig does as well). When PerfSuite is built, it has to check that the Tcl development libraries and header files are available on the target machine, which it does during the "configure" process. Here are the relevant portions from your configure output: configure:21992: checking for use of Tcl library configure:22019: gcc -c -g -O2 conftest.c >&5 conftest.c:2:21: tcl.h: No such file or directory conftest.c: In function `main': conftest.c:7: error: `Tcl_AppInit' undeclared (first use in this function) conftest.c:7: error: (Each undeclared identifier is reported only once conftest.c:7: error: for each function it appears in.) configure:22025: $? = 1 configure: failed program was: | | #include <tcl.h> | int main(argc, argv) | int argc; | char **argv; | { | Tcl_Main(argc,argv,Tcl_AppInit); | exit(0); | } | configure:22048: result: failed ... so I think that what might be going on here is that your system currently lacks the Tcl development packages and only has the prebuilt Tcl libraries and shells. I'm not familiar with Debian but a Google search resulted in a package called "tcl-devel" that looks promising. You might try installing the Tcl development things (you don't need Tk, which is a related package), reconfiguring PerfSuite, and trying again. If configure no longer complains at the above tests and if you see a subdirectory in $PREFIX/lib called "papi" after installation, we've located the culprit. Rick |
From: Giuseppe G. <gg9...@un...> - 2006-01-11 09:59:39
|
> > >Giuseppe, > > > >>> I am using perfsuite with PAPI, but I am not sure if multiplexing is >>> enabled. How to verify it? Is it necessary to install MPX? >>> Variation that occurs are up to 100 % so unuseful for me, and I always >>> use the same intel compiler with the same option. Thanks, Giuseppe. >>> >> >> > >It's not necessary to install MPX. That's a software layer that was >developed by John May at Lawrence Livermore. It's the basis for PAPI's >multiplexing and so comes with PAPI. It's used when you are using >PerfSuite in counting (not profiling) mode, and you can tell when it has >been enabled by either looking in the "Statistics" section of psprocess >output or looking at the XML output files that psrun creates. > >If you have a program that runs for a very short time, it's possible that >you may see many event counts of zero, and typically that occurs because >the process is so short-lived, there is not enough time to cycle through >all the events. You can force runs to happen without multiplexing by >creating your own configuration file; there is a short description of how >to do so here: > >http://perfsuite.ncsa.uiuc.edu/libpshwpc/sampleconfig.html > >Rick > Dear Rick, I am testing a process that lasts tipically 30-40 seconds and it is not too short that many zeros appears. For this process I have variations up to 100 % in the number of total floating point operations. I used perfsuite also to test a simple matrix-by-matrix multiplication that lasts between 0.5 an 1.0 seconds, but variation in this parameter are up to 20% and all counting parameters works. Does the number of subroutines I call influences this parameter? What can I do to have a more precise test? Thanks, Giuseppe. |
From: Rick K. <rk...@nc...> - 2006-01-11 05:48:57
|
Tirath, First of all, you get hundreds of gold stars for going so deeply into trying to isolate the problem on your own :) Thanks for the detailed problem report and your attempts to set things right. I appreciate the effort to do a little Tcl debugging - I know Tcl's not on the list of in-vogue languages, and the intent of PerfSuite is for users to not even be aware that Tcl is involved (or have to debug Tcl code). The problems you are seeing can come from several places, but as a first cut, would you try setting the environment variable TCLLIBPATH to "/usr/local/share/perfsuite/tcllib", and then doing the psprocess run on your XML document again? If that doesn't work, then please send the output of your configure script (it should be contained in the file config.log in the directory where you built PerfSuite) so we can see where things were placed on your system. Two other things that would be helpful are your kernel version (uname -r) and the CPU type that you're using. Rick On Wed, 11 Jan 2006, Tirath Ramdas wrote: > Hi all, > > I've run psrun on a test Fortran code, and the xml was produced. I > did so with the profil.xml configuration and psprocess was able to > handle the psrun generated xml, however when I try to do so with a > modified papi3_p4.xml [2], psprocess dies: > PAPI support not found (required for this XML doc) > > I've dug a little deeper, doing this in tclsh: > % package require pspapi > can't find package pspapi > % package names > http tdom tcltest msgcat opt Tk Tcl > > Prior to this I've never mucked around with tcl. I tried a few things > like pkg_mkIndex and mucking around with tcl_pkgPath, with no > success. For what it's worth though, I did see this: > > % pkg_mkIndex -verbose /usr/local/lib > warning: error while loading libperfctr.so: couldn't find procedure > Perfctr_Init > warning: error while loading libpapi.so: couldn't find procedure > Papi_Init > warning: error while loading libpshwpc.so: couldn't find procedure > Pshwpc_Init > warning: error while loading libpshwpc_r.so: couldn't find procedure > Pshwpc_r_Init > warning: error while loading libpsrun.so: couldn't load file "./ > libpsrun.so": ./libpsrun.so: undefined symbol: _ps_c2fstring > warning: error while loading libpsrun_r.so: couldn't load file "./ > libpsrun_r.so": ./libpsrun_r.so: undefined symbol: _ps_c2fstring > > I installed papi and perfsuite in /usr/local. tdom was obtained with > apt-get, and is in /usr. I'm stumped! Any pointers anyone? > > regards, > -tirath > [0] I did make -k check; everything was okay. > [1] This is a Debian system. > [2] The modification: I omitted PAPI_L2_DCR, which according to psinv > -p is not available on my system. > [3] Contents of /usr/local/lib: > firmware libpshwpc.la libpsrun_r.a > libpapi.a libpshwpc_r.a libpsrun_r.la > libpapi.so libpshwpc_r.la libpsrun_r.so > libperfctr.a libpshwpc_r.so libpsrun_r.so.0 > libperfctr.so libpshwpc_r.so.0 libpsrun_r.so.0.0.0 > libperfctr.so.5 libpshwpc_r.so.0.0.0 libpsrun.so > libperfctr.so.5.2.6.15 libpshwpc.so libpsrun.so.0 > libperfctr.so.5.2.6.18 libpshwpc.so.0 libpsrun.so.0.0.0 > libperfsuite.a libpshwpc.so.0.0.0 pkgIndex.tcl > libperfsuite_r.a libpsrun.a python2.3 > libpshwpc.a libpsrun.la site_ruby > > > > ------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. Do you grep through log files > for problems? Stop! Download the new AJAX search engine that makes > searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! > http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click > _______________________________________________ > PerfSuite-users mailing list > Per...@li... > https://lists.sourceforge.net/lists/listinfo/perfsuite-users > |
From: Tirath R. <ti...@tp...> - 2006-01-11 05:01:25
|
Hi all, I've run psrun on a test Fortran code, and the xml was produced. I did so with the profil.xml configuration and psprocess was able to handle the psrun generated xml, however when I try to do so with a modified papi3_p4.xml [2], psrun works but psprocess dies: "PAPI support not found (required for this XML doc)" I've dug a little deeper, doing this in tclsh: % package require pspapi can't find package pspapi % package names http tdom tcltest msgcat opt Tk Tcl My TCL skills are virtually zero. I tried a few things like pkg_mkIndex and mucking around with tcl_pkgPath, with no success. For what it's worth though, I did see this: % pkg_mkIndex -verbose /usr/local/lib warning: error while loading libperfctr.so: couldn't find procedure Perfctr_Init warning: error while loading libpapi.so: couldn't find procedure Papi_Init warning: error while loading libpshwpc.so: couldn't find procedure Pshwpc_Init warning: error while loading libpshwpc_r.so: couldn't find procedure Pshwpc_r_Init warning: error while loading libpsrun.so: couldn't load file "./ libpsrun.so": ./libpsrun.so: undefined symbol: _ps_c2fstring warning: error while loading libpsrun_r.so: couldn't load file "./ libpsrun_r.so": ./libpsrun_r.so: undefined symbol: _ps_c2fstring I installed papi and perfsuite in /usr/local. tdom was obtained with apt-get, and is in /usr. I'm stumped! Any pointers anyone? regards, -tirath [0] I did make -k check; everything was okay. [1] This is a Debian system. [2] The modification: I omitted PAPI_L2_DCR, which according to psinv -p is not available on my system. [3] Contents of /usr/local/lib: firmware libpshwpc.la libpsrun_r.a libpapi.a libpshwpc_r.a libpsrun_r.la libpapi.so libpshwpc_r.la libpsrun_r.so libperfctr.a libpshwpc_r.so libpsrun_r.so.0 libperfctr.so libpshwpc_r.so.0 libpsrun_r.so.0.0.0 libperfctr.so.5 libpshwpc_r.so.0.0.0 libpsrun.so libperfctr.so.5.2.6.15 libpshwpc.so libpsrun.so.0 libperfctr.so.5.2.6.18 libpshwpc.so.0 libpsrun.so.0.0.0 libperfsuite.a libpshwpc.so.0.0.0 pkgIndex.tcl libperfsuite_r.a libpsrun.a python2.3 libpshwpc.a libpsrun.la site_ruby |
From: Tirath R. <ti...@in...> - 2006-01-11 04:47:13
|
Hi all, I've run psrun on a test Fortran code, and the xml was produced. I did so with the profil.xml configuration and psprocess was able to handle the psrun generated xml, however when I try to do so with a modified papi3_p4.xml [2], psprocess dies: PAPI support not found (required for this XML doc) I've dug a little deeper, doing this in tclsh: % package require pspapi can't find package pspapi % package names http tdom tcltest msgcat opt Tk Tcl Prior to this I've never mucked around with tcl. I tried a few things like pkg_mkIndex and mucking around with tcl_pkgPath, with no success. For what it's worth though, I did see this: % pkg_mkIndex -verbose /usr/local/lib warning: error while loading libperfctr.so: couldn't find procedure Perfctr_Init warning: error while loading libpapi.so: couldn't find procedure Papi_Init warning: error while loading libpshwpc.so: couldn't find procedure Pshwpc_Init warning: error while loading libpshwpc_r.so: couldn't find procedure Pshwpc_r_Init warning: error while loading libpsrun.so: couldn't load file "./ libpsrun.so": ./libpsrun.so: undefined symbol: _ps_c2fstring warning: error while loading libpsrun_r.so: couldn't load file "./ libpsrun_r.so": ./libpsrun_r.so: undefined symbol: _ps_c2fstring I installed papi and perfsuite in /usr/local. tdom was obtained with apt-get, and is in /usr. I'm stumped! Any pointers anyone? regards, -tirath [0] I did make -k check; everything was okay. [1] This is a Debian system. [2] The modification: I omitted PAPI_L2_DCR, which according to psinv -p is not available on my system. [3] Contents of /usr/local/lib: firmware libpshwpc.la libpsrun_r.a libpapi.a libpshwpc_r.a libpsrun_r.la libpapi.so libpshwpc_r.la libpsrun_r.so libperfctr.a libpshwpc_r.so libpsrun_r.so.0 libperfctr.so libpshwpc_r.so.0 libpsrun_r.so.0.0.0 libperfctr.so.5 libpshwpc_r.so.0.0.0 libpsrun.so libperfctr.so.5.2.6.15 libpshwpc.so libpsrun.so.0 libperfctr.so.5.2.6.18 libpshwpc.so.0 libpsrun.so.0.0.0 libperfsuite.a libpshwpc.so.0.0.0 pkgIndex.tcl libperfsuite_r.a libpsrun.a python2.3 libpshwpc.a libpsrun.la site_ruby |
From: Rick K. <rk...@nc...> - 2006-01-11 04:39:45
|
Giuseppe, > I am using perfsuite with PAPI, but I am not sure if multiplexing is > enabled. How to verify it? Is it necessary to install MPX? > Variation that occurs are up to 100 % so unuseful for me, and I always > use the same intel compiler with the same option. Thanks, Giuseppe. > It's not necessary to install MPX. That's a software layer that was developed by John May at Lawrence Livermore. It's the basis for PAPI's multiplexing and so comes with PAPI. It's used when you are using PerfSuite in counting (not profiling) mode, and you can tell when it has been enabled by either looking in the "Statistics" section of psprocess output or looking at the XML output files that psrun creates. If you have a program that runs for a very short time, it's possible that you may see many event counts of zero, and typically that occurs because the process is so short-lived, there is not enough time to cycle through all the events. You can force runs to happen without multiplexing by creating your own configuration file; there is a short description of how to do so here: http://perfsuite.ncsa.uiuc.edu/libpshwpc/sampleconfig.html Rick |
From: Harry M. <hj...@ta...> - 2006-01-10 16:11:59
|
I'll defer to Rick, but I think that the perfsuite tools use statistical sampling, and thus especially over short time courses, there will be significant differences in reporting just because of the sampling interval. Over infrequently used routines in some profiling I've done recently, the values varies over 30% on identical runs (of ~10s of CPU time). The following is perhaps slightly off-topic, but it does a lot to explain what profilers will give you and what they cannot and why: Rob Fowler <rj...@ri...> kindly wrote in response to a similar query from me: ===== I haven't tried to digest all of the details, but things like this are the reason why we tend to emphasize loop-level analysis and to take statement-level numbers with a grain of salt. There are three primary contributions for phenomena like this: * First, optimizing compilers will be aggressively rearranging the code. The generated code is a shuffle of instructions from different statements. * Second, remember that the performance of modern CPUs depends on a lot of instruction level parallelism and there can be dozens of instructions "in flight" at a time and instructions can be issued and completed out of order. * Third, when an event occurs, processors tend to be sloppy w.r.t. the attribution of the event to a specific instruction. The reported program counter for any one event is subject to "skew and smear", i.e. it's likely to be attributed to some nearby instruction that is currently in the pipeline. For example, it might be the most recent instruction to enter the pipeline. Thus,if you look at the instruction level, you can see seemingly nonsensical stuff like "loads" being charged to floating point instructions, visa versa, etc. These three components all contribute to making instruction and statement level counts imprecise. On the other hand, averaged over hundreds of instructions, the aggregate numbers are very stable and reliable. ** A brief religious statement: On encountering the attribution problem for deeply pipelined, out-of-order processors, some architects have just chosen to ignore it. Others, i.e., the Alpha architects, abandoned conventional event counts in favor of other mechanisms. Still others have risen to the challenge and have implemented clever and relatively expensive mechanisms to restore precise attribution, i.e., Power 5. Since performance issues on high-ILP machines are not a matter of any single instructions, rather "how they play with a few dozen of their closest friends", I believe that while precise attribution may be an admirable goal, it is not necessary and not worth paying a lot for, at least not for the kinds of analyses we do and certainly not for tools that use coarse-grain calipers for measurement. A recurring scenario that we've run into is a loop in which, say, 80% of the cost (or other measures ) has been attributed to a single statement. The developer sees a big potential for improvement and rearranges the code to try to get a big win by reducing the cost of that one statement. The confusing resultis that the overall cost is unchanged, but now a different statement gets charged the 80%. I hope this helps. -- Rob On Tuesday 10 January 2006 05:58, Rick Kufrin wrote: > On Tue, 10 Jan 2006, Giuseppe Grieco wrote: > > I would like to know why it happens that monitoring the same process, > > the number of total floating point operations changes. > > I guess it should remain the same. Is it? > > Thanks, Giuseppe > > Giuseppe, > > I think it depends on a number of factors. If you are using default > configuration files for PerfSuite, remember that multiplexing (timesharing > of the available performance counter registers) will be occurring. By > nature this results in estimates of the true number of event occurrences > and will be inexact, so the counts can be expected to vary from run to > run. > > Also, depending on the CPU type and the underlying access method (Perfmon > or PAPI), the "floating point operation" count may also include vector > operations (e.g. x86 SSE), which are not all strictly floating point > operations. Some things are compiler-dependent. > > Finally, things can vary to some extent just to different runtime > conditions although it's hard to say since you don't indicate to what > degree you are seeing variation. > > Rick > > > > > > ------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. Do you grep through log > files for problems? Stop! Download the new AJAX search engine that makes > searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! > http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click > _______________________________________________ > PerfSuite-users mailing list > Per...@li... > https://lists.sourceforge.net/lists/listinfo/perfsuite-users -- Cheers, Harry Harry J Mangalam - 949 856 2847 (vox; email for fax) - hj...@ta... <<plain text preferred>> |
From: Giuseppe G. <gg9...@un...> - 2006-01-10 15:25:14
|
Dear Rick, I am using perfsuite with PAPI, but I am not sure if multiplexing is enabled. How to verify it? Is it necessary to install MPX? Variation that occurs are up to 100 % so unuseful for me, and I always use the same intel compiler with the same option. Thanks, Giuseppe. |
From: Rick K. <rk...@nc...> - 2006-01-10 13:58:33
|
On Tue, 10 Jan 2006, Giuseppe Grieco wrote: > I would like to know why it happens that monitoring the same process, > the number of total floating point operations changes. > I guess it should remain the same. Is it? > Thanks, Giuseppe Giuseppe, I think it depends on a number of factors. If you are using default configuration files for PerfSuite, remember that multiplexing (timesharing of the available performance counter registers) will be occurring. By nature this results in estimates of the true number of event occurrences and will be inexact, so the counts can be expected to vary from run to run. Also, depending on the CPU type and the underlying access method (Perfmon or PAPI), the "floating point operation" count may also include vector operations (e.g. x86 SSE), which are not all strictly floating point operations. Some things are compiler-dependent. Finally, things can vary to some extent just to different runtime conditions although it's hard to say since you don't indicate to what degree you are seeing variation. Rick |
From: Giuseppe G. <gg9...@un...> - 2006-01-10 13:00:03
|
I would like to know why it happens that monitoring the same process, the number of total floating point operations changes. I guess it should remain the same. Is it? Thanks, Giuseppe |
From: Rick K. <rk...@nc...> - 2005-12-22 19:28:26
|
Greetings, I have been adding some additional metrics to the default set of metrics calculated by the psprocess command when doing aggregate counting. These metrics are processed on the fly from an XML database that psprocess reads at runtime. The database used is located after installation in the directory: PREFIX/share/perfsuite/xml/pshwpc There are two variants: PAPI_metrics.xml and perfmon_metrics.xml. The one used depends whether you are using PAPI or perfmon for any particular run. The set of metrics that exist to date are just those that were thought to be a decent set, but could be extended. Users can add their own custom metrics contained in a separate XML document (the "-m" option to psprocess), but that is localized. So this email is to let you know that if there are metrics that you wish were presented but are not (and you know the definition of such a metric), please feel free to send an email, and if the metric seems useful, I'll be happy to add to the default set of metrics for the next release of PerfSuite. Thanks, Rick |