From: Tirath R. <ti...@tp...> - 2006-01-19 10:10:53
|
Hi Rick and rest of list, > Sometimes it's good to rule things out first and look more closely at > what's remaining. In this case, it seems to me that PAPI is not the > source of the problem because by the time the XML document is > generated, > PAPI is out of the picture. Its job is to accumulate sample counts > that Agreed, PAPI is most likely not to blame at all. psprocess and tcl are my most likely suspects I think, especially given my history of psprocess/tcl related issues (see earlier posts, "psprocess, PAPI support not found"). >> psprocess -e /opt/gamess_src/gamess.02.x /usr/scr/gamess.02.x.873.xml >> | less > [ some mappings, no source line info ] The missing mappings are for shared libraries -- specifically, ATLAS. The missing source line info I'm assuming was because the binary was built without debugging symbols (i.e. gcc WITHOUT -g)? > mapping? Also, psprocess supports an "-o" option, which should > cause the > output to be written to a file. Does the same issue come up with > that? Ah, I didn't know about the "-o" option! D'oh.. that oughta teach me to RTFM from now on! > I'm guessing that since your file names are identical (both the > executable > and the XML document), that nothing has changed along those lines, > like a > recompile, but it doesn't hurt to verify that with you. Yes that's correct -- identical binaries. And yes, the different psprocess output as previously described was observed with identical XML input. > It might be easier to help if you could provide access to the > executable > and the XML document in question - is this possible? (only reply to > me or Sure, will send the XML offlist. As for remote access... alas the boxen I have access to are all hidden behind VPN unfortunately. :( Fortunately, the problem IS reproducable... very fortunate because in an unforgivable lapse of judgement I had deleted the original offending xmls! Here is a twist though. With the newly generated xml, I can obtain all mappings except those I believe to be due to shared libraries only with: (A) A1: `psprocess -e /opt/gamess_src/gamess.02.x gamess.02.x.1049.xml` A2: `psprocess -o tmp1.txt -e /opt/gamess_src/gamess.02.x gamess.02.x. 1049.xml` [ some mappings, no source line info ] With all of the following, I get no mappings: (B) B1: `psprocess -e /opt/gamess_src/gamess.02.x gamess.02.x.1049.xml | less` B2: `psprocess -e /opt/gamess_src/gamess.02.x gamess.02.x.1049.xml > tmp.txt` [ no mappings ] Note that the observed behaviour is now the __INVERSE__ of what was originally documented. However, as I mentioned in the first post, there has been one other case where the B2 command produced no mappings; if that weren't the case I would not have noticed a problem existed at all. For the sake of recording all this, full outputs are listed at the bottom of this post for both case A and B. These outputs were validated multiple times with the same xml file. I re-ran the job to produce a fresh xml - the same behaviour as noted above is observed. Let's call this case 0. (XML labelled appropriately will be sent offlist.) I changed the counter active in the profiling config xml to PAPI_L1_LDM (as opposed to PAPI_L2_LDM): __all__ cases (B1, B2, A1, A2) produce the __same textual output__, i.e. some mappings, no source line info. Let's call this case 1. (XML labelled appropriately will be sent offlist.) So crude preliminary observations: PAPI_L1_LDM profiling xml output file and all other counter profiling xml files as far as I know: fine PAPI_L2_LDM profiling xml output file: dramas Let me know if there is any more testing you can think of that would be beneficial. psinv and psprocess output enclosed below. Let me know how you think we ought to proceed. cheers, -tirath > the system and software - that can be gotten with the output of > "psinv" System Information - Processors: 1 Total Memory (MB): 883.40 System Page Size (KB): 4.00 Processor Information - Vendor: Intel Processor family: Pentium 4 Brand: Intel(R) Pentium(R) 4 CPU 3.00GHz Model (Type): (unknown) Revision: 1 Clock Speed: 3000.11 MHz Cache and TLB Information - Cache levels: 2 Caches/TLBs: 5 Cache Details - Level 1: Type: Data Size: 16 KB Line size: 64 bytes Associativity: 8-way set associative Type: Instruction Trace Size: 12K uOps Associativity: 8-way set associative Level 2: Type: Unified Size: 1.00 MB Line size: 64 bytes Associativity: 8-way set associative TLB Details - Level 1: Type: Instruction Entries: 64 Pagesize (KB): 4 2048 4096 Associativity: Fully associative Type: Data Entries: 64 Pagesize (KB): 4 4096 Associativity: Fully associative > and "psinv -x". [Note: "BFD n/a" may be an important clue!] tramdas@PC00019127:/opt/gamess_src$ psinv -x Software Version Information - Build date: Jan 11 2006 19:00:17 BFD: n/a GNU C Library: 2.3.2 PerfSuite: 0.6.1 Real time clock: CPU interval timer Tcl shell: 8.4.9 tDOM versions: 0.7.8 ########## ########## Astart, start output for A: ########## PerfSuite Hardware Performance Summary Report Version : 1.0 Created : Thu Jan 19 20:19:04 EST 2006 Generator : psprocess 0.2 XML Source : gamess.02.x.1049.xml Executable : /opt/gamess_src/gamess.02.x Execution Information ======================================================================== ==================== Date : Thu Jan 19 19:46:48 2006 Host : PC00019127.eng.monash.edu.au User : tramdas Processor and System Information ======================================================================== ==================== Node CPUs : 1 Vendor : Intel Family : Pentium 4 Brand : Intel(R) Pentium(R) 4 CPU 3.00GHz CPU Revision : 1 Clock (MHz) : 3000.106 Memory (MB) : 883.40 Pagesize (KB) : 4 Cache Information ======================================================================== ==================== Cache levels : 2 -------------------------------- Level 1 Type : data Size (KB) : 16 Linesize (B) : 64 Assoc : 8 Type : instruction trace Size (KuOps) : 12 Assoc : 8 -------------------------------- Level 2 Type : unified Size (KB) : 1024 Linesize (B) : 64 Assoc : 8 Profile Information ======================================================================== ==================== Class : PAPI Event : PAPI_L2_LDM (Level 2 load misses) Period : 10000 Samples : 292 Domain : user Run Time : 494.21 (seconds) Min Self % : (all) Module Summary ------------------------------------------------------------------------ -------- Samples Self % Total % Module 222 76.03% 76.03% /opt/gamess_src/gamess.02.x 65 22.26% 98.29% /usr/lib/atlas/sse2/libblas.so.3.0 4 1.37% 99.66% /lib/libc-2.3.2.so 1 0.34% 100.00% /lib/libpthread-0.10.so File Summary ------------------------------------------------------------------------ -------- Samples Self % Total % File 223 76.37% 76.37% ? 69 23.63% 100.00% ?? Function Summary ------------------------------------------------------------------------ -------- Samples Self % Total % Function 80 27.40% 27.40% sotran_ 69 23.63% 51.03% ?? 43 14.73% 65.75% dirfck_ 24 8.22% 73.97% vclr_ 16 5.48% 79.45% prefin_ 16 5.48% 84.93% jacdia_ 5 1.71% 86.64% xyzint_ 5 1.71% 88.36% sonewt_ 4 1.37% 89.73% twoei_ 4 1.37% 91.10% shlden_ 3 1.03% 92.12% shells_ 3 1.03% 93.15% coproj_ 3 1.03% 94.18% symmos_ 2 0.68% 94.86% bndord_ 2 0.68% 95.55% vadd_ 2 0.68% 96.23% schwdn_ 2 0.68% 96.92% sograd_ 2 0.68% 97.60% symtrd_ 1 0.34% 97.95% extrap_ 1 0.34% 98.29% prcalc_ 1 0.34% 98.63% dmtx_ 1 0.34% 98.97% zmat2_ 1 0.34% 99.32% symdia_ 1 0.34% 99.66% pthread_mutex_lock 1 0.34% 100.00% salcao_ Function:File:Line Summary ------------------------------------------------------------------------ -------- Samples Self % Total % Function:File:Line 80 27.40% 27.40% sotran_:?:0 69 23.63% 51.03% ??:??:0 43 14.73% 65.75% dirfck_:?:0 24 8.22% 73.97% vclr_:?:0 16 5.48% 79.45% prefin_:?:0 16 5.48% 84.93% jacdia_:?:0 5 1.71% 86.64% sonewt_:?:0 5 1.71% 88.36% xyzint_:?:0 4 1.37% 89.73% shlden_:?:0 4 1.37% 91.10% twoei_:?:0 3 1.03% 92.12% symmos_:?:0 3 1.03% 93.15% shells_:?:0 3 1.03% 94.18% coproj_:?:0 2 0.68% 94.86% bndord_:?:0 2 0.68% 95.55% vadd_:?:0 2 0.68% 96.23% schwdn_:?:0 2 0.68% 96.92% sograd_:?:0 2 0.68% 97.60% symtrd_:?:0 1 0.34% 97.95% extrap_:?:0 1 0.34% 98.29% prcalc_:?:0 1 0.34% 98.63% dmtx_:?:0 1 0.34% 98.97% zmat2_:?:0 1 0.34% 99.32% symdia_:?:0 1 0.34% 99.66% pthread_mutex_lock:?:0 1 0.34% 100.00% salcao_:?:0 ########## ########## Aend, end output for A. ########## ########## ########## Bstart, start output for B: ########## PerfSuite Hardware Performance Summary Report Version : 1.0 Created : Thu Jan 19 20:25:23 EST 2006 Generator : psprocess 0.2 XML Source : gamess.02.x.1049.xml Executable : /opt/gamess_src/gamess.02.x Execution Information ======================================================================== ======== ============ Date : Thu Jan 19 19:46:48 2006 Host : PC00019127.eng.monash.edu.au User : tramdas Processor and System Information ======================================================================== ======== ============ Node CPUs : 1 Vendor : Intel Family : Pentium 4 Brand : Intel(R) Pentium(R) 4 CPU 3.00GHz CPU Revision : 1 Clock (MHz) : 3000.106 Memory (MB) : 883.40 Pagesize (KB) : 4 Cache Information ======================================================================== ======== ============ Cache levels : 2 -------------------------------- Level 1 Type : data Size (KB) : 16 Linesize (B) : 64 Assoc : 8 Type : instruction trace Size (KuOps) : 12 Assoc : 8 -------------------------------- Level 2 Type : unified Size (KB) : 1024 Linesize (B) : 64 Assoc : 8 Profile Information ======================================================================== ======== ============ Class : PAPI Event : PAPI_L2_LDM (Level 2 load misses) Period : 10000 Samples : 292 Domain : user Run Time : 494.21 (seconds) Min Self % : (all) Module Summary ------------------------------------------------------------------------ -------- Samples Self % Total % Module 222 76.03% 76.03% /opt/gamess_src/gamess.02.x 65 22.26% 98.29% /usr/lib/atlas/sse2/libblas.so.3.0 4 1.37% 99.66% /lib/libc-2.3.2.so 1 0.34% 100.00% /lib/libpthread-0.10.so File Summary ------------------------------------------------------------------------ -------- Samples Self % Total % File 292 100.00% 100.00% ?? Function Summary ------------------------------------------------------------------------ -------- Samples Self % Total % Function 292 100.00% 100.00% ?? Function:File:Line Summary ------------------------------------------------------------------------ -------- Samples Self % Total % Function:File:Line 292 100.00% 100.00% ??:??:0 ########## ########## Bend, end output for B. ########## |