Re: [perfmon2] Pfmon and Hpcrun give inconsistent results.
Status: Beta
Brought to you by:
seranian
From: Philip M. <mu...@cs...> - 2008-04-16 23:20:21
|
>> Hi Gary, >> Phil, >> >> On Wed, Apr 16, 2008 at 11:22 AM, Philip Mucci <mu...@cs...> >> wrote: >>> Folks, >>> >>> hpcrun does it's sampling inside the target process using first >>> person >>> access, not a 3rd person ptrace() like pfmon, so the process is >>> implicitly blocked when processing samples, i.e. there are no >>> dropped >>> samples unless something else has gone wrong. >>> >> Thanks for clarifing this. It makes more sense given how PAPI works. > > I think this means that I just lost the best explanation I have had > so far > as to why I see these inconsistencies. > Yes, you are correct. The blocking is not an issue in hpcrun. >> >>> Another thing, you cannot rely on the sample count of hpcrun to >>> compute cycles. Why? Because those are samples that only have not >>> been >>> dropped. If samples occur ourside of the sample space (as can happen >>> when one has floating point exceptions), the address will be in >>> kernel >>> space and it will be dropped. pfmon has no concept of filtering out >>> addresses, so even if you ask for user-space samples, you'll still >>> get >>> samples in the output with kernel addresses. I'm not sure what the >>> default is for your version of pfmon. >>> >> PFmon does not do filtering of samples. It relies on the hardware via >> the priv levels. By default, pfmon only measures at the user level. >> That does not mean you won't get kernel-level samples because there >> are boundary effects when sampling. >> >>> Which value is correct, according to /bin/time? 2Billion or 154 > Billion? >>> >> This is a valid point. Which value makes most sense related to time? > > > Let me provide all the steps I use when running these tests. Maybe I > am just doing something wrong and you can correct my misunderstanding. > > When I use "time" with my hpcrun test, it provides this information: > > time hpcrun -e CPU_CYCLES:32767 -o hpcrun.data -- ./code.exe >hpcrun > 2>hpcrun.debug > real 1m44.921s > user 1m39.490s > sys 0m2.420s > > The output from the hpcprof run on all of the data files produced by > this > test shows the following summary information: > > Columns correspond to the following events [event:period (events/ > sample)] > CPU_CYCLES:32767 - CPU Cycles (29 samples) [not shown] > CPU_CYCLES:32767 - CPU Cycles (9 samples) [not shown] > CPU_CYCLES:32767 - CPU Cycles (29 samples) [not shown] > CPU_CYCLES:32767 - CPU Cycles (29 samples) [not shown] > CPU_CYCLES:32767 - CPU Cycles (41602 samples) [not shown] > CPU_CYCLES:32767 - CPU Cycles (24490 samples) [not shown] > CPU_CYCLES:32767 - CPU Cycles (7 samples) [not shown] > CPU_CYCLES:32767 - CPU Cycles (5 samples) [not shown] > CPU_CYCLES (min):32767 - CPU Cycles (The minimum for events of this > type.) (1 samples) > CPU_CYCLES (max):32767 - CPU Cycles (The maximum for events of this > type.) (66127 samples) > CPU_CYCLES (sum):32767 - CPU Cycles (Summed over all events of this > type.) (66200 samples) > Gary, this executable must be multi-threaded? Do each of the threads do the same amount of work? If so, the above is your clue. Most of the threads are experiencing the perfmon2 race where the signal comes in but gets dropped and thus monitoring is not restarted. PAPI from CVS has fixes in there for this on perfmon2 platforms. Is this OS using perfmon2 or the old 'monolithic' perfmon interface? If this is perfmon1, then we may have an issue here. But PAPI-CVS handles this properly for perfmon2 by using a real-time signal. Judging from the other numbers you have below, I'd guess that if you set the sample rate to something much lower (which is certainly reasonable, 32768 is awfully small for a 1600Mhz processor), then you'd get more reasonable results. Experience (from Mark and the Rice folks) have shown that this signal dropping is much less likely to happen when the interrupt load is low. So, I reckon if you set the sample period to 16,000,000 (approximate 100/second), you'll get answers that match up. Phil P.S. Please get back to me on which version of PAPI and perfmon kernel support you have. |