Re: [perfmon2] Monitoring core and uncore events in the same testrun.
Status: Beta
Brought to you by:
seranian
From: Dan T. <ter...@ee...> - 2009-09-24 00:07:03
|
Interestingly, there's a guy at ZIH Dresden who implemented a PAPI-C component specifically to measure events on the uncore. He never tried to measure both per-thread and uncore at the same time, and I doubt that it would work, but found it intriguing that he was able to get reasonable data. Like the dancing dog, it's not how well he dances, but that he dances at all... - d > -----Original Message----- > From: stephane eranian [mailto:er...@go...] > Sent: Wednesday, September 23, 2009 5:12 AM > To: Gar...@bu... > Cc: per...@li... > Subject: Re: [perfmon2] Monitoring core and uncore events in the same > testrun. > > Gary, > > Sorry for the delay. > > The reason there is a restriction with uncore PMU is because it is shared > by all cores on the socket. Given the model used by perfmon, i.e., event > are assigned to counters in user space, the kernel needs to enforce some > access control to ensure no two sessions try to use the same resource, > here uncore registers. > > The current implementation uses a coarse-grain access control policy: > - only system-wide sessions can access uncore PMU > - the first session to access uncore PMU, grabs it all > > The core and uncore PMU do not share any resource except the interrupt > vector. Theoretically we could allow distinct uncore and core sessions. > > Some people have also argued that allowing uncore access to per-thread > sessions may also be beneficial. The reason being that you'd want to know > what is going on around you. It could be hinting at what you are > experiencing > in your core. I believe this is similar to what you are trying to do with > your > measurement. I think this is a perfectly good reason to do this. > > Going back to your example of a system-wide session, I think it would be > easier > to add enough smart to the tool to suppress uncore events to all but > the first cpu > of each socket given the list of monitored cpus (either all or > --cpu-list). I think adding > this to pfmon may not so trivial because of internal data structures, > but it is doable. > > The alternative has some problems because you would not return an error > when the > uncore registers are written. Thus applications would not be able to > tell apart whether > a zero value on read is because no event occurred or because the event > was suppressed. > > Another alternative would be to consider uncore session as a third > kind of sessions distinct > from system-wide. We would allow uncore sessions when there are > per-thread and system-wide > sessions. uncore sessions would only support uncore events, of course. > You would need a > distinct pfmon session for them. > > > On Fri, Sep 18, 2009 at 8:41 PM, <Gar...@bu...> wrote: > > > > Stephane > > > > We would like to be able to collect both core and uncore counters with > > pfmon during > > the same test run. This works (if you are careful) as shown below: > > > > [kirk] (hpctk) test_cases> pfmon --system-wide -u -k --cpu-list 0,1 -e > > > UNC_LLC_MISS:READ,UNHALTED_CORE_CYCLES,INSTRUCTIONS_RETIRED,FP_COMP_OPS_EX > E:SSE_FP > > ./LoopTest > > > > .... application dribble .... > > > > CPU0 12080 UNC_LLC_MISS:READ > > CPU0 26709 UNHALTED_CORE_CYCLES > > CPU0 9766 INSTRUCTIONS_RETIRED > > CPU0 0 FP_COMP_OPS_EXE:SSE_FP > > CPU1 197 UNC_LLC_MISS:READ > > CPU1 29020 UNHALTED_CORE_CYCLES > > CPU1 10715 INSTRUCTIONS_RETIRED > > CPU1 0 FP_COMP_OPS_EXE:SSE_FP > > > > But our system also has cpu cores 2-15 which can not be included in the > cpu > > list > > because they share the same cpu socket as 0 or 1 so the uncore event > causes > > a problem creating the perfmon session on behalf of those cpu cores. > > > > Would it be possible for pfmon to detect when multiple cpu cores on the > > same > > socket are included in the cpu list then only put the uncore events in > the > > event > > list used when creating a session to the first cpu core on that socket. > > Then > > sessions to other cpu cores that share the same socket would contain > only > > the core events so that perfmon would allow sessions to all the cores. > > > > One other possible approach I considered is to leave pfmon alone and > change > > perfmon to just remove the uncore event from the event list when the > > session is > > created to the second cpu core on the same socket. This could possibly > be > > done where the error is currently being detected and then allow the > session > > to be created with a subset of the events (minus all uncore events) > > requested by > > the caller. > > > > If either of these approaches could be implemented it would make it > > possible for > > us to get all the data we need in a single test run (and that makes sure > > the data is > > consistent and complete). > > > > Just interested in your thoughts. > > Gary > > > > > > ------------------------------------------------------------------------ > ------ > > Come build with us! The BlackBerry® Developer Conference in SF, CA > > is the only developer event you need to attend this year. Jumpstart your > > developing skills, take BlackBerry mobile applications to market and > stay > > ahead of the curve. Join us from November 9-12, 2009. Register > now! > > http://p.sf.net/sfu/devconf > > _______________________________________________ > > perfmon2-devel mailing list > > per...@li... > > https://lists.sourceforge.net/lists/listinfo/perfmon2-devel > > > > -------------------------------------------------------------------------- > ---- > Come build with us! The BlackBerry® Developer Conference in SF, CA > is the only developer event you need to attend this year. Jumpstart your > developing skills, take BlackBerry mobile applications to market and stay > ahead of the curve. Join us from November 9-12, 2009. Register > now! > http://p.sf.net/sfu/devconf > _______________________________________________ > perfmon2-devel mailing list > per...@li... > https://lists.sourceforge.net/lists/listinfo/perfmon2-devel |