Re: [perfmon2] [PATCH] 1/1 - add kernel support for POWER4 and POWER6 to perfmon2
Status: Beta
Brought to you by:
seranian
From: Stephane E. <er...@hp...> - 2007-11-29 09:03:52
|
Corey, On Wed, Nov 28, 2007 at 05:44:04PM -0800, Corey Ashford wrote: > Stephane Eranian wrote: > >Corey, > > > ... > >>I haven't looked at the sampling code in perfmon carefully. Does it > >>know when the counter that has been assigned to count the event it wants > >>to sample on does not support overflow interrupting? Is this something > >>libpfm needs to be aware of? > >> > > > >The kernel does not know which counter can/cannot overflow. Typically > >you have to set a bit in the config register for the counter so that it > >triggers an interrupt on overflow. The bit is forced in the default value > >of each register. This is needed for 64-bit wide emulation. No matter what > >libpfm does the bit is overriden by the kernel. But there is no explicit > >flag > >that says: "register can overflow". In fact non counter registers can also > >trigger interrupts, e.g., AMD Barcelona IBS registers. > > > ... > > Here's what I don't understand... I have looked at the perfmon interface > doc and it seems to assume (or I am reading into it) that all counters > can interrupt on overflow somehow, by turning on a bit somewhere: > > 3.4.3 Counter Overflow notifications > When a counter overflows, it is possible to receive a notification. > The notification is never automatic > and must be requested by specifying the PFM REGFL OVFL NOTIFY flag > on the controlling PMC register. > When the flag is not set, the counter simply wraps around on > overflow. > Well, that's because this has been true of all other PMU models. Power 6 is the first one that implements non-interrupting counters. Conversly, we now support "overflow" notification on non counters, e.g., for AMD Barcelona IBS. I think the term "overflow notification" is not totally correct anymore. I'd like to find a more generic term to describe the notification mechanism. It is also important to realize that at the interface level, you can ask for a notification on counter overflow, but it will ONLY be generated when the 64-bit counter actually overflows. When HW counters are narrower, they will interrupt at a much higher frequency, but that will not be visible to applications. 64-bit counter emulation is usually implemented using the interrupt capability. The way you did it for Power6 counters 5 & 6 is just another method of emulation, by active sampling and accumulation of the HW counter. What you are missing at this point is interrupt-capability for this virtualized counter. I believe this can be emulated on top of what you have, and here is how this could be done: - make virtual counter 5 & 6 writeable - keep your timer-based sampling and accumulation scheme - when you accumulate, check that the new value is greater than the previous value, if not then: - set bit 5 or 6 in set->povfl_pmds, set->npend_ovfls++ - post the PMU interrupt That should work. The corner case, of course, is what happens if you also have simultaneous interrupts from other counters at the same time? For that, I think you'd have to craft the get_ovfl_pmds() routine to take this into account. > My concern is that someone using an interface such as PAPI will try to > set up sampling based on an instructions event counter, and let's say > they just want to sample IP addresses. For POWER6, if they specify > sampling in user + kernel + supervisor (aka hypervisor) mode, they will > likely be assigned an event group that uses PMC5, which cannot > interrupt. The sampling will not work, correct? It seems like there > might be a disconnect here between libpfm knowing that the register the > user wants to sample on cannot interrupt, and the user who assumes it can. > Let's see if we can implement the scheme I am proposing above. It could solve the libpfm issue. -- -Stephane |