Re: [perfmon2] Question about overflow handling in perfmon2
Status: Beta
Brought to you by:
seranian
From: stephane e. <er...@go...> - 2008-01-08 18:06:22
|
Corey, For most architectures masking is equivalent to stopping, except on IA-64 because it lets you start/stop directly from user mode under certain conditions. That is whya I had to introduce the concept of masking,i.e., which also means stopping but in a manner that cannot be modified by user. Hope this helps. On Jan 8, 2008 10:02 AM, Corey J Ashford <cja...@us...> wrote: > > Hi Stephane, > > Thanks for your response. I had missed that there was an arch-dependent > part of masking. I need to read again about the semantics of "masking", > then I'll give this a shot. > > Thank you! > > > - Corey > > Corey Ashford > Software Engineer > IBM Linux Technology Center, Linux Toolchain > Beaverton, OR > 503-578-3507 > cja...@us... > > > "stephane eranian" <er...@go...> wrote on 01/07/2008 10:30:02 PM: > > > > > Corey, > > > > On Jan 3, 2008 8:30 PM, Corey Ashford <cja...@us...> wrote: > > > Hello, > > > > > > I've been debugging a PAPI test case - papi/src/ctests/overflow - on > > > POWER5 (PAPI implemented on top of perfmon2). This test hangs in the > > > kernel after one of its event counters overflows. The interrupt > handler > > > is entered, and the function pfm_overflow_handler discovers that > > > overflow notification is enabled and so it turns on masking: > > > > > > (perfmon_intr.c) > > > ... > > > } else { > > > /* > > > * When no sampling format is used, the default > > > * is: > > > * - mask monitoring > > > * - notify user if requested > > > * > > > * If notification is not requested, monitoring is > masked > > > * and overflowed counters are not reset (saturation). > > > * This mimics the behavior of the default sampling > format. > > > */ > > > ovfl_ctrl = PFM_OVFL_CTRL_NOTIFY; > > > > > > if (!must_switch || has_notify) > > > ovfl_ctrl |= PFM_OVFL_CTRL_MASK; > > > } > > > ... > > > > > > As you can see from the comments, the counters are not reset. > > > > Yes, the counters are reset when you issue pfm_restart() at least in > > this case where you are > > doing user-level sampling. > > But note, that you get out of the pfm_overflow_handler() with the > > state = MASKED and thus > > you have called pfm_mask_monitoring() which is supposed to stop > > monitoring. Thus you > > should not be getting anymore interrupts. It seems to me that > > pfm_arch_mask_monitoring() > > is not doing the right thing on Power. From your description, it seems > > it may need to do more > > to avoid re-entering the interrupt, i.e., it should clear that most > > significant bit. Today this > > function is empty for Power. > > > > Could you try this out? > > > > > > > Consequently, on POWER5, even though we reset the Performance Monitor > > > Alert Occurred bit in MMCR0, the interrupt is still pending because one > > > or more counters (just one in this case) has its most significant bit > > > (the "negative" bit) set. All of the other bits that could gate a > > > performance monitor exception are reenabled by the > > > pfm_power5_irq_handler. As a result, when the thread of execution > exits > > > the interrupt handler, the interrupt occurs again immediately. > > > > > > What's the right way to deal with this? > > > > > > Should pfm_power5_irq_handler check the state of ctx->state has the > > > PFM_OVFL_CTRL_MASK bit set, and if it is, not to re-enable performance > > > monitor interrupts upon exit? > > > > > > Thanks for your consideration, > |