From: graydon h. <gr...@re...> - 2002-12-21 04:08:43
|
On Fri, 2002-12-20 at 18:53, Richard Henderson wrote: > Alpha has the ability to enable performance counters on a per-process > basis. I.e. enabled and disabled during the context switch between > user/kernel *and* based on a bit in the process' task control block. > Is there any way to use this with oprofile? If not, I can leave it > running all the time, or only in particular protection levels, so it's > not a bit deal, but it seems very useful. iirc the ppc hardware can do this too; it's a bit which marks a process for profiling, meant to be propagated between parents and children by the OS, and the counter control regs have some corresponding bits saying whether you're counting the marked, the unmarked, or all the processes in the system. the ppc driver I was recently working on just sets this to "always on", to achieve similar-to-existing behavior. so far that's always been my goal with new ports, even when there's potentially exotic hardware lurking beneath the surface. > Is there any existing scheme to allocating event numbers and/or event > names for a platform? Surely some of the events are common to most/all > platforms; it'd be a shame to make this harder than necessary for apps > that try to interpret the data in some way. Currently it appears the > events have been given the name from the cpu reference manual, and the > number chosen at random. Is this correct? it's worse than that on ppc, in fact! the events don't have any names, so I mangled the first few nouns and verbs in the event descriptions into symbolic names :(( generally, yeah, we should extend libop or something with an "easy" querying mechanism, for at least say cycles, instructions, micro-ops, cache misses, and pipeline stalls. nearly everything can count those. it would also be good if we improved the guess-an-appropriate-frequency code. > I notice that the code in 2.5.52 is lagging behind what's in oprofile > cvs, or even the 0.4 release. How does development between the two > source trees interact? you're right, they're a bit out of sync. but we have an evil plan :) the module accompanying the later 2.5 series kernels is plainly a better driver than the 2.4-supporting one residing in oprofile cvs. it is newer, has a cleaner design, and doesn't wrap system calls. unfortunately it is missing the p4 and hammer (and possibly ia64?) support we did in the fall, for the 2.4 driver. so what will and I have done recently (hopefully I'm not spiling any more beans than I did with the last internal email I posted here) is back-port the 2.5 driver to 2.4 and merge in the various CPU backends from the "old" 2.4 driver (including recently working hyper-threading). so we'll probably post most of this work sometime early in the new year, and that'll make further syncing between 2.4 and 2.5 series kernel and userspace quite a bit simpler, I think. -graydon |