From: Richard H. <rt...@tw...> - 2002-12-20 23:53:49
|
Alpha has the ability to enable performance counters on a per-process basis. I.e. enabled and disabled during the context switch between user/kernel *and* based on a bit in the process' task control block. Is there any way to use this with oprofile? If not, I can leave it running all the time, or only in particular protection levels, so it's not a bit deal, but it seems very useful. Is there any existing scheme to allocating event numbers and/or event names for a platform? Surely some of the events are common to most/all platforms; it'd be a shame to make this harder than necessary for apps that try to interpret the data in some way. Currently it appears the events have been given the name from the cpu reference manual, and the number chosen at random. Is this correct? I notice that the code in 2.5.52 is lagging behind what's in oprofile cvs, or even the 0.4 release. How does development between the two source trees interact? r~ |
From: John L. <le...@mo...> - 2002-12-21 02:20:38
|
On Fri, Dec 20, 2002 at 03:53:38PM -0800, Richard Henderson wrote: > names for a platform? Surely some of the events are common to most/all > platforms; it'd be a shame to make this harder than necessary for apps > that try to interpret the data in some way. Currently it appears the > events have been given the name from the cpu reference manual, and the > number chosen at random. Is this correct? Which numbers are you referring to ? The event numbers are not random. We need some work to virtualise the parameters across CPUs, currently they are rather intel-specific (event vs. unit mask etc.). I'm not sure of the best solution here, but I'm a bit nervous of trying to draw parallels between counters across architectures. The intention is that things can link against libop[++].a and get the information they need. I know this needs work. > I notice that the code in 2.5.52 is lagging behind what's in oprofile > cvs, or even the 0.4 release. How does development between the two > source trees interact? This is not true. oprofile cvs matches the current kernel. What you're probably seeing is that the old startup script (i.e. op_start) for 2.5 is currently installed as "op_start_25" instead of "op_start". This is just a silly bug that I hope to find time to fix tomorrow. But note we are deprecating op_start in favour of the more powerful opcontrol script, which should work under all circumstances. regards john -- "ALL television is children's television." - Richard Adler |
From: Richard H. <rt...@tw...> - 2002-12-21 08:38:43
|
On Sat, Dec 21, 2002 at 02:16:30AM +0000, John Levon wrote: > Which numbers are you referring to ? The event numbers are not random. They aren't? Oh, well, that's not obvious then. Especially given the apparent lack of symbolic names for them. Perhaps I'm missing something? > ... I'm a bit nervous of trying to draw > parallels between counters across architectures. Obvious stuff like cycles and cache misses should be easy. Suggestions on what I should do with my events in the mean time? > This is not true. oprofile cvs matches the current kernel. What you're > probably seeing is that the old startup script (i.e. op_start) for 2.5 > is currently installed as "op_start_25" instead of "op_start". No, I hadn't even noticed anything wrt op_start. What I noticed was the missing p4 support in the 2.5-bk tree. r~ |
From: John L. <le...@mo...> - 2002-12-21 14:09:10
|
On Sat, Dec 21, 2002 at 12:38:27AM -0800, Richard Henderson wrote: > No, I hadn't even noticed anything wrt op_start. What I noticed > was the missing p4 support in the 2.5-bk tree. Oh, sorry, yeah (this is what happens when you respond to email after coming back from the pub). As Graydon explained, we still need to forward-port their side port to the backport of this stuff ... regards john -- "ALL television is children's television." - Richard Adler |
From: Philippe E. <ph...@wa...> - 2002-12-21 22:53:49
|
Richard Henderson wrote: > On Sat, Dec 21, 2002 at 02:16:30AM +0000, John Levon wrote: > [defining independant architecture events] > > >>... I'm a bit nervous of trying to draw >>parallels between counters across architectures. > > > Obvious stuff like cycles and cache misses should be easy. not so easy e.g. AMD Athlon don't get a cpu cycle count (the most near equivalent is retired insn). Beside that using the documentation events names easier a lot code review and allow for user to "grep" HW manufacter documentation to get further information. > > Suggestions on what I should do with my events in the mean time? Use documentation events names please unless you think they are really confusing. Anyway later we can add some synthetic events mapped on real events available for all architecture. >>This is not true. oprofile cvs matches the current kernel. What you're >>probably seeing is that the old startup script (i.e. op_start) for 2.5 >>is currently installed as "op_start_25" instead of "op_start". humm I stopped looking in 2.5 from the module breakage in 2.5 (49/50) ? The last thing I do on 2.5 is separation of module samples for each task. I'm guessing it will work ;) regards, Philipe Elie |
From: Richard H. <rt...@tw...> - 2002-12-21 23:29:31
|
On Sat, Dec 21, 2002 at 11:50:35PM +0000, Philippe Elie wrote: > > Obvious stuff like cycles and cache misses should be easy. > > not so easy e.g. AMD Athlon don't get a cpu cycle > count (the most near equivalent is retired insn). No, you misunderstand. I'm not suggesting that we try to force-map all cpu's events onto one other, just that we have a taxonomy. > > Suggestions on what I should do with my events in the mean time? > > Use documentation events names please unless you > think they are really confusing. Anyway later we > can add some synthetic events mapped on real events > available for all architecture. I actually menat wrt event numbers. I've started choosing them as seems conveniant for the architecture. r~ |
From: John L. <le...@mo...> - 2002-12-22 01:52:09
|
On Sat, Dec 21, 2002 at 03:28:40PM -0800, Richard Henderson wrote: > No, you misunderstand. I'm not suggesting that we try to force-map > all cpu's events onto one other, just that we have a taxonomy. This would probably be useful at some point as more archs get supported. > I actually menat wrt event numbers. I've started choosing > them as seems conveniant for the architecture. Are you referring to the event number field in libop/op_events.c here ? I'm a bit confused by your comments. This number is the actual value that must be written into the perfctr programmable registers as defined in the Intel manuals. Is there some reason you're not doing the same for Alpha ? I assume you've defined an op_cpu_type for Alpha and are adding the events to op_events.c. Is this currently too inflexible ? Are you doing something different ? I admit I've yet to read the Alpha PDF's I have ... regards john -- "ALL television is children's television." - Richard Adler |
From: Richard H. <rt...@tw...> - 2002-12-22 08:30:09
|
On Sun, Dec 22, 2002 at 01:48:02AM +0000, John Levon wrote: > Are you referring to the event number field in libop/op_events.c here ? Yes. > I'm a bit confused by your comments. This number is the actual value > that must be written into the perfctr programmable registers as defined > in the Intel manuals. Is there some reason you're not doing the same for > Alpha ? (1) There is typically one register that controls all of the counters. It's all encoded into a 64-bit value. I suppose one could take the view that this number should be the value placed into the field for that counter, but: (2) Several events can be put onto more than one counter. However, the values placed into their respective fields are *never* the same. (3) It makes the most sense to me that the CYCLES event is seen by the userland tool that interprets the event stream identically, no matter from which counter it originated. Now, as it happens, I have *some* method to the selection of values to place in that field, e.g. from ev5_cpu_setup, /* Select desired events. The event numbers are selected such that they map directly into the event selection fields: PCSEL0: 0, 1 PCSEL1: 24-39 CBOX1: 40-47 PCSEL2: 48-63 CBOX2: 64-71 There are two special cases, in that CYCLES can be measured on PCSEL[02], and SCACHE_WRITE can be measured on CBOX[12]. These event numbers are canonicalized to their first appearance. */ ctl = 0; for (i = 0; i < 3; ++i) { unsigned long event = counter_config[i].event; if (!counter_config[i].enabled) continue; /* Remap the duplicate events, as described above. */ if (i == 2) { if (event == 0) event = 12+48; else if (event == 2+41) event = 4+65; } /* Convert the event numbers onto mux_select bit mask. */ if (event < 2) ctl |= event << 31; else if (event < 24) /* error */; else if (event < 40) ctl |= (event - 24) << 4; else if (event < 48) ctl |= (event - 40) << 19 | 15 << 4; else if (event < 64) ctl |= event - 48; else if (event < 72) ctl |= (event - 64) << 22 | 15; } wrperfmon(2, ctl); which means that the SPLIT_ISSUE_CYCLES event, which is obtained only on counter 1 by PCSEL1 field value 1, is described as { 2, OP_EV5, 1+24, &um_empty, "SPLIT_ISSUE_CYCLES", "Some but not all issuable instructions issued", 256 }, and CYCLES, which is obtained on counters 0 and 2, as mentioned above, is described as { 5, OP_EV5, 0, &um_empty, "CYCLES", "Total cycles", 256 }, (I've yet to figure out what op_unit_mask actually does. I suppose I'll have to read the x86 docs to figure it out.) > I assume you've defined an op_cpu_type for Alpha and are adding the > events to op_events.c. Three cpu types (ev4, ev5, ev6), but yes. r~ |
From: John L. <le...@mo...> - 2003-01-04 04:41:11
|
On Sun, Dec 22, 2002 at 12:29:55AM -0800, Richard Henderson wrote: > (2) Several events can be put onto more than one counter. > However, the values placed into their respective fields > are *never* the same. Ah, I see. > which means that the SPLIT_ISSUE_CYCLES event, which is obtained > only on counter 1 by PCSEL1 field value 1, is described as > > { 2, OP_EV5, 1+24, &um_empty, "SPLIT_ISSUE_CYCLES", > "Some but not all issuable instructions issued", 256 }, > > and CYCLES, which is obtained on counters 0 and 2, as mentioned > above, is described as > > { 5, OP_EV5, 0, &um_empty, "CYCLES", > "Total cycles", 256 }, OK seems reasonable. > (I've yet to figure out what op_unit_mask actually does. > I suppose I'll have to read the x86 docs to figure it out.) It is basically a sub-event type. For example, the cache events on x86 can be restricted to particular states of the cacheline (M,E,S,I) as a bitmask. It's obviously x86-specific name, but it could easily be generalised if needed. regards john -- "I will eat a rubber tire to the music of The Flight of the Bumblebee" |
From: John L. <le...@mo...> - 2002-12-21 02:46:20
|
On Fri, Dec 20, 2002 at 03:53:38PM -0800, Richard Henderson wrote: > Alpha has the ability to enable performance counters on a per-process > basis. I.e. enabled and disabled during the context switch between > user/kernel *and* based on a bit in the process' task control block. > Is there any way to use this with oprofile? If not, I can leave it > running all the time, or only in particular protection levels, so it's > not a bit deal, but it seems very useful. opcontrol/op_start lets you specify kernel/user counting, but I don't think it's the same thing. We could generalise this fairly easily though I think. As for per-task counting, the only real problem is deciding how the user should specify which tasks are of interest. oprofiled itself doesn't care, it will work on whatever info it's given. As I mentioned the question of how to generalise the setting parameters is the problem. In terms of the kernel/user interface it's probably just a matter of the alpha-specific code using oprofilefs_*() API to create parameters that are settable from user-space during opcontrol --setup regards john -- "ALL television is children's television." - Richard Adler |
From: graydon h. <gr...@re...> - 2002-12-21 04:08:43
|
On Fri, 2002-12-20 at 18:53, Richard Henderson wrote: > Alpha has the ability to enable performance counters on a per-process > basis. I.e. enabled and disabled during the context switch between > user/kernel *and* based on a bit in the process' task control block. > Is there any way to use this with oprofile? If not, I can leave it > running all the time, or only in particular protection levels, so it's > not a bit deal, but it seems very useful. iirc the ppc hardware can do this too; it's a bit which marks a process for profiling, meant to be propagated between parents and children by the OS, and the counter control regs have some corresponding bits saying whether you're counting the marked, the unmarked, or all the processes in the system. the ppc driver I was recently working on just sets this to "always on", to achieve similar-to-existing behavior. so far that's always been my goal with new ports, even when there's potentially exotic hardware lurking beneath the surface. > Is there any existing scheme to allocating event numbers and/or event > names for a platform? Surely some of the events are common to most/all > platforms; it'd be a shame to make this harder than necessary for apps > that try to interpret the data in some way. Currently it appears the > events have been given the name from the cpu reference manual, and the > number chosen at random. Is this correct? it's worse than that on ppc, in fact! the events don't have any names, so I mangled the first few nouns and verbs in the event descriptions into symbolic names :(( generally, yeah, we should extend libop or something with an "easy" querying mechanism, for at least say cycles, instructions, micro-ops, cache misses, and pipeline stalls. nearly everything can count those. it would also be good if we improved the guess-an-appropriate-frequency code. > I notice that the code in 2.5.52 is lagging behind what's in oprofile > cvs, or even the 0.4 release. How does development between the two > source trees interact? you're right, they're a bit out of sync. but we have an evil plan :) the module accompanying the later 2.5 series kernels is plainly a better driver than the 2.4-supporting one residing in oprofile cvs. it is newer, has a cleaner design, and doesn't wrap system calls. unfortunately it is missing the p4 and hammer (and possibly ia64?) support we did in the fall, for the 2.4 driver. so what will and I have done recently (hopefully I'm not spiling any more beans than I did with the last internal email I posted here) is back-port the 2.5 driver to 2.4 and merge in the various CPU backends from the "old" 2.4 driver (including recently working hyper-threading). so we'll probably post most of this work sometime early in the new year, and that'll make further syncing between 2.4 and 2.5 series kernel and userspace quite a bit simpler, I think. -graydon |