From: Chandra K. <moo...@gm...> - 2005-03-17 01:37:06
|
On Wed, 16 Mar 2005 10:28:49 +0100, Philip Mucci <mu...@cs...> wrote: > Hi Chandra, > > Keep us posted...sounds pretty interesting. We'll put it on our wish > list as well. > > Phil Hi Here is a brain dump of what I'm thinking on event group multiplexing for the 970. The API: opcontrol --vmlinux=/boot/kernel-2.6.11.3 --start \ --group=PM_CYC_GRP1:10000:0:1:1,PM_INST_CMPL_GRP1:10000:0:1:1;2;15 \ --group=PM_LD_MISS_1_GRP2:1000:0:1:1,PM_ST_REF_L1_GRP2:5000:0:1:1;1:12 The new option "--group" is an [infinitely] repeatable commandline argument and works thus: --group=<groupspec>, where <groupspec> = <eventspec>[,<eventspec>,...];ctr-causing-reprogram;num-ovfl-causing-reprogram with the restriction that all event specs in a group spec belong to the same group (as listed by op_help) Also, <eventspec> is the usual definition. ctr-causing-reprogram and num-ovfl-causing-reprogram refer to which PMC has to overflow (ctr-causing-reprogram) how many times (num-ovfl-causing-reprogram) before the overflow interrupt handler sends a notification to userspace to reprogram the PMU to the next group state. This allows for round-robin reprogramming of the PMU to monitor different event groups, and the weighting factors allow for watching some groups "longer" than others. Now, for some implementation details: I added a new escape code in the event buffer to indicate "GROUP_SWITCH". When the overflow handler determines that it is time for a group switch, it schedules an agent X for the purpose (more on this later). After agent X is done reprogramming the PMU, it inserts a CPU_GROUP event into the CPU buffer which eventually makes its way into the event buffer. This information makes it to a "group" field in struct trans. (PC, counter) pairs are then interpreted in the context of this 'group' field (among other things) to lead to the right odb file. As for agent X, I'm trying to keep the reprogramming information out of the kernel. An obvious agent for doing this reprogramming would be oprofiled, which is in a while(1) { read(oprofile_buffer_devfd); ...} loop. There are two alternative proposals for asking oprofiled to effect the switch: I could encode a "please reprogram the PMU to the next group" message in the event buffer and cause buffer_ready (in event_buffer.c) to become true even if the buffer is not truly full. Or, I could just send an asynchronous UNIX signal to the oprofiled process (from within the kernel) which would cause it to reprogram the PMU. Now, for some questions: 1) Any comments on the proposed API extension? 2) Implementation? From kernelspace or userspace? Use the event-buffer, or asynchronous signaling? Caveat: I'm avoiding thinking too much about other architectures for now. If I'm crippling something, please let me know... Thanks, Chandra |