From: Ray B. <ra...@mp...> - 2006-02-14 19:55:20
|
I'm in the process of updating the PMU event names and unit masks for libpfm, the user side of the perfmon2 interface that Stephane Eranian is trying to get merged into the mainline kernel. (The latest public version of the PMU definitions is in version 3.28 of the BKDG, published in October 2005). In looking at the "hammer" event names in oprofile it appears that this table needs to be updated as well. I'm wondering if now wouldn't be a good time to get the event names to be the same across oprofile and perfmon. This is going to require some changes to the existing oprofile names. What do people think? Is compatibility with existing code more important than comformity, or is it a good thing to move toward having the names be the same? Or no one cares or what? :-) -- Ray Bryant AMD Performance Labs Austin, Tx 512-602-0038 (o) 512-507-7807 (c) |
From: Ray B. <ra...@mp...> - 2006-02-17 08:50:34
Attachments:
merged_event_list
|
The attached file is a summary of the various names for the Opteron events in oprofile, Code Analyst, and perfmon. It also shows what names I am suggesting be used instead. Any comments would be appreciated. -- Ray Bryant AMD Performance Labs Austin, Tx 512-602-0038 (o) 512-507-7807 (c) |
From: Stephane E. <er...@hp...> - 2006-02-17 10:02:59
|
Ray, I don't have a problem with your proposed changes. The important thing to understand about perfmon2 is that the kernel has no knowledge of events (encodings, names). As such every application is free to use whatever naming it wants. The libpfm library is a simple helper library, it is NOT REQUIRED to use the perfmon interface, in fact, it does not use it at all. As an example, the HP Caliper tool does not use libpfm yet it runs on top of perfmon. From a user's perspective, I think your initiative makes a lot of sense. Having a uniform naming across tools does bring clarity and portability between tools which is good. Now, one thing I ran into when I looked at the event description for AMD, is the fact that for some events, muliple unit masks can be combined (Or'ed). It is not totally spelled out but I suspect this is the case. For instance, DC_REFILL, it looks like you could mix the various unit masks to build a combination, e.g., DC_REFILL_FROM_L2_INVALID_OR_SHARED. Unit mask combinations exist on Itanium, so I suspect this is also possible on AMD. Can you confirm? On Thu, Feb 16, 2006 at 06:57:23PM -0600, Ray Bryant wrote: > The attached file is a summary of the various names for the Opteron events in > oprofile, Code Analyst, and perfmon. It also shows what names I am > suggesting be used instead. > > Any comments would be appreciated. > -- > Ray Bryant > AMD Performance Labs Austin, Tx > 512-602-0038 (o) 512-507-7807 (c) > Merged Event List Version 1.0 2/14/2006 > Key: > -CA Code Analyst > -OP oprofile > -OPN proposed new oprofile name > -WHY why the change to the above > -PF current perfmon2 name > -PFN proposed new perfmon2 name > > Notes: > (1) The CA name is normally the same as the name in the BKDG. > (2) For perfmon, the event name is actually the name for a particular > (event select,unit mask) pair. If the event name is of the form FOO_*, > then it indicates the rest of the name is specified by the unit_mask name. > (3) The initial set of events done for perfmon2 was known to be a partial > set, so a lot of the events are missing for that case. > > In general, I've tried to make the new oprofile name match the PFN and CA > names. However, since the PFN name includes the unit mask value, this is > not always possible. (e. g. see events F6/F7/F8, where in PFN, we name each event > pair, but oprofile just names the event select value). > ------------------------------------------------------------------------------------------- > > 00h -CA Dispatched FPU Operations > -OP DISPATCHED_FPU_OPS : Dispatched FPU ops > -OPN ...No Change... > -PF DISPATCHED_FP_OPS_* > -PFN DISPATCHED_FPU_OPS_* > > 01h -CA Cycles with no FPU Ops Retired > -OP Cycles with no FPU ops retired > -PF CYCLES_NO_FP_OPS_RETIRED > -PFN CYCLES_NO_FPU_OPS_RETIRED > > 02h -CA Dispatched Fast Flag FPU Operations > -OP FAST_FPU_OPS : Dispatched FPU ops that use the fast flag interface > -OPN DISPATCHED_FPU_OPS_FAST_FLAG > -WHY Match with perfmon2 > -PF DISPATCHED_FP_OPS_FAST_FLAG > -PFN DISPATCHED_FPU_OPS_FAST_FLAG > > 20h -CA Segment Register Loads > -OP SEG_REG_LOAD : Segment register load > -OP SEGMENT_REGISTER_LOADS > -WHY Match with BKDG > -PF SEG_REG_LOAD_* > -PF SEGMENT_REGISTER_LOADS_* > > 21h -CA Pipeline Restart Due to Self-Modifying Code > -OP SELF_MODIFY_RESYNC : Micro-architectural re-sync caused by self modifying code > -OPN PIPELINE_RESTART_DUE_TO_SELF_MODIFYING_CODE > -WHY Match with BKDG > -PF MICRO_ARCH_RESYNC_SELF_MOD_CODE > -PFN PIPELINE_RESTART_DUE_TO_SELF_MODIFYING_CODE > > 22h -CA Pipeline Restart Due to Probe Hit > -OP SNOOP_RESYNC : Micro-architectural re-sync caused by snoop > -OPN PIPELINE_RESTART_DUE_TO_PROBE_HIT > -WHY Match with BKDG > -PF MICRO_ARCH_RESYNC_SNOOP > -PFN PIPELINE_RESTART_DUE_TO_PROBE_HIT > > 23h -CA LS Buffer 2 Full > -OP LS_BUFFER_FULL : LS Buffer 2 Full > -OPN LS_BUFFER_2_FULL_CYCLES > -WHY Match with BKDG; tag as cycles > -PF LS_BUFFER_2_FULL > -PFN LS_BUFFER_2_FULL_CYCLES > > 24h -CA Locked Operations > -OP LOCKED_OP : Locked operation > -OPN LOCKED_OPS > -PF LOCKED_OPS_EXEC > -PFN LOCKED_OPS_EXECUTED > LOCKED_OPS_CYCLES_SPECULATIVE_PHASE > LOCKED_OPS_CYCLES_NON_SPECULATIVE_PHASE > > 25h -CA ...missing... > -OP OP_LATE_CANCEL : Micro-architectural late cancel of an operation > -OPN ...remove.... > -WHY Match with BKDG > -PF MICRO_ARCH_LATE_CANCEL > -PFN ...missing... not in BKDG > > 26h -CA Retired CLFLUSH Instructions > -OP CFLUSH_RETIRED : Retired CFLUSH instructions > -OPN RETIRED_CLFLUSH_INSTRUCTIONS > -WHY Match with BKDG > -PF CFLUSH_RETIRED_INST > -PFN RETIRED_CLFLUSH_INSTRUCTIONS > > 27h -CA Retired CPUID Instructions. > -OP CPUID_RETIRED : Retired CPUID instructions > -OPN RETIRED_CPUID_INSTRUCTIONS > -WHY Match with BKDG > -PF CPUID_RETIRED_INST > -PFN RETIRED_CPUID_INSTRUCTIONS > > 40h -CA Data Cache Accesses > -OP DATA_CACHE_ACCESSES : Data cache accesses > -OPN ...No Change... > -PF DC_ACCESS > -PFN DATA_CACHE_ACCESSES > > 41h -CA Data Cache Misses > -OP DATA_CACHE_MISSES : Data cache misses > -OPN ...No Change... > -PF DC_MISS > -PFN DATA_CACHE_MISSES > > 42h -CA Data Cache Refills from L2 or System > -OP DATA_CACHE_REFILLS_FROM_L2 : Data cache refills from L2 > -OPN DATA_CACHE_REFILLS_FROM_L2_OR_SYSTEM > -WHY Match with BKDG; see event mask change for details > -PF DC_REFILL_L2_* > -PFN DATA_CACHE_REFILLS_FROM_*_* > > 43h -CA Data Cache Refills from System > -OP DATA_CACHE_REFILLS_FROM_SYSTEM : Data cache refills from System > -OPN ...No Change... > -PF ...missing... > -PFN DATA_CACHE_REFILLS_FROM_SYSTEM_* > > 44h -CA Data Cache Lines Evicted > -OP DATA_CACHE_WRITEBACKS : Data cache write backs > -OPN DATA_CACHE_LINES_EVICTED > -WHY Match with BKDG > -PF ...missing... > -PFN DATA_CACHE_LINES_EVICTED_* > > 45h -CA L1 DTLB Miss and L2 DLTB Hit > -OP L1_DTLB_MISSES_L2_DTLB_HITS : L1 DTLB misses and L2 DTLB hits > -OPN L1_DTLB_MISS_AND_L2_DLTB_HIT > -WHY Match with BKDG > -PF ...missing... > -PFN L1_DTLB_MISS_AND_L2_DLTB_HIT > > 46h -CA L1 DTLB and L2 DLTB Miss > -OP L1_AND_L2_DTLB_MISSES : L1 and L2 DTLB misses > -OPN L1_DTLB_AND_L2_DLTB_MISS > -WHY Match with BKDG > -PF ...missing... > -PFN L1_DTLB_AND_L2_DLTB_MISS > > 47h -CA Misaligned Accesses > -OP MISALIGNED_DATA_REFS : Misaligned data references > -OPN MISALIGNED_ACCESSES > -WHY Match with BKDG > -PF ...missing... > -PFN MISALIGNED_ACCESSES > > 48h -CA Microarchitectural Late Cancel of an Access > -OP ACCESS_CANCEL_LATE : Micro-architectural late cancel of an access > -OPN MICROARCHITECTURAL_LATE_CANCEL_OF_AN_ACCESS > -WHY Match with BKDG > -PF ...missing... > -PFN MICROARCHITECTURAL_LATE_CANCEL_OF_AN_ACCESS > > 49h -CA Microarchitectural Early Cancel of an Access > -OP ACCESS_CANCEL_EARLY : Micro-architectural early cancel of an access > -OPN MICROARCHITECTURAL_EARLY_CANCEL_OF_AN_ACCESS > -WHY Match with BKDG > -PF ...missing... > -PFN MICROARCHITECTURAL_EARLY_CANCEL_OF_AN_ACCESS > > 4Ah -CA Single-bit ECC Errors Recorded by Scrubber > -OP ECC_BIT_ERR : One bit ECC error recorded by scrubbe > -OPN SCRUBBER_SINGLE_BIT_ECC_ERRORS > -WHY Match with perfmon2 > -PF ...missing... > -PFN SCRUBBER_SINGLE_BIT_ECC_ERRORS > PIGGYBACK_SCRUBBER_SINGLE_BIT_ECC_ERRORS > > 4Bh -CA Prefetch Instructions Dispatched > -OP DISPATCHED_PRE_INSTRS : Dispatched prefetch instructions > -OPN PREFETCH_INSTRUCTIONS_DISPATCHED > -WHY Match with BKDG > -PF ...missing... > -PFN PREFETCH_INSTRUCTIONS_DISPATCHED_* > > 4Ch -CA DCACHE Misses by Locked Instructions > -OP ...missing... > -OPN DCACHE_MISS_LOCKED_INSTRUCTIONS > -WHY Match with BKDG > -PF ...missing... > -PFN DCACHE_MISS_LOCKED_INSTRUCTIONS > > 65h -CA Memory Requests by Type > -OP ...missing... > -OPN MEMORY_REQUESTS > -PF ...missing... > -PFN MEMORY_REQUESTS_* > > 67h -CA Data Prefetcher > -OP ...missing... > -OPN DATA_PREFETCHES > -PF ...missing... > -PFN DATA_PREFETCHES_* > > 6Ch -CA System Read Responses by Coherency State > -OP ...missing... > -OPN SYSTEM_READ_RESPONSES > -WHY Match with perfmon2 > -PF ...missing... > -PFN SYSTEM_READ_RESPONSES_* > > 6Dh -CA Quadwords Written to System > -OP ...missing... > -OPN QUADWORD_WRITE_TRANSFERS > -WHY Match with perfmon2 > -PF ...missing... > -PF QUADWORD_WRITE_TRANSFERS > > 76h -CA CPU Clocks not Halted > -OP CPU_CLK_UNHALTED : Cycles outside of halt state > -OP ...No change... > -PF CPU_CLK_UNHALTED : CPU clock not in HLT or STPCLK > -PFN CPU_CLK_UNHALTED > > 7Dh -CA Requests to L2 Cache > -OP BU_INT_L2_REQ : Internal L2 request > -OPN REQUESTS_TO_L2 > -WHY Match with BKDG > -PF ...missing... > -PFN REQUESTS_TO_L2_* > > 7Eh -CA L2 Cache Misses > -OP BU_FILL_REQ : Fill request that missed in L2 > -OPN L2_CACHE_MISS > -WHY Match with BKDG > -PF ...missing... > -PFN L2_CACHE_MISS_* > > 7Fh -CA L2 Fill/Writeback > -OP BU_FILL_L2 : Fill into L2 > -OPN L2_CACHE_FILL_WRITEBACK > -WHY Match with BKDG > -PF ...missing... > -PFN L2_CACHE_FILL_WRITEBACK > > 80h -CA Instruction Cache Fetches > -OP ICACHE_FETCHES : Instruction cache fetches > -OPN INSTRUCTION_CACHE_FETCHES > -WHY Match with BKDG > -PF ...missing... > -PFN INSTRUCTION_CACHE_FETCHES > > 81h -CA Instruction Cache Misses > -OP ICACHE_MISSES : Instruction cache misses > -OPN INSTRUCTION_CACHE_MISSES > -WHY Match with BKDG > -PF ...missing... > -PFN INSTRUCTION_CACHE_MISSES > > 82h -CA Instruction Cache Refills from L2 > -OP IC_REFILL_FROM_L2 : Refill from L2 > -OPN INSTRUCTION_CACHE_REFILLS_FROM_L2 > -WHY Match with BKDG > -PF ...missing... > -PFN INSTRUCTION_CACHE_REFILLS_FROM_L2 > > 83h -CA Instruction Cache Refills from System > -OP IC_REFILL_FROM_SYS : Refill from system > -OPN INSTRUCTION_CACHE_REFILLS_FROM_SYSTEM > -WHY Match with BKDG > -PF ...missing... > -PFN INSTRUCTION_CACHE_REFILLS_FROM_SYSTEM > > 84h -CA L1 ITLB Miss, L2 ITLB Hit > -OP L1_ITLB_MISSES_L2_ITLB_HITS : L1 ITLB misses (and L2 ITLB hits) > -OPN L1_ITLB_MISS_AND_L2_ITLB_HIT > -PF ...missing... > -PFN L1_ITLB_MISS_AND_L2_ITLB_HIT > > 85h -CA L1 ITLB Miss, L2 ITLB Miss > -OP L1_AND_L2_ITLB_MISSES : L1 and L2 ITLB misses > -OPN L1_ITLB_MISS_AND_L2_ITLB_MISS > -WHY Personally, I prefer the AND here > -PF ...missing... > -PFN L1_ITLB_MISS_AND_L2_ITLB_MISS > > 86h -CA Pipeline Restart Due to Instruction Stream Probe > -OP IC_RESYNC_BY_SNOOP : Micro-architectural re-sync caused by snoop > -OPN PIPELINE_RESTART_DUE_TO_INSTRUCTION_STREAM_PROBE > -WHY Match with BKDG > -PF ...missing... > -PFN PIPELINE_RESTART_DUE_TO_INSTRUCTION_STREAM_PROBE > > 87h -CA Instruction Fetch Stall > -OP IC_FETCH_STALL : Instruction fetch stall > -OPN INSTRUCTION_FETCH_STALL > -WHY Match with BKDG > -PF ...missing... > -PFN INSTRUCTION_FETCH_STALL > > 88h -CA Return Stack Hits > -OP IC_STACK_HIT : Return stack hit > -OPN RETURN_STACK_HITS > -WHY Match with BKDG > -PF ...missing... > -PFN RETURN_STACK_HITS > > 89h -CA Return Stack Overflows > -OP IC_STACK_OVERFLOW : Return stack overflow > -OPN RETURN_STACK_OVERFLOWS > -WHY Match with BKDG > -PF ...missing... > -PFN RETURN_STACK_OVERFLOWS > > C0h -CA Retired Instructions > -OP RETIRED_INSNS : Retired instructions (includes exceptions, interrupts, re-syncs) > -OPN RETIRED_X86_INSTRUCTIONS > -WHY Match with Perfmon2; arguably, Permfon2 should just be RETIRED_INSTRUCTIONS > and if we make that change there, we can make this one RETIRED_INSTRUCTIONS as > well. Will check with Stephane. > -PF RETIRED_X86_INST : Retired x86 instructions including excepti > -PFN RETIRED_X86_INSTRUCTIONS > > 88h -CA Return Stack Hits > -OP IC_STACK_HIT > -OPN RETURN_STACK_HITS > -WHY Match with BKDG > -PF ...missing... > -PFN RETURN_STACK_HITS > > C1h -CA Retired uops > -OP RETIRED_OPS : Retired ops > -OPN RETIRED_UOPS : Retired micro-ops > -WHY Match with BKDG; its not retired ops, anyway. > -PF ...missing... > -PFN RETIRED_UOPS > > C2h -CA Retired Branch Instructions > -OP RETIRED_BRANCHES : Retired branches (conditional, unconditional, exceptions, interrupts) > -OPN RETIRED_BRANCH_INSTRUCTIONS > -WHY Match with BKDG > -PF ...missing... > -PFN RETIRED_BRANCH_INSTRUCTIONS > > C3h -CA Retired Mispredicted Branch Instructions > -OP RETIRED_BRANCHES_MISPREDICTED : Retired branches mispredicted > -OPN RETIRED_MISPREDICTED_BRANCH_INSTRUCTIONS > -WHY Match with BKDG > -PF ...missing... > -PFN RETIRED_MISPREDICTED_BRANCH_INSTRUCTIONS > > C4h -CA Retired Taken Branch Instructions > -OP RETIRED_TAKEN_BRANCHES : Retired taken branches > -OPN RETIRED_TAKEN_BRANCH_INSTRUCTIONS > -WHY Match with BKDG > -PF ...missing... > -PFN RETIRED_TAKEN_BRANCH_INSTRUCTIONS > > C5h -CA Retired Taken Branch Instructions Mispredicted > -OP RETIRED_TAKEN_BRANCHES_MISPREDICTED : Retired taken branches mispredicted > -OPN RETIRED_TAKEN_BRANCH_INSTRUCTIONS_MISPREDICTED > -WHY Match with BKDG > -PF ...missing... > -PFN RETIRED_TAKEN_BRANCH_INSTRUCTIONS_MISPREDICTED > > C6h -CA Retired Far Control Transfers > -OP RETIRED_FAR_CONTROL_TRANSFERS : Retired far control transfers > -OPN ...no change... > -PF ...missing... > -PFN RETIRED_FAR_CONTROL_TRANSFERS > > C7h -CA Retired Branch Resyncs > -OP RETIRED_RESYNC_BRANCHES : Retired re-sync branches (only non-control transfer branches) > -OPN RETIRED_BRANCH_RESYNCS > -WHY Match with BKDG > -PF ...missing... > -PFN RETIRED_BRANCH_RESYNCS > > C8h -CA Retired Near Returns > -OP RETIRED_NEAR_RETURNS : Retired near returns > -OPN ...no change... > -PF ...missing... > -PFN RETIRED_NEAR_RETURNS > > C9h -CA Retired Near Returns Mispredicted > -OP RETIRED_RETURNS_MISPREDICT : Retired near returns mispredicted > -OPN RETIRED_NEAR_RETURNS_MISPREDICTED > -WHY Match with BKDG > -PF ...missing... > -PFN RETIRED_NEAR_RETURNS_MISPREDICTED > > CAh -CA Retired Indirect Branches Mispredicted > -OP RETIRED_BRANCH_MISCOMPARE : Returned taken branches mispredicted due to address miscompare > -OPN RETIRED_INDIRECT_BRANCHES_MISPREDICTED > -WHY Match with BKDG > -PF ...missing... > -PFN RETIRED_INDIRECT_BRANCHES_MISPREDICTED > > CBh -CA Retired MMX/FP Instructions > -OP RETIRED_FPU_INSTRS : Retired FPU instructions > -OPN RETIRED_MMX/FP_INSTRUCTIONS > -PF COMBINED_MMX_3DNOW_RETIRED > COMBINED_PACKED_SSE_SSE2_RETIRED > COMBINED_SCALAR_SSE_SSE2_RETIRED > -PFN RETIRED_X87_INSTRUCTIONS > RETIRED_MMX_AND_3DNOW_INSTRUCTIONS > RETIRED_PACKED_SSE_AND_SSE2_INSTRUCTIONS > RETIRED_SCALAR_SSE_AND_SSE2_INSTRUCTIONS > > CCh -CA Retired Fastpath Double op Instructions > -OP RETIRED_FASTPATH_INSTRS : Retired FastPath double-op instructions > -OPN RETIRED_FASTPATH_DOUBLE_OP_INSTRUCTIONS > -WHY Match with BKDG > -PF ...missing... > -PF RETIRED_FASTPATH_DOUBLE_OP_INSTRUCTIONS_WITH_LOW_OP_IN_POSITION_* > > CDh -CA Interrupts-Masked Cycles > -OP INTERRUPTS_MASKED : Interrupts masked cycles (IF=0) > -OPN INTERRUPTS_MASKED_CYCLES > -WHY Match with BKDG -- make it clear this is cycles > -PF ...missing... > -PFN INTERRUPTS_MASKED_CYCLES > > CEh -CA Interrupts-Masked Cycles with Interrupt Pending > -OP INTERRUPTS_MASKED_PENDING : Interrupts masked while pending cycles (INTR while IF=0) > -OPN INTERRUPTS_MASKED_CYCLES_WITH_INTERRUPT_PENDING > -WHY Match with BKDG -- make it clear this is cycles > -PF ...missing... > -PFN INTERRUPTS_MASKED_CYCLES_WITH_INTERRUPT_PENDING > > CFh -CA Interrupts Taken > -OP HARDWARE_INTERRUPTS : Number of taken hardware interrupts > -OPN INTERRUPTS_TAKEN > -WHY Match with BKDG > -PF ...missing... > -PFN INTERRUPTS_TAKEN > > D0h -CA Decoder Empty > -OP DECODER_EMPTY : Nothing to dispatch (decoder empty) > -OPN ...no change... > -PF ...missing... > -PFN DECODER_EMPTY > > D1h -CA Dispatch Stalls > -OP DISPATCH_STALLS : Dispatch stalls > -OPN ...no change... > -PF ...missing... > -PFN DISPATCH_STALLS > > D2h -CA Dispatch Stall for Branch Abort to Retire > -OP DISPATCH_STALL_FROM_BRANCH_ABORT : Dispatch stall from branch abort to retire > -OPN DISPATCH_STALL_FOR_BRANCH_ABORT > -WHY Match with BKDG > -PF ...missing... > -PFN DISPATCH_STALL_FOR_BRANCH_ABORT > > D3h -CA Dispatch Stall for Serialization > -OP DISPATCH_STALL_SERIALIZATION : Dispatch stall for serialization > -OPN DISPATCH_STALL_FOR_SERIALIZATION > -WHY Match with BKDG > -PF ...missing... > -PFN DISPATCH_STALL_FOR_SERIALIZATION > > D4h -CA Dispatch Stall for Segment Load > -OP DISPATCH_STALL_SEG_LOAD : Dispatch stall for segment load > -OPN DISPATCH_STALL_FOR_SEGMENT_LOAD > -WHY Match with BKDG > -PF ...missing... > -PFN DISPATCH_STALL_FOR_SEGMENT_LOAD > > D5h -CA Dispatch Stall for Reorder Buffer Full > -OP DISPATCH_STALL_REORDER_BUFFER : Dispatch stall when reorder buffer is full > -OPN DISPATCH_STALL_FOR_REORDER_BUFFER_FULL > -WHY Match with BKDG > -PF ...missing... > -PFN DISPATCH_STALL_FOR_REORDER_BUFFER_FULL > > D6h -CA Dispatch Stall for Reservation Station Full > -OP DISPATCH_STALL_RESERVE_STATIONS : Dispatch stall when reservation stations are full > -OPN DISPATCH_STALL_FOR_RESERVATION_STATION_FULL > -WHY Match with BKDG > -PF ...missing... > -PFN DISPATCH_STALL_FOR_RESERVATION_STATION_FULL > > D7h -CA Dispatch Stall for FPU Full > -OP DISPATCH_STALL_FPU : Dispatch stall when FPU is full > -OPN DISPATCH_STALL_FOR_FPU_FULL > -WHY Match with BKDG > -PF ...missing... > -PFN DISPATCH_STALL_FOR_FPU_FULL > > D8h -CA Dispatch Stall for LS Full > -OP DISPATCH_STALL_LS : Dispatch stall when LS is full > -OPN DISPATCH_STALL_FOR_LS_FULL > -WHY Match with BKDG > -PF ...missing... > -PFN DISPATCH_STALL_FOR_LS_FULL > > D9h -CA Dispatch Stall Waiting for All Quiet > -OP DISPATCH_STALL_QUIET_WAIT : Dispatch stall when waiting for all to be quiet > -OPN DISPATCH_STALL_WAITING_FOR_ALL_QUIET > -WHY Match with BKDG > -PF ...missing... > -PFN DISPATCH_STALL_WAITING_FOR_ALL_QUIET > > DAh -CA Dispatch Stall for Far Transfer or Resync to Retire > -OP DISPATCH_STALL_PENDING : Dispatch stall when far control transfer or re-sync branch is pending > -OPN DISPATCH_STALL_FOR_FAR_TRANSFER_OR_RESYNC > -PF ...missing... > -PFN DISPATCH_STALL_FOR_FAR_TRANSFER_OR_RESYNC > > DBh -CA FPU Exceptions > -OP FPU_EXCEPTIONS : FPU exceptions > -OPN ...no change... > -PF ...missing... > -PFN FPU_EXECPTIONS_* > > DCh -CA DR0 Breakpoint Matches > -OP DR0_BREAKPOINTS : Number of breakpoints for DR0 > -OPN ...no change... > -PF ...missing... > -PFN DR0_BREAKPOINT_MATCHES > > DDh -CA DR1 Breakpoint Matches > -OP DR1_BREAKPOINTS : Number of breakpoints for DR1 > -OPN ...no change... > -PF ...missing... > -PFN DR1_BREAKPOINT_MATCHES > > DEh -CA DR2 Breakpoint Matches > -OP DR2_BREAKPOINTS : Number of breakpoints for DR2 > -OPN ...no change... > -PF ...missing... > -PFN DR2_BREAKPOINT_MATCHES > > DFh -CA DR3 Breakpoint Matches > -OP DR3_BREAKPOINTS : Number of breakpoints for DR3 > -OPN ...no change... > -PF ...missing... > > E0h -CA DRAM Accesses > -OP MEM_PAGE_ACCESS : Memory controller page access > -OPN DRAM_ACCESSES > -WHY Match with BKDG > -PF ...missing... > -PFN DRAM_ACCESSES_* > > E1h -CA Memory Controller Page Table Overflows > -OP MEM_PAGE_TBL_OVERFLOW : Memory controller page table overflow > -OPN MEMORY_CONTROLLER_PAGE_TABLE_OVERFLOWS > -WHY Match with BKDG > -PF ...missing... > -PFN MEMORY_CONTROLLER_PAGE_TABLE_OVERFLOWS > > E2h -CA ...missing... > -OP DRAM_SLOTS_MISSED : Memory controller DRAM command slots missed (in MemClks) > -OPN ...delete... > -PF ...missing... > -PFN ...missing... not in BKDG > > E3h -CA Memory Controller Turnarounds > -OP MEM_TURNAROUND : Memory controller turnaround > -OPN MEMORY_CONTROLLER_TURNAROUNDS > -WHY Match with BKDG > -PF ...missing... > -PFN MEMORY_CONTROLLER_TURNAROUNDS_* > > E4h -CA Memory Controller Bypass Counter Saturation > -OP MEM_BYPASS_SAT : Memory controller bypass saturation > -OPN MEMORY_COMTROLLER_BYPASS_COUNTER_SATURATION > -WHY Match with BKDG > -PF ...missing... > -PFN MEMORY_CONTROLLER_HIGH_PRIORITY_BYPASS > MEMORY_CONTROLLER_LOW_PRIORITY_BYPASS > DRAM_CONTROLLER_INTERFACE_BYPASS > DRAM_CONTROLLER_QUEUE_BYPASS > > E5h -CA Sized Blocks > -OP ...missing... > -OPN SIZED_BLOCKS > -WHY Match with BKDG > -PF ...missing... > -PFN SIZE_32_BYTE_WRITES > SIZE_64_BYTE_WRITES > SIZE_32_BYTE_READS > SIZE_64_BYTE_READS > > E8h -CA ECC Errors > -OP ...missing... > -OPN DRAM_ECC_ERRORS > -WHY Distinguish between DRAM and Scrubber Errors > -PF ...missing... > -PFN DRAM_ECC_ERRORS > > E9h -CA CPU/IO Requests to Memory/IO (RevE) > -OP ...missing... > -OPN CPU/IO_REQUESTS_TO_MEMORY/IO > -PF ...missing... > -PFN <Note: See the unit_mask list> > > EAh -CA Cache Block Commands (RevE) > -OP ...missing... > -OPN CACHE_BLOCK_COMMANDS > -WHY Match with BKDG > -PF ...missing... > -PFN CACHE_BLOCK_VICTIM_WRITEBACK > CACHE_BLOCK_DCACHE_LOAD_MISS > CACHE_BLOCK_SHARED_ICACHE_REFILL > CACHE_BLOCK_READ_BLOCK_MODIFIED > CACHE_BLOCK_READ_TO_DIRTY > > EBh -CA Sized Commands > -OP SIZED_COMMANDS : Sized Commands > -OPN ...no change... > -PF ...missing... > -PFN NON_POSTED_WRITE_BYTE > POSTED_WRITE_BYTE > POSTED_WRITE_DWORD > READ_BYTE_4_BYTES > READ_DWORD_1_16_DWORDS > READ_MODIFY_WRITE > > ECh -CA Probe Responses and Upstream Requests > -OP PROBE_RESULT : Probe Result > -OPN PROBE_RESPONSES_AND_UPSTREAM_REQUESTS > -WHY Match with BKDG > -PF ...missing... > -PFN PROBE_MISS > PROBE_HIT_CLEAN > PROBE_HIT_CLEAN_NO_MEMORY_CANCEL > PROBE_HIT_DIRTY_WITH_MEMORY_CANCEL > UPSTREAM_DISPLAY_REFRESH_READS > UPSTREAM_NON_DISPLAY_REFRESH_READS > UPSTREAM_WRITES > > EEh -CA GART Events > -OP ...missing. > -OPN GART_APERTURE_HIT > -WHY Unit Mask description > -PF ...missing... > -PFN GART_APERTURE_HIT_FROM_* > > F6h -CA HyperTransport Link 0 Transmit Bandwidth > -OP HYPERTRANSPORT_BUS0_WIDTH : HyperTransport(tm) bus 0 bandwidth > -OPN HYPERTRANSPORT_LINK0_BANDWIDTH > -WHY Its Hypertransport link, not bus, in AMD terminology > -PF HYPERTRANSPORT_BANDWIDTH > -PFN HT0_COMMAND_DWORD_SENT > HT0_DATA_DWORD_SENT > HT0_BUFFER_RELEASE_DWORD_SENT > HT0_NOP_DWORD_SENT > > F7h -CA HyperTransport Link 1 Transmit Bandwidth > -OP HYPERTRANSPORT_BUS1_WIDTH : HyperTransport(tm) bus 1 bandwidth > -OPN HYPERTRANSPORT_LINK1_BANDWIDTH > -WHY Its Hypertransport link, not bus, in AMD terminology > -PF ...missing... > -PFN HT0_COMMAND_DWORD_SENT > HT0_DATA_DWORD_SENT > HT0_BUFFER_RELEASE_DWORD_SENT > HT0_NOP_DWORD_SENT > > F8h -CA HyperTransport Link 2 Transmit Bandwidth > -OP HYPERTRANSPORT_BUS2_WIDTH : HyperTransport(tm) bus 2 bandwidth > -OPN HYPERTRANSPORT_LINK2_BANDWIDTH > -WHY Its Hypertransport link, not bus, in AMD terminology > -PF ...missing... > -PFN HT0_COMMAND_DWORD_SENT > HT0_DATA_DWORD_SENT > HT0_BUFFER_RELEASE_DWORD_SENT > HT0_NOP_DWORD_SENT -- -Stephane |
From: Ray B. <ra...@mp...> - 2006-02-17 23:22:32
|
On Friday 17 February 2006 03:58, Stephane Eranian wrote: > Ray, > > I don't have a problem with your proposed changes. The important > thing to understand about perfmon2 is that the kernel has no knowledge > of events (encodings, names). As such every application is free to use > whatever naming it wants. The libpfm library is a simple helper > library, it is NOT REQUIRED to use the perfmon interface, in fact, > it does not use it at all. As an example, the HP Caliper tool does not > use libpfm yet it runs on top of perfmon. > Understood. I'm being imprecise when I talk about something running on top of perfmon2 and using the naming I am working on. I should be saying, someone using libpfm to setup calls to perfmon2, etc.... > >From a user's perspective, I think your initiative makes a lot of sense. > > Having a uniform naming across tools does bring clarity and portability > between tools which is good. > > Now, one thing I ran into when I looked at the event description for AMD, > is the fact that for some events, muliple unit masks can be combined > (Or'ed). It is not totally spelled out but I suspect this is the case. > For instance, DC_REFILL, it looks like you could mix the various > unit masks to build a combination, e.g., > DC_REFILL_FROM_L2_INVALID_OR_SHARED. Unit mask combinations exist on > Itanium, so I suspect this is also possible on AMD. Can you confirm? > Yes. In most cases you can combine the various unit mask bit values. However, for simplicity's sake, I'm not going to provide names for all such possible combinations in perfmon2. For example, for Event Select: 20h - Segment Register Loads: The BKDG defines the unit masks as follows: UNIT_MASK 01h ES UNIT_MASK 02h CS UNIT_MASK 04h SS UNIT_MASK 08h DS UNIT_MASK 10h FS UNIT_MASK 20h GS UNIT_MASK 40h HS I'm proposing that the perfmon2 names be: SEGMENT_REGISTER_LOADS_ES 0x0120 SEGMENT_REGISTER_LOADS_CS 0x0220 etc... each for one type of segment register and then a single combined event: SEGMENT_REGISTER_LOADS_ALL 0x7F20 So you won't be able to count loads of, say, just ES and CS at the same time. There just get to be too many events that way. With Oprofile, one specifies the unit_mask as a number, so you could chose whatever set of segment registers you wanted: opcontrol ... --event=<event_name>:<count>:<unit-mask>:<kernel>:<user> e. g.: opcontrol ... --event=SEGMENT_REGISTER_LOADS:30000:3:1:1 This would profile on segment register loads for ES and CS both, since the unit mask is 3. In some cases, the unit_mask is a specific value, for example, for event E9 (new in Revision E), you have the following: Event Select: E9h - CPU/IO Requests to Memory/IO (Revision E) UNIT_MASK A2h Requests Local I/O to Local Memory UNIT_MASK A1h Requests Local I/O to Local I/O UNIT_MASK A3h Requests Local I/O to Local Any UNIT_MASK AAh Requests Local Any to Local Memory UNIT_MASK A5h Requests Local Any to Local I/O UNIT_MASK AFh Requests Local Any to Local Any UNIT_MASK 98h Requests Local CPU to Remote Memory UNIT_MASK 94h Requests Local CPU to Remote I/O UNIT_MASK 9Ch Requests Local CPU to Remote Any UNIT_MASK 92h Requests Local I/O to Remote Memory UNIT_MASK 91h Requests Local I/O to Remote I/O UNIT_MASK 93h Requests Local I/O to Remote Any UNIT_MASK 9Ah Requests Local Any to Remote Memory UNIT_MASK 95h Requests Local Any to Remote I/O UNIT_MASK 9Fh Requests Local Any to Remote Any UNIT_MASK B8h Requests Local CPU to Any Memory UNIT_MASK B4h Requests Local CPU to Any I/O UNIT_MASK BCh Requests Local CPU to Any Any UNIT_MASK B2h Requests Local I/O to Any Memory UNIT_MASK B1h Requests Local I/O to Any I/O UNIT_MASK B3h Requests Local I/O to Any Any UNIT_MASK BAh Requests Local Any to Any Memory UNIT_MASK B5h Requests Local Any to Any I/O UNIT_MASK BFh Requests Local Any to Any Any UNIT_MASK 64h Requests Remote CPU to Local I/O UNIT_MASK 61h Requests Remote I/O to Local I/O UNIT_MASK 65h Requests Remote Any to Local I/O These are the only legal UNIT_MASK values, and you can't or them together. In this case, I've chosen to make the perfmon2 name the unit mask name, that is the name for perfmon2 event 0xA2E9 will be REQUESTS_LOCAL_I/O_TO_LOCAL_MEMORY. The same event under oprofile will be invoked as: opcontrol --event=CPU/IO_REQUESTS_TO_MEMORY_I/O:3000:162:1:1 (162 = 0xA2, I think... :-) ). There's just no good way to make the names the same here without having an absurdly long name for the perfmon2 event. I'm hoping the CA folks will pick up the naming I am proposing for perfmon2. In most cases, I'm using the same names they are, but the E9 event is newer so I'm not sure what is happening there. > On Thu, Feb 16, 2006 at 06:57:23PM -0600, Ray Bryant wrote: > > The attached file is a summary of the various names for the Opteron > > events in oprofile, Code Analyst, and perfmon. It also shows what > > names I am suggesting be used instead. > > > > Any comments would be appreciated. > > -- > > Ray Bryant > > AMD Performance Labs Austin, Tx > > 512-602-0038 (o) 512-507-7807 (c) > > <snip> -- Ray Bryant AMD Performance Labs Austin, Tx 512-602-0038 (o) 512-507-7807 (c) |
From: John L. <le...@mo...> - 2006-02-17 13:55:35
|
On Thu, Feb 16, 2006 at 06:57:23PM -0600, Ray Bryant wrote: > The attached file is a summary of the various names for the Opteron events in > oprofile, Code Analyst, and perfmon. It also shows what names I am > suggesting be used instead. You should run this past Andi Kleen. > 25h -CA ...missing... > -OP OP_LATE_CANCEL : Micro-architectural late cancel of an operation > -OPN ...remove.... No, why remove it? > C0h -CA Retired Instructions > -OP RETIRED_INSNS : Retired instructions (includes exceptions, interrupts, re-syncs) > -OPN RETIRED_X86_INSTRUCTIONS No, fix perfmon, like you suggested. > E2h -CA ...missing... > -OP DRAM_SLOTS_MISSED : Memory controller DRAM command slots missed (in MemClks) > -OPN ...delete... > E8h -CA ECC Errors > -OP ...missing... > -OPN DRAM_ECC_ERRORS OK, this one clearly makes no sense for oprofile. Have you verified all these 'missing' ones are actually meaningful for a statistical profiler? regards john |
From: Ray B. <ra...@mp...> - 2006-02-17 19:02:12
|
On Friday 17 February 2006 07:56, John Levon wrote: <snip> > > > E2h -CA ...missing... > > -OP DRAM_SLOTS_MISSED : Memory controller DRAM command slots missed > > (in MemClks) -OPN ...delete... > > > > E8h -CA ECC Errors > > -OP ...missing... > > -OPN DRAM_ECC_ERRORS > > OK, this one clearly makes no sense for oprofile. Have you verified all > these 'missing' ones are actually meaningful for a statistical profiler? > No. At the moment I was more concerned with making the event list match what is in the Bios and Kernel Developer's Guide. I can certainly go through the list and make a determination as to what seems reasonable wrt plausible events for oprofile to use, and trim the list as appropriate. Would that help? > regards > john -- Ray Bryant AMD Performance Labs Austin, Tx 512-602-0038 (o) 512-507-7807 (c) |
From: William C. <wc...@re...> - 2006-07-20 19:16:06
|
Hi Ray, Do you have a patch for OProfile to provide the matching names? We would like to get this into OProfile to have consistent event names. Do the patches also change the event names on the subset of events available on the Athlon? -Will Ray Bryant wrote: > The attached file is a summary of the various names for the Opteron events in > oprofile, Code Analyst, and perfmon. It also shows what names I am > suggesting be used instead. > > Any comments would be appreciated. > > > ------------------------------------------------------------------------ > > Merged Event List Version 1.0 2/14/2006 > Key: > -CA Code Analyst > -OP oprofile > -OPN proposed new oprofile name > -WHY why the change to the above > -PF current perfmon2 name > -PFN proposed new perfmon2 name > > Notes: > (1) The CA name is normally the same as the name in the BKDG. > (2) For perfmon, the event name is actually the name for a particular > (event select,unit mask) pair. If the event name is of the form FOO_*, > then it indicates the rest of the name is specified by the unit_mask name. > (3) The initial set of events done for perfmon2 was known to be a partial > set, so a lot of the events are missing for that case. > > In general, I've tried to make the new oprofile name match the PFN and CA > names. However, since the PFN name includes the unit mask value, this is > not always possible. (e. g. see events F6/F7/F8, where in PFN, we name each event > pair, but oprofile just names the event select value). > ------------------------------------------------------------------------------------------- > > 00h -CA Dispatched FPU Operations > -OP DISPATCHED_FPU_OPS : Dispatched FPU ops > -OPN ...No Change... > -PF DISPATCHED_FP_OPS_* > -PFN DISPATCHED_FPU_OPS_* > > 01h -CA Cycles with no FPU Ops Retired > -OP Cycles with no FPU ops retired > -PF CYCLES_NO_FP_OPS_RETIRED > -PFN CYCLES_NO_FPU_OPS_RETIRED > > 02h -CA Dispatched Fast Flag FPU Operations > -OP FAST_FPU_OPS : Dispatched FPU ops that use the fast flag interface > -OPN DISPATCHED_FPU_OPS_FAST_FLAG > -WHY Match with perfmon2 > -PF DISPATCHED_FP_OPS_FAST_FLAG > -PFN DISPATCHED_FPU_OPS_FAST_FLAG > > 20h -CA Segment Register Loads > -OP SEG_REG_LOAD : Segment register load > -OP SEGMENT_REGISTER_LOADS > -WHY Match with BKDG > -PF SEG_REG_LOAD_* > -PF SEGMENT_REGISTER_LOADS_* > > 21h -CA Pipeline Restart Due to Self-Modifying Code > -OP SELF_MODIFY_RESYNC : Micro-architectural re-sync caused by self modifying code > -OPN PIPELINE_RESTART_DUE_TO_SELF_MODIFYING_CODE > -WHY Match with BKDG > -PF MICRO_ARCH_RESYNC_SELF_MOD_CODE > -PFN PIPELINE_RESTART_DUE_TO_SELF_MODIFYING_CODE > > 22h -CA Pipeline Restart Due to Probe Hit > -OP SNOOP_RESYNC : Micro-architectural re-sync caused by snoop > -OPN PIPELINE_RESTART_DUE_TO_PROBE_HIT > -WHY Match with BKDG > -PF MICRO_ARCH_RESYNC_SNOOP > -PFN PIPELINE_RESTART_DUE_TO_PROBE_HIT > > 23h -CA LS Buffer 2 Full > -OP LS_BUFFER_FULL : LS Buffer 2 Full > -OPN LS_BUFFER_2_FULL_CYCLES > -WHY Match with BKDG; tag as cycles > -PF LS_BUFFER_2_FULL > -PFN LS_BUFFER_2_FULL_CYCLES > > 24h -CA Locked Operations > -OP LOCKED_OP : Locked operation > -OPN LOCKED_OPS > -PF LOCKED_OPS_EXEC > -PFN LOCKED_OPS_EXECUTED > LOCKED_OPS_CYCLES_SPECULATIVE_PHASE > LOCKED_OPS_CYCLES_NON_SPECULATIVE_PHASE > > 25h -CA ...missing... > -OP OP_LATE_CANCEL : Micro-architectural late cancel of an operation > -OPN ...remove.... > -WHY Match with BKDG > -PF MICRO_ARCH_LATE_CANCEL > -PFN ...missing... not in BKDG > > 26h -CA Retired CLFLUSH Instructions > -OP CFLUSH_RETIRED : Retired CFLUSH instructions > -OPN RETIRED_CLFLUSH_INSTRUCTIONS > -WHY Match with BKDG > -PF CFLUSH_RETIRED_INST > -PFN RETIRED_CLFLUSH_INSTRUCTIONS > > 27h -CA Retired CPUID Instructions. > -OP CPUID_RETIRED : Retired CPUID instructions > -OPN RETIRED_CPUID_INSTRUCTIONS > -WHY Match with BKDG > -PF CPUID_RETIRED_INST > -PFN RETIRED_CPUID_INSTRUCTIONS > > 40h -CA Data Cache Accesses > -OP DATA_CACHE_ACCESSES : Data cache accesses > -OPN ...No Change... > -PF DC_ACCESS > -PFN DATA_CACHE_ACCESSES > > 41h -CA Data Cache Misses > -OP DATA_CACHE_MISSES : Data cache misses > -OPN ...No Change... > -PF DC_MISS > -PFN DATA_CACHE_MISSES > > 42h -CA Data Cache Refills from L2 or System > -OP DATA_CACHE_REFILLS_FROM_L2 : Data cache refills from L2 > -OPN DATA_CACHE_REFILLS_FROM_L2_OR_SYSTEM > -WHY Match with BKDG; see event mask change for details > -PF DC_REFILL_L2_* > -PFN DATA_CACHE_REFILLS_FROM_*_* > > 43h -CA Data Cache Refills from System > -OP DATA_CACHE_REFILLS_FROM_SYSTEM : Data cache refills from System > -OPN ...No Change... > -PF ...missing... > -PFN DATA_CACHE_REFILLS_FROM_SYSTEM_* > > 44h -CA Data Cache Lines Evicted > -OP DATA_CACHE_WRITEBACKS : Data cache write backs > -OPN DATA_CACHE_LINES_EVICTED > -WHY Match with BKDG > -PF ...missing... > -PFN DATA_CACHE_LINES_EVICTED_* > > 45h -CA L1 DTLB Miss and L2 DLTB Hit > -OP L1_DTLB_MISSES_L2_DTLB_HITS : L1 DTLB misses and L2 DTLB hits > -OPN L1_DTLB_MISS_AND_L2_DLTB_HIT > -WHY Match with BKDG > -PF ...missing... > -PFN L1_DTLB_MISS_AND_L2_DLTB_HIT > > 46h -CA L1 DTLB and L2 DLTB Miss > -OP L1_AND_L2_DTLB_MISSES : L1 and L2 DTLB misses > -OPN L1_DTLB_AND_L2_DLTB_MISS > -WHY Match with BKDG > -PF ...missing... > -PFN L1_DTLB_AND_L2_DLTB_MISS > > 47h -CA Misaligned Accesses > -OP MISALIGNED_DATA_REFS : Misaligned data references > -OPN MISALIGNED_ACCESSES > -WHY Match with BKDG > -PF ...missing... > -PFN MISALIGNED_ACCESSES > > 48h -CA Microarchitectural Late Cancel of an Access > -OP ACCESS_CANCEL_LATE : Micro-architectural late cancel of an access > -OPN MICROARCHITECTURAL_LATE_CANCEL_OF_AN_ACCESS > -WHY Match with BKDG > -PF ...missing... > -PFN MICROARCHITECTURAL_LATE_CANCEL_OF_AN_ACCESS > > 49h -CA Microarchitectural Early Cancel of an Access > -OP ACCESS_CANCEL_EARLY : Micro-architectural early cancel of an access > -OPN MICROARCHITECTURAL_EARLY_CANCEL_OF_AN_ACCESS > -WHY Match with BKDG > -PF ...missing... > -PFN MICROARCHITECTURAL_EARLY_CANCEL_OF_AN_ACCESS > > 4Ah -CA Single-bit ECC Errors Recorded by Scrubber > -OP ECC_BIT_ERR : One bit ECC error recorded by scrubbe > -OPN SCRUBBER_SINGLE_BIT_ECC_ERRORS > -WHY Match with perfmon2 > -PF ...missing... > -PFN SCRUBBER_SINGLE_BIT_ECC_ERRORS > PIGGYBACK_SCRUBBER_SINGLE_BIT_ECC_ERRORS > > 4Bh -CA Prefetch Instructions Dispatched > -OP DISPATCHED_PRE_INSTRS : Dispatched prefetch instructions > -OPN PREFETCH_INSTRUCTIONS_DISPATCHED > -WHY Match with BKDG > -PF ...missing... > -PFN PREFETCH_INSTRUCTIONS_DISPATCHED_* > > 4Ch -CA DCACHE Misses by Locked Instructions > -OP ...missing... > -OPN DCACHE_MISS_LOCKED_INSTRUCTIONS > -WHY Match with BKDG > -PF ...missing... > -PFN DCACHE_MISS_LOCKED_INSTRUCTIONS > > 65h -CA Memory Requests by Type > -OP ...missing... > -OPN MEMORY_REQUESTS > -PF ...missing... > -PFN MEMORY_REQUESTS_* > > 67h -CA Data Prefetcher > -OP ...missing... > -OPN DATA_PREFETCHES > -PF ...missing... > -PFN DATA_PREFETCHES_* > > 6Ch -CA System Read Responses by Coherency State > -OP ...missing... > -OPN SYSTEM_READ_RESPONSES > -WHY Match with perfmon2 > -PF ...missing... > -PFN SYSTEM_READ_RESPONSES_* > > 6Dh -CA Quadwords Written to System > -OP ...missing... > -OPN QUADWORD_WRITE_TRANSFERS > -WHY Match with perfmon2 > -PF ...missing... > -PF QUADWORD_WRITE_TRANSFERS > > 76h -CA CPU Clocks not Halted > -OP CPU_CLK_UNHALTED : Cycles outside of halt state > -OP ...No change... > -PF CPU_CLK_UNHALTED : CPU clock not in HLT or STPCLK > -PFN CPU_CLK_UNHALTED > > 7Dh -CA Requests to L2 Cache > -OP BU_INT_L2_REQ : Internal L2 request > -OPN REQUESTS_TO_L2 > -WHY Match with BKDG > -PF ...missing... > -PFN REQUESTS_TO_L2_* > > 7Eh -CA L2 Cache Misses > -OP BU_FILL_REQ : Fill request that missed in L2 > -OPN L2_CACHE_MISS > -WHY Match with BKDG > -PF ...missing... > -PFN L2_CACHE_MISS_* > > 7Fh -CA L2 Fill/Writeback > -OP BU_FILL_L2 : Fill into L2 > -OPN L2_CACHE_FILL_WRITEBACK > -WHY Match with BKDG > -PF ...missing... > -PFN L2_CACHE_FILL_WRITEBACK > > 80h -CA Instruction Cache Fetches > -OP ICACHE_FETCHES : Instruction cache fetches > -OPN INSTRUCTION_CACHE_FETCHES > -WHY Match with BKDG > -PF ...missing... > -PFN INSTRUCTION_CACHE_FETCHES > > 81h -CA Instruction Cache Misses > -OP ICACHE_MISSES : Instruction cache misses > -OPN INSTRUCTION_CACHE_MISSES > -WHY Match with BKDG > -PF ...missing... > -PFN INSTRUCTION_CACHE_MISSES > > 82h -CA Instruction Cache Refills from L2 > -OP IC_REFILL_FROM_L2 : Refill from L2 > -OPN INSTRUCTION_CACHE_REFILLS_FROM_L2 > -WHY Match with BKDG > -PF ...missing... > -PFN INSTRUCTION_CACHE_REFILLS_FROM_L2 > > 83h -CA Instruction Cache Refills from System > -OP IC_REFILL_FROM_SYS : Refill from system > -OPN INSTRUCTION_CACHE_REFILLS_FROM_SYSTEM > -WHY Match with BKDG > -PF ...missing... > -PFN INSTRUCTION_CACHE_REFILLS_FROM_SYSTEM > > 84h -CA L1 ITLB Miss, L2 ITLB Hit > -OP L1_ITLB_MISSES_L2_ITLB_HITS : L1 ITLB misses (and L2 ITLB hits) > -OPN L1_ITLB_MISS_AND_L2_ITLB_HIT > -PF ...missing... > -PFN L1_ITLB_MISS_AND_L2_ITLB_HIT > > 85h -CA L1 ITLB Miss, L2 ITLB Miss > -OP L1_AND_L2_ITLB_MISSES : L1 and L2 ITLB misses > -OPN L1_ITLB_MISS_AND_L2_ITLB_MISS > -WHY Personally, I prefer the AND here > -PF ...missing... > -PFN L1_ITLB_MISS_AND_L2_ITLB_MISS > > 86h -CA Pipeline Restart Due to Instruction Stream Probe > -OP IC_RESYNC_BY_SNOOP : Micro-architectural re-sync caused by snoop > -OPN PIPELINE_RESTART_DUE_TO_INSTRUCTION_STREAM_PROBE > -WHY Match with BKDG > -PF ...missing... > -PFN PIPELINE_RESTART_DUE_TO_INSTRUCTION_STREAM_PROBE > > 87h -CA Instruction Fetch Stall > -OP IC_FETCH_STALL : Instruction fetch stall > -OPN INSTRUCTION_FETCH_STALL > -WHY Match with BKDG > -PF ...missing... > -PFN INSTRUCTION_FETCH_STALL > > 88h -CA Return Stack Hits > -OP IC_STACK_HIT : Return stack hit > -OPN RETURN_STACK_HITS > -WHY Match with BKDG > -PF ...missing... > -PFN RETURN_STACK_HITS > > 89h -CA Return Stack Overflows > -OP IC_STACK_OVERFLOW : Return stack overflow > -OPN RETURN_STACK_OVERFLOWS > -WHY Match with BKDG > -PF ...missing... > -PFN RETURN_STACK_OVERFLOWS > > C0h -CA Retired Instructions > -OP RETIRED_INSNS : Retired instructions (includes exceptions, interrupts, re-syncs) > -OPN RETIRED_X86_INSTRUCTIONS > -WHY Match with Perfmon2; arguably, Permfon2 should just be RETIRED_INSTRUCTIONS > and if we make that change there, we can make this one RETIRED_INSTRUCTIONS as > well. Will check with Stephane. > -PF RETIRED_X86_INST : Retired x86 instructions including excepti > -PFN RETIRED_X86_INSTRUCTIONS > > 88h -CA Return Stack Hits > -OP IC_STACK_HIT > -OPN RETURN_STACK_HITS > -WHY Match with BKDG > -PF ...missing... > -PFN RETURN_STACK_HITS > > C1h -CA Retired uops > -OP RETIRED_OPS : Retired ops > -OPN RETIRED_UOPS : Retired micro-ops > -WHY Match with BKDG; its not retired ops, anyway. > -PF ...missing... > -PFN RETIRED_UOPS > > C2h -CA Retired Branch Instructions > -OP RETIRED_BRANCHES : Retired branches (conditional, unconditional, exceptions, interrupts) > -OPN RETIRED_BRANCH_INSTRUCTIONS > -WHY Match with BKDG > -PF ...missing... > -PFN RETIRED_BRANCH_INSTRUCTIONS > > C3h -CA Retired Mispredicted Branch Instructions > -OP RETIRED_BRANCHES_MISPREDICTED : Retired branches mispredicted > -OPN RETIRED_MISPREDICTED_BRANCH_INSTRUCTIONS > -WHY Match with BKDG > -PF ...missing... > -PFN RETIRED_MISPREDICTED_BRANCH_INSTRUCTIONS > > C4h -CA Retired Taken Branch Instructions > -OP RETIRED_TAKEN_BRANCHES : Retired taken branches > -OPN RETIRED_TAKEN_BRANCH_INSTRUCTIONS > -WHY Match with BKDG > -PF ...missing... > -PFN RETIRED_TAKEN_BRANCH_INSTRUCTIONS > > C5h -CA Retired Taken Branch Instructions Mispredicted > -OP RETIRED_TAKEN_BRANCHES_MISPREDICTED : Retired taken branches mispredicted > -OPN RETIRED_TAKEN_BRANCH_INSTRUCTIONS_MISPREDICTED > -WHY Match with BKDG > -PF ...missing... > -PFN RETIRED_TAKEN_BRANCH_INSTRUCTIONS_MISPREDICTED > > C6h -CA Retired Far Control Transfers > -OP RETIRED_FAR_CONTROL_TRANSFERS : Retired far control transfers > -OPN ...no change... > -PF ...missing... > -PFN RETIRED_FAR_CONTROL_TRANSFERS > > C7h -CA Retired Branch Resyncs > -OP RETIRED_RESYNC_BRANCHES : Retired re-sync branches (only non-control transfer branches) > -OPN RETIRED_BRANCH_RESYNCS > -WHY Match with BKDG > -PF ...missing... > -PFN RETIRED_BRANCH_RESYNCS > > C8h -CA Retired Near Returns > -OP RETIRED_NEAR_RETURNS : Retired near returns > -OPN ...no change... > -PF ...missing... > -PFN RETIRED_NEAR_RETURNS > > C9h -CA Retired Near Returns Mispredicted > -OP RETIRED_RETURNS_MISPREDICT : Retired near returns mispredicted > -OPN RETIRED_NEAR_RETURNS_MISPREDICTED > -WHY Match with BKDG > -PF ...missing... > -PFN RETIRED_NEAR_RETURNS_MISPREDICTED > > CAh -CA Retired Indirect Branches Mispredicted > -OP RETIRED_BRANCH_MISCOMPARE : Returned taken branches mispredicted due to address miscompare > -OPN RETIRED_INDIRECT_BRANCHES_MISPREDICTED > -WHY Match with BKDG > -PF ...missing... > -PFN RETIRED_INDIRECT_BRANCHES_MISPREDICTED > > CBh -CA Retired MMX/FP Instructions > -OP RETIRED_FPU_INSTRS : Retired FPU instructions > -OPN RETIRED_MMX/FP_INSTRUCTIONS > -PF COMBINED_MMX_3DNOW_RETIRED > COMBINED_PACKED_SSE_SSE2_RETIRED > COMBINED_SCALAR_SSE_SSE2_RETIRED > -PFN RETIRED_X87_INSTRUCTIONS > RETIRED_MMX_AND_3DNOW_INSTRUCTIONS > RETIRED_PACKED_SSE_AND_SSE2_INSTRUCTIONS > RETIRED_SCALAR_SSE_AND_SSE2_INSTRUCTIONS > > CCh -CA Retired Fastpath Double op Instructions > -OP RETIRED_FASTPATH_INSTRS : Retired FastPath double-op instructions > -OPN RETIRED_FASTPATH_DOUBLE_OP_INSTRUCTIONS > -WHY Match with BKDG > -PF ...missing... > -PF RETIRED_FASTPATH_DOUBLE_OP_INSTRUCTIONS_WITH_LOW_OP_IN_POSITION_* > > CDh -CA Interrupts-Masked Cycles > -OP INTERRUPTS_MASKED : Interrupts masked cycles (IF=0) > -OPN INTERRUPTS_MASKED_CYCLES > -WHY Match with BKDG -- make it clear this is cycles > -PF ...missing... > -PFN INTERRUPTS_MASKED_CYCLES > > CEh -CA Interrupts-Masked Cycles with Interrupt Pending > -OP INTERRUPTS_MASKED_PENDING : Interrupts masked while pending cycles (INTR while IF=0) > -OPN INTERRUPTS_MASKED_CYCLES_WITH_INTERRUPT_PENDING > -WHY Match with BKDG -- make it clear this is cycles > -PF ...missing... > -PFN INTERRUPTS_MASKED_CYCLES_WITH_INTERRUPT_PENDING > > CFh -CA Interrupts Taken > -OP HARDWARE_INTERRUPTS : Number of taken hardware interrupts > -OPN INTERRUPTS_TAKEN > -WHY Match with BKDG > -PF ...missing... > -PFN INTERRUPTS_TAKEN > > D0h -CA Decoder Empty > -OP DECODER_EMPTY : Nothing to dispatch (decoder empty) > -OPN ...no change... > -PF ...missing... > -PFN DECODER_EMPTY > > D1h -CA Dispatch Stalls > -OP DISPATCH_STALLS : Dispatch stalls > -OPN ...no change... > -PF ...missing... > -PFN DISPATCH_STALLS > > D2h -CA Dispatch Stall for Branch Abort to Retire > -OP DISPATCH_STALL_FROM_BRANCH_ABORT : Dispatch stall from branch abort to retire > -OPN DISPATCH_STALL_FOR_BRANCH_ABORT > -WHY Match with BKDG > -PF ...missing... > -PFN DISPATCH_STALL_FOR_BRANCH_ABORT > > D3h -CA Dispatch Stall for Serialization > -OP DISPATCH_STALL_SERIALIZATION : Dispatch stall for serialization > -OPN DISPATCH_STALL_FOR_SERIALIZATION > -WHY Match with BKDG > -PF ...missing... > -PFN DISPATCH_STALL_FOR_SERIALIZATION > > D4h -CA Dispatch Stall for Segment Load > -OP DISPATCH_STALL_SEG_LOAD : Dispatch stall for segment load > -OPN DISPATCH_STALL_FOR_SEGMENT_LOAD > -WHY Match with BKDG > -PF ...missing... > -PFN DISPATCH_STALL_FOR_SEGMENT_LOAD > > D5h -CA Dispatch Stall for Reorder Buffer Full > -OP DISPATCH_STALL_REORDER_BUFFER : Dispatch stall when reorder buffer is full > -OPN DISPATCH_STALL_FOR_REORDER_BUFFER_FULL > -WHY Match with BKDG > -PF ...missing... > -PFN DISPATCH_STALL_FOR_REORDER_BUFFER_FULL > > D6h -CA Dispatch Stall for Reservation Station Full > -OP DISPATCH_STALL_RESERVE_STATIONS : Dispatch stall when reservation stations are full > -OPN DISPATCH_STALL_FOR_RESERVATION_STATION_FULL > -WHY Match with BKDG > -PF ...missing... > -PFN DISPATCH_STALL_FOR_RESERVATION_STATION_FULL > > D7h -CA Dispatch Stall for FPU Full > -OP DISPATCH_STALL_FPU : Dispatch stall when FPU is full > -OPN DISPATCH_STALL_FOR_FPU_FULL > -WHY Match with BKDG > -PF ...missing... > -PFN DISPATCH_STALL_FOR_FPU_FULL > > D8h -CA Dispatch Stall for LS Full > -OP DISPATCH_STALL_LS : Dispatch stall when LS is full > -OPN DISPATCH_STALL_FOR_LS_FULL > -WHY Match with BKDG > -PF ...missing... > -PFN DISPATCH_STALL_FOR_LS_FULL > > D9h -CA Dispatch Stall Waiting for All Quiet > -OP DISPATCH_STALL_QUIET_WAIT : Dispatch stall when waiting for all to be quiet > -OPN DISPATCH_STALL_WAITING_FOR_ALL_QUIET > -WHY Match with BKDG > -PF ...missing... > -PFN DISPATCH_STALL_WAITING_FOR_ALL_QUIET > > DAh -CA Dispatch Stall for Far Transfer or Resync to Retire > -OP DISPATCH_STALL_PENDING : Dispatch stall when far control transfer or re-sync branch is pending > -OPN DISPATCH_STALL_FOR_FAR_TRANSFER_OR_RESYNC > -PF ...missing... > -PFN DISPATCH_STALL_FOR_FAR_TRANSFER_OR_RESYNC > > DBh -CA FPU Exceptions > -OP FPU_EXCEPTIONS : FPU exceptions > -OPN ...no change... > -PF ...missing... > -PFN FPU_EXECPTIONS_* > > DCh -CA DR0 Breakpoint Matches > -OP DR0_BREAKPOINTS : Number of breakpoints for DR0 > -OPN ...no change... > -PF ...missing... > -PFN DR0_BREAKPOINT_MATCHES > > DDh -CA DR1 Breakpoint Matches > -OP DR1_BREAKPOINTS : Number of breakpoints for DR1 > -OPN ...no change... > -PF ...missing... > -PFN DR1_BREAKPOINT_MATCHES > > DEh -CA DR2 Breakpoint Matches > -OP DR2_BREAKPOINTS : Number of breakpoints for DR2 > -OPN ...no change... > -PF ...missing... > -PFN DR2_BREAKPOINT_MATCHES > > DFh -CA DR3 Breakpoint Matches > -OP DR3_BREAKPOINTS : Number of breakpoints for DR3 > -OPN ...no change... > -PF ...missing... > > E0h -CA DRAM Accesses > -OP MEM_PAGE_ACCESS : Memory controller page access > -OPN DRAM_ACCESSES > -WHY Match with BKDG > -PF ...missing... > -PFN DRAM_ACCESSES_* > > E1h -CA Memory Controller Page Table Overflows > -OP MEM_PAGE_TBL_OVERFLOW : Memory controller page table overflow > -OPN MEMORY_CONTROLLER_PAGE_TABLE_OVERFLOWS > -WHY Match with BKDG > -PF ...missing... > -PFN MEMORY_CONTROLLER_PAGE_TABLE_OVERFLOWS > > E2h -CA ...missing... > -OP DRAM_SLOTS_MISSED : Memory controller DRAM command slots missed (in MemClks) > -OPN ...delete... > -PF ...missing... > -PFN ...missing... not in BKDG > > E3h -CA Memory Controller Turnarounds > -OP MEM_TURNAROUND : Memory controller turnaround > -OPN MEMORY_CONTROLLER_TURNAROUNDS > -WHY Match with BKDG > -PF ...missing... > -PFN MEMORY_CONTROLLER_TURNAROUNDS_* > > E4h -CA Memory Controller Bypass Counter Saturation > -OP MEM_BYPASS_SAT : Memory controller bypass saturation > -OPN MEMORY_COMTROLLER_BYPASS_COUNTER_SATURATION > -WHY Match with BKDG > -PF ...missing... > -PFN MEMORY_CONTROLLER_HIGH_PRIORITY_BYPASS > MEMORY_CONTROLLER_LOW_PRIORITY_BYPASS > DRAM_CONTROLLER_INTERFACE_BYPASS > DRAM_CONTROLLER_QUEUE_BYPASS > > E5h -CA Sized Blocks > -OP ...missing... > -OPN SIZED_BLOCKS > -WHY Match with BKDG > -PF ...missing... > -PFN SIZE_32_BYTE_WRITES > SIZE_64_BYTE_WRITES > SIZE_32_BYTE_READS > SIZE_64_BYTE_READS > > E8h -CA ECC Errors > -OP ...missing... > -OPN DRAM_ECC_ERRORS > -WHY Distinguish between DRAM and Scrubber Errors > -PF ...missing... > -PFN DRAM_ECC_ERRORS > > E9h -CA CPU/IO Requests to Memory/IO (RevE) > -OP ...missing... > -OPN CPU/IO_REQUESTS_TO_MEMORY/IO > -PF ...missing... > -PFN <Note: See the unit_mask list> > > EAh -CA Cache Block Commands (RevE) > -OP ...missing... > -OPN CACHE_BLOCK_COMMANDS > -WHY Match with BKDG > -PF ...missing... > -PFN CACHE_BLOCK_VICTIM_WRITEBACK > CACHE_BLOCK_DCACHE_LOAD_MISS > CACHE_BLOCK_SHARED_ICACHE_REFILL > CACHE_BLOCK_READ_BLOCK_MODIFIED > CACHE_BLOCK_READ_TO_DIRTY > > EBh -CA Sized Commands > -OP SIZED_COMMANDS : Sized Commands > -OPN ...no change... > -PF ...missing... > -PFN NON_POSTED_WRITE_BYTE > POSTED_WRITE_BYTE > POSTED_WRITE_DWORD > READ_BYTE_4_BYTES > READ_DWORD_1_16_DWORDS > READ_MODIFY_WRITE > > ECh -CA Probe Responses and Upstream Requests > -OP PROBE_RESULT : Probe Result > -OPN PROBE_RESPONSES_AND_UPSTREAM_REQUESTS > -WHY Match with BKDG > -PF ...missing... > -PFN PROBE_MISS > PROBE_HIT_CLEAN > PROBE_HIT_CLEAN_NO_MEMORY_CANCEL > PROBE_HIT_DIRTY_WITH_MEMORY_CANCEL > UPSTREAM_DISPLAY_REFRESH_READS > UPSTREAM_NON_DISPLAY_REFRESH_READS > UPSTREAM_WRITES > > EEh -CA GART Events > -OP ...missing. > -OPN GART_APERTURE_HIT > -WHY Unit Mask description > -PF ...missing... > -PFN GART_APERTURE_HIT_FROM_* > > F6h -CA HyperTransport Link 0 Transmit Bandwidth > -OP HYPERTRANSPORT_BUS0_WIDTH : HyperTransport(tm) bus 0 bandwidth > -OPN HYPERTRANSPORT_LINK0_BANDWIDTH > -WHY Its Hypertransport link, not bus, in AMD terminology > -PF HYPERTRANSPORT_BANDWIDTH > -PFN HT0_COMMAND_DWORD_SENT > HT0_DATA_DWORD_SENT > HT0_BUFFER_RELEASE_DWORD_SENT > HT0_NOP_DWORD_SENT > > F7h -CA HyperTransport Link 1 Transmit Bandwidth > -OP HYPERTRANSPORT_BUS1_WIDTH : HyperTransport(tm) bus 1 bandwidth > -OPN HYPERTRANSPORT_LINK1_BANDWIDTH > -WHY Its Hypertransport link, not bus, in AMD terminology > -PF ...missing... > -PFN HT0_COMMAND_DWORD_SENT > HT0_DATA_DWORD_SENT > HT0_BUFFER_RELEASE_DWORD_SENT > HT0_NOP_DWORD_SENT > > F8h -CA HyperTransport Link 2 Transmit Bandwidth > -OP HYPERTRANSPORT_BUS2_WIDTH : HyperTransport(tm) bus 2 bandwidth > -OPN HYPERTRANSPORT_LINK2_BANDWIDTH > -WHY Its Hypertransport link, not bus, in AMD terminology > -PF ...missing... > -PFN HT0_COMMAND_DWORD_SENT > HT0_DATA_DWORD_SENT > HT0_BUFFER_RELEASE_DWORD_SENT > HT0_NOP_DWORD_SENT |
From: shobha r. <sho...@ya...> - 2006-07-20 20:15:16
|
I have a FC5 xen0 and xenU setup in a P6 system. I downloaded src oprofile-0.9.1 and and built it. When I run opcontrol --help, I do not get the newly added command line parameters.. thanks William and Mark for your tips.. I also went the route of building xenoprof thru downloading xen-usntable-src.tgz... and doing a "make world" and "make install"..But I still do not get "--xen" options in "opcontrol --help".. What am I missing ? Shobha --------------------------------- Yahoo! Music Unlimited - Access over 1 million songs.Try it free. |
From: John L. <le...@mo...> - 2006-02-14 20:01:51
|
On Tue, Feb 14, 2006 at 02:19:28PM -0600, Ray Bryant wrote: > What do people think? Is compatibility with existing code more important > than comformity, or is it a good thing to move toward having the names be the > same? Or no one cares or what? :-) We're still pre beta, and we can change names as needed. What are the differences? regards john |
From: Ray B. <ra...@mp...> - 2006-02-14 20:55:18
|
On Tuesday 14 February 2006 14:02, John Levon wrote: > On Tue, Feb 14, 2006 at 02:19:28PM -0600, Ray Bryant wrote: > > What do people think? Is compatibility with existing code more > > important than comformity, or is it a good thing to move toward having > > the names be the same? Or no one cares or what? :-) > > We're still pre beta, and we can change names as needed. What are the > differences? > Hi John, Let me put together a full list and we can compare the various names and see if we can come up with common names we all like. In part, the reason for sending out the email was to make sure people were interested/cared about this before I went to the effort to produce a combined list, etc. > regards > john > > > ------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. Do you grep through log > files for problems? Stop! Download the new AJAX search engine that makes > searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642 > _______________________________________________ > oprofile-list mailing list > opr...@li... > https://lists.sourceforge.net/lists/listinfo/oprofile-list -- Ray Bryant AMD Performance Labs Austin, Tx 512-602-0038 (o) 512-507-7807 (c) |
From: William C. <wc...@re...> - 2006-02-14 20:09:43
|
Ray Bryant wrote: > I'm in the process of updating the PMU event names and unit masks for libpfm, > the user side of the perfmon2 interface that Stephane Eranian is trying to > get merged into the mainline kernel. (The latest public version of the PMU > definitions is in version 3.28 of the BKDG, published in October 2005). > > In looking at the "hammer" event names in oprofile it appears that this table > needs to be updated as well. I'm wondering if now wouldn't be a good time > to get the event names to be the same across oprofile and perfmon. This is > going to require some changes to the existing oprofile names. > > What do people think? Is compatibility with existing code more important > than comformity, or is it a good thing to move toward having the names be the > same? Or no one cares or what? :-) If it is along the lines of making the event names agree with documentation, I am for it. There is already enough confusion between the different architectures/implementation. Making sure that the events in the software are consistent with the processor documentation is a good thing. Is there a publically available document with the current event names and masks? Any thoughts on managing the differences between architecture performance events, e.g. Net burst and AMD64 cache events. Users do find the multitude of different events on the various architectures confusing. PAPI has some generic event names for some types of events, but these do not alway make well to the actual events available on the particular processor. -Will |
From: Ray B. <ra...@mp...> - 2006-02-14 20:44:45
|
On Tuesday 14 February 2006 14:09, William Cohen wrote: > Ray Bryant wrote: > > I'm in the process of updating the PMU event names and unit masks for > > libpfm, the user side of the perfmon2 interface that Stephane Eranian is > > trying to get merged into the mainline kernel. (The latest public > > version of the PMU definitions is in version 3.28 of the BKDG, published > > in October 2005). > > > > In looking at the "hammer" event names in oprofile it appears that this > > table needs to be updated as well. I'm wondering if now wouldn't be a > > good time to get the event names to be the same across oprofile and > > perfmon. This is going to require some changes to the existing oprofile > > names. > > > > What do people think? Is compatibility with existing code more > > important than comformity, or is it a good thing to move toward having > > the names be the same? Or no one cares or what? :-) > > If it is along the lines of making the event names agree with > documentation, I am for it. There is already enough confusion between > the different architectures/implementation. Making sure that the events > in the software are consistent with the processor documentation is a > good thing. Is there a publically available document with the current > event names and masks? Hi Will, Yes, the current document is at: http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/26094.PDF This is the "Bios and Kernel Developer's Guide for AMD Athlon [tm] 64 and AMD Opteron [tm] Processors", Revision 3.28, October 2005, Publication 26094. The Performance Monitoring events are defined in chapter 10. AFAIK, this is the latest public document. What I am doing for libpfm/perfmon2 is making sure that the names match what is in the document as much as possible. This sometimes makes for longer names (e. g. DATA_CACHE_REFILL_FROM_SYSTEM instead of DC_REFILL) but should make it easier for people to match up what is in libpfm versus what is in the document. > > Any thoughts on managing the differences between architecture > performance events, e.g. Net burst and AMD64 cache events. Users do find > the multitude of different events on the various architectures > confusing. PAPI has some generic event names for some types of events, > but these do not alway make well to the actual events available on the > particular processor. > Well, no, I hadn't considered looking at all of the event names across architectures. That would require a fair amount of analysis and consideration in order to balance correctness versus commonality across architectures. At the moment, I'm just going to try to get the Opteron events to match the document. :-) I suppose the other way we could go would be to try to match with the PAPI names. However, I know from experience that some of the PAPI names don't directly match Opteron events (sometimes the PAPI event turns out to be the sum of two or more Opteron events). > -Will -- Ray Bryant AMD Performance Labs Austin, Tx 512-602-0038 (o) 512-507-7807 (c) |
From: William C. <wc...@re...> - 2006-02-14 21:49:08
|
Ray Bryant wrote: > On Tuesday 14 February 2006 14:09, William Cohen wrote: > >>Ray Bryant wrote: >> >>>I'm in the process of updating the PMU event names and unit masks for >>>libpfm, the user side of the perfmon2 interface that Stephane Eranian is >>>trying to get merged into the mainline kernel. (The latest public >>>version of the PMU definitions is in version 3.28 of the BKDG, published >>>in October 2005). >>> >>>In looking at the "hammer" event names in oprofile it appears that this >>>table needs to be updated as well. I'm wondering if now wouldn't be a >>>good time to get the event names to be the same across oprofile and >>>perfmon. This is going to require some changes to the existing oprofile >>>names. >>> >>>What do people think? Is compatibility with existing code more >>>important than comformity, or is it a good thing to move toward having >>>the names be the same? Or no one cares or what? :-) >> >>If it is along the lines of making the event names agree with >>documentation, I am for it. There is already enough confusion between >>the different architectures/implementation. Making sure that the events >>in the software are consistent with the processor documentation is a >>good thing. Is there a publically available document with the current >>event names and masks? > > > Hi Will, > > Yes, the current document is at: > http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/26094.PDF > > This is the "Bios and Kernel Developer's Guide for AMD Athlon [tm] 64 and AMD > Opteron [tm] Processors", Revision 3.28, October 2005, Publication 26094. > > The Performance Monitoring events are defined in chapter 10. AFAIK, this is > the latest public document. > > What I am doing for libpfm/perfmon2 is making sure that the names match what > is in the document as much as possible. This sometimes makes for longer > names (e. g. DATA_CACHE_REFILL_FROM_SYSTEM instead of DC_REFILL) but should > make it easier for people to match up what is in libpfm versus what is in the > document. > > >>Any thoughts on managing the differences between architecture >>performance events, e.g. Net burst and AMD64 cache events. Users do find >>the multitude of different events on the various architectures >>confusing. PAPI has some generic event names for some types of events, >>but these do not alway make well to the actual events available on the >>particular processor. >> > > > Well, no, I hadn't considered looking at all of the event names across > architectures. That would require a fair amount of analysis and > consideration in order to balance correctness versus commonality across > architectures. At the moment, I'm just going to try to get the Opteron > events to match the document. :-) I am happy having the names in the software agree with the documentation. By having the names match up exactly, that eliminates one place of confusion. > I suppose the other way we could go would be to try to match with the PAPI > names. However, I know from experience that some of the PAPI names don't > directly match Opteron events (sometimes the PAPI event turns out to be the > sum of two or more Opteron events). I wasn't expecting these changes to deal with the differences between the architectures. I just figured it would be a good time to mention the event portability issues. Yes, I know that in some cases the PAPI events have to be synthesized from multiple event counters. In other cases the events don't match exactly. -Will |