From: Will D. <wil...@ar...> - 2010-05-17 17:59:34
|
Hello, The following patch series adds userspace support for the Cortex-A9 CPU. Kernel support for this core is due to arrive in 2.6.35. The current ARMv7 event structure is also modified to factor out the common architectural events from the core-specific extensions. All feedback welcome. Will Deacon (3): ARM: factor out ARMv7 common architectural events ARM: correct usage of core terminology for v7 and MPCore ARM: add support for Cortex-A9 events events/Makefile.am | 2 + events/arm/armv7-ca9/events | 50 ++++++++++++++++++++++++++++++++++++ events/arm/armv7-ca9/unit_masks | 4 +++ events/arm/armv7-common/events | 22 ++++++++++++++++ events/arm/armv7-common/unit_masks | 4 +++ events/arm/armv7/events | 21 +-------------- libop/op_cpu_type.c | 5 ++- libop/op_cpu_type.h | 3 +- libop/op_events.c | 1 + utils/ophelp.c | 8 +++++- 10 files changed, 96 insertions(+), 24 deletions(-) create mode 100644 events/arm/armv7-ca9/events create mode 100644 events/arm/armv7-ca9/unit_masks create mode 100644 events/arm/armv7-common/events create mode 100644 events/arm/armv7-common/unit_masks |
From: Will D. <wil...@ar...> - 2010-06-30 13:59:46
|
Hello, This is a re-posting of the patch series originally posted last month: http://marc.info/?l=oprofile-list&m=127411918506283&w=2 The only change is that I have rebased the code on top of CVS so that it doesn't conflict with the Nehalem code. It would be good to get this committed as Ubuntu are carrying these patches for the Maverick release and it would be easier to maintain if it was available upstream. Thanks! The following patch series adds userspace support for the Cortex-A9 CPU. Kernel support for this core is due to arrive in 2.6.35. The current ARMv7 event structure is also modified to factor out the common architectural events from the core-specific extensions. All feedback welcome. Will Deacon (3): ARM: factor out ARMv7 common architectural events ARM: correct usage of core terminology for v7 and MPCore ARM: add support for Cortex-A9 events events/Makefile.am | 2 + events/arm/armv7-ca9/events | 50 ++++++++++++++++++++++++++++++++++++ events/arm/armv7-ca9/unit_masks | 4 +++ events/arm/armv7-common/events | 22 ++++++++++++++++ events/arm/armv7-common/unit_masks | 4 +++ events/arm/armv7/events | 21 +-------------- libop/op_cpu_type.c | 5 ++- libop/op_cpu_type.h | 3 +- libop/op_events.c | 1 + utils/ophelp.c | 8 +++++- 10 files changed, 96 insertions(+), 24 deletions(-) create mode 100644 events/arm/armv7-ca9/events create mode 100644 events/arm/armv7-ca9/unit_masks create mode 100644 events/arm/armv7-common/events create mode 100644 events/arm/armv7-common/unit_masks |
From: Will D. <wil...@ar...> - 2010-06-30 13:59:44
|
The ARMv7 architecture defines a set of common events that share the same event ID space across all ARMv7 cores. Individual cores may extend this base set if required. This patch separates out the common event set into an armv7-common directory which can be included by ARMv7 cores. This avoids duplication and removes the potential for different event definitions for the same event. Signed-off-by: Will Deacon <wil...@ar...> --- events/arm/armv7-common/events | 22 ++++++++++++++++++++++ events/arm/armv7-common/unit_masks | 4 ++++ events/arm/armv7/events | 21 +-------------------- 3 files changed, 27 insertions(+), 20 deletions(-) create mode 100644 events/arm/armv7-common/events create mode 100644 events/arm/armv7-common/unit_masks diff --git a/events/arm/armv7-common/events b/events/arm/armv7-common/events new file mode 100644 index 0000000..c4fe8c7 --- /dev/null +++ b/events/arm/armv7-common/events @@ -0,0 +1,22 @@ +# ARM V7 events +# From ARM ARM +# +event:0x00 counters:1,2,3,4,5,6 um:zero minimum:500 name:PMNC_SW_INCR : Software increment of PMNC registers +event:0x01 counters:1,2,3,4,5,6 um:zero minimum:500 name:IFETCH_MISS : Instruction fetch misses from cache or normal cacheable memory +event:0x02 counters:1,2,3,4,5,6 um:zero minimum:500 name:ITLB_MISS : Instruction fetch misses from TLB +event:0x03 counters:1,2,3,4,5,6 um:zero minimum:500 name:DCACHE_REFILL : Data R/W operation that causes a refill from cache or normal cacheable memory +event:0x04 counters:1,2,3,4,5,6 um:zero minimum:500 name:DCACHE_ACCESS : Data R/W from cache +event:0x05 counters:1,2,3,4,5,6 um:zero minimum:500 name:DTLB_REFILL : Data R/W that causes a TLB refill +event:0x06 counters:1,2,3,4,5,6 um:zero minimum:500 name:DREAD : Data read architecturally executed (note: architecturally executed = for instructions that are unconditional or that pass the condition code) +event:0x07 counters:1,2,3,4,5,6 um:zero minimum:500 name:DWRITE : Data write architecturally executed +event:0x08 counters:1,2,3,4,5,6 um:zero minimum:500 name:INSTR_EXECUTED : All executed instructions +event:0x09 counters:1,2,3,4,5,6 um:zero minimum:500 name:EXC_TAKEN : Exception taken +event:0x0A counters:1,2,3,4,5,6 um:zero minimum:500 name:EXC_EXECUTED : Exception return architecturally executed +event:0x0B counters:1,2,3,4,5,6 um:zero minimum:500 name:CID_WRITE : Instruction that writes to the Context ID Register architecturally executed +event:0x0C counters:1,2,3,4,5,6 um:zero minimum:500 name:PC_WRITE : SW change of PC, architecturally executed (not by exceptions) +event:0x0D counters:1,2,3,4,5,6 um:zero minimum:500 name:PC_IMM_BRANCH : Immediate branch instruction executed (taken or not) +event:0x0E counters:1,2,3,4,5,6 um:zero minimum:500 name:PC_PROC_RETURN : Procedure return architecturally executed (not by exceptions) +event:0x0F counters:1,2,3,4,5,6 um:zero minimum:500 name:UNALIGNED_ACCESS : Unaligned access architecturally executed +event:0x10 counters:1,2,3,4,5,6 um:zero minimum:500 name:PC_BRANCH_MIS_PRED : Branch mispredicted or not predicted. Counts pipeline flushes because of misprediction +event:0x12 counters:1,2,3,4,5,6 um:zero minimum:500 name:PC_BRANCH_MIS_USED : Branch or change in program flow that could have been predicted +event:0xFF counters:0 um:zero minimum:500 name:CPU_CYCLES : Number of CPU cycles diff --git a/events/arm/armv7-common/unit_masks b/events/arm/armv7-common/unit_masks new file mode 100644 index 0000000..4027469 --- /dev/null +++ b/events/arm/armv7-common/unit_masks @@ -0,0 +1,4 @@ +# ARM V7 PMNC possible unit masks +# +name:zero type:mandatory default:0x00 + 0x00 No unit mask diff --git a/events/arm/armv7/events b/events/arm/armv7/events index ffecf2b..d6d9227 100644 --- a/events/arm/armv7/events +++ b/events/arm/armv7/events @@ -1,24 +1,7 @@ # ARM V7 events # From Cortex A8 DDI (ARM DDI 0344B, revision r1p1) # -event:0x00 counters:1,2,3,4 um:zero minimum:500 name:PMNC_SW_INCR : Software increment of PMNC registers -event:0x01 counters:1,2,3,4 um:zero minimum:500 name:IFETCH_MISS : Instruction fetch misses from cache or normal cacheable memory -event:0x02 counters:1,2,3,4 um:zero minimum:500 name:ITLB_MISS : Instruction fetch misses from TLB -event:0x03 counters:1,2,3,4 um:zero minimum:500 name:DCACHE_REFILL : Data R/W operation that causes a refill from cache or normal cacheable memory -event:0x04 counters:1,2,3,4 um:zero minimum:500 name:DCACHE_ACCESS : Data R/W from cache -event:0x05 counters:1,2,3,4 um:zero minimum:500 name:DTLB_REFILL : Data R/W that causes a TLB refill -event:0x06 counters:1,2,3,4 um:zero minimum:500 name:DREAD : Data read architecturally executed (note: architecturally executed = for instructions that are unconditional or that pass the condition code) -event:0x07 counters:1,2,3,4 um:zero minimum:500 name:DWRITE : Data write architecturally executed -event:0x08 counters:1,2,3,4 um:zero minimum:500 name:INSTR_EXECUTED : All executed instructions -event:0x09 counters:1,2,3,4 um:zero minimum:500 name:EXC_TAKEN : Exception taken -event:0x0A counters:1,2,3,4 um:zero minimum:500 name:EXC_EXECUTED : Exception return architecturally executed -event:0x0B counters:1,2,3,4 um:zero minimum:500 name:CID_WRITE : Instruction that writes to the Context ID Register architecturally executed -event:0x0C counters:1,2,3,4 um:zero minimum:500 name:PC_WRITE : SW change of PC, architecturally executed (not by exceptions) -event:0x0D counters:1,2,3,4 um:zero minimum:500 name:PC_IMM_BRANCH : Immediate branch instruction executed (taken or not) -event:0x0E counters:1,2,3,4 um:zero minimum:500 name:PC_PROC_RETURN : Procedure return architecturally executed (not by exceptions) -event:0x0F counters:1,2,3,4 um:zero minimum:500 name:UNALIGNED_ACCESS : Unaligned access architecturally executed -event:0x10 counters:1,2,3,4 um:zero minimum:500 name:PC_BRANCH_MIS_PRED : Branch mispredicted or not predicted. Counts pipeline flushes because of misprediction -event:0x12 counters:1,2,3,4 um:zero minimum:500 name:PC_BRANCH_MIS_USED : Branch or change in program flow that could have been predicted +include:arm/armv7-common event:0x40 counters:1,2,3,4 um:zero minimum:500 name:WRITE_BUFFER_FULL : Any write buffer full cycle event:0x41 counters:1,2,3,4 um:zero minimum:500 name:L2_STORE_MERGED : Any store that is merged in L2 cache event:0x42 counters:1,2,3,4 um:zero minimum:500 name:L2_STORE_BUFF : Any bufferable store from load/store to L2 cache @@ -49,5 +32,3 @@ event:0x5A counters:1,2,3,4 um:zero minimum:500 name:NEON_CYCLES : Number of cyc event:0x70 counters:1,2,3,4 um:zero minimum:500 name:PMU0_EVENTS : Number of events from external input source PMUEXTIN[0] event:0x71 counters:1,2,3,4 um:zero minimum:500 name:PMU1_EVENTS : Number of events from external input source PMUEXTIN[1] event:0x72 counters:1,2,3,4 um:zero minimum:500 name:PMU_EVENTS : Number of events from both external input sources PMUEXTIN[0] and PMUEXTIN[1] -event:0xFF counters:0 um:zero minimum:500 name:CPU_CYCLES : Number of CPU cycles - -- 1.6.3.3 |
From: Will D. <wil...@ar...> - 2010-06-30 13:59:46
|
Currently, OProfile refers to the ARM11MPCore as "ARM MPCore". The Cortex-A9 is also available in MP configurations, so this nomenclature is ambiguous. Additionally, OProfile refers to the Cortex-A8 event set as "ARMv7 PMNC" which doesn't make sense when the Cortex-A9 is added to the equation. This patch fixes up errors in terminology, but leaves the string which is used to communicate with the Kernel alone in order to remain backwards compatible. Furthermore, a new cpu_descr is added for Cortex-A9. Signed-off-by: Will Deacon <wil...@ar...> --- events/Makefile.am | 2 ++ libop/op_cpu_type.c | 5 +++-- libop/op_cpu_type.h | 3 ++- libop/op_events.c | 1 + utils/ophelp.c | 8 +++++++- 5 files changed, 15 insertions(+), 4 deletions(-) diff --git a/events/Makefile.am b/events/Makefile.am index 6c51f78..546b631 100644 --- a/events/Makefile.am +++ b/events/Makefile.am @@ -38,7 +38,9 @@ event_files = \ arm/xscale1/events arm/xscale1/unit_masks \ arm/xscale2/events arm/xscale2/unit_masks \ arm/armv6/events arm/armv6/unit_masks \ + arm/armv7-common/events arm/armv7-common/unit_masks \ arm/armv7/events arm/armv7/unit_masks \ + arm/armv7-ca9/events arm/armv7-ca9/unit_masks \ arm/mpcore/events arm/mpcore/unit_masks \ avr32/events avr32/unit_masks \ mips/20K/events mips/20K/unit_masks \ diff --git a/libop/op_cpu_type.c b/libop/op_cpu_type.c index 7af34d8..8c1d5cf 100644 --- a/libop/op_cpu_type.c +++ b/libop/op_cpu_type.c @@ -70,12 +70,12 @@ static struct cpu_descr const cpu_descrs[MAX_CPU_TYPE] = { { "ppc64 Cell Broadband Engine", "ppc64/cell-be", CPU_PPC64_CELL, 8 }, { "AMD64 family10", "x86-64/family10", CPU_FAMILY10, 4 }, { "ppc64 PA6T", "ppc64/pa6t", CPU_PPC64_PA6T, 6 }, - { "ARM MPCore", "arm/mpcore", CPU_ARM_MPCORE, 2 }, + { "ARM 11MPCore", "arm/mpcore", CPU_ARM_MPCORE, 2 }, { "ARM V6 PMU", "arm/armv6", CPU_ARM_V6, 3 }, { "ppc64 POWER5++", "ppc64/power5++", CPU_PPC64_POWER5pp, 6 }, { "e300", "ppc/e300", CPU_PPC_E300, 4 }, { "AVR32", "avr32", CPU_AVR32, 3 }, - { "ARM V7 PMNC", "arm/armv7", CPU_ARM_V7, 5 }, + { "ARM Cortex-A8", "arm/armv7", CPU_ARM_V7, 5 }, { "Intel Architectural Perfmon", "i386/arch_perfmon", CPU_ARCH_PERFMON, 0}, { "AMD64 family11h", "x86-64/family11h", CPU_FAMILY11H, 4 }, { "ppc64 POWER7", "ppc64/power7", CPU_PPC64_POWER7, 6 }, @@ -84,6 +84,7 @@ static struct cpu_descr const cpu_descrs[MAX_CPU_TYPE] = { { "Intel Atom", "i386/atom", CPU_ATOM, 2 }, { "Loongson2", "mips/loongson2", CPU_MIPS_LOONGSON2, 2 }, { "Intel Nehalem microarchitecture", "i386/nehalem", CPU_NEHALEM, 4 }, + { "ARM Cortex-A9", "arm/armv7-ca9", CPU_ARM_V7_CA9, 7 }, }; static size_t const nr_cpu_descrs = sizeof(cpu_descrs) / sizeof(struct cpu_descr); diff --git a/libop/op_cpu_type.h b/libop/op_cpu_type.h index 10911eb..861fbb5 100644 --- a/libop/op_cpu_type.h +++ b/libop/op_cpu_type.h @@ -72,7 +72,7 @@ typedef enum { CPU_PPC64_POWER5pp, /**< ppc64 Power5++ family */ CPU_PPC_E300, /**< e300 */ CPU_AVR32, /**< AVR32 */ - CPU_ARM_V7, /**< ARM V7 */ + CPU_ARM_V7, /**< ARM Cortex-A8 */ CPU_ARCH_PERFMON, /**< Intel architectural perfmon */ CPU_FAMILY11H, /**< AMD family 11h */ CPU_PPC64_POWER7, /**< ppc64 POWER7 family */ @@ -81,6 +81,7 @@ typedef enum { CPU_ATOM, /* First generation Intel Atom */ CPU_MIPS_LOONGSON2, /* < loongson2 family */ CPU_NEHALEM, /* Intel Nehalem microarchitecture */ + CPU_ARM_V7_CA9, /**< ARM Cortex-A9 */ MAX_CPU_TYPE } op_cpu; diff --git a/libop/op_events.c b/libop/op_events.c index 9121f47..f4aaeed 100644 --- a/libop/op_events.c +++ b/libop/op_events.c @@ -1007,6 +1007,7 @@ void op_default_event(op_cpu cpu_type, struct op_default_event_descr * descr) case CPU_ARM_MPCORE: case CPU_ARM_V6: case CPU_ARM_V7: + case CPU_ARM_V7_CA9: case CPU_AVR32: descr->name = "CPU_CYCLES"; break; diff --git a/utils/ophelp.c b/utils/ophelp.c index 3b61bb4..a2d9442 100644 --- a/utils/ophelp.c +++ b/utils/ophelp.c @@ -535,10 +535,16 @@ int main(int argc, char const * argv[]) case CPU_ARM_V7: event_doc = - "See ARM11 Technical Reference Manual\n" + "See Cortex-A8 Technical Reference Manual\n" "Cortex A8 DDI (ARM DDI 0344B, revision r1p1)\n"; break; + case CPU_ARM_V7_CA9: + event_doc = + "See Cortex-A9 Technical Reference Manual\n" + "Cortex A9 DDI (ARM DDI 0388E, revision r2p0)\n"; + break; + case CPU_PPC64_PA6T: event_doc = "See PA6T Power Implementation Features Book IV\n" -- 1.6.3.3 |
From: Will D. <wil...@ar...> - 2010-06-30 13:59:44
|
The Cortex-A9 is an ARMv7 core which implements its own set of additional events on top of those defined by the architecture. This patch adds support for these extra events. Signed-off-by: Will Deacon <wil...@ar...> --- events/arm/armv7-ca9/events | 50 +++++++++++++++++++++++++++++++++++++++ events/arm/armv7-ca9/unit_masks | 4 +++ 2 files changed, 54 insertions(+), 0 deletions(-) create mode 100644 events/arm/armv7-ca9/events create mode 100644 events/arm/armv7-ca9/unit_masks diff --git a/events/arm/armv7-ca9/events b/events/arm/armv7-ca9/events new file mode 100644 index 0000000..c1e4084 --- /dev/null +++ b/events/arm/armv7-ca9/events @@ -0,0 +1,50 @@ +# ARM Cortex A9 events +# From Cortex A9 TRM +# +include:arm/armv7-common +event:0x40 counters:1,2,3,4,5,6 um:zero minimum:500 name:JAVA_BC_EXEC : Number of Java bytecodes decoded, including speculative ones +event:0x41 counters:1,2,3,4,5,6 um:zero minimum:500 name:JAVA_SFTBC_EXEC : Number of software Java bytecodes decoded, including speculative ones +event:0x42 counters:1,2,3,4,5,6 um:zero minimum:500 name:JAVA_BB_EXEC : Number of Jazelle taken branches executed, including those flushed due to a previous load/store which aborts late + +event:0x50 counters:1,2,3,4,5,6 um:zero minimum:500 name:CO_LF_MISS : Number of coherent linefill requests which miss in all other CPUs, meaning that the request is sent to external memory +event:0x51 counters:1,2,3,4,5,6 um:zero minimum:500 name:CO_LF_HIT : Number of coherent linefill requests which hit in another CPU, meaning that the linefill data is fetched directly from the relevant cache + +event:0x60 counters:1,2,3,4,5,6 um:zero minimum:500 name:IC_DEP_STALL : Number of cycles where CPU is ready to accept new instructions but does not receive any because of the instruction side not being able to provide any and the instruction cache is currently performing at least one linefill +event:0x61 counters:1,2,3,4,5,6 um:zero minimum:500 name:DC_DEP_STALL : Number of cycles where CPU has some instructions that it cannot issue to any pipeline and the LSU has at least one pending linefill request but no pending TLB requests +event:0x63 counters:1,2,3,4,5,6 um:zero minimum:500 name:STREX_PASS : Number of STREX instructions architecturally executed and passed +event:0x64 counters:1,2,3,4,5,6 um:zero minimum:500 name:STREX_FAILS : Number of STREX instructions architecturally executed and failed +event:0x65 counters:1,2,3,4,5,6 um:zero minimum:500 name:DATA_EVICT : Number of eviction requests due to a linefill in the data cache +event:0x66 counters:1,2,3,4,5,6 um:zero minimum:500 name:ISS_NO_DISP : Number of cycles where the issue stage does not dispatch any instruction +event:0x67 counters:1,2,3,4,5,6 um:zero minimum:500 name:ISS_EMPTY : Number of cycles where the issue stage is empty +event:0x68 counters:1,2,3,4,5,6 um:zero minimum:500 name:INS_RENAME : Number of instructions going through the Register Renaming stage + +event:0x6E counters:1,2,3,4,5,6 um:zero minimum:500 name:PRD_FN_RET : Number of procedure returns whose condition codes do not fail, excluding all exception returns + +event:0x70 counters:1,2,3,4,5,6 um:zero minimum:500 name:INS_MAIN_EXEC : Number of instructions being executed in main execution pipeline of the CPU, the multiply pipeline and the ALU pipeline +event:0x71 counters:1,2,3,4,5,6 um:zero minimum:500 name:INS_SND_EXEC : Number of instructions being executed in the second execution pipeline (ALU) of the CPU +event:0x72 counters:1,2,3,4,5,6 um:zero minimum:500 name:INS_LSU : Number of instructions being executed in the Load/Store unit +event:0x73 counters:1,2,3,4,5,6 um:zero minimum:500 name:INS_FP_RR : Number of floating-point instructions going through the Register Rename stage +event:0x74 counters:1,2,3,4,5,6 um:zero minimum:500 name:INS_NEON_RR : Number of NEON instructions going through the Register Rename stage + +event:0x80 counters:1,2,3,4,5,6 um:zero minimum:500 name:STALL_PLD : Number of cycles where CPU is stalled because PLD slots are all full +event:0x81 counters:1,2,3,4,5,6 um:zero minimum:500 name:STALL_WRITE : Number of cycles where CPU is stalled because data side is full and executing writes to external memory +event:0x82 counters:1,2,3,4,5,6 um:zero minimum:500 name:STALL_INS_TLB : Number of cycles where CPU is stalled because of main TLB misses on requests issued by the instruction side +event:0x83 counters:1,2,3,4,5,6 um:zero minimum:500 name:STALL_DATA_TLB : Number of cycles where CPU is stalled because of main TLB misses on requests issued by the data side +event:0x84 counters:1,2,3,4,5,6 um:zero minimum:500 name:STALL_INS_UTLB : Number of cycles where CPU is stalled because of micro TLB misses on the instruction side +event:0x85 counters:1,2,3,4,5,6 um:zero minimum:500 name:STALL_DATA_ULTB : Number of cycles where CPU is stalled because of micro TLB misses on the data side +event:0x86 counters:1,2,3,4,5,6 um:zero minimum:500 name:STALL_DMB : Number of cycles where CPU is stalled due to executed of a DMB memory barrier + +event:0x8A counters:1,2,3,4,5,6 um:zero minimum:500 name:CLK_INT_EN : Number of cycles during which the integer core clock is enabled +event:0x8B counters:1,2,3,4,5,6 um:zero minimum:500 name:CLK_DE_EN : Number of cycles during which the Data Engine clock is enabled + +event:0x90 counters:1,2,3,4,5,6 um:zero minimum:500 name:INS_ISB : Number of ISB instructions architecturally executed +event:0x91 counters:1,2,3,4,5,6 um:zero minimum:500 name:INS_DSB : Number of DSB instructions architecturally executed +event:0x92 counters:1,2,3,4,5,6 um:zero minimum:500 name:INS_DMB : Number of DMB instructions speculatively executed +event:0x93 counters:1,2,3,4,5,6 um:zero minimum:500 name:EXT_IRQ : Number of external interrupts executed by the processor + +event:0xA0 counters:1,2,3,4,5,6 um:zero minimum:500 name:PLE_CL_REQ_CMP : PLE cache line request completed +event:0xA1 counters:1,2,3,4,5,6 um:zero minimum:500 name:PLE_CL_REQ_SKP : PLE cache line request skipped +event:0xA2 counters:1,2,3,4,5,6 um:zero minimum:500 name:PLE_FIFO_FLSH : PLE FIFO flush +event:0xA3 counters:1,2,3,4,5,6 um:zero minimum:500 name:PLE_REQ_COMP : PLE request completed +event:0xA4 counters:1,2,3,4,5,6 um:zero minimum:500 name:PLE_FIFO_OF : PLE FIFO overflow +event:0xA5 counters:1,2,3,4,5,6 um:zero minimum:500 name:PLE_REQ_PRG : PLE request programmed diff --git a/events/arm/armv7-ca9/unit_masks b/events/arm/armv7-ca9/unit_masks new file mode 100644 index 0000000..4027469 --- /dev/null +++ b/events/arm/armv7-ca9/unit_masks @@ -0,0 +1,4 @@ +# ARM V7 PMNC possible unit masks +# +name:zero type:mandatory default:0x00 + 0x00 No unit mask -- 1.6.3.3 |
From: Maynard J. <may...@us...> - 2010-06-30 15:26:55
|
Will Deacon wrote: > Hello, > Thanks for the patch. Does it pass 'make check'? I'll let Richard review to give it his blessing, since he's the ARM expert around here. -Maynard > This is a re-posting of the patch series originally posted last month: > > http://marc.info/?l=oprofile-list&m=127411918506283&w=2 > > The only change is that I have rebased the code on top of CVS so that > it doesn't conflict with the Nehalem code. > > It would be good to get this committed as Ubuntu are carrying these > patches for the Maverick release and it would be easier to maintain if > it was available upstream. > > Thanks! > > > The following patch series adds userspace support for the Cortex-A9 > CPU. Kernel support for this core is due to arrive in 2.6.35. > > The current ARMv7 event structure is also modified to factor out > the common architectural events from the core-specific extensions. > > All feedback welcome. > > > Will Deacon (3): > ARM: factor out ARMv7 common architectural events > ARM: correct usage of core terminology for v7 and MPCore > ARM: add support for Cortex-A9 events > > events/Makefile.am | 2 + > events/arm/armv7-ca9/events | 50 ++++++++++++++++++++++++++++++++++++ > events/arm/armv7-ca9/unit_masks | 4 +++ > events/arm/armv7-common/events | 22 ++++++++++++++++ > events/arm/armv7-common/unit_masks | 4 +++ > events/arm/armv7/events | 21 +-------------- > libop/op_cpu_type.c | 5 ++- > libop/op_cpu_type.h | 3 +- > libop/op_events.c | 1 + > utils/ophelp.c | 8 +++++- > 10 files changed, 96 insertions(+), 24 deletions(-) > create mode 100644 events/arm/armv7-ca9/events > create mode 100644 events/arm/armv7-ca9/unit_masks > create mode 100644 events/arm/armv7-common/events > create mode 100644 events/arm/armv7-common/unit_masks > > > ------------------------------------------------------------------------------ > This SF.net email is sponsored by Sprint > What will you do first with EVO, the first 4G phone? > Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first > _______________________________________________ > oprofile-list mailing list > opr...@li... > https://lists.sourceforge.net/lists/listinfo/oprofile-list |
From: Will D. <wil...@ar...> - 2010-06-30 16:07:22
|
Hi Maynard, Thanks for looking at this. > Thanks for the patch. Does it pass 'make check'? I'll let Richard review to give it his blessing, > since he's the ARM expert around here. I just ran make check on a quad-core Cortex-A9 board and everything appeared to PASS. Cheers, Will |
From: Jean P. <jp...@mv...> - 2010-06-30 16:31:37
|
Hi Will, Ok with those patches! Acked-by: Jean Pihet <jp...@mv...> Thx, Jean On Wed, Jun 30, 2010 at 15:59, Will Deacon <wil...@ar...> wrote: > Hello, > > This is a re-posting of the patch series originally posted last month: > > http://marc.info/?l=oprofile-list&m=127411918506283&w=2 > > The only change is that I have rebased the code on top of CVS so that > it doesn't conflict with the Nehalem code. > > It would be good to get this committed as Ubuntu are carrying these > patches for the Maverick release and it would be easier to maintain if > it was available upstream. > > Thanks! > > > The following patch series adds userspace support for the Cortex-A9 > CPU. Kernel support for this core is due to arrive in 2.6.35. > > The current ARMv7 event structure is also modified to factor out > the common architectural events from the core-specific extensions. > > All feedback welcome. > > > Will Deacon (3): > ARM: factor out ARMv7 common architectural events > ARM: correct usage of core terminology for v7 and MPCore > ARM: add support for Cortex-A9 events > > events/Makefile.am | 2 + > events/arm/armv7-ca9/events | 50 > ++++++++++++++++++++++++++++++++++++ > events/arm/armv7-ca9/unit_masks | 4 +++ > events/arm/armv7-common/events | 22 ++++++++++++++++ > events/arm/armv7-common/unit_masks | 4 +++ > events/arm/armv7/events | 21 +-------------- > libop/op_cpu_type.c | 5 ++- > libop/op_cpu_type.h | 3 +- > libop/op_events.c | 1 + > utils/ophelp.c | 8 +++++- > 10 files changed, 96 insertions(+), 24 deletions(-) > create mode 100644 events/arm/armv7-ca9/events > create mode 100644 events/arm/armv7-ca9/unit_masks > create mode 100644 events/arm/armv7-common/events > create mode 100644 events/arm/armv7-common/unit_masks > > > > ------------------------------------------------------------------------------ > This SF.net email is sponsored by Sprint > What will you do first with EVO, the first 4G phone? > Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first > _______________________________________________ > oprofile-list mailing list > opr...@li... > https://lists.sourceforge.net/lists/listinfo/oprofile-list > |
From: Richard P. <rp...@rp...> - 2010-07-01 11:07:13
|
On Wed, 2010-06-30 at 17:07 +0100, Will Deacon wrote: > Hi Maynard, > > Thanks for looking at this. > > > Thanks for the patch. Does it pass 'make check'? I'll let Richard review to give it his blessing, > > since he's the ARM expert around here. > > I just ran make check on a quad-core Cortex-A9 board and everything > appeared to PASS. Patches look good to me. Acked-by: Richard Purdie <rp...@rp...> Cheers, Richard |
From: Maynard J. <may...@us...> - 2010-07-01 15:24:33
|
On 07/01/2010 5:41 AM, Richard Purdie wrote: > On Wed, 2010-06-30 at 17:07 +0100, Will Deacon wrote: >> Hi Maynard, >> >> Thanks for looking at this. >> >>> Thanks for the patch. Does it pass 'make check'? I'll let Richard review to give it his blessing, >>> since he's the ARM expert around here. >> >> I just ran make check on a quad-core Cortex-A9 board and everything >> appeared to PASS. > > Patches look good to me. > > Acked-by: Richard Purdie<rp...@rp...> Patches applied. Will, for future reference, please include an update to the ChangeLog file in any patches submitted. Thanks. -Maynard > > Cheers, > > Richard > |
From: Will D. <wil...@ar...> - 2010-07-01 15:26:37
|
Hi Maynard, > Patches applied. Thanks - that's great! > Will, for future reference, please include an update to the ChangeLog file in > any patches submitted. Thanks. Ah yes, sorry about that. This is mostly cleanup, but the A9 support should probably be mentioned somewhere. Thanks again, Will |
From: Will D. <wil...@ar...> - 2010-05-17 17:59:38
|
The ARMv7 architecture defines a set of common events that share the same event ID space across all ARMv7 cores. Individual cores may extend this base set if required. This patch separates out the common event set into an armv7-common directory which can be included by ARMv7 cores. This avoids duplication and removes the potential for different event definitions for the same event. Signed-off-by: Will Deacon <wil...@ar...> --- events/arm/armv7-common/events | 22 ++++++++++++++++++++++ events/arm/armv7-common/unit_masks | 4 ++++ events/arm/armv7/events | 21 +-------------------- 3 files changed, 27 insertions(+), 20 deletions(-) create mode 100644 events/arm/armv7-common/events create mode 100644 events/arm/armv7-common/unit_masks diff --git a/events/arm/armv7-common/events b/events/arm/armv7-common/events new file mode 100644 index 0000000..c4fe8c7 --- /dev/null +++ b/events/arm/armv7-common/events @@ -0,0 +1,22 @@ +# ARM V7 events +# From ARM ARM +# +event:0x00 counters:1,2,3,4,5,6 um:zero minimum:500 name:PMNC_SW_INCR : Software increment of PMNC registers +event:0x01 counters:1,2,3,4,5,6 um:zero minimum:500 name:IFETCH_MISS : Instruction fetch misses from cache or normal cacheable memory +event:0x02 counters:1,2,3,4,5,6 um:zero minimum:500 name:ITLB_MISS : Instruction fetch misses from TLB +event:0x03 counters:1,2,3,4,5,6 um:zero minimum:500 name:DCACHE_REFILL : Data R/W operation that causes a refill from cache or normal cacheable memory +event:0x04 counters:1,2,3,4,5,6 um:zero minimum:500 name:DCACHE_ACCESS : Data R/W from cache +event:0x05 counters:1,2,3,4,5,6 um:zero minimum:500 name:DTLB_REFILL : Data R/W that causes a TLB refill +event:0x06 counters:1,2,3,4,5,6 um:zero minimum:500 name:DREAD : Data read architecturally executed (note: architecturally executed = for instructions that are unconditional or that pass the condition code) +event:0x07 counters:1,2,3,4,5,6 um:zero minimum:500 name:DWRITE : Data write architecturally executed +event:0x08 counters:1,2,3,4,5,6 um:zero minimum:500 name:INSTR_EXECUTED : All executed instructions +event:0x09 counters:1,2,3,4,5,6 um:zero minimum:500 name:EXC_TAKEN : Exception taken +event:0x0A counters:1,2,3,4,5,6 um:zero minimum:500 name:EXC_EXECUTED : Exception return architecturally executed +event:0x0B counters:1,2,3,4,5,6 um:zero minimum:500 name:CID_WRITE : Instruction that writes to the Context ID Register architecturally executed +event:0x0C counters:1,2,3,4,5,6 um:zero minimum:500 name:PC_WRITE : SW change of PC, architecturally executed (not by exceptions) +event:0x0D counters:1,2,3,4,5,6 um:zero minimum:500 name:PC_IMM_BRANCH : Immediate branch instruction executed (taken or not) +event:0x0E counters:1,2,3,4,5,6 um:zero minimum:500 name:PC_PROC_RETURN : Procedure return architecturally executed (not by exceptions) +event:0x0F counters:1,2,3,4,5,6 um:zero minimum:500 name:UNALIGNED_ACCESS : Unaligned access architecturally executed +event:0x10 counters:1,2,3,4,5,6 um:zero minimum:500 name:PC_BRANCH_MIS_PRED : Branch mispredicted or not predicted. Counts pipeline flushes because of misprediction +event:0x12 counters:1,2,3,4,5,6 um:zero minimum:500 name:PC_BRANCH_MIS_USED : Branch or change in program flow that could have been predicted +event:0xFF counters:0 um:zero minimum:500 name:CPU_CYCLES : Number of CPU cycles diff --git a/events/arm/armv7-common/unit_masks b/events/arm/armv7-common/unit_masks new file mode 100644 index 0000000..4027469 --- /dev/null +++ b/events/arm/armv7-common/unit_masks @@ -0,0 +1,4 @@ +# ARM V7 PMNC possible unit masks +# +name:zero type:mandatory default:0x00 + 0x00 No unit mask diff --git a/events/arm/armv7/events b/events/arm/armv7/events index ffecf2b..d6d9227 100644 --- a/events/arm/armv7/events +++ b/events/arm/armv7/events @@ -1,24 +1,7 @@ # ARM V7 events # From Cortex A8 DDI (ARM DDI 0344B, revision r1p1) # -event:0x00 counters:1,2,3,4 um:zero minimum:500 name:PMNC_SW_INCR : Software increment of PMNC registers -event:0x01 counters:1,2,3,4 um:zero minimum:500 name:IFETCH_MISS : Instruction fetch misses from cache or normal cacheable memory -event:0x02 counters:1,2,3,4 um:zero minimum:500 name:ITLB_MISS : Instruction fetch misses from TLB -event:0x03 counters:1,2,3,4 um:zero minimum:500 name:DCACHE_REFILL : Data R/W operation that causes a refill from cache or normal cacheable memory -event:0x04 counters:1,2,3,4 um:zero minimum:500 name:DCACHE_ACCESS : Data R/W from cache -event:0x05 counters:1,2,3,4 um:zero minimum:500 name:DTLB_REFILL : Data R/W that causes a TLB refill -event:0x06 counters:1,2,3,4 um:zero minimum:500 name:DREAD : Data read architecturally executed (note: architecturally executed = for instructions that are unconditional or that pass the condition code) -event:0x07 counters:1,2,3,4 um:zero minimum:500 name:DWRITE : Data write architecturally executed -event:0x08 counters:1,2,3,4 um:zero minimum:500 name:INSTR_EXECUTED : All executed instructions -event:0x09 counters:1,2,3,4 um:zero minimum:500 name:EXC_TAKEN : Exception taken -event:0x0A counters:1,2,3,4 um:zero minimum:500 name:EXC_EXECUTED : Exception return architecturally executed -event:0x0B counters:1,2,3,4 um:zero minimum:500 name:CID_WRITE : Instruction that writes to the Context ID Register architecturally executed -event:0x0C counters:1,2,3,4 um:zero minimum:500 name:PC_WRITE : SW change of PC, architecturally executed (not by exceptions) -event:0x0D counters:1,2,3,4 um:zero minimum:500 name:PC_IMM_BRANCH : Immediate branch instruction executed (taken or not) -event:0x0E counters:1,2,3,4 um:zero minimum:500 name:PC_PROC_RETURN : Procedure return architecturally executed (not by exceptions) -event:0x0F counters:1,2,3,4 um:zero minimum:500 name:UNALIGNED_ACCESS : Unaligned access architecturally executed -event:0x10 counters:1,2,3,4 um:zero minimum:500 name:PC_BRANCH_MIS_PRED : Branch mispredicted or not predicted. Counts pipeline flushes because of misprediction -event:0x12 counters:1,2,3,4 um:zero minimum:500 name:PC_BRANCH_MIS_USED : Branch or change in program flow that could have been predicted +include:arm/armv7-common event:0x40 counters:1,2,3,4 um:zero minimum:500 name:WRITE_BUFFER_FULL : Any write buffer full cycle event:0x41 counters:1,2,3,4 um:zero minimum:500 name:L2_STORE_MERGED : Any store that is merged in L2 cache event:0x42 counters:1,2,3,4 um:zero minimum:500 name:L2_STORE_BUFF : Any bufferable store from load/store to L2 cache @@ -49,5 +32,3 @@ event:0x5A counters:1,2,3,4 um:zero minimum:500 name:NEON_CYCLES : Number of cyc event:0x70 counters:1,2,3,4 um:zero minimum:500 name:PMU0_EVENTS : Number of events from external input source PMUEXTIN[0] event:0x71 counters:1,2,3,4 um:zero minimum:500 name:PMU1_EVENTS : Number of events from external input source PMUEXTIN[1] event:0x72 counters:1,2,3,4 um:zero minimum:500 name:PMU_EVENTS : Number of events from both external input sources PMUEXTIN[0] and PMUEXTIN[1] -event:0xFF counters:0 um:zero minimum:500 name:CPU_CYCLES : Number of CPU cycles - -- 1.6.3.3 |
From: Will D. <wil...@ar...> - 2010-05-17 17:59:39
|
Currently, OProfile refers to the ARM11MPCore as "ARM MPCore". The Cortex-A9 is also available in MP configurations, so this nomenclature is ambiguous. Additionally, OProfile refers to the Cortex-A8 event set as "ARMv7 PMNC" which doesn't make sense when the Cortex-A9 is added to the equation. This patch fixes up errors in terminology, but leaves the string which is used to communicate with the Kernel alone in order to remain backwards compatible. Furthermore, a new cpu_descr is added for Cortex-A9. Signed-off-by: Will Deacon <wil...@ar...> --- events/Makefile.am | 2 ++ libop/op_cpu_type.c | 5 +++-- libop/op_cpu_type.h | 3 ++- libop/op_events.c | 1 + utils/ophelp.c | 8 +++++++- 5 files changed, 15 insertions(+), 4 deletions(-) diff --git a/events/Makefile.am b/events/Makefile.am index 6c51f78..546b631 100644 --- a/events/Makefile.am +++ b/events/Makefile.am @@ -38,7 +38,9 @@ event_files = \ arm/xscale1/events arm/xscale1/unit_masks \ arm/xscale2/events arm/xscale2/unit_masks \ arm/armv6/events arm/armv6/unit_masks \ + arm/armv7-common/events arm/armv7-common/unit_masks \ arm/armv7/events arm/armv7/unit_masks \ + arm/armv7-ca9/events arm/armv7-ca9/unit_masks \ arm/mpcore/events arm/mpcore/unit_masks \ avr32/events avr32/unit_masks \ mips/20K/events mips/20K/unit_masks \ diff --git a/libop/op_cpu_type.c b/libop/op_cpu_type.c index 0262d02..0014e81 100644 --- a/libop/op_cpu_type.c +++ b/libop/op_cpu_type.c @@ -70,12 +70,12 @@ static struct cpu_descr const cpu_descrs[MAX_CPU_TYPE] = { { "ppc64 Cell Broadband Engine", "ppc64/cell-be", CPU_PPC64_CELL, 8 }, { "AMD64 family10", "x86-64/family10", CPU_FAMILY10, 4 }, { "ppc64 PA6T", "ppc64/pa6t", CPU_PPC64_PA6T, 6 }, - { "ARM MPCore", "arm/mpcore", CPU_ARM_MPCORE, 2 }, + { "ARM 11MPCore", "arm/mpcore", CPU_ARM_MPCORE, 2 }, { "ARM V6 PMU", "arm/armv6", CPU_ARM_V6, 3 }, { "ppc64 POWER5++", "ppc64/power5++", CPU_PPC64_POWER5pp, 6 }, { "e300", "ppc/e300", CPU_PPC_E300, 4 }, { "AVR32", "avr32", CPU_AVR32, 3 }, - { "ARM V7 PMNC", "arm/armv7", CPU_ARM_V7, 5 }, + { "ARM Cortex-A8", "arm/armv7", CPU_ARM_V7, 5 }, { "Intel Architectural Perfmon", "i386/arch_perfmon", CPU_ARCH_PERFMON, 0}, { "AMD64 family11h", "x86-64/family11h", CPU_FAMILY11H, 4 }, { "ppc64 POWER7", "ppc64/power7", CPU_PPC64_POWER7, 6 }, @@ -83,6 +83,7 @@ static struct cpu_descr const cpu_descrs[MAX_CPU_TYPE] = { { "Intel Core/i7", "i386/core_i7", CPU_CORE_I7, 4 }, { "Intel Atom", "i386/atom", CPU_ATOM, 2 }, { "Loongson2", "mips/loongson2", CPU_MIPS_LOONGSON2, 2 }, + { "ARM Cortex-A9", "arm/armv7-ca9", CPU_ARM_V7_CA9, 7 }, }; static size_t const nr_cpu_descrs = sizeof(cpu_descrs) / sizeof(struct cpu_descr); diff --git a/libop/op_cpu_type.h b/libop/op_cpu_type.h index 9108253..6391bc9 100644 --- a/libop/op_cpu_type.h +++ b/libop/op_cpu_type.h @@ -72,7 +72,7 @@ typedef enum { CPU_PPC64_POWER5pp, /**< ppc64 Power5++ family */ CPU_PPC_E300, /**< e300 */ CPU_AVR32, /**< AVR32 */ - CPU_ARM_V7, /**< ARM V7 */ + CPU_ARM_V7, /**< ARM Cortex-A8 */ CPU_ARCH_PERFMON, /**< Intel architectural perfmon */ CPU_FAMILY11H, /**< AMD family 11h */ CPU_PPC64_POWER7, /**< ppc64 POWER7 family */ @@ -80,6 +80,7 @@ typedef enum { CPU_CORE_I7, /* Intel Core i7, Nehalem */ CPU_ATOM, /* First generation Intel Atom */ CPU_MIPS_LOONGSON2, /* < loongson2 family */ + CPU_ARM_V7_CA9, /**< ARM Cortex-A9 */ MAX_CPU_TYPE } op_cpu; diff --git a/libop/op_events.c b/libop/op_events.c index 08f92a2..3cf7eef 100644 --- a/libop/op_events.c +++ b/libop/op_events.c @@ -1006,6 +1006,7 @@ void op_default_event(op_cpu cpu_type, struct op_default_event_descr * descr) case CPU_ARM_MPCORE: case CPU_ARM_V6: case CPU_ARM_V7: + case CPU_ARM_V7_CA9: case CPU_AVR32: descr->name = "CPU_CYCLES"; break; diff --git a/utils/ophelp.c b/utils/ophelp.c index 0bb8780..c511d47 100644 --- a/utils/ophelp.c +++ b/utils/ophelp.c @@ -534,10 +534,16 @@ int main(int argc, char const * argv[]) case CPU_ARM_V7: event_doc = - "See ARM11 Technical Reference Manual\n" + "See Cortex-A8 Technical Reference Manual\n" "Cortex A8 DDI (ARM DDI 0344B, revision r1p1)\n"; break; + case CPU_ARM_V7_CA9: + event_doc = + "See Cortex-A9 Technical Reference Manual\n" + "Cortex A9 DDI (ARM DDI 0388E, revision r2p0)\n"; + break; + case CPU_PPC64_PA6T: event_doc = "See PA6T Power Implementation Features Book IV\n" -- 1.6.3.3 |
From: Will D. <wil...@ar...> - 2010-05-17 17:59:41
|
The Cortex-A9 is an ARMv7 core which implements its own set of additional events on top of those defined by the architecture. This patch adds support for these extra events. Signed-off-by: Will Deacon <wil...@ar...> --- events/arm/armv7-ca9/events | 50 +++++++++++++++++++++++++++++++++++++++ events/arm/armv7-ca9/unit_masks | 4 +++ 2 files changed, 54 insertions(+), 0 deletions(-) create mode 100644 events/arm/armv7-ca9/events create mode 100644 events/arm/armv7-ca9/unit_masks diff --git a/events/arm/armv7-ca9/events b/events/arm/armv7-ca9/events new file mode 100644 index 0000000..c1e4084 --- /dev/null +++ b/events/arm/armv7-ca9/events @@ -0,0 +1,50 @@ +# ARM Cortex A9 events +# From Cortex A9 TRM +# +include:arm/armv7-common +event:0x40 counters:1,2,3,4,5,6 um:zero minimum:500 name:JAVA_BC_EXEC : Number of Java bytecodes decoded, including speculative ones +event:0x41 counters:1,2,3,4,5,6 um:zero minimum:500 name:JAVA_SFTBC_EXEC : Number of software Java bytecodes decoded, including speculative ones +event:0x42 counters:1,2,3,4,5,6 um:zero minimum:500 name:JAVA_BB_EXEC : Number of Jazelle taken branches executed, including those flushed due to a previous load/store which aborts late + +event:0x50 counters:1,2,3,4,5,6 um:zero minimum:500 name:CO_LF_MISS : Number of coherent linefill requests which miss in all other CPUs, meaning that the request is sent to external memory +event:0x51 counters:1,2,3,4,5,6 um:zero minimum:500 name:CO_LF_HIT : Number of coherent linefill requests which hit in another CPU, meaning that the linefill data is fetched directly from the relevant cache + +event:0x60 counters:1,2,3,4,5,6 um:zero minimum:500 name:IC_DEP_STALL : Number of cycles where CPU is ready to accept new instructions but does not receive any because of the instruction side not being able to provide any and the instruction cache is currently performing at least one linefill +event:0x61 counters:1,2,3,4,5,6 um:zero minimum:500 name:DC_DEP_STALL : Number of cycles where CPU has some instructions that it cannot issue to any pipeline and the LSU has at least one pending linefill request but no pending TLB requests +event:0x63 counters:1,2,3,4,5,6 um:zero minimum:500 name:STREX_PASS : Number of STREX instructions architecturally executed and passed +event:0x64 counters:1,2,3,4,5,6 um:zero minimum:500 name:STREX_FAILS : Number of STREX instructions architecturally executed and failed +event:0x65 counters:1,2,3,4,5,6 um:zero minimum:500 name:DATA_EVICT : Number of eviction requests due to a linefill in the data cache +event:0x66 counters:1,2,3,4,5,6 um:zero minimum:500 name:ISS_NO_DISP : Number of cycles where the issue stage does not dispatch any instruction +event:0x67 counters:1,2,3,4,5,6 um:zero minimum:500 name:ISS_EMPTY : Number of cycles where the issue stage is empty +event:0x68 counters:1,2,3,4,5,6 um:zero minimum:500 name:INS_RENAME : Number of instructions going through the Register Renaming stage + +event:0x6E counters:1,2,3,4,5,6 um:zero minimum:500 name:PRD_FN_RET : Number of procedure returns whose condition codes do not fail, excluding all exception returns + +event:0x70 counters:1,2,3,4,5,6 um:zero minimum:500 name:INS_MAIN_EXEC : Number of instructions being executed in main execution pipeline of the CPU, the multiply pipeline and the ALU pipeline +event:0x71 counters:1,2,3,4,5,6 um:zero minimum:500 name:INS_SND_EXEC : Number of instructions being executed in the second execution pipeline (ALU) of the CPU +event:0x72 counters:1,2,3,4,5,6 um:zero minimum:500 name:INS_LSU : Number of instructions being executed in the Load/Store unit +event:0x73 counters:1,2,3,4,5,6 um:zero minimum:500 name:INS_FP_RR : Number of floating-point instructions going through the Register Rename stage +event:0x74 counters:1,2,3,4,5,6 um:zero minimum:500 name:INS_NEON_RR : Number of NEON instructions going through the Register Rename stage + +event:0x80 counters:1,2,3,4,5,6 um:zero minimum:500 name:STALL_PLD : Number of cycles where CPU is stalled because PLD slots are all full +event:0x81 counters:1,2,3,4,5,6 um:zero minimum:500 name:STALL_WRITE : Number of cycles where CPU is stalled because data side is full and executing writes to external memory +event:0x82 counters:1,2,3,4,5,6 um:zero minimum:500 name:STALL_INS_TLB : Number of cycles where CPU is stalled because of main TLB misses on requests issued by the instruction side +event:0x83 counters:1,2,3,4,5,6 um:zero minimum:500 name:STALL_DATA_TLB : Number of cycles where CPU is stalled because of main TLB misses on requests issued by the data side +event:0x84 counters:1,2,3,4,5,6 um:zero minimum:500 name:STALL_INS_UTLB : Number of cycles where CPU is stalled because of micro TLB misses on the instruction side +event:0x85 counters:1,2,3,4,5,6 um:zero minimum:500 name:STALL_DATA_ULTB : Number of cycles where CPU is stalled because of micro TLB misses on the data side +event:0x86 counters:1,2,3,4,5,6 um:zero minimum:500 name:STALL_DMB : Number of cycles where CPU is stalled due to executed of a DMB memory barrier + +event:0x8A counters:1,2,3,4,5,6 um:zero minimum:500 name:CLK_INT_EN : Number of cycles during which the integer core clock is enabled +event:0x8B counters:1,2,3,4,5,6 um:zero minimum:500 name:CLK_DE_EN : Number of cycles during which the Data Engine clock is enabled + +event:0x90 counters:1,2,3,4,5,6 um:zero minimum:500 name:INS_ISB : Number of ISB instructions architecturally executed +event:0x91 counters:1,2,3,4,5,6 um:zero minimum:500 name:INS_DSB : Number of DSB instructions architecturally executed +event:0x92 counters:1,2,3,4,5,6 um:zero minimum:500 name:INS_DMB : Number of DMB instructions speculatively executed +event:0x93 counters:1,2,3,4,5,6 um:zero minimum:500 name:EXT_IRQ : Number of external interrupts executed by the processor + +event:0xA0 counters:1,2,3,4,5,6 um:zero minimum:500 name:PLE_CL_REQ_CMP : PLE cache line request completed +event:0xA1 counters:1,2,3,4,5,6 um:zero minimum:500 name:PLE_CL_REQ_SKP : PLE cache line request skipped +event:0xA2 counters:1,2,3,4,5,6 um:zero minimum:500 name:PLE_FIFO_FLSH : PLE FIFO flush +event:0xA3 counters:1,2,3,4,5,6 um:zero minimum:500 name:PLE_REQ_COMP : PLE request completed +event:0xA4 counters:1,2,3,4,5,6 um:zero minimum:500 name:PLE_FIFO_OF : PLE FIFO overflow +event:0xA5 counters:1,2,3,4,5,6 um:zero minimum:500 name:PLE_REQ_PRG : PLE request programmed diff --git a/events/arm/armv7-ca9/unit_masks b/events/arm/armv7-ca9/unit_masks new file mode 100644 index 0000000..4027469 --- /dev/null +++ b/events/arm/armv7-ca9/unit_masks @@ -0,0 +1,4 @@ +# ARM V7 PMNC possible unit masks +# +name:zero type:mandatory default:0x00 + 0x00 No unit mask -- 1.6.3.3 |
From: Jean P. <jp...@mv...> - 2010-05-17 19:10:54
|
Hi Will, On Mon, May 17, 2010 at 19:32, Will Deacon <wil...@ar...> wrote: > Hello, > > The following patch series adds userspace support for the Cortex-A9 > CPU. Kernel support for this core is due to arrive in 2.6.35. > > The current ARMv7 event structure is also modified to factor out > the common architectural events from the core-specific extensions. > I fully agree. With this code in we now have common and core-specific code in both kernel and user space. > All feedback welcome. > > > Will Deacon (3): > ARM: factor out ARMv7 common architectural events > ARM: correct usage of core terminology for v7 and MPCore > ARM: add support for Cortex-A9 events > > > Jean |
From: Will D. <wil...@ar...> - 2010-05-27 16:19:53
|
Hi Jean, > I fully agree. With this code in we now have common and core-specific code in both kernel and user > space. Thanks for the feedback. Are patches picked up from this mailing list or do I need to submit them elsewhere after getting relevant ACKs? Cheers, Will |
From: Jean P. <jp...@mv...> - 2010-05-27 17:03:58
|
Hi Will, On Thu, May 27, 2010 at 18:19, Will Deacon <wil...@ar...> wrote: > Hi Jean, > > > I fully agree. With this code in we now have common and core-specific > code in both kernel and user > > space. > > Thanks for the feedback. You are welcome. I quickly reviewed the patch and nothing bad popped up. > Are patches picked up from this mailing list > or do I need to submit them elsewhere after getting relevant ACKs? > You just need to sign-off the patch and Richard Purdie applies it in the CVS tree if it is relevant. Some guidelines are available at http://oprofile.sourceforge.net/contribute/. Do you need a formal ACK from me? > Cheers, > > Will > Indeed, cheers! Jean |