From: William C. <wc...@re...> - 2014-02-04 20:11:06
|
This revised patch addresses Will Deacon's comment about possible follow on implementations of the pmu unit such as pmuv4 for armv8 processors. The name is armv8-pmuv3 to match up with what the kernel reports for perf events. If an aarch64 processors has implementation specific events, it can be named appropriately and it can include the events in this patch. The "make distcheck" works fine with this version of the patch. William Cohen (1): Provide basic AArch64 support events/Makefile.am | 1 + events/arm/armv8-pmuv3/events | 38 ++++++++++++++++++++++++++++++++++++++ events/arm/armv8-pmuv3/unit_masks | 4 ++++ libop/op_cpu_type.c | 11 ++++++++++- libop/op_cpu_type.h | 1 + libop/op_events.c | 1 + utils/opcontrol | 5 +++++ utils/ophelp.c | 7 +++++++ 8 files changed, 67 insertions(+), 1 deletion(-) create mode 100644 events/arm/armv8-pmuv3/events create mode 100644 events/arm/armv8-pmuv3/unit_masks -- 1.8.3.1 |
From: Maynard J. <may...@us...> - 2014-02-11 17:39:42
|
How does this look, guys? *Will D*, please double-check that the new op_get_cpu_number() code is correct. *Will C*, please test the patch to make sure I didn't miss anything. Thanks. -Maynard ------------------------------------------------------------------------------- Provide basic AArch64 (ARMv8) support The AArch64 (ARMv8) support is provided as an ARM variant to allow use in both 32-bit and 64-bit ARM environments. The support in this patch is just the basic events described in the AArch64 documentation. AArch64 processor implementation may provide additional implementation specific events. One could add code to recognize those processor specific implementations and include the armv8-pmuv3-common base events into the event sets for the processor implementations. The ARMv8 X-Gene processor type is included in this patch as an implementation, although there are no known processor-specific events to add at this time. Below is example run on the ARM Foundation simulator collecting data on a build of OProfile. $ cd oprofile $ operf make ... $ opreport -t 5 Using /home/wcohen/oprofile/oprofile/oprofile_data/samples/ for samples directory. WARNING: Lost samples detected! See /home/wcohen/oprofile/oprofile/oprofile_data/samples/operf.log for details. CPU: ARM AArch64 Counted CPU_CYCLES events (Cycle) with a unit mask of 0x00 (No unit mask) count 100000 CPU_CYCLES:100000| samples| %| ------------------ 10943 90.5877 make CPU_CYCLES:100000| samples| %| ------------------ 5281 48.2592 make 4543 41.5151 libc-2.17.so 1079 9.8602 kallsyms 40 0.3655 ld-2.17.so 735 6.0844 sh CPU_CYCLES:100000| samples| %| ------------------ 321 43.6735 kallsyms 298 40.5442 libc-2.17.so 94 12.7891 bash 22 2.9932 ld-2.17.so Signed-off-by: William Cohen <wc...@re...> --- events/Makefile.am | 2 + events/arm/armv8-pmuv3-common/events | 38 ++++++++++++++++++++++++++++++ events/arm/armv8-pmuv3-common/unit_masks | 4 +++ events/arm/armv8-xgene/events | 7 +++++ events/arm/armv8-xgene/unit_masks | 3 ++ libop/op_cpu_type.c | 11 ++++++++- libop/op_cpu_type.h | 1 + libop/op_events.c | 1 + utils/opcontrol | 5 ++++ utils/ophelp.c | 7 +++++ 10 files changed, 78 insertions(+), 1 deletions(-) create mode 100644 events/arm/armv8-pmuv3-common/events create mode 100644 events/arm/armv8-pmuv3-common/unit_masks create mode 100644 events/arm/armv8-xgene/events create mode 100644 events/arm/armv8-xgene/unit_masks mode change 100755 => 100644 utils/opcontrol Index: op-master/events/Makefile.am =================================================================== --- op-master.orig/events/Makefile.am +++ op-master/events/Makefile.am @@ -59,6 +59,8 @@ event_files = \ arm/armv7-ca7/events arm/armv7-ca7/unit_masks \ arm/armv7-ca15/events arm/armv7-ca15/unit_masks \ arm/mpcore/events arm/mpcore/unit_masks \ + arm/armv8-pmuv3-common/events arm/armv8-pmuv3-common/unit_masks \ + arm/armv8-xgene/events arm/armv8-xgene/unit_masks \ avr32/events avr32/unit_masks \ mips/20K/events mips/20K/unit_masks \ mips/24K/events mips/24K/unit_masks \ Index: op-master/events/arm/armv8-pmuv3-common/events =================================================================== --- /dev/null +++ op-master/events/arm/armv8-pmuv3-common/events @@ -0,0 +1,38 @@ +# +# Copyright (c) Red Hat, 2014. +# Contributed by William Cohen <wc...@re...> +# +# ARMv8 pmu v3 architected events + +event:0x00 um:zero minimum:500 name:SW_INCR : Instruction architecturally executed, condition code check pass, software increment +event:0x01 um:zero minimum:5000 name:L1I_CACHE_REFILL : Level 1 instruction cache refill +event:0x02 um:zero minimum:5000 name:L1I_TLB_REFILL : Level 1 instruction TLB refill +event:0x03 um:zero minimum:5000 name:L1D_CACHE_REFILL : Level 1 data cache refill +event:0x04 um:zero minimum:5000 name:L1D_CACHE : Level 1 data cache access +event:0x05 um:zero minimum:5000 name:L1D_TLB_REFILL : Level 1 data TLB refill +event:0x06 um:zero minimum:100000 name:LD_RETIRED : Instruction architecturally executed, condition code check pass, load +event:0x07 um:zero minimum:100000 name:ST_RETIRED : Instruction architecturally executed, condition code check pass, store +event:0x08 um:zero minimum:100000 name:INST_RETIRED : Instruction architecturally executed +event:0x09 um:zero minimum:500 name:EXC_TAKEN : Exception taken +event:0x0A um:zero minimum:500 name:EXC_RETURN : Instruction architecturally executed, condition code check pass, exception return +event:0x0B um:zero minimum:500 name:CID_WRITE_RETIRED : Instruction architecturally executed, condition code check pass, write to CONTEXTIDR +event:0x0C um:zero minimum:5000 name:PC_WRITE_RETIRED : Instruction architecturally executed, condition code check pass, software change of the PC +event:0x0D um:zero minimum:5000 name:BR_IMMED_RETIRED : Instruction architecturally executed, immediate branch +event:0x0E um:zero minimum:5000 name:BR_RETURN_RETIRED : Instruction architecturally executed, condition code check pass, procedure return +event:0x0F um:zero minimum:500 name:UNALIGNED_LDST_RETIRED : Instruction architecturally executed, condition code check pass, unaligned load or store +event:0x10 um:zero minimum:5000 name:BR_MIS_PRED : Mispredicted or not predicted branch speculatively executed +event:0x11 um:zero minimum:100000 name:CPU_CYCLES : Cycle +event:0x12 um:zero minimum:5000 name:BR_PRED : Predictable branch speculatively executed +event:0x13 um:zero minimum:100000 name:MEM_ACCESS : Data memory access +event:0x14 um:zero minimum:5000 name:L1I_CACHE : Level 1 instruction cache access +event:0x15 um:zero minimum:5000 name:L1D_CACHE_WB : Level 1 data cache write-back +event:0x16 um:zero minimum:5000 name:L2D_CACHE : Level 2 data cache access +event:0x17 um:zero minimum:5000 name:L2D_CACHE_REFILL : Level 2 data cache refill +event:0x18 um:zero minimum:5000 name:L2D_CACHE_WB : Level 2 data cache write-back +event:0x19 um:zero minimum:5000 name:BUS_ACCESS : Bus access +event:0x1A um:zero minimum:500 name:MEMORY_ERROR : Local memory error +event:0x1B um:zero minimum:100000 name:INST_SPEC : Operation speculatively executed +event:0x1C um:zero minimum:5000 name:TTBR_WRITE_RETIRED : Instruction architecturally executed, condition code check pass, write to TTBR +event:0x1D um:zero minimum:5000 name:BUS_CYCLES : Bus cycle +event:0x1F um:zero minimum:5000 name:L1D_CACHE_ALLOCATE : Level 1 data cache allocation without refill +event:0x20 um:zero minimum:5000 name:L2D_CACHE_ALLOCATE : Level 2 data cache allocation without refill Index: op-master/events/arm/armv8-pmuv3-common/unit_masks =================================================================== --- /dev/null +++ op-master/events/arm/armv8-pmuv3-common/unit_masks @@ -0,0 +1,4 @@ +# ARMv8 architected events unit masks +# +name:zero type:mandatory default:0x00 + 0x00 No unit mask Index: op-master/events/arm/armv8-xgene/events =================================================================== --- /dev/null +++ op-master/events/arm/armv8-xgene/events @@ -0,0 +1,7 @@ +# +# Copyright (c) Red Hat, 2014. +# Contributed by William Cohen <wc...@re...> +# +# Basic ARM V8 events +# +include:arm/armv8-pmuv3-common Index: op-master/events/arm/armv8-xgene/unit_masks =================================================================== --- /dev/null +++ op-master/events/arm/armv8-xgene/unit_masks @@ -0,0 +1,3 @@ +# ARMv8 architected events unit masks +# +include:arm/armv8-pmuv3-common Index: op-master/libop/op_cpu_type.c =================================================================== --- op-master.orig/libop/op_cpu_type.c +++ op-master/libop/op_cpu_type.c @@ -129,6 +129,7 @@ static struct cpu_descr const cpu_descrs { "e6500", "ppc/e6500", CPU_PPC_E6500, 6 }, { "Intel Silvermont microarchitecture", "i386/silvermont", CPU_SILVERMONT, 2 }, { "ARMv7 Krait", "arm/armv7-krait", CPU_ARM_KRAIT, 5 }, + { "ARM X-Gene", "arm/armv8-xgene", CPU_ARM_V8_XGENE, 6 }, }; static size_t const nr_cpu_descrs = sizeof(cpu_descrs) / sizeof(struct cpu_descr); @@ -395,6 +396,11 @@ static op_cpu _get_arm_cpu_type(void) case 0xc0f: return op_get_cpu_number("arm/armv7-ca15"); } + } else if (vendorid == 0x50) { /* Applied Micro Circuits Corporation */ + switch (cpuid) { + case 0x000: + return op_get_cpu_number("arm/armv8-xgene"); + } } else if (vendorid == 0x69) { /* Intel xscale */ switch (cpuid >> 9) { case 1: @@ -631,7 +637,8 @@ static op_cpu __get_cpu_type_alt_method( (strncmp(uname_info.machine, "ppc64le", 7) == 0)) { return _get_ppc64_cpu_type(); } - if (strncmp(uname_info.machine, "arm", 3) == 0) { + if (strncmp(uname_info.machine, "arm", 3) == 0 || + strncmp(uname_info.machine, "aarch64", 7) == 0) { return _get_arm_cpu_type(); } if (strncmp(uname_info.machine, "tile", 4) == 0) { Index: op-master/libop/op_cpu_type.h =================================================================== --- op-master.orig/libop/op_cpu_type.h +++ op-master/libop/op_cpu_type.h @@ -109,6 +109,7 @@ typedef enum { CPU_PPC_E6500, /**< e6500 */ CPU_SILVERMONT, /** < Intel Silvermont microarchitecture */ CPU_ARM_KRAIT, /**< ARM KRAIT */ + CPU_ARM_V8_XGENE, /* ARM X-Gene */ MAX_CPU_TYPE } op_cpu; Index: op-master/libop/op_events.c =================================================================== --- op-master.orig/libop/op_events.c +++ op-master/libop/op_events.c @@ -1253,6 +1253,7 @@ void op_default_event(op_cpu cpu_type, s case CPU_ARM_SCORPION: case CPU_ARM_SCORPIONMP: case CPU_ARM_KRAIT: + case CPU_ARM_V8_XGENE: descr->name = "CPU_CYCLES"; break; Index: op-master/utils/opcontrol =================================================================== --- op-master.orig/utils/opcontrol +++ op-master/utils/opcontrol @@ -400,6 +400,11 @@ do_init() do_deinit exit 1 ;; + aarch64/*) + echo "*** ARM AArch64 processors are not supported with opcontrol. Please use operf instead. ***" + do_deinit + exit 1 + ;; esac fi Index: op-master/utils/ophelp.c =================================================================== --- op-master.orig/utils/ophelp.c +++ op-master/utils/ophelp.c @@ -656,6 +656,13 @@ int main(int argc, char const * argv[]) "Cortex A15 DDI (ARM DDI 0438F, revision r3p1)\n"; break; + case CPU_ARM_V8_XGENE: + event_doc = + "See ARM Architecture Reference Manual \n" + "ARMv8, for ARMv8-A architecture profile\n" + "DDI (ARM DDI0487A.a)\n"; + break; + case CPU_PPC64_PA6T: event_doc = "See PA6T Power Implementation Features Book IV\n" |
From: Will D. <wil...@ar...> - 2014-02-11 17:58:28
|
On Tue, Feb 11, 2014 at 05:39:26PM +0000, Maynard Johnson wrote: > How does this look, guys? *Will D*, please double-check that the new op_get_cpu_number() > code is correct. *Will C*, please test the patch to make sure I didn't miss anything. [...] > Index: op-master/libop/op_cpu_type.c > =================================================================== > --- op-master.orig/libop/op_cpu_type.c > +++ op-master/libop/op_cpu_type.c > @@ -129,6 +129,7 @@ static struct cpu_descr const cpu_descrs > { "e6500", "ppc/e6500", CPU_PPC_E6500, 6 }, > { "Intel Silvermont microarchitecture", "i386/silvermont", CPU_SILVERMONT, 2 }, > { "ARMv7 Krait", "arm/armv7-krait", CPU_ARM_KRAIT, 5 }, > + { "ARM X-Gene", "arm/armv8-xgene", CPU_ARM_V8_XGENE, 6 }, This should be "APM X-Gene" to keep the lawyers happy :) http://www.apm.com/products/data-center/x-gene-family/x-gene/ > Index: op-master/libop/op_cpu_type.h > =================================================================== > --- op-master.orig/libop/op_cpu_type.h > +++ op-master/libop/op_cpu_type.h > @@ -109,6 +109,7 @@ typedef enum { > CPU_PPC_E6500, /**< e6500 */ > CPU_SILVERMONT, /** < Intel Silvermont microarchitecture */ > CPU_ARM_KRAIT, /**< ARM KRAIT */ > + CPU_ARM_V8_XGENE, /* ARM X-Gene */ Same here (in the comment). Will |
From: Maynard J. <may...@us...> - 2014-02-11 18:22:56
|
On 02/11/2014 11:58 AM, Will Deacon wrote: > On Tue, Feb 11, 2014 at 05:39:26PM +0000, Maynard Johnson wrote: >> How does this look, guys? *Will D*, please double-check that the new op_get_cpu_number() >> code is correct. *Will C*, please test the patch to make sure I didn't miss anything. > > [...] > >> Index: op-master/libop/op_cpu_type.c >> =================================================================== >> --- op-master.orig/libop/op_cpu_type.c >> +++ op-master/libop/op_cpu_type.c >> @@ -129,6 +129,7 @@ static struct cpu_descr const cpu_descrs >> { "e6500", "ppc/e6500", CPU_PPC_E6500, 6 }, >> { "Intel Silvermont microarchitecture", "i386/silvermont", CPU_SILVERMONT, 2 }, >> { "ARMv7 Krait", "arm/armv7-krait", CPU_ARM_KRAIT, 5 }, >> + { "ARM X-Gene", "arm/armv8-xgene", CPU_ARM_V8_XGENE, 6 }, > > This should be "APM X-Gene" to keep the lawyers happy :) heh . . . I thought it was a typo! :-) I'll fix it when I commit it, pending Will C's successful testing. -Maynard > > http://www.apm.com/products/data-center/x-gene-family/x-gene/ > >> Index: op-master/libop/op_cpu_type.h >> =================================================================== >> --- op-master.orig/libop/op_cpu_type.h >> +++ op-master/libop/op_cpu_type.h >> @@ -109,6 +109,7 @@ typedef enum { >> CPU_PPC_E6500, /**< e6500 */ >> CPU_SILVERMONT, /** < Intel Silvermont microarchitecture */ >> CPU_ARM_KRAIT, /**< ARM KRAIT */ >> + CPU_ARM_V8_XGENE, /* ARM X-Gene */ > > Same here (in the comment). > > Will > |
From: William C. <wc...@re...> - 2014-02-11 22:50:50
|
On 02/11/2014 12:39 PM, Maynard Johnson wrote: > How does this look, guys? *Will D*, please double-check that the new op_get_cpu_number() > code is correct. *Will C*, please test the patch to make sure I didn't miss anything. > > Thanks. > > -Maynard Hi Maynard, I was able to get some operf to collect data from a physical Applied Micro X-Gene machine. Looks like the patch continues to work after the tweaks. -Will > ------------------------------------------------------------------------------- > > Provide basic AArch64 (ARMv8) support > > The AArch64 (ARMv8) support is provided as an ARM variant to allow use > in both 32-bit and 64-bit ARM environments. The support in this patch > is just the basic events described in the AArch64 documentation. > AArch64 processor implementation may provide additional implementation > specific events. One could add code to recognize those processor > specific implementations and include the armv8-pmuv3-common base > events into the event sets for the processor implementations. > The ARMv8 X-Gene processor type is included in this patch as an > implementation, although there are no known processor-specific events > to add at this time. > > Below is example run on the ARM Foundation simulator collecting data > on a build of OProfile. > > $ cd oprofile > $ operf make > ... > $ opreport -t 5 > Using /home/wcohen/oprofile/oprofile/oprofile_data/samples/ for samples directory. > > WARNING: Lost samples detected! See /home/wcohen/oprofile/oprofile/oprofile_data/samples/operf.log for details. > CPU: ARM AArch64 > Counted CPU_CYCLES events (Cycle) with a unit mask of 0x00 (No unit mask) count 100000 > CPU_CYCLES:100000| > samples| %| > ------------------ > 10943 90.5877 make > CPU_CYCLES:100000| > samples| %| > ------------------ > 5281 48.2592 make > 4543 41.5151 libc-2.17.so > 1079 9.8602 kallsyms > 40 0.3655 ld-2.17.so > 735 6.0844 sh > CPU_CYCLES:100000| > samples| %| > ------------------ > 321 43.6735 kallsyms > 298 40.5442 libc-2.17.so > 94 12.7891 bash > 22 2.9932 ld-2.17.so > > Signed-off-by: William Cohen <wc...@re...> > --- > events/Makefile.am | 2 + > events/arm/armv8-pmuv3-common/events | 38 ++++++++++++++++++++++++++++++ > events/arm/armv8-pmuv3-common/unit_masks | 4 +++ > events/arm/armv8-xgene/events | 7 +++++ > events/arm/armv8-xgene/unit_masks | 3 ++ > libop/op_cpu_type.c | 11 ++++++++- > libop/op_cpu_type.h | 1 + > libop/op_events.c | 1 + > utils/opcontrol | 5 ++++ > utils/ophelp.c | 7 +++++ > 10 files changed, 78 insertions(+), 1 deletions(-) > create mode 100644 events/arm/armv8-pmuv3-common/events > create mode 100644 events/arm/armv8-pmuv3-common/unit_masks > create mode 100644 events/arm/armv8-xgene/events > create mode 100644 events/arm/armv8-xgene/unit_masks > mode change 100755 => 100644 utils/opcontrol > > Index: op-master/events/Makefile.am > =================================================================== > --- op-master.orig/events/Makefile.am > +++ op-master/events/Makefile.am > @@ -59,6 +59,8 @@ event_files = \ > arm/armv7-ca7/events arm/armv7-ca7/unit_masks \ > arm/armv7-ca15/events arm/armv7-ca15/unit_masks \ > arm/mpcore/events arm/mpcore/unit_masks \ > + arm/armv8-pmuv3-common/events arm/armv8-pmuv3-common/unit_masks \ > + arm/armv8-xgene/events arm/armv8-xgene/unit_masks \ > avr32/events avr32/unit_masks \ > mips/20K/events mips/20K/unit_masks \ > mips/24K/events mips/24K/unit_masks \ > Index: op-master/events/arm/armv8-pmuv3-common/events > =================================================================== > --- /dev/null > +++ op-master/events/arm/armv8-pmuv3-common/events > @@ -0,0 +1,38 @@ > +# > +# Copyright (c) Red Hat, 2014. > +# Contributed by William Cohen <wc...@re...> > +# > +# ARMv8 pmu v3 architected events > + > +event:0x00 um:zero minimum:500 name:SW_INCR : Instruction architecturally executed, condition code check pass, software increment > +event:0x01 um:zero minimum:5000 name:L1I_CACHE_REFILL : Level 1 instruction cache refill > +event:0x02 um:zero minimum:5000 name:L1I_TLB_REFILL : Level 1 instruction TLB refill > +event:0x03 um:zero minimum:5000 name:L1D_CACHE_REFILL : Level 1 data cache refill > +event:0x04 um:zero minimum:5000 name:L1D_CACHE : Level 1 data cache access > +event:0x05 um:zero minimum:5000 name:L1D_TLB_REFILL : Level 1 data TLB refill > +event:0x06 um:zero minimum:100000 name:LD_RETIRED : Instruction architecturally executed, condition code check pass, load > +event:0x07 um:zero minimum:100000 name:ST_RETIRED : Instruction architecturally executed, condition code check pass, store > +event:0x08 um:zero minimum:100000 name:INST_RETIRED : Instruction architecturally executed > +event:0x09 um:zero minimum:500 name:EXC_TAKEN : Exception taken > +event:0x0A um:zero minimum:500 name:EXC_RETURN : Instruction architecturally executed, condition code check pass, exception return > +event:0x0B um:zero minimum:500 name:CID_WRITE_RETIRED : Instruction architecturally executed, condition code check pass, write to CONTEXTIDR > +event:0x0C um:zero minimum:5000 name:PC_WRITE_RETIRED : Instruction architecturally executed, condition code check pass, software change of the PC > +event:0x0D um:zero minimum:5000 name:BR_IMMED_RETIRED : Instruction architecturally executed, immediate branch > +event:0x0E um:zero minimum:5000 name:BR_RETURN_RETIRED : Instruction architecturally executed, condition code check pass, procedure return > +event:0x0F um:zero minimum:500 name:UNALIGNED_LDST_RETIRED : Instruction architecturally executed, condition code check pass, unaligned load or store > +event:0x10 um:zero minimum:5000 name:BR_MIS_PRED : Mispredicted or not predicted branch speculatively executed > +event:0x11 um:zero minimum:100000 name:CPU_CYCLES : Cycle > +event:0x12 um:zero minimum:5000 name:BR_PRED : Predictable branch speculatively executed > +event:0x13 um:zero minimum:100000 name:MEM_ACCESS : Data memory access > +event:0x14 um:zero minimum:5000 name:L1I_CACHE : Level 1 instruction cache access > +event:0x15 um:zero minimum:5000 name:L1D_CACHE_WB : Level 1 data cache write-back > +event:0x16 um:zero minimum:5000 name:L2D_CACHE : Level 2 data cache access > +event:0x17 um:zero minimum:5000 name:L2D_CACHE_REFILL : Level 2 data cache refill > +event:0x18 um:zero minimum:5000 name:L2D_CACHE_WB : Level 2 data cache write-back > +event:0x19 um:zero minimum:5000 name:BUS_ACCESS : Bus access > +event:0x1A um:zero minimum:500 name:MEMORY_ERROR : Local memory error > +event:0x1B um:zero minimum:100000 name:INST_SPEC : Operation speculatively executed > +event:0x1C um:zero minimum:5000 name:TTBR_WRITE_RETIRED : Instruction architecturally executed, condition code check pass, write to TTBR > +event:0x1D um:zero minimum:5000 name:BUS_CYCLES : Bus cycle > +event:0x1F um:zero minimum:5000 name:L1D_CACHE_ALLOCATE : Level 1 data cache allocation without refill > +event:0x20 um:zero minimum:5000 name:L2D_CACHE_ALLOCATE : Level 2 data cache allocation without refill > Index: op-master/events/arm/armv8-pmuv3-common/unit_masks > =================================================================== > --- /dev/null > +++ op-master/events/arm/armv8-pmuv3-common/unit_masks > @@ -0,0 +1,4 @@ > +# ARMv8 architected events unit masks > +# > +name:zero type:mandatory default:0x00 > + 0x00 No unit mask > Index: op-master/events/arm/armv8-xgene/events > =================================================================== > --- /dev/null > +++ op-master/events/arm/armv8-xgene/events > @@ -0,0 +1,7 @@ > +# > +# Copyright (c) Red Hat, 2014. > +# Contributed by William Cohen <wc...@re...> > +# > +# Basic ARM V8 events > +# > +include:arm/armv8-pmuv3-common > Index: op-master/events/arm/armv8-xgene/unit_masks > =================================================================== > --- /dev/null > +++ op-master/events/arm/armv8-xgene/unit_masks > @@ -0,0 +1,3 @@ > +# ARMv8 architected events unit masks > +# > +include:arm/armv8-pmuv3-common > Index: op-master/libop/op_cpu_type.c > =================================================================== > --- op-master.orig/libop/op_cpu_type.c > +++ op-master/libop/op_cpu_type.c > @@ -129,6 +129,7 @@ static struct cpu_descr const cpu_descrs > { "e6500", "ppc/e6500", CPU_PPC_E6500, 6 }, > { "Intel Silvermont microarchitecture", "i386/silvermont", CPU_SILVERMONT, 2 }, > { "ARMv7 Krait", "arm/armv7-krait", CPU_ARM_KRAIT, 5 }, > + { "ARM X-Gene", "arm/armv8-xgene", CPU_ARM_V8_XGENE, 6 }, > }; > > static size_t const nr_cpu_descrs = sizeof(cpu_descrs) / sizeof(struct cpu_descr); > @@ -395,6 +396,11 @@ static op_cpu _get_arm_cpu_type(void) > case 0xc0f: > return op_get_cpu_number("arm/armv7-ca15"); > } > + } else if (vendorid == 0x50) { /* Applied Micro Circuits Corporation */ > + switch (cpuid) { > + case 0x000: > + return op_get_cpu_number("arm/armv8-xgene"); > + } > } else if (vendorid == 0x69) { /* Intel xscale */ > switch (cpuid >> 9) { > case 1: > @@ -631,7 +637,8 @@ static op_cpu __get_cpu_type_alt_method( > (strncmp(uname_info.machine, "ppc64le", 7) == 0)) { > return _get_ppc64_cpu_type(); > } > - if (strncmp(uname_info.machine, "arm", 3) == 0) { > + if (strncmp(uname_info.machine, "arm", 3) == 0 || > + strncmp(uname_info.machine, "aarch64", 7) == 0) { > return _get_arm_cpu_type(); > } > if (strncmp(uname_info.machine, "tile", 4) == 0) { > Index: op-master/libop/op_cpu_type.h > =================================================================== > --- op-master.orig/libop/op_cpu_type.h > +++ op-master/libop/op_cpu_type.h > @@ -109,6 +109,7 @@ typedef enum { > CPU_PPC_E6500, /**< e6500 */ > CPU_SILVERMONT, /** < Intel Silvermont microarchitecture */ > CPU_ARM_KRAIT, /**< ARM KRAIT */ > + CPU_ARM_V8_XGENE, /* ARM X-Gene */ > MAX_CPU_TYPE > } op_cpu; > > Index: op-master/libop/op_events.c > =================================================================== > --- op-master.orig/libop/op_events.c > +++ op-master/libop/op_events.c > @@ -1253,6 +1253,7 @@ void op_default_event(op_cpu cpu_type, s > case CPU_ARM_SCORPION: > case CPU_ARM_SCORPIONMP: > case CPU_ARM_KRAIT: > + case CPU_ARM_V8_XGENE: > descr->name = "CPU_CYCLES"; > break; > > Index: op-master/utils/opcontrol > =================================================================== > --- op-master.orig/utils/opcontrol > +++ op-master/utils/opcontrol > @@ -400,6 +400,11 @@ do_init() > do_deinit > exit 1 > ;; > + aarch64/*) > + echo "*** ARM AArch64 processors are not supported with opcontrol. Please use operf instead. ***" > + do_deinit > + exit 1 > + ;; > esac > fi > > Index: op-master/utils/ophelp.c > =================================================================== > --- op-master.orig/utils/ophelp.c > +++ op-master/utils/ophelp.c > @@ -656,6 +656,13 @@ int main(int argc, char const * argv[]) > "Cortex A15 DDI (ARM DDI 0438F, revision r3p1)\n"; > break; > > + case CPU_ARM_V8_XGENE: > + event_doc = > + "See ARM Architecture Reference Manual \n" > + "ARMv8, for ARMv8-A architecture profile\n" > + "DDI (ARM DDI0487A.a)\n"; > + break; > + > case CPU_PPC64_PA6T: > event_doc = > "See PA6T Power Implementation Features Book IV\n" > > > ------------------------------------------------------------------------------ > Android apps run on BlackBerry 10 > Introducing the new BlackBerry 10.2.1 Runtime for Android apps. > Now with support for Jelly Bean, Bluetooth, Mapview and more. > Get your Android app in front of a whole new audience. Start now. > http://pubads.g.doubleclick.net/gampad/clk?id=124407151&iu=/4140/ostg.clktrk > _______________________________________________ > oprofile-list mailing list > opr...@li... > https://lists.sourceforge.net/lists/listinfo/oprofile-list > |
From: Maynard J. <may...@us...> - 2014-02-12 14:29:28
|
On 02/11/2014 04:50 PM, William Cohen wrote: > On 02/11/2014 12:39 PM, Maynard Johnson wrote: >> How does this look, guys? *Will D*, please double-check that the new op_get_cpu_number() >> code is correct. *Will C*, please test the patch to make sure I didn't miss anything. >> >> Thanks. >> >> -Maynard > > Hi Maynard, > > I was able to get some operf to collect data from a physical Applied Micro X-Gene machine. Looks like the patch continues to work after the tweaks. Patch applied. A final look and testing of what I pushed upstream would be appreciated, but the only changes I made from the patch most recently associated with this thread was to change "ARM" to "APM" that Will D pointed out, plus an extra sentence in the commit message stating that the patch supports the APM X-Gene processor. Thanks! -Maynard > > -Will >> [snip] |
From: Maynard J. <may...@us...> - 2014-02-12 14:52:33
|
On 02/12/2014 08:29 AM, Maynard Johnson wrote: > On 02/11/2014 04:50 PM, William Cohen wrote: >> On 02/11/2014 12:39 PM, Maynard Johnson wrote: >>> How does this look, guys? *Will D*, please double-check that the new op_get_cpu_number() >>> code is correct. *Will C*, please test the patch to make sure I didn't miss anything. >>> >>> Thanks. >>> >>> -Maynard >> >> Hi Maynard, >> >> I was able to get some operf to collect data from a physical Applied Micro X-Gene machine. Looks like the patch continues to work after the tweaks. > Patch applied. A final look and testing of what I pushed upstream would be appreciated, but the only changes I made from the patch most recently associated with this thread was to change "ARM" to "APM" that Will D pointed out, plus an extra sentence in the commit message stating that the patch supports the APM X-Gene processor. Thanks! Somehow, I missed changing CPU_ARM_V8_XGENE to CPU_ARM_V8_APM_XGENE in libop/op_events.c when I committed the patch. I fixed that and pushed the fix upstream. -Maynard > > -Maynard >> >> -Will >>> > [snip] > > > ------------------------------------------------------------------------------ > Android apps run on BlackBerry 10 > Introducing the new BlackBerry 10.2.1 Runtime for Android apps. > Now with support for Jelly Bean, Bluetooth, Mapview and more. > Get your Android app in front of a whole new audience. Start now. > http://pubads.g.doubleclick.net/gampad/clk?id=124407151&iu=/4140/ostg.clktrk > _______________________________________________ > oprofile-list mailing list > opr...@li... > https://lists.sourceforge.net/lists/listinfo/oprofile-list > |
From: William C. <wc...@re...> - 2014-02-12 16:38:37
|
On 02/12/2014 09:52 AM, Maynard Johnson wrote: > On 02/12/2014 08:29 AM, Maynard Johnson wrote: >> On 02/11/2014 04:50 PM, William Cohen wrote: >>> On 02/11/2014 12:39 PM, Maynard Johnson wrote: >>>> How does this look, guys? *Will D*, please double-check that the new op_get_cpu_number() >>>> code is correct. *Will C*, please test the patch to make sure I didn't miss anything. >>>> >>>> Thanks. >>>> >>>> -Maynard >>> >>> Hi Maynard, >>> >>> I was able to get some operf to collect data from a physical Applied Micro X-Gene machine. Looks like the patch continues to work after the tweaks. >> Patch applied. A final look and testing of what I pushed upstream would be appreciated, but the only changes I made from the patch most recently associated with this thread was to change "ARM" to "APM" that Will D pointed out, plus an extra sentence in the commit message stating that the patch supports the APM X-Gene processor. Thanks! > Somehow, I missed changing CPU_ARM_V8_XGENE to CPU_ARM_V8_APM_XGENE in libop/op_events.c when I committed the patch. I fixed that and pushed the fix upstream. > > -Maynard Thanks for much for merging this in. As get more experience with the APM x-gene machines there will probably be some additional patches for events. -Will |
From: William C. <wc...@re...> - 2014-02-04 20:11:06
|
The AArch64 support is provided as an ARM variant to allow use in both 32-bit and 64-bit ARM environments. The support in this patch is just the basic events described in the AArch64 documentation. AArch64 processor implementation may provide additional implementation specific events. One could add code to recognize those processor specific implementations and include the armv8-common base events into the event sets for the processor implementation specific events. Below is example run on the ARM Foundation simulator collecting data on a build of OProfile. $ cd oprofile $ operf make ... $ opreport -t 5 Using /home/wcohen/oprofile/oprofile/oprofile_data/samples/ for samples directory. WARNING: Lost samples detected! See /home/wcohen/oprofile/oprofile/oprofile_data/samples/operf.log for details. CPU: ARM AArch64 Counted CPU_CYCLES events (Cycle) with a unit mask of 0x00 (No unit mask) count 100000 CPU_CYCLES:100000| samples| %| ------------------ 10943 90.5877 make CPU_CYCLES:100000| samples| %| ------------------ 5281 48.2592 make 4543 41.5151 libc-2.17.so 1079 9.8602 kallsyms 40 0.3655 ld-2.17.so 735 6.0844 sh CPU_CYCLES:100000| samples| %| ------------------ 321 43.6735 kallsyms 298 40.5442 libc-2.17.so 94 12.7891 bash 22 2.9932 ld-2.17.so Signed-off-by: William Cohen <wc...@re...> --- events/Makefile.am | 1 + events/arm/armv8-pmuv3/events | 38 ++++++++++++++++++++++++++++++++++++++ events/arm/armv8-pmuv3/unit_masks | 4 ++++ libop/op_cpu_type.c | 11 ++++++++++- libop/op_cpu_type.h | 1 + libop/op_events.c | 1 + utils/opcontrol | 5 +++++ utils/ophelp.c | 7 +++++++ 8 files changed, 67 insertions(+), 1 deletion(-) create mode 100644 events/arm/armv8-pmuv3/events create mode 100644 events/arm/armv8-pmuv3/unit_masks diff --git a/events/Makefile.am b/events/Makefile.am index ad45642..c17452e 100644 --- a/events/Makefile.am +++ b/events/Makefile.am @@ -59,6 +59,7 @@ event_files = \ arm/armv7-ca7/events arm/armv7-ca7/unit_masks \ arm/armv7-ca15/events arm/armv7-ca15/unit_masks \ arm/mpcore/events arm/mpcore/unit_masks \ + arm/armv8-pmuv3/events arm/armv8-pmuv3/unit_masks \ avr32/events avr32/unit_masks \ mips/20K/events mips/20K/unit_masks \ mips/24K/events mips/24K/unit_masks \ diff --git a/events/arm/armv8-pmuv3/events b/events/arm/armv8-pmuv3/events new file mode 100644 index 0000000..3cdff03 --- /dev/null +++ b/events/arm/armv8-pmuv3/events @@ -0,0 +1,38 @@ +# +# Copyright (c) Red Hat, 2014. +# Contributed by William Cohen <wc...@re...> +# +# ARMv8 pmu v3 architected events + +event:0x00 um:zero minimum:500 name:SW_INCR : Instruction architecturally executed, condition code check pass, software increment +event:0x01 um:zero minimum:5000 name:L1I_CACHE_REFILL : Level 1 instruction cache refill +event:0x02 um:zero minimum:5000 name:L1I_TLB_REFILL : Level 1 instruction TLB refill +event:0x03 um:zero minimum:5000 name:L1D_CACHE_REFILL : Level 1 data cache refill +event:0x04 um:zero minimum:5000 name:L1D_CACHE : Level 1 data cache access +event:0x05 um:zero minimum:5000 name:L1D_TLB_REFILL : Level 1 data TLB refill +event:0x06 um:zero minimum:100000 name:LD_RETIRED : Instruction architecturally executed, condition code check pass, load +event:0x07 um:zero minimum:100000 name:ST_RETIRED : Instruction architecturally executed, condition code check pass, store +event:0x08 um:zero minimum:100000 name:INST_RETIRED : Instruction architecturally executed +event:0x09 um:zero minimum:500 name:EXC_TAKEN : Exception taken +event:0x0A um:zero minimum:500 name:EXC_RETURN : Instruction architecturally executed, condition code check pass, exception return +event:0x0B um:zero minimum:500 name:CID_WRITE_RETIRED : Instruction architecturally executed, condition code check pass, write to CONTEXTIDR +event:0x0C um:zero minimum:5000 name:PC_WRITE_RETIRED : Instruction architecturally executed, condition code check pass, software change of the PC +event:0x0D um:zero minimum:5000 name:BR_IMMED_RETIRED : Instruction architecturally executed, immediate branch +event:0x0E um:zero minimum:5000 name:BR_RETURN_RETIRED : Instruction architecturally executed, condition code check pass, procedure return +event:0x0F um:zero minimum:500 name:UNALIGNED_LDST_RETIRED : Instruction architecturally executed, condition code check pass, unaligned load or store +event:0x10 um:zero minimum:5000 name:BR_MIS_PRED : Mispredicted or not predicted branch speculatively executed +event:0x11 um:zero minimum:100000 name:CPU_CYCLES : Cycle +event:0x12 um:zero minimum:5000 name:BR_PRED : Predictable branch speculatively executed +event:0x13 um:zero minimum:100000 name:MEM_ACCESS : Data memory access +event:0x14 um:zero minimum:5000 name:L1I_CACHE : Level 1 instruction cache access +event:0x15 um:zero minimum:5000 name:L1D_CACHE_WB : Level 1 data cache write-back +event:0x16 um:zero minimum:5000 name:L2D_CACHE : Level 2 data cache access +event:0x17 um:zero minimum:5000 name:L2D_CACHE_REFILL : Level 2 data cache refill +event:0x18 um:zero minimum:5000 name:L2D_CACHE_WB : Level 2 data cache write-back +event:0x19 um:zero minimum:5000 name:BUS_ACCESS : Bus access +event:0x1A um:zero minimum:500 name:MEMORY_ERROR : Local memory error +event:0x1B um:zero minimum:100000 name:INST_SPEC : Operation speculatively executed +event:0x1C um:zero minimum:5000 name:TTBR_WRITE_RETIRED : Instruction architecturally executed, condition code check pass, write to TTBR +event:0x1D um:zero minimum:5000 name:BUS_CYCLES : Bus cycle +event:0x1F um:zero minimum:5000 name:L1D_CACHE_ALLOCATE : Level 1 data cache allocation without refill +event:0x20 um:zero minimum:5000 name:L2D_CACHE_ALLOCATE : Level 2 data cache allocation without refill diff --git a/events/arm/armv8-pmuv3/unit_masks b/events/arm/armv8-pmuv3/unit_masks new file mode 100644 index 0000000..7666c35 --- /dev/null +++ b/events/arm/armv8-pmuv3/unit_masks @@ -0,0 +1,4 @@ +# ARMv8 architected events unit masks +# +name:zero type:mandatory default:0x00 + 0x00 No unit mask diff --git a/libop/op_cpu_type.c b/libop/op_cpu_type.c index 1ae2913..396b35d 100644 --- a/libop/op_cpu_type.c +++ b/libop/op_cpu_type.c @@ -129,6 +129,7 @@ static struct cpu_descr const cpu_descrs[MAX_CPU_TYPE] = { { "e6500", "ppc/e6500", CPU_PPC_E6500, 6 }, { "Intel Silvermont microarchitecture", "i386/silvermont", CPU_SILVERMONT, 2 }, { "ARMv7 Krait", "arm/armv7-krait", CPU_ARM_KRAIT, 5 }, + { "ARM AArch64", "arm/armv8-pmuv3", CPU_ARM_V8_PMUV3, 6 }, }; static size_t const nr_cpu_descrs = sizeof(cpu_descrs) / sizeof(struct cpu_descr); @@ -394,6 +395,13 @@ static op_cpu _get_arm_cpu_type(void) return op_get_cpu_number("arm/armv7-ca9"); case 0xc0f: return op_get_cpu_number("arm/armv7-ca15"); + case 0xd00: + return op_get_cpu_number("arm/armv8-pmuv3"); + } + } else if (vendorid == 0x50) { /* Applied Micro Circuits Corpation */ + switch (cpuid) { + case 0x000: + return op_get_cpu_number("arm/armv8-pmuv3"); } } else if (vendorid == 0x69) { /* Intel xscale */ switch (cpuid >> 9) { @@ -631,7 +639,8 @@ static op_cpu __get_cpu_type_alt_method(void) (strncmp(uname_info.machine, "ppc64le", 7) == 0)) { return _get_ppc64_cpu_type(); } - if (strncmp(uname_info.machine, "arm", 3) == 0) { + if (strncmp(uname_info.machine, "arm", 3) == 0 || + strncmp(uname_info.machine, "aarch64", 7) == 0) { return _get_arm_cpu_type(); } if (strncmp(uname_info.machine, "tile", 4) == 0) { diff --git a/libop/op_cpu_type.h b/libop/op_cpu_type.h index 67e16de..733fb26 100644 --- a/libop/op_cpu_type.h +++ b/libop/op_cpu_type.h @@ -109,6 +109,7 @@ typedef enum { CPU_PPC_E6500, /**< e6500 */ CPU_SILVERMONT, /** < Intel Silvermont microarchitecture */ CPU_ARM_KRAIT, /**< ARM KRAIT */ + CPU_ARM_V8_PMUV3, /* ARM V8 base architected events */ MAX_CPU_TYPE } op_cpu; diff --git a/libop/op_events.c b/libop/op_events.c index 358a154..ee9001b 100644 --- a/libop/op_events.c +++ b/libop/op_events.c @@ -1253,6 +1253,7 @@ void op_default_event(op_cpu cpu_type, struct op_default_event_descr * descr) case CPU_ARM_SCORPION: case CPU_ARM_SCORPIONMP: case CPU_ARM_KRAIT: + case CPU_ARM_V8_PMUV3: descr->name = "CPU_CYCLES"; break; diff --git a/utils/opcontrol b/utils/opcontrol index 38bb1ac..04a4a91 100755 --- a/utils/opcontrol +++ b/utils/opcontrol @@ -400,6 +400,11 @@ do_init() do_deinit exit 1 ;; + aarch64/*) + echo "*** ARM AArch64 processors are not supported with opcontrol. Please use operf instead. ***" + do_deinit + exit 1 + ;; esac fi diff --git a/utils/ophelp.c b/utils/ophelp.c index af4c1e5..ad42884 100644 --- a/utils/ophelp.c +++ b/utils/ophelp.c @@ -656,6 +656,13 @@ int main(int argc, char const * argv[]) "Cortex A15 DDI (ARM DDI 0438F, revision r3p1)\n"; break; + case CPU_ARM_V8_PMUV3: + event_doc = + "See ARM Architecture Reference Manual \n" + "ARMv8, for ARMv8-A architecture profile\n" + "DDI (ARM DDI0487A.a)\n"; + break; + case CPU_PPC64_PA6T: event_doc = "See PA6T Power Implementation Features Book IV\n" -- 1.8.3.1 |
From: William C. <wc...@re...> - 2014-02-07 17:10:26
|
Hi Will, Does the following revised patch address your comments on the future versions of the PMU on ARMv8? -Will On 02/04/2014 03:10 PM, William Cohen wrote: > The AArch64 support is provided as an ARM variant to allow use in both > 32-bit and 64-bit ARM environments. The support in this patch is just > the basic events described in the AArch64 documentation. AArch64 > processor implementation may provide additional implementation > specific events. One could add code to recognize those processor > specific implementations and include the armv8-common base events into > the event sets for the processor implementation specific events. > > Below is example run on the ARM Foundation simulator collecting data > on a build of OProfile. > > $ cd oprofile > $ operf make > ... > $ opreport -t 5 > Using /home/wcohen/oprofile/oprofile/oprofile_data/samples/ for samples directory. > > WARNING: Lost samples detected! See /home/wcohen/oprofile/oprofile/oprofile_data/samples/operf.log for details. > CPU: ARM AArch64 > Counted CPU_CYCLES events (Cycle) with a unit mask of 0x00 (No unit mask) count 100000 > CPU_CYCLES:100000| > samples| %| > ------------------ > 10943 90.5877 make > CPU_CYCLES:100000| > samples| %| > ------------------ > 5281 48.2592 make > 4543 41.5151 libc-2.17.so > 1079 9.8602 kallsyms > 40 0.3655 ld-2.17.so > 735 6.0844 sh > CPU_CYCLES:100000| > samples| %| > ------------------ > 321 43.6735 kallsyms > 298 40.5442 libc-2.17.so > 94 12.7891 bash > 22 2.9932 ld-2.17.so > > Signed-off-by: William Cohen <wc...@re...> > --- > events/Makefile.am | 1 + > events/arm/armv8-pmuv3/events | 38 ++++++++++++++++++++++++++++++++++++++ > events/arm/armv8-pmuv3/unit_masks | 4 ++++ > libop/op_cpu_type.c | 11 ++++++++++- > libop/op_cpu_type.h | 1 + > libop/op_events.c | 1 + > utils/opcontrol | 5 +++++ > utils/ophelp.c | 7 +++++++ > 8 files changed, 67 insertions(+), 1 deletion(-) > create mode 100644 events/arm/armv8-pmuv3/events > create mode 100644 events/arm/armv8-pmuv3/unit_masks > > diff --git a/events/Makefile.am b/events/Makefile.am > index ad45642..c17452e 100644 > --- a/events/Makefile.am > +++ b/events/Makefile.am > @@ -59,6 +59,7 @@ event_files = \ > arm/armv7-ca7/events arm/armv7-ca7/unit_masks \ > arm/armv7-ca15/events arm/armv7-ca15/unit_masks \ > arm/mpcore/events arm/mpcore/unit_masks \ > + arm/armv8-pmuv3/events arm/armv8-pmuv3/unit_masks \ > avr32/events avr32/unit_masks \ > mips/20K/events mips/20K/unit_masks \ > mips/24K/events mips/24K/unit_masks \ > diff --git a/events/arm/armv8-pmuv3/events b/events/arm/armv8-pmuv3/events > new file mode 100644 > index 0000000..3cdff03 > --- /dev/null > +++ b/events/arm/armv8-pmuv3/events > @@ -0,0 +1,38 @@ > +# > +# Copyright (c) Red Hat, 2014. > +# Contributed by William Cohen <wc...@re...> > +# > +# ARMv8 pmu v3 architected events > + > +event:0x00 um:zero minimum:500 name:SW_INCR : Instruction architecturally executed, condition code check pass, software increment > +event:0x01 um:zero minimum:5000 name:L1I_CACHE_REFILL : Level 1 instruction cache refill > +event:0x02 um:zero minimum:5000 name:L1I_TLB_REFILL : Level 1 instruction TLB refill > +event:0x03 um:zero minimum:5000 name:L1D_CACHE_REFILL : Level 1 data cache refill > +event:0x04 um:zero minimum:5000 name:L1D_CACHE : Level 1 data cache access > +event:0x05 um:zero minimum:5000 name:L1D_TLB_REFILL : Level 1 data TLB refill > +event:0x06 um:zero minimum:100000 name:LD_RETIRED : Instruction architecturally executed, condition code check pass, load > +event:0x07 um:zero minimum:100000 name:ST_RETIRED : Instruction architecturally executed, condition code check pass, store > +event:0x08 um:zero minimum:100000 name:INST_RETIRED : Instruction architecturally executed > +event:0x09 um:zero minimum:500 name:EXC_TAKEN : Exception taken > +event:0x0A um:zero minimum:500 name:EXC_RETURN : Instruction architecturally executed, condition code check pass, exception return > +event:0x0B um:zero minimum:500 name:CID_WRITE_RETIRED : Instruction architecturally executed, condition code check pass, write to CONTEXTIDR > +event:0x0C um:zero minimum:5000 name:PC_WRITE_RETIRED : Instruction architecturally executed, condition code check pass, software change of the PC > +event:0x0D um:zero minimum:5000 name:BR_IMMED_RETIRED : Instruction architecturally executed, immediate branch > +event:0x0E um:zero minimum:5000 name:BR_RETURN_RETIRED : Instruction architecturally executed, condition code check pass, procedure return > +event:0x0F um:zero minimum:500 name:UNALIGNED_LDST_RETIRED : Instruction architecturally executed, condition code check pass, unaligned load or store > +event:0x10 um:zero minimum:5000 name:BR_MIS_PRED : Mispredicted or not predicted branch speculatively executed > +event:0x11 um:zero minimum:100000 name:CPU_CYCLES : Cycle > +event:0x12 um:zero minimum:5000 name:BR_PRED : Predictable branch speculatively executed > +event:0x13 um:zero minimum:100000 name:MEM_ACCESS : Data memory access > +event:0x14 um:zero minimum:5000 name:L1I_CACHE : Level 1 instruction cache access > +event:0x15 um:zero minimum:5000 name:L1D_CACHE_WB : Level 1 data cache write-back > +event:0x16 um:zero minimum:5000 name:L2D_CACHE : Level 2 data cache access > +event:0x17 um:zero minimum:5000 name:L2D_CACHE_REFILL : Level 2 data cache refill > +event:0x18 um:zero minimum:5000 name:L2D_CACHE_WB : Level 2 data cache write-back > +event:0x19 um:zero minimum:5000 name:BUS_ACCESS : Bus access > +event:0x1A um:zero minimum:500 name:MEMORY_ERROR : Local memory error > +event:0x1B um:zero minimum:100000 name:INST_SPEC : Operation speculatively executed > +event:0x1C um:zero minimum:5000 name:TTBR_WRITE_RETIRED : Instruction architecturally executed, condition code check pass, write to TTBR > +event:0x1D um:zero minimum:5000 name:BUS_CYCLES : Bus cycle > +event:0x1F um:zero minimum:5000 name:L1D_CACHE_ALLOCATE : Level 1 data cache allocation without refill > +event:0x20 um:zero minimum:5000 name:L2D_CACHE_ALLOCATE : Level 2 data cache allocation without refill > diff --git a/events/arm/armv8-pmuv3/unit_masks b/events/arm/armv8-pmuv3/unit_masks > new file mode 100644 > index 0000000..7666c35 > --- /dev/null > +++ b/events/arm/armv8-pmuv3/unit_masks > @@ -0,0 +1,4 @@ > +# ARMv8 architected events unit masks > +# > +name:zero type:mandatory default:0x00 > + 0x00 No unit mask > diff --git a/libop/op_cpu_type.c b/libop/op_cpu_type.c > index 1ae2913..396b35d 100644 > --- a/libop/op_cpu_type.c > +++ b/libop/op_cpu_type.c > @@ -129,6 +129,7 @@ static struct cpu_descr const cpu_descrs[MAX_CPU_TYPE] = { > { "e6500", "ppc/e6500", CPU_PPC_E6500, 6 }, > { "Intel Silvermont microarchitecture", "i386/silvermont", CPU_SILVERMONT, 2 }, > { "ARMv7 Krait", "arm/armv7-krait", CPU_ARM_KRAIT, 5 }, > + { "ARM AArch64", "arm/armv8-pmuv3", CPU_ARM_V8_PMUV3, 6 }, > }; > > static size_t const nr_cpu_descrs = sizeof(cpu_descrs) / sizeof(struct cpu_descr); > @@ -394,6 +395,13 @@ static op_cpu _get_arm_cpu_type(void) > return op_get_cpu_number("arm/armv7-ca9"); > case 0xc0f: > return op_get_cpu_number("arm/armv7-ca15"); > + case 0xd00: > + return op_get_cpu_number("arm/armv8-pmuv3"); > + } > + } else if (vendorid == 0x50) { /* Applied Micro Circuits Corpation */ > + switch (cpuid) { > + case 0x000: > + return op_get_cpu_number("arm/armv8-pmuv3"); > } > } else if (vendorid == 0x69) { /* Intel xscale */ > switch (cpuid >> 9) { > @@ -631,7 +639,8 @@ static op_cpu __get_cpu_type_alt_method(void) > (strncmp(uname_info.machine, "ppc64le", 7) == 0)) { > return _get_ppc64_cpu_type(); > } > - if (strncmp(uname_info.machine, "arm", 3) == 0) { > + if (strncmp(uname_info.machine, "arm", 3) == 0 || > + strncmp(uname_info.machine, "aarch64", 7) == 0) { > return _get_arm_cpu_type(); > } > if (strncmp(uname_info.machine, "tile", 4) == 0) { > diff --git a/libop/op_cpu_type.h b/libop/op_cpu_type.h > index 67e16de..733fb26 100644 > --- a/libop/op_cpu_type.h > +++ b/libop/op_cpu_type.h > @@ -109,6 +109,7 @@ typedef enum { > CPU_PPC_E6500, /**< e6500 */ > CPU_SILVERMONT, /** < Intel Silvermont microarchitecture */ > CPU_ARM_KRAIT, /**< ARM KRAIT */ > + CPU_ARM_V8_PMUV3, /* ARM V8 base architected events */ > MAX_CPU_TYPE > } op_cpu; > > diff --git a/libop/op_events.c b/libop/op_events.c > index 358a154..ee9001b 100644 > --- a/libop/op_events.c > +++ b/libop/op_events.c > @@ -1253,6 +1253,7 @@ void op_default_event(op_cpu cpu_type, struct op_default_event_descr * descr) > case CPU_ARM_SCORPION: > case CPU_ARM_SCORPIONMP: > case CPU_ARM_KRAIT: > + case CPU_ARM_V8_PMUV3: > descr->name = "CPU_CYCLES"; > break; > > diff --git a/utils/opcontrol b/utils/opcontrol > index 38bb1ac..04a4a91 100755 > --- a/utils/opcontrol > +++ b/utils/opcontrol > @@ -400,6 +400,11 @@ do_init() > do_deinit > exit 1 > ;; > + aarch64/*) > + echo "*** ARM AArch64 processors are not supported with opcontrol. Please use operf instead. ***" > + do_deinit > + exit 1 > + ;; > esac > fi > > diff --git a/utils/ophelp.c b/utils/ophelp.c > index af4c1e5..ad42884 100644 > --- a/utils/ophelp.c > +++ b/utils/ophelp.c > @@ -656,6 +656,13 @@ int main(int argc, char const * argv[]) > "Cortex A15 DDI (ARM DDI 0438F, revision r3p1)\n"; > break; > > + case CPU_ARM_V8_PMUV3: > + event_doc = > + "See ARM Architecture Reference Manual \n" > + "ARMv8, for ARMv8-A architecture profile\n" > + "DDI (ARM DDI0487A.a)\n"; > + break; > + > case CPU_PPC64_PA6T: > event_doc = > "See PA6T Power Implementation Features Book IV\n" > |
From: Will D. <wil...@ar...> - 2014-02-07 17:14:59
|
On Fri, Feb 07, 2014 at 05:09:59PM +0000, William Cohen wrote: > Hi Will, Hi Will, > Does the following revised patch address your comments on the future versions of the PMU on ARMv8? Yes, thanks! Acked-by: Will Deacon <wil...@ar...> Will |
From: Maynard J. <may...@us...> - 2014-02-07 23:00:56
|
On 02/04/2014 02:10 PM, William Cohen wrote: > This revised patch addresses Will Deacon's comment about possible > follow on implementations of the pmu unit such as pmuv4 for armv8 > processors. The name is armv8-pmuv3 to match up with what the kernel > reports for perf events. If an aarch64 processors has implementation > specific events, it can be named appropriately and it can include the > events in this patch. > > The "make distcheck" works fine with this version of the patch. > > William Cohen (1): > Provide basic AArch64 support > > events/Makefile.am | 1 + > events/arm/armv8-pmuv3/events | 38 ++++++++++++++++++++++++++++++++++++++ > events/arm/armv8-pmuv3/unit_masks | 4 ++++ Will and Will, Maybe I'm confused. The initial patch had events/arm/armv8-common/. Will D's comment was: The uses of armv8_common and CPU_ARM_V8_COMMON would be more precise if we mentioned pmuv3 in there somewhere. At some point we'll probably get PMUv4, and then the common events might not be common anymore. The events/arm/armv8-pmuv3 events and unit masks files have the comment "ARMv8 architected events" in them. Shouldn't we have a events/arm/armv8-pmuv3-common directory with the actual events defined, and then the processor-specific files (in events/arm/armv8-pmuv3) would just "include" the armv8-pmuv3-common? -Maynard > libop/op_cpu_type.c | 11 ++++++++++- > libop/op_cpu_type.h | 1 + > libop/op_events.c | 1 + > utils/opcontrol | 5 +++++ > utils/ophelp.c | 7 +++++++ > 8 files changed, 67 insertions(+), 1 deletion(-) > create mode 100644 events/arm/armv8-pmuv3/events > create mode 100644 events/arm/armv8-pmuv3/unit_masks > |
From: Will D. <wil...@ar...> - 2014-02-10 10:20:39
|
On Fri, Feb 07, 2014 at 11:00:45PM +0000, Maynard Johnson wrote: > On 02/04/2014 02:10 PM, William Cohen wrote: > > This revised patch addresses Will Deacon's comment about possible > > follow on implementations of the pmu unit such as pmuv4 for armv8 > > processors. The name is armv8-pmuv3 to match up with what the kernel > > reports for perf events. If an aarch64 processors has implementation > > specific events, it can be named appropriately and it can include the > > events in this patch. > > > > The "make distcheck" works fine with this version of the patch. > > > > William Cohen (1): > > Provide basic AArch64 support > > > > events/Makefile.am | 1 + > > events/arm/armv8-pmuv3/events | 38 ++++++++++++++++++++++++++++++++++++++ > > events/arm/armv8-pmuv3/unit_masks | 4 ++++ > Will and Will, Hi Maynard, > Maybe I'm confused. The initial patch had events/arm/armv8-common/. Will D's comment was: > The uses of armv8_common and CPU_ARM_V8_COMMON would be more precise > if we mentioned pmuv3 in there somewhere. At some point we'll probably > get PMUv4, and then the common events might not be common anymore. > > The events/arm/armv8-pmuv3 events and unit masks files have the comment > "ARMv8 architected events" in them. Shouldn't we have a > events/arm/armv8-pmuv3-common directory with the actual events defined, > and then the processor-specific files (in events/arm/armv8-pmuv3) would > just "include" the armv8-pmuv3-common? I don't think this patch adds any processor-specific events, so if/when they appear I was anticipating them including events/arm/armv8-pmuv3/events. So you'd have something like events/arm/armv8-ca57/events which would include:arm/armv8-pmuv3. Does that make sense? Will |
From: William C. <wc...@re...> - 2014-02-10 14:32:59
|
On 02/10/2014 05:20 AM, Will Deacon wrote: > On Fri, Feb 07, 2014 at 11:00:45PM +0000, Maynard Johnson wrote: >> On 02/04/2014 02:10 PM, William Cohen wrote: >>> This revised patch addresses Will Deacon's comment about possible >>> follow on implementations of the pmu unit such as pmuv4 for armv8 >>> processors. The name is armv8-pmuv3 to match up with what the kernel >>> reports for perf events. If an aarch64 processors has implementation >>> specific events, it can be named appropriately and it can include the >>> events in this patch. >>> >>> The "make distcheck" works fine with this version of the patch. >>> >>> William Cohen (1): >>> Provide basic AArch64 support >>> >>> events/Makefile.am | 1 + >>> events/arm/armv8-pmuv3/events | 38 ++++++++++++++++++++++++++++++++++++++ >>> events/arm/armv8-pmuv3/unit_masks | 4 ++++ >> Will and Will, > > Hi Maynard, > >> Maybe I'm confused. The initial patch had events/arm/armv8-common/. Will D's comment was: >> The uses of armv8_common and CPU_ARM_V8_COMMON would be more precise >> if we mentioned pmuv3 in there somewhere. At some point we'll probably >> get PMUv4, and then the common events might not be common anymore. >> >> The events/arm/armv8-pmuv3 events and unit masks files have the comment >> "ARMv8 architected events" in them. Shouldn't we have a >> events/arm/armv8-pmuv3-common directory with the actual events defined, >> and then the processor-specific files (in events/arm/armv8-pmuv3) would >> just "include" the armv8-pmuv3-common? > > I don't think this patch adds any processor-specific events, so if/when they > appear I was anticipating them including events/arm/armv8-pmuv3/events. > > So you'd have something like events/arm/armv8-ca57/events which would > include:arm/armv8-pmuv3. > > Does that make sense? > > Will > Hi Will and Maynard, The events described in the patch are the events described in the ARM Architecture Reference Manual ARMv8 manual. The dropping the "common" was to have naming that match what the Linux kernel uses and reduce name variations. There certainly can be armv8 processors that have additional events. The recently available "ARM Cortex-A57 MPCore Processor Technical Reference Manual ( http://infocenter.arm.com/help/topic/com.arm.doc.ddi0488d/DDI0488D_cortex_a57_mpcore_r1p1_trm.pdf) list additional events in Section "11.8 Events" when compared to the "ARM Architecture Reference Manual ARMv8, for ARMv8-A architecture profile" (DDI0487A_a_armv8_arm.pdf) The expectation would be that would be a /usr/share/oprofile/arm/armv8-a57/events that would include the armv8-pmu3 events. -Will -Will |
From: Maynard J. <may...@us...> - 2014-02-10 17:05:31
|
On 02/10/2014 10:15 AM, William Cohen wrote: > On 02/10/2014 10:18 AM, Maynard Johnson wrote: >> On 02/10/2014 08:32 AM, William Cohen wrote: >>> On 02/10/2014 05:20 AM, Will Deacon wrote: >>>> On Fri, Feb 07, 2014 at 11:00:45PM +0000, Maynard Johnson wrote: >>>>> On 02/04/2014 02:10 PM, William Cohen wrote: >>>>>> This revised patch addresses Will Deacon's comment about possible >>>>>> follow on implementations of the pmu unit such as pmuv4 for armv8 >>>>>> processors. The name is armv8-pmuv3 to match up with what the kernel >>>>>> reports for perf events. If an aarch64 processors has implementation >>>>>> specific events, it can be named appropriately and it can include the >>>>>> events in this patch. >>>>>> >>>>>> The "make distcheck" works fine with this version of the patch. >>>>>> >>>>>> William Cohen (1): >>>>>> Provide basic AArch64 support >>>>>> >>>>>> events/Makefile.am | 1 + >>>>>> events/arm/armv8-pmuv3/events | 38 ++++++++++++++++++++++++++++++++++++++ >>>>>> events/arm/armv8-pmuv3/unit_masks | 4 ++++ >>>>> Will and Will, >>>> >>>> Hi Maynard, >>>> >>>>> Maybe I'm confused. The initial patch had events/arm/armv8-common/. Will D's comment was: >>>>> The uses of armv8_common and CPU_ARM_V8_COMMON would be more precise >>>>> if we mentioned pmuv3 in there somewhere. At some point we'll probably >>>>> get PMUv4, and then the common events might not be common anymore. >>>>> >>>>> The events/arm/armv8-pmuv3 events and unit masks files have the comment >>>>> "ARMv8 architected events" in them. Shouldn't we have a >>>>> events/arm/armv8-pmuv3-common directory with the actual events defined, >>>>> and then the processor-specific files (in events/arm/armv8-pmuv3) would >>>>> just "include" the armv8-pmuv3-common? >>>> >>>> I don't think this patch adds any processor-specific events, so if/when they >>>> appear I was anticipating them including events/arm/armv8-pmuv3/events. >>>> >>>> So you'd have something like events/arm/armv8-ca57/events which would >>>> include:arm/armv8-pmuv3. >> I know I'm beating a dead horse, but nothing either of you have said has made it clear to me why this scenario is different from the various armv7 processor models including armv7-common/events. If there's no clear distinction, then why veer from the existing pattern of directory naming, where said directory will include events and unit mask files that are intended (expected?) to be included by processor specific events/unit mask files? Granted, the pattern is not set in concrete -- ARM uses "common" in the directory name; Intel uses "arch_perfmon"; current IBM Power uses "architected_events_v1" -- but the names make the intent pretty clear. >> >> -Maynard > > Hi Maynard, > > I don't have a significant preference for the name. I would just like to come to a consensus on the name and get the patch merged in. There are the following possible choices: > > armv8: > armv8-common: (name scheme following earlier armvN processors in OProfile) > armv8-pmuv3: (name scheme used by the upstream kernel) > armv8-common-pmuv3: > > The last patch used armv8-pmuv3 to match the name used in the kernel, "arm/armv8-pmuv3" . However, the kernel is not particularly consistent in its naming of arm processors PMUs. The 32-bit arm processors PMUs have names "ARMv7 Cortex-A<N>", "v6", and "v6mpcore". > > All of the current arm processor have specific events sets and none display the armv7-common as the name of the events. The krait just includes the armv7-common. > > Of those choices above what would a user perfer to see? Is something with "armv8-common-pmuv3" or something with "common" in the name that the user wants to see? Will, if we employed a "common" directory to hold these events, we would still want to have another directory with processor specific events/unit masks (which would then "include:" the new common dir). So the user would not *see" the common dir -- just the processor specific one. Se my response to Will D for more details. -Maynard > > -Will > |
From: Maynard J. <may...@us...> - 2014-02-11 15:28:42
|
On 02/11/2014 04:25 AM, Will Deacon wrote: > On Mon, Feb 10, 2014 at 10:43:55PM +0000, William Cohen wrote: >> On 02/10/2014 05:01 PM, Maynard Johnson wrote: >>> On 02/10/2014 01:37 PM, William Cohen wrote: >>>> On 02/10/2014 12:04 PM, Maynard Johnson wrote: >>> Close. Will D's preference was to make an armv8-ca57 directory now (as >>> the processor-specific implementation). So I swizzled your patch to >>> make appropriate changes. And here it is below. It compiles and runs >>> 'make distcheck'. *Will* and *Will*, once I receive your acks, I'll >>> push this upstream. Thanks! >>> >> >> The documentation on the applied micro x-gene that I have access doesn't >> state that it supports the additional cortex-a57 events and I am not sure >> that the applied micro x-gene supports those events. The patch was an >> attempt to be conservative and not assume that applied micro x-gene is the >> same as ARMv8 cortex a-57 by just the base armv8 events. However, maybe >> Will Deacon knows whether this is acceptable for the AP x-gene. > > x-gene will likely have its own set of events but, without documentation, > there's not much we can do to support the extra events there. > > However... > >>> events/Makefile.am | 2 + >>> events/arm/armv8-ca57/events | 7 +++++ >>> events/arm/armv8-ca57/unit_masks | 3 ++ >>> events/arm/armv8-pmuv3-common/events | 38 ++++++++++++++++++++++++++++++ >>> events/arm/armv8-pmuv3-common/unit_masks | 4 +++ > > [...] > >>> @@ -394,6 +395,13 @@ static op_cpu _get_arm_cpu_type(void) >>> return op_get_cpu_number("arm/armv7-ca9"); >>> case 0xc0f: >>> return op_get_cpu_number("arm/armv7-ca15"); >>> + case 0xd00: >>> + return op_get_cpu_number("arm/armv8-pmuv3"); >>> + } >>> + } else if (vendorid == 0x50) { /* Applied Micro Circuits Corpation */ >>> + switch (cpuid) { >>> + case 0x000: >>> + return op_get_cpu_number("arm/armv8-pmuv3"); > > ...these strings won't hit in the events directory. In the absence of x-gene Good catch! > specific events, we should probably direct the latter case at > "arm/armv8-pmuv3-common" and remove the first case (0xd00), since it will be > updated to point at, for example, the ca57 events in future. Unfortunately, with the patch that I most recently posted in this thread, that won't work because calling op_get_cpu_number("arm/armv8-pmuv3-common") will return CPU_NO_GOOD (note below there is no cpu_descr with a name field of "arm/armv8-pmuv3-common"): @@ -129,6 +129,7 @@ static struct cpu_descr const cpu_descrs[MAX_CPU_TYPE] = { { "e6500", "ppc/e6500", CPU_PPC_E6500, 6 }, { "Intel Silvermont microarchitecture", "i386/silvermont", CPU_SILVERMONT, 2 }, { "ARMv7 Krait", "arm/armv7-krait", CPU_ARM_KRAIT, 5 }, + { "ARM AArch64", "arm/armv8-ca57", CPU_ARM_V8_CA57, 6 }, }; Perhaps we should go back to the idea of defining just a "common" armv8 processor type, and skip the idea of trying to shoehorn in a pseudo processor-specific type when there is none yet. So the above patch excerpt becomes: @@ -129,6 +129,7 @@ static struct cpu_descr const cpu_descrs[MAX_CPU_TYPE] = { { "e6500", "ppc/e6500", CPU_PPC_E6500, 6 }, { "Intel Silvermont microarchitecture", "i386/silvermont", CPU_SILVERMONT, 2 }, { "ARMv7 Krait", "arm/armv7-krait", CPU_ARM_KRAIT, 5 }, + { "ARMv8 pmuv3 Common", "arm/armv8-pmuv3-common", CPU_ARM_V8_PMUV3_COMMON, 6 }, ^--- is this acceptable? This is what would be displayed on the first line of 'ophelp' for the CPU type. }; And of course we would remove the events/arm/armv8-ca57 events and unit_masks files, too. -Maynard > > Will > |
From: Will D. <wil...@ar...> - 2014-02-11 15:37:33
|
On Tue, Feb 11, 2014 at 03:28:11PM +0000, Maynard Johnson wrote: > On 02/11/2014 04:25 AM, Will Deacon wrote: > >>> @@ -394,6 +395,13 @@ static op_cpu _get_arm_cpu_type(void) > >>> return op_get_cpu_number("arm/armv7-ca9"); > >>> case 0xc0f: > >>> return op_get_cpu_number("arm/armv7-ca15"); > >>> + case 0xd00: > >>> + return op_get_cpu_number("arm/armv8-pmuv3"); > >>> + } > >>> + } else if (vendorid == 0x50) { /* Applied Micro Circuits Corpation */ > >>> + switch (cpuid) { > >>> + case 0x000: > >>> + return op_get_cpu_number("arm/armv8-pmuv3"); > > > > ...these strings won't hit in the events directory. In the absence of x-gene > Good catch! > > specific events, we should probably direct the latter case at > > "arm/armv8-pmuv3-common" and remove the first case (0xd00), since it will be > > updated to point at, for example, the ca57 events in future. > Unfortunately, with the patch that I most recently posted in this thread, > that won't work because calling > op_get_cpu_number("arm/armv8-pmuv3-common") will return CPU_NO_GOOD (note > below there is no cpu_descr with a name field of > "arm/armv8-pmuv3-common"): > > @@ -129,6 +129,7 @@ static struct cpu_descr const cpu_descrs[MAX_CPU_TYPE] = { > { "e6500", "ppc/e6500", CPU_PPC_E6500, 6 }, > { "Intel Silvermont microarchitecture", "i386/silvermont", CPU_SILVERMONT, 2 }, > { "ARMv7 Krait", "arm/armv7-krait", CPU_ARM_KRAIT, 5 }, > + { "ARM AArch64", "arm/armv8-ca57", CPU_ARM_V8_CA57, 6 }, > }; > > Perhaps we should go back to the idea of defining just a "common" armv8 > processor type, and skip the idea of trying to shoehorn in a pseudo > processor-specific type when there is none yet. So the above patch > excerpt becomes: > > @@ -129,6 +129,7 @@ static struct cpu_descr const cpu_descrs[MAX_CPU_TYPE] = { > { "e6500", "ppc/e6500", CPU_PPC_E6500, 6 }, > { "Intel Silvermont microarchitecture", "i386/silvermont", CPU_SILVERMONT, 2 }, > { "ARMv7 Krait", "arm/armv7-krait", CPU_ARM_KRAIT, 5 }, > + { "ARMv8 pmuv3 Common", "arm/armv8-pmuv3-common", CPU_ARM_V8_PMUV3_COMMON, 6 }, > ^--- is this acceptable? This is what would be displayed on the first line of 'ophelp' for the CPU type. > }; > > And of course we would remove the events/arm/armv8-ca57 events and > unit_masks files, too. I'd rather just add x-gene as the CPU and point it directly at the common events, since there's no publicly available documentation and it is the CPU which Will is playing with. e.g: { "APM X-Gene", "arm/armv8-pmuv3-common", CPU_ARM_V8_APM_XGENE, 6 }, If/when that documentation becomes available, then we can add the relevant files and redirect the lookup table to those. An alternative is to add the dummy files for X-gene instead of Cortex-A57, and include them in this series. Thoughts/opinions? Will |
From: Maynard J. <may...@us...> - 2014-02-11 17:36:34
|
On 02/11/2014 09:37 AM, Will Deacon wrote: > On Tue, Feb 11, 2014 at 03:28:11PM +0000, Maynard Johnson wrote: >> On 02/11/2014 04:25 AM, Will Deacon wrote: >>>>> @@ -394,6 +395,13 @@ static op_cpu _get_arm_cpu_type(void) >>>>> return op_get_cpu_number("arm/armv7-ca9"); >>>>> case 0xc0f: >>>>> return op_get_cpu_number("arm/armv7-ca15"); >>>>> + case 0xd00: >>>>> + return op_get_cpu_number("arm/armv8-pmuv3"); >>>>> + } >>>>> + } else if (vendorid == 0x50) { /* Applied Micro Circuits Corpation */ >>>>> + switch (cpuid) { >>>>> + case 0x000: >>>>> + return op_get_cpu_number("arm/armv8-pmuv3"); >>> >>> ...these strings won't hit in the events directory. In the absence of x-gene >> Good catch! >>> specific events, we should probably direct the latter case at >>> "arm/armv8-pmuv3-common" and remove the first case (0xd00), since it will be >>> updated to point at, for example, the ca57 events in future. >> Unfortunately, with the patch that I most recently posted in this thread, >> that won't work because calling >> op_get_cpu_number("arm/armv8-pmuv3-common") will return CPU_NO_GOOD (note >> below there is no cpu_descr with a name field of >> "arm/armv8-pmuv3-common"): >> >> @@ -129,6 +129,7 @@ static struct cpu_descr const cpu_descrs[MAX_CPU_TYPE] = { >> { "e6500", "ppc/e6500", CPU_PPC_E6500, 6 }, >> { "Intel Silvermont microarchitecture", "i386/silvermont", CPU_SILVERMONT, 2 }, >> { "ARMv7 Krait", "arm/armv7-krait", CPU_ARM_KRAIT, 5 }, >> + { "ARM AArch64", "arm/armv8-ca57", CPU_ARM_V8_CA57, 6 }, >> }; >> >> Perhaps we should go back to the idea of defining just a "common" armv8 >> processor type, and skip the idea of trying to shoehorn in a pseudo >> processor-specific type when there is none yet. So the above patch >> excerpt becomes: >> >> @@ -129,6 +129,7 @@ static struct cpu_descr const cpu_descrs[MAX_CPU_TYPE] = { >> { "e6500", "ppc/e6500", CPU_PPC_E6500, 6 }, >> { "Intel Silvermont microarchitecture", "i386/silvermont", CPU_SILVERMONT, 2 }, >> { "ARMv7 Krait", "arm/armv7-krait", CPU_ARM_KRAIT, 5 }, >> + { "ARMv8 pmuv3 Common", "arm/armv8-pmuv3-common", CPU_ARM_V8_PMUV3_COMMON, 6 }, >> ^--- is this acceptable? This is what would be displayed on the first line of 'ophelp' for the CPU type. >> }; >> >> And of course we would remove the events/arm/armv8-ca57 events and >> unit_masks files, too. > > I'd rather just add x-gene as the CPU and point it directly at the common > events, since there's no publicly available documentation and it is the CPU > which Will is playing with. > > e.g: > { "APM X-Gene", "arm/armv8-pmuv3-common", CPU_ARM_V8_APM_XGENE, 6 }, > > If/when that documentation becomes available, then we can add the relevant > files and redirect the lookup table to those. An alternative is to add the > dummy files for X-gene instead of Cortex-A57, and include them in this > series. Let's do the latter. I'll re-swizzle the patch and post it for one final review. -Maynard > > Thoughts/opinions? > > Will > |
From: Maynard J. <may...@us...> - 2014-02-10 15:18:36
|
On 02/10/2014 08:32 AM, William Cohen wrote: > On 02/10/2014 05:20 AM, Will Deacon wrote: >> On Fri, Feb 07, 2014 at 11:00:45PM +0000, Maynard Johnson wrote: >>> On 02/04/2014 02:10 PM, William Cohen wrote: >>>> This revised patch addresses Will Deacon's comment about possible >>>> follow on implementations of the pmu unit such as pmuv4 for armv8 >>>> processors. The name is armv8-pmuv3 to match up with what the kernel >>>> reports for perf events. If an aarch64 processors has implementation >>>> specific events, it can be named appropriately and it can include the >>>> events in this patch. >>>> >>>> The "make distcheck" works fine with this version of the patch. >>>> >>>> William Cohen (1): >>>> Provide basic AArch64 support >>>> >>>> events/Makefile.am | 1 + >>>> events/arm/armv8-pmuv3/events | 38 ++++++++++++++++++++++++++++++++++++++ >>>> events/arm/armv8-pmuv3/unit_masks | 4 ++++ >>> Will and Will, >> >> Hi Maynard, >> >>> Maybe I'm confused. The initial patch had events/arm/armv8-common/. Will D's comment was: >>> The uses of armv8_common and CPU_ARM_V8_COMMON would be more precise >>> if we mentioned pmuv3 in there somewhere. At some point we'll probably >>> get PMUv4, and then the common events might not be common anymore. >>> >>> The events/arm/armv8-pmuv3 events and unit masks files have the comment >>> "ARMv8 architected events" in them. Shouldn't we have a >>> events/arm/armv8-pmuv3-common directory with the actual events defined, >>> and then the processor-specific files (in events/arm/armv8-pmuv3) would >>> just "include" the armv8-pmuv3-common? >> >> I don't think this patch adds any processor-specific events, so if/when they >> appear I was anticipating them including events/arm/armv8-pmuv3/events. >> >> So you'd have something like events/arm/armv8-ca57/events which would >> include:arm/armv8-pmuv3. I know I'm beating a dead horse, but nothing either of you have said has made it clear to me why this scenario is different from the various armv7 processor models including armv7-common/events. If there's no clear distinction, then why veer from the existing pattern of directory naming, where said directory will include events and unit mask files that are intended (expected?) to be included by processor specific events/unit mask files? Granted, the pattern is not set in concrete -- ARM uses "common" in the directory name; Intel uses "arch_perfmon"; current IBM Power uses "architected_events_v1" -- but the names make the intent pretty clear. -Maynard >> >> Does that make sense? >> >> Will >> > > Hi Will and Maynard, > > The events described in the patch are the events described in the ARM Architecture Reference Manual ARMv8 manual. The dropping the "common" was to have naming that match what the Linux kernel uses and reduce name variations. There certainly can be armv8 processors that have additional events. The recently available "ARM Cortex-A57 MPCore Processor Technical Reference Manual ( http://infocenter.arm.com/help/topic/com.arm.doc.ddi0488d/DDI0488D_cortex_a57_mpcore_r1p1_trm.pdf) list additional events in Section "11.8 Events" when compared to the "ARM Architecture Reference Manual ARMv8, for ARMv8-A architecture profile" (DDI0487A_a_armv8_arm.pdf) The expectation would be that would be a /usr/share/oprofile/arm/armv8-a57/events that would include the armv8-pmu3 events. > > -Will > > -Will > |
From: Will D. <wil...@ar...> - 2014-02-10 16:13:23
|
On Mon, Feb 10, 2014 at 03:18:24PM +0000, Maynard Johnson wrote: > On 02/10/2014 08:32 AM, William Cohen wrote: > > On 02/10/2014 05:20 AM, Will Deacon wrote: > >> I don't think this patch adds any processor-specific events, so if/when they > >> appear I was anticipating them including events/arm/armv8-pmuv3/events. > >> > >> So you'd have something like events/arm/armv8-ca57/events which would > >> include:arm/armv8-pmuv3. > > I know I'm beating a dead horse, but nothing either of you have said has > made it clear to me why this scenario is different from the various armv7 > processor models including armv7-common/events. If there's no clear > distinction, then why veer from the existing pattern of directory naming, > where said directory will include events and unit mask files that are > intended (expected?) to be included by processor specific events/unit mask > files? Granted, the pattern is not set in concrete -- ARM uses "common" > in the directory name; Intel uses "arch_perfmon"; current IBM Power uses > "architected_events_v1" -- but the names make the intent pretty clear. When we added armv7-common, there was a single, architecturally defined set of events and `armv7-common' made sense. However, that later got named `pmuv1' and `pmuv2' was added with the v7.1 architecture. v8 has pmuv3 and I'm pretty sure we'll get a pmuv4 in the future. Rather than try to squeeze this all under `armv8-common', the benefit of hindsight lets us use pmuvN to describe current and future events with ease. That said, I can also see the argument for not deviating from the armv7 stuff, so if you want to stick with that we can try it too (but I worry that it will become a mess in due course). Will |
From: William C. <wc...@re...> - 2014-02-10 16:15:58
|
On 02/10/2014 10:18 AM, Maynard Johnson wrote: > On 02/10/2014 08:32 AM, William Cohen wrote: >> On 02/10/2014 05:20 AM, Will Deacon wrote: >>> On Fri, Feb 07, 2014 at 11:00:45PM +0000, Maynard Johnson wrote: >>>> On 02/04/2014 02:10 PM, William Cohen wrote: >>>>> This revised patch addresses Will Deacon's comment about possible >>>>> follow on implementations of the pmu unit such as pmuv4 for armv8 >>>>> processors. The name is armv8-pmuv3 to match up with what the kernel >>>>> reports for perf events. If an aarch64 processors has implementation >>>>> specific events, it can be named appropriately and it can include the >>>>> events in this patch. >>>>> >>>>> The "make distcheck" works fine with this version of the patch. >>>>> >>>>> William Cohen (1): >>>>> Provide basic AArch64 support >>>>> >>>>> events/Makefile.am | 1 + >>>>> events/arm/armv8-pmuv3/events | 38 ++++++++++++++++++++++++++++++++++++++ >>>>> events/arm/armv8-pmuv3/unit_masks | 4 ++++ >>>> Will and Will, >>> >>> Hi Maynard, >>> >>>> Maybe I'm confused. The initial patch had events/arm/armv8-common/. Will D's comment was: >>>> The uses of armv8_common and CPU_ARM_V8_COMMON would be more precise >>>> if we mentioned pmuv3 in there somewhere. At some point we'll probably >>>> get PMUv4, and then the common events might not be common anymore. >>>> >>>> The events/arm/armv8-pmuv3 events and unit masks files have the comment >>>> "ARMv8 architected events" in them. Shouldn't we have a >>>> events/arm/armv8-pmuv3-common directory with the actual events defined, >>>> and then the processor-specific files (in events/arm/armv8-pmuv3) would >>>> just "include" the armv8-pmuv3-common? >>> >>> I don't think this patch adds any processor-specific events, so if/when they >>> appear I was anticipating them including events/arm/armv8-pmuv3/events. >>> >>> So you'd have something like events/arm/armv8-ca57/events which would >>> include:arm/armv8-pmuv3. > I know I'm beating a dead horse, but nothing either of you have said has made it clear to me why this scenario is different from the various armv7 processor models including armv7-common/events. If there's no clear distinction, then why veer from the existing pattern of directory naming, where said directory will include events and unit mask files that are intended (expected?) to be included by processor specific events/unit mask files? Granted, the pattern is not set in concrete -- ARM uses "common" in the directory name; Intel uses "arch_perfmon"; current IBM Power uses "architected_events_v1" -- but the names make the intent pretty clear. > > -Maynard Hi Maynard, I don't have a significant preference for the name. I would just like to come to a consensus on the name and get the patch merged in. There are the following possible choices: armv8: armv8-common: (name scheme following earlier armvN processors in OProfile) armv8-pmuv3: (name scheme used by the upstream kernel) armv8-common-pmuv3: The last patch used armv8-pmuv3 to match the name used in the kernel, "arm/armv8-pmuv3" . However, the kernel is not particularly consistent in its naming of arm processors PMUs. The 32-bit arm processors PMUs have names "ARMv7 Cortex-A<N>", "v6", and "v6mpcore". All of the current arm processor have specific events sets and none display the armv7-common as the name of the events. The krait just includes the armv7-common. Of those choices above what would a user perfer to see? Is something with "armv8-common-pmuv3" or something with "common" in the name that the user wants to see? -Will |
From: Maynard J. <may...@us...> - 2014-02-10 16:51:24
|
On 02/10/2014 10:13 AM, Will Deacon wrote: > On Mon, Feb 10, 2014 at 03:18:24PM +0000, Maynard Johnson wrote: >> On 02/10/2014 08:32 AM, William Cohen wrote: >>> On 02/10/2014 05:20 AM, Will Deacon wrote: >>>> I don't think this patch adds any processor-specific events, so if/when they >>>> appear I was anticipating them including events/arm/armv8-pmuv3/events. >>>> >>>> So you'd have something like events/arm/armv8-ca57/events which would >>>> include:arm/armv8-pmuv3. >> >> I know I'm beating a dead horse, but nothing either of you have said has >> made it clear to me why this scenario is different from the various armv7 >> processor models including armv7-common/events. If there's no clear >> distinction, then why veer from the existing pattern of directory naming, >> where said directory will include events and unit mask files that are >> intended (expected?) to be included by processor specific events/unit mask >> files? Granted, the pattern is not set in concrete -- ARM uses "common" >> in the directory name; Intel uses "arch_perfmon"; current IBM Power uses >> "architected_events_v1" -- but the names make the intent pretty clear. > > When we added armv7-common, there was a single, architecturally defined set > of events and `armv7-common' made sense. However, that later got named > `pmuv1' and `pmuv2' was added with the v7.1 architecture. v8 has pmuv3 and > I'm pretty sure we'll get a pmuv4 in the future. Rather than try to squeeze > this all under `armv8-common', Just to clarify, that's not what I was suggesting. I suggested "armv8-pmuv3-common". I realize what's complicating the issue is that the patch doesn't contain any truly processor-specific events. So, if we go with a armv8-pmuv3-common directory to hold the actual events, we could create a pseudo (probably temporary) processor-specific directory (e.g, armv8-cx) that would have no events/unit masks for now and would just "include:" the armv8-pmuv3-common files. That said, if you aren't enamored with that idea, then nack it, and I'll shut up. ;-) -Maynard > the benefit of hindsight lets us use pmuvN to > describe current and future events with ease. > > That said, I can also see the argument for not deviating from the armv7 > stuff, so if you want to stick with that we can try it too (but I worry that > it will become a mess in due course). > > Will > |
From: Will D. <wil...@ar...> - 2014-02-10 16:55:26
|
On Mon, Feb 10, 2014 at 04:51:05PM +0000, Maynard Johnson wrote: > On 02/10/2014 10:13 AM, Will Deacon wrote: > > On Mon, Feb 10, 2014 at 03:18:24PM +0000, Maynard Johnson wrote: > >> On 02/10/2014 08:32 AM, William Cohen wrote: > >>> On 02/10/2014 05:20 AM, Will Deacon wrote: > >>>> I don't think this patch adds any processor-specific events, so if/when they > >>>> appear I was anticipating them including events/arm/armv8-pmuv3/events. > >>>> > >>>> So you'd have something like events/arm/armv8-ca57/events which would > >>>> include:arm/armv8-pmuv3. > >> > >> I know I'm beating a dead horse, but nothing either of you have said has > >> made it clear to me why this scenario is different from the various armv7 > >> processor models including armv7-common/events. If there's no clear > >> distinction, then why veer from the existing pattern of directory naming, > >> where said directory will include events and unit mask files that are > >> intended (expected?) to be included by processor specific events/unit mask > >> files? Granted, the pattern is not set in concrete -- ARM uses "common" > >> in the directory name; Intel uses "arch_perfmon"; current IBM Power uses > >> "architected_events_v1" -- but the names make the intent pretty clear. > > > > When we added armv7-common, there was a single, architecturally defined set > > of events and `armv7-common' made sense. However, that later got named > > `pmuv1' and `pmuv2' was added with the v7.1 architecture. v8 has pmuv3 and > > I'm pretty sure we'll get a pmuv4 in the future. Rather than try to squeeze > > this all under `armv8-common', > Just to clarify, that's not what I was suggesting. I suggested "armv8-pmuv3-common". Sorry, I missed that. I don't have strong opinions on the name as long as it contains `pmuv3' as a substring somewhere. > I realize what's complicating the issue is that the patch doesn't contain > any truly processor-specific events. So, if we go with a > armv8-pmuv3-common directory to hold the actual events, we could create a > pseudo (probably temporary) processor-specific directory (e.g, armv8-cx) > that would have no events/unit masks for now and would just "include:" the > armv8-pmuv3-common files. > > That said, if you aren't enamored with that idea, then nack it, and I'll shut up. ;-) How about we meet half way and we add armv8-ca57 instead? :) Then, I can add the Cortex-A57 events in a later patch. Will |
From: Maynard J. <may...@us...> - 2014-02-10 17:05:19
|
On 02/10/2014 10:55 AM, Will Deacon wrote: > On Mon, Feb 10, 2014 at 04:51:05PM +0000, Maynard Johnson wrote: >> On 02/10/2014 10:13 AM, Will Deacon wrote: >>> On Mon, Feb 10, 2014 at 03:18:24PM +0000, Maynard Johnson wrote: >>>> On 02/10/2014 08:32 AM, William Cohen wrote: >>>>> On 02/10/2014 05:20 AM, Will Deacon wrote: >>>>>> I don't think this patch adds any processor-specific events, so if/when they >>>>>> appear I was anticipating them including events/arm/armv8-pmuv3/events. >>>>>> >>>>>> So you'd have something like events/arm/armv8-ca57/events which would >>>>>> include:arm/armv8-pmuv3. >>>> >>>> I know I'm beating a dead horse, but nothing either of you have said has >>>> made it clear to me why this scenario is different from the various armv7 >>>> processor models including armv7-common/events. If there's no clear >>>> distinction, then why veer from the existing pattern of directory naming, >>>> where said directory will include events and unit mask files that are >>>> intended (expected?) to be included by processor specific events/unit mask >>>> files? Granted, the pattern is not set in concrete -- ARM uses "common" >>>> in the directory name; Intel uses "arch_perfmon"; current IBM Power uses >>>> "architected_events_v1" -- but the names make the intent pretty clear. >>> >>> When we added armv7-common, there was a single, architecturally defined set >>> of events and `armv7-common' made sense. However, that later got named >>> `pmuv1' and `pmuv2' was added with the v7.1 architecture. v8 has pmuv3 and >>> I'm pretty sure we'll get a pmuv4 in the future. Rather than try to squeeze >>> this all under `armv8-common', >> Just to clarify, that's not what I was suggesting. I suggested "armv8-pmuv3-common". > > Sorry, I missed that. I don't have strong opinions on the name as long as it > contains `pmuv3' as a substring somewhere. > >> I realize what's complicating the issue is that the patch doesn't contain >> any truly processor-specific events. So, if we go with a >> armv8-pmuv3-common directory to hold the actual events, we could create a >> pseudo (probably temporary) processor-specific directory (e.g, armv8-cx) >> that would have no events/unit masks for now and would just "include:" the >> armv8-pmuv3-common files. I forgot to mention that when we have some processor-specific events from a particular armv8-pmuv3 implementation (e.g., armv8-ca57), we could rename the pseudo processor-specific directory to armv8-ca57) and place the new events in there. >> >> That said, if you aren't enamored with that idea, then nack it, and I'll shut up. ;-) > > How about we meet half way and we add armv8-ca57 instead? :) > > Then, I can add the Cortex-A57 events in a later patch. I presume CA57 is next out of the block, so that sounds good to me. Thanks! -Maynard > > Will > |
From: William C. <wc...@re...> - 2014-02-10 19:37:57
Attachments:
0001-Provide-basic-AArch64-ARMv8-support.patch
|
On 02/10/2014 12:04 PM, Maynard Johnson wrote: > On 02/10/2014 10:15 AM, William Cohen wrote: >> >> Hi Maynard, >> >> I don't have a significant preference for the name. I would just like to come to a consensus on the name and get the patch merged in. There are the following possible choices: >> >> armv8: >> armv8-common: (name scheme following earlier armvN processors in OProfile) >> armv8-pmuv3: (name scheme used by the upstream kernel) >> armv8-common-pmuv3: >> >> The last patch used armv8-pmuv3 to match the name used in the kernel, "arm/armv8-pmuv3" . However, the kernel is not particularly consistent in its naming of arm processors PMUs. The 32-bit arm processors PMUs have names "ARMv7 Cortex-A<N>", "v6", and "v6mpcore". >> >> All of the current arm processor have specific events sets and none display the armv7-common as the name of the events. The krait just includes the armv7-common. >> >> Of those choices above what would a user perfer to see? Is something with "armv8-common-pmuv3" or something with "common" in the name that the user wants to see? > Will, if we employed a "common" directory to hold these events, we would still want to have another directory with processor specific events/unit masks (which would then "include:" the new common dir). So the user would not *see" the common dir -- just the processor specific one. Se my response to Will D for more details. > > -Maynard >> >> -Will >> Hi Maynard, I have reworked the patch with a armv8-pmuv3-common directory to hold the architected armv8 events. There is a armv8-pmuv3 directory that basically just includes those events. The documentation I have for the applied micro x-gene just lists those architected armv8 events and does not list any of the additional events for the Cortex-A57. Probably want to be on the safe side and have a separate arv8-ca57 rather than just renaming armv8-pmuv3. Does the attached patch implement things in the desired way? -Will |