From: Ming L. <min...@ca...> - 2012-01-18 12:24:19
|
On Wed, Jan 18, 2012 at 7:39 PM, Shilimkar, Santosh <san...@ti...> wrote: > On Wed, Jan 18, 2012 at 10:33 AM, Ming Lei <min...@ca...> wrote: >> Hi Will and stephane, >> >> On Wed, Jan 18, 2012 at 12:18 PM, Ming Lei <min...@ca...> wrote: >>> Hi stephane & Will, >>> >>> On Tue, Jan 10, 2012 at 8:46 AM, stephane eranian >>> <er...@go...> wrote: >>>> See the dmesg from my 3.2 kernel: >>>> >>>> >>>> [ 0.000000] Booting Linux on physical CPU 0[ 0.000000] >> >> But if you test omap4 perf against -next kernel, pmu won't work because >> the commit[1] may put 'emu_sys_clkdm' clock domain into HW_AUTO mode, >> so writing pmu register may not take effect. >> >> I have found the similar problem on cam clock domain before[2]. >> CD_EMU is very simliar with CD_CAM in the point below: >> >> CD_EMU has no static or module wake-up dependency with any other clock >> domain of the device.[3] >> >> So the patch[4] can make omap4 pmu work on -next tree. >> >> Shilimkar, care to comment on the patch[4]? >> >> thanks, >> -- >> Ming Lei >> >> [1], commit 3c50729b3fa1cd8ca1f347e6caf1081204cf1a7c >> Author: Santosh Shilimkar <san...@ti...> >> Date: Wed Jan 5 22:03:17 2011 +0530 >> >> ARM: OMAP4: PM: Initialise all the clockdomains to supported states >> >> Initialise hardware supervised mode for all clockdomains if it's >> supported. Initiate sleep transition for other clockdomains, >> if they are not being used. >> >> [2], http://www.spinics.net/lists/linux-omap/msg61911.html >> >> [3], 3.6.12.3 of OMAP4 TRM >> >> [4], >> diff --git a/arch/arm/mach-omap2/clockdomains44xx_data.c >> b/arch/arm/mach-omap2/clockdomains44xx_data.c >> index 9299ac2..41d2260 100644 >> --- a/arch/arm/mach-omap2/clockdomains44xx_data.c >> +++ b/arch/arm/mach-omap2/clockdomains44xx_data.c >> @@ -390,7 +390,7 @@ static struct clockdomain emu_sys_44xx_clkdm = { >> .prcm_partition = OMAP4430_PRM_PARTITION, >> .cm_inst = OMAP4430_PRM_EMU_CM_INST, >> .clkdm_offs = OMAP4430_PRM_EMU_CM_EMU_CDOFFS, >> - .flags = CLKDM_CAN_HWSUP, >> + .flags = CLKDM_CAN_SWSUP, >> }; > NAK. > > You don't need this patch. What you saw on CAMERA was indeed > a known bug but emulation domain has no such issues. > > So the accesses to emulation register should continue to work > with the clock-domain being kept under hardware supervision. But why can this patch make omap4 pmu work? Without the patch, there are no CTI interrupts generated for pmu irq. thanks -- Ming Lei |
From: Shilimkar, S. <san...@ti...> - 2012-01-18 12:33:56
|
On Wed, Jan 18, 2012 at 1:24 PM, Ming Lei <min...@ca...> wrote: > On Wed, Jan 18, 2012 at 7:39 PM, Shilimkar, Santosh > <san...@ti...> wrote: >> On Wed, Jan 18, 2012 at 10:33 AM, Ming Lei <min...@ca...> wrote: >>> Hi Will and stephane, >>> >>> On Wed, Jan 18, 2012 at 12:18 PM, Ming Lei <min...@ca...> wrote: >>>> Hi stephane & Will, >>>> >>>> On Tue, Jan 10, 2012 at 8:46 AM, stephane eranian >>>> <er...@go...> wrote: >>>>> See the dmesg from my 3.2 kernel: >>>>> >>>>> >>>>> [ 0.000000] Booting Linux on physical CPU 0[ 0.000000] >>> >>> But if you test omap4 perf against -next kernel, pmu won't work because >>> the commit[1] may put 'emu_sys_clkdm' clock domain into HW_AUTO mode, >>> so writing pmu register may not take effect. >>> >>> I have found the similar problem on cam clock domain before[2]. >>> CD_EMU is very simliar with CD_CAM in the point below: >>> >>> CD_EMU has no static or module wake-up dependency with any other clock >>> domain of the device.[3] >>> >>> So the patch[4] can make omap4 pmu work on -next tree. >>> >>> Shilimkar, care to comment on the patch[4]? >>> >>> thanks, >>> -- >>> Ming Lei >>> >>> [1], commit 3c50729b3fa1cd8ca1f347e6caf1081204cf1a7c >>> Author: Santosh Shilimkar <san...@ti...> >>> Date: Wed Jan 5 22:03:17 2011 +0530 >>> >>> ARM: OMAP4: PM: Initialise all the clockdomains to supported states >>> >>> Initialise hardware supervised mode for all clockdomains if it's >>> supported. Initiate sleep transition for other clockdomains, >>> if they are not being used. >>> >>> [2], http://www.spinics.net/lists/linux-omap/msg61911.html >>> >>> [3], 3.6.12.3 of OMAP4 TRM >>> >>> [4], >>> diff --git a/arch/arm/mach-omap2/clockdomains44xx_data.c >>> b/arch/arm/mach-omap2/clockdomains44xx_data.c >>> index 9299ac2..41d2260 100644 >>> --- a/arch/arm/mach-omap2/clockdomains44xx_data.c >>> +++ b/arch/arm/mach-omap2/clockdomains44xx_data.c >>> @@ -390,7 +390,7 @@ static struct clockdomain emu_sys_44xx_clkdm = { >>> .prcm_partition = OMAP4430_PRM_PARTITION, >>> .cm_inst = OMAP4430_PRM_EMU_CM_INST, >>> .clkdm_offs = OMAP4430_PRM_EMU_CM_EMU_CDOFFS, >>> - .flags = CLKDM_CAN_HWSUP, >>> + .flags = CLKDM_CAN_SWSUP, >>> }; >> NAK. >> >> You don't need this patch. What you saw on CAMERA was indeed >> a known bug but emulation domain has no such issues. >> >> So the accesses to emulation register should continue to work >> with the clock-domain being kept under hardware supervision. > > But why can this patch make omap4 pmu work? Without the patch, > there are no CTI interrupts generated for pmu irq. > Interesting. For me debugger works which also relies on Emulation domain. Need to see why CTI is behaving like this. Regards Santosh |
From: Paul W. <pa...@pw...> - 2012-04-03 23:56:38
|
Hi On Tue, 3 Apr 2012, Kevin Hilman wrote: > Indeed, like you, I have to change the EMU clock domain to SWSUP[1] in > order to see any interrupts and see anything in perf top. This isn't > really a mergeable workaround, so I'll look into this a little closer > with Santosh to see what we can do once we fully understand the HW > problem. Part of the problem is that the clockdomain data for the emu_sys clockdomain is wrong. Here's something to try to fix it. It might just be enough to get it to work. - Paul From: Paul Walmsley <pa...@pw...> Date: Tue, 3 Apr 2012 17:13:48 -0600 Subject: [PATCH] ARM: OMAP44xx: clockdomain data: correct the emu_sys_clkdm CLKTRCTRL data According to the 4430 ES2.0 TRM vX Table 3-744 "CM_EMU_CLKSTCTRL", the emu_sys clockdomain data in mainline is incorrect. The emu_sys clockdomain does not support the DISABLE_AUTO state, and instead it supports the FORCE_WAKEUP state. Signed-off-by: Paul Walmsley <pa...@pw...> Cc: Benoît Cousson <b-c...@ti...> Cc: Kevin Hilman <kh...@ti...> Cc: Santosh Shilimkar <san...@ti...> Cc: Ming Lei <min...@ca...> --- arch/arm/mach-omap2/clockdomains44xx_data.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/arm/mach-omap2/clockdomains44xx_data.c b/arch/arm/mach-omap2/clockdomains44xx_data.c index 9299ac2..bd7ed13 100644 --- a/arch/arm/mach-omap2/clockdomains44xx_data.c +++ b/arch/arm/mach-omap2/clockdomains44xx_data.c @@ -390,7 +390,7 @@ static struct clockdomain emu_sys_44xx_clkdm = { .prcm_partition = OMAP4430_PRM_PARTITION, .cm_inst = OMAP4430_PRM_EMU_CM_INST, .clkdm_offs = OMAP4430_PRM_EMU_CM_EMU_CDOFFS, - .flags = CLKDM_CAN_HWSUP, + .flags = CLKDM_CAN_ENABLE_AUTO | CLKDM_CAN_FORCE_WAKEUP, }; static struct clockdomain l3_dma_44xx_clkdm = { -- 1.7.9.5 |
From: Ming L. <min...@ca...> - 2012-04-04 03:42:38
|
On Wed, Apr 4, 2012 at 7:29 AM, Paul Walmsley <pa...@pw...> wrote: > Hi > > On Tue, 3 Apr 2012, Kevin Hilman wrote: > >> Indeed, like you, I have to change the EMU clock domain to SWSUP[1] in >> order to see any interrupts and see anything in perf top. This isn't >> really a mergeable workaround, so I'll look into this a little closer >> with Santosh to see what we can do once we fully understand the HW >> problem. > > Part of the problem is that the clockdomain data for the emu_sys > clockdomain is wrong. Here's something to try to fix it. It might just > be enough to get it to work. > > - Paul > > From: Paul Walmsley <pa...@pw...> > Date: Tue, 3 Apr 2012 17:13:48 -0600 > Subject: [PATCH] ARM: OMAP44xx: clockdomain data: correct the emu_sys_clkdm > CLKTRCTRL data > > According to the 4430 ES2.0 TRM vX Table 3-744 "CM_EMU_CLKSTCTRL", > the emu_sys clockdomain data in mainline is incorrect. > > The emu_sys clockdomain does not support the DISABLE_AUTO state, and > instead it supports the FORCE_WAKEUP state. > > Signed-off-by: Paul Walmsley <pa...@pw...> > Cc: Benoît Cousson <b-c...@ti...> > Cc: Kevin Hilman <kh...@ti...> > Cc: Santosh Shilimkar <san...@ti...> > Cc: Ming Lei <min...@ca...> > --- > arch/arm/mach-omap2/clockdomains44xx_data.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/arch/arm/mach-omap2/clockdomains44xx_data.c b/arch/arm/mach-omap2/clockdomains44xx_data.c > index 9299ac2..bd7ed13 100644 > --- a/arch/arm/mach-omap2/clockdomains44xx_data.c > +++ b/arch/arm/mach-omap2/clockdomains44xx_data.c > @@ -390,7 +390,7 @@ static struct clockdomain emu_sys_44xx_clkdm = { > .prcm_partition = OMAP4430_PRM_PARTITION, > .cm_inst = OMAP4430_PRM_EMU_CM_INST, > .clkdm_offs = OMAP4430_PRM_EMU_CM_EMU_CDOFFS, > - .flags = CLKDM_CAN_HWSUP, > + .flags = CLKDM_CAN_ENABLE_AUTO | CLKDM_CAN_FORCE_WAKEUP, I tested the patch just now, but unfortunately, the change still doesn't make PMU to generate IRQs. Mark the flags as CLKDM_CAN_SWSUP may work, but PMU will stop producing IRQs after resuming from suspend. Thanks -- Ming Lei |
From: stephane e. <er...@go...> - 2012-01-18 21:58:17
|
Ming, Ok, so I used Linus' tree @ It already includes patches #1 and #2. I applied 4-6. Recompiled but my kernel does not boot, I don't see anything on the serial console. Could be a broken .config file. Could you send me your .config for Panda? Thanks. On Wed, Jan 18, 2012 at 11:07 AM, Ming Lei <min...@ca...> wrote: > Hi, > > On Wed, Jan 18, 2012 at 5:54 PM, stephane eranian <er...@go...> >> Should I use Will's -next tree as the base instead of Linus'? > > Either one is OK. If you use linus tree as base, you need to apply the #1 and > #2 patch manually. > >> Given that MARC is shutdown today, would you mind packing those patches >> into a tarball and sending them to me directly? > > See attachment, which includes the patches from #3 to #6. > >> >> When you mention Will's -next tree, are you talking about: >> git://git.kernel.org/pub/scm/linux/kernel/git/will/linux.git for-next/perf > > It is perf/omap4 brach, you can pick up the two patches[1][2] directly from > the branch. > > > thanks, > -- > Ming Lei > > [1], http://git.kernel.org/?p=linux/kernel/git/will/linux.git;a=commit;h=7924a3eba0766348d6d6a56cbb9873cdbcab0d8c > > [2], http://git.kernel.org/?p=linux/kernel/git/will/linux.git;a=commit;h=bde071f005e2dc71378aff69e86b961d8cd7922f |
From: Ming L. <min...@ca...> - 2012-01-19 01:22:00
Attachments:
conf.tar.gz
|
Hi, On Thu, Jan 19, 2012 at 5:58 AM, stephane eranian <er...@go...> wrote: > Ming, > > Ok, so I used Linus' tree @ > > It already includes patches #1 and #2. I applied 4-6. The patch #3 is missed? > Recompiled but my kernel does not boot, I don't see > anything on the serial console. Could be a broken I don't think that the patches can cause your non boot, you can try the linus tree kernel first, then try the patches. > .config file. Could you send me your .config for Panda? See the attachment. > > Thanks. > > On Wed, Jan 18, 2012 at 11:07 AM, Ming Lei <min...@ca...> wrote: >> Hi, >> >> On Wed, Jan 18, 2012 at 5:54 PM, stephane eranian <er...@go...> >>> Should I use Will's -next tree as the base instead of Linus'? >> >> Either one is OK. If you use linus tree as base, you need to apply the #1 and >> #2 patch manually. >> >>> Given that MARC is shutdown today, would you mind packing those patches >>> into a tarball and sending them to me directly? >> >> See attachment, which includes the patches from #3 to #6. >> >>> >>> When you mention Will's -next tree, are you talking about: >>> git://git.kernel.org/pub/scm/linux/kernel/git/will/linux.git for-next/perf >> >> It is perf/omap4 brach, you can pick up the two patches[1][2] directly from >> the branch. >> >> >> thanks, >> -- >> Ming Lei >> >> [1], http://git.kernel.org/?p=linux/kernel/git/will/linux.git;a=commit;h=7924a3eba0766348d6d6a56cbb9873cdbcab0d8c >> >> [2], http://git.kernel.org/?p=linux/kernel/git/will/linux.git;a=commit;h=bde071f005e2dc71378aff69e86b961d8cd7922f > -- > To unsubscribe from this list: send the line "unsubscribe linux-omap" in > the body of a message to maj...@vg... > More majordomo info at http://vger.kernel.org/majordomo-info.html |
From: stephane e. <er...@go...> - 2012-01-19 11:35:02
|
Hi, Ok some update on this. With your .config file + 3.2.0 (Linus) + patch 3, 4, 5, 6, I get a kernel that boots. It does recognize the PMU. However, it still does not count correctly and I believe for the same reason.: no interrupts are delivered. I run a cycle burner program on CPU0, I watch /proc/interrupts. and then I run libpfm4 program that does per-cpu monitoring on CPU0 and print the counts every second: $ sudo ./syst_count -d 10 -p -c 0 -e cpu_cycles <press CTRL-C to quit before 10s time limit> # 1s ----- CPU0 G0 1008129147 cpu_cycles (scaling 0.00%, ena=1000152588, run=1000152588) # 2s ----- CPU0 G0 2016240766 cpu_cycles (scaling 0.00%, ena=2000335693, run=2000335693) # 3s ----- CPU0 G0 3024249265 cpu_cycles (scaling 0.00%, ena=3000427245, run=3000427245) # 4s ----- CPU0 G0 4072779364 cpu_cycles (scaling 0.00%, ena=4040710449, run=4040710449) # 5s ----- CPU0 G0 785954705 cpu_cycles (scaling 0.00%, ena=5040954589, run=5040954589) # 6s ----- CPU0 G0 1803397848 cpu_cycles (scaling 0.00%, ena=6050384520, run=6050384520) # 7s ----- You clearly see that after 4s you've reached the 32-bit limit of the counter and then you wrap around. It should show 5 billions or so cycles. Over the entire run, no arm-pmu interrupt was delivered according to /proc/interrupts. I guess you can test the same condition using perf directly, use a program that burns cycles for a know duration. Try < 4s and then > 4s. I use 1s vs. 10s and I expect the count to be 10x larger in the latter test case. If it's not then, interrupts are not coming in, On Thu, Jan 19, 2012 at 2:21 AM, Ming Lei <min...@ca...> wrote: > Hi, > > On Thu, Jan 19, 2012 at 5:58 AM, stephane eranian > <er...@go...> wrote: >> Ming, >> >> Ok, so I used Linus' tree @ >> >> It already includes patches #1 and #2. I applied 4-6. > > The patch #3 is missed? > >> Recompiled but my kernel does not boot, I don't see >> anything on the serial console. Could be a broken > > I don't think that the patches can cause your non boot, you > can try the linus tree kernel first, then try the patches. > >> .config file. Could you send me your .config for Panda? > > See the attachment. > >> >> Thanks. >> >> On Wed, Jan 18, 2012 at 11:07 AM, Ming Lei <min...@ca...> wrote: >>> Hi, >>> >>> On Wed, Jan 18, 2012 at 5:54 PM, stephane eranian <er...@go...> >>>> Should I use Will's -next tree as the base instead of Linus'? >>> >>> Either one is OK. If you use linus tree as base, you need to apply the #1 and >>> #2 patch manually. >>> >>>> Given that MARC is shutdown today, would you mind packing those patches >>>> into a tarball and sending them to me directly? >>> >>> See attachment, which includes the patches from #3 to #6. >>> >>>> >>>> When you mention Will's -next tree, are you talking about: >>>> git://git.kernel.org/pub/scm/linux/kernel/git/will/linux.git for-next/perf >>> >>> It is perf/omap4 brach, you can pick up the two patches[1][2] directly from >>> the branch. >>> >>> >>> thanks, >>> -- >>> Ming Lei >>> >>> [1], http://git.kernel.org/?p=linux/kernel/git/will/linux.git;a=commit;h=7924a3eba0766348d6d6a56cbb9873cdbcab0d8c >>> >>> [2], http://git.kernel.org/?p=linux/kernel/git/will/linux.git;a=commit;h=bde071f005e2dc71378aff69e86b961d8cd7922f >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-omap" in >> the body of a message to maj...@vg... >> More majordomo info at http://vger.kernel.org/majordomo-info.html |
From: Ming L. <min...@ca...> - 2012-01-19 12:45:17
|
Hi, On Thu, Jan 19, 2012 at 7:34 PM, stephane eranian <er...@go...> wrote: > Hi, > > Ok some update on this. > With your .config file + 3.2.0 (Linus) + patch 3, 4, 5, 6, I get a kernel that You forget patch 1 and patch 2? > boots. It does recognize the PMU. However, it still does not count correctly > and I believe for the same reason.: no interrupts are delivered. > > I run a cycle burner program on CPU0, I watch /proc/interrupts. > and then I run libpfm4 program that does per-cpu monitoring on CPU0 and > print the counts every second: I just run 'perf top', then watch output of '/proc/interrupts' in another terminal. I am sure I can see perf is OK and interrupts are generated on my pandaboard. > > $ sudo ./syst_count -d 10 -p -c 0 -e cpu_cycles > <press CTRL-C to quit before 10s time limit> > # 1s ----- > CPU0 G0 1008129147 cpu_cycles (scaling 0.00%, > ena=1000152588, run=1000152588) > # 2s ----- > CPU0 G0 2016240766 cpu_cycles (scaling 0.00%, > ena=2000335693, run=2000335693) > # 3s ----- > CPU0 G0 3024249265 cpu_cycles (scaling 0.00%, > ena=3000427245, run=3000427245) > # 4s ----- > CPU0 G0 4072779364 cpu_cycles (scaling 0.00%, > ena=4040710449, run=4040710449) > # 5s ----- > CPU0 G0 785954705 cpu_cycles (scaling 0.00%, > ena=5040954589, run=5040954589) > # 6s ----- > CPU0 G0 1803397848 cpu_cycles (scaling 0.00%, > ena=6050384520, run=6050384520) > # 7s ----- > > You clearly see that after 4s you've reached the 32-bit limit of the > counter and then you wrap around. > It should show 5 billions or so cycles. Over the entire run, no > arm-pmu interrupt was delivered according > to /proc/interrupts. > > I guess you can test the same condition using perf directly, use a > program that burns cycles > for a know duration. Try < 4s and then > 4s. I use 1s vs. 10s and I > expect the count to be > 10x larger in the latter test case. If it's not then, interrupts are > not coming in, > > > On Thu, Jan 19, 2012 at 2:21 AM, Ming Lei <min...@ca...> wrote: >> Hi, >> >> On Thu, Jan 19, 2012 at 5:58 AM, stephane eranian >> <er...@go...> wrote: >>> Ming, >>> >>> Ok, so I used Linus' tree @ >>> >>> It already includes patches #1 and #2. I applied 4-6. >> >> The patch #3 is missed? >> >>> Recompiled but my kernel does not boot, I don't see >>> anything on the serial console. Could be a broken >> >> I don't think that the patches can cause your non boot, you >> can try the linus tree kernel first, then try the patches. >> >>> .config file. Could you send me your .config for Panda? >> >> See the attachment. >> >>> >>> Thanks. >>> >>> On Wed, Jan 18, 2012 at 11:07 AM, Ming Lei <min...@ca...> wrote: >>>> Hi, >>>> >>>> On Wed, Jan 18, 2012 at 5:54 PM, stephane eranian <er...@go...> >>>>> Should I use Will's -next tree as the base instead of Linus'? >>>> >>>> Either one is OK. If you use linus tree as base, you need to apply the #1 and >>>> #2 patch manually. >>>> >>>>> Given that MARC is shutdown today, would you mind packing those patches >>>>> into a tarball and sending them to me directly? >>>> >>>> See attachment, which includes the patches from #3 to #6. >>>> >>>>> >>>>> When you mention Will's -next tree, are you talking about: >>>>> git://git.kernel.org/pub/scm/linux/kernel/git/will/linux.git for-next/perf >>>> >>>> It is perf/omap4 brach, you can pick up the two patches[1][2] directly from >>>> the branch. >>>> >>>> >>>> thanks, >>>> -- >>>> Ming Lei >>>> >>>> [1], http://git.kernel.org/?p=linux/kernel/git/will/linux.git;a=commit;h=7924a3eba0766348d6d6a56cbb9873cdbcab0d8c >>>> >>>> [2], http://git.kernel.org/?p=linux/kernel/git/will/linux.git;a=commit;h=bde071f005e2dc71378aff69e86b961d8cd7922f >>> -- >>> To unsubscribe from this list: send the line "unsubscribe linux-omap" in >>> the body of a message to maj...@vg... >>> More majordomo info at http://vger.kernel.org/majordomo-info.html > -- > To unsubscribe from this list: send the line "unsubscribe linux-omap" in > the body of a message to maj...@vg... > More majordomo info at http://vger.kernel.org/majordomo-info.html |
From: stephane e. <er...@go...> - 2012-01-21 09:17:08
|
On Sat, Jan 21, 2012 at 4:25 AM, Ming Lei <min...@ca...> wrote: > On Fri, Jan 20, 2012 at 9:47 PM, stephane eranian > <er...@go...> wrote: >> Started afresh from: >> >> 90a4c0f uml: fix compile for x86-64 >> >> And added 3, 4, 5, 6: >> 603c316 arm: omap4: pmu: support runtime pm >> 4899fbd arm: omap4: support pmu >> d737bb1 arm: omap4: create pmu device via hwmod >> 4e0259e arm: omap4: hwmod: introduce emu hwmod >> >> Still no interrupts firing. I am using your .config file. >> >> My HW: >> CPU implementer : 0x41 >> CPU architecture: 7 >> CPU variant : 0x1 >> CPU part : 0xc09 >> CPU revision : 2 >> >> Hardware : OMAP4 Panda board >> Revision : 0020 >> >> There must be something I am missing here. > > Have you applied the patch in link[1]? > You mean this: > [1], http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=summary It does not point to a patch but to the entire tree. > > thanks, > -- > Ming Lei > > [1], http://marc.info/?l=linux-arm-kernel&m=132697975416659&w=2 |
From: stephane e. <er...@go...> - 2012-01-27 17:03:37
|
On Fri, Jan 27, 2012 at 5:59 PM, Will Deacon <wil...@ar...> wrote: > On Fri, Jan 27, 2012 at 03:57:25PM +0000, stephane eranian wrote: >> On Fri, Jan 27, 2012 at 4:54 PM, Will Deacon <wil...@ar...> wrote: >> > >> > Ok. Note that on ARM the PMU generates a standard IRQ (i.e. not an NMI) so >> > you may miss samples if they occur during critical kernel sections (and if >> > you look at a profile, spin_unlock_irqrestore will be quite high). >> > >> But I am only running a user space noploop. So it spends 99% in user space, no >> critical section. > > and your result is almost 99% of the way there :) > > There are also potential overheads from the PMU interrupts themselves, since > there is a latency between overflow and taking the interrupt and then > between there are actually reading the counter (they continue to count after > overflow). > > That said, if you see any bugs in the code please do shout! > I suspect there is something wrong, we shouldn't hit the max_rate_limit. You may have bursts of interrupts (samples). I'll check on that this week-end. >> > A7 and A15 have the ability to filter counters based on privilege level, so >> > you can get more accurate userspace counts there. >> >> Ok, that's better. Need to update libpfm4 for A15 with priv levels then! > > How do you handle that in libpfm4? On ARM, the event encodings remain the same, > you just need to set some extra bits to determine which levels are included or > excluded (you can do this with the perf tool by using the :{u,k,h} suffix on an > event description). > It depends what you call the encoding? If the priv level can be encoded in the attr->config field, then that's easy. If it needs to be set somewhere else, then we need to figure out how you encode it in the attr struct. Either in some other bits in attr->config or use attr->config1, for instance. You tell me. > Will |
From: Will D. <wil...@ar...> - 2012-01-27 17:10:20
|
On Fri, Jan 27, 2012 at 05:03:28PM +0000, stephane eranian wrote: > On Fri, Jan 27, 2012 at 5:59 PM, Will Deacon <wil...@ar...> wrote: > > That said, if you see any bugs in the code please do shout! > > > I suspect there is something wrong, we shouldn't hit the max_rate_limit. > You may have bursts of interrupts (samples). I'll check on that this week-end. Ok, thanks. Keep in mind that you probably have variable rate clocks, which will affect the cycle counter frequency. > >> > A7 and A15 have the ability to filter counters based on privilege level, so > >> > you can get more accurate userspace counts there. > >> > >> Ok, that's better. Need to update libpfm4 for A15 with priv levels then! > > > > How do you handle that in libpfm4? On ARM, the event encodings remain the same, > > you just need to set some extra bits to determine which levels are included or > > excluded (you can do this with the perf tool by using the :{u,k,h} suffix on an > > event description). > > > It depends what you call the encoding? If the priv level can be encoded in the > attr->config field, then that's easy. If it needs to be set somewhere else, then > we need to figure out how you encode it in the attr struct. Either in some other > bits in attr->config or use attr->config1, for instance. You tell me. The way it's done with perf is to set the exclude{user,kernel,hv} fields in the attr. The ARM perf backend then translates these into the relevant bits which get orred into the config_base before hitting the hardware. Will |
From: stephane e. <er...@go...> - 2012-01-30 17:16:04
|
On Mon, Jan 30, 2012 at 5:08 PM, Måns Rullgård <ma...@ma...> wrote: > stephane eranian <er...@go...> writes: > >> Same result for me on CPU1: >> >> top - 16:20:24 up 1:45, 1 user, load average: 0.29, 0.08, 0.07 >> Tasks: 70 total, 2 running, 68 sleeping, 0 stopped, 0 zombie >> Cpu(s): 30.7%us, 2.7%sy, 0.0%ni, 66.7%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st >> Mem: 940232k total, 228984k used, 711248k free, 82244k buffers >> Swap: 524240k total, 0k used, 524240k free, 91400k cached >> >> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ P COMMAND >> 3968 eranian 20 0 644 160 128 R 100 0.0 0:21.98 1 noploop >> 3969 eranian 20 0 2184 1056 804 R 3 0.1 0:00.53 0 top >> 82 root 20 0 0 0 0 S 1 0.0 0:01.35 0 >> kworker/0:1 >> >> With 3.3.0-rc1, if I revert the clockdomain patch, I get the same result. >> So it must be coming from somewhere else, as you suggested. >> >> If the processor was spending time processing interrupts, then this would be >> accounted for in as sys time. But that's not what I observe here. It's either >> idle or user. That line, leads me to believe that the processor can only run >> my program for 30% of the time. The rest is spent idling even though my >> program is non-blocking. How could that be possible? Power-saving? > > In top, press 1 to see the statistics for the CPUs separately. > Ok, when I pin my program to CPU1, and press 1 in top I get: asks: 69 total, 2 running, 67 sleeping, 0 stopped, 0 zombie Cpu0 : 0.9%us, 3.8%sy, 0.0%ni, 94.3%id, 0.0%wa, 0.0%hi, 0.9%si, 0.0%st Cpu1 :100.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 940232k total, 75480k used, 864752k free, 8148k buffers Swap: 524240k total, 0k used, 524240k free, 37568k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 3788 eranian 20 0 644 160 128 R 100 0.0 0:47.93 noploop 3758 eranian 20 0 9900 1512 712 S 2 0.2 0:01.17 sshd 3789 eranian 20 0 2184 1056 804 R 2 0.1 0:01.22 top Which gives me the right answer. But in 'collapsed mode', press 1 again, the aggregate value is bogus. Could be wrong math in top. Ok, that was a false alarm then. Thanks for the help. Still need to investigate why the frequency mode does not yield the correct number of samples even with low frequency. $ taskset -c 1 perf record -e cycles -F 100 noploop 10 $ perf report -D | tail -20 Aggregated stats: TOTAL events: 475 MMAP events: 11 COMM events: 2 EXIT events: 2 SAMPLE events: 460 cycles stats: TOTAL events: 475 MMAP events: 11 COMM events: 2 EXIT events: 2 SAMPLE events: 460 460 samples is way too low. Should be 100x10 = 1000 samples or close to it. |
From: Will D. <wil...@ar...> - 2012-01-30 17:25:10
|
On Mon, Jan 30, 2012 at 05:15:53PM +0000, stephane eranian wrote: > Still need to investigate why the frequency mode does > not yield the correct number of samples even with low frequency. > > > $ taskset -c 1 perf record -e cycles -F 100 noploop 10 > $ perf report -D | tail -20 > Aggregated stats: > TOTAL events: 475 > MMAP events: 11 > COMM events: 2 > EXIT events: 2 > SAMPLE events: 460 > cycles stats: > TOTAL events: 475 > MMAP events: 11 > COMM events: 2 > EXIT events: 2 > SAMPLE events: 460 > > 460 samples is way too low. Should be 100x10 = 1000 samples or close to it. Can you stick noploop.c somewhere (I'm lazy :) and I'll try it on one of my A9 boards? Thanks, Will |
From: stephane e. <er...@go...> - 2012-01-27 17:16:53
|
On Fri, Jan 27, 2012 at 6:10 PM, Will Deacon <wil...@ar...> wrote: > On Fri, Jan 27, 2012 at 05:03:28PM +0000, stephane eranian wrote: >> On Fri, Jan 27, 2012 at 5:59 PM, Will Deacon <wil...@ar...> wrote: >> > That said, if you see any bugs in the code please do shout! >> > >> I suspect there is something wrong, we shouldn't hit the max_rate_limit. >> You may have bursts of interrupts (samples). I'll check on that this week-end. > > Ok, thanks. Keep in mind that you probably have variable rate clocks, which > will affect the cycle counter frequency. > I assume it does not vary the clock if the workload is steady and just burning cycles, e.g.: for(;;); >> >> > A7 and A15 have the ability to filter counters based on privilege level, so >> >> > you can get more accurate userspace counts there. >> >> >> >> Ok, that's better. Need to update libpfm4 for A15 with priv levels then! >> > >> > How do you handle that in libpfm4? On ARM, the event encodings remain the same, >> > you just need to set some extra bits to determine which levels are included or >> > excluded (you can do this with the perf tool by using the :{u,k,h} suffix on an >> > event description). >> > >> It depends what you call the encoding? If the priv level can be encoded in the >> attr->config field, then that's easy. If it needs to be set somewhere else, then >> we need to figure out how you encode it in the attr struct. Either in some other >> bits in attr->config or use attr->config1, for instance. You tell me. > > The way it's done with perf is to set the exclude{user,kernel,hv} fields in > the attr. The ARM perf backend then translates these into the relevant bits > which get orred into the config_base before hitting the hardware. > Well, that's also how we do it with libpfm4 on X86. This is because with perf_events, the exclude_* fields have priority over what you set in the attr->config field. > Will |
From: stephane e. <er...@go...> - 2012-01-29 17:36:17
|
Hi, Ok, so I did a few more tests and there is a serious issue when sampling in frequency mode (the default). I noticed wrong number of samples, so I investigated this some more and instrumented the perf_event kernel code. I found some erratic timer ticks causing broken period adjustments. In fact, the problem is visible using top. I am running a noploop program on CPU0 and nothing else besides top. The noploop program does: for(;;);. That is 100% user. On a 2-way system otherwise idle, I expect top to return 50% user 50% idle. Top with the commit: top - 16:19:21 up 5 min, 1 user, load average: 0.23, 0.15, 0.07 Tasks: 70 total, 2 running, 68 sleeping, 0 stopped, 0 zombie Cpu(s): 31.1%us, 2.0%sy, 0.0%ni, 66.2%id, 0.0%wa, 0.0%hi, 0.7%si, 0.0%st ^^^^^^^^ That's WRONG Mem: 940292k total, 74984k used, 865308k free, 8020k buffers Swap: 524240k total, 0k used, 524240k free, 37420k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 3770 eranian 20 0 644 160 128 R 99 0.0 0:14.21 noploop 3771 eranian 20 0 2184 1052 804 R 2 0.1 0:00.32 top 1 root 20 0 2564 1528 952 S 0 0.2 0:01.26 init I removed that one liner patch from Ming. The one fiddling with the clockdomains: --- a/arch/arm/mach-omap2/clockdomains44xx_data.c +++ b/arch/arm/mach-omap2/clockdomains44xx_data.c @@ -390,7 +390,7 @@ static struct clockdomain emu_sys_44xx_clkdm = { .prcm_partition = OMAP4430_PRM_PARTITION, .cm_inst = OMAP4430_PRM_EMU_CM_INST, .clkdm_offs = OMAP4430_PRM_EMU_CM_EMU_CDOFFS, - .flags = CLKDM_CAN_HWSUP, + .flags = CLKDM_CAN_SWSUP, When I rerun, the test, it now work: top - 16:02:51 up 15 min, 1 user, load average: 1.02, 0.46, 0.21 Tasks: 70 total, 2 running, 68 sleeping, 0 stopped, 0 zombie Cpu(s): 47.2%us, 1.0%sy, 0.0%ni, 50.8%id, 0.0%wa, 0.0%hi, 1.0%si, 0.0%st ^^^^^^^^ close enough (in it stabilize somehow around 49% which is good) Mem: 940292k total, 75288k used, 865004k free, 8004k buffers Swap: 524240k total, 0k used, 524240k free, 37408k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 3771 eranian 20 0 644 160 128 R 100 0.0 0:34.44 noploop Although the patch fixes PMU interrupts, it breaks the timer tick logic somehow. The perf problem is related to timer tick. I am hoping that the tradeoff is not: PMU interrupts but broken timer ticks vs. No PMU interrupts but working timer ticks On Fri, Jan 27, 2012 at 6:16 PM, stephane eranian <er...@go...> wrote: > On Fri, Jan 27, 2012 at 6:10 PM, Will Deacon <wil...@ar...> wrote: >> On Fri, Jan 27, 2012 at 05:03:28PM +0000, stephane eranian wrote: >>> On Fri, Jan 27, 2012 at 5:59 PM, Will Deacon <wil...@ar...> wrote: >>> > That said, if you see any bugs in the code please do shout! >>> > >>> I suspect there is something wrong, we shouldn't hit the max_rate_limit. >>> You may have bursts of interrupts (samples). I'll check on that this week-end. >> >> Ok, thanks. Keep in mind that you probably have variable rate clocks, which >> will affect the cycle counter frequency. >> > I assume it does not vary the clock if the workload is steady and just burning > cycles, e.g.: for(;;); > >>> >> > A7 and A15 have the ability to filter counters based on privilege level, so >>> >> > you can get more accurate userspace counts there. >>> >> >>> >> Ok, that's better. Need to update libpfm4 for A15 with priv levels then! >>> > >>> > How do you handle that in libpfm4? On ARM, the event encodings remain the same, >>> > you just need to set some extra bits to determine which levels are included or >>> > excluded (you can do this with the perf tool by using the :{u,k,h} suffix on an >>> > event description). >>> > >>> It depends what you call the encoding? If the priv level can be encoded in the >>> attr->config field, then that's easy. If it needs to be set somewhere else, then >>> we need to figure out how you encode it in the attr struct. Either in some other >>> bits in attr->config or use attr->config1, for instance. You tell me. >> >> The way it's done with perf is to set the exclude{user,kernel,hv} fields in >> the attr. The ARM perf backend then translates these into the relevant bits >> which get orred into the config_base before hitting the hardware. >> > Well, that's also how we do it with libpfm4 on X86. This is because > with perf_events, > the exclude_* fields have priority over what you set in the attr->config field. > >> Will |
From: stephane e. <er...@go...> - 2012-01-30 17:45:25
|
Will, There you go, no attachment, not sure the omap list supports this. There is something quite interesting to observe. While I run perf record -e cycles -F 100 noploop 10, I watch /proc/interrupts. The number of interrupts is way lower than expected. Therefore the number of samples is way too low: $ perf record -e cycles -F 100 noploop 10 $ perf report -D | tail -20 cycles stats: TOTAL events: 535 MMAP events: 11 COMM events: 2 EXIT events: 2 SAMPLE events: 520 The delta in /proc/interrupts on CPU1 is 520 interrupts. So looks like the frequency adjustment which is hooked off of the timer tick is either not called at each timer tick, the timer ticks are not at regular interval, or the math is wrong. If I go with the fixed period mode: $ perf stat -e cycles noploop 10 noploop for 10 seconds Performance counter stats for 'noploop 10': 10079156960 cycles # 0.000 GHz 10.004547117 seconds time elapsed That means, if I want 100 samples/sec: = 10079156960/(10*100)=10079157 $ perf record -e cycles -c 10079157 noploop 10 $ perf report -D | tail -20 cycles stats: TOTAL events: 1003 MMAP events: 11 COMM events: 2 EXIT events: 2 THROTTLE events: 1 UNTHROTTLE events: 1 SAMPLE events: 986 Now, we're getting the right answer! So with the right sampling period, everything works fine. We need to elucidate what's going on in perf_event_task_tick(). I have tried with my throttling fix and it did not help. We are not subject to throttling with such a low rate. noploop.c: #include <sys/types.h> #include <stdio.h> #include <stdlib.h> #include <signal.h> #include <inttypes.h> #include <unistd.h> void handler(int sig) { exit(0); } void noploop(void) { for(;;); } int main(int argc, char **argv) { unsigned int delay; delay = argc > 1 ? atoi(argv[1]) : 1; signal(SIGALRM, handler); printf("noploop for %d seconds\n", delay); alarm(delay); noploop(); return 0; } On Mon, Jan 30, 2012 at 6:24 PM, Will Deacon <wil...@ar...> wrote: > On Mon, Jan 30, 2012 at 05:15:53PM +0000, stephane eranian wrote: >> Still need to investigate why the frequency mode does >> not yield the correct number of samples even with low frequency. >> >> >> $ taskset -c 1 perf record -e cycles -F 100 noploop 10 >> $ perf report -D | tail -20 >> Aggregated stats: >> TOTAL events: 475 >> MMAP events: 11 >> COMM events: 2 >> EXIT events: 2 >> SAMPLE events: 460 >> cycles stats: >> TOTAL events: 475 >> MMAP events: 11 >> COMM events: 2 >> EXIT events: 2 >> SAMPLE events: 460 >> >> 460 samples is way too low. Should be 100x10 = 1000 samples or close to it. > > Can you stick noploop.c somewhere (I'm lazy :) and I'll try it on one of my > A9 boards? > > Thanks, > > Will |
From: Will D. <wil...@ar...> - 2012-01-30 19:14:57
|
On Mon, Jan 30, 2012 at 05:45:19PM +0000, stephane eranian wrote: > There you go, no attachment, not sure the omap list > supports this. Cheers Stephane. > There is something quite interesting to observe. > > While I run perf record -e cycles -F 100 noploop 10, I watch > /proc/interrupts. The number of interrupts is way lower than > expected. Therefore the number of samples is way too low: > > $ perf record -e cycles -F 100 noploop 10 > $ perf report -D | tail -20 > cycles stats: > TOTAL events: 535 > MMAP events: 11 > COMM events: 2 > EXIT events: 2 > SAMPLE events: 520 > > The delta in /proc/interrupts on CPU1 is 520 interrupts. Yes, that is about half of what you'd expect. Running on my A9 platform (vexpress) I get: $ perf record -e cycles -F 100 noploop 10 $ perf report -D | tail -20 cycles stats: TOTAL events: 1007 MMAP events: 18 COMM events: 2 EXIT events: 2 SAMPLE events: 985 > So looks like the frequency adjustment which is hooked off of the > timer tick is either not called at each timer tick, the timer ticks are > not at regular interval, or the math is wrong. My hunch is that that the interval is probably varying, but I don't know much about OMAP4 and its clocks. > If I go with the fixed period mode: > $ perf stat -e cycles noploop 10 > noploop for 10 seconds > Performance counter stats for 'noploop 10': > 10079156960 cycles # 0.000 GHz > 10.004547117 seconds time elapsed > > That means, if I want 100 samples/sec: = 10079156960/(10*100)=10079157 > $ perf record -e cycles -c 10079157 noploop 10 > $ perf report -D | tail -20 > cycles stats: > TOTAL events: 1003 > MMAP events: 11 > COMM events: 2 > EXIT events: 2 > THROTTLE events: 1 > UNTHROTTLE events: 1 > SAMPLE events: 986 > > Now, we're getting the right answer! Just to confirm, for me: $ perf stat -e cycles ./noploop 10 noploop for 10 seconds Performance counter stats for './noploop 10': 4001163930 cycles # 0.000 GHz 10.006534024 seconds time elapsed $ perf record -e cycles -c 4001163 ./noploop 10 $ perf report -D | tail -20 Aggregated stats: TOTAL events: 1020 MMAP events: 18 COMM events: 2 EXIT events: 2 SAMPLE events: 998 cycles stats: TOTAL events: 1020 MMAP events: 18 COMM events: 2 EXIT events: 2 SAMPLE events: 998 which is close enough :) > We need to elucidate what's going on in perf_event_task_tick(). > I have tried with my throttling fix and it did not help. We are > not subject to throttling with such a low rate. Ok. I would start by looking at the clock ticks if I were you, since this seems to be alright on my board. Will |
From: stephane e. <er...@go...> - 2012-01-30 20:45:50
|
On Mon, Jan 30, 2012 at 8:14 PM, Will Deacon <wil...@ar...> wrote: > On Mon, Jan 30, 2012 at 05:45:19PM +0000, stephane eranian wrote: >> There you go, no attachment, not sure the omap list >> supports this. > > Cheers Stephane. > >> There is something quite interesting to observe. >> >> While I run perf record -e cycles -F 100 noploop 10, I watch >> /proc/interrupts. The number of interrupts is way lower than >> expected. Therefore the number of samples is way too low: >> >> $ perf record -e cycles -F 100 noploop 10 >> $ perf report -D | tail -20 >> cycles stats: >> TOTAL events: 535 >> MMAP events: 11 >> COMM events: 2 >> EXIT events: 2 >> SAMPLE events: 520 >> >> The delta in /proc/interrupts on CPU1 is 520 interrupts. > > Yes, that is about half of what you'd expect. Running on my A9 platform > (vexpress) I get: > > $ perf record -e cycles -F 100 noploop 10 > $ perf report -D | tail -20 > cycles stats: > TOTAL events: 1007 > MMAP events: 18 > COMM events: 2 > EXIT events: 2 > SAMPLE events: 985 > >> So looks like the frequency adjustment which is hooked off of the >> timer tick is either not called at each timer tick, the timer ticks are >> not at regular interval, or the math is wrong. > > My hunch is that that the interval is probably varying, but I don't know much > about OMAP4 and its clocks. > Glad you tested this. At least, it seems the generic perf_event code is allright. I agree with you, something is fishy with the clocks. Just out of curiosity, what is the HZ value for your board? On my Panda it's 128Hz. >> If I go with the fixed period mode: >> $ perf stat -e cycles noploop 10 >> noploop for 10 seconds >> Performance counter stats for 'noploop 10': >> 10079156960 cycles # 0.000 GHz >> 10.004547117 seconds time elapsed >> >> That means, if I want 100 samples/sec: = 10079156960/(10*100)=10079157 >> $ perf record -e cycles -c 10079157 noploop 10 >> $ perf report -D | tail -20 >> cycles stats: >> TOTAL events: 1003 >> MMAP events: 11 >> COMM events: 2 >> EXIT events: 2 >> THROTTLE events: 1 >> UNTHROTTLE events: 1 >> SAMPLE events: 986 >> >> Now, we're getting the right answer! > > Just to confirm, for me: > > $ perf stat -e cycles ./noploop 10 > noploop for 10 seconds > > Performance counter stats for './noploop 10': > > 4001163930 cycles # 0.000 GHz > > 10.006534024 seconds time elapsed > > $ perf record -e cycles -c 4001163 ./noploop 10 > $ perf report -D | tail -20 > Aggregated stats: > TOTAL events: 1020 > MMAP events: 18 > COMM events: 2 > EXIT events: 2 > SAMPLE events: 998 > cycles stats: > TOTAL events: 1020 > MMAP events: 18 > COMM events: 2 > EXIT events: 2 > SAMPLE events: 998 > > which is close enough :) > >> We need to elucidate what's going on in perf_event_task_tick(). >> I have tried with my throttling fix and it did not help. We are >> not subject to throttling with such a low rate. > > Ok. I would start by looking at the clock ticks if I were you, since this > seems to be alright on my board. > > Will |
From: Ming L. <min...@ca...> - 2012-01-30 09:40:25
|
Hi, On Mon, Jan 30, 2012 at 1:36 AM, stephane eranian <er...@go...> wrote: > Hi, > > Ok, so I did a few more tests and there is a serious issue when sampling > in frequency mode (the default). I noticed wrong number of samples, so > I investigated this some more and instrumented the perf_event kernel code. > I found some erratic timer ticks causing broken period adjustments. > > In fact, the problem is visible using top. > I am running a noploop program on CPU0 and nothing else besides top. > The noploop program does: for(;;);. That is 100% user. On a 2-way Sometimes it is not 100% user, for example irq/exception handling... > system otherwise idle, I expect top to return 50% user 50% idle. > > Top with the commit: > > top - 16:19:21 up 5 min, 1 user, load average: 0.23, 0.15, 0.07 > Tasks: 70 total, 2 running, 68 sleeping, 0 stopped, 0 zombie > Cpu(s): 31.1%us, 2.0%sy, 0.0%ni, 66.2%id, 0.0%wa, 0.0%hi, 0.7%si, 0.0%st > ^^^^^^^^ That's WRONG Did you reproduce the issue each time or just occasionally? Looks no such issue on my board with 3.3-rc1 plus the 5 extra pmu/emu patches. top - 00:59:15 up 7 min, 1 user, load average: 1.00, 0.73, 0.35 Tasks: 56 total, 2 running, 54 sleeping, 0 stopped, 0 zombie Cpu(s): 42.6%us, 0.2%sy, 0.0%ni, 56.8%id, 0.0%wa, 0.0%hi, 0.4%si, 0.0%st Mem: 1013560k total, 50960k used, 962600k free, 6272k buffers Swap: 0k total, 0k used, 0k free, 29036k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 1355 root 20 0 1460 260 216 R 99 0.0 5:07.38 busy 532 root 20 0 0 0 0 S 0 0.0 0:00.23 kworker/1:1 1356 root 20 0 2552 1120 916 R 0 0.1 0:01.93 top > > Mem: 940292k total, 74984k used, 865308k free, 8020k buffers > Swap: 524240k total, 0k used, 524240k free, 37420k cached > > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND > 3770 eranian 20 0 644 160 128 R 99 0.0 0:14.21 noploop > 3771 eranian 20 0 2184 1052 804 R 2 0.1 0:00.32 top > 1 root 20 0 2564 1528 952 S 0 0.2 0:01.26 init > > > I removed that one liner patch from Ming. The one fiddling with the > clockdomains: > > --- a/arch/arm/mach-omap2/clockdomains44xx_data.c > +++ b/arch/arm/mach-omap2/clockdomains44xx_data.c > @@ -390,7 +390,7 @@ static struct clockdomain emu_sys_44xx_clkdm = { > .prcm_partition = OMAP4430_PRM_PARTITION, > .cm_inst = OMAP4430_PRM_EMU_CM_INST, > .clkdm_offs = OMAP4430_PRM_EMU_CM_EMU_CDOFFS, > - .flags = CLKDM_CAN_HWSUP, > + .flags = CLKDM_CAN_SWSUP, The patch should not affect timer tick logic, and what the patch does is just to revert the commit [1] wrt. emu clock domain. > > When I rerun, the test, it now work: > > top - 16:02:51 up 15 min, 1 user, load average: 1.02, 0.46, 0.21 > Tasks: 70 total, 2 running, 68 sleeping, 0 stopped, 0 zombie > Cpu(s): 47.2%us, 1.0%sy, 0.0%ni, 50.8%id, 0.0%wa, 0.0%hi, 1.0%si, 0.0%st > ^^^^^^^^ close enough (in it stabilize somehow around 49% > which is good) > > Mem: 940292k total, 75288k used, 865004k free, 8004k buffers > Swap: 524240k total, 0k used, 524240k free, 37408k cached > > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND > 3771 eranian 20 0 644 160 128 R 100 0.0 0:34.44 noploop > > Although the patch fixes PMU interrupts, it breaks the timer tick logic somehow. > The perf problem is related to timer tick. > > I am hoping that the tradeoff is not: > PMU interrupts but broken timer ticks > vs. > No PMU interrupts but working timer ticks [1], 3c50729b3fa1cd8ca1f347e6caf1081204cf1a7c ARM: OMAP4: PM: Initialise all the clockdomains to supported states thanks -- Ming Lei |
From: stephane e. <er...@go...> - 2012-01-30 10:25:08
|
Ok, let me try again with 3.3.0-rc1, that was with 3.2.0. The only thing that changed was that one line and it made a big difference. On Mon, Jan 30, 2012 at 10:40 AM, Ming Lei <min...@ca...> wrote: > Hi, > > On Mon, Jan 30, 2012 at 1:36 AM, stephane eranian > <er...@go...> wrote: >> Hi, >> >> Ok, so I did a few more tests and there is a serious issue when sampling >> in frequency mode (the default). I noticed wrong number of samples, so >> I investigated this some more and instrumented the perf_event kernel code. >> I found some erratic timer ticks causing broken period adjustments. >> >> In fact, the problem is visible using top. >> I am running a noploop program on CPU0 and nothing else besides top. >> The noploop program does: for(;;);. That is 100% user. On a 2-way > > Sometimes it is not 100% user, for example irq/exception handling... > >> system otherwise idle, I expect top to return 50% user 50% idle. >> >> Top with the commit: >> >> top - 16:19:21 up 5 min, 1 user, load average: 0.23, 0.15, 0.07 >> Tasks: 70 total, 2 running, 68 sleeping, 0 stopped, 0 zombie >> Cpu(s): 31.1%us, 2.0%sy, 0.0%ni, 66.2%id, 0.0%wa, 0.0%hi, 0.7%si, 0.0%st >> ^^^^^^^^ That's WRONG > > Did you reproduce the issue each time or just occasionally? > > Looks no such issue on my board with 3.3-rc1 plus the 5 extra pmu/emu patches. > > top - 00:59:15 up 7 min, 1 user, load average: 1.00, 0.73, 0.35 > Tasks: 56 total, 2 running, 54 sleeping, 0 stopped, 0 zombie > Cpu(s): 42.6%us, 0.2%sy, 0.0%ni, 56.8%id, 0.0%wa, 0.0%hi, 0.4%si, 0.0%st > Mem: 1013560k total, 50960k used, 962600k free, 6272k buffers > Swap: 0k total, 0k used, 0k free, 29036k cached > > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND > 1355 root 20 0 1460 260 216 R 99 0.0 5:07.38 busy > 532 root 20 0 0 0 0 S 0 0.0 0:00.23 kworker/1:1 > 1356 root 20 0 2552 1120 916 R 0 0.1 0:01.93 top > >> >> Mem: 940292k total, 74984k used, 865308k free, 8020k buffers >> Swap: 524240k total, 0k used, 524240k free, 37420k cached >> >> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND >> 3770 eranian 20 0 644 160 128 R 99 0.0 0:14.21 noploop >> 3771 eranian 20 0 2184 1052 804 R 2 0.1 0:00.32 top >> 1 root 20 0 2564 1528 952 S 0 0.2 0:01.26 init >> >> >> I removed that one liner patch from Ming. The one fiddling with the >> clockdomains: >> >> --- a/arch/arm/mach-omap2/clockdomains44xx_data.c >> +++ b/arch/arm/mach-omap2/clockdomains44xx_data.c >> @@ -390,7 +390,7 @@ static struct clockdomain emu_sys_44xx_clkdm = { >> .prcm_partition = OMAP4430_PRM_PARTITION, >> .cm_inst = OMAP4430_PRM_EMU_CM_INST, >> .clkdm_offs = OMAP4430_PRM_EMU_CM_EMU_CDOFFS, >> - .flags = CLKDM_CAN_HWSUP, >> + .flags = CLKDM_CAN_SWSUP, > > The patch should not affect timer tick logic, and what the patch does is > just to revert the commit [1] wrt. emu clock domain. > >> >> When I rerun, the test, it now work: >> >> top - 16:02:51 up 15 min, 1 user, load average: 1.02, 0.46, 0.21 >> Tasks: 70 total, 2 running, 68 sleeping, 0 stopped, 0 zombie >> Cpu(s): 47.2%us, 1.0%sy, 0.0%ni, 50.8%id, 0.0%wa, 0.0%hi, 1.0%si, 0.0%st >> ^^^^^^^^ close enough (in it stabilize somehow around 49% >> which is good) >> >> Mem: 940292k total, 75288k used, 865004k free, 8004k buffers >> Swap: 524240k total, 0k used, 524240k free, 37408k cached >> >> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND >> 3771 eranian 20 0 644 160 128 R 100 0.0 0:34.44 noploop >> >> Although the patch fixes PMU interrupts, it breaks the timer tick logic somehow. >> The perf problem is related to timer tick. >> >> I am hoping that the tradeoff is not: >> PMU interrupts but broken timer ticks >> vs. >> No PMU interrupts but working timer ticks > > > > [1], 3c50729b3fa1cd8ca1f347e6caf1081204cf1a7c > ARM: OMAP4: PM: Initialise all the clockdomains to supported states > > thanks > -- > Ming Lei |
From: stephane e. <er...@go...> - 2012-01-30 13:44:10
|
Same results for me with 3.3.0-rc1 + 5 patches. top - 14:42:34 up 8 min, 1 user, load average: 0.70, 0.29, 0.15 Tasks: 75 total, 2 running, 73 sleeping, 0 stopped, 0 zombie Cpu(s): 32.9%us, 1.3%sy, 0.0%ni, 65.8%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 940232k total, 118520k used, 821712k free, 8080k buffers Swap: 524240k total, 0k used, 524240k free, 79432k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 3868 eranian 20 0 644 160 128 R 99 0.0 0:53.34 noploop 3870 eranian 20 0 2284 1060 804 R 3 0.1 0:00.63 top 1 root 20 0 2564 1532 952 S 0 0.2 0:01.26 init I am connecting to the board via ssh. But the results don't look correct to me. On Mon, Jan 30, 2012 at 11:24 AM, stephane eranian <er...@go...> wrote: > Ok, let me try again with 3.3.0-rc1, that was with 3.2.0. > The only thing that changed was that one line and it made > a big difference. > > > On Mon, Jan 30, 2012 at 10:40 AM, Ming Lei <min...@ca...> wrote: >> Hi, >> >> On Mon, Jan 30, 2012 at 1:36 AM, stephane eranian >> <er...@go...> wrote: >>> Hi, >>> >>> Ok, so I did a few more tests and there is a serious issue when sampling >>> in frequency mode (the default). I noticed wrong number of samples, so >>> I investigated this some more and instrumented the perf_event kernel code. >>> I found some erratic timer ticks causing broken period adjustments. >>> >>> In fact, the problem is visible using top. >>> I am running a noploop program on CPU0 and nothing else besides top. >>> The noploop program does: for(;;);. That is 100% user. On a 2-way >> >> Sometimes it is not 100% user, for example irq/exception handling... >> >>> system otherwise idle, I expect top to return 50% user 50% idle. >>> >>> Top with the commit: >>> >>> top - 16:19:21 up 5 min, 1 user, load average: 0.23, 0.15, 0.07 >>> Tasks: 70 total, 2 running, 68 sleeping, 0 stopped, 0 zombie >>> Cpu(s): 31.1%us, 2.0%sy, 0.0%ni, 66.2%id, 0.0%wa, 0.0%hi, 0.7%si, 0.0%st >>> ^^^^^^^^ That's WRONG >> >> Did you reproduce the issue each time or just occasionally? >> >> Looks no such issue on my board with 3.3-rc1 plus the 5 extra pmu/emu patches. >> >> top - 00:59:15 up 7 min, 1 user, load average: 1.00, 0.73, 0.35 >> Tasks: 56 total, 2 running, 54 sleeping, 0 stopped, 0 zombie >> Cpu(s): 42.6%us, 0.2%sy, 0.0%ni, 56.8%id, 0.0%wa, 0.0%hi, 0.4%si, 0.0%st >> Mem: 1013560k total, 50960k used, 962600k free, 6272k buffers >> Swap: 0k total, 0k used, 0k free, 29036k cached >> >> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND >> 1355 root 20 0 1460 260 216 R 99 0.0 5:07.38 busy >> 532 root 20 0 0 0 0 S 0 0.0 0:00.23 kworker/1:1 >> 1356 root 20 0 2552 1120 916 R 0 0.1 0:01.93 top >> >>> >>> Mem: 940292k total, 74984k used, 865308k free, 8020k buffers >>> Swap: 524240k total, 0k used, 524240k free, 37420k cached >>> >>> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND >>> 3770 eranian 20 0 644 160 128 R 99 0.0 0:14.21 noploop >>> 3771 eranian 20 0 2184 1052 804 R 2 0.1 0:00.32 top >>> 1 root 20 0 2564 1528 952 S 0 0.2 0:01.26 init >>> >>> >>> I removed that one liner patch from Ming. The one fiddling with the >>> clockdomains: >>> >>> --- a/arch/arm/mach-omap2/clockdomains44xx_data.c >>> +++ b/arch/arm/mach-omap2/clockdomains44xx_data.c >>> @@ -390,7 +390,7 @@ static struct clockdomain emu_sys_44xx_clkdm = { >>> .prcm_partition = OMAP4430_PRM_PARTITION, >>> .cm_inst = OMAP4430_PRM_EMU_CM_INST, >>> .clkdm_offs = OMAP4430_PRM_EMU_CM_EMU_CDOFFS, >>> - .flags = CLKDM_CAN_HWSUP, >>> + .flags = CLKDM_CAN_SWSUP, >> >> The patch should not affect timer tick logic, and what the patch does is >> just to revert the commit [1] wrt. emu clock domain. >> >>> >>> When I rerun, the test, it now work: >>> >>> top - 16:02:51 up 15 min, 1 user, load average: 1.02, 0.46, 0.21 >>> Tasks: 70 total, 2 running, 68 sleeping, 0 stopped, 0 zombie >>> Cpu(s): 47.2%us, 1.0%sy, 0.0%ni, 50.8%id, 0.0%wa, 0.0%hi, 1.0%si, 0.0%st >>> ^^^^^^^^ close enough (in it stabilize somehow around 49% >>> which is good) >>> >>> Mem: 940292k total, 75288k used, 865004k free, 8004k buffers >>> Swap: 524240k total, 0k used, 524240k free, 37408k cached >>> >>> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND >>> 3771 eranian 20 0 644 160 128 R 100 0.0 0:34.44 noploop >>> >>> Although the patch fixes PMU interrupts, it breaks the timer tick logic somehow. >>> The perf problem is related to timer tick. >>> >>> I am hoping that the tradeoff is not: >>> PMU interrupts but broken timer ticks >>> vs. >>> No PMU interrupts but working timer ticks >> >> >> >> [1], 3c50729b3fa1cd8ca1f347e6caf1081204cf1a7c >> ARM: OMAP4: PM: Initialise all the clockdomains to supported states >> >> thanks >> -- >> Ming Lei |
From: Ming L. <min...@ca...> - 2012-01-30 14:50:09
|
On Mon, Jan 30, 2012 at 9:43 PM, stephane eranian <er...@go...> wrote: > Same results for me with 3.3.0-rc1 + 5 patches. In fact, I think the only effect of the patch is to enable pmu interrupt handling, which may cause so much difference? Also maybe you should put 'noploop' to run on CPU1 and you may observe a more accurate result of 'top'. On ARM, almost handling of all IRQs from gic is run on CPU0 at default, which may cause your issue. > > > top - 14:42:34 up 8 min, 1 user, load average: 0.70, 0.29, 0.15 > Tasks: 75 total, 2 running, 73 sleeping, 0 stopped, 0 zombie > Cpu(s): 32.9%us, 1.3%sy, 0.0%ni, 65.8%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st > Mem: 940232k total, 118520k used, 821712k free, 8080k buffers > Swap: 524240k total, 0k used, 524240k free, 79432k cached > > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND > 3868 eranian 20 0 644 160 128 R 99 0.0 0:53.34 noploop > 3870 eranian 20 0 2284 1060 804 R 3 0.1 0:00.63 top > 1 root 20 0 2564 1532 952 S 0 0.2 0:01.26 init > > I am connecting to the board via ssh. > But the results don't look correct to me. thanks, -- Ming Lei |
From: stephane e. <er...@go...> - 2012-01-30 16:02:17
|
Same result for me on CPU1: top - 16:20:24 up 1:45, 1 user, load average: 0.29, 0.08, 0.07 Tasks: 70 total, 2 running, 68 sleeping, 0 stopped, 0 zombie Cpu(s): 30.7%us, 2.7%sy, 0.0%ni, 66.7%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 940232k total, 228984k used, 711248k free, 82244k buffers Swap: 524240k total, 0k used, 524240k free, 91400k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ P COMMAND 3968 eranian 20 0 644 160 128 R 100 0.0 0:21.98 1 noploop 3969 eranian 20 0 2184 1056 804 R 3 0.1 0:00.53 0 top 82 root 20 0 0 0 0 S 1 0.0 0:01.35 0 kworker/0:1 With 3.3.0-rc1, if I revert the clockdomain patch, I get the same result. So it must be coming from somewhere else, as you suggested. If the processor was spending time processing interrupts, then this would be accounted for in as sys time. But that's not what I observe here. It's either idle or user. That line, leads me to believe that the processor can only run my program for 30% of the time. The rest is spent idling even though my program is non-blocking. How could that be possible? Power-saving? On Mon, Jan 30, 2012 at 3:49 PM, Ming Lei <min...@ca...> wrote: > On Mon, Jan 30, 2012 at 9:43 PM, stephane eranian > <er...@go...> wrote: >> Same results for me with 3.3.0-rc1 + 5 patches. > > In fact, I think the only effect of the patch is to enable pmu > interrupt handling, > which may cause so much difference? > > Also maybe you should put 'noploop' to run on CPU1 and you may observe > a more accurate result of 'top'. > > On ARM, almost handling of all IRQs from gic is run on CPU0 at default, > which may cause your issue. > >> >> >> top - 14:42:34 up 8 min, 1 user, load average: 0.70, 0.29, 0.15 >> Tasks: 75 total, 2 running, 73 sleeping, 0 stopped, 0 zombie >> Cpu(s): 32.9%us, 1.3%sy, 0.0%ni, 65.8%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st >> Mem: 940232k total, 118520k used, 821712k free, 8080k buffers >> Swap: 524240k total, 0k used, 524240k free, 79432k cached >> >> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND >> 3868 eranian 20 0 644 160 128 R 99 0.0 0:53.34 noploop >> 3870 eranian 20 0 2284 1060 804 R 3 0.1 0:00.63 top >> 1 root 20 0 2564 1532 952 S 0 0.2 0:01.26 init >> >> I am connecting to the board via ssh. >> But the results don't look correct to me. > > thanks, > -- > Ming Lei |
From: Måns R. <ma...@ma...> - 2012-01-30 16:08:39
|
stephane eranian <er...@go...> writes: > Same result for me on CPU1: > > top - 16:20:24 up 1:45, 1 user, load average: 0.29, 0.08, 0.07 > Tasks: 70 total, 2 running, 68 sleeping, 0 stopped, 0 zombie > Cpu(s): 30.7%us, 2.7%sy, 0.0%ni, 66.7%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st > Mem: 940232k total, 228984k used, 711248k free, 82244k buffers > Swap: 524240k total, 0k used, 524240k free, 91400k cached > > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ P COMMAND > 3968 eranian 20 0 644 160 128 R 100 0.0 0:21.98 1 noploop > 3969 eranian 20 0 2184 1056 804 R 3 0.1 0:00.53 0 top > 82 root 20 0 0 0 0 S 1 0.0 0:01.35 0 > kworker/0:1 > > With 3.3.0-rc1, if I revert the clockdomain patch, I get the same result. > So it must be coming from somewhere else, as you suggested. > > If the processor was spending time processing interrupts, then this would be > accounted for in as sys time. But that's not what I observe here. It's either > idle or user. That line, leads me to believe that the processor can only run > my program for 30% of the time. The rest is spent idling even though my > program is non-blocking. How could that be possible? Power-saving? In top, press 1 to see the statistics for the CPUs separately. -- Måns Rullgård ma...@ma... |
From: Will D. <wil...@ar...> - 2012-01-27 12:13:40
|
Hi guys, On Sat, Jan 21, 2012 at 09:16:57AM +0000, stephane eranian wrote: > On Sat, Jan 21, 2012 at 4:25 AM, Ming Lei <min...@ca...> wrote: > > On Fri, Jan 20, 2012 at 9:47 PM, stephane eranian > > <er...@go...> wrote: > >> Started afresh from: > >> > >> 90a4c0f uml: fix compile for x86-64 > >> > >> And added 3, 4, 5, 6: > >> 603c316 arm: omap4: pmu: support runtime pm > >> 4899fbd arm: omap4: support pmu > >> d737bb1 arm: omap4: create pmu device via hwmod > >> 4e0259e arm: omap4: hwmod: introduce emu hwmod > >> > >> Still no interrupts firing. I am using your .config file. > >> > >> My HW: > >> CPU implementer : 0x41 > >> CPU architecture: 7 > >> CPU variant : 0x1 > >> CPU part : 0xc09 > >> CPU revision : 2 > >> > >> Hardware : OMAP4 Panda board > >> Revision : 0020 > >> > >> There must be something I am missing here. Did this lead anywhere in the end? It seems as though Ming Lei has a working setup but Stephane is unable to replicate it, despite applying the necessary patches and trying an updated bootloader. Drastic suggestion: Stephane, could you try a kernel *binary* from Ming Lei? If that works then you're probably just missing a patch. If it doesn't, then there must be something different between your boards. Will |