From: Weng M. <wen...@hu...> - 2013-12-20 08:11:27
|
From: Weng Meiling <wen...@hu...> There is a situation event is triggered before oprofile_perf_start() finish. Because the event is still not stored in per_cpu(perf_events, cpu)[event], op_overflow_handler() will print the warning. During the time, if unregistered event is triggered again, the cpu will print again. This may make cpu keeping on printing and trigger softlockup. So check whether events register finished in op_overflow_handler(). The problem was once triggered on kernel 2.6.34, the main information: <3>BUG: soft lockup - CPU#0 stuck for 60005ms! [opcontrol:8673] Pid: 8673, comm: opcontrol =====================SOFTLOCKUP INFO BEGIN======================= [CPU#0] the task [opcontrol] is not waiting for a lock,maybe a delay or deadcricle! <6>opcontrol R<c> running <c> 0 8673 7603 0x00000002 locked: bf0e1928 mutex 0 [<bf0de0d8>] oprofile_start+0x10/0x68 [oprofile] bf0e1a24 mutex 0 [<bf0e07f0>] op_arm_start+0x10/0x48 [oprofile] c0628020 &ctx->mutex 0 [<c00af85c>] perf_event_create_kernel_counter+0xa4/0x14c [<c00362b8>] (unwind_backtrace+0x0/0x164) from [<c0031db4>] (show_stack+0x10/0x14) [<c0031db4>] (show_stack+0x10/0x14) from [<c008d964>] (show_lock_info+0x9c/0x168) [<c008d964>] (show_lock_info+0x9c/0x168) from [<c008dbf4>] (softlockup_tick+0x1c4/0x234) [<c008dbf4>] (softlockup_tick+0x1c4/0x234) from [<c0066d58>] (update_process_times+0x2c/0x50) [<c0066d58>] (update_process_times+0x2c/0x50) from [<c00811cc>] (tick_sched_timer+0x268/0x2c4) [<c00811cc>] (tick_sched_timer+0x268/0x2c4) from [<c0077340>] (__run_hrtimer+0x158/0x25c) [<c0077340>] (__run_hrtimer+0x158/0x25c) from [<c0077e08>] (hrtimer_interrupt+0x13c/0x2f8) [<c0077e08>] (hrtimer_interrupt+0x13c/0x2f8) from [<c003f82c>] (timer64_timer_interrupt+0x20/0x2c) [<c003f82c>] (timer64_timer_interrupt+0x20/0x2c) from [<c008e54c>] (handle_IRQ_event+0x144/0x2ec) [<c008e54c>] (handle_IRQ_event+0x144/0x2ec) from [<c00900dc>] (handle_level_irq+0xc0/0x13c) [<c00900dc>] (handle_level_irq+0xc0/0x13c) from [<c002b080>] (asm_do_IRQ+0x80/0xbc) [<c002b080>] (asm_do_IRQ+0x80/0xbc) from [<c0274b8c>] (__irq_svc+0x4c/0xe4) Exception stack(0xc4099db8 to 0xc4099e00) 9da0: c0357538 00000000 9dc0: 00000000 c0380cc0 c4098000 00000202 00000028 c4098000 3fca9fbc c4098000 9de0: c0028b08 00000000 c4098000 c4099e00 c005eb50 c005e544 20000113 ffffffff [<c0274b8c>] (__irq_svc+0x4c/0xe4) from [<c005e544>] (__do_softirq+0x64/0x25c) [<c005e544>] (__do_softirq+0x64/0x25c) from [<c005eb50>] (irq_exit+0x48/0x5c) [<c005eb50>] (irq_exit+0x48/0x5c) from [<c002b084>] (asm_do_IRQ+0x84/0xbc) [<c002b084>] (asm_do_IRQ+0x84/0xbc) from [<c0274b8c>] (__irq_svc+0x4c/0xe4) Exception stack(0xc4099e58 to 0xc4099ea0) 9e40: c0628010 20000093 9e60: 00000001 00000000 00000000 60000013 c00aff24 cc4f6c00 00000001 c4098000 9e80: 00000000 00000000 00000000 c4099ea0 c0084fa0 c0084fa4 60000013 ffffffff [<c0274b8c>] (__irq_svc+0x4c/0xe4) from [<c0084fa4>] (smp_call_function_single+0xc0/0x1d8) [<c0084fa4>] (smp_call_function_single+0xc0/0x1d8) from [<c00af86c>] (perf_event_create_kernel_counter+0xb4/0x14c) [<c00af86c>] (perf_event_create_kernel_counter+0xb4/0x14c) from [<bf0e0700>] (op_perf_start+0x54/0xf0 [oprofile]) [<bf0e0700>] (op_perf_start+0x54/0xf0 [oprofile]) from [<bf0e0800>] (op_arm_start+0x20/0x48 [oprofile]) [<bf0e0800>] (op_arm_start+0x20/0x48 [oprofile]) from [<bf0de100>] (oprofile_start+0x38/0x68 [oprofile]) [<bf0de100>] (oprofile_start+0x38/0x68 [oprofile]) from [<bf0dfac0>] (enable_write+0x34/0x54 [oprofile]) [<bf0dfac0>] (enable_write+0x34/0x54 [oprofile]) from [<c00e5368>] (vfs_write+0xa8/0x150) [<c00e5368>] (vfs_write+0xa8/0x150) from [<c00e5698>] (sys_write+0x3c/0x100) [<c00e5698>] (sys_write+0x3c/0x100) from [<c002c500>] (ret_fast_syscall+0x0/0x30) =====================SOFTLOCKUP INFO END========================= <0>Kernel panic - not syncing: softlockup: hung tasks Cc: <st...@vg...> # 2.6.34+ Signed-off-by: Weng Meiling <wen...@hu...> --- drivers/oprofile/oprofile_perf.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/oprofile/oprofile_perf.c b/drivers/oprofile/oprofile_perf.c index d5b2732..a9e5761 100644 --- a/drivers/oprofile/oprofile_perf.c +++ b/drivers/oprofile/oprofile_perf.c @@ -38,6 +38,9 @@ static void op_overflow_handler(struct perf_event *event, int id; u32 cpu = smp_processor_id(); + if (!oprofile_perf_enabled) + return; + for (id = 0; id < num_counters; ++id) if (per_cpu(perf_events, cpu)[id] == event) break; -- 1.8.3 |
From: Weng M. <wen...@hu...> - 2013-12-30 09:07:56
|
Hi Robert, What do you think about this patch? On 2013/12/20 15:49, Weng Meiling wrote: > > From: Weng Meiling <wen...@hu...> > > There is a situation event is triggered before oprofile_perf_start() finish. > Because the event is still not stored in per_cpu(perf_events, cpu)[event], > op_overflow_handler() will print the warning. During the time, if unregistered > event is triggered again, the cpu will print again. This may make cpu keeping > on printing and trigger softlockup. So check whether events register finished > in op_overflow_handler(). > > The problem was once triggered on kernel 2.6.34, the main information: > <3>BUG: soft lockup - CPU#0 stuck for 60005ms! [opcontrol:8673] > > Pid: 8673, comm: opcontrol > =====================SOFTLOCKUP INFO BEGIN======================= > [CPU#0] the task [opcontrol] is not waiting for a lock,maybe a delay or deadcricle! > <6>opcontrol R<c> running <c> 0 8673 7603 0x00000002 > locked: > bf0e1928 mutex 0 [<bf0de0d8>] oprofile_start+0x10/0x68 [oprofile] > bf0e1a24 mutex 0 [<bf0e07f0>] op_arm_start+0x10/0x48 [oprofile] > c0628020 &ctx->mutex 0 [<c00af85c>] perf_event_create_kernel_counter+0xa4/0x14c > [<c00362b8>] (unwind_backtrace+0x0/0x164) from [<c0031db4>] (show_stack+0x10/0x14) > [<c0031db4>] (show_stack+0x10/0x14) from [<c008d964>] (show_lock_info+0x9c/0x168) > [<c008d964>] (show_lock_info+0x9c/0x168) from [<c008dbf4>] (softlockup_tick+0x1c4/0x234) > [<c008dbf4>] (softlockup_tick+0x1c4/0x234) from [<c0066d58>] (update_process_times+0x2c/0x50) > [<c0066d58>] (update_process_times+0x2c/0x50) from [<c00811cc>] (tick_sched_timer+0x268/0x2c4) > [<c00811cc>] (tick_sched_timer+0x268/0x2c4) from [<c0077340>] (__run_hrtimer+0x158/0x25c) > [<c0077340>] (__run_hrtimer+0x158/0x25c) from [<c0077e08>] (hrtimer_interrupt+0x13c/0x2f8) > [<c0077e08>] (hrtimer_interrupt+0x13c/0x2f8) from [<c003f82c>] (timer64_timer_interrupt+0x20/0x2c) > [<c003f82c>] (timer64_timer_interrupt+0x20/0x2c) from [<c008e54c>] (handle_IRQ_event+0x144/0x2ec) > [<c008e54c>] (handle_IRQ_event+0x144/0x2ec) from [<c00900dc>] (handle_level_irq+0xc0/0x13c) > [<c00900dc>] (handle_level_irq+0xc0/0x13c) from [<c002b080>] (asm_do_IRQ+0x80/0xbc) > [<c002b080>] (asm_do_IRQ+0x80/0xbc) from [<c0274b8c>] (__irq_svc+0x4c/0xe4) > Exception stack(0xc4099db8 to 0xc4099e00) > 9da0: c0357538 00000000 > 9dc0: 00000000 c0380cc0 c4098000 00000202 00000028 c4098000 3fca9fbc c4098000 > 9de0: c0028b08 00000000 c4098000 c4099e00 c005eb50 c005e544 20000113 ffffffff > [<c0274b8c>] (__irq_svc+0x4c/0xe4) from [<c005e544>] (__do_softirq+0x64/0x25c) > [<c005e544>] (__do_softirq+0x64/0x25c) from [<c005eb50>] (irq_exit+0x48/0x5c) > [<c005eb50>] (irq_exit+0x48/0x5c) from [<c002b084>] (asm_do_IRQ+0x84/0xbc) > [<c002b084>] (asm_do_IRQ+0x84/0xbc) from [<c0274b8c>] (__irq_svc+0x4c/0xe4) > Exception stack(0xc4099e58 to 0xc4099ea0) > 9e40: c0628010 20000093 > 9e60: 00000001 00000000 00000000 60000013 c00aff24 cc4f6c00 00000001 c4098000 > 9e80: 00000000 00000000 00000000 c4099ea0 c0084fa0 c0084fa4 60000013 ffffffff > [<c0274b8c>] (__irq_svc+0x4c/0xe4) from [<c0084fa4>] (smp_call_function_single+0xc0/0x1d8) > [<c0084fa4>] (smp_call_function_single+0xc0/0x1d8) from [<c00af86c>] (perf_event_create_kernel_counter+0xb4/0x14c) > [<c00af86c>] (perf_event_create_kernel_counter+0xb4/0x14c) from [<bf0e0700>] (op_perf_start+0x54/0xf0 [oprofile]) > [<bf0e0700>] (op_perf_start+0x54/0xf0 [oprofile]) from [<bf0e0800>] (op_arm_start+0x20/0x48 [oprofile]) > [<bf0e0800>] (op_arm_start+0x20/0x48 [oprofile]) from [<bf0de100>] (oprofile_start+0x38/0x68 [oprofile]) > [<bf0de100>] (oprofile_start+0x38/0x68 [oprofile]) from [<bf0dfac0>] (enable_write+0x34/0x54 [oprofile]) > [<bf0dfac0>] (enable_write+0x34/0x54 [oprofile]) from [<c00e5368>] (vfs_write+0xa8/0x150) > [<c00e5368>] (vfs_write+0xa8/0x150) from [<c00e5698>] (sys_write+0x3c/0x100) > [<c00e5698>] (sys_write+0x3c/0x100) from [<c002c500>] (ret_fast_syscall+0x0/0x30) > =====================SOFTLOCKUP INFO END========================= > <0>Kernel panic - not syncing: softlockup: hung tasks > > Cc: <st...@vg...> # 2.6.34+ > Signed-off-by: Weng Meiling <wen...@hu...> > --- > drivers/oprofile/oprofile_perf.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/drivers/oprofile/oprofile_perf.c b/drivers/oprofile/oprofile_perf.c > index d5b2732..a9e5761 100644 > --- a/drivers/oprofile/oprofile_perf.c > +++ b/drivers/oprofile/oprofile_perf.c > @@ -38,6 +38,9 @@ static void op_overflow_handler(struct perf_event *event, > int id; > u32 cpu = smp_processor_id(); > > + if (!oprofile_perf_enabled) > + return; > + > for (id = 0; id < num_counters; ++id) > if (per_cpu(perf_events, cpu)[id] == event) > break; > |
From: Robert R. <rr...@ke...> - 2014-01-13 08:46:09
|
Weng, sorry for answering late, your mail hit the holiday season. On 20.12.13 15:49:01, Weng Meiling wrote: > > From: Weng Meiling <wen...@hu...> > > There is a situation event is triggered before oprofile_perf_start() finish. > Because the event is still not stored in per_cpu(perf_events, cpu)[event], > op_overflow_handler() will print the warning. During the time, if unregistered > event is triggered again, the cpu will print again. This may make cpu keeping > on printing and trigger softlockup. So check whether events register finished > in op_overflow_handler(). > > The problem was once triggered on kernel 2.6.34, the main information: > <3>BUG: soft lockup - CPU#0 stuck for 60005ms! [opcontrol:8673] > > Pid: 8673, comm: opcontrol > =====================SOFTLOCKUP INFO BEGIN======================= > [CPU#0] the task [opcontrol] is not waiting for a lock,maybe a delay or deadcricle! > <6>opcontrol R<c> running <c> 0 8673 7603 0x00000002 > locked: > bf0e1928 mutex 0 [<bf0de0d8>] oprofile_start+0x10/0x68 [oprofile] > bf0e1a24 mutex 0 [<bf0e07f0>] op_arm_start+0x10/0x48 [oprofile] > c0628020 &ctx->mutex 0 [<c00af85c>] perf_event_create_kernel_counter+0xa4/0x14c I rather suspect the code of perf_install_in_context() of 2.6.34 to cause the locking issue. There was a lot of rework in between there. Can you further explain the locking and why your fix should solve it? It would be better to go through the bunch of fixes between 2.6.34 and current kernel. Or, to use the latest kernel and/or operf if possible. See also below. > [<c00362b8>] (unwind_backtrace+0x0/0x164) from [<c0031db4>] (show_stack+0x10/0x14) > [<c0031db4>] (show_stack+0x10/0x14) from [<c008d964>] (show_lock_info+0x9c/0x168) > [<c008d964>] (show_lock_info+0x9c/0x168) from [<c008dbf4>] (softlockup_tick+0x1c4/0x234) > [<c008dbf4>] (softlockup_tick+0x1c4/0x234) from [<c0066d58>] (update_process_times+0x2c/0x50) > [<c0066d58>] (update_process_times+0x2c/0x50) from [<c00811cc>] (tick_sched_timer+0x268/0x2c4) > [<c00811cc>] (tick_sched_timer+0x268/0x2c4) from [<c0077340>] (__run_hrtimer+0x158/0x25c) > [<c0077340>] (__run_hrtimer+0x158/0x25c) from [<c0077e08>] (hrtimer_interrupt+0x13c/0x2f8) > [<c0077e08>] (hrtimer_interrupt+0x13c/0x2f8) from [<c003f82c>] (timer64_timer_interrupt+0x20/0x2c) > [<c003f82c>] (timer64_timer_interrupt+0x20/0x2c) from [<c008e54c>] (handle_IRQ_event+0x144/0x2ec) > [<c008e54c>] (handle_IRQ_event+0x144/0x2ec) from [<c00900dc>] (handle_level_irq+0xc0/0x13c) > [<c00900dc>] (handle_level_irq+0xc0/0x13c) from [<c002b080>] (asm_do_IRQ+0x80/0xbc) > [<c002b080>] (asm_do_IRQ+0x80/0xbc) from [<c0274b8c>] (__irq_svc+0x4c/0xe4) > Exception stack(0xc4099db8 to 0xc4099e00) > 9da0: c0357538 00000000 > 9dc0: 00000000 c0380cc0 c4098000 00000202 00000028 c4098000 3fca9fbc c4098000 > 9de0: c0028b08 00000000 c4098000 c4099e00 c005eb50 c005e544 20000113 ffffffff > [<c0274b8c>] (__irq_svc+0x4c/0xe4) from [<c005e544>] (__do_softirq+0x64/0x25c) > [<c005e544>] (__do_softirq+0x64/0x25c) from [<c005eb50>] (irq_exit+0x48/0x5c) > [<c005eb50>] (irq_exit+0x48/0x5c) from [<c002b084>] (asm_do_IRQ+0x84/0xbc) > [<c002b084>] (asm_do_IRQ+0x84/0xbc) from [<c0274b8c>] (__irq_svc+0x4c/0xe4) > Exception stack(0xc4099e58 to 0xc4099ea0) > 9e40: c0628010 20000093 > 9e60: 00000001 00000000 00000000 60000013 c00aff24 cc4f6c00 00000001 c4098000 > 9e80: 00000000 00000000 00000000 c4099ea0 c0084fa0 c0084fa4 60000013 ffffffff > [<c0274b8c>] (__irq_svc+0x4c/0xe4) from [<c0084fa4>] (smp_call_function_single+0xc0/0x1d8) > [<c0084fa4>] (smp_call_function_single+0xc0/0x1d8) from [<c00af86c>] (perf_event_create_kernel_counter+0xb4/0x14c) > [<c00af86c>] (perf_event_create_kernel_counter+0xb4/0x14c) from [<bf0e0700>] (op_perf_start+0x54/0xf0 [oprofile]) > [<bf0e0700>] (op_perf_start+0x54/0xf0 [oprofile]) from [<bf0e0800>] (op_arm_start+0x20/0x48 [oprofile]) > [<bf0e0800>] (op_arm_start+0x20/0x48 [oprofile]) from [<bf0de100>] (oprofile_start+0x38/0x68 [oprofile]) > [<bf0de100>] (oprofile_start+0x38/0x68 [oprofile]) from [<bf0dfac0>] (enable_write+0x34/0x54 [oprofile]) > [<bf0dfac0>] (enable_write+0x34/0x54 [oprofile]) from [<c00e5368>] (vfs_write+0xa8/0x150) > [<c00e5368>] (vfs_write+0xa8/0x150) from [<c00e5698>] (sys_write+0x3c/0x100) > [<c00e5698>] (sys_write+0x3c/0x100) from [<c002c500>] (ret_fast_syscall+0x0/0x30) > =====================SOFTLOCKUP INFO END========================= > <0>Kernel panic - not syncing: softlockup: hung tasks > > Cc: <st...@vg...> # 2.6.34+ > Signed-off-by: Weng Meiling <wen...@hu...> > --- > drivers/oprofile/oprofile_perf.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/drivers/oprofile/oprofile_perf.c b/drivers/oprofile/oprofile_perf.c > index d5b2732..a9e5761 100644 > --- a/drivers/oprofile/oprofile_perf.c > +++ b/drivers/oprofile/oprofile_perf.c > @@ -38,6 +38,9 @@ static void op_overflow_handler(struct perf_event *event, > int id; > u32 cpu = smp_processor_id(); > > + if (!oprofile_perf_enabled) > + return; > + > for (id = 0; id < num_counters; ++id) > if (per_cpu(perf_events, cpu)[id] == event) > break; Your newly introduced check does basically the same as this existing check, except that it now prevents printing the warning and thus stays shorter in the interrupt handler. So it might work accidentally due to different timing and does not solve the problem. Using oprofile_perf_enabled would also require protection by the oprofile_perf_mutex. But op_overflow_handler() does not contain code protected either by oprofile_perf_enabled nor oprofile_perf_mutex. Since the mutex can't be used in the interrupt handler you also can't use oprofile_perf_enabled there. -Robert > -- > 1.8.3 > > |
From: Weng M. <wen...@hu...> - 2014-01-14 01:53:39
|
On 2014/1/13 16:45, Robert Richter wrote: > Weng, > > sorry for answering late, your mail hit the holiday season. > > On 20.12.13 15:49:01, Weng Meiling wrote: >> >> From: Weng Meiling <wen...@hu...> >> >> There is a situation event is triggered before oprofile_perf_start() finish. >> Because the event is still not stored in per_cpu(perf_events, cpu)[event], >> op_overflow_handler() will print the warning. During the time, if unregistered >> event is triggered again, the cpu will print again. This may make cpu keeping >> on printing and trigger softlockup. So check whether events register finished >> in op_overflow_handler(). >> >> The problem was once triggered on kernel 2.6.34, the main information: >> <3>BUG: soft lockup - CPU#0 stuck for 60005ms! [opcontrol:8673] >> >> Pid: 8673, comm: opcontrol >> =====================SOFTLOCKUP INFO BEGIN======================= >> [CPU#0] the task [opcontrol] is not waiting for a lock,maybe a delay or deadcricle! >> <6>opcontrol R<c> running <c> 0 8673 7603 0x00000002 >> locked: >> bf0e1928 mutex 0 [<bf0de0d8>] oprofile_start+0x10/0x68 [oprofile] >> bf0e1a24 mutex 0 [<bf0e07f0>] op_arm_start+0x10/0x48 [oprofile] >> c0628020 &ctx->mutex 0 [<c00af85c>] perf_event_create_kernel_counter+0xa4/0x14c > > I rather suspect the code of perf_install_in_context() of 2.6.34 to > cause the locking issue. There was a lot of rework in between there. > Can you further explain the locking and why your fix should solve it? > Thanks for your answer! The locking happens when the event's sample_period is small which leads to cpu keeping printing the warning for the triggered unregistered event. So the thread context can't be executed and trigger softlockup. As you said below, the patch is not appropriate, and the patch just prevents printing the warning and thus stays shorter in the interrupt handler, it can't solve the problem. The problem was once triggered on kernel 2.6.34, I'll try to trigger it in current kernel and resend a correct patch. Weng Meiling Thanks! > It would be better to go through the bunch of fixes between 2.6.34 and > current kernel. Or, to use the latest kernel and/or operf if possible. > > See also below. > >> [<c00362b8>] (unwind_backtrace+0x0/0x164) from [<c0031db4>] (show_stack+0x10/0x14) >> [<c0031db4>] (show_stack+0x10/0x14) from [<c008d964>] (show_lock_info+0x9c/0x168) >> [<c008d964>] (show_lock_info+0x9c/0x168) from [<c008dbf4>] (softlockup_tick+0x1c4/0x234) >> [<c008dbf4>] (softlockup_tick+0x1c4/0x234) from [<c0066d58>] (update_process_times+0x2c/0x50) >> [<c0066d58>] (update_process_times+0x2c/0x50) from [<c00811cc>] (tick_sched_timer+0x268/0x2c4) >> [<c00811cc>] (tick_sched_timer+0x268/0x2c4) from [<c0077340>] (__run_hrtimer+0x158/0x25c) >> [<c0077340>] (__run_hrtimer+0x158/0x25c) from [<c0077e08>] (hrtimer_interrupt+0x13c/0x2f8) >> [<c0077e08>] (hrtimer_interrupt+0x13c/0x2f8) from [<c003f82c>] (timer64_timer_interrupt+0x20/0x2c) >> [<c003f82c>] (timer64_timer_interrupt+0x20/0x2c) from [<c008e54c>] (handle_IRQ_event+0x144/0x2ec) >> [<c008e54c>] (handle_IRQ_event+0x144/0x2ec) from [<c00900dc>] (handle_level_irq+0xc0/0x13c) >> [<c00900dc>] (handle_level_irq+0xc0/0x13c) from [<c002b080>] (asm_do_IRQ+0x80/0xbc) >> [<c002b080>] (asm_do_IRQ+0x80/0xbc) from [<c0274b8c>] (__irq_svc+0x4c/0xe4) >> Exception stack(0xc4099db8 to 0xc4099e00) >> 9da0: c0357538 00000000 >> 9dc0: 00000000 c0380cc0 c4098000 00000202 00000028 c4098000 3fca9fbc c4098000 >> 9de0: c0028b08 00000000 c4098000 c4099e00 c005eb50 c005e544 20000113 ffffffff >> [<c0274b8c>] (__irq_svc+0x4c/0xe4) from [<c005e544>] (__do_softirq+0x64/0x25c) >> [<c005e544>] (__do_softirq+0x64/0x25c) from [<c005eb50>] (irq_exit+0x48/0x5c) >> [<c005eb50>] (irq_exit+0x48/0x5c) from [<c002b084>] (asm_do_IRQ+0x84/0xbc) >> [<c002b084>] (asm_do_IRQ+0x84/0xbc) from [<c0274b8c>] (__irq_svc+0x4c/0xe4) >> Exception stack(0xc4099e58 to 0xc4099ea0) >> 9e40: c0628010 20000093 >> 9e60: 00000001 00000000 00000000 60000013 c00aff24 cc4f6c00 00000001 c4098000 >> 9e80: 00000000 00000000 00000000 c4099ea0 c0084fa0 c0084fa4 60000013 ffffffff >> [<c0274b8c>] (__irq_svc+0x4c/0xe4) from [<c0084fa4>] (smp_call_function_single+0xc0/0x1d8) >> [<c0084fa4>] (smp_call_function_single+0xc0/0x1d8) from [<c00af86c>] (perf_event_create_kernel_counter+0xb4/0x14c) >> [<c00af86c>] (perf_event_create_kernel_counter+0xb4/0x14c) from [<bf0e0700>] (op_perf_start+0x54/0xf0 [oprofile]) >> [<bf0e0700>] (op_perf_start+0x54/0xf0 [oprofile]) from [<bf0e0800>] (op_arm_start+0x20/0x48 [oprofile]) >> [<bf0e0800>] (op_arm_start+0x20/0x48 [oprofile]) from [<bf0de100>] (oprofile_start+0x38/0x68 [oprofile]) >> [<bf0de100>] (oprofile_start+0x38/0x68 [oprofile]) from [<bf0dfac0>] (enable_write+0x34/0x54 [oprofile]) >> [<bf0dfac0>] (enable_write+0x34/0x54 [oprofile]) from [<c00e5368>] (vfs_write+0xa8/0x150) >> [<c00e5368>] (vfs_write+0xa8/0x150) from [<c00e5698>] (sys_write+0x3c/0x100) >> [<c00e5698>] (sys_write+0x3c/0x100) from [<c002c500>] (ret_fast_syscall+0x0/0x30) >> =====================SOFTLOCKUP INFO END========================= >> <0>Kernel panic - not syncing: softlockup: hung tasks >> >> Cc: <st...@vg...> # 2.6.34+ >> Signed-off-by: Weng Meiling <wen...@hu...> >> --- >> drivers/oprofile/oprofile_perf.c | 3 +++ >> 1 file changed, 3 insertions(+) >> >> diff --git a/drivers/oprofile/oprofile_perf.c b/drivers/oprofile/oprofile_perf.c >> index d5b2732..a9e5761 100644 >> --- a/drivers/oprofile/oprofile_perf.c >> +++ b/drivers/oprofile/oprofile_perf.c >> @@ -38,6 +38,9 @@ static void op_overflow_handler(struct perf_event *event, >> int id; >> u32 cpu = smp_processor_id(); >> >> + if (!oprofile_perf_enabled) >> + return; >> + >> for (id = 0; id < num_counters; ++id) >> if (per_cpu(perf_events, cpu)[id] == event) >> break; > > Your newly introduced check does basically the same as this existing > check, except that it now prevents printing the warning and thus stays > shorter in the interrupt handler. So it might work accidentally due to > different timing and does not solve the problem. > > Using oprofile_perf_enabled would also require protection by the > oprofile_perf_mutex. But op_overflow_handler() does not contain code > protected either by oprofile_perf_enabled nor oprofile_perf_mutex. > Since the mutex can't be used in the interrupt handler you also can't > use oprofile_perf_enabled there. > > -Robert > >> -- >> 1.8.3 >> >> > > . > |
From: Weng M. <wen...@hu...> - 2014-02-11 04:35:56
|
Hi Will, > >>> how userland can be notified about throttling. Throttling could be >>> worth for operf too, not only for the oprofile kernel driver. >>> > >>> From a quick look it seems there is also code in x86 that dynamically >>> adjusts the rate which might be worth being implemented for ARM too. >> >> Are you referring to the perf_sample_event_took callback? If so, that >> certainly looks worth persuing. I'll stick it on my list, thanks! >> Is there any progress on this work? Because this is important for me. Sorry for trouble you. Thanks! Weng Meiling |
From: Will D. <wil...@ar...> - 2014-02-11 15:52:28
|
On Tue, Feb 11, 2014 at 04:33:51AM +0000, Weng Meiling wrote: > Hi Will, Hello, > >>> how userland can be notified about throttling. Throttling could be > >>> worth for operf too, not only for the oprofile kernel driver. > >>> > > > >>> From a quick look it seems there is also code in x86 that dynamically > >>> adjusts the rate which might be worth being implemented for ARM too. > >> > >> Are you referring to the perf_sample_event_took callback? If so, that > >> certainly looks worth persuing. I'll stick it on my list, thanks! > >> > > Is there any progress on this work? Because this is important for me. > Sorry for trouble you. Oops, I totally forgot about this. Does the below patch work for you? Will --->8 diff --git a/arch/arm/kernel/perf_event.c b/arch/arm/kernel/perf_event.c index 361a1aaee7c8..a6bc431cde70 100644 --- a/arch/arm/kernel/perf_event.c +++ b/arch/arm/kernel/perf_event.c @@ -302,6 +302,8 @@ static irqreturn_t armpmu_dispatch_irq(int irq, void *dev) struct arm_pmu *armpmu; struct platform_device *plat_device; struct arm_pmu_platdata *plat; + int ret; + u64 start_clock, finish_clock; if (irq_is_percpu(irq)) dev = *(void **)dev; @@ -309,10 +311,15 @@ static irqreturn_t armpmu_dispatch_irq(int irq, void *dev) plat_device = armpmu->plat_device; plat = dev_get_platdata(&plat_device->dev); + start_clock = sched_clock(); if (plat && plat->handle_irq) - return plat->handle_irq(irq, dev, armpmu->handle_irq); + ret = plat->handle_irq(irq, dev, armpmu->handle_irq); else - return armpmu->handle_irq(irq, dev); + ret = armpmu->handle_irq(irq, dev); + finish_clock = sched_clock(); + + perf_sample_event_took(finish_clock - start_clock); + return ret; } static void diff --git a/kernel/events/core.c b/kernel/events/core.c index 56003c6edfd3..6fcc293d77a4 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -237,6 +237,8 @@ void perf_sample_event_took(u64 sample_len_ns) u64 local_samples_len; u64 allowed_ns = ACCESS_ONCE(perf_sample_allowed_ns); + pr_info("perf_sample_event_took(%llu ns)\n", sample_len_ns); + if (allowed_ns == 0) return; |
From: Will D. <wil...@ar...> - 2014-02-11 18:05:29
|
On Tue, Feb 11, 2014 at 03:52:07PM +0000, Will Deacon wrote: > On Tue, Feb 11, 2014 at 04:33:51AM +0000, Weng Meiling wrote: > > Is there any progress on this work? Because this is important for me. > > Sorry for trouble you. > > Oops, I totally forgot about this. Does the below patch work for you? [...] > diff --git a/kernel/events/core.c b/kernel/events/core.c > index 56003c6edfd3..6fcc293d77a4 100644 > --- a/kernel/events/core.c > +++ b/kernel/events/core.c > @@ -237,6 +237,8 @@ void perf_sample_event_took(u64 sample_len_ns) > u64 local_samples_len; > u64 allowed_ns = ACCESS_ONCE(perf_sample_allowed_ns); > > + pr_info("perf_sample_event_took(%llu ns)\n", sample_len_ns); > + > if (allowed_ns == 0) > return; Ignore this hunk, it was there as a debugging aid. Testing the other half of the patch would be useful though! Will |
From: Robert R. <rr...@ke...> - 2014-01-14 15:06:06
|
On 14.01.14 09:52:11, Weng Meiling wrote: > On 2014/1/13 16:45, Robert Richter wrote: > > On 20.12.13 15:49:01, Weng Meiling wrote: > >> The problem was once triggered on kernel 2.6.34, the main information: > >> <3>BUG: soft lockup - CPU#0 stuck for 60005ms! [opcontrol:8673] > >> > >> Pid: 8673, comm: opcontrol > >> =====================SOFTLOCKUP INFO BEGIN======================= > >> [CPU#0] the task [opcontrol] is not waiting for a lock,maybe a delay or deadcricle! > >> <6>opcontrol R<c> running <c> 0 8673 7603 0x00000002 > >> locked: > >> bf0e1928 mutex 0 [<bf0de0d8>] oprofile_start+0x10/0x68 [oprofile] > >> bf0e1a24 mutex 0 [<bf0e07f0>] op_arm_start+0x10/0x48 [oprofile] > >> c0628020 &ctx->mutex 0 [<c00af85c>] perf_event_create_kernel_counter+0xa4/0x14c > > > > I rather suspect the code of perf_install_in_context() of 2.6.34 to > > cause the locking issue. There was a lot of rework in between there. > > Can you further explain the locking and why your fix should solve it? > > > Thanks for your answer! > The locking happens when the event's sample_period is small which leads to cpu > keeping printing the warning for the triggered unregistered event. So the thread > context can't be executed and trigger softlockup. > As you said below, the patch is not appropriate, and the patch just > prevents printing the warning and thus stays shorter in the interrupt handler, > it can't solve the problem. The problem was once triggered on kernel 2.6.34, I'll > try to trigger it in current kernel and resend a correct patch. Weng, so an interrupt storm due to warning messages causes the lock. I was looking further at it and wrote a patch that enables the event after it was added to the perf_events list. This should fix spurious overflows and its warning messages. Could you reproduce the issue with a mainline kernel and then test with the patch below applied? Thanks, -Robert From: Robert Richter <rr...@ke...> Date: Tue, 14 Jan 2014 15:19:54 +0100 Subject: [PATCH] oprofile_perf Signed-off-by: Robert Richter <rr...@ke...> --- drivers/oprofile/oprofile_perf.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/drivers/oprofile/oprofile_perf.c b/drivers/oprofile/oprofile_perf.c index d5b2732..2b07c95 100644 --- a/drivers/oprofile/oprofile_perf.c +++ b/drivers/oprofile/oprofile_perf.c @@ -38,6 +38,9 @@ static void op_overflow_handler(struct perf_event *event, int id; u32 cpu = smp_processor_id(); + /* sync perf_events with op_create_counter(): */ + smp_rmb(); + for (id = 0; id < num_counters; ++id) if (per_cpu(perf_events, cpu)[id] == event) break; @@ -68,6 +71,7 @@ static void op_perf_setup(void) attr->config = counter_config[i].event; attr->sample_period = counter_config[i].count; attr->pinned = 1; + attr->disabled = 1; } } @@ -94,6 +98,11 @@ static int op_create_counter(int cpu, int event) per_cpu(perf_events, cpu)[event] = pevent; + /* sync perf_events with overflow handler: */ + smp_wmb(); + + perf_event_enable(pevent); + return 0; } -- 1.8.4.2 |
From: Weng M. <wen...@hu...> - 2014-01-15 02:03:05
|
On 2014/1/14 23:05, Robert Richter wrote: > On 14.01.14 09:52:11, Weng Meiling wrote: >> On 2014/1/13 16:45, Robert Richter wrote: >>> On 20.12.13 15:49:01, Weng Meiling wrote: > >>>> The problem was once triggered on kernel 2.6.34, the main information: >>>> <3>BUG: soft lockup - CPU#0 stuck for 60005ms! [opcontrol:8673] >>>> >>>> Pid: 8673, comm: opcontrol >>>> =====================SOFTLOCKUP INFO BEGIN======================= >>>> [CPU#0] the task [opcontrol] is not waiting for a lock,maybe a delay or deadcricle! >>>> <6>opcontrol R<c> running <c> 0 8673 7603 0x00000002 >>>> locked: >>>> bf0e1928 mutex 0 [<bf0de0d8>] oprofile_start+0x10/0x68 [oprofile] >>>> bf0e1a24 mutex 0 [<bf0e07f0>] op_arm_start+0x10/0x48 [oprofile] >>>> c0628020 &ctx->mutex 0 [<c00af85c>] perf_event_create_kernel_counter+0xa4/0x14c >>> >>> I rather suspect the code of perf_install_in_context() of 2.6.34 to >>> cause the locking issue. There was a lot of rework in between there. >>> Can you further explain the locking and why your fix should solve it? >>> >> Thanks for your answer! >> The locking happens when the event's sample_period is small which leads to cpu >> keeping printing the warning for the triggered unregistered event. So the thread >> context can't be executed and trigger softlockup. >> As you said below, the patch is not appropriate, and the patch just >> prevents printing the warning and thus stays shorter in the interrupt handler, >> it can't solve the problem. The problem was once triggered on kernel 2.6.34, I'll >> try to trigger it in current kernel and resend a correct patch. > > Weng, > > so an interrupt storm due to warning messages causes the lock. > > I was looking further at it and wrote a patch that enables the event > after it was added to the perf_events list. This should fix spurious > overflows and its warning messages. Could you reproduce the issue with > a mainline kernel and then test with the patch below applied? > > Thanks, > > -Robert > > It's my pleasure. But one more question, please see below. > From: Robert Richter <rr...@ke...> > Date: Tue, 14 Jan 2014 15:19:54 +0100 > Subject: [PATCH] oprofile_perf > > Signed-off-by: Robert Richter <rr...@ke...> > --- > drivers/oprofile/oprofile_perf.c | 9 +++++++++ > 1 file changed, 9 insertions(+) > > diff --git a/drivers/oprofile/oprofile_perf.c b/drivers/oprofile/oprofile_perf.c > index d5b2732..2b07c95 100644 > --- a/drivers/oprofile/oprofile_perf.c > +++ b/drivers/oprofile/oprofile_perf.c > @@ -38,6 +38,9 @@ static void op_overflow_handler(struct perf_event *event, > int id; > u32 cpu = smp_processor_id(); > > + /* sync perf_events with op_create_counter(): */ > + smp_rmb(); > + > for (id = 0; id < num_counters; ++id) > if (per_cpu(perf_events, cpu)[id] == event) > break; > @@ -68,6 +71,7 @@ static void op_perf_setup(void) > attr->config = counter_config[i].event; > attr->sample_period = counter_config[i].count; > attr->pinned = 1; > + attr->disabled = 1; > } > } > > @@ -94,6 +98,11 @@ static int op_create_counter(int cpu, int event) > > per_cpu(perf_events, cpu)[event] = pevent; > > + /* sync perf_events with overflow handler: */ > + smp_wmb(); > + > + perf_event_enable(pevent); > + Should this step go before the if check:pevent->state != PERF_EVENT_STATE_ACTIVE ? Because the attr->disabled is true, So after the perf_event_create_kernel_counter the pevent->state is not PERF_EVENT_STATE_ACTIVE. > return 0; > } > > |
From: Robert R. <rr...@ke...> - 2014-01-15 10:24:57
|
On 15.01.14 10:02:44, Weng Meiling wrote: > On 2014/1/14 23:05, Robert Richter wrote: > > @@ -94,6 +98,11 @@ static int op_create_counter(int cpu, int event) > > > > per_cpu(perf_events, cpu)[event] = pevent; > > > > + /* sync perf_events with overflow handler: */ > > + smp_wmb(); > > + > > + perf_event_enable(pevent); > > + > > Should this step go before the if check:pevent->state != PERF_EVENT_STATE_ACTIVE ? > Because the attr->disabled is true, So after the perf_event_create_kernel_counter > the pevent->state is not PERF_EVENT_STATE_ACTIVE. Right, the check is a problem. We need to move it after the event was enabled. On error, we need to NULL the event, see below. -Robert --- drivers/oprofile/oprofile_perf.c | 27 +++++++++++++++++++-------- 1 file changed, 19 insertions(+), 8 deletions(-) diff --git a/drivers/oprofile/oprofile_perf.c b/drivers/oprofile/oprofile_perf.c index d5b2732..9dfb236 100644 --- a/drivers/oprofile/oprofile_perf.c +++ b/drivers/oprofile/oprofile_perf.c @@ -38,6 +38,9 @@ static void op_overflow_handler(struct perf_event *event, int id; u32 cpu = smp_processor_id(); + /* sync perf_events with op_create_counter(): */ + smp_rmb(); + for (id = 0; id < num_counters; ++id) if (per_cpu(perf_events, cpu)[id] == event) break; @@ -68,6 +71,7 @@ static void op_perf_setup(void) attr->config = counter_config[i].event; attr->sample_period = counter_config[i].count; attr->pinned = 1; + attr->disabled = 1; } } @@ -85,16 +89,23 @@ static int op_create_counter(int cpu, int event) if (IS_ERR(pevent)) return PTR_ERR(pevent); - if (pevent->state != PERF_EVENT_STATE_ACTIVE) { - perf_event_release_kernel(pevent); - pr_warning("oprofile: failed to enable event %d " - "on CPU %d\n", event, cpu); - return -EBUSY; - } - per_cpu(perf_events, cpu)[event] = pevent; - return 0; + /* sync perf_events with overflow handler: */ + smp_wmb(); + + perf_event_enable(pevent); + + if (pevent->state == PERF_EVENT_STATE_ACTIVE) + return 0; + + perf_event_release_kernel(pevent); + per_cpu(perf_events, cpu)[event] = NULL; + + pr_warning("oprofile: failed to enable event %d on CPU %d\n", + event, cpu); + + return -EBUSY; } static void op_destroy_counter(int cpu, int event) -- 1.8.4.2 |
From: Weng M. <wen...@hu...> - 2014-02-15 02:41:43
|
Hi Will, I test kernel with this patch, the problem has be fixed. When the event's sample_period is small, the cpu will not stall except printing warning "oprofile: ignoring spurious overflow ignoring spurious overflow". This is normal for unregistered event. So would you please send a formal one? :) Thanks very much! On 2014/2/11 23:52, Will Deacon wrote: > On Tue, Feb 11, 2014 at 04:33:51AM +0000, Weng Meiling wrote: >> Hi Will, > > Hello, > >>>>> how userland can be notified about throttling. Throttling could be >>>>> worth for operf too, not only for the oprofile kernel driver. >>>>> >>> >>>>> From a quick look it seems there is also code in x86 that dynamically >>>>> adjusts the rate which might be worth being implemented for ARM too. >>>> >>>> Are you referring to the perf_sample_event_took callback? If so, that >>>> certainly looks worth persuing. I'll stick it on my list, thanks! >>>> >> >> Is there any progress on this work? Because this is important for me. >> Sorry for trouble you. > > Oops, I totally forgot about this. Does the below patch work for you? > > Will > > --->8 > > diff --git a/arch/arm/kernel/perf_event.c b/arch/arm/kernel/perf_event.c > index 361a1aaee7c8..a6bc431cde70 100644 > --- a/arch/arm/kernel/perf_event.c > +++ b/arch/arm/kernel/perf_event.c > @@ -302,6 +302,8 @@ static irqreturn_t armpmu_dispatch_irq(int irq, void *dev) > struct arm_pmu *armpmu; > struct platform_device *plat_device; > struct arm_pmu_platdata *plat; > + int ret; > + u64 start_clock, finish_clock; > > if (irq_is_percpu(irq)) > dev = *(void **)dev; > @@ -309,10 +311,15 @@ static irqreturn_t armpmu_dispatch_irq(int irq, void *dev) > plat_device = armpmu->plat_device; > plat = dev_get_platdata(&plat_device->dev); > > + start_clock = sched_clock(); > if (plat && plat->handle_irq) > - return plat->handle_irq(irq, dev, armpmu->handle_irq); > + ret = plat->handle_irq(irq, dev, armpmu->handle_irq); > else > - return armpmu->handle_irq(irq, dev); > + ret = armpmu->handle_irq(irq, dev); > + finish_clock = sched_clock(); > + > + perf_sample_event_took(finish_clock - start_clock); > + return ret; > } > > static void > diff --git a/kernel/events/core.c b/kernel/events/core.c > index 56003c6edfd3..6fcc293d77a4 100644 > --- a/kernel/events/core.c > +++ b/kernel/events/core.c > @@ -237,6 +237,8 @@ void perf_sample_event_took(u64 sample_len_ns) > u64 local_samples_len; > u64 allowed_ns = ACCESS_ONCE(perf_sample_allowed_ns); > > + pr_info("perf_sample_event_took(%llu ns)\n", sample_len_ns); > + > if (allowed_ns == 0) > return; > > > > . > |
From: Will D. <wil...@ar...> - 2014-02-17 10:08:44
|
On Sat, Feb 15, 2014 at 02:41:09AM +0000, Weng Meiling wrote: > Hi Will, > > I test kernel with this patch, the problem has be fixed. When the > event's sample_period is small, the cpu will not stall except printing > warning "oprofile: ignoring spurious overflow ignoring spurious overflow". > This is normal for unregistered event. > > So would you please send a formal one? :) Thanks very much! It's already in -next. Will |
From: Weng M. <wen...@hu...> - 2014-02-17 11:41:01
|
On 2014/2/17 18:08, Will Deacon wrote: > On Sat, Feb 15, 2014 at 02:41:09AM +0000, Weng Meiling wrote: >> Hi Will, >> >> I test kernel with this patch, the problem has be fixed. When the >> event's sample_period is small, the cpu will not stall except printing >> warning "oprofile: ignoring spurious overflow ignoring spurious overflow". >> This is normal for unregistered event. >> >> So would you please send a formal one? :) Thanks very much! > > It's already in -next. > > Will > > OK, Thanks~ |
From: Robert R. <rr...@ke...> - 2014-01-16 11:52:56
|
(cc'ing Will) Weng, thanks for testing. On 16.01.14 17:33:04, Weng Meiling wrote: > Using the same test case, the problem also exists in the same kernel with the new patch applied: > > > # opcontrol --start > > Using 2.6+ OProfile kernel interface. > Using log file /var/lib/oprofile/samples/oprofiled.log > Daemon started. > [ 508.456878] INFO: rcu_sched self-detected stall on CPU { 0} (t=2100 jiffies g=685 c=684 q=83) > [ 571.496856] INFO: rcu_sched self-detected stall on CPU { 0} (t=8404 jiffies g=685 c=684 q=83) > [ 634.526855] INFO: rcu_sched self-detected stall on CPU { 0} (t=14707 jiffies g=685 c=684 q=83) Yes, the patch does not prevent an interrupt storm. The same happened on x86 and was there solved also by limiting the minimum cycle period as the kernel was not able to ratelimit. > ARM: events: increase minimum cycle period to 100k > -event:0xFF counters:0 um:zero minimum:500 name:CPU_CYCLES : CPU cycle > +event:0xFF counters:0 um:zero minimum:100000 name:CPU_CYCLES : CPU cycle However, an arbitrary hardcoded value migth not fit for all kind of cpus esp. on ARM where the variety is high. It also looks like there is no way other than patching the events file to force lower values than the minimum on cpus there this might be necessary. The problem of too low sample periods could be solved on ARM by using perf's interrupt throttling, you might play around with: /proc/sys/kernel/perf_event_max_sample_rate:100000 I am not quite sure whether this works esp. for kernel counters and how userland can be notified about throttling. Throttling could be worth for operf too, not only for the oprofile kernel driver. >From a quick look it seems there is also code in x86 that dynamically adjusts the rate which might be worth being implemented for ARM too. -Robert |
From: Weng M. <wen...@hu...> - 2014-01-16 01:09:53
|
On 2014/1/15 18:24, Robert Richter wrote: > On 15.01.14 10:02:44, Weng Meiling wrote: >> On 2014/1/14 23:05, Robert Richter wrote: >>> @@ -94,6 +98,11 @@ static int op_create_counter(int cpu, int event) >>> >>> per_cpu(perf_events, cpu)[event] = pevent; >>> >>> + /* sync perf_events with overflow handler: */ >>> + smp_wmb(); >>> + >>> + perf_event_enable(pevent); >>> + >> >> Should this step go before the if check:pevent->state != PERF_EVENT_STATE_ACTIVE ? >> Because the attr->disabled is true, So after the perf_event_create_kernel_counter >> the pevent->state is not PERF_EVENT_STATE_ACTIVE. > > Right, the check is a problem. We need to move it after the event was > enabled. On error, we need to NULL the event, see below. > > -Robert > > --- > drivers/oprofile/oprofile_perf.c | 27 +++++++++++++++++++-------- > 1 file changed, 19 insertions(+), 8 deletions(-) > > diff --git a/drivers/oprofile/oprofile_perf.c b/drivers/oprofile/oprofile_perf.c > index d5b2732..9dfb236 100644 > --- a/drivers/oprofile/oprofile_perf.c > +++ b/drivers/oprofile/oprofile_perf.c > @@ -38,6 +38,9 @@ static void op_overflow_handler(struct perf_event *event, > int id; > u32 cpu = smp_processor_id(); > > + /* sync perf_events with op_create_counter(): */ > + smp_rmb(); > + > for (id = 0; id < num_counters; ++id) > if (per_cpu(perf_events, cpu)[id] == event) > break; > @@ -68,6 +71,7 @@ static void op_perf_setup(void) > attr->config = counter_config[i].event; > attr->sample_period = counter_config[i].count; > attr->pinned = 1; > + attr->disabled = 1; > } > } > > @@ -85,16 +89,23 @@ static int op_create_counter(int cpu, int event) > if (IS_ERR(pevent)) > return PTR_ERR(pevent); > > - if (pevent->state != PERF_EVENT_STATE_ACTIVE) { > - perf_event_release_kernel(pevent); > - pr_warning("oprofile: failed to enable event %d " > - "on CPU %d\n", event, cpu); > - return -EBUSY; > - } > - > per_cpu(perf_events, cpu)[event] = pevent; > > - return 0; > + /* sync perf_events with overflow handler: */ > + smp_wmb(); > + > + perf_event_enable(pevent); > + > + if (pevent->state == PERF_EVENT_STATE_ACTIVE) > + return 0; > + > + perf_event_release_kernel(pevent); > + per_cpu(perf_events, cpu)[event] = NULL; > + > + pr_warning("oprofile: failed to enable event %d on CPU %d\n", > + event, cpu); > + > + return -EBUSY; > } > > static void op_destroy_counter(int cpu, int event) > OK, I'll test the patch, and send the result as soon as possible. |
From: Weng M. <wen...@hu...> - 2014-01-16 09:34:18
|
Hi Robert, The testcase which trigger the problem on kernel 2.6.34 is: opcontrol --init opcontrol --no-vmlinux opcontrol --event=CPU_CYCLES:500:0:1:1 opcontrol --start Run the testcase in the Linux 3.13-rc1 kernel, the last step "opcontrol --start" will stall, and can't be killed, and print the following messages: #opcontrol --start Using 2.6+ OProfile kernel interface. Using log file /var/lib/oprofile/samples/oprofiled.log Daemon started. [ 864.450714] INFO: rcu_sched self-detected stall on CPU { 0} (t=2100 jiffies g=176 c=175 q=73) then the watchdog will cause the os reboot, because the environment casing the kernel 2.6.34 warning message has gone. we use pandaboard to buildup the test environment now, but after reboot the demsg will lost, so I can't get the dmesg messages. Using the same test case, the problem also exists in the same kernel with the new patch applied: # opcontrol --start Using 2.6+ OProfile kernel interface. Using log file /var/lib/oprofile/samples/oprofiled.log Daemon started. [ 508.456878] INFO: rcu_sched self-detected stall on CPU { 0} (t=2100 jiffies g=685 c=684 q=83) [ 571.496856] INFO: rcu_sched self-detected stall on CPU { 0} (t=8404 jiffies g=685 c=684 q=83) [ 634.526855] INFO: rcu_sched self-detected stall on CPU { 0} (t=14707 jiffies g=685 c=684 q=83) During the test, I go through the code again. One thing confused me: when the event's sample_period is small, even the events was enabled after it was added to the perf_events list, the cpu won't keep printing the warning, but the code will go to another branch oprofile_add_sample(), so an interrupt storm will continue? In this way, it may be best to solve the problem from userland? Like oprofile tools, the oprofile tool version must be below v0.9.9, or the problem can't be reproduced. Because the v0.9.9 add the following patch which increase the minimum cycle period: ARM: events: increase minimum cycle period to 100k On ARM, we intentionally leave the minimum event counters low since the performance profile of the cores can vary dramatically between CPUs and their implementations. However, since the default event is CPU_CYCLES, it's best to err on the side of caution and raise the limit to something more realistic so we don't lock-up on the unsuspecting user (as opposed to somebody passing an explicit event period). This patch raises the CPU_CYCLES minimum event count to 100k on ARM. Signed-off-by: Will Deacon <wil...@ar...> Authored by: Will Deacon 2013-07-29 Committed by: Maynard Johnson 2013-07-29 Browse code at this revision Parent(s): [a7e408] Child(ren): [491aff] changed events/arm/armv7-common/events events/arm/armv7-common/events Diff Switch to side-by-side view --- a/events/arm/armv7-common/events +++ b/events/arm/armv7-common/events @@ -33,4 +33,4 @@ event:0x1C counters:1,2,3,4,5,6 um:zero minimum:500 name:TTBR_WRITE_RETIRED : Write to TTBR architecturally executed, condition code pass event:0x1D counters:1,2,3,4,5,6 um:zero minimum:500 name:BUS_CYCLES : Bus cycle -event:0xFF counters:0 um:zero minimum:500 name:CPU_CYCLES : CPU cycle +event:0xFF counters:0 um:zero minimum:100000 name:CPU_CYCLES : CPU cycle This is just my personal opinion. Maybe there is something wrong. What do you think about it? On 2014/1/16 9:09, Weng Meiling wrote: > > On 2014/1/15 18:24, Robert Richter wrote: >> On 15.01.14 10:02:44, Weng Meiling wrote: >>> On 2014/1/14 23:05, Robert Richter wrote: >>>> @@ -94,6 +98,11 @@ static int op_create_counter(int cpu, int event) >>>> >>>> per_cpu(perf_events, cpu)[event] = pevent; >>>> >>>> + /* sync perf_events with overflow handler: */ >>>> + smp_wmb(); >>>> + >>>> + perf_event_enable(pevent); >>>> + >>> >>> Should this step go before the if check:pevent->state != PERF_EVENT_STATE_ACTIVE ? >>> Because the attr->disabled is true, So after the perf_event_create_kernel_counter >>> the pevent->state is not PERF_EVENT_STATE_ACTIVE. >> >> Right, the check is a problem. We need to move it after the event was >> enabled. On error, we need to NULL the event, see below. >> >> -Robert >> >> --- >> drivers/oprofile/oprofile_perf.c | 27 +++++++++++++++++++-------- >> 1 file changed, 19 insertions(+), 8 deletions(-) >> >> diff --git a/drivers/oprofile/oprofile_perf.c b/drivers/oprofile/oprofile_perf.c >> index d5b2732..9dfb236 100644 >> --- a/drivers/oprofile/oprofile_perf.c >> +++ b/drivers/oprofile/oprofile_perf.c >> @@ -38,6 +38,9 @@ static void op_overflow_handler(struct perf_event *event, >> int id; >> u32 cpu = smp_processor_id(); >> >> + /* sync perf_events with op_create_counter(): */ >> + smp_rmb(); >> + >> for (id = 0; id < num_counters; ++id) >> if (per_cpu(perf_events, cpu)[id] == event) >> break; >> @@ -68,6 +71,7 @@ static void op_perf_setup(void) >> attr->config = counter_config[i].event; >> attr->sample_period = counter_config[i].count; >> attr->pinned = 1; >> + attr->disabled = 1; >> } >> } >> >> @@ -85,16 +89,23 @@ static int op_create_counter(int cpu, int event) >> if (IS_ERR(pevent)) >> return PTR_ERR(pevent); >> >> - if (pevent->state != PERF_EVENT_STATE_ACTIVE) { >> - perf_event_release_kernel(pevent); >> - pr_warning("oprofile: failed to enable event %d " >> - "on CPU %d\n", event, cpu); >> - return -EBUSY; >> - } >> - >> per_cpu(perf_events, cpu)[event] = pevent; >> >> - return 0; >> + /* sync perf_events with overflow handler: */ >> + smp_wmb(); >> + >> + perf_event_enable(pevent); >> + >> + if (pevent->state == PERF_EVENT_STATE_ACTIVE) >> + return 0; >> + >> + perf_event_release_kernel(pevent); >> + per_cpu(perf_events, cpu)[event] = NULL; >> + >> + pr_warning("oprofile: failed to enable event %d on CPU %d\n", >> + event, cpu); >> + >> + return -EBUSY; >> } >> >> static void op_destroy_counter(int cpu, int event) >> > > OK, I'll test the patch, and send the result as soon as possible. > |
From: Will D. <wil...@ar...> - 2014-01-16 19:37:15
|
On Thu, Jan 16, 2014 at 11:52:45AM +0000, Robert Richter wrote: > (cc'ing Will) Thanks Robert, > On 16.01.14 17:33:04, Weng Meiling wrote: > > Using the same test case, the problem also exists in the same kernel with the new patch applied: > > > > > > # opcontrol --start > > > > Using 2.6+ OProfile kernel interface. > > Using log file /var/lib/oprofile/samples/oprofiled.log > > Daemon started. > > [ 508.456878] INFO: rcu_sched self-detected stall on CPU { 0} (t=2100 jiffies g=685 c=684 q=83) > > [ 571.496856] INFO: rcu_sched self-detected stall on CPU { 0} (t=8404 jiffies g=685 c=684 q=83) > > [ 634.526855] INFO: rcu_sched self-detected stall on CPU { 0} (t=14707 jiffies g=685 c=684 q=83) > > Yes, the patch does not prevent an interrupt storm. The same happened > on x86 and was there solved also by limiting the minimum cycle period > as the kernel was not able to ratelimit. > > > ARM: events: increase minimum cycle period to 100k > > > -event:0xFF counters:0 um:zero minimum:500 name:CPU_CYCLES : CPU cycle > > +event:0xFF counters:0 um:zero minimum:100000 name:CPU_CYCLES : CPU cycle > > However, an arbitrary hardcoded value migth not fit for all kind of > cpus esp. on ARM where the variety is high. It also looks like there > is no way other than patching the events file to force lower values > than the minimum on cpus there this might be necessary. Yeah, it's pretty much impossible to pick a one-size-fits-all value for ARM. > The problem of too low sample periods could be solved on ARM by using > perf's interrupt throttling, you might play around with: > > /proc/sys/kernel/perf_event_max_sample_rate:100000 > > I am not quite sure whether this works esp. for kernel counters and > how userland can be notified about throttling. Throttling could be > worth for operf too, not only for the oprofile kernel driver. > > From a quick look it seems there is also code in x86 that dynamically > adjusts the rate which might be worth being implemented for ARM too. Are you referring to the perf_sample_event_took callback? If so, that certainly looks worth persuing. I'll stick it on my list, thanks! Will |
From: Weng M. <wen...@hu...> - 2014-01-17 03:38:23
|
On 2014/1/17 3:36, Will Deacon wrote: > On Thu, Jan 16, 2014 at 11:52:45AM +0000, Robert Richter wrote: >> (cc'ing Will) > > Thanks Robert, > >> The problem of too low sample periods could be solved on ARM by using >> perf's interrupt throttling, you might play around with: >> >> /proc/sys/kernel/perf_event_max_sample_rate:100000 >> >> I am not quite sure whether this works esp. for kernel counters and Try to lower the value of perf_event_max_sample_rate, it works. Testing the following values: 100 500 1000 5000 10000 50000 for the last value, the command start to stall. Just a simple test. :) >> how userland can be notified about throttling. Throttling could be >> worth for operf too, not only for the oprofile kernel driver. >> >> From a quick look it seems there is also code in x86 that dynamically >> adjusts the rate which might be worth being implemented for ARM too. > > Are you referring to the perf_sample_event_took callback? If so, that > certainly looks worth persuing. I'll stick it on my list, thanks! > Thanks Will for doing this. Thanks Weng Meiling > Will > > . > |