|
From: Richard W. <ri...@no...> - 2014-03-16 15:10:08
|
Am 14.03.2014 15:57, schrieb Thomas Meyer: > Am Mittwoch, den 06.11.2013, 20:59 +0100 schrieb Richard Weinberger: >> Am 06.11.2013 20:52, schrieb Thomas Meyer: >>> Am Mittwoch, den 06.11.2013, 13:40 +0100 schrieb Richard Weinberger: >>>> On Tue, Nov 5, 2013 at 9:21 PM, Thomas Meyer <th...@m3...> wrote: >>>>> Hi, >>>>> >>>>> I'm running Fedora 20 inside a 3.12 UML kernel and the "yum upgrade -y" >>>>> command seems to get stuck after a while/few minutes. >>>>> >>>>> Any ideas what's going one here? How to debug this? >>>>> >>>>> It looks like the process running yum is in state ptrace stopped, but >>>>> doesn't continue. >>>> >>>> Got only yum stuck or the whole UML kernel? >>> > > only some processes get stuck. > > After enabling hung task detection in the kernel I see this in the logs: > > [ 8040.100000] INFO: task jbd2/ubda-8:308 blocked for more than 120 > seconds. > [ 8040.100000] Not tainted 3.13.6 #24 > [ 8040.100000] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" > disables this message. > [ 8040.100000] jbd2/ubda-8 D 0000000040aa1c83 0 308 2 > 0x00000000 > [ 8040.100000] Stack: > [ 8040.100000] 600795a0 603f3120 b066db70 6006476d > [ 8040.100000] 1b066dbb0 b0631040 b066db80 6001c0a3 > [ 8040.100000] 600795a0 b066b380 b066dbe0 60314b4e > [ 8040.100000] Call Trace: > [ 8040.100000] [<600795a0>] ? rcu_sched_qs+0x0/0xd0 > [ 8040.100000] [<6006476d>] ? dequeue_task+0x1d/0x50 > [ 8040.100000] [<6001c0a3>] __switch_to+0x53/0x90 > [ 8040.100000] [<600795a0>] ? rcu_sched_qs+0x0/0xd0 > [ 8040.100000] [<60314b4e>] __schedule+0x1ae/0x470 > [ 8040.100000] [<60068900>] ? pick_next_task_fair+0x0/0x1a0 > [ 8040.100000] [<6006aa70>] ? prepare_to_wait+0x0/0x90 > [ 8040.100000] [<60314e43>] schedule+0x33/0x80 > [ 8040.100000] [<6020b9d4>] ? submit_bio+0xa4/0x1c0 > [ 8040.100000] [<60315030>] io_schedule+0x60/0x90 > [ 8040.100000] [<6010b7b0>] sleep_on_buffer+0x10/0x20 > [ 8040.100000] [<603153a3>] __wait_on_bit+0x63/0xa0 > [ 8040.100000] [<6010b7a0>] ? sleep_on_buffer+0x0/0x20 > [ 8040.100000] [<6010b7a0>] ? sleep_on_buffer+0x0/0x20 > [ 8040.100000] [<60315466>] out_of_line_wait_on_bit+0x86/0xa0 > [ 8040.100000] [<6010e312>] ? submit_bh+0x12/0x20 > [ 8040.100000] [<6006aed0>] ? wake_bit_function+0x0/0x40 > [ 8040.100000] [<600645e3>] ? __might_sleep+0x153/0x170 > [ 8040.100000] [<60064490>] ? __might_sleep+0x0/0x170 > [ 8040.100000] [<6010b890>] ? __wait_on_buffer+0x0/0x40 > [ 8040.100000] [<6010b8c4>] __wait_on_buffer+0x34/0x40 > [ 8040.100000] [<6019a166>] jbd2_journal_commit_transaction > +0x1696/0x19a0 > [ 8040.100000] [<60315060>] ? _cond_resched+0x0/0x50 > [ 8040.100000] [<6010d200>] ? __brelse+0x0/0x30 > [ 8040.100000] [<600330c0>] ? block_signals+0x0/0x20 > [ 8040.100000] [<6006abff>] ? finish_wait+0x6f/0x90 > [ 8040.100000] [<6006a950>] ? __wake_up+0x0/0x70 > [ 8040.100000] [<600487c0>] ? del_timer+0x0/0x60 > [ 8040.100000] [<6006a950>] ? __wake_up+0x0/0x70 > [ 8040.100000] [<6019dd5f>] kjournald2+0xdf/0x2d0 > [ 8040.100000] [<60068900>] ? pick_next_task_fair+0x0/0x1a0 > [ 8040.100000] [<6006ae90>] ? autoremove_wake_function+0x0/0x40 > [ 8040.100000] [<6019dc80>] ? kjournald2+0x0/0x2d0 > [ 8040.100000] [<6006a670>] ? __init_waitqueue_head+0x0/0x10 > [ 8040.100000] [<6019dc80>] ? kjournald2+0x0/0x2d0 > [ 8040.100000] [<6006a670>] ? __init_waitqueue_head+0x0/0x10 > [ 8040.100000] [<6005ce44>] kthread+0x114/0x140 > [ 8040.100000] [<6006442d>] ? finish_task_switch.isra.72+0x2d/0x90 > [ 8040.100000] [<600656a2>] ? schedule_tail+0x22/0xd0 > [ 8040.100000] [<6001be21>] new_thread_handler+0x81/0xb0 > [ 8040.100000] > [ 8040.100000] INFO: task yum:1082 blocked for more than 120 seconds. > [ 8040.100000] Not tainted 3.13.6 #24 > [ 8040.100000] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" > disables this message. > [ 8040.100000] yum D 0000000040aa1c83 0 1082 1073 > 0x00000000 > [ 8040.100000] Stack: > [ 8040.100000] 600795a0 603f3120 aae3fc80 6006476d > [ 8040.100000] 162b83078 b0631040 aae3fc90 6001c0a3 > [ 8040.100000] 600795a0 aadeab40 aae3fcf0 60314b4e > [ 8040.100000] Call Trace: > [ 8040.100000] [<600795a0>] ? rcu_sched_qs+0x0/0xd0 > [ 8040.100000] [<6006476d>] ? dequeue_task+0x1d/0x50 > [ 8040.100000] [<6001c0a3>] __switch_to+0x53/0x90 > [ 8040.100000] [<600795a0>] ? rcu_sched_qs+0x0/0xd0 > [ 8040.100000] [<60314b4e>] __schedule+0x1ae/0x470 > [ 8040.100000] [<60068900>] ? pick_next_task_fair+0x0/0x1a0 > [ 8040.100000] [<6003344f>] ? set_signals+0x3f/0x50 > [ 8040.100000] [<6006ace0>] ? prepare_to_wait_event+0x0/0x110 > [ 8040.100000] [<60314e10>] ? schedule+0x0/0x80 > [ 8040.100000] [<60314e43>] schedule+0x33/0x80 > [ 8040.100000] [<6006ace0>] ? prepare_to_wait_event+0x0/0x110 > [ 8040.100000] [<6019d3a3>] jbd2_log_wait_commit+0xa3/0x110 > [ 8040.100000] [<6006ae90>] ? autoremove_wake_function+0x0/0x40 > [ 8040.100000] [<6019f228>] jbd2_complete_transaction+0x48/0x90 > [ 8040.100000] [<6015162b>] ext4_sync_file+0x28b/0x320 > [ 8040.100000] [<600dba2a>] ? vfs_read+0x13a/0x190 > [ 8040.100000] [<6010a133>] do_fsync+0x53/0x80 > [ 8040.100000] [<600db73d>] ? SyS_lseek+0x7d/0x90 > [ 8040.100000] [<6010a442>] SyS_fsync+0x12/0x20 > [ 8040.100000] [<6001fa68>] handle_syscall+0x68/0x90 > [ 8040.100000] [<600370fd>] userspace+0x4fd/0x600 > [ 8040.100000] [<60031e0f>] ? save_registers+0x1f/0x40 > [ 8040.100000] [<6003a2c7>] ? arch_prctl+0x177/0x1b0 > [ 8040.100000] [<6001bed5>] fork_handler+0x85/0x90 > [ 8040.100000] > > any ideas? some synchronisation error in ext4? Hmm, maybe you suffer from the same issue this patch tries to address: https://lkml.org/lkml/2014/2/14/733 Thanks, //richard |