From: Riccardo M. <ric...@gm...> - 2012-05-09 12:18:50
|
Hi Richard, On Wed, May 9, 2012 at 12:06 PM, richard -rw- weinberger <ric...@gm...> wrote: >> we're having issues with a UML machine that refuses to shut down: the >> halt sequence commences and then we get the following kernel error >> message and backtrace: >> >> Asking all remaining processes to terminate... >> INFO: task killall5:5269 blocked for more than 120 seconds. >> [ 1080.320000] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" >> disables this message. >> [ 1080.320000] killall5 D 00000000405070c3 0 5269 >> 5266 0x00000000 >> [ 1080.320000] 93d84c08 9354d288 9354ce38 6063e550 9200bc28 >> 6001a7f3 93d84780 92008000 >> [ 1080.320000] 92008000 93d84780 9200bc88 604345ad 9200bc48 >> 9354ce00 9200bc88 92008000 >> [ 1080.320000] 00000001 93d84780 93588060 00000001 >> ffffffffffffffff 00000000 9200bce8 604357cd >> [ 1080.320000] Call Trace: >> [ 1080.320000] 9200bc00: [<6001a7f3>] _switch_to+0x5e/0xae >> [ 1080.320000] 9200bc30: [<604345ad>] schedule+0x240/0x27a >> [ 1080.320000] 9200bc90: [<604357cd>] rwsem_down_failed_common+0xb8/0xd8 >> [ 1080.320000] 9200bcf0: [<60435814>] rwsem_down_read_failed+0x12/0x14 >> [ 1080.320000] 9200bd00: [<6002ecec>] call_rwsem_down_read_failed+0x14/0x24 >> [ 1080.320000] 9200bd48: [<60434fc2>] down_read+0x11/0x13 >> [ 1080.320000] 9200bd58: [<6007a6ec>] access_process_vm+0x45/0x147 >> [ 1080.320000] 9200bdc8: [<600cdd13>] proc_pid_cmdline+0x65/0xf9 >> [ 1080.320000] 9200be18: [<600cf298>] proc_info_read+0x68/0xcb >> [ 1080.320000] 9200be68: [<6008f2c7>] vfs_read+0xa7/0x155 >> [ 1080.320000] 9200bea8: [<6008f42e>] sys_read+0x45/0x6c >> [ 1080.320000] 9200bee8: [<6001cd9c>] handle_syscall+0x58/0x70 >> [ 1080.320000] 9200bf08: [<6002c00f>] userspace+0x2d4/0x381 >> [ 1080.320000] 9200bfc8: [<6001a6f3>] fork_handler+0x62/0x69 >> [ 1080.320000] >> >> The error repeats every 120 seconds, and the machine never shuts down. > > Can you please try a recent kernel? > Maybe you need this commit: > 3a3679078aed2c451ebc32836bbd3b8219a65e01 (um: Use RWSEM_GENERIC_SPINLOCK on x86) > We're trying 3.2.16 right now (also from devloop.org.uk); it might take some time before we are able to confirm or disprove, since the bug does not occur deterministically but only on a certain percentage of the runs. Thanks, Riccardo |