Menu

#5 fault causes hang of process and /proc parsing utils

open
nobody
None
5
2008-04-11
2008-04-11
Anonymous
No

I'm seeing occasional problems where a process hangs. Running ps -ef will then hang (while ps -e is okay).

Looking in /proc for the process, ls will work within the processes directory but ls -l will hang.
/proc/pid/exe appears to be the problem

kill, kill -9, gdb hangs in turn.

./java -version

Message from syslogd@ at Fri Apr 11 14:25:57 2008 ...
paradox kernel: ------------[ cut here ]------------
Message from syslogd@ at Fri Apr 11 14:25:57 2008 ...
paradox kernel: invalid opcode: 0000 [#5] SMP
Message from syslogd@ at Fri Apr 11 14:25:57 2008 ...
paradox kernel: Process java (pid: 28666, ti=f1d46000 task=f14fb330 task.ti=f1d46000)
Message from syslogd@ at Fri Apr 11 14:25:57 2008 ...
paradox kernel: Stack: 00000001 00000296 c042359e 00000000 00000000 00000003 c0721c2c 00000000
Message from syslogd@ at Fri Apr 11 14:25:57 2008 ...
paradox kernel: c078c580 f35c5f98 c04dc4d2 00000000 c1fa2fe0 c1fa2fe0 c045f7e4 00000002
Message from syslogd@ at Fri Apr 11 14:25:57 2008 ...
paradox kernel: 00000044 f319e374 f1d46f04 fffb2000 00000002 00000000 001280d2 c071f5cc
Message from syslogd@ at Fri Apr 11 14:25:57 2008 ...
paradox kernel: Call Trace:
Message from syslogd@ at Fri Apr 11 14:25:57 2008 ...
paradox kernel: [<c042359e>] __wake_up+0x32/0x42
Message from syslogd@ at Fri Apr 11 14:25:57 2008 ...
paradox kernel: [<c04dc4d2>] aufs_fault+0x68/0x1f0
Message from syslogd@ at Fri Apr 11 14:25:57 2008 ...
paradox kernel: [<c045f7e4>] get_page_from_freelist+0x2b5/0x328
Message from syslogd@ at Fri Apr 11 14:25:57 2008 ...
paradox kernel: [<c0465f24>] __do_fault+0x51/0x32c
Message from syslogd@ at Fri Apr 11 14:25:57 2008 ...
paradox kernel: [<c0467e16>] handle_mm_fault+0x2be/0x5ef
Message from syslogd@ at Fri Apr 11 14:25:57 2008 ...
paradox kernel: [<c065233f>] do_page_fault+0x204/0x583
Message from syslogd@ at Fri Apr 11 14:25:57 2008 ...
paradox kernel: [<c065213b>] do_page_fault+0x0/0x583
Message from syslogd@ at Fri Apr 11 14:25:57 2008 ...
paradox kernel: [<c0650eda>] error_code+0x72/0x78
Message from syslogd@ at Fri Apr 11 14:25:57 2008 ...
paradox kernel: [<c0650000>] __mutex_lock_interruptible_slowpath+0x60/0x9e
Message from syslogd@ at Fri Apr 11 14:25:57 2008 ...
paradox kernel: =======================
Message from syslogd@ at Fri Apr 11 14:25:57 2008 ...
paradox kernel: Code: 24 0c c7 44 24 08 39 00 00 00 c7 44 24 04 dd 65 66 c0 89 54 24 10 c7 04 24 86 96 6c c0 e8 4b 2b f4 ff 81 7b 34 73 66 75 61 74 04 <0f> 0b eb fe 89 d8 e8 43 f0 fd ff 89 c3 8d 80 a8 00 00 00 e8 ec
Message from syslogd@ at Fri Apr 11 14:25:57 2008 ...
paradox kernel: EIP: [<c04e962a>] au_robr_safe_file+0x61/0x123 SS:ESP 0068:f1d46e68

and the terminal with this output is now hung.
The pid listed in the output doesn't seem to exist any more, but it's the same executable as the hung process that's locking up ps -ef and various other utils... don't know if that's significant.

/etc/mtab shows:
none composite_path aufs rw,xino=overlay_path/.aufs.xino,br:overlay_path=rw:read_only_path=ro 0 0

There was also a bind remount of the composite path to another location on the local filesystem (with the bind mount done after the aufs mount).

This seems to happen every couple of days so if there's any specific information you'd like I can probably grab it. I'm running on 2.6.24.4 with a cvs extract of aufs on 09.04.2008

Cheers,

George
hickeng@uk.ibm.com

Discussion


Log in to post a comment.

MongoDB Logo MongoDB