[SSI-devel] unixnm.c oops
Brought to you by:
brucewalker,
rogertsang
From: Roger T. <rog...@gm...> - 2005-04-21 03:06:35
|
I'm using SSI-1.2.2-Lustre-1.2.4 with Laura's IPC patch and crashed the whole cluster today. My root filesystem is not NFS exported, but seems like whenever the whole cluster crashes the CFS hard mount would need a manual fsck even when it is a journaling fs. This is a UP kernel for P3/Coppermine with highmem. The initnode crashed so bad I didn't get any response on the console.=20 The failover node got into kdb however. This is what I have on the failover node, and haven't reboot the failover node yet. -Roger The following is from node 2. kdb> dmesg <3>unixnmsvr_put: entry not found for node 2 and ino 33556058 <4>------------[ cut here ]------------ <4>kernel BUG at unixnm.c:696! <4>invalid operand: 0000 <4>tun loop cls_u32 sch_sfq sch_htb softdog nfsd ip_vs_sed ipt_REJECT ipt_m= ultip ort ipt_state ip_conntrack ipt_TCPMSS iptable_filter ip_tables microcode id= e-cd=20 s <4>CPU: 0 <4>EIP: 0060:[<c020bb58>] Not tainted <4>EFLAGS: 00210246 <4> <4>EIP is at unixnm_put [kernel] 0x188 (2.4.22-1.2199.nptl_ssi_9up) <4>eax: 00000000 ebx: c0564768 ecx: 00000001 edx: d52b1f68 <4>esi: 00000014 edi: 0200065a ebp: cbeeff2c esp: cbeeff00 <4>ds: 0068 es: 0068 ss: 0068 <4>Process nautilus (pid: 133488, stackpage=3Dcbeef000) <4>Stack: 00000002 cbeeff1c 00000014 00000002 0200065a 00000000 cb943b00 ff= fffd4 4=20 <4> cd6f3b80 cb943c30 cb943b00 cbeeff44 c03419a3 cb943c30 00000286 cb= 943c3 0=20 <4> d7edf280 cbeeff54 c02f3df6 cb943c30 cb943b00 cbeeff6c c02f43d7 cb= 943c3 0=20 <4>Call Trace: =20 <4>[<c03419a3>] unix_release [kernel] 0x43 (0xcbeeff30) <4>[<c02f3df6>] sock_release [kernel] 0x56 (0xcbeeff48) <4>[<c02f43d7>] sock_close [kernel] 0x37 (0xcbeeff58) <4>[<c014f50f>] fput [kernel] 0xef (0xcbeeff70) more> =20 Only 'q' or 'Q' are processed at more prompt, input ignored <4>[<c014dfab>] filp_close [kernel] 0x4b (0xcbeeff90) <4>[<c014e02f>] sys_close [kernel] 0x4f (0xcbeeffac) <4>[<c010bae7>] system_call [kernel] 0x33 (0xcbeeffc0) <4> <4>Code: 0f 0b b8 02 c6 d7 39 c0 e9 3a ff ff ff 89 7c 24 10 8d 55 f0=20 <4>=20 kdb>=20 kdb> bt Stack traceback for pid 133488 0xcbeee000 133488 1 1 0 R 0xcbeee350 *nautilus EBP EIP Function (args) 0xcbeeff2c 0xc020bb58 unixnm_put+0x188 (0xcb943c30, 0x286, 0xcb943c30, 0xd7= edf28 0) kernel .text 0xc0100000 0xc020b9d0 0xc020bba= 0 0xcbeeff44 0xc03419a3 unix_release+0x43 (0xcb943c30, 0xcb943b00) kernel .text 0xc0100000 0xc0341960 0xc03419d= 0 0xcbeeff54 0xc02f3df6 sock_release+0x56 (0xcb943c30, 0xce89c700, 0x0, 0xce8= 9c700 ) kernel .text 0xc0100000 0xc02f3da0 0xc02f3e0= 0 0xcbeeff6c 0xc02f43d7 sock_close+0x37 (0xcb943b00, 0xce89c700, 0xcb938580, = 0xce8 9c700, 0xd323e980) kernel .text 0xc0100000 0xc02f43a0 0xc02f43f= 0 0xcbeeff8c 0xc014f50f fput+0xef (0xce89c700, 0xd323e980, 0xce89c700, 0xe, 0= x97c8 a40) kernel .text 0xc0100000 0xc014f420 0xc014f53= 0 0xcbeeffa8 0xc014dfab filp_close+0x4b (0xce89c700, 0xd323e980, 0xcbeee000) kernel .text 0xc0100000 0xc014df60 0xc014dfe= 0 0xcbeeffbc 0xc014e02f sys_close+0x4f (0xe, 0x0, 0x4d1af34, 0xe, 0x97c8a40) kernel .text 0xc0100000 0xc014dfe0 0xc014e04= 0 0xc010bae7 system_call+0x33 kernel .text 0xc0100000 0xc010bab4 0xc010bae= c kdb> |