Re: [SSI-devel] Critical ext3 bug in latest SSI kernel from CVS OPENSSI-RH branc
Brought to you by:
brucewalker,
rogertsang
From: Roger T. <pe...@ho...> - 2004-12-10 21:04:52
|
Indeed it just might be NFS related. I managed to complete a SSI kernel compile after turning off SSInfs, nfslock, and portmap. I only have one NFS client and it is RedHat and running dhcpd, named, postfix, sendmail, spamassassin, gaim, and rrdtool with period graph updates on an automount NFS import on the CVIP (cluster). One of the requirements for this client is no_subtree_check. Within this mount (/mnt/web/abc and /mnt/web/asdf) I also have files that are accessed locally on the SSI cluster by imap, named, apache, php, mysql, mailman, tomcat apps, but not sendmail. One of the php apps is a webmail. Client /mnt /etc/auto.mnt --ghost abc -rw,hard,intr,rsize=8192,wsize=8192,tcp cluster:/mnt/web/abc asdf -rw,hard,intr,rsize=8192,wsize=8192,tcp cluster:/mnt/web/asdf Server /mnt/web *.domain(rw,no_root_squash,sync,mp,no_subtree_check) /dev/hda4 /mnt/web ext3 defaults,node=1 1 2 /dev/hda4 is ide0(3,4) > >Roger Tsang wrote: >>It's not helping. I can compile the SSI kernel while boot up from a >>Fedora Core 2 supplied kernel and no crashes like this ever while the >>system is under load using that kernel. It can't be a hardware problem >>then. > >At least a couple of these panics look to involve NFS. What kind of NFS >traffic are you generating? > >John > > >> >>Dec 10 13:12:51 node1 rpc.mountd: authenticated mount request from >>asdf:917 for /mnt/web/asdf (/mnt/web) >>Dec 10 13:18:54 node1 rpc.mountd: authenticated unmount request from >>asdf:933 for /mnt/web/asdf (/mnt/web) >>Dec 10 13:20:05 node1 kernel: Unable to handle kernel NULL pointer >>dereference at virtual address 00000078 >>Dec 10 13:20:05 node1 kernel: printing eip: >>Dec 10 13:20:05 node1 kernel: c025340c >>Dec 10 13:20:05 node1 kernel: *pde = 2267a001 >>Dec 10 13:20:05 node1 kernel: *pte = 00000000 >>Dec 10 13:20:05 node1 kernel: Oops: 0000 >>Dec 10 13:20:05 node1 kernel: r128 agpgart nfsd ipt_REJECT ipt_multiport >>ipt_state ip_conntrack iptable_filter ip_tables floppy ide-cd sr_mod cdrom >>keybdev mousedev hid input usb-uhci ehci >>Dec 10 13:20:05 node1 kernel: CPU: 0 >>Dec 10 13:20:05 node1 kernel: EIP: 0060:[<c025340c>] Not tainted >>Dec 10 13:20:05 node1 kernel: EFLAGS: 00010286 >>Dec 10 13:20:05 node1 kernel: >>Dec 10 13:20:05 node1 kernel: EIP is at cfsd_lookup [kernel] 0x2c >>(2.4.22-1.2199.nptl_ssi_5develsmp) >>Dec 10 13:20:05 node1 kernel: eax: f539a000 ebx: ebef3680 ecx: >>00000070 edx: 00000000 >>Dec 10 13:20:05 node1 kernel: esi: f5d51a80 edi: f44cea80 ebp: >>f539bcb8 esp: f539bc88 >>Dec 10 13:20:05 node1 kernel: ds: 0068 es: 0068 ss: 0068 >>Dec 10 13:20:05 node1 kernel: Process nfsd (pid: 67570, >>stackpage=f539b000) >>Dec 10 13:20:05 node1 kernel: Stack: 41b9e8d5 0001c859 00000001 00000000 >>ac781cca 00000000 f1e7ab80 00000000 >>Dec 10 13:20:05 node1 kernel: 00000000 ebef3680 f5d51a80 e90983bc >>f539bdfc c024ec63 f467ee80 e90983bc >>Dec 10 13:20:05 node1 kernel: f539bcdc f539bd30 00000000 f7f82400 >>00000000 00000000 00000003 f7f82400 >>Dec 10 13:20:05 node1 kernel: Call Trace: >>Dec 10 13:20:05 node1 kernel: [<c024ec63>] cfs_proc_lookup [kernel] 0x1b3 >>(0xf539bcbc) >>Dec 10 13:20:05 node1 kernel: [<c02415d5>] msgsend [kernel] 0xf5 >>(0xf539bcfc) >>Dec 10 13:20:05 node1 kernel: [<c023fe11>] get_hold [kernel] 0x31 >>(0xf539bd30) >>Dec 10 13:20:05 node1 kernel: [<c0241745>] process_msgs [kernel] 0xb5 >>(0xf539bd4c) >>Dec 10 13:20:05 node1 kernel: [<c023fd1e>] tok_wait [kernel] 0x19e >>(0xf539bd54) >>Dec 10 13:20:05 node1 kernel: [<c025a61c>] cfstok_req [kernel] 0x1fc >>(0xf539bd88) >>Dec 10 13:20:05 node1 kernel: [<c0248775>] cfs_fh_to_dentry [kernel] 0xe5 >>(0xf539bdbc) >>Dec 10 13:20:05 node1 kernel: [<c024d351>] cfs_lookup [kernel] 0xe1 >>(0xf539be00) >>Dec 10 13:20:05 node1 kernel: [<f8a663f0>] lookup_it [nfsd] 0x40 >>(0xf539be3c) >>Dec 10 13:20:05 node1 kernel: [<f8a66a68>] nfsd_findparent [nfsd] 0x68 >>(0xf539be54) >>Dec 10 13:20:05 node1 kernel: [<f8a73efc>] .rodata.str1.1 [nfsd] 0x160 >>(0xf539be60) >>Dec 10 13:20:05 node1 kernel: [<f8a66ea8>] find_fh_dentry [nfsd] 0x1a8 >>(0xf539be80) >>Dec 10 13:20:05 node1 kernel: [<f8a67269>] fh_verify [nfsd] 0x199 >>(0xf539beb4) >>Dec 10 13:20:05 node1 kernel: [<c039ed8b>] svc_sock_enqueue [kernel] 0x13b >>(0xf539bef0) >>Dec 10 13:20:05 node1 kernel: [<f8a6f6aa>] nfsd3_proc_getattr [nfsd] 0x6a >>(0xf539bf10) >>Dec 10 13:20:05 node1 kernel: [<f8a77804>] nfsd_procedures3 [nfsd] 0x24 >>(0xf539bf3c) >>Dec 10 13:20:05 node1 kernel: [<f8a64639>] nfsd_dispatch [nfsd] 0xc9 >>(0xf539bf48) >>Dec 10 13:20:05 node1 kernel: [<f8a64570>] nfsd_dispatch [nfsd] 0x0 >>(0xf539bf60) >>Dec 10 13:20:05 node1 kernel: [<c039ea45>] svc_process [kernel] 0x355 >>(0xf539bf68) >>Dec 10 13:20:05 node1 kernel: [<f8a77804>] nfsd_procedures3 [nfsd] 0x24 >>(0xf539bf8c) >>Dec 10 13:20:05 node1 kernel: [<f8a77118>] nfsd_version3 [nfsd] 0x0 >>(0xf539bf90) >>Dec 10 13:20:05 node1 kernel: [<f8a77138>] nfsd_program [nfsd] 0x0 >>(0xf539bf94) >>Dec 10 13:20:05 node1 kernel: [<f8a6441b>] nfsd [nfsd] 0x1fb (0xf539bfb0) >>Dec 10 13:20:05 node1 kernel: [<f8a64220>] nfsd [nfsd] 0x0 (0xf539bfe0) >>Dec 10 13:20:05 node1 kernel: [<c01077ed>] kernel_thread_helper [kernel] >>0x5 (0xf539bff0) >>Dec 10 13:20:05 node1 kernel: >>Dec 10 13:20:05 node1 kernel: Code: 39 41 08 74 14 f0 ff 4a 70 0f 88 93 1d >>00 00 89 41 08 c7 45 >> >> >>> >>>I think I figured out the problem. I noticed a pattern in the virtual >>>addresses at which the kernel oops is happening. Whenever I receive >>>kernel oops consecutively there is a high chance that I would see the >>>same virtual address, indicative of hardware problems. I know that at >>>this stage chances are low it would be a critical kernel bug, so... >>> >>>I opened up the machine and found memory in slot 1 and 2, but nothing >>>occupying slot 0. I moved the memory module from slot 2 to slot 0 and >>>re-ran the memory test. Now I have doubts about the reliability of >>>memtest86 because the test alone never detected any errors before. >>> >>>It seems the location of the memory modules affected system stability >>>under load - during kernel compile and heavy NFS server activity for >>>example. Hopefully this is the very last thing I have to do so I can >>>move on. >>> >>>Dec 10 10:30:05 node1 kernel: Unable to handle kernel NULL pointer >>>dereference at virtual address 00000070 >>>Dec 10 10:30:05 node1 kernel: printing eip: >>>Dec 10 10:30:05 node1 kernel: c024c894 >>>Dec 10 10:30:05 node1 kernel: *pde = 1443b001 >>>Dec 10 10:30:05 node1 kernel: *pte = 00000000 >>>Dec 10 10:30:05 node1 kernel: Oops: 0002 >>>Dec 10 10:30:05 node1 kernel: r128 agpgart nfsd ipt_REJECT ipt_multiport >>>ipt_state ip_conntrack iptable_filter ip_tables floppy ide-cd sr_mod >>>cdrom keybdev mousedev hid input usb-uhci ehci >>>Dec 10 10:30:05 node1 kernel: CPU: 0 >>>Dec 10 10:30:05 node1 kernel: EIP: 0060:[<c024c894>] Not tainted >>>Dec 10 10:30:05 node1 kernel: EFLAGS: 00010286 >>>Dec 10 10:30:05 node1 kernel: >>>Dec 10 10:30:05 node1 kernel: EIP is at cfs_local_readdir [kernel] 0x84 >>>(2.4.22-1.2199.nptl_ssi_5develsmp) >>>Dec 10 10:30:05 node1 kernel: eax: f58ab180 ebx: ffffffec ecx: >>>00000070 edx: 00000000 >>>Dec 10 10:30:05 node1 kernel: esi: 00000070 edi: 00000000 ebp: >>>f536fb94 esp: f536fb04 >>>Dec 10 10:30:05 node1 kernel: ds: 0068 es: 0068 ss: 0068 >>>Dec 10 10:30:05 node1 kernel: Process nfsd (pid: 67591, >>>stackpage=f536f000) >>>Dec 10 10:30:05 node1 kernel: Stack: c8e83080 00004000 00000004 f536fb18 >>>00000000 00000000 00000000 f58ab180 >>>Dec 10 10:30:05 node1 kernel: 00000000 c057fce0 00000001 00008000 >>>00000001 00000000 00000000 00000000 >>>Dec 10 10:30:05 node1 kernel: 00000000 00000000 00000000 00000000 >>>00000000 00000000 00000000 00000000 >>>Dec 10 10:30:05 node1 kernel: Call Trace: >>>Dec 10 10:30:05 node1 kernel: [<c024ce73>] cfs_readdir [kernel] 0x553 >>>(0xf536fb98) >>>Dec 10 10:30:05 node1 kernel: [<f8a66400>] filldir_one [nfsd] 0x0 >>>(0xf536fba4) >>>Dec 10 10:30:05 node1 kernel: [<c0257384>] hfind [kernel] 0x14 >>>(0xf536fbf8) >>>Dec 10 10:30:05 node1 kernel: [<c0257478>] svrtok_lookup [kernel] 0x78 >>>(0xf536fc18) >>>Dec 10 10:30:05 node1 kernel: [<c019aec0>] ext3_lookup [kernel] 0x110 >>>(0xf536fc20) >>>Dec 10 10:30:05 node1 kernel: [<c016d6bd>] vfs_readdir [kernel] 0xad >>>(0xf536fc4c) >>>Dec 10 10:30:05 node1 kernel: [<f8a66400>] filldir_one [nfsd] 0x0 >>>(0xf536fc58) >>>Dec 10 10:30:05 node1 kernel: [<f8a66400>] filldir_one [nfsd] 0x0 >>>(0xf536fc68) >>>Dec 10 10:30:05 node1 kernel: [<f8a66600>] nfsd_get_name [nfsd] 0x190 >>>(0xf536fc70) >>>Dec 10 10:30:05 node1 kernel: [<f8a66400>] filldir_one [nfsd] 0x0 >>>(0xf536fc78) >>>Dec 10 10:30:05 node1 kernel: [<c02479ec>] cfs_hpget [kernel] 0xec >>>(0xf536fc90) >>>Dec 10 10:30:05 node1 kernel: [<f8a66b90>] splice [nfsd] 0x30 >>>(0xf536fd48) >>>Dec 10 10:30:05 node1 kernel: [<c0248775>] cfs_fh_to_dentry [kernel] 0xe5 >>>(0xf536fdbc) >>>Dec 10 10:30:05 node1 kernel: [<c025a8e7>] cfstok_relse [kernel] 0x47 >>>(0xf536fdd4) >>>Dec 10 10:30:05 node1 kernel: [<c024d395>] cfs_lookup [kernel] 0x125 >>>(0xf536fe00) >>>Dec 10 10:30:05 node1 kernel: [<c0172f21>] dput [kernel] 0x31 >>>(0xf536fe3c) >>>Dec 10 10:30:05 node1 kernel: [<f8a66abd>] nfsd_findparent [nfsd] 0xbd >>>(0xf536fe54) >>>Dec 10 10:30:05 node1 kernel: [<f8a73efc>] .rodata.str1.1 [nfsd] 0x160 >>>(0xf536fe60) >>>Dec 10 10:30:05 node1 kernel: [<f8a66ed7>] find_fh_dentry [nfsd] 0x1d7 >>>(0xf536fe80) >>>Dec 10 10:30:05 node1 kernel: [<f8a67269>] fh_verify [nfsd] 0x199 >>>(0xf536feb4) >>>Dec 10 10:30:05 node1 kernel: [<c039ed8b>] svc_sock_enqueue [kernel] >>>0x13b (0xf536fef0) >>>Dec 10 10:30:05 node1 kernel: [<f8a6f6aa>] nfsd3_proc_getattr [nfsd] 0x6a >>>(0xf536ff10) >>>Dec 10 10:30:05 node1 kernel: [<f8a77804>] nfsd_procedures3 [nfsd] 0x24 >>>(0xf536ff3c) >>>Dec 10 10:30:05 node1 kernel: [<f8a64639>] nfsd_dispatch [nfsd] 0xc9 >>>(0xf536ff48) >>>Dec 10 10:30:05 node1 kernel: [<f8a64570>] nfsd_dispatch [nfsd] 0x0 >>>(0xf536ff60) >>>Dec 10 10:30:05 node1 kernel: [<c039ea45>] svc_process [kernel] 0x355 >>>(0xf536ff68) >>>Dec 10 10:30:05 node1 kernel: [<f8a77804>] nfsd_procedures3 [nfsd] 0x24 >>>(0xf536ff8c) >>>Dec 10 10:30:05 node1 kernel: [<f8a77118>] nfsd_version3 [nfsd] 0x0 >>>(0xf536ff90) >>>Dec 10 10:30:05 node1 kernel: [<f8a77138>] nfsd_program [nfsd] 0x0 >>>(0xf536ff94) >>>Dec 10 10:30:05 node1 kernel: [<f8a6441b>] nfsd [nfsd] 0x1fb (0xf536ffb0) >>>Dec 10 10:30:05 node1 kernel: [<f8a64220>] nfsd [nfsd] 0x0 (0xf536ffe0) >>>Dec 10 10:30:05 node1 kernel: [<c01077ed>] kernel_thread_helper [kernel] >>>0x5 (0xf536fff0) >>>Dec 10 10:30:05 node1 kernel: >>>Dec 10 10:30:05 node1 kernel: Code: f0 ff 4f 70 0f 88 c1 15 00 00 b8 00 >>>e0 ff ff bb fe ff ff ff >>>Dec 10 10:41:26 node1 kernel: Unable to handle kernel NULL pointer >>>dereference at virtual address 000000cc >>>Dec 10 10:41:26 node1 kernel: printing eip: >>>Dec 10 10:41:26 node1 kernel: c014926c >>>Dec 10 10:41:26 node1 kernel: *pde = 32150001 >>>Dec 10 10:41:26 node1 kernel: *pte = 27ae5067 >>>Dec 10 10:41:26 node1 kernel: Oops: 0000 >>>Dec 10 10:41:26 node1 kernel: r128 agpgart nfsd ipt_REJECT ipt_multiport >>>ipt_state ip_conntrack iptable_filter ip_tables floppy ide-cd sr_mod >>>cdrom keybdev mousedev hid input usb-uhci ehci >>>Dec 10 10:41:26 node1 kernel: CPU: 0 >>>Dec 10 10:41:26 node1 kernel: EIP: 0060:[<c014926c>] Not tainted >>>Dec 10 10:41:26 node1 kernel: EFLAGS: 00013206 >>>Dec 10 10:41:26 node1 kernel: >>>Dec 10 10:41:26 node1 kernel: EIP is at generic_file_write [kernel] 0x1c >>>(2.4.22-1.2199.nptl_ssi_5develsmp) >>>Dec 10 10:41:26 node1 kernel: eax: 00000000 ebx: c1f65e78 ecx: >>>00001000 edx: ffffffea >>>Dec 10 10:41:26 node1 kernel: esi: ffffffff edi: 00000000 ebp: >>>c1f65dfc esp: c1f65dd4 >>>Dec 10 10:41:26 node1 kernel: ds: 0068 es: 0068 ss: 0068 >>>Dec 10 10:41:26 node1 kernel: Process cfs_async (pid: 12, >>>stackpage=c1f65000) >>>Dec 10 10:41:26 node1 kernel: Stack: c1f65e78 f15b4000 00000000 c1f65e98 >>>e9597d00 c1f64000 00001000 c1f65e78 >>>Dec 10 10:41:26 node1 kernel: ffffffff 00000000 c1f65e20 c0192d45 >>>c1f65e78 e7b18000 00001000 c1f65e98 >>>Dec 10 10:41:26 node1 kernel: c1f64000 ffffffff ffffffff c1f65ef4 >>>c0253f44 c1f65e78 e7b18000 00001000 >>>Dec 10 10:41:26 node1 kernel: Call Trace: >>>Dec 10 10:41:26 node1 kernel: [<c0192d45>] ext3_file_write [kernel] 0x35 >>>(0xc1f65e00) >>>Dec 10 10:41:26 node1 kernel: [<c0253f44>] cfsd_write [kernel] 0xe4 >>>(0xc1f65e24) >>>Dec 10 10:41:26 node1 kernel: [<c0253c11>] cfsd_close [kernel] 0x61 >>>(0xc1f65e48) >>>Dec 10 10:41:26 node1 kernel: [<c0253e48>] cfsd_read [kernel] 0x108 >>>(0xc1f65e60) >>>Dec 10 10:41:26 node1 kernel: [<c024f284>] cfs_proc_write [kernel] 0x1c4 >>>(0xc1f65ef8) >>>Dec 10 10:41:26 node1 kernel: [<c01222d8>] recalc_task_prio [kernel] 0xa8 >>>(0xc1f65f28) >>>Dec 10 10:41:26 node1 kernel: [<c024b134>] cfs_async_handler_write >>>[kernel] 0x104 (0xc1f65f58) >>>Dec 10 10:41:26 node1 kernel: [<c02930c6>] nsc_async_daemon [kernel] >>>0x1b6 (0xc1f65fa4) >>>Dec 10 10:41:26 node1 kernel: [<c0292f10>] nsc_async_daemon [kernel] 0x0 >>>(0xc1f65fe0) >>>Dec 10 10:41:26 node1 kernel: [<c01077ed>] kernel_thread_helper [kernel] >>>0x5 (0xc1f65ff0) >>>Dec 10 10:41:26 node1 kernel: >>>Dec 10 10:41:26 node1 kernel: Code: 8b 80 cc 00 00 00 8b 78 20 0f 88 a8 >>>00 00 00 b9 00 e0 ff ff >>>Dec 10 10:41:26 node1 kernel: <1>Unable to handle kernel NULL pointer >>>dereference at virtual address 000000cc >>>Dec 10 10:41:26 node1 kernel: printing eip: >>>Dec 10 10:41:26 node1 kernel: c014926c >>>Dec 10 10:41:26 node1 kernel: *pde = 32150001 >>>Dec 10 10:41:26 node1 kernel: *pte = 27ae5067 >>>Dec 10 10:41:26 node1 kernel: Oops: 0000 >>>Dec 10 10:41:26 node1 kernel: r128 agpgart nfsd ipt_REJECT ipt_multiport >>>ipt_state ip_conntrack iptable_filter ip_tables floppy ide-cd sr_mod >>>cdrom keybdev mousedev hid input usb-uhci ehci >>>Dec 10 10:41:26 node1 kernel: CPU: 0 >>>Dec 10 10:41:26 node1 kernel: CPU: 0 >>>Dec 10 10:41:26 node1 kernel: EIP: 0060:[<c014926c>] Not tainted >>>Dec 10 10:41:26 node1 kernel: EFLAGS: 00013206 >>>Dec 10 10:41:26 node1 kernel: >>>Dec 10 10:41:26 node1 kernel: EIP is at generic_file_write [kernel] 0x1c >>>(2.4.22-1.2199.nptl_ssi_5develsmp) >>>Dec 10 10:41:26 node1 kernel: eax: 00000000 ebx: c1f61e78 ecx: >>>00001000 edx: ffffffea >>>Dec 10 10:41:26 node1 kernel: esi: ffffffff edi: 00000000 ebp: >>>c1f61dfc esp: c1f61dd4 >>>Dec 10 10:41:26 node1 kernel: ds: 0068 es: 0068 ss: 0068 >>>Dec 10 10:41:26 node1 kernel: Process cfs_async (pid: 14, >>>stackpage=c1f61000) >>>Dec 10 10:41:26 node1 kernel: Stack: c1f61e78 ec740000 00000000 c1f61e98 >>>e88e7080 c1f60000 00000c8c c1f61e78 >>>Dec 10 10:41:26 node1 kernel: ffffffff 00000000 c1f61e20 c0192d45 >>>c1f61e78 e9094000 00001000 c1f61e98 >>>Dec 10 10:41:26 node1 kernel: c1f60000 ffffffff ffffffff c1f61ef4 >>>c0253f44 c1f61e78 e9094000 00001000 >>>Dec 10 10:41:26 node1 kernel: Call Trace: >>>Dec 10 10:41:26 node1 kernel: [<c0192d45>] ext3_file_write [kernel] 0x35 >>>(0xc1f61e00) >>>Dec 10 10:41:26 node1 kernel: [<c0253f44>] cfsd_write [kernel] 0xe4 >>>(0xc1f61e24) >>>Dec 10 10:41:26 node1 kernel: [<c0253c11>] cfsd_close [kernel] 0x61 >>>(0xc1f61e48) >>>Dec 10 10:41:26 node1 kernel: [<c0253e48>] cfsd_read [kernel] 0x108 >>>(0xc1f61e60) >>>Dec 10 10:41:26 node1 kernel: [<c024f284>] cfs_proc_write [kernel] 0x1c4 >>>(0xc1f61ef8) >>>Dec 10 10:41:26 node1 kernel: [<c01222d8>] recalc_task_prio [kernel] 0xa8 >>>(0xc1f61f28) >>>Dec 10 10:41:26 node1 kernel: [<c024b134>] cfs_async_handler_write >>>[kernel] 0x104 (0xc1f61f58) >>>Dec 10 10:41:26 node1 kernel: [<c02930c6>] nsc_async_daemon [kernel] >>>0x1b6 (0xc1f61fa4) >>>Dec 10 10:41:26 node1 kernel: [<c0292f10>] nsc_async_daemon [kernel] 0x0 >>>(0xc1f61fe0) >>>Dec 10 10:41:26 node1 kernel: [<c01077ed>] kernel_thread_helper [kernel] >>>0x5 (0xc1f61ff0) >>>Dec 10 10:41:26 node1 kernel: >>>Dec 10 10:41:26 node1 kernel: Code: 8b 80 cc 00 00 00 8b 78 20 0f 88 a8 >>>00 00 00 b9 00 e0 ff ff >>> >>> >>>> >>>>Hi John, >>>> >>>>I logged a few ext3 errors lastnight. I think this might be because the >>>>filesystem has been mangled previously, but e2fsck -f from a linux >>>>rescue CD did not catch anything. I issued a find -inum >>>><directory_inode> and it turns out to be a .png file and also got seg >>>>fault after showing the first file. When I reboot into a Fedora Core 2 >>>>supplied kernel 2.6.5 and issue the same find command in the same >>>>partition it completes. >>>> >>>>So I assume the only thing now is to backup all the files and mke2fs >>>>everything on that disk. So I do that. Cross my fingers. While >>>>restoring the files I noticed that .cfs_unlink is still in side the >>>>mount on dev(3,4) but this may be of no concern. >>>> >>>>I'm gonna try recompiling the SSI kernel in SSI again so that I can test >>>>Aneesh's HA-LVS fix and put a little load on the system. The question >>>>remains could this be an outstanding ext3 or cfs bug? Or could this >>>>just be carry over damange from the glibc library issue? >>>> >>>>Some structures haven't been initialized or have been randomly assigned? >>>> >>>>Dec 9 18:42:57 node1 kernel: attempt to access beyond end of device >>>>Dec 9 18:42:57 node1 kernel: 03:02: rw=0, want=110692356, >>>>limit=12289725 >>>>Dec 9 18:42:57 node1 kernel: EXT3-fs error (device ide0(3,2)): >>>>ext3_readdir: directory #722804 contains a hole at offset 0 >>>>Dec 9 18:42:57 node1 kernel: Unable to handle kernel NULL pointer >>>>dereference at virtual address 0000003a >>>>Dec 9 18:42:57 node1 kernel: printing eip: >>>>Dec 9 18:42:57 node1 kernel: c019d486 >>>>Dec 9 18:42:57 node1 kernel: *pde = 25913001 >>>>Dec 9 18:42:57 node1 kernel: *pte = 25717067 >>>>Dec 9 18:42:57 node1 kernel: Oops: 0000 >>>>Dec 9 18:42:57 node1 kernel: loop r128 agpgart nfsd ipt_REJECT >>>>ipt_multiport ipt_state ip_conntrack iptable_filter ip_tables floppy >>>>ide-cd sr_mod cdrom keybdev mousedev hid input usb-uhci >>>>Dec 9 18:42:57 node1 kernel: CPU: 0 >>>>Dec 9 18:42:57 node1 kernel: EIP: 0060:[<c019d486>] Not tainted >>>>Dec 9 18:42:57 node1 kernel: EFLAGS: 00013202 >>>>Dec 9 18:42:57 node1 kernel: >>>>Dec 9 18:42:57 node1 kernel: EIP is at ext3_handle_error [kernel] 0x26 >>>>(2.4.22-1.2199.nptl_ssi_5develsmp) >>>>Dec 9 18:42:57 node1 kernel: eax: 00000002 ebx: f7edb400 ecx: >>>>00000001 edx: f47b8000 >>>>Dec 9 18:42:57 node1 kernel: esi: 00000000 edi: e1b69300 ebp: >>>>f3e7bc1c esp: f3e7bc08 >>>>Dec 9 18:42:57 node1 kernel: ds: 0068 es: 0068 ss: 0068 >>>>Dec 9 18:42:57 node1 kernel: Process nfsd (pid: 67586, >>>>stackpage=f3e7b000) >>>>Dec 9 18:42:57 node1 kernel: Stack: c03e9a4d f3e7bc34 f7edb400 f7edb400 >>>>00000000 f3e7bc38 c019d58a f7edb400 >>>>Dec 9 18:42:57 node1 kernel: c06e8600 c03e5366 c06e9820 f3e7bcf4 >>>>f3e7bcd8 c0192c76 f7edb400 c03e5366 >>>>Dec 9 18:42:57 node1 kernel: c03e9458 000b0774 00000000 c007e080 >>>>f4755000 c05e45e0 f3e7bc80 00000000 >>>>Dec 9 18:42:57 node1 kernel: Call Trace: >>>>Dec 9 18:42:57 node1 kernel: [<c019d58a>] ext3_error [kernel] 0x5a >>>>(0xf3e7bc20) >>>>Dec 9 18:42:57 node1 kernel: [<c0192c76>] ext3_readdir [kernel] 0x3d6 >>>>(0xf3e7bc3c) >>>>Dec 9 18:42:57 node1 kernel: [<c0253a2e>] cfsd_open [kernel] 0xce >>>>(0xf3e7bcbc) >>>>Dec 9 18:42:57 node1 kernel: [<c024c8d0>] cfs_local_readdir [kernel] >>>>0xc0 (0xf3e7bcdc) >>>>Dec 9 18:42:57 node1 kernel: [<f8a73840>] nfs3svc_encode_entry [nfsd] >>>>0x0 (0xf3e7bce8) >>>>Dec 9 18:42:57 node1 kernel: [<c024ce73>] cfs_readdir [kernel] 0x553 >>>>(0xf3e7bd74) >>>>Dec 9 18:42:57 node1 kernel: [<f8a73840>] nfs3svc_encode_entry [nfsd] >>>>0x0 (0xf3e7bd80) >>>>Dec 9 18:42:57 node1 kernel: [<c0337ea5>] skb_release_data [kernel] >>>>0x85 (0xf3e7bdbc) >>>>Dec 9 18:42:57 node1 kernel: [<f8a69257>] _nfsd_open [nfsd] 0x2a7 >>>>(0xf3e7bdd8) >>>>Dec 9 18:42:57 node1 kernel: [<c016d6bd>] vfs_readdir [kernel] 0xad >>>>(0xf3e7be28) >>>>Dec 9 18:42:57 node1 kernel: [<f8a73840>] nfs3svc_encode_entry [nfsd] >>>>0x0 (0xf3e7be34) >>>>Dec 9 18:42:57 node1 kernel: [<f8a6b798>] nfsd_readdir [nfsd] 0xc8 >>>>(0xf3e7be4c) >>>>Dec 9 18:42:57 node1 kernel: [<f8a73840>] nfs3svc_encode_entry [nfsd] >>>>0x0 (0xf3e7be54) >>>>Dec 9 18:42:57 node1 kernel: [<c039ed8b>] svc_sock_enqueue [kernel] >>>>0x13b (0xf3e7bef0) >>>>Dec 9 18:42:57 node1 kernel: [<f8a70ba1>] nfsd3_proc_readdir [nfsd] >>>>0xe1 (0xf3e7bf04) >>>>Dec 9 18:42:57 node1 kernel: [<f8a73840>] nfs3svc_encode_entry [nfsd] >>>>0x0 (0xf3e7bf18) >>>>Dec 9 18:42:57 node1 kernel: [<f8a77a20>] nfsd_procedures3 [nfsd] 0x240 >>>>(0xf3e7bf3c) >>>>Dec 9 18:42:57 node1 kernel: [<f8a64639>] nfsd_dispatch [nfsd] 0xc9 >>>>(0xf3e7bf48) >>>>Dec 9 18:42:57 node1 kernel: [<f8a64570>] nfsd_dispatch [nfsd] 0x0 >>>>(0xf3e7bf60) >>>>Dec 9 18:42:57 node1 kernel: [<c039ea45>] svc_process [kernel] 0x355 >>>>(0xf3e7bf68) >>>>Dec 9 18:42:57 node1 kernel: [<f8a77a20>] nfsd_procedures3 [nfsd] 0x240 >>>>(0xf3e7bf8c) >>>>Dec 9 18:42:57 node1 kernel: [<f8a77118>] nfsd_version3 [nfsd] 0x0 >>>>(0xf3e7bf90) >>>>Dec 9 18:42:57 node1 kernel: [<f8a77138>] nfsd_program [nfsd] 0x0 >>>>(0xf3e7bf94) >>>>Dec 9 18:42:57 node1 kernel: [<f8a6441b>] nfsd [nfsd] 0x1fb >>>>(0xf3e7bfb0) >>>>Dec 9 18:42:57 node1 kernel: [<f8a64220>] nfsd [nfsd] 0x0 (0xf3e7bfe0) >>>>Dec 9 18:42:57 node1 kernel: [<c01077ed>] kernel_thread_helper [kernel] >>>>0x5 (0xf3e7bff0) >>>>Dec 9 18:42:57 node1 kernel: >>>>Dec 9 18:42:57 node1 kernel: Code: 0f b7 46 3a 83 c8 02 66 89 46 3a f6 >>>>43 34 01 75 53 89 1c 24 >>>>Dec 9 18:45:06 node1 syslogd 1.4.1: restart. >>>> >>>> >>>>This is even worse.. I don't have a 600GB ide partition. >>>> >>>>Dec 10 04:04:18 node1 init: Trying to re-exec init >>>>Dec 10 04:07:40 node1 kernel: attempt to access beyond end of device >>>>Dec 10 04:07:40 node1 kernel: 03:04: rw=0, want=636727300, >>>>limit=12932325 >>>>Dec 10 04:07:40 node1 kernel: EXT3-fs error (device ide0(3,4)): >>>>ext3_readdir: directory #31618 contains a hole at offset 0 >>>>Dec 10 04:07:40 node1 kernel: Unable to handle kernel NULL pointer >>>>dereference at virtual address 0000003a >>>>Dec 10 04:07:40 node1 kernel: printing eip: >>>>Dec 10 04:07:40 node1 kernel: c019d486 >>>>Dec 10 04:07:40 node1 kernel: *pde = 22fda001 >>>>Dec 10 04:07:40 node1 kernel: *pte = 00000000 >>>>Dec 10 04:07:40 node1 kernel: Oops: 0000 >>>>Dec 10 04:07:40 node1 kernel: nfsd ipt_REJECT ipt_multiport ipt_state >>>>ip_conntrack iptable_filter ip_tables floppy ide-cd sr_mod cdrom keybdev >>>>mousedev hid input usb-uhci ehci-hcd usbcore >>>>Dec 10 04:07:40 node1 kernel: CPU: 0 >>>>Dec 10 04:07:40 node1 kernel: EIP: 0060:[<c019d486>] Not tainted >>>>Dec 10 04:07:40 node1 kernel: EFLAGS: 00010202 >>>>Dec 10 04:07:40 node1 kernel: >>>>Dec 10 04:07:40 node1 kernel: EIP is at ext3_handle_error [kernel] 0x26 >>>>(2.4.22-1.2199.nptl_ssi_5develsmp) >>>>Dec 10 04:07:40 node1 kernel: eax: 00000002 ebx: f4852000 ecx: >>>>00000001 edx: f487a000 >>>>Dec 10 04:07:40 node1 kernel: esi: 00000000 edi: e97efa80 ebp: >>>>e1ab7d40 esp: e1ab7d2c >>>>Dec 10 04:07:40 node1 kernel: ds: 0068 es: 0068 ss: 0068 >>>>Dec 10 04:07:40 node1 kernel: Process updatedb (pid: 71556, >>>>stackpage=e1ab7000) >>>>Dec 10 04:07:40 node1 kernel: Stack: c03e9a4d e1ab7d58 f4852000 f4852000 >>>>00000000 e1ab7d5c c019d58a f4852000 >>>>Dec 10 04:07:40 node1 kernel: c06e8600 c03e5366 c06e9820 e1ab7e18 >>>>e1ab7dfc c0192c76 f4852000 c03e5366 >>>>Dec 10 04:07:40 node1 kernel: c03e9458 00007b82 00000000 c025ab05 >>>>00000000 f2e03f00 00000002 00000000 >>>>Dec 10 04:07:40 node1 kernel: Call Trace: >>>>Dec 10 04:07:40 node1 kernel: [<c019d58a>] ext3_error [kernel] 0x5a >>>>(0xe1ab7d44) >>>>Dec 10 04:07:40 node1 kernel: [<c0192c76>] ext3_readdir [kernel] 0x3d6 >>>>(0xe1ab7d60) >>>>Dec 10 04:07:40 node1 kernel: [<c025ab05>] cfstok_relsex [kernel] 0x215 >>>>(0xe1ab7d78) >>>>Dec 10 04:07:40 node1 kernel: [<c0253a2e>] cfsd_open [kernel] 0xce >>>>(0xe1ab7de0) >>>>Dec 10 04:07:40 node1 kernel: [<c024c8d0>] cfs_local_readdir [kernel] >>>>0xc0 (0xe1ab7e00) >>>>Dec 10 04:07:40 node1 kernel: [<c016e0b0>] filldir64 [kernel] 0x0 >>>>(0xe1ab7e0c) >>>>Dec 10 04:07:40 node1 kernel: [<c024ce73>] cfs_readdir [kernel] 0x553 >>>>(0xe1ab7e98) >>>>Dec 10 04:07:40 node1 kernel: [<c016e0b0>] filldir64 [kernel] 0x0 >>>>(0xe1ab7ea4) >>>>Dec 10 04:07:40 node1 kernel: [<c025a8e7>] cfstok_relse [kernel] 0x47 >>>>(0xe1ab7f1c) >>>>Dec 10 04:07:40 node1 kernel: [<c016d6bd>] vfs_readdir [kernel] 0xad >>>>(0xe1ab7f4c) >>>>Dec 10 04:07:40 node1 kernel: [<c016e0b0>] filldir64 [kernel] 0x0 >>>>(0xe1ab7f58) >>>>Dec 10 04:07:40 node1 kernel: [<c016e2e2>] sys_getdents64 [kernel] 0x52 >>>>(0xe1ab7f70) >>>>Dec 10 04:07:40 node1 kernel: [<c016e0b0>] filldir64 [kernel] 0x0 >>>>(0xe1ab7f78) >>>>Dec 10 04:07:40 node1 kernel: [<c0157e5e>] sys_fchdir [kernel] 0x4e >>>>(0xe1ab7f90) >>>>Dec 10 04:07:40 node1 kernel: [<c010be37>] system_call [kernel] 0x33 >>>>(0xe1ab7fc0) >>>>Dec 10 04:07:40 node1 kernel: >>>>Dec 10 04:07:40 node1 kernel: Code: 0f b7 46 3a 83 c8 02 66 89 46 3a f6 >>>>43 34 01 75 53 89 1c 24 >>>>Dec 10 04:10:01 node1 syslogd 1.4.1: restart. >>>> >>>>My daemon monitors the log and force reboot if necessary. >>>> >>>>Anyway when I return to find that the system had reboot, I stopped some >>>>apps until lsof shows nothing for the mount. Then umount the partition, >>>>fsck, mount again and find -inum of that directory... It turns out to >>>>be a .png file and not a directory, then it immediately seg fault. >>>> >>>>Dec 10 06:18:56 node1 kernel: inode busy: dev 03:04:17202 (f10a4a80) >>>>mode 40755 count 1 >>> >>> >>>>Dec 10 06:18:56 node1 kernel: inode busy: dev 03:04:24136 (f1118580) >>>>mode 40755 count 1 >>>>Dec 10 06:18:56 node1 kernel: inode busy: dev 03:04:900036 (f1187800) >>>>mode 10600 count 2 >>>>Dec 10 06:18:56 node1 kernel: inode busy: dev 03:04:900034 (f11d3080) >>>>mode 10600 count 2 >>>>Dec 10 06:18:56 node1 kernel: inode busy: dev 03:04:900094 (f1258080) >>>>mode 40710 count 2 >>>>Dec 10 06:18:56 node1 kernel: inode busy: dev 03:04:343753 (f3851300) >>>>mode 40755 count 1 >>>>Dec 10 06:18:56 node1 kernel: inode busy: dev 03:04:801613 (f3851800) >>>>mode 40755 count 1 >>>>Dec 10 06:18:56 node1 kernel: inode busy: dev 03:04:245281 (f3170800) >>>>mode 40750 count 1 >>>>Dec 10 06:18:56 node1 kernel: inode busy: dev 03:04:21703 (f31aea80) >>>>mode 40755 count 1 >>>>Dec 10 06:18:56 node1 kernel: inode busy: dev 03:04:2 (f3cea580) mode >>>>40755 count 1 >>>>Dec 10 06:18:56 node1 kernel: VFS: Busy inodes after unmount. >>>>Self-destruct in 5 seconds. Have a nice day... >>>>Dec 10 06:22:03 node1 kernel: kjournald starting. Commit interval 5 >>>>seconds >>>>Dec 10 06:22:03 node1 kernel: EXT3 FS 2.4-0.9.19, 19 August 2002 on >>>>ide0(3,4), internal journal >>>>Dec 10 06:22:03 node1 kernel: EXT3-fs: mounted filesystem with ordered >>>>data mode. >>>>Dec 10 06:24:14 node1 kernel: ------------[ cut here ]------------ >>>>Dec 10 06:24:14 node1 kernel: kernel BUG at buffer.c:2646! >>>>Dec 10 06:24:14 node1 kernel: invalid operand: 0000 >>>>Dec 10 06:24:14 node1 kernel: r128 agpgart nfsd ipt_REJECT ipt_multiport >>>>ipt_state ip_conntrack iptable_filter ip_tables floppy ide-cd sr_mod >>>>cdrom keybdev mousedev hid input usb-uhci ehci >>>>Dec 10 06:24:14 node1 kernel: CPU: 0 >>>>Dec 10 06:24:14 node1 kernel: EIP: 0060:[<c015e089>] Not tainted >>>>Dec 10 06:24:14 node1 kernel: EFLAGS: 00010206 >>>>Dec 10 06:24:14 node1 kernel: >>>>Dec 10 06:24:14 node1 kernel: EIP is at grow_buffers [kernel] 0x39 >>>>(2.4.22-1.2199.nptl_ssi_5develsmp) >>>>Dec 10 06:24:14 node1 kernel: eax: 000001ff ebx: 00000000 ecx: >>>>00000200 edx: 00000000 >>>>Dec 10 06:24:14 node1 kernel: esi: 00004000 edi: c0228ac0 ebp: >>>>c3305c58 esp: c3305c38 >>>>Dec 10 06:24:14 node1 kernel: ds: 0068 es: 0068 ss: 0068 >>>>Dec 10 06:24:14 node1 kernel: Process find (pid: 88368, >>>>stackpage=c3305000) >>>>Dec 10 06:24:14 node1 kernel: Stack: c3305cec c0257a2b c3305c44 00000032 >>>>00004000 00000000 00004000 c0228ac0 >>>>Dec 10 06:24:14 node1 kernel: c3305c78 c015ba0c 00004000 001205d2 >>>>c0228ac0 c3305dac 00000000 e58c7580 >>>>Dec 10 06:24:14 node1 kernel: c3305d38 c0195ba1 00004000 001205d2 >>>>c0228ac0 c3305cbc 00000000 00000001 >>>>Dec 10 06:24:14 node1 kernel: Call Trace: >>> >>> >>>>Dec 10 06:24:14 node1 kernel: [<c0257a2b>] cfs_tokmsg_seq [kernel] 0xab >>>>(0xc3305c3c) >>>>Dec 10 06:24:14 node1 kernel: [<c0228ac0>] ssidev_poll_return [kernel] >>>>0x0 (0xc3305c54) >>>>Dec 10 06:24:14 node1 kernel: [<c015ba0c>] getblk [kernel] 0x3c >>>>(0xc3305c5c) >>>>Dec 10 06:24:14 node1 kernel: [<c0228ac0>] ssidev_poll_return [kernel] >>>>0x0 (0xc3305c68) >>>>Dec 10 06:24:14 node1 kernel: [<c0195ba1>] ext3_getblk [kernel] 0x91 >>>>(0xc3305c7c) >>>>Dec 10 06:24:14 node1 kernel: [<c0228ac0>] ssidev_poll_return [kernel] >>>>0x0 (0xc3305c88) >>>>Dec 10 06:24:14 node1 kernel: [<c0258057>] svrcfstok_send [kernel] 0x177 >>>>(0xc3305cf0) >>>>Dec 10 06:24:14 node1 kernel: [<c0195e90>] ext3_bread [kernel] 0x30 >>>>(0xc3305d3c) >>>>Dec 10 06:24:14 node1 kernel: [<c019292c>] ext3_readdir [kernel] 0x8c >>>>(0xc3305d60) >>>>Dec 10 06:24:14 node1 kernel: [<c0253a2e>] cfsd_open [kernel] 0xce >>>>(0xc3305de0) >>>>Dec 10 06:24:14 node1 kernel: [<c024c8d0>] cfs_local_readdir [kernel] >>>>0xc0 (0xc3305e00) >>>>Dec 10 06:24:14 node1 kernel: [<c016e0b0>] filldir64 [kernel] 0x0 >>>>(0xc3305e0c) >>>>Dec 10 06:24:14 node1 kernel: [<c024ce73>] cfs_readdir [kernel] 0x553 >>>>(0xc3305e98) >>>>Dec 10 06:24:14 node1 kernel: [<c016e0b0>] filldir64 [kernel] 0x0 >>>>(0xc3305ea4) >>>>Dec 10 06:24:14 node1 kernel: [<c023fb37>] tok_hold [kernel] 0x87 >>>>(0xc3305ec4) >>>>Dec 10 06:24:14 node1 kernel: [<c025a8e7>] cfstok_relse [kernel] 0x47 >>>>(0xc3305f30) >>>>Dec 10 06:24:14 node1 kernel: [<c016d6bd>] vfs_readdir [kernel] 0xad >>>>(0xc3305f4c) >>>>Dec 10 06:24:14 node1 kernel: [<c016e0b0>] filldir64 [kernel] 0x0 >>>>(0xc3305f58) >>>>Dec 10 06:24:14 node1 kernel: [<c016e2e2>] sys_getdents64 [kernel] 0x52 >>>>(0xc3305f70) >>>>Dec 10 06:24:14 node1 kernel: [<c016e0b0>] filldir64 [kernel] 0x0 >>>>(0xc3305f78) >>>>Dec 10 06:24:14 node1 kernel: [<c016cac4>] sys_fcntl64 [kernel] 0x54 >>>>(0xc3305f9c) >>>>Dec 10 06:24:14 node1 kernel: [<c010be37>] system_call [kernel] 0x33 >>>>(0xc3305fc0) >>>>Dec 10 06:24:14 node1 kernel: >>>>Dec 10 06:24:14 node1 kernel: Code: 0f 0b 56 0a a8 47 3e c0 8d 87 00 fe >>>>ff ff 3d 00 0e 00 00 76 >>>> >>>>I reboot into Fedora Core 2 supplied kernel-2.6 and exec the same find >>>>command in the same partition without problems, and it turns out to be a >>>>file not a directory. >>>> >>>> >>>>> >>>>>Roger Tsang wrote: >>>>> >>>>>>John, >>>>>> >>>>>>I believe I resolved this issue. I had unwittingly replaced some >>>>>>glibc kernel headers with SSI kernel headers by symlinking to >>>>>>/usr/src/linux/include. It is tradition on older RedHat systems to do >>>>>>this symlink when upgrading kernels, but apparently not for Fedora >>>>>>Core 2. I fixed this by reinstalling glibc-kernheaders and >>>>>>recompiling nfs-utils, util-linux, and Apache2. Mount and umount are >>>>>>from the util-linux package, so simply executing the bad umount >>>>>>probably caused the kernel oops. >>>>>> >>>>>>-Roger >>>>> >>>>> >>>>>Okay, but I have difficulty understanding how building umount with the >>>>>SSI kernel headers would cause a problem with umount. Bruce suggested >>>>>that maybe mount was messing things up and then umount found the >>>>>problem. >>>>> >>>>>If it comes back, let me know. >>>>> >>>>>John >>>>> >>>>>> >>>>>>> >>>>>>>Roger Tsang wrote: >>>>>>> >>>>>>>>I've rebuilt the SSI userspace tools too and doesn't help >>>>>>>>unfortunately. >>>>>>> >>>>>>> >>>>>>> >>>>>>>How are you producing the problem, exactly? Simple umounting a >>>>>>>filesystem doesn't panic for me. >>>>>>> >>>>>>>John Byrne >>>>>>> >>>>>>> >>>>>>>> >>>>>>>>Dec 8 12:56:16 node1 init: Trying to re-exec init >>>>>>>>Dec 8 13:10:24 node1 kernel: kernel BUG in header file at line 325 >>>>>>>>Dec 8 >>>>>>>>13:10:24 node1 kernel: ------------[ cut here ]------------ Dec 8 >>>>>>>>13:10:24 node1 kernel: kernel BUG at panic.c:297! >>>>>>>>Dec 8 13:10:24 node1 kernel: invalid operand: 0000 >>>>>>>>Dec 8 13:10:24 node1 kernel: r128 agpgart nfsd ipt_REJECT >>>>>>>>ipt_multiport >>>>>>>>ipt_state ip_conntrack iptable_filter ip_tables floppy ide-cd sr_mod >>>>>>>>cdrom >>>>>>>>dm-mod keybdev mousedev hid input usb-uh >>>>>>>>Dec 8 13:10:24 node1 kernel: CPU: 0 >>>>>>>>Dec 8 13:10:24 node1 kernel: EIP: 0060:[<c0128a39>] Not >>>>>>>>tainted Dec >>>>>>>> 8 13:10:24 node1 kernel: EFLAGS: 00210286 >>>>>>>>Dec 8 13:10:24 node1 kernel: >>>>>>>>Dec 8 13:10:24 node1 kernel: EIP is at __out_of_line_bug [kernel] >>>>>>>>0x19 >>>>>>>>(2.4.22-1.2199.nptl_ssi_5develsmp) >>>>>>>>Dec 8 13:10:24 node1 kernel: eax: 00000026 ebx: c7795d80 ecx: >>>>>>>>00000000 edx: c066f980 >>>>>>>>Dec 8 13:10:24 node1 kernel: esi: d1bf97bc edi: c7795dec ebp: >>>>>>>>f4695c44 esp: f4695c3c >>>>>>>>Dec 8 13:10:24 node1 kernel: ds: 0068 es: 0068 ss: 0068 >>>>>>>>Dec 8 13:10:24 node1 kernel: Process nfsd (pid: 67604, >>>>>>>>stackpage=f4695000) >>>>>>>>Dec 8 13:10:24 node1 kernel: Stack: c03e2f90 00000145 f4695c64 >>>>>>>>c0173b39 >>>>>>>>00000145 000001f0 d1bf97bc fffffff4 >>>>>>>>Dec 8 13:10:24 node1 kernel: f2348680 f233c580 f4695c88 >>>>>>>>c01684b8 >>>>>>>>f2348680 d1bf97bc 00000000 00000000 >>>>>>>>Dec 8 13:10:24 node1 kernel: cebfd3c0 f51c1080 f2348680 >>>>>>>>f4695c9c >>>>>>>>c016858e d1bf97bc f2348680 00000000 >>>>>>>>Dec 8 13:10:24 node1 kernel: Call Trace: >>>>>>>>Dec 8 13:10:24 node1 kernel: [<c0173b39>] d_alloc [kernel] 0x199 >>>>>>>>(0xf4695c48) >>>>>>>>Dec 8 13:10:24 node1 kernel: [<c01684b8>] lookup_hash_it [kernel] >>>>>>>>0x88 >>>>>>>>(0xf4695c68) >>>>>>>>Dec 8 13:10:24 node1 kernel: [<c016858e>] lookup_hash [kernel] 0x1e >>>>>>>>(0xf4695c8c) >>>>>>>>Dec 8 13:10:24 node1 kernel: [<c0253434>] cfsd_lookup [kernel] 0x54 >>>>>>>>(0xf4695ca0) >>>>>>>>Dec 8 13:10:24 node1 kernel: [<c035a850>] ip_finish_output2 >>>>>>>>[kernel] 0x0 >>>>>>>>(0xf4695cac) >>>>>>>>Dec 8 13:10:24 node1 kernel: [<c024ec63>] cfs_proc_lookup [kernel] >>>>>>>>0x1b3 >>>>>>>>(0xf4695cd8) >>>>>>>>Dec 8 13:10:24 node1 kernel: [<c03466b7>] nf_hook_slow [kernel] >>>>>>>>0xf7 >>>>>>>>(0xf4695cec) >>>>>>>>Dec 8 13:10:24 node1 kernel: [<c035a980>] ip_queue_xmit2 [kernel] >>>>>>>>0x0 >>>>>>>>(0xf4695d08) >>>>>>>>Dec 8 13:10:24 node1 kernel: [<c035a980>] ip_queue_xmit2 [kernel] >>>>>>>>0x0 >>>>>>>>(0xf4695d18) >>>>>>>>Dec 8 13:10:24 node1 kernel: [<c035962a>] ip_queue_xmit [kernel] >>>>>>>>0x48a >>>>>>>>(0xf4695d24) >>>>>>>>Dec 8 13:10:24 node1 kernel: [<c025ab05>] cfstok_relsex [kernel] >>>>>>>>0x215 >>>>>>>>(0xf4695d5c) >>>>>>>>Dec 8 13:10:24 node1 kernel: [<c023fb37>] tok_hold [kernel] 0x87 >>>>>>>>(0xf4695d84) >>>>>>>>Dec 8 13:10:24 node1 kernel: [<c025a547>] cfstok_req [kernel] 0x127 >>>>>>>>(0xf4695da4) >>>>>>>>Dec 8 13:10:24 node1 kernel: [<c024d351>] cfs_lookup [kernel] 0xe1 >>>>>>>>(0xf4695e1c) >>>>>>>>Dec 8 13:10:24 node1 kernel: [<c0168545>] lookup_hash_it [kernel] >>>>>>>>0x115 >>>>>>>>(0xf4695e58) >>>>>>>>Dec 8 13:10:24 node1 kernel: [<c0168602>] lookup_one_len_it >>>>>>>>[kernel] 0x62 >>>>>>>>(0xf4695e7c) >>>>>>>>Dec 8 13:10:24 node1 kernel: [<c0168645>] lookup_one_len [kernel] >>>>>>>>0x25 >>>>>>>>(0xf4695ea8) >>>>>>>>Dec 8 13:10:24 node1 kernel: [<f8a684cb>] nfsd_lookup [nfsd] 0xdb >>>>>>>>(0xf4695ec0) >>>>>>>>Dec 8 13:10:24 node1 kernel: [<c039ed8b>] svc_sock_enqueue [kernel] >>>>>>>>0x13b >>>>>>>>(0xf4695ef0) >>>>>>>>Dec 8 13:10:24 node1 kernel: [<f8a6f8c0>] nfsd3_proc_lookup [nfsd] >>>>>>>>0xa0 >>>>>>>>(0xf4695f0c) >>>>>>>>Dec 8 13:10:24 node1 kernel: [<f8a7198e>] nfs3svc_decode_diropargs >>>>>>>>[nfsd] >>>>>>>>0x7e (0xf4695f24) >>>>>>>>Dec 8 13:10:24 node1 kernel: [<f8a7784c>] nfsd_procedures3 [nfsd] >>>>>>>>0x6c >>>>>>>>(0xf4695f3c) >>>>>>>>Dec 8 13:10:24 node1 kernel: [<f8a64639>] nfsd_dispatch [nfsd] 0xc9 >>>>>>>>(0xf4695f48) >>>>>>>>Dec 8 13:10:24 node1 kernel: [<f8a64570>] nfsd_dispatch [nfsd] 0x0 >>>>>>>>(0xf4695f60) >>>>>>>>Dec 8 13:10:24 node1 kernel: [<c039ea45>] svc_process [kernel] >>>>>>>>0x355 >>>>>>>>(0xf4695f68) >>>>>>>>Dec 8 13:10:24 node1 kernel: [<f8a7784c>] nfsd_procedures3 [nfsd] >>>>>>>>0x6c >>>>>>>>(0xf4695f8c) >>>>>>>>Dec 8 13:10:24 node1 kernel: [<f8a77118>] nfsd_version3 [nfsd] 0x0 >>>>>>>>(0xf4695f90) >>>>>>>>Dec 8 13:10:24 node1 kernel: [<f8a77138>] nfsd_program [nfsd] 0x0 >>>>>>>>(0xf4695f94) >>>>>>>>Dec 8 13:10:24 node1 kernel: [<f8a6441b>] nfsd [nfsd] 0x1fb >>>>>>>>(0xf4695fb0) >>>>>>>>Dec 8 13:10:24 node1 kernel: [<f8a64220>] nfsd [nfsd] 0x0 >>>>>>>>(0xf4695fe0) >>>>>>>>Dec 8 13:10:24 node1 kernel: [<c01077ed>] kernel_thread_helper >>>>>>>>[kernel] >>>>>>>>0x5 (0xf4695ff0) >>>>>>>>Dec 8 13:10:24 node1 kernel: >>>>>>>>Dec 8 13:10:24 node1 kernel: Code: 0f 0b 29 01 3d 27 3e c0 eb 0d 90 >>>>>>>>90 90 >>>>>>>>90 90 90 90 90 90 90 >>>>>>>> >>>>>>>> >>>>>>>>Dec 8 14:30:12 node1 kernel: Unable to handle kernel NULL pointer >>>>>>>>dereference at virtual address 00000070 >>>>>>>>Dec 8 14:30:12 node1 kernel: printing eip: >>>>>>>>Dec 8 14:30:12 node1 kernel: c024c894 >>>>>>>>Dec 8 14:30:12 node1 kernel: *pde = 33261001 >>>>>>>>Dec 8 14:30:12 node1 kernel: *pte = 00000000 >>>>>>>>Dec 8 14:30:12 node1 kernel: Oops: 0002 >>>>>>>>Dec 8 14:30:12 node1 kernel: r128 agpgart nfsd ipt_REJECT >>>>>>>>ipt_multiport >>>>>>>>ipt_state ip_conntrack iptable_filter ip_tables floppy ide-cd sr_mod >>>>>>>>cdrom >>>>>>>>dm-mod keybdev mousedev hid input usb-uh >>>>>>>>Dec 8 14:30:12 node1 kernel: CPU: 0 >>>>>>>>Dec 8 14:30:12 node1 kernel: EIP: 0060:[<c024c894>] Not >>>>>>>>tainted Dec >>>>>>>> 8 14:30:12 node1 kernel: EFLAGS: 00010286 >>>>>>>>Dec 8 14:30:12 node1 kernel: >>>>>>>>Dec 8 14:30:12 node1 kernel: EIP is at cfs_local_readdir [kernel] >>>>>>>>0x84 >>>>>>>>(2.4.22-1.2199.nptl_ssi_5develsmp) >>>>>>>>Dec 8 14:30:12 node1 kernel: eax: f31f5680 ebx: ffffffec ecx: >>>>>>>>00000070 edx: 00000000 >>>>>>>>Dec 8 14:30:12 node1 kernel: esi: 00000070 edi: 00000000 ebp: >>>>>>>>f4289b94 esp: f4289b04 >>>>>>>>Dec 8 14:30:12 node1 kernel: ds: 0068 es: 0068 ss: 0068 >>>>>>>>Dec 8 14:30:12 node1 kernel: Process nfsd (pid: 67587, >>>>>>>>stackpage=f4289000) >>>>>>>>Dec 8 14:30:12 node1 kernel: Stack: f31e6b80 00004000 00000004 >>>>>>>>f4289b18 >>>>>>>>00000000 00000000 00000000 f31f5680 >>>>>>>>Dec 8 14:30:12 node1 kernel: 00000000 c057fce0 00000001 >>>>>>>>00008000 >>>>>>>>00000001 00000000 00000000 00000000 >>>>>>>>Dec 8 14:30:12 node1 kernel: 00000000 00000000 00000000 >>>>>>>>00000000 >>>>>>>>00000000 00000000 00000000 00000000 >>>>>>>>Dec 8 14:30:12 node1 kernel: Call Trace: >>>>>>>>Dec 8 14:30:12 node1 kernel: [<c024ce73>] cfs_readdir [kernel] >>>>>>>>0x553 >>>>>>>>(0xf4289b98) >>>>>>>>Dec 8 14:30:12 node1 kernel: [<f8a66400>] filldir_one [nfsd] 0x0 >>>>>>>>(0xf4289ba4) >>>>>>>>Dec 8 14:30:12 node1 kernel: [<c0257384>] hfind [kernel] 0x14 >>>>>>>>(0xf4289bf8) >>>>>>>>Dec 8 14:30:12 node1 kernel: [<c0257478>] svrtok_lookup [kernel] >>>>>>>>0x78 >>>>>>>>(0xf4289c18) >>>>>>>>Dec 8 14:30:12 node1 kernel: [<c019aec0>] ext3_lookup [kernel] >>>>>>>>0x110 >>>>>>>>(0xf4289c20) >>>>>>>>Dec 8 14:30:12 node1 kernel: [<c016d6bd>] vfs_readdir [kernel] 0xad >>>>>>>>(0xf4289c4c) >>>>>>>>Dec 8 14:30:12 node1 kernel: [<f8a66400>] filldir_one [nfsd] 0x0 >>>>>>>>(0xf4289c58) >>>>>>>>Dec 8 14:30:12 node1 kernel: [<f8a66400>] filldir_one [nfsd] 0x0 >>>>>>>>(0xf4289c68) >>>>>>>>Dec 8 14:30:12 node1 kernel: [<f8a66600>] nfsd_get_name [nfsd] >>>>>>>>0x190 >>>>>>>>(0xf4289c70) >>>>>>>>Dec 8 14:30:12 node1 kernel: [<f8a66400>] filldir_one [nfsd] 0x0 >>>>>>>>(0xf4289c78) >>>>>>>>Dec 8 14:30:12 node1 kernel: [<c02479ec>] cfs_hpget [kernel] 0xec >>>>>>>>(0xf4289c90) >>>>>>>>Dec 8 14:30:12 node1 kernel: [<f8a66b90>] splice [nfsd] 0x30 >>>>>>>>(0xf4289d48) >>>>>>>>Dec 8 14:30:12 node1 kernel: [<c0248775>] cfs_fh_to_dentry [kernel] >>>>>>>>0xe5 >>>>>>>>(0xf4289dbc) >>>>>>>>Dec 8 14:30:12 node1 kernel: [<c025a8e7>] cfstok_relse [kernel] >>>>>>>>0x47 >>>>>>>>(0xf4289dd4) >>>>>>>>Dec 8 14:30:12 node1 kernel: [<c024d395>] cfs_lookup [kernel] 0x125 >>>>>>>>(0xf4289e00) >>>>>>>>Dec 8 14:30:12 node1 kernel: [<c0172f21>] dput [kernel] 0x31 >>>>>>>>(0xf4289e3c) >>>>>>>>Dec 8 14:30:12 node1 kernel: [<f8a66abd>] nfsd_findparent [nfsd] >>>>>>>>0xbd >>>>>>>>(0xf4289e54) >>>>>>>>Dec 8 14:30:12 node1 kernel: [<f8a73efc>] .rodata.str1.1 [nfsd] >>>>>>>>0x160 >>>>>>>>(0xf4289e60) >>>>>>>>Dec 8 14:30:12 node1 kernel: [<f8a66ed7>] find_fh_dentry [nfsd] >>>>>>>>0x1d7 >>>>>>>>(0xf4289e80) >>>>>>>>Dec 8 14:30:12 node1 kernel: [<f8a67269>] fh_verify [nfsd] 0x199 >>>>>>>>(0xf4289eb4) >>>>>>>>Dec 8 14:30:12 node1 kernel: [<c039ed8b>] svc_sock_enqueue [kernel] >>>>>>>>0x13b >>>>>>>>(0xf4289ef0) >>>>>>>>Dec 8 14:30:12 node1 kernel: [<f8a6f6aa>] nfsd3_proc_getattr [nfsd] >>>>>>>>0x6a >>>>>>>>(0xf4289f10) >>>>>>>>Dec 8 14:30:12 node1 kernel: [<f8a77804>] nfsd_procedures3 [nfsd] >>>>>>>>0x24 >>>>>>>>(0xf4289f3c) >>>>>>>>Dec 8 14:30:12 node1 kernel: [<f8a64639>] nfsd_dispatch [nfsd] 0xc9 >>>>>>>>(0xf4289f48) >>>>>>>>Dec 8 14:30:12 node1 kernel: [<f8a64570>] nfsd_dispatch [nfsd] 0x0 >>>>>>>>(0xf4289f60) >>>>>>>>Dec 8 14:30:12 node1 kernel: [<c039ea45>] svc_process [kernel] >>>>>>>>0x355 >>>>>>>>(0xf4289f68) >>>>>>>>Dec 8 14:30:12 node1 kernel: [<f8a77804>] nfsd_procedures3 [nfsd] >>>>>>>>0x24 >>>>>>>>(0xf4289f8c) >>>>>>>>Dec 8 14:30:12 node1 kernel: [<f8a77118>] nfsd_version3 [nfsd] 0x0 >>>>>>>>(0xf4289f90) >>>>>>>>Dec 8 14:30:12 node1 kernel: [<f8a77138>] nfsd_program [nfsd] 0x0 >>>>>>>>(0xf4289f94) >>>>>>>>Dec 8 14:30:12 node1 kernel: [<f8a6441b>] nfsd [nfsd] 0x1fb >>>>>>>>(0xf4289fb0) >>>>>>>>Dec 8 14:30:12 node1 kernel: [<f8a64220>] nfsd [nfsd] 0x0 >>>>>>>>(0xf4289fe0) >>>>>>>>Dec 8 14:30:12 node1 kernel: [<c01077ed>] kernel_thread_helper >>>>>>>>[kernel] >>>>>>>>0x5 (0xf4289ff0) >>>>>>>>Dec 8 14:30:12 node1 kernel: >>>>>>>>Dec 8 14:30:12 node1 kernel: Code: f0 ff 4f 70 0f 88 c1 15 00 00 b8 >>>>>>>>00 e0 >>>>>>>>ff ff bb fe ff ff ff >>>>>>>> >>>>>>>> >>>>>>>>Dec 8 14:34:06 node1 kernel: Unable to handle kernel NULL pointer >>>>>>>>dereference at virtual address 000000cc >>>>>>>>Dec 8 14:34:06 node1 kernel: printing eip: >>>>>>>>Dec 8 14:34:06 node1 kernel: c014926c >>>>>>>>Dec 8 14:34:06 node1 kernel: *pde = 2879a001 >>>>>>>>Dec 8 14:34:06 node1 kernel: *pte = 2821c067 >>>>>>>>Dec 8 14:34:06 node1 kernel: Oops: 0000 >>>>>>>>Dec 8 14:34:06 node1 kernel: nfsd ipt_REJECT ipt_multiport >>>>>>>>ipt_state >>>>>>>>ip_conntrack iptable_filter ip_tables floppy ide-cd sr_mod cdrom >>>>>>>>dm-mod >>>>>>>>keybdev mousedev hid input usb-uhci ehci-hcd u >>>>>>>>Dec 8 14:34:06 node1 kernel: CPU: 0 >>>>>>>>Dec 8 14:34:06 node1 kernel: EIP: 0060:[<c014926c>] Not >>>>>>>>tainted Dec >>>>>>>> 8 14:34:06 node1 kernel: EFLAGS: 00010206 >>>>>>>>Dec 8 14:34:06 node1 kernel: >>>>>>>>Dec 8 14:34:06 node1 kernel: EIP is at generic_file_write [kernel] >>>>>>>>0x1c >>>>>>>>(2.4.22-1.2199.nptl_ssi_5develsmp) >>>>>>>>Dec 8 14:34:06 node1 kernel: eax: 00000000 ebx: c1f6be78 ecx: >>>>>>>>000001eb edx: ffffffea >>>>>>>>Dec 8 14:34:06 node1 kernel: esi: ffffffff edi: 00000000 ebp: >>>>>>>>c1f6bdfc esp: c1f6bdd4 >>>>>>>>Dec 8 14:34:06 node1 kernel: ds: 0068 es: 0068 ss: 0068 >>>>>>>>Dec 8 14:34:06 node1 kernel: Process cfs_async (pid: 13, >>>>>>>>stackpage=c1f6b000) >>>>>>>>Dec 8 14:34:06 node1 kernel: Stack: c1f6be78 f02157a6 00000000 >>>>>>>>c1f6be98 >>>>>>>>e171ba80 c1f6a000 000007a6 c1f6be78 >>>>>>>>Dec 8 14:34:06 node1 kernel: ffffffff 00000000 c1f6be20 >>>>>>>>c0192d45 >>>>>>>>c1f6be78 df723000 000001eb c1f6be98 >>>>>>>>Dec 8 14:34:06 node1 kernel: c1f6a000 ffffffff ffffffff >>>>>>>>c1f6bef4 >>>>>>>>c0253f44 c1f6be78 df723000 000001eb >>>>>>>>Dec 8 14:34:06 node1 kernel: Call Trace: >>>>>>>>Dec 8 14:34:06 node1 kernel: [<c0192d45>] ext3_file_write [kernel] >>>>>>>>0x35 >>>>>>>>(0xc1f6be00) >>>>>>>>Dec 8 14:34:06 node1 kernel: [<c0253f44>] cfsd_write [kernel] 0xe4 >>>>>>>>(0xc1f6be24) >>>>>>>>Dec 8 14:34:06 node1 kernel: [<c0253c11>] cfsd_close [kernel] 0x61 >>>>>>>>(0xc1f6be48) >>>>>>>>Dec 8 14:34:06 node1 kernel: [<c0253e48>] cfsd_read [kernel] 0x108 >>>>>>>>(0xc1f6be60) >>>>>>>>Dec 8 14:34:06 node1 kernel: [<c024f284>] cfs_proc_write [kernel] >>>>>>>>0x1c4 >>>>>>>>(0xc1f6bef8) >>>>>>>>Dec 8 14:34:06 node1 kernel: [<c024b134>] cfs_async_handler_write >>>>>>>>[kernel] 0x104 (0xc1f6bf58) >>>>>>>>Dec 8 14:34:06 node1 kernel: [<c02930c6>] nsc_async_daemon [kernel] >>>>>>>>0x1b6 >>>>>>>>(0xc1f6bfa4) >>>>>>>>Dec 8 14:34:06 node1 kernel: [<c0292f10>] nsc_async_daemon [kernel] >>>>>>>>0x0 >>>>>>>>(0xc1f6bfe0) >>>>>>>>Dec 8 14:34:06 node1 kernel: [<c01077ed>] kernel_thread_helper >>>>>>>>[kernel] >>>>>>>>0x5 (0xc1f6bff0) >>>>>>>>Dec 8 14:34:06 node1 kernel: >>>>>>>>Dec 8 14:34:06 node1 kernel: Code: 8b 80 cc 00 00 00 8b 78 20 0f 88 >>>>>>>>a8 00 >>>>>>>>00 00 b9 00 e0 ff ff >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>>I built the latest SSI kernel from CVS OPENSSI-RH branch using the >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>Fedora >>>>>>>> >>>>>>>>>Core base kernel from srpms OPENSSI-FC branch. >>>>>>>>> >>>>>>>>>There is a critical ext3 bug. After reboot I lost >>>>>>>>>/boot/initrd-2.4.22...develsmp.img when I umount some other >>>>>>>>>partition on >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>the >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>------------------------------------------------------- >>>>>>>>SF email is sponsored by - The IT Product Guide >>>>>>>>Read honest & candid reviews on hundreds of IT Products from real >>>>>>>>users. >>>>>>>>Discover which products truly live up to the hype. Start reading >>>>>>>>now. http://productguide.itmanagersjournal.com/ >>>>>>>>_______________________________________________ >>>>>>>>ssic-linux-devel mailing list >>>>>>>>ssi...@li... >>>>>>>>https://lists.sourceforge.net/lists/listinfo/ssic-linux-devel >>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>>------------------------------------------------------- >>>>>>>SF email is sponsored by - The IT Product Guide >>>>>>>Read honest & candid reviews on hundreds of IT Products from real >>>>>>>users. >>>>>>>Discover which products truly live up to the hype. Start reading now. >>>>>>>http://productguide.itmanagersjournal.com/ >>>>>>>_______________________________________________ >>>>>>>ssic-linux-devel mailing list >>>>>>>ssi...@li... >>>>>>>https://lists.sourceforge.net/lists/listinfo/ssic-linux-devel >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>>> >>>>>------------------------------------------------------- >>>>>SF email is sponsored by - The IT Product Guide >>>>>Read honest & candid reviews on hundreds of IT Products from real >>>>>users. >>>>>Discover which products truly live up to the hype. Start reading now. >>>>>http://productguide.itmanagersjournal.com/ >>>>>_______________________________________________ >>>>>ssic-linux-devel mailing list >>>>>ssi...@li... >>>>>https://lists.sourceforge.net/lists/listinfo/ssic-linux-devel >>>> >>>> >>>> >>>> >>>> >>>>------------------------------------------------------- >>>>SF email is sponsored by - The IT Product Guide >>>>Read honest & candid reviews on hundreds of IT Products from real users. >>>>Discover which products truly live up to the hype. Start reading now. >>>>http://productguide.itmanagersjournal.com/ >>>>_______________________________________________ >>>>ssic-linux-devel mailing list >>>>ssi...@li... >>>>https://lists.sourceforge.net/lists/listinfo/ssic-linux-devel >>> >>> >>> >>> >>> >>>------------------------------------------------------- >>>SF email is sponsored by - The IT Product Guide >>>Read honest & candid reviews on hundreds of IT Products from real users. >>>Discover which products truly live up to the hype. Start reading now. >>>http://productguide.itmanagersjournal.com/ >>>_______________________________________________ >>>ssic-linux-devel mailing list >>>ssi...@li... >>>https://lists.sourceforge.net/lists/listinfo/ssic-linux-devel >> >> >> >> > |