Thread: [SSI] Kernel oops when 2nd node boots

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

Hi,

I have installed SSI 0.7.0 on a PC and created a 3-node cluster. I'm using CFS
and the root filesystem is on a IDE disk. The master node boots without
problems, but the second and third node get a oops during the boot when
cmount runs. Both node-2 and node-3 panics the same way. The oops happens
at cfs_get_uniqueid+0x2a

I found that if I do "umount /etc/sysconfig/network-scripts-1" on the master
node before attempt to boot the other nodes, I don't get the oops when node-2
and node-3 boot. Instead they boot and joins the cluster.

Here are some details about the oops.

Console messages during boot and kdb output:

....
RAMDISK: Compressed image found at block 0
Freeing initrd memory: 1552k freed
VFS: Mounted root (ext2 filesystem).
Freeing unused kernel memory: 208k freed
Note: unable to open serial console.
Mounting /proc
Loading 8390 module
Loading ne2k-pci module
ne2k-pci.c:v1.02 10/19/2000 D. Becker/P. Gortmaker
  http://www.scyld.com/network/ne2k-pci.html
eth0: RealTek RTL-8029 found at 0xfce0, IRQ 9, 00:00:E8:88:52:E5.
Gathering cluster info
Configuring cluster
Running pre-root cluster initialization
RTNL: assertion failed at devinet.c(805):inetdev_event
RTNL: assertion failed at devinet.c(805):inetdev_event
spawn_daemon_thread: Truncated daemon name ics_llunack_daemon
spawn_daemon_thread: Truncated daemon name ics_probe_clms_daemon
Searching for an existing root node...
Found node 1 as the root node.
spawn_daemon_proc: Truncated daemon name nm cli nd daemon
spawn_daemon_proc: Truncated daemon name nm cli send daemon
ipcnameserver ready completed

This is a NonStop Clusters kernel.
    This Cluster Node: 3
    Cluster Master Node(s): 1:192.168.1.2

Name server registered with clms
ipcname_read completed
spawn_daemon_proc: Truncated daemon name CFS delrel daemon
Mounting root in linuxrc
Unmounting /proc
Attempting pivot_root
Running post-root cluster initialization
Starting init
/etc/rc.d/nodeup 3 running
                        Welcome to Red Hat Linux
Unable to handle kernel paging request at virtual address ce17ea54
*pde = 00000000

Entering kdb (current=0xc394a000, pid 196687) Oops: Oops
due to oops @ 0xc01abd5a
eax = 0xce17e9cc ebx = 0xc467a480 ecx = 0xc467a480 edx = 0x00000003 
esi = 0xc033349c edi = 0xc11d8000 esp = 0xc394be3c eip = 0xc01abd5a 
ebp = 0xc394be3c xss = 0x00000018 xcs = 0x00000010 eflags = 0x00010246 
xds = 0x00000018 xes = 0x00000018 origeax = 0xffffffff &regs = 0xc394be08
kdb> bt
    EBP       EIP         Function(args)
0xc394be3c 0xc01abd5a cfs_get_uniqueid+0x2a (0xc467a480, 0xc3906000, 0xc394beb4)
                               kernel .text 0xc0100000 0xc01abd30 0xc01abd70
0xc394be58 0xc0138f96 do_kern_mount+0x166 (0xc3905000, 0x40000000, 0xc3904000, 0xc3906000, 0xc3904000)
                               kernel .text 0xc0100000 0xc0138e30 0xc0138fd0
0xc394be8c 0xc01499e8 do_add_mount+0x48 (0xc394beb4, 0xc3905000, 0x40000000, 0x0, 0xc3904000)
                               kernel .text 0xc0100000 0xc01499a0 0xc0149ab0
0xc394bee0 0xc0149ca8 do_mount+0x138
                               kernel .text 0xc0100000 0xc0149b70 0xc0149cd0
0xc394bf5c 0xc01dcd67 ssisys_discover_mounts+0x1c7
                               kernel .text 0xc0100000 0xc01dcba0 0xc01dcde0
kdb> ps
Task Addr  Pid      Parent   [*] cpu  State Thread     Command
....
0xc3ac4000 00066323 00000001  1  000  stop  0xc3ac4280 rc.nodeup
0xc3a58000 00196674 00066323  1  000  stop  0xc3a58280 rc.sysinit.node
0xc394a000 00196687 00196674  1  000  run   0xc394a280*cmount
kdb> id 0xc01abd5a
0xc01abd5a cfs_get_uniqueid+0x2a:         mov    0x88(%eax),%eax
0xc01abd60 cfs_get_uniqueid+0x30:         pop    %ebp
0xc01abd61 cfs_get_uniqueid+0x31:         ret    
0xc01abd62 cfs_get_uniqueid+0x32:         lea    0x0(%esi,1),%esi
0xc01abd69 cfs_get_uniqueid+0x39:         lea    0x0(%edi,1),%edi
...

Extract from System.map

c01abd30 T cfs_get_uniqueid
c01abd70 t cfs_statfs

Output from "objdump --source -d cluster/ssi/cfs/inode.o"
(this kernel was compiled with "-g")

...
00000500 <cfs_get_uniqueid>:

unsigned long
cfs_get_uniqueid(struct vfsmount *mnt, void *raw_data)
{
     500:       55                      push   %ebp
     501:       89 e5                   mov    %esp,%ebp
     503:       8b 45 0c                mov    0xc(%ebp),%eax
     506:       8b 4d 08                mov    0x8(%ebp),%ecx
        struct cfs_mount_data *data = (struct cfs_mount_data *)raw_data;
        struct cfsmountargs *argp;

        if (data == NULL)
     509:       85 c0                   test   %eax,%eax
     50b:       74 15                   je     522 <cfs_get_uniqueid+0x22>
                return mnt->mnt_uniqueid;

        if (data->magic != CFS_MOUNT_MAGIC)
     50d:       81 38 cf cf cf cf       cmpl   $0xcfcfcfcf,(%eax)
     513:       75 0d                   jne    522 <cfs_get_uniqueid+0x22>
                return mnt->mnt_uniqueid;

        if (data->mode != CFS_NOTIFY && data->mode != CFS_DISCOVER)
     515:       8b 50 04                mov    0x4(%eax),%edx
     518:       83 fa 01                cmp    $0x1,%edx
     51b:       74 0a                   je     527 <cfs_get_uniqueid+0x27>
     51d:       83 fa 03                cmp    $0x3,%edx
     520:       74 05                   je     527 <cfs_get_uniqueid+0x27>
                return mnt->mnt_uniqueid;
     522:       8b 41 3c                mov    0x3c(%ecx),%eax
     525:       eb 09                   jmp    530 <cfs_get_uniqueid+0x30>

        argp = (struct cfsmountargs *) data->payload;
     527:       8b 40 08                mov    0x8(%eax),%eax
        return argp->uniqueid;
     52a:       8b 80 88 00 00 00       mov    0x88(%eax),%eax
}
     530:       5d                      pop    %ebp
     531:       c3                      ret    
     532:       8d b4 26 00 00 00 00    lea    0x0(%esi,1),%esi
     539:       8d bc 27 00 00 00 00    lea    0x0(%edi,1),%edi
...

Any idea what's going wrong here?

Thanks,
Sven-Olof

Thread: [SSI] Kernel oops when 2nd node boots

ssic-linux-devel