Re: [SSI-users] DRBD failover issue, kernal panic no init found
Brought to you by:
brucewalker,
rogertsang
From: Vijay S. <vij...@ni...> - 2006-07-08 01:14:14
|
I don't think its an SSIfailover, or rc.sysrecover issue. It seems that it tries to re-spawn INIT, without failing over DRBD first. If I try cfs_setroot ext3 /dev/drbd/0 on a node, nothing seems to happen. Is that correct behavior? I'm attaching a JPEG screen-shot of the error. Hopefully it goes through. /vijay On Fri, 2006-07-07 at 17:56 -0400, Roger Tsang wrote: > I already migrated to SSI-1.9.x with the latest DRBD which is based on > drbd-0.7.20 but haven't checked in the update yet. The latest one in > CVS is based on drbd-0.7.19 and works fine unless you changed your > drbd devices' al-extents parameter. In that case you would want > drbd-ssi based on drbd-0.7.20. > > It seems to me, without much info, your SSIfailover and rc.sysrecover > system scripts weren't setup properly. You have to modify them for > drbd for root filesystem failover. The drbd-ssi-1.2.2-20050712 > tarball should include a sample of these two system scripts. > > Roger > > > On 7/7/06, Vijay Swami <vij...@ni...> wrote: > > Roger, > > > > Its drbd-ssi-1.2.2-20050712. It seems like as soon as the master goes > > down, the slave does not resort to using its own copy. I don't see any > > EXT3 FSCK messages like I've seen by searching the list for successful > > fail over messages. The error is most certainly it can't find /sbin/init > > probably because once the connection to the master dies, its not > > properly using its 'own' copy. > > > > I'm wondering if this is related to the Debian /dev/drbd0 vs /dev/drbd/0 > > issue that I've read about. I noticed you said you ran into this problem > > on FC2, which is what RHEL3 is based on. Where you using OpenSSI 1.2.x > > or 1.9.x and which version of DRBD? > > > > Thanks. > > > > /vijay > > > > On Thu, 2006-07-06 at 22:50 -0400, Roger Tsang wrote: > > > Are you using drbd-ssi (openSSI modified drbd) and have enabled > > > SSI-failover service? > > > > > > If you like to try the latest drbd-ssi I'll be making a drbd-ssi > > > tarball based on drbd-0.7.20, hopefully sometime this week. > > > > > > Roger > > > > > > > > > On 7/6/06, Vijay Swami <vij...@ni...> wrote: > > > > Pertinent details: > > > > * 2 node cluster > > > > * RHEL3 > > > > * Kernel/OpenSSI ver: 2.4.21-27.0.2.EL_ssi_2 (install from RPM) > > > > * drbd version 0.7.11 (api:77) > > > > * SunFire X4100 2 CPU AMD Opteron hardware > > > > > > > > drbd.conf snippet: > > > > > > > > on host1 { > > > > device /dev/drbd/0; > > > > disk /dev/sda2; > > > > nodenum 1; > > > > address 192.168.1.1:7788; > > > > meta-disk /dev/sda3[0]; > > > > > > > > > > > > on host2 { > > > > device /dev/drbd/0; > > > > disk /dev/sda2; > > > > nodenum 2; > > > > address 192.168.1.2:7788; > > > > meta-disk /dev/sda3[0]; > > > > > > > > /etc/fstab: > > > > > > > > /dev/drbd/0 / ext3 defaults,chard,errors=remount-ro,node=1:2 0 1 > > > > /dev/sda1 /boot ext3 defaults,node=1:2 1 2 > > > > none /dev/pts devpts gid=5,mode=620,node=* 0 0 > > > > none /proc proc defaults,node=* 0 0 > > > > #none /dev/shm tmpfs defaults > > > > 0 0 > > > > > > > > df output: > > > > > > > > Filesystem 1K-blocks Used Available Use% Mounted on > > > > /dev/1/drbd/0 10080520 2329744 7238708 25% / > > > > /dev/1/sda1 248895 61612 174433 27% /boot > > > > > > > > The root file system is configured with DRBD, and working great. Each > > > > node can act as an init node when booted up individually. The other node > > > > joins the cluster, and things work great. > > > > > > > > However, I'm having trouble with the failover. > > > > > > > > If I 'unplug' the master node, the slave recognizes the master has gone > > > > down, and attempts to take over. It prints DRBD timeout messages to the > > > > console (as expected), then: > > > > > > > > ipcnameserver ready > > > > Kernel Panic: no init found. Cannot restart > > > > > > > > .. and then it reboots. > > > > > > > > I have a feeling its something fairly simple here, as DRBD itself works > > > > fine when the nodes are booted. I'm assuming its looking for /sbin/init > > > > to start on the slave node but can't find it? I don't know why since > > > > DRBD itself works on either node acting as an init node when they are > > > > booted regardless of sequence. > > > > > > > > Any ideas? > > > > > > > > Thanks. > > > > > > > > /vijay > > > > > > > > Using Tomcat but need to do more? Need to support web services, security? > > > > Get stuff done quickly with pre-integrated technology to make your job easier > > > > Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo > > > > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 > > > > _______________________________________________ > > > > Ssic-linux-users mailing list > > > > Ssi...@li... > > > > https://lists.sourceforge.net/lists/listinfo/ssic-linux-users > > > > > > > > > > |