Re: [SSI-users] DRBD Root Failover with Fedora Core 2
Brought to you by:
brucewalker,
rogertsang
|
From: John D. <jo...@na...> - 2005-11-23 05:37:28
|
As a follow-up to my previous post, I have determined that Fedora Core 2 refuses to start the drbd service on Node2. Why is that, and what can I do to correct it? Specifically, "service drbd start" returns an error about "rc.nodeinfo". This file is present, and when I manually add the line "drbd all Y", I can then issue the "onnode 2 service drbd start", (which then attempts to start the service on *every* node!) But when it did start on Node2, it began to sync with Node1! Yay! So, here are my final hurdles: 1) How can I make Fedora start drbd services automatically at boot? 2) How can I make a service, like httpd, start on a different node (say, node 3)? Is that necessary, or will bash_ll handle this for me? 3) If node 3 dies, or gets abducted by aliens, will node 2 fire httpd back up automatically, since the process fell off? John David wrote: > Hi all, > > I have a 2-node cluster up and running using OpenSSI-1.2.2 and Fedora > Core 2, but I am running into a few bumps getting drbd to do anything: > > 1. I have installed drbd successfully on Node1, and it boots properly > (although it does do a lot of waiting and counting before actually > booting). For example, it says that it is waiting for a root node for > a very long time, then when that is over, it waits for 120 seconds > more to see a peer node, (which I can escape by typing "yes"). > Although this is not a problem right now, Node2's etherboot fails > after waiting so long. Node2 will connect if I reboot it after Node 1 > has finished, but that's awfully tempermental for a HA server, eh? > That essentially means a power outage will always result in Node1 > being the only one up. I'm hoping that one of my other problems is > the reason for this delay in booting up. > > 2. Node1 likes drbd, but Node2 doesn't have any idea what drbd is: > > Node1 offers information through /proc/drbd: > [root@node1 root]# cat /proc/drbd > version: 0.7.11 (api:77/proto:74) > SVN Revision: 1799 build by root@node1, 2005-11-22 19:23:24 > 0: cs:WFConnection st:Primary/Unknown ld:Consistent > ns:0 nr:0 dw:59196 dr:31353 al:0 bm:63 lo:0 pe:0 ua:0 ap:0 > > Node2 remains angry and uncooperative: > [root@node2 root]# cat /proc/drbd > cat: /proc/drbd: No such file or directory > > Node1 appears to have mounted the drbd device properly (but I have no > idea how this is supposed to look). The results for the "mount" > command are identical for both nodes: > /dev/1/drbd/0 on / type ext3 (rw,chard) > none on /proc type proc (rw) > none on /sys type sysfs (rw) > none on /dev/pts type devpts (rw,gid=5,mode=620) > devfs on /dev type devfs (rw) > /dev/1/hda1 on /boot type ext3 (rw) > > Node1 has a /dev/drbd/0: > [root@node1 root]# ls -asl /dev/drbd/0 > 0 brw------- 1 root root 147, 0 Dec 31 1969 /dev/drbd/0 > > Node2 does not: > [root@node2 root]# ls -asl /dev/drbd/0 > ls: /dev/drbd/0: No such file or directory > > This is really strange to me, because if they are booting using the > same initrd image (updated by ssi-ksync), why isn't Node2 loading the > drbd module and behaving more like Node1? Moreover, if this node is > supposed to perform root failover, it would need to provide drbd > service to the other nodes -- which it obviously isn't ready to do if > it doesn't have a /dev/drbd/0 device on boot. |