Thread: [SSI-users] root failover and local boot device
Brought to you by:
brucewalker,
rogertsang
From: <len...@pa...> - 2004-11-23 16:07:49
|
Hi, I'm planning to install OpenSSI (for Debian Sarge) on a two node system with a shared SCSI bus i.e. - node1 (no internal disk) - node2 (no internal disk) - shared storage: 4 SCSI disks While reading http://openssi.org/docs/debian/INSTALL.shtml , I noticed the following, which is not very clear to me: --- begin --- If you have enabled root failover you MUST configure a local boot device on the new node. Otherwise, configuring a local boot device is optional. If you are going to configure a local boot device, it is highly recommended that the boot device have the same name as the first node's boot device. Remember that we assumed at the beginning of these instructions that the first node's boot device is located on the first partition of the first drive (e.g., /dev/hda1 or /dev/sda1). --- end --- Does this mean that each of the nodes should have it's "private" bootdisk e.g. node1 uses sda1 and node2 uses sdb1? If this is the case, what would be the consequences having a different boot device name for both nodes? Apparently, ssi-chnode copies files to the boot device. Would it be possible to use a floppy (size?) or use a network bootable image as boot device (e.g. using PXE or etherboot)? cu, --=20 len...@pa... gpg fingerprint: A41E A399 5160 BAB9 AEF1 58F2 B92A F4AB 9FFB 3707 gpg key id: 9FFB3707 Those who do not understand Unix are condemned to reinvent it, poorly." -- Henry Spencer |
From: Brian J. W. <Bri...@hp...> - 2004-11-24 01:16:58
|
Frank Lenaerts wrote: > Hi, > > I'm planning to install OpenSSI (for Debian Sarge) on a two node > system with a shared SCSI bus i.e. > > - node1 (no internal disk) > - node2 (no internal disk) > - shared storage: 4 SCSI disks > > While reading http://openssi.org/docs/debian/INSTALL.shtml , I noticed > the following, which is not very clear to me: > > --- begin --- > If you have enabled root failover you MUST configure a local boot > device on the new node. Otherwise, configuring a local boot device is > optional. If you are going to configure a local boot device, it is > highly recommended that the boot device have the same name as the > first node's boot device. Remember that we assumed at the beginning of > these instructions that the first node's boot device is located on the > first partition of the first drive (e.g., /dev/hda1 or /dev/sda1). > --- end --- > > Does this mean that each of the nodes should have it's "private" > bootdisk e.g. node1 uses sda1 and node2 uses sdb1? If this is the > case, what would be the consequences having a different boot device > name for both nodes? I was attempting to keep the GRUB boot block instructions as simple as possible, both for the sake of inexperienced users and because I'm no GRUB expert myself. What I'm trying to say by "the first partition of the first drive", is that you should use the partition that GRUB would call 'hd0,0'. Typically, this is /dev/hda1 or /dev/sda1. If you're willing to be more adventurous with configuring GRUB, then you can put the local boot partition on any device that you care to use. You just need to substitute 'hd0,0' and 'hd0' in the GRUB boot block instructions, as needed. As far as OpenSSI is concerned, the local boot partition can be a different device on different nodes. See /etc/clustertab to see how this is configured. ssi-chnode is just a front-end interface for editing /etc/clustertab. > Apparently, ssi-chnode copies files to the boot device. Indirectly. It calls ssi-ksync, which does the actual copying of boot materials to all local boot partitions listed in /etc/clustertab. > Would it be > possible to use a floppy (size?) The size of the OpenSSI kernel/ramdisk would be well beyond a floppy's capacity. > or use a network bootable image as > boot device (e.g. using PXE or etherboot)? By default, all non-root nodes network boot, using images in /tftpboot on the shared root. Alternatively, since both of your potential root nodes are booting from a shared disk, you could attempt to configure them to boot from the same exact partition. I haven't tested ssi-ksync or the rest of openssi-tools with this configuration, but it might work. Regards, Brian |
From: <len...@pa...> - 2004-11-24 08:11:01
|
On Tue, Nov 23, 2004 at 05:16:41PM -0800, Brian J. Watson wrote: > Frank Lenaerts wrote: > >Hi, > > > >I'm planning to install OpenSSI (for Debian Sarge) on a two node > >system with a shared SCSI bus i.e. > > > >- node1 (no internal disk) > >- node2 (no internal disk) > >- shared storage: 4 SCSI disks > > > >While reading http://openssi.org/docs/debian/INSTALL.shtml , I noticed > >the following, which is not very clear to me: > > > >--- begin --- > >If you have enabled root failover you MUST configure a local boot > >device on the new node. Otherwise, configuring a local boot device is > >optional. If you are going to configure a local boot device, it is > >highly recommended that the boot device have the same name as the > >first node's boot device. Remember that we assumed at the beginning of > >these instructions that the first node's boot device is located on the > >first partition of the first drive (e.g., /dev/hda1 or /dev/sda1). > >--- end --- > > > >Does this mean that each of the nodes should have it's "private" > >bootdisk e.g. node1 uses sda1 and node2 uses sdb1? If this is the > >case, what would be the consequences having a different boot device > >name for both nodes? >=20 > I was attempting to keep the GRUB boot block instructions as simple as=20 > possible, both for the sake of inexperienced users and because I'm no=20 > GRUB expert myself. What I'm trying to say by "the first partition of=20 > the first drive", is that you should use the partition that GRUB would=20 > call 'hd0,0'. Typically, this is /dev/hda1 or /dev/sda1. >=20 > If you're willing to be more adventurous with configuring GRUB, then you= =20 > can put the local boot partition on any device that you care to use. You= =20 > just need to substitute 'hd0,0' and 'hd0' in the GRUB boot block=20 > instructions, as needed. >=20 > As far as OpenSSI is concerned, the local boot partition can be a=20 > different device on different nodes. See /etc/clustertab to see how this= =20 > is configured. ssi-chnode is just a front-end interface for editing=20 > /etc/clustertab. Looking at README.clustertab, it seems that only swap would be node specific.=20 > >Apparently, ssi-chnode copies files to the boot device. >=20 > Indirectly. It calls ssi-ksync, which does the actual copying of boot=20 > materials to all local boot partitions listed in /etc/clustertab. I haven't looked at ssi-ksync yet, but in my case it wouldn't have to copy anything as I would only have a "local" swap partition. > Alternatively, since both of your potential root nodes are booting from= =20 > a shared disk, you could attempt to configure them to boot from the same= =20 > exact partition. I haven't tested ssi-ksync or the rest of openssi-tools= =20 > with this configuration, but it might work. That is what I was thinking i.e. - install everything on node1 on shared storage (note that even the grub stuff could be on the same disk as both nodes see e.g. /dev/sda as hd0) - if node1 would not be up, node2 could be booted from the same partition - if node1 is up, node2 boots via the network I will have a look at ssi-ksync. > Regards, cu, > Brian --=20 len...@pa... gpg fingerprint: A41E A399 5160 BAB9 AEF1 58F2 B92A F4AB 9FFB 3707 gpg key id: 9FFB3707 Those who do not understand Unix are condemned to reinvent it, poorly." -- Henry Spencer |
From: Brian J. W. <Bri...@hp...> - 2004-11-24 20:22:26
|
Frank Lenaerts wrote: > Looking at README.clustertab, it seems that only swap would be node > specific. You're referring to README.clusterfstab, which I guess is confusingly named. I'll rename it to README.fstab, since the cluster part is implied by it being part of OpenSSI. If you want to learn about /etc/clustertab, read the comments at the beginning of the file. >>>Apparently, ssi-chnode copies files to the boot device. >> >>Indirectly. It calls ssi-ksync, which does the actual copying of boot >>materials to all local boot partitions listed in /etc/clustertab. > > > I haven't looked at ssi-ksync yet, but in my case it wouldn't have to > copy anything as I would only have a "local" swap partition. It has nothing to do with swap. It copies boot materials (kernels, ramdisks, and grub.conf) from /boot into all local boot partition. It also runs ssi-ksync-network, which regenerates the network boot images under /tftpboot. > That is what I was thinking i.e. > > - install everything on node1 on shared storage (note that even the > grub stuff could be on the same disk as both nodes see e.g. /dev/sda > as hd0) > > - if node1 would not be up, node2 could be booted from the same partition > > - if node1 is up, node2 boots via the network I don't see a problem with booting both nodes from the same boot partition, regardless of who's up. The boot partition is effectively read-only when a node is booting and loads its kernel and ramdisk from the partition. There could be trouble if you boot node 1, mount the boot partition on /boot (which you might want to do), then boot node 2 while you're simultaneously changing stuff in /boot (which you can concsiously avoid doing). The worst that might happen as a result is that you'll need to boot node 2 again. > I will have a look at ssi-ksync. My main concern about ssi-ksync is that it won't recognize that both nodes 1 and 2 are sharing the same boot partition, so it will attempt to copy the boot materials on top of themselves. I think this will effectively do nothing, but I'm not sure. Regards, Brian |
From: Bruce W. <br...@ka...> - 2004-11-30 17:35:44
|
Frank, Please share your experience so we can document on website how to use this configuration. thanks, bruce > Frank Lenaerts wrote: > > Looking at README.clustertab, it seems that only swap would be node > > specific. > > You're referring to README.clusterfstab, which I guess is confusingly > named. I'll rename it to README.fstab, since the cluster part is implied > by it being part of OpenSSI. > > If you want to learn about /etc/clustertab, read the comments at the > beginning of the file. > > >>>Apparently, ssi-chnode copies files to the boot device. > >> > >>Indirectly. It calls ssi-ksync, which does the actual copying of boot > >>materials to all local boot partitions listed in /etc/clustertab. > > > > > > I haven't looked at ssi-ksync yet, but in my case it wouldn't have to > > copy anything as I would only have a "local" swap partition. > > It has nothing to do with swap. It copies boot materials (kernels, > ramdisks, and grub.conf) from /boot into all local boot partition. It > also runs ssi-ksync-network, which regenerates the network boot images > under /tftpboot. > > > > That is what I was thinking i.e. > > > > - install everything on node1 on shared storage (note that even the > > grub stuff could be on the same disk as both nodes see e.g. /dev/sda > > as hd0) > > > > - if node1 would not be up, node2 could be booted from the same partition > > > > - if node1 is up, node2 boots via the network > > I don't see a problem with booting both nodes from the same boot > partition, regardless of who's up. The boot partition is effectively > read-only when a node is booting and loads its kernel and ramdisk from > the partition. There could be trouble if you boot node 1, mount the boot > partition on /boot (which you might want to do), then boot node 2 while > you're simultaneously changing stuff in /boot (which you can concsiously > avoid doing). The worst that might happen as a result is that you'll > need to boot node 2 again. > > > I will have a look at ssi-ksync. > > My main concern about ssi-ksync is that it won't recognize that both > nodes 1 and 2 are sharing the same boot partition, so it will attempt to > copy the boot materials on top of themselves. I think this will > effectively do nothing, but I'm not sure. > > Regards, > > Brian > > > > ------------------------------------------------------- > SF email is sponsored by - The IT Product Guide > Read honest & candid reviews on hundreds of IT Products from real users. > Discover which products truly live up to the hype. Start reading now. > http://productguide.itmanagersjournal.com/ > _______________________________________________ > Ssic-linux-users mailing list > Ssi...@li... > https://lists.sourceforge.net/lists/listinfo/ssic-linux-users > |
From: <len...@te...> - 2005-03-07 22:51:22
|
On Tue, Nov 30, 2004 at 09:35:32AM -0800, Bruce Walker wrote: > Frank, Hi, > Please share your experience so we can document on website how to > use this configuration. Last weekend, I finally had some time to set it up. Below, I describe my experiences so far: * http://www.openssi.org/cgi-bin/view?page=3Ddocs2/1.2/debian/INSTALL.html The lines: deb http://ftp.easynet.fr/openssi/openssidebs-devel ./ deb-src http://ftp.easynet.fr/openssi/openssidebs-devel ./=20 should become (the terminating / is necessary because the packages are right below the given directory, see man sources.list for more info): deb http://ftp.easynet.fr/openssi/openssidebs-devel/ ./ deb-src http://ftp.easynet.fr/openssi/openssidebs-devel/ ./=20 * The on-screen documentation seems to be oriented towards RedHat e.g. refers to /etc/sysconfig/loadlevellist while the installation document refers to /cluster/etc/loadlevellist (which is correct). This however, is just a cosmetic problem. * Adding new nodes: NICs and netbooting Each of my nodes has 4 network interfaces (4 separate cards). My goal would be to have 2 NICs for public addresses (with local NIC failover in case of problems with the public network (cfr. Sun Cluster's NAFO)) and 2 NICs for the private interconnect (to avoid a SPOF). As I installed Debian Sarge using the net installer, I wanted to use eth0 as public interface and eth1 as first interconnect.=20 As new nodes "have to" network boot and most (if not all) PC BIOSes only allow you to select "boot from LAN", which only uses the first NIC (i.e. there does not seem to be a way to setup netbooting from a specific NIC e.g. eth1), I decided to use the primary (core LAN) NIC as interconnect because only this one would be used for netbooting. =3D> for netbooting (via PXE), normally only eth0 can be used: this makes that eth0 should be used as interconnect =3D> as only eth0 can be used for PXE, a spare interconnect cannot be setup? * Adding new nodes: shared SCSI setup With a shared SCSI setup, both nodes would boot from the shared disk. In this case, nodes should not be netbooting at all i.e. the whole PXE / Etherboot setup seems superfluous. Due to the fact that the documentation insists on netbooting, I was initially thinking that it would work as follows: - node1 tries to boot from the network, but as there is no bootserver available, it boots from disk (BIOS falls back to the alternative boot option) - node2 tries to boot from the network; as node1 is running a bootserver, this is possible Is it correct that in my case netbooting would not be necessary? * TFTP server issues: - The TFTP server is in package tftpd-hpa (not tftp-hpa) - Why exactly does the tftp server have to run as root instead of nobody? If it runs as nobody, I get a message "cannot set groups for user nobody". * Grub issues: - I setup a separate swap partition for each node but did not setup a local boot device (even though I configured root failover). Following my reasoning that both nodes boot from the shared disk (i.e. even node2 does not netboot), everything was already setup (except for the extra swap partition). Node2 would just reuse the grub configuration etc. - Noticed that the default kernel to boot was still the one from the initial Debian Sarge installation, so changed the default. * Root failover setup: - manually added /dev/sda1 to /etc/clustertab for the second node as only node1 had a boot device associated with it. - changed /etc/fstab: - included node2 in the node list for the root filesystem - changed devices into their corresponding labels, added chard mount option and added node=3D1:2 (because all my filesystems are failover) * Shutdown a node - As an initial root failover test, issued a shutdown -h now on node1: this fails because it cannot find awk, sort (and possibly some other tools). After entering the root passwd, I get into maintenance mode, where I can see that most filesystems (all except /proc and /) are already unmounted. As awk and sort are in /usr/bin and /usr is a separate partition, this is of course a problem. I think this is due to my separate /usr filesystem but I can't imagine I'm the first one to encounter this - In maintenance mode, I pinged another machine on the network (don't remember why) but I could not CTRL-C it anymore so I had to use the power button to turn the machine off. At the point, node2 said the following: Taking over master from node 1. Node 1 has gone down!!! ... Kernel panic ... kdb =3D> not exactly what I want;-) * Questions: - Working remotely: It seems that 2 sshd processes are running. Using ssh to the individual nodes does not work ("host key verification failed"). Using ssh from the nodes also does not work (same error). Any idea? - What is the purpose of /etc/rc.nodeinfo? I didn't find any documentation about it and the file doesn't contain any comments. It seems to indicate what should be started where etc. Reason I ask this is that I saw messages like "Service ifupdown has no entry in rc.nodeinfo", "Service reboot has no entry in rc.nodeinfo", ... - When booting, right after "Waiting for 5 seconds, press ENTER" (from the initrd), I always see the message "kill: 1: No such process". What's that? - When issuing a shutdown -h now on a node (to shutdown one node), everything seems to be stopped twice (the messages are shown twice). When all processes are gone, node1 seems to be stopped but the console of node2 seems to hang. As soon as node1 is powered of, node2 says "nm_log_daemon: Missed 2 packets from node 1; Taking over master from node 1; Node 1 has gone down!!!; Kernel panic" I hope some of my questions and/or remarks can be cleared up. If more information or testing is needed, just let me know. > thanks, cu, > bruce > =20 > > Frank Lenaerts wrote: > > > Looking at README.clustertab, it seems that only swap would be node > > > specific.=20 > >=20 > > You're referring to README.clusterfstab, which I guess is confusingly= =20 > > named. I'll rename it to README.fstab, since the cluster part is implie= d=20 > > by it being part of OpenSSI. > >=20 > > If you want to learn about /etc/clustertab, read the comments at the=20 > > beginning of the file. > >=20 > > >>>Apparently, ssi-chnode copies files to the boot device. > > >> > > >>Indirectly. It calls ssi-ksync, which does the actual copying of boot= =20 > > >>materials to all local boot partitions listed in /etc/clustertab. > > >=20 > > >=20 > > > I haven't looked at ssi-ksync yet, but in my case it wouldn't have to > > > copy anything as I would only have a "local" swap partition. > >=20 > > It has nothing to do with swap. It copies boot materials (kernels,=20 > > ramdisks, and grub.conf) from /boot into all local boot partition. It= =20 > > also runs ssi-ksync-network, which regenerates the network boot images= =20 > > under /tftpboot. > >=20 > >=20 > > > That is what I was thinking i.e. > > >=20 > > > - install everything on node1 on shared storage (note that even the > > > grub stuff could be on the same disk as both nodes see e.g. /dev/sda > > > as hd0) > > >=20 > > > - if node1 would not be up, node2 could be booted from the same parti= tion > > >=20 > > > - if node1 is up, node2 boots via the network > >=20 > > I don't see a problem with booting both nodes from the same boot=20 > > partition, regardless of who's up. The boot partition is effectively=20 > > read-only when a node is booting and loads its kernel and ramdisk from= =20 > > the partition. There could be trouble if you boot node 1, mount the boo= t=20 > > partition on /boot (which you might want to do), then boot node 2 while= =20 > > you're simultaneously changing stuff in /boot (which you can concsiousl= y=20 > > avoid doing). The worst that might happen as a result is that you'll=20 > > need to boot node 2 again. > >=20 > > > I will have a look at ssi-ksync. > >=20 > > My main concern about ssi-ksync is that it won't recognize that both=20 > > nodes 1 and 2 are sharing the same boot partition, so it will attempt t= o=20 > > copy the boot materials on top of themselves. I think this will=20 > > effectively do nothing, but I'm not sure. > >=20 > > Regards, > >=20 > > Brian --=20 len...@pa... gpg fingerprint: A41E A399 5160 BAB9 AEF1 58F2 B92A F4AB 9FFB 3707 gpg key id: 9FFB3707 Those who do not understand Unix are condemned to reinvent it, poorly." -- Henry Spencer |