Thread: Re: [SSI-users] Debian Boot Problem (fwd)
Brought to you by:
brucewalker,
rogertsang
From: Aneesh K. KV <ane...@di...> - 2004-01-27 16:46:22
|
Jiann-Ming Su wrote: >On Mon, 26 Jan 2004, Aneesh Kumar KV wrote: > > > >>I am not sure what is wrong. Are you sure you are using the correct >>entry. First thing to make sure is >> >>1) Is it picking correct kernel version ( check you hda1 ( root(hd0,0) >>) and make sure that /boot is there in that directory. ) I doubt it is >>hd0, 0 . Since you are saying your root is sda4 i guess it should be >>hd0,3 . in case you don't have a seperate boot as sda1. Even if you have >>seperate boot the name of the image should be then >>/vmlinuz-2.4.20-openssi and initrd /initrd-2.4.20ssi.img.gz >> >>but in all the above case it should not boot the kernel. >> >> >>2) Is it reading the ramdisk. ( It will be printing a console message >>saying that ramdisk is found >>3) is it configuring the cluster. ( Before running init it should >>configure. ) >> >> >> > >Okay, turns out the problem I described to you was due to a corrupt initrd >file. I ran mkinitrd again, and was able to get a bit farther along. >Here's roughly what I'm getting now: > >Gathering cluster info > >Running pre-root cluster initialization >RTNL : assertion failed at devinet.c(825) >RTNL : assertion failed at devinet.c(825) > >This is a CI/OpenSSI kernel > This cluster is node: 1 > Cluster master node(s): 1:192.168.0.1 > >Name server registered with clms >ipcnamserver ready completed >ipcname_read completed >Mounting root in linuxrc >do_ssisys: Illegal op 44 in state 2 >warning: can't open /etc/fstab: No such file or directory >Usage mount ...... > > Can you build the ramdisk using -k option for mkinitrd. This will keeep the temporary files. Please do the below $cd < temperorary directory > $ more initrd/linuxrc.conf $ more initrd/linuxrc I am still not sure why you are getting the do_ssisys: Illegal op 44 in state 2. Can you send me the last portion of the .config that you used to build the kernel that is relevant to SSI cluster ? -aneesh -- ph: 603-884-5742 |
From: Aneesh K. KV <ane...@di...> - 2004-01-27 20:46:27
|
Jiann-Ming Su wrote: >On Tue, 27 Jan 2004, Aneesh Kumar KV wrote: > > > >>Can you build the ramdisk using -k option for mkinitrd. This will keeep >>the temporary files. Please do the below >> >>$cd < temperorary directory > >>$ more initrd/linuxrc.conf >>$ more initrd/linuxrc >> >> >>I am still not sure why you are getting the do_ssisys: Illegal op 44 in >>state 2. Can you send me the last portion of the .config that you used >>to build the kernel that is relevant to SSI cluster ? >> >> >> > >Please see the attachments for all of the above... > > > >------------------------------------------------------------------------ > >DELAY=0 >FSTYPES=ext3,ext2 >VERSION=0.1.55ssi >CLUSTER_ROOT=/dev/scsi/host0/bus0/target0/lun0/part4 > I guess this is the problem . It should be /dev/sda4 for you . I guess you used the -r /dev/sda4 option while building the ramdisk. I am not sure what went wrong with mkinitrd. I will look into this. One easy fix would be to build the ramdisk then unzip mount and edit this file manually. gzip -d <image.gz> mount -o loop image /mnt cd /mnt vi linuxrc.conf Also make sure you have /mnt/dev/sda4 >CLUSTER_FSTYPE= >CLUSTER_FSOPT=chard,errors=remount-ro > > > > > |
From: Aneesh K. KV <ane...@di...> - 2004-01-27 21:20:06
|
Jiann-Ming Su wrote: >On Tue, 27 Jan 2004, Aneesh Kumar KV wrote: > > > >>Can you build the ramdisk using -k option for mkinitrd. This will keeep >>the temporary files. Please do the below >> >>$cd < temperorary directory > >>$ more initrd/linuxrc.conf >>$ more initrd/linuxrc >> >> >>I am still not sure why you are getting the do_ssisys: Illegal op 44 in >>state 2. Can you send me the last portion of the .config that you used >>to build the kernel that is relevant to SSI cluster ? >> >> >> > >Please see the attachments for all of the above... > > > >------------------------------------------------------------------------ > >DELAY=0 >FSTYPES=ext3,ext2 >VERSION=0.1.55ssi >CLUSTER_ROOT=/dev/scsi/host0/bus0/target0/lun0/part4 >CLUSTER_FSTYPE= >CLUSTER_FSOPT=chard,errors=remount-ro > > Since you used the devfs name and you didn't had that name in /etc/fstab you get CLUSTER_FSTYPE as "" and that's why mount failed. so when fixing the above file please make sure you add the right fstype format. A valid entry would look like CLUSTER_FSTYPE=ext3 -aneesh ph: 603-884-5742 |
From: Jiann-Ming Su <js...@em...> - 2004-01-28 20:25:05
|
On Tue, 27 Jan 2004, Aneesh Kumar KV wrote: > Since you used the devfs name and you didn't had that name in /etc/fstab you get CLUSTER_FSTYPE as "" and that's why mount failed. so when fixing the above file please make sure you add the right fstype format. > > A valid entry would look like > > CLUSTER_FSTYPE=ext3 > Okay, I modified the linuxrc.conf in the initrd image. I went back to reboot, and now it appears that fstab didn't make it into the initrd image: mounting root in linuxrc do_ssiys: Illegal op 44 in state 2 warning: can't open /etc/fstab: No such file or directory mount: /dev/scsi/host0/bus0/target0/lun0/part4: only devices specified as UUID= or LABEL= can be mouned chard ERROR: mounting root filesystem failed Unable to continue. Halting. Should fstab be in the initrd? And, do I need to label my filesystem? -- Jiann-Ming Su js...@em... 404-712-2603 Development Team Systems Administrator General Libraries Systems Division |
From: David B. Z. <dav...@hp...> - 2004-01-29 05:20:03
|
Jiann-Ming Su wrote: > Okay, I modified the linuxrc.conf in the initrd image. I went back to > reboot, > >and now it appears that fstab didn't make it into the initrd image: > >mounting root in linuxrc >do_ssiys: Illegal op 44 in state 2 >warning: can't open /etc/fstab: No such file or directory >mount: /dev/scsi/host0/bus0/target0/lun0/part4: only devices specified as UUID= or LABEL= can be mouned chard >ERROR: mounting root filesystem failed >Unable to continue. Halting. > > >Should fstab be in the initrd? And, do I need to label my filesystem? > > > In the redhat mkintird code we use touch to create an empty /etc/fstab in the ramdisk. Doing that should clean-up the warning. In a Redhat install during ssi-create, if you enable shared root filesystem, the device field of the "/" entry in /etc/fstab is replaced with "UUID=..." More importantly, the generated grub.conf stanza gets "root=UUID=..." as a kernel option. The mkfs for ext2/ext3 always puts a UUID in, so we get this value which is already in the super block. It dosen't matter that the normal Redhat install uses LABEL=/ in /etc/fstab and grub.conf. -- David B. Zafman | Hewlett-Packard Company mailto:dav...@hp... | http://www.hp.com "Computer Science" is no more about computers than astronomy is about telescopes - E. W. Dijkstra |
From: Aneesh K. KV <ane...@di...> - 2004-01-30 23:33:36
|
David B. Zafman wrote: > Jiann-Ming Su wrote: > >> Okay, I modified the linuxrc.conf in the initrd image. I went back to >> reboot, >> >> and now it appears that fstab didn't make it into the initrd image: >> >> mounting root in linuxrc >> do_ssiys: Illegal op 44 in state 2 >> warning: can't open /etc/fstab: No such file or directory >> mount: /dev/scsi/host0/bus0/target0/lun0/part4: only devices >> specified as UUID= or LABEL= can be mouned chard >> ERROR: mounting root filesystem failed >> Unable to continue. Halting. >> >> >> Should fstab be in the initrd? And, do I need to label my filesystem? >> >> >> > In the redhat mkintird code we use touch to create an empty /etc/fstab > in the ramdisk. Doing that should clean-up the warning. > In a Redhat install during ssi-create, if you enable shared root > filesystem, the device field of the "/" entry in /etc/fstab is > replaced with "UUID=..." More importantly, the generated grub.conf > stanza gets "root=UUID=..." as a kernel option. The mkfs for > ext2/ext3 always puts a UUID in, so we get this value which is already > in the super block. It dosen't matter that the normal Redhat install > uses LABEL=/ in /etc/fstab and grub.conf. > I haven't tested Debian failover configuration yet. ( I don't have hardware to do the same. ). The following things are yet to be tested on debian 1) syncing images between different nodes 2) failover 3) NFS configuration. -aneesh -- ph: 603-884-5742 |
From: Aneesh K. KV <ane...@di...> - 2004-01-30 21:43:50
|
Jiann-Ming Su wrote: >On Thu, 29 Jan 2004, Aneesh Kumar KV wrote: > > > >>I haven't tested Debian failover configuration yet. ( I don't have >>hardware to do the same. ). The following things are yet to be tested on >>debian >> >> > >Yeah, that seems to be our problem. I was trying to use our external RAID >array in failover mode. We really don't have the controllers to support >cluster mode, yet. But, I figure I'd go ahead and configure it as such. >To do this, I had to disable the controller BIOS, which causes the system >BIOS not to know about the RAID array, which caused me to build a grub >iso image, which requires patching the grub source. Well, this was >getting really tedious, so I've backed off of the failover configuration >and now the openssi kernel boots properly. > > Cool. >I see a /etc/rcSSI.d directory. What services should be started in there? >When I connect a second node to the network and run openssi-config-node, >it doesn't see any mac addresses trying to boot. I remember in the RH >cluster I played with, the MAC addresses trying to boot would show up when >I ran the openssi-config-node script. Inetd seems to be running, >and I have the following in /etc/inetd.conf: > > tftp dgram udp wait root /usr/sbin/in.tftpd /var/ftpd > >Though, there is no /var/ftpd directory. > > > /etc/rcSSI.d is similar to /etc/rcS.d . It contain the SSI modified runlevel startup used by the joining node. There is some difference in the way the second node bootup happens. All service configuration is now done via /etc/rc.nodeinfo ( You have to manually edit that file. You can find the doc at openssi/docs/rc-design-notes ) and you need to use invoke-rc.d to start the service on a running cluster. Debian doesn't support reading MAC address from /var/log/messages. I never got it working. So for debian you have to find the mac address of the system and pass it as command line argument to ssi-addnode. -aneesh -- ph: 603-884-5742 |
From: Aneesh K. KV <ane...@di...> - 2004-02-02 21:14:03
|
Jiann-Ming Su wrote: >On Fri, 30 Jan 2004, Aneesh Kumar KV wrote: > > > >>Debian doesn't support reading MAC address from /var/log/messages. I >>never got it working. So for debian you have to find the mac address of >>the system and pass it as command line argument to ssi-addnode. >> >> >> > >Okay, I ran openssi-config-node, and ran into a few issues. First, it's >looking for /boot/grub/grub.conf, which may or may not exist in Debian, >all depending on how grub was installed. So, I had to create a symlink. >When the script continues, I get: > >(W)rite new configuration, (R)econfigure, or (Q)uit without writing [W]: >The configuration changes have been saved. >Do you wish to configure another node (y/n) [n]: n >Rebuilding the ramdisk and Updating the dhcpd.conf entries...... >Updating /boot/initrd-2.4.20ssi.img.gz: Cannot copy device file, directory does not exist /tmp/initrd.yyiTBy//cluster/node1/dev/scsi/host0/bus0/target0/lun0 >FAILED >Synchronizing network boot images: expr: syntax error >/sbin/ssi-ksync-network: mkelf-linux: command not found >cp: cannot stat `/bootkernel': No such file or directory >cp: cannot stat `/bootinitrd': No such file or directory >FAILED >ERROR: /sbin/ssi-ksync: ssi-ksync-network failed >The ssi-ksync command failed to sync your local boot devices. > >All new nodes are allowed to join the cluster. If you wish to setup a >local boot device for a node, wait until it's fully up, create a Linux >filesystem on one of its local disks using fdisk and mkfs, then run >ssi-chnode to configure the filesystem as a local boot device. > > > > As i said in my previous mail automatic syncing of images is not yet tested on debian. ( Most of the testing before was on Alpha and now on a single debian machine with lilo. ) . I would ask you to update the grub.conf and tftp image manually. You can do it by passing /etc/lilo.conf as the bootloader. (ssi-addnode --help ) -aneesh -- ph: 603-884-5742 |
From: Aneesh K. KV <ane...@di...> - 2004-02-02 22:09:53
|
Jiann-Ming Su wrote: >On Mon, 2 Feb 2004, Jiann-Ming Su wrote: > > > >>Okay, I ran openssi-config-node, and ran into a few issues. First, it's >>looking for /boot/grub/grub.conf, which may or may not exist in Debian, >>all depending on how grub was installed. So, I had to create a symlink. >>When the script continues, I get: >> >>(W)rite new configuration, (R)econfigure, or (Q)uit without writing [W]: >>The configuration changes have been saved. >>Do you wish to configure another node (y/n) [n]: n >>Rebuilding the ramdisk and Updating the dhcpd.conf entries...... >>Updating /boot/initrd-2.4.20ssi.img.gz: Cannot copy device file, directory does not exist /tmp/initrd.yyiTBy//cluster/node1/dev/scsi/host0/bus0/target0/lun0 >>FAILED >>Synchronizing network boot images: expr: syntax error >>/sbin/ssi-ksync-network: mkelf-linux: command not found >>cp: cannot stat `/bootkernel': No such file or directory >>cp: cannot stat `/bootinitrd': No such file or directory >>FAILED >>ERROR: /sbin/ssi-ksync: ssi-ksync-network failed >>The ssi-ksync command failed to sync your local boot devices. >> >>All new nodes are allowed to join the cluster. If you wish to setup a >>local boot device for a node, wait until it's fully up, create a Linux >>filesystem on one of its local disks using fdisk and mkfs, then run >>ssi-chnode to configure the filesystem as a local boot device. >> >> >> > >Okay, as you've probably figured out by now, I shoot first and ask question >later... anyway, I modified /sbin/ssi-ksync-network as follows: > > kernel=`tail +$line $config | grep 'kernel' | head -1 | awk '{print $3}'` > initrd=`tail +$line $config | grep 'initrd' | head -1 | awk '{print $3}'` > >I simply changed "print $2" to "print $3". Also, in /sbin/ssi-ksync was >looking for "/etc/init.d/dhcpd reload." In Debian, it's "/etc/init.d/dhcp >force-reload." So now, when I run openssi-config-node, I get this: > >### BEGIN ### >Do you wish to configure another node (y/n) [n]: >Rebuilding the ramdisk and Updating the dhcpd.conf entries...... >Updating /boot/initrd-2.4.20ssi.img.gz: Cannot copy device file, directory does >not exist /tmp/initrd.xDgfRo//cluster/node1/dev/scsi/host0/bus0/target0/lun0 >FAILED >Synchronizing network boot images: expr: syntax error >succeeded >Stopping DHCP server: dhcp. >Starting DHCP server: dhcp. >Synchronizing local boot devices > syncing /dev/scsi/host0/bus0/target0/lun0/part1 on node 1: /sbin/ssi-ksync: > /sbin/nash: No such file or directory >/sbin/ssi-ksync: /sbin/nash: No such file or directory >/sbin/ssi-ksync: /sbin/nash: No such file or directory >/sbin/ssi-ksync: /sbin/nash: No such file or directory >/sbin/ssi-ksync: /sbin/nash: No such file or directory >/sbin/ssi-ksync: /sbin/nash: No such file or directory >/sbin/ssi-ksync: /sbin/nash: No such file or directory >/sbin/ssi-ksync: /sbin/nash: No such file or directory >/sbin/ssi-ksync: /sbin/nash: No such file or directory >/sbin/ssi-ksync: /sbin/nash: No such file or directory >/sbin/ssi-ksync: /sbin/nash: No such file or directory >/sbin/ssi-ksync: /sbin/nash: No such file or directory >/sbin/ssi-ksync: /sbin/nash: No such file or directory >/sbin/ssi-ksync: /sbin/nash: No such file or directory >/sbin/ssi-ksync: /sbin/nash: No such file or directory >/sbin/ssi-ksync: /sbin/nash: No such file or directory >/sbin/ssi-ksync: /sbin/nash: No such file or directory >succeeded > >All new nodes are allowed to join the cluster. If you wish to setup a >local boot device for a node, wait until it's fully up, create a Linux >filesystem on one of its local disks using fdisk and mkfs, then run >ssi-chnode to configure the filesystem as a local boot device. >### END ### > > >Looks like debian doesn't distribute /sbin/nash in its mkinitrd package. > > > No debian doesn't use /sbin/nash. If you can send me a working patch for syncing images I will be more than happy to commit the same to the CVS. Things to be done 1) ssi_arch.pm right now uses cluster_mkinitrd . This should change. I am waiting for Brian's reply on this 2) Understand and modifity ssi-ksync. I never looked at this. -aneesh -- ph: 603-884-5742 |
From: David B. Z. <dav...@hp...> - 2004-02-02 22:43:02
|
On Feb 2, 2004, at 2:09 PM, Aneesh Kumar KV wrote: > No debian doesn't use /sbin/nash. If you can send me a working patch > for syncing images I will be more than happy to commit the same to > the CVS. > > Things to be done > > 1) ssi_arch.pm right now uses cluster_mkinitrd . This should change. I > am waiting for Brian's reply on this > 2) Understand and modifity ssi-ksync. I never looked at this. > > -aneesh > > For root failover with UUID in the root= of /etc/grub.conf the mkinitrd uses nash to perform a mkrootdev before mounting /dev/root. In addition, the redhat code uses the showlabels nash command to find the UUID of the root partition. This operation is performed during the kernel RPM package installation scripts. David B. Zafman | Hewlett-Packard Company mailto:dav...@hp... | http://www.hp.com "Computer Science" is no more about computers than astronomy is about telescopes - E. W. Dijkstra |
From: Jiann-Ming Su <js...@em...> - 2004-02-04 17:28:25
|
On Tue, 3 Feb 2004, Jiann-Ming Su wrote: > On Mon, 2 Feb 2004, Aneesh Kumar KV wrote: > > I don't see a pxelinux.0 file in /tftpboot. There's a combined, initrd, and > kernel files in there, plus a pxelinux.cfg directory which contains a > default file. My second node isn't booting because dhcp is telling it > to look for the wrong file. > Okay, making some progress... turns out there's a /usr/lib/syslinux/pxelinux.0 file that I had to copy to /boot. So, now the second node is able to boot, but I get a kernel panic: Running pre-root cluster initialization ics_getifconfig: Invalid ics_netmask ics_lltransport_config: ifconfig parameters incorrect eth1 RTNL: assertion failed at devinet.c(825) Kernel panic: ics_seticsinfo: node 0 out of range. -- Jiann-Ming Su js...@em... 404-712-2603 Development Team Systems Administrator General Libraries Systems Division |