You can subscribe to this list here.
2006 |
Jan
(1) |
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
---|---|---|---|---|---|---|---|---|---|---|---|---|
2007 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(105) |
Nov
(10) |
Dec
(7) |
2008 |
Jan
|
Feb
(31) |
Mar
(13) |
Apr
(7) |
May
|
Jun
(2) |
Jul
(1) |
Aug
|
Sep
(4) |
Oct
|
Nov
(23) |
Dec
|
2009 |
Jan
(25) |
Feb
(24) |
Mar
(10) |
Apr
(8) |
May
(4) |
Jun
(6) |
Jul
(27) |
Aug
(1) |
Sep
|
Oct
(2) |
Nov
(7) |
Dec
(25) |
2010 |
Jan
|
Feb
(7) |
Mar
|
Apr
(1) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(2) |
Nov
|
Dec
|
2011 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
(3) |
Dec
(1) |
From: Marc G. <gr...@at...> - 2007-10-12 07:53:46
|
On Friday 12 October 2007 09:25:45 Gordan Bobic wrote: > On Fri, 12 Oct 2007, Marc Grimme wrote: > >>>> It throws up a worrying error when it boots: > >>>> GFS: fsid=cluster:root.0: warning: assertion > >>>> "gfs_glock_is_locked_by_me(ip->i_gl)" failed > >>>> GFS: fsid=cluster:root.0 function = gfs_readpage > >>>> GFS: fsid=cluster:root.0 file = > >>>> /builddir/build/BUILD/gfs-kmod-0.1.16/_kmod_build_/src/gfs/ops_address > >>>>.c , line 279 > >>>> GFS: fsid=cluster:root.0: time = 1192119131 > >>>> > >>>> I wonder if this may be caused by a file system perhaps not having > >>>> been cleanly unmounted on a previous try while I was building it... > >>> > >>> Perhaps just fschk it when you're in the initrd. > >> > >> Yup, just did. Some minor things were broken with the fs metadata. But > >> when I reboot, I still get a similar message when HALd loads. I wonder > >> if I can safely switch that off - assuming that's causing it... > > > > You could also file it to the gfs list and see what they'll tell you. It > > seems to me I've seen this message also. > > If you mean the RedHat's Linux Cluster list - I already did. :-) Yes I saw one minute after writing the answer. ;-) > > >> Now, in theory, I should be able to bring up another node on the same > >> file system. All I would need to do is clone the /boot partition to the > >> other box, and it should just come up. > > > > Why cloning it and not using the same. Isn't that possible. We are always > > doing it this way. > > Because I'm not booting this off DHCP. I'm booting the kernel and the > initrd off the local disk. So I need to clone the boot partition with the > kernel and the initrd to each of the nodes. ok. How about PXE. IMHO you could use one shared bootimage couldn't you? > > >> What do I need to do to achieve this, and can it all be done with the > >> one node that is already running? I'm assuming that I'll have to do > >> something like: > >> > >> mount --bind /cluster/cdsl/4/ /cdsl.local/ > > > > exactly if nodeid is 4. But again the initrd should do this job > > automatically. > > So, I wouldn't need to do this at all? The initrd will automagically link > /cdsl.local to /cluster/cdsl/nodeid ? Yes this is done in linuxrc.generic.sh lines 354-360: clusterfs_mount_cdsl $newroot $cdsl_local_dir $nodeid $cdsl_prefix if [ $return_c -ne 0 ]; then echo_local "Could not mount cdsl $cdsl_local_dir to ${cdsl_prefix}/$nodeid. Exiting" exit_linuxrc 1 fi step "CDSL tree mounted" > > >> As far as unsharing things under /var, I _think_ only /var/lock actually > >> needs to be unshared. Can I do this with the running image with: > >> > >> com-mkcdsl -r / -a /var/lock > > > > you can skip the -r/ it is default. > > How about /var/run, /var/log, /var/cache, /var/tmp, /var/spool. All of > > these normally need to be hostdependent. > > I'm not sure why /var/cache and /var/spool would need to be host > dependent. I can see reasons why I'd want to them to be shared. I think e.g. /var/spool/mail or just from the name it should be. But it's up to you. > > I agree that /var/run and /var/lock should be private. > > It would be _nice_ to have a shared /var/log, but from past experience, > the logs will get messed up when multiple syslogs try to write to them. > Is there a shared logging solution for this? I know I can pick a master > log node and get syslog pointed at this, but this won't work for all the > other non-syslog services (e.g. Apache). Why did I want to say (use a syslog-server)? Right with apache it does not work. For e.g. apache we've written a log analysis tool to merge the logs. It's in the addons channel and is called mgrep. I think I also read a howto integrate apache into syslog somewhere. > > I plan to link /var/tmp to /tmp, and have /tmp mounted to a big local > partition (local disks are only planned to have /boot, /tmp and swap). > > Which brings me to the next question - how do I use a local disk partition > instead of the initrd? What's the procedure for that? It seems a more > efficient solution than relying on a ramdisk that eats memory after > booting up when there is plenty of local disk space available. How do I > use /etc/sysconfig/comoonics-chroot ? Yes. So I suppose you don't want to configure your local disk with lvm ;-) . So I'll explain it without. It's basically quite easy: 1. For every node: spare one partition for the chroot (let's say it is /dev/sda4) and let it be at least 500M. 2. For every node: mkfs.ext3 /dev/sda4 3. Add to the com_info section for every node the following: <chrootenv mountpoint="/var/comoonics/chroot" fstype="ext3" device="/dev/sda4" chrootdir="/var/comoonics/chroot"/> 4. Make a new initrd 5. reboot every node That's it no everything should be running on your local disk instead of tmpfs. Marc. -- Gruss / Regards, Marc Grimme http://www.atix.de/ http://www.open-sharedroot.org/ |
From: Gordan B. <go...@bo...> - 2007-10-12 07:35:59
|
On Fri, 12 Oct 2007, Marc Grimme wrote: > On Thursday 11 October 2007 19:10:51 Gordan Bobic wrote: >> Is the ram disk that remains mounted on /var/comoonics/chroot supposed to >> be 176MB? About 75MB of this is kernel drivers. Surely these are no longer >> required once the root image is mounted, because they can be loaded from >> the GFS file system now. I don't mind a MB or two remaining, but 176 seems >> a little excessive... > > Feel free to remove those files. Yes I agree. It would be ok to delete the > kernel modules. > Again, normally this chroot is automatically moved to a local disk (the same > that is used for swap and /tmp) because these data are not important. That's > basically the reason why we do not clean up the tmpfs. I'm figuring that just adding rm -rf /lib/modules would help. I have just added this to clean_initrd function. I'll have a look at what else can be pruned after the GFS root is mounted. Gordan |
From: Gordan B. <go...@bo...> - 2007-10-12 07:28:03
|
On Fri, 12 Oct 2007, Marc Grimme wrote: >> Is there a switch for this, or do I just remove it from linuxrc.generic >> script? > Just remove it from there. We should think about a switch. Again you're the > first to use it without CLVM. LOL! Living on the bleeding edge! :-) > I'll file a RFE and we'll decide to include it in the next major release. Or > you'll do it. > Quick guess: > I would just check if the major of the rootdevice is 8 and if so not start > clvmd. Mark? What do you think? Hmm... That sounds like a good idea because it doesn't require a switch, only a check. My idea was to add a kernel boot parameter and then check for that. But I like your idea better. :-) Gordan |
From: Gordan B. <go...@bo...> - 2007-10-12 07:25:59
|
On Fri, 12 Oct 2007, Marc Grimme wrote: >>>> It throws up a worrying error when it boots: >>>> GFS: fsid=cluster:root.0: warning: assertion >>>> "gfs_glock_is_locked_by_me(ip->i_gl)" failed >>>> GFS: fsid=cluster:root.0 function = gfs_readpage >>>> GFS: fsid=cluster:root.0 file = >>>> /builddir/build/BUILD/gfs-kmod-0.1.16/_kmod_build_/src/gfs/ops_address.c >>>> , line 279 >>>> GFS: fsid=cluster:root.0: time = 1192119131 >>>> >>>> I wonder if this may be caused by a file system perhaps not having been >>>> cleanly unmounted on a previous try while I was building it... >>> >>> Perhaps just fschk it when you're in the initrd. >> >> Yup, just did. Some minor things were broken with the fs metadata. But >> when I reboot, I still get a similar message when HALd loads. I wonder if >> I can safely switch that off - assuming that's causing it... > > You could also file it to the gfs list and see what they'll tell you. It seems > to me I've seen this message also. If you mean the RedHat's Linux Cluster list - I already did. :-) >> Now, in theory, I should be able to bring up another node on the same file >> system. All I would need to do is clone the /boot partition to the other >> box, and it should just come up. > > Why cloning it and not using the same. Isn't that possible. We are always > doing it this way. Because I'm not booting this off DHCP. I'm booting the kernel and the initrd off the local disk. So I need to clone the boot partition with the kernel and the initrd to each of the nodes. >> What do I need to do to achieve this, and can it all be done with the one >> node that is already running? I'm assuming that I'll have to do something >> like: >> >> mount --bind /cluster/cdsl/4/ /cdsl.local/ > > exactly if nodeid is 4. But again the initrd should do this job automatically. So, I wouldn't need to do this at all? The initrd will automagically link /cdsl.local to /cluster/cdsl/nodeid ? >> As far as unsharing things under /var, I _think_ only /var/lock actually >> needs to be unshared. Can I do this with the running image with: >> >> com-mkcdsl -r / -a /var/lock > you can skip the -r/ it is default. > How about /var/run, /var/log, /var/cache, /var/tmp, /var/spool. All of these > normally need to be hostdependent. I'm not sure why /var/cache and /var/spool would need to be host dependent. I can see reasons why I'd want to them to be shared. I agree that /var/run and /var/lock should be private. It would be _nice_ to have a shared /var/log, but from past experience, the logs will get messed up when multiple syslogs try to write to them. Is there a shared logging solution for this? I know I can pick a master log node and get syslog pointed at this, but this won't work for all the other non-syslog services (e.g. Apache). I plan to link /var/tmp to /tmp, and have /tmp mounted to a big local partition (local disks are only planned to have /boot, /tmp and swap). Which brings me to the next question - how do I use a local disk partition instead of the initrd? What's the procedure for that? It seems a more efficient solution than relying on a ramdisk that eats memory after booting up when there is plenty of local disk space available. How do I use /etc/sysconfig/comoonics-chroot ? Gordan |
From: Mark H. <hla...@at...> - 2007-10-12 07:07:17
|
On Friday 12 October 2007 08:20:05 Marc Grimme wrote: > On Thursday 11 October 2007 19:05:24 Gordan Bobic wrote: > > Is there a switch for this, or do I just remove it from linuxrc.generic > > script? > > Just remove it from there. We should think about a switch. Again you're the > first to use it without CLVM. > I'll file a RFE and we'll decide to include it in the next major release. > Or you'll do it. > Quick guess: > I would just check if the major of the rootdevice is 8 and if so not start > clvmd. Mark? What do you think? That's fine for me. But you should expand the list for major numbers to the whole number of major numbers for scsi disks. There is a list in the kernel documentation. Mark |
From: Marc G. <gr...@at...> - 2007-10-12 06:24:15
|
On Thursday 11 October 2007 19:10:51 Gordan Bobic wrote: > Is the ram disk that remains mounted on /var/comoonics/chroot supposed to > be 176MB? About 75MB of this is kernel drivers. Surely these are no longer > required once the root image is mounted, because they can be loaded from > the GFS file system now. I don't mind a MB or two remaining, but 176 seems > a little excessive... Feel free to remove those files. Yes I agree. It would be ok to delete the kernel modules. Again, normally this chroot is automatically moved to a local disk (the same that is used for swap and /tmp) because these data are not important. That's basically the reason why we do not clean up the tmpfs. Again I'll file a RFE... Marc -- Gruss / Regards, Marc Grimme http://www.atix.de/ http://www.open-sharedroot.org/ |
From: Marc G. <gr...@at...> - 2007-10-12 06:20:19
|
On Thursday 11 October 2007 19:05:24 Gordan Bobic wrote: > Is there a switch for this, or do I just remove it from linuxrc.generic > script? Just remove it from there. We should think about a switch. Again you're the first to use it without CLVM. I'll file a RFE and we'll decide to include it in the next major release. Or you'll do it. Quick guess: I would just check if the major of the rootdevice is 8 and if so not start clvmd. Mark? What do you think? -- Gruss / Regards, Marc Grimme Phone: +49-89 452 3538-14 http://www.atix.de/ http://www.open-sharedroot.org/ |
From: Marc G. <gr...@at...> - 2007-10-12 06:17:20
|
On Thursday 11 October 2007 18:53:12 Gordan Bobic wrote: > Is this right? > > # cat /proc/mounts > rootfs / rootfs rw 0 0 > none /dev tmpfs rw 0 0 > none /var/comoonics/chroot tmpfs rw 0 0 > none /var/comoonics/chroot/dev/pts devpts rw 0 0 > proc /var/comoonics/chroot/proc proc rw 0 0 > none /var/comoonics/chroot/sys sysfs rw 0 0 > none /var/comoonics/chroot/sys/kernel/config configfs rw 0 0 > /dev/sdb2 / gfs rw,hostdata=jid=0:id=196611:first=1 0 0 This is the root itself. > /dev/sdb2 /cdsl.local gfs rw,hostdata=jid=0:id=196611:first=1 0 0 And this is the bindmount. /cluster/cdsl/1 /cdsl.local. It just looks horrible in /proc/mounts. > /proc /proc proc rw 0 0 > /sys /sys sysfs rw 0 0 > /proc/bus/usb /proc/bus/usb usbfs rw 0 0 > devpts /dev/pts devpts rw 0 0 > /dev/sda1 /boot ext3 rw,noatime,data=ordered 0 0 > tmpfs /dev/shm tmpfs rw 0 0 > none /proc/sys/fs/binfmt_misc binfmt_misc rw 0 0 > > > Specifically, the two /dev/sdb2 lines? yes. You're done with this. > > Gordan > > ------------------------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. > Still grepping through log files to find problems? Stop. > Now Search log events and configuration files using AJAX and a browser. > Download your FREE copy of Splunk now >> http://get.splunk.com/ > _______________________________________________ > Open-sharedroot-devel mailing list > Ope...@li... > https://lists.sourceforge.net/lists/listinfo/open-sharedroot-devel -- Gruss / Regards, Marc Grimme Phone: +49-89 452 3538-14 http://www.atix.de/ http://www.open-sharedroot.org/ |
From: Marc G. <gr...@at...> - 2007-10-12 06:16:09
|
On Thursday 11 October 2007 18:50:10 Gordan Bobic wrote: > On Thu, 11 Oct 2007, Marc Grimme wrote: > >> It throws up a worrying error when it boots: > >> GFS: fsid=cluster:root.0: warning: assertion > >> "gfs_glock_is_locked_by_me(ip->i_gl)" failed > >> GFS: fsid=cluster:root.0 function = gfs_readpage > >> GFS: fsid=cluster:root.0 file = > >> /builddir/build/BUILD/gfs-kmod-0.1.16/_kmod_build_/src/gfs/ops_address.c > >>, line 279 > >> GFS: fsid=cluster:root.0: time = 1192119131 > >> > >> I wonder if this may be caused by a file system perhaps not having been > >> cleanly unmounted on a previous try while I was building it... > > > > Perhaps just fschk it when you're in the initrd. > > Yup, just did. Some minor things were broken with the fs metadata. But > when I reboot, I still get a similar message when HALd loads. I wonder if > I can safely switch that off - assuming that's causing it... You could also file it to the gfs list and see what they'll tell you. It seems to me I've seen this message also. > > Now, in theory, I should be able to bring up another node on the same file > system. All I would need to do is clone the /boot partition to the other > box, and it should just come up. Why cloning it and not using the same. Isn't that possible. We are always doing it this way. > > What do I need to do to achieve this, and can it all be done with the one > node that is already running? I'm assuming that I'll have to do something > like: > > mount --bind /cluster/cdsl/4/ /cdsl.local/ exactly if nodeid is 4. But again the initrd should do this job automatically. > > As far as unsharing things under /var, I _think_ only /var/lock actually > needs to be unshared. Can I do this with the running image with: > > com-mkcdsl -r / -a /var/lock you can skip the -r/ it is default. How about /var/run, /var/log, /var/cache, /var/tmp, /var/spool. All of these normally need to be hostdependent. > > ? > > Thanks. > > Gordan > > ------------------------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. > Still grepping through log files to find problems? Stop. > Now Search log events and configuration files using AJAX and a browser. > Download your FREE copy of Splunk now >> http://get.splunk.com/ > _______________________________________________ > Open-sharedroot-devel mailing list > Ope...@li... > https://lists.sourceforge.net/lists/listinfo/open-sharedroot-devel Marc. -- Gruss / Regards, Marc Grimme Phone: +49-89 452 3538-14 http://www.atix.de/ http://www.open-sharedroot.org/ ** Visit us at LinuxWorld Conference & Expo 31.10. - 01.11.2007 in Jaarbeurs Utrecht - The Netherlands ATIX stand: Hall 9 / B 005 ** ATIX Informationstechnologie und Consulting AG Einsteinstr. 10 85716 Unterschleissheim Deutschland/Germany Phone: +49-89 452 3538-0 Fax: +49-89 990 1766-0 Registergericht: Amtsgericht Muenchen Registernummer: HRB 168930 USt.-Id.: DE209485962 Vorstand: Marc Grimme, Mark Hlawatschek, Thomas Merz (Vors.) Vorsitzender des Aufsichtsrats: Dr. Martin Buss |
From: Gordan B. <go...@bo...> - 2007-10-11 17:11:01
|
Is the ram disk that remains mounted on /var/comoonics/chroot supposed to be 176MB? About 75MB of this is kernel drivers. Surely these are no longer required once the root image is mounted, because they can be loaded from the GFS file system now. I don't mind a MB or two remaining, but 176 seems a little excessive... Gordan |
From: Gordan B. <go...@bo...> - 2007-10-11 17:05:39
|
Is there a switch for this, or do I just remove it from linuxrc.generic script? Gordan |
From: Gordan B. <go...@bo...> - 2007-10-11 16:53:17
|
Is this right? # cat /proc/mounts rootfs / rootfs rw 0 0 none /dev tmpfs rw 0 0 none /var/comoonics/chroot tmpfs rw 0 0 none /var/comoonics/chroot/dev/pts devpts rw 0 0 proc /var/comoonics/chroot/proc proc rw 0 0 none /var/comoonics/chroot/sys sysfs rw 0 0 none /var/comoonics/chroot/sys/kernel/config configfs rw 0 0 /dev/sdb2 / gfs rw,hostdata=jid=0:id=196611:first=1 0 0 /dev/sdb2 /cdsl.local gfs rw,hostdata=jid=0:id=196611:first=1 0 0 /proc /proc proc rw 0 0 /sys /sys sysfs rw 0 0 /proc/bus/usb /proc/bus/usb usbfs rw 0 0 devpts /dev/pts devpts rw 0 0 /dev/sda1 /boot ext3 rw,noatime,data=ordered 0 0 tmpfs /dev/shm tmpfs rw 0 0 none /proc/sys/fs/binfmt_misc binfmt_misc rw 0 0 Specifically, the two /dev/sdb2 lines? Gordan |
From: Gordan B. <go...@bo...> - 2007-10-11 16:50:20
|
On Thu, 11 Oct 2007, Marc Grimme wrote: >> It throws up a worrying error when it boots: >> GFS: fsid=cluster:root.0: warning: assertion >> "gfs_glock_is_locked_by_me(ip->i_gl)" failed >> GFS: fsid=cluster:root.0 function = gfs_readpage >> GFS: fsid=cluster:root.0 file = >> /builddir/build/BUILD/gfs-kmod-0.1.16/_kmod_build_/src/gfs/ops_address.c, >> line 279 >> GFS: fsid=cluster:root.0: time = 1192119131 >> >> I wonder if this may be caused by a file system perhaps not having been >> cleanly unmounted on a previous try while I was building it... > > Perhaps just fschk it when you're in the initrd. Yup, just did. Some minor things were broken with the fs metadata. But when I reboot, I still get a similar message when HALd loads. I wonder if I can safely switch that off - assuming that's causing it... Now, in theory, I should be able to bring up another node on the same file system. All I would need to do is clone the /boot partition to the other box, and it should just come up. What do I need to do to achieve this, and can it all be done with the one node that is already running? I'm assuming that I'll have to do something like: mount --bind /cluster/cdsl/4/ /cdsl.local/ As far as unsharing things under /var, I _think_ only /var/lock actually needs to be unshared. Can I do this with the running image with: com-mkcdsl -r / -a /var/lock ? Thanks. Gordan |
From: Marc G. <gr...@at...> - 2007-10-11 16:32:46
|
On Thursday 11 October 2007 18:20:44 Gordan Bobic wrote: > On Thu, 11 Oct 2007, Marc Grimme wrote: > >> OK - my fault - I had done something stupid. > >> I touched /etc/sysconfig/comoonics-chroot to silence the warning at time > >> of building the initrd. > >> I have now instead removed the inclusion of this file, and that gets > >> further - it actually mount the GFS file system on /mnt/newroot! :-) > >> > >> It then proceeds to boot the GFS setup! :-) > >> > >> Then it deems to get stuck, printing endlessly: > >> dlm: root: remove fr 0 ID 3 > >> > >> Now what... > > > > Good question ;-) > > No we are in GFS/DLM. Which packages of GFS/DLM combi are you using and > > can you manually mount the fs? (com-step before) > > No, wait - it was just me making a schoolbody mistake again. :-) > I forgot to switch off the eth1 (iSCSI SAN) interface at boot-time, along > with the iscsi inits (because they were started up by the initrd image). > > Switched those off and now it boots fully! :-) great. > > It throws up a worrying error when it boots: > GFS: fsid=cluster:root.0: warning: assertion > "gfs_glock_is_locked_by_me(ip->i_gl)" failed > GFS: fsid=cluster:root.0 function = gfs_readpage > GFS: fsid=cluster:root.0 file = > /builddir/build/BUILD/gfs-kmod-0.1.16/_kmod_build_/src/gfs/ops_address.c, > line 279 > GFS: fsid=cluster:root.0: time = 1192119131 > > I wonder if this may be caused by a file system perhaps not having been > cleanly unmounted on a previous try while I was building it... > > Gordan Perhaps just fschk it when you're in the initrd. > > ------------------------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. > Still grepping through log files to find problems? Stop. > Now Search log events and configuration files using AJAX and a browser. > Download your FREE copy of Splunk now >> http://get.splunk.com/ > _______________________________________________ > Open-sharedroot-devel mailing list > Ope...@li... > https://lists.sourceforge.net/lists/listinfo/open-sharedroot-devel -- Gruss / Regards, Marc Grimme Phone: +49-89 452 3538-14 http://www.atix.de/ http://www.open-sharedroot.org/ ** Visit us at LinuxWorld Conference & Expo 31.10. - 01.11.2007 in Jaarbeurs Utrecht - The Netherlands ATIX stand: Hall 9 / B 005 ** ATIX Informationstechnologie und Consulting AG Einsteinstr. 10 85716 Unterschleissheim Deutschland/Germany Phone: +49-89 452 3538-0 Fax: +49-89 990 1766-0 Registergericht: Amtsgericht Muenchen Registernummer: HRB 168930 USt.-Id.: DE209485962 Vorstand: Marc Grimme, Mark Hlawatschek, Thomas Merz (Vors.) Vorsitzender des Aufsichtsrats: Dr. Martin Buss |
From: Gordan B. <go...@bo...> - 2007-10-11 16:20:56
|
On Thu, 11 Oct 2007, Marc Grimme wrote: >> OK - my fault - I had done something stupid. >> I touched /etc/sysconfig/comoonics-chroot to silence the warning at time >> of building the initrd. >> I have now instead removed the inclusion of this file, and that gets >> further - it actually mount the GFS file system on /mnt/newroot! :-) >> >> It then proceeds to boot the GFS setup! :-) >> >> Then it deems to get stuck, printing endlessly: >> dlm: root: remove fr 0 ID 3 >> >> Now what... > > Good question ;-) > No we are in GFS/DLM. Which packages of GFS/DLM combi are you using and can > you manually mount the fs? (com-step before) No, wait - it was just me making a schoolbody mistake again. :-) I forgot to switch off the eth1 (iSCSI SAN) interface at boot-time, along with the iscsi inits (because they were started up by the initrd image). Switched those off and now it boots fully! :-) It throws up a worrying error when it boots: GFS: fsid=cluster:root.0: warning: assertion "gfs_glock_is_locked_by_me(ip->i_gl)" failed GFS: fsid=cluster:root.0 function = gfs_readpage GFS: fsid=cluster:root.0 file = /builddir/build/BUILD/gfs-kmod-0.1.16/_kmod_build_/src/gfs/ops_address.c, line 279 GFS: fsid=cluster:root.0: time = 1192119131 I wonder if this may be caused by a file system perhaps not having been cleanly unmounted on a previous try while I was building it... Gordan |
From: Marc G. <gr...@at...> - 2007-10-11 16:10:15
|
On Thursday 11 October 2007 18:02:20 Gordan Bobic wrote: > OK - my fault - I had done something stupid. > I touched /etc/sysconfig/comoonics-chroot to silence the warning at time > of building the initrd. > I have now instead removed the inclusion of this file, and that gets > further - it actually mount the GFS file system on /mnt/newroot! :-) > > It then proceeds to boot the GFS setup! :-) > > Then it deems to get stuck, printing endlessly: > dlm: root: remove fr 0 ID 3 > > Now what... Good question ;-) No we are in GFS/DLM. Which packages of GFS/DLM combi are you using and can you manually mount the fs? (com-step before) Marc. > > Gordan > > ------------------------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. > Still grepping through log files to find problems? Stop. > Now Search log events and configuration files using AJAX and a browser. > Download your FREE copy of Splunk now >> http://get.splunk.com/ > _______________________________________________ > Open-sharedroot-devel mailing list > Ope...@li... > https://lists.sourceforge.net/lists/listinfo/open-sharedroot-devel -- Gruss / Regards, Marc Grimme Phone: +49-89 452 3538-14 http://www.atix.de/ http://www.open-sharedroot.org/ ** Visit us at LinuxWorld Conference & Expo 31.10. - 01.11.2007 in Jaarbeurs Utrecht - The Netherlands ATIX stand: Hall 9 / B 005 ** ATIX Informationstechnologie und Consulting AG Einsteinstr. 10 85716 Unterschleissheim Deutschland/Germany Phone: +49-89 452 3538-0 Fax: +49-89 990 1766-0 Registergericht: Amtsgericht Muenchen Registernummer: HRB 168930 USt.-Id.: DE209485962 Vorstand: Marc Grimme, Mark Hlawatschek, Thomas Merz (Vors.) Vorsitzender des Aufsichtsrats: Dr. Martin Buss |
From: Gordan B. <go...@bo...> - 2007-10-11 16:04:01
|
On Thu, 11 Oct 2007, Marc Grimme wrote: > I would say retry to use the bootimage rpm from here: > http://downloads.atix.de/yum/comoonics/redhat-el5/preview/SRPMS/comoonics-bootimage-1.3-20.src.rpm > http://downloads.atix.de/yum/comoonics/redhat-el5/preview/noarch/RPMS/comoonics-bootimage-1.3-20.noarch.rpm > because at least we should get a way better logfile. > And the resend the logfile if problems still exist. See other email - next step, new problem. :-) And I think I yum updated that package this morning. :-) Gordan |
From: Gordan B. <go...@bo...> - 2007-10-11 16:02:28
|
OK - my fault - I had done something stupid. I touched /etc/sysconfig/comoonics-chroot to silence the warning at time of building the initrd. I have now instead removed the inclusion of this file, and that gets further - it actually mount the GFS file system on /mnt/newroot! :-) It then proceeds to boot the GFS setup! :-) Then it deems to get stuck, printing endlessly: dlm: root: remove fr 0 ID 3 Now what... Gordan |
From: Marc G. <gr...@at...> - 2007-10-11 15:56:27
|
hmm, it looks quite strange. I mean the logs; in special that line res: -> chroot_mount=, chroot_path= it comes from linuxrc.generic.sh line 285 normally it should look like as follows if you didn't specify anything: Building comoonics chroot environment[ OK ] res: /var/comoonics/chroot -> chroot_mount=/var/comoonics/chroot, chroot_path=/var/comoonics/chroot chroot environment created I would say retry to use the bootimage rpm from here: http://downloads.atix.de/yum/comoonics/redhat-el5/preview/SRPMS/comoonics-bootimage-1.3-20.src.rpm http://downloads.atix.de/yum/comoonics/redhat-el5/preview/noarch/RPMS/comoonics-bootimage-1.3-20.noarch.rpm because at least we should get a way better logfile. And the resend the logfile if problems still exist. Marc. On Thursday 11 October 2007 17:19:13 Gordan Bobic wrote: > On Thu, 11 Oct 2007, Mark Hlawatschek wrote: > > can you please post the output of lsmod and uname -a. Could you also > > verify that the /lib/modules for your kernel are installed in the initrd > > ? configfs support is enabled with the load of the dlm module. > > lsmod | grep dlm: > lock_dlm > dlm lock_dlm > gfs2 lock_dlm,gfs > configfs dlm > > uname -a: > Linux (none) 2.6.18-8.1.14.el5 #1 SMP x86_64 GNU/Linux > > Gordan > > ------------------------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. > Still grepping through log files to find problems? Stop. > Now Search log events and configuration files using AJAX and a browser. > Download your FREE copy of Splunk now >> http://get.splunk.com/ > _______________________________________________ > Open-sharedroot-devel mailing list > Ope...@li... > https://lists.sourceforge.net/lists/listinfo/open-sharedroot-devel -- Gruss / Regards, Marc Grimme Phone: +49-89 452 3538-14 http://www.atix.de/ http://www.open-sharedroot.org/ ** Visit us at LinuxWorld Conference & Expo 31.10. - 01.11.2007 in Jaarbeurs Utrecht - The Netherlands ATIX stand: Hall 9 / B 005 ** ATIX Informationstechnologie und Consulting AG Einsteinstr. 10 85716 Unterschleissheim Deutschland/Germany Phone: +49-89 452 3538-0 Fax: +49-89 990 1766-0 Registergericht: Amtsgericht Muenchen Registernummer: HRB 168930 USt.-Id.: DE209485962 Vorstand: Marc Grimme, Mark Hlawatschek, Thomas Merz (Vors.) Vorsitzender des Aufsichtsrats: Dr. Martin Buss |
From: Gordan B. <go...@bo...> - 2007-10-11 15:19:20
|
On Thu, 11 Oct 2007, Mark Hlawatschek wrote: > can you please post the output of lsmod and uname -a. Could you also verify > that the /lib/modules for your kernel are installed in the initrd ? > configfs support is enabled with the load of the dlm module. lsmod | grep dlm: lock_dlm dlm lock_dlm gfs2 lock_dlm,gfs configfs dlm uname -a: Linux (none) 2.6.18-8.1.14.el5 #1 SMP x86_64 GNU/Linux Gordan |
From: Mark H. <hla...@at...> - 2007-10-11 15:08:09
|
On Thursday 11 October 2007 16:55:28 Gordan Bobic wrote: > On Thu, 11 Oct 2007, Marc Grimme wrote: > > What does mount (/proc/mounts) and ifconfig -a look like? > > /proc/mounts: > rootfs / rootfs rw 0 0 > proc /proc proc rw 0 0 > none /sys sysfs rw 0 0 > none /dev tmpfs rw 0 0 > /proc/bus/usb /proc/bus/usb usbfs rw 0 0 > none /dev/pts devpts rw 0 0 > > > ifconfig -a: > lo looks normal > eth0 is disabled > eth1 is set for 10.10.10.152/255.255.255.0, and verified working. I can > ping things and access the subnet. But then again I knew that because I > can mount the iSCSI share manually. I just cannot seem to remount it as > root. > > It all goes wrong, seemingly, when it tries to mount configfs. That's the > first failure message. can you please post the output of lsmod and uname -a. Could you also verify that the /lib/modules for your kernel are installed in the initrd ? configfs support is enabled with the load of the dlm module. Thanks Mark |
From: Gordan B. <go...@bo...> - 2007-10-11 14:55:46
|
On Thu, 11 Oct 2007, Marc Grimme wrote: > What does mount (/proc/mounts) and ifconfig -a look like? /proc/mounts: rootfs / rootfs rw 0 0 proc /proc proc rw 0 0 none /sys sysfs rw 0 0 none /dev tmpfs rw 0 0 /proc/bus/usb /proc/bus/usb usbfs rw 0 0 none /dev/pts devpts rw 0 0 ifconfig -a: lo looks normal eth0 is disabled eth1 is set for 10.10.10.152/255.255.255.0, and verified working. I can ping things and access the subnet. But then again I knew that because I can mount the iSCSI share manually. I just cannot seem to remount it as root. It all goes wrong, seemingly, when it tries to mount configfs. That's the first failure message. Gordan |
From: Marc G. <gr...@at...> - 2007-10-11 14:35:32
|
the cluster.conf looks good. What does mount (/proc/mounts) and ifconfig -a look like? On Thursday 11 October 2007 16:20:22 Gordan Bobic wrote: > On Thu, 11 Oct 2007, Marc Grimme wrote: > > what does you cluster.conf look like? > > Pasted: > > <?xml version="1.0"?> > <cluster config_version="3" name="cluster1"> > <cman expected_votes="1"/> > <fence_daemon post_fail_delay="0" post_join_delay="3"/> > <clusternodes> > <clusternode name="node3" nodeid="3" votes="1"> > <com_info> > <syslog name="localhost"/> > <rootvolume name="/dev/sdb"/> > <eth name="eth1" ip="10.10.10.152" > mac="00:18:8B:F9:72:DC" mask="255.255.255.0" gateway=""/> > <fenceackserver user="fence" > passwd="test123"/> > </com_info> > <fence> > <method name="1"/> > </fence> > </clusternode> > > <clusternode name="node4" nodeid="4" votes="2"> > <com_info> > <syslog name="localhost"/> > <rootvolume name="/dev/sdb"/> > <eth name="eth1" ip="10.10.10.153" > mac="00:18:8B:F9:6E:B6" mask="255.255.255.0" gateway=""/> > <fenceackserver user="fence" > passwd="test123"/> > </com_info> > <fence> > <method name="1"/> > </fence> > </clusternode> > </clusternodes> > <cman/> > <fencedevices/> > <rm> > <failoverdomains/> > <resources/> > </rm> > </cluster> > > > ------------------------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. > Still grepping through log files to find problems? Stop. > Now Search log events and configuration files using AJAX and a browser. > Download your FREE copy of Splunk now >> http://get.splunk.com/ > _______________________________________________ > Open-sharedroot-devel mailing list > Ope...@li... > https://lists.sourceforge.net/lists/listinfo/open-sharedroot-devel -- Gruss / Regards, Marc Grimme Phone: +49-89 452 3538-14 http://www.atix.de/ http://www.open-sharedroot.org/ ** Visit us at LinuxWorld Conference & Expo 31.10. - 01.11.2007 in Jaarbeurs Utrecht - The Netherlands ATIX stand: Hall 9 / B 005 ** ATIX Informationstechnologie und Consulting AG Einsteinstr. 10 85716 Unterschleissheim Deutschland/Germany Phone: +49-89 452 3538-0 Fax: +49-89 990 1766-0 Registergericht: Amtsgericht Muenchen Registernummer: HRB 168930 USt.-Id.: DE209485962 Vorstand: Marc Grimme, Mark Hlawatschek, Thomas Merz (Vors.) Vorsitzender des Aufsichtsrats: Dr. Martin Buss |
From: Gordan B. <go...@bo...> - 2007-10-11 14:20:32
|
On Thu, 11 Oct 2007, Marc Grimme wrote: > what does you cluster.conf look like? Pasted: <?xml version="1.0"?> <cluster config_version="3" name="cluster1"> <cman expected_votes="1"/> <fence_daemon post_fail_delay="0" post_join_delay="3"/> <clusternodes> <clusternode name="node3" nodeid="3" votes="1"> <com_info> <syslog name="localhost"/> <rootvolume name="/dev/sdb"/> <eth name="eth1" ip="10.10.10.152" mac="00:18:8B:F9:72:DC" mask="255.255.255.0" gateway=""/> <fenceackserver user="fence" passwd="test123"/> </com_info> <fence> <method name="1"/> </fence> </clusternode> <clusternode name="node4" nodeid="4" votes="2"> <com_info> <syslog name="localhost"/> <rootvolume name="/dev/sdb"/> <eth name="eth1" ip="10.10.10.153" mac="00:18:8B:F9:6E:B6" mask="255.255.255.0" gateway=""/> <fenceackserver user="fence" passwd="test123"/> </com_info> <fence> <method name="1"/> </fence> </clusternode> </clusternodes> <cman/> <fencedevices/> <rm> <failoverdomains/> <resources/> </rm> </cluster> |
From: Marc G. <gr...@at...> - 2007-10-11 14:13:35
|
what does you cluster.conf look like? On Thursday 11 October 2007 15:22:28 Gordan Bobic wrote: > On Thu, 11 Oct 2007, Marc Grimme wrote: > >> Almost everything works now. > >> I had to move the scsi_start to _before_ the dm initialisation, so that > >> the locally attached SCSI drives show up before the iSCSI devices. > >> > >> So, the start-up order is: > >> SCSI > >> iSCSI > >> dm > >> > >> Now things just seem to end with a failure to get things going: > >> > >> Mounting configfs [FAILED] > >> Starting service /sbin/ccsd [FAILED] > >> > >> No glaring errors on the console. > >> > >> I have attached the comoonics-boot.log > >> > >> Can anyone hazard a guess as to what goes wrong? Could it be that there > >> is no two-node option in cluster-conf, so it can't get quorate with just > >> one node? > > > > I would also need /var/comoonics/chroot/var/log/comoonics-boot.log (fixed > > in cvs) and please also boot with bootoption com-debug. > > Rebooted with com-debug. No > /var/comoonics/chroot/var/log/comoonics-boot.log file appears in the > initrd. > New /var/log/comoonics-boot.log attached. > > Gordan -- Gruss / Regards, Marc Grimme Phone: +49-89 452 3538-14 http://www.atix.de/ http://www.open-sharedroot.org/ ** Visit us at LinuxWorld Conference & Expo 31.10. - 01.11.2007 in Jaarbeurs Utrecht - The Netherlands ATIX stand: Hall 9 / B 005 ** ATIX Informationstechnologie und Consulting AG Einsteinstr. 10 85716 Unterschleissheim Deutschland/Germany Phone: +49-89 452 3538-0 Fax: +49-89 990 1766-0 Registergericht: Amtsgericht Muenchen Registernummer: HRB 168930 USt.-Id.: DE209485962 Vorstand: Marc Grimme, Mark Hlawatschek, Thomas Merz (Vors.) Vorsitzender des Aufsichtsrats: Dr. Martin Buss |