From: Marc G. <gr...@at...> - 2007-10-12 07:53:46
|
On Friday 12 October 2007 09:25:45 Gordan Bobic wrote: > On Fri, 12 Oct 2007, Marc Grimme wrote: > >>>> It throws up a worrying error when it boots: > >>>> GFS: fsid=cluster:root.0: warning: assertion > >>>> "gfs_glock_is_locked_by_me(ip->i_gl)" failed > >>>> GFS: fsid=cluster:root.0 function = gfs_readpage > >>>> GFS: fsid=cluster:root.0 file = > >>>> /builddir/build/BUILD/gfs-kmod-0.1.16/_kmod_build_/src/gfs/ops_address > >>>>.c , line 279 > >>>> GFS: fsid=cluster:root.0: time = 1192119131 > >>>> > >>>> I wonder if this may be caused by a file system perhaps not having > >>>> been cleanly unmounted on a previous try while I was building it... > >>> > >>> Perhaps just fschk it when you're in the initrd. > >> > >> Yup, just did. Some minor things were broken with the fs metadata. But > >> when I reboot, I still get a similar message when HALd loads. I wonder > >> if I can safely switch that off - assuming that's causing it... > > > > You could also file it to the gfs list and see what they'll tell you. It > > seems to me I've seen this message also. > > If you mean the RedHat's Linux Cluster list - I already did. :-) Yes I saw one minute after writing the answer. ;-) > > >> Now, in theory, I should be able to bring up another node on the same > >> file system. All I would need to do is clone the /boot partition to the > >> other box, and it should just come up. > > > > Why cloning it and not using the same. Isn't that possible. We are always > > doing it this way. > > Because I'm not booting this off DHCP. I'm booting the kernel and the > initrd off the local disk. So I need to clone the boot partition with the > kernel and the initrd to each of the nodes. ok. How about PXE. IMHO you could use one shared bootimage couldn't you? > > >> What do I need to do to achieve this, and can it all be done with the > >> one node that is already running? I'm assuming that I'll have to do > >> something like: > >> > >> mount --bind /cluster/cdsl/4/ /cdsl.local/ > > > > exactly if nodeid is 4. But again the initrd should do this job > > automatically. > > So, I wouldn't need to do this at all? The initrd will automagically link > /cdsl.local to /cluster/cdsl/nodeid ? Yes this is done in linuxrc.generic.sh lines 354-360: clusterfs_mount_cdsl $newroot $cdsl_local_dir $nodeid $cdsl_prefix if [ $return_c -ne 0 ]; then echo_local "Could not mount cdsl $cdsl_local_dir to ${cdsl_prefix}/$nodeid. Exiting" exit_linuxrc 1 fi step "CDSL tree mounted" > > >> As far as unsharing things under /var, I _think_ only /var/lock actually > >> needs to be unshared. Can I do this with the running image with: > >> > >> com-mkcdsl -r / -a /var/lock > > > > you can skip the -r/ it is default. > > How about /var/run, /var/log, /var/cache, /var/tmp, /var/spool. All of > > these normally need to be hostdependent. > > I'm not sure why /var/cache and /var/spool would need to be host > dependent. I can see reasons why I'd want to them to be shared. I think e.g. /var/spool/mail or just from the name it should be. But it's up to you. > > I agree that /var/run and /var/lock should be private. > > It would be _nice_ to have a shared /var/log, but from past experience, > the logs will get messed up when multiple syslogs try to write to them. > Is there a shared logging solution for this? I know I can pick a master > log node and get syslog pointed at this, but this won't work for all the > other non-syslog services (e.g. Apache). Why did I want to say (use a syslog-server)? Right with apache it does not work. For e.g. apache we've written a log analysis tool to merge the logs. It's in the addons channel and is called mgrep. I think I also read a howto integrate apache into syslog somewhere. > > I plan to link /var/tmp to /tmp, and have /tmp mounted to a big local > partition (local disks are only planned to have /boot, /tmp and swap). > > Which brings me to the next question - how do I use a local disk partition > instead of the initrd? What's the procedure for that? It seems a more > efficient solution than relying on a ramdisk that eats memory after > booting up when there is plenty of local disk space available. How do I > use /etc/sysconfig/comoonics-chroot ? Yes. So I suppose you don't want to configure your local disk with lvm ;-) . So I'll explain it without. It's basically quite easy: 1. For every node: spare one partition for the chroot (let's say it is /dev/sda4) and let it be at least 500M. 2. For every node: mkfs.ext3 /dev/sda4 3. Add to the com_info section for every node the following: <chrootenv mountpoint="/var/comoonics/chroot" fstype="ext3" device="/dev/sda4" chrootdir="/var/comoonics/chroot"/> 4. Make a new initrd 5. reboot every node That's it no everything should be running on your local disk instead of tmpfs. Marc. -- Gruss / Regards, Marc Grimme http://www.atix.de/ http://www.open-sharedroot.org/ |