From: Dan M. <dan...@or...> - 2009-04-28 18:01:51
|
> you might try the patch or file itself attached. Yes, this patch seems to get the distribution set properly. clutype still gets set to gfs... is that OK even if I'm using ocfs2? > /etc/xen should only be tested for Dom0 "NOT" DomU. I think this problem goes away with the correct distro setting. > > And why when "Loading modules for all found network cards" > > do I get "FATAL: Module xennet not found"? I have xen > > networking compiled into my kernel so there is no module > > for it. Should this be fatal? > Hm. Up to now it seems to be ;-) . Nowerdays I only saw > kernels which are > modularized. So this is a usecase where the errordetection detects a > nonexistant error. I'll have to think about it. Perhaps also check lib/modules/build/.config to see if the config option is set to "=y"? I am now also seeing FATAL error reports when trying to load the ocfs2 modules, scsi, dm, and others. I have all of these compiled into my kernel. > -----Original Message----- > From: Marc Grimme [mailto:gr...@at...] > Sent: Tuesday, April 28, 2009 9:58 AM > To: ope...@li... > Cc: Dan Magenheimer > Subject: Re: [OSR-users] New Preview RPMs for next Release > Candidate of > comoonics-bootimage > > > Hi Dan, > you might try the patch or file itself attached. > > Do a > --------------------------------X8---------------------------- > --------- > source /opt/atix/comoonics/bootimage/boot-scripts/etc/std-lib.sh > sourceLibs /opt/atix/comoonics/bootimage/boot-script > sourceRootfsLibs /opt/atix/comoonics/bootimage/boot-script > getDistributionList > --------------------------------X8---------------------------- > --------- > > and let me know what the output is. > > Regards Marc. > > On Tuesday 28 April 2009 17:29:16 Dan Magenheimer wrote: > > Hi Marc -- > > > > As you probably know, Oracle's Enterprise Linux (EL) > > is a "clone" of Red Hat Enterprise Linux (RHEL) and is > > essentially identical except for bug fixes. I don't > > know if OSR needs to distinguish between EL and RHEL, > > but I'm sure you know if they do. > > > > I'm told that you can distinguish between EL5 and RHEL5 > > for RH/EL5ga, RH/EL5u1 and RH/EL5u2 because the file > > /etc/redhat-release has > > > > "Enterprise Linux Enterprise Linux release 5.X (codename)" > > > > in EL but has > > > > "Red Hat Enterprise Linux release 5.x (different_codename)" > > > > in RHEL. > > > > (Sorry, I don't know all the codenames.) > > > > Also, the file /etc/enterprise-release exists on EL but > > not on RHEL and has the same contents as /etc/redhat-release. > > > > HOWEVER, STARTING IN RH/EL5u3, this changes. The file > > /etc/redhat-release is the SAME for EL5u3 and RHEL5u3: > > > > "Red Hat Enterprise Linux release 5.3 (codename)" > > > > And for EL5u3, the file /etc/enterprise-release is different > > than /etc/redhat-release. For EL5u3, /etc/enterprise-release has > > > > "Enterprise Linux Enterprise Linux release 5.3 (codename)" > > > > but /etc/redhat-release has: > > > > "Red Hat Enterprise Linux release 5.3 (different_codename)" > > > > It looks to me like the OSR scripts already distinguish > > between EL5 and RHEL5, so there is probably a bug somewhere. > > > > Hope that helps! > > > > Thanks, > > Dan > > > > > -----Original Message----- > > > From: Marc Grimme [mailto:gr...@at...] > > > Sent: Tuesday, April 28, 2009 1:05 AM > > > To: Dan Magenheimer > > > Cc: ope...@li... > > > Subject: Re: [OSR-users] New Preview RPMs for next Release > > > Candidate of > > > comoonics-bootimage > > > > > > On Tuesday 28 April 2009 02:51:49 Dan Magenheimer wrote: > > > > Attached is the output from the messages command to > > > > the rescueshell for a freshly created initrd (with > > > > a smaller lib/modules). > > > > > > > > Some messages I see on the console which do not appear > > > > in the "messages" output: > > > > > > > > Detecting Hardware ./etc/hardware-lib.sh: line 458: > > > > unknown_hardware_detect: command not found > > > > > > That's the wrong distribution detection. > > > This function is called as ${distribution}_hardware_detect. > > > Which will fail in > > > your case. Send me a cat /etc/*-release and ls -1 > > > /etc/*-release and I'll > > > make a patch for it. > > > > > > > Loading modules for all found network cardsFATAL: Module > > > > > > xennet not found. > > > > > > > error: "xen.independent_wallclock" is an unknown key > > > > > > > > > -----Original Message----- > > > > > From: Dan Magenheimer > > > > > Sent: Monday, April 27, 2009 6:14 PM > > > > > To: Marc Grimme; ope...@li... > > > > > Subject: Re: [OSR-users] New Preview RPMs for next Release > > > > > Candidate of > > > > > comoonics-bootimage > > > > > > > > > > > > > > > Hi Marc -- > > > > > > > > > > Thanks for the reply. First, let me clarify that I am > > > > > trying two different approaches: > > > > > > > > > > (A) Use your entire OSR RHEL5+OCFS2 howto (with > > > > > a Xen-paravirtualized 2.6.29 kernel running as a > > > > > guest on Xen-3.4.0), or > > > > > (B) Build my own initrd and use your OSR scripts > > > > > only after my initrd finishes. > > > > > > > > > > I couldn't get (A) to work... the initrd built using > > > > > the howto was failing very early, so I decided to > > > > > try with (B). I hoped that (B) would be easier because > > > > > your code is very general to handle many different > > > > > kinds of systems, and mine could be much more specific. > > > > > > > > > > HOWEVER, I just discovered one problem with (A). > > > > > Your mkinitrd process builds a huge (200MB) initrd > > > > > and there appears to be a BUG IN XEN that fails > > > > > to load large initrds (larger than about 100MB)! > > > > > > > > > > Your initrd is so large because my lib/modules/2.6.29 > > > > > is very large. If I delete that from initrd built > > > > > using the howto, the initrd.gz is only about 24MB > > > > > and I am able to boot and see the ATIX logo and > > > > > it drops into the rescue shell (because I haven't > > > > > specified the MAC-Addresses I think). However other > > > > > boot errors complain about modules that are missing. > > > > > I will try to build a kernel with fewer modules and > > > > > see how that goes. > > > > > > > > > > Other feedback: > > > > > > > > > > I wonder if your script detecting "distribution" > > > > > and "shortdistribution" are correct for all versions > > > > > of RHEL and Oracle Enterprise Linux? I am getting > > > > > "unknown" for both (from listparameters in the rescue > > > > > shell), though I am booting Oracle Enterprise Linux 5 > > > > > update 2. > > > > > > > > > > I also see that your test for detecting xen in xen-lib.sh > > > > > is not very good. You may want to test for /proc/xen > > > > > instead of (or in addition to) /etc/xen. > > > > > > > > > > And why when "Loading modules for all found network cards" > > > > > do I get "FATAL: Module xennet not found"? I have xen > > > > > networking compiled into my kernel so there is no module > > > > > for it. Should this be fatal? > > > > > > > > > > > Ok. But still we need the mac adress for detecting the nodes > > > > > > identity in the /etc/cluster/cluster.conf. > > > > > > > > > > Is that really necessary? Ocfs2 only requires the > > > > > node name, not the mac address. But I guess I can > > > > > configure the mac address in the xen guest config file > > > > > so I can live with this. > > > > > > > > > > Thanks, > > > > > Dan > > > > > > > > > > > -----Original Message----- > > > > > > From: Marc Grimme [mailto:gr...@at...] > > > > > > Sent: Monday, April 27, 2009 12:52 AM > > > > > > To: ope...@li... > > > > > > Cc: Dan Magenheimer > > > > > > Subject: Re: [OSR-users] New Preview RPMs for next Release > > > > > > Candidate of > > > > > > comoonics-bootimage > > > > > > > > > > > > On Friday 24 April 2009 00:42:17 Dan Magenheimer wrote: > > > > > > > Hi Marc -- > > > > > > > > > > > > > > Thanks for the help. I got past my rpm problems and am > > > > > > > now much further along but have hit another roadblock. > > > > > > > > > > > > > > First, FYI, I am using a different approach then the > > > > > > > RHEL5 OCFS2 Shared Root Mini Howto, because of the > > > > > > > way that I want to use the shared root. Specifically, > > > > > > > I am first building and booting a root ocfs2 filesystem, > > > > > > > using my own kernel (2.6.29) and my own initrd.img. With > > > > > > > this (and before I install any OSR stuff), I am able > > > > > > > to boot it as a Xen paravirtualized guest, using > > > > > > > the Xen kernel= and ramdisk= config options; the root > > > > > > > disk is NOT an LVM because I don't need a /boot. The > > > > > > > > > > > > No problem with this my osr-ocfs2 cluster is running exactly > > > > > > the same. No LVM, > > > > > > direct boot via kernel and initrd. That should not > be a problem. > > > > > > > > > > > > > 2.6.29 kernel has CONFIG_IP_PNP and I pass in the > > > > > > > IP address and hostname from the Xen config file. > > > > > > > This all seems to work fine (for a single node). > > > > > > > > > > > > Ok. But still we need the mac adress for detecting the nodes > > > > > > identity in > > > > > > the /etc/cluster/cluster.conf. > > > > > > Also you might try to set the onboot flag in the > cluster.conf > > > > > > at the nic > > > > > > config to "no": > > > > > > <com_info> > > > > > > .. > > > > > > <eth name="eth0" mac="..." onboot="no"/> > > > > > > .. > > > > > > </com_info> > > > > > > I didn't test it yet (it not YET in my testcases) but I'm > > > > > > pretty confident it > > > > > > should work. > > > > > > > > > > > > > Next, I install the OSR rpms directly in the running > > > > > > > ocfs2-root guest, then shut it down. > > > > > > > > > > > > Ok. so far so good. > > > > > > > > > > > > > Next, I mount the ocfs2-root-disk from another guest > > > > > > > (that also has the OSR rpms installed) and follow > > > > > > > the howto steps to create the cdsl infrastructure > > > > > > > and links. Then I shut down the other guest. > > > > > > > > > > > > Could you recall the exact steps and outputs? > > > > > > > > > > > > > Next, I try to boot the OSR-modified ocfs2-root guest, > > > > > > > but it has problems. It appears that /var doesn't > > > > > > > exist as I get many messages such as: > > > > > > > > > > > > The not mouting of /var is very strange. It should > put you in > > > > > > a rescue shell. > > > > > > Then type messages and send me the output. > > > > > > > > > > > > > /etc/rc.d/rc.sysinit: /var/log/dmesg: No such file or > > > > > > directory > > > > > > > > > That's just before it trys to boot. That's somehow to > > > > > > far advanced. > > > > > > > > > > and then the boot process seems to hang trying to > start the > > > > > > > System Logger. No, it just takes a very long time and > > > > > > > eventually I get to a login prompt. (Or I can boot > > > > > > > single-user mode and get the same error messages, but > > > > > > > get to a bash prompt.) > > > > > > > > > > > > > > With a "ls -l /var", I see: > > > > > > > > > > > > > > lrwxrwxrwx 1 root root 14 <date> /var -> cdsl.local/var > > > > > > > > > > > > That's perfectly ok. > > > > > > > > > > > > > (Note no leading / before cdsl.local) > > > > > > > > > > > > > > but "ls -ld /cdsl.local" shows it is empty. > > > > > > > > > > > > That's strange. > > > > > > > > > > > > > Browsing around, I see that /cluster/cdsl is populated > > > > > > > (with subdirectories 0 ... 7 and default) and each has > > > > > > > an etc and a var subdirectory. /cluster/shared has > > > > > > > a var subdirectory and a var/lib subdirectory. > > > > > > > > > > > > That's again perfectly ok.- > > > > > > > > > > > > > So I'm guessing that cdsl.local should somehow be > > > > > > > linked to /cluster but isn't. True? > > > > > > > > > > > > Right but it is not liked but bind mounted. That means: > > > > > > mount --bind /cluster/cdsl/<nodeid> /cdsl.local > > > > > > but that's done in the initrd automatically so you > should not > > > > > > have to bother > > > > > > about that. > > > > > > > > > > > > > One other thing I should mention... since my cluster.conf > > > > > > > has 8 nodes numbered 0 to 7, in the "mount --bind" > > > > > > > command during the cdsl setup steps, I used cluster/cdsl/0 > > > > > > > instead of cluster/cdsl/1 to bind to cdsl.local. > > > > > > > > > > > > How did you "use" that. That should be done automatically > > > > > > shouldn't it? > > > > > > > > > > > > > Any ideas? Maybe your initrd creates some necessary links > > > > > > > and mine does not? (I tried booting with your initrd, > > > > > > > but my ocfs2-root failed to mount giving a kernel panic... > > > > > > > have you tested with linux-2.6.29? The error message > > > > > > > "Heartbeat has to be started to mount a read-write > > > > > > > clustered device" looks like it comes from a somewhat > > > > > > > recent ocfs2 kernel patch I found here: > > > > > > > > > http://www.mail-archive.com/ocf...@os.../msg00293.html > > > > > > > > > > and I worked around it by mounting with -o heartbeat=local) > > > > > > > > > > > > Sorry this is so long! > > > > > > > > > > How does your /etc/cluster/cluster.conf look like? > > > > > > > > > > -- > > > > > Gruss / Regards, > > > > > > > > > > Marc Grimme > > > > > http://www.atix.de/ > http://www.open-sharedroot.org/ > > > > > > -------------------------------------------------------------- > > > ------------- > > > > > > >--- Register Now & Save for Velocity, the Web Performance & > > > > > > Operations > > > > > > > Conference from O'Reilly Media. Velocity features a full day of > > > > expert-led, hands-on workshops and two days of sessions > > > > > > from industry > > > > > > > leaders in dedicated Performance & Operations tracks. Use > > > > > > code vel09scf > > > > > > > and Save an extra 15% before 5/3. > http://p.sf.net/sfu/velocityconf > > > > _______________________________________________ > > > > Open-sharedroot-users mailing list > > > > Ope...@li... > > > > > https://lists.sourceforge.net/lists/listinfo/open-sharedroot-users > > > > > > -- > > > Gruss / Regards, > > > > > > Marc Grimme > > > Phone: +49-89 452 3538-14 > > > http://www.atix.de/ http://www.open-sharedroot.org/ > > > > > > ATIX Informationstechnologie und Consulting AG | > Einsteinstrasse 10 | > > > 85716 Unterschleissheim | www.atix.de | www.open-sharedroot.org > > > > > > Registergericht: Amtsgericht Muenchen, Registernummer: HRB > > > 168930, USt.-Id.: > > > DE209485962 | Vorstand: Marc Grimme, Mark Hlawatschek, Thomas > > > Merz (Vors.) | > > > Vorsitzender des Aufsichtsrats: Dr. Martin Buss > > > > > -------------------------------------------------------------- > ------------- > >--- Register Now & Save for Velocity, the Web Performance & > Operations > > Conference from O'Reilly Media. Velocity features a full day of > > expert-led, hands-on workshops and two days of sessions > from industry > > leaders in dedicated Performance & Operations tracks. Use > code vel09scf > > and Save an extra 15% before 5/3. http://p.sf.net/sfu/velocityconf > > _______________________________________________ > > Open-sharedroot-users mailing list > > Ope...@li... > > https://lists.sourceforge.net/lists/listinfo/open-sharedroot-users > > > > -- > Gruss / Regards, > > Marc Grimme > Phone: +49-89 452 3538-14 > http://www.atix.de/ http://www.open-sharedroot.org/ > > ATIX Informationstechnologie und Consulting AG | Einsteinstrasse 10 | > 85716 Unterschleissheim | www.atix.de | www.open-sharedroot.org > > Registergericht: Amtsgericht Muenchen, Registernummer: HRB > 168930, USt.-Id.: > DE209485962 | Vorstand: Marc Grimme, Mark Hlawatschek, Thomas > Merz (Vors.) | > Vorsitzender des Aufsichtsrats: Dr. Martin Buss > |