|
From: Marc G. <gr...@at...> - 2009-04-28 15:57:56
|
Hi Dan,
you might try the patch or file itself attached.
Do a
--------------------------------X8-------------------------------------
source /opt/atix/comoonics/bootimage/boot-scripts/etc/std-lib.sh
sourceLibs /opt/atix/comoonics/bootimage/boot-script
sourceRootfsLibs /opt/atix/comoonics/bootimage/boot-script
getDistributionList
--------------------------------X8-------------------------------------
and let me know what the output is.
Regards Marc.
On Tuesday 28 April 2009 17:29:16 Dan Magenheimer wrote:
> Hi Marc --
>
> As you probably know, Oracle's Enterprise Linux (EL)
> is a "clone" of Red Hat Enterprise Linux (RHEL) and is
> essentially identical except for bug fixes. I don't
> know if OSR needs to distinguish between EL and RHEL,
> but I'm sure you know if they do.
>
> I'm told that you can distinguish between EL5 and RHEL5
> for RH/EL5ga, RH/EL5u1 and RH/EL5u2 because the file
> /etc/redhat-release has
>
> "Enterprise Linux Enterprise Linux release 5.X (codename)"
>
> in EL but has
>
> "Red Hat Enterprise Linux release 5.x (different_codename)"
>
> in RHEL.
>
> (Sorry, I don't know all the codenames.)
>
> Also, the file /etc/enterprise-release exists on EL but
> not on RHEL and has the same contents as /etc/redhat-release.
>
> HOWEVER, STARTING IN RH/EL5u3, this changes. The file
> /etc/redhat-release is the SAME for EL5u3 and RHEL5u3:
>
> "Red Hat Enterprise Linux release 5.3 (codename)"
>
> And for EL5u3, the file /etc/enterprise-release is different
> than /etc/redhat-release. For EL5u3, /etc/enterprise-release has
>
> "Enterprise Linux Enterprise Linux release 5.3 (codename)"
>
> but /etc/redhat-release has:
>
> "Red Hat Enterprise Linux release 5.3 (different_codename)"
>
> It looks to me like the OSR scripts already distinguish
> between EL5 and RHEL5, so there is probably a bug somewhere.
>
> Hope that helps!
>
> Thanks,
> Dan
>
> > -----Original Message-----
> > From: Marc Grimme [mailto:gr...@at...]
> > Sent: Tuesday, April 28, 2009 1:05 AM
> > To: Dan Magenheimer
> > Cc: ope...@li...
> > Subject: Re: [OSR-users] New Preview RPMs for next Release
> > Candidate of
> > comoonics-bootimage
> >
> > On Tuesday 28 April 2009 02:51:49 Dan Magenheimer wrote:
> > > Attached is the output from the messages command to
> > > the rescueshell for a freshly created initrd (with
> > > a smaller lib/modules).
> > >
> > > Some messages I see on the console which do not appear
> > > in the "messages" output:
> > >
> > > Detecting Hardware ./etc/hardware-lib.sh: line 458:
> > > unknown_hardware_detect: command not found
> >
> > That's the wrong distribution detection.
> > This function is called as ${distribution}_hardware_detect.
> > Which will fail in
> > your case. Send me a cat /etc/*-release and ls -1
> > /etc/*-release and I'll
> > make a patch for it.
> >
> > > Loading modules for all found network cardsFATAL: Module
> >
> > xennet not found.
> >
> > > error: "xen.independent_wallclock" is an unknown key
> > >
> > > > -----Original Message-----
> > > > From: Dan Magenheimer
> > > > Sent: Monday, April 27, 2009 6:14 PM
> > > > To: Marc Grimme; ope...@li...
> > > > Subject: Re: [OSR-users] New Preview RPMs for next Release
> > > > Candidate of
> > > > comoonics-bootimage
> > > >
> > > >
> > > > Hi Marc --
> > > >
> > > > Thanks for the reply. First, let me clarify that I am
> > > > trying two different approaches:
> > > >
> > > > (A) Use your entire OSR RHEL5+OCFS2 howto (with
> > > > a Xen-paravirtualized 2.6.29 kernel running as a
> > > > guest on Xen-3.4.0), or
> > > > (B) Build my own initrd and use your OSR scripts
> > > > only after my initrd finishes.
> > > >
> > > > I couldn't get (A) to work... the initrd built using
> > > > the howto was failing very early, so I decided to
> > > > try with (B). I hoped that (B) would be easier because
> > > > your code is very general to handle many different
> > > > kinds of systems, and mine could be much more specific.
> > > >
> > > > HOWEVER, I just discovered one problem with (A).
> > > > Your mkinitrd process builds a huge (200MB) initrd
> > > > and there appears to be a BUG IN XEN that fails
> > > > to load large initrds (larger than about 100MB)!
> > > >
> > > > Your initrd is so large because my lib/modules/2.6.29
> > > > is very large. If I delete that from initrd built
> > > > using the howto, the initrd.gz is only about 24MB
> > > > and I am able to boot and see the ATIX logo and
> > > > it drops into the rescue shell (because I haven't
> > > > specified the MAC-Addresses I think). However other
> > > > boot errors complain about modules that are missing.
> > > > I will try to build a kernel with fewer modules and
> > > > see how that goes.
> > > >
> > > > Other feedback:
> > > >
> > > > I wonder if your script detecting "distribution"
> > > > and "shortdistribution" are correct for all versions
> > > > of RHEL and Oracle Enterprise Linux? I am getting
> > > > "unknown" for both (from listparameters in the rescue
> > > > shell), though I am booting Oracle Enterprise Linux 5
> > > > update 2.
> > > >
> > > > I also see that your test for detecting xen in xen-lib.sh
> > > > is not very good. You may want to test for /proc/xen
> > > > instead of (or in addition to) /etc/xen.
> > > >
> > > > And why when "Loading modules for all found network cards"
> > > > do I get "FATAL: Module xennet not found"? I have xen
> > > > networking compiled into my kernel so there is no module
> > > > for it. Should this be fatal?
> > > >
> > > > > Ok. But still we need the mac adress for detecting the nodes
> > > > > identity in the /etc/cluster/cluster.conf.
> > > >
> > > > Is that really necessary? Ocfs2 only requires the
> > > > node name, not the mac address. But I guess I can
> > > > configure the mac address in the xen guest config file
> > > > so I can live with this.
> > > >
> > > > Thanks,
> > > > Dan
> > > >
> > > > > -----Original Message-----
> > > > > From: Marc Grimme [mailto:gr...@at...]
> > > > > Sent: Monday, April 27, 2009 12:52 AM
> > > > > To: ope...@li...
> > > > > Cc: Dan Magenheimer
> > > > > Subject: Re: [OSR-users] New Preview RPMs for next Release
> > > > > Candidate of
> > > > > comoonics-bootimage
> > > > >
> > > > > On Friday 24 April 2009 00:42:17 Dan Magenheimer wrote:
> > > > > > Hi Marc --
> > > > > >
> > > > > > Thanks for the help. I got past my rpm problems and am
> > > > > > now much further along but have hit another roadblock.
> > > > > >
> > > > > > First, FYI, I am using a different approach then the
> > > > > > RHEL5 OCFS2 Shared Root Mini Howto, because of the
> > > > > > way that I want to use the shared root. Specifically,
> > > > > > I am first building and booting a root ocfs2 filesystem,
> > > > > > using my own kernel (2.6.29) and my own initrd.img. With
> > > > > > this (and before I install any OSR stuff), I am able
> > > > > > to boot it as a Xen paravirtualized guest, using
> > > > > > the Xen kernel= and ramdisk= config options; the root
> > > > > > disk is NOT an LVM because I don't need a /boot. The
> > > > >
> > > > > No problem with this my osr-ocfs2 cluster is running exactly
> > > > > the same. No LVM,
> > > > > direct boot via kernel and initrd. That should not be a problem.
> > > > >
> > > > > > 2.6.29 kernel has CONFIG_IP_PNP and I pass in the
> > > > > > IP address and hostname from the Xen config file.
> > > > > > This all seems to work fine (for a single node).
> > > > >
> > > > > Ok. But still we need the mac adress for detecting the nodes
> > > > > identity in
> > > > > the /etc/cluster/cluster.conf.
> > > > > Also you might try to set the onboot flag in the cluster.conf
> > > > > at the nic
> > > > > config to "no":
> > > > > <com_info>
> > > > > ..
> > > > > <eth name="eth0" mac="..." onboot="no"/>
> > > > > ..
> > > > > </com_info>
> > > > > I didn't test it yet (it not YET in my testcases) but I'm
> > > > > pretty confident it
> > > > > should work.
> > > > >
> > > > > > Next, I install the OSR rpms directly in the running
> > > > > > ocfs2-root guest, then shut it down.
> > > > >
> > > > > Ok. so far so good.
> > > > >
> > > > > > Next, I mount the ocfs2-root-disk from another guest
> > > > > > (that also has the OSR rpms installed) and follow
> > > > > > the howto steps to create the cdsl infrastructure
> > > > > > and links. Then I shut down the other guest.
> > > > >
> > > > > Could you recall the exact steps and outputs?
> > > > >
> > > > > > Next, I try to boot the OSR-modified ocfs2-root guest,
> > > > > > but it has problems. It appears that /var doesn't
> > > > > > exist as I get many messages such as:
> > > > >
> > > > > The not mouting of /var is very strange. It should put you in
> > > > > a rescue shell.
> > > > > Then type messages and send me the output.
> > > > >
> > > > > > /etc/rc.d/rc.sysinit: /var/log/dmesg: No such file or
> >
> > directory
> >
> > > > > That's just before it trys to boot. That's somehow to
> >
> > far advanced.
> >
> > > > > > and then the boot process seems to hang trying to start the
> > > > > > System Logger. No, it just takes a very long time and
> > > > > > eventually I get to a login prompt. (Or I can boot
> > > > > > single-user mode and get the same error messages, but
> > > > > > get to a bash prompt.)
> > > > > >
> > > > > > With a "ls -l /var", I see:
> > > > > >
> > > > > > lrwxrwxrwx 1 root root 14 <date> /var -> cdsl.local/var
> > > > >
> > > > > That's perfectly ok.
> > > > >
> > > > > > (Note no leading / before cdsl.local)
> > > > > >
> > > > > > but "ls -ld /cdsl.local" shows it is empty.
> > > > >
> > > > > That's strange.
> > > > >
> > > > > > Browsing around, I see that /cluster/cdsl is populated
> > > > > > (with subdirectories 0 ... 7 and default) and each has
> > > > > > an etc and a var subdirectory. /cluster/shared has
> > > > > > a var subdirectory and a var/lib subdirectory.
> > > > >
> > > > > That's again perfectly ok.-
> > > > >
> > > > > > So I'm guessing that cdsl.local should somehow be
> > > > > > linked to /cluster but isn't. True?
> > > > >
> > > > > Right but it is not liked but bind mounted. That means:
> > > > > mount --bind /cluster/cdsl/<nodeid> /cdsl.local
> > > > > but that's done in the initrd automatically so you should not
> > > > > have to bother
> > > > > about that.
> > > > >
> > > > > > One other thing I should mention... since my cluster.conf
> > > > > > has 8 nodes numbered 0 to 7, in the "mount --bind"
> > > > > > command during the cdsl setup steps, I used cluster/cdsl/0
> > > > > > instead of cluster/cdsl/1 to bind to cdsl.local.
> > > > >
> > > > > How did you "use" that. That should be done automatically
> > > > > shouldn't it?
> > > > >
> > > > > > Any ideas? Maybe your initrd creates some necessary links
> > > > > > and mine does not? (I tried booting with your initrd,
> > > > > > but my ocfs2-root failed to mount giving a kernel panic...
> > > > > > have you tested with linux-2.6.29? The error message
> > > > > > "Heartbeat has to be started to mount a read-write
> > > > > > clustered device" looks like it comes from a somewhat
> > > > > > recent ocfs2 kernel patch I found here:
> > >
> > > http://www.mail-archive.com/ocf...@os.../msg00293.html
> > >
> > > > > and I worked around it by mounting with -o heartbeat=local)
> > > > >
> > > > > Sorry this is so long!
> > > >
> > > > How does your /etc/cluster/cluster.conf look like?
> > > >
> > > > --
> > > > Gruss / Regards,
> > > >
> > > > Marc Grimme
> > > > http://www.atix.de/ http://www.open-sharedroot.org/
> >
> > --------------------------------------------------------------
> > -------------
> >
> > >--- Register Now & Save for Velocity, the Web Performance &
> >
> > Operations
> >
> > > Conference from O'Reilly Media. Velocity features a full day of
> > > expert-led, hands-on workshops and two days of sessions
> >
> > from industry
> >
> > > leaders in dedicated Performance & Operations tracks. Use
> >
> > code vel09scf
> >
> > > and Save an extra 15% before 5/3. http://p.sf.net/sfu/velocityconf
> > > _______________________________________________
> > > Open-sharedroot-users mailing list
> > > Ope...@li...
> > > https://lists.sourceforge.net/lists/listinfo/open-sharedroot-users
> >
> > --
> > Gruss / Regards,
> >
> > Marc Grimme
> > Phone: +49-89 452 3538-14
> > http://www.atix.de/ http://www.open-sharedroot.org/
> >
> > ATIX Informationstechnologie und Consulting AG | Einsteinstrasse 10 |
> > 85716 Unterschleissheim | www.atix.de | www.open-sharedroot.org
> >
> > Registergericht: Amtsgericht Muenchen, Registernummer: HRB
> > 168930, USt.-Id.:
> > DE209485962 | Vorstand: Marc Grimme, Mark Hlawatschek, Thomas
> > Merz (Vors.) |
> > Vorsitzender des Aufsichtsrats: Dr. Martin Buss
>
> ---------------------------------------------------------------------------
>--- Register Now & Save for Velocity, the Web Performance & Operations
> Conference from O'Reilly Media. Velocity features a full day of
> expert-led, hands-on workshops and two days of sessions from industry
> leaders in dedicated Performance & Operations tracks. Use code vel09scf
> and Save an extra 15% before 5/3. http://p.sf.net/sfu/velocityconf
> _______________________________________________
> Open-sharedroot-users mailing list
> Ope...@li...
> https://lists.sourceforge.net/lists/listinfo/open-sharedroot-users
--
Gruss / Regards,
Marc Grimme
Phone: +49-89 452 3538-14
http://www.atix.de/ http://www.open-sharedroot.org/
ATIX Informationstechnologie und Consulting AG | Einsteinstrasse 10 |
85716 Unterschleissheim | www.atix.de | www.open-sharedroot.org
Registergericht: Amtsgericht Muenchen, Registernummer: HRB 168930, USt.-Id.:
DE209485962 | Vorstand: Marc Grimme, Mark Hlawatschek, Thomas Merz (Vors.) |
Vorsitzender des Aufsichtsrats: Dr. Martin Buss
|