From: Dan M. <dan...@or...> - 2009-05-12 15:45:45
|
Sorry to take awhile to respond to this. I decided to fall back to a 2.6.18-92 (EL5u2) kernel. I still have to customize it, but thought it would be closer to your tested environment. I've worked around a few problems and eventually got an EL5u2 (with modified kernel) OCFS2 OSR to boot. First, I still had to go back and patch up the boot-lib.sh file to properly recognize "enterprise linux enterprise linux"... is there an rpm on download.atix that has that fixed yet? Next, I am still seeing many "FATAL" modprobes. Perhaps the usages of modprobe in your scripts should use the "-q" option? Anyway, none of the FATAL messages is really fatal I think. When I used the mkinitrd -l option, one of the DLM modules did not find its way into the initrd. (Console output read "unknown filesystem type 'ocfs2_dlmfs'). With no -l option this problem went away... and since the 2.6.18 config file I am using has fewer modules, I didn't run into the xen bug I saw before that failed to properly load large ramdisks. Last, in working through the above failed boots, failure always drops into a bash shell instead of a rescueshell. I've attached console output of a failed boot (not the final successful boot)... let me know if you need more. Thanks, Dan P.S. My rpm list is the same as before except I used bootimage-1.4-21 instead of 1.4.19. > -----Original Message----- > From: Marc Grimme [mailto:gr...@at...] > Sent: Wednesday, May 06, 2009 1:40 AM > To: Dan Magenheimer > Cc: ope...@li... > Subject: Re: [OSR-users] New Preview RPMs for next Release > Candidate of > comoonics-bootimage > > > On Tuesday 05 May 2009 01:29:28 Dan Magenheimer wrote: > > Hi Marc -- > > > > I've worked through the OSR setup process again in my > > environment and am documenting it. > > > > For the most part, it is working, but intermittently. > > Booting some nodes works fine once and then fails the > > next time with no changes. One common failure appears > > to be due to a failure in "Detecting nodeid & nodename..." > That's very strange perhaps something has not stabilized. > Could you start with com-debug and sent the output? > > > > But a problem I've seen with this new cluster: When > > I have a problem (such as the above "Detecting..."), > > the boot process on this cluster no longer falls into > > a rescue shell but instead into a bash shell. So I > > can't look at the repository. > That's also very strange that should never happen ;-) . Is this only > while "Detecting.." or at any other cases? > Again logs would be very interesting. > > > > One difference I've used with this cluster is that I did > > the com.../mkinitrd with your new "-l" option. I wonder > > if maybe this option is failing to copy a shared library > > or something else the rescueshell needs so the rescushell > > fails to work? > No I don't think so. The -l option reduces the amount of > modules loaded into > initrd. That's it. So I would doubt it. > > > > Thanks, > > Dan > Thanks Marc. > > -- > Gruss / Regards, > > Marc Grimme > http://www.atix.de/ http://www.open-sharedroot.org/ > > |