Menu

#782 Maui stops at the re-boot

4.2.1 (deprecated)
open-fixed
9
2006-04-24
2006-04-04
DongInn Kim
No

Hi,

I have tried to test OSCAR 4.2.1 r4500 on FC3 X86 and I
found something wrong with Maui services after a head
node rebooted.

I ran the oscar_wizard after a head node rebooted and
did the step 8 "Test Cluster Setup" and every tests
related to the Maui service all failed because Maui was
not running.

I have done some experiment on Maui service.
What I did is run Maui manually and see what happens.
It runs only for about 10~15 seconds and then dies.

I have no idea why this happens after a head node reboots.

I have done some testing for PackageInUn.pm by
uninstalling/re-installing some packages before I
reboot the head node. I wonder if the Maui issue has
something to do with the PackageInUn.pm testing. I will
try to test the Maui issue without doing PackageInUn.pm
test next time.

Discussion

  • Bernard Li

    Bernard Li - 2006-04-04

    Logged In: YES
    user_id=879102

    More details:

    This also happens for me when I uninstall PVM and re-run the
    setup test - TORQUE shell test will fail and I found that
    all the jobs are actually stuck in the queue and not ran
    (because MAUI scheduler is not running).

    I tested on RHEL4u3 x86.

     
  • John

    John - 2006-04-05

    Logged In: YES
    user_id=505737

    Well, looks like we've got some kinda problem. Bumping.

     
  • John

    John - 2006-04-05
    • priority: 8 --> 9
     
  • Bernard Li

    Bernard Li - 2006-04-05

    Logged In: YES
    user_id=879102

    So what's the fix here? Rebuild 3.2.6p14?

     
  • Bernard Li

    Bernard Li - 2006-04-13
    • assigned_to: nobody --> naughtont
     
  • Thomas Naughton

    Thomas Naughton - 2006-04-18

    Logged In: YES
    user_id=288102

    I have tested "maui-oscar-3.2.6p14-1" with "
    "oscar-4.2.1b6r4484" and do not see any Maui problems:
    1) after restarting compute nodes
    2) after restarting the headnode

    After #1 and #2 I was able to test the cluster and get
    successful results. The 'service maui status' showed maui
    and friends running.

    Note, I do see strange failures with "PVM (via Torque)" on
    the initial test and have to CTRL+C, but subsquent tests
    worked just fine with no change other than re-running the
    tests...??? But I don't believe that is related to the
    version of Maui in use, therefore the problem this bug was
    filed upon seems fixed based on my tests on FC4/x86.

     
  • Thomas Naughton

    Thomas Naughton - 2006-04-18

    Logged In: YES
    user_id=288102

    So, we need to upgrade the version of Maui in branch-4-2.
    The simplest thing is to just reuse the RPMS in trunk which
    are based on maui-oscar-3.2.6p14-1.

    I know that trunk is using the distro/ dir with a call to
    generic-setup from maui/scripts/setup. So, we'll have to
    update that, along with the config.xml version info.

    I can make these changes, at least copy things from trunk,
    not sure what the supported distro list is for 4.2.1, not
    sure if it include items not supported in trunk.

    Lastly, there are a few minor updates I'd like to do to the
    scripts/ area: remove dead code in server uninstall, and fix
    mode on post_clients file.

    I'll do these changes (minus any new rebuilds for non-trunk
    supported distros) today if that is ok.

     
  • Thomas Naughton

    Thomas Naughton - 2006-04-18

    Logged In: YES
    user_id=288102

    # TJN: 4/18/06
    # A quick audit of SVN for existing (pre-built) Maui RPMS in
    trunk.

    # Supported oscar-4.2.1 distro/arches that are MISSING from
    trunk/
    # for the Maui package.

    TODO:
    fc2-x86
    mdk10.1-x86
    mdk10-x86
    rhel3-ia64
    rhel4-ia64

    # Supported oscar-4.2.1 distro/arches that are AVAILABLE
    from trunk/
    # for the Maui package.

    DONE:
    rhel4-x86
    rhel4-x86_64
    fc3-x86
    fc3-x86_64
    fc4-x86
    rhel3-x86
    rhel3-x86_64

     
  • Thomas Naughton

    Thomas Naughton - 2006-04-18

    Logged In: YES
    user_id=288102

    It appears that fc2-x86 may not be supported.
    Also, the powers that be suggest just using RHEL3 built RPMS
    across all distros. So.... stay tuned oscar fans! :)

     
  • Bernard Li

    Bernard Li - 2006-04-24
    • assigned_to: naughtont --> bernardli
    • status: open --> open-fixed
     
  • Bernard Li

    Bernard Li - 2006-04-24

    Logged In: YES
    user_id=879102

    Maui 3.2.6p14 RHEL3 x86|x86|64|ia64 RPMs checked into
    bracnh-4-2 r4615,4616.

     
MongoDB Logo MongoDB