Moab_Adaptive_Computing

Note: this is a user-contributed document. While this howto contains some very useful information, it is not necessarily updated for new versions of xCAT and therefore may be inaccurate in some cases. Use it with this in mind. We welcome corrections and contributions - see Editing xCAT Documentation. And see the [XCAT_Documentation] page for documentation that is tested and updated with each xCAT release (to the best of our ability).

Moab Adaptive Computing Installation

1) Introduction
2) Prerequisites
3) Install Moab on the management node
4) Configure Moab using the moab.cfg file
5) Install Moab Service Manager (MSM) on the management node
6) Configure MSM using the msm.cfg file
7) Verify the installation

1) Introduction

Moab Adaptive Computing Suite (Moab®) can dynamically provision compute machines to requested operating systems and power off compute machines when not in use.

2) Pre-requisites

Set up and configure xCAT correctly according to documentation (xcat.wiki.sourceforge.net/HowTos).

Test all nodes that Moab controls via xCAT to verify response to rpower, nodestat, and rinstall xCAT commands for all os/arch/profile combinations that jobs submitted to Moab use.

Currently known to work with nodes using the xCAT IPMI and ILO hardware management plugins.

Have a valid Moab license file (moab.lic) with provisioning and green enabled. For information on acquiring an evaluation license please contact info@clusterresources.com.

3) Install Moab on the management node

Moab is the intelligence engine that coordinates the capabilities of xCat and Torque to dynamically provision compute nodes to the requested operating system. Moab also schedules workload on the system and powers off idle nodes.

Download Moab from the Cluster Resources website:

http://www.clusterresources.com/product/mwm/index.php

Install Moab following the installation documentation:

http://www.clusterresources.com/moabdocs/2.0installation.shtml

4) Configure Moab using the moab.cfg file
Moab stores its configuration in the moab.cfg file: /opt/moab/moab.cfg. A sample configuration file, set up and optimized for adaptive computing follows:

# Example moab.cfg

SCHEDCFG[Moab]          SERVER=gpc-sched:42559
ADMINCFG[1]             USERS=root,egan

LOGLEVEL                7

# How often (in seconds) to refresh information from Torque and MSM
RMPOLLINTERVAL       60

RESERVATIONDEPTH        10
DEFERTIME               0

###################################################################
# Location of msm directory                                       #
# www.clusterresources.com/moabdocs/a.fparameters.shtml#toolsdir  #
###################################################################

TOOLSDIR                /opt/moab/tools


###############################################################################
# TORQUE and MSM configuration                                                #
# http://www.clusterresources.com/products/mwm/docs/a.fparameters.shtml#rmcfg #
###############################################################################

RMCFG[torque]           TYPE=PBS

RMCFG[msm]        TYPE=NATIVE:msm FLAGS=autosync,NOCREATERESOURCE RESOURCETYPE=PROV
RMCFG[msm]        TIMEOUT=60
RMCFG[msm]        PROVDURATION=10:00

AGGREGATENODEACTIONS    TRUE

###############################################################################
# ON DEMAND PROVISIONING SETUP                                                #
# www.clusterresources.com/moabdocs/3.5credoverview.shtml#qos                 #
# www.clusterresources.com/moabdocs/5.2nodeallocation.shtml#PRIORITY          #
# www.clusterresources.com/moabodcs/a.fparameters.shtml#jobprioaccrualpolicy  #
###############################################################################

QOSCFG[od]              QFLAGS=PROVISION
USERCFG[DEFAULT]        QLIST=od

JOBPRIOACCRUALPOLICY    ACCRUE
JOBPRIOEXCEPTIONS       BATCHHOLD,SYSTEMHOLD,DEPENDS

NODEALLOCATIONPOLICY    PRIORITY
NODECFG[DEFAULT]        PRIORITYF=1000*OS+1000*POWER

NODEAVAILABILITYPOLICY  DEDICATED

CLASSCFG[DEFAULT]       DEFAULT.OS=scinetcompute

###############################################################
# GREEN POLICIES                                              #
# www.clusterresources.com/moabdocs/23.0greencomputing.shtml  #
###############################################################

NODECFG[DEFAULT]        POWERPOLICY=GREEN
PARCFG[ALL]             NODEPOWEROFFDURATION=20:00

NODEIDLEPOWERTHRESHOLD  600

# END Example moab.cfg

5) Install Moab Service Manager (MSM) on the management node

Configuration in xCAT:

Create a node group with a name of your choosing (for example, moab) and add all nodes Moab will manage to that group.

Compile a list of combinations of OS, architecture, profile (as defined in the xCAT setup) plus node features that will be used. Generate a text file that contains these combinations in the format that follows (later refferred to as _IMAGESFNAME).

# Example image definition file
#
# image_name => arbitrary name, specified by jobs at submission time
# arch       => arch as used by xCAT for provisioning a node
# os         => os as used by xCAT for provisioning a node
# profile    => profile as used by xCAT for provisioning a node
# nodeset    => netboot|install - is the image stateless or statefull
# feature    => arbitrary feature names, used to identify features of a node
#

# image_name     arch    os          profile    nodeset   features

# physics group
phys_a           x86     centos5.2   gaussian   netboot   infiniband,bigmem
phys_b           x86_64  centos5.2   gaussian   netboot   infiniband,bigmem
phys_c           x86     centos5.1   gaussian   netboot   infiniband
phys_d           x86_64  centos5.1   gaussian   netboot   infiniband,bigmem
phys_e           x86_64  sles11      vasp       netboot   storage,video

# biology group
bio_a           x86     centos5.2    dft        netboot   infiniband,bigmem
bio_b           x86_64  centos5.2    dft        netboot   infiniband,bigmem
bio_c           x86     centos5.1    dft
bio_d           x86_64  centos5.1    dft        netboot   infiniband,bigmem
bio_e           x86_64  sles11       dft        netboot   storage,video

# END Example image definition file

Installing/Configuring MSM:

Ensure the following Perl modules are available:
DBD::SQLite
Proc::Daemon
XML::Simple

Create an MSM directory (/opt/moab/tools/msm).
Extract msm.tgz to /opt/moab/tools/msm.

6) Configure MSM using the msm.cfg file

Create an MSM configuration (/opt/moab/tools/msm/msm.cfg) as follows:

# Example msm.cfg

RMCFG[msm]        PORT=24603
RMCFG[msm]        POLLINTERVAL=10
RMCFG[msm]        LOGFILE=/opt/moab/log/msm.log
RMCFG[msm]        LOGLEVEL=8
RMCFG[msm]        DEFAULTNODEAPP=xcat

APPCFG[xcat]      DESCRIPTION="xCAT plugin"
APPCFG[xcat]      MODULE=Moab::MSM::App::xCAT
APPCFG[xcat]      REPORTSTATE=FALSE
APPCFG[xcat]      LOGLEVEL=8

# this is where your xCAT group name for all nodes moab will be managing goes
APPCFG[xcat]       _NODERANGE=moab

# This value should be greater than the amount of time it
# takes
# 'nodestat moab_group_name' + 'rpower moab_group_name' to complete
# on your cluster (seconds).
APPCFG[xcat]       POLLINTERVAL=30

# Timeout for nodestat + rpower commands, assumed to have failed if not done
# in this much time (seconds).
APPCFG[xcat]       _TIMEOUT=300

# Your xCAT feature group names
APPCFG[xcat]       _FEATUREGROUPS=infiniband,bigmem,video

# Full path to your image definition file, make sure it is readable
# by the effective UID moab will be running as.
APPCFG[xcat]       _IMAGESFNAME=/opt/moab/tools/msm/images.txt

# Use this configuration parameter to have MSM perform nodestat/rpower
# operations in small groups that are executed in parallel.  This is useful
# if nodestat or rpower commands take considerable time on your cluster.
APPCFG[xcat]       _MAXRANGECOUNT=10

# Use this configuration parameter to tell MSM not to start new cluster query operations while a previous one is executing.
APPCFG[xcat]       _LIMITCLUSTERQUERY=1


# END Example msm.cfg

7) Verify the installation

When Moab starts it immediately communicates with its configured resource managers. In this case Moab communicates with Torque to get compute node and job queue information. It then communicates with MSM to determine the state of the nodes according to xCAT. It aggregates this information and processes the jobs discovered from Torque.

When a job is submitted, Moab determines whether nodes need to be provisioned to a particular operating system to satisfy the requirements of the job. If any nodes need to be provisioned Moab performs this action by creating a provisioning system job (a job that is internal to Moab). This system job communicates with xCAT to provision the nodes and remain active while the nodes are provisioning. Once the system job has provisioned the nodes it informs the user’s job that the nodes are ready at which time the user’s job starts running on the newly provisioned nodes.

When a node has been idle for NODEIDLEPOWERTHRESHOLD Moab will create a power-off system job. This job communicates with xCAT to power off the nodes and remain active in the job queue until the nodes have powered off. Then the system job informs Moab that the nodes are powered off but are still available to run jobs. The power off system job then exits.

To verify correct communication between Moab and MSM run the mdiag -R –v msm command.

$ mdiag -R -v msm
diagnosing resource managers

RM[msm]       State: Active  Type: NATIVE:MSM  ResourceType: PROV
  Timeout:            30000.00 ms
  Cluster Query URL:  $HOME/tools/msm/contrib/cluster.query.xcat.pl
  Workload Query URL: exec://$TOOLSDIR/msm/contrib/workload.query.pl
  Job Start URL:      exec://$TOOLSDIR/msm/contrib/job.start.pl
  Job Cancel URL:     exec://$TOOLSDIR/msm/contrib/job.modify.pl
  Job Migrate URL:    exec://$TOOLSDIR/msm/contrib/job.migrate.pl
  Job Submit URL:     exec://$TOOLSDIR/msm/contrib/job.submit.pl
  Node Modify URL:    exec://$TOOLSDIR/msm/contrib/node.modify.pl
  Node Power URL:     exec://$TOOLSDIR/msm/contrib/node.power.pl
  RM Start URL:       exec://$TOOLSDIR/msm/bin/msmd
  RM Stop URL:        exec://$TOOLSDIR/msm/bin/msmctl?-k
  System Modify URL:  exec://$TOOLSDIR/msm/contrib/node.modify.pl
  Environment:        MSMHOMEDIR=/home/wightman/test/scinet/tools//msm;MSMLIBDIR=/home/wightman/test/scinet/tools//msm
  Objects Reported:   Nodes=10 (0 procs)  Jobs=0
  Flags:              autosync
  Partition:          SHARED
  Event Management:   (event interface disabled)
  RM Performance:     AvgTime=0.10s  MaxTime=0.25s  (38 samples)
  RM Languages:       NATIVE
  RM Sub-Languages:   -

To verify nodes are configured to provision use the checknode -v <nodeid> command. Each node will have a list of available operating systems.

$ checknode n01
node n01

State:      Idle  (in current state for 00:00:00)
Configured Resources: PROCS: 4  MEM: 1024G  SWAP: 4096M  DISK: 1024G
Utilized   Resources: ---
Dedicated  Resources: ---
Generic Metrics:    watts=25.00,temp=40.00
Power Policy:       Green (global policy)   Selected Power State: Off
Power State:   Off
Power:      Off
  MTBF(longterm):   INFINITY  MTBF(24h):   INFINITY
Opsys:      compute   Arch:      ---
  OS Option: compute
  OS Option: computea
  OS Option: gpfscompute
  OS Option: gpfscomputea
Speed:      1.00      CPULoad:   0.000
Flags:      rmdetected
RM[msm]:    TYPE=NATIVE:MSM  ATTRO=POWER
EffNodeAccessPolicy: SINGLEJOB

Total Time: 00:02:30  Up: 00:02:19 (92.67%)  Active: 00:00:11 (7.33%)

To verify nodes are configured for Green power management, run the mdiag –G command. Each node will show its power state.

$ mdiag -G
NOTE:  power management enabled for all nodes
Partition ALL:  power management enabled
  Partition NodeList:
Partition local:  power management enabled
  Partition NodeList:
  node n01 is in state Idle, power state On (green powerpolicy enabled)
  node n02 is in state Idle, power state On (green powerpolicy enabled)
  node n03 is in state Idle, power state On (green powerpolicy enabled)
  node n04 is in state Idle, power state On (green powerpolicy enabled)
  node n05 is in state Idle, power state On (green powerpolicy enabled)
  node n06 is in state Idle, power state On (green powerpolicy enabled)
  node n07 is in state Idle, power state On (green powerpolicy enabled)
  node n08 is in state Idle, power state On (green powerpolicy enabled)
  node n09 is in state Idle, power state On (green powerpolicy enabled)
  node n10 is in state Idle, power state On (green powerpolicy enabled)
Partition SHARED:  power management enabled

To submit a job that dynamically provisions compute nodes, run the msub –l os=<image> command.

$ msub -l os=computea job.sh

yuby.3
$ showq

active jobs------------------------
JOBID              USERNAME      STATE PROCS   REMAINING            STARTTIME

provision-4            root    Running     8    00:01:00  Fri Jun 19 09:12:56

1 active job               8 of 40 processors in use by local jobs (20.00%)
                           2 of 10 nodes active      (20.00%)

eligible jobs----------------------
JOBID              USERNAME      STATE PROCS     WCLIMIT            QUEUETIME

yuby.3             wightman       Idle     8    00:10:00  Fri Jun 19 09:12:55

1 eligible job

blocked jobs-----------------------
JOBID              USERNAME      STATE PROCS     WCLIMIT            QUEUETIME


0 blocked jobs

Total jobs:  2

Notice that Moab created a provisioning system job named provision-4 to provision the nodes. When provision-4 detects that the nodes are correctly provisioned to the requested OS, the submitted job yuby.3 runs:

$ showq

active jobs------------------------
JOBID              USERNAME      STATE PROCS   REMAINING            STARTTIME

yuby.3             wightman    Running     8    00:08:49  Fri Jun 19 09:13:29

1 active job               8 of 40 processors in use by local jobs (20.00%)
                           2 of 10 nodes active      (20.00%)

eligible jobs----------------------
JOBID              USERNAME      STATE PROCS     WCLIMIT            QUEUETIME


0 eligible jobs

blocked jobs-----------------------
JOBID              USERNAME      STATE PROCS     WCLIMIT            QUEUETIME


0 blocked jobs

Total job:  1

The checkjob command shows information about the provisioning job as well as the submitted job. If any errors occur, run the checkjob –v <jobid> command to diagnose failures.

© 2009 Cluster Resources, Incorporated


Related

Wiki: HowTos

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.