bproc-users Mailing List for BProc: Beowulf Distributed Process Space (Page 23)

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

On a sourceforge mirror near you:
Name        : Clubmask
Version     : 0.6                             
Release     : b1
Group       : Cluster Resource Management and Scheduling
Vendor      : Liniac Project, University of Pennsylvania
License     : GPL-2
URL         : http://clubmask.sourceforge.net
Download    :
http://sourceforge.net/project/showfiles.php?group_id=1316&release_id=197383

What is Clubmask
------------------------------------------------------------------------------
Clubmask is a resource manager designed to allow Bproc based clusters
enjoy the full scheduling power and configuration of the Maui HPC
Scheduler.

Clubmask uses a modified version of the Supermon resource monitoring
software to gather resource information from the cluster nodes. This
information is combined with job submission data and delivered to the
Maui scheduler. Maui issues job control commands back to Clubmask,
which then starts or stops the job scripts using the Bproc environment.

Clubmask also provides builtin support for a supermon2ganglia translator
that allows a standard Ganlgia  web backend to contact supermon and get
XML data that will disply through the Ganglia web interface.

Clubmask is currently running on around 10 clusters, varying in size
from 8 to 128 nodes, and has been tested up to 5000 jobs.

Notes/warnings on this release:
------------------------------------------------------------------------------
Before upgrading, please make sure to save your /etc/clubmask/clubmask.conf 
file, as it may get overwritten. There are a few new variables in clubmask.conf,
 so beware!

To use the resource requests, you must be running the latest snapshot of maui.

Changes since 0.5:
------------------------------------------------------------------------------
Change the name from the god awfull absolute timestamp, to a more normal
"string.number" format, where "string" is an arbitrary job name and
"number" is the Nth time that the job name is being used. EX root.1,
root.2, ...

fix cmnodesshknownhosts to get the -n information from the bproc
nodenumber that is given as the argument

update to latest supermon APIs

Feature Request #790938: add 'cmsubmit -r <resid>' to run a job in a
maui reservation.

Fixed bug #791396: make sure processes get killed in Interactive jobs

make sure bproc is running when starting resource_manager

fix cmsubmit -h. it is now cleaner, and easier to understand

add support for resource requirements on the nodes. swap, mem, disk,
qos, reservation, and processors per node are supported now. see
cmsumbit -h for more information.

add infrastructure for architecture, os, network, arbitrary features as
node resource requests. We do not get this information dynamically yet,
so no need in letting people muck with it.

add supermon_state daemon to manage the nodelist for supermon. keeps
that logic out of resource_manager

make sure there is at most one 'R' command in the pipeline for down
nodes at any given time. No sense in asking nodes to revive if they have
not responded to the last request yet.

cleanup setup to perform RPM builds cleaner

split /etc/clubmask/clubmask.conf to /etc/clubmask/{system,clubmask}.conf to
allow variables that need user editing to live in clubmask.conf and the
rest of the system varaibles to live in system.conf. This will let a
user update to a newer version of Clubmask, and just copy over the old
clubmask.conf to restore their configuration.

migrate all docs from Docbook XML to Lyx/latex. All of the docs -- pdf,
html single, and html multiple can be generated with a simple 'make' in
the docs/ directory.

add --secret-key to setup.py args for building maui and clubmask with
same checksum key. This removes the need to edit setup.py when
installing clubmask.

Links
-------------
Bproc: http://bproc.sourceforge.net
Ganglia: http://ganglia.sourceforge.net
Maui Scheduler: http://www.supercluster.org/maui
Supermon: http://supermon.sourceforge.net

Cheers~
Nic
-- 
Nicholas Henke
Penguin Herder & Linux Cluster System Programmer
Liniac Project - Univ. of Pennsylvania

2001	Jan	Feb	Mar	Apr	May	Jun	Jul	Aug	Sep	Oct (25)	Nov	Dec (22)
2002	Jan (13)	Feb (22)	Mar (39)	Apr (10)	May (26)	Jun (23)	Jul (38)	Aug (20)	Sep (27)	Oct (76)	Nov (32)	Dec (11)
2003	Jan (8)	Feb (23)	Mar (12)	Apr (39)	May (1)	Jun (48)	Jul (35)	Aug (15)	Sep (60)	Oct (27)	Nov (9)	Dec (32)
2004	Jan (8)	Feb (16)	Mar (40)	Apr (25)	May (12)	Jun (33)	Jul (49)	Aug (39)	Sep (26)	Oct (47)	Nov (26)	Dec (36)
2005	Jan (29)	Feb (15)	Mar (22)	Apr (1)	May (8)	Jun (32)	Jul (11)	Aug (17)	Sep (9)	Oct (7)	Nov (15)	Dec

bproc-users Mailing List for BProc: Beowulf Distributed Process Space (Page 23)

bproc-users — General discussion about BProc.