PBS Cluster Viz Git

Status: Beta

Brought to you by: paultcochrane

Tree [1a2059] master / History

HTTPS access

File	Date	Author	Commit
bin	2013-01-11	Paul Cochrane	[54d446] Removed now superluous window sizing code in vt...
debian	2012-11-09	Paul Cochrane	[6423c7] Corrected libxml python package in debian contr...
doc	2013-01-01	Paul Cochrane	[d31f38] Renamed the TextLog class to ScreenLog
etc	2013-07-12	Paul Cochrane	[9a4082] Reordered the nodes so that the 32 core smp nod...
examples	2012-09-10	Paul Cochrane	[4cf192] Updated the nodes.rrzn and pbsnodes_rrzn.xml files
man	2012-10-12	Paul Cochrane	[22e518] Added man pages for the cluster_status and gen_...
pbsclusterviz	2013-01-01	Paul Cochrane	[9f1e73] Renamed the text_log variable to screen_log
rpm	2013-08-01	Paul Cochrane	[1a2059] Removed %clean in spec as recommended by Veaces...
.gitignore	2012-09-18	Paul Cochrane	[2dca9f] Removed epydoc API build target
AUTHORS	2012-09-16	Paul Cochrane	[87a290] Added Marco's name to the list of authors
BUGS	2012-10-22	Marco Reps	[03482a] Minor not-functional changes.
CHANGES	2012-09-10	Paul Cochrane	[0f6e9a] Updated CHANGES file in preparation for the 0.7...
COPYING	2012-11-30	Paul Cochrane	[94bb58] Updated the GPL text to the current version fro...
MANIFEST.in	2012-09-18	Paul Cochrane	[87a0a6] Added the docs to the distribution
Makefile	2012-11-09	Paul Cochrane	[3fe4b1] Using 'rpm' and 'deb' instead of 'build(rpm\|deb)'
README	2012-11-09	Paul Cochrane	[6423c7] Corrected libxml python package in debian contr...
TODO	2012-10-12	Paul Cochrane	[f3c625] TODO: memory usage in text balloons
pylint.rc	2013-01-01	Paul Cochrane	[f83cb5] Updated pylint.rc to current standard in pylint...
setup.py	2012-10-12	Paul Cochrane	[bfa7d1] Added vim coda to setup.py and removed tabs

Read Me

Project page:
    http://pbsclusterviz.sourceforge.net

Project documentation:
    http://pbsclusterviz.sourceforge.net/

--------------------------------------------------------------------------
For the impatient:
--------------------------------------------------------------------------

   $ python setup.py install
   $ pbsnodes -x > pbsnodes.xml
   # assuming your cluster nodes all start with 'lcn'
   # and your cluster name is "Cluster"
   $ gen_nodes_file -x pbsnodes.xml -n Cluster -p lcn -o nodes
   $ cluster_status

--------------------------------------------------------------------------
Installation
--------------------------------------------------------------------------

Installation in a central location:

   $ python setup.py install

Installation in your home directory:

   $ python setup.py install --prefix=$HOME/pbsclusterviz

Then you need to extend your PYTHONPATH environment variable so:

   $ export PYTHONPATH=$PYTHONPATH:$HOME/pbsclusterviz/lib/python2.x/site-packages

and extend your PATH so:

   $ export PATH=$PATH:$HOME/pbsclusterviz/bin


Dependencies:
   * Python Version 2.4+
        $ aptitude install python
        or
        $ yum install python
   * The Visualisation Toolkit Version 5.4+ with Python bindings
        $ aptitude install python-vtk
        or
        $ yum install vtk-python
   * Python XML libraries
        $ aptitude install python-libxml2
        or
        $ yum install libxml2-python
   * Python TkInter (for full interactive usage)
        $ aptitude install python-tk
        or
        $ yum install tkinter
   * Sphinx (in order to build the html docs)
        $ aptitude install python-sphinx
        or
        $ yum install python-sphinx

--------------------------------------------------------------------------
Usage
--------------------------------------------------------------------------

Firstly, a nodes file needs to be generated.  This specifies how the nodes
for the entire cluster system will be displayed.  The cluster nodes are
displayed as boxes laid out in a grid.  One essentially has many rows of
cluster nodes a set number of nodes wide (e.g. 40 nodes wide would be
appropriate for very large installations, the default of 10 nodes should be
OK for smaller cluster installations).  To get the right numbers one must
experiment somewhat.  

When the nodes file is initially generated, a basic structure will be
created which can be later fine tuned by hand.

***********************
Generating a nodes file
***********************

   $ pbsnodes -x > pbsnodes.xml
   $ gen_nodes_file -x pbsnodes.xml \
              -n <node_section_name> -p <node_prefix> -o nodes

Many cluster installations are collections of one or more smaller clusters
of computers, each with their own naming strategy.  For instance, a cluster
called "LinuxCluster" could have nodes labeled lcn01, lcn02, lcn03... etc.
Therefore, one needs to specify a node prefix so that gen_nodes_file can
pick the relevant nodes out of the pbsnodes xml file.  The node section name
is a comment in the generated nodes file.

To account for more than one cluster in an entire cluster system one appends
to an existing nodes file with the -a option to gen_nodes_file.

For example, with three clusters "TinyCluster", "LinuxCluster" and
"BigOldBull", where the nodes are labelled tcn<xx>, lcn<xx> and bobn<xx>
respectively, one would run gen_nodes_file like so:

   $ gen_nodes_file -x pbsnodes.xml -n TinyCluster -p tcn -o nodes
   $ gen_nodes_file -x pbsnodes.xml -n LinuxCluster -p lcn -o nodes -a
   $ gen_nodes_file -x pbsnodes.xml -n BigOldBull -p bobn -o nodes -a

The output is a plain text file called 'nodes' which you can then alter to
your heart's content.  If you add a new cluster to your configuration, you
merely need to use the line

   $ gen_nodes_file -x pbsnodes.xml -n NewCluster -p newn -o nodes -a

to add the new cluster nodes to your load and job status visualisation.

Now that the nodes file has been generated, you're now ready to begin
visualising the load and job status of your cluster system.

**********************************
Visualising the system load status
**********************************

To generate an interactive three-dimensional view of the current load of all
nodes in your cluster system, you merely need to run the 'cluster_status'
command:

   $ cluster_status

To view current job-level utilisation of all nodes just press the "j" button
when viewing the cluster.

If you wish, you can specify a previously generated pbsnodes xml file:

   $ cluster_status -x pbsnodes.xml

The title of the output image is controlled by the configuration file
(default: clusterviz.conf).  In the section [load viewer] you merely need to
set the value of the 'title' key to the title you wish to use.  For example:

[load viewer]
title = My awesome cluster load status

To specify an alternate configuration file, you can use the '-c' option:

   $ cluster_load_status -c mycluster.conf

If you wish to use the program non-interactively and thereby generate an
image of the cluster status at that point in time, just use the '-N' option.

   $ cluster_load_status -N

When the program is called, the cluster status image appears briefly on the
screen and then disappears.  This image is then saved to the file
'cluster_load_status.png' by default.

*********************************
Visualising the system job status
*********************************

In order to view the job status of your cluster system you merely need to
use the 'cluster_status' command with the '-m/--display_mode' option, e.g.:

   $ cluster_status -m job

To change the default title of the generated image, you need to set the
value of the 'title' key in the [job viewer] section of the configuration
file (clusterviz.conf):

[job viewer]
title = My awesome cluster job status

As before, in order to generate an output image one needs to use the
'-N/--non_interactive' option:

   $ cluster_status -m job -N

By default this will generate an image with filename
'cluster_job_status.png'.

********************
Updating the display
********************

The display output can be updated by pressing the u key when in interactive
mode.

***********************************
Remote pbsnodes XML file generation
***********************************

Instead of just using a local 'pbsnodes.xml' file one can also get this file
from a remote host.  This is a practical solution when the computer where
the cluster status is being visualised is not part of PBS-based cluster
system and so is not able to extract the pbsnodes information itself.  With
the '-s' option to 'cluster_status' it is now possible to generate the
'pbsnodes.xml' file on a remote host and have it read at the local host.
Normally this would occur via 'ssh', and after having added a line similar
to the following to the '[main]' section of the configuration file

[main]
syscall = ssh login-node 'pbsnodes -x' > pbsnodes.xml

one then merely needs to start 'cluster_status' like so:

   $ cluster_status -s

For this to run smoothly (and without you needing to enter your password
each time the display is updated) you should enable password-less 'ssh'
access to the remote server.  This article gives a good overview:

http://www.howtoforge.com/ssh-best-practices

****************************************************
Generating movies of your cluster status information
****************************************************

The cluster load and job status images are saved to disk with the respective
filenames cluster_load_status.png and cluster_job_status.png.  An extra file
image file with the current timestamp is also saved for each type of status
image.  These files can then be used to create movies of the evolution of
the cluster status over time and can give insight into patterns not
otherwise obvious from viewing the static images.  The best way to produce
such movies is to run 'cluster_status -m load' and 'cluster_status -m job'
as cron jobs.  For instance, one could save images every ten minutes, then
after a day or even a week, one can generate an mpeg movie file from the
collected images.

--------------------------------------------------------------------------
Example
--------------------------------------------------------------------------

In the examples/ directory of the distribution you will find some
pre-generated pbsnodes xml files and an example configuration file.

The first example works for the RRZN cluster system
(http://www.rrzn.uni-hannover.de/computeserver.html).  Change into the
examples/ directory and run the following command:

   $ cluster_status -x pbsnodes_rrzn.xml -n nodes.rrzn -c rrznviz.conf -i

--------------------------------------------------------------------------
Documentation
--------------------------------------------------------------------------

The documentation is distributed with 'pbsclusterviz' in the 'doc/'
directory.  Make sure that you have installed the 'sphinx' package so that
you can build the documentation.

To build the html documentation change into the 'doc/' directory and run

   $ make html

then point your browser to '.../doc/_build/html/index.html'.

To build the pdf documentation change into the 'doc/' directory and run

   $ make latex
   $ cd _build/latex
   $ make all-pdf

then open the file 'PBSClusterViz.pdf' with your favourite PDF-viewer.

PBS Cluster Viz Git

Branches

Tree [1a2059] master / Download Snapshot History

Read Me

Tree [1a2059] master /

History