Menu

ToDo

Daniel Povey

Things remaining to do

Just a note (mainly to the other participants) about what remains to do in this project.

  • Testing the setup
  • Adding information on how to set up a GPU image (Dan will do this himself, he has some notes prepared)
  • Adding a script to make it easier to attach a volume to a node and export it. There should be something checked in called kl-attach-volume but it's not in the normal style of these scripts, it was copied from an older setup.
  • Possibly adding a script to make it easier to add a user? Like an interactive script that asks you for stuff? There are a lot of steps right now in adding a user.
  • More tutorial examples on how to administer GridEngine? E.g. examples of the kinds of things one needs to do?
  • Possibly modifying kl-remove nodes so it prompts for confirmation if you attempt to remove a node that has attached storage?
  • E-mail. Making it so mem-killer.pl and similar scripts can send mail. By default Amazon nodes are allowed to send email I think, but it tends to end up in the spam filter on (e.g.) gmail. The way to stop this is: request an "elastic IP" from amazon using ec2-allocate-address, attach it to the master using ec2-associate-address (presumably before setting up the nodes), and ask Amazon to 'whitelist' the address at https://portal.aws.amazon.com/gp/aws/html-forms-controller/contactus/ec2-email-limit-rdns-request. Then the master will be able to send email. It should be possible to set it up so the master is an email proxy for the nodes or whatever it's called, but it might be tricky given that we share the same image for the master and nodes. We could have the init scripts set it up in principle though.
  • Once this is done, could set up scripts that inform the administrator by email if disks are getting full.

In the longer term future:

  • Automatic load balancing (Like MIT StarCluster)?
  • Automatic backups, maybe? On the cluster I was setting up, I had the local disks on the nodes back each other up in a round-robin method.. or at least that was my intention.

Previous: Adding a user
Up: Kluster Wiki


Related

Wiki: AddingUser
Wiki: Home

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.