slurm-roll / Discussion / General Discussion: Unable to submit job

Martin Brodbeck - 2013-10-29

I'm not sure if this is a question for this forum here or if it should rather go to a slurm forum. But I've installed a new cluster with Rocks 6.1 and the latest slurm roll. Everything went fine with the installation. But now I try to submit a job with sbatch and get the following error:

sbatch: error: Batch job submission failed: Unable to contact slurm controller (connect failure)

Now, having a look at slurmctld.log gives me that:

fatal: It appears you don't have any association data from your database. The priority/multifactor plugin requires this information to run correctly. Please check your database connection and try again.

I've done nothing else with the cluster so far. So I was expecting that it just runs out of the box. But maybe I missed something?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Martin Brodbeck - 2013-10-29

It seems that a "sacctmgr add cluster myname" has solved the problem. But I wonder why that was necessary and if the cluster is now well-configured. :)

Last edit: Martin Brodbeck 2013-10-29

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Werner Saar - 2013-10-29
  
  On 29.10.2013 16:34, Martin Brodbeck wrote:
  
  It seems that a "sacctmgr add cluster <name>" has solved the problem. But I wonder why that was necessary and if the cluster is now well-configured. :)</name>
  
  Unable to submit job
  
  Sent from sourceforge.net because you indicated interest in https://sourceforge.net/p/slurm-roll/discussion/general/
  
  To unsubscribe from further messages, please visit https://sourceforge.net/auth/subscriptions/
  Hi,
  
  this command is executed when you run "rocks run roll slurm|sh".
  Did you saw an error, when You ran this command?
  
  Best regards
  
  Werner
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Martin Brodbeck - 2013-10-30

I've added the slurm roll at the very beginning in the installation process. That is, I added the slurm roll together with base, kernel, os and so on. I thought that the installation steps are then performed automatically just as if I had installed a different roll like torque...
So, no, there wasn't an error message, but maybe I missed it because the installation process took place in the backgound?

Thanks,
Martin

Last edit: Martin Brodbeck 2013-10-30

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Werner Saar - 2013-10-30
  
  On 30.10.2013 14:33, Martin Brodbeck wrote:
  
  I've added the slurm roll at the very beginning in the installation process. That is, I added the slurm roll together with base, kernel, os and so on. I thought that the installation steps are then performed automatically just as if I had installed a different roll like torque...
  So, no, there wasn't an error message, but maybe I missed it because the installation process took place in the backgound?
  
  Unable to submit job
  
  Sent from sourceforge.net because you indicated interest in https://sourceforge.net/p/slurm-roll/discussion/general/
  
  To unsubscribe from further messages, please visit https://sourceforge.net/auth/subscriptions/
  Hi,
  
  Sorry. Please run this command:
  
  rocks run roll slurm > /tmp/slurm.script.
  
  You will find the following lines:
  
  service slurmdbd start
  sleep 60
  sacctmgr -i create cluster $CLUSTER
  sleep 20
  service slurm start
  
  You see, that I wait 60 seconds after starting slurmdbd, this was always
  enough in my tests. But if this time is too short
  the command to create the cluster will fail. I think, that this was the
  reason for the failure.
  
  I will try to find a better solution.
  
  Thank You for your help.
  
  Best regards
  
  Werner
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Martin Brodbeck - 2013-10-30

Thanks, Werner. So, do you guess that this "sacctmgr add cluster myname" was enough I had to to in oder to fix the slurm installation? It seems that everything is working well, though...

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Werner Saar - 2013-10-30
  
  On 30.10.2013 16:40, Martin Brodbeck wrote:
  
  Thanks, Werner. So, do you guess that this "sacctmgr add cluster myname" was enough I had to to in oder to fix the slurm installation? It seems that everything is working well, though...
  
  Unable to submit job
  
  Sent from sourceforge.net because you indicated interest in https://sourceforge.net/p/slurm-roll/discussion/general/
  
  To unsubscribe from further messages, please visit https://sourceforge.net/auth/subscriptions/
  Hi,
  
  when You install the headnode, You have to give the name of the cluster.
  The command "rocks run roll slurm|sh" writes the name of the cluster to
  the file /etc/slurm/headnode.conf
  
  If the name of your cluster is myname, then:
  
  sacctmgr -i add cluster myname
  
  is enough and all is o.k.
  
  Best regards
  
  Werner
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Unable to submit job

Slurm Resource Manager for Rocks Clusters

Forums

Help

Unable to submit job

Unable to submit job

Slurm Resource Manager for Rocks Clusters

Forums

Help

Unable to submit job document.SUBSCRIPTION_OPTIONS = { "thing": "topic", "subscribed": false, "url": "subscribe", "icon": { "css": "fa fa-envelope-o" } };

Unable to submit job