I installed last slurm-roll version on rocks clusters 7.0, and this works OK.
(SElinux is disabled, I use only the few rolls compatible with slurm)
My issue is I use some code where jobs are submitted from within other jobs.
As an example I may launch CC.job where there is a loop :
...
for X in CASE CONTROL
do
sbatch $X.job
done
...
So within CC.job; CASE.job and CONTROL.job will be submitted.
This does not work : CC.job is executed, but the sbatch $X.job fails with :
error: Unable to allocate resources: Access/permission denied
(i can use sinfo, squeue from compute nodes)
Is there a change that can be made to allow submission from a compute node ?
I guess it must be linked to the pam settings...
Did someone run into this? and solve it?
PYB
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
The list of hosts that can submit jobs is set by AllocNodes. The slurm default is "ALL" but the rocks roll sets it to the cluster head node only. Look in slurm.conf for the line:
PartitionName=DEFAULT AllocNodes=... State=UP
Ian
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi Werner & all;
I installed last slurm-roll version on rocks clusters 7.0, and this works OK.
(SElinux is disabled, I use only the few rolls compatible with slurm)
My issue is I use some code where jobs are submitted from within other jobs.
As an example I may launch CC.job where there is a loop :
...
for X in CASE CONTROL
do
sbatch $X.job
done
...
So within CC.job; CASE.job and CONTROL.job will be submitted.
This does not work : CC.job is executed, but the sbatch $X.job fails with :
error: Unable to allocate resources: Access/permission denied
(i can use sinfo, squeue from compute nodes)
Is there a change that can be made to allow submission from a compute node ?
I guess it must be linked to the pam settings...
Did someone run into this? and solve it?
PYB
The list of hosts that can submit jobs is set by AllocNodes. The slurm default is "ALL" but the rocks roll sets it to the cluster head node only. Look in slurm.conf for the line:
Ian
Thank you Ian, this worked like a charm!
steps :
1 - update slurm.conf on Front end,
PartitonName=XXX AllocNodes=ALL ...
2 - restart slurmctld on FrontEnd
systemctl restart slurmctld