Hello all,
upon doing fresh rocks 7.0 installation then installed a compute nodoe, Installed rolls:
NAME VERSION ARCH ENABLED base: 7.0 x86_64 yes CentOS: 7.4.1708 x86_64 yes core: 7.0 x86_64 yes ganglia: 7.0 x86_64 yes hpc: 7.0 x86_64 yes kernel: 7.0 x86_64 yes Updates-CentOS-7.4.1708: 2017-12-01 x86_64 yes
after trying to install the latest slurm version I get this error :
Created symlink from /etc/systemd/system/multi-user.target.wants/slurmdbd.service to /usr/lib/systemd/system/slurmdbd.service. Created symlink from /etc/systemd/system/multi-user.target.wants/mariadb.service to /usr/lib/systemd/system/mariadb.service. sacctmgr: error: slurm_persist_conn_open_without_init: failed to open persistent connection to 127:6819: Invalid argument sacctmgr: error: slurmdbd: Sending PersistInit msg: Invalid argument sacctmgr: error: Problem talking to the database: Invalid argument sacctmgr: error: slurm_persist_conn_open_without_init: failed to open persistent connection to 127:6819: Invalid argument sacctmgr: error: slurmdbd: Sending PersistInit msg: Invalid argument sacctmgr: error: Problem talking to the database: Invalid argument sacctmgr: error: slurm_persist_conn_open_without_init: failed to open persistent connection to 127:6819: Invalid argument sacctmgr: error: slurmdbd: Sending PersistInit msg: Invalid argument sacctmgr: error: Problem talking to the database: Invalid argument sacctmgr: error: slurm_persist_conn_open_without_init: failed to open persistent connection to 127:6819: Invalid argument sacctmgr: error: slurmdbd: Sending PersistInit msg: Invalid argument sacctmgr: error: Problem talking to the database: Invalid argument sacctmgr: error: slurm_persist_conn_open_without_init: failed to open persistent connection to 127:6819: Invalid argument sacctmgr: error: slurmdbd: Sending PersistInit msg: Invalid argument sacctmgr: error: Problem talking to the database: Invalid argument sacctmgr: error: slurm_persist_conn_open_without_init: failed to open persistent connection to 127:6819: Invalid argument sacctmgr: error: slurmdbd: Sending PersistInit msg: Invalid argument sacctmgr: error: Problem talking to the database: Invalid argument Created symlink from /etc/systemd/system/multi-user.target.wants/slurmctld.service to /usr/lib/systemd/system/slurmctld.service. ######################################################################## # WARNING: The command: # # # # sacctmgr -i create cluster rocks.vm # # # # failed. Please run this command again # ########################################################################
the output file is available here : https://pastebin.com/2s1uSJeJ
upon looking into slurmctld status I get his error:
systemctl status slurmctld.service ● slurmctld.service - Slurm controller daemon Loaded: loaded (/usr/lib/systemd/system/slurmctld.service; enabled; vendor preset: disabled) Active: failed (Result: resources) since Sat 2020-03-28 14:20:33 SAST; 3min 35s ago Process: 3370 ExecStart=/usr/sbin/slurmctld $SLURMCTLD_OPTIONS (code=exited, status=0/SUCCESS) Main PID: 1395 (code=exited, status=1/FAILURE) Mar 28 14:20:33 127.0.0.1 systemd[1]: Starting Slurm controller daemon... Mar 28 14:20:33 127.0.0.1 systemd[1]: PID file /var/run/slurmctld.pid not readable (yet?) after start. Mar 28 14:20:33 127.0.0.1 systemd[1]: slurmctld.service never wrote its PID file. Failing. Mar 28 14:20:33 127.0.0.1 systemd[1]: Failed to start Slurm controller daemon. Mar 28 14:20:33 127.0.0.1 systemd[1]: Unit slurmctld.service entered failed state. Mar 28 14:20:33 127.0.0.1 systemd[1]: slurmctld.service failed.
Foud a fix by installing : slurm-7.0.0.193-18.08.08.00.00.x86_64 then updating to the latest version.
Log in to post a comment.
Hello all,
upon doing fresh rocks 7.0 installation then installed a compute nodoe,
Installed rolls:
NAME VERSION ARCH ENABLED
base: 7.0 x86_64 yes
CentOS: 7.4.1708 x86_64 yes
core: 7.0 x86_64 yes
ganglia: 7.0 x86_64 yes
hpc: 7.0 x86_64 yes
kernel: 7.0 x86_64 yes
Updates-CentOS-7.4.1708: 2017-12-01 x86_64 yes
after trying to install the latest slurm version I get this error :
the output file is available here : https://pastebin.com/2s1uSJeJ
upon looking into slurmctld status I get his error:
Foud a fix by installing : slurm-7.0.0.193-18.08.08.00.00.x86_64 then updating to the latest version.