Menu

rocks sync problem

2018-05-18
2018-05-18
  • mahmood naderan

    mahmood naderan - 2018-05-18

    Hi Werner,
    Do you understand the following error?

    [root@rocks7 ~]# rocks sync slurm
    compute-0-0: bash: compute-0-1: command not found
    pdsh@rocks7: compute-0-0: ssh exited with exit code 127
    compute-0-0: bash: compute-0-1: command not found
    pdsh@rocks7: compute-0-0: ssh exited with exit code 127
    [root@rocks7 ~]#
    
     
    • Werner Saar

      Werner Saar - 2018-05-18

      Hi,

      sorry, I don't know, which command is not found.

      Best regards

      Werner

      On 05/18/2018 08:31 AM, mahmood naderan wrote:

      Hi Werner,
      Do you understand the following error?

      ~~~
      [root@rocks7 ~]# rocks sync slurm
      compute-0-0: bash: compute-0-1: command not found
      pdsh@rocks7: compute-0-0: ssh exited with exit code 127
      compute-0-0: bash: compute-0-1: command not found
      pdsh@rocks7: compute-0-0: ssh exited with exit code 127
      [root@rocks7 ~]#
      ~~~


      rocks sync problem


      Sent from sourceforge.net because you indicated interest in https://sourceforge.net/p/slurm-roll/discussion/general/

      To unsubscribe from further messages, please visit https://sourceforge.net/auth/subscriptions/

       
      • mahmood naderan

        mahmood naderan - 2018-05-18

        How can I debug the sync slurm command?

         
        • Werner Saar

          Werner Saar - 2018-05-18

          Look at the directory
          /opt/rocks/lib/python2.7/site-packages/rocks/commands/sync/slurm.

          Remove the file init.pyo

          Now you can modify the file init.py, add some print commands for example

          On 05/18/2018 10:32 AM, mahmood naderan wrote:

          How can I debug the sync slurm command?


          rocks sync problem


          Sent from sourceforge.net because you indicated interest in https://sourceforge.net/p/slurm-roll/discussion/general/

          To unsubscribe from further messages, please visit https://sourceforge.net/auth/subscriptions/

           
          • mahmood naderan

            mahmood naderan - 2018-05-18

            OK I removed .pyo . After the following modifications and saving the file, I ran rocks sync slurm. No pyo is generated. Is that normal?

            I modified

                      hl = hostlist.collect_hostlist(myhostlist)
                      print "printing hl=", hl
                      os.system("pdsh -t 5 -u 30 -w %s /etc/slurm/slurm-prep.sh start " % (hl))
                      os.system("pdsh -t 5 -u 30 -w %s /usr/bin/systemctl restart slurmd.service" % (hl))
                      print "ran os"
            

            In the output, I see

                   printing hl= compute-[0]-[0-6]
                  compute-0-0: bash: compute-0-1: command not found
                  pdsh@rocks7: compute-0-0: ssh exited with exit code 127
                  compute-0-0: bash: compute-0-1: command not found  
                  pdsh@rocks7: compute-0-0: ssh exited with exit code 127
                 ran os
            

            Then I tried

            [root@rocks7 ~]# pdsh -t 5 -u 30 -w compute-[0]-[0-6] /etc/slurm/slurm-prep.sh start
            compute-0-0: bash: compute-0-1: command not found
            pdsh@rocks7: compute-0-0: ssh exited with exit code 127
            [root@rocks7 ~]# ls -l /etc/slurm/slurm-prep.sh
            ls: cannot access /etc/slurm/slurm-prep.sh: No such file or directory
            

            Is that normal? I don't remember that I had intentionally deleted such a file!

             
            • Werner Saar

              Werner Saar - 2018-05-18

              Hi,

              The file .pyo is not needed is not rebuild during operation. Don't worry.

              On 05/18/2018 02:12 PM, mahmood naderan wrote:

              OK I removed .pyo . After the following modifications and saving the file, I ran rocks sync slurm. No pyo is generated. Is that normal?

              I modified

                         hl = hostlist.collect_hostlist(myhostlist)
                         print "printing hl=", hl
                         os.system("pdsh -t 5 -u 30 -w %s /etc/slurm/slurm-prep.sh start " % (hl))
                         os.system("pdsh -t 5 -u 30 -w %s /usr/bin/systemctl restart slurmd.service" % (hl))
                         print "ran os"
              

              In the output, I see

                      printing hl= compute-[0]-[0-6]
                     compute-0-0: bash: compute-0-1: command not found
                     pdsh@rocks7: compute-0-0: ssh exited with exit code 127
                     compute-0-0: bash: compute-0-1: command not found
                     pdsh@rocks7: compute-0-0: ssh exited with exit code 127
                    ran os
              

              Then I tried

              ~~~
              [root@rocks7 ~]# pdsh -t 5 -u 30 -w compute-[0]-[0-6] /etc/slurm/slurm-prep.sh start
              compute-0-0: bash: compute-0-1: command not found
              pdsh@rocks7: compute-0-0: ssh exited with exit code 127
              [root@rocks7 ~]# ls -l /etc/slurm/slurm-prep.sh
              ls: cannot access /etc/slurm/slurm-prep.sh: No such file or directory
              ~~~

              Is that normal? I don't remember that I had intentionally deleted such a file!


              rocks sync problem


              Sent from sourceforge.net because you indicated interest in https://sourceforge.net/p/slurm-roll/discussion/general/

              To unsubscribe from further messages, please visit https://sourceforge.net/auth/subscriptions/

               
              • mahmood naderan

                mahmood naderan - 2018-05-18

                What about last question I asked?

                [root@rocks7 ~]# ls -l /etc/slurm/slurm-prep.sh
                ls: cannot access /etc/slurm/slurm-prep.sh: No such file or directory
                
                 
                • Werner Saar

                  Werner Saar - 2018-05-18

                  The file /etc/slurm/slurm-prep.sh exists only on compute nodes and is
                  not needed on the head-node.

                  On 05/18/2018 03:47 PM, mahmood naderan wrote:

                  What about last question I asked?
                  ~~~

                  [root@rocks7 ~]# ls -l /etc/slurm/slurm-prep.sh
                  ls: cannot access /etc/slurm/slurm-prep.sh: No such file or directory
                  ~~~


                  rocks sync problem


                  Sent from sourceforge.net because you indicated interest in https://sourceforge.net/p/slurm-roll/discussion/general/

                  To unsubscribe from further messages, please visit https://sourceforge.net/auth/subscriptions/

                   
                  • mahmood naderan

                    mahmood naderan - 2018-05-18

                    I really don't understand what is going wrong.

                    [root@rocks7 ~]# pdsh -t 5 -u 30 -w compute-[0]-[0-6] /etc/slurm/slurm-prep.sh start
                    compute-0-0: bash: compute-0-1: command not found
                    pdsh@rocks7: compute-0-0: ssh exited with exit code 127
                    [root@rocks7 ~]# pdsh -t 5 -u 30 -w compute-0-0 /etc/slurm/slurm-prep.sh start
                    compute-0-0: Wrote: /etc/auto.home
                    compute-0-0: Wrote: /etc/auto.master
                    compute-0-0: Wrote: /etc/auto.misc
                    compute-0-0: Wrote: /etc/auto.net
                    compute-0-0: Wrote: /etc/auto.share
                    compute-0-0: Wrote: /etc/auto.smb
                    compute-0-0: Wrote: /etc/group
                    compute-0-0: Wrote: /etc/munge/munge.key
                    compute-0-0: Wrote: /etc/passwd
                    compute-0-0: Wrote: /etc/shadow
                    compute-0-0: Wrote: /etc/slurm/cgroup.conf
                    compute-0-0: Wrote: /etc/slurm/gres.conf.1
                    compute-0-0: Wrote: /etc/slurm/gres.conf.2
                    compute-0-0: Wrote: /etc/slurm/gres.conf.3
                    compute-0-0: Wrote: /etc/slurm/gres.conf.4
                    compute-0-0: Wrote: /etc/slurm/head.conf
                    compute-0-0: Wrote: /etc/slurm/node.conf
                    compute-0-0: Wrote: /etc/slurm/parts.conf
                    compute-0-0: Wrote: /etc/slurm/slurm.conf
                    compute-0-0: Wrote: /etc/slurm/topo.conf
                    compute-0-0: Wrote: /etc/ssh/shosts.equiv
                    compute-0-0: Wrote: /etc/ssh/ssh_known_hosts
                    compute-0-0: compute-0-0.local
                    compute-0-0: NodeName=compute-0-0 NodeAddr=10.1.1.254 CPUs=32 Weight=20511900 Feature=rack-0,32CPUs
                    [root@rocks7 ~]# pdsh -t 5 -u 30 -w compute-0-1 /etc/slurm/slurm-prep.sh start
                    compute-0-1: Wrote: /etc/auto.home
                    compute-0-1: Wrote: /etc/auto.master
                    compute-0-1: Wrote: /etc/auto.misc
                    compute-0-1: Wrote: /etc/auto.net
                    compute-0-1: Wrote: /etc/auto.share
                    compute-0-1: Wrote: /etc/auto.smb
                    compute-0-1: Wrote: /etc/group
                    compute-0-1: Wrote: /etc/munge/munge.key
                    compute-0-1: Wrote: /etc/passwd
                    compute-0-1: Wrote: /etc/shadow
                    compute-0-1: Wrote: /etc/slurm/cgroup.conf
                    compute-0-1: Wrote: /etc/slurm/gres.conf.1
                    compute-0-1: Wrote: /etc/slurm/gres.conf.2
                    compute-0-1: Wrote: /etc/slurm/gres.conf.3
                    compute-0-1: Wrote: /etc/slurm/gres.conf.4
                    compute-0-1: Wrote: /etc/slurm/head.conf
                    compute-0-1: Wrote: /etc/slurm/node.conf
                    compute-0-1: Wrote: /etc/slurm/parts.conf
                    compute-0-1: Wrote: /etc/slurm/slurm.conf
                    compute-0-1: Wrote: /etc/slurm/topo.conf
                    compute-0-1: Wrote: /etc/ssh/shosts.equiv
                    compute-0-1: Wrote: /etc/ssh/ssh_known_hosts
                    compute-0-1: compute-0-1.local
                    compute-0-1: NodeName=compute-0-1 NodeAddr=10.1.1.253 CPUs=32 Weight=20511899 Feature=rack-0,32CPUs
                    [root@rocks7 ~]# pdsh -t 5 -u 30 -w compute-0-2 /etc/slurm/slurm-prep.sh start
                    compute-0-2: Wrote: /etc/auto.home
                    compute-0-2: Wrote: /etc/auto.master
                    compute-0-2: Wrote: /etc/auto.misc
                    compute-0-2: Wrote: /etc/auto.net
                    compute-0-2: Wrote: /etc/auto.share
                    compute-0-2: Wrote: /etc/auto.smb
                    compute-0-2: Wrote: /etc/group
                    compute-0-2: Wrote: /etc/munge/munge.key
                    compute-0-2: Wrote: /etc/passwd
                    compute-0-2: Wrote: /etc/shadow
                    compute-0-2: Wrote: /etc/slurm/cgroup.conf
                    compute-0-2: Wrote: /etc/slurm/gres.conf.1
                    compute-0-2: Wrote: /etc/slurm/gres.conf.2
                    compute-0-2: Wrote: /etc/slurm/gres.conf.3
                    compute-0-2: Wrote: /etc/slurm/gres.conf.4
                    compute-0-2: Wrote: /etc/slurm/head.conf
                    compute-0-2: Wrote: /etc/slurm/node.conf
                    compute-0-2: Wrote: /etc/slurm/parts.conf
                    compute-0-2: Wrote: /etc/slurm/slurm.conf
                    compute-0-2: Wrote: /etc/slurm/topo.conf
                    compute-0-2: Wrote: /etc/ssh/shosts.equiv
                    compute-0-2: Wrote: /etc/ssh/ssh_known_hosts
                    compute-0-2: compute-0-2.local
                    compute-0-2: NodeName=compute-0-2 NodeAddr=10.1.1.252 CPUs=32 Weight=20511898 Feature=rack-0,32CPUs
                    [root@rocks7 ~]# pdsh -t 5 -u 30 -w compute-0-3 /etc/slurm/slurm-prep.sh start
                    compute-0-3: Wrote: /etc/auto.home
                    compute-0-3: Wrote: /etc/auto.master
                    compute-0-3: Wrote: /etc/auto.misc
                    compute-0-3: Wrote: /etc/auto.net
                    compute-0-3: Wrote: /etc/auto.share
                    compute-0-3: Wrote: /etc/auto.smb
                    compute-0-3: Wrote: /etc/group
                    compute-0-3: Wrote: /etc/munge/munge.key
                    compute-0-3: Wrote: /etc/passwd
                    compute-0-3: Wrote: /etc/shadow
                    compute-0-3: Wrote: /etc/slurm/cgroup.conf
                    compute-0-3: Wrote: /etc/slurm/gres.conf.1
                    compute-0-3: Wrote: /etc/slurm/gres.conf.2
                    compute-0-3: Wrote: /etc/slurm/gres.conf.3
                    compute-0-3: Wrote: /etc/slurm/gres.conf.4
                    compute-0-3: Wrote: /etc/slurm/head.conf
                    compute-0-3: Wrote: /etc/slurm/node.conf
                    compute-0-3: Wrote: /etc/slurm/parts.conf
                    compute-0-3: Wrote: /etc/slurm/slurm.conf
                    compute-0-3: Wrote: /etc/slurm/topo.conf
                    compute-0-3: Wrote: /etc/ssh/shosts.equiv
                    compute-0-3: Wrote: /etc/ssh/ssh_known_hosts
                    compute-0-3: compute-0-3.local
                    compute-0-3: NodeName=compute-0-3 NodeAddr=10.1.1.251 CPUs=32 Weight=20511897 Feature=rack-0,32CPUs
                    [root@rocks7 ~]# pdsh -t 5 -u 30 -w compute-0-4 /etc/slurm/slurm-prep.sh start
                    compute-0-4: Wrote: /etc/auto.home
                    compute-0-4: Wrote: /etc/auto.master
                    compute-0-4: Wrote: /etc/auto.misc
                    compute-0-4: Wrote: /etc/auto.net
                    compute-0-4: Wrote: /etc/auto.share
                    compute-0-4: Wrote: /etc/auto.smb
                    compute-0-4: Wrote: /etc/group
                    compute-0-4: Wrote: /etc/munge/munge.key
                    compute-0-4: Wrote: /etc/passwd
                    compute-0-4: Wrote: /etc/shadow
                    compute-0-4: Wrote: /etc/slurm/cgroup.conf
                    compute-0-4: Wrote: /etc/slurm/gres.conf.1
                    compute-0-4: Wrote: /etc/slurm/gres.conf.2
                    compute-0-4: Wrote: /etc/slurm/gres.conf.3
                    compute-0-4: Wrote: /etc/slurm/gres.conf.4
                    compute-0-4: Wrote: /etc/slurm/head.conf
                    compute-0-4: Wrote: /etc/slurm/node.conf
                    compute-0-4: Wrote: /etc/slurm/parts.conf
                    compute-0-4: Wrote: /etc/slurm/slurm.conf
                    compute-0-4: Wrote: /etc/slurm/topo.conf
                    compute-0-4: Wrote: /etc/ssh/shosts.equiv
                    compute-0-4: Wrote: /etc/ssh/ssh_known_hosts
                    compute-0-4: compute-0-4.local
                    compute-0-4: NodeName=compute-0-4 NodeAddr=10.1.1.250 CPUs=32 Weight=20511896 Feature=rack-0,32CPUs
                    [root@rocks7 ~]# pdsh -t 5 -u 30 -w compute-0-5 /etc/slurm/slurm-prep.sh start
                    compute-0-5: Wrote: /etc/auto.home
                    compute-0-5: Wrote: /etc/auto.master
                    compute-0-5: Wrote: /etc/auto.misc
                    compute-0-5: Wrote: /etc/auto.net
                    compute-0-5: Wrote: /etc/auto.share
                    compute-0-5: Wrote: /etc/auto.smb
                    compute-0-5: Wrote: /etc/group
                    compute-0-5: Wrote: /etc/munge/munge.key
                    compute-0-5: Wrote: /etc/passwd
                    compute-0-5: Wrote: /etc/shadow
                    compute-0-5: Wrote: /etc/slurm/cgroup.conf
                    compute-0-5: Wrote: /etc/slurm/gres.conf.1
                    compute-0-5: Wrote: /etc/slurm/gres.conf.2
                    compute-0-5: Wrote: /etc/slurm/gres.conf.3
                    compute-0-5: Wrote: /etc/slurm/gres.conf.4
                    compute-0-5: Wrote: /etc/slurm/head.conf
                    compute-0-5: Wrote: /etc/slurm/node.conf
                    compute-0-5: Wrote: /etc/slurm/parts.conf
                    compute-0-5: Wrote: /etc/slurm/slurm.conf
                    compute-0-5: Wrote: /etc/slurm/topo.conf
                    compute-0-5: Wrote: /etc/ssh/shosts.equiv
                    compute-0-5: Wrote: /etc/ssh/ssh_known_hosts
                    compute-0-5: compute-0-5.local
                    compute-0-5: NodeName=compute-0-5 NodeAddr=10.1.1.249 CPUs=56 Weight=20535895 Feature=rack-0,56CPUs
                    [root@rocks7 ~]# pdsh -t 5 -u 30 -w compute-0-6 /etc/slurm/slurm-prep.sh start
                    compute-0-6: Wrote: /etc/auto.home
                    compute-0-6: Wrote: /etc/auto.master
                    compute-0-6: Wrote: /etc/auto.misc
                    compute-0-6: Wrote: /etc/auto.net
                    compute-0-6: Wrote: /etc/auto.share
                    compute-0-6: Wrote: /etc/auto.smb
                    compute-0-6: Wrote: /etc/group
                    compute-0-6: Wrote: /etc/munge/munge.key
                    compute-0-6: Wrote: /etc/passwd
                    compute-0-6: Wrote: /etc/shadow
                    compute-0-6: Wrote: /etc/slurm/cgroup.conf
                    compute-0-6: Wrote: /etc/slurm/gres.conf.1
                    compute-0-6: Wrote: /etc/slurm/gres.conf.2
                    compute-0-6: Wrote: /etc/slurm/gres.conf.3
                    compute-0-6: Wrote: /etc/slurm/gres.conf.4
                    compute-0-6: Wrote: /etc/slurm/head.conf
                    compute-0-6: Wrote: /etc/slurm/node.conf
                    compute-0-6: Wrote: /etc/slurm/parts.conf
                    compute-0-6: Wrote: /etc/slurm/slurm.conf
                    compute-0-6: Wrote: /etc/slurm/topo.conf
                    compute-0-6: Wrote: /etc/ssh/shosts.equiv
                    compute-0-6: Wrote: /etc/ssh/ssh_known_hosts
                    compute-0-6: compute-0-6.local
                    compute-0-6: NodeName=compute-0-6 NodeAddr=10.1.1.248 CPUs=56 Weight=20535894 Feature=rack-0,56CPUs
                    [root@rocks7 ~]#
                    
                     
  • mahmood naderan

    mahmood naderan - 2018-05-18

    So, the solution is to put the value of -w in ''
    I think there was an update to a package. You can also check that.

    os.system("pdsh -t 5 -u 30 -w '%s' /etc/slurm/slurm-prep.sh start >/dev/null" % (hl))
    os.system("pdsh -t 5 -u 30 -w '%s' /usr/bin/systemctl restart slurmd.service >/dev/null" % (hl))
    
     

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.