Re: [scipion-users] Scipion 3.0 cluster config

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Thanks for your reply. Happy new year. I see. I tried to install scipion-em-xmipp again, it has reported a successful install, but when I submit job via slurm, I failed to run. Also, I tried to run a small job on headnode (only system since our headnode do not have GPU cards), and it works. Maybe something related to my config files, especially my host.conf? Here’s my config files: 
host.conf
[localhost]
PARALLEL_COMMAND = mpirun -np %_(JOB_NODES)d %_(COMMAND)s
NAME = SCIPION_SLURM
MANDATORY = False
SUBMIT_COMMAND = sbatch %_(JOB_SCRIPT)s
CANCEL_COMMAND = scancel %_(JOB_ID)s
CHECK_COMMAND = squeue -h -j %_(JOB_ID)s
SUBMIT_TEMPLATE = #!/bin/bash
    ### Inherit all current environment variables
    #SBATCH --export=ALL
    ### Job name
    #SBATCH -J %_(JOB_NAME)s
    ### Queue name
    #SBATCH -p %_(JOB_QUEUE)s
    ### Standard output and standard error messages
    #SBATCH -o %_(JOB_LOGS)s.out
    #SBATCH -e %_(JOB_LOGS)s.err
    ### Specify the number of nodes and thread (ppn) for your job.
    #SBATCH --ntasks=%_(JOB_NODES)d 
    #SBATCH --cpus-per-task=%_(JOB_THREADS)d 
    #SBATCH --mem=%_(JOB_MEMORY)s          
    #SBATCH --gres=gpu:%_(GPU_COUNT)s
    ### Tell PBS the anticipated run-time for your job, where walltime=HH:MM:SS
    #SBATCH --time=%_(JOB_TIME)s:00:00
    # Use as working dir the path where qsub was launched
    WORKDIR=$SLURM_SUBMIT_DIR
    #################################
    ### Set environment variable to know running mode is non interactive
    export XMIPP_IN_QUEUE=1
    ### Switch to the working directory;
    cd $WORKDIR
    # Make a copy of PBS_NODEFILE
    cp $SLURM_JOB_NODELIST > %_(JOB_NODEFILE)s
    # Calculate the number of processors allocated to this run.
    NPROCS=`wc -l < $SLURM_JOB_NODELIST`
    # Calculate the number of nodes allocated.
    ###NNODES=`uniq $SLURM_JOB_NODELIST | wc -l`
    ### Display the job context
    echo Running on host `hostname`
    echo Time is `date`
    echo Working directory is `pwd`
    ###echo Using ${NPROCS} processors across ${NNODES} nodes
    echo NODE LIST - config:
    echo $SLURM_JOB_NODELIST
    echo CUDA_VISIBLE_DEVICES: $CUDA_VISIBLE_DEVICES
    module load cuda-10.1
    module load impi-2019.4
    module load relion-3.1.0
    module load java-1.8.0
    #################################
    # echo '%_(JOB_COMMAND)s' >> /tmp/slurm-jobs.log
    %_(JOB_COMMAND)s
    ###find "$SLURM_SUBMIT_DIR" -type f -user $USER -perm 644 -exec chmod 664 {} +
; #Next variable is used to provide a regex to check if a job is finished on a queue system
QUEUES = {"gpu": [["JOB_MEMORY", "8192", "Memory (MB)", "Select amount of memory (in megabytes) for this job"],
        ["JOB_TIME", "48", "Time (hours)", "Select the time expected (in hours) for this job"],
        ["GPU_COUNT", "8", "Number of GPUs", "Select the number of GPUs if protocol has been set up to use them"],
        ["QUEUE_FOR_JOBS", "N", "Use queue for jobs", "Send individual jobs to queue"]]}
scipion.conf:
[PYWORKFLOW]
CONDA_ACTIVATION_CMD = eval "$(/share/apps/software/anconda/bin/conda shell.bash hook)"
SCIPION_DOMAIN = pwem
SCIPION_FONT_NAME = Helvetica
SCIPION_LOG = ~/ScipionUserData/logs/scipion.log
SCIPION_LOGO = scipion_logo.gif
SCIPION_LOGS = ~/ScipionUserData/logs
SCIPION_NOTES_FILE = notes.txt
SCIPION_NOTIFY = True
SCIPION_PLUGIN_REPO_URL = http://scipion.i2pc.es/getplugins/
SCIPION_SOFTWARE = ${SCIPION_HOME}/software
SCIPION_SUPPORT_EMAIL = sc...@cn...
SCIPION_TESTS = ${SCIPION_HOME}/data/tests
SCIPION_TESTS_CMD = scipion3 tests
SCIPION_TESTS_OUTPUT = ~/ScipionUserData/Tests
SCIPION_TMP = ~/ScipionUserData/tmp
SCIPION_URL = http://scipion.cnb.csic.es/downloads/scipion
SCIPION_URL_SOFTWARE = http://scipion.cnb.csic.es/downloads/scipion/software
SCIPION_URL_TESTDATA = http://scipion.cnb.csic.es/downloads/scipion/data/tests
SCIPION_USER_DATA = ~/ScipionUserData
WIZARD_MASK_COLOR = [0.125, 0.909, 0.972]
SCIPION_NOTES_ARGS = 
SCIPION_NOTES_PROGRAM = 

[PLUGINS]
EM_ROOT = software/em
MAXIT_HOME = %(EM_ROOT)s/maxit-10.1
XMIPP_HOME = %(EM_ROOT)s/xmipp
CUDA_BIN = /share/apps/software/cuda-10.1/bin
CUDA_LIB = /share/apps/software/cuda-10.1/lib64
CHIMERA_HOME = %(EM_ROOT)s/chimerax-1.1
GCTF_HOME = %(EM_ROOT)s/Gctf_v1.18
GAUTOMATCH = Gautomatch_v0.56_sm30-75_cu10.1
MOTIONCOR2_CUDA_LIB = /share/apps/software/cuda-10.1/lib64
RELION_HOME = software/em/relion-3.1.0
RESMAP = ResMap-1.95-cuda-Centos7x64
GCTF_CUDA_LIB = /share/apps/software/cuda-10.1/lib64
MOTIONCOR2_BIN = MotionCor2_1.4.0_Cuda101
CRYOSPARC_HOME = 
RESMAP_GPU_LIB = ResMap_krnl-cuda-V8.0.61-sm60_gpu.so
CRYO_PROJECTS_DIR = scipion_projects
GAUTOMATCH_HOME = software/em/gautomatch-0.56
RELION_CUDA_LIB = /share/apps/software/cuda-10.1/lib64
RELION_CUDA_BIN = /share/apps/software/cuda-10.1/bin
CTFFIND4_HOME = software/em/ctffind4-4.1.14
RESMAP_HOME = software/em/resmap-1.95
GAUTOMATCH_CUDA_LIB = /share/apps/software/cuda-10.1/lib64
GCTF = Gctf_v1.18_sm30-75_cu10.1
CISTEM_HOME = software/em/cistem-1.0.0-beta
MOTIONCOR2_HOME = software/em/motioncor2-1.4.0

> 2020年12月30日 下午4:50，Grigory Sharov <sha...@gm...> 写道：
> 
> Hi Yangyang, the line 
> 00016:   GHOST in place, read call ignored
> means that you don't have xmipp installed which is still required for all basic operations
> 
> Grigory . 
> 
> 
> On Wed, Dec 30, 2020, 08:44 Yangyang Yi <yy...@si... <mailto:yy...@si...>> wrote:
> Hi,
>      Happy Holiday! 
>      Thanks for the help in last email about Scipion-2.0 cluster config. I have solved the problem, since I used “Use queue for jobs:Y” instead of "N", and some plugins only download partly. I edited those parameter and reinstalled some of the plugin, now it works.
>      However, our users now prefer scipion-3.0 and I found new errors with similar settings. Now all the motion correction job works and result files in extra directory (micrographs and shifts) and generated, however, I failed to run CTF after motion correction. I have imported some motion correction micrographs but still failed to run CTF (both Gctf and CTFFIND). The error is similar as below: 
> 
> CTFFIND: 
> 00002:   Hostname: gpu04
> 00003:   PID: 138990
> 00004:   pyworkflow: 3.0.8
> 00005:   plugin: cistem
> 00006:   plugin v: 3.0.9
> 00007:   currentDir: /share/home/biotest/stu01/ScipionUserData/projects/test
> 00008:   workingDir: Runs/001767_CistemProtCTFFind
> 00009:   runMode: Restart
> 00010:             MPI: 1
> 00011:         threads: 1
> 00012:    Starting at step: 1
> 00013:    Running steps 
> 00014:   STARTED: estimateCtfStep, step 1, time 2020-12-30 15:35:57.887435
> 00015:   Estimating CTF of micrograph: 1 
> 00016:   GHOST in place, read call ignored!.
> 00017:   Traceback (most recent call last):
> 00018:     File "/share/apps/software/anconda/envs/scipion3/lib/python3.8/site-packages/pyworkflow/protocol/protocol.py", line 189, in run
> 00019:       self._run()
> 00020:     File "/share/apps/software/anconda/envs/scipion3/lib/python3.8/site-packages/pyworkflow/protocol/protocol.py", line 240, in _run
> 00021:       resultFiles = self._runFunc()
> 00022:     File "/share/apps/software/anconda/envs/scipion3/lib/python3.8/site-packages/pyworkflow/protocol/protocol.py", line 236, in _runFunc
> 00023:       return self._func(*self._args)
> 00024:     File "/share/apps/software/anconda/envs/scipion3/lib/python3.8/site-packages/pwem/protocols/protocol_micrographs.py", line 247, in estimateCtfStep
> 00025:       self._estimateCTF(mic, *args)
> 00026:     File "/share/apps/software/anconda/envs/scipion3/lib/python3.8/site-packages/cistem/protocols/protocol_ctffind.py", line 108, in _estimateCTF
> 00027:       self._doCtfEstimation(mic)
> 00028:     File "/share/apps/software/anconda/envs/scipion3/lib/python3.8/site-packages/cistem/protocols/protocol_ctffind.py", line 78, in _doCtfEstimation
> 00029:       ih.convert(micFn, micFnMrc, emlib.DT_FLOAT)
> 00030:     File "/share/apps/software/anconda/envs/scipion3/lib/python3.8/site-packages/pwem/emlib/image/image_handler.py", line 170, in convert
> 00031:       self._img.write(outputLoc)
> 00032:   AttributeError: 'Image' object has no attribute 'write'
> 00033:   Protocol failed: 'Image' object has no attribute 'write'
> 00034:   FAILED: estimateCtfStep, step 1, time 2020-12-30 15:35:57.909908
> 
> 00002:   Hostname: gpu08
> 00003:   PID: 101015
> 00004:   pyworkflow: 3.0.8
> 00005:   plugin: gctf
> 00006:   plugin v: 3.0.11
> 00007:   currentDir: /share/home/biotest/stu01/ScipionUserData/projects/test
> 00008:   workingDir: Runs/001508_ProtGctf
> 00009:   runMode: Restart
> 00010:             MPI: 1
> 00011:         threads: 1
> 00012:    Starting at step: 1
> 00013:    Running steps 
> 00014:   STARTED: estimateCtfStep, step 1, time 2020-12-30 16:20:37.950836
> 00015:   Estimating CTF of micrograph: 1 
> 00016:   GHOST in place, read call ignored!.
> 00017:   ERROR: Gctf has failed on Runs/001508_ProtGctf/tmp/mic_0001/*.mrc
> 00018:   Traceback (most recent call last):
> 00019:     File "/share/apps/software/anconda/envs/scipion3/lib/python3.8/site-packages/gctf/protocols/protocol_gctf.py", line 82, in _estimateCtfList
> 00020:       ih.convert(micFn, micFnMrc, emlib.DT_FLOAT)
> 00021:     File "/share/apps/software/anconda/envs/scipion3/lib/python3.8/site-packages/pwem/emlib/image/image_handler.py", line 170, in convert
> 00022:       self._img.write(outputLoc)
> 00023:   AttributeError: 'Image' object has no attribute 'write'
> 00024:   FINISHED: estimateCtfStep, step 1, time 2020-12-30 16:20:38.060255
> 
> Is there any suggestions? Thanks! 
> 
> 
> 
> _______________________________________________
> scipion-users mailing list
> sci...@li... <mailto:sci...@li...>
> https://lists.sourceforge.net/lists/listinfo/scipion-users <https://lists.sourceforge.net/lists/listinfo/scipion-users>
> _______________________________________________
> scipion-users mailing list
> sci...@li...
> https://lists.sourceforge.net/lists/listinfo/scipion-users

Re: [scipion-users] Scipion 3.0 cluster config

Image processing framework to integrate EM software packages.

Re: [scipion-users] Scipion 3.0 cluster config