From: Yangyang Yi <yy...@si...> - 2021-01-19 09:12:45
|
Thanks for your reply. Happy new year. I see. I tried to install scipion-em-xmipp again, it has reported a successful install, but when I submit job via slurm, I failed to run. Also, I tried to run a small job on headnode (only system since our headnode do not have GPU cards), and it works. Maybe something related to my config files, especially my host.conf? Here’s my config files: host.conf [localhost] PARALLEL_COMMAND = mpirun -np %_(JOB_NODES)d %_(COMMAND)s NAME = SCIPION_SLURM MANDATORY = False SUBMIT_COMMAND = sbatch %_(JOB_SCRIPT)s CANCEL_COMMAND = scancel %_(JOB_ID)s CHECK_COMMAND = squeue -h -j %_(JOB_ID)s SUBMIT_TEMPLATE = #!/bin/bash ### Inherit all current environment variables #SBATCH --export=ALL ### Job name #SBATCH -J %_(JOB_NAME)s ### Queue name #SBATCH -p %_(JOB_QUEUE)s ### Standard output and standard error messages #SBATCH -o %_(JOB_LOGS)s.out #SBATCH -e %_(JOB_LOGS)s.err ### Specify the number of nodes and thread (ppn) for your job. #SBATCH --ntasks=%_(JOB_NODES)d #SBATCH --cpus-per-task=%_(JOB_THREADS)d #SBATCH --mem=%_(JOB_MEMORY)s #SBATCH --gres=gpu:%_(GPU_COUNT)s ### Tell PBS the anticipated run-time for your job, where walltime=HH:MM:SS #SBATCH --time=%_(JOB_TIME)s:00:00 # Use as working dir the path where qsub was launched WORKDIR=$SLURM_SUBMIT_DIR ################################# ### Set environment variable to know running mode is non interactive export XMIPP_IN_QUEUE=1 ### Switch to the working directory; cd $WORKDIR # Make a copy of PBS_NODEFILE cp $SLURM_JOB_NODELIST > %_(JOB_NODEFILE)s # Calculate the number of processors allocated to this run. NPROCS=`wc -l < $SLURM_JOB_NODELIST` # Calculate the number of nodes allocated. ###NNODES=`uniq $SLURM_JOB_NODELIST | wc -l` ### Display the job context echo Running on host `hostname` echo Time is `date` echo Working directory is `pwd` ###echo Using ${NPROCS} processors across ${NNODES} nodes echo NODE LIST - config: echo $SLURM_JOB_NODELIST echo CUDA_VISIBLE_DEVICES: $CUDA_VISIBLE_DEVICES module load cuda-10.1 module load impi-2019.4 module load relion-3.1.0 module load java-1.8.0 ################################# # echo '%_(JOB_COMMAND)s' >> /tmp/slurm-jobs.log %_(JOB_COMMAND)s ###find "$SLURM_SUBMIT_DIR" -type f -user $USER -perm 644 -exec chmod 664 {} + ; #Next variable is used to provide a regex to check if a job is finished on a queue system QUEUES = {"gpu": [["JOB_MEMORY", "8192", "Memory (MB)", "Select amount of memory (in megabytes) for this job"], ["JOB_TIME", "48", "Time (hours)", "Select the time expected (in hours) for this job"], ["GPU_COUNT", "8", "Number of GPUs", "Select the number of GPUs if protocol has been set up to use them"], ["QUEUE_FOR_JOBS", "N", "Use queue for jobs", "Send individual jobs to queue"]]} scipion.conf: [PYWORKFLOW] CONDA_ACTIVATION_CMD = eval "$(/share/apps/software/anconda/bin/conda shell.bash hook)" SCIPION_DOMAIN = pwem SCIPION_FONT_NAME = Helvetica SCIPION_LOG = ~/ScipionUserData/logs/scipion.log SCIPION_LOGO = scipion_logo.gif SCIPION_LOGS = ~/ScipionUserData/logs SCIPION_NOTES_FILE = notes.txt SCIPION_NOTIFY = True SCIPION_PLUGIN_REPO_URL = http://scipion.i2pc.es/getplugins/ SCIPION_SOFTWARE = ${SCIPION_HOME}/software SCIPION_SUPPORT_EMAIL = sc...@cn... SCIPION_TESTS = ${SCIPION_HOME}/data/tests SCIPION_TESTS_CMD = scipion3 tests SCIPION_TESTS_OUTPUT = ~/ScipionUserData/Tests SCIPION_TMP = ~/ScipionUserData/tmp SCIPION_URL = http://scipion.cnb.csic.es/downloads/scipion SCIPION_URL_SOFTWARE = http://scipion.cnb.csic.es/downloads/scipion/software SCIPION_URL_TESTDATA = http://scipion.cnb.csic.es/downloads/scipion/data/tests SCIPION_USER_DATA = ~/ScipionUserData WIZARD_MASK_COLOR = [0.125, 0.909, 0.972] SCIPION_NOTES_ARGS = SCIPION_NOTES_PROGRAM = [PLUGINS] EM_ROOT = software/em MAXIT_HOME = %(EM_ROOT)s/maxit-10.1 XMIPP_HOME = %(EM_ROOT)s/xmipp CUDA_BIN = /share/apps/software/cuda-10.1/bin CUDA_LIB = /share/apps/software/cuda-10.1/lib64 CHIMERA_HOME = %(EM_ROOT)s/chimerax-1.1 GCTF_HOME = %(EM_ROOT)s/Gctf_v1.18 GAUTOMATCH = Gautomatch_v0.56_sm30-75_cu10.1 MOTIONCOR2_CUDA_LIB = /share/apps/software/cuda-10.1/lib64 RELION_HOME = software/em/relion-3.1.0 RESMAP = ResMap-1.95-cuda-Centos7x64 GCTF_CUDA_LIB = /share/apps/software/cuda-10.1/lib64 MOTIONCOR2_BIN = MotionCor2_1.4.0_Cuda101 CRYOSPARC_HOME = RESMAP_GPU_LIB = ResMap_krnl-cuda-V8.0.61-sm60_gpu.so CRYO_PROJECTS_DIR = scipion_projects GAUTOMATCH_HOME = software/em/gautomatch-0.56 RELION_CUDA_LIB = /share/apps/software/cuda-10.1/lib64 RELION_CUDA_BIN = /share/apps/software/cuda-10.1/bin CTFFIND4_HOME = software/em/ctffind4-4.1.14 RESMAP_HOME = software/em/resmap-1.95 GAUTOMATCH_CUDA_LIB = /share/apps/software/cuda-10.1/lib64 GCTF = Gctf_v1.18_sm30-75_cu10.1 CISTEM_HOME = software/em/cistem-1.0.0-beta MOTIONCOR2_HOME = software/em/motioncor2-1.4.0 > 2020年12月30日 下午4:50,Grigory Sharov <sha...@gm...> 写道: > > Hi Yangyang, the line > 00016: GHOST in place, read call ignored > means that you don't have xmipp installed which is still required for all basic operations > > Grigory . > > > On Wed, Dec 30, 2020, 08:44 Yangyang Yi <yy...@si... <mailto:yy...@si...>> wrote: > Hi, > Happy Holiday! > Thanks for the help in last email about Scipion-2.0 cluster config. I have solved the problem, since I used “Use queue for jobs:Y” instead of "N", and some plugins only download partly. I edited those parameter and reinstalled some of the plugin, now it works. > However, our users now prefer scipion-3.0 and I found new errors with similar settings. Now all the motion correction job works and result files in extra directory (micrographs and shifts) and generated, however, I failed to run CTF after motion correction. I have imported some motion correction micrographs but still failed to run CTF (both Gctf and CTFFIND). The error is similar as below: > > CTFFIND: > 00002: Hostname: gpu04 > 00003: PID: 138990 > 00004: pyworkflow: 3.0.8 > 00005: plugin: cistem > 00006: plugin v: 3.0.9 > 00007: currentDir: /share/home/biotest/stu01/ScipionUserData/projects/test > 00008: workingDir: Runs/001767_CistemProtCTFFind > 00009: runMode: Restart > 00010: MPI: 1 > 00011: threads: 1 > 00012: Starting at step: 1 > 00013: Running steps > 00014: STARTED: estimateCtfStep, step 1, time 2020-12-30 15:35:57.887435 > 00015: Estimating CTF of micrograph: 1 > 00016: GHOST in place, read call ignored!. > 00017: Traceback (most recent call last): > 00018: File "/share/apps/software/anconda/envs/scipion3/lib/python3.8/site-packages/pyworkflow/protocol/protocol.py", line 189, in run > 00019: self._run() > 00020: File "/share/apps/software/anconda/envs/scipion3/lib/python3.8/site-packages/pyworkflow/protocol/protocol.py", line 240, in _run > 00021: resultFiles = self._runFunc() > 00022: File "/share/apps/software/anconda/envs/scipion3/lib/python3.8/site-packages/pyworkflow/protocol/protocol.py", line 236, in _runFunc > 00023: return self._func(*self._args) > 00024: File "/share/apps/software/anconda/envs/scipion3/lib/python3.8/site-packages/pwem/protocols/protocol_micrographs.py", line 247, in estimateCtfStep > 00025: self._estimateCTF(mic, *args) > 00026: File "/share/apps/software/anconda/envs/scipion3/lib/python3.8/site-packages/cistem/protocols/protocol_ctffind.py", line 108, in _estimateCTF > 00027: self._doCtfEstimation(mic) > 00028: File "/share/apps/software/anconda/envs/scipion3/lib/python3.8/site-packages/cistem/protocols/protocol_ctffind.py", line 78, in _doCtfEstimation > 00029: ih.convert(micFn, micFnMrc, emlib.DT_FLOAT) > 00030: File "/share/apps/software/anconda/envs/scipion3/lib/python3.8/site-packages/pwem/emlib/image/image_handler.py", line 170, in convert > 00031: self._img.write(outputLoc) > 00032: AttributeError: 'Image' object has no attribute 'write' > 00033: Protocol failed: 'Image' object has no attribute 'write' > 00034: FAILED: estimateCtfStep, step 1, time 2020-12-30 15:35:57.909908 > > 00002: Hostname: gpu08 > 00003: PID: 101015 > 00004: pyworkflow: 3.0.8 > 00005: plugin: gctf > 00006: plugin v: 3.0.11 > 00007: currentDir: /share/home/biotest/stu01/ScipionUserData/projects/test > 00008: workingDir: Runs/001508_ProtGctf > 00009: runMode: Restart > 00010: MPI: 1 > 00011: threads: 1 > 00012: Starting at step: 1 > 00013: Running steps > 00014: STARTED: estimateCtfStep, step 1, time 2020-12-30 16:20:37.950836 > 00015: Estimating CTF of micrograph: 1 > 00016: GHOST in place, read call ignored!. > 00017: ERROR: Gctf has failed on Runs/001508_ProtGctf/tmp/mic_0001/*.mrc > 00018: Traceback (most recent call last): > 00019: File "/share/apps/software/anconda/envs/scipion3/lib/python3.8/site-packages/gctf/protocols/protocol_gctf.py", line 82, in _estimateCtfList > 00020: ih.convert(micFn, micFnMrc, emlib.DT_FLOAT) > 00021: File "/share/apps/software/anconda/envs/scipion3/lib/python3.8/site-packages/pwem/emlib/image/image_handler.py", line 170, in convert > 00022: self._img.write(outputLoc) > 00023: AttributeError: 'Image' object has no attribute 'write' > 00024: FINISHED: estimateCtfStep, step 1, time 2020-12-30 16:20:38.060255 > > Is there any suggestions? Thanks! > > > > _______________________________________________ > scipion-users mailing list > sci...@li... <mailto:sci...@li...> > https://lists.sourceforge.net/lists/listinfo/scipion-users <https://lists.sourceforge.net/lists/listinfo/scipion-users> > _______________________________________________ > scipion-users mailing list > sci...@li... > https://lists.sourceforge.net/lists/listinfo/scipion-users |