From: Joseph H. <jo...@ga...> - 2018-05-03 04:15:23
|
Dear Sir: Relion 2D was running but I stopped it. After I tried to restart, I got this error msg. I could not get Relion 2D running anymore. I also tried MPI is 1 but it still failed. I use scipion v.1.2 and I installed it from source. The msg from run.stdout 00309: The deprecated forms *will* disappear in a future version of Open MPI. 00310: Please update to the new syntax. 00311: -------------------------------------------------------------------------- 00312: -------------------------------------------------------------------------- 00313: [[50020,1],0]: A high-performance Open MPI point-to-point messaging module 00314: was unable to find any relevant network interfaces: 00315: 00316: Module: OpenFabrics (openib) 00317: Host: spgpu3 00318: 00319: Another transport will be used instead, although this may result in 00320: lower performance. 00321: -------------------------------------------------------------------------- 00322: === RELION MPI setup === 00323: + Number of MPI processes = 3 00324: + Master (0) runs on host = spgpu3 00325: + Slave 1 runs on host = spgpu3 00326: + Slave 2 runs on host = spgpu3 00327: ================= 00328: uniqueHost spgpu3 has 2 ranks. 00329: Slave 1 will distribute threads over devices 1 2 3 00330: Thread 0 on slave 1 mapped to device 1 00331: Slave 2 will distribute threads over devices 1 2 3 00332: Thread 0 on slave 2 mapped to device 1 00333: Device 1 on spgpu3 is split between 2 slaves 00334: Running CPU instructions in double precision. 00335: [spgpu3:17287] 2 more processes have sent help message help-mpi-btl-base.txt / btl:no-nics 00336: [spgpu3:17287] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages 00337: + WARNING: Changing psi sampling rate (before oversampling) to 5.625 degrees, for more efficient GPU calculations 00338: Estimating initial noise spectra 00339: 1/ 20 sec ..~~(,_,"> fn_img= 067786@Runs/000695_XmippProtCropResizeParticles/extra/output_images.stk <mailto:067786@Runs/000695_XmippProtCropResizeParticles/extra/output_images. stk> bg_avg= 0.429345 bg_stddev= 0.852297 bg_radius= 14.0174 00340: ERROR: 00341: ERROR: It appears that these images have not been normalised to an average background value of 0 and a stddev value of 1. 00342: Note that the average and stddev values for the background are calculated: 00343: (1) for single particles: outside a circle with the particle diameter 00344: (2) for helical segments: outside a cylinder (tube) with the helical tube diameter 00345: You can use the relion_preprocess program to normalise your images 00346: If you are sure you have normalised the images correctly (also see the RELION Wiki), you can switch off this error message using the --dont_check_norm command line option 00347: File: /usr/local/scipion/software/em/relion-2.1/src/ml_optimiser.cpp line: 1879 00348: -------------------------------------------------------------------------- 00349: MPI_ABORT was invoked on rank 2 in communicator MPI_COMM_WORLD 00350: with errorcode 1. 00351: 00352: NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes. 00353: You may or may not see output from other processes, depending on 00354: exactly when Open MPI kills them. 00355: -------------------------------------------------------------------------- 00356: Traceback (most recent call last): 00357: File "/usr/local/scipion/pyworkflow/protocol/protocol.py", line 186, in run 00358: self._run() 00359: File "/usr/local/scipion/pyworkflow/protocol/protocol.py", line 233, in _run 00360: resultFiles = self._runFunc() 00361: File "/usr/local/scipion/pyworkflow/protocol/protocol.py", line 229, in _runFunc 00362: return self._func(*self._args) 00363: File "/usr/local/scipion/pyworkflow/em/packages/relion/protocol_base.py", line 880, in runRelionStep 00364: self.runJob(self._getProgram(), params) 00365: File "/usr/local/scipion/pyworkflow/protocol/protocol.py", line 1138, in runJob 00366: self._stepsExecutor.runJob(self._log, program, arguments, **kwargs) 00367: File "/usr/local/scipion/pyworkflow/protocol/executor.py", line 56, in runJob 00368: env=env, cwd=cwd) 00369: File "/usr/local/scipion/pyworkflow/utils/process.py", line 51, in runJob 00370: return runCommand(command, env, cwd) 00371: File "/usr/local/scipion/pyworkflow/utils/process.py", line 65, in runCommand 00372: check_call(command, shell=True, stdout=sys.stdout, stderr=sys.stderr, env=env, cwd=cwd) 00373: File "/usr/local/scipion/software/lib/python2.7/subprocess.py", line 186, in check_call 00374: raise CalledProcessError(retcode, cmd) 00375: CalledProcessError: Command 'mpirun -np 3 -bynode `which relion_refine_mpi` --gpu 1,2,3 --tau2_fudge 2 --scale --dont_combine_weights_via_disc --iter 25 --norm --psi_step 10.0 --ctf --offset_range 5.0 --oversampling 1 --pool 3 --o Runs/000827_ProtRelionClassify2D/extra/relion --i Runs/000827_ProtRelionClassify2D/input_particles.star --particle_diameter 200 --K 200 --flatten_solvent --zero_mask --offset_step 2.0 --angpix 7.134 --j 1' returned non-zero exit status 1 00376: Protocol failed: Command 'mpirun -np 3 -bynode `which relion_refine_mpi` --gpu 1,2,3 --tau2_fudge 2 --scale --dont_combine_weights_via_disc --iter 25 --norm --psi_step 10.0 --ctf --offset_range 5.0 --oversampling 1 --pool 3 --o Runs/000827_ProtRelionClassify2D/extra/relion --i Runs/000827_ProtRelionClassify2D/input_particles.star --particle_diameter 200 --K 200 --flatten_solvent --zero_mask --offset_step 2.0 --angpix 7.134 --j 1' returned non-zero exit status 1 00377: FAILED: runRelionStep, step 2 00378: 2018-05-03 10:35:36.970022 Thanks for your help Meng-Chiao (Joseph) Ho Assistant Research Fellow Institute of Biological Chemistry, Academia Sinica No. 128, Sec 2, Academia Road, Nankang, Taipei 115, Taiwan Tel: 886-2-27855696 ext 3080/3162 Email: jo...@ga... |