From: Kanika K. <kk...@uc...> - 2016-02-24 22:38:45
|
Hello, So I am trying localization with the tutorial data set itself. I did not use the hostfile as I am running on only one host. Hence, I specified the host and no. of CPUs as per the following command mpirun --host 'HostName' -c 4 pytom /usr/local/pytom/bin/localization.py 'PathToJobFile'/job.xml 2 2 2 I get the following error: It looks like orte_init failed for some reason; your parallel process is likely to abort. There are many reasons that a parallel process can fail during orte_init; some of which are due to configuration or environment problems. This failure appears to be an internal failure; here's some additional information (which may only be relevant to an Open MPI developer): opal_init failed --> Returned value Error (-1) instead of ORTE_SUCCESS -------------------------------------------------------------------------- *** An error occurred in MPI_Init *** on a NULL communicator *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort, *** and potentially your MPI job) [pordoi:29128] Local abort before MPI_INIT completed successfully; not able to aggregate error messages, and not able to guarantee that all other processes were killed! On Tue, Feb 23, 2016 at 3:44 PM, Dustin Morado <dus...@gm...> wrote: > Hi Kanika, > > What was the command you ran exactly? It looks like there’s an error with > your host file. Are you running this on a cluster, or just a workstation > with openmpi? > > If it’s the latter you can simply run it like this: > mpirun --hostfile localhost -c 8 pytomlocalization.py job.xml 2 2 2 > > which will send one of each of the 8 chunks of the tomogram to a processor. > > If you are on a cluster, you might want to ask the cluster admin if > there’s a specific submit script you need to run to have access to the > correct default hostfile. > > — > Cheers, > Dustin > > > > Thanks you for that. When I run the process with openmpi, i get the > following > > error message: > > Open RTE detected a parse error in the hostfile: > /home/kkhanna/Documents/Pytom/Bacillus_ribosome/job.xml It occured on line > number 37 on token 1. > > ————————————————————————————————————— > > [pordoi:25878] [[4037,0],0] ORTE_ERROR_LOG: Error in file > base/rmaps_base_support_fns.c at line 176 > > [pordoi:25878] [[4037,0],0] ORTE_ERROR_LOG: Error in file rmaps_rr.c at > line 121 > > [pordoi:25878] [[4037,0],0] ORTE_ERROR_LOG: Error in file > base/rmaps_base_map_job.c at line 379 > > [pordoi:25878] 2 more processes have sent help message help-hostfile.txt > / parse_error > > [pordoi:25878] Set MCA parameter "orte_base_help_aggregate" to 0 to see > all help / error messages > |