From: Dustin M. <dus...@gm...> - 2017-10-03 08:56:16
|
Hi Haixin, This is a known problem with MPI and pytom dynamically linked libraries. It is mentioned in the pytom FAQ: http://pytom.org/doc/pytom/faq.html You need to have the LD_PRELOAD variable set in you shell: BASH: export LD_PRELOAD='/usr/local/openmpi/lib/libmpi.so' CSH: setenv LD_PRELOAD '/usr/local/openmpi/lib/libmpi.so' -- Hope this helps, Dustin Morado John Briggs Group MRC LMB ___ Cheers, Dustin M. On Tue, Oct 3, 2017 at 4:06 AM, Haixin Sui <su...@ho...> wrote: > I followed Dustin's advice on the job xml file and I also modified two of > the basic programs as he suggested. The "localization.py can now be > successfully executed with the job.xml without using mpirun. If not working > with parallel computing, the "localization.py" runs too slow to be > practically useful. > > However, I failed to make the "localization.py" run with openmpi! I am > not sure if this is a problem with the openmpi environment we have or not. > I tested the openmpi on both my work station and my laptop using two > simple testing programs provided by openmpi distribution package. The > test programs worked well on both. However, when I did mpirun with > localization.py, I got the same error message on both of the two > computers.The error message has been included in this email. Any ideas > about what went wrong? > > Thanks for your help in advance ! > > Haixin Sui > > ---------------------------- > hsui@Thinkpad-HR16395:~/subtomo$ mpirun -n 2 localization.py > job_test_1.xml 2 2 2 > PyTom v0.971 > > This license affects the software package PyTom and all the herein > distributed source / data files. > Authors: > Thomas Hrabe > Yuxiang Chen > Friedrich Foerster > > Copyright (c) 2008-2016 > Max-Planck-Institute for Biochemistry > Dept. Molecular Structural Biology > 82152 Martinsried, Germany > http://www.pytom.org > > This program is free software: you can redistribute it and/or modify > it under the terms of the GNU General Public License as published by > the Free Software Foundation, either version 3 of the License, or > (at your option) any later version. > > This program is distributed in the hope that it will be useful, > but WITHOUT ANY WARRANTY; without even the implied warranty of > MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > GNU General Public License for more details. > > The complete license can be obtained from > http://www.gnu.org/licenses/gpl-3.0.html. > > PyTom v0.971 > > This license affects the software package PyTom and all the herein > distributed source / data files. > Authors: > Thomas Hrabe > Yuxiang Chen > Friedrich Foerster > > Copyright (c) 2008-2016 > Max-Planck-Institute for Biochemistry > Dept. Molecular Structural Biology > 82152 Martinsried, Germany > http://www.pytom.org > > This program is free software: you can redistribute it and/or modify > it under the terms of the GNU General Public License as published by > the Free Software Foundation, either version 3 of the License, or > (at your option) any later version. > > This program is distributed in the hope that it will be useful, > but WITHOUT ANY WARRANTY; without even the implied warranty of > MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > GNU General Public License for more details. > > The complete license can be obtained from > http://www.gnu.org/licenses/gpl-3.0.html. > > *** An error occurred in MPI_Init > *** on a NULL communicator > *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort, > *** and potentially your MPI job) > [Thinkpad-HR16395:24468] Local abort before MPI_INIT completed completed > successfully, but am not able to aggregate error messages, and not able to > guarantee that all other processes were killed! > ------------------------------------------------------- > Primary job terminated normally, but 1 process returned > a non-zero exit code.. Per user-direction, the job has been aborted. > ------------------------------------------------------- > *** An error occurred in MPI_Init > *** on a NULL communicator > *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort, > *** and potentially your MPI job) > [Thinkpad-HR16395:24469] Local abort before MPI_INIT completed completed > successfully, but am not able to aggregate error messages, and not able to > guarantee that all other processes were killed! > -------------------------------------------------------------------------- > mpirun detected that one or more processes exited with non-zero status, > thus causing > the job to be terminated. The first process to do so was: > > Process name: [[24716,1],1] > Exit code: 1 > -------------------------------------------------------------------------- > > > Preliminary test of our openmpi shows no problem with other testing > programs. > =============================================== > hsui@Thinkpad-HR16395:~/subtomo$ mpirun -n 2 hello_c > Hello, world, I am 0 of 2, (Open MPI v1.10.2, package: Open MPI > buildd@lgw01-57 Distribution, ident: 1.10.2, repo rev: > v1.10.1-145-g799148f, Jan 21, 2016, 126) > Hello, world, I am 1 of 2, (Open MPI v1.10.2, package: Open MPI > buildd@lgw01-57 Distribution, ident: 1.10.2, repo rev: > v1.10.1-145-g799148f, Jan 21, 2016, 126) > > hsui@Thinkpad-HR16395:~/subtomo$ mpirun -n 2 ring_c > Process 0 sending 10 to 1, tag 201 (2 processes in ring) > Process 0 sent to 1 > Process 0 decremented value: 9 > Process 0 decremented value: 8 > Process 0 decremented value: 7 > Process 0 decremented value: 6 > Process 0 decremented value: 5 > Process 0 decremented value: 4 > Process 0 decremented value: 3 > Process 0 decremented value: 2 > Process 0 decremented value: 1 > Process 0 decremented value: 0 > Process 0 exiting > Process 1 exiting > > > ------------------------------------------------------------ > ------------------ > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > _______________________________________________ > Pytom-mail mailing list > Pyt...@li... > https://lists.sourceforge.net/lists/listinfo/pytom-mail > > |