You can subscribe to this list here.
2016 |
Jan
(2) |
Feb
(13) |
Mar
(9) |
Apr
(4) |
May
(5) |
Jun
(2) |
Jul
(8) |
Aug
(3) |
Sep
(25) |
Oct
(7) |
Nov
(49) |
Dec
(15) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2017 |
Jan
(24) |
Feb
(36) |
Mar
(53) |
Apr
(44) |
May
(37) |
Jun
(34) |
Jul
(12) |
Aug
(15) |
Sep
(14) |
Oct
(9) |
Nov
(9) |
Dec
(7) |
2018 |
Jan
(16) |
Feb
(9) |
Mar
(27) |
Apr
(39) |
May
(8) |
Jun
(24) |
Jul
(22) |
Aug
(11) |
Sep
(1) |
Oct
|
Nov
|
Dec
|
2019 |
Jan
(4) |
Feb
(5) |
Mar
|
Apr
(1) |
May
(21) |
Jun
(13) |
Jul
(31) |
Aug
(22) |
Sep
(9) |
Oct
(19) |
Nov
(24) |
Dec
(12) |
2020 |
Jan
(30) |
Feb
(12) |
Mar
(16) |
Apr
(4) |
May
(37) |
Jun
(17) |
Jul
(19) |
Aug
(15) |
Sep
(26) |
Oct
(84) |
Nov
(64) |
Dec
(55) |
2021 |
Jan
(18) |
Feb
(58) |
Mar
(26) |
Apr
(88) |
May
(51) |
Jun
(36) |
Jul
(31) |
Aug
(37) |
Sep
(79) |
Oct
(15) |
Nov
(29) |
Dec
(8) |
2022 |
Jan
(5) |
Feb
(8) |
Mar
(29) |
Apr
(21) |
May
(11) |
Jun
(11) |
Jul
(18) |
Aug
(16) |
Sep
(6) |
Oct
(10) |
Nov
(23) |
Dec
(1) |
2023 |
Jan
(18) |
Feb
|
Mar
(4) |
Apr
|
May
(3) |
Jun
(10) |
Jul
(1) |
Aug
|
Sep
|
Oct
(1) |
Nov
(3) |
Dec
(5) |
2024 |
Jan
(2) |
Feb
|
Mar
|
Apr
|
May
|
Jun
(1) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2025 |
Jan
(1) |
Feb
|
Mar
|
Apr
(5) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: Grigory S. <sha...@gm...> - 2023-01-17 16:57:48
|
Hi, You most likely have a duplicated rlnSgdStepsizeScheme row in the corresponding run_optimiser.star file. If you find and delete the duplicate, the import of the run_data.star file will work Best regards, Grigory -------------------------------------------------------------------------------- Grigory Sharov, Ph.D. MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge Biomedical Campus, Cambridge CB2 0QH, UK. tel. +44 (0) 1223 267228 <+44%201223%20267228> e-mail: gs...@mr... On Tue, Jan 17, 2023 at 4:51 PM Leonardo Talachia Rosa via scipion-users < sci...@li...> wrote: > Dear Scipion team, > I hope you are well. > > I am trying to import a run_data.star file from relion 4.0, using the > Import particles protocol. This star file is an output of a 3D refine done > directly in relion. > The import fails giving the error message "invalid literal for int() with > base 10: '0.076970' " > I checked the .star file, and that is a value in the FOM tab of the first > particle. > The FOM values should not be a problem, since they are assigned in the > moment of picking, right? > I am new to scipion so any ideas would be welcome. > > Kind wishes, > Leonardo > > -- > ************************* > Leonardo TALACHIA ROSA, PhD > PostDoctoral Researcher > Microbiology and Structural Biology > Instituto de Química, Universidade de São Paulo - IQ-USP > Av. Prof. Lineu Prestes, 748 - Butantã, São Paulo - SP > ************************** > _______________________________________________ > scipion-users mailing list > sci...@li... > https://lists.sourceforge.net/lists/listinfo/scipion-users > |
From: Leonardo T. R. <leo...@us...> - 2023-01-17 16:51:21
|
Dear Scipion team, I hope you are well. I am trying to import a run_data.star file from relion 4.0, using the Import particles protocol. This star file is an output of a 3D refine done directly in relion. The import fails giving the error message "invalid literal for int() with base 10: '0.076970' " I checked the .star file, and that is a value in the FOM tab of the first particle. The FOM values should not be a problem, since they are assigned in the moment of picking, right? I am new to scipion so any ideas would be welcome. Kind wishes, Leonardo -- ************************* Leonardo TALACHIA ROSA, PhD PostDoctoral Researcher Microbiology and Structural Biology Instituto de Química, Universidade de São Paulo - IQ-USP Av. Prof. Lineu Prestes, 748 - Butantã, São Paulo - SP ************************** |
From: Chuchu W. <cc...@st...> - 2023-01-09 21:46:39
|
Dear Scipion Team, Happy New Year! I tried to install Scipion and the PySeg plug-in on our computer cluster, but because of many conflicts, it failed to work. I wonder if you know any Stanford Sherlock users who successfully installed your app (esp. PySeg plug-in) and ran it smoothly, I would like to ask for his/her favor to share the installation steps. Best, Chuchu -- Chuchu Wang, Ph.D. Postdoc | Axel T. Brunger Lab Molecular & Cellular Physiology | Stanford Medicine Howard Hughs Medical Institute at Stanford Clark Center Rm. E300C, 318 Campus Drive, CA. 94305 |
From: Yunior C. F. R. <cfo...@cn...> - 2022-12-14 11:33:05
|
Hello everyone, As you may have seen, a new version of cryoSPARC (v4.1) has been released. This version, at the moment, is incompatible with our plugin. I strongly recommend that you do not update cryoSPARC. I'm working to make the plugin compatible with this version as soon as possible. Cheers, Yun |
From: Pablo C. <pc...@cn...> - 2022-11-28 08:47:21
|
I'd also look at https://conda.io/projects/conda/en/latest/user-guide/configuration/use-condarc.html to make scipion environment "discoverable" by any user in the machine. We has an admin account that manages the installation and the the users that don't have privileges to install anything. Still, pip in this cases allow users to install locally packages. If you want to prevent this I think you can do it with this: https://pip.pypa.io/en/stable/user_guide/#user-installs Defining PYTHONUSERBASE= to a protected path? On 27/11/22 15:52, Grigory Sharov wrote: > Dear Peter, > > I don't have any specific instructions for Rocky 9, I suggest you to > follow > https://scipion-em.github.io/docs/release-3.0.0/docs/scipion-modes/how-to-install.html > and CentOS7 notes. If you encounter problems please do post them here. > > Scipion does not provide any user management tools, so you'd have to > use standard unix acl / permissions > > Best regards, > Grigory > > -------------------------------------------------------------------------------- > Grigory Sharov, Ph.D. > > MRC Laboratory of Molecular Biology, > Francis Crick Avenue, > Cambridge Biomedical Campus, > Cambridge CB2 0QH, UK. > tel. +44 (0) 1223 267228 <tel:+44%201223%20267228> > e-mail: gs...@mr... > > > On Thu, Nov 17, 2022 at 12:20 PM Peter Gonzalez via scipion-users > <sci...@li...> wrote: > > Hello - I would like to join your list and get some guidance with > installing Scipion on a Rocky 9 workstation for multiple users. > > Any help appreciated. > > Thank you. > > — > Peter Gonzalez > Computational Technology Manager > Dept. of Physiology and Biophysics > Institute for Computational Biomedicine > Weill Cornell Medical College > 212-746-1457 > > _______________________________________________ > scipion-users mailing list > sci...@li... > https://lists.sourceforge.net/lists/listinfo/scipion-users > > > > _______________________________________________ > scipion-users mailing list > sci...@li... > https://lists.sourceforge.net/lists/listinfo/scipion-users -- Pablo Conesa - *Madrid Scipion <http://scipion.i2pc.es> team* |
From: Grigory S. <sha...@gm...> - 2022-11-27 15:02:59
|
Hi, does scipion3 tests relion.tests.test_protocols_3d.TestRelionInitialModel fail as well? Have you verified that your MPI works (without scipion or relion)? Best regards, Grigory -------------------------------------------------------------------------------- Grigory Sharov, Ph.D. MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge Biomedical Campus, Cambridge CB2 0QH, UK. tel. +44 (0) 1223 267228 <+44%201223%20267228> e-mail: gs...@mr... On Mon, Nov 21, 2022 at 12:37 PM helder veras <hel...@ho...> wrote: > Hi all! > > Recently, I opened a discussion in this mailing list regarding the > installation of Scipion inside a singularity container, which worked very > well, but now I'm facing a problem that I'm not sure if could be related to > that installation. > I'm trying to run the relion 3D classification protocol in GPU, but I > received the following error message: > (It seems an MPI issue, and if I use only 1 MPI it works. Interestingly, > the 2D classification which also calls the "relion_refine_mpi" program runs > without problems. It seems an issue specific to the 3D classification). > > Does anyone have any clues as to the possible cause of that problem? > > ps: sorry if this is not a scipion-related issue. > > Configuration tested: > 3 MPI + 1 thread > GPU nvidia A100 > UBUNTU 20.04 > cuda-11.7 > gcc version 9.4 > mpirun version 4.0.3 > > > Thank you!! > > Best, > > Helder > > - stderr: > > 00027: [gpu01:3062501] 5 more processes have sent help message > help-mpi-btl-openib-cpc-base.txt / no cpcs for port > 00028: [gpu01:3062501] Set MCA parameter "orte_base_help_aggregate" to 0 > to see all help / error messages > 00029: [gpu01:3062508] *** Process received signal *** > 00030: [gpu01:3062508] Signal: Segmentation fault (11) > 00031: [gpu01:3062508] Signal code: Address not mapped (1) > 00032: [gpu01:3062508] Failing at address: 0x30 > 00033: [gpu01:3062508] [ 0] > /lib/x86_64-linux-gnu/libpthread.so.0(+0x14420)[0x7fffef157420] > 00034: [gpu01:3062508] [ 1] > /usr/lib/x86_64-linux-gnu/openmpi/lib/openmpi3/mca_mtl_ofi.so(ompi_mtl_ofi_progress_no_inline+0x1a2)[0x7fffecb041c2] > 00035: [gpu01:3062508] [ 2] > /lib/x86_64-linux-gnu/libopen-pal.so.40(opal_progress+0x34)[0x7fffeea71854] > 00036: [gpu01:3062508] [ 3] > /lib/x86_64-linux-gnu/libmpi.so.40(ompi_request_default_wait_all+0xe5)[0x7fffef43ce25] > 00037: [gpu01:3062508] [ 4] > /lib/x86_64-linux-gnu/libmpi.so.40(ompi_coll_base_bcast_intra_generic+0x4be)[0x7fffef491d4e] > 00038: [gpu01:3062508] [ 5] > /lib/x86_64-linux-gnu/libmpi.so.40(ompi_coll_base_bcast_intra_pipeline+0xd1)[0x7fffef492061] > 00039: [gpu01:3062508] [ 6] > /usr/lib/x86_64-linux-gnu/openmpi/lib/openmpi3/mca_coll_tuned.so(ompi_coll_tuned_bcast_intra_dec_fixed+0x12e)[0x7fffec0b4dae] > 00040: [gpu01:3062508] [ 7] > /lib/x86_64-linux-gnu/libmpi.so.40(MPI_Bcast+0x120)[0x7fffef454b10] > 00041: [gpu01:3062508] [ 8] > /opt/software/em/relion-4.0/bin/relion_refine_mpi(_ZN7MpiNode16relion_MPI_BcastEPvlP15ompi_datatype_tiP19ompi_communicator_t+0x176)[0x55555565de56] > 00042: [gpu01:3062508] [ 9] > /opt/software/em/relion-4.0/bin/relion_refine_mpi(_ZN14MlOptimiserMpi10initialiseEv+0x178)[0x555555644d48] > 00043: [gpu01:3062508] [10] > /opt/software/em/relion-4.0/bin/relion_refine_mpi(main+0x71)[0x5555555fcf41] > 00044: [gpu01:3062508] [11] > /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf3)[0x7fffeebd5083] > 00045: [gpu01:3062508] [12] > /opt/software/em/relion-4.0/bin/relion_refine_mpi(_start+0x2e)[0x55555560026e] > 00046: [gpu01:3062508] *** End of error message *** > > > - stdout: > > Logging configured. STDOUT --> > Runs/003851_ProtRelionClassify3D/logs/run.stdout , STDERR --> > Runs/003851_ProtRelionClassify3D/logs/run.stderr > ^[[32mRUNNING PROTOCOL -----------------^[[0m > Protocol starts > Hostname: gpu01.cnpem.local > PID: 3062463 > pyworkflow: 3.0.27 > plugin: relion > plugin v: 4.0.11 > currentDir: /home/helder.ribeiro/ScipionUserData/projects/scipion_teste > workingDir: Runs/003851_ProtRelionClassify3D > runMode: Restart > MPI: 3 > threads: 1 > Starting at step: 1 > Running steps > ^[[35mSTARTED^[[0m: convertInputStep, step 1, time 2022-11-21 > 13:12:45.527978 > Converting set from 'Runs/003330_ProtImportParticles/particles.sqlite' > into 'Runs/003851_ProtRelionClassify3D/input_particles.star' > ** Running command: ^[[32m relion_image_handler --i > Runs/002945_ProtImportVolumes/extra/import_output_volume.mrc --o > Runs/003851_ProtRelionClassify3D/tmp/import_output_volume.00.mrc --angpix > 1.10745 --new_box 220^[[0m > 000/??? sec ~~(,_,"> > [oo]^M 0/ 0 sec > ............................................................~~(,_,"> > ^[[35mFINISHED^[[0m: convertInputStep, step 1, time 2022-11-21 > 13:12:45.850105 > ^[[35mSTARTED^[[0m: runRelionStep, step 2, time 2022-11-21 13:12:45.875931 > ^[[32m mpirun -np 3 -bynode `which relion_refine_mpi` --i > Runs/003851_ProtRelionClassify3D/input_particles.star --particle_diameter > 226 --zero_mask --K 3 --firstiter_cc --ini_high 60.0 --sym c1 > --ref_angpix 1.10745 --ref > Runs/003851_ProtRelionClassify3D/tmp/import_output_volume.00.mrc --norm > --scale --o Runs/003851_ProtRelionClassify3D/extra/relion --oversampling > 1 --flatten_solvent --tau2_fudge 4.0 --iter 25 --pad 2 --healpix_order 2 > --offset_range 5.0 --offset_step 2.0 --dont_combine_weights_via_disc > --pool 3 --gpu --j 1^[[0m > RELION version: 4.0.0-commit-138b9c > Precision: BASE=double > > === RELION MPI setup === > + Number of MPI processes = 3 > + Leader (0) runs on host = gpu01 > + Follower 1 runs on host = gpu01 > + Follower 2 runs on host = gpu01 > ================= > uniqueHost gpu01 has 2 ranks. > GPU-ids not specified for this rank, threads will automatically be mapped > to available devices. > Thread 0 on follower 1 mapped to device 0 > GPU-ids not specified for this rank, threads will automatically be mapped > to available devices. > Thread 0 on follower 2 mapped to device 0 > Device 0 on gpu01 is split between 2 followers > Running CPU instructions in double precision. > Estimating initial noise spectra from 1000 particles > 000/??? sec ~~(,_,"> > > ..... > ^[[35mFAILED^[[0m: runRelionStep, step 2, time 2022-11-21 13:12:52.259808 > *** Last status is failed > ^[[32m------------------- PROTOCOL FAILED (DONE 2/3)^[[0m > > _______________________________________________ > scipion-users mailing list > sci...@li... > https://lists.sourceforge.net/lists/listinfo/scipion-users > |
From: Grigory S. <sha...@gm...> - 2022-11-27 15:00:26
|
Dear Kyrylo, while it seems to be not a scipion problem, you could run a test first: scipion3 tests relion.tests.test_protocols_3d.TestRelionInitialModel If that fails, it may be that your relion/mpi installation is not working. Best regards, Grigory -------------------------------------------------------------------------------- Grigory Sharov, Ph.D. MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge Biomedical Campus, Cambridge CB2 0QH, UK. tel. +44 (0) 1223 267228 <+44%201223%20267228> e-mail: gs...@mr... On Thu, Nov 24, 2022 at 3:13 PM Bisikalo, Kyrylo < kyr...@he...> wrote: > Hello, > > > > I have been trying to process a small dataset (~800 particles), but the 3D > initial model keeps failing mid-run. It displays the following error: > “Protocol failed: Command ' relion_refine --i > Runs/000899_ProtRelionInitialModel/input_particles.star --particle_diameter > 682 --ctf --zero_mask --o Runs/000899_ProtRelionInitialModel/extra/relion > --oversampling 1 --pad 1 --tau2_fudge 3.0 --flatten_solvent --sym i2 > --iter 150 --grad --K 1 --denovo_3dref --healpix_order 1 --offset_range 6 > --offset_step 2 --auto_sampling --dont_combine_weights_via_disc > --preread_images --pool 3 --gpu --j 2' died with <Signals.SIGSEGV: 11>.” > > > > What can be done to amend this? > > I have attached the log files to this email. > > > > Faithfully yours, > > Kyrylo Bisikalo > > Erasmus Mundus Joint Master’s Degree program > > “Advanced Spectroscopy in Chemistry” > _______________________________________________ > scipion-users mailing list > sci...@li... > https://lists.sourceforge.net/lists/listinfo/scipion-users > |
From: Grigory S. <sha...@gm...> - 2022-11-27 14:52:38
|
Dear Peter, I don't have any specific instructions for Rocky 9, I suggest you to follow https://scipion-em.github.io/docs/release-3.0.0/docs/scipion-modes/how-to-install.html and CentOS7 notes. If you encounter problems please do post them here. Scipion does not provide any user management tools, so you'd have to use standard unix acl / permissions Best regards, Grigory -------------------------------------------------------------------------------- Grigory Sharov, Ph.D. MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge Biomedical Campus, Cambridge CB2 0QH, UK. tel. +44 (0) 1223 267228 <+44%201223%20267228> e-mail: gs...@mr... On Thu, Nov 17, 2022 at 12:20 PM Peter Gonzalez via scipion-users < sci...@li...> wrote: > Hello - I would like to join your list and get some guidance with > installing Scipion on a Rocky 9 workstation for multiple users. > > Any help appreciated. > > Thank you. > > — > Peter Gonzalez > Computational Technology Manager > Dept. of Physiology and Biophysics > Institute for Computational Biomedicine > Weill Cornell Medical College > 212-746-1457 > > _______________________________________________ > scipion-users mailing list > sci...@li... > https://lists.sourceforge.net/lists/listinfo/scipion-users > |
From: Bisikalo, K. <kyr...@he...> - 2022-11-24 15:12:44
|
Hello, I have been trying to process a small dataset (~800 particles), but the 3D initial model keeps failing mid-run. It displays the following error: "Protocol failed: Command ' relion_refine --i Runs/000899_ProtRelionInitialModel/input_particles.star --particle_diameter 682 --ctf --zero_mask --o Runs/000899_ProtRelionInitialModel/extra/relion --oversampling 1 --pad 1 --tau2_fudge 3.0 --flatten_solvent --sym i2 --iter 150 --grad --K 1 --denovo_3dref --healpix_order 1 --offset_range 6 --offset_step 2 --auto_sampling --dont_combine_weights_via_disc --preread_images --pool 3 --gpu --j 2' died with <Signals.SIGSEGV: 11>." What can be done to amend this? I have attached the log files to this email. Faithfully yours, Kyrylo Bisikalo Erasmus Mundus Joint Master's Degree program "Advanced Spectroscopy in Chemistry" |
From: Pablo C. <pc...@cn...> - 2022-11-22 11:20:11
|
Hi! The easyest way is to: export the project to SCIPION_HOME/config/you-name-it*.json.template* Then, when running scipion3 template, you should see it listed as "local-you-name-it" On 22/11/22 11:48, Lugmayr, Wolfgang wrote: > Hi, > > as a starting point for new users, I want to make a predefined template for the tomo workflow. > > The idea is that users import a recommended workflow as "empty" template with the steps. > Then they edit the steps with their own data and follow the "boxes". > > How can this be done? > Can I export a existing project as template and put it somewhere so other users can import it? > Let's assume all work on the same Scipion installation so missing plugins is not an issue on this specific one. > > Cheers, > Wolfgang > > > _______________________________________________ > scipion-users mailing list > sci...@li... > https://lists.sourceforge.net/lists/listinfo/scipion-users -- Pablo Conesa - *Madrid Scipion <http://scipion.i2pc.es> team* |
From: Lugmayr, W. <w.l...@uk...> - 2022-11-22 11:04:07
|
Hi, as a starting point for new users, I want to make a predefined template for the tomo workflow. The idea is that users import a recommended workflow as "empty" template with the steps. Then they edit the steps with their own data and follow the "boxes". How can this be done? Can I export a existing project as template and put it somewhere so other users can import it? Let's assume all work on the same Scipion installation so missing plugins is not an issue on this specific one. Cheers, Wolfgang |
From: helder v. <hel...@ho...> - 2022-11-21 12:37:22
|
Hi all! Recently, I opened a discussion in this mailing list regarding the installation of Scipion inside a singularity container, which worked very well, but now I'm facing a problem that I'm not sure if could be related to that installation. I'm trying to run the relion 3D classification protocol in GPU, but I received the following error message: (It seems an MPI issue, and if I use only 1 MPI it works. Interestingly, the 2D classification which also calls the "relion_refine_mpi" program runs without problems. It seems an issue specific to the 3D classification). Does anyone have any clues as to the possible cause of that problem? ps: sorry if this is not a scipion-related issue. Configuration tested: 3 MPI + 1 thread GPU nvidia A100 UBUNTU 20.04 cuda-11.7 gcc version 9.4 mpirun version 4.0.3 Thank you!! Best, Helder * stderr: 00027: [gpu01:3062501] 5 more processes have sent help message help-mpi-btl-openib-cpc-base.txt / no cpcs for port 00028: [gpu01:3062501] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages 00029: [gpu01:3062508] *** Process received signal *** 00030: [gpu01:3062508] Signal: Segmentation fault (11) 00031: [gpu01:3062508] Signal code: Address not mapped (1) 00032: [gpu01:3062508] Failing at address: 0x30 00033: [gpu01:3062508] [ 0] /lib/x86_64-linux-gnu/libpthread.so.0(+0x14420)[0x7fffef157420] 00034: [gpu01:3062508] [ 1] /usr/lib/x86_64-linux-gnu/openmpi/lib/openmpi3/mca_mtl_ofi.so(ompi_mtl_ofi_progress_no_inline+0x1a2)[0x7fffecb041c2] 00035: [gpu01:3062508] [ 2] /lib/x86_64-linux-gnu/libopen-pal.so.40(opal_progress+0x34)[0x7fffeea71854] 00036: [gpu01:3062508] [ 3] /lib/x86_64-linux-gnu/libmpi.so.40(ompi_request_default_wait_all+0xe5)[0x7fffef43ce25] 00037: [gpu01:3062508] [ 4] /lib/x86_64-linux-gnu/libmpi.so.40(ompi_coll_base_bcast_intra_generic+0x4be)[0x7fffef491d4e] 00038: [gpu01:3062508] [ 5] /lib/x86_64-linux-gnu/libmpi.so.40(ompi_coll_base_bcast_intra_pipeline+0xd1)[0x7fffef492061] 00039: [gpu01:3062508] [ 6] /usr/lib/x86_64-linux-gnu/openmpi/lib/openmpi3/mca_coll_tuned.so(ompi_coll_tuned_bcast_intra_dec_fixed+0x12e)[0x7fffec0b4dae] 00040: [gpu01:3062508] [ 7] /lib/x86_64-linux-gnu/libmpi.so.40(MPI_Bcast+0x120)[0x7fffef454b10] 00041: [gpu01:3062508] [ 8] /opt/software/em/relion-4.0/bin/relion_refine_mpi(_ZN7MpiNode16relion_MPI_BcastEPvlP15ompi_datatype_tiP19ompi_communicator_t+0x176)[0x55555565de56] 00042: [gpu01:3062508] [ 9] /opt/software/em/relion-4.0/bin/relion_refine_mpi(_ZN14MlOptimiserMpi10initialiseEv+0x178)[0x555555644d48] 00043: [gpu01:3062508] [10] /opt/software/em/relion-4.0/bin/relion_refine_mpi(main+0x71)[0x5555555fcf41] 00044: [gpu01:3062508] [11] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf3)[0x7fffeebd5083] 00045: [gpu01:3062508] [12] /opt/software/em/relion-4.0/bin/relion_refine_mpi(_start+0x2e)[0x55555560026e] 00046: [gpu01:3062508] *** End of error message *** * stdout: Logging configured. STDOUT --> Runs/003851_ProtRelionClassify3D/logs/run.stdout , STDERR --> Runs/003851_ProtRelionClassify3D/logs/run.stderr ^[[32mRUNNING PROTOCOL -----------------^[[0m Protocol starts Hostname: gpu01.cnpem.local PID: 3062463 pyworkflow: 3.0.27 plugin: relion plugin v: 4.0.11 currentDir: /home/helder.ribeiro/ScipionUserData/projects/scipion_teste workingDir: Runs/003851_ProtRelionClassify3D runMode: Restart MPI: 3 threads: 1 Starting at step: 1 Running steps ^[[35mSTARTED^[[0m: convertInputStep, step 1, time 2022-11-21 13:12:45.527978 Converting set from 'Runs/003330_ProtImportParticles/particles.sqlite' into 'Runs/003851_ProtRelionClassify3D/input_particles.star' ** Running command: ^[[32m relion_image_handler --i Runs/002945_ProtImportVolumes/extra/import_output_volume.mrc --o Runs/003851_ProtRelionClassify3D/tmp/import_output_volume.00.mrc --angpix 1.10745 --new_box 220^[[0m 000/??? sec ~~(,_,"> [oo]^M 0/ 0 sec ............................................................~~(,_,"> ^[[35mFINISHED^[[0m: convertInputStep, step 1, time 2022-11-21 13:12:45.850105 ^[[35mSTARTED^[[0m: runRelionStep, step 2, time 2022-11-21 13:12:45.875931 ^[[32m mpirun -np 3 -bynode `which relion_refine_mpi` --i Runs/003851_ProtRelionClassify3D/input_particles.star --particle_diameter 226 --zero_mask --K 3 --firstiter_cc --ini_high 60.0 --sym c1 --ref_angpix 1.10745 --ref Runs/003851_ProtRelionClassify3D/tmp/import_output_volume.00.mrc --norm --scale --o Runs/003851_ProtRelionClassify3D/extra/relion --oversampling 1 --flatten_solvent --tau2_fudge 4.0 --iter 25 --pad 2 --healpix_order 2 --offset_range 5.0 --offset_step 2.0 --dont_combine_weights_via_disc --pool 3 --gpu --j 1^[[0m RELION version: 4.0.0-commit-138b9c Precision: BASE=double === RELION MPI setup === + Number of MPI processes = 3 + Leader (0) runs on host = gpu01 + Follower 1 runs on host = gpu01 + Follower 2 runs on host = gpu01 ================= uniqueHost gpu01 has 2 ranks. GPU-ids not specified for this rank, threads will automatically be mapped to available devices. Thread 0 on follower 1 mapped to device 0 GPU-ids not specified for this rank, threads will automatically be mapped to available devices. Thread 0 on follower 2 mapped to device 0 Device 0 on gpu01 is split between 2 followers Running CPU instructions in double precision. Estimating initial noise spectra from 1000 particles 000/??? sec ~~(,_,"> ..... ^[[35mFAILED^[[0m: runRelionStep, step 2, time 2022-11-21 13:12:52.259808 *** Last status is failed ^[[32m------------------- PROTOCOL FAILED (DONE 2/3)^[[0m |
From: Peter G. <pe...@me...> - 2022-11-16 21:26:15
|
Hello - I would like to join your list and get some guidance with installing Scipion on a Rocky 9 workstation for multiple users. Any help appreciated. Thank you. — Peter Gonzalez Computational Technology Manager Dept. of Physiology and Biophysics Institute for Computational Biomedicine Weill Cornell Medical College 212-746-1457 |
From: Alberto G. M. <alb...@cn...> - 2022-11-16 06:45:49
|
Dear all, The XMIPP team has released the new version of Xmipp. Please welcome to *Xmipp 3.22.11-Iris. *For this version we have updated 139 files; the main new features are these*,* <https://github.com/I2PC/xmipp/wiki/Release-notes#release-32211---Iris>**including some speed improvements and performance optimization. Feel free toinstall it <https://github.com/I2PC/xmipp#standalone-installation>, test it, report any problems <https://github.com/I2PC/xmipp/issues> if they arise and above all enjoy it! You can update *scipion-em-xmipp* with the plugin manager or with/.///scipion3 installp -p scipion-em-xmipp/ Xmipp-Team -- Alberto García Mena National Center for Biotechnology - CSIC Facility software support & Xmipp team @ Biocomputing Unit / I2PC |
From: Pablo C. <pc...@cn...> - 2022-11-10 19:12:16
|
Hi Helder! Good job, and thank you for sharing the singularity file. If you have a public url where (like a github repository) we can link to it in our site under the "Scipion ecosystem" menu. I'm glad it worked. All the best, Pablo On 10/11/22 19:14, helder veras wrote: > Hi Pablo! > > Your comment about the virtualenv was great! The container was not > accessing the correct python. I just exported the right scipion python > in the container and now everything is working!! > So now I'm executing the container from a GUI node and I'm able to > launch the scipion interface. After setting up the protocols, I could > successfully launch the job to the target queues. > > I've attached the new singularity definition file just in case someone > would be interested in testing/using it (I'll also improve the command > lines to make it easier for the users to launch the container in the > cluster and I can share it) > > Thank you (and the others) so much for all the help! > > Best, > > Helder > > ------------------------------------------------------------------------ > *De:* Pablo Conesa <pc...@cn...> > *Enviado:* quinta-feira, 10 de novembro de 2022 07:36 > *Para:* sci...@li... > <sci...@li...> > *Assunto:* Re: [scipion-users] scipion - singularity - HPC > > It may be related... here are some comments... > > > when you run: > > > python3 -m scipioninstaller -noXmipp -noAsk /opt > > > the installer is going to create an environment (conda environemnt if > conda is found otherwise virtualenv, log should give you a hint) for > scipion. We do not want to use System's python. Therefore, my guess is > that there is an scipion environment and later installp command should > be fine. > > > What it does not make sense is to: > > > pip install scipion-pyworkflow scipion-em > > > This is going to your system python, probably having an "empty" > scipion core installation in the system. This is probably what brings > the "Legacy" error. > > > Your call to the GPU node is ending up happening in the "empty > installation" on the system's python. > > > So, how through slurm, the call should end up in the right environment > on the node? > > > I think our calls have always an absolute path to the python of the > environment. This is meant to "define" the environment to use. > > > Does this work for the case of a virtualenv environment? Not sure. > > > Also, how environment variables are passed from slurm to the node may > be affecting. (note, I'm not an expert in Slurm). > > > First thing I'd do would be to verify which python is being called in > the node: system's one with pyworkflow and scipion-em or the correct > one created by the installer. > > > Can you see the command sent to slurm and received by the node? > > I'd also check if the installer created and environment (conda? or > virtualenv one?) > > > Let's hunt this! ;-) > > > > On 10/11/22 12:33, helder veras wrote: >> Hi Pablo! >> >> Thank you for the suggestion! >> >> I've tested to execute the protocols command in the login node (the >> node I'm currently running the GUI) and tested to execute the sbatch >> file that calls the protocols command and that runs on the computing >> node. Both returned the relion plugins as expected. >> There're two things that I noted during the scipion installation in >> the singularity container that maybe could be related to this problem. >> >> 1. Since I'm installing scipion inside the container, I didn't use a >> conda environment. For some reason, is that conda environment >> necessary for the correct installation of scipion and plugins? >> 2. After installation, I noted some errors related to the >> scipion-pyworkflow and scipion-em. The modules were not found >> when executing the protocols. So I installed them directly from >> pip inside the container. Do you think that could be a problem? >> >> I've attached the singularity definition file (a text file), as you >> can see this file contains the commands that I used to install >> scipion and plugins inside the container. >> >> Best, >> >> Helder >> ------------------------------------------------------------------------ >> *De:* Pablo Conesa <pc...@cn...> <mailto:pc...@cn...> >> *Enviado:* quinta-feira, 10 de novembro de 2022 03:38 >> *Para:* sci...@li... >> <mailto:sci...@li...> >> <sci...@li...> >> <mailto:sci...@li...> >> *Assunto:* Re: [scipion-users] scipion - singularity - HPC >> >> This could happen if you have a different node installation where you >> are missing some plugins. >> >> >> I.e: You have relion plugin in the "login node", but it is not >> present in the computing node? >> >> >> One way to check is to compare the output of: >> >> >> scipion3 protocols >> >> >> from both machines. Are they equal? >> >> >> On 8/11/22 21:06, helder veras wrote: >>> Hi Pablo! >>> >>> Thank you for your thoughts and comments! >>> >>> I tested to modify the hosts.conf file to execute the singularity >>> from the sbatch scripts. It seems the previous errors were solved, >>> but now I got this one: >>> >>> *run.stdout:* >>> RUNNING PROTOCOL -----------------ESC[0m >>> Protocol starts >>> Hostname: gpu01.cnpem.local >>> PID: 2044741 >>> pyworkflow: 3.0.27 >>> plugin: orphan >>> currentDir: /home/helder.ribeiro/ScipionUserData/projects/scipion_teste >>> workingDir: Runs/000641_ProtRelionClassify2D >>> runMode: Restart >>> MPI: 1 >>> threads: 1 >>> 'NoneType' object has no attribute 'Plugin' LegacyProtocol >>> installation couldn't be validated. Possible cause could be a >>> configuration issue. Try to run scipion config. >>> ESC[31mProtocol has validation errors: >>> 'NoneType' object has no attribute 'Plugin' LegacyProtocol >>> installation couldn't be validated. Possible cause could be a >>> configuration issue. Try to run scipion config.ESC[0m >>> ESC[32m------------------- PROTOCOL FAILED (DONE 0/0)ESC[0m >>> >>> Do you have any idea what could be causing this error? >>> >>> Best, >>> >>> Helder >>> ------------------------------------------------------------------------ >>> *De:* Pablo Conesa <pc...@cn...> <mailto:pc...@cn...> >>> *Enviado:* sexta-feira, 4 de novembro de 2022 03:12 >>> *Para:* sci...@li... >>> <mailto:sci...@li...> >>> <sci...@li...> >>> <mailto:sci...@li...> >>> *Assunto:* Re: [scipion-users] scipion - singularity - HPC >>> >>> Hi Helder! >>> >>> >>> I got no experience with singularity, but I guess you have a cluster >>> of singuarity nodes? >>> >>> >>> Nodes should have scipion installed as well in the same way(paths) >>> as the "login node". >>> >>> >>> I guess the challenge here is to make slurm "talk to other >>> singularity nodes"? >>> >>> >>> Regarding the error ...: >>> >>> >>> var/spool/slurmd/slurmd/job02147/slurm_script: line 28: gpu01: No >>> such file or directory >>> >>> uniq: gpu01: No such file or directory >>> python3: can't open file >>> '/opt/.scipion3/lib/python3.8/site-packages/pyworkflow/apps/pw_protocol_run.py': >>> [Errno 2] No such file or directory >>> >>> I only understand the last line (not sure if the first 2 are a >>> consequence of the 3rd one). That path look correct, providing you >>> have a "Scipion virtualenv" installation at /opt. >>> >>> >>> On 3/11/22 19:36, helder veras wrote: >>>> Hi Mohamad! >>>> >>>> Thank you for your reply! Yes, I'm using the host configuration >>>> file to launch slurm. >>>> Let me provide more details about my issue: >>>> >>>> I built a singularity container that has scipion and all the >>>> required dependencies and programs installed as well. This >>>> container works fine and I tested it on a desktop machine and on an >>>> HPC node without the queue option as well. Programs inside scipion >>>> are correctly executed and everything works fine. >>>> To be able to launch scipion using the queue option with slurm I >>>> had to bind the slurm/murge paths to the container and export some >>>> paths (just as presented in >>>> https://info.gwdg.de/wiki/doku.php?id=wiki:hpc:usage_of_slurm_within_a_singularity_container >>>> <https://info.gwdg.de/wiki/doku.php?id=wiki:hpc:usage_of_slurm_within_a_singularity_container> ) >>>> ( I also included slurm user on the container). By doing this >>>> Scipion was able to see the queue (which I changed in hosts.conf >>>> file) and successfully launch the job to the queue. The problem is >>>> that the sbatch script calls the pw_protocol_run.py that is inside >>>> the container, which raises the error in .err file: >>>> >>>> /var/spool/slurmd/slurmd/job02147/slurm_script: line 28: gpu01: No >>>> such file or directory >>>> uniq: gpu01: No such file or directory >>>> python3: can't open file >>>> '/opt/.scipion3/lib/python3.8/site-packages/pyworkflow/apps/pw_protocol_run.py': >>>> [Errno 2] No such file or directory >>>> >>>> I think the problem is that the slurm is trying to execute the >>>> script that is only available inside the container. >>>> Usage of Slurm within a Singularity Container - GWDG >>>> <https://info.gwdg.de/wiki/doku.php?id=wiki:hpc:usage_of_slurm_within_a_singularity_container> >>>> Starting a singularity container is quite cumbersome for the >>>> operating system of the host and, in case too many requests to >>>> start a singularity container are received at the same time, it >>>> might fail. >>>> info.gwdg.de >>>> >>>> >>>> Best, >>>> >>>> Helder Ribeiro >>>> >>>> ------------------------------------------------------------------------ >>>> *De:* Mohamad HARASTANI <moh...@so...> >>>> <mailto:moh...@so...> >>>> *Enviado:* quinta-feira, 3 de novembro de 2022 09:29 >>>> *Para:* Mailing list for Scipion users >>>> <sci...@li...> >>>> <mailto:sci...@li...> >>>> *Assunto:* Re: [scipion-users] scipion - singularity - HPC >>>> Hello Helder, >>>> >>>> Have you taken a look at the host configuration here >>>> (https://scipion-em.github.io/docs/release-3.0.0/docs/scipion-modes/host-configuration.html >>>> <https://scipion-em.github.io/docs/release-3.0.0/docs/scipion-modes/host-configuration.html>)? >>>> >>>> Best of luck, >>>> Mohamad >>>> >>>> ------------------------------------------------------------------------ >>>> *From: *"Helder Veras Ribeiro Filho" >>>> <hel...@ln...> <mailto:hel...@ln...> >>>> *To: *sci...@li... >>>> <mailto:sci...@li...> >>>> *Sent: *Wednesday, November 2, 2022 5:07:53 PM >>>> *Subject: *[scipion-users] scipion - singularity - HPC >>>> >>>> Hello scipion group! >>>> >>>> I'm trying to launch Scipion from a singularity container in an HPC >>>> with the slurm as a scheduler. The container works fine and I'm >>>> able to execute Scipion routines correctly without using a queue. >>>> The problem is when I try to send Scipion jobs using the queue in >>>> the Scipion interface. I suppose that it is a slurm/singularity >>>> configuration problem. >>>> Could anyone who was successful in sending jobs to queue from a >>>> singularity launched scipion help me with some tips? >>>> >>>> Best, >>>> >>>> Helder >>>> >>>> *Helder Veras Ribeiro Filho, PhD*** >>>> Brazilian Biosciences National Laboratory - LNBio >>>> Brazilian Center for Research in Energy and Materials - CNPEM >>>> 10,000 Giuseppe Maximo Scolfaro St. >>>> Campinas, SP - Brazil13083-100 >>>> +55(19) 3512-1255 >>>> >>>> >>>> Aviso Legal: Esta mensagem e seus anexos podem conter informações >>>> confidenciais e/ou de uso restrito. Observe atentamente seu >>>> conteúdo e considere eventual consulta ao remetente antes de >>>> copiá-la, divulgá-la ou distribuí-la. Se você recebeu esta mensagem >>>> por engano, por favor avise o remetente e apague-a imediatamente. >>>> >>>> Disclaimer: This email and its attachments may contain confidential >>>> and/or privileged information. Observe its content carefully and >>>> consider possible querying to the sender before copying, disclosing >>>> or distributing it. If you have received this email by mistake, >>>> please notify the sender and delete it immediately. >>>> >>>> >>>> >>>> _______________________________________________ >>>> scipion-users mailing list >>>> sci...@li... >>>> <mailto:sci...@li...> >>>> https://lists.sourceforge.net/lists/listinfo/scipion-users >>>> <https://lists.sourceforge.net/lists/listinfo/scipion-users> >>>> >>>> >>>> _______________________________________________ >>>> scipion-users mailing list >>>> sci...@li... <mailto:sci...@li...> >>>> https://lists.sourceforge.net/lists/listinfo/scipion-users <https://lists.sourceforge.net/lists/listinfo/scipion-users> >>> -- >>> Pablo Conesa - *Madrid Scipion <http://scipion.i2pc.es> team* >>> >>> >>> _______________________________________________ >>> scipion-users mailing list >>> sci...@li... <mailto:sci...@li...> >>> https://lists.sourceforge.net/lists/listinfo/scipion-users <https://lists.sourceforge.net/lists/listinfo/scipion-users> >> -- >> Pablo Conesa - *Madrid Scipion <http://scipion.i2pc.es> team* >> >> >> _______________________________________________ >> scipion-users mailing list >> sci...@li... <mailto:sci...@li...> >> https://lists.sourceforge.net/lists/listinfo/scipion-users <https://lists.sourceforge.net/lists/listinfo/scipion-users> > -- > Pablo Conesa - *Madrid Scipion <http://scipion.i2pc.es> team* > > > _______________________________________________ > scipion-users mailing list > sci...@li... > https://lists.sourceforge.net/lists/listinfo/scipion-users -- Pablo Conesa - *Madrid Scipion <http://scipion.i2pc.es> team* |
From: helder v. <hel...@ho...> - 2022-11-10 18:30:39
|
Hi Pablo! Your comment about the virtualenv was great! The container was not accessing the correct python. I just exported the right scipion python in the container and now everything is working!! So now I'm executing the container from a GUI node and I'm able to launch the scipion interface. After setting up the protocols, I could successfully launch the job to the target queues. I've attached the new singularity definition file just in case someone would be interested in testing/using it (I'll also improve the command lines to make it easier for the users to launch the container in the cluster and I can share it) Thank you (and the others) so much for all the help! Best, Helder ________________________________ De: Pablo Conesa <pc...@cn...> Enviado: quinta-feira, 10 de novembro de 2022 07:36 Para: sci...@li... <sci...@li...> Assunto: Re: [scipion-users] scipion - singularity - HPC It may be related... here are some comments... when you run: python3 -m scipioninstaller -noXmipp -noAsk /opt the installer is going to create an environment (conda environemnt if conda is found otherwise virtualenv, log should give you a hint) for scipion. We do not want to use System's python. Therefore, my guess is that there is an scipion environment and later installp command should be fine. What it does not make sense is to: pip install scipion-pyworkflow scipion-em This is going to your system python, probably having an "empty" scipion core installation in the system. This is probably what brings the "Legacy" error. Your call to the GPU node is ending up happening in the "empty installation" on the system's python. So, how through slurm, the call should end up in the right environment on the node? I think our calls have always an absolute path to the python of the environment. This is meant to "define" the environment to use. Does this work for the case of a virtualenv environment? Not sure. Also, how environment variables are passed from slurm to the node may be affecting. (note, I'm not an expert in Slurm). First thing I'd do would be to verify which python is being called in the node: system's one with pyworkflow and scipion-em or the correct one created by the installer. Can you see the command sent to slurm and received by the node? I'd also check if the installer created and environment (conda? or virtualenv one?) Let's hunt this! ;-) On 10/11/22 12:33, helder veras wrote: Hi Pablo! Thank you for the suggestion! I've tested to execute the protocols command in the login node (the node I'm currently running the GUI) and tested to execute the sbatch file that calls the protocols command and that runs on the computing node. Both returned the relion plugins as expected. There're two things that I noted during the scipion installation in the singularity container that maybe could be related to this problem. 1. Since I'm installing scipion inside the container, I didn't use a conda environment. For some reason, is that conda environment necessary for the correct installation of scipion and plugins? 2. After installation, I noted some errors related to the scipion-pyworkflow and scipion-em. The modules were not found when executing the protocols. So I installed them directly from pip inside the container. Do you think that could be a problem? I've attached the singularity definition file (a text file), as you can see this file contains the commands that I used to install scipion and plugins inside the container. Best, Helder ________________________________ De: Pablo Conesa <pc...@cn...><mailto:pc...@cn...> Enviado: quinta-feira, 10 de novembro de 2022 03:38 Para: sci...@li...<mailto:sci...@li...> <sci...@li...><mailto:sci...@li...> Assunto: Re: [scipion-users] scipion - singularity - HPC This could happen if you have a different node installation where you are missing some plugins. I.e: You have relion plugin in the "login node", but it is not present in the computing node? One way to check is to compare the output of: scipion3 protocols from both machines. Are they equal? On 8/11/22 21:06, helder veras wrote: Hi Pablo! Thank you for your thoughts and comments! I tested to modify the hosts.conf file to execute the singularity from the sbatch scripts. It seems the previous errors were solved, but now I got this one: run.stdout: RUNNING PROTOCOL -----------------ESC[0m Protocol starts Hostname: gpu01.cnpem.local PID: 2044741 pyworkflow: 3.0.27 plugin: orphan currentDir: /home/helder.ribeiro/ScipionUserData/projects/scipion_teste workingDir: Runs/000641_ProtRelionClassify2D runMode: Restart MPI: 1 threads: 1 'NoneType' object has no attribute 'Plugin' LegacyProtocol installation couldn't be validated. Possible cause could be a configuration issue. Try to run scipion config. ESC[31mProtocol has validation errors: 'NoneType' object has no attribute 'Plugin' LegacyProtocol installation couldn't be validated. Possible cause could be a configuration issue. Try to run scipion config.ESC[0m ESC[32m------------------- PROTOCOL FAILED (DONE 0/0)ESC[0m Do you have any idea what could be causing this error? Best, Helder ________________________________ De: Pablo Conesa <pc...@cn...><mailto:pc...@cn...> Enviado: sexta-feira, 4 de novembro de 2022 03:12 Para: sci...@li...<mailto:sci...@li...> <sci...@li...><mailto:sci...@li...> Assunto: Re: [scipion-users] scipion - singularity - HPC Hi Helder! I got no experience with singularity, but I guess you have a cluster of singuarity nodes? Nodes should have scipion installed as well in the same way(paths) as the "login node". I guess the challenge here is to make slurm "talk to other singularity nodes"? Regarding the error ...: var/spool/slurmd/slurmd/job02147/slurm_script: line 28: gpu01: No such file or directory uniq: gpu01: No such file or directory python3: can't open file '/opt/.scipion3/lib/python3.8/site-packages/pyworkflow/apps/pw_protocol_run.py': [Errno 2] No such file or directory I only understand the last line (not sure if the first 2 are a consequence of the 3rd one). That path look correct, providing you have a "Scipion virtualenv" installation at /opt. On 3/11/22 19:36, helder veras wrote: Hi Mohamad! Thank you for your reply! Yes, I'm using the host configuration file to launch slurm. Let me provide more details about my issue: I built a singularity container that has scipion and all the required dependencies and programs installed as well. This container works fine and I tested it on a desktop machine and on an HPC node without the queue option as well. Programs inside scipion are correctly executed and everything works fine. To be able to launch scipion using the queue option with slurm I had to bind the slurm/murge paths to the container and export some paths (just as presented in https://info.gwdg.de/wiki/doku.php?id=wiki:hpc:usage_of_slurm_within_a_singularity_container ) ( I also included slurm user on the container). By doing this Scipion was able to see the queue (which I changed in hosts.conf file) and successfully launch the job to the queue. The problem is that the sbatch script calls the pw_protocol_run.py that is inside the container, which raises the error in .err file: /var/spool/slurmd/slurmd/job02147/slurm_script: line 28: gpu01: No such file or directory uniq: gpu01: No such file or directory python3: can't open file '/opt/.scipion3/lib/python3.8/site-packages/pyworkflow/apps/pw_protocol_run.py': [Errno 2] No such file or directory I think the problem is that the slurm is trying to execute the script that is only available inside the container. Usage of Slurm within a Singularity Container - GWDG<https://info.gwdg.de/wiki/doku.php?id=wiki:hpc:usage_of_slurm_within_a_singularity_container> Starting a singularity container is quite cumbersome for the operating system of the host and, in case too many requests to start a singularity container are received at the same time, it might fail. info.gwdg.de Best, Helder Ribeiro ________________________________ De: Mohamad HARASTANI <moh...@so...><mailto:moh...@so...> Enviado: quinta-feira, 3 de novembro de 2022 09:29 Para: Mailing list for Scipion users <sci...@li...><mailto:sci...@li...> Assunto: Re: [scipion-users] scipion - singularity - HPC Hello Helder, Have you taken a look at the host configuration here (https://scipion-em.github.io/docs/release-3.0.0/docs/scipion-modes/host-configuration.html)? Best of luck, Mohamad ________________________________ From: "Helder Veras Ribeiro Filho" <hel...@ln...><mailto:hel...@ln...> To: sci...@li...<mailto:sci...@li...> Sent: Wednesday, November 2, 2022 5:07:53 PM Subject: [scipion-users] scipion - singularity - HPC Hello scipion group! I'm trying to launch Scipion from a singularity container in an HPC with the slurm as a scheduler. The container works fine and I'm able to execute Scipion routines correctly without using a queue. The problem is when I try to send Scipion jobs using the queue in the Scipion interface. I suppose that it is a slurm/singularity configuration problem. Could anyone who was successful in sending jobs to queue from a singularity launched scipion help me with some tips? Best, Helder Helder Veras Ribeiro Filho, PhD Brazilian Biosciences National Laboratory - LNBio Brazilian Center for Research in Energy and Materials - CNPEM 10,000 Giuseppe Maximo Scolfaro St. Campinas, SP - Brazil 13083-100 +55(19) 3512-1255 Aviso Legal: Esta mensagem e seus anexos podem conter informações confidenciais e/ou de uso restrito. Observe atentamente seu conteúdo e considere eventual consulta ao remetente antes de copiá-la, divulgá-la ou distribuí-la. Se você recebeu esta mensagem por engano, por favor avise o remetente e apague-a imediatamente. Disclaimer: This email and its attachments may contain confidential and/or privileged information. Observe its content carefully and consider possible querying to the sender before copying, disclosing or distributing it. If you have received this email by mistake, please notify the sender and delete it immediately. _______________________________________________ scipion-users mailing list sci...@li...<mailto:sci...@li...> https://lists.sourceforge.net/lists/listinfo/scipion-users _______________________________________________ scipion-users mailing list sci...@li...<mailto:sci...@li...> https://lists.sourceforge.net/lists/listinfo/scipion-users -- Pablo Conesa - Madrid Scipion<http://scipion.i2pc.es> team _______________________________________________ scipion-users mailing list sci...@li...<mailto:sci...@li...> https://lists.sourceforge.net/lists/listinfo/scipion-users -- Pablo Conesa - Madrid Scipion<http://scipion.i2pc.es> team _______________________________________________ scipion-users mailing list sci...@li...<mailto:sci...@li...> https://lists.sourceforge.net/lists/listinfo/scipion-users -- Pablo Conesa - Madrid Scipion<http://scipion.i2pc.es> team |
From: Pablo C. <pc...@cn...> - 2022-11-10 12:37:03
|
It may be related... here are some comments... when you run: python3 -m scipioninstaller -noXmipp -noAsk /opt the installer is going to create an environment (conda environemnt if conda is found otherwise virtualenv, log should give you a hint) for scipion. We do not want to use System's python. Therefore, my guess is that there is an scipion environment and later installp command should be fine. What it does not make sense is to: pip install scipion-pyworkflow scipion-em This is going to your system python, probably having an "empty" scipion core installation in the system. This is probably what brings the "Legacy" error. Your call to the GPU node is ending up happening in the "empty installation" on the system's python. So, how through slurm, the call should end up in the right environment on the node? I think our calls have always an absolute path to the python of the environment. This is meant to "define" the environment to use. Does this work for the case of a virtualenv environment? Not sure. Also, how environment variables are passed from slurm to the node may be affecting. (note, I'm not an expert in Slurm). First thing I'd do would be to verify which python is being called in the node: system's one with pyworkflow and scipion-em or the correct one created by the installer. Can you see the command sent to slurm and received by the node? I'd also check if the installer created and environment (conda? or virtualenv one?) Let's hunt this! ;-) On 10/11/22 12:33, helder veras wrote: > Hi Pablo! > > Thank you for the suggestion! > > I've tested to execute the protocols command in the login node (the > node I'm currently running the GUI) and tested to execute the sbatch > file that calls the protocols command and that runs on the computing > node. Both returned the relion plugins as expected. > There're two things that I noted during the scipion installation in > the singularity container that maybe could be related to this problem. > > 1. Since I'm installing scipion inside the container, I didn't use a > conda environment. For some reason, is that conda environment > necessary for the correct installation of scipion and plugins? > 2. After installation, I noted some errors related to the > scipion-pyworkflow and scipion-em. The modules were not found when > executing the protocols. So I installed them directly from pip > inside the container. Do you think that could be a problem? > > I've attached the singularity definition file (a text file), as you > can see this file contains the commands that I used to install scipion > and plugins inside the container. > > Best, > > Helder > ------------------------------------------------------------------------ > *De:* Pablo Conesa <pc...@cn...> > *Enviado:* quinta-feira, 10 de novembro de 2022 03:38 > *Para:* sci...@li... > <sci...@li...> > *Assunto:* Re: [scipion-users] scipion - singularity - HPC > > This could happen if you have a different node installation where you > are missing some plugins. > > > I.e: You have relion plugin in the "login node", but it is not present > in the computing node? > > > One way to check is to compare the output of: > > > scipion3 protocols > > > from both machines. Are they equal? > > > On 8/11/22 21:06, helder veras wrote: >> Hi Pablo! >> >> Thank you for your thoughts and comments! >> >> I tested to modify the hosts.conf file to execute the singularity >> from the sbatch scripts. It seems the previous errors were solved, >> but now I got this one: >> >> *run.stdout:* >> RUNNING PROTOCOL -----------------ESC[0m >> Protocol starts >> Hostname: gpu01.cnpem.local >> PID: 2044741 >> pyworkflow: 3.0.27 >> plugin: orphan >> currentDir: /home/helder.ribeiro/ScipionUserData/projects/scipion_teste >> workingDir: Runs/000641_ProtRelionClassify2D >> runMode: Restart >> MPI: 1 >> threads: 1 >> 'NoneType' object has no attribute 'Plugin' LegacyProtocol >> installation couldn't be validated. Possible cause could be a >> configuration issue. Try to run scipion config. >> ESC[31mProtocol has validation errors: >> 'NoneType' object has no attribute 'Plugin' LegacyProtocol >> installation couldn't be validated. Possible cause could be a >> configuration issue. Try to run scipion config.ESC[0m >> ESC[32m------------------- PROTOCOL FAILED (DONE 0/0)ESC[0m >> >> Do you have any idea what could be causing this error? >> >> Best, >> >> Helder >> ------------------------------------------------------------------------ >> *De:* Pablo Conesa <pc...@cn...> <mailto:pc...@cn...> >> *Enviado:* sexta-feira, 4 de novembro de 2022 03:12 >> *Para:* sci...@li... >> <mailto:sci...@li...> >> <sci...@li...> >> <mailto:sci...@li...> >> *Assunto:* Re: [scipion-users] scipion - singularity - HPC >> >> Hi Helder! >> >> >> I got no experience with singularity, but I guess you have a cluster >> of singuarity nodes? >> >> >> Nodes should have scipion installed as well in the same way(paths) as >> the "login node". >> >> >> I guess the challenge here is to make slurm "talk to other >> singularity nodes"? >> >> >> Regarding the error ...: >> >> >> var/spool/slurmd/slurmd/job02147/slurm_script: line 28: gpu01: No >> such file or directory >> >> uniq: gpu01: No such file or directory >> python3: can't open file >> '/opt/.scipion3/lib/python3.8/site-packages/pyworkflow/apps/pw_protocol_run.py': >> [Errno 2] No such file or directory >> >> I only understand the last line (not sure if the first 2 are a >> consequence of the 3rd one). That path look correct, providing you >> have a "Scipion virtualenv" installation at /opt. >> >> >> On 3/11/22 19:36, helder veras wrote: >>> Hi Mohamad! >>> >>> Thank you for your reply! Yes, I'm using the host configuration file >>> to launch slurm. >>> Let me provide more details about my issue: >>> >>> I built a singularity container that has scipion and all the >>> required dependencies and programs installed as well. This container >>> works fine and I tested it on a desktop machine and on an HPC node >>> without the queue option as well. Programs inside scipion are >>> correctly executed and everything works fine. >>> To be able to launch scipion using the queue option with slurm I had >>> to bind the slurm/murge paths to the container and export some paths >>> (just as presented in >>> https://info.gwdg.de/wiki/doku.php?id=wiki:hpc:usage_of_slurm_within_a_singularity_container >>> <https://info.gwdg.de/wiki/doku.php?id=wiki:hpc:usage_of_slurm_within_a_singularity_container> ) >>> ( I also included slurm user on the container). By doing this >>> Scipion was able to see the queue (which I changed in hosts.conf >>> file) and successfully launch the job to the queue. The problem is >>> that the sbatch script calls the pw_protocol_run.py that is inside >>> the container, which raises the error in .err file: >>> >>> /var/spool/slurmd/slurmd/job02147/slurm_script: line 28: gpu01: No >>> such file or directory >>> uniq: gpu01: No such file or directory >>> python3: can't open file >>> '/opt/.scipion3/lib/python3.8/site-packages/pyworkflow/apps/pw_protocol_run.py': >>> [Errno 2] No such file or directory >>> >>> I think the problem is that the slurm is trying to execute the >>> script that is only available inside the container. >>> Usage of Slurm within a Singularity Container - GWDG >>> <https://info.gwdg.de/wiki/doku.php?id=wiki:hpc:usage_of_slurm_within_a_singularity_container> >>> Starting a singularity container is quite cumbersome for the >>> operating system of the host and, in case too many requests to start >>> a singularity container are received at the same time, it might fail. >>> info.gwdg.de >>> >>> >>> Best, >>> >>> Helder Ribeiro >>> >>> ------------------------------------------------------------------------ >>> *De:* Mohamad HARASTANI <moh...@so...> >>> <mailto:moh...@so...> >>> *Enviado:* quinta-feira, 3 de novembro de 2022 09:29 >>> *Para:* Mailing list for Scipion users >>> <sci...@li...> >>> <mailto:sci...@li...> >>> *Assunto:* Re: [scipion-users] scipion - singularity - HPC >>> Hello Helder, >>> >>> Have you taken a look at the host configuration here >>> (https://scipion-em.github.io/docs/release-3.0.0/docs/scipion-modes/host-configuration.html >>> <https://scipion-em.github.io/docs/release-3.0.0/docs/scipion-modes/host-configuration.html>)? >>> >>> Best of luck, >>> Mohamad >>> >>> ------------------------------------------------------------------------ >>> *From: *"Helder Veras Ribeiro Filho" <hel...@ln...> >>> <mailto:hel...@ln...> >>> *To: *sci...@li... >>> <mailto:sci...@li...> >>> *Sent: *Wednesday, November 2, 2022 5:07:53 PM >>> *Subject: *[scipion-users] scipion - singularity - HPC >>> >>> Hello scipion group! >>> >>> I'm trying to launch Scipion from a singularity container in an HPC >>> with the slurm as a scheduler. The container works fine and I'm able >>> to execute Scipion routines correctly without using a queue. The >>> problem is when I try to send Scipion jobs using the queue in the >>> Scipion interface. I suppose that it is a slurm/singularity >>> configuration problem. >>> Could anyone who was successful in sending jobs to queue from a >>> singularity launched scipion help me with some tips? >>> >>> Best, >>> >>> Helder >>> >>> *Helder Veras Ribeiro Filho, PhD*** >>> Brazilian Biosciences National Laboratory - LNBio >>> Brazilian Center for Research in Energy and Materials - CNPEM >>> 10,000 Giuseppe Maximo Scolfaro St. >>> Campinas, SP - Brazil13083-100 >>> +55(19) 3512-1255 >>> >>> >>> Aviso Legal: Esta mensagem e seus anexos podem conter informações >>> confidenciais e/ou de uso restrito. Observe atentamente seu conteúdo >>> e considere eventual consulta ao remetente antes de copiá-la, >>> divulgá-la ou distribuí-la. Se você recebeu esta mensagem por >>> engano, por favor avise o remetente e apague-a imediatamente. >>> >>> Disclaimer: This email and its attachments may contain confidential >>> and/or privileged information. Observe its content carefully and >>> consider possible querying to the sender before copying, disclosing >>> or distributing it. If you have received this email by mistake, >>> please notify the sender and delete it immediately. >>> >>> >>> >>> _______________________________________________ >>> scipion-users mailing list >>> sci...@li... >>> <mailto:sci...@li...> >>> https://lists.sourceforge.net/lists/listinfo/scipion-users >>> <https://lists.sourceforge.net/lists/listinfo/scipion-users> >>> >>> >>> _______________________________________________ >>> scipion-users mailing list >>> sci...@li... <mailto:sci...@li...> >>> https://lists.sourceforge.net/lists/listinfo/scipion-users <https://lists.sourceforge.net/lists/listinfo/scipion-users> >> -- >> Pablo Conesa - *Madrid Scipion <http://scipion.i2pc.es> team* >> >> >> _______________________________________________ >> scipion-users mailing list >> sci...@li... <mailto:sci...@li...> >> https://lists.sourceforge.net/lists/listinfo/scipion-users <https://lists.sourceforge.net/lists/listinfo/scipion-users> > -- > Pablo Conesa - *Madrid Scipion <http://scipion.i2pc.es> team* > > > _______________________________________________ > scipion-users mailing list > sci...@li... > https://lists.sourceforge.net/lists/listinfo/scipion-users -- Pablo Conesa - *Madrid Scipion <http://scipion.i2pc.es> team* |
From: helder v. <hel...@ho...> - 2022-11-10 12:17:16
|
Hi Laura! Great!! Thank you! I'll take a look at these docker files! I'll be very helpful! ________________________________ De: Laura <lde...@cn...> Enviado: quinta-feira, 10 de novembro de 2022 07:03 Para: Mailing list for Scipion users <sci...@li...>; helder veras <hel...@ho...> Assunto: Re: [scipion-users] scipion - singularity - HPC Hi Helder, in our group we have not experience with singularity but I have prepared a similar installation with docker-slurm. Dockerfiles are here<https://github.com/I2PC/scipion-docker/>, one for master and one for worker, please have a look at the master-image/Dockerfile and the hosts.conf file, maybe it gives you some hint. Our installation works fine. best regards Laura On 10/11/22 12:33, helder veras wrote: Hi Pablo! Thank you for the suggestion! I've tested to execute the protocols command in the login node (the node I'm currently running the GUI) and tested to execute the sbatch file that calls the protocols command and that runs on the computing node. Both returned the relion plugins as expected. There're two things that I noted during the scipion installation in the singularity container that maybe could be related to this problem. 1. Since I'm installing scipion inside the container, I didn't use a conda environment. For some reason, is that conda environment necessary for the correct installation of scipion and plugins? 2. After installation, I noted some errors related to the scipion-pyworkflow and scipion-em. The modules were not found when executing the protocols. So I installed them directly from pip inside the container. Do you think that could be a problem? I've attached the singularity definition file (a text file), as you can see this file contains the commands that I used to install scipion and plugins inside the container. Best, Helder ________________________________ De: Pablo Conesa <pc...@cn...><mailto:pc...@cn...> Enviado: quinta-feira, 10 de novembro de 2022 03:38 Para: sci...@li...<mailto:sci...@li...> <sci...@li...><mailto:sci...@li...> Assunto: Re: [scipion-users] scipion - singularity - HPC This could happen if you have a different node installation where you are missing some plugins. I.e: You have relion plugin in the "login node", but it is not present in the computing node? One way to check is to compare the output of: scipion3 protocols from both machines. Are they equal? On 8/11/22 21:06, helder veras wrote: Hi Pablo! Thank you for your thoughts and comments! I tested to modify the hosts.conf file to execute the singularity from the sbatch scripts. It seems the previous errors were solved, but now I got this one: run.stdout: RUNNING PROTOCOL -----------------ESC[0m Protocol starts Hostname: gpu01.cnpem.local PID: 2044741 pyworkflow: 3.0.27 plugin: orphan currentDir: /home/helder.ribeiro/ScipionUserData/projects/scipion_teste workingDir: Runs/000641_ProtRelionClassify2D runMode: Restart MPI: 1 threads: 1 'NoneType' object has no attribute 'Plugin' LegacyProtocol installation couldn't be validated. Possible cause could be a configuration issue. Try to run scipion config. ESC[31mProtocol has validation errors: 'NoneType' object has no attribute 'Plugin' LegacyProtocol installation couldn't be validated. Possible cause could be a configuration issue. Try to run scipion config.ESC[0m ESC[32m------------------- PROTOCOL FAILED (DONE 0/0)ESC[0m Do you have any idea what could be causing this error? Best, Helder ________________________________ De: Pablo Conesa <pc...@cn...><mailto:pc...@cn...> Enviado: sexta-feira, 4 de novembro de 2022 03:12 Para: sci...@li...<mailto:sci...@li...> <sci...@li...><mailto:sci...@li...> Assunto: Re: [scipion-users] scipion - singularity - HPC Hi Helder! I got no experience with singularity, but I guess you have a cluster of singuarity nodes? Nodes should have scipion installed as well in the same way(paths) as the "login node". I guess the challenge here is to make slurm "talk to other singularity nodes"? Regarding the error ...: var/spool/slurmd/slurmd/job02147/slurm_script: line 28: gpu01: No such file or directory uniq: gpu01: No such file or directory python3: can't open file '/opt/.scipion3/lib/python3.8/site-packages/pyworkflow/apps/pw_protocol_run.py': [Errno 2] No such file or directory I only understand the last line (not sure if the first 2 are a consequence of the 3rd one). That path look correct, providing you have a "Scipion virtualenv" installation at /opt. On 3/11/22 19:36, helder veras wrote: Hi Mohamad! Thank you for your reply! Yes, I'm using the host configuration file to launch slurm. Let me provide more details about my issue: I built a singularity container that has scipion and all the required dependencies and programs installed as well. This container works fine and I tested it on a desktop machine and on an HPC node without the queue option as well. Programs inside scipion are correctly executed and everything works fine. To be able to launch scipion using the queue option with slurm I had to bind the slurm/murge paths to the container and export some paths (just as presented in https://info.gwdg.de/wiki/doku.php?id=wiki:hpc:usage_of_slurm_within_a_singularity_container ) ( I also included slurm user on the container). By doing this Scipion was able to see the queue (which I changed in hosts.conf file) and successfully launch the job to the queue. The problem is that the sbatch script calls the pw_protocol_run.py that is inside the container, which raises the error in .err file: /var/spool/slurmd/slurmd/job02147/slurm_script: line 28: gpu01: No such file or directory uniq: gpu01: No such file or directory python3: can't open file '/opt/.scipion3/lib/python3.8/site-packages/pyworkflow/apps/pw_protocol_run.py': [Errno 2] No such file or directory I think the problem is that the slurm is trying to execute the script that is only available inside the container. Usage of Slurm within a Singularity Container - GWDG<https://info.gwdg.de/wiki/doku.php?id=wiki:hpc:usage_of_slurm_within_a_singularity_container> Starting a singularity container is quite cumbersome for the operating system of the host and, in case too many requests to start a singularity container are received at the same time, it might fail. info.gwdg.de Best, Helder Ribeiro ________________________________ De: Mohamad HARASTANI <moh...@so...><mailto:moh...@so...> Enviado: quinta-feira, 3 de novembro de 2022 09:29 Para: Mailing list for Scipion users <sci...@li...><mailto:sci...@li...> Assunto: Re: [scipion-users] scipion - singularity - HPC Hello Helder, Have you taken a look at the host configuration here (https://scipion-em.github.io/docs/release-3.0.0/docs/scipion-modes/host-configuration.html)? Best of luck, Mohamad ________________________________ From: "Helder Veras Ribeiro Filho" <hel...@ln...><mailto:hel...@ln...> To: sci...@li...<mailto:sci...@li...> Sent: Wednesday, November 2, 2022 5:07:53 PM Subject: [scipion-users] scipion - singularity - HPC Hello scipion group! I'm trying to launch Scipion from a singularity container in an HPC with the slurm as a scheduler. The container works fine and I'm able to execute Scipion routines correctly without using a queue. The problem is when I try to send Scipion jobs using the queue in the Scipion interface. I suppose that it is a slurm/singularity configuration problem. Could anyone who was successful in sending jobs to queue from a singularity launched scipion help me with some tips? Best, Helder Helder Veras Ribeiro Filho, PhD Brazilian Biosciences National Laboratory - LNBio Brazilian Center for Research in Energy and Materials - CNPEM 10,000 Giuseppe Maximo Scolfaro St. Campinas, SP - Brazil 13083-100 +55(19) 3512-1255 Aviso Legal: Esta mensagem e seus anexos podem conter informações confidenciais e/ou de uso restrito. Observe atentamente seu conteúdo e considere eventual consulta ao remetente antes de copiá-la, divulgá-la ou distribuí-la. Se você recebeu esta mensagem por engano, por favor avise o remetente e apague-a imediatamente. Disclaimer: This email and its attachments may contain confidential and/or privileged information. Observe its content carefully and consider possible querying to the sender before copying, disclosing or distributing it. If you have received this email by mistake, please notify the sender and delete it immediately. _______________________________________________ scipion-users mailing list sci...@li...<mailto:sci...@li...> https://lists.sourceforge.net/lists/listinfo/scipion-users _______________________________________________ scipion-users mailing list sci...@li...<mailto:sci...@li...> https://lists.sourceforge.net/lists/listinfo/scipion-users -- Pablo Conesa - Madrid Scipion<http://scipion.i2pc.es> team _______________________________________________ scipion-users mailing list sci...@li...<mailto:sci...@li...> https://lists.sourceforge.net/lists/listinfo/scipion-users -- Pablo Conesa - Madrid Scipion<http://scipion.i2pc.es> team _______________________________________________ scipion-users mailing list sci...@li...<mailto:sci...@li...> https://lists.sourceforge.net/lists/listinfo/scipion-users |
From: Yunior C. F. R. <cfo...@cn...> - 2022-11-10 12:14:39
|
Hello everyone, I have released a new version of the plugin that is compatible with the 4.0.X versions of cryoSPARC. In this new version, the refinement protocols do not generate the FSC due to accessibility problems to this data (the CS team has improved security and it is impossible for me to access). We have already contacted them and they will give us a way to access in one of the new releases or patches. Sorry for this inconvenience, but I couldn't wait any longer to release the plugin. As soon as they give access, then we will have FSC. Best regards Yun |
From: Laura <lde...@cn...> - 2022-11-10 12:04:06
|
Hi Helder, in our group we have not experience with singularity but I have prepared a similar installation with docker-slurm. Dockerfiles are here <https://github.com/I2PC/scipion-docker/>, one for master and one for worker, please have a look at the master-image/Dockerfile and the hosts.conf file, maybe it gives you some hint. Our installation works fine. best regards Laura On 10/11/22 12:33, helder veras wrote: > Hi Pablo! > > Thank you for the suggestion! > > I've tested to execute the protocols command in the login node (the > node I'm currently running the GUI) and tested to execute the sbatch > file that calls the protocols command and that runs on the computing > node. Both returned the relion plugins as expected. > There're two things that I noted during the scipion installation in > the singularity container that maybe could be related to this problem. > > 1. Since I'm installing scipion inside the container, I didn't use a > conda environment. For some reason, is that conda environment > necessary for the correct installation of scipion and plugins? > 2. After installation, I noted some errors related to the > scipion-pyworkflow and scipion-em. The modules were not found when > executing the protocols. So I installed them directly from pip > inside the container. Do you think that could be a problem? > > I've attached the singularity definition file (a text file), as you > can see this file contains the commands that I used to install scipion > and plugins inside the container. > > Best, > > Helder > ------------------------------------------------------------------------ > *De:* Pablo Conesa <pc...@cn...> > *Enviado:* quinta-feira, 10 de novembro de 2022 03:38 > *Para:* sci...@li... > <sci...@li...> > *Assunto:* Re: [scipion-users] scipion - singularity - HPC > > This could happen if you have a different node installation where you > are missing some plugins. > > > I.e: You have relion plugin in the "login node", but it is not present > in the computing node? > > > One way to check is to compare the output of: > > > scipion3 protocols > > > from both machines. Are they equal? > > > On 8/11/22 21:06, helder veras wrote: >> Hi Pablo! >> >> Thank you for your thoughts and comments! >> >> I tested to modify the hosts.conf file to execute the singularity >> from the sbatch scripts. It seems the previous errors were solved, >> but now I got this one: >> >> *run.stdout:* >> RUNNING PROTOCOL -----------------ESC[0m >> Protocol starts >> Hostname: gpu01.cnpem.local >> PID: 2044741 >> pyworkflow: 3.0.27 >> plugin: orphan >> currentDir: /home/helder.ribeiro/ScipionUserData/projects/scipion_teste >> workingDir: Runs/000641_ProtRelionClassify2D >> runMode: Restart >> MPI: 1 >> threads: 1 >> 'NoneType' object has no attribute 'Plugin' LegacyProtocol >> installation couldn't be validated. Possible cause could be a >> configuration issue. Try to run scipion config. >> ESC[31mProtocol has validation errors: >> 'NoneType' object has no attribute 'Plugin' LegacyProtocol >> installation couldn't be validated. Possible cause could be a >> configuration issue. Try to run scipion config.ESC[0m >> ESC[32m------------------- PROTOCOL FAILED (DONE 0/0)ESC[0m >> >> Do you have any idea what could be causing this error? >> >> Best, >> >> Helder >> ------------------------------------------------------------------------ >> *De:* Pablo Conesa <pc...@cn...> <mailto:pc...@cn...> >> *Enviado:* sexta-feira, 4 de novembro de 2022 03:12 >> *Para:* sci...@li... >> <mailto:sci...@li...> >> <sci...@li...> >> <mailto:sci...@li...> >> *Assunto:* Re: [scipion-users] scipion - singularity - HPC >> >> Hi Helder! >> >> >> I got no experience with singularity, but I guess you have a cluster >> of singuarity nodes? >> >> >> Nodes should have scipion installed as well in the same way(paths) as >> the "login node". >> >> >> I guess the challenge here is to make slurm "talk to other >> singularity nodes"? >> >> >> Regarding the error ...: >> >> >> var/spool/slurmd/slurmd/job02147/slurm_script: line 28: gpu01: No >> such file or directory >> >> uniq: gpu01: No such file or directory >> python3: can't open file >> '/opt/.scipion3/lib/python3.8/site-packages/pyworkflow/apps/pw_protocol_run.py': >> [Errno 2] No such file or directory >> >> I only understand the last line (not sure if the first 2 are a >> consequence of the 3rd one). That path look correct, providing you >> have a "Scipion virtualenv" installation at /opt. >> >> >> On 3/11/22 19:36, helder veras wrote: >>> Hi Mohamad! >>> >>> Thank you for your reply! Yes, I'm using the host configuration file >>> to launch slurm. >>> Let me provide more details about my issue: >>> >>> I built a singularity container that has scipion and all the >>> required dependencies and programs installed as well. This container >>> works fine and I tested it on a desktop machine and on an HPC node >>> without the queue option as well. Programs inside scipion are >>> correctly executed and everything works fine. >>> To be able to launch scipion using the queue option with slurm I had >>> to bind the slurm/murge paths to the container and export some paths >>> (just as presented in >>> https://info.gwdg.de/wiki/doku.php?id=wiki:hpc:usage_of_slurm_within_a_singularity_container ) >>> ( I also included slurm user on the container). By doing this >>> Scipion was able to see the queue (which I changed in hosts.conf >>> file) and successfully launch the job to the queue. The problem is >>> that the sbatch script calls the pw_protocol_run.py that is inside >>> the container, which raises the error in .err file: >>> >>> /var/spool/slurmd/slurmd/job02147/slurm_script: line 28: gpu01: No >>> such file or directory >>> uniq: gpu01: No such file or directory >>> python3: can't open file >>> '/opt/.scipion3/lib/python3.8/site-packages/pyworkflow/apps/pw_protocol_run.py': >>> [Errno 2] No such file or directory >>> >>> I think the problem is that the slurm is trying to execute the >>> script that is only available inside the container. >>> Usage of Slurm within a Singularity Container - GWDG >>> <https://info.gwdg.de/wiki/doku.php?id=wiki:hpc:usage_of_slurm_within_a_singularity_container> >>> Starting a singularity container is quite cumbersome for the >>> operating system of the host and, in case too many requests to start >>> a singularity container are received at the same time, it might fail. >>> info.gwdg.de >>> >>> >>> Best, >>> >>> Helder Ribeiro >>> >>> ------------------------------------------------------------------------ >>> *De:* Mohamad HARASTANI <moh...@so...> >>> <mailto:moh...@so...> >>> *Enviado:* quinta-feira, 3 de novembro de 2022 09:29 >>> *Para:* Mailing list for Scipion users >>> <sci...@li...> >>> <mailto:sci...@li...> >>> *Assunto:* Re: [scipion-users] scipion - singularity - HPC >>> Hello Helder, >>> >>> Have you taken a look at the host configuration here >>> (https://scipion-em.github.io/docs/release-3.0.0/docs/scipion-modes/host-configuration.html)? >>> >>> Best of luck, >>> Mohamad >>> >>> ------------------------------------------------------------------------ >>> *From: *"Helder Veras Ribeiro Filho" <hel...@ln...> >>> <mailto:hel...@ln...> >>> *To: *sci...@li... >>> <mailto:sci...@li...> >>> *Sent: *Wednesday, November 2, 2022 5:07:53 PM >>> *Subject: *[scipion-users] scipion - singularity - HPC >>> >>> Hello scipion group! >>> >>> I'm trying to launch Scipion from a singularity container in an HPC >>> with the slurm as a scheduler. The container works fine and I'm able >>> to execute Scipion routines correctly without using a queue. The >>> problem is when I try to send Scipion jobs using the queue in the >>> Scipion interface. I suppose that it is a slurm/singularity >>> configuration problem. >>> Could anyone who was successful in sending jobs to queue from a >>> singularity launched scipion help me with some tips? >>> >>> Best, >>> >>> Helder >>> >>> *Helder Veras Ribeiro Filho, PhD*** >>> Brazilian Biosciences National Laboratory - LNBio >>> Brazilian Center for Research in Energy and Materials - CNPEM >>> 10,000 Giuseppe Maximo Scolfaro St. >>> Campinas, SP - Brazil13083-100 >>> +55(19) 3512-1255 >>> >>> >>> Aviso Legal: Esta mensagem e seus anexos podem conter informações >>> confidenciais e/ou de uso restrito. Observe atentamente seu conteúdo >>> e considere eventual consulta ao remetente antes de copiá-la, >>> divulgá-la ou distribuí-la. Se você recebeu esta mensagem por >>> engano, por favor avise o remetente e apague-a imediatamente. >>> >>> Disclaimer: This email and its attachments may contain confidential >>> and/or privileged information. Observe its content carefully and >>> consider possible querying to the sender before copying, disclosing >>> or distributing it. If you have received this email by mistake, >>> please notify the sender and delete it immediately. >>> >>> >>> >>> _______________________________________________ >>> scipion-users mailing list >>> sci...@li... >>> <mailto:sci...@li...> >>> https://lists.sourceforge.net/lists/listinfo/scipion-users >>> >>> >>> _______________________________________________ >>> scipion-users mailing list >>> sci...@li... <mailto:sci...@li...> >>> https://lists.sourceforge.net/lists/listinfo/scipion-users >> -- >> Pablo Conesa - *Madrid Scipion <http://scipion.i2pc.es> team* >> >> >> _______________________________________________ >> scipion-users mailing list >> sci...@li... <mailto:sci...@li...> >> https://lists.sourceforge.net/lists/listinfo/scipion-users > -- > Pablo Conesa - *Madrid Scipion <http://scipion.i2pc.es> team* > > > _______________________________________________ > scipion-users mailing list > sci...@li... > https://lists.sourceforge.net/lists/listinfo/scipion-users |
From: helder v. <hel...@ho...> - 2022-11-10 11:33:53
|
Hi Pablo! Thank you for the suggestion! I've tested to execute the protocols command in the login node (the node I'm currently running the GUI) and tested to execute the sbatch file that calls the protocols command and that runs on the computing node. Both returned the relion plugins as expected. There're two things that I noted during the scipion installation in the singularity container that maybe could be related to this problem. 1. Since I'm installing scipion inside the container, I didn't use a conda environment. For some reason, is that conda environment necessary for the correct installation of scipion and plugins? 2. After installation, I noted some errors related to the scipion-pyworkflow and scipion-em. The modules were not found when executing the protocols. So I installed them directly from pip inside the container. Do you think that could be a problem? I've attached the singularity definition file (a text file), as you can see this file contains the commands that I used to install scipion and plugins inside the container. Best, Helder ________________________________ De: Pablo Conesa <pc...@cn...> Enviado: quinta-feira, 10 de novembro de 2022 03:38 Para: sci...@li... <sci...@li...> Assunto: Re: [scipion-users] scipion - singularity - HPC This could happen if you have a different node installation where you are missing some plugins. I.e: You have relion plugin in the "login node", but it is not present in the computing node? One way to check is to compare the output of: scipion3 protocols from both machines. Are they equal? On 8/11/22 21:06, helder veras wrote: Hi Pablo! Thank you for your thoughts and comments! I tested to modify the hosts.conf file to execute the singularity from the sbatch scripts. It seems the previous errors were solved, but now I got this one: run.stdout: RUNNING PROTOCOL -----------------ESC[0m Protocol starts Hostname: gpu01.cnpem.local PID: 2044741 pyworkflow: 3.0.27 plugin: orphan currentDir: /home/helder.ribeiro/ScipionUserData/projects/scipion_teste workingDir: Runs/000641_ProtRelionClassify2D runMode: Restart MPI: 1 threads: 1 'NoneType' object has no attribute 'Plugin' LegacyProtocol installation couldn't be validated. Possible cause could be a configuration issue. Try to run scipion config. ESC[31mProtocol has validation errors: 'NoneType' object has no attribute 'Plugin' LegacyProtocol installation couldn't be validated. Possible cause could be a configuration issue. Try to run scipion config.ESC[0m ESC[32m------------------- PROTOCOL FAILED (DONE 0/0)ESC[0m Do you have any idea what could be causing this error? Best, Helder ________________________________ De: Pablo Conesa <pc...@cn...><mailto:pc...@cn...> Enviado: sexta-feira, 4 de novembro de 2022 03:12 Para: sci...@li...<mailto:sci...@li...> <sci...@li...><mailto:sci...@li...> Assunto: Re: [scipion-users] scipion - singularity - HPC Hi Helder! I got no experience with singularity, but I guess you have a cluster of singuarity nodes? Nodes should have scipion installed as well in the same way(paths) as the "login node". I guess the challenge here is to make slurm "talk to other singularity nodes"? Regarding the error ...: var/spool/slurmd/slurmd/job02147/slurm_script: line 28: gpu01: No such file or directory uniq: gpu01: No such file or directory python3: can't open file '/opt/.scipion3/lib/python3.8/site-packages/pyworkflow/apps/pw_protocol_run.py': [Errno 2] No such file or directory I only understand the last line (not sure if the first 2 are a consequence of the 3rd one). That path look correct, providing you have a "Scipion virtualenv" installation at /opt. On 3/11/22 19:36, helder veras wrote: Hi Mohamad! Thank you for your reply! Yes, I'm using the host configuration file to launch slurm. Let me provide more details about my issue: I built a singularity container that has scipion and all the required dependencies and programs installed as well. This container works fine and I tested it on a desktop machine and on an HPC node without the queue option as well. Programs inside scipion are correctly executed and everything works fine. To be able to launch scipion using the queue option with slurm I had to bind the slurm/murge paths to the container and export some paths (just as presented in https://info.gwdg.de/wiki/doku.php?id=wiki:hpc:usage_of_slurm_within_a_singularity_container ) ( I also included slurm user on the container). By doing this Scipion was able to see the queue (which I changed in hosts.conf file) and successfully launch the job to the queue. The problem is that the sbatch script calls the pw_protocol_run.py that is inside the container, which raises the error in .err file: /var/spool/slurmd/slurmd/job02147/slurm_script: line 28: gpu01: No such file or directory uniq: gpu01: No such file or directory python3: can't open file '/opt/.scipion3/lib/python3.8/site-packages/pyworkflow/apps/pw_protocol_run.py': [Errno 2] No such file or directory I think the problem is that the slurm is trying to execute the script that is only available inside the container. Usage of Slurm within a Singularity Container - GWDG<https://info.gwdg.de/wiki/doku.php?id=wiki:hpc:usage_of_slurm_within_a_singularity_container> Starting a singularity container is quite cumbersome for the operating system of the host and, in case too many requests to start a singularity container are received at the same time, it might fail. info.gwdg.de Best, Helder Ribeiro ________________________________ De: Mohamad HARASTANI <moh...@so...><mailto:moh...@so...> Enviado: quinta-feira, 3 de novembro de 2022 09:29 Para: Mailing list for Scipion users <sci...@li...><mailto:sci...@li...> Assunto: Re: [scipion-users] scipion - singularity - HPC Hello Helder, Have you taken a look at the host configuration here (https://scipion-em.github.io/docs/release-3.0.0/docs/scipion-modes/host-configuration.html)? Best of luck, Mohamad ________________________________ From: "Helder Veras Ribeiro Filho" <hel...@ln...><mailto:hel...@ln...> To: sci...@li...<mailto:sci...@li...> Sent: Wednesday, November 2, 2022 5:07:53 PM Subject: [scipion-users] scipion - singularity - HPC Hello scipion group! I'm trying to launch Scipion from a singularity container in an HPC with the slurm as a scheduler. The container works fine and I'm able to execute Scipion routines correctly without using a queue. The problem is when I try to send Scipion jobs using the queue in the Scipion interface. I suppose that it is a slurm/singularity configuration problem. Could anyone who was successful in sending jobs to queue from a singularity launched scipion help me with some tips? Best, Helder Helder Veras Ribeiro Filho, PhD Brazilian Biosciences National Laboratory - LNBio Brazilian Center for Research in Energy and Materials - CNPEM 10,000 Giuseppe Maximo Scolfaro St. Campinas, SP - Brazil 13083-100 +55(19) 3512-1255 Aviso Legal: Esta mensagem e seus anexos podem conter informações confidenciais e/ou de uso restrito. Observe atentamente seu conteúdo e considere eventual consulta ao remetente antes de copiá-la, divulgá-la ou distribuí-la. Se você recebeu esta mensagem por engano, por favor avise o remetente e apague-a imediatamente. Disclaimer: This email and its attachments may contain confidential and/or privileged information. Observe its content carefully and consider possible querying to the sender before copying, disclosing or distributing it. If you have received this email by mistake, please notify the sender and delete it immediately. _______________________________________________ scipion-users mailing list sci...@li...<mailto:sci...@li...> https://lists.sourceforge.net/lists/listinfo/scipion-users _______________________________________________ scipion-users mailing list sci...@li...<mailto:sci...@li...> https://lists.sourceforge.net/lists/listinfo/scipion-users -- Pablo Conesa - Madrid Scipion<http://scipion.i2pc.es> team _______________________________________________ scipion-users mailing list sci...@li...<mailto:sci...@li...> https://lists.sourceforge.net/lists/listinfo/scipion-users -- Pablo Conesa - Madrid Scipion<http://scipion.i2pc.es> team |
From: Pablo C. <pc...@cn...> - 2022-11-10 08:39:02
|
This could happen if you have a different node installation where you are missing some plugins. I.e: You have relion plugin in the "login node", but it is not present in the computing node? One way to check is to compare the output of: scipion3 protocols from both machines. Are they equal? On 8/11/22 21:06, helder veras wrote: > Hi Pablo! > > Thank you for your thoughts and comments! > > I tested to modify the hosts.conf file to execute the singularity from > the sbatch scripts. It seems the previous errors were solved, but now > I got this one: > > *run.stdout:* > RUNNING PROTOCOL -----------------ESC[0m > Protocol starts > Hostname: gpu01.cnpem.local > PID: 2044741 > pyworkflow: 3.0.27 > plugin: orphan > currentDir: /home/helder.ribeiro/ScipionUserData/projects/scipion_teste > workingDir: Runs/000641_ProtRelionClassify2D > runMode: Restart > MPI: 1 > threads: 1 > 'NoneType' object has no attribute 'Plugin' LegacyProtocol > installation couldn't be validated. Possible cause could be a > configuration issue. Try to run scipion config. > ESC[31mProtocol has validation errors: > 'NoneType' object has no attribute 'Plugin' LegacyProtocol > installation couldn't be validated. Possible cause could be a > configuration issue. Try to run scipion config.ESC[0m > ESC[32m------------------- PROTOCOL FAILED (DONE 0/0)ESC[0m > > Do you have any idea what could be causing this error? > > Best, > > Helder > ------------------------------------------------------------------------ > *De:* Pablo Conesa <pc...@cn...> > *Enviado:* sexta-feira, 4 de novembro de 2022 03:12 > *Para:* sci...@li... > <sci...@li...> > *Assunto:* Re: [scipion-users] scipion - singularity - HPC > > Hi Helder! > > > I got no experience with singularity, but I guess you have a cluster > of singuarity nodes? > > > Nodes should have scipion installed as well in the same way(paths) as > the "login node". > > > I guess the challenge here is to make slurm "talk to other singularity > nodes"? > > > Regarding the error ...: > > > var/spool/slurmd/slurmd/job02147/slurm_script: line 28: gpu01: No such > file or directory > > uniq: gpu01: No such file or directory > python3: can't open file > '/opt/.scipion3/lib/python3.8/site-packages/pyworkflow/apps/pw_protocol_run.py': > [Errno 2] No such file or directory > > I only understand the last line (not sure if the first 2 are a > consequence of the 3rd one). That path look correct, providing you > have a "Scipion virtualenv" installation at /opt. > > > On 3/11/22 19:36, helder veras wrote: >> Hi Mohamad! >> >> Thank you for your reply! Yes, I'm using the host configuration file >> to launch slurm. >> Let me provide more details about my issue: >> >> I built a singularity container that has scipion and all the required >> dependencies and programs installed as well. This container works >> fine and I tested it on a desktop machine and on an HPC node without >> the queue option as well. Programs inside scipion are correctly >> executed and everything works fine. >> To be able to launch scipion using the queue option with slurm I had >> to bind the slurm/murge paths to the container and export some paths >> (just as presented in >> https://info.gwdg.de/wiki/doku.php?id=wiki:hpc:usage_of_slurm_within_a_singularity_container >> <https://info.gwdg.de/wiki/doku.php?id=wiki:hpc:usage_of_slurm_within_a_singularity_container> ) >> ( I also included slurm user on the container). By doing this Scipion >> was able to see the queue (which I changed in hosts.conf file) and >> successfully launch the job to the queue. The problem is that the >> sbatch script calls the pw_protocol_run.py that is inside the >> container, which raises the error in .err file: >> >> /var/spool/slurmd/slurmd/job02147/slurm_script: line 28: gpu01: No >> such file or directory >> uniq: gpu01: No such file or directory >> python3: can't open file >> '/opt/.scipion3/lib/python3.8/site-packages/pyworkflow/apps/pw_protocol_run.py': >> [Errno 2] No such file or directory >> >> I think the problem is that the slurm is trying to execute the script >> that is only available inside the container. >> Usage of Slurm within a Singularity Container - GWDG >> <https://info.gwdg.de/wiki/doku.php?id=wiki:hpc:usage_of_slurm_within_a_singularity_container> >> Starting a singularity container is quite cumbersome for the >> operating system of the host and, in case too many requests to start >> a singularity container are received at the same time, it might fail. >> info.gwdg.de >> >> >> Best, >> >> Helder Ribeiro >> >> ------------------------------------------------------------------------ >> *De:* Mohamad HARASTANI <moh...@so...> >> <mailto:moh...@so...> >> *Enviado:* quinta-feira, 3 de novembro de 2022 09:29 >> *Para:* Mailing list for Scipion users >> <sci...@li...> >> <mailto:sci...@li...> >> *Assunto:* Re: [scipion-users] scipion - singularity - HPC >> Hello Helder, >> >> Have you taken a look at the host configuration here >> (https://scipion-em.github.io/docs/release-3.0.0/docs/scipion-modes/host-configuration.html >> <https://scipion-em.github.io/docs/release-3.0.0/docs/scipion-modes/host-configuration.html>)? >> >> Best of luck, >> Mohamad >> >> ------------------------------------------------------------------------ >> *From: *"Helder Veras Ribeiro Filho" <hel...@ln...> >> <mailto:hel...@ln...> >> *To: *sci...@li... >> <mailto:sci...@li...> >> *Sent: *Wednesday, November 2, 2022 5:07:53 PM >> *Subject: *[scipion-users] scipion - singularity - HPC >> >> Hello scipion group! >> >> I'm trying to launch Scipion from a singularity container in an HPC >> with the slurm as a scheduler. The container works fine and I'm able >> to execute Scipion routines correctly without using a queue. The >> problem is when I try to send Scipion jobs using the queue in the >> Scipion interface. I suppose that it is a slurm/singularity >> configuration problem. >> Could anyone who was successful in sending jobs to queue from a >> singularity launched scipion help me with some tips? >> >> Best, >> >> Helder >> >> *Helder Veras Ribeiro Filho, PhD*** >> Brazilian Biosciences National Laboratory - LNBio >> Brazilian Center for Research in Energy and Materials - CNPEM >> 10,000 Giuseppe Maximo Scolfaro St. >> Campinas, SP - Brazil13083-100 >> +55(19) 3512-1255 >> >> >> Aviso Legal: Esta mensagem e seus anexos podem conter informações >> confidenciais e/ou de uso restrito. Observe atentamente seu conteúdo >> e considere eventual consulta ao remetente antes de copiá-la, >> divulgá-la ou distribuí-la. Se você recebeu esta mensagem por engano, >> por favor avise o remetente e apague-a imediatamente. >> >> Disclaimer: This email and its attachments may contain confidential >> and/or privileged information. Observe its content carefully and >> consider possible querying to the sender before copying, disclosing >> or distributing it. If you have received this email by mistake, >> please notify the sender and delete it immediately. >> >> >> >> _______________________________________________ >> scipion-users mailing list >> sci...@li... >> <mailto:sci...@li...> >> https://lists.sourceforge.net/lists/listinfo/scipion-users >> <https://lists.sourceforge.net/lists/listinfo/scipion-users> >> >> >> _______________________________________________ >> scipion-users mailing list >> sci...@li... <mailto:sci...@li...> >> https://lists.sourceforge.net/lists/listinfo/scipion-users <https://lists.sourceforge.net/lists/listinfo/scipion-users> > -- > Pablo Conesa - *Madrid Scipion <http://scipion.i2pc.es> team* > > > _______________________________________________ > scipion-users mailing list > sci...@li... > https://lists.sourceforge.net/lists/listinfo/scipion-users -- Pablo Conesa - *Madrid Scipion <http://scipion.i2pc.es> team* |
From: helder v. <hel...@ho...> - 2022-11-08 20:20:45
|
Hi Pablo! Thank you for your thoughts and comments! I tested to modify the hosts.conf file to execute the singularity from the sbatch scripts. It seems the previous errors were solved, but now I got this one: run.stdout: RUNNING PROTOCOL -----------------ESC[0m Protocol starts Hostname: gpu01.cnpem.local PID: 2044741 pyworkflow: 3.0.27 plugin: orphan currentDir: /home/helder.ribeiro/ScipionUserData/projects/scipion_teste workingDir: Runs/000641_ProtRelionClassify2D runMode: Restart MPI: 1 threads: 1 'NoneType' object has no attribute 'Plugin' LegacyProtocol installation couldn't be validated. Possible cause could be a configuration issue. Try to run scipion config. ESC[31mProtocol has validation errors: 'NoneType' object has no attribute 'Plugin' LegacyProtocol installation couldn't be validated. Possible cause could be a configuration issue. Try to run scipion config.ESC[0m ESC[32m------------------- PROTOCOL FAILED (DONE 0/0)ESC[0m Do you have any idea what could be causing this error? Best, Helder ________________________________ De: Pablo Conesa <pc...@cn...> Enviado: sexta-feira, 4 de novembro de 2022 03:12 Para: sci...@li... <sci...@li...> Assunto: Re: [scipion-users] scipion - singularity - HPC Hi Helder! I got no experience with singularity, but I guess you have a cluster of singuarity nodes? Nodes should have scipion installed as well in the same way(paths) as the "login node". I guess the challenge here is to make slurm "talk to other singularity nodes"? Regarding the error ...: var/spool/slurmd/slurmd/job02147/slurm_script: line 28: gpu01: No such file or directory uniq: gpu01: No such file or directory python3: can't open file '/opt/.scipion3/lib/python3.8/site-packages/pyworkflow/apps/pw_protocol_run.py': [Errno 2] No such file or directory I only understand the last line (not sure if the first 2 are a consequence of the 3rd one). That path look correct, providing you have a "Scipion virtualenv" installation at /opt. On 3/11/22 19:36, helder veras wrote: Hi Mohamad! Thank you for your reply! Yes, I'm using the host configuration file to launch slurm. Let me provide more details about my issue: I built a singularity container that has scipion and all the required dependencies and programs installed as well. This container works fine and I tested it on a desktop machine and on an HPC node without the queue option as well. Programs inside scipion are correctly executed and everything works fine. To be able to launch scipion using the queue option with slurm I had to bind the slurm/murge paths to the container and export some paths (just as presented in https://info.gwdg.de/wiki/doku.php?id=wiki:hpc:usage_of_slurm_within_a_singularity_container ) ( I also included slurm user on the container). By doing this Scipion was able to see the queue (which I changed in hosts.conf file) and successfully launch the job to the queue. The problem is that the sbatch script calls the pw_protocol_run.py that is inside the container, which raises the error in .err file: /var/spool/slurmd/slurmd/job02147/slurm_script: line 28: gpu01: No such file or directory uniq: gpu01: No such file or directory python3: can't open file '/opt/.scipion3/lib/python3.8/site-packages/pyworkflow/apps/pw_protocol_run.py': [Errno 2] No such file or directory I think the problem is that the slurm is trying to execute the script that is only available inside the container. Usage of Slurm within a Singularity Container - GWDG<https://info.gwdg.de/wiki/doku.php?id=wiki:hpc:usage_of_slurm_within_a_singularity_container> Starting a singularity container is quite cumbersome for the operating system of the host and, in case too many requests to start a singularity container are received at the same time, it might fail. info.gwdg.de Best, Helder Ribeiro ________________________________ De: Mohamad HARASTANI <moh...@so...><mailto:moh...@so...> Enviado: quinta-feira, 3 de novembro de 2022 09:29 Para: Mailing list for Scipion users <sci...@li...><mailto:sci...@li...> Assunto: Re: [scipion-users] scipion - singularity - HPC Hello Helder, Have you taken a look at the host configuration here (https://scipion-em.github.io/docs/release-3.0.0/docs/scipion-modes/host-configuration.html)? Best of luck, Mohamad ________________________________ From: "Helder Veras Ribeiro Filho" <hel...@ln...><mailto:hel...@ln...> To: sci...@li...<mailto:sci...@li...> Sent: Wednesday, November 2, 2022 5:07:53 PM Subject: [scipion-users] scipion - singularity - HPC Hello scipion group! I'm trying to launch Scipion from a singularity container in an HPC with the slurm as a scheduler. The container works fine and I'm able to execute Scipion routines correctly without using a queue. The problem is when I try to send Scipion jobs using the queue in the Scipion interface. I suppose that it is a slurm/singularity configuration problem. Could anyone who was successful in sending jobs to queue from a singularity launched scipion help me with some tips? Best, Helder Helder Veras Ribeiro Filho, PhD Brazilian Biosciences National Laboratory - LNBio Brazilian Center for Research in Energy and Materials - CNPEM 10,000 Giuseppe Maximo Scolfaro St. Campinas, SP - Brazil 13083-100 +55(19) 3512-1255 Aviso Legal: Esta mensagem e seus anexos podem conter informações confidenciais e/ou de uso restrito. Observe atentamente seu conteúdo e considere eventual consulta ao remetente antes de copiá-la, divulgá-la ou distribuí-la. Se você recebeu esta mensagem por engano, por favor avise o remetente e apague-a imediatamente. Disclaimer: This email and its attachments may contain confidential and/or privileged information. Observe its content carefully and consider possible querying to the sender before copying, disclosing or distributing it. If you have received this email by mistake, please notify the sender and delete it immediately. _______________________________________________ scipion-users mailing list sci...@li...<mailto:sci...@li...> https://lists.sourceforge.net/lists/listinfo/scipion-users _______________________________________________ scipion-users mailing list sci...@li...<mailto:sci...@li...> https://lists.sourceforge.net/lists/listinfo/scipion-users -- Pablo Conesa - Madrid Scipion<http://scipion.i2pc.es> team |
From: Pablo C. <pc...@cn...> - 2022-11-04 08:12:42
|
Hi Helder! I got no experience with singularity, but I guess you have a cluster of singuarity nodes? Nodes should have scipion installed as well in the same way(paths) as the "login node". I guess the challenge here is to make slurm "talk to other singularity nodes"? Regarding the error ...: var/spool/slurmd/slurmd/job02147/slurm_script: line 28: gpu01: No such file or directory uniq: gpu01: No such file or directory python3: can't open file '/opt/.scipion3/lib/python3.8/site-packages/pyworkflow/apps/pw_protocol_run.py': [Errno 2] No such file or directory I only understand the last line (not sure if the first 2 are a consequence of the 3rd one). That path look correct, providing you have a "Scipion virtualenv" installation at /opt. On 3/11/22 19:36, helder veras wrote: > Hi Mohamad! > > Thank you for your reply! Yes, I'm using the host configuration file > to launch slurm. > Let me provide more details about my issue: > > I built a singularity container that has scipion and all the required > dependencies and programs installed as well. This container works fine > and I tested it on a desktop machine and on an HPC node without the > queue option as well. Programs inside scipion are correctly executed > and everything works fine. > To be able to launch scipion using the queue option with slurm I had > to bind the slurm/murge paths to the container and export some paths > (just as presented in > https://info.gwdg.de/wiki/doku.php?id=wiki:hpc:usage_of_slurm_within_a_singularity_container ) > ( I also included slurm user on the container). By doing this Scipion > was able to see the queue (which I changed in hosts.conf file) and > successfully launch the job to the queue. The problem is that the > sbatch script calls the pw_protocol_run.py that is inside the > container, which raises the error in .err file: > > /var/spool/slurmd/slurmd/job02147/slurm_script: line 28: gpu01: No > such file or directory > uniq: gpu01: No such file or directory > python3: can't open file > '/opt/.scipion3/lib/python3.8/site-packages/pyworkflow/apps/pw_protocol_run.py': > [Errno 2] No such file or directory > > I think the problem is that the slurm is trying to execute the script > that is only available inside the container. > Usage of Slurm within a Singularity Container - GWDG > <https://info.gwdg.de/wiki/doku.php?id=wiki:hpc:usage_of_slurm_within_a_singularity_container> > Starting a singularity container is quite cumbersome for the operating > system of the host and, in case too many requests to start a > singularity container are received at the same time, it might fail. > info.gwdg.de > > > Best, > > Helder Ribeiro > > ------------------------------------------------------------------------ > *De:* Mohamad HARASTANI <moh...@so...> > *Enviado:* quinta-feira, 3 de novembro de 2022 09:29 > *Para:* Mailing list for Scipion users > <sci...@li...> > *Assunto:* Re: [scipion-users] scipion - singularity - HPC > Hello Helder, > > Have you taken a look at the host configuration here > (https://scipion-em.github.io/docs/release-3.0.0/docs/scipion-modes/host-configuration.html)? > > Best of luck, > Mohamad > > ------------------------------------------------------------------------ > *From: *"Helder Veras Ribeiro Filho" <hel...@ln...> > *To: *sci...@li... > *Sent: *Wednesday, November 2, 2022 5:07:53 PM > *Subject: *[scipion-users] scipion - singularity - HPC > > Hello scipion group! > > I'm trying to launch Scipion from a singularity container in an HPC > with the slurm as a scheduler. The container works fine and I'm able > to execute Scipion routines correctly without using a queue. The > problem is when I try to send Scipion jobs using the queue in the > Scipion interface. I suppose that it is a slurm/singularity > configuration problem. > Could anyone who was successful in sending jobs to queue from a > singularity launched scipion help me with some tips? > > Best, > > Helder > > *Helder Veras Ribeiro Filho, PhD*** > Brazilian Biosciences National Laboratory - LNBio > Brazilian Center for Research in Energy and Materials - CNPEM > 10,000 Giuseppe Maximo Scolfaro St. > Campinas, SP - Brazil13083-100 > +55(19) 3512-1255 > > > Aviso Legal: Esta mensagem e seus anexos podem conter informações > confidenciais e/ou de uso restrito. Observe atentamente seu conteúdo e > considere eventual consulta ao remetente antes de copiá-la, divulgá-la > ou distribuí-la. Se você recebeu esta mensagem por engano, por favor > avise o remetente e apague-a imediatamente. > > Disclaimer: This email and its attachments may contain confidential > and/or privileged information. Observe its content carefully and > consider possible querying to the sender before copying, disclosing or > distributing it. If you have received this email by mistake, please > notify the sender and delete it immediately. > > > > _______________________________________________ > scipion-users mailing list > sci...@li... > https://lists.sourceforge.net/lists/listinfo/scipion-users > > > _______________________________________________ > scipion-users mailing list > sci...@li... > https://lists.sourceforge.net/lists/listinfo/scipion-users -- Pablo Conesa - *Madrid Scipion <http://scipion.i2pc.es> team* |
From: helder v. <hel...@ho...> - 2022-11-03 18:36:39
|
Hi Mohamad! Thank you for your reply! Yes, I'm using the host configuration file to launch slurm. Let me provide more details about my issue: I built a singularity container that has scipion and all the required dependencies and programs installed as well. This container works fine and I tested it on a desktop machine and on an HPC node without the queue option as well. Programs inside scipion are correctly executed and everything works fine. To be able to launch scipion using the queue option with slurm I had to bind the slurm/murge paths to the container and export some paths (just as presented in https://info.gwdg.de/wiki/doku.php?id=wiki:hpc:usage_of_slurm_within_a_singularity_container ) ( I also included slurm user on the container). By doing this Scipion was able to see the queue (which I changed in hosts.conf file) and successfully launch the job to the queue. The problem is that the sbatch script calls the pw_protocol_run.py that is inside the container, which raises the error in .err file: /var/spool/slurmd/slurmd/job02147/slurm_script: line 28: gpu01: No such file or directory uniq: gpu01: No such file or directory python3: can't open file '/opt/.scipion3/lib/python3.8/site-packages/pyworkflow/apps/pw_protocol_run.py': [Errno 2] No such file or directory I think the problem is that the slurm is trying to execute the script that is only available inside the container. Usage of Slurm within a Singularity Container - GWDG<https://info.gwdg.de/wiki/doku.php?id=wiki:hpc:usage_of_slurm_within_a_singularity_container> Starting a singularity container is quite cumbersome for the operating system of the host and, in case too many requests to start a singularity container are received at the same time, it might fail. info.gwdg.de Best, Helder Ribeiro ________________________________ De: Mohamad HARASTANI <moh...@so...> Enviado: quinta-feira, 3 de novembro de 2022 09:29 Para: Mailing list for Scipion users <sci...@li...> Assunto: Re: [scipion-users] scipion - singularity - HPC Hello Helder, Have you taken a look at the host configuration here (https://scipion-em.github.io/docs/release-3.0.0/docs/scipion-modes/host-configuration.html)? Best of luck, Mohamad ________________________________ From: "Helder Veras Ribeiro Filho" <hel...@ln...> To: sci...@li... Sent: Wednesday, November 2, 2022 5:07:53 PM Subject: [scipion-users] scipion - singularity - HPC Hello scipion group! I'm trying to launch Scipion from a singularity container in an HPC with the slurm as a scheduler. The container works fine and I'm able to execute Scipion routines correctly without using a queue. The problem is when I try to send Scipion jobs using the queue in the Scipion interface. I suppose that it is a slurm/singularity configuration problem. Could anyone who was successful in sending jobs to queue from a singularity launched scipion help me with some tips? Best, Helder Helder Veras Ribeiro Filho, PhD Brazilian Biosciences National Laboratory - LNBio Brazilian Center for Research in Energy and Materials - CNPEM 10,000 Giuseppe Maximo Scolfaro St. Campinas, SP - Brazil 13083-100 +55(19) 3512-1255 Aviso Legal: Esta mensagem e seus anexos podem conter informações confidenciais e/ou de uso restrito. Observe atentamente seu conteúdo e considere eventual consulta ao remetente antes de copiá-la, divulgá-la ou distribuí-la. Se você recebeu esta mensagem por engano, por favor avise o remetente e apague-a imediatamente. Disclaimer: This email and its attachments may contain confidential and/or privileged information. Observe its content carefully and consider possible querying to the sender before copying, disclosing or distributing it. If you have received this email by mistake, please notify the sender and delete it immediately. _______________________________________________ scipion-users mailing list sci...@li... https://lists.sourceforge.net/lists/listinfo/scipion-users |