I am getting the following error:
/gpfs_fs/atol/tol/Trypanosoma/Trypanosoma_cruzi_CL/gDNA/rawdata/sff_ngs_qc/MaSuRCA_assembly_v3/CA/1-overlapper/0000000002/h0046172081r0028857550 failed, job index 1.
.
.
.
.
/gpfs_fs/atol/tol/Trypanosoma/Trypanosoma_cruzi_CL/gDNA/rawdata/sff_ngs_qc/MaSuRCA_assembly_v3/CA/1-overlapper/0000000002/h0046172081r0046172080 failed, job index 369.
================================================================================
runCA failed.
Stack trace:
at /usr/global/blp/MaSuRCA-2.3.2/bin/../CA/Linux-amd64/bin/runCA line 1121.
main::caFailure('369 overlapper jobs failed', undef) called at /usr/global/blp/MaSuRCA-2.3.2/bin/../CA/Linu
x-amd64/bin/runCA line 1271
main::checkOverlapper('normal') called at /usr/global/blp/MaSuRCA-2.3.2/bin/../CA/Linux-amd64/bin/runCA lin
e 1320
main::checkOverlap('normal') called at /usr/global/blp/MaSuRCA-2.3.2/bin/../CA/Linux-amd64/bin/runCA line 5
341
Failure message:
369 overlapper jobs failed
Any idea how to solve it?
It looks like you're running MaSuRCA. MaSuRCA uses a fork of an older version of CA and may have bugs that have since been fixed in CA. I suggesting contacting the developers of MaSuRCA directly:
http://www.genome.umd.edu/masurca.html
That said, when overlapper fails, it is almost always a system resources issue and not a software issue. It looks like every one of your jobs failed, which makes me suspect some configuration or memory issue. you can check any of the 1-overlapper/*.out files for the exact error message that caused the failure.
Last edit: Sergey Koren 2015-05-19
Could you please explain what you exactly mean by "system resources" and how I can request them differently?
The system resources for overlapping (memory and cores) are configured through runCA:
http://wgs-assembler.sourceforge.net/wiki/index.php/RunCA#OVL_Overlapper
However, MaSuRCA controls the spec file for assembly in your run and I don't know how to alter it's behavior. The output files I pointed you to have more information on what caused the overlap jobs to fail which would give more guidance on what needs to be adjusted for the MaSuRCA developers.
The .out files you suggested contain the following error message.
/usr/global/blp/MaSuRCA-2.3.2/CA/Linux-amd64/bin/overlap: symbol lookup error: /usr/global/blp/MaSuRCA-2.3.2/CA/Lin
ux-amd64/bin/overlap: undefined symbol: _ZN9jellyfish10mer_dna_ns15mer_base_staticImLi0EE2k_E
Does this help in figuring out whats wrong?
Further looking into jellyfish library revealed:
That is, it does not contain _ZN9jellyfish10mer_dna_ns15mer_base_staticImLi0EE2k_E! Is this a jellyfish version mismatch?
Yes, that error implies something is wrong in your compilation/installation of MaSuRCA (or rather the compilation of CA bundled with MaSuRCA). It looks like a function the overlap binary was linked to is missing. You can get more details running:
ldd /usr/global/blp/MaSuRCA-2.3.2/CA/Linux-amd64/bin/overlap
This bug is not in CA but in MaSuRCA (jellyfish is not part of the standard CA distribution). Thus, I do not know an appropriate fix. I'd recommend running test examples of MaSuRCA to make sure your installation is working and contacting the MaSuRCA developers for guidance in resolving the error.
Linking bug in MaSuRCA installation, not related to Celera Assembler.