My run failed after correction during the assembly overlapping step of PBcR. Here is the full content of results/sample/0-overlaptrim-overlap/000001.out. 000002 and 000003 failed similarly. Do you have any recommendations?
STRING_NUM_BITS 31
OFFSET_BITS 31
STRING_NUM_MASK 2147483647
OFFSET_MASK 2147483647
MAX_STRING_NUM 2147483647
Hash_Mask_Bits 25
Max_Hash_Strings 11273
Max_Hash_Data_Len 20002740
Max_Hash_Load 0.750000
Kmer Length 22
Min Overlap Length 80
MAX_ERRORS 1967
ERRORS_FOR_FREE 1
Num_PThreads 4
Max_Reads_Per_Batch 11273
Max_Reads_Per_Thread 704
HASH_TABLE_SIZE 33554432
sizeof(Hash_Bucket_t) 216
hash table size: 6912 MB
check 128 MB
info 0 MB
start 0 MB
Initialize_Work_Area: MAX_ERRORS=1967 allocated 1MB
Initialize_Work_Area: MAX_ERRORS=1967 allocated 1MB
Initialize_Work_Area: MAX_ERRORS=1967 allocated 1MB
Initialize_Work_Area: MAX_ERRORS=1967 allocated 1MB
Build_Hash_Index from 1 to 11273
HASH LOADING STOPPED: strings 11273 out of 11273 max.
HASH LOADING STOPPED: length 20002740 out of 20002740 max.
HASH LOADING STOPPED: entries 8164529 out of 528482304 max (load 1.16).
String_Ct = 11273 Extra_String_Ct = 0 Extra_String_Subcount = 2978
Read 4521 kmers to mark to skip
Index built.
Starting 1 11273
Choose_And_Process_Stream_Segment()-- tid 0
Choose_And_Process_Stream_Segment()-- tid 3
WorkArea 0 allocates space 0 of size 16777216 for array 0 through 4092
Failed with 'Segmentation fault'
Backtrace (mangled):
/wgs-4604/Linux-amd64/bin/overlapInCore(_Z17AS_UTL_catchCrashiP7siginfoPv+0x2a)[0x4160fa]
/lib64/libpthread.so.0[0x3cf460eb10]
/wgs-4604/Linux-amd64/bin/overlapInCore(_Z20Process_String_OlapsPciS_j14Direction_TypeP9Work_Area+0x2ad)[0x410f7d]
/wgs-4604/Linux-amd64/bin/overlapInCore(_Z13Find_OverlapsPciS_j14Direction_TypeP9Work_Area+0x5e1)[0x40d301]
/wgs-4604/Linux-amd64/bin/overlapInCore(_Z16Process_OverlapsP8gkStreamP9Work_Area+0x139)[0x40da29]
/wgs-4604/Linux-amd64/bin/overlapInCore(_Z13OverlapDriverv+0x4af)[0x41288f]
/wgs-4604/Linux-amd64/bin/overlapInCore(main+0xfbf)[0x413b3f]
/lib64/libc.so.6(libc_start_main+0xf4)[0x3cf3e1d994]
/wgs-4604/Linux-amd64/bin/overlapInCore(gxx_personality_v0+0x101)[0x4069c9]
Backtrace (demangled):
[0] /wgs-4604/Linux-amd64/bin/overlapInCore::AS_UTL_catchCrash(int, siginfo, void) + 0x2a [0x4160fa]
[1] /lib64/libpthread.so.0 [0x3cf460eb10]
[2] /wgs-4604/Linux-amd64/bin/overlapInCore::Process_String_Olaps(char, int, char, unsigned int, Direction_Type, Work_Area) + 0x2ad [0x410f7d]
[3] /wgs-4604/Linux-amd64/bin/overlapInCore::Find_Overlaps(char, int, char, unsigned int, Direction_Type, Work_Area) + 0x5e1 [0x40d301]
[4] /wgs-4604/Linux-amd64/bin/overlapInCore::Process_Overlaps(gkStream, Work_Area) + 0x139 [0x40da29]
[5] /wgs-4604/Linux-amd64/bin/overlapInCore::OverlapDriver() + 0x4af [0x41288f]
[6] /wgs-4604/Linux-amd64/bin/overlapInCore::(null) + 0xfbf [0x413b3f]
[7] /lib64/libc.so.6::(null) + 0xf4 [0x3cf3e1d994]
[8] /wgs-4604/Linux-amd64/bin/overlapInCore::(null) + 0x101 [0x4069c9]
GDB:
/results/sample/0-overlaptrim-overlap/overlap.sh: line 60: 8523 Segmentation fault $bin/overlapInCore -G --hashbits 25 --hashload 0.75 -t 4 $opt -k 22 -k /results/sample/0-mercounts/asm.nmers.obt.fasta -o /results/sample/0-overlaptrim-overlap/$bat/$job.ovb.WORKING.gz /results/sample/asm.gkpStore
Hi,
Are you using correction with Illumina data? The CA built-in overlapper is not recommended anymore (it only gets used as a fallback). I would recommend getting SMRTportal (blasr and sawriter) installed which will be used by default for Illumina correction if available. If you are using only PacBio data for self-correction, it should be using MHAP instead and not the CA overlapper.
Sergey
[Bri removed quoted email]
Last edit: Brian Walenz 2015-02-25
I am not using Illumina data, just Pacbio for self-correction - attempting
to use MHAP. Is there something wrong with my command?
/wgs-4604/Linux-amd64/bin/PBcR -pbCNS -partitions 400 -fastq $fastq -s
../../pacbio.spec -length 1000 -libraryname $sample -threads $nproc
-genomeSize 4600000 localStaging=/tmp
pacbio.spec:
original asm settings
utgErrorRate = 0.25
utgErrorLimit = 4.5
cnsErrorRate = 0.25
cgwErrorRate = 0.25
ovlErrorRate = 0.25
merSize=14
grid info
useGrid = 0
scriptOnGrid = 0
frgCorrOnGrid = 0
ovlCorrOnGrid = 0
ovlMemory=8GB --hashload 0.7
ovlHashBits = 25
ovlThreads = 2
ovlHashBlockLength = 20000000
ovlRefBlockSize = 50000000
for mer overlapper
merCompression = 1
merOverlapperSeedBatchSize = 500000
merOverlapperExtendBatchSize = 250000
frgCorrThreads = 8
frgCorrBatchSize = 100000
ovlCorrBatchSize = 100000
non-Grid settings, if you set useGrid to 0 above these will be used
merylMemory = 128000
merylThreads = 8
ovlStoreMemory = 8192
ovlConcurrency = 8
cnsConcurrency = 8
merOverlapperThreads = 3
merOverlapperSeedConcurrency = 3
merOverlapperExtendConcurrency = 3
frgCorrConcurrency = 2
ovlCorrConcurrency = 4
cnsConcurrency = 4
Thanks,
Jeremy
[Bri removed quoted email]
Last edit: Brian Walenz 2015-02-25
Your command and spec should be fine though it is quite old and most of the options aren’t necessary (all those starting with mer), as well as the error rates. I would suggest just using the MHAP one provided with our publication (or the one that comes with the E. coli sample dataset).
http://www.cbcb.umd.edu/software/PBcR/mhap/asm/pacbio.spec http://www.cbcb.umd.edu/software/PBcR/mhap/asm/pacbio.spec
However, your spec file shouldn’t affect running MHAP. Can you do an ls on Linux-amd64/lib/java to make sure the mhap jar is present. Can you also provide any outputs from the pipeline as it runs?
Thanks
[Bri removed quoted email]
Last edit: Brian Walenz 2015-02-25
Sergey,
I used your updated spec file, the mhap jar is present and it appears to be using MHAP now, instead of blasr. However, it's still failing during the overlapper step of the assembly, after self-correction. The output from PBcR is attached.
Thanks,
Jeremy
It looks like this segfault was fixed in revision 4624.
Thanks for the update. Glad to hear your issue is fixed.
[Bri removed quoted email]
Last edit: Brian Walenz 2015-02-25