From: Jason H. <jas...@zo...> - 2014-08-18 17:11:24
|
Hello PBcR and WGS community, I’m working with what should be 100x pacbio coverage and after using PBcR I’m ending up with at best 7x - 8x of corrected reads. My initial read set is about 11million reads, with an average length of 3000bp. After error correction my best run resulted in 1.2million reads with an average length of 2000bp. My genome has a relatively high heterozygosity as a terrestrial insect. I’ve adjusted both max_coverage and increased genome size to try to account for this but see fewer and shorter reads than using the default PBcR parameters. My current run is being done with following the command spec file. I’m using the latest version of all WGS, 8.2b. ############## pacbio.spec ############# assemble = 0 localStaging = /wgs_pacbio_assembly/PBcR_self_correction/staging #faster overlapper with more sensitive settings mhap = "-k 16 --num-hashes 1256 --num-min-matches 3 --threshold 0.04" merSize = 16 #system memory parameters to avoid fraction bug ovlMemory = 512 ovlStoreMemory = 512000 merylMemory = 512000 #increase coverage depth to counter heterozygosity/error rate #usually results in less corrected reads maxCoverage = 60 #increase genome size to counter heterozygosity, actual genome size 350MB #usually results in less corrected reads genomeSize = 500000000 ##################################### $PBcR -pbCNS\ -length 300\ -partitions 65\ -l corrected_pb_1\ -t 64\ -s pacbio.spec\ -noclean\ -fastq pb.fastq 2>&1 | tee self_corrected_pb_1.log When looking at the corrected read lists in the temporary directory I see what appear to be deleted reads of a length I would assume would make the cut, for example: >100003680002,3680002 mate=0,0 lib=corrected_pb_1,1 clr=LATEST,1,2219 deleted=1 cgtatgtaaaccaattttatactgatggggcgcgaaataacttttcttaagttccttgtgtccaaaca… continues for a total of 2219 bp. As it is, none of the overlap layout assemblers can do much with the low coverage I end up with so I’m very eager to hear ideas of how I can move this forward. Would you please take a look and let me know how you would proceed? I would be happy to supply any additional information and files. -Jason |