insert size estimation bug

de novo assembly & analysis of Illumina sequence data

Brought to you by: koadman

#4 insert size estimation bug

Status: New

Owner: nobody

Labels: None

Priority: Medium

Type: Defect

Updated: 2012-10-30

Created: 2012-10-30

Creator: Anonymous

Private: No

Originally created by: nyoun... (code.google.com)@gmail.com

First off, thanks for creating the pipeline. My tests comparing de novo illumina paired end assemblies of methanogen genomes versus closed (or almost closed) versions of the genomes shows that your pipeline works better than Velvet, Newbler, CLC Genomics, and SPADES in terms of contiguity and fidelity (ie miscalled bases).

I've been using different numbers of reads for input and noticed a possible bug with the insert size estimation. When using 2 million read pairs, the insert size is estimated as 100bp. However, when using greater numbers of reads (4 million - 10 million), the insert size is ~430bp, which is what it should be.

I've tried supplying an insert size in a library file, but since the insert size can be estimated (i.e. 100bp), my provided insert size is not used.

Thanks again for constructing the pipeline.

Nick

insert size estimation bug

de novo assembly & analysis of Illumina sequence data

Searches

Help

#4 insert size estimation bug

Discussion