Re: [Bio-bwa-help] inconsistent read and CIGAR lengths in 0.7.3a?
Status: Beta
Brought to you by:
lh3lh3
From: Heng Li <lh...@sa...> - 2013-03-18 16:04:23
|
Could you show an example? Is it possible to given me your reference genome (is it public)? You are aligning to an assembly. It is possible that I have overlooked some corner cases in the presence of many short contigs. Thanks, Heng PS: I guess the total length of contigs is around 268Mb? On Mar 18, 2013, at 11:50 AM, Douglas G. Scofield wrote: > Hi, I've been mapping some 100-200 bp single-end reads composed of joined overlapping PE reads. This error of inconsistent read and CIGAR lengths comes from samtools view -Sb receiving bwa mem output, and has turned up in most mappings of individual lanes. The sequence length is always a length consistent with our dataset, and the CIGAR length is always large and of the same magnitude. > > ./bwa-0.7.3a/bwa mem -t 8 -M ref.fa joined-reads.fq.gz | samtools view -Sb - > joined.bam > [M::main_mem] read 542310 sequences (80000143 bp)... > [samopen] SAM header is present: 10253694 sequences. > [M::main_mem] read 542074 sequences (80000131 bp)... > [M::main_mem] read 542126 sequences (80000106 bp)... > [M::main_mem] read 541780 sequences (80000144 bp)... > [M::main_mem] read 542008 sequences (80000233 bp)... > [M::main_mem] read 541446 sequences (80000252 bp)... > [M::main_mem] read 541508 sequences (80000213 bp)... > [M::main_mem] read 541508 sequences (80000083 bp)... > [M::main_mem] read 541194 sequences (80000233 bp)... > [M::main_mem] read 541150 sequences (80000155 bp)... > [M::main_mem] read 541192 sequences (80000148 bp)... > [M::main_mem] read 541256 sequences (80000178 bp)... > [M::main_mem] read 541900 sequences (80000175 bp)... > [M::main_mem] read 541452 sequences (80000147 bp)... > Line 17958452, sequence length 192 vs 268435648 from CIGAR > Parse error at line 17958452: CIGAR and sequence length are inconsistent > > Other examples of the error message, showing a 1-1 mapping from sequence length to CIGAR length: > > Line 21464310, sequence length 185 vs 268435641 from CIGAR > Line 19254499, sequence length 176 vs 268435632 from CIGAR > Line 18340511, sequence length 189 vs 268435645 from CIGAR > Line 11148681, sequence length 182 vs 268435638 from CIGAR > Line 21119586, sequence length 192 vs 268435648 from CIGAR > Line 17331580, sequence length 191 vs 268435647 from CIGAR > Line 10555478, sequence length 182 vs 268435638 from CIGAR > Line 17958452, sequence length 192 vs 268435648 from CIGAR > Line 11933371, sequence length 181 vs 268435637 from CIGAR > Line 13738253, sequence length 179 vs 268435635 from CIGAR > Line 23942863, sequence length 185 vs 268435641 from CIGAR > Line 15246868, sequence length 190 vs 268435646 from CIGAR > Line 14256497, sequence length 191 vs 268435647 from CIGAR > Line 37720680, sequence length 192 vs 268435648 from CIGAR > Line 17036331, sequence length 180 vs 268435636 from CIGAR > Line 10740406, sequence length 186 vs 268435642 from CIGAR > Line 12467794, sequence length 192 vs 268435648 from CIGAR > Line 18170858, sequence length 189 vs 268435645 from CIGAR > Line 10364600, sequence length 191 vs 268435647 from CIGAR > Line 17908527, sequence length 192 vs 268435648 from CIGAR > Line 11970932, sequence length 191 vs 268435647 from CIGAR > Line 19278210, sequence length 186 vs 268435642 from CIGAR > Line 10920332, sequence length 188 vs 268435644 from CIGAR > > Cheers, > > Doug > > Douglas G. Scofield > Umeå Plant Sciences Centre > Umeå University, Umeå Sweden > dou...@pl... > dou...@gm... > > > ------------------------------------------------------------------------------ > Everyone hates slow websites. So do we. > Make your web apps faster with AppDynamics > Download AppDynamics Lite for free today: > http://p.sf.net/sfu/appdyn_d2d_mar_______________________________________________ > Bio-bwa-help mailing list > Bio...@li... > https://lists.sourceforge.net/lists/listinfo/bio-bwa-help -- The Wellcome Trust Sanger Institute is operated by Genome Rese arch Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE. |