From: Mark A. D. <ma...@de...> - 2010-12-11 14:11:02
|
I found it in the NA12878 HiSeq data set. Aligned by BWA and processed with the Picard process at the Broad. I'll dig into how this might have occurred, and I do think the spec should clearly state that such cigar operators are invalid. On Dec 11, 2010, at 8:07 AM, Richard Durbin wrote: > What BAM was this in? Was it a 1000 genomes DCC BAM? > > Sent from my iPhone > > On 10 Dec 2010, at 19:51, "Goncalo Abecasis" <go...@um...> wrote: > >> Seems like that should not pass; in fact, it is not clear to me how to interpret it. >> >> >> >> Goncalo >> >> >> >> From: Mark A. DePristo [mailto:ma...@de...] >> Sent: Friday, December 10, 2010 2:34 PM >> To: Samtools-devel >> Subject: [Samtools-devel] Cigar elements not summing to read length? >> >> >> >> Hi all, >> >> >> >> Just a quick question. During some development in the GATK I discovered the following read in our HiSeq data set. It is 101 bp long, but the cigar string is 31M3D7M. It passes Picard validation, and the spec doesn't state this is disallowed. Is this a valid read? Is it an implicit clip? Perhaps the spec should be clarified. >> >> >> >> read length : 101 >> >> read start : 10058657 (10058657 unclipped) >> >> cigar : 31M3D7M >> >> read bases : TCACACCACTGCATTCCAGCCTGGGCAACAGAGCAAGACCCTGTCTCAAAAAAGAGAAAAAGAAATTTCAAGAAAAGATGATAGCTGTCCGAGATCGGAAG >> >> original quals: @BC@@@BCADBCACCBBCCCDDCCBBACACCCCCCDCDAA;ACCCDCCCACDCCDCDDCBDDDDDCA+BCDDDDDD-DCADB:@=;>A>;;DCDAB:6:BC >> >> >> >> Best, >> >> >> >> Mark A. DePristo, Ph.D. >> >> Manager, Medical and Population Genetics Analysis >> >> Broad Institute of MIT and Harvard >> >> dep...@br... >> >> ma...@de... >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> ------------------------------------------------------------------------------ >> Oracle to DB2 Conversion Guide: Learn learn about native support for PL/SQL, >> new data types, scalar functions, improved concurrency, built-in packages, >> OCI, SQL*Plus, data movement tools, best practices and more. >> http://p.sf.net/sfu/oracle-sfdev2dev >> _______________________________________________ >> Samtools-devel mailing list >> Sam...@li... >> https://lists.sourceforge.net/lists/listinfo/samtools-devel > > -- The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a compa ny registered in England with number 2742969, whose registered office is 2 15 Euston Road, London, NW1 2BE. > ------------------------------------------------------------------------------ > Oracle to DB2 Conversion Guide: Learn learn about native support for PL/SQL, > new data types, scalar functions, improved concurrency, built-in packages, > OCI, SQL*Plus, data movement tools, best practices and more. > http://p.sf.net/sfu/oracle-sfdev2dev _______________________________________________ > Samtools-devel mailing list > Sam...@li... > https://lists.sourceforge.net/lists/listinfo/samtools-devel Mark A. DePristo, Ph.D. Manager, Medical and Population Genetics Analysis Broad Institute of MIT and Harvard dep...@br... ma...@de... |