From: Peter C. <p.j...@go...> - 2014-11-12 01:29:04
|
On Tue, Nov 11, 2014 at 5:56 PM, James Bonfield <jk...@sa...> wrote: > Hello Andrew, > > On Mon, Nov 10, 2014 at 11:53:22AM -0800, Andrew McPherson wrote: >> I would like to suggest a minor but important update/clarification for the >> SAM format specification regarding unmapped reads. >> >> In the sam spec http://samtools.github.io/hts-specs/SAMv1.pdf, it says the >> following about the flag: >> >> 0x10 SEQ being reverse complemented >> >> and further down in the notes: >> >> Bit 0x4 is the only reliable place to tell whether the read is unmapped. If >> 0x4 is set, no assumptions can be made about RNAME, POS, CIGAR, MAPQ, bits >> 0x2, 0x10, 0x100 and 0x800, and the bit 0x20 of the previous read in the >> template. >> >> Thus for an unmapped read, we dont know if the reverse complement flag is >> valid. This has important implications for extracting the original read >> data from a bam file, since we dont know whether unmapped reads are >> represented as-is in the bam, or have been reverse complemented. It would >> be fairly simple to add the requirement that 0x10 is always valid, >> regardless of 0x4. > > I completely agree. I'd forgotten about this issue, but now you've > brought it up again it reminded me and sure enough I had the same > thought: > > http://sourceforge.net/p/samtools/mailman/message/30514706/ > > I wish I'd remembered it earlier (ie since I got more involved with > the samtools core group). Feel free to create a github issue on this > against hts-specs so we don't forget about it again by our next > meeting (Monday). > > James Ah - you found the old thread :) I was sure this issue had come up but was unable to find that conversation in my email archive :( I agree, this needs to be made explicit, particularly for tasks like "samtools bam2fq". Peter |