From: Heng Li <lh...@sa...> - 2010-06-22 03:59:43
|
On Tue, Jun 22, 2010 at 09:47:16AM +0800, Colin Hercus wrote: > Hi Tim, > > Thanks for the quick reply. > > Thought of that one but I think it needs a large -ve number of N's, > 68M-2000N32M That is a problem because CIGAR disallows negative lengths. I do not have a satisfactory solution. Perhaps I would store the split read in two records and treat one part as single-end read. This is certainly not optimal, though. BTW, this is actually an example that arbitrarily defining orientation in the @RG header is not always straightforward. When we have PacBio's strobe reads, it will be even more difficult. Heng > > [image: Screenshot.png] > Colin > > On Tue, Jun 22, 2010 at 9:28 AM, Tim Fennell <tfe...@br...>wrote: > > > I think I'd be tempted to represent this as one primary record per end, > > with the split end having a large N operation in the middle of it's cigar. > > So if the junction turned up at base 70/101 I'd pull together the split > > read alignment and generate a single cigar or 69M2000N32M. The advantages I > > see of doing it this way: > > > > 1) You still only have one sam record per end so all your usual inferences > > apply > > 2) All your bases from the one read are in one place > > 3) You can actually count your split reads easily by asking how many reads > > have jump-sized skips in them > > > > -t > > > > On Jun 21, 2010, at 9:22 PM, Colin Hercus wrote: > > > > > Hi, > > > > > > I'm not sure how how to represent paired end reads when one read has > > split alignments and was wondering if someone could advise on the best > > method. > > > > > > The issue arises with Illumina mate pair libraries where the junction > > from cicularisation may land in one of the reads and result in a split > > alignment for the read. > > > > > > SAM specifications allow two primary alignments for a split read and it > > seems OK to use this to split one read but then we have the other read which > > isn't split. To store two proper pairs with mate alignment locations, isize > > etc. we need to store two alignments for the unsplit read. So what do we do > > with the unsplit read, two records both primary or one primary and one > > secondary but both with a primary mate? > > > > > > Any suggestions on how to represent this in SAM would be appreciated. > > > > > > Thanks, Colin > > > > > > > > ------------------------------------------------------------------------------ > > > ThinkGeek and WIRED's GeekDad team up for the Ultimate > > > GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the > > > lucky parental unit. See the prize list and enter to win: > > > > > http://p.sf.net/sfu/thinkgeek-promo_______________________________________________ > > > Samtools-devel mailing list > > > Sam...@li... > > > https://lists.sourceforge.net/lists/listinfo/samtools-devel > > > > > ------------------------------------------------------------------------------ > ThinkGeek and WIRED's GeekDad team up for the Ultimate > GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the > lucky parental unit. See the prize list and enter to win: > http://p.sf.net/sfu/thinkgeek-promo > _______________________________________________ > Samtools-devel mailing list > Sam...@li... > https://lists.sourceforge.net/lists/listinfo/samtools-devel -- The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE. |