Re: [Bio-bwa-help] mate-sw mapping
Status: Beta
Brought to you by:
lh3lh3
From: Heng Li <lh...@sa...> - 2010-10-29 15:56:50
|
On Oct 29, 2010, at 11:44 AM, Rusch, Michael wrote: > OK, thanks. I did the alignment in batches. I'm not sure which batch these reads were in, but the stderr for most batches contained several copies of this: > > [infer_isize] fail to infer insert size: weird pairing > > One batch had a different error message: > > [infer_isize] fail to infer insert size: too few good pairs > > One batch had a mixture of the two. > > So, I'm assuming this is moving us closer to the cause here. I'm not sure what the definition of "weird" is in this case, or what sorts of problems I should be looking for... That's why mate-sw is not working. You'd better plot the insert size distribution. Using the 0.5.8c may be also worth as it is a little bit more robust in estimating the insert size distribution. Heng > > Michael > > -----Original Message----- > From: Heng Li [mailto:lh...@sa...] > Sent: Friday, October 29, 2010 10:35 AM > To: Rusch, Michael > Subject: Re: [Bio-bwa-help] mate-sw mapping. . > > > On Oct 29, 2010, at 11:28 AM, Rusch, Michael wrote: > >>> What is the inferred insert size by bwa? >> >> I'm not sure. How do I find that out? > > In the stderr output. You'd better grab the lines start with something like "infer_isize". Thanks. > >> >>> I think there is no way to force mate-sw, unless you change the source code. >> >> What I was doing was setting the insert size explicitly with -a, but I see now that that wouldn't work anyway because that's only used for setting the proper pair flag, right? That is, it doesn't factor into the SW realignment process, right? > > No, it does not work. > > Heng > >> >> Michael >> >> -----Original Message----- >> From: Heng Li [mailto:lh...@sa...] >> Sent: Friday, October 29, 2010 10:23 AM >> To: Rusch, Michael >> Cc: bio...@li... >> Subject: Re: [Bio-bwa-help] mate-sw mapping. . >> >> >> On Oct 29, 2010, at 11:05 AM, Rusch, Michael wrote: >> >>> We have two sets of 100bp paired-end reads for the same thing that came from two different runs. Looking at one particular region, I see that with the data from one run we are getting soft-clipping at that region, and with the data from the other run we are just getting a drop in coverage, with no soft clipping. We did a significant amount of analysis to find that we have pairs of reads where one read maps just fine and the other is unmapped, but the unmapped read should be able to map perfectly with some soft clipping nearby the first read. I don't understand why these aren't being mapped Mate-SW. >>> >>> I pulled out one example read pair and attached it. >>> >>> Here are the alignments I get (with sequence and quality snipped out to not clutter up the email): >>> myread 73 11 117860457 37 100M = 117860457 0 <snip> X0:i:1 X1:i:0 MD:Z:100 XG:i:0 AM:i:0 NM:i:0 SM:i:37 XM:i:0 XO:i:0 XT:A:U >>> myread 133 11 117860457 0 * = 117860457 0 >>> >>> From BLAT, I find that bases 44-100 of the second read should map in the reverse orientation to 11:117860750-117860806. >> >> What is the inferred insert size by bwa? >> >>> I don't understand why this wouldn't map Mate-SW. I even tried running sampe with insert sizes set explicitly to try to coerce it to map, but it just would not. >> >> I think there is no way to force mate-sw, unless you change the source code. >> >> Heng >> >>> >>> For one read I wouldn't care, but we've found dozens of reads in this case that were unmapped but should have, as far as I can see, be mapped Mate-SW just to this one particular locus. It also doesn't make sense because in our other dataset, we got lots of mappings here that were mapped mate-sw. >>> >>> Michael >>> >>> Email Disclaimer: www.stjude.org/emaildisclaimer >>> <s_2.fq><s_1.fq>------------------------------------------------------------------------------ >>> Nokia and AT&T present the 2010 Calling All Innovators-North America contest >>> Create new apps & games for the Nokia N8 for consumers in U.S. and Canada >>> $10 million total in prizes - $4M cash, 500 devices, nearly $6M in marketing >>> Develop with Nokia Qt SDK, Web Runtime, or Java and Publish to Ovi Store >>> http://p.sf.net/sfu/nokia-dev2dev_______________________________________________ >>> Bio-bwa-help mailing list >>> Bio...@li... >>> https://lists.sourceforge.net/lists/listinfo/bio-bwa-help >> >> >> >> -- >> The Wellcome Trust Sanger Institute is operated by Genome Research >> Limited, a charity registered in England with number 1021457 and a >> company registered in England with number 2742969, whose registered >> office is 215 Euston Road, London, NW1 2BE. >> >> > > > > -- > The Wellcome Trust Sanger Institute is operated by Genome Research > Limited, a charity registered in England with number 1021457 and a > company registered in England with number 2742969, whose registered > office is 215 Euston Road, London, NW1 2BE. > > -- The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE. |