|
From: Feiyu Du <fd...@ge...> - 2010-04-22 14:49:22
|
Thanks a lot ! It's very helpful. Feiyu Tim Fennell wrote: > Hi Feiyu, > > The algorithm probably does need describing somewhere in detail, but I > don't believe I have anything handy. Essentially what it does (for > pairs; single-end data is also handled) is to find the 5' coordinates > and mapping orientations of each read pair. When doing this it takes > into account all clipping that has taking place as well as any gaps or > jumps in the alignment. You can thus think of it as determining "if > all the bases from the read were aligned, where would the 5' most base > have been aligned". It then matches all read pairs that have > identical 5' coordinates and orientations and marks as duplicates all > but the "best" pair. "Best" is defined as the read pair having the > highest sum of base qualities as bases with Q >= 15. > > Hope that helps, > > -t > > > On Apr 20, 2010, at 1:54 PM, Feiyu Du wrote: > >> Hi, >> >> I am new to MarkDuplicates. Could someone describe how it works or point >> to a helpful documentation site ? Also would quality trimming (resulting >> in varied length of pair ends) have an affect on MarkDup outputs ? >> >> thanks a lot, >> >> Feiyu >> >> ------------------------------------------------------------------------------ >> >> Download Intel® Parallel Studio Eval >> Try the new software tools for yourself. Speed compiling, find bugs >> proactively, and fine-tune applications for parallel performance. >> See why Intel Parallel Studio got high marks during beta. >> http://p.sf.net/sfu/intel-sw-dev >> _______________________________________________ >> Samtools-help mailing list >> Sam...@li... >> https://lists.sourceforge.net/lists/listinfo/samtools-help |