From: Shankar A. S. <sha...@gm...> - 2010-03-26 20:53:00
|
Thanks, Heng. I tried option 2 that you suggested. fixmate didn't work on the name-sorted bam when leaving in the trailing read1 & read2 identifiers - "/[12]" . Only the strand flag remains set. Shankar On Mon, Mar 22, 2010 at 5:30 PM, Heng Li <lh...@sa...> wrote: > Fixmate tells read1 and read2 from the tailing "/[12]" in read names. As > you do not have this information, fixmate cannot do the job and rmdup > will fail. It is recommended to use an aligner that generates proper > FLAGs or to keep "/[12]" in the read names. Manually fixing flags is > often problematic. > > Heng > > On Mon, Mar 22, 2010 at 08:46:16PM +0000, Shankar Ajay Subramanian wrote: >> Hi all, >> >> I've been trying to get a predictable outcome with rmdup but I can't seem to. >> >> Here are the steps that I've gone through to try and remove duplicate >> read pairs: >> >> Step 1: Name sorted bam file >> >> 510.1.1.1204.14 16 chr12 132047285 91 100M * >> 0 0 >> CCTTGCCAAGACTTGACNNNNTNNTTCCTCCTCCCCATACAATCACTTATCTTTTGTAAATTAATATGTATTAATGTGGAGTCCTAATTAGGGAAAAAGA >> ###########################?*894<==GHGGHFFBFFFDFG;HHECHHHHHHHHHHFHHHHHHHHHHHHHHHHFHHHHHHHHHHEEHHHHHH >> 510.1.1.1204.14 0 chr12 132047018 72 1S84M15S >> * 0 0 >> NCAAACTACAAAGACCAAANTGANNTGACATTAAAAATACCTTAATTAGNTTATTTACGTGCATAATTTTTAAAAACTGAGTCTACNNNNNNNNNANNNN >> #++**7::50FFFFF+++'#+*+##+++**FFF<FFFF<7FFFFF+++*#+*+'&FF=FFFF==<?8;=?############################## >> 510.1.1.1214.14997 16 chr12 132047285 91 100M >> * 0 0 >> CCTTGCCAAGACTTGACNNNNTNNTTCCTCCTCCCCATACAATCACTTATCTTTTGTAAATTAATATGTATTAATGTGGAGTCCTAATTAGGGAAAAAGA >> ###########################?*894<==GHGGHFFBFFFDFG;HHECHHHHHHHHHHFHHHHHHHHHHHHHHHHFHHHHHHHHHHEEHHHHHH >> 510.1.1.1214.14997 0 chr12 132047018 72 1S84M15S >> * 0 0 >> NCAAACTACAAAGACCAAANTGANNTGACATTAAAAATACCTTAATTAGNTTATTTACGTGCATAATTTTTAAAAACTGAGTCTACNNNNNNNNNANNNN >> #++**7::50FFFFF+++'#+*+##+++**FFF<FFFF<7FFFFF+++*#+*+'&FF=FFFF==<?8;=?############################## >> >> Step 2: samtools fixmate test.nameSrt.bam test.nameSrt.fxm.bam >> >> 510.1.1.1204.14 16 chr12 132047285 91 100M = >> 132047018 -367 >> CCTTGCCAAGACTTGACNNNNTNNTTCCTCCTCCCCATACAATCACTTATCTTTTGTAAATTAATATGTATTAATGTGGAGTCCTAATTAGGGAAAAAGA >> ###########################?*894<==GHGGHFFBFFFDFG;HHECHHHHHHHHHHFHHHHHHHHHHHHHHHHFHHHHHHHHHHEEHHHHHH >> 510.1.1.1204.14 32 chr12 132047018 72 1S84M15S >> = 132047285 367 >> NCAAACTACAAAGACCAAANTGANNTGACATTAAAAATACCTTAATTAGNTTATTTACGTGCATAATTTTTAAAAACTGAGTCTACNNNNNNNNNANNNN >> #++**7::50FFFFF+++'#+*+##+++**FFF<FFFF<7FFFFF+++*#+*+'&FF=FFFF==<?8;=?############################## >> 510.1.1.1214.14997 16 chr12 132047285 91 100M >> = 132047018 -367 >> CCTTGCCAAGACTTGACNNNNTNNTTCCTCCTCCCCATACAATCACTTATCTTTTGTAAATTAATATGTATTAATGTGGAGTCCTAATTAGGGAAAAAGA >> ###########################?*894<==GHGGHFFBFFFDFG;HHECHHHHHHHHHHFHHHHHHHHHHHHHHHHFHHHHHHHHHHEEHHHHHH >> 510.1.1.1214.14997 32 chr12 132047018 72 >> 1S84M15S = 132047285 367 >> NCAAACTACAAAGACCAAANTGANNTGACATTAAAAATACCTTAATTAGNTTATTTACGTGCATAATTTTTAAAAACTGAGTCTACNNNNNNNNNANNNN >> #++**7::50FFFFF+++'#+*+##+++**FFF<FFFF<7FFFFF+++*#+*+'&FF=FFFF==<?8;=?############################## >> >> Step 3: samtools rmdup test.nameSrt.fxm.bam test.nameSrt.fxm.uniq.bam >> >> 510.1.1.1204.14 16 chr12 132047285 91 100M = >> 132047018 -367 >> CCTTGCCAAGACTTGACNNNNTNNTTCCTCCTCCCCATACAATCACTTATCTTTTGTAAATTAATATGTATTAATGTGGAGTCCTAATTAGGGAAAAAGA >> ###########################?*894<==GHGGHFFBFFFDFG;HHECHHHHHHHHHHFHHHHHHHHHHHHHHHHFHHHHHHHHHHEEHHHHHH >> 510.1.1.1204.14 32 chr12 132047018 72 1S84M15S >> = 132047285 367 >> NCAAACTACAAAGACCAAANTGANNTGACATTAAAAATACCTTAATTAGNTTATTTACGTGCATAATTTTTAAAAACTGAGTCTACNNNNNNNNNANNNN >> #++**7::50FFFFF+++'#+*+##+++**FFF<FFFF<7FFFFF+++*#+*+'&FF=FFFF==<?8;=?############################## >> 510.1.1.1214.14997 16 chr12 132047285 91 100M >> = 132047018 -367 >> CCTTGCCAAGACTTGACNNNNTNNTTCCTCCTCCCCATACAATCACTTATCTTTTGTAAATTAATATGTATTAATGTGGAGTCCTAATTAGGGAAAAAGA >> ###########################?*894<==GHGGHFFBFFFDFG;HHECHHHHHHHHHHFHHHHHHHHHHHHHHHHFHHHHHHHHHHEEHHHHHH >> 510.1.1.1214.14997 32 chr12 132047018 72 >> 1S84M15S = 132047285 367 >> NCAAACTACAAAGACCAAANTGANNTGACATTAAAAATACCTTAATTAGNTTATTTACGTGCATAATTTTTAAAAACTGAGTCTACNNNNNNNNNANNNN >> #++**7::50FFFFF+++'#+*+##+++**FFF<FFFF<7FFFFF+++*#+*+'&FF=FFFF==<?8;=?############################## >> >> Changing the flags from 16 to 83 and 32 to 163 didn't seem to work >> either. Is there something fundamentally wrong with what I am doing? >> >> Any help will be much appreciated. >> >> Thanks, >> Shankar >> >> ------------------------------------------------------------------------------ >> Download Intel® Parallel Studio Eval >> Try the new software tools for yourself. Speed compiling, find bugs >> proactively, and fine-tune applications for parallel performance. >> See why Intel Parallel Studio got high marks during beta. >> http://p.sf.net/sfu/intel-sw-dev >> _______________________________________________ >> Samtools-help mailing list >> Sam...@li... >> https://lists.sourceforge.net/lists/listinfo/samtools-help > > > -- > The Wellcome Trust Sanger Institute is operated by Genome Research > Limited, a charity registered in England with number 1021457 and a > company registered in England with number 2742969, whose registered > office is 215 Euston Road, London, NW1 2BE. > |