From: Klaudia W. <kw...@sa...> - 2010-02-26 15:34:33
|
Hi Alec, I tried to change it, but I still get the same error message: Illegal mate state Could you please let me know which combinations are considered legal mate combinations in Picard? Many thanks, Klaudia On 25 Feb 2010, at 01:28, Alec Wysoker wrote: > Hi Klaudia, > > Although the SAM spec does allow the info about which end is which > to be omitted, we (Picard development team) feel that this is bad > practice and we don't want to encourage it. Can you write a perl or > python script to arbitrarily assign ends to be first or second of > pair? > > -Alec > > Klaudia Walter wrote: >> Hi all, >> >> I found the following flags paired up 17 with 33 and 19 with 35, >> which do not contain the information whether they are the first or >> the second mate, if I understand that correctly. >> >> 1st Example: >> >> SRR003669.14280418 17 1 1024770 78 51M = >> 1025080 259 >> TTTGGTCTGTTGTTCTAAGAATCGGAGAGAGAGAGGTTAAAATCTCCGACT :; >> 7::;9<98=:=0:=@A>=>26:9B=A<B?A5B:<?;A???:=994;6:C >> RG:Z:SRR003669 MF:i:4 Aq:i:53 NM:i:3 UQ:i:72 H0:i: >> 1 H1:i:0 >> >> SRR003669.14280418 33 1 1025080 53 51M = >> 1024770 -259 >> TGGTCTATTGTTCTAAGAATCGGAGAGAGAGAGGTTAAAATCTCCAACTAT C99==@=?? >> 8>@;A;?=9@><6=1=9>;=<<8>=9@40A9A8>@6:>9?48 RG:Z:SRR003669 >> MF:i:4 Aq:i:53 NM:i:1 UQ:i:29 H0:i:0 H1:i:1 >> >> >> 2nd Example: >> >> SRR003667.10102516 35 1 1102933 99 51M = >> 1103095 213 >> GTCAGTACTTTAGAGGATCCCCTTCCCCAGCAGGAATCCTGGGTGCTGAGG 3';-; >> +<35,;==;./<@8/;542864:901>*/5736))4*-)).*-/2) RG:Z:SRR003667 >> MF:i:18 Aq:i:57 NM:i:0 UQ:i:0 H0:i:1 H1:i:0 >> >> SRR003667.10102516 19 1 1103095 99 51M = >> 1102933 -213 >> GGGGAGGGGTCTCAGGGCTCCTGACTTCTTCCATTCTTGCCCAGCCCACCC >> 40*595,+4><;<>69,?96=@0;<@=<><;<A>:><><<;>A;::<;96@ >> RG:Z:SRR003667 MF:i:18 Aq:i:57 NM:i:0 UQ:i:0 H0:i: >> 1 H1:i:0 >> >> I am not sure in which circumstances these flags are set. As a >> solution for the SamToFastq tool, could not the mate with the >> smaller chromosomal position be allocated as the first mate and the >> other one as the second mate? >> >> Thanks, >> Klaudia >> >> >> On 23 Feb 2010, at 09:57, John Marshall wrote: >> >>> On 22 Feb 2010, at 22:56, Alec Wysoker wrote: >>>> It looks like there is something strange with your input SAM >>>> file. It >>>> appears that it does actually contain paired reads, but there are >>>> two >>>> reads with the same name that are either both marked as being the >>>> first >>>> of the pair or both marked as being the second of the pair. >>> >>> I wonder whether Klaudia's input file contains non-primary >>> alignments. I guess tools like SamToFastq need to allow for reads >>> appearing in more than one SAM alignment record -- hopefully this >>> could be as simple as ignoring non-primary records, though the >>> hint about split hits in the flag field description suggests that >>> might not be quite good enough. >>> >>> John >> -- The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE. |