|
From: SHAUN W. <sw...@st...> - 2011-11-15 11:25:55
|
Hi, I have a question about the picard tool SamToFastq. I have a bam file containing paired end reads and I have attempted to convert this back to fastq using the following code: java -Xmx10g -jar SamToFastq.jar VALIDATION_STRINGENCY=LENIENT INPUT=$i FASTQ=$i.fq SECOND_END_FASTQ=$i.2.fq The reads have been mapped with bwa and I have used lenient validation as Picard was complaining about some unmapped reads having a mapping quality>0. I'm not sure why bwa has given these reads a mapping quality, I suspect it is to do with the other pair but there are only ~200 out of ~200,000,000 so not that bothered about them. The main question I have is regarding reverse complement of sequence when converting to FastQ. I have the following paired end reads: HS18_6811:3:1304:5092:136893 1145 chr1 3009956 25 75M = 3009956 0 CTCTTCCGATCTTTTTCCATTTTCCATTTTCTTTGATTGTTGTGCCGATGTTCTCTATGGAATCTTCTGCACCTG ?ED@IIICKMBIK@KGGCJLBGHILKIMFIJGH<IJJKHJFIHLHLKIJHKOIAGFHJKL@DABJDJJIIIIFBC X0:i:1 X1:i:0 MD:Z:0G1A2A0G68 RG:Z:1 XG:i:0 AM:i:0 NM:i:4 SM:i:25 XM:i:4 XO:i:0 XT:A:U HS18_6811:3:1304:5092:136893 181 chr1 3009956 0 * = 3009956 0 CTCTTCCGATCTCAGGTGCAGAAGATTCCATAGAGAACATCGGCACAACAATCAAAGAAAATGGAAAATGGAAAA CFD@DIDKNHJ@ILDEJHEEFEFHGJFDICCGL=DG5CI:FIHJGCJCILIE@LKKEFICGHGIEHIDFFF?*59 RG:Z:1 The first is mapped to the reverse strand and flagged correctly. The second is unmapped but bwa gives it the reverse strand flag as this is where it's pair is mapped. In the output FastQ files both sequences are reverse complemented. I am unsure whether the unmapped read sequence in the bam file is the original read sequence or is rc'd when the bam file is written? Does the picard tool just check for a reverse flag or does it look to see if the read is mapped also? Thanks for any help Shaun Webb -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. |