Re: [maq-help] map NA18507
Status: Beta
Brought to you by:
lh3lh3
From: Joel M. <ano...@co...> - 2009-05-20 16:51:59
|
That makes sense, it also means fq_all2std.pl seqprb2std is bugged, at least for pipeline 1.3.1+ data as it gave me the 10 4 10 10... scores. Joel David Dooling wrote: > On Fri, May 15, 2009 at 11:01:55PM -0700, Joel Martin wrote: > >> the quality numbers look typically precalibration, here's one of mine, >> >> before ( prb file to phred score via fq_all2std.pl seqprb2std ) >> 10 4 10 10 10 0 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 7 10 >> 5 10 10 6 10 10 10 10 10 10 10 0 >> and after calibration >> 29 26 31 31 29 24 27 32 31 30 29 31 27 28 31 31 30 32 28 30 32 32 30 29 >> 30 30 30 28 31 31 30 30 30 27 31 0 >> >> not sure how you'd fix it, illumina is using a table lately for >> calibration instead of requiring the phiX >> values, I think, but i've little to do with that end of things. Could >> just lower your thresholds :) >> Erm, consed was converting from prb to qual in version 17 or 18, and I >> don't remember seeing >> scores this low so maybe there's a simple answer. >> > > I think the simpler answer is that those FASTQ files are using the > standard (Sanger) offset of 33 (I = 40) rather than the Solexa > convention of 64 (I = 9). If I recall correctly, the Solexa pipeline > tops the quality off at 40 (and the recalibration may too). It would > make more sense to have a bunch of high quality bases in a row capped > at 40 than a bunch of mediocre reads with identical quality in a row. > > >> kc...@wa... wrote: >> >>> Dear Colleagues, >>> >>> I am trying to map the NA18507 reads downloaded from >>> http://www.ncbi.nlm.nih.gov/Traces/sra/sra.cgi?study=SRP000239 >>> to human reference sequence. However, I am getting strange mapping >>> results. The maximal mapping quality is only 30 and the alternative >>> mapping quality for a properly mapped pair is only 15. >>> >>> e.g., >>> SRR003971.245152 1 840373 + 219 18 30 15 >>> 15 1 3 0 1 36 >>> SRR003971.245152 1 840556 - -219 18 30 15 >>> 15 0 0 1 0 36 >>> >>> I wonder what I have done wrong? >>> >>> Before mapping, I noticed that the fastq files seemed to be in Illumina >>> format: >>> @SRR002291.1 FC202P4AAXX_R1:1:1:881:635 length=36 >>> GAAAGAAGTTTTCTGTGAAATGGCTTTGTGATTTGT >>> +SRR002291.1 FC202P4AAXX_R1:1:1:881:635 length=36 >>> IIIIIIIIIEIIIEI:9:3:IE947IG':.%@6E)8 >>> @SRR002291.2 FC202P4AAXX_R1:1:1:686:750 length=36 >>> GATCGTCTATTACCTTTAAATAACCTAGATCTCAAG >>> +SRR002291.2 FC202P4AAXX_R1:1:1:686:750 length=36 >>> IIIIIIII<IICIIIII4:3I96197/,,;,;0+*, >>> >>> So I converted them using "fq_all2std.pl sol2std". According to the >>> instruction these base qualities were only PF filtered but were not >>> calibrated from alignment. Would base quality calibration affect the >>> mapping quality calculations? How can I make it work? >>> >>> Many Thanks. >>> >>> -Ken >>> >>> >>> >>> ------------------------------------------------------------------------------ >>> Crystal Reports - New Free Runtime and 30 Day Trial >>> Check out the new simplified licensing option that enables >>> unlimited royalty-free distribution of the report engine >>> for externally facing server and web deployment. >>> http://p.sf.net/sfu/businessobjects >>> _______________________________________________ >>> maq-help mailing list >>> maq...@li... >>> https://lists.sourceforge.net/lists/listinfo/maq-help >>> >>> >>> >> >> ------------------------------------------------------------------------------ >> Crystal Reports - New Free Runtime and 30 Day Trial >> Check out the new simplified licensing option that enables >> unlimited royalty-free distribution of the report engine >> for externally facing server and web deployment. >> http://p.sf.net/sfu/businessobjects >> _______________________________________________ >> maq-help mailing list >> maq...@li... >> https://lists.sourceforge.net/lists/listinfo/maq-help >> > > |