Re: [svtoolkit-help] Regarding sample test in Genome Strip
Status: Beta
Brought to you by:
bhandsaker
From: Mishra, P. <Pam...@bc...> - 2014-03-04 18:54:49
|
Hi Bob, Thanks for replying. It's WGS dara for sure and not exome. Also the coverage is 6X. There are 1576 calls and only 16 PASS calls for the 100 samples. For the 50 sub-sampled data this is the breakdown: 50 first samples: 236 deletion calls with 7 PASS calls 50 last samples : 866 calls with 7 PASS calls. The issue is not the number of calls but the samples that it picks up (with discordant pairs). The GSSAMPLES field has only 1 samples name for the last 50 sub-sampled and 2 samples for the 50 first samples. These same 2 samples appear in the complete 100 samples run. For example: 20 62949798 DEL_3987 G <DEL> . COVERAGE;DEPTH CIEND=-42,42;CIPOS=-42,42;END=62959138;GSCOHERENCE=-0.2890065636733324;GSCOHFN=-0.2890065636733324;GSCOHPVALUE=0.7447;GSCOORDS=62949527,62949709,62959226,62959343;GSDEPTHCALLTHRESHOLD=47.76261138557784;GSDEPTHNOBSSAMPLES=1;GSDEPTHNTOTALSAMPLES=1;GSDEPTHOBSSAMPLES=A00548;GSDEPTHPVALUE=3.3E-5;GSDEPTHRANKSUMPVALUE=0.8873735;GSDEPTHRATIO=1.3081478463888137;GSDMAX=9616;GSDMIN=8617;GSDOPT=9608;GSDSPAN=9516;GSMEMBNPAIRS=0;GSMEMBNSAMPLES=0;GSMEMBOBSSAMPLES=NA;GSMEMBPVALUE=NA;GSMEMBSTATISTIC=NA;GSNDEPTHCALLS=43;GSNPAIRS=2;GSNSAMPLES=1;GSOBSINREADS=246;GSOBSOUTREADS=1505009;GSOUTLEFT=1;GSOUTLIERS=1;GSOUTRIGHT=0;GSREADGROUPS=1,1;GSREADNAMES=HWI-ST1324:20:C0WACACXX:8:2106:3126:57827,HWI-ST1324:20:C0WACACXX:8:2114:9563:13653;GSSAMPLES=A00548,A00548;GSUNOBSINREADS=14652;GSUNOBSOUTREADS=117262117;IMPRECISE;SVLEN=-9608;SVTYPE=DEL This is not a PASS call site. But I see this more frequently. Here I see same 2 samples appearing again. Is this normal? Shouldn't this be able to pick up discordant pairs for the same site from other samples. These samples are pretty closely related of what I know. Is the coverage too low. But since Genome-strip was used for 1000 Genomes, which had a coverage of ~4X, I would presume it can work on a 6X. I will send you the vcf file once I get a clearance that I can send this data. Thanks, Pamela From: Bob Handsaker [mailto:han...@br...] Sent: Tuesday, March 04, 2014 8:20 AM To: svt...@li... Subject: Re: [svtoolkit-help] Regarding sample test in Genome Strip First, I want to check to make sure you are using WGS data, not exome capture. It would be quite unusual to see only 2 deletions on chr20 across 100 individuals. Are you able to share your output vcfs (e.g. one from a 100 sample run, one from a 50 sample run) that have discordant results? You can email them directly to me. Another thing to try is to genotype the sites (in all 100 samples is fine) and if you can share those results and also run PlotGenotypingResults, I can take a look and see if I can help diagnose what's wrong. Best, -Bob On 3/3/14 3:49 PM, Mishra, Pamela wrote: Hi, I am trying to find SV for 100 samples of chromosome 20. I already ran preprocessing and discovery on these samples without any error. The problem is when I look at the final vcf file, I see only 2 samples being called out having SVs in the discovery phase. I checked the logs and it looks like its reading and also making stats for all the 100 files. When I split the 100 samples in batches of 50 each (2 batches),one batch gives me the same 2 samples that show up in the 100 sample list. The other batch gives me only 1 sample. This same sample does not show any SV when its included in the 100 samples. I have done such kind of tests several times but I always find the same pattern again and again. I am not sure why this is happening. I understand there are a few filters listed here.(http://sourceforge.net/p/svtoolkit/mailman/message/28046131/) But my idea was that the samples that show discordant pairs should show up both in the list of 100 as well as when I run only 50. Could you please throw your suggestion and insight into this situation. Is there checks to be done? Any suggestion would be highly appreciated. Thanks, Pamela Mishra ------------------------------------------------------------------------------ Subversion Kills Productivity. Get off Subversion & Make the Move to Perforce. With Perforce, you get hassle-free workflows. Merge that actually works. Faster operations. Version large binaries. Built-in WAN optimization and the freedom to use Git, Perforce or both. Make the move to Perforce. http://pubads.g.doubleclick.net/gampad/clk?id=122218951&iu=/4140/ostg.clktrk _______________________________________________ svtoolkit-help mailing list svt...@li...<mailto:svt...@li...> https://lists.sourceforge.net/lists/listinfo/svtoolkit-help |