On 8/6/12 12:26 AM, Jaemin Kim wrote:
> Hi, I've been asking quite a few questions about GenomeSTRiP and
> answers have been very helpful.
>
> I have two additional questions:
>
> 1. When you actually count the number of deletion sites in your paper,
> did you count the ones with "PASS" quality only?
Yes. There are a set of default filters applied if you use the standard
SVDiscovery Q script.
These may or may not be optimal for different data sets, and I would
encourage you to look at the distributions of the various metrics.
The default filters weren't too bad when applied out-of-the-box to 1000G
Phase 1, although that is similar 4x data.
In the larger data set, the default filters had an estimated false
discovery rate of about 8%.
We used two additional filters for Phase 1 to get the estimated FDR down
to about 3-4%, most importantly a filter that mostly removed rare events:
GSNPAIRS/GSNSAMPLES > 1.1
For any low-coverage sequencing of several hundred samples or more this
filter is likely important.
We estimated the FDR of sites where there were two supporting read pairs
in two different samples at around 25%.
Of lesser importance, but still helpful, we also removed any sites that
were > 90% alpha satellite repeat (as called by repeatmasker).
> 2. How did you calculate the deletion length? Did you use the GSCOORDS
> information (and take the subtraction between biggest and smallest
> coordinates)?
No. I use either INFO:END - POS or INFO:SVLEN. For Genome STRiP I think
these are always the same.
These are supposed to be the "most likely" start/end/length of the deletion.
On top of these, you can use INFO:CIPOS and INFO:CIEND which are
approximately 95% confidence intervals on POS and END respectively.
GSCOORDS is somewhat different - it is the inner/outer extent of the
aberrantly spaced read pairs and is mostly used for internal bookkeeping.
The outer coordinates will likely over-estimate the true position and
the inner coordinates will similarly under-estimate.
-Bob
>
>
> Thanks for your kindness.
>
>
> Regards,
>
> Jaemin Kim
>
>
> ------------------------------------------------------------------------------
> Live Security Virtual Conference
> Exclusive live event will cover all the ways today's security and
> threat landscape has changed and how IT managers can respond. Discussions
> will include endpoint security, mobile security and the latest in malware
> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
>
>
> _______________________________________________
> svtoolkit-help mailing list
> svt...@li...
> https://lists.sourceforge.net/lists/listinfo/svtoolkit-help
|