Yes, that is correct. The minimum reads is used to reduce the number of windows passed to DESeq. This doesn't hurt the sensitivity though since that few reads, even 10:0 treat:cont won't be significant after applying a multiple testing correction.
Yes, you can take each dataset (chIP or input) and run ScanSeqs in single dataset mode, don't provide a control, just treatment. Then run the EnrichedRegionMaker and threshold based on the number of reads (e.g. "-s 10, -I 0" for min 10 reads in a window. The output contains an egr file (chr start stop …) for all of the regions with 10 or more reads. Then sum the length of the regions and divide by your genome length * 0.8 (~80% of higher eukaryotic genomes can be mapped).
-cheers, D
From: Gareth Wilson <gar...@ca...<mailto:gar...@ca...>>
Date: Tue, 18 Oct 2011 09:19:35 -0600
To: David Nix <dav...@hc...<mailto:dav...@hc...>>
Subject: Proportion of genome analysed?
Hi David,
I'm using MultipleReplicaScanSeqs on MeDIP cohorts with –m = 10. As far as I gather, each window is tested to see if the total read count is greater or equal to the threshold and, if so, is passed to DESeq to be tested for differential counts. Do you have a method for determining what proportion of the total genome is covered at the required threshold?
Many Thanks,
Gareth.
------
Dr Gareth A Wilson
Bioinformatician
Medical Genomics Group
UCL Cancer Institute
Paul O'Gorman Building
University College London
72 Huntley Street
London
WC1E 6BT
tel: +44 (0) 20 7679 0999
------
|