From: Noboru Jo S. <ns...@uc...> - 2011-01-10 21:47:22
|
Hi David, thanks for checking my data. It seems the difference is because I filtered by EmpFDR, not Qvalue. I reran Useq now and when I filter with -i 2,4 -s 5,1 I get tens of thousands of peaks, but with -i 1,4 -s 20,1 I get ten peaks. Can you confirm this by filtering windows by EmpFDR? The peaks make some biological sense, overall (GO, conservation). So it's not like the peaks found are noise. The IP is not good, but there is some signal there. So I tend to trust the peaks that Useq is giving me. But, please, I would appreciate if you can comment on this! If you don't mind sharing, in your experience, how unusual is something like this? I mean, a sample that has some signal, but because it's weak, one needs to give it a good shake to get something? Are the samples you analyze consistently better than this? I'm not the wet lab person, so I won't feel offended with your criticism ;-) Thanks again. noboru David Nix wrote: > Hello Noboru, > > I'm not seeing the increase in the number of regions when you subsample the > input control to match the chIP sample. > > I see 283 regions with the full control and 18 with the matched control when > thresholding using a qvalue of 20 (0.01) and a log2Ratio of 1 (2x). > > Here's what I did: > > 1) Run Tag2Point to convert your bed datasets to binary PointData > 2) Run the PointDataManipulator to filter out duplicate reads. Both datasets > look good with 94% unique > 3) Run ScanSeqs to window scan your data > 4) Run EnrichedRegionMaker to collapse overlapping windows that exceed the > above thresholds into a list of putative peaks. > > For the reduced control dataset, I used the SubSamplePointData to randomly > toss duplicate filtered input PointData to 3398890 and then ran ScanSeqs and > the EnrichedRegionMaker. > > I wonder where the discrepancy occurred? I've attached the two spread sheet > results from the EnrichedRegionMaker. > > -cheers, D > > > |