Hi,
I was looking at the recommended usage of useq which involves the ScanSeqs step - requiring a control sample.
Is there a way to assign confidence values to peaks purely based on statistical significance within the "treatment" PointData?
I was just wondering if I can use useq on an experiment designed without a control input sample.
Thanks,
Deniz
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Bottom line, without a control, your results will be riddled with false positives.
Many of the early chIP-seq detectors, including USeq, used a poisson distribution with a fixed global lambda to estimate confidence in a local accumulation of reads in a chIP sample. This doesn't work because there are hundreds of peaks of all sizes that are significantly enriched in null data (IgG control, input DNA...). Where are these coming from? Likely due to bias in DNA fragmentation, PCR amplification, alignment artifacts,.... See the USeq paper for a detailed exploration of this topic.
I would recommend downloading an input control that best matches your chIP sample. This won't be ideal but should provide a bit of protection against false positives.
We're starting to feel that even matched input samples are a poor control. Better to perform a differential chIP-Seq experiment that directly compares two different chIP experiments, say treatment A vs B or time point 0 vs 10min.
Note, you can use ScanSeqs to scan your chIP data. It will produce just a windowed hit count track. Then use the EnrichedRegionMaker to generate a spreadsheet of regions. I threw out all of the other stats.
Good luck
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi,
I've been trying to generate a spreadsheet of ERs using the EnrichedRegionMaker tool like you mentioned in the last paragraph based on only the chIP data. I'm getting an error message that the control is required.
The command I use is to retrieve the top100 regions:
java -jar useq-4.2/Apps/EnrichedRegionMaker -t ./PointData -f ScanSeqsResults/windowData52bp.swi -n 100
Error Message: Please provide control data.
Wonder what I'm doing wrong, maybe someone can help me out here.
Thanks,
Deniz
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Omit the treatment data too. The EnrichedRegionMaker can, if provided, use both treatment and control data to rescan each enriched region for the highest sub window peak and rescore the entire enriched region for several statistics. It skips these steps if the data isn't provided. I don't have it configured to work with just treatment data.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi,
I was looking at the recommended usage of useq which involves the ScanSeqs step - requiring a control sample.
Is there a way to assign confidence values to peaks purely based on statistical significance within the "treatment" PointData?
I was just wondering if I can use useq on an experiment designed without a control input sample.
Thanks,
Deniz
Bottom line, without a control, your results will be riddled with false positives.
Many of the early chIP-seq detectors, including USeq, used a poisson distribution with a fixed global lambda to estimate confidence in a local accumulation of reads in a chIP sample. This doesn't work because there are hundreds of peaks of all sizes that are significantly enriched in null data (IgG control, input DNA...). Where are these coming from? Likely due to bias in DNA fragmentation, PCR amplification, alignment artifacts,.... See the USeq paper for a detailed exploration of this topic.
I would recommend downloading an input control that best matches your chIP sample. This won't be ideal but should provide a bit of protection against false positives.
We're starting to feel that even matched input samples are a poor control. Better to perform a differential chIP-Seq experiment that directly compares two different chIP experiments, say treatment A vs B or time point 0 vs 10min.
Note, you can use ScanSeqs to scan your chIP data. It will produce just a windowed hit count track. Then use the EnrichedRegionMaker to generate a spreadsheet of regions. I threw out all of the other stats.
Good luck
Hi,
I've been trying to generate a spreadsheet of ERs using the EnrichedRegionMaker tool like you mentioned in the last paragraph based on only the chIP data. I'm getting an error message that the control is required.
The command I use is to retrieve the top100 regions:
java -jar useq-4.2/Apps/EnrichedRegionMaker -t ./PointData -f ScanSeqsResults/windowData52bp.swi -n 100
Error Message: Please provide control data.
Wonder what I'm doing wrong, maybe someone can help me out here.
Thanks,
Deniz
Omit the treatment data too. The EnrichedRegionMaker can, if provided, use both treatment and control data to rescan each enriched region for the highest sub window peak and rescore the entire enriched region for several statistics. It skips these steps if the data isn't provided. I don't have it configured to work with just treatment data.