Menu

Peak analysis without control

Help
2009-06-16
2012-09-20
  • Deniz Koellhofer

    Hi,
    I was looking at the recommended usage of useq which involves the ScanSeqs step - requiring a control sample.
    Is there a way to assign confidence values to peaks purely based on statistical significance within the "treatment" PointData?

    I was just wondering if I can use useq on an experiment designed without a control input sample.

    Thanks,
    Deniz

     
    • David Nix

      David Nix - 2009-06-16

      Bottom line, without a control, your results will be riddled with false positives.

      Many of the early chIP-seq detectors, including USeq, used a poisson distribution with a fixed global lambda to estimate confidence in a local accumulation of reads in a chIP sample. This doesn't work because there are hundreds of peaks of all sizes that are significantly enriched in null data (IgG control, input DNA...). Where are these coming from? Likely due to bias in DNA fragmentation, PCR amplification, alignment artifacts,.... See the USeq paper for a detailed exploration of this topic.

      I would recommend downloading an input control that best matches your chIP sample. This won't be ideal but should provide a bit of protection against false positives.

      We're starting to feel that even matched input samples are a poor control. Better to perform a differential chIP-Seq experiment that directly compares two different chIP experiments, say treatment A vs B or time point 0 vs 10min.

      Note, you can use ScanSeqs to scan your chIP data. It will produce just a windowed hit count track. Then use the EnrichedRegionMaker to generate a spreadsheet of regions. I threw out all of the other stats.

      Good luck

       
      • Deniz Koellhofer

        Hi,
        I've been trying to generate a spreadsheet of ERs using the EnrichedRegionMaker tool like you mentioned in the last paragraph based on only the chIP data. I'm getting an error message that the control is required.
        The command I use is to retrieve the top100 regions:
        java -jar useq-4.2/Apps/EnrichedRegionMaker -t ./PointData -f ScanSeqsResults/windowData52bp.swi -n 100

        Error Message: Please provide control data.

        Wonder what I'm doing wrong, maybe someone can help me out here.

        Thanks,
        Deniz

         
        • David Nix

          David Nix - 2009-07-29

          Omit the treatment data too. The EnrichedRegionMaker can, if provided, use both treatment and control data to rescan each enriched region for the highest sub window peak and rescore the entire enriched region for several statistics. It skips these steps if the data isn't provided. I don't have it configured to work with just treatment data.

           

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.