dspchip Wiki

Brought to you by: daweonline

Examples

Labels: Featured (2)

Authors: Anonymous

Typical (or not) usage scenario

dspchip has been implemented to perform common DSP operations on quantitative genomic features. Such data include, but are not limited to, ChIP-seq (or ChIP-chip) data. When we started writing dspchip, we had to face histone modifications, which spread kilobases and, most important, are poorly detected by available peak-finding software (we really like FindPeaks4, MACS and SICER).

Example 1: calculate log2ratio between two datasets

$ dspchip -i chip.bam -c mock.bam --pl=NL -s 5000

Will calculate the log2ratio between tags in chip.bam and mock.bam. The results will be written into a bedgraph file, averaging values in 5 kbp windows.

Example 2: ChIP-seq analysis

$ dspchip -i chip.bam -c input.bam --pl=NLFTZ -e 50000 -s 5000 -p -l mad --wf=max --fir=hanning -n enriched

Will calculate the log2ratio between chip.bam and input.bam (L) after energy normalization (N). Values will be smoothed using a Hanning low pass filter assuming the expected signal size being 50 kbp (F). Negative values will be removed (Z) after thresholding (T) using a MAD estimation. Data will be processed to find peak boundaries. BigWig file will be stored with a step size of 5 kbp using a "max" windowing function. Files prefix will be "enriched".

Example 3: Correlate features

$ dspchip -i satellite.bed --pl=FZ -e 50000 -s 1000 -n satellite --fa=bed --csize=genome.tab
$ dspchip -i enriched.bigwig -c satellite.bigwig -e 50000 -C --pl='' --fa=bw

The first command will get a bed file containing satellite repeats and will smooth it using FFT (assuming a window of 50 kbp). The last command will produce correlation plots (chromosome based) between ChIP-seq analysis and satellite repeats.