Installation
===============
Prerequisite:
* gcc <http://gcc.gnu.org/>;
* python2.7 <http://www.python.org/getit/releases/2.7/>
* numpy <http://pypi.python.org/pypi/numpy>
* Mac OS X users need to download and install Xcode <https://developer.apple.com/xcode/>.
* Mac OS X users need to install GNU Scientific Library (GSL) <http://gnu.askapache.com/gsl/>
Procedure to install MACE (Linux & MAC OS X)::
tar zxf MACE-VERSION.tar.gz
cd MACE-VERSION
python setup.py install #will install MACE in system level.
python setup.py install --root=/home/user/MACE #will install MACE at user specified location
export PYTHONPATH=/home/user/MACE/usr/local/lib/python2.7/site-packages:$PYTHONPATH
export PATH=/home/user/MACE/usr/local/bin:$PATH
Walkthrough example using Pugh's CTCF ChIP-exo data
====================================================
Step1: Download raw sequence data
-----------------------------------
Download CTCF ChIP-exo data (Accession number SRA044886) published in 2011 Cell ::
wget ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR346/SRR346401/SRR346401.fastq.gz
wget ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR346/SRR346402/SRR346402.fastq.gz
wget ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR346/SRR346403/SRR346403.fastq.gz
Step2: Align reads to reference genome
--------------------------------------
Mapping reads to reference genome. In this example, we use Bowtie to align reads to human
reference genome (GRCh37/hg19). The color space human genome index files were downloaded
from ftp://ftp.cbcb.umd.edu/pub/data/bowtie_indexes/. Any aligner is fine, as long as it
can generate BAM or format alignments::
bowtie -S -C -q -m 1 PATH/bowtie/indexes/colorspace/hg19c SRR346401.fastq CTCF_replicate1.sam
bowtie -S -C -q -m 1 PATH/bowtie/indexes/colorspace/hg19c SRR346402.fastq CTCF_replicate2.sam
bowtie -S -C -q -m 1 PATH/bowtie/indexes/colorspace/hg19c SRR346403.fastq CTCF_replicate3.sam
Convert SAM into BAM using samtools, then sort and index BAM file. You only need the index
step if the aligner you used already produced sorted BAM file::
samtools view -bS CTCF_replicate1.sam > CTCF_replicate1.bam
samtools sort CTCF_replicate1.bam CTCF_replicate1.sorted
samtools index CTCF_replicate1.sorted.bam
samtools view -bS CTCF_replicate2.sam > CTCF_replicate2.bam
samtools sort CTCF_replicate2.bam CTCF_replicate2.sorted
samtools index CTCF_replicate2.sorted.bam
samtools view -bS CTCF_replicate3.sam > CTCF_replicate3.bam
samtools sort CTCF_replicate3.bam CTCF_replicate3.sorted
samtools index CTCF_replicate3.sorted.bam
You can download our sorted and indexed BAM files (Skip Step1, Step2)::
wget http://dldcc-web.brc.bcm.edu/lilab/MACE/bam/CTCF_replicate1.sorted.bam
wget http://dldcc-web.brc.bcm.edu/lilab/MACE/bam/CTCF_replicate1.sorted.bam.bai
wget http://dldcc-web.brc.bcm.edu/lilab/MACE/bam/CTCF_replicate2.sorted.bam
wget http://dldcc-web.brc.bcm.edu/lilab/MACE/bam/CTCF_replicate2.sorted.bam.bai
wget http://dldcc-web.brc.bcm.edu/lilab/MACE/bam/CTCF_replicate3.sorted.bam
wget http://dldcc-web.brc.bcm.edu/lilab/MACE/bam/CTCF_replicate3.sorted.bam.bai
Step3: Proprocessing including sequencing depth normalization, nucleotide composition bias
correction, signal consolidation and noise reduction
------------------------------------------------------------------------------------------
Replicate BAM files are separated by ','. Reads mapped to forward and reverse strand will
define two boundaries of binding region, so finally two wiggle files representing the coverage
signal by 5' ends of reads will be produced. wiggle files will be converted into bigwig format
if 'WigToBigWig' can be found in your system $PATH.::
preprocessor.py -i CTCF_replicate1.sorted.bam,CTCF_replicate2.sorted.bam,CTCF_replicate3.sorted.bam -r hg19.chrom.sizes -o CTCF_MACE
Convert wiggle into bigwig format manually. You can download WigToBigWig program from UCSC
(<http://hgdownload.cse.ucsc.edu/admin/exe/> ).::
wigToBigWig CTCF_hg19_Forward.wig hg19.chrom.sizes CTCF_hg19_Forward_snr.bw
wigToBigWig CTCF_hg19_Reverse.wig hg19.chrom.sizes CTCF_hg19_Reverse_snr.bw
You can download our bigwig files directly::
wget http://dldcc-web.brc.bcm.edu/lilab/MACE/bigfile/CTCF_MACE_Forward.bw
wget http://dldcc-web.brc.bcm.edu/lilab/MACE/bigfile/CTCF_MACE_Reverse.bw
Step4: Border detection and border pairing
-------------------------------------------
Run mace.py to perform ChIP-Exo peak calling and pairing from two BigWig files::
mace.py -s hg19.chrom.sizes -f CTCF_MACE_Forward.bw -r CTCF_MACE_Reverse.bw -o CTCF_MACE
Step5: visualize results using UCSC genome browser.
----------------------------------------------------
Copy the following lines and pasted into UCSC <http://genome.ucsc.edu/cgi-bin/hgCustom>.
Note the Assembly version is hg19. You need to convert peak and peak_pair files into
BigBed <http://genome.ucsc.edu/FAQ/FAQformat.html#format1.5>`.
#MACE border pair and predicted motif
track type=bigBed name="CTCF MACE Border Pairs" visibility=2 color=102,0,0 windowingFunction=maximum db=hg19 bigDataUrl=http://dldcc-web.brc.bcm.edu/lilab/MACE/bigfile/CTCF_MACE_borderPair.bb
track type=bigBed name="CTCF FIMO Motif" visibility=2 color=255,0,0 windowingFunction=maximum db=hg19 bigDataUrl=http://dldcc-web.brc.bcm.edu/lilab/MACE/bigfile/CTCF_FIMO_motif.bb
#MACE consolidated signal
track type=bigWig name="CTCF MACE Forward Signal" visibility=2 color=0,0,153 windowingFunction=maximum db=hg19 bigDataUrl=http://dldcc-web.brc.bcm.edu/lilab/MACE/bigfile/CTCF_MACE_Forward.bw
track type=bigWig name="CTCF MACE Reverse Signal" visibility=2 color=153,0,0 windowingFunction=maximum db=hg19 bigDataUrl=http://dldcc-web.brc.bcm.edu/lilab/MACE/bigfile/CTCF_MACE_Reverse.bw
#Raw signals
track type=bigWig name="CTCF Raw Signal (rep1 Forward) " visibility=2 color=0,0,153 windowingFunction=maximum db=hg19 bigDataUrl=http://dldcc-web.brc.bcm.edu/lilab/MACE/bigfile/CTCF_raw_rep1_Forward.bw
track type=bigWig name="CTCF Raw Signal (rep1 Reverse) " visibility=2 color=153,0,0 windowingFunction=maximum db=hg19 bigDataUrl=http://dldcc-web.brc.bcm.edu/lilab/MACE/bigfile/CTCF_raw_rep1_Reverse.bw
track type=bigWig name="CTCF Raw Signal (rep2 Forward) " visibility=2 color=0,0,153 windowingFunction=maximum db=hg19 bigDataUrl=http://dldcc-web.brc.bcm.edu/lilab/MACE/bigfile/CTCF_raw_rep2_Forward.bw
track type=bigWig name="CTCF Raw Signal (rep2 Reverse) " visibility=2 color=153,0,0 windowingFunction=maximum db=hg19 bigDataUrl=http://dldcc-web.brc.bcm.edu/lilab/MACE/bigfile/CTCF_raw_rep2_Reverse.bw
track type=bigWig name="CTCF Raw Signal (rep3 Forward) " visibility=2 color=0,0,153 windowingFunction=maximum db=hg19 bigDataUrl=http://dldcc-web.brc.bcm.edu/lilab/MACE/bigfile/CTCF_raw_rep3_Forward.bw
track type=bigWig name="CTCF Raw Signal (rep3 Reverse) " visibility=2 color=153,0,0 windowingFunction=maximum db=hg19 bigDataUrl=http://dldcc-web.brc.bcm.edu/lilab/MACE/bigfile/CTCF_raw_rep3_Reverse.bw
Online document
===============
http://chipexo.sourceforge.net/
Contacts:
========================
wangliguo78@gmail.com
Update on: July 31, 2014