The ChIP-Seq Project
==============================================================================================
The ChIP-seq software provides methods for the analysis of ChIP-seq data and
other types of mass genome annotation data.
Giovanna Ambrosini EPFL SV/ISREC GR-BUCHER
15.02.2019 - (Rel: 1.5.5)
- Modify Makefile for better dealing with system-wide code and data installation
- Move example data to FTP-Site (ftp://ccg.epfl.ch/chip-seq/data)
03.12.2018 - (Rel: 1.5.4)
- Bug fix in the sga2bed program
- Add chr_NC_gi file for chromosome name to NCBI identifier conversion
Rel: 1.5.4 - 02.10.2018
- Replace format conversion Perl scripts sga2bed.pl sga2wig.pl bed2sga.pl with
corresponding C programs (sga2bed, sga2wig, bed2sga)
- Add format conversion program bed2bed_display to convert BED to BED track
suitable for visualizing signal peak regions
- chippeak: add check on chromosome boundaries after peak refinement
- chipcenter: add check on chromosome boundaries after read shifting
- Update User's Guide
Rel: 1.5.3 - 07.02.2017
Minor modifications:
- sga2bed.pl: add read from stdin
- sga2wigVS.pl: minor modifications to deal with equal genomics coordinates
- chipcor.c, chippeak.c, chippart.c, chipcenter.c: change default cut-off value to 1
Rel: 1.5.3 - 11.10.2016
- README file in tool directory has been modified
- bed2sga.pl: add (-narrowPeak) option to properly convert ENCODE narrowPeak files
Rel: 1.5.2 - 21.04.2016
- ChIP-Seq tools print software version 1.5.2 now
- Compilation warnings have been removed
- User's Guide has been updated
- README file in tool directory has been updated
- sga2wigFS.pl, sga2wigVS.pl: bug fix on chrom start position
- partit2bed.pl: bug fix on chrom size estimation
- sga2bed.pl: extend peak center coordinates by +/-<readlen>/2 (if <readlen> is
specified for an unoriented SGA), and add option for track color
Rel: 1.5.1 - 10.09.2015
- chipcenter: Add option for specifying an alternative feature name
for feature replacement
- counts_filter: add 'retain mode' option to retain only those features
which fall into selected regions
- Update chipcenter man page
Rel: 1.5.0 - 03.08.2015
- Add chipextract (C code): Feature Correlation and Extraction Tool
- Remove previous perl code (chip_extract.pl)
- Update man pages
Rel: 1.4-2 - 1.07.2015
- A few bug fixes in the chipscore program
- Add new applications: chip_extract.pl, wigFS2sga.pl
Rel: 1.4-1 - 17.03.2015
- A few bug fixes in the chipcor program
Rel: 1.4 - 07.01.2015
- ChIP-Seq tools read from stdin as well
- Add compactsga, counts_filter, and featreplace to main programs
- Upgrade bed2sga.pl and sga2fps.pl tools
Rel: 1.3 - 29-08-2012
- Correct a bug in the chippart program
- Upgrade auxiliary tools
Rel: 1.2 - 17-12-2010
- Correct a bug in the chippeak program
Rel: 1.1 - 09-04-2009
- Update Man pages
- Add auxiliary tools
Rel: 1.0 - 11-11-2008
- Initial Release
DESCRIPTION OF THE TOOLS
----------------------------------------------------------------------------------------------
We propose a set of useful tools performing common ChIP-Seq data analysis tasks,
including positional correlation analysis, peak detection, and genome partitioning
into signal-rich and signal-poor regions.
These tools exist as stand-alone programs and perform the following tasks:
1. Positional correlation and generation of an aggregation plot (AP) for two genomic features (chipcor);
2. Extraction of specific genome annotation features around reference genomic anchor points (chipextract);
3. Read shifting (chipcenter);
4. Narrow peak caller that uses a fixed width peak size (chippeak);
5. Broad peak caller algorithm used for broad regions of enrichment (i.e. histone marks) (chippart);
6. Feature selection tool based on a read count threshold (chipscore).
The programs use their own compact format for ChIP-Seq data representation called SGA (Simplified Genome Annotation).
SGA is a single-line-oriented and tab-delimited format with the following five obligatory fields:
1. Sequence name (Char String),
2. Feature (Char String),
3. Sequence Position (Integer),
4. Strand (+/- or 0),
5. Read Counts (Integer).
Any number of additional fields may be added containing application-specific information.
SGA files represent genome-wide read count distributions for one ore more features.
The 'feature' field (identified by field 2) is used to identify the molecular species targeted by antibody of a ChIP-seq experiment.
Sequences are identified by NCBI/RefSeq chromosome ids, which are assembly specific in order to prevent mixing of different assemblies.
The 'read counts' field represents the number of sequence reads that have been mapped to a specific position in the genome.
Input features may be ChIP-Seq sequence read positions, peaks identified by ChIP-peak, or any type of genome annotation that can be mapped to a single base on a chromosome.
An example of SGA-formatted file is shown here below:
NC_000001.9 H3K4me3 4794 + 1
NC_000001.9 H3K4me3 6090 + 1
NC_000001.9 H3K4me3 6099 + 1
NC_000001.9 H3K4me3 6655 + 1
NC_000001.9 H3K4me3 18453 - 1
NC_000001.9 H3K4me3 19285 + 1
NC_000001.9 H3K4me3 44529 + 1
NC_000001.9 H3K4me3 46333 + 1
NC_000001.9 H3K4me3 46349 - 1
NC_000001.9 H3K4me3 52929 + 1
NC_000001.9 H3K4me3 59412 + 1
...
Chip-Seq programs require SGA intput files to be sorted by sequence name, position, and strand.
In the UNIX environment, the command to properly sort SGA files is the following:
sort -s -k1,1 -k3,3n -k4,4 <SGA file>
WEB SITE
----------------------------------------------------------------------------------------------
ChIP-Seq has a Web interface which is freely available at:
http://ccg.vital-it.ch/chipseq/
FTP SITE (Data used to show examples in the User's Guide)
ftp://ccg.epfl.ch/chip-seq/data
PROGRAM INSTALLATION
==============================================================================================
Untar the file containing the source code (e.g. chip-seq.1.5.4.tar.gz).
For code compilation a suitable makefile is provided.
##############################################################################################
# For ChIP-Seq versions up to chip-seq.1.5.4.tar.gz
##############################################################################################
To create the executable files, type:
make
To delete the excutable files and all the object files from the root directory, type:
make clean
To install the man pages you should have root permissions and type:
make man
- To install the executable files (default $binDir is ./bin.x86_64), type:
make install
- To delete the excutable files and all the object files from the $binDir directory, type:
make cleanbin
##############################################################################################
# For ChIP-Seq version chip-seq.1.5.5.tar.gz and on
##############################################################################################
For code compilation and data/code installation a suitable makefile is provided.
- To create the executable files, type:
make
- To install the man pages you should have root permissions and type:
make man
- To install the executable files (default $binDir is ./bin), type:
make install
- To install the executable files system-wide (e.g. in /usr/lcal/bin), type:
sudo make prefix=/usr/local install
- To delete the excutable files and all the object files from the compilation directory, type:
make clean
- To delete the excutable files and all the object files from the $binDir directory, type:
make uninstall
# Man Pages
- To install man pages system-wide, type:
sudo make prefix=/usr/local install-man
This command installs the chip-seq man pages in /usr/local/share/man/chip-seq/man1.
# Data files needed for format-conversion tasks
- To install data files needed for some format conversion programs, type:
sudo make prefix=/usr/local install-dat
This command installs the chr_NC_gi, chro_idx.nstorage, and chr_size files in /usr/local/share/chip-seq/.
# ChIP-Seq User's Manual
- To install the User's manual system-wide, please type:
sudo make prefix=/usr/local install-doc
This command will install the ChIP-Seq_Tools-UsersGuide.pdf file in /usr/local/share/chip-seq/doc/.