Home
Name Modified Size InfoDownloads / Week
annotations 2016-04-14
ReAnnotator_1-0-0.tar.gz 2014-04-09 11.1 kB
README.txt 2014-04-09 2.8 kB
README.doc 2014-04-09 30.7 kB
Totals: 4 Items   44.7 kB 4
########################################
# HOW TO GET STARTED WITH RE-ANNOTATOR #
########################################

A. Things to do before the first time using Re-Annotator

I) Install External Programs

Following programs should be installed on your system prior to running the Re-Annotator:
a) PERL
b) BWA 		(http://sourceforge.net/projects/bio-bwa/files/)
c) SAMtools 	(https://sourceforge.net/projects/samtools/files/)
d) Annovar		(http://www.openbioinformatics.org/annovar/)

II) Get External Data
a) Reference Genome Sequence
- Download the reference genome, e.g., hg19
  http://hgdownload.cse.ucsc.edu/goldenPath/hg19/bigZips/chromFa.tar.gz
- unzip (using gzip)
  make sure every chromosome is in a single file

b) Gene Database
- Download information on gene locations, e.g., RefSeq
  http://hgdownload.cse.ucsc.edu/goldenPath/hg19/database/refGene.txt.gz
- unzip (using gzip)

c) Make sure desired databases for annotation are available in Annovar:
These can be for instance
- RefGene
./annotate_variation.pl -downdb refGene -buildver hg19 humandb/

- snpdatabase, e.g., snp129 (available from annovar)
./annotate_variation.pl -downdb snp129 -buildver hg19 -webfrom annovar humandb/

III) Generate the mRNA Reference Sequence
a) Execute new_exomeBuilding.pl to build the mRNA reference sequence
example:
$> ./BuildExomeReference.pl -i ~/ReAnnotator/refGene.txt -o ~/ReAnnotator/exomeRef/hg19exome -r ~/ReAnnotator/hg19/

b) Use "BWA index" to generate the BWA index files for
i) the exome reference sequence
ii) the whole genome reference sequence

IV) Complete the config.sh
* provide exact locations to the external programs
in the file config.sh
* config.sh is also the place to change the default settings
 - # of CPUs
 - # of mismatches
 - genome version, e.g., gh19, hg18, mm9, ...
 - genedb (refGene, ensemble, ...)
 - snpdb  (snp135, snp136, ...)

NOTE: step III) has to be carried out every time the Gene Database is updated!
NOTE: update config.sh to meet the needs for your Re-Annotation

B. Things to do at every run

I) Check that config.sh is set up correctly for the genome and exome you are about to use
 - regenerate the mRNA reference if you are using an updated database version or new genome release

II) Convert probes file into a fasta file
For Illumina probe files the script parse_fastaFromOriginIlmnAnno.pl can be used.

III)
execute: ./ReAnnotator.sh
with corresponding parameters
example:
./ReAnnotator.sh my_illumina_probes.fasta ~/ReAnnotator/exomeRef/hg19exome_inclUTR.fasta ~/ReAnnotator/hg19/hg19genome.fasta.gz ~/ReAnnotator/refGene.txt ~/ReAnnotator/outputs/ ~/ReAnnotator/tmp/

Prior to running the script, make sure that all the directories exist.

The script will call following scripts in a row:
 a) run_realignment.sh
 b) run_coordinateConversion.sh
 c) run_genome_snp_annotation.sh
Source: README.txt, updated 2014-04-09