fusion-test - Browse Files at SourceForge.net

The interactive file manager requires Javascript. Please enable it or use sftp or scp.
You may still browse the files here.

Name	Modified	Size	InfoDownloads / Week
fusion_test.sh	2019-02-23	38.0 kB	0
README	2018-11-19	7.1 kB	2
MapSplice-v2.2.2.tar.gz	2018-11-19	8.7 MB	0
tophat.post.config	2018-02-11	85 Bytes	0
tophat.rna.config	2018-02-11	93 Bytes	0
Alignment.oscript	2018-02-02	1.0 kB	0
FusionMap_2015-03-31.tar.gz	2018-01-23	25.4 MB	0
fusionmap.sh	2018-01-22	7.1 kB	0
configuration.tmp	2017-12-29	1.2 kB	0
samtools-0.1.19.tar.gz	2017-07-25	3.4 MB	0
ericscript-0.5.5.tar.gz	2017-07-24	569.7 kB	0
SOAPfuse-v1.26.tar.gz	2016-07-06	42.8 MB	0
mcl	2016-03-03	406.9 kB	0
refGene_sorted.txt	2016-03-03	10.8 MB	0
ensGtp.txt	2016-03-03	7.4 MB	0
ensGene.txt	2016-03-03	39.6 MB	0
blastn	2016-03-03	33.5 MB	0
cytoBand.txt.gz	2015-11-15	6.6 kB	0
hg19table.txt.gz	2015-11-15	3.7 MB	0
hgnc_complete_set.txt.gz	2015-11-15	3.1 MB	0
bowtie-1.1.1.tar.gz	2015-11-15	14.2 MB	0
hg19table.txt	2015-11-15	122.3 kB	0
chimerascan-0.4.5.tar.gz	2015-11-15	4.1 MB	0
Homo_sapiens.GRCh37.60.chr.gtf.gz	2015-11-15	20.3 MB	0
Mono.tar.gz	2015-11-15	89.8 MB	0
Jinja2-2.7.3.tar.gz	2015-11-15	1.3 MB	0
scikit-learn-0.14.1.tar.gz	2015-11-15	6.8 MB	0
Homo_sapiens.GRCh37.69.gtf.gz	2015-11-15	26.2 MB	0
fusion_test.config	2015-11-15	70 Bytes	0
Totals: 29 Items		342.5 MB	2

# fusion_test.sh
# Author: Chin-Chen Pan
# Directore, General and Surgical Pathology
# Professor, attending pathologist
# Department of Pathology and Laboratory Medicine
# Taipei Veterans General Hospital
# TAIWAN
# Version 4.5.2
# Date: Nov 2, 2018

[Introduction]

fusion_test.sh is a shell script to detect fusion in DNAseq or RNAseq. It uses chimerascan, SOAPfuse, MapSplice2, FusionMap, fusioncatcher, TopHat and EricScript. The output files of chimerascan are further annotated by Jinja and Pegasus Fusion.

[Before running]

1. Prepare fusion_test.config. The file contains four words in one line. No other words and lines are allowed.

/path/to/programs /path/to/inputfile /path/to/outputfile thread_number

ex1:
/home/user_name /media/user_name/disk1/input /home/user_name/output 8

ex2:
~ ~/input ~/output 8

2. python-dev, zlib1g-dev, phython-pandas, libgdiplus and default-jre, R, R package ada must be installed.

sudo apt-get install python-dev
sudo apt-get install zlib1g-dev
sudo apt-get install python-pandas
sudo apt-get install default-jre
sudo apt-get install r-base
sudo apt-get install libgdiplus
R
>install.packages("ada")
>q()

3. Install chimerascan, jinja, pegasus, sklearn (only from scikit-learn-0.14.1) and mono (only from mono-2.10.9 ). Please refer to the authors' websites.
If you place the folders of chimerascan-0.4.5 (including ez_setup.py), Jinja2-2.7.3, scikit-learn-0.14.1, mono-2.10.9 in the /path/to/programs, the script can automatically install the programs the first time you run it.
Pre-built Mono and MapSplice are provided.
Download SOAPfuse-v1.26, Homo_sapiens.GRCh37.69.gtf.gz, cytoBand.txt.gz and hgnc_complete_set.txt to /path/to/programs. The script will build SOAP-index automatically.
Download Tophat2 prebuilt binary tophat-2.0.0.Linux_x86_64, bowtie1 index from Tophat site.
Copy blastn, mcl, ensGtp.txt, ensGene.txt, refGene_sorted.txt into tophat-2.0.0.Linux_x86_64 folder. Those files can be downloaded here.
Make /path/to/programs/blast directory, download human_genomic*, other_genomic*, and nt* from blast database (ftp://ftp.ncbi.nlm.nih.gov/blast/db/), and extract them under /path/to/programs/blast.
Download bedtools and extract the directory to /path/to/programs/bedtools2. The script will use it to install bedtools the frist time you run it.
eriscript uses old version of samtools (samtools-0.1.19). Download prebuilt samtools here, decomress, rename and copy the samtools to /usr/bin.

Install oshell to /path/to/programs/oshell according to the authors' website. http://www.arrayserver.com/wiki/index.php?title=Oshell#OmicScript_for_FusionMap

Copy Alignment.oscript to /path/to/programs/oshell.

sudo cp /path/to/samtools-0.1.19/samtools /usr/bin/samtools-0.1.19

Download ericscript-0.5.5.tar.gz here and decompress it to /path/to/programs.
Download ericscript_db_homosapiens_ensembl73 from the ericscript site and extract it to /path/to/programs/ericscript_db_homosapiens_ensembl73.

Copy configuration.tmp to /path/to/fusioncatcher/fusioncatcher_v0.99.5a/etc. It will be used as a template for configuration.cfg for fusioncatcher.

4. The followings files and folders must be placed in the /path/to/programs.

chimerascan-0.4.5
bowtie-1.1.1
chimerascan_hg19_ucsc_index (index for chimerascan)
Pegasus_dist.0.3.1
SOAPfuse-v1.26
SOAPfuse-index (index for SOAPfuse)
MapSplice-v2.2.2
chromFa (index for MapSplice)
Homo_sapiens.GRCh37.60.chr.gtf (required for Pegasus Fusion)
Homo_sapiens.GRCh37.69.gtf (required for MapSplice)
Mono
oshell
oshell/Alignment.oscript
OmicsoftFolders (index for FusionMap)
fusioncatcher
tophat-2.1.0.Linux_x86_64
BowtieIndex
blast
ericscript-0.5.5
ericscript_db_homosapiens_ensembl73

Some of the files can be downloaded here.

5. The original configuration files, ./SOAPfuse-v1.26/config.txt and ./oshell/Alignment.oscript, are used as template. Do not change the content of these 2 files. You can create config2.txt and Alignment2.oscript with different parameters (in the options -sf2 and -fm2).

6. You may write additional options to tophat.rna.config, tophat.genome.config and tophat.post.config, each containing a single line (only the first line of the files will be read), and save them to /path/to/programs/tophat-2.1.0.Linux_x86_64.

7. The seq files must be paired end, and named as samplename_1.suffix and samplename_2.suffix. The suffix must be one of the fastq/fq/fastq.gz/fq.gz.

8. In order to be compatible with SOAPfuse, the seq files must be placed in the following paths:

/path/to/inputfile/samplename/Lib/samplename_1.suffix
/path/to/inputfile/samplename/Lib/samplename_2.suffix

[RUNNING]

Syntax: sh fusion_test.sh samplename suffix -options

options:
-sf2: use second configuration file for SOAPfuse
-fm2: use second configuration file for FusionMap
-rna: RNAseq (default)
-dna: DNAseq for FusionMap
-transcriptome: Use transcriptome index in TopHat (default)
-genome: Use genome index in TopHat
-s: shutdown after finished
-kt: keep temporary files
-skc: skip chimerascan
-sks: skip SOAPfuse
-skm: skip MapSplice
-skf: skip FusionMap
-skt: skip fusioncatcher
-skh: skip TopHat
-ske: skip EricScript

ex1:
sh fusion_test.sh test1 fastq.gz -genome
ex2:
sh fusion_test.sh test2 fastq -skc -sf2 -skm -s

[How to build chimerascan index]
1. download http://hgdownload.cse.ucsc.edu/goldenPath/hg19/bigZips/chromFa.tar.gz
2. uncompress chromFa.tar.gz
3. cat chr?.fa chr??.fa > ~/hg19.fa
4. python /usr/local/bin/chimerascan_index.py --bowtie-path=~/bowtie-1.1.1 ~/hg19.fa ~/hg19table.txt ~/chimerascan_hg19_ucsc_index
(hg19table.txe can be downloaded here).

[How to build SOAPfuse index]
perl SOAPfuse-S00-Generate_SOAPfuse_database.pl -wg ~/hg19/hg19.fa -gtf ~/Homo_sapiens.GRCh37.69.gtf.gz -cbd ~/cytoBand.txt.gz -gf ~/hgnc_complete_set.txt -sd ~/SOAPfuse-v1.26/ -dd ~/SOAPfuse-index/
(Homo_sapiens.GRCh37.69.gtf.gz, cytoBand.txt.gz and hgnc_complete_set.txt can be downloaded here.)

[How to build MapSplice index]
1. uncompress chromFa.tar.gz into ~/chromFa
2. keep chr1.fa to chr22.fa, chrM.fa, chrX.fa, chrY.fa. delete others.
3. the program will automatically build the index.

[How to build FusionMap index]
Make following empty folders in /path/to/programs.
OmicsoftFolders
---OmicsoftSGE
---Fusion
---ReferenceLibrary
Run with -kt function at the first time, the program will download the idex files to /path/to/outputfile/filename/FusionMap/OmicsoftFolders. After finished, copy the /path/to/outputfile/filename/FusionMap/OmicsoftFolders back to /path/to/programs/OmicsoftFolders.

Note: MapSplice and FusionCatcher will unzip the input files to two temporary files which will be deleted after the procedure. To avoid this, unzip the input files before running the script and use unzipped files for all procedures.

Source: README, updated 2018-11-19

fusion-test Files

Script for fusion detection in DNAseq and RNAseq

fusion-test Files

Script for fusion detection in DNAseq and RNAseq

Get an email when there's a new version of fusion-test