deFuse is a software package for gene fusion discovery using RNA-Seq data. The software uses clusters of discordant paired end alignments to inform a split read alignment analysis for finding fusion boundaries. The software also employs a number of heuristic filters in an attempt to reduce the number of false positives and produces a fully annotated output for each predicted fusion.
The deFuse algorithm and results from an application to ovarian tumours and sarcomas have been published in PLoS Computational Biology:
deFuse: An Algorithm for Gene Fusion Discovery in Tumor RNA-Seq Data
deFuse has been used to discover gene fusions in tumour samples for the following publications:
If you are having trouble running deFuse, please post your question to the deFuse Help forum.
Four of the granulosa cell tumour datasets analysed in the PLoS Comp. Bio. paper have been submitted to the European Genome-Phenome Archive and can be accessed here. A mapping between IDs used in the PLoS Comp. Bio. paper and the New England Journal of Medicine paper is provided below.
N. Engl. J. Med ID PLoS Comp. Bio. ID
GCT0026
GRC4
GCT0028
GRC1
GCT0077
GRC2
GCT0078
GRC3
Please feel free to email the author if you have any questions or issues.
Improvements:
interrupted_indexN and splicing_indexNinterrupted_indexN and splicing_indexN using calculate_extra_annotationsImprovements:
Bugfixes:
merge_read_stats.plOther Changes:
bowtie_options if in the config if needed Comments:
Fixes the following bugs:
Fastq error, unable to interpret readidNo dataset rebuild required.
Added a simulated test dataset and deFuse output based on default parameters here
Fixes the following bugs:
Error: expected GTAG and GTC to be same size message muted In addition, the script get_reads.pl has been added. Pass it the config filename, output directory, and cluster_id of a given fusion to view the supporting spanning and split reads.
No dataset rebuild required.
Fixes the est island filter that was previously broken. Adds 50 A's to each cDNA sequence to allow for alignment of poly A tail reads.
Requires a dataset rebuild using the create_reference_dataset.pl script.
Version 0.4.1 fixes the too many reference sequences bug of when using version 0.4.0 with the latest ensembl.
To start 0.4.1 where 0.4.0 left off, delete *cluster* from the output directory and run 0.4.1 on that output directory. There is no need to rebuild the reference dataset.
Version 0.4.0 provides several advantages over version 0.3 defuse:
From version 0.4.0 onwards, a prepackaged dataset will be replaced by step by step instructions on how to build your own dataset. The 0.3 dataset will not be compatible with version 0.4.0 and onwards.
Version 0.3.7 fixes speed and memory issues that would occur with some datasets. If deFuse was taking a long time and generating a very large clusters.txt file, this update should help. In order to run the new version on a dataset that failed with version 0.3.6 or lower, first remove all files matching clusters in the output directory created by the previous version, then restart version 0.3.7 using the old output directory. The new version should pick up where the old version left off.
Known issues:
Version 0.3.6 introduces new annotations that are leveraged using the adaboost classifier for slightly higher accuracy. The filtered output is now based on the probability produced by the adaboost classifier. deFuse 0.3.6 updates can be used to quickly reannotate a deFuse 0.3.5 analysis, however the repeats.txt file from the newly posted dataset is required. To reannotate:
rm output_directory/annotations.txt ./reannotate.pl -c config.txt -o output_directory
Known issues:
Version 0.3.5 is a minor update that fixes a number of small bugs and fixes reannotate.pl so that it can actually be used to classify the results of any defuse 0.3.X run. This functionality was not working for 0.3.4. Simply run the following to annotate any 0.3.X run and obtain the new annotations and adaboost probability score. It is not necessary to run reannotate.pl if you have already run version 0.3.5 of defuse.pl.
rm output_directory/annotations.txt ./reannotate.pl -c config.txt -o output_directory -p max_threads
Known issues:
Version 0.3.4 uses an adaboost classifier trained on 60 true positives and 61 false positives to produce a single probability score for each fusion. The R ada package is required. You can take full advantage of the adaboost classifier for results produced using other 0.3.X versions of defuse by simply running the reannotate.pl script *Edit: this does not work, please update to version 0.3.5*. Doing a full rerun using 0.3.4 should not be necessary. Once again, the dataset has not changed.
Changes:
Known issues:
Version 0.3.3 reworks the split alignments so they are much quicker and do not hit the disk as much as for previous versions. If your deFuse runs are taking a long time, its a good idea to upgrade. The dataset has not changed.
Changes:
Known issues:
If you were having problems previously with deFuse creating very large files, consuming large amounts of memory and taking long amounts of time, it could be because version 0.3.1 and 0.3.0 were not properly filtering IG rearrangements. The problems are exacerbated if your RNA-Seq data was produced from a tumour with high amounts of immune infiltration. This issue is now fixed in version 0.3.2. Note that results from 0.3.1 and 0.3.0 will not be wrong, but will include a superset of what you're interested in (assuming you are not interested in predicting IG rearrangements).
Version 0.3.2 also provides a number of other annotations that may be interesting for prioritizing fusions for validation or further experiments.
Changes:
Known issues:
Changes:
This version represents a major change to the way split reads are calculated and has produced more validated predicitons than 0.2.0. Not backward compatible with 0.2.0.
Changes:
This is the first official release of deFuse.
Thanks to Malachi Griffith for getting deFuse to work on LSF clusters.
Wiki: Main_Page
Wiki: DeFuse_Version_0.2.0
Wiki: DeFuse_Version_0.3.0
Wiki: DeFuse_Version_0.3.2
Wiki: DeFuse_Version_0.3.4
Wiki: DeFuse_Version_0.3.5
Wiki: DeFuse_Version_0.3.6
Wiki: DeFuse_Version_0.4.0
Wiki: DeFuse_Version_0.4.1
Wiki: DeFuse_Version_0.4.2
Wiki: DeFuse_Version_0.6.0
Wiki: DeFuse_Version_0.6.1
Wiki: Main_Page