Menu

Home

Daniel N

Description

FusionCatcher searches for novel/known fusion genes, translocations, and chimeras in RNA-seq data (paired-end reads from Illumina NGS platforms like Solexa/HiSeq/NextSeq/MiSeq) from diseased samples.

The aims of FusionCatcher are:

  • very good detection rate for finding candidate fusion genes
  • very easy to use (i.e. no a priori knowledge of databases and bioinformatics is needed in order to run FusionCatcher)
  • to be as automatic as possible (i.e. the FusionCatcher will choose automatically the best parameters in order to find candidate fusion genes, e.g. finding automatically the adapters, quality trimming of reads, building the exon-exon junctions automatically based on the length of the reads given as input, etc.) while providing the best possible detection rate for finding fusion genes (with a very low rate of false positives but a very good sensitivity).

FusionCatcher has been used for finding novel and known fusion genes in the following articles:

  • S. Kangaspeska, S. Hultsch, H. Edgren, D. Nicorici, A. Murumägi, O.P. Kallioniemi, Reanalysis of RNA-sequencing data reveals several additional fusion genes with multiple isoforms, PLOS One, Oct. 2012. http://dx.plos.org/10.1371/journal.pone.0048745
  • H. Edgren, A. Murumagi, S. Kangaspeska, D. Nicorici, V. Hongisto, K. Kleivi, I.H. Rye, S. Nyberg, M. Wolf, A.L. Borresen-Dale, O.P. Kallioniemi, Identification of fusion genes in breast cancer by paired-end RNA-sequencing, Genome Biology, Vol. 12, Jan. 2011. http://genomebiology.com/2011/12/1/R6
  • JN. Honeyman, EP. Simon, N. Robine, R. Chiaroni-Clarke, DG. Darcy, I. Isabel, P. Lim, CE. Gleason, JM. Murphy, BR. Rosenberg, L. Teegan, CN. Takacs, S. Botero, R. Belote, S. Germer, A-K. Emde, V. Vacic, U. Bhanot, MP. LaQuaglia, and S.M. Simon, Detection of a Recurrent DNAJB1-PRKACA Chimeric Transcript in Fibrolamellar Hepatocellular Carcinoma, Science 343 (6174), Feb. 2014, pp. 1010-1014, DOI:10.1126/science.1249484, http://www.sciencemag.org/content/343/6174/1010.abstract

FusionCatcher has found also the novel fusion genes TAF5L-C1ORF95, PPP1R13L-ZNF541, and SACS-SGCG in U87MG human glioblastoma cell line using the publicly available RNA-seq datasets from the following articles:

FusionCatcher has found also the novel fusion genes TIAM1-ATP5O, [http://genomebiology.com/2013/14/2/R12/abstract CIRH1A-TMCO7], and [http://genomebiology.com/2013/14/2/R12/abstract PSMD8-SIPA1L3] in both [http://www.ncbi.nlm.nih.gov/sra/SRX148575 T24] and [http://www.ncbi.nlm.nih.gov/sra/SRX148574 5637] bladder cancer cell lines using the publicly available RNA-seq dataset [http://www.ncbi.nlm.nih.gov/sra?term=SRA052960 SRA052960]. FusionCatcher correctly identifies BDKRB2-BDKRB1 candidate fusion gene as a false positive event in both cell lines (for more see [http://www.ensembl.org/Homo_sapiens/Gene/Summary?g=ENSG00000258691;r=14:96671181-96730266 RP11-404P21.8] gene).


FusionCatcher has found novel FGFR2 fusion genes in SNU-16 and KATOIII gastric cancer cell lines!


FusionCatcher has found the novel fusion genes FGFR2-PPAPDC1A, PVT1-SLC1A2, CD44-SLC1A2, CD44-PDHX, PVT1-PDHX, PVT1-APIP, FABP2-C4ORF3, CD44-FGFR2, FGFR2-MYC, FGFR2-PDHX, and C6ORF47-BAG6, in gastric cancer [http://www.ncbi.nlm.nih.gov/sra/SRX181248 SNU-16] cell line. More info is [fusionsSNU16 here]. ( D. Nicorici, Novel FGFR2 fusion genes in SNU-16 gastric cancer cell line, Figshare, Nov. 2013, http://dx.doi.org/10.6084/m9.figshare.856657 )


FusionCatcher has found the novel fusion genes FGFR2-ULK4, CTNNB1-FGFR2, CTNNB1-ULK4, FGFR2-CEACAM5, GCNT3-FGFR2, PAFAH1B2-SIK3, FOXA2-LINC00261, UBA2-WTIP, AC156455.1-MLXIP, ANAPC15-LAMTOR1, and SNX19-FGFR2, in gastric cancer [http://www.ncbi.nlm.nih.gov/sra/SRX181269 KATOIII] cell line. More info is [fusionsKATOIII here]. (D Nicorici, Novel FGFR2 fusion genes in KATOIII gastric cancer cell line, Figshare, Nov. 2013, http://dx.doi.org/10.6084/m9.figshare.856658 )


FusionCatcher has found the novel fusion genes CHCHD10-VPREB3, CANX-SQSTM1, KIFC3-LRRC36, TRIP12-DNER, ECEL1-CITED4, SREBF1-NUP210, KLHDC4-SLC7A5, P2RY6-ARHGEF17, HNRNPUL2-C11ORF49, NYAP2-AC016717.1, OBSL1-CHPF, PHGDH-ATAD3A, NCKIPSD-CELSR3, NCL-CALR, B3GAT3-GANAB, and CPPED1-COQ7, in [http://en.wikipedia.org/wiki/HeLa HeLa] cell line. More info is [fusionsHeLa here] (D. Nicorici, Fusion genes in HeLa cervical cancer cell line, Figshare, Nov. 2013. http://dx.doi.org/10.6084/m9.figshare.856664 )

Manual/Help

More detailed information about FusionCatcher can be found in the [Manual].


Installation

wget http://sourceforge.net/projects/fusioncatcher/files/bootstrap.py && python bootstrap.py --download

or using your local/favourite Python version (which will be used later by FusionCatcher)

wget http://sourceforge.net/projects/fusioncatcher/files/bootstrap.py /your/favourite/python bootstrap.py

In case that one wants to install FusionCatcher here /some/directory/fusioncatcher/, then this shall be run:

wget http://sourceforge.net/projects/fusioncatcher/files/bootstrap.py
/your/favourite/python bootstrap.py --prefix=/some/directory/

Usage Example

Searching for fusion genes:

fusioncatcher \
-d /some/human/data/directory/ \
-i /some/input/directory/containing/fastq/files/ \
-o /some/output/directory/

where:

  • /some/human/data/directory/ - contains the build data generated by fusioncatcher-build (see [Manual#5.2_-_Downloading/building_organism's_data Download data] section for more information); the human build data can be downloaded directly from [https://mega.co.nz/#!SEFSFDpT!fYKp5Dk5hLK9CyQKtJPcxDZBb_Vw32Rv4xk2FMWp3BU here]

  • /some/input/directory/containing/fastq/files/ - contains the input FASTQ (or SRA if NCBI SRA toolkit is installed) files

  • /some/output/directory/ - contains output files:

    • final-list_candidate_fusion_genes.txt - final list with the newly found candidates fusion genes (it contains the fusion genes with their junction sequence and points); See [Manual#6.2_-_Output_data Table 1] for columns descriptions;
    • preliminary-list_candidate_fusion_genes.txt - preliminary list of candidates fusion genes which is used further to find the final list of candidate fusion genes (it contains the candidate fusion genes without their junction sequence and point); See [Manual#6.2_-_Output_data Table 2] for columns descriptions;
    • candidate_fusion_genes_supporting-reads_BOWTIE.zip - sequences of short reads supporting the newly found candidate fusion genes found using only and exclusively the Bowtie aligner;
    • candidate_fusion_genes_supporting-reads_BLAT.zip - sequences of short reads supporting the newly found candidate fusion genes found using the Bowtie and Blat aligners;
    • info.txt - information regarding genome version, Ensembl database version, versions of tools used, read counts, etc.;
    • fusioncatcher.log - log of the entire run (e.g. all commands/programs which have been run, command line arguments used, running time for each command, etc.).


Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.