Massively parallel sequencing of cDNA reverse transcribed from RNA (RNASeq) provides an accurate estimate of the quantity and composition of mRNAs. To characterize the transcriptome through the analysis of RNA-seq data, we developed PRADA. PRADA focuses on the processing and analysis of gene expression estimates, supervised and unsupervised gene fusion identification, and supervised intragenic deletion identification.
PRADA currently supports 7 modules to process and identify abnormalities from RNAseq data:
preprocess: Generates aligned and recalibrated BAM files.
expression: Generates gene expression (RPKM) and quality metrics.
fusion: Identifies candidate gene fusions.
guess-ft: Supervised search for fusion transcripts.
guess-if: Supervised search for intragenic fusions.
homology: Calculates homology between given two genes.
frame: Predicts functional consequence of fusion transcript
Features
- PRADA is written in python programing language and intended to run in a command line environment on UNIX or LINUX operating systems.
- Detail description of installation steps and the usage of each module with examples is available in the documentation at https://sourceforge.net/p/prada/wiki/Home/attachment/pyPRADA.pdf
- The hg19 reference files are available to download at http://bioinformatics.mdanderson.org/Software/PRADA/
- To remove fusion artifacts, we filter out genes with multiple partners in the same sample and homology.