Name | Modified | Size | Downloads / Week |
---|---|---|---|
Parent folder | |||
USAGE | 2014-03-31 | 285 Bytes | |
README | 2014-03-31 | 2.7 kB | |
NEWS | 2014-03-31 | 402 Bytes | |
FineSplice.py | 2014-03-31 | 15.9 kB | |
Totals: 4 Items | 19.2 kB | 0 |
FineSplice, a Python wrapper to TopHat2 for enhanced splice junction detection and quantification from RNA-Seq data USAGE python FineSplice.py -i <filename>.bam -l <read length> -i <filename>.bam (path to TopHat2 BAM file) -l <read length> (read length) e.g. python FineSplice.py -i example.bam -l 50 PREREQUISITES TopHat2 alignment in BAM format Align with reference transcript annotations (i.e. -G/--GTF option) Python 2.x (version >= 2.6) with the following modules installed: pysam (version tested 0.7.4) scikit-learn (version tested 0.13.1) numpy (version tested 1.7.1, required by scikit-learn) scipy (version tested 0.7.2, required by scikit-learn) Get the latest versions at: http://code.google.com/p/pysam/ (pysam) http://scikit-learn.org/ (scikit-learn) OUTPUT FORMAT FineSplice produces two output files: <filename>.accepted.junc <filename>.discarded.junc Consisting of a tab-separated list of splice junctions with the following fields: 1. SN (reference sequence name i.e. chromosome) 2. start (junction start genomic coordinate,0-based) 3. end (junction end genomic coordinate, 0-based) 4. prob (posterior probability i.e. junction reliability) 5. unique (# uniquely mapping reads) [ 6. rescued (# multiple mapping reads rescued after filtering) ] DESCRIPTION FineSplice is a post-processing tool geared towards a a reliable identification of expressed exon junctions from TopHat2, at enhanced detection precision with small loss in sensitivity. Following alignment with TopHat2 (using known transcript annotations) FineSplice takes as input the resulting BAM file and outputs a confident set of expressed junctions, with the corresponding read counts. Potential false positives arising from spurious alignments are filtered out through a semi-supervised anomaly detection strategy based on logistic regression. Multiple mapping reads with a unique hit after filtering are rescued and reallocated to the most reliable candidate location. REFERENCE For further details check out our paper: Gatto,A. et al (2014) FineSplice, enhanced splice junction detection and quantification: a novel pipeline based on the assessment of diverse RNA-Seq alignment solutions. If you use FineSplice in scientific publications, please cite our paper. Get FineSplice latest version at: https://sourceforge.net/projects/finesplice/