OLego -- short or long RNA-seq read mapping to discover exon junction
Jie Wu (wuj@cshl.edu), Chaolin Zhang (cz2294@columbia.edu)
Please find the most recent documentation at http://zhanglab.c2b2.columbia.edu/index.php/OLego_Documentation
What is OLego?
======================
OLego is a program specifically designed for de novo spliced mapping of mRNA-seq reads. OLego adopts a seed-and-extend scheme, and does not rely on a separate external mapper. It achieves high sensitivity of junction detection by strategic searches with very small seeds (12-14 nt), efficiently mapped using Burrows-Wheeler transform (BWT) and FM-index. This also makes it particularly sensitive for discovering small exons. It is implemented in C++ with full support of multiple threading, to allow for fast processing of large-scale data.
OLego is an open source code project and released under GPLv3. The implementation of OLego relies heavily on BWA (version 0.5.9rc1, http://bio-bwa.sourceforge.net/).
Citation
======================
Wu,J., Anczukow,O., Krainer,A.R., Zhang,M.Q. , Zhang,C. , 2013. OLego: Fast and sensitive mapping of spliced mRNA-Seq reads using small seeds. Nucleic Acids Res. , In press.
Versions
======================
v1.1.5 ( 7-14-2014 )
---------------------
* Bugs fixed
* Optmization for speed
v1.1.2 ( 7-1-2013 )
---------------------
* Sensitivity improved in small exons and single anchor search.by allowing mismatch.
* Allows overlapping seeds to improve speed and seeding flexibility.
* Increase default seed size to 15 (max 1 nt overlapping ) to keep both senstivity and speed.
* A bug fixed (crashes when using -M 0 )
v1.1.1 ( 4-14-2013 )
---------------------
* Improved speed by filtering simple repetitive anchors.
* Default options optimized.
v1.1.0 ( 3-31-2013 )
---------------------
* Bug fixed for duplicate entries for some reads, sensitivity improved.
* Optimized on option -W.
* Bug fixed in sam2bed.pl.
* Bug fixed for option -e.
* Bug fixed for regression_model_gen.
* Add support for gzip input file for sam2bed.pl.
v1.0.8 ( 11-20-2012 )
---------------------
* Improvement in hit clustering.
* Fixed an overcounting problem in mismatch counting.
* Fixed bug in merging step.
* Fixed bug in XS tag for extra exon body reads.
* Allows pipe input/output with "-" for some of the scripts.
v1.0.6 ( 08-09-2012 )
---------------------
* Added option --max-multi (default:1000) to avoid huge data in a single line.
* Added option --num-reads-batch.
* Fixed a bug in the junction connecting step.
v1.0.5 ( 07-16-2012 )
---------------------
* Minor bug fixed (the old code crashes in a very rare case).
v1.0.4 ( 06-12-2012 )
---------------------
* Option changes ( do single-anchor search by default now ).
v1.0.3 ( 06-10-2012 )
---------------------
* Now supports strand specific library
* Fixed bugs about XS
v1.0.0 ( 05-15-2012 )
----------------------
* The initial Public release