dupRadar Code
Quality control for PCR artefacts in RNA-Seq data
Brought to you by:
holgerklein,
nkreim
File | Date | Author | Commit |
---|---|---|---|
dupRadar | 2015-10-29 |
![]() |
[e18dba] first PE approximation: Rsubread doesnt drop so... |
README.txt | 2014-12-02 |
![]() |
[13bbc7] Removed java counting tool. |
###################### ## ## ## dupRadar package ## ## ## ###################### License: GPL v3 R package dupRadar (c) Holger Klein Institute for Molecular Biology gGmbH, Mainz and Boehringer Ingelheim Pharma GmbH & Co. KG, Target Discovery Research holger.klein@gmail.com and and Sergi Sayols Puig Institute for Molecular Biology gGmbH, Mainz s.sayolspuig@imb-mainz.de https://sourceforge.net/projects/dupradar/ Description For RNA-Seq data, global measure of read duplication in an NGS data set are misleading. In these kind of data, technical reasons, such as PCR artefacts, as well as biological reasons, such as read duplication due to oversequencing, can cause duplicate reads. Global duplication rates for RNA-Seq data are often larger than 60 or 70% even in RNA-Seq data sets, that do not suffer from PCR artefacts. Hence we developed dupRadar, a QC tool for RNA-Seq that calculates per gene duplication rates and analyzes the dependence between duplication rate and gene expression. The dupRadar package is written in R and will be submitted to the biconductor project. Internally it uses featureCounts() from the RSubread package. The package provides functions for plotting and analyzing the duplication rates dependent on the expression levels. It requires a duplication marked bam file and a reference gene annotation in GTF format. Duplicate reads can be marked with Picard's MarkDuplicates, bamUtils dedup, or biobambams bammarkduplicates. The latter two beat the Picard implementation in speed. Install the dupRadar R-package using R CMD INSTALL dupRadar_x.y.z.tar.gz