dupRadar Code
Quality control for PCR artefacts in RNA-Seq data
Brought to you by:
holgerklein,
nkreim
| File | Date | Author | Commit |
|---|---|---|---|
| dupRadar | 2015-10-29 |
|
[e18dba] first PE approximation: Rsubread doesnt drop so... |
| README.txt | 2014-12-02 |
|
[13bbc7] Removed java counting tool. |
######################
## ##
## dupRadar package ##
## ##
######################
License: GPL v3
R package dupRadar
(c) Holger Klein
Institute for Molecular Biology gGmbH, Mainz and
Boehringer Ingelheim Pharma GmbH & Co. KG, Target Discovery Research
holger.klein@gmail.com and
and
Sergi Sayols Puig
Institute for Molecular Biology gGmbH, Mainz
s.sayolspuig@imb-mainz.de
https://sourceforge.net/projects/dupradar/
Description
For RNA-Seq data, global measure of read duplication in an NGS data set
are misleading. In these kind of data, technical reasons, such as PCR
artefacts, as well as biological reasons, such as read duplication due to
oversequencing, can cause duplicate reads. Global duplication rates for
RNA-Seq data are often larger than 60 or 70% even in RNA-Seq data sets,
that do not suffer from PCR artefacts.
Hence we developed dupRadar, a QC tool for RNA-Seq that calculates
per gene duplication rates and analyzes the dependence between duplication
rate and gene expression.
The dupRadar package is written in R and will be submitted to the
biconductor project. Internally it uses featureCounts() from the RSubread
package. The package provides functions for plotting and
analyzing the duplication rates dependent on the expression levels.
It requires a duplication marked bam file and a reference gene
annotation in GTF format.
Duplicate reads can be marked with Picard's MarkDuplicates, bamUtils dedup,
or biobambams bammarkduplicates. The latter two beat the Picard
implementation in speed.
Install the dupRadar R-package using
R CMD INSTALL dupRadar_x.y.z.tar.gz