Menu

#6 geneBody_coverage graphs misleading

2.0
open
nobody
None
2021-06-01
2021-06-01
No

In geneBody_coverage.py, the algorithm for coverage appears to scale each sample so that the minimum coverage is 0 and the maximum is 1.0, which (as far as I can tell) happens in this line:

dataset.append((name, [(i -min(dat))/(max(dat) - min(dat)) for i in dat], skewness))    

While there isn't any issue with this approach on its own, the y-axis of the line graphs, 'coverage', likewise ranges from 0.0 to 1.0. This has the unfortunate effect of implying that '0 coverage' on these graphs means there were no reads in that region, when in reality it is simply the global minimum. It you have a sample where the minimum coverage was 30%, 80%, or 99% of the maximum, then that 30%, 80%, or 99% point will still be graphed as '0 coverage'.

Anyways, the current output is confusing as is... it really wasn't clear to me what was going on here until I looked at the code directly. Perhaps the y-axis and / or scale can be relabeled to more accurately reflect what is being graphed.

Discussion


Log in to post a comment.