Thread: [Useq-users] base-by-base read coverage stats

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Hi David,

I have another question.  For the NovoalignParser (and other parsers, such as Tag2Point), the documentation says that it converts to "center position binary point data", but to get a base-by-base read coverage, we would need start and stop positions for each read, especially when reads have different sizes.  Has the format of the bar file been updated to include start and stop or is it still center-based?  If not, do you have a recommended method for getting base-by-base read coverage?

Thanks,

Andrew

Brad Cairns Lab
Huntsman Cancer Institute Room #4350
University of Utah
2000 Circle of Hope
Salt Lake City, UT 84112
(801) 585-1823

From: David Nix <Dav...@hc...<mailto:Dav...@hc...>>
Date: Tue, 15 Mar 2011 10:36:01 -0600
To: Andrew Oler <and...@hc...<mailto:and...@hc...>>
Subject: Re: Novo bis parser

It means exactly what it says.  Where there is an overlap in a pair of reads from the same template it calls a consensus so that you don’t double count the same base pairs.

Yes, 0.41 is a lot of overlap, the library insert size was too small and thus your effectively cutting your data output by 40%.  This should be < 5%.

-cheers, D

On 3/14/11 10:15 AM, "Andrew Oler" <And...@hc...> wrote:

Hi David,

I'm running NBP and I was wondering if you could explain the stats at the end.

>From the app description, what does this mean, exactly?  "Flattens overlapping reads in a pair to call consensus bps."  Does that mean to call consensus as to whether converted or not?  If overlapping, then it should have a higher accuracy right?

I got these stats at the end.

2353965845 BPs overlapping paired sequence
5645468238 BPs paired sequence
0.417 Fraction overlapping bps from paired reads.

This means that I had a lot of overlapping reads, right?   What do you usually get for this number?

Is there a way to get coverage statistics, e.g., what fraction of the genome is covered at least 5-fold?  Or coverage tracks?

Thanks,

Andrew

Brad Cairns Lab
Huntsman Cancer Institute Room #4350
University of Utah
2000 Circle of Hope
Salt Lake City, UT 84112
(801) 585-1823

Thread: [Useq-users] base-by-base read coverage stats

useq-users