From: Andrew O. <And...@hc...> - 2011-03-15 19:29:31
|
Hi David, I have another question. For the NovoalignParser (and other parsers, such as Tag2Point), the documentation says that it converts to "center position binary point data", but to get a base-by-base read coverage, we would need start and stop positions for each read, especially when reads have different sizes. Has the format of the bar file been updated to include start and stop or is it still center-based? If not, do you have a recommended method for getting base-by-base read coverage? Thanks, Andrew Brad Cairns Lab Huntsman Cancer Institute Room #4350 University of Utah 2000 Circle of Hope Salt Lake City, UT 84112 (801) 585-1823 From: David Nix <Dav...@hc...<mailto:Dav...@hc...>> Date: Tue, 15 Mar 2011 10:36:01 -0600 To: Andrew Oler <and...@hc...<mailto:and...@hc...>> Subject: Re: Novo bis parser It means exactly what it says. Where there is an overlap in a pair of reads from the same template it calls a consensus so that you don’t double count the same base pairs. Yes, 0.41 is a lot of overlap, the library insert size was too small and thus your effectively cutting your data output by 40%. This should be < 5%. -cheers, D On 3/14/11 10:15 AM, "Andrew Oler" <And...@hc...> wrote: Hi David, I'm running NBP and I was wondering if you could explain the stats at the end. >From the app description, what does this mean, exactly? "Flattens overlapping reads in a pair to call consensus bps." Does that mean to call consensus as to whether converted or not? If overlapping, then it should have a higher accuracy right? I got these stats at the end. 2353965845 BPs overlapping paired sequence 5645468238 BPs paired sequence 0.417 Fraction overlapping bps from paired reads. This means that I had a lot of overlapping reads, right? What do you usually get for this number? Is there a way to get coverage statistics, e.g., what fraction of the genome is covered at least 5-fold? Or coverage tracks? Thanks, Andrew Brad Cairns Lab Huntsman Cancer Institute Room #4350 University of Utah 2000 Circle of Hope Salt Lake City, UT 84112 (801) 585-1823 |
From: David N. <dav...@gm...> - 2011-03-15 21:40:52
|
Your read sizes shouldn't change thus its easy to get start and stop for each read when you know the center position. The ReadCoverage app does just this and calculates a per base read coverage. It throws a warning if it finds reads of different lengths. -cheers, D On 3/15/11 1:14 PM, "Andrew Oler" <And...@hc...> wrote: > Hi David, > > I have another question. For the NovoalignParser (and other parsers, such as > Tag2Point), the documentation says that it converts to "center position binary > point data", but to get a base-by-base read coverage, we would need start and > stop positions for each read, especially when reads have different sizes. Has > the format of the bar file been updated to include start and stop or is it > still center-based? If not, do you have a recommended method for getting > base-by-base read coverage? > > Thanks, > > Andrew > > Brad Cairns Lab > Huntsman Cancer Institute Room #4350 > University of Utah > 2000 Circle of Hope > Salt Lake City, UT 84112 > (801) 585-1823 > > From: David Nix <Dav...@hc...<mailto:Dav...@hc...>> > Date: Tue, 15 Mar 2011 10:36:01 -0600 > To: Andrew Oler <and...@hc...<mailto:and...@hc...>> > Subject: Re: Novo bis parser > > It means exactly what it says. Where there is an overlap in a pair of reads > from the same template it calls a consensus so that you don¹t double count the > same base pairs. > > Yes, 0.41 is a lot of overlap, the library insert size was too small and thus > your effectively cutting your data output by 40%. This should be < 5%. > > -cheers, D > > > On 3/14/11 10:15 AM, "Andrew Oler" <And...@hc...> wrote: > > Hi David, > > I'm running NBP and I was wondering if you could explain the stats at the end. > >> From the app description, what does this mean, exactly? "Flattens >> overlapping reads in a pair to call consensus bps." Does that mean to call >> consensus as to whether converted or not? If overlapping, then it should >> have a higher accuracy right? > > I got these stats at the end. > > 2353965845 BPs overlapping paired sequence > 5645468238 BPs paired sequence > 0.417 Fraction overlapping bps from paired reads. > > This means that I had a lot of overlapping reads, right? What do you usually > get for this number? > > Is there a way to get coverage statistics, e.g., what fraction of the genome > is covered at least 5-fold? Or coverage tracks? > > Thanks, > > Andrew > > Brad Cairns Lab > Huntsman Cancer Institute Room #4350 > University of Utah > 2000 Circle of Hope > Salt Lake City, UT 84112 > (801) 585-1823 > > > ------------------------------------------------------------------------------ > Colocation vs. Managed Hosting > A question and answer guide to determining the best fit > for your organization - today and in the future. > http://p.sf.net/sfu/internap-sfd2d > _______________________________________________ > Useq-users mailing list > Use...@li... > https://lists.sourceforge.net/lists/listinfo/useq-users |