|
From: Gregor R. <gre...@fr...> - 2012-04-19 17:12:01
|
The correct values are: 1024 image size, 100nt region, window size = 1nt 1024 image size, 1024KB region, window size = 1000nt Thanks, Gregor On 4/19/2012 7:06 PM, Gregor Rot wrote: > Hi Tim, > > thank you for the explanation, very nice. > > One more thing: if you look at my picture: i have the image width set to > 1024 and am viewing a region of 100bp. This would set the "window" to 1, > since the region < image width. > > The red color is bigWig adapter. Since window size = 1, this shows > coverage data nicely. Mean = sum (1 value per window). Indeed, it is > showing 12930 as max value and this is the maximum value present in the > bam coverage, i double checked this with my script. > > Is this interpretation correct? > > Now, if i zoom out and view a 10024KB region with the image size of > 1024, the window size would be 1000nt now, so each pixel represents a > window of 1024nt. The bigWig adapter would look for coverage inside the > window (at each nucleotide) and return the mean value (out of 1000 > coverage values) for the window. Is this so? > > Another puzzling thing is the bam (blue color) graph. Still using glyph > xyplot, but it only shows 8008 as the max value. But from your > explanation, this one should be showing 12930 also? > > The thing is it would be really terrific to have the bigWig adapter > return the sum for each window instead of mean. How difficult would that > be? I can try to implement this, but i need some help for where to look. > > Thanks, > Gregor > > On 4/19/2012 5:56 PM, Timothy Parnell wrote: >> Hi, >> Not sure if I can adequately explain all, but here goes >> >> The coverage method from Bio::DB::Sam returns a simple count of all >> alignments covering each "window" or pixel in the current view. Depending >> on the zoom factor of the current view, each window could represent >> anything from 1 bp to the current view size divided by panel size in >> pixels (10 Mb / 1024 pixels = 9766 bp per pixel). >> >> The summary method from Bio::DB::BigWig returns some statistics for each >> window, including count, sum, etc. The wiggle_xyplot glyph by default >> shows the mean value for each window, which is ok for something like >> microarray probe values but not good for displaying sequence coverage >> (sum >> would be better). I've been meaning to dig into the code and suggest a >> patch to Lincoln to change this behavior. >> >> The wiggle_whiskers glyph displays the mean, standard deviation, and max >> values for each window, denoting each as a different color. Hence, it >> only >> works with BigWigs that return window statistics. This is a little >> better. >> >> One way to ameliorate this problem is to convert the bam into binned >> coverage (100, 500, or 1000 bp) wig files. My bam2wig.pl will now do >> this. >> >> I'm not sure about the difference between bam and bigwig when zoomed >> in at >> base pair level. My bam2wig.pl script can count alignments in a number of >> different ways: at the start position, mid position, or each alignment >> base. It may or may not count gaps, depending on whether splices are >> enabled, and can skip low-scoring alignments. The Bam coverage method >> very >> likely does not count gaps, and will count regardless of score. It's hard >> to say without careful accounting of each alignment at each base pair and >> comparing methods. >> >> I hope that answers at least some of your concerns, >> Tim >> >> >> On 4/19/12 7:54 AM, "Gregor Rot"<gre...@fr...> wrote: >> >>> Hi all, >>> >>> i have 3 simple questions: >>> >>> a) i converted my bam to bigWig (each aligned read contributes +1 at >>> each position spanning the read). I am now comparing the bigWig track >>> with the coverage bam track, the database definitions are: >>> >>> [bigwig_db:database] >>> db_adaptor = Bio::DB::BigWigSet >>> db_args = -dir /big_wig_folder >>> >>> [bam_db:database] >>> db_adaptor = Bio::DB::Sam >>> db_args = -bam /path_to_bam_file >>> >>> and i am using: >>> >>> glypx = wiggle_xyplot >>> >>> For feature i am using "coverage" for the bam track and "summary" for >>> the bigWig track. What is the difference? >>> >>> If you look at figure bam_vs_bigwig_coverage.png, you will see that >>> coverage at centre is 27 for bigWig (bottom) and 35 for the bam track >>> (top). I checked the sam file and the correct coverage is 27 (bigWig), i >>> don't know how the bam coverage is computed? >>> >>> b) if i zoom out to chr1 (scaling is set to local min/max), you see the >>> result in figure scaling_1.png. The selected region has the highest peak >>> (8500), but you can see other higher regions in the bam coverage track. >>> Why? Also the y-axis on this track now shows only 347, but the bigWig >>> track correctly shows 8554. >>> >>> c) If you look at figure whiskers.png, i am using the whiskers glyph for >>> the bigWig track. What is the difference between xyplot and whiskers? I >>> don't understand why the tops of the values are being cut off (yellow >>> color), and at value 8000. >>> >>> --- >>> To sum up, i would like to show the bigWig for coverage (it looks very >>> nice, the combined forward/reverse strand with red/blue colors). The >>> problem is i would need some kind of log-value scaling or something like >>> that (because if a user zooms out to the entire chromosome it's very >>> difficult to see where the peak regions are). >>> >>> Any help appreciated, >>> >>> Thanks, >>> Gregor >>> >>> -- >>> Gregor Rot >>> Bioinformatics Laboratory >>> Faculty of computer and information science >>> SI-1000 Ljubljana >>> Slovenia >>> http://www.fri.uni-lj.si/en/gregor-rot >> > > > ------------------------------------------------------------------------------ > For Developers, A Lot Can Happen In A Second. > Boundary is the first to Know...and Tell You. > Monitor Your Applications in Ultra-Fine Resolution. Try it FREE! > http://p.sf.net/sfu/Boundary-d2dvs2 > > > > _______________________________________________ > Gmod-gbrowse mailing list > Gmo...@li... > https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse |