From: Mitch S. <mit...@be...> - 2007-02-16 04:08:25
|
Over the last week or so I've been experimenting with a different way of doing the rendering. Performance-wise, it takes significantly less time and space. Correctness-wise, I haven't found any problems but checking it is a bit difficult. Code-elegance wise, it's worse, but I think that with some Panel api changes it could be cleaned up a lot by moving most of the code into a Panel subclass (it currently fiddles in odd ways with some of the Panel's state). For yeast_chr1, it takes 3 and a half minutes to render all tracks + all zooms and uses 93MB of RAM. This is about 1/4 the space and less than 1/4 of the time taken by current CVS HEAD with in-memory primitive storage. For Drosophila chr. 4 (all tracks + zooms), it takes 31 and a half minutes and uses 200 MB of RAM, which is less than 1/6 the space and about 40% of the time taken by CVS HEAD. I've put a set of tiles generated with this code here: http://genome.biowiki.org/gbrowse/dmel-noti/prototype_gbrowse.html http://genome.biowiki.org/gbrowse/yeast_chr1-noti/prototype_gbrowse.html and I'd appreciate any reports of incorrectly rendered tiles there. The rest of this email is a description of why I took this approach and how it's done. If you just want to render big chromosomes without reading all the details, then I'll be committing this code soon, either to HEAD or on a branch. There were two things that pointed me in this direction: the gridline thing, and the empty tile thing. The gridline thing was when I tried to avoid storing gridline primitives by just drawing the first tile's gridlines on every tile, without going through TiledImage. At first I thought I had to use the first tile's gridlines because otherwise the gridlines would have been drawn off-tile (because without going through TiledImage I didn't have TiledImage's primitive position translation functionality). The problem was that the first tile's gridlines were _different_ from the rest, because of the Panel's edge behavior at the first gridline. This all could have been solved (by copying and adjusting the gridline code, if nothing else), but there's a similar issue with "global feature" tracks like DNA and translation, and the ruler. For those, and for the gridlines, I wanted to be able to generate just one rendering tile's worth of primitives at a time, and not have to store all of the primitives for the entire chromosome, which take up a lot of space on these primitive-intensive tracks. My solution was to create a Panel for each rendering tile, and use that to draw the gridlines and global features. One problem with this approach is what happens when some primitive runs off the end of a rendering tile (which is one of the main reasons TiledImage exists in the first place). For the DNA and translation tracks, there are no labels or other primitives that extend in unpredictable ways, so it's not a big problem there. For the ruler (where the labels do have some extra width), my solution was to have the per-tile Panel extend a short distance (currently 100px) beyond the rendering tile on both sides. This way, any primitive which extends less than 100px off-tile does get rendered correctly on both sides of the tile boundary. The distance is set by the "$global_padding" variable in my experimental version of generate-tiles.pl. The empty tile thing had to with the fact that GD::Image::png was taking up a lot of time in the profiles I was generating. I figured I could use the glyph boxes used by the imagemap code to figure out which tiles were blank, and avoid generating those (by hardlinking the file name to a previous blank tile). This worked, and it gave a speedup on CVS HEAD of up to 12% on some tracks, but it got me thinking: if I knew the pixel span of each glyph in advance (which is what those boxes provide) then for each rendering tile I could use that information to only render the glyphs that overlap the tile. I also spent some time reading the panel code, and I realized that I could get the TiledImage primitive position translation functionality almost for free by giving the Panel a negative pad_left value. And only rendering a tile's worth of glyphs at a time did something similar to TiledImage's "only render the current tile's worth of primitives" functionality, without having to store any primitives. For non-global features I'm still using a chromosome-wide Panel to do the bumping, so the layout is still right. So this approach doesn't use TiledImage (or BatchTiledImage, or DBPrimStorage, or MemoryPrimStorage) at all. Which is a fairly radical change IMO but it's the only way I see to really scale to large chromosomes. I do like the fact that TiledImage is a nice clean abstraction, but there's no way to store a human chr. 1 worth of primitives in memory, and even if you had an infinitely fast disk storage method for primitives the (de)serialization overhead would still kill you, as far as I can tell. Actually, now that I think about it, I remember Data::Dumper (serialization) taking a nontrivial amount of time in the database primitive storage profiling I did last year, but I'm not sure about eval (deserialization). As for whether we should ditch TiledImage, I think there are two remaining questions: rendering on demand and correctness. If this approach can do both of those things, then I think it's the way we should go. I believe this can be applied to the rendering-on-demand scenario if we use mod_perl and do the layout step on startup. This would take a fair amount of RAM but it's only necessary for tracks that haven't been fully rendered yet. One plus of that approach is that handling single new features gets easier. Storing the layout in a database is theoretically possible but saving and restoring that information seems pretty complicated, unless we just serialize the entire panel. So far, I've been testing my changes by doing diffs of the tiles; I'm pretty sure I've only committed changes that generate tiles that are bit-exactly the same as before. The tiles that I've generated with this approach aren't the same bit-for-bit, but they do look the same (with one exception: right ends of genes are now getting rendered correctly). I think the difference is in the palette, so the tiles could still be correct even if they're different. So I'm not yet fully convinced that it's rendering exactly correctly, but it does look right to me. If you're curious, I've appended the code for the meat of the tile rendering below. The things to pay attention to here are how the per-tile panel is set up, any if statement that checks $is_global, and the $small_tile_gd->copy call near the end. @per_tile_glyphs is an array of arrays; for each rendering tile, it has an array of the glyphs that overlap that tile. Comments? Mitch for (my $x = $first_large_tile; $x <= $last_large_tile; $x++) { my $large_tile_gd; my $pixel_offset = (0 == $x) ? 0 : $global_padding; # we want to skip rendering whole tile if it's blank, but only if # there's a blank tile to which to hardlink that's already rendered if (defined($per_tile_glyphs[$x]) || (!defined($blankTile))) { # rendering tile bounds in pixel coordinates my $rtile_left = ($x * $rendering_tilewidth) - $pixel_offset; my $rtile_right = (($x + 1) * $rendering_tilewidth) + $global_padding - 1; # rendering tile bounds in bp coordinates my $first_base = ($rtile_left / $big_panel->scale) + 1; my $last_base = int(($rtile_right / $big_panel->scale) + 1); #print "pixel_offset: $pixel_offset first_base: $first_base last_base: $last_base " . tv_interval($start_time) . "\n"; # set up the per-rendering-tile panel, with the right # bp coordinates and pixel width my %tpanel_args = %$panel_args; $tpanel_args{-start} = $first_base; $tpanel_args{-end} = $last_base; $tpanel_args{-stop} = $last_base; $tpanel_args{-width} = $rtile_right - $rtile_left + 1; my $tile_panel = Bio::Graphics::Panel->new(%tpanel_args); if ($is_global) { # for global features we can just render everything # using the per-tile panel my @segments = $CONFIG->name2segments($landmark_name . ":" . $first_base . ".." . $last_base, $db, undef, 1); my $small_segment = $segments[0]; $tile_panel->add_track($small_segment, @$track_settings); $large_tile_gd = $tile_panel->gd(); } else { # add generic track to the tile panel, so that the # gridlines have the right height $tile_panel->add_track(-glyph => 'generic', @$track_settings, -height => $image_height); $large_tile_gd = $tile_panel->gd(); #print "got tile panel gd " . tv_interval($start_time) . "\n"; if (defined $per_tile_glyphs[$x]) { # some glyphs call set_pen on the big_panel; # we want that to go to the right GD object $big_panel->{gd} = $large_tile_gd; #move rendering onto the tile $big_panel->pad_left(-$rtile_left); # draw the glyphs for the current rendering tile foreach my $glyph (@{$per_tile_glyphs[$x]}) { # sometimes the glyph positions itself # using the panel's pad_left, sometimes # it just uses the x-coordinate it gets # in the draw method. We want them both # to be -$rtile_left. $glyph->draw($large_tile_gd, -$rtile_left, 0); } } } $tile_panel->finished; $tile_panel = undef; } # now to break up the large tile into small tiles and write them to PNG on disk... SMALLTILE: for (my $y = 0; $y < $small_per_large; $y++) { my $small_tile_num = $x * $small_per_large + $y; if ( ($small_tile_num >= $first_tile) && ($small_tile_num <= $last_tile) ) { # do we print it? my $outfile = "${tile_prefix}${small_tile_num}.png"; if (!$is_global) { writeHTML($tile_prefix, $x, $y, $small_tile_num, $tilewidth_pixels, $image_height, $track_num, $html_current_outdir, $per_tile_glyphs[$x]); if (!defined($nonempty_smalltiles[$x]{$y})) { if (defined($blankTile)) { #print "linking $outfile to $blankTile\n"; link $blankTile, $outfile || die "could not link blank tile: $!\n"; next SMALLTILE; } else { $blankTile = $outfile; } } } open (TILE, ">${outfile}") or die "ERROR: could not open ${outfile}!\n"; my $small_tile_gd = GD::Image->new($tilewidth_pixels, $image_height, 0); $small_tile_gd->copy($large_tile_gd, 0, 0, $y * $tilewidth_pixels + $pixel_offset, 0, $tilewidth_pixels, $image_height); print TILE $small_tile_gd->png or die "ERROR: could not write to ${outfile}!\n"; warn "done printing ${outfile}\n" if $verbose >= 2; } } } |
From: Mitch S. <mit...@be...> - 2007-02-16 18:08:32
|
I wrote: > For yeast_chr1, it takes 3 and a half minutes to render all tracks + all > zooms and uses 93MB of RAM. This is about 1/4 the space and less than > 1/4 of the time taken by current CVS HEAD with in-memory primitive > storage. For Drosophila chr. 4 (all tracks + zooms), it takes 31 and a > half minutes and uses 200 MB of RAM, which is less than 1/6 the space > and about 40% of the time taken by CVS HEAD. Also, I'm not sure how much I believe DProf anymore, but here's what the profile looks like for Dmel chr. 4 mRNA at zoom 1. It's odd because I don't recall GD::Image::_new taking up so much time before. I'd expect it to increase in relative terms since I've been working on other stuff, but I think it's increased in absolute terms. Total Elapsed Time = 143.3461 Seconds User+System Time = 142.6761 Seconds Exclusive Times %Time ExclSec CumulS #Calls sec/call Csec/c Name 67.3 96.13 96.130 11492 0.0084 0.0084 GD::Image::_new 23.0 32.89 32.890 8656 0.0038 0.0038 GD::Image::png 5.35 7.631 7.631 115396 0.0000 0.0000 GD::Image::line 4.43 6.321 9.205 864 0.0073 0.0107 Bio::Graphics::Glyph::collides 3.44 4.910 4.924 115080 0.0000 0.0000 Bio::Graphics::Panel::map_pt 3.12 4.456 4.456 1276 0.0035 0.0035 Bio::Graphics::Glyph::_collision_k eys 1.44 2.049 2.193 12817 0.0002 0.0002 main::writeHTML 1.43 2.040 2.040 8656 0.0002 0.0002 GD::Image::copy 1.10 1.567 3.156 412 0.0038 0.0077 Bio::Graphics::Glyph::add_collisio n 1.08 1.546 138.26 1 1.5461 138.26 main::renderTileRange 0.36 0.510 0.510 11492 0.0000 0.0000 GD::Image::DESTROY |
From: Mitch S. <mit...@be...> - 2007-02-26 22:46:35
|
I wrote: > So far, I've been testing my changes by doing diffs of the tiles; I'm > pretty sure I've only committed changes that generate tiles that are > bit-exactly the same as before. The tiles that I've generated with this > approach aren't the same bit-for-bit, but they do look the same (with > one exception: right ends of genes are now getting rendered correctly). > I think the difference is in the palette, so the tiles could still be > correct even if they're different. So I'm not yet fully convinced that > it's rendering exactly correctly, but it does look right to me. > I got the image diff program that the cairo project uses for its tests, which handles palette differences without a problem. It also tells you how many pixels are different, which is a useful metric IMO. I used it to compare the gbrowse-ajax-tiledimage version of tile generation with my experimental panel-based one, and after a few days of fiddling with the per-tile panel bounds, it's close. It produces the same images pixel-for-pixel as TiledImage, with the exception of the 2kbp and 1kbp zoom levels, where the gridlines are shifted by one pixel. I spent some time trying to chase this down, without much success. My best guess is that since 1kbp and 2kbp are right on a rounding boundary (1 and 0.5 bases per pixel, respectively) there's some kind of rounding difference. At any rate, while the difference is annoying, I think it's close enough that I'm going to go ahead and commit this. For the time being I plan to continue with this approach, unless something comes up that shows it's unworkable. Mitch |