From: Mitch S. <li...@ar...> - 2006-12-13 10:20:32
|
I've implemented some of the things I talked about in my last message; there are two patches attached, one for GD and one for gmod-ajax. With them both, my hardware can render Dmel chr. 4 mRNA (with the transcript glyph) at zoom 1 in about five and a half minutes. More inline: On Mon, 2006-12-04 at 08:22 -0800, Mitchell Skinner wrote: > * the TiledImage::AUTOLOAD routine was taking up a > significant proportion of the time, especially on > primitive-intensive tracks like the yeast_chr1 > translation tracks at full zoom. I'm working on > reimplementing the accessors using closures or > Class::Struct, and reimplementing the GD primitive > interception with closures This is done. > * on some tracks, GD::filledRectangle was taking the > lion's share of the time (96% on one run that I > measured for a track that was using the "transcript" > glyph). I looked at the gd c code, and > gdImageFilledRectangle calls the retail > gdImageSetPixel function (which is non-trivial) in > a nested loop. The attached GD patch adds a special case to gdImageFilledRectangle that uses memset on non-truecolor images where the rectangle is being filled with a regular color. If no one here sees any problems with it, I'll be sending it on upstream. The optimal case for the memset version is very wide (say, tens of thousands of pixels) and short rectangles, just like the ones we're doing at full zoom. On rectangles like that, this patch makes gdImageFilledRectangle up to 40x faster on my system. > * for the less primitive-intensive tracks (like gene > or mRNA) at high zoom levels, the gridlines are big > memory users (for the in-memory primitive storage > approach). Rendering Dmel chrom 4 mRNA at zoom 1 > without gridlines reduced peak memory usage (RSS as > measured by ps) from 653 megabytes to 169 megabytes. > If we can find a way to avoid rendering the gridlines > (like using PNG transparency as mentioned on the > wiki) or if we can find a way to avoid storing the > gridline primitives then I don't think memory usage > will be a problem for these kinds of tracks. I thought about trying to take the gridline code from TiledImagePanel and copying it into TiledImage, but that would mean that TiledImage would have to know about a lot more stuff than it does now; plus blurring the line betweeen TiledImagePanel and the rest of the pre-rendering code works against the goal of merging TiledImagePanel back into BioPerl proper. Instead, I took an approach that's a bit less efficient but IMHO cleaner layering-wise: trying to represent all of the gridline information without explicitly storing all of it. In other words, the fact that the gridline line primitives are so regular means that we can store the gridline information a little more compactly. I wrote a class that compactly stores sequences of integers where the numbers are mostly increasing by the same interval or mostly the same, and I used it to store the arguments for vertical line primitives. See CompactList.pm and TiledImage::line in the patch. This reduces memory usage on Dmel chr. 4 mRNA at zoom 1 from 653 megabytes to 225 megabytes, and I think it will scale fairly well to larger chromosomes. The runtime cost is several percent; I think it's a good tradeoff. > * for the primitive-intensive tracks like the > translation and dna tracks, I'm hoping that rendering > smaller tile ranges will make the jobs fit in RAM. > I'm not sure if we can avoid generating all > primitives when we're not rendering the whole track, > but in any case we should be able to avoid storing > all those primitives. I haven't done anything here yet, but I believe we can do something similar the gridlines to store those primitives more compactly. I think this all suggests that pre-rendering large chromosomes will work; when I get a chance I'll be loading up and trying out the other Drosophila chromosomes. In the attached patch I've gone ahead and taken out the code related to the primitive database. If people are cool with the patch, then I'll go ahead and commit; if not, I'd appreciate any feedback. Mitch |