From: Mitch S. <li...@ar...> - 2006-11-26 15:38:17
|
Hello, Reading the wiki, it looks like in-memory was the original approach, but it was too slow because it was rendering all the primitives for each tile. As I understand it, storing the primitives in a database buys you the ability to query for just those primitives that overlap the current tile. After reading Andrew's message that said that database access was the bottleneck, I wanted to try another take on the in-memory approach. Instead of storing all the primitives in one big array, I'm keeping an array of arrays of primitives, with one array of primitives per rendering tile. This version of GDRecordPrimitives adds its primitive to the primitive array of each of the rendering tiles that overlap with the primitive. There's also a separate global primitive array. On the yeast_chr1 that comes with gbrowse, rendering track 3 (named genes) at zoom level 1 with the in-memory patch takes about a third of the time that the db version does (see below). I haven't gotten chado going on my machine yet, so I'd be interested in seeing comparisons with/without the patches from anyone who wants to do one on their data (particularly with larger/more complex tracks). I've attached two patches (against CVS HEAD); the first one (ti-prim-api.patch) changes the primitive storage api in TiledImage a little so that I could cleanly override those functions in BatchTiledImage. Basically, it moves some work between callers and callees so that the in-memory version doesn't have to do the serialization work. The second patch (bti-memstorage.patch, which depends on the first patch) changes BatchTiledImage to override the primitive storage methods of TiledImage. I put this stuff in BatchTiledImage because BatchTiledImage is the class that knows about the rendering tile dimensions, which I wanted to use. I thought this was the minimally invasive way to do it; I wanted to make it easy to see what I was trying to do by reading the patch. If there's consensus that this is the way to go, then more reorganization would probably be a good idea. Regards, Mitch This is on a 2.2 GHz Athlon 64 - [mitch@firebolt server]$ patch < ti-prim-api.patch patching file TiledImage.pm [mitch@firebolt server]$ mkdir testdb [mitch@firebolt server]$ time ./generate-tiles.pl -c ~/apache/conf/gbrowse.conf/ -o testdb/ -s yeast_chr1 -m 0 -v 1 --no-xml --print-tile-nums --render-gridlines -l I -r t3z1r1-2303 &> withdb.out real 4m10.798s user 2m16.425s sys 0m6.968s [mitch@firebolt server]$ patch < bti-memstorage.patch patching file BatchTiledImage.pm [mitch@firebolt server]$ mkdir testmem [mitch@firebolt server]$ time ./generate-tiles.pl -c ~/apache/conf/gbrowse.conf/ -o testmem/ -s yeast_chr1 -m 0 -v 1 --no-xml --print-tile-nums --render-gridlines -l I -r t3z1r1-2303 &> withmem.out real 1m23.400s user 1m18.229s sys 0m2.572s [mitch@firebolt server]$ diff -r testdb testmem [mitch@firebolt server]$ |