From: Ian H. <ih...@be...> - 2006-11-26 20:12:07
|
Brilliant, thanks Mitch! This is very promising. It might also be interesting to investigate (i) how many features the in-memory approach can handle (e.g. on Drosophila, human chromosomes) and (ii) the performance compared to the render-on-demand approach (which fills the database with primitives, but does not render any tiles until someone views that particular tile). Ian On Nov 26, 2006, at 7:38 AM, Mitch Skinner wrote: > Hello, > > Reading the wiki, it looks like in-memory was the original > approach, but > it was too slow because it was rendering all the primitives for each > tile. As I understand it, storing the primitives in a database > buys you > the ability to query for just those primitives that overlap the > current > tile. > > After reading Andrew's message that said that database access was the > bottleneck, I wanted to try another take on the in-memory approach. > Instead of storing all the primitives in one big array, I'm keeping an > array of arrays of primitives, with one array of primitives per > rendering tile. This version of GDRecordPrimitives adds its primitive > to the primitive array of each of the rendering tiles that overlap > with > the primitive. There's also a separate global primitive array. > > On the yeast_chr1 that comes with gbrowse, rendering track 3 (named > genes) at zoom level 1 with the in-memory patch takes about a third of > the time that the db version does (see below). I haven't gotten chado > going on my machine yet, so I'd be interested in seeing comparisons > with/without the patches from anyone who wants to do one on their data > (particularly with larger/more complex tracks). > > I've attached two patches (against CVS HEAD); the first one > (ti-prim-api.patch) changes the primitive storage api in TiledImage a > little so that I could cleanly override those functions in > BatchTiledImage. Basically, it moves some work between callers and > callees so that the in-memory version doesn't have to do the > serialization work. > > The second patch (bti-memstorage.patch, which depends on the first > patch) changes BatchTiledImage to override the primitive storage > methods > of TiledImage. I put this stuff in BatchTiledImage because > BatchTiledImage is the class that knows about the rendering tile > dimensions, which I wanted to use. I thought this was the minimally > invasive way to do it; I wanted to make it easy to see what I was > trying > to do by reading the patch. If there's consensus that this is the way > to go, then more reorganization would probably be a good idea. > > Regards, > Mitch > > This is on a 2.2 GHz Athlon 64 - > > [mitch@firebolt server]$ patch < ti-prim-api.patch > patching file TiledImage.pm > [mitch@firebolt server]$ mkdir testdb > [mitch@firebolt server]$ time ./generate-tiles.pl -c > ~/apache/conf/gbrowse.conf/ -o testdb/ -s yeast_chr1 -m 0 -v 1 --no- > xml > --print-tile-nums --render-gridlines -l I -r t3z1r1-2303 &> withdb.out > > real 4m10.798s > user 2m16.425s > sys 0m6.968s > [mitch@firebolt server]$ patch < bti-memstorage.patch > patching file BatchTiledImage.pm > [mitch@firebolt server]$ mkdir testmem > [mitch@firebolt server]$ time ./generate-tiles.pl -c > ~/apache/conf/gbrowse.conf/ -o testmem/ -s yeast_chr1 -m 0 -v 1 -- > no-xml > --print-tile-nums --render-gridlines -l I -r t3z1r1-2303 &> > withmem.out > > real 1m23.400s > user 1m18.229s > sys 0m2.572s > [mitch@firebolt server]$ diff -r testdb testmem > [mitch@firebolt server]$ > <bti-memstorage.patch> > <ti-prim-api.patch> > ---------------------------------------------------------------------- > --- > Take Surveys. Earn Cash. Influence the Future of IT > Join SourceForge.net's Techsay panel and you'll get the chance to > share your > opinions on IT & business topics through brief surveys - and earn cash > http://www.techsay.com/default.php? > page=join.php&p=sourceforge&CID=DEVDEV________________________________ > _______________ > Gmod-ajax mailing list > Gmo...@li... > https://lists.sourceforge.net/lists/listinfo/gmod-ajax |