Re: [Gmod-ajax] Out-of-memory crash

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Sorry it took so long to get back to you.  I've replied inline below -

Caroline wrote:
>> It should also be possible to use SAM/BAM with JBrowse using
>> Bio::SamTools, but I haven't tested it yet.
>>     
>
> I had a go at this and it is. Patch attached (first time I've done this
> with git, give me a shout if it's not the right format). 
>   

Awesome.  The patch looks great; I've applied a slightly different 
version that makes Bio::DB::Sam optional:
http://github.com/jbrowse/jbrowse/commit/82f29deedb57051bff81f9dc311bfd80b554e8e9

It's currently just on the "lazyfeatures" branch because I don't think 
it's that useful without the other stuff on that branch.

Longer term, I'd like to try doing pulls from people's git repos, which 
will (e.g.) preserve authorship information in the git metadata.  But 
patches are also fine and I'm happy to get them.  This one worked fine 
for me.

It still uses much more memory than it should; that can be addressed but 
I think it'll take more work re-arranging the interface between 
JBrowse's JsonGenerator and its clients.

> Awesome! What's the plan for this? I'm trying to knock up a sequence
> server (perl, Catalyst) that will hand over our ChIPseq data from BAM
> files bit by bit and eventually do the same for remote BAM files, DAS
> servers and so on. You mentioned this on the mailing list ages ago, but
> this is the first chance I've had to get around to it. 
>
> Can you point me in the right direction for getting it to play nicely
> with JBrowse? What will the lazyfeatures track expect from the server?
> How does it deal with zooming - can the server just decide to only
> return a hist summary at some point? What about caching? Does the
> browser grab data in defined chunks? What else should I be worrying
> about?  

I wrote this to try and answer these questions:
http://biowiki.org/view/JBrowse/LazyFeatureLoading

The short version is: yes, there's a hist summary; currently the hist 
counts are generated at a zoom level that's hard-coded.  That's a 
terrible hack, and doing something smarter is definitely on the list.  
The client does grab data in defined chunks, and caches those.

After I had implemented lazy loading in JBrowse, I found out about the 
lazy/partial loading work that Heng Li has done for BAM and that Jim 
Kent has done for his BigBed/BigWig format.  There was a big thread on 
samtools-devel about it:
http://sourceforge.net/mailarchive/forum.php?thread_name=6dce9a0b0911150626o701e07baq2c97c4135e5ffda9%40mail.gmail.com&forum_name=samtools-devel

There are a few messages from me in there that try to compare the 
JBrowse approach to the BAM and BigBed approaches.

In the end, each of us came up with something different; Heng Li is 
using binning, Jim Kent is using r-trees, and I'm using NCLists.  I 
don't think we can directly adopt either of the other two solutions for 
JBrowse, because they're doing a lot of bit-twiddling that I think would 
be hard to do in a web browser (I'm happy to have someone prove me wrong 
though, and I'd be happy to talk about it in more detail if people are 
interested).  So my next thought was to wonder if I (or someone) could 
write a proxy that could act as a BAM/BigBed client and then serve JSON 
to JBrowse.  I think it could be done but it's not 100% clear in my head 
how to do it.  I'd be happy to talk about what I've been thinking so far 
if you're interested in tackling this.

Earlier this year, I said that I didn't want to make the JBrowse JSON 
format a public thing because I wanted to be able to change it at will.  
Thinking about it some more, there are some aspects of the format that I 
think are pretty solid, and some other parts that are pretty likely to 
change.  It might be possible to split out the likely-to-change bits 
from the unlikely-to-change bits; earlier I was worried about splitting 
things up too much and ending up with too many server round-trips, but 
maybe not.  I'll write up a description of what's in there now and then 
we could talk about where to go from there.

Thanks for the patch,
Mitch