Re: [Gmod-ajax] next generation sequencing visualization

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

hmmm, I think you can easily construct situations where people might 
want to eyeball reads at the basepair level. Including insertions 
(which, fwiw, I think can be displayed a little more easily than per 
your email, Mitch -- e.g. as popups.)

Technically I think this comes down to a volume-of-data issue. Point 
being that you can already visualize short reads in aggregate, by 
generating a WIG plot of read density (easy) or by generating your own 
image track (almost as easy).

The only thing you currently cannot do is load a genome's worth of short 
reads into your web browser (nor would you want to do this). So, at the 
level of core tech, this comes down to how you deal with annotation 
tracks containing millions of features. The obvious answer being that 
you load them incrementally (e.g. in chunks [as we currently handle 
sequence] or by CGI range queries).

As an open source, developer-friendly project, we should be encouraging 
people (as a first resort) to make maximal use of the APIs and parts 
that we've already provided. That API should be extended only when it 
simply fails to meet a significant (empirical) demand.

So I think that I'd essentially agree with what Mitch said. Consider 
first what you can do using an image track (it might go further than you 
think -- e.g. you could display SNPs using a sequence logo) and whether 
it is at all possible that you could implement this yourself (obviously, 
with help from us).

At some point we will implement partial loading extensions that will 
allow you to eyeball high-volume feature tracks. But this will happen 
faster if you can demonstrate that you have already pushed back to your 
users with simpler (image-based) alternatives and they are, 
nevertheless, in need of a high-volume solution!

BTW, Sean Eddy has a discussion thread on next-gen sequencing challenges:
http://selab.janelia.org/people/eddys/blog/?p=86

Ian

Mitch Skinner wrote:
> Steve Taylor wrote:
>> Yes...but we really need a decent alignment viewer at the bp level to 
>> see SNPs etc. Can GBrowse display alignments in the panel?
>>   
> 
> The volume of data is large, right?  So why would someone want to 
> eyeball it?  Won't people be running programs to identify SNPs, rather 
> than trying to do it manually?
> 
> I worked with biologists for several years, so I know how much they like 
> to eyeball things.  But if the data volume is large, IMHO it's important 
> to push back and advocate automated analysis instead.  I'd hate to do a 
> lot of work only to find that after the initial burst of enthusiasm no 
> one used it.
> 
> Currently, there's an assumption built fairly widely into JBrowse (and 
> all other genome browsers as far as I know), which is that the 
> coordinate system defined by the reference sequence doesn't change on 
> the fly.  So it'll take a fair chunk of work to be able to show 
> insertions from resequencing.
> 
> On the other hand, if you're talking about viewing just a small region, 
> and you want to view it in alignment coordinates, and all of your data 
> is in aligment coordinates, then the JBrowse part of the work should be 
> easy to do.  We've talked about displaying per-base data (like sequence, 
> or a predicted RNA fold) in features; it's not implemented but it should 
> be straightforward to do.
> 
> Mitch