From: Caroline <Car...@kc...> - 2009-07-08 16:51:51
|
Hi, I've just had a quick look at JBrowse - am I right in thinking that it expects JSON-formatted feature data? Is there a definition of the data format anywhere? We are generating lots of ChIPseq data and we want to visualise it alongside other people's data and standard annotation tracks. I was thinking of storing our data (aligned reads and called peaks) in say BAM, or BioHDF and wrapping it in a REST API to which you could make GET requests like http://whatever.com/experimentname/chr/start/end and get back appropriately formatted feature data. How easy would it be to get JBrowse to let users register a feature-server like this as a track and have it call the server for the feature data as required? Does this even seem like a sensible approach? Cheers, Cass |
From: Mitch S. <mit...@be...> - 2009-07-08 19:34:51
|
On 07/08/2009 09:24 AM, Caroline wrote: > I've just had a quick look at JBrowse - am I right in thinking that it > expects JSON-formatted feature data? Is there a definition of the data > format anywhere? > The use case of including external data in JBrowse is an important one and I'm very keen on making it work, but there are a bunch of caveats: You're right that the javascript client does expect JSON-formatted feature data, but I've been treating the JSON format as an internal implementation detail rather than a public interface. As JBrowse development has progressed, I've been changing the specifics of what's in the JSON, and I'd like to continue to have the freedom to do that for at least a little while longer. The public interface for getting data into JBrowse is from an implementation of the bioperl Bio::DasI interface (like Bio::DB::GFF, Bio::DB::SeqFeature::Store, or Bio::DB:Das::Chado) or from flatfiles (GFF or BED, currently). JBrowse includes perl scripts for generating the JSON from those sources. If I'm understanding you correctly, you're suggesting that the JBrowse javascript client could get data directly from outside servers. That won't work exactly that way, though, because a javascript client can't get data from more than one server (this is a security restriction imposed by the web browsers). We could theoretically get around that restriction if the datasources each served some javascript glue code (the "mashup" approach), but that means we'd have to completely trust each data source, and that makes me uncomfortable. ("mashups are self-inflicted cross-site-scripting attacks", is the way Doug Crockford puts it) The mashup approach might work okay for plain JBrowse as it is right now, but if someone embedded JBrowse in an application that involved logging in, and allowed arbitrary external URLs, then that approach would be a huge security hole. Instead, the web browser should (in my opinion) be getting everything from a single server. That server might get data from other servers, though (some people distinguish "mashup in the browser" from "mashup in the server"; this would be the latter). So I've been thinking about writing a proxy to convert external URL sources (with, say, DAS or flatfiles) into JBrowse's JSON. The proxy would run on a JBrowse server, get data from an external URL, convert it, and then serve it to the client. It should be pretty straightforward to glue an external-URL front end to JBrowse's JSON-generating code. The JSON-generating code expects a chromosome's worth of data at a time, though, so that approach might not scale well to very large feature sets. How many features (per chromosome) would be in a given track for you? On the other hand, large feature sets might be okay as long as the server serving them (and our hypothetical external-URL front end) supports HTTP caching (conditional GETs) properly. Then the conversion work would only have to be done when the data changes. Still, the JSON-generating code is implemented using bioperl, so there would still be some scalability issues. In that case, an implementation in another language is certainly a possibility. I'm probably not going to be able to get to that for a month or two, though (maybe more). If someone else wants to tackle it, I'd be happy to help and answer questions. If you want something today, you could just run your own JBrowse server and aggregate other people's data and the standard annotation tracks yourself. If you do that, you might also be able to use Ian's TWiki plugin to allow other people to upload their own data; I'm not sure if he considers it a proof-of-concept or production-ready. Regards, Mitch |
From: Ian H. <ih...@be...> - 2009-07-08 20:01:06
|
Mitch Skinner wrote: > If you want something today, you could > just run your own JBrowse server and aggregate other people's data and > the standard annotation tracks yourself. If you do that, you might also > be able to use Ian's TWiki plugin to allow other people to upload their > own data; I'm not sure if he considers it a proof-of-concept or > production-ready. The TWiki plugin should work, but it requires some knowledge of how to configure & administer a TWiki system, and also some (hopefully minimal) education of users in how to operate a TWiki in general, and this plugin in particular. You should be aware that the user interface for the wiki plugin is less intuitive than JBrowse (it looks like a genome browser embedded in a wiki, so there is a bit of clutter) Also, validation of user input is almost nonexistent (also true of the command-line scripts, but more of a problem in a user-facing system) In these ways it's a proof-of-concept system, but it should also work. See the following ticket regarding input format validation: http://jbrowse.lighthouseapp.com/projects/23792-jbrowse/tickets/28 As Mitch noted, the currently-supported interface to JBrowse is the server scripts; we hope to document the JSON before too long, so that people can extend JBrowse themselves, but we'd like to retain some flexibility for now. |