Re: [Gmod-ajax] Openlazlo

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

yes, i think that's a fair summary of the consensus. i favor a mixture 
of divs (for feature tracks, especially those uploaded by general users) 
and images (for WIG tracks, and potentially for reference annotations 
that have been curated or are otherwise more official/trusted).

there are a few reasons why I believe we wouldn't want to drop to 
10-pixel resolution for histograms:

the main one is that i think you want the browser to be as high-res as 
possible. citing technological limitations of javascript as a reason to 
res down by a factor of 10 will not, i think, cut it with most people. 
the possibility of zooming in does not make up for this. (you can zoom 
in on a lot of applications, e.g. Photoshop or even web browsers, but 
that wouldn't really make up for a pixelated image)

you want the high-res display for all sorts of reasons. for example you 
might want to see whether rising/falling edges in the quantitative data 
line up with start/endpoints of features. you certainly wouldn't want 
feature co-ords to be rounded to the nearest 10 pixels & so there is 
also a consistency argument here (e.g. you will be able to pick out 
introns from a pixel-accurate feature view, but they'll be lost in a 
nearest-10-pixel histogram view)

or, the difference between 10 pixels and 1 pixel might be enough to stop 
you resolving two peaks that are close together. at 10 pixels they would 
look like a single peak. when you res down you're shoving everything 
through a low-pass filter.

yes, you are always losing *some* information like this at any zoom 
level (except the closest), and you *can* always zoom in, but are you 
going to do that at every position? scroll, zoom in, zoom back out, 
scroll again?

i think this goes against the way that people use browsers. they try to 
get as much information on the page as possible. this is just an 
empirical observation (albeit anecdotal): people like big displays that 
they can then sit back & gaze at. there is already a tension between 
wanting to display as much genomic context as possible (on the one 
hand), and being limited by the finite screen resolution (on the other).

aside from the dynamic view, there's also the static. people like to 
save or print out the browser view, for use in presentations or posters 
or talks. More or less every talk at the recent CSHL Biology of Genomes 
meeting (for example) had some kind of histogram, and as Lincoln pointed 
out, they were all high-res. This is one of the big (possibly 
under-appreciated) uses of the UCSC browser: people like to print the 
image. I doubt that they'd do that if the WIG tracks were all chunky.

I feel that we have to pay attention to the way that people actually use 
genome browsers, and be careful about saying "oh, you won't need that 
functionality, you can just do XYZ instead". while that may sometimes be 
true, i think that the burden of proof is on the side of novelty, where 
we have less direct experience of how people would take to the paradigm.

BTW, I think Edward Tufte (author of the classic "The Visual Display of 
Quantitative Information", 1983) would probably agree with me on this. 
 From http://www.csiss.org/classics/content/44

"Finally, Tufte was a proponent of high data density. He defined it as 
the ratio of the number of entries in a data matrix to the area of the 
data graphic. Of course, such a ratio is often hard to quantify, but his 
point was clear: don't waste a large graphic on a small amount of 
information. If there are only a couple data entries, a table within the 
text makes more sense than creating a histogram with only a handful of 
bars. One example of a graphic using a high density of data showed the 
amount of sunny versus clouded periods for each day throughout the year 
at a given location. Another featured a pollution map that was repeated 
multiple times to create a form of animation over the course of a day."

Of course, Tufte wrote his book on the cusp of computer apps that allow 
you to zoom in, but I think that this is the whole crux. It's far from 
obvious to me that people will trade a rich visual display for a worse 
display with a better zoom dial. Especially when we can give them both 
the rich display AND the zoom dial.

-Ian

Mitch Skinner wrote:
> On Mon, 2007-06-04 at 10:15 +0100, Steve Taylor wrote:
>> There are an increasing number of biological techniques that are
>> producing genome wide quantitative data that require graph like
>> visualisation, such as array intensities and ChIP-on-chip. While 
>> graphs/plots *can* be broken down into rectangles how do think the div
>> method will cope with the increased amounts of elements required?
> 
> This is a really good point that Ian and I have been talking about a
> lot.  The consensus is that there's some density threshold above which
> you'd have to use images.
> 
> Images fit well with that kind of dense quantitative data because:
> 1. there's no layout (bumping)
> 2. they can be very dense
> 3. As far as I know, there's no need to click on individual data points,
> the way you'd click on individual features (and therefore no need for an
> imagemap).  One of the nice things about using divs for feature tracks
> is that a div's visual span is the same as its region of clickability,
> so there's no need to specify that separately with something like an
> imagemap.  But if there's no clicking then it doesn't matter.
> 
> Not needing layout (or text labels) for that kind of data makes it
> easier to render images on demand, since there's no need to worry about
> distant elements affecting the image tile that you're rendering.
> 
> Not having any text (or anything aspect-ratio dependent) in the image
> tile also makes it possible to scale images on the client without making
> them look odd.  We could just horizontally stretch out the
> 1-base-per-pixel image for higher zoom levels, which (if you were
> pre-rendering) would save you a lot of disk space and processing time.
> Or, if you were rendering on demand, some bandwidth and server CPU.
> 
> Also, I can't think of any reason you'd want to edit that kind of data
> the way you might want to edit a feature.
> 
> Point being, feature tracks and measurement-type tracks are pretty
> different animals IMO, and while divs are a good solution for feature
> tracks, images are probably the right way to go for dense measurement
> tracks.  So I think we'd want to have both image-based tracks and
> div-based tracks side-by-side.  AFAICT this shouldn't be too hard.
> 
> 
> That's the consensus as I see it; personally I think it's fine to
> visualize ChIP/chip data with a bar graph where each bar is (say) 10
> pixels wide, as long as it's easy to zoom in to see more detail.
> Everyone I've talked to seems to think that's crazy, although no-one
> seems terribly bothered by the fact that at low zoom levels (say, a
> whole chromosome) even 1-pixel granular plots will still be averaging
> over a large area for each visible data point.  What I'd like to see
> (regardless of the image/div decision) are plots that show you not only
> a mean but also a max and a min (or a standard deviation, or something)
> for each plot point where averaging has been done.  Explicitly
> acknowledge the fact that the visualization is coarse, in other words.
> Then coarse visualizations become less scary.
> 
> That's all just my opinion, though.  I'm happy to implement whatever
> image-based stuff people think is important.  I may be somewhat biased
> about the div thing.
> 
> Mitch
> 
> 
> -------------------------------------------------------------------------
> This SF.net email is sponsored by DB2 Express
> Download DB2 Express C - the FREE version of DB2 express and take
> control of your XML. No limits. Just data. Click to get it now.
> http://sourceforge.net/powerbar/db2/
> _______________________________________________
> Gmod-ajax mailing list
> Gmo...@li...
> https://lists.sourceforge.net/lists/listinfo/gmod-ajax