RE: [matplotlib-devel] image module questions

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

John Hunter writes:

> Sorry for the confusion.  I meant that I was considering using
> antigrain to load/store/scale/convert/process the pixel buffers in an
> image module in the same way that PIL could be used, and that this has
> nothing per se to do with backend_agg.  By "backends", I meant the
> same old backends we already have, backend_gd, backend_wx, backend_ps,
> etc..., each of which would implement a new method draw_image to
> render the image object to its respective canvases/displays.  Whether
> this image object is PIL based or Agg based under the hood is an open
> question.
> 
> I hope I this clarifies rather than muddies the waters....
> 
OK, I understand what you meant.

>     Perry> But initially, reading images into arrays seems like the
>     Perry> most flexible and straightforward thing to do.
> 
> Agreed - I like the idea of making the user deal with endianess, etc,
> in loading the image data into arrays, and passing those to image
> module.  Todd, is it reasonably straightforward to write extension
> code that is Numeric/numarray neutral for rank 2 and rank 3 arrays?
>
Todd and I just talked about this. There are two possible approaches,
one of which should work and the other which we would have to think
about. The simpler approach is to write a C extension using the
Numeric API. As long as a small, rarely-used, subset of the Numeric
API is not used (the UFunc API and some type conversion stuff
in the type descriptor) then numarray can use the same code with
only a change to the include file (which could be handled by an
#ifdef) Then the same C extension code could be use with either
Numeric or numarray. The catch is that it must be compiled to be
used with one or the other. I was thinking that the way around that
would be to do 2 things: have setup.py build for Numeric, numarray,
*or* both depending on what it found installed on the system. The
respective C extension modules would have different names (e.g.,
_image_Numeric.so/dll or _image_numarray.so/dll). There would also
be a wrapper module that uses numerix to determine which of these
C extensions to import. This is a bit clumsy (having to layer a 
module over it a la numerix, but it means only having one C source
file to handle both.

It might be possible for us to fiddle with numarray's structure 
definition so that the same compiled C code works with Numeric
and numarray arrays, but given that the API functions will generate
one or the other for creation, I'm not sure this is workable.
We will give it some thought.

My inclination is to use the first approach as a conservative but
workable solution.

>     Perry> These arrays are passed to matplotlib rendering methods or
>     Perry> functions and the dimensionality will tell the rendering
>     Perry> engine how to interpret it. The question is how much the
>     Perry> engine needs to know about the depth and representation of
>     Perry> the display buffer and how much of these details are
>     Perry> handled by agg (or other backends)
> 
> One thing that the rest of matplotlib tries to do is insulate as much
> complexity from the backends as possible.  For example, the backends
> only know one coordinate system (display) and the various figure
> objects transform themselves before requesting, for example,
> draw_lines or draw_text.  Likewise, the backends don't know about
> Line2D objects or Rectangle objects; the only know how to draw a line
> from x1, y1 to x2, y2, etc...
> 
> This suggests doing as much work as possible in the image module
> itself.  For example, if the image module converted all image data to
> an RGBA array of floats, this would be totally general and the
> backends would only have to worry about one thing vis-a-vis images:
> dumping rgba floats to the canvas.  Nothing about byte-order, RGB
> versus BGR, grayscale, colormaps and so on.  Most or all of these
> things could be supported in the image module: the image module scales
> the image, handles the interpolation, and converts all pixel formats
> to an array of rgba floats.  Then the backend takes over and renders.
> An array of RGBA UInt16s would probably suffice for just about
> everything, however.
> 
> The obvious potential downside here is performance and memory.  You
> might be in a situation where the user passes in UInt8, the image
> module converts to floats, and the backend converts back to UInt8 to
> pass to display.  Those of you who deal with image data a lot: to what
> extent is this a big concern?  My first reaction is that on a
> reasonably modern PC, even largish images could be handled with
> reasonable speed.
> 
I agree that this is  the drawback. On the other hand, processor 
memory has grown much faster than image display memory. With
a full screen image of 1024x1280 pixels, we are only talking about
5MB or so for a 32-bit deep display. Converting rgba all to floats
means 20MB, while large is not overwhelming these days, and that is
the worst case for interactive cases. I suppose it would be a bigger
concern for non-interactive situations (e.g. PDF). But it would seem
that in doing this simpler approach, we would have something that
works sooner, and then we could think about how to handle cases where
someone wants to generate a 4Kx4K PDF image (which is going to be
one big PDF file!). I'd let's not let extreme cases like this 
drive the first implementation.

> On the subject of PIL versus agg for the workhorse under the image
> module hood: I agree with the positive points things about PIL that
> you and Andrew brought up (stability, portability, wide user base,
> relevant functionality already implemented).  In favor of agg I would
> add
> 
Well, I was just referring to the image file support that PIL supplies;
the rest of PIL seems mostly redundant with matplotlib and array
capabilities (if I remember right, I haven't really used PIL much).
How much image file format support is built into agg?

Perry