Thread: [matplotlib-devel] Reconfiguring transforms

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

[Background: I'm working on refactoring the transforms framework with 
the end goal of making it easier to add new kinds of non-linear 
transforms and projections to matplotlib.  I've been talking a bit with 
John Hunter about this -- this question is mainly for John and Ken 
McIvor, though there are probably some other interested parties on this 
list as well.]

I've studied John's mpl1.py and Ken's mpl1_displaypdf.py to try to get a 
sense of where things could go.  I appreciate the ideas both of these 
present as clean slates -- however, I think what I'm running into is 
"how to get there from here" in manageable steps.

My first baby step in this large task has been to try to remove 
transforms.py/cpp and replace it with something based on standard 3x3 
affine matrices, using Python/numpy only.  The way transforms.py/.cpp 
works now, everything is built around live updates of a tree of 
interdependent bounding boxes and transforms, where a change to a single 
scalar in any object automatically propagates through the tree.

My first thought was to make something out of immutable transforms where 
a transform would have to be calculated from its dependencies 
immediately before drawing, and therefore get rid of these "magical" 
side-effects by not allowing transforms to change in place.  Reading 
between the lines, this seems to be what mpl1_displaypdf.py suggests.  I 
quickly came to the conclusion that that is perhaps a step too far -- 
matplotlib is very much built around these side-effects and I would hate 
to replace hundreds of lines of well-tested code.  On the other hand, 
there is probably a pattern to those changes, and it may be worth the 
effort if others agree it's useful.

My second kick at the can was to build a live-updating tree of 
transforms.  This is similar to what I saw in mpl1.py using "changed" 
callbacks so that a change in one transform would affect all transforms 
that depend on it.

[I worry about a pure callback approach because of the likelihood of 
computing many partial values.  For example, if 'a' depends on 'b' and 
'c', and I change 'b' then 'c', 'a' will get recomputed twice.  Instead, 
I used an "invalidation" technique, where a change in b simply 
invalidates a, and a doesn't get recomputed until it is later requested. 
  This is something we used a lot when I programmed for gaming hardware. 
  The resulting semantics are very similar to using callbacks, however.]

This approach got closer, until I hit the wall that dependencies work at 
an even lower level -- single lazy values get borrowed from one bounding 
box and referenced in another (e.g. Axes.autoscale_view())  Certainly, 
this could be implemented in my new affine-based framework, but then 
we're almost back to square one and have basically re-implemented 
transforms.py/.cpp into something that is probably slower -- though 
perhaps more flexible in that more kinds of transforms could be added 
using only Python.  Of course, autoscale_view() (and other instances of 
this) could be rewritten to work differently, but it's hard to know 
where that might end.

(You can see my semi-working sketch of this here:
http://matplotlib.svn.sourceforge.net/viewvc/matplotlib/branches/transforms/lib/matplotlib/affine.py?revision=3835&view=markup

If you check out r3835 from my branch, simple_plot.py is working, with 
the exception of things that rely on this really low-level 
interdependence, e.g. the data limits.)

So, I feel like I'm going in a bit of a circle here, and I might need a 
reality check.  I thought I'd better check in and see where you guys 
(who've thought about this a lot longer than I have) see this going.  A 
statement of objectives of this part of the task would be helpful. 
(e.g. what's the biggest problem with how transforms work now, and what 
model would be a better fit). John, I know you've mentioned some to me 
before, e.g. the LazyValue concept is quirky and relies on C and the PDF 
stateful transforms model is close, but not quite what we need, etc.  I 
feel I have a better sense of the overall code structure now, but you 
guys may have a better "gut" sense of what will fit best.

My next planned step, to move more (affine) transformations to the 
backends to allow the same path data to be transformed in the backend 
without retransmitting/converting the path data each time, doesn't 
actually seem to be dependent on getting the above done.  (The existing 
transforms.cpp code already has a way to get a representation as an 
affine matrix).  John, I have a note from our phone conversation that 
indicates you thought these two things would be dependent, but I don't 
remember the reason you gave -- could you maybe refresh my memory -- I 
was much less aware of the code structure then.

Sorry if this is sort of an open-ended question...  Any pointers or 
impressions, no matter how small, are appreciated.

Cheers,
Mike

-- 
Michael Droettboom
Operations and Engineering Division
Space Telescope Science Institute
Operated by AURA for NASA