Thread: [Matplotlib-users] filled contours, and missing data

Brought to you by: cjgohlke, dsdale, efiring, heeres, and 8 others

matplotlib-users

[Matplotlib-users] filled contours, and missing data

From: Eric F. <ef...@ha...> - 2005-02-20 02:23:23

John et al.,

I would like to phase in matplotlib to replace Matlab ASAP for plotting 
physical oceanographic observations, primarily current profile 
measurements.  I (and many other physical oceanographers) primarily use 
contourf to plot filled contours; I only rarely use line contours.  It 
looks to me like gcntr.c has the necessary functionality--the ability to 
output polygons enclosing regions between a pair of specified levels. 
Is someone already working on exposing that functionality in matplotlib, 
or is it planned?

It appears that gcntr.c also has the ability to handle missing data via 
setting elements of the reg array to zero, and that this could be 
exposed fairly easily in the contour method in axes.py by adding "reg" 
to the set of kwargs.  Correct?  If so, is this also planned?

The question of missing data handling in contour plotting brings up the 
more general issue of how to handle data gaps in plots.  For example, 
the ocean current profiles that I measure using a Doppler profiler 
extend to varying depths, and sometimes have holes in the middle where 
there are not enough acoustic scatterers to give a signal.  This sort of 
thing--data gaps--is universal in physical oceanography.  One of 
Matlab's major strengths is the way it handles them, using nan as a bad 
value flag.  Plotting a line with the plot command, the line is broken 
at each nan; so if there is a hole in the data, the plot shows exactly 
that.  The same for contouring: nans are automatically used as a mask.

Obviously, not everyone needs this kind of automatic handling of data 
gaps, but I think it would be very useful for many applications, so I 
hope it can be considered as a possible goal.  At the plotting level, 
collections may make it easier to implement than would have been the 
case in the early days of matplotlib.  At the array manipulation level, 
the implementation could involve either masked arrays or nans.  I would 
greatly prefer the Matlab-style nan approach, but I don't know whether 
this would work with Numeric.  Maybe in Numeric3?  Numarray appears 
better equipped, with its ieeespecial.py module.

Thanks for the enormous amount of beautiful work you have already done!

Eric

Re: [Matplotlib-users] filled contours, and missing data

From: Perry G. <pe...@st...> - 2005-02-21 17:07:39

On Feb 19, 2005, at 9:23 PM, Eric Firing wrote:

> John et al.,
>
> I would like to phase in matplotlib to replace Matlab ASAP for 
> plotting physical oceanographic observations, primarily current 
> profile measurements.  I (and many other physical oceanographers) 
> primarily use contourf to plot filled contours; I only rarely use line 
> contours.  It looks to me like gcntr.c has the necessary 
> functionality--the ability to output polygons enclosing regions 
> between a pair of specified levels. Is someone already working on 
> exposing that functionality in matplotlib, or is it planned?
>
No one (as far as I know :-) is working on it right now. It is in our 
plans to add this capability. As you correctly note, the underlying C 
code can handle this capability. I'm not sure how long it will be; 
right now the priority is to finish contour labeling capability, and 
the person working on that also has other work that competes with her 
time to do this. I'm guessing that she could start looking at it in a 
couple weeks. Of course, if someone wants to help now, that would be 
great.

> It appears that gcntr.c also has the ability to handle missing data 
> via setting elements of the reg array to zero, and that this could be 
> exposed fairly easily in the contour method in axes.py by adding "reg" 
> to the set of kwargs.  Correct?  If so, is this also planned?
>
Correct. Yes (it is planned).

> The question of missing data handling in contour plotting brings up 
> the more general issue of how to handle data gaps in plots.  For 
> example, the ocean current profiles that I measure using a Doppler 
> profiler extend to varying depths, and sometimes have holes in the 
> middle where there are not enough acoustic scatterers to give a 
> signal.  This sort of thing--data gaps--is universal in physical 
> oceanography.  One of Matlab's major strengths is the way it handles 
> them, using nan as a bad value flag.  Plotting a line with the plot 
> command, the line is broken at each nan; so if there is a hole in the 
> data, the plot shows exactly that.  The same for contouring: nans are 
> automatically used as a mask.
>
> Obviously, not everyone needs this kind of automatic handling of data 
> gaps, but I think it would be very useful for many applications, so I 
> hope it can be considered as a possible goal.  At the plotting level, 
> collections may make it easier to implement than would have been the 
> case in the early days of matplotlib.  At the array manipulation 
> level, the implementation could involve either masked arrays or nans.  
> I would greatly prefer the Matlab-style nan approach, but I don't know 
> whether this would work with Numeric.  Maybe in Numeric3?  Numarray 
> appears better equipped, with its ieeespecial.py module.
>
I think you touch on the key issue. I think we'd have to figure out how 
to handle this between Numeric and numarray (and Numeric3 potentially). 
Would a mask array be a suitable substitute as an interim solution?

Perry

Re: [Matplotlib-users] filled contours, and missing data

From: Eric F. <ef...@ha...> - 2005-02-21 22:02:38

Perry,

>> I would like to phase in matplotlib to replace Matlab ASAP for 
>> plotting physical oceanographic observations, primarily current 
>> profile measurements.  I (and many other physical oceanographers) 
>> primarily use contourf to plot filled contours; I only rarely use line 
>> contours.  It looks to me like gcntr.c has the necessary 
>> functionality--the ability to output polygons enclosing regions 
>> between a pair of specified levels. Is someone already working on 
>> exposing that functionality in matplotlib, or is it planned?
>>
> No one (as far as I know :-) is working on it right now. It is in our 
> plans to add this capability. As you correctly note, the underlying C 
> code can handle this capability. I'm not sure how long it will be; right 
> now the priority is to finish contour labeling capability, and the 
> person working on that also has other work that competes with her time 
> to do this. I'm guessing that she could start looking at it in a couple 
> weeks. Of course, if someone wants to help now, that would be great.

I have started working on it.  I don't know how far I will get; the 
necessary change to the c extension code was easy, but my first attempt 
to make a PolyCollection work in place of a Line Collection is failing. 
  I will do a bit more research before asking for help, if necessary. 
(No promises--I don't have much time to work on this, and it is my first 
plunge into the innards of matplotlib.)

> 
>> It appears that gcntr.c also has the ability to handle missing data 
>> via setting elements of the reg array to zero, and that this could be 
>> exposed fairly easily in the contour method in axes.py by adding "reg" 
>> to the set of kwargs.  Correct?  If so, is this also planned?
>>
> Correct. Yes (it is planned).
> 
>> The question of missing data handling in contour plotting brings up 
>> the more general issue of how to handle data gaps in plots.  For 
>> example, the ocean current profiles that I measure using a Doppler 
>> profiler extend to varying depths, and sometimes have holes in the 
>> middle where there are not enough acoustic scatterers to give a 
>> signal.  This sort of thing--data gaps--is universal in physical 
>> oceanography.  One of Matlab's major strengths is the way it handles 
>> them, using nan as a bad value flag.  Plotting a line with the plot 
>> command, the line is broken at each nan; so if there is a hole in the 
>> data, the plot shows exactly that.  The same for contouring: nans are 
>> automatically used as a mask.
>>
>> Obviously, not everyone needs this kind of automatic handling of data 
>> gaps, but I think it would be very useful for many applications, so I 
>> hope it can be considered as a possible goal.  At the plotting level, 
>> collections may make it easier to implement than would have been the 
>> case in the early days of matplotlib.  At the array manipulation 
>> level, the implementation could involve either masked arrays or nans.  
>> I would greatly prefer the Matlab-style nan approach, but I don't know 
>> whether this would work with Numeric.  Maybe in Numeric3?  Numarray 
>> appears better equipped, with its ieeespecial.py module.
>>
> I think you touch on the key issue. I think we'd have to figure out how 
> to handle this between Numeric and numarray (and Numeric3 potentially). 
> Would a mask array be a suitable substitute as an interim solution?

Are you suggesting something like this?  Let each plotting function have 
a new kwarg, perhaps called "validmask", with the same dimensions as the 
dependent variable to be plotted, and with nonzero where the variable is 
valid and 0 where it is missing.  The mask would then be used (1) to 
limit the autoranging tests to the valid data, (2) in the case of line 
plotting, to break the line up into segments so that a LineCollection 
would be plotted, (3) in the case of contouring, to set the reg array, 
(4) for images or pcolors to similarly mask out the invalid regions with 
white, or transparent, or perhaps some settable color.

This could be implemented in matplotlib in a way that would not depend 
on any special features, or likely changes, in the 
Numeric/Numeric3/numarray set.

A numarray user could then use
def notnan(y):
     return numarray.ieeespecial.mask(y, numarray.ieeespecial.NAN)

and say
plot(x, y, validmask=notnan(y))

In any case, this "validmask kwarg" solution seems to me like a 
perfectly good one from a user's standpoint, and a good bridge to the 
happy day when Numeric/Numeric3/numarray converge or evolve to a single, 
dominant numerical module with good nan handling built in.  (I very much 
hope such convergence will occur, and the sooner the better.)

Eric

Re: [Matplotlib-users] filled contours, and missing data

From: Stephen W. <ste...@cs...> - 2005-02-21 22:07:34

Eric Firing wrote:

> Are you suggesting something like this?  Let each plotting function 
> have a new kwarg, perhaps called "validmask", with the same dimensions 
> as the dependent variable to be plotted, and with nonzero where the 
> variable is valid and 0 where it is missing.

More or less, except that the mask is an attribute (?) of a MaskedArray 
object.  I for one would be in favor of this capability.

Re: [Matplotlib-users] filled contours, and missing data

From: Eric F. <ef...@ha...> - 2005-02-22 00:01:05

Stephen,

>> Are you suggesting something like this?  Let each plotting function 
>> have a new kwarg, perhaps called "validmask", with the same dimensions 
>> as the dependent variable to be plotted, and with nonzero where the 
>> variable is valid and 0 where it is missing.
> 
> 
> More or less, except that the mask is an attribute (?) of a MaskedArray 
> object.  I for one would be in favor of this capability.

I agree that this is an alternative, but I am not sure that it is better 
than what I described.  It requires all the machinery of the ma/MA 
module, which looks cumbersome to me.  What does it gain?  max and min 
will do the right thing on the masked array input, so one would not have 
to duplicate this inside matplotlib.  It is not hard to duplicate, 
however.  How much more ma/MA functionality would actually be useful?

When it was originally developed, the MaskedArray may have been a good 
way to get past Numeric's lack of nan-handling.  In the long run, 
however, it seems to me that Python needs a numeric module with good 
nan-handling (as in Matlab and Octave), and that this will render the 
Masked Array obsolete.  If so, then specifying a mask as a kwarg in 
matplotlib, and not using MA internally, may be simpler, more robust, 
and more flexible.

The user would still be free to use MA/ma externally, if desired.

A variation would be to support MA/ma in matplotlib only to the extent 
of checking for a MaskedArray input, and if it is present, breaking it 
apart and using the mask as if it had come via the kwarg. One could use 
either the kwarg or a Masked Array.

Eric

Re: [Matplotlib-users] filled contours, and missing data

From: Stephen W. <ste...@cs...> - 2005-02-22 03:05:49

Hello,

>
> I agree that this is an alternative, but I am not sure that it is 
> better than what I described.  It requires all the machinery of the 
> ma/MA module, which looks cumbersome to me.  What does it gain?

It would be more flexible.  Instead of having to actually replace data 
with NaN, you could create a mask which marked data to be ignored for 
the moment:  all negative values, say, or all values with a complex part 
less than 1e-5.  Much more flexible.  Having said that, I agree that NaN 
should also be ignored wherever it occurs.

>  One could use either the kwarg or a Masked Array.

-1 on the kwarg.  It seems to me that adding it to every plot command 
uglifies the interface significantly as well as being more work for John.

Stephen

Re: [Matplotlib-users] filled contours, and missing data

From: Stephen W. <ste...@cs...> - 2005-02-22 14:35:16

Stephen Walton wrote, in the context of using masked arrays rather than 
a keyword argument which would be the mask:

> It would be more flexible.  Instead of having to actually replace data 
> with NaN, you could create a mask which marked data to be ignored for 
> the moment:

This is, of course, incorrect.  Both approaches would allow arbitrary 
data sets to be masked as needed.

There was a mention over on an astronomy group of the progress being 
made in masked astronomical images.  Here too, the mask "comes along" 
with the data.  Perry et al., does STScI anticipate using 
numarray/Numeric masked arrays within PyFITS to handle this?

Re: [Matplotlib-users] filled contours, and missing data

From: Perry G. <pe...@st...> - 2005-02-22 21:03:55

On Feb 21, 2005, at 7:00 PM, Eric Firing wrote:

>
> Stephen,
>
>>> Are you suggesting something like this?  Let each plotting function 
>>> have a new kwarg, perhaps called "validmask", with the same 
>>> dimensions as the dependent variable to be plotted, and with nonzero 
>>> where the variable is valid and 0 where it is missing.
>> More or less, except that the mask is an attribute (?) of a 
>> MaskedArray object.  I for one would be in favor of this capability.
>
> I agree that this is an alternative, but I am not sure that it is 
> better than what I described.  It requires all the machinery of the 
> ma/MA module, which looks cumbersome to me.  What does it gain?  max 
> and min will do the right thing on the masked array input, so one 
> would not have to duplicate this inside matplotlib.  It is not hard to 
> duplicate, however.  How much more ma/MA functionality would actually 
> be useful?
>
> When it was originally developed, the MaskedArray may have been a good 
> way to get past Numeric's lack of nan-handling.  In the long run, 
> however, it seems to me that Python needs a numeric module with good 
> nan-handling (as in Matlab and Octave), and that this will render the 
> Masked Array obsolete.  If so, then specifying a mask as a kwarg in 
> matplotlib, and not using MA internally, may be simpler, more robust, 
> and more flexible.
>
> The user would still be free to use MA/ma externally, if desired.
>
> A variation would be to support MA/ma in matplotlib only to the extent 
> of checking for a MaskedArray input, and if it is present, breaking it 
> apart and using the mask as if it had come via the kwarg. One could 
> use either the kwarg or a Masked Array.
>

When we looked at the issue of using NaNs in place of masks or masked 
arrays, we concluded (well, I did anyway) that while NaNs could be used 
to replace masks in many instances, they could not be used in all. 
There are a lot of cases where people want to retain the value being 
masked (e.g., to do statistics on the rejected values). NaNs as masks 
only work for float and complex, not ints. So both approaches are 
useful and needed as far as I can tell.

As far as keyword args go, it seems to me that they would be more 
convenient in many cases, but as Stephen mentions, may be a fair amount 
of work (and in essence, they are an attribute of the data, so that may 
be where they belong).

Perry

Re: [Matplotlib-users] filled contours, and missing data

From: John H. <jdh...@ac...> - 2005-02-22 21:33:15

>>>>> "Perry" == Perry Greenfield <pe...@st...> writes:

    Perry> As far as keyword args go, it seems to me that they would
    Perry> be more convenient in many cases, but as Stephen mentions,
    Perry> may be a fair amount of work (and in essence, they are an
    Perry> attribute of the data, so that may be where they belong).

In the context of plotting, it isn't clear that NaNess is an attribute
of the data.  If the data are y = sin(2*pi*t), then NaNess enters only
under certain transformations (eg log) of the data.  From my
perspective, or the perspective of a matplotlib Line, the data are
intact, it is just that under certain transformations the data are
invalid.  Thus the mask is only needed under certain views
(transformations) which the Line class is mostly unaware of.  I think
there are two cases to be distinguished: the case Eric mentioned where
some data points are NaN or None because the measurements are missing
or invalid, and the case where the data are valid but are nan under
certain transformations.

For the latter, it would be maximally useful to be able to do

  #x,y,xt,yt are numerix arrays; transform puts in NaN on domain error
  xt, yt = transform(x, y)  
  
and the drawing routine drops NaN points and handles the connecting
segments properly.  That it only works for float arrays is not a
problem since that is what we are using, eg in the line class

        self._x = asarray(x, Float)
        self._y = asarray(y, Float)

but it is my current understanding that this ain't gonna happen in a
consistent way for Numeric, Numeric3, and numarray across platforms,
because my brief forays into researching this issue turned up posts by
Tim Peters saying that there was no standard way of dealing with IEEE
754 special values across compilers.  If that's true, and we can't fix
it or work around it, then I think boolean masks passed as a kwarg may
be the easiest way to handle this across various numerix
implementations, assuming that Numeric3 comes to fruition and assuming
that there isn't a consistent MA approach that is accessible at the
API level, which I don't know to be true but I get the feeling that
this is a reasonable guess.

I'd be interested in getting feedback from those of you who have
informed opinions on these matters:

  - Is it possible to handle NaN in Numeric/numarray arrays across the
    big three platforms?  I'm pretty sure the answer here is no.

  - Are Numeric and numarray MAs similar enough to be used at the
    C-API level, which is where agg would be checking the mask?  Does
    anyone know whether MAs will be part of the Numeric3 core?  I
    haven't seen any reference to them in the PEP.

JDH

Re: [Matplotlib-users] filled contours, and missing data

From: Perry G. <pe...@st...> - 2005-02-22 21:48:43

On Feb 22, 2005, at 4:21 PM, John Hunter wrote:

>>>>>> "Perry" == Perry Greenfield <pe...@st...> writes:
>
>     Perry> As far as keyword args go, it seems to me that they would
>     Perry> be more convenient in many cases, but as Stephen mentions,
>     Perry> may be a fair amount of work (and in essence, they are an
>     Perry> attribute of the data, so that may be where they belong).
>
> In the context of plotting, it isn't clear that NaNess is an attribute
> of the data.  If the data are y = sin(2*pi*t), then NaNess enters only

I meant masks were an attribute of the data, but not NaNs (in other 
words, the way MA handles it may be the most appropriate)

Perry

[Matplotlib-users] filled contours: progress, a minor bug (?), and a point of curiosity

From: Eric F. <ef...@ha...> - 2005-02-22 04:14:50

Perry,  John,

Progress!  I found that the problem I was having with PolyCollection was 
this: the vertices argument must be a sequence (list or tuple) of tuples 
of tuples--if one gives it a list of *lists* of tuples, one gets

[first part of trace omitted]
  File "/usr/lib/python2.3/site-packages/matplotlib/collections.py", 
line 205, in draw
    self._offsets,  self._transOffset)
TypeError: CXX: type error

(The line number was smaller before I put in some debugging print 
statements.)

I think this fussiness qualifies as a bug; the docstring for 
PolyCollection says vertices can be a sequence of sequences of tuples.  
I don't know what the right way to fix it is, however, so I am working 
around it.

Having solved that problem, I am getting more optimistic about being 
able to come up with a usable filled contour capability fairly quickly.  
Still no promises, though.

All this brings to mind a question that has puzzled me for a long time: 
why does matplotlib internally use sequences of (x,y) tuples instead of 
numerix arrays--either a 2-D array, or a pair (or tuple) of 1-D arrays? 
I would think that running all plotted numbers through the conversion 
from arrays to Python tuples, and then from there into the native data 
types for each backend, would incur a big performance penalty when 
plotting large numbers of points.  Not that I am suggesting a 
redesign--I am just curious.

Eric

Re: [Matplotlib-users] filled contours: progress, a minor bug (?), and a point of curiosity

From: John H. <jdh...@ac...> - 2005-02-22 18:45:34

>>>>> "Eric" == Eric Firing <ef...@ha...> writes:

    Eric> Perry, John, Progress!  

Cool!

    Eric> I found that the problem I was having with PolyCollection
    Eric> was this: the vertices argument must be a sequence (list or
    Eric> tuple) of tuples of tuples--if one gives it a list of
    Eric> *lists* of tuples, one gets

    Eric> [first part of trace omitted] File
    Eric> "/usr/lib/python2.3/site-packages/matplotlib/collections.py",
    Eric> line 205, in draw self._offsets, self._transOffset)
    Eric> TypeError: CXX: type error

    Eric> (The line number was smaller before I put in some debugging
    Eric> print statements.)

    Eric> I think this fussiness qualifies as a bug; the docstring for
    Eric> PolyCollection says vertices can be a sequence of sequences
    Eric> of tuples.  I don't know what the right way to fix it is,
    Eric> however, so I am working around it.

Fair enough -- I just fixed all the agg collection drawing routines to
work with the sequence API and not require tuples.  Glad to see you're
making progress -- poly contouring is something I'd like to see added.

    Eric> Having solved that problem, I am getting more optimistic
    Eric> about being able to come up with a usable filled contour
    Eric> capability fairly quickly.  Still no promises, though.

Great -- be mindful of the contourf matlab docstrings.  Strict
adherence is not required, but it is nice to be compatible where
possible.

    Eric> All this brings to mind a question that has puzzled me for a
    Eric> long time: why does matplotlib internally use sequences of
    Eric> (x,y) tuples instead of numerix arrays--either a 2-D array,
    Eric> or a pair (or tuple) of 1-D arrays? I would think that
    Eric> running all plotted numbers through the conversion from
    Eric> arrays to Python tuples, and then from there into the native
    Eric> data types for each backend, would incur a big performance
    Eric> penalty when plotting large numbers of points.  Not that I
    Eric> am suggesting a redesign--I am just curious.

Historical and other reasons.  The historical part is that this part
of the code was written before Todd had solved the numeric/numarray
API compatibility problem for matplotlib.  These are now solved, 've
been slowly adding some numerix code to backend agg, most recently in
0.72.  I don't think it would make a lot of difference for
collections.  In the first place, you'd have to create all these lists
of numarray lists, since the collection is by definition a list of
disconnected lines.  In the second place, there is a fair amount going
on in the inner loop that I think would offset the gains you get from
using numeric.  In draw_lines, where the x,y access is a major part of
the inner loop, I do use numerix.

The backend API is moving to a path drawing model, which may obviate
the need for specialized collection drawing methods. The collection
interface would remain unchanged, but we might get away w/o having
special methods to draw them.

JDH

Re: [Matplotlib-users] filled contours: progress, a minor bug (?), and a point of curiosity

From: Eric F. <ef...@ha...> - 2005-02-25 18:57:23

John,

>     Eric> Having solved that problem, I am getting more optimistic
>     Eric> about being able to come up with a usable filled contour
>     Eric> capability fairly quickly.  Still no promises, though.
> 
> Great -- be mindful of the contourf matlab docstrings.  Strict
> adherence is not required, but it is nice to be compatible where
> possible.

I have the basic filled contour functionality working, with the 
following caveats, comments, and questions:

0) I've done only the simplest of testing so far.

1) There is a fundamental difference in strategy between Matlab's 
contour patch generation algorithm and gcntr.c: Matlab makes all patches 
as simply connected regions without branch cuts, but gcntr polygons have 
branch cuts.  This means that we can't use the polygon edges; if one 
wants line contours at the contour levels, they must be drawn 
separately, by asking gcntr for lines, as contour does.  My inclination 
is to leave it this way: the user can simply call contourf to get the 
filled regions, and then call contour to add lines as needed.  Typically 
I draw lines at only a few of the color boundaries, and sometimes I draw 
additional lines within colored regions, so this is the way I normally 
use matlab contourf and contour anyway.

2) In the present version, there is much too much duplication of code 
between contour and contourf in axes.py; I copied the contour function 
to contourf, modified what I needed to, and moved only the 
ContourMappable class out to the module level.  I would like to factor 
out more of the common code.

3) The docstrings in axes.py are driving me nuts--lacking proper 
indentation, they make it very difficult to find the function 
definitions.  I presume this is because of the way boilerplate.py is 
generating the pylab.py functions and their docstrings.  I haven't 
looked at boilerplate.py (I haven't used it yet at all), but I suspect 
it would be easy to change things so that it would handle properly 
indented docstrings.  Is it OK if I do this?

4) ToDo: it is not standard in matlab, but for filled contouring I 
always use a matching colorbar--essentially a colorbar contoured with 
the same levels and colors as the contour plot itself, rather than one 
that shows the whole colormap.

5) ToDo: I haven't tried to do anything with region masking yet; maybe I 
will get to it soon, since it is something I need.

5) gcntr.c uses global variables, which presumably means that it will 
fail if called from more than one thread at a time.  Longer term, should 
I/we/someone modify it so that this not the case?  Or is this 
characteristic of other routines used by matplotlib, so there is no 
point in worrying about gcntr.c in particular?

6) When the time comes to send you my modifications, how should I do it: 
diffs, or complete files? Send to you directly, or to the list?  If you 
would prefer diffs, please give me an example of the exact diff command 
options to use.  (I am working with matplotlib-0.72.1 as a starting 
point.)  Modified files will include axes.py, pylab.py (and/or 
boilerplate.py), _contour.c, and an example.

Eric

Re: [Matplotlib-users] filled contours, and missing data

From: Perry G. <pe...@st...> - 2005-02-22 21:45:40

On Feb 22, 2005, at 4:21 PM, John Hunter wrote:

>>>>>> "Perry" == Perry Greenfield <pe...@st...> writes:
>
>     Perry> As far as keyword args go, it seems to me that they would
>     Perry> be more convenient in many cases, but as Stephen mentions,
>     Perry> may be a fair amount of work (and in essence, they are an
>     Perry> attribute of the data, so that may be where they belong).
>
> In the context of plotting, it isn't clear that NaNess is an attribute
> of the data.  If the data are y = sin(2*pi*t), then NaNess enters only
> under certain transformations (eg log) of the data.  From my
> perspective, or the perspective of a matplotlib Line, the data are
> intact, it is just that under certain transformations the data are
> invalid.  Thus the mask is only needed under certain views
> (transformations) which the Line class is mostly unaware of.  I think
> there are two cases to be distinguished: the case Eric mentioned where
> some data points are NaN or None because the measurements are missing
> or invalid, and the case where the data are valid but are nan under
> certain transformations.
>
Right.

> For the latter, it would be maximally useful to be able to do
>
>   #x,y,xt,yt are numerix arrays; transform puts in NaN on domain error
>   xt, yt = transform(x, y)
>
> and the drawing routine drops NaN points and handles the connecting
> segments properly.  That it only works for float arrays is not a
> problem since that is what we are using, eg in the line class
>
>         self._x = asarray(x, Float)
>         self._y = asarray(y, Float)
>
> but it is my current understanding that this ain't gonna happen in a
> consistent way for Numeric, Numeric3, and numarray across platforms,
> because my brief forays into researching this issue turned up posts by
> Tim Peters saying that there was no standard way of dealing with IEEE
> 754 special values across compilers.  If that's true, and we can't fix
> it or work around it, then I think boolean masks passed as a kwarg may
> be the easiest way to handle this across various numerix
> implementations, assuming that Numeric3 comes to fruition and assuming
> that there isn't a consistent MA approach that is accessible at the
> API level, which I don't know to be true but I get the feeling that
> this is a reasonable guess.
>
I think you may be over-extending what Tim was saying. I believe the 
issue is writing C code that handles things like tests against NaN 
values correctly. In that, Tim is right, there is a great deal of 
inconsistency in how C compilers handle ieee special values. But 
generally speaking, the computations *are* handled consistently by the 
floating point processors. numarray solved this problem by not relying 
on the C compiler to test or set NaN values (it tests raw bit patterns 
instead) and so should handle this issue properly. Numeric doesn't, but 
Numeric3 is planned to. So I think NaNs can't be handled right now 
because of Numeric, but I don't think the C compiler issue that Tim 
mentions is a roadblock (it is a nuisance for implementation though)..

> I'd be interested in getting feedback from those of you who have
> informed opinions on these matters:
>
>   - Is it possible to handle NaN in Numeric/numarray arrays across the
>
>   big three platforms?  I'm pretty sure the answer here is no.
>
No for Numeric, yes for numarray/Numeric3

>   - Are Numeric and numarray MAs similar enough to be used at the
>     C-API level, which is where agg would be checking the mask?  Does
>     anyone know whether MAs will be part of the Numeric3 core?  I
>     haven't seen any reference to them in the PEP.
>
We'll have to raise it. I imagine that they should be. As I mentioned, 
NaNs don't solve all the problems that MA does. Since MA is layered on 
Numeric/numarray it shouldn't be hard to do so. As far as C-API, since 
it is a Python implementation, I presume it comes down to whether or 
not the needed attributes are the same for the numeric and Numarray 
variants. I'm not immediately familiar with that, but Todd should be 
able to give a fairly quick answer on that (he did the port to 
numarray)

Perry

Re: [Matplotlib-users] filled contours, and missing data

From: Todd M. <jm...@st...> - 2005-02-22 22:29:15

On Tue, 2005-02-22 at 16:46, Perry Greenfield wrote:
> >   - Are Numeric and numarray MAs similar enough to be used at the
> >     C-API level, which is where agg would be checking the mask?  Does
> >     anyone know whether MAs will be part of the Numeric3 core?  I
> >     haven't seen any reference to them in the PEP.
> >
> We'll have to raise it. I imagine that they should be. As I mentioned, 
> NaNs don't solve all the problems that MA does. Since MA is layered on 
> Numeric/numarray it shouldn't be hard to do so. As far as C-API, since 
> it is a Python implementation, I presume it comes down to whether or 
> not the needed attributes are the same for the numeric and Numarray 
> variants. I'm not immediately familiar with that, but Todd should be 
> able to give a fairly quick answer on that (he did the port to 
> numarray)

As Perry said,  numarray.ma is a port of MA to numarray so they're
fairly close.  Both packages are pure Python layered over ordinary
numarray or Numeric numerical arrays.   Accessing from C,  both packages
should yield data or mask information via a method callback.  The
resulting arrays are PyArrayObjects which are source compatible only: 
extensions using MA component arrays will need to be compiled for either
Numeric or numarray as we do now for _image, _transforms, and _contour.

Todd