From: Ben A. <bax...@co...> - 2008-07-25 15:09:01
|
I have noticed 2 bugs having to do with NaN handling in the scatter() function. And one other bug that seems to be in numpy. 1. The min and max for the axes are not computed properly when there are NaNs in the data. Example: import pylab as pl import numpy as np x = np.asarray([0, 1, 2, 3, None, 5, 6, 7, 8, 9], float) y = np.asarray([0, None, 2, 3, 4, 5, 6, 7, 8, 9], float) ax = pl.subplot(111) ax.scatter(x, y) pl.show() The points with NaN values are left out of the plot as expected, but you will see that everything before the NaN is ignored when computing the axis ranges. (The X axis goes from 4 to 10, cutting off some data, when it should be from -1 to 10. The Y axis goes from 1 to 10 when it should be also be from -1 to 10.) This is rather annoying since these simple calls fix the issue: ax.set_xlim(min(x), max(y)) ax.set_ylim(min(y), max(y)) 2. We see the same behavior for the 'c' axis. Example: import pylab as pl import numpy as np x = np.asarray([0, 1, 2, 3, 4, 5, 6, 7, 8, 9], float) y = np.asarray([0, 1, 2, 3, 4, 5, 6, 7, 8, 9], float) z = np.asarray([0, 1, 2, 3, 4, 5, None, 7, 8, 9], float) ax = pl.subplot(111) ax.scatter(x, y, c=z) pl.show() We see that everything before point 7 has zero color. And we can bandaid fix it by adding: ax.scatter(x, y, c=z, vmin=min(z), vmax=max(z)) Then only the one NaN point has zero color. 3. Both of the above mentioned bandaid fixes suffer from some bug (I think in numpy). Where the min() and max() of a numpy array where the first value is NaN, bugs out: x = np.asarray([None, 1, 2, 3, 4, 5, 6, 7, 8, 9], float) y = np.asarray([0, 1, 2, 3, 4, 5, 6, 7, 8, None], float) z = np.asarray([0, 1, 2, 3, 4, 5, None, 7, 8, 9], float) print min(x), max(x) #prints 1.#QNAN 1.#QNAN print min(y), max(y) #prints 0.0 8.0 print min(z), max(z) #pritns 0.0 9.0 FYI, I am using MatPlotLib version 0.91.4 and NumPy 1.1.0 on windows and Debian Linux. Thanks, -Ben |