|
From: Eric F. <ef...@ha...> - 2011-02-18 19:17:36
|
On 02/18/2011 08:26 AM, Benjamin Root wrote: > > > On Thu, Feb 17, 2011 at 1:36 PM, Benjamin Root <ben...@ou... > <mailto:ben...@ou...>> wrote: > > On Tue, Feb 15, 2011 at 2:05 PM, Benjamin Root <ben...@ou... > <mailto:ben...@ou...>> wrote: > > On Tue, Feb 15, 2011 at 1:41 PM, Benjamin Root <ben...@ou... > <mailto:ben...@ou...>> wrote: > > > > On Tue, Feb 15, 2011 at 1:17 PM, Eric Firing > <ef...@ha... <mailto:ef...@ha...>> wrote: > > On 02/15/2011 08:50 AM, Benjamin Root wrote: > > On Tue, Feb 15, 2011 at 12:19 PM, Benjamin Root > <ben...@ou... <mailto:ben...@ou...> > <mailto:ben...@ou... <mailto:ben...@ou...>>> > wrote: > > On Tue, Feb 15, 2011 at 11:54 AM, Eric Firing > <ef...@ha... <mailto:ef...@ha...> > <mailto:ef...@ha... > <mailto:ef...@ha...>>> wrote: > > On 02/15/2011 07:40 AM, Benjamin Root wrote: > > I have come across a little inconsistency that > was unexpected > in the > > matplotlib API. The following is perfectly valid: > > > > import matplotlib.pyplot as plt > > plt.plot([], []) > > plt.show() > > > > > > However, this is not valid: > > > > import matplotlib.pyplot as plt > > plt.scatter([], []) > > plt.show() > > > > > > The immediate issue that comes up in scatter is > that it > attempts to find > > min/max of the input data for the purpose of > autoscaling > (this can > > probably be done better by just using > set_xmargin(0.05) and > > set_ymargin(0.05)). This can easily be bypassed > with an if > statement. > > However, then we discover that polygon collection > do not like > having > > empty offsets, which leads to a failure in the affine > transformation. > > > > So, the question is, is this a bug or a feature? I > personally believe > > that empty data is a perfectly valid scenario and > given that > other > > matplotlib functions handle it gracefully, we > should make the > > collections object more friendly to empty data. > > I agree; a call with empty data should > simply not plot anything. > > Eric > > > Digging further, it appears that the problem is > in _path.cpp for > _path_module::affine_transform() which > explicitly checks for an > empty vertices array and throws an exception if > it is empty. > > So, do we want to make _path.cpp empty-friendly > or should we just > make empty collections objects just avoid doing > anything that > requires doing an affine transform? > > Ben Root > > > > Ok, some more digging deepens the mystery. While an > empty-friendly > _path.cpp would be nice, it appears that the > collections and axes > objects are already doing all it can to avoid doing > transforms for empty > collections. > > However, it appears that the supposedly empty > collection object from > scatter([], []) is not really empty. Its > self._paths member contains a > list of unit_circle() from Path. This is also the > case for > EllipseCollection. Meanwhile, LineCollection and > PatchCollection > initializes their self._paths in accordance to their > given data. > > > One way to solve the problem would be to start each > draw() method with a short-circuit return in case there > is nothing to draw. It would be needed only for classes > for which empty self._paths is not a valid test. So for > CircleCollection it would be: > > @allow_rasterization > def draw(self, renderer): > # sizes is the area of the circle circumscribing > the polygon > # in points^2 > if len(self._sizes) == 0: > return > self._transforms = [ > transforms.Affine2D().scale( > (np.sqrt(x) * self.figure.dpi / 72.0) / > np.sqrt(np.pi)) > for x in self._sizes] > return Collection.draw(self, renderer) > > (Collection.draw returns nothing, so there is no > inconsistency in the return value.) > > Alternatively, it looks like maybe an empty > self._transforms could be used in a short-circuit test > at the start of Collection.draw. > > Eric > > > > Ben Root > > > > That wouldn't completely solve the problem. Affine > transforms are being done for get_datalim() as well. The > other issue (and I see that I mixed this up myself) is the > assumption elsewhere that a non-empty self._path attribute > means that there is something to plot. This is an > assumption that is made in axes.py on line 1413 and it is an > invalid assumption. > > As for your proposed solution in draw(), I prefer > short-circuiting in Collections.draw(). This makes for less > work for new Collection subclasses. > > Ben Root > > > > Working from an assumption that we can't possibly find all > bad/broken assumptions in mpl, I decided to go the route of Eric > and make the Collections object as robust as possible. I have > attached a patch that should enable empty scatter plots in > matplotlib and enable empty Collections as well. > > This has not been fully tested yet. Please be brutal! > > Ben Root > > > > Heh, looks like that patch breaks the plotting of errorbars (the > line connecting the center point to the error caps is missing). > > I will see about fixing that. > > Ben Root > > > Nailed it! > > After much examination and analysis, I realized that the transforms and > the backend renderers are actually quite robust to empty arrays of > points, so long as the arrays are shaped Nx2. It appears that in > collections.py, an attempt is made to force that shape, but this is done > inconsistently and wasn't technically correct. > > Consider the following snippet: > > offsets = np.asarray(offsets) > if len(offsets.shape) == 1: > offsets = offsets[np.newaxis,:] # Make it Nx2. > > If offsets was [[1, 4], [2, 5], [3, 6]], everything is fine. If offsets > was [1, 4], everything works fine as well. However, if offsets was [1, > 2, 3], this would go through, producing a 1x3 array (which isn't what we > want and can cause errors later). If offsets was [], then a (1, 0) > array is produced, and this causes problems later in the code when it > checks for empty arrays incorrectly. > > We can make this code better and clearer by shaping the array > appropriately like so: > > offsets = np.asarray(offsets) > offsets.shape = (-1, 2) > > There are multiple benefits to this. First, it explicitly makes it > clear our intent and requirement (that offsets needs to be Nx2). > Second, it guarantees no copying of the array. Third, any array that > can't fit this shape will throw an error. Fourth, we eliminate an > if-statement! > > So, with this cleaner enforcement, I applied it to other places in > collections.py where it wasn't being done at all. In addition, checks > for empty arrays were being done with a check for len(offsets) > 0, > which isn't quite right (a 1x0 array would pass this check, which is > incorrect), so I changed it to the clearer and more correct offsets.size > > 0. > > The attached patch will not only fix the issues in collections.py, but > it also slightly changes the implementation of scatter() in axes.py to > avoid doing min/max checks (which would fail for empty arrays) and > instead will set data margins if they haven't been set already (and if > there is data). > > I have tested the patch with the test suite and it passes. I also > double-checked that all the mplot3d examples (which are not in the test > suite) and they all rendered correctly. > > My only question is: Is it too late to commit to svn? I don't think so. Try it. It can't hurt. > > I hope this enlightens others. Yes, thank you! Eric > Ben Root |