From: Benjamin R. <ben...@ou...> - 2011-02-18 19:30:08
|
On Fri, Feb 18, 2011 at 1:17 PM, Eric Firing <ef...@ha...> wrote: > On 02/18/2011 08:26 AM, Benjamin Root wrote: > >> >> >> On Thu, Feb 17, 2011 at 1:36 PM, Benjamin Root <ben...@ou... >> <mailto:ben...@ou...>> wrote: >> >> On Tue, Feb 15, 2011 at 2:05 PM, Benjamin Root <ben...@ou... >> <mailto:ben...@ou...>> wrote: >> >> On Tue, Feb 15, 2011 at 1:41 PM, Benjamin Root <ben...@ou... >> <mailto:ben...@ou...>> wrote: >> >> >> >> On Tue, Feb 15, 2011 at 1:17 PM, Eric Firing >> <ef...@ha... <mailto:ef...@ha...>> wrote: >> >> On 02/15/2011 08:50 AM, Benjamin Root wrote: >> >> On Tue, Feb 15, 2011 at 12:19 PM, Benjamin Root >> <ben...@ou... <mailto:ben...@ou...> >> <mailto:ben...@ou... <mailto:ben...@ou...>>> >> >> wrote: >> >> On Tue, Feb 15, 2011 at 11:54 AM, Eric Firing >> <ef...@ha... <mailto:ef...@ha...> >> <mailto:ef...@ha... >> >> <mailto:ef...@ha...>>> wrote: >> >> On 02/15/2011 07:40 AM, Benjamin Root wrote: >> > I have come across a little inconsistency that >> was unexpected >> in the >> > matplotlib API. The following is perfectly valid: >> > >> > import matplotlib.pyplot as plt >> > plt.plot([], []) >> > plt.show() >> > >> > >> > However, this is not valid: >> > >> > import matplotlib.pyplot as plt >> > plt.scatter([], []) >> > plt.show() >> > >> > >> > The immediate issue that comes up in scatter is >> that it >> attempts to find >> > min/max of the input data for the purpose of >> autoscaling >> (this can >> > probably be done better by just using >> set_xmargin(0.05) and >> > set_ymargin(0.05)). This can easily be bypassed >> with an if >> statement. >> > However, then we discover that polygon collection >> do not like >> having >> > empty offsets, which leads to a failure in the >> affine >> transformation. >> > >> > So, the question is, is this a bug or a feature? I >> personally believe >> > that empty data is a perfectly valid scenario and >> given that >> other >> > matplotlib functions handle it gracefully, we >> should make the >> > collections object more friendly to empty data. >> >> I agree; a call with empty data should >> simply not plot anything. >> >> Eric >> >> >> Digging further, it appears that the problem is >> in _path.cpp for >> _path_module::affine_transform() which >> explicitly checks for an >> empty vertices array and throws an exception if >> it is empty. >> >> So, do we want to make _path.cpp empty-friendly >> or should we just >> make empty collections objects just avoid doing >> anything that >> requires doing an affine transform? >> >> Ben Root >> >> >> >> Ok, some more digging deepens the mystery. While an >> empty-friendly >> _path.cpp would be nice, it appears that the >> collections and axes >> objects are already doing all it can to avoid doing >> transforms for empty >> collections. >> >> However, it appears that the supposedly empty >> collection object from >> scatter([], []) is not really empty. Its >> self._paths member contains a >> list of unit_circle() from Path. This is also the >> case for >> EllipseCollection. Meanwhile, LineCollection and >> PatchCollection >> initializes their self._paths in accordance to their >> given data. >> >> >> One way to solve the problem would be to start each >> draw() method with a short-circuit return in case there >> is nothing to draw. It would be needed only for classes >> for which empty self._paths is not a valid test. So for >> CircleCollection it would be: >> >> @allow_rasterization >> def draw(self, renderer): >> # sizes is the area of the circle circumscribing >> the polygon >> # in points^2 >> if len(self._sizes) == 0: >> return >> self._transforms = [ >> transforms.Affine2D().scale( >> (np.sqrt(x) * self.figure.dpi / 72.0) / >> np.sqrt(np.pi)) >> for x in self._sizes] >> return Collection.draw(self, renderer) >> >> (Collection.draw returns nothing, so there is no >> inconsistency in the return value.) >> >> Alternatively, it looks like maybe an empty >> self._transforms could be used in a short-circuit test >> at the start of Collection.draw. >> >> Eric >> >> >> >> Ben Root >> >> >> >> That wouldn't completely solve the problem. Affine >> transforms are being done for get_datalim() as well. The >> other issue (and I see that I mixed this up myself) is the >> assumption elsewhere that a non-empty self._path attribute >> means that there is something to plot. This is an >> assumption that is made in axes.py on line 1413 and it is an >> invalid assumption. >> >> As for your proposed solution in draw(), I prefer >> short-circuiting in Collections.draw(). This makes for less >> work for new Collection subclasses. >> >> Ben Root >> >> >> >> Working from an assumption that we can't possibly find all >> bad/broken assumptions in mpl, I decided to go the route of Eric >> and make the Collections object as robust as possible. I have >> attached a patch that should enable empty scatter plots in >> matplotlib and enable empty Collections as well. >> >> This has not been fully tested yet. Please be brutal! >> >> Ben Root >> >> >> >> Heh, looks like that patch breaks the plotting of errorbars (the >> line connecting the center point to the error caps is missing). >> >> I will see about fixing that. >> >> Ben Root >> >> >> Nailed it! >> >> After much examination and analysis, I realized that the transforms and >> the backend renderers are actually quite robust to empty arrays of >> points, so long as the arrays are shaped Nx2. It appears that in >> collections.py, an attempt is made to force that shape, but this is done >> inconsistently and wasn't technically correct. >> >> Consider the following snippet: >> >> offsets = np.asarray(offsets) >> if len(offsets.shape) == 1: >> offsets = offsets[np.newaxis,:] # Make it Nx2. >> >> If offsets was [[1, 4], [2, 5], [3, 6]], everything is fine. If offsets >> was [1, 4], everything works fine as well. However, if offsets was [1, >> 2, 3], this would go through, producing a 1x3 array (which isn't what we >> want and can cause errors later). If offsets was [], then a (1, 0) >> array is produced, and this causes problems later in the code when it >> checks for empty arrays incorrectly. >> >> We can make this code better and clearer by shaping the array >> appropriately like so: >> >> offsets = np.asarray(offsets) >> offsets.shape = (-1, 2) >> >> There are multiple benefits to this. First, it explicitly makes it >> clear our intent and requirement (that offsets needs to be Nx2). >> Second, it guarantees no copying of the array. Third, any array that >> can't fit this shape will throw an error. Fourth, we eliminate an >> if-statement! >> >> So, with this cleaner enforcement, I applied it to other places in >> collections.py where it wasn't being done at all. In addition, checks >> for empty arrays were being done with a check for len(offsets) > 0, >> which isn't quite right (a 1x0 array would pass this check, which is >> incorrect), so I changed it to the clearer and more correct offsets.size >> > 0. >> >> The attached patch will not only fix the issues in collections.py, but >> it also slightly changes the implementation of scatter() in axes.py to >> avoid doing min/max checks (which would fail for empty arrays) and >> instead will set data margins if they haven't been set already (and if >> there is data). >> >> I have tested the patch with the test suite and it passes. I also >> double-checked that all the mplot3d examples (which are not in the test >> suite) and they all rendered correctly. >> >> My only question is: Is it too late to commit to svn? >> > > I don't think so. Try it. It can't hurt. > > > >> I hope this enlightens others. >> > > Yes, thank you! > > Eric > > Ben Root >> > > Ah, and so it is still open. Committed in r8988. Ben Root |