|
From: Benjamin R. <ben...@ou...> - 2011-02-15 17:40:32
|
I have come across a little inconsistency that was unexpected in the matplotlib API. The following is perfectly valid: import matplotlib.pyplot as plt plt.plot([], []) plt.show() However, this is not valid: import matplotlib.pyplot as plt plt.scatter([], []) plt.show() The immediate issue that comes up in scatter is that it attempts to find min/max of the input data for the purpose of autoscaling (this can probably be done better by just using set_xmargin(0.05) and set_ymargin(0.05)). This can easily be bypassed with an if statement. However, then we discover that polygon collection do not like having empty offsets, which leads to a failure in the affine transformation. So, the question is, is this a bug or a feature? I personally believe that empty data is a perfectly valid scenario and given that other matplotlib functions handle it gracefully, we should make the collections object more friendly to empty data. Ben Root |
|
From: Ben G. <bga...@gm...> - 2011-02-15 17:52:07
|
On Tue, 15 Feb 2011 11:40:01 -0600, Benjamin Root <ben...@ou...> wrote: > So, the question is, is this a bug or a feature? I personally believe that > empty data is a perfectly valid scenario and given that other matplotlib > functions handle it gracefully, we should make the collections object more > friendly to empty data. > I agree that this is certainly a bug. For instance, I have a short script which reads a stream of data from stdin and plots in realtime. Unfortunately this bug made it prohibitively difficult to add support for scatter plots. I'd definitely appreciate it if this were fixed. Cheers, - Ben |
|
From: Eric F. <ef...@ha...> - 2011-02-15 17:54:50
|
On 02/15/2011 07:40 AM, Benjamin Root wrote: > I have come across a little inconsistency that was unexpected in the > matplotlib API. The following is perfectly valid: > > import matplotlib.pyplot as plt > plt.plot([], []) > plt.show() > > > However, this is not valid: > > import matplotlib.pyplot as plt > plt.scatter([], []) > plt.show() > > > The immediate issue that comes up in scatter is that it attempts to find > min/max of the input data for the purpose of autoscaling (this can > probably be done better by just using set_xmargin(0.05) and > set_ymargin(0.05)). This can easily be bypassed with an if statement. > However, then we discover that polygon collection do not like having > empty offsets, which leads to a failure in the affine transformation. > > So, the question is, is this a bug or a feature? I personally believe > that empty data is a perfectly valid scenario and given that other > matplotlib functions handle it gracefully, we should make the > collections object more friendly to empty data. I agree; a call with empty data should simply not plot anything. Eric > > Ben Root > |
|
From: Benjamin R. <ben...@ou...> - 2011-02-15 18:20:00
|
On Tue, Feb 15, 2011 at 11:54 AM, Eric Firing <ef...@ha...> wrote: > On 02/15/2011 07:40 AM, Benjamin Root wrote: > > I have come across a little inconsistency that was unexpected in the > > matplotlib API. The following is perfectly valid: > > > > import matplotlib.pyplot as plt > > plt.plot([], []) > > plt.show() > > > > > > However, this is not valid: > > > > import matplotlib.pyplot as plt > > plt.scatter([], []) > > plt.show() > > > > > > The immediate issue that comes up in scatter is that it attempts to find > > min/max of the input data for the purpose of autoscaling (this can > > probably be done better by just using set_xmargin(0.05) and > > set_ymargin(0.05)). This can easily be bypassed with an if statement. > > However, then we discover that polygon collection do not like having > > empty offsets, which leads to a failure in the affine transformation. > > > > So, the question is, is this a bug or a feature? I personally believe > > that empty data is a perfectly valid scenario and given that other > > matplotlib functions handle it gracefully, we should make the > > collections object more friendly to empty data. > > I agree; a call with empty data should simply not plot anything. > > Eric > > Digging further, it appears that the problem is in _path.cpp for _path_module::affine_transform() which explicitly checks for an empty vertices array and throws an exception if it is empty. So, do we want to make _path.cpp empty-friendly or should we just make empty collections objects just avoid doing anything that requires doing an affine transform? Ben Root |
|
From: Benjamin R. <ben...@ou...> - 2011-02-15 18:50:48
|
On Tue, Feb 15, 2011 at 12:19 PM, Benjamin Root <ben...@ou...> wrote: > On Tue, Feb 15, 2011 at 11:54 AM, Eric Firing <ef...@ha...> wrote: > >> On 02/15/2011 07:40 AM, Benjamin Root wrote: >> > I have come across a little inconsistency that was unexpected in the >> > matplotlib API. The following is perfectly valid: >> > >> > import matplotlib.pyplot as plt >> > plt.plot([], []) >> > plt.show() >> > >> > >> > However, this is not valid: >> > >> > import matplotlib.pyplot as plt >> > plt.scatter([], []) >> > plt.show() >> > >> > >> > The immediate issue that comes up in scatter is that it attempts to find >> > min/max of the input data for the purpose of autoscaling (this can >> > probably be done better by just using set_xmargin(0.05) and >> > set_ymargin(0.05)). This can easily be bypassed with an if statement. >> > However, then we discover that polygon collection do not like having >> > empty offsets, which leads to a failure in the affine transformation. >> > >> > So, the question is, is this a bug or a feature? I personally believe >> > that empty data is a perfectly valid scenario and given that other >> > matplotlib functions handle it gracefully, we should make the >> > collections object more friendly to empty data. >> >> I agree; a call with empty data should simply not plot anything. >> >> Eric >> >> > Digging further, it appears that the problem is in _path.cpp for > _path_module::affine_transform() which explicitly checks for an empty > vertices array and throws an exception if it is empty. > > So, do we want to make _path.cpp empty-friendly or should we just make > empty collections objects just avoid doing anything that requires doing an > affine transform? > > Ben Root > > Ok, some more digging deepens the mystery. While an empty-friendly _path.cpp would be nice, it appears that the collections and axes objects are already doing all it can to avoid doing transforms for empty collections. However, it appears that the supposedly empty collection object from scatter([], []) is not really empty. Its self._paths member contains a list of unit_circle() from Path. This is also the case for EllipseCollection. Meanwhile, LineCollection and PatchCollection initializes their self._paths in accordance to their given data. Ben Root |
|
From: Eric F. <ef...@ha...> - 2011-02-15 19:17:21
|
On 02/15/2011 08:50 AM, Benjamin Root wrote:
> On Tue, Feb 15, 2011 at 12:19 PM, Benjamin Root <ben...@ou...
> <mailto:ben...@ou...>> wrote:
>
> On Tue, Feb 15, 2011 at 11:54 AM, Eric Firing <ef...@ha...
> <mailto:ef...@ha...>> wrote:
>
> On 02/15/2011 07:40 AM, Benjamin Root wrote:
> > I have come across a little inconsistency that was unexpected
> in the
> > matplotlib API. The following is perfectly valid:
> >
> > import matplotlib.pyplot as plt
> > plt.plot([], [])
> > plt.show()
> >
> >
> > However, this is not valid:
> >
> > import matplotlib.pyplot as plt
> > plt.scatter([], [])
> > plt.show()
> >
> >
> > The immediate issue that comes up in scatter is that it
> attempts to find
> > min/max of the input data for the purpose of autoscaling
> (this can
> > probably be done better by just using set_xmargin(0.05) and
> > set_ymargin(0.05)). This can easily be bypassed with an if
> statement.
> > However, then we discover that polygon collection do not like
> having
> > empty offsets, which leads to a failure in the affine
> transformation.
> >
> > So, the question is, is this a bug or a feature? I
> personally believe
> > that empty data is a perfectly valid scenario and given that
> other
> > matplotlib functions handle it gracefully, we should make the
> > collections object more friendly to empty data.
>
> I agree; a call with empty data should simply not plot anything.
>
> Eric
>
>
> Digging further, it appears that the problem is in _path.cpp for
> _path_module::affine_transform() which explicitly checks for an
> empty vertices array and throws an exception if it is empty.
>
> So, do we want to make _path.cpp empty-friendly or should we just
> make empty collections objects just avoid doing anything that
> requires doing an affine transform?
>
> Ben Root
>
>
>
> Ok, some more digging deepens the mystery. While an empty-friendly
> _path.cpp would be nice, it appears that the collections and axes
> objects are already doing all it can to avoid doing transforms for empty
> collections.
>
> However, it appears that the supposedly empty collection object from
> scatter([], []) is not really empty. Its self._paths member contains a
> list of unit_circle() from Path. This is also the case for
> EllipseCollection. Meanwhile, LineCollection and PatchCollection
> initializes their self._paths in accordance to their given data.
One way to solve the problem would be to start each draw() method with a
short-circuit return in case there is nothing to draw. It would be
needed only for classes for which empty self._paths is not a valid test.
So for CircleCollection it would be:
@allow_rasterization
def draw(self, renderer):
# sizes is the area of the circle circumscribing the polygon
# in points^2
if len(self._sizes) == 0:
return
self._transforms = [
transforms.Affine2D().scale(
(np.sqrt(x) * self.figure.dpi / 72.0) / np.sqrt(np.pi))
for x in self._sizes]
return Collection.draw(self, renderer)
(Collection.draw returns nothing, so there is no inconsistency in the
return value.)
Alternatively, it looks like maybe an empty self._transforms could be
used in a short-circuit test at the start of Collection.draw.
Eric
>
> Ben Root
>
|
|
From: Benjamin R. <ben...@ou...> - 2011-02-15 19:42:02
|
On Tue, Feb 15, 2011 at 1:17 PM, Eric Firing <ef...@ha...> wrote: > On 02/15/2011 08:50 AM, Benjamin Root wrote: > >> On Tue, Feb 15, 2011 at 12:19 PM, Benjamin Root <ben...@ou... >> <mailto:ben...@ou...>> wrote: >> >> On Tue, Feb 15, 2011 at 11:54 AM, Eric Firing <ef...@ha... >> <mailto:ef...@ha...>> wrote: >> >> On 02/15/2011 07:40 AM, Benjamin Root wrote: >> > I have come across a little inconsistency that was unexpected >> in the >> > matplotlib API. The following is perfectly valid: >> > >> > import matplotlib.pyplot as plt >> > plt.plot([], []) >> > plt.show() >> > >> > >> > However, this is not valid: >> > >> > import matplotlib.pyplot as plt >> > plt.scatter([], []) >> > plt.show() >> > >> > >> > The immediate issue that comes up in scatter is that it >> attempts to find >> > min/max of the input data for the purpose of autoscaling >> (this can >> > probably be done better by just using set_xmargin(0.05) and >> > set_ymargin(0.05)). This can easily be bypassed with an if >> statement. >> > However, then we discover that polygon collection do not like >> having >> > empty offsets, which leads to a failure in the affine >> transformation. >> > >> > So, the question is, is this a bug or a feature? I >> personally believe >> > that empty data is a perfectly valid scenario and given that >> other >> > matplotlib functions handle it gracefully, we should make the >> > collections object more friendly to empty data. >> >> I agree; a call with empty data should simply not plot anything. >> >> Eric >> >> >> Digging further, it appears that the problem is in _path.cpp for >> _path_module::affine_transform() which explicitly checks for an >> empty vertices array and throws an exception if it is empty. >> >> So, do we want to make _path.cpp empty-friendly or should we just >> make empty collections objects just avoid doing anything that >> requires doing an affine transform? >> >> Ben Root >> >> >> >> Ok, some more digging deepens the mystery. While an empty-friendly >> _path.cpp would be nice, it appears that the collections and axes >> objects are already doing all it can to avoid doing transforms for empty >> collections. >> >> However, it appears that the supposedly empty collection object from >> scatter([], []) is not really empty. Its self._paths member contains a >> list of unit_circle() from Path. This is also the case for >> EllipseCollection. Meanwhile, LineCollection and PatchCollection >> initializes their self._paths in accordance to their given data. >> > > One way to solve the problem would be to start each draw() method with a > short-circuit return in case there is nothing to draw. It would be needed > only for classes for which empty self._paths is not a valid test. So for > CircleCollection it would be: > > @allow_rasterization > def draw(self, renderer): > # sizes is the area of the circle circumscribing the polygon > # in points^2 > if len(self._sizes) == 0: > return > self._transforms = [ > transforms.Affine2D().scale( > (np.sqrt(x) * self.figure.dpi / 72.0) / np.sqrt(np.pi)) > for x in self._sizes] > return Collection.draw(self, renderer) > > (Collection.draw returns nothing, so there is no inconsistency in the > return value.) > > Alternatively, it looks like maybe an empty self._transforms could be used > in a short-circuit test at the start of Collection.draw. > > Eric > > > >> Ben Root >> >> > That wouldn't completely solve the problem. Affine transforms are being done for get_datalim() as well. The other issue (and I see that I mixed this up myself) is the assumption elsewhere that a non-empty self._path attribute means that there is something to plot. This is an assumption that is made in axes.py on line 1413 and it is an invalid assumption. As for your proposed solution in draw(), I prefer short-circuiting in Collections.draw(). This makes for less work for new Collection subclasses. Ben Root |
|
From: Benjamin R. <ben...@ou...> - 2011-02-17 19:36:34
|
On Tue, Feb 15, 2011 at 2:05 PM, Benjamin Root <ben...@ou...> wrote: > On Tue, Feb 15, 2011 at 1:41 PM, Benjamin Root <ben...@ou...> wrote: > >> >> >> On Tue, Feb 15, 2011 at 1:17 PM, Eric Firing <ef...@ha...> wrote: >> >>> On 02/15/2011 08:50 AM, Benjamin Root wrote: >>> >>>> On Tue, Feb 15, 2011 at 12:19 PM, Benjamin Root <ben...@ou... >>>> <mailto:ben...@ou...>> wrote: >>>> >>>> On Tue, Feb 15, 2011 at 11:54 AM, Eric Firing <ef...@ha... >>>> <mailto:ef...@ha...>> wrote: >>>> >>>> On 02/15/2011 07:40 AM, Benjamin Root wrote: >>>> > I have come across a little inconsistency that was unexpected >>>> in the >>>> > matplotlib API. The following is perfectly valid: >>>> > >>>> > import matplotlib.pyplot as plt >>>> > plt.plot([], []) >>>> > plt.show() >>>> > >>>> > >>>> > However, this is not valid: >>>> > >>>> > import matplotlib.pyplot as plt >>>> > plt.scatter([], []) >>>> > plt.show() >>>> > >>>> > >>>> > The immediate issue that comes up in scatter is that it >>>> attempts to find >>>> > min/max of the input data for the purpose of autoscaling >>>> (this can >>>> > probably be done better by just using set_xmargin(0.05) and >>>> > set_ymargin(0.05)). This can easily be bypassed with an if >>>> statement. >>>> > However, then we discover that polygon collection do not like >>>> having >>>> > empty offsets, which leads to a failure in the affine >>>> transformation. >>>> > >>>> > So, the question is, is this a bug or a feature? I >>>> personally believe >>>> > that empty data is a perfectly valid scenario and given that >>>> other >>>> > matplotlib functions handle it gracefully, we should make the >>>> > collections object more friendly to empty data. >>>> >>>> I agree; a call with empty data should simply not plot anything. >>>> >>>> Eric >>>> >>>> >>>> Digging further, it appears that the problem is in _path.cpp for >>>> _path_module::affine_transform() which explicitly checks for an >>>> empty vertices array and throws an exception if it is empty. >>>> >>>> So, do we want to make _path.cpp empty-friendly or should we just >>>> make empty collections objects just avoid doing anything that >>>> requires doing an affine transform? >>>> >>>> Ben Root >>>> >>>> >>>> >>>> Ok, some more digging deepens the mystery. While an empty-friendly >>>> _path.cpp would be nice, it appears that the collections and axes >>>> objects are already doing all it can to avoid doing transforms for empty >>>> collections. >>>> >>>> However, it appears that the supposedly empty collection object from >>>> scatter([], []) is not really empty. Its self._paths member contains a >>>> list of unit_circle() from Path. This is also the case for >>>> EllipseCollection. Meanwhile, LineCollection and PatchCollection >>>> initializes their self._paths in accordance to their given data. >>>> >>> >>> One way to solve the problem would be to start each draw() method with a >>> short-circuit return in case there is nothing to draw. It would be needed >>> only for classes for which empty self._paths is not a valid test. So for >>> CircleCollection it would be: >>> >>> @allow_rasterization >>> def draw(self, renderer): >>> # sizes is the area of the circle circumscribing the polygon >>> # in points^2 >>> if len(self._sizes) == 0: >>> return >>> self._transforms = [ >>> transforms.Affine2D().scale( >>> (np.sqrt(x) * self.figure.dpi / 72.0) / np.sqrt(np.pi)) >>> for x in self._sizes] >>> return Collection.draw(self, renderer) >>> >>> (Collection.draw returns nothing, so there is no inconsistency in the >>> return value.) >>> >>> Alternatively, it looks like maybe an empty self._transforms could be >>> used in a short-circuit test at the start of Collection.draw. >>> >>> Eric >>> >>> >>> >>>> Ben Root >>>> >>>> >>> >> That wouldn't completely solve the problem. Affine transforms are being >> done for get_datalim() as well. The other issue (and I see that I mixed >> this up myself) is the assumption elsewhere that a non-empty self._path >> attribute means that there is something to plot. This is an assumption that >> is made in axes.py on line 1413 and it is an invalid assumption. >> >> As for your proposed solution in draw(), I prefer short-circuiting in >> Collections.draw(). This makes for less work for new Collection subclasses. >> >> Ben Root >> > > > Working from an assumption that we can't possibly find all bad/broken > assumptions in mpl, I decided to go the route of Eric and make the > Collections object as robust as possible. I have attached a patch that > should enable empty scatter plots in matplotlib and enable empty Collections > as well. > > This has not been fully tested yet. Please be brutal! > > Ben Root > > > Heh, looks like that patch breaks the plotting of errorbars (the line connecting the center point to the error caps is missing). I will see about fixing that. Ben Root |
|
From: Eric F. <ef...@ha...> - 2011-02-18 19:17:36
|
On 02/18/2011 08:26 AM, Benjamin Root wrote: > > > On Thu, Feb 17, 2011 at 1:36 PM, Benjamin Root <ben...@ou... > <mailto:ben...@ou...>> wrote: > > On Tue, Feb 15, 2011 at 2:05 PM, Benjamin Root <ben...@ou... > <mailto:ben...@ou...>> wrote: > > On Tue, Feb 15, 2011 at 1:41 PM, Benjamin Root <ben...@ou... > <mailto:ben...@ou...>> wrote: > > > > On Tue, Feb 15, 2011 at 1:17 PM, Eric Firing > <ef...@ha... <mailto:ef...@ha...>> wrote: > > On 02/15/2011 08:50 AM, Benjamin Root wrote: > > On Tue, Feb 15, 2011 at 12:19 PM, Benjamin Root > <ben...@ou... <mailto:ben...@ou...> > <mailto:ben...@ou... <mailto:ben...@ou...>>> > wrote: > > On Tue, Feb 15, 2011 at 11:54 AM, Eric Firing > <ef...@ha... <mailto:ef...@ha...> > <mailto:ef...@ha... > <mailto:ef...@ha...>>> wrote: > > On 02/15/2011 07:40 AM, Benjamin Root wrote: > > I have come across a little inconsistency that > was unexpected > in the > > matplotlib API. The following is perfectly valid: > > > > import matplotlib.pyplot as plt > > plt.plot([], []) > > plt.show() > > > > > > However, this is not valid: > > > > import matplotlib.pyplot as plt > > plt.scatter([], []) > > plt.show() > > > > > > The immediate issue that comes up in scatter is > that it > attempts to find > > min/max of the input data for the purpose of > autoscaling > (this can > > probably be done better by just using > set_xmargin(0.05) and > > set_ymargin(0.05)). This can easily be bypassed > with an if > statement. > > However, then we discover that polygon collection > do not like > having > > empty offsets, which leads to a failure in the affine > transformation. > > > > So, the question is, is this a bug or a feature? I > personally believe > > that empty data is a perfectly valid scenario and > given that > other > > matplotlib functions handle it gracefully, we > should make the > > collections object more friendly to empty data. > > I agree; a call with empty data should > simply not plot anything. > > Eric > > > Digging further, it appears that the problem is > in _path.cpp for > _path_module::affine_transform() which > explicitly checks for an > empty vertices array and throws an exception if > it is empty. > > So, do we want to make _path.cpp empty-friendly > or should we just > make empty collections objects just avoid doing > anything that > requires doing an affine transform? > > Ben Root > > > > Ok, some more digging deepens the mystery. While an > empty-friendly > _path.cpp would be nice, it appears that the > collections and axes > objects are already doing all it can to avoid doing > transforms for empty > collections. > > However, it appears that the supposedly empty > collection object from > scatter([], []) is not really empty. Its > self._paths member contains a > list of unit_circle() from Path. This is also the > case for > EllipseCollection. Meanwhile, LineCollection and > PatchCollection > initializes their self._paths in accordance to their > given data. > > > One way to solve the problem would be to start each > draw() method with a short-circuit return in case there > is nothing to draw. It would be needed only for classes > for which empty self._paths is not a valid test. So for > CircleCollection it would be: > > @allow_rasterization > def draw(self, renderer): > # sizes is the area of the circle circumscribing > the polygon > # in points^2 > if len(self._sizes) == 0: > return > self._transforms = [ > transforms.Affine2D().scale( > (np.sqrt(x) * self.figure.dpi / 72.0) / > np.sqrt(np.pi)) > for x in self._sizes] > return Collection.draw(self, renderer) > > (Collection.draw returns nothing, so there is no > inconsistency in the return value.) > > Alternatively, it looks like maybe an empty > self._transforms could be used in a short-circuit test > at the start of Collection.draw. > > Eric > > > > Ben Root > > > > That wouldn't completely solve the problem. Affine > transforms are being done for get_datalim() as well. The > other issue (and I see that I mixed this up myself) is the > assumption elsewhere that a non-empty self._path attribute > means that there is something to plot. This is an > assumption that is made in axes.py on line 1413 and it is an > invalid assumption. > > As for your proposed solution in draw(), I prefer > short-circuiting in Collections.draw(). This makes for less > work for new Collection subclasses. > > Ben Root > > > > Working from an assumption that we can't possibly find all > bad/broken assumptions in mpl, I decided to go the route of Eric > and make the Collections object as robust as possible. I have > attached a patch that should enable empty scatter plots in > matplotlib and enable empty Collections as well. > > This has not been fully tested yet. Please be brutal! > > Ben Root > > > > Heh, looks like that patch breaks the plotting of errorbars (the > line connecting the center point to the error caps is missing). > > I will see about fixing that. > > Ben Root > > > Nailed it! > > After much examination and analysis, I realized that the transforms and > the backend renderers are actually quite robust to empty arrays of > points, so long as the arrays are shaped Nx2. It appears that in > collections.py, an attempt is made to force that shape, but this is done > inconsistently and wasn't technically correct. > > Consider the following snippet: > > offsets = np.asarray(offsets) > if len(offsets.shape) == 1: > offsets = offsets[np.newaxis,:] # Make it Nx2. > > If offsets was [[1, 4], [2, 5], [3, 6]], everything is fine. If offsets > was [1, 4], everything works fine as well. However, if offsets was [1, > 2, 3], this would go through, producing a 1x3 array (which isn't what we > want and can cause errors later). If offsets was [], then a (1, 0) > array is produced, and this causes problems later in the code when it > checks for empty arrays incorrectly. > > We can make this code better and clearer by shaping the array > appropriately like so: > > offsets = np.asarray(offsets) > offsets.shape = (-1, 2) > > There are multiple benefits to this. First, it explicitly makes it > clear our intent and requirement (that offsets needs to be Nx2). > Second, it guarantees no copying of the array. Third, any array that > can't fit this shape will throw an error. Fourth, we eliminate an > if-statement! > > So, with this cleaner enforcement, I applied it to other places in > collections.py where it wasn't being done at all. In addition, checks > for empty arrays were being done with a check for len(offsets) > 0, > which isn't quite right (a 1x0 array would pass this check, which is > incorrect), so I changed it to the clearer and more correct offsets.size > > 0. > > The attached patch will not only fix the issues in collections.py, but > it also slightly changes the implementation of scatter() in axes.py to > avoid doing min/max checks (which would fail for empty arrays) and > instead will set data margins if they haven't been set already (and if > there is data). > > I have tested the patch with the test suite and it passes. I also > double-checked that all the mplot3d examples (which are not in the test > suite) and they all rendered correctly. > > My only question is: Is it too late to commit to svn? I don't think so. Try it. It can't hurt. > > I hope this enlightens others. Yes, thank you! Eric > Ben Root |
|
From: Benjamin R. <ben...@ou...> - 2011-02-18 19:30:08
|
On Fri, Feb 18, 2011 at 1:17 PM, Eric Firing <ef...@ha...> wrote: > On 02/18/2011 08:26 AM, Benjamin Root wrote: > >> >> >> On Thu, Feb 17, 2011 at 1:36 PM, Benjamin Root <ben...@ou... >> <mailto:ben...@ou...>> wrote: >> >> On Tue, Feb 15, 2011 at 2:05 PM, Benjamin Root <ben...@ou... >> <mailto:ben...@ou...>> wrote: >> >> On Tue, Feb 15, 2011 at 1:41 PM, Benjamin Root <ben...@ou... >> <mailto:ben...@ou...>> wrote: >> >> >> >> On Tue, Feb 15, 2011 at 1:17 PM, Eric Firing >> <ef...@ha... <mailto:ef...@ha...>> wrote: >> >> On 02/15/2011 08:50 AM, Benjamin Root wrote: >> >> On Tue, Feb 15, 2011 at 12:19 PM, Benjamin Root >> <ben...@ou... <mailto:ben...@ou...> >> <mailto:ben...@ou... <mailto:ben...@ou...>>> >> >> wrote: >> >> On Tue, Feb 15, 2011 at 11:54 AM, Eric Firing >> <ef...@ha... <mailto:ef...@ha...> >> <mailto:ef...@ha... >> >> <mailto:ef...@ha...>>> wrote: >> >> On 02/15/2011 07:40 AM, Benjamin Root wrote: >> > I have come across a little inconsistency that >> was unexpected >> in the >> > matplotlib API. The following is perfectly valid: >> > >> > import matplotlib.pyplot as plt >> > plt.plot([], []) >> > plt.show() >> > >> > >> > However, this is not valid: >> > >> > import matplotlib.pyplot as plt >> > plt.scatter([], []) >> > plt.show() >> > >> > >> > The immediate issue that comes up in scatter is >> that it >> attempts to find >> > min/max of the input data for the purpose of >> autoscaling >> (this can >> > probably be done better by just using >> set_xmargin(0.05) and >> > set_ymargin(0.05)). This can easily be bypassed >> with an if >> statement. >> > However, then we discover that polygon collection >> do not like >> having >> > empty offsets, which leads to a failure in the >> affine >> transformation. >> > >> > So, the question is, is this a bug or a feature? I >> personally believe >> > that empty data is a perfectly valid scenario and >> given that >> other >> > matplotlib functions handle it gracefully, we >> should make the >> > collections object more friendly to empty data. >> >> I agree; a call with empty data should >> simply not plot anything. >> >> Eric >> >> >> Digging further, it appears that the problem is >> in _path.cpp for >> _path_module::affine_transform() which >> explicitly checks for an >> empty vertices array and throws an exception if >> it is empty. >> >> So, do we want to make _path.cpp empty-friendly >> or should we just >> make empty collections objects just avoid doing >> anything that >> requires doing an affine transform? >> >> Ben Root >> >> >> >> Ok, some more digging deepens the mystery. While an >> empty-friendly >> _path.cpp would be nice, it appears that the >> collections and axes >> objects are already doing all it can to avoid doing >> transforms for empty >> collections. >> >> However, it appears that the supposedly empty >> collection object from >> scatter([], []) is not really empty. Its >> self._paths member contains a >> list of unit_circle() from Path. This is also the >> case for >> EllipseCollection. Meanwhile, LineCollection and >> PatchCollection >> initializes their self._paths in accordance to their >> given data. >> >> >> One way to solve the problem would be to start each >> draw() method with a short-circuit return in case there >> is nothing to draw. It would be needed only for classes >> for which empty self._paths is not a valid test. So for >> CircleCollection it would be: >> >> @allow_rasterization >> def draw(self, renderer): >> # sizes is the area of the circle circumscribing >> the polygon >> # in points^2 >> if len(self._sizes) == 0: >> return >> self._transforms = [ >> transforms.Affine2D().scale( >> (np.sqrt(x) * self.figure.dpi / 72.0) / >> np.sqrt(np.pi)) >> for x in self._sizes] >> return Collection.draw(self, renderer) >> >> (Collection.draw returns nothing, so there is no >> inconsistency in the return value.) >> >> Alternatively, it looks like maybe an empty >> self._transforms could be used in a short-circuit test >> at the start of Collection.draw. >> >> Eric >> >> >> >> Ben Root >> >> >> >> That wouldn't completely solve the problem. Affine >> transforms are being done for get_datalim() as well. The >> other issue (and I see that I mixed this up myself) is the >> assumption elsewhere that a non-empty self._path attribute >> means that there is something to plot. This is an >> assumption that is made in axes.py on line 1413 and it is an >> invalid assumption. >> >> As for your proposed solution in draw(), I prefer >> short-circuiting in Collections.draw(). This makes for less >> work for new Collection subclasses. >> >> Ben Root >> >> >> >> Working from an assumption that we can't possibly find all >> bad/broken assumptions in mpl, I decided to go the route of Eric >> and make the Collections object as robust as possible. I have >> attached a patch that should enable empty scatter plots in >> matplotlib and enable empty Collections as well. >> >> This has not been fully tested yet. Please be brutal! >> >> Ben Root >> >> >> >> Heh, looks like that patch breaks the plotting of errorbars (the >> line connecting the center point to the error caps is missing). >> >> I will see about fixing that. >> >> Ben Root >> >> >> Nailed it! >> >> After much examination and analysis, I realized that the transforms and >> the backend renderers are actually quite robust to empty arrays of >> points, so long as the arrays are shaped Nx2. It appears that in >> collections.py, an attempt is made to force that shape, but this is done >> inconsistently and wasn't technically correct. >> >> Consider the following snippet: >> >> offsets = np.asarray(offsets) >> if len(offsets.shape) == 1: >> offsets = offsets[np.newaxis,:] # Make it Nx2. >> >> If offsets was [[1, 4], [2, 5], [3, 6]], everything is fine. If offsets >> was [1, 4], everything works fine as well. However, if offsets was [1, >> 2, 3], this would go through, producing a 1x3 array (which isn't what we >> want and can cause errors later). If offsets was [], then a (1, 0) >> array is produced, and this causes problems later in the code when it >> checks for empty arrays incorrectly. >> >> We can make this code better and clearer by shaping the array >> appropriately like so: >> >> offsets = np.asarray(offsets) >> offsets.shape = (-1, 2) >> >> There are multiple benefits to this. First, it explicitly makes it >> clear our intent and requirement (that offsets needs to be Nx2). >> Second, it guarantees no copying of the array. Third, any array that >> can't fit this shape will throw an error. Fourth, we eliminate an >> if-statement! >> >> So, with this cleaner enforcement, I applied it to other places in >> collections.py where it wasn't being done at all. In addition, checks >> for empty arrays were being done with a check for len(offsets) > 0, >> which isn't quite right (a 1x0 array would pass this check, which is >> incorrect), so I changed it to the clearer and more correct offsets.size >> > 0. >> >> The attached patch will not only fix the issues in collections.py, but >> it also slightly changes the implementation of scatter() in axes.py to >> avoid doing min/max checks (which would fail for empty arrays) and >> instead will set data margins if they haven't been set already (and if >> there is data). >> >> I have tested the patch with the test suite and it passes. I also >> double-checked that all the mplot3d examples (which are not in the test >> suite) and they all rendered correctly. >> >> My only question is: Is it too late to commit to svn? >> > > I don't think so. Try it. It can't hurt. > > > >> I hope this enlightens others. >> > > Yes, thank you! > > Eric > > Ben Root >> > > Ah, and so it is still open. Committed in r8988. Ben Root |