Screenshot instructions:
Windows
Mac
Red Hat Linux
Ubuntu
Click URL instructions:
Rightclick on ad, choose "Copy Link", then paste here →
(This may not be possible with some types of ads)
From: <jasonsage@cr...>  20090914 20:46:12

Robert Kern wrote: > prctile does not handle the case where the exact percentile lies between two > items. scoreatpercentile does. > > If mlab is supposed to be compatible with matlab, then isn't this a problem? From matlab, version 7.2.0.283 (R2006a) >> prctile([1 1 2 2 1 2 4 3 2 2 2 3 4 5 6 7 8 9 7 6 4 5 5],[0 25 50 75 100]) ans = 1.0000 2.0000 4.0000 5.7500 9.0000 Of course, the 75th percentile is different here too (5.75 instead of scipy's 5.5). I don't know how to explain that discrepancy. Jason  Jason Grout 
From: <jasonsage@cr...>  20090914 17:30:59

I tried the following (most output text is deleted): In [1]: ob1=[1,1,2,2,1,2,4,3,2,2,2,3,4,5,6,7,8,9,7,6,4,5,5] In [2]: import matplotlib.pyplot as plt In [3]: plt.figure() In [4]: plt.boxplot(ob1) In [5]: plt.savefig('test.png') In [6]: import scipy.stats In [7]: scipy.stats.scoreatpercentile(ob1,75) Out[7]: 5.5 Note that the 75th percentile is 5.5. R agrees with this calculation. However, in the boxplot, the top of the box is around 6, not 5.5. Isn't the top of the box supposed to be at the 75th percentile? Thanks, Jason  Jason Grout 
From: Gökhan Sever <gokhansever@gm...>  20090914 18:50:09
Attachments:
Message as HTML

On Mon, Sep 14, 2009 at 12:30 PM, <jasonsage@...> wrote: > I tried the following (most output text is deleted): > > In [1]: ob1=[1,1,2,2,1,2,4,3,2,2,2,3,4,5,6,7,8,9,7,6,4,5,5] > In [2]: import matplotlib.pyplot as > plt > In [3]: > plt.figure() > In [4]: > plt.boxplot(ob1) > In [5]: > plt.savefig('test.png') > In [6]: import > scipy.stats > In [7]: > scipy.stats.scoreatpercentile(ob1,75) > Out[7]: 5.5 > > > Note that the 75th percentile is 5.5. R agrees with this calculation. > However, in the boxplot, the top of the box is around 6, not 5.5. Isn't > the top of the box supposed to be at the 75th percentile? > > Thanks, > > Jason > >  > Jason Grout > > >From matplotlib/lib/matplotlib/axes.py You can see how matplotlib calculating percentiles. And yes it doesn't conform with scipy's scoreatpercentile() # get median and quartiles q1, med, q3 = mlab.prctile(d,[25,50,75]) I[36]: q1 O[36]: 2.0 I[37]: med O[37]: 4.0 I[38]: q3 O[38]: 6.0 Could this be due to a rounding? I don't know, but I am curious to hear the explanations for this discrepancy.  Gökhan 
From: Robert Kern <robert.kern@gm...>  20090914 19:07:39

On 20090914 13:49 PM, Gökhan Sever wrote: > > > On Mon, Sep 14, 2009 at 12:30 PM, <jasonsage@... > <mailto:jasonsage@...>> wrote: > > I tried the following (most output text is deleted): > > In [1]: ob1=[1,1,2,2,1,2,4,3,2,2,2,3,4,5,6,7,8,9,7,6,4,5,5] > In [2]: import matplotlib.pyplot as > plt > In [3]: > plt.figure() > In [4]: > plt.boxplot(ob1) > In [5]: > plt.savefig('test.png') > In [6]: import > scipy.stats > In [7]: > scipy.stats.scoreatpercentile(ob1,75) > Out[7]: 5.5 > > > Note that the 75th percentile is 5.5. R agrees with this calculation. > However, in the boxplot, the top of the box is around 6, not 5.5. Isn't > the top of the box supposed to be at the 75th percentile? > > Thanks, > > Jason > >  > Jason Grout > > > From matplotlib/lib/matplotlib/axes.py > > You can see how matplotlib calculating percentiles. And yes it doesn't > conform with scipy's scoreatpercentile() > > > # get median and quartiles > q1, med, q3 = mlab.prctile(d,[25,50,75]) > > I[36]: q1 > O[36]: 2.0 > > I[37]: med > O[37]: 4.0 > > I[38]: q3 > O[38]: 6.0 > > > Could this be due to a rounding? I don't know, but I am curious to hear > the explanations for this discrepancy. prctile does not handle the case where the exact percentile lies between two items. scoreatpercentile does.  Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth."  Umberto Eco 
From: Gökhan Sever <gokhansever@gm...>  20090914 21:08:24
Attachments:
Message as HTML

On Mon, Sep 14, 2009 at 3:45 PM, <jasonsage@...> wrote: > Robert Kern wrote: > > prctile does not handle the case where the exact percentile lies between > two > > items. scoreatpercentile does. > > > > > > If mlab is supposed to be compatible with matlab, then isn't this a > problem? > > From matlab, version 7.2.0.283 (R2006a) > > >> prctile([1 1 2 2 1 2 4 3 2 2 2 3 4 5 6 7 8 9 7 6 4 5 5],[0 25 50 75 > 100]) > > ans = > > 1.0000 2.0000 4.0000 5.7500 9.0000 > > > Of course, the 75th percentile is different here too (5.75 instead of > scipy's 5.5). I don't know how to explain that discrepancy. > > Jason > Now there are 3 different 75 percentiles :). Any ideas, which is one the most correct? I have used matplotlib's percentile outputs on some of my abstracts and posters, not yet in a paper. Not a big difference amongst them, but still makes me think, should I compare similar other function results with other programs when I do data analyses.  Gökhan 
From: Robert Kern <robert.kern@gm...>  20090914 21:18:09

On 20090914 16:08 PM, Gökhan Sever wrote: > > > On Mon, Sep 14, 2009 at 3:45 PM, <jasonsage@... > <mailto:jasonsage@...>> wrote: > > Robert Kern wrote: > > prctile does not handle the case where the exact percentile lies > between two > > items. scoreatpercentile does. > > > > > > If mlab is supposed to be compatible with matlab, then isn't this a > problem? > > From matlab, version 7.2.0.283 (R2006a) > > >> prctile([1 1 2 2 1 2 4 3 2 2 2 3 4 5 6 7 8 9 7 6 4 5 5],[0 25 50 75 > 100]) > > ans = > > 1.0000 2.0000 4.0000 5.7500 9.0000 > > > Of course, the 75th percentile is different here too (5.75 instead of > scipy's 5.5). I don't know how to explain that discrepancy. > > Jason > > > Now there are 3 different 75 percentiles :). Any ideas, which is one the > most correct? They are all reasonable. There are lots of different ways of handling this case. From the R documentation: http://sekhon.berkeley.edu/stats/html/quantile.html  Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth."  Umberto Eco 
From: Andrew Straw <strawman@as...>  20091221 00:47:52

Robert Kern wrote: > On 20090914 13:49 PM, Gökhan Sever wrote: > >> On Mon, Sep 14, 2009 at 12:30 PM, <jasonsage@... >> <mailto:jasonsage@...>> wrote: >> >> I tried the following (most output text is deleted): >> >> In [1]: ob1=[1,1,2,2,1,2,4,3,2,2,2,3,4,5,6,7,8,9,7,6,4,5,5] >> In [2]: import matplotlib.pyplot as >> plt >> In [3]: >> plt.figure() >> In [4]: >> plt.boxplot(ob1) >> In [5]: >> plt.savefig('test.png') >> In [6]: import >> scipy.stats >> In [7]: >> scipy.stats.scoreatpercentile(ob1,75) >> Out[7]: 5.5 >> >> >> Note that the 75th percentile is 5.5. R agrees with this calculation. >> However, in the boxplot, the top of the box is around 6, not 5.5. Isn't >> the top of the box supposed to be at the 75th percentile? >> >> Thanks, >> >> Jason >> >>  >> Jason Grout >> >> >> From matplotlib/lib/matplotlib/axes.py >> >> You can see how matplotlib calculating percentiles. And yes it doesn't >> conform with scipy's scoreatpercentile() >> >> >> # get median and quartiles >> q1, med, q3 = mlab.prctile(d,[25,50,75]) >> >> I[36]: q1 >> O[36]: 2.0 >> >> I[37]: med >> O[37]: 4.0 >> >> I[38]: q3 >> O[38]: 6.0 >> >> >> Could this be due to a rounding? I don't know, but I am curious to hear >> the explanations for this discrepancy. >> > > prctile does not handle the case where the exact percentile lies between two > items. scoreatpercentile does. > Fixed in r8039. 
From: <jasonsage@cr...>  20090914 20:46:12

Robert Kern wrote: > prctile does not handle the case where the exact percentile lies between two > items. scoreatpercentile does. > > If mlab is supposed to be compatible with matlab, then isn't this a problem? From matlab, version 7.2.0.283 (R2006a) >> prctile([1 1 2 2 1 2 4 3 2 2 2 3 4 5 6 7 8 9 7 6 4 5 5],[0 25 50 75 100]) ans = 1.0000 2.0000 4.0000 5.7500 9.0000 Of course, the 75th percentile is different here too (5.75 instead of scipy's 5.5). I don't know how to explain that discrepancy. Jason  Jason Grout 
Sign up for the SourceForge newsletter:
No, thanks