From: Adrian C. <ac...@gm...> - 2008-05-20 09:02:29
|
Hey all, Wherein we discover that stats are hard, even for the simple questions... On Tue, 2008-05-20 at 10:18 +0200, Andrea Aime wrote: > Jody Garnett ha scritto: > > What a difficult question; is there a strict definition of the quantile > > function we could grab from statistics or something? I'm not sure the use of "Quantile" for this function is correct terminology but don't have time to explore it rigourously. So far all I've learned is that I've now forgotten how to use R. As ever, wikipedia is our friend these days: By a quantile, we mean the fraction (or percent) of points below the given value. That is, the 0.3 (or 30%) quantile is the point at which 30% percent of the data fall below and 70% fall above that value. Since the key footnote points us to R, we can start to trust this as an authoritative source. http://stat.ethz.ch/R-manual/R-devel/library/stats/html/quantile.html In R, it seems you want a type=3 method of quantification " Type 3 SAS definition: nearest even order statistic" but, again, I don't have the time to answer this rigourously today. > Quantile( {-1 -2 0 0 0 0 3 5 7 9}, 2) ==> ? > Quantile( {-1 -2 0 0 0 0 3 5 7 9}, 3) ==> ? eratosthenes:~> R ... > x <- c(-1,-2,0,0,0,0,3,5,7,9) > n <- 2 > quantile(x,probs=seq(0,1,1/n)) 0% 50% 100% -2 0 9 > n <-3 > quantile(x,probs=seq(0,1,1/n)) 0% 33.33333% 66.66667% 100% -2 0 3 9 with the value shown being the rightmost in the original vector and defining the breaks which can be applied to the vector to yield the resulting classes. (You don't care about the leftmost value). > Quantile( {-10 -9 -2 0 0 0 1 2 4 9 9 9}, 3) ==> what now? > x2 <- c(-10,-9,-2,0,0,0,1,2,4,9,9,9) > n <- 3 > quantile(x2,probs=seq(0,1,1/n)) 0% 33.33333% 66.66667% 100% -10.000000 0.000000 2.666667 9.000000 > quantile(x2,probs=seq(0,1,1/n),type=3) 0% 33.33333% 66.66667% 100% -10 0 2 9 Also you might look at the spreadsheet functions definitions since they might explain the terminology needed. --adrian |