From: Andrea A. <aa...@op...> - 2008-05-20 08:18:07
|
Jody Garnett ha scritto: > What a difficult question; is there a strict definition of the quantile > function we could grab from statistics or something? I did not find much, and none of what I've found talks about how to handle flat areas in the data histogram: http://www.gisbanker.com/introduction_part5.htm http://www.geovista.psu.edu/grants/dg-qg/classing_epi/summary.html http://www.censusmapper.com/CM_Help/classifyfield.htm ... > Given you example I want to ask: what is more important; the number of > classifications, or the fact that they are "even" in size... If only the number was important, an equal interval classification would have been chosen. Quantile is defined by the "even in size", but given enough flat areas in your data historgram, how do you guess what the even size would be? The method I suggested won't guarantee nor the interval nor the equal size, but just avoid the silly interval structure... do you have any suggestion on how to deal with this? What would you do with: Quantile( {-1 -2 0 0 0 0 3 5 7 9}, 2) ==> ? Quantile( {-1 -2 0 0 0 0 3 5 7 9}, 3) ==> ? The method I proposed, that is, detect the flat area in the histogram and avoid breaking the class until you get out of it, would generate the same result for both: {-1 -2 0 0 0 0} {3 5 7 9} For 3 intervals, another non totally silly output could be: {-1 -2} {0 0 0 0} {3 5 7 9} Generally speaking, detect flat areas, if they are big enough, make them a class apart, since they somehow represent an anomaly in the data. Of course applying this principle you could get more classes than you asked for. For example: Quantile( {-10 -9 -2 0 0 0 1 2 4 9 9 9}, 3) ==> what now? the "don't break if in flat area" would generate only 2 classes: {-10 -9 -2 0 0 0} {1 2 4 9 9 9} the "break out flat areas if big enough" approach would generate 4: {-10 -9 -2} {0 0 0} {1 2 4} {9 9 9} > If we go for even in size; you may get 2 categories when you asked for > three > Quantile( {0 0 0 0 3 5 7 9}, 2) ==> {0 0 0 0}, { 3 5 7 9 } > Quantile( {0 0 0 0 3 5 7 9}, 3) ==> {0 0 0 0}, { 3 5 7 9 } > > This may be a strange case of what do you expect? If I am looking at a > map of summary of I want to know what the colors represent; and if I ask > the application to color equal quantities of data in different colors; > for the data you provided we could only make a map with 2 categories; > anything else would be a mistake ... > > So while I can think of silly ways to break the content up into {0 0} > and {0 0} - they are just that - silly. Yeah, silly. Unfortunately that's exactly what you're getting today out of the quantile classification simple. I have cases, with real data, where the current function generates 3 subsequent intervals at 0. Cheers Andrea |