Menu

#879 Standard deviation not calculated correctly

1.0.x
open
nobody
General (896)
5
2016-03-04
2008-08-02
Anonymous
No

Maybe this is already reported but
Statistics.getStdDev() gives wrong results because of
the last line of the method:

return Math.sqrt(sum / (data.length - 1));

You should not subtract 1 from data.length.
Thanks, nice software.

Discussion

  • Christopher Katz

    I agree with your point, if we can state with certainty that the data array is the entire population of data that the standard deviation is calculated for. If we are only getting data on a sample of a population, a more accurate estimate of the standard deviation of the entire population subtracts 1 from the length of the array. But would there be a case when the method is only passed a sample of data instead of the entire population of data?

    This link gives a good description of the differences in standard deviation calculations:
    http://www.graphpad.com/faq/viewfaq.cfm?faq=1382

    Regardless, IMO subtracting 1 is generally accurate in all cases, so the code should be kept the way it is

    - Chris

     
  • Yann Zimmermann

    Yann Zimmermann - 2015-09-15

    Hi,

    I am also disappointed by this function I am using from several months. Because I just experience a crash due to this function, I have read the code and seen this division by "data.length - 1", instead of "data.length".

    I cannot argue in which situation we should have to use "data.length - 1" or ""data.length" but I think it should be understable by the function name and its documentation. The better is probablebly to provide both function and let the user to do the correct choice for him.

    Whathever to do with that, I experienced a crash because (depending on the user inputs), it may happen I call this function with a vector of size 1. I did not expect a problement with that beacuse the (javadoc) documentation says the vector must neither be null nor empty, but it does not say the vector must have a size > 1.

    According to the javadoc and the name of the function, I think this is really a bug to do the division by "data.length - 1". This is not what the user expect seeing the name of the function and the javadoc.

    I suggest to correct the bug in this function and to provide another function "getSampleStdDev"

    Thank you

     
  • simon04

    simon04 - 2016-03-04

    Statistics.getStdDev() returns NaN for an array of length 1. This is consistent with Wolfram Alpha not computing a standard deviation for 1 element: http://www.wolframalpha.com/input/?i=sample+standard+deviation+{1}+

     

Log in to post a comment.