The attached patch adds the calculation of the skewness, the kurtosis, and the
standard errors of mean, stddev, skewness, and kurtosis to the stats command.
It also changes the calculation formula for the variance to the "corrected two-pass
algorithm", since according to the Numerical Recipes (3rd ed.) the formula used before
"can magnify the roundoff error by large factor and is generally unjustifiable in terms
of computing speed".
This patch is greatly appreciated. It has two minor issues, though:
In the function two_column_output one could put the standard errors above the already not aligned output of the slope and intercept. For the function sgl_column_output I'm
not sure.
In writing the patch I also thought about this case and decided to skip the test and
risk an assignment of positive infinity since I didn't know what to assign to STATS_skewness and STATS_kurtosis when this happens.
This variant of the patch now re-indents the output of all results to match the new results. The maximum total line length is ~51, which is still acceptable.
In cases where the calculation is undefined (variance == 0), the user variables are set to NaN and the printed value is "undefined".
There is a last issue which needs to be clarified. Your patch changes the definition of the variance from the population variance (n) to the sample variance (n-1). This then propagates to the kurtosis and skewness. While this makes a lot of sense in some cases, it is a change in behaviour. In any case this should be documented and consistent with other parts of gnuplot. See also comments on [bugs:#1118].
Related
Bugs:
#1118Thank you for the reindentaion and the good solution of the variance == 0 case.
I didn't notice this change when implementing the two-pass algorithm. Since I hope that
everybody who uses this function will have a large enough data set, where the difference
between n and n-1 does not matter, you can revert this change. If one uses the sample
variance one should include a test for the (n - 1 == 0) case and handle it like you
did in the (variance == 0) case.
Thanks. Now in CVS.