I have available patched versions of the source to work with MySQL 5.1, and to accept any numeric values as input (instead of only REALs). I also added a MOMENT() UDF based on KURTOSIS() and SKEWNESS() that returns any central moment of the provided data.
If anyone's interested, I'll submit the patches this weekend.
I've attached the files that I've patched so far.
However, it looks like this project might be defunct, so if nobody objects, I'm going to fork it. It's too valuable a resource to go entirely unsupported.
Error message when opening this (Ubuntu 10.04 with all patches applied):
bzip2: (stdin) is not a bzip2 file.
tar: Child returned status 2
tar: Exiting with failure status due to previous errors
Can you upload with different compression?
non-compressed source of patched .cc source
Attached uncompressed version. Please let me know how it works out.
Well, apparently, the upload/download cycle via sf.net destroys my tar files. they work locally, before round-tripping them.
Both tar files are giving the following error messages:
tar: This does not look like a tar archive
tar: Skipping to next header
tar: Exiting with failure status due to previous errors
The kurtosis file was not uploaded correctly.
Other files are fine.
Perhaps an idea to upload to http://forge.mysql.com/ as well?
Thanks for the feedback. I'll upload another kurtosis file soon. In the meantime, the kurtosis is (3 - MOMENT(4, column)) as I recall, if you need it in an emergency.
When I fork this project, I'll probably host it on Google Code, but I'll keep that website in mind for posting updates.
Would anyone be interested in ORIGIN_MOMENT(k, c, [d]), to calculate the kth moment about the origin? It should be a trivial enough thing to implement now I know roughly how the API works.
Apparently, there's also something called an L-moment. Research is ongoing, but I may end up posting LMOMENT(k, c, [d]) as well.
Please excuse me talking to myself, but this seems a decent place to gather my thoughts.
TODO: rewrite MEDIAN() to accept any sortable non-numeric data. Use a Fibonacci heap?
TODO: write MODE() to accept any data. Also use a Fibonacci heap? Trivial to implement with SELECT c AS mode FROM (SELECT c, COUNT(*) AS n GROUP BY c ORDER BY n DESC, c ASC LIMIT 1), but that's an awful lot of keystrokes for something so elementary.
TODO: write MEDIAN_FRACTION() and MODE_FRACTION() to return the fraction of the data that is equal to the median and mode respectively.
TODO: For MEDIAN() and MODE() on non-numeric data, what to do when there's a tie?
Very much in favor of forking it, in particular to allow distributed version control.
Forked at http://code.google.com/p/mysql-udf-moments/source/browse/#svn%2Ftrunk
If you want to be involved the development of the fork, please file a ticket via the new project page.