Great job on this library. I am finding it extremely useful. But I do dispute the usefulness of having methods write messages to System.err, as their callers will never see the report of whatever condition they are reporting.
Here is a particular example. The following calls write "Precision error in final section of NonCentralT.cumulative" to System.err. Even if the caller could detect this, which it cannot, what should it do with that information? Or is this a library bug?
new NonCentralT(29, -5.477225575051661).cumulative(2.0452296421327034) = 0.9999999999997987
new NonCentralT(29, -5.477225575051661).cumulative(2.756385903670605) = 0.9999999999998798
new NonCentralT(241, -3.8890872965260113).cumulative(2.596382898231599) = 0.9999999999396004
new NonCentralT(17, -4.242640687119285).cumulative(2.898230519677418) = 0.9999999999508804
Thank you.
Anonymous
Thank you for your report. In R, the first statement is equivalent to:
pt(2.0452296421327034, 29, -5.477225575051661)
And R throws a warning message:
Warning message:
In pt(2.0452296421327, 29, -5.47722557505166) :
full precision may not have been achieved in 'pnt{final}'
JDistlib does not have a sophisticated error reporting system, especially for this warning. I think it should not be an error, though. That is why I decided to report incidence like this to System.err.
Since some people would prefer such warnings to show on screen, I think I should provide a cost effective mechanism to do that. Will think of a way. Thanks!
Thanks for your prompt response! (Thank you also for the R connection.) Though I admit I don't have your familiarity with the algorithm, I'm not sure I agree with your statement "I think it should not be an error". If R (or JDistlib) can't provide an answer to the expected precision (because it limits the number of iterations, for example, or for any other reason), I'd call that an error.
In interactive applications, it is perfectly appropriate to display error messages to the user (for example, by writing to System.err, or the R console). But library code can't assume there is a user; it is appropriate for library methods that cannot fulfill their contract to indicate that to their callers by throwing exceptions. You do that in numerous instances (throwing variously IllegalArgumentException, ArithmeticException, or simply RuntimeException). Perhaps that should be done here.
One further note on exceptions: I think a good rule of thumb is to use unchecked exceptions when client code has a reasonable opportunity to avoid them, and checked exceptions otherwise. So in cases such as this, I think it's better to define your own checked exception and throw an instance of that. That way, the compiler alerts me to cases where, for example, subtle combinations of arguments might cause a method to fail to return a result. https://docs.oracle.com/javase/tutorial/essential/exceptions/runtime.html makes some good points on this subject.
Also, I went ahead and filed this: https://bugs.r-project.org/bugzilla/show_bug.cgi?id=16680 .
Last edit: Wes Munsil 2016-01-22
I disagree with throwing Exception. This is clearly a limitation of the underlying algorithm (as you probably have seen in your bug submission). A lot of people still use the results anyway. Throwing an exception would disrupt the overall flow and you cannot get the results. Those who demand much accuracy are in the minority. For these people, I need some kind of setting to raise exception to inaccuracy problems.
Yes, it is a limitation of the underlying algorithm, and the R folks' response is essentially "a bug that is documented is not a bug", and I find that amusing.
Thank you for adding the setting to allow these situations to result in exceptions being thrown--though I might have replaced the calls to System.err.println by calls to a utility method that does the println and the conditional throw: a lot less code, and a lot less repetition!
Finally, re your statement "Throwing an exception would disrupt the overall flow and you cannot get the results.": not necessarily. Consider
throw new jdistlib.exception.PrecisionException("precision has been lost because blah blah blah", approximateResult);
That way the caller has the information about the loss of precision as well as the imprecise result, and he can choose to use it or not.
Last edit: Wes Munsil 2016-01-26
To be fair, the algorithms for computing statistical distribution are not trivial. R folks are not inventing these algorithms out of the blue---they are simply using an existing, published algorithm. In this case, the published algorithm is just not good. There is nothing they can do about it until somebody publishes his/her improved algoritm (which is extremely hard to do).
Good idea about precision exception.
It turns out that it is not entirely possible to implement a clean precision exception because the precision issues could be encountered in one of subfunctions, where the final result has not necessarily been computed. It is far more of a problem that it's worth.
Certainly. My example was only an example of what might be done in certain cases. I didn't propose it as something that was universally applicable. In other cases, something else might be appropriate. Obviously I'm most interested in the cases that I have personally encountered. :-) But the library designer (that is, you) has complete freedom to design whatever set of exceptions is appropriate, and useful. I simply intended to illustrate that you have a great deal of latitude in that area.
Again, I am so appreciative of your work on this library, and your responsiveness!
Last edit: Wes Munsil 2016-01-27
It's a series of try-catch-rethrow type. I implemented to some degree, but I feel it will make me harder to sync with R code---especially for NonCentralChiSquare. I'll go ahead and do it.
Fixed for the upcoming release (either 0.4.2 or 0.5.0)