[WWdevel] Subtractive cancellation

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Folks:

Well, this mailing list has been pretty quiet, so I thought I'd bring  
up an issue I've been thinking about.  On a number of occasions, I've  
seen WeBWorK reject a correct answer because of numeric instability  
at one of the test points.  This is usually the result of one of the  
main sources of error in numeric computations:  subtractive  
cancellation.

When two nearly identical numbers are subtracted, most of the digits  
cancel leaving something near zero.  That is correct and perfectly  
normal.  The problem, however, is that when you are only storing a  
fixed number of digits, you can go from having 16 digits of precision  
and cancel out 13 of them, leaving only 3 digits left, so you have a  
massive loss of quality of the results.  Moreover, these lowest 3  
digits are where all the accumulated round-off errors are, and so  
these 3 digits are essentially random, and so those random digits  
have been moved up to the most significant digits, and your results  
are essentially worthless.

For example, if you compute  a - b  where a = 3.14159265395 and b =  
3.14159265213, then you get 1-b = .00000000182, and even though your  
original numbers have 12 significant digits, the result has only  
three, which is a significant loss of precision.  And since these 3  
digits are the least significant ones in the original numbers, they  
are subject to the most error due to accumulated round-off from  
previous computations.  This is how round-off error becomes  
significant in a computation.

Note also in this example that the result has only three digits, so  
if this gets compared to a student's answer that is computed in a  
slightly different way (with a slightly different 3-digit result),  
the relative error between the two answers will be quite high (it  
will be in the second or third digit, which is far higher than the  
usual tolerances, which are in the 4th digit or so).   I know that  
the zeroLevel and zeroLevelTol settings are intended to address this  
issue, but there is a problem with that approach:  the subtractive  
cancellation can occur at any order of magnitude.  For example, with  
numbers on the order of 1000, when 16 digits are being stored (the  
usual number for 64-bit floating point reals), if say 12 digits  
cancel (leaving only 4), the result will be on the order of 1E-8,  
which is significantly ABOVE the zeroLevelTol of 1E-12, and so will  
not be detected via that method, and we will go on with only 4 digits  
of precision to obtain values that are substantially less precise  
than usual.

The only way out of this currently is to adjust the limits used in  
the problem to avoid the test points where subtractive cancellation  
occurs.  Usually, this means avoiding the points where the function  
is zero (these points almost always involve subtractive cancellation  
somehow).  But it is not always easy to determine where these will  
be, especially when it depends on the random parameters, and since so  
few problem designers think carefully about their domains in any  
case, this is is not a very reliable method even when the zeros are  
clear.

I have been looking into a method that could help with this sort of  
problem.  Since the Parser handles the various operations (like  
subtraction) through overloaded operators, it would be possible to  
check each subtraction to see whether there has been a significant  
loss of precision (i.e., subtractive cancellation), and track the  
maximum loss for the evaluation of the function.  When selecting test  
points, if the computation caused a significant loss, the test point  
could be discarded and another one selected.

A more aggressive alternative would be to have subtraction convert  
the answer to zero when such a massive loss of precision occurs.   
This has the advantage of working no mater what the magnitude of the  
results, but might have unforeseen consequences, and would have to be  
watched carefully.

The drawback to this approach is that there would be more overhead  
involved with EVERY computation.  Currently, the computations  
performed when two functions are compared are NOT performed through  
the overloaded operators (for efficiency), and so changing things to  
allow tracking of cancellation would mean adding a lot of new  
overhead.  It may not be worth the cost.

A less aggressive approach would be to add a numeric stability test  
to the diagnostic output that is available in the Parser's function  
checker, so that the problem author could get a report of ranges  
where subtractive cancellation is a problem, and so can adjust the  
limits to avoid them.  This was one of the things I had in mind when  
I wrote the diagnostic display, but hadn't gotten to it.  (BTW, we  
still need a better way to get the diagnostic output into the WW page  
-- currently it is via warn messages, which fills up the server error  
log with lots of long messages that are not really errors.  But  
that's a different discussion.)  That is, the overhead would only  
occur at the request of the professor during problem development, not  
when the student is running the problem, and so would not be such a  
problem.  But is suffers from the flaw that the problem developer  
needs to do something extra in order to get the diagnostics.  (Of  
course, if we made that easier, say with a button on the professor's  
screen, and it became a normal part of the development process, that  
might not be so bad.)

Anyway, I have made some tests, and it IS feasible to track the  
subtractive cancellation, and so it could be used to avoid the major  
cause of numeric instability.  Any comments?

Davide