Re: [WWdevel] Subtractive cancellation

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Hi Davide,

I'm afraid I'm going to sully your elegant explanation with a stupid 
question.  I (believe I) understand the issue of cancellation, however 
poor I may be at predicting it.  In the example problem that we discussed 
before you sent this e-mail, I found that the issue with subtractive 
cancellation occurred when I used the Parser version of fun_cmp, but not 
the older version.  Is there any reason besides blind luck that the older 
version didn't also mis-grade the problem?

I'm personally nervous about adding too much overhead to WeBWorK 
processing, though some of that may be residual from overloading a server 
back when we were running pre-mod_perl WeBWorK.

Gavin

-- 
P. Gavin LaRose / Instructional Tech.,  The tough problem is not in iden-
Math Dept., University of Michigan      tifying winners;  it is in making
gl...@um...                       winners of ordinary people. That,
734.764.6454                            after  all,  is  the overwhelming
http://www.math.lsa.umich.edu/~glarose/ purpose of education.  -K.P.Cross

On Wed, 12 Apr 2006, Davide P. Cervone wrote:

> Folks:
>
> Well, this mailing list has been pretty quiet, so I thought I'd bring up an 
> issue I've been thinking about.  On a number of occasions, I've seen WeBWorK 
> reject a correct answer because of numeric instability at one of the test 
> points.  This is usually the result of one of the main sources of error in 
> numeric computations:  subtractive cancellation.
>
> When two nearly identical numbers are subtracted, most of the digits cancel 
> leaving something near zero.  That is correct and perfectly normal.  The 
> problem, however, is that when you are only storing a fixed number of digits, 
> you can go from having 16 digits of precision and cancel out 13 of them, 
> leaving only 3 digits left, so you have a massive loss of quality of the 
> results.  Moreover, these lowest 3 digits are where all the accumulated 
> round-off errors are, and so these 3 digits are essentially random, and so 
> those random digits have been moved up to the most significant digits, and 
> your results are essentially worthless.
>
> For example, if you compute  a - b  where a = 3.14159265395 and b = 
> 3.14159265213, then you get 1-b = .00000000182, and even though your original 
> numbers have 12 significant digits, the result has only three, which is a 
> significant loss of precision.  And since these 3 digits are the least 
> significant ones in the original numbers, they are subject to the most error 
> due to accumulated round-off from previous computations.  This is how 
> round-off error becomes significant in a computation.
>
> Note also in this example that the result has only three digits, so if this 
> gets compared to a student's answer that is computed in a slightly different 
> way (with a slightly different 3-digit result), the relative error between 
> the two answers will be quite high (it will be in the second or third digit, 
> which is far higher than the usual tolerances, which are in the 4th digit or 
> so).   I know that the zeroLevel and zeroLevelTol settings are intended to 
> address this issue, but there is a problem with that approach:  the 
> subtractive cancellation can occur at any order of magnitude.  For example, 
> with numbers on the order of 1000, when 16 digits are being stored (the usual 
> number for 64-bit floating point reals), if say 12 digits cancel (leaving 
> only 4), the result will be on the order of 1E-8, which is significantly 
> ABOVE the zeroLevelTol of 1E-12, and so will not be detected via that method, 
> and we will go on with only 4 digits of precision to obtain values that are 
> substantially less precise than usual.
>
> The only way out of this currently is to adjust the limits used in the 
> problem to avoid the test points where subtractive cancellation occurs. 
> Usually, this means avoiding the points where the function is zero (these 
> points almost always involve subtractive cancellation somehow).  But it is 
> not always easy to determine where these will be, especially when it depends 
> on the random parameters, and since so few problem designers think carefully 
> about their domains in any case, this is is not a very reliable method even 
> when the zeros are clear.
>
> I have been looking into a method that could help with this sort of problem. 
> Since the Parser handles the various operations (like subtraction) through 
> overloaded operators, it would be possible to check each subtraction to see 
> whether there has been a significant loss of precision (i.e., subtractive 
> cancellation), and track the maximum loss for the evaluation of the function. 
> When selecting test points, if the computation caused a significant loss, the 
> test point could be discarded and another one selected.
>
> A more aggressive alternative would be to have subtraction convert the answer 
> to zero when such a massive loss of precision occurs.  This has the advantage 
> of working no mater what the magnitude of the results, but might have 
> unforeseen consequences, and would have to be watched carefully.
>
> The drawback to this approach is that there would be more overhead involved 
> with EVERY computation.  Currently, the computations performed when two 
> functions are compared are NOT performed through the overloaded operators 
> (for efficiency), and so changing things to allow tracking of cancellation 
> would mean adding a lot of new overhead.  It may not be worth the cost.
>
> A less aggressive approach would be to add a numeric stability test to the 
> diagnostic output that is available in the Parser's function checker, so that 
> the problem author could get a report of ranges where subtractive 
> cancellation is a problem, and so can adjust the limits to avoid them.  This 
> was one of the things I had in mind when I wrote the diagnostic display, but 
> hadn't gotten to it.  (BTW, we still need a better way to get the diagnostic 
> output into the WW page -- currently it is via warn messages, which fills up 
> the server error log with lots of long messages that are not really errors. 
> But that's a different discussion.)  That is, the overhead would only occur 
> at the request of the professor during problem development, not when the 
> student is running the problem, and so would not be such a problem.  But is 
> suffers from the flaw that the problem developer needs to do something extra 
> in order to get the diagnostics.  (Of course, if we made that easier, say 
> with a button on the professor's screen, and it became a normal part of the 
> development process, that might not be so bad.)
>
> Anyway, I have made some tests, and it IS feasible to track the subtractive 
> cancellation, and so it could be used to avoid the major cause of numeric 
> instability.  Any comments?
>
> Davide
>
>
>
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by xPML, a groundbreaking scripting language
> that extends applications into web and mobile media. Attend the live webcast
> and join the prime developer group breaking into this new coding territory!
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
> _______________________________________________
> OpenWeBWorK-Devel mailing list
> Ope...@li...
> https://lists.sf.net/lists/listinfo/openwebwork-devel
>
>