Menu

On Cygwin with g++ 10.2.0: ConicProj0 test exposes infinite loop in AlbersEqualArea::DDatanhee()

Anonymous
2021-01-01
2021-02-22
  • Anonymous

    Anonymous - 2021-01-01

    Having cmake'ed GeographicLib-1.51 on a Windows 7 with Cygwin

    $ uname -a
    CYGWIN_NT-6.1-WOW Annette-Pc2 3.1.7(0.340/5/3) 2020-08-22 19:03 i686 Cygwin
    $ g++ --version
    g++ (GCC) 10.2.0
    

    and running the tests, I found that the ConicProj0 test ran for minutes, so I interrupted it. Investigation showed that the loop in AlbersEqualArea::DDatanhee() controlled by while (os != s) apparently ran forever. The investigation was hampered by the fact that inserting std::cerr << debugging output; statements inside or near the loop sometimes corrected the problem and the infinite loop behavior went away. (Hello, Heisenbug.)

    I have heard rumors of more modern hardware supporting more precise floating point values than implied by the usual 32-bit or 64-bit storage formats in order to enable more precision in intermediate calculations. This is presumably mostly good, but may create problems on occasion.

    This discussion is from the gcc man page:

    -ffloat-store
    Do not store floating-point variables in registers, and inhibit other options that might change whether a
    floating-point value is taken from a register or memory.

    This option prevents undesirable excess precision on machines such as the 68000 where the floating
    registers (of the 68881) keep more precision than a "double" is supposed to have. Similarly for the x86
    architecture. For most programs, the excess precision does only good, but a few programs rely on the
    precise definition of IEEE floating point. Use -ffloat-store for such programs, after modifying them to
    store all pertinent intermediate computations into variables.

    Suspecting that the AlbersEqualArea::DDatanhee() loop could be a case of excess precision creating problems led me to declare both of the variables in the os != s condition as volatile. Like this:

       Math::real AlbersEqualArea::DDatanhee(real x, real y) const {
    -    real s = 0;
    +    volatile real s = 0;
         if (_e2 * (abs(x) + abs(y)) < real(0.5)) {
    -      real os = -1, z = 1, k = 1, t = 0, c = 0, en = 1;
    +      volatile real os = -1;
    +      real z = 1, k = 1, t = 0, c = 0, en = 1;
           while (os != s) {
    

    This causes the infinite loop behavior to go away.

    Best regards
    Thorkil Naur

     
  • Charles Karney

    Charles Karney - 2021-01-01

    Thanks for reporting (and diagnosing!) this. My first response is that os != s is likely too stringent a test. I'll look into this.

     
  • Anonymous

    Anonymous - 2021-01-24

    Following up on this, initially, I noted that the ConicProj0 looping problem also occurs with:

    $ uname -a
    Linux tn41.thorkilnaur.dk 5.6.13-100.fc30.i686 #1 SMP Fri May 15 00:02:53 UTC 2020 i686 i686 i386 GNU/Linux
    $ g++ --version
    g++ (GCC) 9.3.1 20200408 (Red Hat 9.3.1-2)
    Copyright (C) 2019 Free Software Foundation, Inc.
    This is free software; see the source for copying conditions.  There is NO
    warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
    $
    

    Investigating in detail under these circumstances, I found that the infinite loop indeed occurs because quantities are compared for equality, only one of which is rounded to ordinary double precision and the other is retained in extended precision.

    The relevant code from GeographicLib-1.51 is this extract from AlbersEqualArea::DDatanhee(real,real):

    370       while (os != s) {
    371         os = s;
    372         t = y * t + z; c += t; z *= x;
    373         t = y * t + z; c += t; z *= x;
    374         k += 2; en *= _e2;
    375         // Here en[l] = e2^l, k[l] = 2*l + 1,
    376         // c[l] = sum( x^i * y^j; i >= 0, j >= 0, i+j < 2*l)
    377         s += en * c / k;
    378       }
    

    In this loop, s is repeatedly increased by a decreasing term, en*c/k. And termination is conditioned on the event that this change of s doesn't change the value of s: So continue while os != s where os is the old value of s.

    Must terminate at some point, it seems, given the limited precision of floating point values and the fact that the added term, en*c/k, as is easily seen, becomes smaller and smaller with each execution of the loop. So what's going on?

    To figure out, I inspected the machine code generated for AlbersEqualArea::DDatanhee(real,real) and the execution details using gdb (http://www.gnu.org/software/gdb/).

    Under the circumstances mentioned, the machine code for the while (os != s) loop is the following:

    1   .L135:
    2       fxch    %st(3)      #
    3       fxch    %st(4)      #
    4       fxch    %st(1)      #
    5       fxch    %st(5)      #
    6       jmp .L97    #
    7   .L136:
    8       fxch    %st(3)      #
    9       fxch    %st(4)      #
    10      fxch    %st(1)      #
    11      fxch    %st(5)      #
    12      .p2align 4,,10
    13      .p2align 3
    14  .L97:
    15  # AlbersEqualArea.cpp:372:         t = y * t + z; c += t; z *= x;
    16      fmul    %st(6), %st #,
    17  # AlbersEqualArea.cpp:372:         t = y * t + z; c += t; z *= x;
    18      fadd    %st(5), %st #,
    19  # AlbersEqualArea.cpp:372:         t = y * t + z; c += t; z *= x;
    20      fadd    %st, %st(2) #,
    21  # AlbersEqualArea.cpp:372:         t = y * t + z; c += t; z *= x;
    22      fldl    16(%esp)    # %sfp
    23      fmul    %st, %st(6) #,
    24      fxch    %st(1)      #
    25  # AlbersEqualArea.cpp:373:         t = y * t + z; c += t; z *= x;
    26      fmul    %st(7), %st #,
    27  # AlbersEqualArea.cpp:373:         t = y * t + z; c += t; z *= x;
    28      fadd    %st(6), %st #,
    29  # AlbersEqualArea.cpp:373:         t = y * t + z; c += t; z *= x;
    30      fadd    %st, %st(3) #,
    31      fxch    %st(6)      #
    32  # AlbersEqualArea.cpp:373:         t = y * t + z; c += t; z *= x;
    33      fmulp   %st, %st(1) #,
    34      fxch    %st(1)      #
    35  # AlbersEqualArea.cpp:374:         k += 2; en *= _e2;
    36      fadds   24(%esp)    # %sfp
    37      fxch    %st(4)      #
    38  # AlbersEqualArea.cpp:374:         k += 2; en *= _e2;
    39      fmull   8(%esp) # %sfp
    40      fxch    %st(3)      #
    41      fstl    (%esp)  # %sfp
    42  # AlbersEqualArea.cpp:377:         s += en * c / k;
    43      fld %st(2)  #
    44      fmul    %st(4), %st #,
    45  # AlbersEqualArea.cpp:377:         s += en * c / k;
    46      fdiv    %st(5), %st #,
    47  # AlbersEqualArea.cpp:377:         s += en * c / k;
    48      faddp   %st, %st(1) #,
    49  # AlbersEqualArea.cpp:370:       while (os != s) {
    50      fldl    (%esp)  # %sfp
    51      fxch    %st(1)      #
    52      fucomi  %st(1), %st #,
    53      fstp    %st(1)      #
    54      jp  .L135   #,
    55      jne .L136   #,
    

    Interpretation of the machine code makes use of

    Intel® 64 and IA-32 Architectures
    Software Developers Manual
    Combined Volumes:
    1, 2A, 2B, 2C, 2D, 3A, 3B, 3C, 3D and 4
    

    found at https://software.intel.com/sites/default/files/managed/39/c5/325462-sdm-vol-1-2abcd-3abcd.pdf

    In this manual, "CHAPTER 8 PROGRAMMING WITH THE X87 FPU" explains that the floating point unit operates with 80-bit floating point values, that is, with extended precision when compared to the normal double values that comprise 64 bits. When a 64-bit floating point value is loaded from memory into a floating point register, it is suitably extended with zeroes. And, when storing a 80-bit value from a floating point register as a 64-bit value in memory, suitable rounding is performed.

    Now, there is no need to go into all details, but analysis of the machine code reveals that the variable s is maintained in the floating point register set throughout. That is, it is never loaded from memory or stored in memory as part of the loop.

    Not so with the os variable, used to keep the old value of s: As I read the code, the instruction

    41      fstl    (%esp)  # %sfp
    

    represents the saving, os = s, of the old value of s and thereby also indicating that os is maintained in memory and, thus, as a 64-bit value. Any additional precision of the value of s held in the floating point register will thus be lost.

    A bit later, the instruction

    50      fldl    (%esp)  # %sfp
    

    brings back the value of os from memory, in preparation for the comparison:

    52      fucomi  %st(1), %st #,
    

    Using the gdb debugger, the situation is depicted as follows in the moment just before the comparison is executed at a time when the loop has been going for a while:

    st0            0.00516878351637141068976 (raw 0x3ff7a95ee614da8860a7)
    st1            0.00516878351637141061903 (raw 0x3ff7a95ee614da886000)
    

    Here, st0 contains s and st1 contains os. Observe that they are identical, except for the least significant 8 bits: Where s has 0xa7, os has 0x00, presumably the result of having stored the value in memory as a 64-bit float and then loading it back into a floating point register.

    So now we understand the problem. Next is, how do we solve it?

    The earlier suggestion of using volatile to solve the problem was a bit of a stab in the dark: It solved the problem, yes, but the method seems brittle and hard to connect with the essence of the problem.

    Another possibility would be to use the earlier mentioned g++ option -ffloat-store. And, indeed, using that option solves the problem. However, the price is possibly some reduced performance and accuracy of the calculations.

    It seems best to attempt to solve the problem at the source code level, for example by adjusting the loop as follows:

      for (;;) {
        t = y * t + z; c += t; z *= x;
        t = y * t + z; c += t; z *= x;
        k += 2; en *= _e2;
        // Here en[l] = e2^l, k[l] = 2*l + 1,
        // c[l] = sum( x^i * y^j; i >= 0, j >= 0, i+j < 2*l)
        real d = en * c / k;
        if ( abs(d) <= abs(s) * std::numeric_limits<real>::epsilon() ) {
          break;
        }
        s += d;
      }
    

    This likewise eliminates the looping behaviour in the observed cases.

    Best regards
    Thorkil Naur

     
  • Charles Karney

    Charles Karney - 2021-01-25

    I agree that this is a good solution. I'm still rethinking my approach to computing this function and I may end up with a different way of handling this issue.

     
  • Charles Karney

    Charles Karney - 2021-02-22

    I've checked in (on the devel branch) various updates to AlbersEqualArea. These eliminate the problem you reported. Thanks!

     

Anonymous
Anonymous

Add attachments
Cancel