and running the tests, I found that the ConicProj0 test ran for minutes, so I interrupted it. Investigation showed that the loop in AlbersEqualArea::DDatanhee() controlled by while (os != s) apparently ran forever. The investigation was hampered by the fact that inserting std::cerr << debugging output; statements inside or near the loop sometimes corrected the problem and the infinite loop behavior went away. (Hello, Heisenbug.)
I have heard rumors of more modern hardware supporting more precise floating point values than implied by the usual 32-bit or 64-bit storage formats in order to enable more precision in intermediate calculations. This is presumably mostly good, but may create problems on occasion.
This discussion is from the gcc man page:
-ffloat-store
Do not store floating-point variables in registers, and inhibit other options that might change whether a
floating-point value is taken from a register or memory.
This option prevents undesirable excess precision on machines such as the 68000 where the floating
registers (of the 68881) keep more precision than a "double" is supposed to have. Similarly for the x86
architecture. For most programs, the excess precision does only good, but a few programs rely on the
precise definition of IEEE floating point. Use -ffloat-store for such programs, after modifying them to
store all pertinent intermediate computations into variables.
Suspecting that the AlbersEqualArea::DDatanhee() loop could be a case of excess precision creating problems led me to declare both of the variables in the os != s condition as volatile. Like this:
Thanks for reporting (and diagnosing!) this. My first response is that os != s is likely too stringent a test. I'll look into this.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Anonymous
Anonymous
-
2021-01-24
Following up on this, initially, I noted that the ConicProj0 looping problem also occurs with:
$ uname-aLinuxtn41.thorkilnaur.dk5.6.13-100.fc30.i686#1SMPFriMay1500:02:53UTC2020i686i686i386GNU/Linux
$ g++--versiong++(GCC)9.3.120200408(RedHat9.3.1-2)Copyright(C)2019FreeSoftwareFoundation, Inc.
Thisisfreesoftware; see the source for copying conditions. There is NOwarranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
$
Investigating in detail under these circumstances, I found that the infinite loop indeed occurs because quantities are compared for equality, only one of which is rounded to ordinary double precision and the other is retained in extended precision.
The relevant code from GeographicLib-1.51 is this extract from AlbersEqualArea::DDatanhee(real,real):
370while(os!=s) {
371os=s;372t=y*t+z; c += t; z *= x;373t=y*t+z; c += t; z *= x;374k+=2; en *= _e2;375//Hereen[l] =e2^l, k[l] =2*l+1,
376//c[l] =sum(x^i*y^j; i >= 0, j >= 0, i+j < 2*l)377s+=en*c/k;378 }
In this loop, s is repeatedly increased by a decreasing term, en*c/k. And termination is conditioned on the event that this change of s doesn't change the value of s: So continue while os != s where os is the old value of s.
Must terminate at some point, it seems, given the limited precision of floating point values and the fact that the added term, en*c/k, as is easily seen, becomes smaller and smaller with each execution of the loop. So what's going on?
To figure out, I inspected the machine code generated for AlbersEqualArea::DDatanhee(real,real) and the execution details using gdb (http://www.gnu.org/software/gdb/).
Under the circumstances mentioned, the machine code for the while (os != s) loop is the following:
In this manual, "CHAPTER 8 PROGRAMMING WITH THE X87 FPU" explains that the floating point unit operates with 80-bit floating point values, that is, with extended precision when compared to the normal double values that comprise 64 bits. When a 64-bit floating point value is loaded from memory into a floating point register, it is suitably extended with zeroes. And, when storing a 80-bit value from a floating point register as a 64-bit value in memory, suitable rounding is performed.
Now, there is no need to go into all details, but analysis of the machine code reveals that the variable s is maintained in the floating point register set throughout. That is, it is never loaded from memory or stored in memory as part of the loop.
Not so with the os variable, used to keep the old value of s: As I read the code, the instruction
41fstl(%esp)#%sfp
represents the saving, os = s, of the old value of s and thereby also indicating that os is maintained in memory and, thus, as a 64-bit value. Any additional precision of the value of s held in the floating point register will thus be lost.
A bit later, the instruction
50fldl(%esp)#%sfp
brings back the value of os from memory, in preparation for the comparison:
52fucomi%st(1),%st#,
Using the gdb debugger, the situation is depicted as follows in the moment just before the comparison is executed at a time when the loop has been going for a while:
Here, st0 contains s and st1 contains os. Observe that they are identical, except for the least significant 8 bits: Where s has 0xa7, os has 0x00, presumably the result of having stored the value in memory as a 64-bit float and then loading it back into a floating point register.
So now we understand the problem. Next is, how do we solve it?
The earlier suggestion of using volatile to solve the problem was a bit of a stab in the dark: It solved the problem, yes, but the method seems brittle and hard to connect with the essence of the problem.
Another possibility would be to use the earlier mentioned g++ option -ffloat-store. And, indeed, using that option solves the problem. However, the price is possibly some reduced performance and accuracy of the calculations.
It seems best to attempt to solve the problem at the source code level, for example by adjusting the loop as follows:
for(;;) {t=y*t+z; c += t; z *= x;t=y*t+z; c += t; z *= x;k+=2; en *= _e2;//Hereen[l] =e2^l, k[l] =2*l+1,
//c[l] =sum(x^i*y^j; i >= 0, j >= 0, i+j < 2*l)reald=en*c/k;if(abs(d)<=abs(s)*std::numeric_limits<real>::epsilon()) {
break;
}
s+=d;
}
This likewise eliminates the looping behaviour in the observed cases.
Best regards
Thorkil Naur
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I agree that this is a good solution. I'm still rethinking my approach to computing this function and I may end up with a different way of handling this issue.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Having cmake'ed GeographicLib-1.51 on a Windows 7 with Cygwin
and running the tests, I found that the ConicProj0 test ran for minutes, so I interrupted it. Investigation showed that the loop in AlbersEqualArea::DDatanhee() controlled by while (os != s) apparently ran forever. The investigation was hampered by the fact that inserting std::cerr << debugging output; statements inside or near the loop sometimes corrected the problem and the infinite loop behavior went away. (Hello, Heisenbug.)
I have heard rumors of more modern hardware supporting more precise floating point values than implied by the usual 32-bit or 64-bit storage formats in order to enable more precision in intermediate calculations. This is presumably mostly good, but may create problems on occasion.
This discussion is from the gcc man page:
Suspecting that the AlbersEqualArea::DDatanhee() loop could be a case of excess precision creating problems led me to declare both of the variables in the os != s condition as volatile. Like this:
This causes the infinite loop behavior to go away.
Best regards
Thorkil Naur
Thanks for reporting (and diagnosing!) this. My first response is that
os != s
is likely too stringent a test. I'll look into this.Following up on this, initially, I noted that the ConicProj0 looping problem also occurs with:
Investigating in detail under these circumstances, I found that the infinite loop indeed occurs because quantities are compared for equality, only one of which is rounded to ordinary double precision and the other is retained in extended precision.
The relevant code from GeographicLib-1.51 is this extract from AlbersEqualArea::DDatanhee(real,real):
In this loop, s is repeatedly increased by a decreasing term, en*c/k. And termination is conditioned on the event that this change of s doesn't change the value of s: So continue while os != s where os is the old value of s.
Must terminate at some point, it seems, given the limited precision of floating point values and the fact that the added term, en*c/k, as is easily seen, becomes smaller and smaller with each execution of the loop. So what's going on?
To figure out, I inspected the machine code generated for AlbersEqualArea::DDatanhee(real,real) and the execution details using gdb (http://www.gnu.org/software/gdb/).
Under the circumstances mentioned, the machine code for the while (os != s) loop is the following:
Interpretation of the machine code makes use of
found at https://software.intel.com/sites/default/files/managed/39/c5/325462-sdm-vol-1-2abcd-3abcd.pdf
In this manual, "CHAPTER 8 PROGRAMMING WITH THE X87 FPU" explains that the floating point unit operates with 80-bit floating point values, that is, with extended precision when compared to the normal double values that comprise 64 bits. When a 64-bit floating point value is loaded from memory into a floating point register, it is suitably extended with zeroes. And, when storing a 80-bit value from a floating point register as a 64-bit value in memory, suitable rounding is performed.
Now, there is no need to go into all details, but analysis of the machine code reveals that the variable s is maintained in the floating point register set throughout. That is, it is never loaded from memory or stored in memory as part of the loop.
Not so with the os variable, used to keep the old value of s: As I read the code, the instruction
represents the saving, os = s, of the old value of s and thereby also indicating that os is maintained in memory and, thus, as a 64-bit value. Any additional precision of the value of s held in the floating point register will thus be lost.
A bit later, the instruction
brings back the value of os from memory, in preparation for the comparison:
Using the gdb debugger, the situation is depicted as follows in the moment just before the comparison is executed at a time when the loop has been going for a while:
Here, st0 contains s and st1 contains os. Observe that they are identical, except for the least significant 8 bits: Where s has 0xa7, os has 0x00, presumably the result of having stored the value in memory as a 64-bit float and then loading it back into a floating point register.
So now we understand the problem. Next is, how do we solve it?
The earlier suggestion of using volatile to solve the problem was a bit of a stab in the dark: It solved the problem, yes, but the method seems brittle and hard to connect with the essence of the problem.
Another possibility would be to use the earlier mentioned g++ option -ffloat-store. And, indeed, using that option solves the problem. However, the price is possibly some reduced performance and accuracy of the calculations.
It seems best to attempt to solve the problem at the source code level, for example by adjusting the loop as follows:
This likewise eliminates the looping behaviour in the observed cases.
Best regards
Thorkil Naur
I agree that this is a good solution. I'm still rethinking my approach to computing this function and I may end up with a different way of handling this issue.
I've checked in (on the
devel
branch) various updates toAlbersEqualArea
. These eliminate the problem you reported. Thanks!