|
From: Fernando P. <Fer...@co...> - 2006-04-28 23:20:58
|
Hi all,
this is somewhat off-topic, since it's really a gcc/g77 question. Yet for us
here (my group) it may lead to the decision to stop using g77 for all fortran
code and switch to another compiler for our python-wrapped libraries. So it
did arise in the context of python usage of in-house code, and I'm appealing
to anyone who may want to play a little with the question and help.
Feel free to reply off-list to keep the noise down on the list.
The problem arose in some in-house library, but can be boiled down to this:
planck[f77bug]> cat testbug.f
program testbug
c
implicit real *8 (a-h,o-z)
c
half = 0.5d0
x = 0.49d0
nnx = 100
iax = (x+half)*nnx
print *, 'Should be 99:',iax
stop
end
c EOF
planck[f77bug]> g77 -o testbug.g77 testbug.f
planck[f77bug]> ./testbug.g77
Should be 99: 98
This can be seen as computing (x/n+1/2)*n and comparing it to x+n/2. Yes, I
know about the dangers of floating point roundoff error (I didn't write the
original code), but a variation of this is used inside a library that began
crashing for certain inputs. The point is that this same code works fine with
the Intel and Lahey compilers, but not with g77. Now, to add a bit of mystery
to the question, I wrote the following C code:
planck[f77bug]> cat scanbug.c
#include <stdio.h>
int main(int argc, char* argv[]) {
double x;
double eps = 1e-2;
double x0 = 0.0;
double xmax = 1.0;
int nnx = 100;
int i = 0;
double dax;
int iax,iax_direct;
x = x0;
while (x<xmax) {
// This operation:
dax = nnx*(x+0.5);
iax = dax;
// And this one:
iax_direct = nnx*(x+0.5);
// look identical, it's jut that one of them does not use a temporary
// double variable to hold the result ( Does this mean that the int
// cast is done in a register straight out of the 80-bit value in the
// FPU? )
// And yet, they produce different results for certain values of x
if (iax != iax_direct) {
printf("ERROR at x=%e!\n",x);
}
x = x0 + i*eps;
i += 1;
}
}
// EOF
And this is really where my question is. The key issue is that nx*(x+0.5)
produces a different result when truncated to an int, depending on whether a
temporary double is involved or not. I tested with the Intel C compiler, and
it does never report a mismatch, yet a gcc compilation (3.4.3 and 4.0.2) does
report a number of them.
Any ideas/comments? Shouldn't the result be independent of the intermediate
double var? It is for icc, can this be considered a gcc bug?
Cheers,
f
|