MinGW-w64 - for 32 and 64 bit Windows / Bugs / #678 'long double' computations are done in 'double' with Openmp

#678 'long double' computations are done in 'double' with Openmp

Milestone: v1.0 (example)

Status: open

Owner: nobody

Labels: None

Priority: 5

Updated: 2017-10-05

Created: 2017-10-02

Creator: Michael Quellmalz

Private: No

When using parallel OpenMP code, computations with long double variables are actually done in double precision. The following example computes the machine epsilon (eps) for long double in several threads:

#include <stdio.h>
#include <omp.h>

int main(){
  volatile long double eps[] = {1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0}, one_plus_eps[8];
  int j;
  #pragma omp parallel for default(shared) private(j)
  for (j=0; j<8; j++)
  {
    //__asm__ ("fninit");
    do {
      eps[j] /= 2.0;
      one_plus_eps[j] = 1.0 + eps[j];
    }
    while (one_plus_eps[j] != 1.0);
    eps[j] *= 2.0;
    int id=omp_get_thread_num();
    printf ("thread #%d,\teps=%LE\n", id, eps[j]);
  }
  return 0;
}

Result: (compiled with Thread model: posix gcc version 7.2.0 (Rev1, Built by MSYS2 project))

$ gcc -D__USE_MINGW_ANSI_STDIO=1 -fopenmp ldbl-omp.c -o a.exe && ./a.exe
thread #1,      eps=2.220446E-016
thread #0,      eps=1.084202E-019
thread #2,      eps=2.220446E-016
thread #1,      eps=2.220446E-016
thread #0,      eps=1.084202E-019
thread #3,      eps=2.220446E-016
thread #2,      eps=2.220446E-016
thread #3,      eps=2.220446E-016

The correct value of epsilon for long double of 1.08E-19 is obtained only in thread 0, all other threads compute the epsilon of double precision.

The correct result can be obtained by uncommenting the instruction __asm__ ("fninit");

Discussion

Doug Semler - 2017-10-05

I'm actually investigating this; I discovered a couple bugs in the floating point environment functions that doesn't set the env properly. However I am still setting my build/test environment. I do have a question though to satisfy my curiosity what happens when you compile with -frounding-math -fsignaling-nans paramaters? (I'm not saying the results will be different, just that there may be a difference).

Also, I can't tell if you're compiling x86 or x64 code; there is a huge difference...if x86 code, can you see if there is a difference if you compile with '-mpfmath=sse -msse2" . Note, x64 is irrelevant.

(this will at least let me try and narrow things down)

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Michael Quellmalz - 2017-10-05

I compiled x64 code. The flags -frounding-math -fsignaling-nans do not make any difference.

When I compile x86 code (in the MSYS2 MINGW32 evironment), I get the same results. Also, neither -mpfmath=sse -msse2 nor -frounding-math -fsignaling-nans nor using both changes the result for me.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Kai Tietz - 2017-10-05

Well, the problem is that for new created threads we don't set the floating point environment. Even worse, we let msvcrt do the wrong thing (which resets it to its own settings).
By using winpthread with recent fix, this issue is resolved for all cases you are using pthread API to create the thread, but not in general ...

It might be a way to use our TLS routines in an executable. The problematic part is here that our code is used in DLL, which is used in a VC executable (or the other way around). As in such scenarios the FPU environment setting is for sure always wrong for one side.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

'long double' computations are done in 'double' with Openmp

A complete runtime environment for gcc

Group

Searches

Help

#678 'long double' computations are done in 'double' with Openmp

Discussion