From: Liu H. <lh_...@12...> - 2018-01-18 13:05:37
|
On 2018/1/18 9:27, lemonsqueeze wrote: > Ah, found the big one: > Was testing single threaded but code uses __thread thread-local storage > which slows things down a lot on mingw. > > Tried a few alternatives: > - Microsoft TlsGetValue() / TlsSetValue() > - pthread_getspecific() / pthread_setspecific() > - __thread > - own array-based tls using pthread_self() > - own array-based tls using GetCurrentThreadId() > > Fastest so far is the last one, getting comparable perf with linux > (8% slower instead of 30%). > > For `pthread_getspecific() / pthread_setspecific()`: See `mingw-w64/winpthreads/src/threads.c`. `pthread_getspecific()` contains a call to `pthread_spin_lock()` and `pthread_spin_unlock()`, in addition to a call to `GetLastError()` and `SetLastError()`, so it could be quite inefficient, which could be made worse if there is at all any contention. For `__thread`: At the moment this is implemented using emutls (https://gcc.gnu.org/onlinedocs/gccint/Emulated-TLS.html). See `gcc/libgcc/emutls.c` for its implementation. It contains a call to `__gthread_getspecific()`, which, in the posix thread model, is effectively `pthread_getspecific()`. -- Best regards, LH_Mouse |