Hi Alfonso,

I have applied the suggested patch and the CPU load is worst than before. It consuming 100% on a Core 2 Duo 2.6 GHz.
Without the patch the load is around 60%, but sometimes the transceiver crash.

Anyone facing the same issue?

Tks,
Alex

2009/1/16 Alfonso De Gregorio <adg@crypto.lo.gy>
Hello David,

David A. Burgess ha scritto:


The bottom line is that the stuff in CommonLibs/Threads needs to be patched to work under most Linux variants.  I particularly suspect the pthread_cond_timedwait does not behave as documented.

I have taken a look into the method implementing the timed condition wait.
There we need to take into account the possibility (and this is what actually happens in most Linux variants) to observe spurious wakeups from the pthread_cond_timedwait() - the function occasionally returns even though the condition variable wasn't signaled or broadcast.

According to the POSIX specs, in case of spurious wakeups pthread_cond_timedwait() and pthread_cond_wait() are required to return zero: http://opengroup.org/onlinepubs/007908799/xsh/pthread_cond_wait.html
For this exact reason, the predicate associated to the condition wait needs to be re-evaluated upon such return and the function eventually invoked again.

Why are we observing this behaviour in Linux? This seems to be related to the use of futex in the pthread_cond_timedwait() implementation on Linux http://en.wikipedia.org/wiki/Spurious_wakeup. As each blocking lnx syscall, futex returns abruptly when the process receives a signal.

Here it is a patch proposal:

  /** Block for the signal up to the cancellation timeout. */
  void Signal::wait(Mutex& wMutex, unsigned timeout) const
  {
      int rc = 0;
      Timeval then(timeout);
      struct timespec waitTime = then.timespec();

      while (! then.passed() && rc == 0)
        rc = pthread_cond_timedwait(&mSignal,&wMutex.mMutex,&waitTime);
  }


As far as I can tell, this does not explains the high CPU load.

Cheers,

--
 Alfonso De Gregorio        http://Crypto.lo.gy/