frozen TimerThread breaks loginTimeout

Rob Forbes
2008-01-31
2012-08-15
  • Rob Forbes
    Rob Forbes
    2008-01-31

    Hello,

    I think I have found a bug in TimerThread that results in loginTimeout becoming ineffective, which can cause CLOSE_WAIT sockets to accumulate as a result of failed logins, eventually resulting in a "too many open files" error. I wonder if you could check my logic on this.

    The behavior we are seeing is an accumulation of CLOSE_WAIT sockets even though loginTimeout is set to 30. (BTW this is not the same issue as #1755448 as in that case loginTimeout was deliberately set to 0). Adding a print statement to the callback that closes sockets after the timeout period shows that the timeout functions for some length of time, then ceases to work. After putting in more print statements, I think I see what is happening:

    TimerThread has a loop in the run() thread to process timers that begins with the following code:

           while (true) {
               try {
                   try {
                       // If nextTimeout == 0 (i.e. there are no more requests
                       // in the queue) wait indefinitely -- wait(0)
                       timerList.wait(nextTimeout == 0 ? 0
                               : nextTimeout - System.currentTimeMillis());
    

    The bug is this: if nextTimeout is not 0, but is equal to System.currentTimeMillis() (i.e. if the next timer is set to go off at that instant) the argument to wait() is 0, which means wait forever. This will hang the thread so that no more timers are processed and no more sockets closed. If nextTimeout is 0 (empty timer queue), the wait(0) is terminated when a new timer is added in setTimer() as follows:

           // If this request is now the first in the list, interupt timer
           if (timerList.getFirst() == t) {
               nextTimeout = t.time;
               this.interrupt();
           }
    

    However, in the buggy case, there is already a timer in the list--the one which should have been processed immediately--so the conditional does not evaluate to true, and the thread is not interrupted--the timers just pile up without being processed.

    I have put in a print statement to test for the nextTimeout == System.currentTimeMillis() condition, and a) it does print out and b) that message coincides with the time that the "closing socket" print ceases and lsof shows accumulating CLOSE_WAIT sockets.

    TIA,

    Rob