Re: [Sqlrelay-discussion] listener hang with open database port
Brought to you by:
mused
|
From: Renat S. <sr...@st...> - 2010-05-27 07:21:13
|
Hi Cal,
I don't really understand what happens in your case, but have some ideas.
> "waiting for the scaler..." which is from sqlrlistener.C around line
> 1285. It hangs at that point until I manually kill the listener
> process. I've been trying to study what is happening here between the
> listener and scaler, but haven't determined anything so far.
After this message listener waits for scaler to signal the semaphore
number 7. You can see this with strace or looking at backtrace in gdb.
Try to run command like this against sqlr-listener (here i did it for
sqlr-scaler, you can see that it waits for semathore #6):
$ sudo -u sqlrelay strace -p 2201
Process 2201 attached - interrupt to quit
semop(294921, {{6, -1, 0}}, 1
Scaler always waits for signal 6 to start the procedure of firing up new
connections. Then it counts sessions and connections and signals
listener to keep going with signal 7.
I believe that listener could freeze in this point if there is no scaler
at all or if the semaphore #4 is acquired by any other process and
scaler can't aquire it.
You could examine the semaphore state with patched sqlr-status, if the
value is 1 - then it's free for acquiring, 0 - already acquired.
You could try "-fork" option to sqlr-start, in this case scaler doesn't
use connection counter in shared memory and so doesn't use semaphore #4.
Or you could just remove acquiring and releasing semaphore #4 from
scaler::countConnections() because there is no need to serialize access
to reading one value - who cares if some process write another value a
bit earlier or later.
But I don't really think that the problem is in the semaphores. You
should examine the state of processes with strace and gdb first.
--
Renat Sabitov e-mail: sr...@st...
Stack Soft jid: sr...@ja...
|