[Queue-developers] Re: Re: Re: new intermediate development Queue version
Brought to you by:
wkrebs
From: QingLong <qin...@Bo...> - 2001-03-06 12:33:19
|
On Mon, Mar 05, 2001 at 09:35:12AM +0100, Gert Van den Eynde wrote: > On Sat, 3 Mar 2001 05:34:27 +0300, QingLong said: >> >> I've made some changes to getrldavg() code that may influence >> the misbehaviour you have reported recently. Please try updated code. > > Updated queue and queued and did the same tests as last week. > Queue still locks up (or continuously keeps on trying) to get the load > on the machines. > > Queued gives this as 'error' output: > > qlib.c Queue_net_connect(): connect()ing to 192.168.1.2:1423 ... > qlib.c Queue_net_connect(): connect()ed to 192.168.1.2:1423 on socket 7. > qlib.c Queue_nonblocking_rw(): failed to select() on fd 7: > select(): Interrupted system call ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > qlib.c Queue_net_rw(): failed to get 1 4-byte items on fd 7; got 0 bytes. > wakeup.c getrldavg(): failed to fread() from fd 7. > wakeup.c getrldavg(): close(7). > wakeup.c getrldavg(): ### failed to get load from dirac > ### returning 1.00e+08 as rejection designator. > qlib.c Queue_net_connect(): connect()ing to 192.168.1.3:1423 ... > qlib.c Queue_net_connect(): connect()ed to 192.168.1.3:1423 on socket 7. > I suspect I know what's the matter. AFAIR, you have a short sleeptime (2 seconds?), do you? Please perform a small test: try to run it with default value of 120s. Does it change anything? If it does, then the problem matter is alarms (used to schedule jobs) interrupting select() on netowrk socket. I am going to put some work around scheduled alarms in network io code --- it will become unnecessary if we get rid of streams on network sockets (and using alarm() to timeout reading/writeing those streams) and use select() on bind()en listen()ed socket (put in non-blocking mode) to multiplex tasks of scheduling jobs and accepting network connections. QingLong. |