queue-developers Mailing List for GNU Queue (Page 7)
Brought to you by:
wkrebs
You can subscribe to this list here.
2000 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(8) |
Jun
(4) |
Jul
(4) |
Aug
(25) |
Sep
(9) |
Oct
(4) |
Nov
(4) |
Dec
(6) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2001 |
Jan
(15) |
Feb
(31) |
Mar
(26) |
Apr
(44) |
May
(39) |
Jun
(3) |
Jul
|
Aug
(3) |
Sep
(1) |
Oct
(1) |
Nov
(1) |
Dec
(1) |
2002 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
(1) |
Jul
|
Aug
(6) |
Sep
|
Oct
|
Nov
|
Dec
|
2003 |
Jan
(2) |
Feb
(1) |
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(5) |
2004 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(2) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2005 |
Jan
|
Feb
|
Mar
|
Apr
(3) |
May
(9) |
Jun
(9) |
Jul
(2) |
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2006 |
Jan
|
Feb
(1) |
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2007 |
Jan
(1) |
Feb
|
Mar
|
Apr
(1) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: QingLong <Qin...@Bo...> - 2001-03-09 12:12:53
|
Hello! I've just commited a couple of patches to queue-development CVS tree at sourceforge. Also a few patches had been committed to repository later. Considerable changes since last intermediate version announce: --- Blocked a number of signals in queue (queue.c main()), actually installed signal-counter as handler for these signals (used Queue_io_ignore_unwanted_signals() from qlib.c to do this; see Queue_io_unwanted_sig[] in qlib.c for the signals list). --- Added a few extra signals to the list of counted signals in queued: SIGPIPE, SIGCONT, SIGTSTP, SIGTTIN, SIGTTOU, SIGURG, SIGIO (see queued.c main()). Not sure about the last two! (OOB mess?) --- Added checks in Queue_nonblocking_rw() (qlib.c) around select(), read() and write() to see if the signal interrupted us was fatal. Considered SIGPIPE as the only fatal traced (counted) signal, all others (SIGCHLD, SIGALRM, etc. were considered harmless). --- Added signals related stuff in qlib.c to count ``unwanted'' signals to have an ability to find out (on been interrupted by signal) whether that signal was fatal (e.g. SIGPIPE indicates attempt to perform operations on a closed connection socket). --- Added autoconf checks for ulong and ushort typedefs quite in the same style as it is done by GnuPG to make "types.h" (imported earlier from gnupg along with crypto stuff) happy. --- Added functions (in qlib.c) to ignore (remember signal's sigaction structure information on beginning to ignore) and later restore signals from a constant list. These workaround functions are to be used to protect select() from being undesirably interrupted by SIGALRM, SIGURG, SIGPIPE, etc, and to trace out (in debug mode) which of these signals actually arrive while being ``ignore''d. --- Added functions Queue_alarms_stop() and Queue_alarms_goon() in qlib.c: the former sets alarm(0) and remembers previously scheduled alarm timer value, the latter restores that timer value taking into account amount of time elapsed since alarms has been stopped. These workaround functions are to be used to protect select() from being undesirably interrupted by SIGALRM. --- Made Queued verbose flag to be equal debug flag (queued.c main()). --- Fixed: Queue_user_id -> Queue_daemon_id in queued.c main(). Please try it and send your bug reports, opinions, suggestions, improvements to the mailing list. Thank you. QingLong. |
From: W. G. K. <wer...@ya...> - 2001-03-06 22:11:50
|
Perhaps we should change this behavior. (It's actually very old and somewhat historical by now.) It's still a good idea not to start flooding the system with jobs after a crash, but if the queue is empty jobs should be processed immediately. Jobs sitting in the queue could be started slowly, one after another, until a certain grace period expires. QingLong wrote: > On Tue, Mar 06, 2001 at 02:25:40PM +0100, Gert Van den Eynde wrote: > > > > I have even a more strange problem... (Yes, that's possible :-) ) > > > > If I start queued simply with --foreground (so no --debug), > > the now queue doesn't accept any jobs. > > I get this as output from queue -i -v -w -p -h fermi -- hostname > > > > Requesting load average for queue "now" on host "fermi"... > > The host "fermi" is not able to serve queue "now". > > Failed to submit job in queue "now" to host "fermi". > > > This is not strange, look in queued main() (queued.c around line 900): > | > | /* > | * Go to sleep for a while before flooding the system with > | * jobs, in case it crashes again right away, or the > | * system manager wants to prevent jobs from running. > | * Send a SIGALRM to give it a kick-start. > | */ > | > | if (!debug) { > | alarm(sleeptime); > | > | /* WGK: Rather than do a sigpause(), here, we do a check_query > | here, which will cause us to wake up immediately if someone > | submits a new job in the first few minutes. This could cause > | the batchd to flood the system with new jobs in the event of an > | immediate query, but is unlikely to cause any real problems.*/ > | > | check_query(); > | > | (void) alarm(0); > | } > | > One have to wait for sleeptime seconds after starting queued in non-debug mode > until it will begin accepting jobs. Haven't I already pointed this out here? > > > > > If I start queued with --foreground --debug, > > I get from queue -i -v -w -p -h fermi -- hostname > > > > Requesting load average for queue "now" on host "fermi"... > > Host "fermi" appears to be able to serve queue "now". > > Ok, connecting to QueueD at it. > > Trying "fermi"... > > Going to submit job to queue "now" on host "fermi". > > queue.c: main(): tty(in/out/err): 1 1 1. > > fermi > > > Isn't thisexpected output? > What is the problem here? > > QingLong. > > _______________________________________________ > Queue-developers mailing list Que...@li... > To unsubscribe, subscribe, or set options: > http://lists.sourceforge.net/lists/listinfo/queue-developers |
From: QingLong <qin...@Bo...> - 2001-03-06 17:14:32
|
On Tue, Mar 06, 2001 at 02:06:39PM +0100, Gert Van den Eynde wrote: > > > Does it change anything? If it does, then the problem matter is alarms > > I'm afraid not: > > qlib.c Queue_net_connect(): connect()ing to 192.168.1.1:1423 ... > qlib.c Queue_net_connect(): connect()ed to 192.168.1.1:1423 on socket 7. > wakeup.c getrldavg(): close(7). > wakeup.c getrldavg(): fermi returned load 1.15e+00. > qlib.c Queue_net_connect(): connect()ing to 192.168.1.3:1423 ... > qlib.c Queue_net_connect(): connect()ed to 192.168.1.3:1423 on socket 7. > wakeup.c getrldavg(): close(7). > wakeup.c getrldavg(): bohr returned load 1.08e+00. > qlib.c Queue_net_connect(): connect()ing to 192.168.1.4:1423 ... > qlib.c Queue_net_connect(): connect()ed to 192.168.1.4:1423 on socket 7. > qlib.c Queue_nonblocking_rw(): failed to select() on fd 7: > select(): Interrupted system call > qlib.c Queue_net_rw(): failed to get 1 4-byte items on fd 7; got 0 bytes. > Maybe, maybe. Nevertheless I added that workaround I have mentioned in previous message as I think that this is right thing to do (here on the not-so-right way). Would you be so kind as to give it a try? Maybe that will reveal something interesting that will help me to get insight of the problem. Thank you. QingLong. |
From: QingLong <qin...@Bo...> - 2001-03-06 16:36:32
|
On Tue, Mar 06, 2001 at 02:25:40PM +0100, Gert Van den Eynde wrote: > > I have even a more strange problem... (Yes, that's possible :-) ) > > If I start queued simply with --foreground (so no --debug), > the now queue doesn't accept any jobs. > I get this as output from queue -i -v -w -p -h fermi -- hostname > > Requesting load average for queue "now" on host "fermi"... > The host "fermi" is not able to serve queue "now". > Failed to submit job in queue "now" to host "fermi". > This is not strange, look in queued main() (queued.c around line 900): | | /* | * Go to sleep for a while before flooding the system with | * jobs, in case it crashes again right away, or the | * system manager wants to prevent jobs from running. | * Send a SIGALRM to give it a kick-start. | */ | | if (!debug) { | alarm(sleeptime); | | /* WGK: Rather than do a sigpause(), here, we do a check_query | here, which will cause us to wake up immediately if someone | submits a new job in the first few minutes. This could cause | the batchd to flood the system with new jobs in the event of an | immediate query, but is unlikely to cause any real problems.*/ | | check_query(); | | (void) alarm(0); | } | One have to wait for sleeptime seconds after starting queued in non-debug mode until it will begin accepting jobs. Haven't I already pointed this out here? > > If I start queued with --foreground --debug, > I get from queue -i -v -w -p -h fermi -- hostname > > Requesting load average for queue "now" on host "fermi"... > Host "fermi" appears to be able to serve queue "now". > Ok, connecting to QueueD at it. > Trying "fermi"... > Going to submit job to queue "now" on host "fermi". > queue.c: main(): tty(in/out/err): 1 1 1. > fermi > Isn't thisexpected output? What is the problem here? QingLong. |
From: Gert V. d. E. <gvd...@sc...> - 2001-03-06 13:45:26
|
Hi QingLong, I have even a more strange problem... (Yes, that's possible :-) ) If I start queued simply with --foreground (so no --debug), the now queue doesn't accept any jobs. I get this as output from queue -i -v -w -p -h fermi -- hostname Requesting load average for queue "now" on host "fermi"... The host "fermi" is not able to serve queue "now". Failed to submit job in queue "now" to host "fermi". If I start queued with --foreground --debug, I get from queue -i -v -w -p -h fermi -- hostname Requesting load average for queue "now" on host "fermi"... Host "fermi" appears to be able to serve queue "now". Ok, connecting to QueueD at it. Trying "fermi"... Going to submit job to queue "now" on host "fermi". queue.c: main(): tty(in/out/err): 1 1 1. fermi Gert |
From: Gert V. d. E. <gvd...@sc...> - 2001-03-06 13:09:19
|
Hi QingLong, > I suspect I know what's the matter. > AFAIR, you have a short sleeptime (2 seconds?), do you? Yes... > Please perform a small test: try to run it with default value of 120s. Just did this (started queued with options --foreground --debug) > Does it change anything? If it does, then the problem matter is alarms I'm afraid not: qlib.c Queue_net_connect(): connect()ing to 192.168.1.1:1423 ... qlib.c Queue_net_connect(): connect()ed to 192.168.1.1:1423 on socket 7. wakeup.c getrldavg(): close(7). wakeup.c getrldavg(): fermi returned load 1.15e+00. qlib.c Queue_net_connect(): connect()ing to 192.168.1.3:1423 ... qlib.c Queue_net_connect(): connect()ed to 192.168.1.3:1423 on socket 7. wakeup.c getrldavg(): close(7). wakeup.c getrldavg(): bohr returned load 1.08e+00. qlib.c Queue_net_connect(): connect()ing to 192.168.1.4:1423 ... qlib.c Queue_net_connect(): connect()ed to 192.168.1.4:1423 on socket 7. qlib.c Queue_nonblocking_rw(): failed to select() on fd 7: select(): Interrupted system call qlib.c Queue_net_rw(): failed to get 1 4-byte items on fd 7; got 0 bytes. wakeup.c getrldavg(): failed to fread() from fd 7. wakeup.c getrldavg(): close(7). wakeup.c getrldavg(): ### failed to get load from wigner ### returning 1.00e+08 as rejection designator. Hope to hear from you soon Gert |
From: QingLong <qin...@Bo...> - 2001-03-06 12:33:19
|
On Mon, Mar 05, 2001 at 09:35:12AM +0100, Gert Van den Eynde wrote: > On Sat, 3 Mar 2001 05:34:27 +0300, QingLong said: >> >> I've made some changes to getrldavg() code that may influence >> the misbehaviour you have reported recently. Please try updated code. > > Updated queue and queued and did the same tests as last week. > Queue still locks up (or continuously keeps on trying) to get the load > on the machines. > > Queued gives this as 'error' output: > > qlib.c Queue_net_connect(): connect()ing to 192.168.1.2:1423 ... > qlib.c Queue_net_connect(): connect()ed to 192.168.1.2:1423 on socket 7. > qlib.c Queue_nonblocking_rw(): failed to select() on fd 7: > select(): Interrupted system call ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > qlib.c Queue_net_rw(): failed to get 1 4-byte items on fd 7; got 0 bytes. > wakeup.c getrldavg(): failed to fread() from fd 7. > wakeup.c getrldavg(): close(7). > wakeup.c getrldavg(): ### failed to get load from dirac > ### returning 1.00e+08 as rejection designator. > qlib.c Queue_net_connect(): connect()ing to 192.168.1.3:1423 ... > qlib.c Queue_net_connect(): connect()ed to 192.168.1.3:1423 on socket 7. > I suspect I know what's the matter. AFAIR, you have a short sleeptime (2 seconds?), do you? Please perform a small test: try to run it with default value of 120s. Does it change anything? If it does, then the problem matter is alarms (used to schedule jobs) interrupting select() on netowrk socket. I am going to put some work around scheduled alarms in network io code --- it will become unnecessary if we get rid of streams on network sockets (and using alarm() to timeout reading/writeing those streams) and use select() on bind()en listen()ed socket (put in non-blocking mode) to multiplex tasks of scheduling jobs and accepting network connections. QingLong. |
From: QingLong <qin...@Bo...> - 2001-03-05 21:43:31
|
On Mon, Mar 05, 2001 at 06:50:46PM +0100, Kai Harrekilde-Petersen wrote: > > It looks like that you have added the variable Queue_user_id in > queued.c, > but it does not get defined, when compiling in a non-root environment. > Here's the output from make/gcc: > > gcc -DHAVE_CONFIG_H -I. -I. -I. -g -O2 -c queued.c > gcc -g -O2 -o queued ident.o handle.o lex.o logging.o mrestart.o > pty.o qlib.o queued.o sha1.o wakeup.o -lfl getloadavg.o -lcrypt -lfl > -lnsl -lrpcsvc > queued.o: In function `main': > /home/khp/projects/queue-cvs/devel/queued.c:598: undefined reference to > `Queue_user_id' > collect2: ld returned 1 exit status > make: *** [queued] Error 1 > > Queue_user_id is defined as 'extern' in queue.h, and is used on line > 598. > However, only queue.c defines Queue_user_id, and it is not linked into > the queued executable. > Should be Queue_daemon_id, rather than Queue_user_id. Sorry. Please try it and tell us if it works. > > I'm sorry to bother you with these things, but we run with > root-squashing on NFS, and I prefer to be able to test it by > myself before passing it on to the other developers here. > That's nice that there is at least someone who runs Queue in non-root setup on a regular basis, as I do so only in test mode rather rarely. Please keep reporting. Thank you. QingLong. |
From: Kai Harrekilde-P. <kh...@ex...> - 2001-03-05 19:17:54
|
I've noticed lately that after restarting queued, I can submit very few jobs before queued claims that someone else is running, or the cookiefile is wrong. This is the output from a simple test; I deleted the cookiefile, restarted all daemons (on 5 machines, named cluster00-04): cluster02:~$ queuedir/sbin/queued -D cluster02:~$ queuedir/bin/queue -w -n -i -- hostname cluster00 cluster02:~$ queuedir/bin/queue -w -n -i -- hostname Cookiefile authentication with server failed! Someone else is running Queue on this cluster or the other side has the wrong cookiefile! cluster03 cluster02:~$ queuedir/bin/queue -w -n -i -- hostname Cookiefile authentication with server failed! Someone else is running Queue on this cluster or the other side has the wrong cookiefile! Cookiefile authentication with server failed! Someone else is running Queue on this cluster or the other side has the wrong cookiefile! cluster04 cluster02:~$ queuedir/bin/queue -w -n -i -- hostname Cookiefile authentication with server failed! Someone else is running Queue on this cluster or the other side has the wrong cookiefile! Cookiefile authentication with server failed! Someone else is running Queue on this cluster or the other side has the wrong cookiefile! Cookiefile authentication with server failed! Someone else is running Queue on this cluster or the other side has the wrong cookiefile! cluster01 cluster02:~$=20 cluster02:~$ queuedir/bin/queue -w -n -i -- hostname Cookiefile authentication with server failed! Someone else is running Queue on this cluster or the other side has the wrong cookiefile! Cookiefile authentication with server failed! Someone else is running Queue on this cluster or the other side has the wrong cookiefile! Cookiefile authentication with server failed! Someone else is running Queue on this cluster or the other side has the wrong cookiefile! Cookiefile authentication with server failed! Someone else is running Queue on this cluster or the other side has the wrong cookiefile! Cookiefile authentication with server failed! Someone else is running Queue on this cluster or the other side has the wrong cookiefile! Queue: Failed to submit job to queue "now". cluster02:~$=20 As ever, I compile/run with NO_ROOT. The interesting part here is that each of the "remote" machines (cluster00, cluster03, cluster04, cluster01) are able to execute a command once, while the "local" machine never can/does. Regards, Kai --=20 Kai Harrekilde-Petersen <kh...@ex...> Exbit Technology A/S |
From: Kai Harrekilde-P. <kh...@ex...> - 2001-03-05 17:49:19
|
Hi QingLong, It looks like that you have added the variable Queue_user_id in queued.c, but it does not get defined, when compiling in a non-root environment. Here's the output from make/gcc: gcc -DHAVE_CONFIG_H -I. -I. -I. -g -O2 -c queued.c gcc -g -O2 -o queued ident.o handle.o lex.o logging.o mrestart.o pty.o qlib.o queued.o sha1.o wakeup.o -lfl getloadavg.o -lcrypt -lfl -lnsl -lrpcsvc=20 queued.o: In function `main': /home/khp/projects/queue-cvs/devel/queued.c:598: undefined reference to `Queue_user_id' collect2: ld returned 1 exit status make: *** [queued] Error 1 Queue_user_id is defined as 'extern' in queue.h, and is used on line 598. However, only queue.c defines Queue_user_id, and it is not linked into the queued executable. I'm sorry to bother you with these things, but we run with root-squashing on NFS, and I prefer to be able to test it by myself before passing it on to the other developers here. Regards, Kai --=20 Kai Harrekilde-Petersen <kh...@ex...> Exbit Technology A/S |
From: QingLong <Qin...@Bo...> - 2001-03-03 02:15:58
|
Hello! I've just commited a couple of patches to queue-development CVS tree at sourceforge. Considerable changes: --- Eliminated stream opened over network socket in getrldavg() (in wakeup.c). --- Added (in qlib.c) rather universal read/write functions operating on file descriptors designed for network socket operations (read/write and netfread/netwfrite counterparts). The intention is to totally get rid of running streams over network sockets, as this appears to be causing much headache. Please do stresstesting of this new code, and if it works well, we should probably consider using this stuff all over the Queue system instead of streams over sockets. I guess this potential change should be discussed here for a while, as it would affect the whole and almost every part of the system. Please send your bug reports, opinions, suggestions, improvements to the mailing list. Thank you. QingLong. |
From: QingLong <qin...@Bo...> - 2001-03-01 08:28:37
|
Hello! I've just commited a bunch of patches to queue-development CVS tree at sourceforge. Notes on the changes made: --- Created a longliving file descriptor (debug_fd) and a stream (debug_stream) used for debug output. Having these fd and stream opened on debug log eliminates necessity of lifting privileges each time one wants to append a record to debug file. Also this way of log file writing is considered as more fast. --- Changed order and policy of std(in|out|err) redirection from/to user terminal, log file, debug log, /dev/null, etc to separate messages generated by user command from Queued's own debug and error output. --- Changed conditions under which some signals (listed in ignore[]) are ignored (queued.c main()). Now they are ignored only if running in backround (irrespective of debug mode). This change was made in a try to fix bug recently reported by Gert Van den Eynde <gvd...@sc...> (``Alarm clock'' death). --- Added _GNU_SOURCE and _XOPEN_SOURCE to config.h to enable those extensions when available. --- Added check for [gs]et*[ug]id() functions to configure.in. --- Performed some code cleaning: added header files inclusions, function prototypes, removed unused variables, synced printf() formats with argument types, etc. --- Added a few library functions return status checking/handling to ensure successfull fd dup2()ing, close()ing, etc. --- Added a few trace messages (activated by --debug or --verbose). --- Fixed typo (of my own) in queued.c readpro(): the macro is HAVE_GETRLIMIT rather than HAVE_RLIMIT. --- Changed debugdir ownership check: made it work correctly for non-root installation as well. The necessity of this change was pointed out by Kai Harrekilde-Petersen <kh...@ex...>. I hope I have managed to fix the buglet. I have not yet committed changes to derived stuff like configure and Makefile.in, so one should probably run automake --add-missing autoheader autoconf prior to ./configure. I hope you already have automake et al already installed on your systems. As the changes made, namely the std(in/out/err) one, touch critical parts of code, the new stuff does need steady testing. Please give it try and send feedback to the mailing list. Thank you. QingLong. |
From: Gert V. d. E. <gvd...@sc...> - 2001-02-28 08:06:10
|
Dear QingLong, > > How do I work around this ? > > > It looks like you are using development versions of the tools. > If so, then you are on your own here, I am not able to help you, > so you would probably have to try yourself to find out if this is > autoconf/automake or Queue bug. And if you find that it is Queue > configure.in that should be fixed, please, teach us how should we > modify it to meet autoconf requirements. Thank you. I'm sorry, I'm a coward under time pressure to get a queueing system working on our cluster and a non-expert on autoconf/automake, so I grabbed and installed the versions you mentioned. All compilation went well after that. I've done some playing around, these are my observations: - compiled queue and queued with --enable-root (no --enable-manager) - queued started with --debug and -t 10 on 5 hosts, each maxexec 2 - submitted 12 jobs for the now queue (using -i -w -p) - 10 start immediately, as expected - 1 is on hold, as expected - last one gives me back the shell after some seconds with the announcement 'Alarm clock' - queued started with --debug --foreground -t 10 on 5 hosts, each maxexec 1 - submitted 7 jobs for the now queue (using -v -i -w -p) -5 start immediately, as expected - 2 are on hold - I get lots of messages from queued among which there are file now/CFDIR/cfm701151391 has 0 length: Bad file descriptor now/CFDIR/cfm701151391 has 0 length. SENDMAIL: To 'root' from 'root': Subject: queued error on bohr: file now/CFDIR/cfm701151391 has 0 length: Interrupted system call Requesting load average for queue "now" on host "pauli"... queue_persistent_connect(): connect()ing to 192.168.1.5:1423 ... queue_persistent_connect(): connect()ed to 192.168.1.5:1423 on socket 6. queue_reliable_fread(): select()ing on socket 6... queue_reliable_fread(): select(6) timed out. getrldavg(): failed to fread() from stream opened on socket 6. getrldavg(): close(6). getrldavg(): ### failed to get load from pauli ### returning 1.00e+08 as rejection designator. Are these messages something you would expect from queued or are they indicating that something is wrong? If you need more specific debugging information (or if you have a test-case I can run on our system), tell me what to look for.... Have a nice day, Gert |
From: Kai Harrekilde-P. <kh...@ex...> - 2001-02-27 20:31:11
|
QingLong, I still need to apply the following one-liner on Linux 2.4 to avoid having queued to dump core upon submitting jobs. Index: queued.c =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D RCS file: /cvsroot/queue/queue/queued.c,v retrieving revision 1.9 diff -u -r1.9 queued.c --- queued.c 2000/05/06 23:26:55 1.9 +++ queued.c 2001/02/27 20:29:58 @@ -2887,7 +2887,7 @@ static struct { int r; enum keyword kwd; -} rtab[] =3D { +} rtab[RLIM_NLIMITS] =3D { #ifdef RLIMIT_CPU RLIMIT_CPU, K_RLIMITCPU, #endif I believe it should be perfectly safe on all platforms. Regards, Kai --=20 Kai Harrekilde-Petersen <kh...@ex...> Exbit Technology A/S |
From: QingLong <qin...@Bo...> - 2001-02-27 12:16:50
|
On Tue, Feb 27, 2001 at 09:10:30AM +0100, Gert Van den Eynde wrote: > > I get errors when I do autoheader for the new CVS sources. > After doing autoheader -v -d, I could locate a (possible) bug > in configure.in. > The lines that contain the following give an error with autoheader: > > AC_MSG_WARN([Define/undefine RXVT_UTMP_FILE in config.h manually]) > > Is it possible autoheader flips on the word Define ? > Maybe. But if this is the case, then this probably is a bug in autoconf, as the mentioned message string is properly [quoted]. I guess it's worth writing bugreport to the autoconf maintainers. > > I use the following versions: > > autoheader (GNU Autoconf) 2.49c > Written by Roland McGrath. > > automake (GNU automake) 1.4d > AFAIK, the latest stable (released) versions of GNU autoconf and automake are 2.13 and 1.4 respectively. I had no problems with these versions. BTW, where have you got _GNU_ autoconf 2.49c? Hasn't it been ftp://alpha.gnu.org/? > > How do I work around this ? > It looks like you are using development versions of the tools. If so, then you are on your own here, I am not able to help you, so you would probably have to try yourself to find out if this is autoconf/automake or Queue bug. And if you find that it is Queue configure.in that should be fixed, please, teach us how should we modify it to meet autoconf requirements. Thank you. QingLong. |
From: Gert V. d. E. <gvd...@sc...> - 2001-02-27 08:13:05
|
Hi all, Hi QingLong I get errors when I do autoheader for the new CVS sources. After doing autoheader -v -d, I could locate a (possible) bug in configure.in. The lines that contain the following give an error with autoheader: AC_MSG_WARN([Define/undefine RXVT_UTMP_FILE in config.h manually]) Is it possible autoheader flips on the word Define ? I use the following versions: autoheader (GNU Autoconf) 2.49c Written by Roland McGrath. automake (GNU automake) 1.4d If I eliminate these warnings (I leave AC_MSG_WARN([RXVT]) in configure.in), autoheader doesn't complain, neither does autoconf. However after that, I still have lines containing @AMDEP@ in my Makefile and Gnu Make doesn't like them :-) How do I work around this ? Thanks, Gert On Tue, 27 Feb 2001 00:56:48 +0300, QingLong said: > Hello! > > There is a new queue-development version at sourceforge CVS. > ChangeLog excerpt: > --- Put config.h stuff in usual #ifndef wrapper to allow multiple > inclusion of the header file. (see acconfig.h) > --- Added <search.h> to AC_CHECK_HEADERS list for remque() and insque() > prototypes in Linux. Added `#include <search.h>' where necessary. > --- Created number of header files for holding type definitions, > prototypes and macros from/for corresponding C files for clean > and clear code structure and files layout. The layout also allows > providing finer and more complete source portions dependance > information in makefile. Began moving relevant stuff > to appropriate locations. > --- Made queue and queued run under (having effective uid.gid set to) > user's real uid.gid and only elevate privileges to root level > on demand, e.g. to bind privileged TCP port or (de)allocate pty. > --- Made queued create more proper/informative utmp entry: > write down real username instead of always putting "root". > --- Added a few extra parameters to functions in wakeup.c to eliminate > need in having queue_verbose and queue_debug global variables both > in queue and queued (where they are irrelevant). > If there is a bunch of common code in client and daemon portions > of the system we should probably arrange it as a library. > --- Did some code cleaning to avoid ambiguites and make gcc more happy > (decrease number of warnings). > --- Added a few library functions return status checking/handling > to ensure successfull socket creation, port bindings, etc. > > I have not yet committed changes to derived stuff like configure > and Makefile.in, so please issue > automake --add-missing > autoheader > autoconf > prior to running ./configure. I hope you already have automake and autoconf > already installed on your systems. > > As the changes made (namely the one dealing with privileges) > touch a few critical parts of code the version do need excessive testing. > Please give it try and send feedback to the mailing list. Thank you. > > QingLong. > > _______________________________________________ > Queue-developers mailing list Que...@li... > To unsubscribe, subscribe, or set options: > http://lists.sourceforge.net/lists/listinfo/queue-developers > |
From: QingLong <Qin...@us...> - 2001-02-26 21:48:30
|
Hello! There is a new queue-development version at sourceforge CVS. ChangeLog excerpt: --- Put config.h stuff in usual #ifndef wrapper to allow multiple inclusion of the header file. (see acconfig.h) --- Added <search.h> to AC_CHECK_HEADERS list for remque() and insque() prototypes in Linux. Added `#include <search.h>' where necessary. --- Created number of header files for holding type definitions, prototypes and macros from/for corresponding C files for clean and clear code structure and files layout. The layout also allows providing finer and more complete source portions dependance information in makefile. Began moving relevant stuff to appropriate locations. --- Made queue and queued run under (having effective uid.gid set to) user's real uid.gid and only elevate privileges to root level on demand, e.g. to bind privileged TCP port or (de)allocate pty. --- Made queued create more proper/informative utmp entry: write down real username instead of always putting "root". --- Added a few extra parameters to functions in wakeup.c to eliminate need in having queue_verbose and queue_debug global variables both in queue and queued (where they are irrelevant). If there is a bunch of common code in client and daemon portions of the system we should probably arrange it as a library. --- Did some code cleaning to avoid ambiguites and make gcc more happy (decrease number of warnings). --- Added a few library functions return status checking/handling to ensure successfull socket creation, port bindings, etc. I have not yet committed changes to derived stuff like configure and Makefile.in, so please issue automake --add-missing autoheader autoconf prior to running ./configure. I hope you already have automake and autoconf already installed on your systems. As the changes made (namely the one dealing with privileges) touch a few critical parts of code the version do need excessive testing. Please give it try and send feedback to the mailing list. Thank you. QingLong. |
From: W. G. K. <wer...@ya...> - 2001-02-23 14:49:09
|
Can someone help this person? |
From: Gert V. d. E. <gvd...@sc...> - 2001-02-23 07:46:57
|
Hi QingLong, Thank you for the quick response... On Thu, 22 Feb 2001 17:18:34 +0300, QingLong said: [snip] > > > Would you be so kind as to provide exact commands you have issued? root@fermi:~> queued gvdeynde@fermi:~ > queue -v -i -w -p -h fermi -- hostname Requesting load average for queue "now" on host "fermi"... The host "fermi"is not able to serve queue "now". Failed to submit job in queue "now" to host "fermi". root@fermi:~> killall queued root@fermi:~/queue-development > queued --debug --foreground gvdeynde@fermi:~ > queue -v -i -w -p -h fermi -- hostname Requesting load average for queue "now" on host "fermi"... Host "fermi" appears to be able to serve queue "now". Ok, connecting to QueueD at it. Trying "fermi"... Going to submit job to queue "now" on host "fermi". queue.c: main(): tty(in/out/err): 1 1 1. queued handle.c handle(): going to try to run "hostname". queued handle.c handle(): assembled full path: "/bin/hostname". queued handle.c handle(): going to execve(/bin/hostname). fermi and the root window gives this for output of queued: SENDMAIL: To 'gvdeynde' from 'queued': Subject: batch queue_b on fermi: now/CFDIR/cfm694234292: Job is starting now. now/CFDIR/cfm694234292: Job is starting now. Concerning your remark on sleeptime: in my version of queued.c this is at line 799 I believe alarm(sleeptime) where I don't set the sleeptime at the command queued, so it should be at the default of 120 s. If I set it manually to 2 s, with no debug options, and I wait for 2 secs before I do the queue command, it works. If I do it before, same error as before root@fermi:~ > queued -t 2 [After 2 seconds waiting] gvdeynde@fermi:~ > queue -v -i -w -p -h fermi -- hostname Requesting load average for queue "now" on host "fermi"... Host "fermi" appears to be able to serve queue "now". Ok, connecting to QueueD at it. Trying "fermi"... Going to submit job to queue "now" on host "fermi". queue.c: main(): tty(in/out/err): 1 1 1. fermi So in theory, if I would start queued without options I would have to wait 120 seconds before I cannot submit a job. If I do it before that time, I get an error that the queue cannot be served. However, if I put sleeptime manually to 2 seconds, I can start jobs right away (or as good as) but queued wakes up every 2 seconds (I have no idea on the overhead cost of that).... Again, thanks for taking the time to help me out, Gert |
From: W. G. K. <wer...@ya...> - 2001-02-23 01:33:53
|
Mike Castle wrote: > On Thu, Feb 22, 2001 at 05:18:34PM +0300, QingLong wrote: > > My fault. I've been believing that everyone building queue-development > > would run aclocal, automake and autoconf beforehand. I've been only > > committing changes to Makefile.am, configure.in and acconfig.h, > > skipping all derived stuff. > > If this is the case, you should probably remove configure from the > repository. ./configure stays in the repository. It makes my life (and, presumably, everyone else that's using the CVS repository) much simpler. I like to try to keep track of everything that actually goes into a finished release. ./configure is part of that process (and most users don't run autoconf), so there needs to be a history of it. > > > Either don't put derived files in the repository, or make sure they stay up > to date. This is a good rule. It's a good idea to keep ./configure consistent so that testers can pull a copy and get it to work easier. The easier it is for people to test configurations, the more feedback there is, and the better the final result. This is another reason why I like configure in there as well. But, I don't want too much discussion of CVS repository rules on here. I want to make sure developers here focus on what's important: 1. coding 2. documenting changes. It's very important that developers feel this is a supportive environment where they can be creative. Periodically (for legal as well as other reasons) I'll make sure the repository is consistent. > > mrc > -- > Mike Castle Life is like a clock: You can work constantly > da...@ix... and be right all the time, or not work at all > www.netcom.com/~dalgoda/ and be right at least twice a day. -- mrc > We are all of us living in the shadow of Manhattan. -- Watchmen > > _______________________________________________ > Queue-developers mailing list Que...@li... > To unsubscribe, subscribe, or set options: > http://lists.sourceforge.net/lists/listinfo/queue-developers |
From: Carlos B. <cb...@fc...> - 2001-02-22 20:21:33
|
Hello! When I run queued -D It says: 'rlimitrss' is not available on this system, ignored it happend with all rlimits What is the problem? kernel 2.2.18 debian/potato, queue 1.30.1 It has the rlimit patch -- Carlos Barros. |
From: Mike C. <da...@ix...> - 2001-02-22 19:35:53
|
On Thu, Feb 22, 2001 at 05:18:34PM +0300, QingLong wrote: > My fault. I've been believing that everyone building queue-development > would run aclocal, automake and autoconf beforehand. I've been only > committing changes to Makefile.am, configure.in and acconfig.h, > skipping all derived stuff. If this is the case, you should probably remove configure from the repository. Either don't put derived files in the repository, or make sure they stay up to date. mrc -- Mike Castle Life is like a clock: You can work constantly da...@ix... and be right all the time, or not work at all www.netcom.com/~dalgoda/ and be right at least twice a day. -- mrc We are all of us living in the shadow of Manhattan. -- Watchmen |
From: QingLong <Qin...@Bo...> - 2001-02-22 16:43:52
|
Hello! There is new intermediate queue-development version at sourceforge CVS. This is from ChangeLog: --- Changed the way queued opens debug log file to harden it against various race condition and symbolic link attacks in world writeable directories (queued.c main()). --- Moved debug file from /tmp/ to /tmp/queue/. --- Added header files to queued_SOURCES and queue_SOURCES lists in Makefile.am to provide more dependence information to make. Hope you'll find it good. BR, QingLong. |
From: QingLong <qin...@Bo...> - 2001-02-22 14:58:30
|
On Thu, Feb 22, 2001 at 12:14:41PM +0100, Gert Van den Eynde wrote: > > and using the verbose option from queue gives me this: > > --- > Requesting load average for queue "now" on host "fermi"... > The host "fermi"is not able to serve queue "now". > Failed to submit job in queue "now" to host "fermi". > --- > > However, if I start queued in debug mode (queued --debug), I get this from queue --verbose .... > > --- > Requesting load average for queue "now" on host "fermi"... > Host "fermi" appears to be able to serve queue "now". > Ok, connecting to QueueD at it. > Trying "fermi"... > Going to submit job to queue "now" on host "fermi". > queue.c: main(): tty(in/out/err): 1 1 1. > queued handle.c handle(): going to try to run "hostname". > queued handle.c handle(): assembled full path: "/bin/hostname". > queued handle.c handle(): going to execve(/bin/hostname). > fermi > --- > I believe, I know what's happening. IMHO, the problem matter is ``having a sleep() before going to job'' code in queued.c (approx line 890) which is skipped in debug mode. It looks like the code isn't elaborate and isn't working as it ought to. I have already pointed this problem out here. The only known solution by now is: wait for a while (to be precise: sleeptime) and queued will wake up and start working. Please, tell me if my guess is correct. QingLong. |
From: QingLong <qin...@Bo...> - 2001-02-22 14:10:38
|
Hello! I am only working on queue-debelopment branch, so I'll only talk about it. On Thu, Feb 22, 2001 at 12:14:41PM +0100, Gert Van den Eynde wrote: > > * I also tried to get the latest CVS going (queue-development). > Something strange is going on during configuration and compilation. > After the usual ./configure --enable-root and make, all seems to be well. > When I do make install, it starts reconfiguring (and effectively changing > config.h), recompiling and then the compilation breaks due to a missing > cleanutent. It seems that during the reconfigure the support for rxvt utmp > was added, but the file logging.c is not in the sources list in the makefile. > My fault. I've been believing that everyone building queue-development would run aclocal, automake and autoconf beforehand. I've been only committing changes to Makefile.am, configure.in and acconfig.h, skipping all derived stuff. > > * I managed to fix the above to have it compile, no probs. > When I start this CVS queued without debugging option > (on one host, just for testing, the other nodes are down) > and I submit jobs to the now queue (the classical hostname job > from the manual), I get emails like this: [...] > and using the verbose option from queue gives me this: [...] > However, if I start queued in debug mode (queued --debug), I get this from queue --verbose .... > Would you be so kind as to provide exact commands you have issued? > > - Is the developers version reliable (I know it's a developers version, > but I am aware of projects where it is best to stick to the developers > version than to the stable releases) ? > Well, all the changes I've made was to get it work (more reliable). I have managed to make it work rather reliable for me, although, of course, I haven't tried out all possible combinations of command line options, environment etc. Please help me to get queue to reproduce the faulty behaviour, your were talking about. Thank you. QingLong. |