You can subscribe to this list here.
2005 |
Jan
|
Feb
(61) |
Mar
(153) |
Apr
(39) |
May
(10) |
Jun
(15) |
Jul
(15) |
Aug
(2) |
Sep
|
Oct
(17) |
Nov
(2) |
Dec
(13) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2006 |
Jan
(18) |
Feb
(9) |
Mar
|
Apr
|
May
|
Jun
|
Jul
(7) |
Aug
(1) |
Sep
(2) |
Oct
|
Nov
(1) |
Dec
|
2007 |
Jan
(8) |
Feb
(3) |
Mar
|
Apr
|
May
(2) |
Jun
|
Jul
(2) |
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2008 |
Jan
|
Feb
|
Mar
|
Apr
(6) |
May
(1) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: SourceForge.net <no...@so...> - 2005-03-14 23:44:46
|
Feature Requests item #1156875, was opened at 2005-03-04 12:03 Message generated for change (Comment added) made by sdeasey You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=719009&aid=1156875&group_id=130646 Category: None Group: None Status: Open Resolution: None Priority: 5 Submitted By: Zoran Vasiljevic (vasiljevic) Assigned to: Zoran Vasiljevic (vasiljevic) Summary: Watchdog process restarts failed server Initial Comment: We have been using this for quite some time and it proved extremely useful. We doublefork the nsd process and make the first forked instance control the second. The first one (the watchdog) reacts on exit codes and signals caught during the watch and correspondingly restarts the second instance (the worker). Also, we have added the the "-restart" option to the "ns_shutdown" command. This just sends the SIGINT to the worker process. The watchdog is handling this signal and respawns the worker automatically. During operation, the watchdog logs events and their cause into the system log file. This looks like: Feb 28 04:00:05 Develop nsd[19400]: worker: started. Mar 1 04:00:13 Develop nsd[4475]: watchdog: worker 19400 exited (2). Mar 1 04:00:15 Develop nsd[21290]: worker: started. Mar 1 04:00:18 Develop nsd[14705]: watchdog: worker 19399 exited (2). Mar 1 04:00:20 Develop nsd[21300]: worker: started. We have done all the changes with "--enable-watchdog" so anybody who needs this feature will have to compile with this option. ---------------------------------------------------------------------- >Comment By: Stephen Deasey (sdeasey) Date: 2005-03-14 16:44 Message: Logged In: YES user_id=87254 Oops, looks like I forgot to attach the patch... Here it is. Yes, the server will attempt to restart a number of times before giving up. I think this is as appropriate for config errors as it is for others, then there doesn't need to be a dual-standard fatal error. ---------------------------------------------------------------------- Comment By: Zoran Vasiljevic (vasiljevic) Date: 2005-03-14 11:12 Message: Logged In: YES user_id=95086 I'd rather wait for Stephens code and see what's done there and then make final changes and commit. It should not take long. ---------------------------------------------------------------------- Comment By: Vlad Seryakov (seryakov) Date: 2005-03-14 11:07 Message: Logged In: YES user_id=184124 Then i do not see any reason not committing it, we can have discussions about it for another couple of years, but until we have it and use it, it is just discussions. We aready agreed on having it in the core. ---------------------------------------------------------------------- Comment By: Zoran Vasiljevic (vasiljevic) Date: 2005-03-14 11:00 Message: Logged In: YES user_id=95086 Stephen, You mind uploading your patch changes so I can peek into it? I think we should soon settle on some solution and move on. Vlad, The patch already does that. After 16 unsuccessfull restarts the watchdog exits. Between every restart we wait 1, 2, 4, 8, 16, 32, .... 16384 seconds and then exit. This would be: 32767 seconds or about 9 hours. ---------------------------------------------------------------------- Comment By: Vlad Seryakov (seryakov) Date: 2005-03-14 07:53 Message: Logged In: YES user_id=184124 I would say, repeat configured number of time and then exit. This way all socket related problems will be cleared after couple of restartes, config problems will make server exit . ---------------------------------------------------------------------- Comment By: Zoran Vasiljevic (vasiljevic) Date: 2005-03-14 07:41 Message: Logged In: YES user_id=95086 Well, because I could not differentiate between config and later runtime errors, I thought its wiser to abort rather to repeatedly restart broken server. One possibility is to add different Ns_Fatal call like: Ns_FatalEx(int exitcode, char *fmt, ...); Or? ---------------------------------------------------------------------- Comment By: Stephen Deasey (sdeasey) Date: 2005-03-14 07:12 Message: Logged In: YES user_id=87254 Are you sure Ns_Fatal should not restart the server? Many of the fatal errors are caused by bad configuration and I can see why you might want the server to exit immediately. But there are also runtime fatal errors, and here I'd expect the server to be restarted. Some config errors are due to external factors like missing directories or file system permissions and it would be nice for the server to come back up as soon as the issue resolved itself. ---------------------------------------------------------------------- Comment By: Zoran Vasiljevic (vasiljevic) Date: 2005-03-14 02:52 Message: Logged In: YES user_id=95086 Calling watchdog before or after chroot has no real implications I believe. Also, when the server exits with Ns_Fatal, it hsould not be restarted, I will look into your changes today. ---------------------------------------------------------------------- Comment By: Stephen Deasey (sdeasey) Date: 2005-03-10 23:27 Message: Logged In: YES user_id=87254 Oh yeah, and I moved the watchdog stuff eaven earlier, before the prebind and chroot. Hmm, now that I think about it it's only the prebind that really needs to happen after the watchdog is started, to ensure the sockets are always in a sane state... ---------------------------------------------------------------------- Comment By: Stephen Deasey (sdeasey) Date: 2005-03-10 23:17 Message: Logged In: YES user_id=87254 Am I reading this right? It looks like if the server calls Ns_Fatal() it will exit with code 1, but 0 and 1 are treated as OK exit codes and the server will not be restarted. I've attached a patch which changes the above, fixes the pid problem in a different way because I completely forgot you already posted below about that small glitch, use Ns_ParseObjv(), adds the -w switch, and a couple of small name changes. I don't want to go overboard with command line switches, but an option to specify not to give up trying to restart the server would be nice. If you have that, you also need to turn off the restart timeout doubling at a certain point. I don't know what the cleanest way to do that is. ---------------------------------------------------------------------- Comment By: Zoran Vasiljevic (vasiljevic) Date: 2005-03-08 13:09 Message: Logged In: YES user_id=95086 This is correct. As soon as we agree on some implementation, I will put the -i processing and avoid starting watchdog for inittab starts. ---------------------------------------------------------------------- Comment By: Vlad Seryakov (seryakov) Date: 2005-03-08 12:48 Message: Logged In: YES user_id=184124 There still should be possibility to run nsd from inittab, so when -i switch is given, no watchdog should be running, let /sbin/init to handle restarts ---------------------------------------------------------------------- Comment By: Zoran Vasiljevic (vasiljevic) Date: 2005-03-08 09:19 Message: Logged In: YES user_id=95086 small glich: After applying the patch, change nsmain.c from: nsconf.pid = serverPid: to nsconf.pid = getpid(); otherwise the pid file will contain the bogus server pid if the watchdog restarted it later. ---------------------------------------------------------------------- Comment By: Zoran Vasiljevic (vasiljevic) Date: 2005-03-08 08:26 Message: Logged In: YES user_id=95086 Another try... Important to note: SIGKILL (signal 9) cannot be handled hence if somebody kills the watchdog with SIGKILL, the server will be left lingering w/o the watchdog. This is important to know. I do not see any possibility how to recover in such cases (i.e. how to stop the server). Apart from that, all objections from Stephen are taken into account. Please try again. A new copy of watchdog.patch file is attached. ---------------------------------------------------------------------- Comment By: Vlad Seryakov (seryakov) Date: 2005-03-07 22:20 Message: Logged In: YES user_id=184124 I just found this code in my old server sources, i just chnaged internal name to Ns/NS, used to run pretty stable. ------------------------------------------------------------------- #define NS_EXIT 99 static void NsTerminate(int sig) { printf("NS[%d] signal %d received...",getpid(),sig); // Kill the server with the same signal if(nsPid > 0) kill(nsPid,sig); // Exit in case of fatal signal if(sig != SIGHUP) { printf("NS[%d] terminating...",getpid()); exit(0); } // Reassign signal handler signal(sig,NsTerminate); } void NsWatchdog(int argc, char *argv[], char *envp[]) { int failcount = 0; time_t start; int status; pid_t pid; signal(SIGTTOU,SIG_IGN); signal(SIGTTIN,SIG_IGN); signal(SIGTSTP,SIG_IGN); signal(SIGPIPE,SIG_IGN); signal(SIGQUIT,SIG_IGN); signal(SIGHUP,NsTerminate); signal(SIGINT,NsTerminate); signal(SIGTERM,NsTerminate); // Go background if((pid = fork())) { if(pid < 0) err_logger("warpConfigure: fork: %s",strerror(errno)); exit(0); } setsid(); for(;;) { // Execute the real server nsPid = fork(); // Child, continue as server if(nsPid == 0) { exit(nsMain(argc, argv, ServerInit)); } /* parent, behaves like a guardian */ time(&start); printf("NS[%d] server process started",getpid()); pid = waitpid(-1, &status, 0); if(WIFEXITED(status)) printf("NS[%d] child process exited with status %d",pid,WEXITSTATUS(status)); else if(WIFSIGNALED(status)) printf("NS[%d] child process exited due to signal %d",pid,WTERMSIG(status)); else printf("NS[%d] child process exited", pid); // Special exit code if(WIFEXITED(status)) { if(WEXITSTATUS(status) == NS_EXIT) { printf("NS[%d] child configuration error, exiting",getpid()); exit(0); } else if(WEXITSTATUS(status) == SIGHUP) { } else { if(time(0) - start < 10) failcount++; if(failcount == 10) { printf("Exiting due to repeated, frequent failures"); exit(1); } } } sleep(3); } } ---------------------------------------------------------------------- Comment By: Stephen Deasey (sdeasey) Date: 2005-03-07 21:23 Message: Logged In: YES user_id=87254 NsWatchdog() is called after the server drops root privs, so both the watchdog and the server run as the defined user. What happens if the server dies and on restart needs to rebind privileged ports? A ps listing shows two running processes, parent and child. If I kill either one, the watchdog dies, the server continues to run. If I kill -9 the parent, the child continues to run. If I kill -9 the child, the server is restarted. Something seems not quite right here... I'm a bit confused about how the code works. For example, NsWatchdog() seems to ignore all of it's arguments? Here's the code which calls it: if (mode == 0) { i = ns_fork(); if (i < 0) { Ns_Fatal("nsmain: fork() failed: '%s'", strerror(errno)); } if (i > 0) { return 0; } setsid(); i = NsWatchdog(argc, argv, initProc); if (i < 0) { Ns_Fatal("nsmain: watchdog failed: '%s'", strerror(errno)); } if (i > 0) { return 0; } nsconf.pid = getpid(); } NsWatchdog() says it returns the worker pid, it also sets the global variable wpid. The code above ignores the returned value, and the global variable, and instead calls getpid()... The variable 'i' is existing code, but still somehow doesn't seem suitable... Could that Ns_Fatal() above be moved into the NsWatchdog() function? I think maybe some comment is needed here. The code structure is very like that just above where ns_fork() is called, but this function will return *multiple* times, right? This is kind of suprising, and an extra twist on the already confusing fork() semantics (or maybe it's just me who gets confused by fork...). A return value of 0 here is a 'request for orderly shutdown', right? How about some more logging to syslog? For example, distinguish between start and restart. Mention when the MAX_SLEEP_PERIOD has been reached, etc. Couple of small things: Can we refer to 'the server', rather than 'the worker'? Worker and Watchdog begin with 'w', and so does the global variable wpid... Maybe serverPid? NsWatchdog is a static function, it doesn't need the Ns prefix. In NsWatchdog(), the variable 'run' should be something like 'nretries', 'nap' should be something more like 'retrySeconds' and MAX_SLEEP_PERIOD should be MAX_RETRY_SECONDS. The comment for WaitForWorker() is misslabeled. Should SigHandler() be called WatchdogSigHandler()? ---------------------------------------------------------------------- Comment By: Vlad Seryakov (seryakov) Date: 2005-03-06 16:03 Message: Logged In: YES user_id=184124 Looks good to me ---------------------------------------------------------------------- Comment By: Zoran Vasiljevic (vasiljevic) Date: 2005-03-05 05:06 Message: Logged In: YES user_id=95086 Ah, correction: The restart option sends SIGINT to the worker process which causes the watchdog to restart it. And, the patchfile is now attached! ---------------------------------------------------------------------- Comment By: Zoran Vasiljevic (vasiljevic) Date: 2005-03-05 05:03 Message: Logged In: YES user_id=95086 Here is the patch. I have added "-restart" option to "ns_shutdown". It is rather clumsy to parse but should do. We should rewrite this with your args parsing routine. The restart option sends SIGTERM to the worker process which causes the watchdog to restart it. ---------------------------------------------------------------------- Comment By: Stephen Deasey (sdeasey) Date: 2005-03-04 16:46 Message: Logged In: YES user_id=87254 I don't think this has to hide behind a config option. It's either a good idea or it's not. Sounds good to me. Is there a patch? I'm wondering about some of the implementation details... ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=719009&aid=1156875&group_id=130646 |
From: SourceForge.net <no...@so...> - 2005-03-14 18:12:07
|
Feature Requests item #1156875, was opened at 2005-03-04 20:03 Message generated for change (Comment added) made by vasiljevic You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=719009&aid=1156875&group_id=130646 Category: None Group: None Status: Open Resolution: None Priority: 5 Submitted By: Zoran Vasiljevic (vasiljevic) Assigned to: Zoran Vasiljevic (vasiljevic) Summary: Watchdog process restarts failed server Initial Comment: We have been using this for quite some time and it proved extremely useful. We doublefork the nsd process and make the first forked instance control the second. The first one (the watchdog) reacts on exit codes and signals caught during the watch and correspondingly restarts the second instance (the worker). Also, we have added the the "-restart" option to the "ns_shutdown" command. This just sends the SIGINT to the worker process. The watchdog is handling this signal and respawns the worker automatically. During operation, the watchdog logs events and their cause into the system log file. This looks like: Feb 28 04:00:05 Develop nsd[19400]: worker: started. Mar 1 04:00:13 Develop nsd[4475]: watchdog: worker 19400 exited (2). Mar 1 04:00:15 Develop nsd[21290]: worker: started. Mar 1 04:00:18 Develop nsd[14705]: watchdog: worker 19399 exited (2). Mar 1 04:00:20 Develop nsd[21300]: worker: started. We have done all the changes with "--enable-watchdog" so anybody who needs this feature will have to compile with this option. ---------------------------------------------------------------------- >Comment By: Zoran Vasiljevic (vasiljevic) Date: 2005-03-14 19:12 Message: Logged In: YES user_id=95086 I'd rather wait for Stephens code and see what's done there and then make final changes and commit. It should not take long. ---------------------------------------------------------------------- Comment By: Vlad Seryakov (seryakov) Date: 2005-03-14 19:07 Message: Logged In: YES user_id=184124 Then i do not see any reason not committing it, we can have discussions about it for another couple of years, but until we have it and use it, it is just discussions. We aready agreed on having it in the core. ---------------------------------------------------------------------- Comment By: Zoran Vasiljevic (vasiljevic) Date: 2005-03-14 19:00 Message: Logged In: YES user_id=95086 Stephen, You mind uploading your patch changes so I can peek into it? I think we should soon settle on some solution and move on. Vlad, The patch already does that. After 16 unsuccessfull restarts the watchdog exits. Between every restart we wait 1, 2, 4, 8, 16, 32, .... 16384 seconds and then exit. This would be: 32767 seconds or about 9 hours. ---------------------------------------------------------------------- Comment By: Vlad Seryakov (seryakov) Date: 2005-03-14 15:53 Message: Logged In: YES user_id=184124 I would say, repeat configured number of time and then exit. This way all socket related problems will be cleared after couple of restartes, config problems will make server exit . ---------------------------------------------------------------------- Comment By: Zoran Vasiljevic (vasiljevic) Date: 2005-03-14 15:41 Message: Logged In: YES user_id=95086 Well, because I could not differentiate between config and later runtime errors, I thought its wiser to abort rather to repeatedly restart broken server. One possibility is to add different Ns_Fatal call like: Ns_FatalEx(int exitcode, char *fmt, ...); Or? ---------------------------------------------------------------------- Comment By: Stephen Deasey (sdeasey) Date: 2005-03-14 15:12 Message: Logged In: YES user_id=87254 Are you sure Ns_Fatal should not restart the server? Many of the fatal errors are caused by bad configuration and I can see why you might want the server to exit immediately. But there are also runtime fatal errors, and here I'd expect the server to be restarted. Some config errors are due to external factors like missing directories or file system permissions and it would be nice for the server to come back up as soon as the issue resolved itself. ---------------------------------------------------------------------- Comment By: Zoran Vasiljevic (vasiljevic) Date: 2005-03-14 10:52 Message: Logged In: YES user_id=95086 Calling watchdog before or after chroot has no real implications I believe. Also, when the server exits with Ns_Fatal, it hsould not be restarted, I will look into your changes today. ---------------------------------------------------------------------- Comment By: Stephen Deasey (sdeasey) Date: 2005-03-11 07:27 Message: Logged In: YES user_id=87254 Oh yeah, and I moved the watchdog stuff eaven earlier, before the prebind and chroot. Hmm, now that I think about it it's only the prebind that really needs to happen after the watchdog is started, to ensure the sockets are always in a sane state... ---------------------------------------------------------------------- Comment By: Stephen Deasey (sdeasey) Date: 2005-03-11 07:17 Message: Logged In: YES user_id=87254 Am I reading this right? It looks like if the server calls Ns_Fatal() it will exit with code 1, but 0 and 1 are treated as OK exit codes and the server will not be restarted. I've attached a patch which changes the above, fixes the pid problem in a different way because I completely forgot you already posted below about that small glitch, use Ns_ParseObjv(), adds the -w switch, and a couple of small name changes. I don't want to go overboard with command line switches, but an option to specify not to give up trying to restart the server would be nice. If you have that, you also need to turn off the restart timeout doubling at a certain point. I don't know what the cleanest way to do that is. ---------------------------------------------------------------------- Comment By: Zoran Vasiljevic (vasiljevic) Date: 2005-03-08 21:09 Message: Logged In: YES user_id=95086 This is correct. As soon as we agree on some implementation, I will put the -i processing and avoid starting watchdog for inittab starts. ---------------------------------------------------------------------- Comment By: Vlad Seryakov (seryakov) Date: 2005-03-08 20:48 Message: Logged In: YES user_id=184124 There still should be possibility to run nsd from inittab, so when -i switch is given, no watchdog should be running, let /sbin/init to handle restarts ---------------------------------------------------------------------- Comment By: Zoran Vasiljevic (vasiljevic) Date: 2005-03-08 17:19 Message: Logged In: YES user_id=95086 small glich: After applying the patch, change nsmain.c from: nsconf.pid = serverPid: to nsconf.pid = getpid(); otherwise the pid file will contain the bogus server pid if the watchdog restarted it later. ---------------------------------------------------------------------- Comment By: Zoran Vasiljevic (vasiljevic) Date: 2005-03-08 16:26 Message: Logged In: YES user_id=95086 Another try... Important to note: SIGKILL (signal 9) cannot be handled hence if somebody kills the watchdog with SIGKILL, the server will be left lingering w/o the watchdog. This is important to know. I do not see any possibility how to recover in such cases (i.e. how to stop the server). Apart from that, all objections from Stephen are taken into account. Please try again. A new copy of watchdog.patch file is attached. ---------------------------------------------------------------------- Comment By: Vlad Seryakov (seryakov) Date: 2005-03-08 06:20 Message: Logged In: YES user_id=184124 I just found this code in my old server sources, i just chnaged internal name to Ns/NS, used to run pretty stable. ------------------------------------------------------------------- #define NS_EXIT 99 static void NsTerminate(int sig) { printf("NS[%d] signal %d received...",getpid(),sig); // Kill the server with the same signal if(nsPid > 0) kill(nsPid,sig); // Exit in case of fatal signal if(sig != SIGHUP) { printf("NS[%d] terminating...",getpid()); exit(0); } // Reassign signal handler signal(sig,NsTerminate); } void NsWatchdog(int argc, char *argv[], char *envp[]) { int failcount = 0; time_t start; int status; pid_t pid; signal(SIGTTOU,SIG_IGN); signal(SIGTTIN,SIG_IGN); signal(SIGTSTP,SIG_IGN); signal(SIGPIPE,SIG_IGN); signal(SIGQUIT,SIG_IGN); signal(SIGHUP,NsTerminate); signal(SIGINT,NsTerminate); signal(SIGTERM,NsTerminate); // Go background if((pid = fork())) { if(pid < 0) err_logger("warpConfigure: fork: %s",strerror(errno)); exit(0); } setsid(); for(;;) { // Execute the real server nsPid = fork(); // Child, continue as server if(nsPid == 0) { exit(nsMain(argc, argv, ServerInit)); } /* parent, behaves like a guardian */ time(&start); printf("NS[%d] server process started",getpid()); pid = waitpid(-1, &status, 0); if(WIFEXITED(status)) printf("NS[%d] child process exited with status %d",pid,WEXITSTATUS(status)); else if(WIFSIGNALED(status)) printf("NS[%d] child process exited due to signal %d",pid,WTERMSIG(status)); else printf("NS[%d] child process exited", pid); // Special exit code if(WIFEXITED(status)) { if(WEXITSTATUS(status) == NS_EXIT) { printf("NS[%d] child configuration error, exiting",getpid()); exit(0); } else if(WEXITSTATUS(status) == SIGHUP) { } else { if(time(0) - start < 10) failcount++; if(failcount == 10) { printf("Exiting due to repeated, frequent failures"); exit(1); } } } sleep(3); } } ---------------------------------------------------------------------- Comment By: Stephen Deasey (sdeasey) Date: 2005-03-08 05:23 Message: Logged In: YES user_id=87254 NsWatchdog() is called after the server drops root privs, so both the watchdog and the server run as the defined user. What happens if the server dies and on restart needs to rebind privileged ports? A ps listing shows two running processes, parent and child. If I kill either one, the watchdog dies, the server continues to run. If I kill -9 the parent, the child continues to run. If I kill -9 the child, the server is restarted. Something seems not quite right here... I'm a bit confused about how the code works. For example, NsWatchdog() seems to ignore all of it's arguments? Here's the code which calls it: if (mode == 0) { i = ns_fork(); if (i < 0) { Ns_Fatal("nsmain: fork() failed: '%s'", strerror(errno)); } if (i > 0) { return 0; } setsid(); i = NsWatchdog(argc, argv, initProc); if (i < 0) { Ns_Fatal("nsmain: watchdog failed: '%s'", strerror(errno)); } if (i > 0) { return 0; } nsconf.pid = getpid(); } NsWatchdog() says it returns the worker pid, it also sets the global variable wpid. The code above ignores the returned value, and the global variable, and instead calls getpid()... The variable 'i' is existing code, but still somehow doesn't seem suitable... Could that Ns_Fatal() above be moved into the NsWatchdog() function? I think maybe some comment is needed here. The code structure is very like that just above where ns_fork() is called, but this function will return *multiple* times, right? This is kind of suprising, and an extra twist on the already confusing fork() semantics (or maybe it's just me who gets confused by fork...). A return value of 0 here is a 'request for orderly shutdown', right? How about some more logging to syslog? For example, distinguish between start and restart. Mention when the MAX_SLEEP_PERIOD has been reached, etc. Couple of small things: Can we refer to 'the server', rather than 'the worker'? Worker and Watchdog begin with 'w', and so does the global variable wpid... Maybe serverPid? NsWatchdog is a static function, it doesn't need the Ns prefix. In NsWatchdog(), the variable 'run' should be something like 'nretries', 'nap' should be something more like 'retrySeconds' and MAX_SLEEP_PERIOD should be MAX_RETRY_SECONDS. The comment for WaitForWorker() is misslabeled. Should SigHandler() be called WatchdogSigHandler()? ---------------------------------------------------------------------- Comment By: Vlad Seryakov (seryakov) Date: 2005-03-07 00:03 Message: Logged In: YES user_id=184124 Looks good to me ---------------------------------------------------------------------- Comment By: Zoran Vasiljevic (vasiljevic) Date: 2005-03-05 13:06 Message: Logged In: YES user_id=95086 Ah, correction: The restart option sends SIGINT to the worker process which causes the watchdog to restart it. And, the patchfile is now attached! ---------------------------------------------------------------------- Comment By: Zoran Vasiljevic (vasiljevic) Date: 2005-03-05 13:03 Message: Logged In: YES user_id=95086 Here is the patch. I have added "-restart" option to "ns_shutdown". It is rather clumsy to parse but should do. We should rewrite this with your args parsing routine. The restart option sends SIGTERM to the worker process which causes the watchdog to restart it. ---------------------------------------------------------------------- Comment By: Stephen Deasey (sdeasey) Date: 2005-03-05 00:46 Message: Logged In: YES user_id=87254 I don't think this has to hide behind a config option. It's either a good idea or it's not. Sounds good to me. Is there a patch? I'm wondering about some of the implementation details... ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=719009&aid=1156875&group_id=130646 |
From: SourceForge.net <no...@so...> - 2005-03-14 18:07:23
|
Feature Requests item #1156875, was opened at 2005-03-04 19:03 Message generated for change (Comment added) made by seryakov You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=719009&aid=1156875&group_id=130646 Category: None Group: None Status: Open Resolution: None Priority: 5 Submitted By: Zoran Vasiljevic (vasiljevic) Assigned to: Zoran Vasiljevic (vasiljevic) Summary: Watchdog process restarts failed server Initial Comment: We have been using this for quite some time and it proved extremely useful. We doublefork the nsd process and make the first forked instance control the second. The first one (the watchdog) reacts on exit codes and signals caught during the watch and correspondingly restarts the second instance (the worker). Also, we have added the the "-restart" option to the "ns_shutdown" command. This just sends the SIGINT to the worker process. The watchdog is handling this signal and respawns the worker automatically. During operation, the watchdog logs events and their cause into the system log file. This looks like: Feb 28 04:00:05 Develop nsd[19400]: worker: started. Mar 1 04:00:13 Develop nsd[4475]: watchdog: worker 19400 exited (2). Mar 1 04:00:15 Develop nsd[21290]: worker: started. Mar 1 04:00:18 Develop nsd[14705]: watchdog: worker 19399 exited (2). Mar 1 04:00:20 Develop nsd[21300]: worker: started. We have done all the changes with "--enable-watchdog" so anybody who needs this feature will have to compile with this option. ---------------------------------------------------------------------- >Comment By: Vlad Seryakov (seryakov) Date: 2005-03-14 18:07 Message: Logged In: YES user_id=184124 Then i do not see any reason not committing it, we can have discussions about it for another couple of years, but until we have it and use it, it is just discussions. We aready agreed on having it in the core. ---------------------------------------------------------------------- Comment By: Zoran Vasiljevic (vasiljevic) Date: 2005-03-14 18:00 Message: Logged In: YES user_id=95086 Stephen, You mind uploading your patch changes so I can peek into it? I think we should soon settle on some solution and move on. Vlad, The patch already does that. After 16 unsuccessfull restarts the watchdog exits. Between every restart we wait 1, 2, 4, 8, 16, 32, .... 16384 seconds and then exit. This would be: 32767 seconds or about 9 hours. ---------------------------------------------------------------------- Comment By: Vlad Seryakov (seryakov) Date: 2005-03-14 14:53 Message: Logged In: YES user_id=184124 I would say, repeat configured number of time and then exit. This way all socket related problems will be cleared after couple of restartes, config problems will make server exit . ---------------------------------------------------------------------- Comment By: Zoran Vasiljevic (vasiljevic) Date: 2005-03-14 14:41 Message: Logged In: YES user_id=95086 Well, because I could not differentiate between config and later runtime errors, I thought its wiser to abort rather to repeatedly restart broken server. One possibility is to add different Ns_Fatal call like: Ns_FatalEx(int exitcode, char *fmt, ...); Or? ---------------------------------------------------------------------- Comment By: Stephen Deasey (sdeasey) Date: 2005-03-14 14:12 Message: Logged In: YES user_id=87254 Are you sure Ns_Fatal should not restart the server? Many of the fatal errors are caused by bad configuration and I can see why you might want the server to exit immediately. But there are also runtime fatal errors, and here I'd expect the server to be restarted. Some config errors are due to external factors like missing directories or file system permissions and it would be nice for the server to come back up as soon as the issue resolved itself. ---------------------------------------------------------------------- Comment By: Zoran Vasiljevic (vasiljevic) Date: 2005-03-14 09:52 Message: Logged In: YES user_id=95086 Calling watchdog before or after chroot has no real implications I believe. Also, when the server exits with Ns_Fatal, it hsould not be restarted, I will look into your changes today. ---------------------------------------------------------------------- Comment By: Stephen Deasey (sdeasey) Date: 2005-03-11 06:27 Message: Logged In: YES user_id=87254 Oh yeah, and I moved the watchdog stuff eaven earlier, before the prebind and chroot. Hmm, now that I think about it it's only the prebind that really needs to happen after the watchdog is started, to ensure the sockets are always in a sane state... ---------------------------------------------------------------------- Comment By: Stephen Deasey (sdeasey) Date: 2005-03-11 06:17 Message: Logged In: YES user_id=87254 Am I reading this right? It looks like if the server calls Ns_Fatal() it will exit with code 1, but 0 and 1 are treated as OK exit codes and the server will not be restarted. I've attached a patch which changes the above, fixes the pid problem in a different way because I completely forgot you already posted below about that small glitch, use Ns_ParseObjv(), adds the -w switch, and a couple of small name changes. I don't want to go overboard with command line switches, but an option to specify not to give up trying to restart the server would be nice. If you have that, you also need to turn off the restart timeout doubling at a certain point. I don't know what the cleanest way to do that is. ---------------------------------------------------------------------- Comment By: Zoran Vasiljevic (vasiljevic) Date: 2005-03-08 20:09 Message: Logged In: YES user_id=95086 This is correct. As soon as we agree on some implementation, I will put the -i processing and avoid starting watchdog for inittab starts. ---------------------------------------------------------------------- Comment By: Vlad Seryakov (seryakov) Date: 2005-03-08 19:48 Message: Logged In: YES user_id=184124 There still should be possibility to run nsd from inittab, so when -i switch is given, no watchdog should be running, let /sbin/init to handle restarts ---------------------------------------------------------------------- Comment By: Zoran Vasiljevic (vasiljevic) Date: 2005-03-08 16:19 Message: Logged In: YES user_id=95086 small glich: After applying the patch, change nsmain.c from: nsconf.pid = serverPid: to nsconf.pid = getpid(); otherwise the pid file will contain the bogus server pid if the watchdog restarted it later. ---------------------------------------------------------------------- Comment By: Zoran Vasiljevic (vasiljevic) Date: 2005-03-08 15:26 Message: Logged In: YES user_id=95086 Another try... Important to note: SIGKILL (signal 9) cannot be handled hence if somebody kills the watchdog with SIGKILL, the server will be left lingering w/o the watchdog. This is important to know. I do not see any possibility how to recover in such cases (i.e. how to stop the server). Apart from that, all objections from Stephen are taken into account. Please try again. A new copy of watchdog.patch file is attached. ---------------------------------------------------------------------- Comment By: Vlad Seryakov (seryakov) Date: 2005-03-08 05:20 Message: Logged In: YES user_id=184124 I just found this code in my old server sources, i just chnaged internal name to Ns/NS, used to run pretty stable. ------------------------------------------------------------------- #define NS_EXIT 99 static void NsTerminate(int sig) { printf("NS[%d] signal %d received...",getpid(),sig); // Kill the server with the same signal if(nsPid > 0) kill(nsPid,sig); // Exit in case of fatal signal if(sig != SIGHUP) { printf("NS[%d] terminating...",getpid()); exit(0); } // Reassign signal handler signal(sig,NsTerminate); } void NsWatchdog(int argc, char *argv[], char *envp[]) { int failcount = 0; time_t start; int status; pid_t pid; signal(SIGTTOU,SIG_IGN); signal(SIGTTIN,SIG_IGN); signal(SIGTSTP,SIG_IGN); signal(SIGPIPE,SIG_IGN); signal(SIGQUIT,SIG_IGN); signal(SIGHUP,NsTerminate); signal(SIGINT,NsTerminate); signal(SIGTERM,NsTerminate); // Go background if((pid = fork())) { if(pid < 0) err_logger("warpConfigure: fork: %s",strerror(errno)); exit(0); } setsid(); for(;;) { // Execute the real server nsPid = fork(); // Child, continue as server if(nsPid == 0) { exit(nsMain(argc, argv, ServerInit)); } /* parent, behaves like a guardian */ time(&start); printf("NS[%d] server process started",getpid()); pid = waitpid(-1, &status, 0); if(WIFEXITED(status)) printf("NS[%d] child process exited with status %d",pid,WEXITSTATUS(status)); else if(WIFSIGNALED(status)) printf("NS[%d] child process exited due to signal %d",pid,WTERMSIG(status)); else printf("NS[%d] child process exited", pid); // Special exit code if(WIFEXITED(status)) { if(WEXITSTATUS(status) == NS_EXIT) { printf("NS[%d] child configuration error, exiting",getpid()); exit(0); } else if(WEXITSTATUS(status) == SIGHUP) { } else { if(time(0) - start < 10) failcount++; if(failcount == 10) { printf("Exiting due to repeated, frequent failures"); exit(1); } } } sleep(3); } } ---------------------------------------------------------------------- Comment By: Stephen Deasey (sdeasey) Date: 2005-03-08 04:23 Message: Logged In: YES user_id=87254 NsWatchdog() is called after the server drops root privs, so both the watchdog and the server run as the defined user. What happens if the server dies and on restart needs to rebind privileged ports? A ps listing shows two running processes, parent and child. If I kill either one, the watchdog dies, the server continues to run. If I kill -9 the parent, the child continues to run. If I kill -9 the child, the server is restarted. Something seems not quite right here... I'm a bit confused about how the code works. For example, NsWatchdog() seems to ignore all of it's arguments? Here's the code which calls it: if (mode == 0) { i = ns_fork(); if (i < 0) { Ns_Fatal("nsmain: fork() failed: '%s'", strerror(errno)); } if (i > 0) { return 0; } setsid(); i = NsWatchdog(argc, argv, initProc); if (i < 0) { Ns_Fatal("nsmain: watchdog failed: '%s'", strerror(errno)); } if (i > 0) { return 0; } nsconf.pid = getpid(); } NsWatchdog() says it returns the worker pid, it also sets the global variable wpid. The code above ignores the returned value, and the global variable, and instead calls getpid()... The variable 'i' is existing code, but still somehow doesn't seem suitable... Could that Ns_Fatal() above be moved into the NsWatchdog() function? I think maybe some comment is needed here. The code structure is very like that just above where ns_fork() is called, but this function will return *multiple* times, right? This is kind of suprising, and an extra twist on the already confusing fork() semantics (or maybe it's just me who gets confused by fork...). A return value of 0 here is a 'request for orderly shutdown', right? How about some more logging to syslog? For example, distinguish between start and restart. Mention when the MAX_SLEEP_PERIOD has been reached, etc. Couple of small things: Can we refer to 'the server', rather than 'the worker'? Worker and Watchdog begin with 'w', and so does the global variable wpid... Maybe serverPid? NsWatchdog is a static function, it doesn't need the Ns prefix. In NsWatchdog(), the variable 'run' should be something like 'nretries', 'nap' should be something more like 'retrySeconds' and MAX_SLEEP_PERIOD should be MAX_RETRY_SECONDS. The comment for WaitForWorker() is misslabeled. Should SigHandler() be called WatchdogSigHandler()? ---------------------------------------------------------------------- Comment By: Vlad Seryakov (seryakov) Date: 2005-03-06 23:03 Message: Logged In: YES user_id=184124 Looks good to me ---------------------------------------------------------------------- Comment By: Zoran Vasiljevic (vasiljevic) Date: 2005-03-05 12:06 Message: Logged In: YES user_id=95086 Ah, correction: The restart option sends SIGINT to the worker process which causes the watchdog to restart it. And, the patchfile is now attached! ---------------------------------------------------------------------- Comment By: Zoran Vasiljevic (vasiljevic) Date: 2005-03-05 12:03 Message: Logged In: YES user_id=95086 Here is the patch. I have added "-restart" option to "ns_shutdown". It is rather clumsy to parse but should do. We should rewrite this with your args parsing routine. The restart option sends SIGTERM to the worker process which causes the watchdog to restart it. ---------------------------------------------------------------------- Comment By: Stephen Deasey (sdeasey) Date: 2005-03-04 23:46 Message: Logged In: YES user_id=87254 I don't think this has to hide behind a config option. It's either a good idea or it's not. Sounds good to me. Is there a patch? I'm wondering about some of the implementation details... ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=719009&aid=1156875&group_id=130646 |
From: SourceForge.net <no...@so...> - 2005-03-14 18:00:42
|
Feature Requests item #1156875, was opened at 2005-03-04 20:03 Message generated for change (Comment added) made by vasiljevic You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=719009&aid=1156875&group_id=130646 Category: None Group: None Status: Open Resolution: None Priority: 5 Submitted By: Zoran Vasiljevic (vasiljevic) Assigned to: Zoran Vasiljevic (vasiljevic) Summary: Watchdog process restarts failed server Initial Comment: We have been using this for quite some time and it proved extremely useful. We doublefork the nsd process and make the first forked instance control the second. The first one (the watchdog) reacts on exit codes and signals caught during the watch and correspondingly restarts the second instance (the worker). Also, we have added the the "-restart" option to the "ns_shutdown" command. This just sends the SIGINT to the worker process. The watchdog is handling this signal and respawns the worker automatically. During operation, the watchdog logs events and their cause into the system log file. This looks like: Feb 28 04:00:05 Develop nsd[19400]: worker: started. Mar 1 04:00:13 Develop nsd[4475]: watchdog: worker 19400 exited (2). Mar 1 04:00:15 Develop nsd[21290]: worker: started. Mar 1 04:00:18 Develop nsd[14705]: watchdog: worker 19399 exited (2). Mar 1 04:00:20 Develop nsd[21300]: worker: started. We have done all the changes with "--enable-watchdog" so anybody who needs this feature will have to compile with this option. ---------------------------------------------------------------------- >Comment By: Zoran Vasiljevic (vasiljevic) Date: 2005-03-14 19:00 Message: Logged In: YES user_id=95086 Stephen, You mind uploading your patch changes so I can peek into it? I think we should soon settle on some solution and move on. Vlad, The patch already does that. After 16 unsuccessfull restarts the watchdog exits. Between every restart we wait 1, 2, 4, 8, 16, 32, .... 16384 seconds and then exit. This would be: 32767 seconds or about 9 hours. ---------------------------------------------------------------------- Comment By: Vlad Seryakov (seryakov) Date: 2005-03-14 15:53 Message: Logged In: YES user_id=184124 I would say, repeat configured number of time and then exit. This way all socket related problems will be cleared after couple of restartes, config problems will make server exit . ---------------------------------------------------------------------- Comment By: Zoran Vasiljevic (vasiljevic) Date: 2005-03-14 15:41 Message: Logged In: YES user_id=95086 Well, because I could not differentiate between config and later runtime errors, I thought its wiser to abort rather to repeatedly restart broken server. One possibility is to add different Ns_Fatal call like: Ns_FatalEx(int exitcode, char *fmt, ...); Or? ---------------------------------------------------------------------- Comment By: Stephen Deasey (sdeasey) Date: 2005-03-14 15:12 Message: Logged In: YES user_id=87254 Are you sure Ns_Fatal should not restart the server? Many of the fatal errors are caused by bad configuration and I can see why you might want the server to exit immediately. But there are also runtime fatal errors, and here I'd expect the server to be restarted. Some config errors are due to external factors like missing directories or file system permissions and it would be nice for the server to come back up as soon as the issue resolved itself. ---------------------------------------------------------------------- Comment By: Zoran Vasiljevic (vasiljevic) Date: 2005-03-14 10:52 Message: Logged In: YES user_id=95086 Calling watchdog before or after chroot has no real implications I believe. Also, when the server exits with Ns_Fatal, it hsould not be restarted, I will look into your changes today. ---------------------------------------------------------------------- Comment By: Stephen Deasey (sdeasey) Date: 2005-03-11 07:27 Message: Logged In: YES user_id=87254 Oh yeah, and I moved the watchdog stuff eaven earlier, before the prebind and chroot. Hmm, now that I think about it it's only the prebind that really needs to happen after the watchdog is started, to ensure the sockets are always in a sane state... ---------------------------------------------------------------------- Comment By: Stephen Deasey (sdeasey) Date: 2005-03-11 07:17 Message: Logged In: YES user_id=87254 Am I reading this right? It looks like if the server calls Ns_Fatal() it will exit with code 1, but 0 and 1 are treated as OK exit codes and the server will not be restarted. I've attached a patch which changes the above, fixes the pid problem in a different way because I completely forgot you already posted below about that small glitch, use Ns_ParseObjv(), adds the -w switch, and a couple of small name changes. I don't want to go overboard with command line switches, but an option to specify not to give up trying to restart the server would be nice. If you have that, you also need to turn off the restart timeout doubling at a certain point. I don't know what the cleanest way to do that is. ---------------------------------------------------------------------- Comment By: Zoran Vasiljevic (vasiljevic) Date: 2005-03-08 21:09 Message: Logged In: YES user_id=95086 This is correct. As soon as we agree on some implementation, I will put the -i processing and avoid starting watchdog for inittab starts. ---------------------------------------------------------------------- Comment By: Vlad Seryakov (seryakov) Date: 2005-03-08 20:48 Message: Logged In: YES user_id=184124 There still should be possibility to run nsd from inittab, so when -i switch is given, no watchdog should be running, let /sbin/init to handle restarts ---------------------------------------------------------------------- Comment By: Zoran Vasiljevic (vasiljevic) Date: 2005-03-08 17:19 Message: Logged In: YES user_id=95086 small glich: After applying the patch, change nsmain.c from: nsconf.pid = serverPid: to nsconf.pid = getpid(); otherwise the pid file will contain the bogus server pid if the watchdog restarted it later. ---------------------------------------------------------------------- Comment By: Zoran Vasiljevic (vasiljevic) Date: 2005-03-08 16:26 Message: Logged In: YES user_id=95086 Another try... Important to note: SIGKILL (signal 9) cannot be handled hence if somebody kills the watchdog with SIGKILL, the server will be left lingering w/o the watchdog. This is important to know. I do not see any possibility how to recover in such cases (i.e. how to stop the server). Apart from that, all objections from Stephen are taken into account. Please try again. A new copy of watchdog.patch file is attached. ---------------------------------------------------------------------- Comment By: Vlad Seryakov (seryakov) Date: 2005-03-08 06:20 Message: Logged In: YES user_id=184124 I just found this code in my old server sources, i just chnaged internal name to Ns/NS, used to run pretty stable. ------------------------------------------------------------------- #define NS_EXIT 99 static void NsTerminate(int sig) { printf("NS[%d] signal %d received...",getpid(),sig); // Kill the server with the same signal if(nsPid > 0) kill(nsPid,sig); // Exit in case of fatal signal if(sig != SIGHUP) { printf("NS[%d] terminating...",getpid()); exit(0); } // Reassign signal handler signal(sig,NsTerminate); } void NsWatchdog(int argc, char *argv[], char *envp[]) { int failcount = 0; time_t start; int status; pid_t pid; signal(SIGTTOU,SIG_IGN); signal(SIGTTIN,SIG_IGN); signal(SIGTSTP,SIG_IGN); signal(SIGPIPE,SIG_IGN); signal(SIGQUIT,SIG_IGN); signal(SIGHUP,NsTerminate); signal(SIGINT,NsTerminate); signal(SIGTERM,NsTerminate); // Go background if((pid = fork())) { if(pid < 0) err_logger("warpConfigure: fork: %s",strerror(errno)); exit(0); } setsid(); for(;;) { // Execute the real server nsPid = fork(); // Child, continue as server if(nsPid == 0) { exit(nsMain(argc, argv, ServerInit)); } /* parent, behaves like a guardian */ time(&start); printf("NS[%d] server process started",getpid()); pid = waitpid(-1, &status, 0); if(WIFEXITED(status)) printf("NS[%d] child process exited with status %d",pid,WEXITSTATUS(status)); else if(WIFSIGNALED(status)) printf("NS[%d] child process exited due to signal %d",pid,WTERMSIG(status)); else printf("NS[%d] child process exited", pid); // Special exit code if(WIFEXITED(status)) { if(WEXITSTATUS(status) == NS_EXIT) { printf("NS[%d] child configuration error, exiting",getpid()); exit(0); } else if(WEXITSTATUS(status) == SIGHUP) { } else { if(time(0) - start < 10) failcount++; if(failcount == 10) { printf("Exiting due to repeated, frequent failures"); exit(1); } } } sleep(3); } } ---------------------------------------------------------------------- Comment By: Stephen Deasey (sdeasey) Date: 2005-03-08 05:23 Message: Logged In: YES user_id=87254 NsWatchdog() is called after the server drops root privs, so both the watchdog and the server run as the defined user. What happens if the server dies and on restart needs to rebind privileged ports? A ps listing shows two running processes, parent and child. If I kill either one, the watchdog dies, the server continues to run. If I kill -9 the parent, the child continues to run. If I kill -9 the child, the server is restarted. Something seems not quite right here... I'm a bit confused about how the code works. For example, NsWatchdog() seems to ignore all of it's arguments? Here's the code which calls it: if (mode == 0) { i = ns_fork(); if (i < 0) { Ns_Fatal("nsmain: fork() failed: '%s'", strerror(errno)); } if (i > 0) { return 0; } setsid(); i = NsWatchdog(argc, argv, initProc); if (i < 0) { Ns_Fatal("nsmain: watchdog failed: '%s'", strerror(errno)); } if (i > 0) { return 0; } nsconf.pid = getpid(); } NsWatchdog() says it returns the worker pid, it also sets the global variable wpid. The code above ignores the returned value, and the global variable, and instead calls getpid()... The variable 'i' is existing code, but still somehow doesn't seem suitable... Could that Ns_Fatal() above be moved into the NsWatchdog() function? I think maybe some comment is needed here. The code structure is very like that just above where ns_fork() is called, but this function will return *multiple* times, right? This is kind of suprising, and an extra twist on the already confusing fork() semantics (or maybe it's just me who gets confused by fork...). A return value of 0 here is a 'request for orderly shutdown', right? How about some more logging to syslog? For example, distinguish between start and restart. Mention when the MAX_SLEEP_PERIOD has been reached, etc. Couple of small things: Can we refer to 'the server', rather than 'the worker'? Worker and Watchdog begin with 'w', and so does the global variable wpid... Maybe serverPid? NsWatchdog is a static function, it doesn't need the Ns prefix. In NsWatchdog(), the variable 'run' should be something like 'nretries', 'nap' should be something more like 'retrySeconds' and MAX_SLEEP_PERIOD should be MAX_RETRY_SECONDS. The comment for WaitForWorker() is misslabeled. Should SigHandler() be called WatchdogSigHandler()? ---------------------------------------------------------------------- Comment By: Vlad Seryakov (seryakov) Date: 2005-03-07 00:03 Message: Logged In: YES user_id=184124 Looks good to me ---------------------------------------------------------------------- Comment By: Zoran Vasiljevic (vasiljevic) Date: 2005-03-05 13:06 Message: Logged In: YES user_id=95086 Ah, correction: The restart option sends SIGINT to the worker process which causes the watchdog to restart it. And, the patchfile is now attached! ---------------------------------------------------------------------- Comment By: Zoran Vasiljevic (vasiljevic) Date: 2005-03-05 13:03 Message: Logged In: YES user_id=95086 Here is the patch. I have added "-restart" option to "ns_shutdown". It is rather clumsy to parse but should do. We should rewrite this with your args parsing routine. The restart option sends SIGTERM to the worker process which causes the watchdog to restart it. ---------------------------------------------------------------------- Comment By: Stephen Deasey (sdeasey) Date: 2005-03-05 00:46 Message: Logged In: YES user_id=87254 I don't think this has to hide behind a config option. It's either a good idea or it's not. Sounds good to me. Is there a patch? I'm wondering about some of the implementation details... ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=719009&aid=1156875&group_id=130646 |
From: SourceForge.net <no...@so...> - 2005-03-14 14:53:42
|
Feature Requests item #1156875, was opened at 2005-03-04 19:03 Message generated for change (Comment added) made by seryakov You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=719009&aid=1156875&group_id=130646 Category: None Group: None Status: Open Resolution: None Priority: 5 Submitted By: Zoran Vasiljevic (vasiljevic) Assigned to: Zoran Vasiljevic (vasiljevic) Summary: Watchdog process restarts failed server Initial Comment: We have been using this for quite some time and it proved extremely useful. We doublefork the nsd process and make the first forked instance control the second. The first one (the watchdog) reacts on exit codes and signals caught during the watch and correspondingly restarts the second instance (the worker). Also, we have added the the "-restart" option to the "ns_shutdown" command. This just sends the SIGINT to the worker process. The watchdog is handling this signal and respawns the worker automatically. During operation, the watchdog logs events and their cause into the system log file. This looks like: Feb 28 04:00:05 Develop nsd[19400]: worker: started. Mar 1 04:00:13 Develop nsd[4475]: watchdog: worker 19400 exited (2). Mar 1 04:00:15 Develop nsd[21290]: worker: started. Mar 1 04:00:18 Develop nsd[14705]: watchdog: worker 19399 exited (2). Mar 1 04:00:20 Develop nsd[21300]: worker: started. We have done all the changes with "--enable-watchdog" so anybody who needs this feature will have to compile with this option. ---------------------------------------------------------------------- >Comment By: Vlad Seryakov (seryakov) Date: 2005-03-14 14:53 Message: Logged In: YES user_id=184124 I would say, repeat configured number of time and then exit. This way all socket related problems will be cleared after couple of restartes, config problems will make server exit . ---------------------------------------------------------------------- Comment By: Zoran Vasiljevic (vasiljevic) Date: 2005-03-14 14:41 Message: Logged In: YES user_id=95086 Well, because I could not differentiate between config and later runtime errors, I thought its wiser to abort rather to repeatedly restart broken server. One possibility is to add different Ns_Fatal call like: Ns_FatalEx(int exitcode, char *fmt, ...); Or? ---------------------------------------------------------------------- Comment By: Stephen Deasey (sdeasey) Date: 2005-03-14 14:12 Message: Logged In: YES user_id=87254 Are you sure Ns_Fatal should not restart the server? Many of the fatal errors are caused by bad configuration and I can see why you might want the server to exit immediately. But there are also runtime fatal errors, and here I'd expect the server to be restarted. Some config errors are due to external factors like missing directories or file system permissions and it would be nice for the server to come back up as soon as the issue resolved itself. ---------------------------------------------------------------------- Comment By: Zoran Vasiljevic (vasiljevic) Date: 2005-03-14 09:52 Message: Logged In: YES user_id=95086 Calling watchdog before or after chroot has no real implications I believe. Also, when the server exits with Ns_Fatal, it hsould not be restarted, I will look into your changes today. ---------------------------------------------------------------------- Comment By: Stephen Deasey (sdeasey) Date: 2005-03-11 06:27 Message: Logged In: YES user_id=87254 Oh yeah, and I moved the watchdog stuff eaven earlier, before the prebind and chroot. Hmm, now that I think about it it's only the prebind that really needs to happen after the watchdog is started, to ensure the sockets are always in a sane state... ---------------------------------------------------------------------- Comment By: Stephen Deasey (sdeasey) Date: 2005-03-11 06:17 Message: Logged In: YES user_id=87254 Am I reading this right? It looks like if the server calls Ns_Fatal() it will exit with code 1, but 0 and 1 are treated as OK exit codes and the server will not be restarted. I've attached a patch which changes the above, fixes the pid problem in a different way because I completely forgot you already posted below about that small glitch, use Ns_ParseObjv(), adds the -w switch, and a couple of small name changes. I don't want to go overboard with command line switches, but an option to specify not to give up trying to restart the server would be nice. If you have that, you also need to turn off the restart timeout doubling at a certain point. I don't know what the cleanest way to do that is. ---------------------------------------------------------------------- Comment By: Zoran Vasiljevic (vasiljevic) Date: 2005-03-08 20:09 Message: Logged In: YES user_id=95086 This is correct. As soon as we agree on some implementation, I will put the -i processing and avoid starting watchdog for inittab starts. ---------------------------------------------------------------------- Comment By: Vlad Seryakov (seryakov) Date: 2005-03-08 19:48 Message: Logged In: YES user_id=184124 There still should be possibility to run nsd from inittab, so when -i switch is given, no watchdog should be running, let /sbin/init to handle restarts ---------------------------------------------------------------------- Comment By: Zoran Vasiljevic (vasiljevic) Date: 2005-03-08 16:19 Message: Logged In: YES user_id=95086 small glich: After applying the patch, change nsmain.c from: nsconf.pid = serverPid: to nsconf.pid = getpid(); otherwise the pid file will contain the bogus server pid if the watchdog restarted it later. ---------------------------------------------------------------------- Comment By: Zoran Vasiljevic (vasiljevic) Date: 2005-03-08 15:26 Message: Logged In: YES user_id=95086 Another try... Important to note: SIGKILL (signal 9) cannot be handled hence if somebody kills the watchdog with SIGKILL, the server will be left lingering w/o the watchdog. This is important to know. I do not see any possibility how to recover in such cases (i.e. how to stop the server). Apart from that, all objections from Stephen are taken into account. Please try again. A new copy of watchdog.patch file is attached. ---------------------------------------------------------------------- Comment By: Vlad Seryakov (seryakov) Date: 2005-03-08 05:20 Message: Logged In: YES user_id=184124 I just found this code in my old server sources, i just chnaged internal name to Ns/NS, used to run pretty stable. ------------------------------------------------------------------- #define NS_EXIT 99 static void NsTerminate(int sig) { printf("NS[%d] signal %d received...",getpid(),sig); // Kill the server with the same signal if(nsPid > 0) kill(nsPid,sig); // Exit in case of fatal signal if(sig != SIGHUP) { printf("NS[%d] terminating...",getpid()); exit(0); } // Reassign signal handler signal(sig,NsTerminate); } void NsWatchdog(int argc, char *argv[], char *envp[]) { int failcount = 0; time_t start; int status; pid_t pid; signal(SIGTTOU,SIG_IGN); signal(SIGTTIN,SIG_IGN); signal(SIGTSTP,SIG_IGN); signal(SIGPIPE,SIG_IGN); signal(SIGQUIT,SIG_IGN); signal(SIGHUP,NsTerminate); signal(SIGINT,NsTerminate); signal(SIGTERM,NsTerminate); // Go background if((pid = fork())) { if(pid < 0) err_logger("warpConfigure: fork: %s",strerror(errno)); exit(0); } setsid(); for(;;) { // Execute the real server nsPid = fork(); // Child, continue as server if(nsPid == 0) { exit(nsMain(argc, argv, ServerInit)); } /* parent, behaves like a guardian */ time(&start); printf("NS[%d] server process started",getpid()); pid = waitpid(-1, &status, 0); if(WIFEXITED(status)) printf("NS[%d] child process exited with status %d",pid,WEXITSTATUS(status)); else if(WIFSIGNALED(status)) printf("NS[%d] child process exited due to signal %d",pid,WTERMSIG(status)); else printf("NS[%d] child process exited", pid); // Special exit code if(WIFEXITED(status)) { if(WEXITSTATUS(status) == NS_EXIT) { printf("NS[%d] child configuration error, exiting",getpid()); exit(0); } else if(WEXITSTATUS(status) == SIGHUP) { } else { if(time(0) - start < 10) failcount++; if(failcount == 10) { printf("Exiting due to repeated, frequent failures"); exit(1); } } } sleep(3); } } ---------------------------------------------------------------------- Comment By: Stephen Deasey (sdeasey) Date: 2005-03-08 04:23 Message: Logged In: YES user_id=87254 NsWatchdog() is called after the server drops root privs, so both the watchdog and the server run as the defined user. What happens if the server dies and on restart needs to rebind privileged ports? A ps listing shows two running processes, parent and child. If I kill either one, the watchdog dies, the server continues to run. If I kill -9 the parent, the child continues to run. If I kill -9 the child, the server is restarted. Something seems not quite right here... I'm a bit confused about how the code works. For example, NsWatchdog() seems to ignore all of it's arguments? Here's the code which calls it: if (mode == 0) { i = ns_fork(); if (i < 0) { Ns_Fatal("nsmain: fork() failed: '%s'", strerror(errno)); } if (i > 0) { return 0; } setsid(); i = NsWatchdog(argc, argv, initProc); if (i < 0) { Ns_Fatal("nsmain: watchdog failed: '%s'", strerror(errno)); } if (i > 0) { return 0; } nsconf.pid = getpid(); } NsWatchdog() says it returns the worker pid, it also sets the global variable wpid. The code above ignores the returned value, and the global variable, and instead calls getpid()... The variable 'i' is existing code, but still somehow doesn't seem suitable... Could that Ns_Fatal() above be moved into the NsWatchdog() function? I think maybe some comment is needed here. The code structure is very like that just above where ns_fork() is called, but this function will return *multiple* times, right? This is kind of suprising, and an extra twist on the already confusing fork() semantics (or maybe it's just me who gets confused by fork...). A return value of 0 here is a 'request for orderly shutdown', right? How about some more logging to syslog? For example, distinguish between start and restart. Mention when the MAX_SLEEP_PERIOD has been reached, etc. Couple of small things: Can we refer to 'the server', rather than 'the worker'? Worker and Watchdog begin with 'w', and so does the global variable wpid... Maybe serverPid? NsWatchdog is a static function, it doesn't need the Ns prefix. In NsWatchdog(), the variable 'run' should be something like 'nretries', 'nap' should be something more like 'retrySeconds' and MAX_SLEEP_PERIOD should be MAX_RETRY_SECONDS. The comment for WaitForWorker() is misslabeled. Should SigHandler() be called WatchdogSigHandler()? ---------------------------------------------------------------------- Comment By: Vlad Seryakov (seryakov) Date: 2005-03-06 23:03 Message: Logged In: YES user_id=184124 Looks good to me ---------------------------------------------------------------------- Comment By: Zoran Vasiljevic (vasiljevic) Date: 2005-03-05 12:06 Message: Logged In: YES user_id=95086 Ah, correction: The restart option sends SIGINT to the worker process which causes the watchdog to restart it. And, the patchfile is now attached! ---------------------------------------------------------------------- Comment By: Zoran Vasiljevic (vasiljevic) Date: 2005-03-05 12:03 Message: Logged In: YES user_id=95086 Here is the patch. I have added "-restart" option to "ns_shutdown". It is rather clumsy to parse but should do. We should rewrite this with your args parsing routine. The restart option sends SIGTERM to the worker process which causes the watchdog to restart it. ---------------------------------------------------------------------- Comment By: Stephen Deasey (sdeasey) Date: 2005-03-04 23:46 Message: Logged In: YES user_id=87254 I don't think this has to hide behind a config option. It's either a good idea or it's not. Sounds good to me. Is there a patch? I'm wondering about some of the implementation details... ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=719009&aid=1156875&group_id=130646 |
From: SourceForge.net <no...@so...> - 2005-03-14 14:49:00
|
Feature Requests item #1159471, was opened at 2005-03-09 00:40 Message generated for change (Comment added) made by seryakov You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=719009&aid=1159471&group_id=130646 Category: C-API Group: None Status: Open Resolution: None Priority: 5 Submitted By: Vlad Seryakov (seryakov) Assigned to: Vlad Seryakov (seryakov) Summary: Virtual Hosting Initial Comment: Okay, i did some digging and testing, looks working and much simpler. I tried to simplify default AS 4.x virtual hosting and added pageroot virtual hosting, simple way to use different pagetroots on the same server. The change vor default virtual hosting is: no defaultserver anymore, the server who registered virtual hosts is default server, so nssock is loaded in the default server, other than that virtual servers are defined the same way. One thing i left is to chamge ns_info pageroot to use Ns_GetConn() and then if exists use connPtr->pageroot, but this is simple change if you approve current virtual hosting patch. Here is the nsd.tcl config example: ns_section "ns/server/${server}/module/nssock/servers" ns_param test vlad.seryakov.com ns_param test vlad.seryakov.com:80 ns_section "ns/server/${server}/module/nssock/pageroots" ns_param ${home}/html/test vlad.seryakov.com ns_param ${home}/html/test vlad.seryakov.com:80 ---------------------------------------------------------------------- >Comment By: Vlad Seryakov (seryakov) Date: 2005-03-14 14:48 Message: Logged In: YES user_id=184124 The problem i see with this patch, it will break all my AS installations, because it assumes that every servers is under servers/ direcotory and pageroot is awlays named pages/. I my on-server installs i do not have this complex directory structure, i have html/ under ns_info home and pageroot is set in the nsd.tcl. Also, what is the point of ServerRoot dir, we have PageRoot already, if it is set into absolute path other apps use ns_info pageroot? As i understand, under ServerRoot there is only pages difrectory? Why not to use pageroot directly? ---------------------------------------------------------------------- Comment By: Stephen Deasey (sdeasey) Date: 2005-03-14 14:05 Message: Logged In: YES user_id=87254 Oh, forgot to mention. It depends on the Tcl Callbacks patch. ---------------------------------------------------------------------- Comment By: Stephen Deasey (sdeasey) Date: 2005-03-14 14:03 Message: Logged In: YES user_id=87254 I've added the patch naviserver-4.0.10-server-root-vhost.patch. It adds the following new routines: Ns_ServerPath() Ns_PagePath() Ns_SetServerRootProc() Ns_ConnLocationAppend() Ns_SetConnLocationProc() NsServerRoot() NsPageRoot() ns_serverpath ?pathSegment ...? ns_pagepath ?pathSegment ...? ns_serverrootproc script ?arg? ns_locationproc script ?arg? And the following new configuration options: ns_section "ns/server/${servername}/vhost" ns_param enabled false ns_param prefix "" ns_param pagedir pages # overides fastpath/pageroot ns_param stripwww true ns_parma stripport true This version of host header based virtual hosting which depends on the existance of the pages directory is a superset of the functionality provided by a static configuration. Applications which call Ns_ConnLocation() will need to be modified to call Ns_ConnLocationAppend() or they will not be vhost aware. I've reinstated the depreciated proc Ns_SetConnLocationProc() and changed it's signature. It dissapeared from ns.h almost 5 years ago, then reappreared ~2.5 years ago. I don't think this will be a problem. In turn, I've depreciated the Ns_SetLocationProc(). It compiles and the server runs. I haven't had time to test it. Does this look acceptable? ---------------------------------------------------------------------- Comment By: Vlad Seryakov (seryakov) Date: 2005-03-13 00:41 Message: Logged In: YES user_id=184124 Just to consider the possibility, instead of mallocing pageroot/location, by default it can use sockPtr->pageroot/sockPtr->location, then when ns_conn pageroot newpageroot called, it will set connPtr->pageroot with malloced string and ns_conn pageroot will check and return it instead of sockPtr->pageroot. This way, no overhead at conn queue and still new pageroot/location can be set in Tcl ---------------------------------------------------------------------- Comment By: Vlad Seryakov (seryakov) Date: 2005-03-13 00:02 Message: Logged In: YES user_id=184124 That is the probem with nssock, it is actually an extender of the driver but still somehow kept independent. nssock itself is useless without driver and only used to bind to more than one address for different servers. it could be moved in the core but the problem will be how to define more than once instance. malloc are overhead indeed, but once copied they can be used independently and canbe set in Tcl by using ns_conn location newLocation or ns_conn pageroot newPageroot. In this case they should be a copies. Just do not make mass virtual hosting the only virtual hosting way, being able to change pageroot in the Tcl/C give developer more flexibility if required. For simple cases, mass virtual hosting is okay. ---------------------------------------------------------------------- Comment By: Stephen Deasey (sdeasey) Date: 2005-03-12 23:48 Message: Logged In: YES user_id=87254 I'm not keen at all on adding virtual hosting to the nssock driver. There's nothing HTTP specific about the nssock dirver and I'd like to keep it that way. There are a couple of problems with the other proposed solution. The paired functions Ns_SetLocationProc()/Ns_ConnSetLocationProc() etc. seems excessive, and the enforced malloc()ing at runtime of the location and pageroot strings is an unwelcome overhead. Using Tls storage is clever but pretty ugly. I think dstrings are the way to go here. I'm not sure Tls is safe in this implementation. The same dstring is used for location and pageroot strings, so it depends what the caller decides to do with the result and in what order, whether or not one overwrites the other. I would like to explore adding mas virtual hosting into the core. Let me work up a patch, I think I can get to this this weekend... ---------------------------------------------------------------------- Comment By: Vlad Seryakov (seryakov) Date: 2005-03-12 21:57 Message: Logged In: YES user_id=184124 Attached is nssock with virtual hosting patch ---------------------------------------------------------------------- Comment By: Vlad Seryakov (seryakov) Date: 2005-03-12 20:39 Message: Logged In: YES user_id=184124 Actually, vhost can be combined with nssock, if options given it will enable virtual hosting, if not works as regular sock driver. This way it is always with the core and at the same time independent. ---------------------------------------------------------------------- Comment By: Vlad Seryakov (seryakov) Date: 2005-03-10 21:38 Message: Logged In: YES user_id=184124 If loaded, vhost module works as Stephan suggested, strips port/host and usesd pageroot if other root is not specified. ---------------------------------------------------------------------- Comment By: Vlad Seryakov (seryakov) Date: 2005-03-10 21:37 Message: Logged In: YES user_id=184124 There is new patch with Stephans corrections/additions. I think we can provide core module for virtual hosting, i called it nsvhost and we can extend this module to do all sorts of hosting. ---------------------------------------------------------------------- Comment By: Vlad Seryakov (seryakov) Date: 2005-03-10 18:53 Message: Logged In: YES user_id=184124 And keeping port and www. is sometimes necessary, you can do virtual hosting by port only, IDT does that for example, and many sites work without www. prefix, just stripping them by default may not be appropriate. ---------------------------------------------------------------------- Comment By: Vlad Seryakov (seryakov) Date: 2005-03-10 18:24 Message: Logged In: YES user_id=184124 >Looks good. However, isn't ns_conn pageroot >unneccessary? For templating engines ns_info pageroot is the only way to figure out the root, UrlToFile works for fastpath only. >Would it be better to have the servers and pageroot config >in one section, to avoid duplication? They are mutually exclusive, that's why i put them in different sections, if full virtual server is set, pageroots are ignored, this is for AOL like virtual hosting with different servers(rare situation though). So in most cases pageroots will be used only, thus only one simple syntax. As for PageRoot and ServerRoot, i think this is a little cofusing. Currently, pageroot returns full path and whoever calls pageroot assumes that it will return full path. Virtual hosting using directories as hostnames is what i am currently using with vhost module and i think it can be included as a standard feature for easy virtualhosting solutions. I do not think this should be the ONLY virtual hosting solution, everybody can write their own modules using SetLocation/SetPageRoot procs or register filter which will set pageroot for each connection. Let me prepare another patch-set with your suggestions included . ---------------------------------------------------------------------- Comment By: Stephen Deasey (sdeasey) Date: 2005-03-10 08:00 Message: Logged In: YES user_id=87254 Looks good. However, isn't ns_conn pageroot unneccessary? Would it be better to have the servers and pageroot config in one section, to avoid duplication? ns_section "ns/server/${server}/module/nssock/servers" ns_param example.com exampleserver ns_param foo.example.com "exampleserver ${home}/whatever" Here, the first entry maps the example.com host to the exampleserver and uses it's default pageroot. The second entry supplies a new pageroot. How about mass virtual hosting, i.e. where you don't have to explicitly configure each host header to pageroot mapping, but construct the pageroot from the host heafer at runtime? I've attached the file nsmassvhost.c which implements the above. It uses the hooks Ns_PageRootProc and Ns_LocationProc which would be unneccessary if the functionality was included as standard. It trims the port and any leading 'www.' from the host header. It would be nice to have this for the static mapping also, as at the moment to be robust you often need 4 mappings for each virtual host. It also uses the function Ns_ServerPath(). The idea here is to introduce the concept of the virtual server root as a distinct location in the file system, where the pageroot is a location below that. I want to change this so that the serverroot is dynamic and based on the host header (when configured), and the pageroot is simply the serverroot with "/pages" (or whatever is configured) appended. It would look something like this: /srv/server1/pages /srv/server1/example.com/pages /srv/server1/example.com/cache The first path is the default or non-virtual hosted case. server1 is a server defind in the config file and has it's own private tcl library. The second path is the pageroot of a virtual host. The third is an example of some data which is specific to the example.com virtual host. So, without virtual hosts: Ns_ServerRoot() -> /srv/server1 Ns_PageRoot() -> /srv/server1/pages With virtual hosts (and called in the context of a conn thread): Ns_ServerRoot() -> /srv/server1/example.com Ns_PageRoot() -> /srv/server1/example.com/pages The advantage of this system is that you don't have to restart your server every time you add or remove a virtual host. There is also a convenient location to store data associated with both virtual servers and virtual hosts. Easy to backup, remove, etc. I haven't had time to look at how this would be integrated into what you've got here, maybe at the weekend. Feel free to take a shot at though :-) Does the scheme outlined above make sense to you? ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=719009&aid=1159471&group_id=130646 |
From: SourceForge.net <no...@so...> - 2005-03-14 14:41:36
|
Feature Requests item #1156875, was opened at 2005-03-04 20:03 Message generated for change (Comment added) made by vasiljevic You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=719009&aid=1156875&group_id=130646 Category: None Group: None Status: Open Resolution: None Priority: 5 Submitted By: Zoran Vasiljevic (vasiljevic) Assigned to: Zoran Vasiljevic (vasiljevic) Summary: Watchdog process restarts failed server Initial Comment: We have been using this for quite some time and it proved extremely useful. We doublefork the nsd process and make the first forked instance control the second. The first one (the watchdog) reacts on exit codes and signals caught during the watch and correspondingly restarts the second instance (the worker). Also, we have added the the "-restart" option to the "ns_shutdown" command. This just sends the SIGINT to the worker process. The watchdog is handling this signal and respawns the worker automatically. During operation, the watchdog logs events and their cause into the system log file. This looks like: Feb 28 04:00:05 Develop nsd[19400]: worker: started. Mar 1 04:00:13 Develop nsd[4475]: watchdog: worker 19400 exited (2). Mar 1 04:00:15 Develop nsd[21290]: worker: started. Mar 1 04:00:18 Develop nsd[14705]: watchdog: worker 19399 exited (2). Mar 1 04:00:20 Develop nsd[21300]: worker: started. We have done all the changes with "--enable-watchdog" so anybody who needs this feature will have to compile with this option. ---------------------------------------------------------------------- >Comment By: Zoran Vasiljevic (vasiljevic) Date: 2005-03-14 15:41 Message: Logged In: YES user_id=95086 Well, because I could not differentiate between config and later runtime errors, I thought its wiser to abort rather to repeatedly restart broken server. One possibility is to add different Ns_Fatal call like: Ns_FatalEx(int exitcode, char *fmt, ...); Or? ---------------------------------------------------------------------- Comment By: Stephen Deasey (sdeasey) Date: 2005-03-14 15:12 Message: Logged In: YES user_id=87254 Are you sure Ns_Fatal should not restart the server? Many of the fatal errors are caused by bad configuration and I can see why you might want the server to exit immediately. But there are also runtime fatal errors, and here I'd expect the server to be restarted. Some config errors are due to external factors like missing directories or file system permissions and it would be nice for the server to come back up as soon as the issue resolved itself. ---------------------------------------------------------------------- Comment By: Zoran Vasiljevic (vasiljevic) Date: 2005-03-14 10:52 Message: Logged In: YES user_id=95086 Calling watchdog before or after chroot has no real implications I believe. Also, when the server exits with Ns_Fatal, it hsould not be restarted, I will look into your changes today. ---------------------------------------------------------------------- Comment By: Stephen Deasey (sdeasey) Date: 2005-03-11 07:27 Message: Logged In: YES user_id=87254 Oh yeah, and I moved the watchdog stuff eaven earlier, before the prebind and chroot. Hmm, now that I think about it it's only the prebind that really needs to happen after the watchdog is started, to ensure the sockets are always in a sane state... ---------------------------------------------------------------------- Comment By: Stephen Deasey (sdeasey) Date: 2005-03-11 07:17 Message: Logged In: YES user_id=87254 Am I reading this right? It looks like if the server calls Ns_Fatal() it will exit with code 1, but 0 and 1 are treated as OK exit codes and the server will not be restarted. I've attached a patch which changes the above, fixes the pid problem in a different way because I completely forgot you already posted below about that small glitch, use Ns_ParseObjv(), adds the -w switch, and a couple of small name changes. I don't want to go overboard with command line switches, but an option to specify not to give up trying to restart the server would be nice. If you have that, you also need to turn off the restart timeout doubling at a certain point. I don't know what the cleanest way to do that is. ---------------------------------------------------------------------- Comment By: Zoran Vasiljevic (vasiljevic) Date: 2005-03-08 21:09 Message: Logged In: YES user_id=95086 This is correct. As soon as we agree on some implementation, I will put the -i processing and avoid starting watchdog for inittab starts. ---------------------------------------------------------------------- Comment By: Vlad Seryakov (seryakov) Date: 2005-03-08 20:48 Message: Logged In: YES user_id=184124 There still should be possibility to run nsd from inittab, so when -i switch is given, no watchdog should be running, let /sbin/init to handle restarts ---------------------------------------------------------------------- Comment By: Zoran Vasiljevic (vasiljevic) Date: 2005-03-08 17:19 Message: Logged In: YES user_id=95086 small glich: After applying the patch, change nsmain.c from: nsconf.pid = serverPid: to nsconf.pid = getpid(); otherwise the pid file will contain the bogus server pid if the watchdog restarted it later. ---------------------------------------------------------------------- Comment By: Zoran Vasiljevic (vasiljevic) Date: 2005-03-08 16:26 Message: Logged In: YES user_id=95086 Another try... Important to note: SIGKILL (signal 9) cannot be handled hence if somebody kills the watchdog with SIGKILL, the server will be left lingering w/o the watchdog. This is important to know. I do not see any possibility how to recover in such cases (i.e. how to stop the server). Apart from that, all objections from Stephen are taken into account. Please try again. A new copy of watchdog.patch file is attached. ---------------------------------------------------------------------- Comment By: Vlad Seryakov (seryakov) Date: 2005-03-08 06:20 Message: Logged In: YES user_id=184124 I just found this code in my old server sources, i just chnaged internal name to Ns/NS, used to run pretty stable. ------------------------------------------------------------------- #define NS_EXIT 99 static void NsTerminate(int sig) { printf("NS[%d] signal %d received...",getpid(),sig); // Kill the server with the same signal if(nsPid > 0) kill(nsPid,sig); // Exit in case of fatal signal if(sig != SIGHUP) { printf("NS[%d] terminating...",getpid()); exit(0); } // Reassign signal handler signal(sig,NsTerminate); } void NsWatchdog(int argc, char *argv[], char *envp[]) { int failcount = 0; time_t start; int status; pid_t pid; signal(SIGTTOU,SIG_IGN); signal(SIGTTIN,SIG_IGN); signal(SIGTSTP,SIG_IGN); signal(SIGPIPE,SIG_IGN); signal(SIGQUIT,SIG_IGN); signal(SIGHUP,NsTerminate); signal(SIGINT,NsTerminate); signal(SIGTERM,NsTerminate); // Go background if((pid = fork())) { if(pid < 0) err_logger("warpConfigure: fork: %s",strerror(errno)); exit(0); } setsid(); for(;;) { // Execute the real server nsPid = fork(); // Child, continue as server if(nsPid == 0) { exit(nsMain(argc, argv, ServerInit)); } /* parent, behaves like a guardian */ time(&start); printf("NS[%d] server process started",getpid()); pid = waitpid(-1, &status, 0); if(WIFEXITED(status)) printf("NS[%d] child process exited with status %d",pid,WEXITSTATUS(status)); else if(WIFSIGNALED(status)) printf("NS[%d] child process exited due to signal %d",pid,WTERMSIG(status)); else printf("NS[%d] child process exited", pid); // Special exit code if(WIFEXITED(status)) { if(WEXITSTATUS(status) == NS_EXIT) { printf("NS[%d] child configuration error, exiting",getpid()); exit(0); } else if(WEXITSTATUS(status) == SIGHUP) { } else { if(time(0) - start < 10) failcount++; if(failcount == 10) { printf("Exiting due to repeated, frequent failures"); exit(1); } } } sleep(3); } } ---------------------------------------------------------------------- Comment By: Stephen Deasey (sdeasey) Date: 2005-03-08 05:23 Message: Logged In: YES user_id=87254 NsWatchdog() is called after the server drops root privs, so both the watchdog and the server run as the defined user. What happens if the server dies and on restart needs to rebind privileged ports? A ps listing shows two running processes, parent and child. If I kill either one, the watchdog dies, the server continues to run. If I kill -9 the parent, the child continues to run. If I kill -9 the child, the server is restarted. Something seems not quite right here... I'm a bit confused about how the code works. For example, NsWatchdog() seems to ignore all of it's arguments? Here's the code which calls it: if (mode == 0) { i = ns_fork(); if (i < 0) { Ns_Fatal("nsmain: fork() failed: '%s'", strerror(errno)); } if (i > 0) { return 0; } setsid(); i = NsWatchdog(argc, argv, initProc); if (i < 0) { Ns_Fatal("nsmain: watchdog failed: '%s'", strerror(errno)); } if (i > 0) { return 0; } nsconf.pid = getpid(); } NsWatchdog() says it returns the worker pid, it also sets the global variable wpid. The code above ignores the returned value, and the global variable, and instead calls getpid()... The variable 'i' is existing code, but still somehow doesn't seem suitable... Could that Ns_Fatal() above be moved into the NsWatchdog() function? I think maybe some comment is needed here. The code structure is very like that just above where ns_fork() is called, but this function will return *multiple* times, right? This is kind of suprising, and an extra twist on the already confusing fork() semantics (or maybe it's just me who gets confused by fork...). A return value of 0 here is a 'request for orderly shutdown', right? How about some more logging to syslog? For example, distinguish between start and restart. Mention when the MAX_SLEEP_PERIOD has been reached, etc. Couple of small things: Can we refer to 'the server', rather than 'the worker'? Worker and Watchdog begin with 'w', and so does the global variable wpid... Maybe serverPid? NsWatchdog is a static function, it doesn't need the Ns prefix. In NsWatchdog(), the variable 'run' should be something like 'nretries', 'nap' should be something more like 'retrySeconds' and MAX_SLEEP_PERIOD should be MAX_RETRY_SECONDS. The comment for WaitForWorker() is misslabeled. Should SigHandler() be called WatchdogSigHandler()? ---------------------------------------------------------------------- Comment By: Vlad Seryakov (seryakov) Date: 2005-03-07 00:03 Message: Logged In: YES user_id=184124 Looks good to me ---------------------------------------------------------------------- Comment By: Zoran Vasiljevic (vasiljevic) Date: 2005-03-05 13:06 Message: Logged In: YES user_id=95086 Ah, correction: The restart option sends SIGINT to the worker process which causes the watchdog to restart it. And, the patchfile is now attached! ---------------------------------------------------------------------- Comment By: Zoran Vasiljevic (vasiljevic) Date: 2005-03-05 13:03 Message: Logged In: YES user_id=95086 Here is the patch. I have added "-restart" option to "ns_shutdown". It is rather clumsy to parse but should do. We should rewrite this with your args parsing routine. The restart option sends SIGTERM to the worker process which causes the watchdog to restart it. ---------------------------------------------------------------------- Comment By: Stephen Deasey (sdeasey) Date: 2005-03-05 00:46 Message: Logged In: YES user_id=87254 I don't think this has to hide behind a config option. It's either a good idea or it's not. Sounds good to me. Is there a patch? I'm wondering about some of the implementation details... ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=719009&aid=1156875&group_id=130646 |
From: SourceForge.net <no...@so...> - 2005-03-14 14:12:37
|
Feature Requests item #1156875, was opened at 2005-03-04 12:03 Message generated for change (Comment added) made by sdeasey You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=719009&aid=1156875&group_id=130646 Category: None Group: None Status: Open Resolution: None Priority: 5 Submitted By: Zoran Vasiljevic (vasiljevic) Assigned to: Zoran Vasiljevic (vasiljevic) Summary: Watchdog process restarts failed server Initial Comment: We have been using this for quite some time and it proved extremely useful. We doublefork the nsd process and make the first forked instance control the second. The first one (the watchdog) reacts on exit codes and signals caught during the watch and correspondingly restarts the second instance (the worker). Also, we have added the the "-restart" option to the "ns_shutdown" command. This just sends the SIGINT to the worker process. The watchdog is handling this signal and respawns the worker automatically. During operation, the watchdog logs events and their cause into the system log file. This looks like: Feb 28 04:00:05 Develop nsd[19400]: worker: started. Mar 1 04:00:13 Develop nsd[4475]: watchdog: worker 19400 exited (2). Mar 1 04:00:15 Develop nsd[21290]: worker: started. Mar 1 04:00:18 Develop nsd[14705]: watchdog: worker 19399 exited (2). Mar 1 04:00:20 Develop nsd[21300]: worker: started. We have done all the changes with "--enable-watchdog" so anybody who needs this feature will have to compile with this option. ---------------------------------------------------------------------- >Comment By: Stephen Deasey (sdeasey) Date: 2005-03-14 07:12 Message: Logged In: YES user_id=87254 Are you sure Ns_Fatal should not restart the server? Many of the fatal errors are caused by bad configuration and I can see why you might want the server to exit immediately. But there are also runtime fatal errors, and here I'd expect the server to be restarted. Some config errors are due to external factors like missing directories or file system permissions and it would be nice for the server to come back up as soon as the issue resolved itself. ---------------------------------------------------------------------- Comment By: Zoran Vasiljevic (vasiljevic) Date: 2005-03-14 02:52 Message: Logged In: YES user_id=95086 Calling watchdog before or after chroot has no real implications I believe. Also, when the server exits with Ns_Fatal, it hsould not be restarted, I will look into your changes today. ---------------------------------------------------------------------- Comment By: Stephen Deasey (sdeasey) Date: 2005-03-10 23:27 Message: Logged In: YES user_id=87254 Oh yeah, and I moved the watchdog stuff eaven earlier, before the prebind and chroot. Hmm, now that I think about it it's only the prebind that really needs to happen after the watchdog is started, to ensure the sockets are always in a sane state... ---------------------------------------------------------------------- Comment By: Stephen Deasey (sdeasey) Date: 2005-03-10 23:17 Message: Logged In: YES user_id=87254 Am I reading this right? It looks like if the server calls Ns_Fatal() it will exit with code 1, but 0 and 1 are treated as OK exit codes and the server will not be restarted. I've attached a patch which changes the above, fixes the pid problem in a different way because I completely forgot you already posted below about that small glitch, use Ns_ParseObjv(), adds the -w switch, and a couple of small name changes. I don't want to go overboard with command line switches, but an option to specify not to give up trying to restart the server would be nice. If you have that, you also need to turn off the restart timeout doubling at a certain point. I don't know what the cleanest way to do that is. ---------------------------------------------------------------------- Comment By: Zoran Vasiljevic (vasiljevic) Date: 2005-03-08 13:09 Message: Logged In: YES user_id=95086 This is correct. As soon as we agree on some implementation, I will put the -i processing and avoid starting watchdog for inittab starts. ---------------------------------------------------------------------- Comment By: Vlad Seryakov (seryakov) Date: 2005-03-08 12:48 Message: Logged In: YES user_id=184124 There still should be possibility to run nsd from inittab, so when -i switch is given, no watchdog should be running, let /sbin/init to handle restarts ---------------------------------------------------------------------- Comment By: Zoran Vasiljevic (vasiljevic) Date: 2005-03-08 09:19 Message: Logged In: YES user_id=95086 small glich: After applying the patch, change nsmain.c from: nsconf.pid = serverPid: to nsconf.pid = getpid(); otherwise the pid file will contain the bogus server pid if the watchdog restarted it later. ---------------------------------------------------------------------- Comment By: Zoran Vasiljevic (vasiljevic) Date: 2005-03-08 08:26 Message: Logged In: YES user_id=95086 Another try... Important to note: SIGKILL (signal 9) cannot be handled hence if somebody kills the watchdog with SIGKILL, the server will be left lingering w/o the watchdog. This is important to know. I do not see any possibility how to recover in such cases (i.e. how to stop the server). Apart from that, all objections from Stephen are taken into account. Please try again. A new copy of watchdog.patch file is attached. ---------------------------------------------------------------------- Comment By: Vlad Seryakov (seryakov) Date: 2005-03-07 22:20 Message: Logged In: YES user_id=184124 I just found this code in my old server sources, i just chnaged internal name to Ns/NS, used to run pretty stable. ------------------------------------------------------------------- #define NS_EXIT 99 static void NsTerminate(int sig) { printf("NS[%d] signal %d received...",getpid(),sig); // Kill the server with the same signal if(nsPid > 0) kill(nsPid,sig); // Exit in case of fatal signal if(sig != SIGHUP) { printf("NS[%d] terminating...",getpid()); exit(0); } // Reassign signal handler signal(sig,NsTerminate); } void NsWatchdog(int argc, char *argv[], char *envp[]) { int failcount = 0; time_t start; int status; pid_t pid; signal(SIGTTOU,SIG_IGN); signal(SIGTTIN,SIG_IGN); signal(SIGTSTP,SIG_IGN); signal(SIGPIPE,SIG_IGN); signal(SIGQUIT,SIG_IGN); signal(SIGHUP,NsTerminate); signal(SIGINT,NsTerminate); signal(SIGTERM,NsTerminate); // Go background if((pid = fork())) { if(pid < 0) err_logger("warpConfigure: fork: %s",strerror(errno)); exit(0); } setsid(); for(;;) { // Execute the real server nsPid = fork(); // Child, continue as server if(nsPid == 0) { exit(nsMain(argc, argv, ServerInit)); } /* parent, behaves like a guardian */ time(&start); printf("NS[%d] server process started",getpid()); pid = waitpid(-1, &status, 0); if(WIFEXITED(status)) printf("NS[%d] child process exited with status %d",pid,WEXITSTATUS(status)); else if(WIFSIGNALED(status)) printf("NS[%d] child process exited due to signal %d",pid,WTERMSIG(status)); else printf("NS[%d] child process exited", pid); // Special exit code if(WIFEXITED(status)) { if(WEXITSTATUS(status) == NS_EXIT) { printf("NS[%d] child configuration error, exiting",getpid()); exit(0); } else if(WEXITSTATUS(status) == SIGHUP) { } else { if(time(0) - start < 10) failcount++; if(failcount == 10) { printf("Exiting due to repeated, frequent failures"); exit(1); } } } sleep(3); } } ---------------------------------------------------------------------- Comment By: Stephen Deasey (sdeasey) Date: 2005-03-07 21:23 Message: Logged In: YES user_id=87254 NsWatchdog() is called after the server drops root privs, so both the watchdog and the server run as the defined user. What happens if the server dies and on restart needs to rebind privileged ports? A ps listing shows two running processes, parent and child. If I kill either one, the watchdog dies, the server continues to run. If I kill -9 the parent, the child continues to run. If I kill -9 the child, the server is restarted. Something seems not quite right here... I'm a bit confused about how the code works. For example, NsWatchdog() seems to ignore all of it's arguments? Here's the code which calls it: if (mode == 0) { i = ns_fork(); if (i < 0) { Ns_Fatal("nsmain: fork() failed: '%s'", strerror(errno)); } if (i > 0) { return 0; } setsid(); i = NsWatchdog(argc, argv, initProc); if (i < 0) { Ns_Fatal("nsmain: watchdog failed: '%s'", strerror(errno)); } if (i > 0) { return 0; } nsconf.pid = getpid(); } NsWatchdog() says it returns the worker pid, it also sets the global variable wpid. The code above ignores the returned value, and the global variable, and instead calls getpid()... The variable 'i' is existing code, but still somehow doesn't seem suitable... Could that Ns_Fatal() above be moved into the NsWatchdog() function? I think maybe some comment is needed here. The code structure is very like that just above where ns_fork() is called, but this function will return *multiple* times, right? This is kind of suprising, and an extra twist on the already confusing fork() semantics (or maybe it's just me who gets confused by fork...). A return value of 0 here is a 'request for orderly shutdown', right? How about some more logging to syslog? For example, distinguish between start and restart. Mention when the MAX_SLEEP_PERIOD has been reached, etc. Couple of small things: Can we refer to 'the server', rather than 'the worker'? Worker and Watchdog begin with 'w', and so does the global variable wpid... Maybe serverPid? NsWatchdog is a static function, it doesn't need the Ns prefix. In NsWatchdog(), the variable 'run' should be something like 'nretries', 'nap' should be something more like 'retrySeconds' and MAX_SLEEP_PERIOD should be MAX_RETRY_SECONDS. The comment for WaitForWorker() is misslabeled. Should SigHandler() be called WatchdogSigHandler()? ---------------------------------------------------------------------- Comment By: Vlad Seryakov (seryakov) Date: 2005-03-06 16:03 Message: Logged In: YES user_id=184124 Looks good to me ---------------------------------------------------------------------- Comment By: Zoran Vasiljevic (vasiljevic) Date: 2005-03-05 05:06 Message: Logged In: YES user_id=95086 Ah, correction: The restart option sends SIGINT to the worker process which causes the watchdog to restart it. And, the patchfile is now attached! ---------------------------------------------------------------------- Comment By: Zoran Vasiljevic (vasiljevic) Date: 2005-03-05 05:03 Message: Logged In: YES user_id=95086 Here is the patch. I have added "-restart" option to "ns_shutdown". It is rather clumsy to parse but should do. We should rewrite this with your args parsing routine. The restart option sends SIGTERM to the worker process which causes the watchdog to restart it. ---------------------------------------------------------------------- Comment By: Stephen Deasey (sdeasey) Date: 2005-03-04 16:46 Message: Logged In: YES user_id=87254 I don't think this has to hide behind a config option. It's either a good idea or it's not. Sounds good to me. Is there a patch? I'm wondering about some of the implementation details... ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=719009&aid=1156875&group_id=130646 |
From: SourceForge.net <no...@so...> - 2005-03-14 14:05:07
|
Feature Requests item #1159471, was opened at 2005-03-08 17:40 Message generated for change (Comment added) made by sdeasey You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=719009&aid=1159471&group_id=130646 Category: C-API Group: None Status: Open Resolution: None Priority: 5 Submitted By: Vlad Seryakov (seryakov) Assigned to: Vlad Seryakov (seryakov) Summary: Virtual Hosting Initial Comment: Okay, i did some digging and testing, looks working and much simpler. I tried to simplify default AS 4.x virtual hosting and added pageroot virtual hosting, simple way to use different pagetroots on the same server. The change vor default virtual hosting is: no defaultserver anymore, the server who registered virtual hosts is default server, so nssock is loaded in the default server, other than that virtual servers are defined the same way. One thing i left is to chamge ns_info pageroot to use Ns_GetConn() and then if exists use connPtr->pageroot, but this is simple change if you approve current virtual hosting patch. Here is the nsd.tcl config example: ns_section "ns/server/${server}/module/nssock/servers" ns_param test vlad.seryakov.com ns_param test vlad.seryakov.com:80 ns_section "ns/server/${server}/module/nssock/pageroots" ns_param ${home}/html/test vlad.seryakov.com ns_param ${home}/html/test vlad.seryakov.com:80 ---------------------------------------------------------------------- >Comment By: Stephen Deasey (sdeasey) Date: 2005-03-14 07:05 Message: Logged In: YES user_id=87254 Oh, forgot to mention. It depends on the Tcl Callbacks patch. ---------------------------------------------------------------------- Comment By: Stephen Deasey (sdeasey) Date: 2005-03-14 07:03 Message: Logged In: YES user_id=87254 I've added the patch naviserver-4.0.10-server-root-vhost.patch. It adds the following new routines: Ns_ServerPath() Ns_PagePath() Ns_SetServerRootProc() Ns_ConnLocationAppend() Ns_SetConnLocationProc() NsServerRoot() NsPageRoot() ns_serverpath ?pathSegment ...? ns_pagepath ?pathSegment ...? ns_serverrootproc script ?arg? ns_locationproc script ?arg? And the following new configuration options: ns_section "ns/server/${servername}/vhost" ns_param enabled false ns_param prefix "" ns_param pagedir pages # overides fastpath/pageroot ns_param stripwww true ns_parma stripport true This version of host header based virtual hosting which depends on the existance of the pages directory is a superset of the functionality provided by a static configuration. Applications which call Ns_ConnLocation() will need to be modified to call Ns_ConnLocationAppend() or they will not be vhost aware. I've reinstated the depreciated proc Ns_SetConnLocationProc() and changed it's signature. It dissapeared from ns.h almost 5 years ago, then reappreared ~2.5 years ago. I don't think this will be a problem. In turn, I've depreciated the Ns_SetLocationProc(). It compiles and the server runs. I haven't had time to test it. Does this look acceptable? ---------------------------------------------------------------------- Comment By: Vlad Seryakov (seryakov) Date: 2005-03-12 17:41 Message: Logged In: YES user_id=184124 Just to consider the possibility, instead of mallocing pageroot/location, by default it can use sockPtr->pageroot/sockPtr->location, then when ns_conn pageroot newpageroot called, it will set connPtr->pageroot with malloced string and ns_conn pageroot will check and return it instead of sockPtr->pageroot. This way, no overhead at conn queue and still new pageroot/location can be set in Tcl ---------------------------------------------------------------------- Comment By: Vlad Seryakov (seryakov) Date: 2005-03-12 17:02 Message: Logged In: YES user_id=184124 That is the probem with nssock, it is actually an extender of the driver but still somehow kept independent. nssock itself is useless without driver and only used to bind to more than one address for different servers. it could be moved in the core but the problem will be how to define more than once instance. malloc are overhead indeed, but once copied they can be used independently and canbe set in Tcl by using ns_conn location newLocation or ns_conn pageroot newPageroot. In this case they should be a copies. Just do not make mass virtual hosting the only virtual hosting way, being able to change pageroot in the Tcl/C give developer more flexibility if required. For simple cases, mass virtual hosting is okay. ---------------------------------------------------------------------- Comment By: Stephen Deasey (sdeasey) Date: 2005-03-12 16:48 Message: Logged In: YES user_id=87254 I'm not keen at all on adding virtual hosting to the nssock driver. There's nothing HTTP specific about the nssock dirver and I'd like to keep it that way. There are a couple of problems with the other proposed solution. The paired functions Ns_SetLocationProc()/Ns_ConnSetLocationProc() etc. seems excessive, and the enforced malloc()ing at runtime of the location and pageroot strings is an unwelcome overhead. Using Tls storage is clever but pretty ugly. I think dstrings are the way to go here. I'm not sure Tls is safe in this implementation. The same dstring is used for location and pageroot strings, so it depends what the caller decides to do with the result and in what order, whether or not one overwrites the other. I would like to explore adding mas virtual hosting into the core. Let me work up a patch, I think I can get to this this weekend... ---------------------------------------------------------------------- Comment By: Vlad Seryakov (seryakov) Date: 2005-03-12 14:57 Message: Logged In: YES user_id=184124 Attached is nssock with virtual hosting patch ---------------------------------------------------------------------- Comment By: Vlad Seryakov (seryakov) Date: 2005-03-12 13:39 Message: Logged In: YES user_id=184124 Actually, vhost can be combined with nssock, if options given it will enable virtual hosting, if not works as regular sock driver. This way it is always with the core and at the same time independent. ---------------------------------------------------------------------- Comment By: Vlad Seryakov (seryakov) Date: 2005-03-10 14:38 Message: Logged In: YES user_id=184124 If loaded, vhost module works as Stephan suggested, strips port/host and usesd pageroot if other root is not specified. ---------------------------------------------------------------------- Comment By: Vlad Seryakov (seryakov) Date: 2005-03-10 14:37 Message: Logged In: YES user_id=184124 There is new patch with Stephans corrections/additions. I think we can provide core module for virtual hosting, i called it nsvhost and we can extend this module to do all sorts of hosting. ---------------------------------------------------------------------- Comment By: Vlad Seryakov (seryakov) Date: 2005-03-10 11:53 Message: Logged In: YES user_id=184124 And keeping port and www. is sometimes necessary, you can do virtual hosting by port only, IDT does that for example, and many sites work without www. prefix, just stripping them by default may not be appropriate. ---------------------------------------------------------------------- Comment By: Vlad Seryakov (seryakov) Date: 2005-03-10 11:24 Message: Logged In: YES user_id=184124 >Looks good. However, isn't ns_conn pageroot >unneccessary? For templating engines ns_info pageroot is the only way to figure out the root, UrlToFile works for fastpath only. >Would it be better to have the servers and pageroot config >in one section, to avoid duplication? They are mutually exclusive, that's why i put them in different sections, if full virtual server is set, pageroots are ignored, this is for AOL like virtual hosting with different servers(rare situation though). So in most cases pageroots will be used only, thus only one simple syntax. As for PageRoot and ServerRoot, i think this is a little cofusing. Currently, pageroot returns full path and whoever calls pageroot assumes that it will return full path. Virtual hosting using directories as hostnames is what i am currently using with vhost module and i think it can be included as a standard feature for easy virtualhosting solutions. I do not think this should be the ONLY virtual hosting solution, everybody can write their own modules using SetLocation/SetPageRoot procs or register filter which will set pageroot for each connection. Let me prepare another patch-set with your suggestions included . ---------------------------------------------------------------------- Comment By: Stephen Deasey (sdeasey) Date: 2005-03-10 01:00 Message: Logged In: YES user_id=87254 Looks good. However, isn't ns_conn pageroot unneccessary? Would it be better to have the servers and pageroot config in one section, to avoid duplication? ns_section "ns/server/${server}/module/nssock/servers" ns_param example.com exampleserver ns_param foo.example.com "exampleserver ${home}/whatever" Here, the first entry maps the example.com host to the exampleserver and uses it's default pageroot. The second entry supplies a new pageroot. How about mass virtual hosting, i.e. where you don't have to explicitly configure each host header to pageroot mapping, but construct the pageroot from the host heafer at runtime? I've attached the file nsmassvhost.c which implements the above. It uses the hooks Ns_PageRootProc and Ns_LocationProc which would be unneccessary if the functionality was included as standard. It trims the port and any leading 'www.' from the host header. It would be nice to have this for the static mapping also, as at the moment to be robust you often need 4 mappings for each virtual host. It also uses the function Ns_ServerPath(). The idea here is to introduce the concept of the virtual server root as a distinct location in the file system, where the pageroot is a location below that. I want to change this so that the serverroot is dynamic and based on the host header (when configured), and the pageroot is simply the serverroot with "/pages" (or whatever is configured) appended. It would look something like this: /srv/server1/pages /srv/server1/example.com/pages /srv/server1/example.com/cache The first path is the default or non-virtual hosted case. server1 is a server defind in the config file and has it's own private tcl library. The second path is the pageroot of a virtual host. The third is an example of some data which is specific to the example.com virtual host. So, without virtual hosts: Ns_ServerRoot() -> /srv/server1 Ns_PageRoot() -> /srv/server1/pages With virtual hosts (and called in the context of a conn thread): Ns_ServerRoot() -> /srv/server1/example.com Ns_PageRoot() -> /srv/server1/example.com/pages The advantage of this system is that you don't have to restart your server every time you add or remove a virtual host. There is also a convenient location to store data associated with both virtual servers and virtual hosts. Easy to backup, remove, etc. I haven't had time to look at how this would be integrated into what you've got here, maybe at the weekend. Feel free to take a shot at though :-) Does the scheme outlined above make sense to you? ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=719009&aid=1159471&group_id=130646 |
From: SourceForge.net <no...@so...> - 2005-03-14 14:03:49
|
Feature Requests item #1159471, was opened at 2005-03-08 17:40 Message generated for change (Comment added) made by sdeasey You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=719009&aid=1159471&group_id=130646 Category: C-API Group: None Status: Open Resolution: None Priority: 5 Submitted By: Vlad Seryakov (seryakov) Assigned to: Vlad Seryakov (seryakov) Summary: Virtual Hosting Initial Comment: Okay, i did some digging and testing, looks working and much simpler. I tried to simplify default AS 4.x virtual hosting and added pageroot virtual hosting, simple way to use different pagetroots on the same server. The change vor default virtual hosting is: no defaultserver anymore, the server who registered virtual hosts is default server, so nssock is loaded in the default server, other than that virtual servers are defined the same way. One thing i left is to chamge ns_info pageroot to use Ns_GetConn() and then if exists use connPtr->pageroot, but this is simple change if you approve current virtual hosting patch. Here is the nsd.tcl config example: ns_section "ns/server/${server}/module/nssock/servers" ns_param test vlad.seryakov.com ns_param test vlad.seryakov.com:80 ns_section "ns/server/${server}/module/nssock/pageroots" ns_param ${home}/html/test vlad.seryakov.com ns_param ${home}/html/test vlad.seryakov.com:80 ---------------------------------------------------------------------- >Comment By: Stephen Deasey (sdeasey) Date: 2005-03-14 07:03 Message: Logged In: YES user_id=87254 I've added the patch naviserver-4.0.10-server-root-vhost.patch. It adds the following new routines: Ns_ServerPath() Ns_PagePath() Ns_SetServerRootProc() Ns_ConnLocationAppend() Ns_SetConnLocationProc() NsServerRoot() NsPageRoot() ns_serverpath ?pathSegment ...? ns_pagepath ?pathSegment ...? ns_serverrootproc script ?arg? ns_locationproc script ?arg? And the following new configuration options: ns_section "ns/server/${servername}/vhost" ns_param enabled false ns_param prefix "" ns_param pagedir pages # overides fastpath/pageroot ns_param stripwww true ns_parma stripport true This version of host header based virtual hosting which depends on the existance of the pages directory is a superset of the functionality provided by a static configuration. Applications which call Ns_ConnLocation() will need to be modified to call Ns_ConnLocationAppend() or they will not be vhost aware. I've reinstated the depreciated proc Ns_SetConnLocationProc() and changed it's signature. It dissapeared from ns.h almost 5 years ago, then reappreared ~2.5 years ago. I don't think this will be a problem. In turn, I've depreciated the Ns_SetLocationProc(). It compiles and the server runs. I haven't had time to test it. Does this look acceptable? ---------------------------------------------------------------------- Comment By: Vlad Seryakov (seryakov) Date: 2005-03-12 17:41 Message: Logged In: YES user_id=184124 Just to consider the possibility, instead of mallocing pageroot/location, by default it can use sockPtr->pageroot/sockPtr->location, then when ns_conn pageroot newpageroot called, it will set connPtr->pageroot with malloced string and ns_conn pageroot will check and return it instead of sockPtr->pageroot. This way, no overhead at conn queue and still new pageroot/location can be set in Tcl ---------------------------------------------------------------------- Comment By: Vlad Seryakov (seryakov) Date: 2005-03-12 17:02 Message: Logged In: YES user_id=184124 That is the probem with nssock, it is actually an extender of the driver but still somehow kept independent. nssock itself is useless without driver and only used to bind to more than one address for different servers. it could be moved in the core but the problem will be how to define more than once instance. malloc are overhead indeed, but once copied they can be used independently and canbe set in Tcl by using ns_conn location newLocation or ns_conn pageroot newPageroot. In this case they should be a copies. Just do not make mass virtual hosting the only virtual hosting way, being able to change pageroot in the Tcl/C give developer more flexibility if required. For simple cases, mass virtual hosting is okay. ---------------------------------------------------------------------- Comment By: Stephen Deasey (sdeasey) Date: 2005-03-12 16:48 Message: Logged In: YES user_id=87254 I'm not keen at all on adding virtual hosting to the nssock driver. There's nothing HTTP specific about the nssock dirver and I'd like to keep it that way. There are a couple of problems with the other proposed solution. The paired functions Ns_SetLocationProc()/Ns_ConnSetLocationProc() etc. seems excessive, and the enforced malloc()ing at runtime of the location and pageroot strings is an unwelcome overhead. Using Tls storage is clever but pretty ugly. I think dstrings are the way to go here. I'm not sure Tls is safe in this implementation. The same dstring is used for location and pageroot strings, so it depends what the caller decides to do with the result and in what order, whether or not one overwrites the other. I would like to explore adding mas virtual hosting into the core. Let me work up a patch, I think I can get to this this weekend... ---------------------------------------------------------------------- Comment By: Vlad Seryakov (seryakov) Date: 2005-03-12 14:57 Message: Logged In: YES user_id=184124 Attached is nssock with virtual hosting patch ---------------------------------------------------------------------- Comment By: Vlad Seryakov (seryakov) Date: 2005-03-12 13:39 Message: Logged In: YES user_id=184124 Actually, vhost can be combined with nssock, if options given it will enable virtual hosting, if not works as regular sock driver. This way it is always with the core and at the same time independent. ---------------------------------------------------------------------- Comment By: Vlad Seryakov (seryakov) Date: 2005-03-10 14:38 Message: Logged In: YES user_id=184124 If loaded, vhost module works as Stephan suggested, strips port/host and usesd pageroot if other root is not specified. ---------------------------------------------------------------------- Comment By: Vlad Seryakov (seryakov) Date: 2005-03-10 14:37 Message: Logged In: YES user_id=184124 There is new patch with Stephans corrections/additions. I think we can provide core module for virtual hosting, i called it nsvhost and we can extend this module to do all sorts of hosting. ---------------------------------------------------------------------- Comment By: Vlad Seryakov (seryakov) Date: 2005-03-10 11:53 Message: Logged In: YES user_id=184124 And keeping port and www. is sometimes necessary, you can do virtual hosting by port only, IDT does that for example, and many sites work without www. prefix, just stripping them by default may not be appropriate. ---------------------------------------------------------------------- Comment By: Vlad Seryakov (seryakov) Date: 2005-03-10 11:24 Message: Logged In: YES user_id=184124 >Looks good. However, isn't ns_conn pageroot >unneccessary? For templating engines ns_info pageroot is the only way to figure out the root, UrlToFile works for fastpath only. >Would it be better to have the servers and pageroot config >in one section, to avoid duplication? They are mutually exclusive, that's why i put them in different sections, if full virtual server is set, pageroots are ignored, this is for AOL like virtual hosting with different servers(rare situation though). So in most cases pageroots will be used only, thus only one simple syntax. As for PageRoot and ServerRoot, i think this is a little cofusing. Currently, pageroot returns full path and whoever calls pageroot assumes that it will return full path. Virtual hosting using directories as hostnames is what i am currently using with vhost module and i think it can be included as a standard feature for easy virtualhosting solutions. I do not think this should be the ONLY virtual hosting solution, everybody can write their own modules using SetLocation/SetPageRoot procs or register filter which will set pageroot for each connection. Let me prepare another patch-set with your suggestions included . ---------------------------------------------------------------------- Comment By: Stephen Deasey (sdeasey) Date: 2005-03-10 01:00 Message: Logged In: YES user_id=87254 Looks good. However, isn't ns_conn pageroot unneccessary? Would it be better to have the servers and pageroot config in one section, to avoid duplication? ns_section "ns/server/${server}/module/nssock/servers" ns_param example.com exampleserver ns_param foo.example.com "exampleserver ${home}/whatever" Here, the first entry maps the example.com host to the exampleserver and uses it's default pageroot. The second entry supplies a new pageroot. How about mass virtual hosting, i.e. where you don't have to explicitly configure each host header to pageroot mapping, but construct the pageroot from the host heafer at runtime? I've attached the file nsmassvhost.c which implements the above. It uses the hooks Ns_PageRootProc and Ns_LocationProc which would be unneccessary if the functionality was included as standard. It trims the port and any leading 'www.' from the host header. It would be nice to have this for the static mapping also, as at the moment to be robust you often need 4 mappings for each virtual host. It also uses the function Ns_ServerPath(). The idea here is to introduce the concept of the virtual server root as a distinct location in the file system, where the pageroot is a location below that. I want to change this so that the serverroot is dynamic and based on the host header (when configured), and the pageroot is simply the serverroot with "/pages" (or whatever is configured) appended. It would look something like this: /srv/server1/pages /srv/server1/example.com/pages /srv/server1/example.com/cache The first path is the default or non-virtual hosted case. server1 is a server defind in the config file and has it's own private tcl library. The second path is the pageroot of a virtual host. The third is an example of some data which is specific to the example.com virtual host. So, without virtual hosts: Ns_ServerRoot() -> /srv/server1 Ns_PageRoot() -> /srv/server1/pages With virtual hosts (and called in the context of a conn thread): Ns_ServerRoot() -> /srv/server1/example.com Ns_PageRoot() -> /srv/server1/example.com/pages The advantage of this system is that you don't have to restart your server every time you add or remove a virtual host. There is also a convenient location to store data associated with both virtual servers and virtual hosts. Easy to backup, remove, etc. I haven't had time to look at how this would be integrated into what you've got here, maybe at the weekend. Feel free to take a shot at though :-) Does the scheme outlined above make sense to you? ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=719009&aid=1159471&group_id=130646 |
From: SourceForge.net <no...@so...> - 2005-03-14 09:52:07
|
Feature Requests item #1156875, was opened at 2005-03-04 20:03 Message generated for change (Comment added) made by vasiljevic You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=719009&aid=1156875&group_id=130646 Category: None Group: None Status: Open Resolution: None Priority: 5 Submitted By: Zoran Vasiljevic (vasiljevic) Assigned to: Zoran Vasiljevic (vasiljevic) Summary: Watchdog process restarts failed server Initial Comment: We have been using this for quite some time and it proved extremely useful. We doublefork the nsd process and make the first forked instance control the second. The first one (the watchdog) reacts on exit codes and signals caught during the watch and correspondingly restarts the second instance (the worker). Also, we have added the the "-restart" option to the "ns_shutdown" command. This just sends the SIGINT to the worker process. The watchdog is handling this signal and respawns the worker automatically. During operation, the watchdog logs events and their cause into the system log file. This looks like: Feb 28 04:00:05 Develop nsd[19400]: worker: started. Mar 1 04:00:13 Develop nsd[4475]: watchdog: worker 19400 exited (2). Mar 1 04:00:15 Develop nsd[21290]: worker: started. Mar 1 04:00:18 Develop nsd[14705]: watchdog: worker 19399 exited (2). Mar 1 04:00:20 Develop nsd[21300]: worker: started. We have done all the changes with "--enable-watchdog" so anybody who needs this feature will have to compile with this option. ---------------------------------------------------------------------- >Comment By: Zoran Vasiljevic (vasiljevic) Date: 2005-03-14 10:52 Message: Logged In: YES user_id=95086 Calling watchdog before or after chroot has no real implications I believe. Also, when the server exits with Ns_Fatal, it hsould not be restarted, I will look into your changes today. ---------------------------------------------------------------------- Comment By: Stephen Deasey (sdeasey) Date: 2005-03-11 07:27 Message: Logged In: YES user_id=87254 Oh yeah, and I moved the watchdog stuff eaven earlier, before the prebind and chroot. Hmm, now that I think about it it's only the prebind that really needs to happen after the watchdog is started, to ensure the sockets are always in a sane state... ---------------------------------------------------------------------- Comment By: Stephen Deasey (sdeasey) Date: 2005-03-11 07:17 Message: Logged In: YES user_id=87254 Am I reading this right? It looks like if the server calls Ns_Fatal() it will exit with code 1, but 0 and 1 are treated as OK exit codes and the server will not be restarted. I've attached a patch which changes the above, fixes the pid problem in a different way because I completely forgot you already posted below about that small glitch, use Ns_ParseObjv(), adds the -w switch, and a couple of small name changes. I don't want to go overboard with command line switches, but an option to specify not to give up trying to restart the server would be nice. If you have that, you also need to turn off the restart timeout doubling at a certain point. I don't know what the cleanest way to do that is. ---------------------------------------------------------------------- Comment By: Zoran Vasiljevic (vasiljevic) Date: 2005-03-08 21:09 Message: Logged In: YES user_id=95086 This is correct. As soon as we agree on some implementation, I will put the -i processing and avoid starting watchdog for inittab starts. ---------------------------------------------------------------------- Comment By: Vlad Seryakov (seryakov) Date: 2005-03-08 20:48 Message: Logged In: YES user_id=184124 There still should be possibility to run nsd from inittab, so when -i switch is given, no watchdog should be running, let /sbin/init to handle restarts ---------------------------------------------------------------------- Comment By: Zoran Vasiljevic (vasiljevic) Date: 2005-03-08 17:19 Message: Logged In: YES user_id=95086 small glich: After applying the patch, change nsmain.c from: nsconf.pid = serverPid: to nsconf.pid = getpid(); otherwise the pid file will contain the bogus server pid if the watchdog restarted it later. ---------------------------------------------------------------------- Comment By: Zoran Vasiljevic (vasiljevic) Date: 2005-03-08 16:26 Message: Logged In: YES user_id=95086 Another try... Important to note: SIGKILL (signal 9) cannot be handled hence if somebody kills the watchdog with SIGKILL, the server will be left lingering w/o the watchdog. This is important to know. I do not see any possibility how to recover in such cases (i.e. how to stop the server). Apart from that, all objections from Stephen are taken into account. Please try again. A new copy of watchdog.patch file is attached. ---------------------------------------------------------------------- Comment By: Vlad Seryakov (seryakov) Date: 2005-03-08 06:20 Message: Logged In: YES user_id=184124 I just found this code in my old server sources, i just chnaged internal name to Ns/NS, used to run pretty stable. ------------------------------------------------------------------- #define NS_EXIT 99 static void NsTerminate(int sig) { printf("NS[%d] signal %d received...",getpid(),sig); // Kill the server with the same signal if(nsPid > 0) kill(nsPid,sig); // Exit in case of fatal signal if(sig != SIGHUP) { printf("NS[%d] terminating...",getpid()); exit(0); } // Reassign signal handler signal(sig,NsTerminate); } void NsWatchdog(int argc, char *argv[], char *envp[]) { int failcount = 0; time_t start; int status; pid_t pid; signal(SIGTTOU,SIG_IGN); signal(SIGTTIN,SIG_IGN); signal(SIGTSTP,SIG_IGN); signal(SIGPIPE,SIG_IGN); signal(SIGQUIT,SIG_IGN); signal(SIGHUP,NsTerminate); signal(SIGINT,NsTerminate); signal(SIGTERM,NsTerminate); // Go background if((pid = fork())) { if(pid < 0) err_logger("warpConfigure: fork: %s",strerror(errno)); exit(0); } setsid(); for(;;) { // Execute the real server nsPid = fork(); // Child, continue as server if(nsPid == 0) { exit(nsMain(argc, argv, ServerInit)); } /* parent, behaves like a guardian */ time(&start); printf("NS[%d] server process started",getpid()); pid = waitpid(-1, &status, 0); if(WIFEXITED(status)) printf("NS[%d] child process exited with status %d",pid,WEXITSTATUS(status)); else if(WIFSIGNALED(status)) printf("NS[%d] child process exited due to signal %d",pid,WTERMSIG(status)); else printf("NS[%d] child process exited", pid); // Special exit code if(WIFEXITED(status)) { if(WEXITSTATUS(status) == NS_EXIT) { printf("NS[%d] child configuration error, exiting",getpid()); exit(0); } else if(WEXITSTATUS(status) == SIGHUP) { } else { if(time(0) - start < 10) failcount++; if(failcount == 10) { printf("Exiting due to repeated, frequent failures"); exit(1); } } } sleep(3); } } ---------------------------------------------------------------------- Comment By: Stephen Deasey (sdeasey) Date: 2005-03-08 05:23 Message: Logged In: YES user_id=87254 NsWatchdog() is called after the server drops root privs, so both the watchdog and the server run as the defined user. What happens if the server dies and on restart needs to rebind privileged ports? A ps listing shows two running processes, parent and child. If I kill either one, the watchdog dies, the server continues to run. If I kill -9 the parent, the child continues to run. If I kill -9 the child, the server is restarted. Something seems not quite right here... I'm a bit confused about how the code works. For example, NsWatchdog() seems to ignore all of it's arguments? Here's the code which calls it: if (mode == 0) { i = ns_fork(); if (i < 0) { Ns_Fatal("nsmain: fork() failed: '%s'", strerror(errno)); } if (i > 0) { return 0; } setsid(); i = NsWatchdog(argc, argv, initProc); if (i < 0) { Ns_Fatal("nsmain: watchdog failed: '%s'", strerror(errno)); } if (i > 0) { return 0; } nsconf.pid = getpid(); } NsWatchdog() says it returns the worker pid, it also sets the global variable wpid. The code above ignores the returned value, and the global variable, and instead calls getpid()... The variable 'i' is existing code, but still somehow doesn't seem suitable... Could that Ns_Fatal() above be moved into the NsWatchdog() function? I think maybe some comment is needed here. The code structure is very like that just above where ns_fork() is called, but this function will return *multiple* times, right? This is kind of suprising, and an extra twist on the already confusing fork() semantics (or maybe it's just me who gets confused by fork...). A return value of 0 here is a 'request for orderly shutdown', right? How about some more logging to syslog? For example, distinguish between start and restart. Mention when the MAX_SLEEP_PERIOD has been reached, etc. Couple of small things: Can we refer to 'the server', rather than 'the worker'? Worker and Watchdog begin with 'w', and so does the global variable wpid... Maybe serverPid? NsWatchdog is a static function, it doesn't need the Ns prefix. In NsWatchdog(), the variable 'run' should be something like 'nretries', 'nap' should be something more like 'retrySeconds' and MAX_SLEEP_PERIOD should be MAX_RETRY_SECONDS. The comment for WaitForWorker() is misslabeled. Should SigHandler() be called WatchdogSigHandler()? ---------------------------------------------------------------------- Comment By: Vlad Seryakov (seryakov) Date: 2005-03-07 00:03 Message: Logged In: YES user_id=184124 Looks good to me ---------------------------------------------------------------------- Comment By: Zoran Vasiljevic (vasiljevic) Date: 2005-03-05 13:06 Message: Logged In: YES user_id=95086 Ah, correction: The restart option sends SIGINT to the worker process which causes the watchdog to restart it. And, the patchfile is now attached! ---------------------------------------------------------------------- Comment By: Zoran Vasiljevic (vasiljevic) Date: 2005-03-05 13:03 Message: Logged In: YES user_id=95086 Here is the patch. I have added "-restart" option to "ns_shutdown". It is rather clumsy to parse but should do. We should rewrite this with your args parsing routine. The restart option sends SIGTERM to the worker process which causes the watchdog to restart it. ---------------------------------------------------------------------- Comment By: Stephen Deasey (sdeasey) Date: 2005-03-05 00:46 Message: Logged In: YES user_id=87254 I don't think this has to hide behind a config option. It's either a good idea or it's not. Sounds good to me. Is there a patch? I'm wondering about some of the implementation details... ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=719009&aid=1156875&group_id=130646 |
From: SourceForge.net <no...@so...> - 2005-03-13 17:17:51
|
Feature Requests item #1159307, was opened at 2005-03-08 19:41 Message generated for change (Comment added) made by seryakov You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=719009&aid=1159307&group_id=130646 Category: None Group: None >Status: Closed Resolution: None Priority: 5 Submitted By: Vlad Seryakov (seryakov) Assigned to: Vlad Seryakov (seryakov) Summary: Move nsext, nspd from core to modules directory Initial Comment: Last time i used those modules was when i used sybase driver, actually those things were created for sybase drivers only. Except that, all other DB modules uses native access and even for sybase FreeTSD exists with working module(i use it for SQL Server access). No point keeping this. ---------------------------------------------------------------------- >Comment By: Vlad Seryakov (seryakov) Date: 2005-03-13 17:17 Message: Logged In: YES user_id=184124 moved ---------------------------------------------------------------------- Comment By: Stephen Deasey (sdeasey) Date: 2005-03-13 01:41 Message: Logged In: YES user_id=87254 You can file a SourceForge request ticket to get them to move stuff about, but I don't think this is neccessary. We have no history to preserve. Just remove these module from their current location and import them fresh into a modules directory. So, /cvsroot/naviserver/naviserver is the current location of the server, and /cvsroot/naviserver/modules/nsext will be the location of the nsext module. Keeping all non-core modules in their own directory will make it easier to check out all at once, generate a website, etc. I think the Makefile will need touching up to build out of core. ---------------------------------------------------------------------- Comment By: Vlad Seryakov (seryakov) Date: 2005-03-13 00:48 Message: Logged In: YES user_id=184124 We do not have contrib or modules directory, but yes, we move it from the core. I am not sure how to do this with CVS. ---------------------------------------------------------------------- Comment By: Stephen Deasey (sdeasey) Date: 2005-03-13 00:32 Message: Logged In: YES user_id=87254 If by "get rid of" you mean extract into a non-core module, I'm all for this. Sounds good to me. ---------------------------------------------------------------------- Comment By: Zoran Vasiljevic (vasiljevic) Date: 2005-03-08 22:51 Message: Logged In: YES user_id=95086 I have written a Tcl threading extension and part of it are thread shared arrays (aka nsv's) with a bind option to external key/value databases (like gdbm). So we actually store all things in thread shared arrays which are bound to a persistent gdbm databases on the filesystem. It's all transparent to the application. ---------------------------------------------------------------------- Comment By: Vlad Seryakov (seryakov) Date: 2005-03-08 20:21 Message: Logged In: YES user_id=184124 Not related to nsext, just out of curiosity, how do you keep any state especially between multiple servers? ---------------------------------------------------------------------- Comment By: Zoran Vasiljevic (vasiljevic) Date: 2005-03-08 20:13 Message: Logged In: YES user_id=95086 We do not use any database modules at all hence I do not really care. I do think that any important database vendor has thread-safe libraries nowdays so the task of writing an in-process driver should not be a problem any more. Besides, AFAIK, all popular databases have already a corresponding in-process driver implementation. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=719009&aid=1159307&group_id=130646 |
From: SourceForge.net <no...@so...> - 2005-03-13 01:50:51
|
Feature Requests item #1119365, was opened at 2005-02-09 08:43 Message generated for change (Settings changed) made by sdeasey You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=719009&aid=1119365&group_id=130646 >Category: C-API Group: None Status: Closed >Resolution: Accepted Priority: 5 Submitted By: Vlad Seryakov (seryakov) Assigned to: Vlad Seryakov (seryakov) Summary: UDP, Unix and raw sockets for binder Initial Comment: I find current binder and -b command line option not very usefull and easy to administer. Can we revert back to 3.x binder and combine it with simple watchdog process, so on start ns forks watchdog/biner process, check for exit status and accepts requests for socket allocation. ---------------------------------------------------------------------- Comment By: Vlad Seryakov (seryakov) Date: 2005-02-16 12:21 Message: Logged In: YES user_id=184124 Watchdog is separate issue, binder extended. ---------------------------------------------------------------------- Comment By: Vlad Seryakov (seryakov) Date: 2005-02-11 07:05 Message: Logged In: YES user_id=184124 In case of icmp sockets, thet are just raw sockets without any ports, so for exampe for SNMP monitoring package i have, i need 100 sockets pre-opened that i can use to perform pings. So, i specify -b 0/icmp/100 just ot keep the syntax same with other protocols. ---------------------------------------------------------------------- Comment By: Stephen Deasey (sdeasey) Date: 2005-02-10 20:59 Message: Logged In: YES user_id=87254 Yeah, 0.0.0.0 works for me, too :-) I don't know enough about icmp/raw sockets to know if that makes sense (what is 'count' for?), but /protocl looks good. I haven't looked at this at all for multi-protocol stuff, but it's obviously desirable. It's great that Vlad's already done this and I dont have to! Although I run daemontools and Linux exclusively, I do have one instance where I ship an AOLserver out to customers and daemontools is just not appropriate there. So Zoran's watchdog sounds great, too. ---------------------------------------------------------------------- Comment By: Zoran Vasiljevic (vasiljevic) Date: 2005-02-10 12:39 Message: Logged In: YES user_id=95086 Aha! This I will have to check and see if this works for me also. ---------------------------------------------------------------------- Comment By: Vlad Seryakov (seryakov) Date: 2005-02-10 12:30 Message: Logged In: YES user_id=184124 I use -b 0.0.0.0 currently, 4.0.10 and 4.1 worked fine. When use -b 0.0.0.0 then address for nssosck should be 0.0.0.0 as well, not empty. ---------------------------------------------------------------------- Comment By: Zoran Vasiljevic (vasiljevic) Date: 2005-02-10 10:33 Message: Logged In: YES user_id=95086 This is what I'Ve done (nsd/driver.c) void NsStartDrivers(void) { Driver *drvPtr; /* * Listen on all drivers. */ drvPtr = firstDrvPtr; while (drvPtr != NULL) { #if 0 /* zv */ drvPtr->sock = Ns_SockListenEx(drvPtr->bindaddr, drvPtr->port, drvPtr->backlog); #else /* Listen on all known interfaces/addresses */ drvPtr->sock = Ns_SockListenEx(NULL, drvPtr->port, drvPtr->backlog); #endif I tried -b 0.0.0.0 but somehow it didn't work hence I reverted to heavy guns :) ---------------------------------------------------------------------- Comment By: Vlad Seryakov (seryakov) Date: 2005-02-10 10:14 Message: Logged In: YES user_id=184124 Do you mean -b 0.0.0.0:80? It will listen on all interfaces for port 80? ---------------------------------------------------------------------- Comment By: Zoran Vasiljevic (vasiljevic) Date: 2005-02-10 03:55 Message: Logged In: YES user_id=95086 Eventually, the watchdog/binder combination seems not the way to go (privileges issue). I see. Concerning binder: I would need an option, (may be even compile-time one, don't care) for the server to listen on all network interfaces, including loopback. Apart from this, I have no immediate nor mid-term needs to modify anything there so I suppose I will be perfectly happy with any changes you guys need in this area. Concerning watchdog: we can't use daemontools nor init. Also, we do have windows as platform, remember. We have struggled to get *minimum* interface to the rest of the system hence our product is easily installable and removable from the system. Actually, the only point we have in common (config-wise) are entries in the startup machinery. Therefore, a control-process from within the nsd is ideal for our needs. I have done this with --enable-watchdog and it is #ifdef'ed in the code so for just about any other user of the server this is pretty invisible and obscure (no backward compat problems also). ---------------------------------------------------------------------- Comment By: Vlad Seryakov (seryakov) Date: 2005-02-09 14:00 Message: Logged In: YES user_id=184124 Here is the new syntax i added to the binder: addr:port[/protocol] port[/protocol] 0/icmp[/count] where protocol can be tcp,udp,unix,icmp is special case of raw socket, added by count. ---------------------------------------------------------------------- Comment By: Stephen Deasey (sdeasey) Date: 2005-02-09 13:55 Message: Logged In: YES user_id=87254 Adding support for socket types other than TCP to the binder sounds like a great idea. ---------------------------------------------------------------------- Comment By: Vlad Seryakov (seryakov) Date: 2005-02-09 12:55 Message: Logged In: YES user_id=184124 I agree, i use inittab for aolserver myself calling bach script which parses nsd.tcl for address/port keyworkd and builds -b options on the fly. In case of inittab, no need in watchdog as well. It is not even on my wish list, i just figured if watchdog is to berunning as root, then it can include binder support. Let's put aside binder as separate process issue for a while, but i'd like to add to regular 4.x binder what i did for supporting UDP/TCP/UNIX/RAW sockets. They still need to be pre-bound but then can be used from within AS, this is how snmp and dns modules uses those sockets. ---------------------------------------------------------------------- Comment By: Stephen Deasey (sdeasey) Date: 2005-02-09 12:52 Message: Logged In: YES user_id=87254 Hmm... I guess you could split the config into multiple pieces, let admins manage everything except address and port to bind to. But would we mandate that in the defautl set-up, or just hope people didn't screw up in practice? ---------------------------------------------------------------------- Comment By: Stephen Deasey (sdeasey) Date: 2005-02-09 12:49 Message: Logged In: YES user_id=87254 I think the issue here is that you need root privs to bind to port 80. The address and port are specified in the config file. The config file is a Tcl script. So to allow someone to change your webserver config file you need to give away root privs on your server. I'm not super keen on this :-) IIRC in 3.x there was a lot of code to make sure that the config script was evaled in a safe interp, and the motivation for the binder was code reduction/simplification. This also impacts the chroot functionality. Idealy you want to chroot as the last thing you do before dropping root privs so that you can open all your fileslink to libraries etc., and not have to recreate a complete copy of your environment in the jail. But you need to know which directory that is, and that's in the config file, which is a script... Personaly, I use daemontools to manage my servers. You need to create a simple wrapper script for this to work and I find that is the ideal place to set some environment variables for the IP, PORT etc. I can pass those along on the command line and extract them from the env in the tcl config script with [env get IP] and so on. The binder is a neat solution, but in practice it trips a lot of people up and it's just a pain. Giving away root is not a great solution though... :-( ---------------------------------------------------------------------- Comment By: Zoran Vasiljevic (vasiljevic) Date: 2005-02-09 10:23 Message: Logged In: YES user_id=95086 Honestly, I did forget already how 3.x did the binding :-What we do in 4.0 is to bind on all interfaces (I modified that in our private version). But I'm open to all variants. I suppose I should look back in 3.x code how that was done... Or, you have a crystal-clear picture already in which case I will simply believe you ;-) As soon as we create the sandbox, we can start hacking this in. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=719009&aid=1119365&group_id=130646 |
From: SourceForge.net <no...@so...> - 2005-03-13 01:47:38
|
Feature Requests item #1161597, was opened at 2005-03-11 19:30 Message generated for change (Comment added) made by seryakov You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=719009&aid=1161597&group_id=130646 Category: C-API Group: None Status: Open Resolution: None Priority: 5 Submitted By: Vlad Seryakov (seryakov) Assigned to: Vlad Seryakov (seryakov) Summary: Extend ns_info with info about traces/filters/procs Initial Comment: Attached is patch that extends ns_info with 3 commands: ns_info traces ns_info filters ns_info requestprocs They work the same way as ns_info callbacks, no functionality changed or added, just information commands ---------------------------------------------------------------------- >Comment By: Vlad Seryakov (seryakov) Date: 2005-03-13 01:47 Message: Logged In: YES user_id=184124 Also, somein docs would be usefull to put info how to get info about some binary callbacks: Welcome to ossweb running at /usr/local/ns/bin/nsd (pid 5524) ossweb:nscp 1> ns_info traces p:0xb7187be0 a:0x80817d8 # gdb (gdb) attach 12345 (gdb) info symbol 0xb7187be0 LogTrace in section .text ---------------------------------------------------------------------- Comment By: Vlad Seryakov (seryakov) Date: 2005-03-13 01:44 Message: Logged In: YES user_id=184124 I started with minimal changes, #defines and &syntax can be chnages of course. now that you started TclCallbacks, this thing depends when you commit your changes, it requires Ns_ArgProc and ns_info stuff. Once we have unified API for callbask i will change it to be server-specific as well, it took a while to figure out how it works :-)) ---------------------------------------------------------------------- Comment By: Stephen Deasey (sdeasey) Date: 2005-03-13 01:37 Message: Logged In: YES user_id=87254 This is a nice addition. ns_info filters/traces is server specific, but ns_info requestprocs returns info for all servers. Can you make this per-server also, to be consistent? The URL walking procs look a little tricky, I tnink they would benefit from having their own more detailed comments. The comment currently says it calls a function for each node, and the effect depends on the function. But the function signature is Ns_ArgProc so there's really only one thing it can do. Why does it fix the stack size at 512? Can these magic numbers be #define'd. Using a while loop rather than an empty for loop would be a little easier to read. There's a lot of this kind of thing: (&(triePtr->branches))->n Could this be simplified to: triePtr->branches.n ? Otherwise, looks pretty good. I like it. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=719009&aid=1161597&group_id=130646 |
From: SourceForge.net <no...@so...> - 2005-03-13 01:47:19
|
Feature Requests item #1159307, was opened at 2005-03-08 12:41 Message generated for change (Settings changed) made by sdeasey You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=719009&aid=1159307&group_id=130646 Category: None Group: None Status: Open Resolution: None Priority: 5 Submitted By: Vlad Seryakov (seryakov) Assigned to: Vlad Seryakov (seryakov) >Summary: Move nsext, nspd from core to modules directory Initial Comment: Last time i used those modules was when i used sybase driver, actually those things were created for sybase drivers only. Except that, all other DB modules uses native access and even for sybase FreeTSD exists with working module(i use it for SQL Server access). No point keeping this. ---------------------------------------------------------------------- Comment By: Stephen Deasey (sdeasey) Date: 2005-03-12 18:41 Message: Logged In: YES user_id=87254 You can file a SourceForge request ticket to get them to move stuff about, but I don't think this is neccessary. We have no history to preserve. Just remove these module from their current location and import them fresh into a modules directory. So, /cvsroot/naviserver/naviserver is the current location of the server, and /cvsroot/naviserver/modules/nsext will be the location of the nsext module. Keeping all non-core modules in their own directory will make it easier to check out all at once, generate a website, etc. I think the Makefile will need touching up to build out of core. ---------------------------------------------------------------------- Comment By: Vlad Seryakov (seryakov) Date: 2005-03-12 17:48 Message: Logged In: YES user_id=184124 We do not have contrib or modules directory, but yes, we move it from the core. I am not sure how to do this with CVS. ---------------------------------------------------------------------- Comment By: Stephen Deasey (sdeasey) Date: 2005-03-12 17:32 Message: Logged In: YES user_id=87254 If by "get rid of" you mean extract into a non-core module, I'm all for this. Sounds good to me. ---------------------------------------------------------------------- Comment By: Zoran Vasiljevic (vasiljevic) Date: 2005-03-08 15:51 Message: Logged In: YES user_id=95086 I have written a Tcl threading extension and part of it are thread shared arrays (aka nsv's) with a bind option to external key/value databases (like gdbm). So we actually store all things in thread shared arrays which are bound to a persistent gdbm databases on the filesystem. It's all transparent to the application. ---------------------------------------------------------------------- Comment By: Vlad Seryakov (seryakov) Date: 2005-03-08 13:21 Message: Logged In: YES user_id=184124 Not related to nsext, just out of curiosity, how do you keep any state especially between multiple servers? ---------------------------------------------------------------------- Comment By: Zoran Vasiljevic (vasiljevic) Date: 2005-03-08 13:13 Message: Logged In: YES user_id=95086 We do not use any database modules at all hence I do not really care. I do think that any important database vendor has thread-safe libraries nowdays so the task of writing an in-process driver should not be a problem any more. Besides, AFAIK, all popular databases have already a corresponding in-process driver implementation. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=719009&aid=1159307&group_id=130646 |
From: SourceForge.net <no...@so...> - 2005-03-13 01:44:55
|
Feature Requests item #1161597, was opened at 2005-03-11 19:30 Message generated for change (Comment added) made by seryakov You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=719009&aid=1161597&group_id=130646 Category: C-API Group: None Status: Open Resolution: None Priority: 5 Submitted By: Vlad Seryakov (seryakov) Assigned to: Vlad Seryakov (seryakov) Summary: Extend ns_info with info about traces/filters/procs Initial Comment: Attached is patch that extends ns_info with 3 commands: ns_info traces ns_info filters ns_info requestprocs They work the same way as ns_info callbacks, no functionality changed or added, just information commands ---------------------------------------------------------------------- >Comment By: Vlad Seryakov (seryakov) Date: 2005-03-13 01:44 Message: Logged In: YES user_id=184124 I started with minimal changes, #defines and &syntax can be chnages of course. now that you started TclCallbacks, this thing depends when you commit your changes, it requires Ns_ArgProc and ns_info stuff. Once we have unified API for callbask i will change it to be server-specific as well, it took a while to figure out how it works :-)) ---------------------------------------------------------------------- Comment By: Stephen Deasey (sdeasey) Date: 2005-03-13 01:37 Message: Logged In: YES user_id=87254 This is a nice addition. ns_info filters/traces is server specific, but ns_info requestprocs returns info for all servers. Can you make this per-server also, to be consistent? The URL walking procs look a little tricky, I tnink they would benefit from having their own more detailed comments. The comment currently says it calls a function for each node, and the effect depends on the function. But the function signature is Ns_ArgProc so there's really only one thing it can do. Why does it fix the stack size at 512? Can these magic numbers be #define'd. Using a while loop rather than an empty for loop would be a little easier to read. There's a lot of this kind of thing: (&(triePtr->branches))->n Could this be simplified to: triePtr->branches.n ? Otherwise, looks pretty good. I like it. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=719009&aid=1161597&group_id=130646 |
From: SourceForge.net <no...@so...> - 2005-03-13 01:44:55
|
Feature Requests item #1156875, was opened at 2005-03-04 12:03 Message generated for change (Settings changed) made by sdeasey You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=719009&aid=1156875&group_id=130646 Category: None Group: None Status: Open Resolution: None Priority: 5 Submitted By: Zoran Vasiljevic (vasiljevic) Assigned to: Zoran Vasiljevic (vasiljevic) >Summary: Watchdog process restarts failed server Initial Comment: We have been using this for quite some time and it proved extremely useful. We doublefork the nsd process and make the first forked instance control the second. The first one (the watchdog) reacts on exit codes and signals caught during the watch and correspondingly restarts the second instance (the worker). Also, we have added the the "-restart" option to the "ns_shutdown" command. This just sends the SIGINT to the worker process. The watchdog is handling this signal and respawns the worker automatically. During operation, the watchdog logs events and their cause into the system log file. This looks like: Feb 28 04:00:05 Develop nsd[19400]: worker: started. Mar 1 04:00:13 Develop nsd[4475]: watchdog: worker 19400 exited (2). Mar 1 04:00:15 Develop nsd[21290]: worker: started. Mar 1 04:00:18 Develop nsd[14705]: watchdog: worker 19399 exited (2). Mar 1 04:00:20 Develop nsd[21300]: worker: started. We have done all the changes with "--enable-watchdog" so anybody who needs this feature will have to compile with this option. ---------------------------------------------------------------------- Comment By: Stephen Deasey (sdeasey) Date: 2005-03-10 23:27 Message: Logged In: YES user_id=87254 Oh yeah, and I moved the watchdog stuff eaven earlier, before the prebind and chroot. Hmm, now that I think about it it's only the prebind that really needs to happen after the watchdog is started, to ensure the sockets are always in a sane state... ---------------------------------------------------------------------- Comment By: Stephen Deasey (sdeasey) Date: 2005-03-10 23:17 Message: Logged In: YES user_id=87254 Am I reading this right? It looks like if the server calls Ns_Fatal() it will exit with code 1, but 0 and 1 are treated as OK exit codes and the server will not be restarted. I've attached a patch which changes the above, fixes the pid problem in a different way because I completely forgot you already posted below about that small glitch, use Ns_ParseObjv(), adds the -w switch, and a couple of small name changes. I don't want to go overboard with command line switches, but an option to specify not to give up trying to restart the server would be nice. If you have that, you also need to turn off the restart timeout doubling at a certain point. I don't know what the cleanest way to do that is. ---------------------------------------------------------------------- Comment By: Zoran Vasiljevic (vasiljevic) Date: 2005-03-08 13:09 Message: Logged In: YES user_id=95086 This is correct. As soon as we agree on some implementation, I will put the -i processing and avoid starting watchdog for inittab starts. ---------------------------------------------------------------------- Comment By: Vlad Seryakov (seryakov) Date: 2005-03-08 12:48 Message: Logged In: YES user_id=184124 There still should be possibility to run nsd from inittab, so when -i switch is given, no watchdog should be running, let /sbin/init to handle restarts ---------------------------------------------------------------------- Comment By: Zoran Vasiljevic (vasiljevic) Date: 2005-03-08 09:19 Message: Logged In: YES user_id=95086 small glich: After applying the patch, change nsmain.c from: nsconf.pid = serverPid: to nsconf.pid = getpid(); otherwise the pid file will contain the bogus server pid if the watchdog restarted it later. ---------------------------------------------------------------------- Comment By: Zoran Vasiljevic (vasiljevic) Date: 2005-03-08 08:26 Message: Logged In: YES user_id=95086 Another try... Important to note: SIGKILL (signal 9) cannot be handled hence if somebody kills the watchdog with SIGKILL, the server will be left lingering w/o the watchdog. This is important to know. I do not see any possibility how to recover in such cases (i.e. how to stop the server). Apart from that, all objections from Stephen are taken into account. Please try again. A new copy of watchdog.patch file is attached. ---------------------------------------------------------------------- Comment By: Vlad Seryakov (seryakov) Date: 2005-03-07 22:20 Message: Logged In: YES user_id=184124 I just found this code in my old server sources, i just chnaged internal name to Ns/NS, used to run pretty stable. ------------------------------------------------------------------- #define NS_EXIT 99 static void NsTerminate(int sig) { printf("NS[%d] signal %d received...",getpid(),sig); // Kill the server with the same signal if(nsPid > 0) kill(nsPid,sig); // Exit in case of fatal signal if(sig != SIGHUP) { printf("NS[%d] terminating...",getpid()); exit(0); } // Reassign signal handler signal(sig,NsTerminate); } void NsWatchdog(int argc, char *argv[], char *envp[]) { int failcount = 0; time_t start; int status; pid_t pid; signal(SIGTTOU,SIG_IGN); signal(SIGTTIN,SIG_IGN); signal(SIGTSTP,SIG_IGN); signal(SIGPIPE,SIG_IGN); signal(SIGQUIT,SIG_IGN); signal(SIGHUP,NsTerminate); signal(SIGINT,NsTerminate); signal(SIGTERM,NsTerminate); // Go background if((pid = fork())) { if(pid < 0) err_logger("warpConfigure: fork: %s",strerror(errno)); exit(0); } setsid(); for(;;) { // Execute the real server nsPid = fork(); // Child, continue as server if(nsPid == 0) { exit(nsMain(argc, argv, ServerInit)); } /* parent, behaves like a guardian */ time(&start); printf("NS[%d] server process started",getpid()); pid = waitpid(-1, &status, 0); if(WIFEXITED(status)) printf("NS[%d] child process exited with status %d",pid,WEXITSTATUS(status)); else if(WIFSIGNALED(status)) printf("NS[%d] child process exited due to signal %d",pid,WTERMSIG(status)); else printf("NS[%d] child process exited", pid); // Special exit code if(WIFEXITED(status)) { if(WEXITSTATUS(status) == NS_EXIT) { printf("NS[%d] child configuration error, exiting",getpid()); exit(0); } else if(WEXITSTATUS(status) == SIGHUP) { } else { if(time(0) - start < 10) failcount++; if(failcount == 10) { printf("Exiting due to repeated, frequent failures"); exit(1); } } } sleep(3); } } ---------------------------------------------------------------------- Comment By: Stephen Deasey (sdeasey) Date: 2005-03-07 21:23 Message: Logged In: YES user_id=87254 NsWatchdog() is called after the server drops root privs, so both the watchdog and the server run as the defined user. What happens if the server dies and on restart needs to rebind privileged ports? A ps listing shows two running processes, parent and child. If I kill either one, the watchdog dies, the server continues to run. If I kill -9 the parent, the child continues to run. If I kill -9 the child, the server is restarted. Something seems not quite right here... I'm a bit confused about how the code works. For example, NsWatchdog() seems to ignore all of it's arguments? Here's the code which calls it: if (mode == 0) { i = ns_fork(); if (i < 0) { Ns_Fatal("nsmain: fork() failed: '%s'", strerror(errno)); } if (i > 0) { return 0; } setsid(); i = NsWatchdog(argc, argv, initProc); if (i < 0) { Ns_Fatal("nsmain: watchdog failed: '%s'", strerror(errno)); } if (i > 0) { return 0; } nsconf.pid = getpid(); } NsWatchdog() says it returns the worker pid, it also sets the global variable wpid. The code above ignores the returned value, and the global variable, and instead calls getpid()... The variable 'i' is existing code, but still somehow doesn't seem suitable... Could that Ns_Fatal() above be moved into the NsWatchdog() function? I think maybe some comment is needed here. The code structure is very like that just above where ns_fork() is called, but this function will return *multiple* times, right? This is kind of suprising, and an extra twist on the already confusing fork() semantics (or maybe it's just me who gets confused by fork...). A return value of 0 here is a 'request for orderly shutdown', right? How about some more logging to syslog? For example, distinguish between start and restart. Mention when the MAX_SLEEP_PERIOD has been reached, etc. Couple of small things: Can we refer to 'the server', rather than 'the worker'? Worker and Watchdog begin with 'w', and so does the global variable wpid... Maybe serverPid? NsWatchdog is a static function, it doesn't need the Ns prefix. In NsWatchdog(), the variable 'run' should be something like 'nretries', 'nap' should be something more like 'retrySeconds' and MAX_SLEEP_PERIOD should be MAX_RETRY_SECONDS. The comment for WaitForWorker() is misslabeled. Should SigHandler() be called WatchdogSigHandler()? ---------------------------------------------------------------------- Comment By: Vlad Seryakov (seryakov) Date: 2005-03-06 16:03 Message: Logged In: YES user_id=184124 Looks good to me ---------------------------------------------------------------------- Comment By: Zoran Vasiljevic (vasiljevic) Date: 2005-03-05 05:06 Message: Logged In: YES user_id=95086 Ah, correction: The restart option sends SIGINT to the worker process which causes the watchdog to restart it. And, the patchfile is now attached! ---------------------------------------------------------------------- Comment By: Zoran Vasiljevic (vasiljevic) Date: 2005-03-05 05:03 Message: Logged In: YES user_id=95086 Here is the patch. I have added "-restart" option to "ns_shutdown". It is rather clumsy to parse but should do. We should rewrite this with your args parsing routine. The restart option sends SIGTERM to the worker process which causes the watchdog to restart it. ---------------------------------------------------------------------- Comment By: Stephen Deasey (sdeasey) Date: 2005-03-04 16:46 Message: Logged In: YES user_id=87254 I don't think this has to hide behind a config option. It's either a good idea or it's not. Sounds good to me. Is there a patch? I'm wondering about some of the implementation details... ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=719009&aid=1156875&group_id=130646 |
From: SourceForge.net <no...@so...> - 2005-03-13 01:44:07
|
Feature Requests item #1119365, was opened at 2005-02-09 08:43 Message generated for change (Settings changed) made by sdeasey You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=719009&aid=1119365&group_id=130646 Category: None Group: None Status: Closed Resolution: None Priority: 5 Submitted By: Vlad Seryakov (seryakov) Assigned to: Vlad Seryakov (seryakov) >Summary: UDP, Unix and raw sockets for binder Initial Comment: I find current binder and -b command line option not very usefull and easy to administer. Can we revert back to 3.x binder and combine it with simple watchdog process, so on start ns forks watchdog/biner process, check for exit status and accepts requests for socket allocation. ---------------------------------------------------------------------- Comment By: Vlad Seryakov (seryakov) Date: 2005-02-16 12:21 Message: Logged In: YES user_id=184124 Watchdog is separate issue, binder extended. ---------------------------------------------------------------------- Comment By: Vlad Seryakov (seryakov) Date: 2005-02-11 07:05 Message: Logged In: YES user_id=184124 In case of icmp sockets, thet are just raw sockets without any ports, so for exampe for SNMP monitoring package i have, i need 100 sockets pre-opened that i can use to perform pings. So, i specify -b 0/icmp/100 just ot keep the syntax same with other protocols. ---------------------------------------------------------------------- Comment By: Stephen Deasey (sdeasey) Date: 2005-02-10 20:59 Message: Logged In: YES user_id=87254 Yeah, 0.0.0.0 works for me, too :-) I don't know enough about icmp/raw sockets to know if that makes sense (what is 'count' for?), but /protocl looks good. I haven't looked at this at all for multi-protocol stuff, but it's obviously desirable. It's great that Vlad's already done this and I dont have to! Although I run daemontools and Linux exclusively, I do have one instance where I ship an AOLserver out to customers and daemontools is just not appropriate there. So Zoran's watchdog sounds great, too. ---------------------------------------------------------------------- Comment By: Zoran Vasiljevic (vasiljevic) Date: 2005-02-10 12:39 Message: Logged In: YES user_id=95086 Aha! This I will have to check and see if this works for me also. ---------------------------------------------------------------------- Comment By: Vlad Seryakov (seryakov) Date: 2005-02-10 12:30 Message: Logged In: YES user_id=184124 I use -b 0.0.0.0 currently, 4.0.10 and 4.1 worked fine. When use -b 0.0.0.0 then address for nssosck should be 0.0.0.0 as well, not empty. ---------------------------------------------------------------------- Comment By: Zoran Vasiljevic (vasiljevic) Date: 2005-02-10 10:33 Message: Logged In: YES user_id=95086 This is what I'Ve done (nsd/driver.c) void NsStartDrivers(void) { Driver *drvPtr; /* * Listen on all drivers. */ drvPtr = firstDrvPtr; while (drvPtr != NULL) { #if 0 /* zv */ drvPtr->sock = Ns_SockListenEx(drvPtr->bindaddr, drvPtr->port, drvPtr->backlog); #else /* Listen on all known interfaces/addresses */ drvPtr->sock = Ns_SockListenEx(NULL, drvPtr->port, drvPtr->backlog); #endif I tried -b 0.0.0.0 but somehow it didn't work hence I reverted to heavy guns :) ---------------------------------------------------------------------- Comment By: Vlad Seryakov (seryakov) Date: 2005-02-10 10:14 Message: Logged In: YES user_id=184124 Do you mean -b 0.0.0.0:80? It will listen on all interfaces for port 80? ---------------------------------------------------------------------- Comment By: Zoran Vasiljevic (vasiljevic) Date: 2005-02-10 03:55 Message: Logged In: YES user_id=95086 Eventually, the watchdog/binder combination seems not the way to go (privileges issue). I see. Concerning binder: I would need an option, (may be even compile-time one, don't care) for the server to listen on all network interfaces, including loopback. Apart from this, I have no immediate nor mid-term needs to modify anything there so I suppose I will be perfectly happy with any changes you guys need in this area. Concerning watchdog: we can't use daemontools nor init. Also, we do have windows as platform, remember. We have struggled to get *minimum* interface to the rest of the system hence our product is easily installable and removable from the system. Actually, the only point we have in common (config-wise) are entries in the startup machinery. Therefore, a control-process from within the nsd is ideal for our needs. I have done this with --enable-watchdog and it is #ifdef'ed in the code so for just about any other user of the server this is pretty invisible and obscure (no backward compat problems also). ---------------------------------------------------------------------- Comment By: Vlad Seryakov (seryakov) Date: 2005-02-09 14:00 Message: Logged In: YES user_id=184124 Here is the new syntax i added to the binder: addr:port[/protocol] port[/protocol] 0/icmp[/count] where protocol can be tcp,udp,unix,icmp is special case of raw socket, added by count. ---------------------------------------------------------------------- Comment By: Stephen Deasey (sdeasey) Date: 2005-02-09 13:55 Message: Logged In: YES user_id=87254 Adding support for socket types other than TCP to the binder sounds like a great idea. ---------------------------------------------------------------------- Comment By: Vlad Seryakov (seryakov) Date: 2005-02-09 12:55 Message: Logged In: YES user_id=184124 I agree, i use inittab for aolserver myself calling bach script which parses nsd.tcl for address/port keyworkd and builds -b options on the fly. In case of inittab, no need in watchdog as well. It is not even on my wish list, i just figured if watchdog is to berunning as root, then it can include binder support. Let's put aside binder as separate process issue for a while, but i'd like to add to regular 4.x binder what i did for supporting UDP/TCP/UNIX/RAW sockets. They still need to be pre-bound but then can be used from within AS, this is how snmp and dns modules uses those sockets. ---------------------------------------------------------------------- Comment By: Stephen Deasey (sdeasey) Date: 2005-02-09 12:52 Message: Logged In: YES user_id=87254 Hmm... I guess you could split the config into multiple pieces, let admins manage everything except address and port to bind to. But would we mandate that in the defautl set-up, or just hope people didn't screw up in practice? ---------------------------------------------------------------------- Comment By: Stephen Deasey (sdeasey) Date: 2005-02-09 12:49 Message: Logged In: YES user_id=87254 I think the issue here is that you need root privs to bind to port 80. The address and port are specified in the config file. The config file is a Tcl script. So to allow someone to change your webserver config file you need to give away root privs on your server. I'm not super keen on this :-) IIRC in 3.x there was a lot of code to make sure that the config script was evaled in a safe interp, and the motivation for the binder was code reduction/simplification. This also impacts the chroot functionality. Idealy you want to chroot as the last thing you do before dropping root privs so that you can open all your fileslink to libraries etc., and not have to recreate a complete copy of your environment in the jail. But you need to know which directory that is, and that's in the config file, which is a script... Personaly, I use daemontools to manage my servers. You need to create a simple wrapper script for this to work and I find that is the ideal place to set some environment variables for the IP, PORT etc. I can pass those along on the command line and extract them from the env in the tcl config script with [env get IP] and so on. The binder is a neat solution, but in practice it trips a lot of people up and it's just a pain. Giving away root is not a great solution though... :-( ---------------------------------------------------------------------- Comment By: Zoran Vasiljevic (vasiljevic) Date: 2005-02-09 10:23 Message: Logged In: YES user_id=95086 Honestly, I did forget already how 3.x did the binding :-What we do in 4.0 is to bind on all interfaces (I modified that in our private version). But I'm open to all variants. I suppose I should look back in 3.x code how that was done... Or, you have a crystal-clear picture already in which case I will simply believe you ;-) As soon as we create the sandbox, we can start hacking this in. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=719009&aid=1119365&group_id=130646 |
From: SourceForge.net <no...@so...> - 2005-03-13 01:42:44
|
Feature Requests item #1119257, was opened at 2005-02-09 05:32 Message generated for change (Settings changed) made by sdeasey You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=719009&aid=1119257&group_id=130646 Category: None Group: None Status: Open Resolution: None Priority: 5 Submitted By: Zoran Vasiljevic (vasiljevic) Assigned to: Stephen Deasey (sdeasey) >Summary: Cache API, add Tcl interface into core Initial Comment: Those commands are in the server itself: ns_cache_flush ns_cache_stats ns_cache_size ns_cache_names ns_cache_keys I would not touch them for the compatibility reasons. However, they are pretty limited (introspection/management). What is missing is type of functionality added by the nscache module which I believe should have be done long time ago already. Suggestion: Include nscache into the core Tcl commands. Stephen, you mentioned you've benn working on alternate cache implementation. Can we use this and add better Tcl interface? ---------------------------------------------------------------------- Comment By: Zoran Vasiljevic (vasiljevic) Date: 2005-02-09 10:25 Message: Logged In: YES user_id=95086 We can synthesize your changes and Stephens rewrite of the cache guts. I see no problem there. ---------------------------------------------------------------------- Comment By: Vlad Seryakov (seryakov) Date: 2005-02-09 10:20 Message: Logged In: YES user_id=184124 I added ns_cache incr command aslo some time ago, i think it is inthe CVS version of nscache. Also i used to play with core cache to add more prices size calculation incuding overhead. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=719009&aid=1119257&group_id=130646 |
From: SourceForge.net <no...@so...> - 2005-03-13 01:41:26
|
Feature Requests item #1159307, was opened at 2005-03-08 12:41 Message generated for change (Comment added) made by sdeasey You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=719009&aid=1159307&group_id=130646 Category: None Group: None Status: Open Resolution: None Priority: 5 Submitted By: Vlad Seryakov (seryakov) Assigned to: Vlad Seryakov (seryakov) Summary: Get rid of nsext/nspd Initial Comment: Last time i used those modules was when i used sybase driver, actually those things were created for sybase drivers only. Except that, all other DB modules uses native access and even for sybase FreeTSD exists with working module(i use it for SQL Server access). No point keeping this. ---------------------------------------------------------------------- >Comment By: Stephen Deasey (sdeasey) Date: 2005-03-12 18:41 Message: Logged In: YES user_id=87254 You can file a SourceForge request ticket to get them to move stuff about, but I don't think this is neccessary. We have no history to preserve. Just remove these module from their current location and import them fresh into a modules directory. So, /cvsroot/naviserver/naviserver is the current location of the server, and /cvsroot/naviserver/modules/nsext will be the location of the nsext module. Keeping all non-core modules in their own directory will make it easier to check out all at once, generate a website, etc. I think the Makefile will need touching up to build out of core. ---------------------------------------------------------------------- Comment By: Vlad Seryakov (seryakov) Date: 2005-03-12 17:48 Message: Logged In: YES user_id=184124 We do not have contrib or modules directory, but yes, we move it from the core. I am not sure how to do this with CVS. ---------------------------------------------------------------------- Comment By: Stephen Deasey (sdeasey) Date: 2005-03-12 17:32 Message: Logged In: YES user_id=87254 If by "get rid of" you mean extract into a non-core module, I'm all for this. Sounds good to me. ---------------------------------------------------------------------- Comment By: Zoran Vasiljevic (vasiljevic) Date: 2005-03-08 15:51 Message: Logged In: YES user_id=95086 I have written a Tcl threading extension and part of it are thread shared arrays (aka nsv's) with a bind option to external key/value databases (like gdbm). So we actually store all things in thread shared arrays which are bound to a persistent gdbm databases on the filesystem. It's all transparent to the application. ---------------------------------------------------------------------- Comment By: Vlad Seryakov (seryakov) Date: 2005-03-08 13:21 Message: Logged In: YES user_id=184124 Not related to nsext, just out of curiosity, how do you keep any state especially between multiple servers? ---------------------------------------------------------------------- Comment By: Zoran Vasiljevic (vasiljevic) Date: 2005-03-08 13:13 Message: Logged In: YES user_id=95086 We do not use any database modules at all hence I do not really care. I do think that any important database vendor has thread-safe libraries nowdays so the task of writing an in-process driver should not be a problem any more. Besides, AFAIK, all popular databases have already a corresponding in-process driver implementation. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=719009&aid=1159307&group_id=130646 |
From: SourceForge.net <no...@so...> - 2005-03-13 01:37:36
|
Feature Requests item #1161597, was opened at 2005-03-11 12:30 Message generated for change (Comment added) made by sdeasey You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=719009&aid=1161597&group_id=130646 Category: C-API Group: None Status: Open Resolution: None Priority: 5 Submitted By: Vlad Seryakov (seryakov) Assigned to: Vlad Seryakov (seryakov) Summary: Extend ns_info with info about traces/filters/procs Initial Comment: Attached is patch that extends ns_info with 3 commands: ns_info traces ns_info filters ns_info requestprocs They work the same way as ns_info callbacks, no functionality changed or added, just information commands ---------------------------------------------------------------------- >Comment By: Stephen Deasey (sdeasey) Date: 2005-03-12 18:37 Message: Logged In: YES user_id=87254 This is a nice addition. ns_info filters/traces is server specific, but ns_info requestprocs returns info for all servers. Can you make this per-server also, to be consistent? The URL walking procs look a little tricky, I tnink they would benefit from having their own more detailed comments. The comment currently says it calls a function for each node, and the effect depends on the function. But the function signature is Ns_ArgProc so there's really only one thing it can do. Why does it fix the stack size at 512? Can these magic numbers be #define'd. Using a while loop rather than an empty for loop would be a little easier to read. There's a lot of this kind of thing: (&(triePtr->branches))->n Could this be simplified to: triePtr->branches.n ? Otherwise, looks pretty good. I like it. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=719009&aid=1161597&group_id=130646 |
From: SourceForge.net <no...@so...> - 2005-03-13 00:48:14
|
Feature Requests item #1159307, was opened at 2005-03-08 19:41 Message generated for change (Comment added) made by seryakov You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=719009&aid=1159307&group_id=130646 Category: None Group: None Status: Open Resolution: None Priority: 5 Submitted By: Vlad Seryakov (seryakov) Assigned to: Vlad Seryakov (seryakov) Summary: Get rid of nsext/nspd Initial Comment: Last time i used those modules was when i used sybase driver, actually those things were created for sybase drivers only. Except that, all other DB modules uses native access and even for sybase FreeTSD exists with working module(i use it for SQL Server access). No point keeping this. ---------------------------------------------------------------------- >Comment By: Vlad Seryakov (seryakov) Date: 2005-03-13 00:48 Message: Logged In: YES user_id=184124 We do not have contrib or modules directory, but yes, we move it from the core. I am not sure how to do this with CVS. ---------------------------------------------------------------------- Comment By: Stephen Deasey (sdeasey) Date: 2005-03-13 00:32 Message: Logged In: YES user_id=87254 If by "get rid of" you mean extract into a non-core module, I'm all for this. Sounds good to me. ---------------------------------------------------------------------- Comment By: Zoran Vasiljevic (vasiljevic) Date: 2005-03-08 22:51 Message: Logged In: YES user_id=95086 I have written a Tcl threading extension and part of it are thread shared arrays (aka nsv's) with a bind option to external key/value databases (like gdbm). So we actually store all things in thread shared arrays which are bound to a persistent gdbm databases on the filesystem. It's all transparent to the application. ---------------------------------------------------------------------- Comment By: Vlad Seryakov (seryakov) Date: 2005-03-08 20:21 Message: Logged In: YES user_id=184124 Not related to nsext, just out of curiosity, how do you keep any state especially between multiple servers? ---------------------------------------------------------------------- Comment By: Zoran Vasiljevic (vasiljevic) Date: 2005-03-08 20:13 Message: Logged In: YES user_id=95086 We do not use any database modules at all hence I do not really care. I do think that any important database vendor has thread-safe libraries nowdays so the task of writing an in-process driver should not be a problem any more. Besides, AFAIK, all popular databases have already a corresponding in-process driver implementation. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=719009&aid=1159307&group_id=130646 |
From: SourceForge.net <no...@so...> - 2005-03-13 00:46:33
|
Feature Requests item #1162223, was opened at 2005-03-13 00:22 Message generated for change (Comment added) made by seryakov You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=719009&aid=1162223&group_id=130646 Category: None Group: None Status: Open Resolution: None Priority: 5 Submitted By: Stephen Deasey (sdeasey) Assigned to: Stephen Deasey (sdeasey) Summary: Tcl Callbacks Initial Comment: The server offers a dual C/Tcl interface. There are a number of of places where Tcl code needs to be executed by the C core, or the core doesn't know whether a callback is C or Tcl. I've attached the patch naviserver-4.1-tclcallbacks.patch. It extracts the Tcl callback infrastructure from tclsched.c into tclcallbacks.c and makes it a public interface. Here it is: Ns_TclNewCallback() Ns_TclNewCallbackObj() Ns_TclEvalCallback() Ns_TclCallbackProc() Ns_TclFreeCallback() Ns_TclCallbackArgProc() It provides a standard interface for Tcl callbacks throughout the server, handling interp allocation, introspection, evaluation and cleanup. Also, tclcallbacks.c is somewhat analogous to callbacks.c, and so the generic ns_atexit etc. callbacks which were in tclsched.c have been moved to the tclcallbacks file. The first set of code to be converted to the new interface is tclsched.c itself, and I've converted to using Ns_ParseObjv() while I was in there. File size has dropped ~250 lines. One thing I'm wondering about is whether the Ns_TclEvalCallback() routine should take extra args to send to the Tcl proc. There seems to be no standard scheme for ordering the args so there still needs to be a way to manually construct the command, which is why the members of the Ns_TclCallback struct are public. But for this very reason, it might be worth providing a standard way to do this so all future callbacks are consistent. It would be nice in the future to come up with some way of registereing these callback points such that you don't have to create an ns_info subcommand for example manually. It should be possible for loadable modules to register their callback points and have them show up in the right place. ---------------------------------------------------------------------- >Comment By: Vlad Seryakov (seryakov) Date: 2005-03-13 00:46 Message: Logged In: YES user_id=184124 It looks good, unified callback interface would be better but Tcl filters. traces, requestprocs uses different arguments/parameters and currently the only way to get info about them to use ns_info interface and Ns_ArgProc interface. Tcl_Callback are for simple proc/args callbacks, filters is special case, see if you can convert them. But i am for this change. ---------------------------------------------------------------------- Comment By: Stephen Deasey (sdeasey) Date: 2005-03-13 00:30 Message: Logged In: YES user_id=87254 I've attached a patch which converts the tclrequest.c file to use the new callback API as an example. It also uses the Ns_ParseObjv() API, and I've removed support for the legacy conn variable. It removes ~200 lines and all utility functions. The callback API provides default Ns_ArgProc introspection, but this will be much nicer when combined with Vlad's RFE 1161597 :-) ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=719009&aid=1162223&group_id=130646 |
From: SourceForge.net <no...@so...> - 2005-03-13 00:41:03
|
Feature Requests item #1159471, was opened at 2005-03-09 00:40 Message generated for change (Comment added) made by seryakov You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=719009&aid=1159471&group_id=130646 Category: C-API Group: None Status: Open Resolution: None Priority: 5 Submitted By: Vlad Seryakov (seryakov) Assigned to: Vlad Seryakov (seryakov) Summary: Virtual Hosting Initial Comment: Okay, i did some digging and testing, looks working and much simpler. I tried to simplify default AS 4.x virtual hosting and added pageroot virtual hosting, simple way to use different pagetroots on the same server. The change vor default virtual hosting is: no defaultserver anymore, the server who registered virtual hosts is default server, so nssock is loaded in the default server, other than that virtual servers are defined the same way. One thing i left is to chamge ns_info pageroot to use Ns_GetConn() and then if exists use connPtr->pageroot, but this is simple change if you approve current virtual hosting patch. Here is the nsd.tcl config example: ns_section "ns/server/${server}/module/nssock/servers" ns_param test vlad.seryakov.com ns_param test vlad.seryakov.com:80 ns_section "ns/server/${server}/module/nssock/pageroots" ns_param ${home}/html/test vlad.seryakov.com ns_param ${home}/html/test vlad.seryakov.com:80 ---------------------------------------------------------------------- >Comment By: Vlad Seryakov (seryakov) Date: 2005-03-13 00:41 Message: Logged In: YES user_id=184124 Just to consider the possibility, instead of mallocing pageroot/location, by default it can use sockPtr->pageroot/sockPtr->location, then when ns_conn pageroot newpageroot called, it will set connPtr->pageroot with malloced string and ns_conn pageroot will check and return it instead of sockPtr->pageroot. This way, no overhead at conn queue and still new pageroot/location can be set in Tcl ---------------------------------------------------------------------- Comment By: Vlad Seryakov (seryakov) Date: 2005-03-13 00:02 Message: Logged In: YES user_id=184124 That is the probem with nssock, it is actually an extender of the driver but still somehow kept independent. nssock itself is useless without driver and only used to bind to more than one address for different servers. it could be moved in the core but the problem will be how to define more than once instance. malloc are overhead indeed, but once copied they can be used independently and canbe set in Tcl by using ns_conn location newLocation or ns_conn pageroot newPageroot. In this case they should be a copies. Just do not make mass virtual hosting the only virtual hosting way, being able to change pageroot in the Tcl/C give developer more flexibility if required. For simple cases, mass virtual hosting is okay. ---------------------------------------------------------------------- Comment By: Stephen Deasey (sdeasey) Date: 2005-03-12 23:48 Message: Logged In: YES user_id=87254 I'm not keen at all on adding virtual hosting to the nssock driver. There's nothing HTTP specific about the nssock dirver and I'd like to keep it that way. There are a couple of problems with the other proposed solution. The paired functions Ns_SetLocationProc()/Ns_ConnSetLocationProc() etc. seems excessive, and the enforced malloc()ing at runtime of the location and pageroot strings is an unwelcome overhead. Using Tls storage is clever but pretty ugly. I think dstrings are the way to go here. I'm not sure Tls is safe in this implementation. The same dstring is used for location and pageroot strings, so it depends what the caller decides to do with the result and in what order, whether or not one overwrites the other. I would like to explore adding mas virtual hosting into the core. Let me work up a patch, I think I can get to this this weekend... ---------------------------------------------------------------------- Comment By: Vlad Seryakov (seryakov) Date: 2005-03-12 21:57 Message: Logged In: YES user_id=184124 Attached is nssock with virtual hosting patch ---------------------------------------------------------------------- Comment By: Vlad Seryakov (seryakov) Date: 2005-03-12 20:39 Message: Logged In: YES user_id=184124 Actually, vhost can be combined with nssock, if options given it will enable virtual hosting, if not works as regular sock driver. This way it is always with the core and at the same time independent. ---------------------------------------------------------------------- Comment By: Vlad Seryakov (seryakov) Date: 2005-03-10 21:38 Message: Logged In: YES user_id=184124 If loaded, vhost module works as Stephan suggested, strips port/host and usesd pageroot if other root is not specified. ---------------------------------------------------------------------- Comment By: Vlad Seryakov (seryakov) Date: 2005-03-10 21:37 Message: Logged In: YES user_id=184124 There is new patch with Stephans corrections/additions. I think we can provide core module for virtual hosting, i called it nsvhost and we can extend this module to do all sorts of hosting. ---------------------------------------------------------------------- Comment By: Vlad Seryakov (seryakov) Date: 2005-03-10 18:53 Message: Logged In: YES user_id=184124 And keeping port and www. is sometimes necessary, you can do virtual hosting by port only, IDT does that for example, and many sites work without www. prefix, just stripping them by default may not be appropriate. ---------------------------------------------------------------------- Comment By: Vlad Seryakov (seryakov) Date: 2005-03-10 18:24 Message: Logged In: YES user_id=184124 >Looks good. However, isn't ns_conn pageroot >unneccessary? For templating engines ns_info pageroot is the only way to figure out the root, UrlToFile works for fastpath only. >Would it be better to have the servers and pageroot config >in one section, to avoid duplication? They are mutually exclusive, that's why i put them in different sections, if full virtual server is set, pageroots are ignored, this is for AOL like virtual hosting with different servers(rare situation though). So in most cases pageroots will be used only, thus only one simple syntax. As for PageRoot and ServerRoot, i think this is a little cofusing. Currently, pageroot returns full path and whoever calls pageroot assumes that it will return full path. Virtual hosting using directories as hostnames is what i am currently using with vhost module and i think it can be included as a standard feature for easy virtualhosting solutions. I do not think this should be the ONLY virtual hosting solution, everybody can write their own modules using SetLocation/SetPageRoot procs or register filter which will set pageroot for each connection. Let me prepare another patch-set with your suggestions included . ---------------------------------------------------------------------- Comment By: Stephen Deasey (sdeasey) Date: 2005-03-10 08:00 Message: Logged In: YES user_id=87254 Looks good. However, isn't ns_conn pageroot unneccessary? Would it be better to have the servers and pageroot config in one section, to avoid duplication? ns_section "ns/server/${server}/module/nssock/servers" ns_param example.com exampleserver ns_param foo.example.com "exampleserver ${home}/whatever" Here, the first entry maps the example.com host to the exampleserver and uses it's default pageroot. The second entry supplies a new pageroot. How about mass virtual hosting, i.e. where you don't have to explicitly configure each host header to pageroot mapping, but construct the pageroot from the host heafer at runtime? I've attached the file nsmassvhost.c which implements the above. It uses the hooks Ns_PageRootProc and Ns_LocationProc which would be unneccessary if the functionality was included as standard. It trims the port and any leading 'www.' from the host header. It would be nice to have this for the static mapping also, as at the moment to be robust you often need 4 mappings for each virtual host. It also uses the function Ns_ServerPath(). The idea here is to introduce the concept of the virtual server root as a distinct location in the file system, where the pageroot is a location below that. I want to change this so that the serverroot is dynamic and based on the host header (when configured), and the pageroot is simply the serverroot with "/pages" (or whatever is configured) appended. It would look something like this: /srv/server1/pages /srv/server1/example.com/pages /srv/server1/example.com/cache The first path is the default or non-virtual hosted case. server1 is a server defind in the config file and has it's own private tcl library. The second path is the pageroot of a virtual host. The third is an example of some data which is specific to the example.com virtual host. So, without virtual hosts: Ns_ServerRoot() -> /srv/server1 Ns_PageRoot() -> /srv/server1/pages With virtual hosts (and called in the context of a conn thread): Ns_ServerRoot() -> /srv/server1/example.com Ns_PageRoot() -> /srv/server1/example.com/pages The advantage of this system is that you don't have to restart your server every time you add or remove a virtual host. There is also a convenient location to store data associated with both virtual servers and virtual hosts. Easy to backup, remove, etc. I haven't had time to look at how this would be integrated into what you've got here, maybe at the weekend. Feel free to take a shot at though :-) Does the scheme outlined above make sense to you? ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=719009&aid=1159471&group_id=130646 |
From: SourceForge.net <no...@so...> - 2005-03-13 00:32:58
|
Feature Requests item #1159307, was opened at 2005-03-08 12:41 Message generated for change (Comment added) made by sdeasey You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=719009&aid=1159307&group_id=130646 Category: None Group: None Status: Open Resolution: None Priority: 5 Submitted By: Vlad Seryakov (seryakov) Assigned to: Vlad Seryakov (seryakov) Summary: Get rid of nsext/nspd Initial Comment: Last time i used those modules was when i used sybase driver, actually those things were created for sybase drivers only. Except that, all other DB modules uses native access and even for sybase FreeTSD exists with working module(i use it for SQL Server access). No point keeping this. ---------------------------------------------------------------------- >Comment By: Stephen Deasey (sdeasey) Date: 2005-03-12 17:32 Message: Logged In: YES user_id=87254 If by "get rid of" you mean extract into a non-core module, I'm all for this. Sounds good to me. ---------------------------------------------------------------------- Comment By: Zoran Vasiljevic (vasiljevic) Date: 2005-03-08 15:51 Message: Logged In: YES user_id=95086 I have written a Tcl threading extension and part of it are thread shared arrays (aka nsv's) with a bind option to external key/value databases (like gdbm). So we actually store all things in thread shared arrays which are bound to a persistent gdbm databases on the filesystem. It's all transparent to the application. ---------------------------------------------------------------------- Comment By: Vlad Seryakov (seryakov) Date: 2005-03-08 13:21 Message: Logged In: YES user_id=184124 Not related to nsext, just out of curiosity, how do you keep any state especially between multiple servers? ---------------------------------------------------------------------- Comment By: Zoran Vasiljevic (vasiljevic) Date: 2005-03-08 13:13 Message: Logged In: YES user_id=95086 We do not use any database modules at all hence I do not really care. I do think that any important database vendor has thread-safe libraries nowdays so the task of writing an in-process driver should not be a problem any more. Besides, AFAIK, all popular databases have already a corresponding in-process driver implementation. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=719009&aid=1159307&group_id=130646 |