From: Oliver B. <ol...@g7...> - 2006-06-13 00:17:52
|
I am having problems using the webkit script from my working directory to start AppServer when my machine starts. The trouble seems to be that webkit loads its old process ID in $PID_FILE, then finds that that particular PID already exists (there are lots of low-numbered processes during startup) and so decides not to run AppServer. Am I missing something? Perhaps webkit should use fcntl.flock? Oliver |
From: Christoph Z. <ci...@on...> - 2006-06-14 01:00:35
|
Oliver Bock wrote: > I am having problems using the webkit script from my working directory > to start AppServer when my machine starts. The trouble seems to be that > webkit loads its old process ID in $PID_FILE, then finds that that > particular PID already exists (there are lots of low-numbered processes > during startup) and so decides not to run AppServer. Normally, this should not happen if the script is properly installed as a rc start and stop script. When the machine is shut down, webkit should be stopped properly and the PID_FILE deleted. But you're right, it may happen if Webware was not shut down properly and there is another process with the same PID running. So instead of simply testing whether a process exists, we could check the process command as well and require it to be something like "python Launch.py ThreadedAppserver" But this would still fail if you run several Webware instances. We need to check whether the process really belongs to the PID_FILE. > Perhaps webkit should use fcntl.flock? Just so I understand properly: You suggest that the PID_FILE is created with a lock (you can also use os.open for that), and then the start script checks with "fuser" whether the process in the PID_FILE is the same as the one that has PID_FILE locked? That should work, but it would waste a file descriptor, and I'm not sure whether fuser is available everywhere. Maybe the following would be better. It does not need any locking. PID=`cat $PID_FILE` if [ tr \\000 \\040 < /proc/$PID/cmdline | grep -q " -i $PID_FILE " ] then conclude that Webware is still running... What do you think? -- Christoph |
From: Oliver B. <ol...@g7...> - 2006-06-14 01:48:47
|
> Normally, this should not happen if the script is properly installed as > a rc start and stop script. When the machine is shut down, webkit should > be stopped properly and the PID_FILE deleted. > You are right. I have verified ThreadedAppServer does respond to SIGTERM during host shutdown and deletes the PID file. I had to hard-reset my OS X box a few months ago and I think I've had a PID file with a low process ID causing problems since then. (I didn't have time to look into the problem immediately and just invoked AppServer manually the few times I've needed it. This is a testing system.) > So instead of simply testing whether a process exists, we could check > the process command as well and require it to be something like > "python Launch.py ThreadedAppserver" > > But this would still fail if you run several Webware instances. We need > to check whether the process really belongs to the PID_FILE. > For the reason you give, I too think this is a bad idea. > > Perhaps webkit should use fcntl.flock? > > Just so I understand properly: You suggest that the PID_FILE is created > with a lock (you can also use os.open for that), and then the start > script checks with "fuser" whether the process in the PID_FILE is the > same as the one that has PID_FILE locked? That should work, but it > would waste a file descriptor, and I'm not sure whether fuser is > available everywhere. > I'm not familiar with fuser, but I think you are right that os.open is more portable than fnctl.flock. Therefore I'm suggesting that ThreadedAppServer could lock PID_FILE. The OS will automatically release this lock when ThreadedAppServer exits. Then the webkit script can attempt to lock PID_FILE before starting ThreadedAppServer. If it fails then ThreadedAppServer must still be running. I had not considered the loss of a file descriptor, although I don't think this is too high a price. > Maybe the following would be better. It does not need any locking. > > PID=`cat $PID_FILE` > if [ tr \\000 \\040 < /proc/$PID/cmdline | grep -q " -i $PID_FILE " ] > then conclude that Webware is still running... > Unfortunately OS X does not include /proc and therefore FreeBSD probably doesn't either. My impression is that /proc is a relatively recent innovation. Oliver |
From: Christoph Z. <ci...@on...> - 2006-06-14 08:03:22
|
Oliver Bock wrote: > Unfortunately OS X does not include /proc and therefore FreeBSD probably > doesn't either. My impression is that /proc is a relatively recent > innovation. I did not know this about OS X. Actually, the procfs is not something new. FreeBSD has it, too (maybe a bit different there). Anyway, we probably have to resort to locking at least for the generic start script (Webware comes with a bunch of different start scripts in the folder WebKit/Startscripts). But locking is also different on various platforms. We have to be careful to do it right. I found an article which may be helpful: http://www.unixreview.com/documents/s=1344/ur0402g/ I will try to come up with a solution based on that. -- Christoph |