From: Steve L. <ste...@hp...> - 2008-02-15 15:16:13
|
James Abley wrote: > > > > 2. what do people think we should do here? Not do any liveness checks > > and rely on web page livenesspage tests? Tests you'd have to delay with > > a Sleep{ } component so they don't fail early either. > > > I like the liveness checks myself, but as a suggestion:, is there any > reason for people to use Sleep in the manner I think you're > suggesting? How about a component similar to > java.util.concurrent.CountdownLatch, which would block doing the page > liveness tests until the system is in an appropriate state? > > Cheers, > > > James > > > > > > 3. There is a slowstart component in the workflow, though I see that it > > is not declared in any .sf file, and hence won't have any tests/docs > > either. It explicitly delays passing down liveness tests until after a > > prespecified delay > > > > start extends SlowStart { > > delay 5000; > > action JettyServer; > > } I'm not overfond of sleeps in deployment descriptors, as they tend to be very brittle with system configurations. Different hardware -or a VM under load- and the delays that did work, dont work so well. I've just looked at the slowstart component, which is in the 3.12.022 release, though without the .sf file you need to use it: SlowStart extends ActionCompound { sfClass "org.smartfrog.sfcore.workflow.combinators.SlowStart"; slowstartSchema extends Schema { delay extends Integer { description "delay in milliseconds"; } } delay 1000; } On startup, it deploys its action child, when pinged, it always pings this child, but just ignores failures for the specified interval protected void sfPingChild(Liveness child) throws SmartFrogLivenessException, RemoteException { if (!live) { long now = System.currentTimeMillis(); if (now > endTime) { //timeout time is reached, time to go live sfLog().info("Going live at end of timeout"); live = true; } } try { super.sfPingChild(child); // if we get here, liveness kicks in if (!live) { sfLog().info("Child is now live"); live = true; } } catch (SmartFrogLivenessException e) { if (live) { //rethrow the exception when we are live throw e; } else { sfLog().ignore("We are not yet live", e); } } } } What is interesting is that even in that interval, say 1000mS, if the child goes live, this fact is remembered -and from then on, failing a ping is a cause for concern. Unlike the Delay component, which doesnt start its child for a given period, SlowStart starts it, but doesnt expect it to go live. Given that this slow starting is possibly a common behaviour of all web applications, I'm thinking of patching it in to all of them, rather than requiring all deployments to be wrapped in SlowStart declarations. I'd have a base delay for the application server, a value which could be overridden by an explicit delay for the webapp/ear or anything else that you deploy. -steve |