|
From: Steve L. <ste...@hp...> - 2008-02-15 15:16:13
|
James Abley wrote:
> >
> > 2. what do people think we should do here? Not do any liveness checks
> > and rely on web page livenesspage tests? Tests you'd have to delay with
> > a Sleep{ } component so they don't fail early either.
>
>
> I like the liveness checks myself, but as a suggestion:, is there any
> reason for people to use Sleep in the manner I think you're
> suggesting? How about a component similar to
> java.util.concurrent.CountdownLatch, which would block doing the page
> liveness tests until the system is in an appropriate state?
>
> Cheers,
>
>
> James
>
>
> >
> > 3. There is a slowstart component in the workflow, though I see that it
> > is not declared in any .sf file, and hence won't have any tests/docs
> > either. It explicitly delays passing down liveness tests until after a
> > prespecified delay
> >
> > start extends SlowStart {
> > delay 5000;
> > action JettyServer;
> > }
I'm not overfond of sleeps in deployment descriptors, as they tend to be
very brittle with system configurations. Different hardware -or a VM
under load- and the delays that did work, dont work so well.
I've just looked at the slowstart component, which is in the 3.12.022
release, though without the .sf file you need to use it:
SlowStart extends ActionCompound {
sfClass "org.smartfrog.sfcore.workflow.combinators.SlowStart";
slowstartSchema extends Schema {
delay extends Integer {
description "delay in milliseconds";
}
}
delay 1000;
}
On startup, it deploys its action child, when pinged, it always pings
this child, but just ignores failures for the specified interval
protected void sfPingChild(Liveness child)
throws SmartFrogLivenessException, RemoteException {
if (!live) {
long now = System.currentTimeMillis();
if (now > endTime) {
//timeout time is reached, time to go live
sfLog().info("Going live at end of timeout");
live = true;
}
}
try {
super.sfPingChild(child);
// if we get here, liveness kicks in
if (!live) {
sfLog().info("Child is now live");
live = true;
}
} catch (SmartFrogLivenessException e) {
if (live) {
//rethrow the exception when we are live
throw e;
} else {
sfLog().ignore("We are not yet live", e);
}
}
}
}
What is interesting is that even in that interval, say 1000mS, if the
child goes live, this fact is remembered -and from then on, failing a
ping is a cause for concern. Unlike the Delay component, which doesnt
start its child for a given period, SlowStart starts it, but doesnt
expect it to go live.
Given that this slow starting is possibly a common behaviour of all web
applications, I'm thinking of patching it in to all of them, rather than
requiring all deployments to be wrapped in SlowStart declarations. I'd
have a base delay for the application server, a value which could be
overridden by an explicit delay for the webapp/ear or anything else that
you deploy.
-steve
|