|
From: <ad...@jb...> - 2005-12-05 18:28:11
|
"
[ Permlink ]
Here's how I reproduced the bug.
1. Create the following classes
----- interface hatest.MyHASingletonServiceMBean -----
package hatest;
import org.jboss.ha.singleton.HASingletonMBean;
public interface MyHASingletonServiceMBean extends HASingletonMBean {
// nothing to add
}
----------
----- class hatest.MyHASingletonService -----
package hatest;
import org.jboss.ha.singleton.HASingletonSupport;
import org.jboss.logging.Logger;
public class MyHASingletonService extends HASingletonSupport implements MyHASingletonServiceMBean {
private static final Logger logger = Logger.getLogger(MyHASingletonService.class);
public void startSingleton() {
logger.info("I am the Master!");
}
public void stopSingleton() {
logger.info("I am no longer the Master.");
throw new RuntimeException("I don't want to die!");
}
}
----------
2. Package the classes in a SAR with the following jboss-service.xml
----- ha-test.sar/META-INF/jboss-service.xml -----
<?xml version="1.0" encoding="UTF-8"?>
jboss:service=${jboss.partition.name:DefaultPartition}
----------
3. Deploy to JBoss
15:47:02,978 INFO [MyHASingletonService] I am the Master!
4. Redeploy the SAR by touching META-INF/jboss-service.xml
15:47:18,168 INFO [MyHASingletonService] I am no longer the Master.
15:47:18,170 WARN [MyHASingletonService] Stopping failed hatest:service=MyHASingletonService
java.lang.RuntimeException: I don't want to die!
at hatest.MyHASingletonService.stopSingleton(MyHASingletonService.java:15)
...
15:47:18,348 WARN [ServiceController] Problem starting service hatest:service=MyHASingletonService
java.lang.NullPointerException
at org.jboss.ha.jmx.HAServiceMBeanSupport.getServiceHAName(HAServiceMBeanSupport.java:361)
at org.jboss.ha.jmx.HAServiceMBeanSupport$1.replicantsChanged(HAServiceMBeanSupport.java:195)
...
Comment by Mirko Nasato [03/Dec/05 11:08 AM] Delete
[ Permlink ]
Attached the zipped ha-test.sar used to reproduce the bug.
Comment by Mirko Nasato [03/Dec/05 11:12 AM] Delete
[ Permlink ]
Regular (non HA-singleton) MBean do not have this problem, i.e. they are redeployed correctly even if they throw an exception when undeploying.
Comment by Mirko Nasato [03/Dec/05 11:51 AM] Delete
[ Permlink ]
In the real world situation we weren't throwing a RuntimeException on purpose of course. A ClassCastException was generated because of another problem, a JDNI object being replaced by another one in a different ClassLoader by NonSerializableFactory after another EAR was deployed.
This bug effectively turned what was supposed to be a "high availability" service into a "zero availability" one.
Comment by Scott Marlow [05/Dec/05 08:51 AM] Delete
[ Permlink ]
This is a 50/50 problem. If the stop _stopOldMaster() fails, should the operation continue? There are probably some cases where the answer would be yes and some no.
If we change this for the 4.0.4 release, to catch the exception, log it and resume starting the new master (makeThisNodeMaster()). We would return from _stopOldMaster not knowing if the old singleton has stopped or not.
I'll go ahead and make the change as it should help in the case that you hit.
Comment by Scott Marlow [05/Dec/05 10:09 AM] Delete
[ Permlink ]
As noted in my comments, this doesn't completely solve the problem, the root cause of the exception still needs to be solved as its unknown of the singleton stopped or not.
Comment by Scott Marlow [05/Dec/05 10:10 AM] Delete
[ Permlink ]
The code change is in head and 4.0.4
"
View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=3910725#3910725
Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=3910725
|