|
From: mnasato <do-...@jb...> - 2005-12-05 21:51:11
|
I'll explain my particular case in even more detail, so you can see if it can apply to others as well. The first time the HASingleton threw an exception when stopping was when one node was shutdown (because of other problems with that node). The second node (we have only 2 nodes in the cluster at the moment) refused to become the master | 2005-11-18 12:14:19,721 ERROR [ourapp.HASingletonScheduledService] _stopOldMaster failed. New master singleton will not start. | In this case not starting the new master was clearly not the best choice, because for sure the old master had been stopped despite the exception, the whole application server being stopped. After the problem occurred we tried redeploying the EAR containing the service to try and restore the service without affecting the other EARs running in the same appserver, but each time we got that NPE in HAServiceMBeanSupport.getServiceHAName(). So eventually we had to bring down both JBoss nodes, which means an outage in all the applications deployed in that cluster, just to have that single HASingleton service start up again. I agree that in other cases it may not be a good idea to start the new master if the old one failed to stop because you could end up having the HASingleton service running on more than one node. But I think this is somewhat less likely to happen, as like in our case the service may be teared down anyway because it's being stopped as part of a server shutdown, or because it throws an exception while trying to close a resource that's already been closed so it's effectively already stopped. And if it does happen it's much easier to fix the situation. If your service is running on 2 nodes when it shouldn't you can just stop the HASingleton on one node using the JMX console as a temporary measure, or restart one JBoss node so when it comes up again it's in a clean state. The situation we ended up required the whole cluster to be shut down and restarted which is far worse. Well this is my biased point of view anyway ;-) Thanks Mirko View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=3910787#3910787 Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=3910787 |