From: Steve D. <st...@us...> - 2004-02-10 00:33:04
|
Stefan Beck wrote: > Hello > > I'm using evms 2.2.2 and heartbeat 1.0.4 on debian sid with kernel 2.4.22. > > evms on a shared scsi-rack works fine so far. > A manual forced takeover via '/usr/lib/heatbeat/hb_standby' works > perfectly within seconds. > > But: > If one node dies, the evms_failover script fails with returncode 7, so > the volumes are not available on the new active node: > > heartbeat: 2004/02/09_15:08:56 ERROR: Return code 7 from > /etc/ha.d/resource.d/evms_failover > > It seems that the evmscluster functionality stops working if the > communication between the two node stops. Is this really the way it > should work? Is there a workaround? The evms_failover script makes sure that the CCM (the component that handles cluster membership) is running before it attempts to make any changes. The idea is that we don't want to make configuration changes to the cluster while the membership is unstable. Apparently the CCM does not provide a membership when on node is dead. I don't know if this is the proper behavior. Workaround? Yes. Read on. > evms (resp. evmsgui) can't be used for administering the cluster > resources, too, when one node is down/dead. > > So if one node dies e.g. due to a hardware failure, one can't resize > volumes, create new volumes,... until the second node (and heartbeat) is > up again. The Cluster Segment Manager by design will not allow changes to CSM containers if it cannot get a membership. As with the evms_failover script, the design is to prevent changes being made while the cluster does not have a stable membership. The workaround is to set "admin_mode = yes" in the "csm" section of the /etc/evms.conf file. It's already in there at the bottom of the file, just commented out. When admin_mode is yes, the CSM will bypass the check for a membership and will allow changes to be made to cluster resources. > Any comments on this? > > Thanks and regards > Stefan Hope this helps. Steve D. |