Re: [Evms-cluster] evms takeover with heartbeat does not work when one node is down

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

Stefan Beck wrote:
> Hello
>
> I'm using evms 2.2.2 and heartbeat 1.0.4 on debian sid with kernel
2.4.22.
>
> evms on a shared scsi-rack works fine so far.
> A manual forced takeover via '/usr/lib/heatbeat/hb_standby' works
> perfectly within seconds.
>
> But:
> If one node dies, the evms_failover script fails with returncode 7, so
> the volumes are not available on the new active node:
>
> heartbeat: 2004/02/09_15:08:56 ERROR: Return code 7 from
> /etc/ha.d/resource.d/evms_failover
>
> It seems that the evmscluster functionality stops working if the
> communication between the two node stops. Is this really the way it
> should work? Is there a workaround?

The evms_failover script makes sure that the CCM (the component that
handles cluster membership) is running before it attempts to make any
changes.  The idea is that we don't want to make configuration changes to
the cluster while the membership is unstable.

Apparently the CCM does not provide a membership when on node is dead.  I
don't know if this is the proper behavior.

Workaround?  Yes.  Read on.

> evms (resp. evmsgui) can't be used for administering the cluster
> resources, too, when one node is down/dead.
>
> So if one node dies e.g. due to a hardware failure, one can't resize
> volumes, create new volumes,... until the second node (and heartbeat) is
> up again.

The Cluster Segment Manager by design will not allow changes to CSM
containers if it cannot get a membership.  As with the evms_failover
script, the design is to prevent changes being made while the cluster does
not have a stable membership.

The workaround is to set "admin_mode = yes" in the "csm" section of the
/etc/evms.conf file.  It's already in there at the bottom of the file, just
commented out.  When admin_mode is yes, the CSM will bypass the check for a
membership and will allow changes to be made to cluster resources.

> Any comments on this?
>
> Thanks and regards
> Stefan

Hope this helps.

Steve D.