I had a very good meeting with Peter Badovinatz at SA Forum this week concerning
the intergration of CLMS from CI with the DLM. I believe we agreed that there
was a very good fit between what CLMS is providing and expecting and what
the DLM can provide and expects with respect to node downs and node ups.
At a somewhat superficial level:
a. DLM will register a nodedown and a nodeup routine with CLMS.
b. For each nodedown or nodeup, that routine will be called.
In the case of nodedown, the call will happen after all
ICS communication with the failed node has ended. Integration
between CLMS, DLM and Stomith was not discussed and I believe is
c. The registered nodedown and nodeup routines will arrange to have
a thread actually do the DLM processing (either by creating a
thread or by queueing the event to an existing thread).
d. When the DLM nodedown/nodeup have successfully processed, they
will call clms_nodedown_callback() or clms_nodeup_callback()
to inform the local CLMS that they are done. CLMS will ensure
that the node does not change state until that nodedown/nodeup is
complete on all nodes.
e. In the case that a second nodedown is detected before the first
has completed, CLMS will just call the registered nodedown function.
DLM may chose to abort the first nodedown and do a combined
nodedown but it will not call the clms_nodedown_callback() for
either nodedown until both are completed and then it will call
the callback for each.
f. If a nodeup happens will one or more nodedowns is ongoing,
CLMS will call the registed nodeup routine as normal but the DLM
may defer processing the nodeup until the nodedowns are done.
This is ok and CLMS will wait for DLM to do the nodeup and to
call the clms_nodeup_callback() on all nodes before allowing
the node to proceed with its nodeup states.
Kai will update and enhance his document on interfacing with
CLMS/ICS and publish that either with or before the 0.6.0 release.
At this point we see no reason why the DLM shouldn't work with
either CI by itself or SSI.
I hope Peter will correct me if I have missed anything.