|
From: Hans N. <han...@er...> - 2015-10-09 08:38:52
|
ack, code review only/Thanks HansN
On 10/05/2015 12:47 PM, pra...@or... wrote:
> osaf/services/saf/amf/amfd/sgproc.cc | 21 +++++++++++++++++++++
> 1 files changed, 21 insertions(+), 0 deletions(-)
>
>
> NG gets stuck in SHUTTING_DOWN state during shutdown op and controller failover.
>
> During SHUTDOWN admin operation on NG, initial admin state is set to SHUTTING_DOWN and
> it is checkpointed to standby AMFD. On decoding it, standby AMFD sets node->admin_ng
> and it clears it when active AMFD checkpoints the LOCKED state. Now after fail-over when
> AMFD gets quiescing success response from AMFND it clears this pointer in
> process_su_si_response_for_ng() assuming there is only one SU hosted on that node.
> After this when response for second SU comes, this response is not processed from NG
> perspective as AMFD has already cleared node->admin_ng. Issue does not occur when node hosts
> only one application SU.
>
> Patch fixes the problem by avoiding clearing of node->admin_ng when NG is in SHUTTING_DOWN state.
>
> diff --git a/osaf/services/saf/amf/amfd/sgproc.cc b/osaf/services/saf/amf/amfd/sgproc.cc
> --- a/osaf/services/saf/amf/amfd/sgproc.cc
> +++ b/osaf/services/saf/amf/amfd/sgproc.cc
> @@ -400,6 +400,27 @@ void process_su_si_response_for_ng(AVD_S
> ng->node_oper_list.erase(Amf::to_string(&node->name));
> TRACE("node_oper_list size:%u",ng->oper_list_size());
> }
> +
> + /*Handling for the case: There are pending assignments on more than one SUs
> + on same node of nodegroup with atleast one quiescing assignment and controller
> + failover occured.
> + Below if block will be hit only when assignments for quiescing state are still pending
> + on atleast one SU and on atleast one node of NG.
> + */
> + if ((ng->saAmfNGAdminState == SA_AMF_ADMIN_SHUTTING_DOWN) &&
> + (ng->admin_ng_pend_cbk.admin_oper == 0) &&
> + (ng->admin_ng_pend_cbk.invocation == 0)) {
> + /*During SHUTDOWN admin operation on NG, initial admin state is set to SHUTTING_DOWN
> + and it is checkpointed to standby AMFD. On decoding it, standby AMFD sets
> + node->admin_ng and it clears it when active AMFD checkpoints the LOCKED state.
> + In case active AMFD sends quiescing state and reboots after checkpointing only
> + SHUTTING_DOWN state, standby AMFD will be able to mark NG LOCKED by processing
> + response of assignments as it has set node->admin_ng. So this pointer should be
> + cleared only when NG is marked LOCKED. And in that case we will not be in this if block.
> + */
> + TRACE_1("'%s' in shutting_down state after failover.",ng->name.value);
> + goto done;
> + }
> /*If assignment changes are done on all the SUs on each node of nodegroup
> then reply to IMM for status of admin operation.*/
> if (ng->node_oper_list.empty())
|