Menu

#822 AMF: Comp Failover performed before comp cleanup done (missing case)

4.3.3
fixed
nobody
None
defect
amf
nd
4.4
major
2014-04-08
2014-03-26
Hans Feldt
No

Test case: SU lock with component failing in QUIESCED state, 2N

With a single sa-aware component in an SU, cleanup is not awaited before the oper-state(DISABLED) message is sent to the director. This causes activation of the standby component before cleanup has finished.

Related

Tickets: #822
Wiki: ChangeLog-4.3.3
Wiki: ChangeLog-4.4.1

Discussion

  • Hans Feldt

    Hans Feldt - 2014-03-26
    • Description has changed:

    Diff:

    --- old
    +++ new
    @@ -1,5 +1,5 @@
    -Test case: SU lock with component failing in QUIESCED state
    +Test case: SU lock with component failing in QUIESCED state, 2N
    
     With a single sa-aware component in an SU, cleanup is not awaited before the oper-state(DISABLED) message is sent to the director. This causes activation of the standby component before cleanup has finished.
    
    -And most severe this leads to that auto repair not gets performed.
    +The component gets instantiated again (auto repair) but not assigned standby.
    
     
  • Hans Feldt

    Hans Feldt - 2014-03-26
    • Description has changed:

    Diff:

    --- old
    +++ new
    @@ -1,5 +1,3 @@
     Test case: SU lock with component failing in QUIESCED state, 2N
    
     With a single sa-aware component in an SU, cleanup is not awaited before the oper-state(DISABLED) message is sent to the director. This causes activation of the standby component before cleanup has finished.
    -
    -The component gets instantiated again (auto repair) but not assigned standby.
    
     
  • Praveen

    Praveen - 2014-03-27

    Since only one component is present which itself has faulted, in the escalation only AMFND responded to AMFD for quiesced assignments. Here is the AMFND trace:

    Mar 27 10:13:29.486079 osafamfnd [26403:err.cc:1375] << avnd_err_esc_su_failover: retval=1
    Mar 27 10:13:29.486105 osafamfnd [26403:err.cc:0368] NO 'safComp=AmfDemo,safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1' faulted due to 'csiSetcallbackFailed' : Recovery is 'componentFailover'
    Mar 27 10:13:29.486116 osafamfnd [26403:err.cc:0474] >> avnd_err_recover: SU:safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1 Comp:safComp=AmfDemo,safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1
    Mar 27 10:13:29.486127 osafamfnd [26403:err.cc:0682] >> avnd_err_rcvr_comp_failover: 'safComp=AmfDemo,safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1'
    Mar 27 10:13:29.486137 osafamfnd [26403:di.cc:0662] >> avnd_di_object_upd_send: Comp 'safComp=AmfDemo,safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1'
    Mar 27 10:13:29.486149 osafamfnd [26403:di.cc:0767] >> avnd_di_msg_send: Msg type '8'
    Mar 27 10:13:29.486159 osafamfnd [26403:di.cc:0958] >> avnd_diq_rec_add
    Mar 27 10:13:29.486170 osafamfnd [26403:di.cc:0973] << avnd_diq_rec_add
    Mar 27 10:13:29.486179 osafamfnd [26403:di.cc:1034] >> avnd_diq_rec_send
    Mar 27 10:13:29.486190 osafamfnd [26403:mds.cc:1150] >> avnd_mds_send: Msg type '1'
    Mar 27 10:13:29.487506 osafamfnd [26403:mds.cc:1205] << avnd_mds_send: 1
    Mar 27 10:13:29.487522 osafamfnd [26403:di.cc:1054] << avnd_diq_rec_send: 1
    Mar 27 10:13:29.487532 osafamfnd [26403:di.cc:0799] << avnd_di_msg_send: 1
    Mar 27 10:13:29.487542 osafamfnd [26403:di.cc:0681] << avnd_di_object_upd_send: 1
    Mar 27 10:13:29.487552 osafamfnd [26403:comp.cc:2471] >> avnd_comp_cmplete_all_assignment: Comp 'safComp=AmfDemo,safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1'
    Mar 27 10:13:29.487563 osafamfnd [26403:comp.cc:1532] >> avnd_comp_csi_assign_done: 'safComp=AmfDemo,safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1', (nil)
    Mar 27 10:13:29.487627 osafamfnd [26403:comp.cc:1543] IN Assigned 'all CSIs' QUIESCED to 'safComp=AmfDemo,safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1'
    Mar 27 10:13:29.487642 osafamfnd [26403:cbq.cc:1003] >> avnd_comp_cbq_csi_rec_del: 'safComp=AmfDemo,safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1', '(null)'
    Mar 27 10:13:29.487655 osafamfnd [26403:tmr.cc:0125] TR callback response timer stopped
    Mar 27 10:13:29.487667 osafamfnd [26403:cbq.cc:1032] << avnd_comp_cbq_csi_rec_del
    Mar 27 10:13:29.487677 osafamfnd [26403:comp.cc:1423] >> all_csis_at_rank_assigned: 'safSi=AmfDemo,safApp=AmfDemo1'rank=1
    Mar 27 10:13:29.487687 osafamfnd [26403:comp.cc:1449] << all_csis_at_rank_assigned: true
    Mar 27 10:13:29.487697 osafamfnd [26403:susm.cc:0940] >> avnd_su_si_oper_done: 'safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1' '(null)'
    Mar 27 10:13:29.487708 osafamfnd [26403:susm.cc:0948] TR SI 'safSi=AmfDemo,safApp=AmfDemo1'
    Mar 27 10:13:29.487746 osafamfnd [26403:susm.cc:0955] NO Assigned 'safSi=AmfDemo,safApp=AmfDemo1' QUIESCED to 'safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1'
    Mar 27 10:13:29.487760 osafamfnd [26403:di.cc:0575] >> avnd_di_susi_resp_send: Sending Resp su=safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1, si=safSi=AmfDemo,safApp=AmfDemo1, curr_state=3, prv_state=1
    Mar 27 10:13:29.487772 osafamfnd [26403:di.cc:0586] TR curr_assign_state '3'
    Mar 27 10:13:29.487782 osafamfnd [26403:di.cc:0616] TR Sending. msg_id'104', node_id'131343', msg_act'5', su'safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1', si'', ha_state'3', error'1', single_csi'0'
    Mar 27 10:13:29.487793 osafamfnd [26403:di.cc:0767] >> avnd_di_msg_send: Msg type '5'

    After this cleanup of component is triggered by AMFND:
    Mar 27 10:13:29.488412 osafamfnd [26403:cpm.cc:0186] << avnd_comp_pm_rec_del_all
    Mar 27 10:13:29.488421 osafamfnd [26403:comp.cc:1850] << avnd_comp_curr_info_del: 1
    Mar 27 10:13:29.488431 osafamfnd [26403:clc.cc:0763] >> avnd_comp_clc_fsm_run: Comp 'safComp=AmfDemo,safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1', Ev '7'
    Mar 27 10:13:29.488442 osafamfnd [26403:clc.cc:0817] TR stopping all monitoring for this component
    Mar 27 10:13:29.488468 osafamfnd [26403:cpm.cc:0634] >> avnd_comp_pm_finalize: Comp 'safComp=AmfDemo,safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1'
    Mar 27 10:13:29.488478 osafamfnd [26403:cpm.cc:0650] << avnd_comp_pm_finalize
    Mar 27 10:13:29.488488 osafamfnd [26403:clc.cc:0835] T1 'safComp=AmfDemo,safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1':Entering CLC FSM: presence state:'SA_AMF_PRESENCE_INSTANTIATED(3)', Event:'AVND_COMP_CLC_PRES_FSM_EV_CLEANUP'
    Mar 27 10:13:29.488498 osafamfnd [26403:clc.cc:1816] >> avnd_comp_clc_inst_clean_hdler: 'safComp=AmfDemo,safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1': Cleanup event in the instantiated state
    Mar 27 10:13:29.488509 osafamfnd [26403:clc.cc:2555] >> avnd_comp_clc_cmd_execute: 'safComp=AmfDemo,safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1':CLC CLI command type:'AVND_COMP_CLC_CMD_TYPE_CLEANUP(3)'
    Mar 27 10:13:29.488528 osafamfnd [26403:clc.cc:2711] T1 CLC CLI script:'/opt/amf_demo/amf_demo_script'
    Mar 27 10:13:29.488538 osafamfnd [26403:clc.cc:2713] T1 CLC CLI command arguments[1] ='cleanup'
    Mar 27 10:13:29.488571 osafamfnd [26403:clc.cc:2716] T1 CLC CLI command timeout: In nano secs:10000000000 In milli secs: 10000
    Mar 27 10:13:29.488582 osafamfnd [26403:clc.cc:2720] T1 CLC CLI command env variable name = 'AMF_DEMO_VAR2': value ='CT_VALUE2'
    Mar 27 10:13:29.488592 osafamfnd [26403:clc.cc:2720] T1 CLC CLI command env variable name = 'AMF_DEMO_VAR2': value ='CT_VALUE2'
    Mar 27 10:13:29.488601 osafamfnd [26403:clc.cc:2720] T1 CLC CLI command env variable name = 'AMF_DEMO_VAR2': value ='CT_VALUE2'

     
  • Hans Feldt

    Hans Feldt - 2014-03-31
    • status: unassigned --> accepted
    • assigned_to: Hans Feldt
     
  • Hans Feldt

    Hans Feldt - 2014-03-31
    • status: accepted --> review
     
  • Hans Feldt

    Hans Feldt - 2014-04-08

    changeset: 5113:dab0d4067b90
    branch: opensaf-4.3.x
    parent: 5110:87f6fa7ae4fe
    user: Hans Feldt hans.feldt@ericsson.com
    date: Tue Apr 08 06:34:50 2014 +0200
    summary: avnd: send SUSI response after cleanup of comp [#822]

    changeset: 5114:9dbafd1322b9
    branch: opensaf-4.4.x
    parent: 5111:419fae714e2c
    user: Hans Feldt hans.feldt@ericsson.com
    date: Mon Mar 31 13:07:42 2014 +0200
    summary: amfnd: send SUSI response after cleanup of comp [#822]

    changeset: 5115:257088744782
    tag: tip
    parent: 5112:deeaff39c414
    user: Hans Feldt hans.feldt@ericsson.com
    date: Mon Mar 31 13:07:42 2014 +0200
    summary: amfnd: send SUSI response after cleanup of comp [#822]

     

    Related

    Tickets: #822

  • Hans Feldt

    Hans Feldt - 2014-04-08
    • status: review --> fixed
    • assigned_to: Hans Feldt --> nobody
    • Milestone: future --> 4.3.3
     

Log in to post a comment.