Menu

#1276 AMF : saAmfSINumCurrStandbyAssignments is holding invalid value in 2N model

4.5.2
fixed
None
defect
amf
-
minor
2015-05-22
2015-03-20
Srikanth R
No

Setup
Version : 4.6 FC
model : 2n
configuration : 1App,1SG,2SUs with 4comps each, 4SIs with 1 CSI each
SU1 is mapped to pl-3 and SU2 to pl-4

Initial state
All the AMF entities regarding the application are in unlocked states. SIs are in fully assigned state. SU1 is the standby SU and SU2 is the active SU

Steps Performed :

-> Ran the command "/etc/init.d/opensafd stop" on the PL-3 node.

Mar 20 15:34:47 SYSTEST-PLD-1 opensafd: Stopping OpenSAF Services
Mar 20 15:34:47 SYSTEST-PLD-1 osafamfnd[6835]: NO Shutdown initiated

Now SU2 on PL4 is having active assignments.

-> Started opensaf on PL-3 node.

Mar 20 15:36:43 SYSTEST-PLD-1 opensafd: Starting OpenSAF Services (Using TIPC)
Mar 20 15:36:45 SYSTEST-PLD-1 opensafd: OpenSAF(4.6.FC - ) services successfully started

Now SU2 on PL-4 is active and SU1 on PL-3 is standby.

-> Stopped opensaf on PL-4 node.

Mar 20 15:36:50 SYSTEST-PLD-1 osafamfnd[16251]: NO Assigned 'all SIs' ACTIVE of 'safSu=SU1,safSg=SG,safApp=test2nApp'
Mar 20 15:36:53 SYSTEST-PLD-1 kernel: [14611.120045] TIPC: Resetting link <1.1.3:eth2-1.1.4:eth2>, peer not responding
Mar 20 15:36:53 SYSTEST-PLD-1 kernel: [14611.120051] TIPC: Lost link <1.1.3:eth2-1.1.4:eth2> on network plane A
Mar 20 15:36:53 SYSTEST-PLD-1 kernel: [14611.120056] TIPC: Lost contact with <1.1.4>
Mar 20 15:37:08 SYSTEST-PLD-1 kernel: [14626.188976] TIPC: Established link <1.1.3:eth2-1.1.4:eth2> on network plane A

Now SU1 on PL-3 is active and SU2 is unassigned state.

-> Started opensaf on PL-4 node.

Mar 20 15:37:08 SYSTEST-PLD-1 kernel: [14626.188976] TIPC: Established link <1.1.3:eth2-1.1.4:eth2> on network plane A

Now the amf-state of all SIs are showing as partially assigned, as saAmfSINumCurrStandbyAssignments is set to the value 2, which is invalid for 2n model.
Callbacks for the components are proper, only the imm attribute is improperly updated by AMF.

saAmfSIPrefStandbyAssignments SA_UINT32_T 1 (0x1)
saAmfSIPrefActiveAssignments SA_UINT32_T 1 (0x1)
saAmfSINumCurrStandbyAssignments SA_UINT32_T 2 (0x2)
saAmfSINumCurrActiveAssignments SA_UINT32_T 1 (0x1)
saAmfSIAssignmentState SA_UINT32_T 3 (0x3)

AMF lock on SI had resulted in following values :
saAmfSIPrefStandbyAssignments SA_UINT32_T 1 (0x1)
saAmfSIPrefActiveAssignments SA_UINT32_T 1 (0x1)
saAmfSINumCurrStandbyAssignments SA_UINT32_T 1 (0x1)
saAmfSINumCurrActiveAssignments SA_UINT32_T 0 (0x0)

1 Attachments

Related

Tickets: #1276
Wiki: ChangeLog-4.5.2
Wiki: ChangeLog-4.6.1

Discussion

  • Nagendra Kumar

    Nagendra Kumar - 2015-03-30
    • status: unassigned --> assigned
    • assigned_to: Nagendra Kumar
     
  • Nagendra Kumar

    Nagendra Kumar - 2015-03-30

    From logs analysis, si swap was issued :
    Mar 20 15:07:50.825598 osafamfd [2353:si.cc:0821] >> si_admin_op_cb: safSi=SI3,safApp=test2nApp op=7
    But component got timeout while transitioning from Quisced to Standby and SU failover triggered.

    Mar 20 15:08:10 SYSTEST-PLD-1 osafamfnd[6835]: NO Performing failover of 'safSu=SU1,safSg=SG,safApp=test2nApp' (SU failover count: 5)
    Mar 20 15:08:10 SYSTEST-PLD-1 osafamfnd[6835]: NO 'safComp=COMP1,safSu=SU1,safSg=SG,safApp=test2nApp' recovery action escalated from 'componentFailover' to 'suFailover'
    Mar 20 15:08:10 SYSTEST-PLD-1 osafamfnd[6835]: NO 'safComp=COMP1,safSu=SU1,safSg=SG,safApp=test2nApp' faulted due to 'csiSetcallbackTimeout' : Recovery is 'suFailover'

    It is reproducible with the following steps:
    1. Configure SU failover for amf demo app and perform SI swap after unlocking both the SUs.
    2. Keep gdb to Timeout when comp is going to standby from quisced.
    3. Perform immlist.
    saAmfSINumCurrStandbyAssignments is still 1, which should be zero

    saAmfSINumCurrStandbyAssignments SA_UINT32_T 1 (0x1)

     
  • Nagendra Kumar

    Nagendra Kumar - 2015-03-31
    • Milestone: 4.6.RC1 --> 4.7-Tentative
     
  • Praveen

    Praveen - 2015-03-31
    • Milestone: 4.7-Tentative --> 4.6.0
     
  • Nagendra Kumar

    Nagendra Kumar - 2015-04-07
    • status: assigned --> review
     
  • Nagendra Kumar

    Nagendra Kumar - 2015-04-07
    • Milestone: 4.6.0 --> 4.7-Tentative
     
  • Anders Bjornerstedt

    A defect can not have a future release as milestone.
    If the defect exists on 4.6 branch then it should be fixed on the 4.6
    branch and any later branches.

     
  • Nagendra Kumar

    Nagendra Kumar - 2015-05-22

    changeset: 6568:532573afb8da
    branch: opensaf-4.5.x
    parent: 6562:e0acc354bd06
    user: Nagendra Kumarnagendra.k@oracle.com
    date: Fri May 22 15:10:55 2015 +0530
    summary: amfd: adjust saAmfSINumCurrStandbyAssignments during HA state change [#1276]

    changeset: 6569:2b5372f7166b
    branch: opensaf-4.6.x
    parent: 6566:9bff9230b284
    user: Nagendra Kumarnagendra.k@oracle.com
    date: Fri May 22 15:11:39 2015 +0530
    summary: amfd: adjust saAmfSINumCurrStandbyAssignments during HA state change [#1276]

    changeset: 6570:5720e8f398e3
    tag: tip
    parent: 6567:89b2c6789acf
    user: Nagendra Kumarnagendra.k@oracle.com
    date: Fri May 22 15:11:47 2015 +0530
    summary: amfd: adjust saAmfSINumCurrStandbyAssignments during HA state change [#1276]

    [staging:532573]
    [staging:2b5372]
    [staging:5720e8]

     

    Related

    Tickets: #1276
    Commit: [2b5372]
    Commit: [532573]
    Commit: [5720e8]

  • Nagendra Kumar

    Nagendra Kumar - 2015-05-22
    • status: review --> fixed
    • Milestone: 4.7-Tentative --> 4.5.2
     

Log in to post a comment.