Menu

#1384 amf-state print out error if it is called during failover

4.5.2
fixed
None
defect
amf
tools
4.5.1
minor
2015-06-18
2015-06-09
Thuan Tran
No

When failover was trigger, there are some problems (maybe implementer disconnect) with immlist tool in short time. So, immlist tool can’t get value of variables or an empty value (ex: saAmfCSICompHAState=<Empty>). Then, amf-state did not handle well with these empty values.

/usr/bin/amf-state: line 62: [: <Empty>: integer expression expected

Reproduction

  • Create a while loop for amf-state on Standby SC.
  • Reboot the Active SC.
  • Failover event occurs and problem occurs also.

Related

Tickets: #1384
Wiki: ChangeLog-4.5.2
Wiki: ChangeLog-4.6.1

Discussion

  • Neelakanta Reddy

    From IMM perspective:

    when the implementer is detached or IMMND goes down then the cached attributes will not have any significance. so, when immlist is called the value is shown as empty.

    The same is present in the IMM spec:
    saImmOiClassImplementerRelease/saImmOiObjectImplementerRelease

    If the operation succeeds, the IMM Service removes the
    SA_IMM_ATTR_IMPLEMENTER_NAME attribute as well as all non-persistent cached
    runtime attributes from all objects of that class.

    In 4.6 and enhancement(#1156) is added

    OM clients performing a read (iteration or accessor-get) that fetches a cached
    runtime attribute, will not immediately see the attribute as empty when/if the
    OI detaches. Instead for a period of grace for 6 seconds, the latest set value is shown.This allows for failover or switchover or process restart of OI to occur without OMclients seeing the "glitch" in the value of the cached runtime attribute.

     
  • Anders Bjornerstedt

    Good analysis by Neelakanta.

    So the component for this ticket should be the AMF and part already set
    correctly to tools.

     
  • Nagendra Kumar

    Nagendra Kumar - 2015-06-10

    Hi Thuan,
    Please specify OpenSAF version.

    Thanks
    -Nagu

     
  • Nagendra Kumar

    Nagendra Kumar - 2015-06-10
    • Component: unknown --> amf
     
  • Thuan Tran

    Thuan Tran - 2015-06-10
    • Version: --> 4.5.1
     
  • Nagendra Kumar

    Nagendra Kumar - 2015-06-10

    Hi Thuan,
    What is the expectation from ticket? Do you think, amf-state should not throw error(integer expression expected) and just print NULL. Please clarify.

    As suggested by Neel, since 4.6 release, this problem will not come(if Failover time is < 6 seconds).

    Thanks
    -Nagu

     
  • Thuan Tran

    Thuan Tran - 2015-06-10

    We expect no error printout.
    This problem still occur on 4.6, may be due to failover > 6s

     
  • Nagendra Kumar

    Nagendra Kumar - 2015-06-10

    Hi Thuan,
    Thanks for the clarification, I need some more:
    So, saAmfCSICompHAState may have valid values like SA_AMF_HA_ACTIVE = 1, SA_AMF_HA_STANDBY = 2, etc. During failover time, if the value of saAmfCSICompHAState is returned as NULL/Zero, which is not a valid value for saAmfCSICompHAState, so what should be printed as a result(of course, after we will suppress the error). Do you think, we should not print non-persistent cached attributes (when values returned as NULL/Zero) and only print other attributes.

    Thanks
    -Nagu

     
  • Anders Bjornerstedt

    If you have a fail-over time longer than 6 seconds then you are likely going to
    get more serious problems than an ugly printout.

    To be precise, It is time to fail-over services plus time for the new active
    AMFD to attach. The registration of the new active AMFD as OI is visible in
    the syslogs.

    Could be that there is some deeper issue that needs resolving here.
    Perhaps the TRY_AGAIN loop in the AMFD for re-attaching as OI has a
    sleep that is too large o a back off that is too quick to add time.

     
  • Nagendra Kumar

    Nagendra Kumar - 2015-06-15

    Update from Thuan:

    I think it should not printout the objects(<== Here) without satisfy the condition.

    amf_state()
    {
    echo
    echo "Service Instances UNLOCKED and UNASSIGNED:" <== Here
    si_dns=immfind -c SaAmfSI
    for si_dn in $si_dns; do
    adm_state_val=immlist -a "saAmfSIAdminState" $si_dn | cut -d = -f2
    ass_state_val=immlist -a "saAmfSIAssignmentState" $si_dn | cut -d = -f2

        if [ $adm_state_val -eq 1 ] && [ $ass_state_val -eq 1 ]; then <== Here
            echo "   $si_dn"
        fi
    
    done
    
    echo
    echo "Service Units UNLOCKED and DISABLED:" <== Here
    su_dns=`immfind -c SaAmfSU`
    for su_dn in $su_dns; do
        adm_state_val=`immlist -a "saAmfSUAdminState" $su_dn | cut -d = -f2`
        oper_state_val=`immlist -a "saAmfSUOperState" $su_dn | cut -d = -f2`
    
        if [ $adm_state_val -eq 1 ] && [ $oper_state_val -eq 2 ]; then <== Here
            echo "   $su_dn"
        fi
    done
    
    echo
    echo "Nodes UNLOCKED and DISABLED:" <== Here
    node_dns=`immfind -c SaAmfNode`
    for node_dn in $node_dns; do
        adm_state_val=`immlist -a "saAmfNodeAdminState" $node_dn | cut -d = -f2`
        oper_state_val=`immlist -a "saAmfNodeOperState" $node_dn | cut -d = -f2`
    
        if [ $adm_state_val -eq 1 ] && [ $oper_state_val -eq 2 ]; then <== Here
            echo "   $node_dn"
        fi
    done
    
    echo
    

    }

    Best regards,
    Thuan

     
  • Nagendra Kumar

    Nagendra Kumar - 2015-06-15
    • status: unassigned --> assigned
    • assigned_to: Nagendra Kumar
     
  • Nagendra Kumar

    Nagendra Kumar - 2015-06-17
    • status: assigned --> accepted
     
  • Nagendra Kumar

    Nagendra Kumar - 2015-06-17
    • status: accepted --> review
     
  • Nagendra Kumar

    Nagendra Kumar - 2015-06-17

    As per the patch, the following log will still come, but error will not come:
    Service Instances UNLOCKED and UNASSIGNED:
    Service Units UNLOCKED and DISABLED:
    Nodes UNLOCKED and DISABLED:

     
  • Nagendra Kumar

    Nagendra Kumar - 2015-06-18
    • status: review --> fixed
     

Log in to post a comment.