Menu

#2059 AMF: SU struck in terminating state, for failure in component instantiation

5.0.2
fixed
Praveen
None
defect
amf
nd
5.1.FC
minor
2017-01-05
2016-09-22
Srikanth R
No

Setup :
Changeset : 7997 5.1.FC
Setup : 5 node setup with 3 payloads.
App : 2N PI Application with SUs hosted on PL-3,PL-4,SC-2.

Issue :

SU struck in terminating state, for failure in component restart

Steps performed :

-> Initially brought up the attached AMF configuration and 2 SUs got assigned successfully.

-> Now moved the instantiation script on SU1. Note that termination script and instantiation script are different.

-> Killed component of SU1. The component went for restart and assignments got removed from SU1 and SU2 & SU3 got active & standby.

-> As the instantiation script is not available, SU should be moved to instantiation failed state. But the SU is struck in terminating state.

-> Lock operation on SU succeeded, but lock-in operation on SU resulted in SG unstable.

   175 12:28:31 09/22/2016 NO safApp=safAmfService "Admin op "LOCK" initiated for 'safSu=TestApp_SU1,safSg=TestApp_SG1,safApp=TestApp_TwoN', invocation: 408021893121"
   176 12:28:31 09/22/2016 NO safApp=safAmfService "safSu=TestApp_SU1,safSg=TestApp_SG1,safApp=TestApp_TwoN AdmState UNLOCKED => LOCKED"
   177 12:28:31 09/22/2016 NO safApp=safAmfService "Admin op done for invocation: 408021893121, result 1"
   178 12:28:39 09/22/2016 NO safApp=safAmfService "Admin op "LOCK_INSTANTIATION" initiated for 'safSu=TestApp_SU1,safSg=TestApp_SG1,safApp=TestApp_TwoN', invocation: 412316860417"
   179 12:28:39 09/22/2016 NO safApp=safAmfService "Admin op invocation: 412316860417, err: ''safSu=TestApp_SU1,safSg=TestApp_SG1,safApp=TestApp_TwoN' presence state is '4''"
   180 12:28:39 09/22/2016 NO safApp=safAmfService "Admin op done for invocation: 412316860417, result 6"
2 Attachments

Related

Tickets: #2059
Wiki: ChangeLog-5.0.2
Wiki: ChangeLog-5.1.1

Discussion

  • Praveen

    Praveen - 2016-09-23
    • status: unassigned --> assigned
    • assigned_to: Praveen
    • Milestone: 4.7.2 --> 5.0.2
     
  • Praveen

    Praveen - 2016-09-29
    • summary: AMF: SU struck in terminating state, for failure in component restart --> AMF: SU struck in terminating state, for failure in component instantiation
    • status: assigned --> accepted
    • Attachments has changed:

    Diff:

    --- old
    +++ new
    @@ -1 +1,2 @@
     2059.tgz (821.7 kB; application/x-compressed-tar)
    +AppConfig-2N.xml_on_SC-1_2comps (10.5 kB; application/octet-stream)
    
    • Part: - --> nd
     
  • Praveen

    Praveen - 2016-09-29

    Analysis:
    Comp faults witjh comp-failover recovery. After removal of assignments, AMFNFD tries to reapir the component by trying to instantiate it. Since instantiate scripit is not present, comp instantiation fails and AMFND cleans up it succesfully. After this successful cleanup AMFND niether tries to instantiate the comp again not it marks comp and SU to INST_FAILED state. AMFND should mark the comp and SU to INST_FAILED state after finishing instantiation of comp MAX_TRY times. If the value is not configuted for the comp then AMFND should honour the default value of saAmfNumMaxInstantiateWithoutDelay=2.

    Attached is the amf_demo based conf to reproduce the issue.
    
     
  • Praveen

    Praveen - 2016-09-29
    • status: accepted --> review
     
  • Praveen

    Praveen - 2017-01-05
    • status: review --> fixed
     
  • Praveen

    Praveen - 2017-01-05

    changeset: 8489:ecaf7bb15c89
    branch: opensaf-5.0.x
    parent: 8483:72c20943e09e
    user: Praveen Malviya praveen.malviya@oracle.com
    date: Thu Jan 05 16:42:01 2017 +0530
    summary: amfnd: honour max num of retries for comp instantiation during repair[#2059].

    changeset: 8490:13b03a80ae60
    branch: opensaf-5.1.x
    parent: 8484:84bd7db1ef39
    user: Praveen Malviya praveen.malviya@oracle.com
    date: Thu Jan 05 16:42:32 2017 +0530
    summary: amfnd: honour max num of retries for comp instantiation during repair[#2059].

    changeset: 8491:edfbbb945f87
    tag: tip
    parent: 8488:2846bb464d10
    user: Praveen Malviya praveen.malviya@oracle.com
    date: Thu Jan 05 16:43:07 2017 +0530
    summary: amfnd: honour max num of retries for comp instantiation during repair[#2059].

    [staging:ecaf7b]
    [staging:13b03a]
    [staging:edfbbb]

     

    Related

    Commit: [13b03a]
    Commit: [ecaf7b]
    Commit: [edfbbb]
    Tickets: #2059


Log in to post a comment.