Setup :
Changeset : 7997 5.1.FC
Setup : 5 node setup with 3 payloads.
App : 2N PI Application with SUs hosted on PL-3,PL-4,SC-2.
Issue :
SU struck in terminating state, for failure in component restart
Steps performed :
-> Initially brought up the attached AMF configuration and 2 SUs got assigned successfully.
-> Now moved the instantiation script on SU1. Note that termination script and instantiation script are different.
-> Killed component of SU1. The component went for restart and assignments got removed from SU1 and SU2 & SU3 got active & standby.
-> As the instantiation script is not available, SU should be moved to instantiation failed state. But the SU is struck in terminating state.
-> Lock operation on SU succeeded, but lock-in operation on SU resulted in SG unstable.
175 12:28:31 09/22/2016 NO safApp=safAmfService "Admin op "LOCK" initiated for 'safSu=TestApp_SU1,safSg=TestApp_SG1,safApp=TestApp_TwoN', invocation: 408021893121" 176 12:28:31 09/22/2016 NO safApp=safAmfService "safSu=TestApp_SU1,safSg=TestApp_SG1,safApp=TestApp_TwoN AdmState UNLOCKED => LOCKED" 177 12:28:31 09/22/2016 NO safApp=safAmfService "Admin op done for invocation: 408021893121, result 1" 178 12:28:39 09/22/2016 NO safApp=safAmfService "Admin op "LOCK_INSTANTIATION" initiated for 'safSu=TestApp_SU1,safSg=TestApp_SG1,safApp=TestApp_TwoN', invocation: 412316860417" 179 12:28:39 09/22/2016 NO safApp=safAmfService "Admin op invocation: 412316860417, err: ''safSu=TestApp_SU1,safSg=TestApp_SG1,safApp=TestApp_TwoN' presence state is '4''" 180 12:28:39 09/22/2016 NO safApp=safAmfService "Admin op done for invocation: 412316860417, result 6"
Diff:
Analysis:
Comp faults witjh comp-failover recovery. After removal of assignments, AMFNFD tries to reapir the component by trying to instantiate it. Since instantiate scripit is not present, comp instantiation fails and AMFND cleans up it succesfully. After this successful cleanup AMFND niether tries to instantiate the comp again not it marks comp and SU to INST_FAILED state. AMFND should mark the comp and SU to INST_FAILED state after finishing instantiation of comp MAX_TRY times. If the value is not configuted for the comp then AMFND should honour the default value of saAmfNumMaxInstantiateWithoutDelay=2.
changeset: 8489:ecaf7bb15c89
branch: opensaf-5.0.x
parent: 8483:72c20943e09e
user: Praveen Malviya praveen.malviya@oracle.com
date: Thu Jan 05 16:42:01 2017 +0530
summary: amfnd: honour max num of retries for comp instantiation during repair[#2059].
changeset: 8490:13b03a80ae60
branch: opensaf-5.1.x
parent: 8484:84bd7db1ef39
user: Praveen Malviya praveen.malviya@oracle.com
date: Thu Jan 05 16:42:32 2017 +0530
summary: amfnd: honour max num of retries for comp instantiation during repair[#2059].
changeset: 8491:edfbbb945f87
tag: tip
parent: 8488:2846bb464d10
user: Praveen Malviya praveen.malviya@oracle.com
date: Thu Jan 05 16:43:07 2017 +0530
summary: amfnd: honour max num of retries for comp instantiation during repair[#2059].
[staging:ecaf7b]
[staging:13b03a]
[staging:edfbbb]
Related
Commit: [13b03a]
Commit: [ecaf7b]
Commit: [edfbbb]
Tickets:
#2059