OS : Suse 64bit
Changeset : 7997 ( 5.1.FC)
Setup : 5 nodes ( 2 controllers and 3 payloads with headless feature enabled & no PBE )
AMF Application : 2N model with SUs mapped on PL-3,PL-4 ( si-si deps enabled)
Application SIs are moving to UNASSIGNED state after middleware failover.
-> Initially brought up AMF application (2n model) on two payloads.
-> All the SIs are fully assigned state and SUs are in INSERVICE state.
-> Performed middleware failover.
-> After standby became active controller, SIs moved to unassigned state. But 'amf-state siass' is showing proper output.
-> Application received CSI remove callbacks after locking the SUs
-> As no fault happened on the application, SIs should not move to UNASSIGNED state for middleware failover.
amfd traces on both the controllers
-> In addition to the steps mentioned in the ticket, for the below operations following message is printed in syslog.
Sep 8 12:06:29 CONTROLLER-1 osafamfd[2222]: ER exec: create FAILED 12
Sep 8 12:06:35 CONTROLLER-1 osafamfd[2222]: ER exec: create FAILED 12
Sep 8 12:06:45 CONTROLLER-1 osafamfd[2222]: ER exec: create FAILED 12
Sep 8 12:06:55 CONTROLLER-1 osafamfd[2222]: ER exec: create FAILED 12
Below are the steps.
-> Delete all the application objects.
-> Perform the middleware switchover / failover.
-> New active controller is trying to access the application SI object which is already deleted earlier.
Sep 8 12:08:36.647738 osafamfd [2222:main.cc:0810] << process_event
Sep 8 12:08:36.647743 osafamfd [2222:imm.cc:0396] >> execute
Sep 8 12:08:36.647748 osafamfd [2222:imm.cc:0142] >> exec: Create safCsi=CSI1,safSi=TestApp_SI4,safApp=TestApp_TwoN
Sep 8 12:08:36.647754 osafamfd [2222:imma_oi_api.c:2786] >> rt_object_create_common
Sep 8 12:08:36.647761 osafamfd [2222:imma_oi_api.c:2892] TR attr:safCSIComp
Sep 8 12:08:36.647768 osafamfd [2222:imma_oi_api.c:2892] TR attr:saAmfCSICompHAState
Sep 8 12:08:36.647795 osafamfd [2222:imma_oi_api.c:2892] TR attr:saAmfCSICompHAReadinessState
Sep 8 12:08:36.650289 osafamfd [2222:imma_oi_api.c:3063] << rt_object_create_common
Sep 8 12:08:36.650330 osafamfd [2222:imm.cc:0163] ER exec: create FAILED 12
A) Steps to reproduce the issue reported in "Discussion/Comment" section:
1)Bring two controllers up and bring up create_app.xml.
2)Lock and LOCK-in SU1.
3)Delete the only SI.
4)Stop active controller.
When standby becomes active controller, it tries to create SISU and CSICOMP in IMM and it fails:
Oct 18 17:07:11.872982 osafamfd [16148:imm.cc:0142] >> exec: Create safSi=AmfDemo,safApp=AmfDemo1
Oct 18 17:07:11.873007 osafamfd [16148:imma_oi_api.c:2786] >> rt_object_create_common
Oct 18 17:07:11.873019 osafamfd [16148:imma_oi_api.c:2892] TR attr:safSISU
Oct 18 17:07:11.873026 osafamfd [16148:imma_oi_api.c:2892] TR attr:saAmfSISUHAState
Oct 18 17:07:11.873033 osafamfd [16148:imma_oi_api.c:2892] TR attr:saAmfSISUHAReadinessState
Oct 18 17:07:11.873039 osafamfd [16148:imma_oi_api.c:2892] TR attr:osafAmfSISUFsmState
Oct 18 17:07:11.873232 osafamfd [16148:imma_oi_api.c:3063] << rt_object_create_common
Oct 18 17:07:11.873641 osafamfd [16148:imm.cc:0163] ER exec: create FAILED 12
Oct 18 17:07:19.892496 osafamfd [16148:imm.cc:0142] >> exec: Create safCsi=AmfDemo,safSi=AmfDemo,safApp=AmfDemo1
Oct 18 17:07:19.892514 osafamfd [16148:imma_oi_api.c:2786] >> rt_object_create_common
Oct 18 17:07:19.892536 osafamfd [16148:imma_oi_api.c:2892] TR attr:safCSIComp
Oct 18 17:07:19.892556 osafamfd [16148:imma_oi_api.c:2892] TR attr:saAmfCSICompHAState
Oct 18 17:07:19.892573 osafamfd [16148:imma_oi_api.c:2892] TR attr:saAmfCSICompHAReadinessState
Oct 18 17:07:19.893515 osafamfd [16148:imma_oi_api.c:3063] << rt_object_create_common
Oct 18 17:07:19.893611 osafamfd [16148:imm.cc:0163] ER exec: create FAILED 12
B) Steps to reproduce a related issue:
1)Bring up two controllers and bring up create_app.xml.
2)Lock and LOCK-in SU1.
3)Delete the app: immcfg -d safApp=AmfDemo1.
4) Now add configuration add_app.xml using "immcfg -f add_app.xml. Configuration add_app.xml creates the same app as in create_app.xml with only difference that SG is in lock-in state and SUs in unlocked state.
5)Stop active controller. When standby becomes active controller, amf-state su shows wrong admin state of SU1:
safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1
saAmfSUAdminState=LOCKED-INSTANTIATION(3)
saAmfSUOperState=ENABLED(1)
saAmfSUPresenceState=UNINSTANTIATED(1)
saAmfSUReadinessState=OUT-OF-SERVICE(1)
But state is correct in AMF database. After dumping state from AMF:
dn: safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1
saAmfSUPreInstantiable: 1
saAmfSUOperState: ENABLED
saAmfSUAdminState: UNLOCKED
saAmfSuReadinessState: OUT_OF_SERVICE
saAmfSUPresenceState: UNINSTANTIATED
saAmfSUHostedByNode: safAmfNode=SC-2,safAmfCluster=myAmfCluster
saAmfSUNumCurrActiveSIs: 0
saAmfSUNumCurrStandbySIs: 0
saAmfSURestartCount: 0
term_state: 0
su_switch: 0
assigned_SIs:
Analysis:
Standby controller maintains a job queue for SU, SiSU and COMPCSI class of size 200. In the job queue both old and new state of a object will be present. The issue is not observed normally because standby will update correct state eventually from old to new. In the above cases since object was deleted and added with new state the issue was observed. One more possiblity (although remote) is if standby after becoming active updates only older state and reboots. In the case new state will not get updated and IMM will show wrong state.
Diff:
Patch for 5.0.
changeset: 8311:9cee81b39c47
branch: opensaf-5.0.x
parent: 8306:45b922ca767a
user: Praveen Malviya praveen.malviya@oracle.com
date: Mon Nov 14 10:28:48 2016 +0530
summary: amfd: execute imm jobs based on latest status of attributes and objects[#2009] V2
changeset: 8312:c365dd1e0992
branch: opensaf-5.1.x
parent: 8309:3d25b22c5827
user: Praveen Malviya praveen.malviya@oracle.com
date: Mon Nov 14 10:29:43 2016 +0530
summary: amfd: execute imm jobs based on latest status of attributes and objects[#2009] V2
changeset: 8313:fa6b413693e3
tag: tip
parent: 8310:1df056ab67f3
user: Praveen Malviya praveen.malviya@oracle.com
date: Mon Nov 14 10:29:56 2016 +0530
summary: amfd: execute imm jobs based on latest status of attributes and objects[#2009] V2
[staging:9cee81]
[staging:c365dd]
[staging:fa6b41]
Related
Commit: [9cee81]
Commit: [c365dd]
Commit: [fa6b41]
Tickets:
#2009