Changeset : 6901
Applications : No Redundancy , 4 SUs ( 3PI components and 1 NPI component) and 4SIs.
Steps
1) Below are the initial assignments.
SI1->SU1, SI2->SU2,SI3->SU3,SI4->SU4
The SI3 is assigned to NPI COMP3 in SU3.
2) Performed restart operation of SU3. Restart operation succeded.
3) Later performed shutdown operation of SU3, for which SG went to unstable state. SI3 struck in QUIESCING state.
Below is the saflog snippet from active controller.
715 14:15:13 10/23/2015 NO safApp=safAmfService "Admin op "RESTART" initiated for 'safSu=TestApp_SU3,safSg=TestApp_SG1,safApp=TestApp_NoRed', invocation: 1142461300737" 716 14:15:13 10/23/2015 NO safApp=safAmfService "safSu=TestApp_SU3,safSg=TestApp_SG1,safApp=TestApp_NoRed ReadinessState IN_SERVICE => OUT_OF_SERVICE" 717 14:15:13 10/23/2015 NO safApp=safAmfService "safSu=TestApp_SU3,safSg=TestApp_SG1,safApp=TestApp_NoRed PresenceState INSTANTIATING => INSTANTIATED" 718 14:15:13 10/23/2015 NO safApp=safAmfService "safSu=TestApp_SU3,safSg=TestApp_SG1,safApp=TestApp_NoRed ReadinessState OUT_OF_SERVICE => IN_SERVICE" 719 14:15:13 10/23/2015 NO safApp=safAmfService "Admin op done for invocation: 1142461300737, result 1" 720 14:15:23 10/23/2015 NO safApp=safAmfService "Admin op "SHUTDOWN" initiated for 'safSu=TestApp_SU3,safSg=TestApp_SG1,safApp=TestApp_NoRed', invocation: 1146756268033" 721 14:15:23 10/23/2015 NO safApp=safAmfService "safSu=TestApp_SU3,safSg=TestApp_SG1,safApp=TestApp_NoRed ReadinessState IN_SERVICE => OUT_OF_SERVICE" 722 14:15:23 10/23/2015 NO safApp=safAmfService "safSu=TestApp_SU3,safSg=TestApp_SG1,safApp=TestApp_NoRed AdmState UNLOCKED => SHUTTING_DOWN" 723 14:15:23 10/23/2015 NO safApp=safAmfService "HA State QUIESCING of safSu=TestApp_SU3,safSg=TestApp_SG1,safApp=TestApp_NoRed for safSi=TestApp_SI3,safApp=TestApp_NoRed" 724 14:15:23 10/23/2015 NO safApp=safAmfService "safSi=TestApp_SI3,safApp=TestApp_NoRed assigned to safSu=TestApp_SU3,safSg=TestApp_SG1,safApp=TestApp_NoRed HA State 'QUIESCING'" 725 14:23:41 10/23/2015 NO safApp=safAmfService "Admin op "UNLOCK" initiated for 'safSu=TestApp_SU3,safSg=TestApp_SG1,safApp=TestApp_NoRed', invocation: 1151051235329" 726 14:23:41 10/23/2015 NO safApp=safAmfService "Admin op invocation: 1151051235329, err: 'SG state is not stable'"
Configuration attached by the reporter configures compinstantiationlevel for a NPI component and next instantiantionlevel belogns to a PI comp. Since a NPI component is instantiated when it is assigned some work load, configuring instantiationlevel for a NPI component is of no use. Spec does not mention anything about the validity of the configuration. But for creating dependency with a NPI comp, csi dep must be used. Issue is reproducible when a configuration meets following criteria:
-Atleast one non-restartable component is present in the configuration.
-All non restartable component should be unassigned.
-NPI comp should be restartable.
-CompInstantiationlevel is configured in such a way that last instantiationlevel belongs to a PI component which non-restartable and unassigned.
Issue comes because when SU restart operation is invoked, AMFND picks the last component which is an unassigned non-restartable PI comp and starts terminaing it (avnd_su_pres_inst_surestart_hdler). After successful termination, next component is a NPI component here this component is not terminated. AMFND skips it and starts terminating next PI comp (avnd_su_pres_terming_compuninst_hdler).
Attached is the simple configuration (1561.xml) to reproduce the problem with same steps as given in the description.
changeset: 7199:7a132dc6a7e9
branch: opensaf-4.7.x
parent: 7196:bf265cb7441f
user: praveen.malviya@oracle.com
date: Wed Dec 30 14:46:36 2015 +0530
summary: amfnd: terminate npi comp in non-restartable SU during admin SU RESTART #1561]
changeset: 7200:c33720429f69
tag: tip
parent: 7198:835719950bcb
user: praveen.malviya@oracle.com
date: Wed Dec 30 14:47:08 2015 +0530
summary: amfnd: terminate npi comp in non-restartable SU during admin SU RESTART #1561]
After incorporating some minor comments.
https://sourceforge.net/p/opensaf/mailman/message/34592272/