Today, it is not possible to initiate AMF node shutdown operation on multiple nodes at the same time.
This ticket proposes (praveen's idea) to introduce a new AMF admin operation on the nodegroup logical entity.
A cluter scale-down usecase might introduce limits on the total time spent in down-sizing an AMF/application cluster. This enhancement shall look into different ways by which atleast some amount of parallelism is induced/pushing within AMF (wherever applicable) during the processing of this new 'node-group' shutdown operation.
End of the day, the final timetake will be a mix of
1) amount of time spent in the checks inside AMF during processing of a parallel admin_op.
+
2) how quicky applications respond to the csi callbacks.
Atleast to start with one redundancy model...
More later... Nagendra, Praveen would update on the details on the scenarios in which AMF would still continue to do serial processing i.e. where existing checks cannot be removed, etc....
(The bigger opportunity is also not to overlty complicate AMF by introducing this adminop)
The following is the scope and behaviour of this new feature that we have come up with after initial analysis. Comments are welcome.
1)Node group class "SaAmfNodeGroup" will be enhanced by including attribute "saAmfNGAdminState".
2)Only shutdown and unlock admin operation will be supported on Nodegroup.
3)Shutdown operation on Nodegroup will be supported only for Nored and Nway_active
redudancy model.
4)Nodegroup admin state "saAmfNGAdminState" will show admin operation status.
5)When shutdown operation is initiated on the nodegroup, all the "application" SUs (not middleware SUs) deployed on the nodes of that nodegroup will be unassigned through quiescing state. This is as if each node is undergoing a shutdown operation,but saAmfNodeAdminState will remain unlocked. Thus there will be only graceful removal of assignment from SUs and not the termination of SUs from the nodes during nodegroup shutdown operation. i.e. There is no lock-in behaviour.
6) Shutdown operation on a nodegroup will be rejected in the following cases:
a)If any AMF entity is not stable on any node of nodegroup.
b)If two or more SUs of same SG are hosted on any node of the nodegroup.
c)If any node of the nodegroup is already under going any other admin operation.
d)If any SU on any node of the nodegroup is undergoing any admin op.
7)As per AMF Pr doc, AMF does not support SI deps in NoRed Model and Nway_active model. So
SI dependency will not be honoured during shutdown operation while assigning quiescing state to SUs or removal of assignemet from SUs hosted on the nodes of nodegroup.
Return values for the shutdown admin operation on nodegroup:
1)SA_AIS_OK- The function completed successfully.
2)SA_AIS_ERR_TIMEOUT - An implementation-dependent timeout occurred before the
call could complete. It is unspecified whether the call succeeded or whether it did not.
3)SA_AIS_ERR_TRY_AGAIN- This will be returned in the following cases:
a)If any AMF entity is not stable on any node of nodegroup.
b)If any node of the nodegroup is already under going any other admin operation.
c)If any SU on any node of the nodegroup is undergoing any admin op.
4)SA_AIS_ERR_NOT_SUPPORTED-If other than shutdown and unlock operation are operated on nodegroup. Or if two or more SUs of same SG are hosted on any node of nodegroup.
5)SA_AIS_ERR_BAD_OPERATION - If shutdown operation is initiated on a nodegroup which is already in locked state.
6)SA_AIS_ERR_NO_OP - If another shutdown operation is initiated on the nodegroup which is already in shutting-down state
because of previous shutdown operation.
Return values for the unlock admin operation on nodegroup:
1)SA_AIS_OK - The function completed successfully.
2)SA_AIS_ERR_TIMEOUT - An implementation-dependent timeout occurred before the
call could complete. It is unspecified whether the call succeeded or whether it did not.
3)SA_AIS_ERR_TRY_AGAIN-If any AMF entity is not stable on any node of nodegroup.
4)SA_AIS_ERR_NO_OP - The invocation of this administrative operation has no effect
on the current state of the logical entity, as it is already in unlocked state.
More update on Implementation Scope:
1)When shutdown admin operation is in progress on a nodegroup, AMF will will transition the nodes belonging to that nodegroup from SHUTTINGDOWN to LOCKED admin state. So after completion of shutdown operation, nodegroup and nodes of nodegroup will be in admin locked state.
2) Since individual nodes of nodegroup are also marked LOCKED as a part of shutdown operation, user will be able to delete the node group after the SHUTDOWN operation is complete.
3) After the nodegroup SHUTDOWN, the user can unlock either the individual nodes by performing unlock on that node OR by performing unlock operation on that nodegorup.
If unlock admin op is targeted on nodegroup, all nodes of nodegroup will be marked unlocked by AMF.
Note: When unlock admin op is targeted on nodegroup, SUs on individual nodes will be assigned based on the overall status and redundancy model of the containing service group and also on whether the redundancy requirements of the service instance are being met.
4)Besides Nway_Active and NoRed models, 2N model will also be supported.
Last edit: Praveen 2015-02-06
Patch floated for Version-1 with minimal testing [#1235].
Please find attached test applications.
For test applications four node cluster is required.
Steps to bring applications up:
immcfg -f AppConfig_2N_5SUs_2SIs.xml
immcfg -f AppConfig-nored_5SUs_2SIs.xml
immcfg -f AppConfig-nwayactive_5SUs_2SIs.xml
amf-adm unlock-in safSu=SU1,safSg=2N,safApp=2N
amf-adm unlock-in safSu=SU2,safSg=2N,safApp=2N
amf-adm unlock-in safSu=SU3,safSg=2N,safApp=2N
amf-adm unlock-in safSu=SU4,safSg=2N,safApp=2N
amf-adm unlock-in safSu=SU1,safSg=NoRed,safApp=NoRed
amf-adm unlock-in safSu=SU2,safSg=NoRed,safApp=NoRed
amf-adm unlock-in safSu=SU3,safSg=NoRed,safApp=NoRed
amf-adm unlock-in safSu=SU4,safSg=NoRed,safApp=NoRed
amf-adm unlock-in safSu=SU5,safSg=NoRed,safApp=NoRed
amf-adm unlock-in safSu=SU1,safSg=NWay_Active,safApp=NWay_Active
amf-adm unlock-in safSu=SU2,safSg=NWay_Active,safApp=NWay_Active
amf-adm unlock-in safSu=SU3,safSg=NWay_Active,safApp=NWay_Active
amf-adm unlock-in safSu=SU4,safSg=NWay_Active,safApp=NWay_Active
amf-adm unlock-in safSu=SU5,safSg=NWay_Active,safApp=NWay_Active
amf-adm unlock safSu=SU1,safSg=2N,safApp=2N
amf-adm unlock safSu=SU2,safSg=2N,safApp=2N
amf-adm unlock safSu=SU3,safSg=2N,safApp=2N
amf-adm unlock safSu=SU4,safSg=2N,safApp=2N
amf-adm unlock safSu=SU1,safSg=NoRed,safApp=NoRed
amf-adm unlock safSu=SU2,safSg=NoRed,safApp=NoRed
amf-adm unlock safSu=SU3,safSg=NoRed,safApp=NoRed
amf-adm unlock safSu=SU4,safSg=NoRed,safApp=NoRed
amf-adm unlock safSu=SU5,safSg=NoRed,safApp=NoRed
amf-adm unlock safSu=SU1,safSg=NWay_Active,safApp=NWay_Active
amf-adm unlock safSu=SU2,safSg=NWay_Active,safApp=NWay_Active
amf-adm unlock safSu=SU3,safSg=NWay_Active,safApp=NWay_Active
amf-adm unlock safSu=SU4,safSg=NWay_Active,safApp=NWay_Active
amf-adm unlock safSu=SU5,safSg=NWay_Active,safApp=NWay_Active
For creating nodegroup with CS-1 and SC-2 nodes:
immcfg -c SaAmfNodeGroup safAmfNodeGroup=TestNG,safAmfCluster=myAmfCluster -a saAmfNGNodeList=safAmfNode=SC-1,safAmfCluster=myAmfCluster -a saAmfNGAdminState=2
immcfg safAmfNodeGroup=TestNG,safAmfCluster=myAmfCluster -a saAmfNGNodeList+=safAmfNode=SC-2,safAmfCluster=myAmfCluster
Related
Tickets:
#1235changeset: 6366:0fd5f3b144d9
user: praveen.malviya@oracle.com
date: Mon Mar 16 15:21:21 2015 +0530
summary: amfd : support shutdown, lock and unlock for 2N, NoRed and NWay_Active on NG[#1235]
changeset: 6367:e9f28ee71043
user: praveen.malviya@oracle.com
date: Mon Mar 16 15:21:45 2015 +0530
summary: amfd: modify or remove assignments of 2N SU during admin op on NG [#1235]
changeset: 6368:d32b835cccaa
user: praveen.malviya@oracle.com
date: Mon Mar 16 15:22:03 2015 +0530
summary: amfd: modify assignments of NWay_Active SU during admin op on NG [#1235].
changeset: 6369:880a4713f7b2
user: praveen.malviya@oracle.com
date: Mon Mar 16 15:22:17 2015 +0530
summary: amfd: modify assignments of NoRed SU during admin op on NG [#1235].
changeset: 6370:15297644f224
user: praveen.malviya@oracle.com
date: Mon Mar 16 15:22:35 2015 +0530
summary: amfd : checkpoint saAmfNGAdminState of NG [#1235].
changeset: 6371:8a467b2164d8
user: praveen.malviya@oracle.com
date: Mon Mar 16 15:23:23 2015 +0530
summary: amfd: send state change notification for saAmfNGAdminState [#1235].
changeset: 6372:64aa11f960dd
tag: tip
user: praveen.malviya@oracle.com
date: Mon Mar 16 15:23:44 2015 +0530
summary: amfd: show nodegroups and their admin state in amf-state command [#1235].
Related
Tickets:
#1235AMF PR doc changes in changeset:
changeset: 149:7db7d054b191
tag: tip
parent: 146:f6b6282eb97e
user: praveen.malviya@oracle.com
date: Tue Apr 07 15:41:31 2015 +0530
summary: amf: update for 4.6 release [#1169, #1158, #964, #1153, #1235]