setup:
Version - OpenSAF 5.1.FC : changeset - 7997
4-Node cluster
2PBE enabled
Bring up opensaf on a controller with 2PBE enable. IMMND throwing error
Attachments: syslog, amfd and immnd traces
Sep 2 16:54:13 SLOT1 osafimmpbed: WA Start prepare for ccb: 100000004/4294967300 towards slave PBE returned: '12' from Immsv
Sep 2 16:54:13 SLOT1 osafimmpbed: WA PBE-A failed to prepare PRTA update Ccb:100000004/4294967300 towards PBE-B
Sep 2 16:54:13 SLOT1 osafimmpbed: NO 2PBE Error (18) in PRTA update (ccbId:100000004)
Sep 2 16:54:13 SLOT1 osafimmnd[3632]: WA update of PERSISTENT runtime attributes in object 'safSi=NoRed3,safApp=OpenSAF' REVERTED. PBE rc:18
Sep 2 16:54:13 SLOT1 osafamfd[3698]: ER exec: update FAILED 18
Sep 2 16:54:14 SLOT1 osafimmnd[3632]: NO PBE-OI established on this SC. Dumping incrementally to file imm.db
Note- 1. OpenSAF is successfully started
2. Issue not seen with 1PBE
Once controller is up, amf-state si gives
safSi=SC-2N,safApp=OpenSAF
saAmfSIAdminState=UNLOCKED(1)
saAmfSIAssignmentState=PARTIALLY_ASSIGNED(3)
safSi=NoRed4,safApp=OpenSAF
saAmfSIAdminState=UNLOCKED(1)
saAmfSIAssignmentState=UNASSIGNED(1)
safSi=NoRed1,safApp=OpenSAF
saAmfSIAdminState=UNLOCKED(1)
saAmfSIAssignmentState=FULLY_ASSIGNED(2)
safSi=NoRed2,safApp=OpenSAF
saAmfSIAdminState=UNLOCKED(1)
saAmfSIAssignmentState=UNASSIGNED(1)
safSi=NoRed3,safApp=OpenSAF
saAmfSIAdminState=UNLOCKED(1)
saAmfSIAssignmentState=UNASSIGNED(1)
Sep 2 16:54:13 SLOT1 osafimmpbed: WA Start prepare for ccb: 100000004/4294967300 towards slave PBE returned: '12' from Immsv
Sep 2 16:54:13 SLOT1 osafimmpbed: WA PBE-A failed to prepare PRTA update Ccb:100000004/4294967300 towards PBE-B
Sep 2 16:54:13 SLOT1 osafimmpbed: NO 2PBE Error (18) in PRTA update (ccbId:100000004)
Sep 2 16:54:13 SLOT1 osafimmnd[3632]: WA update of PERSISTENT runtime attributes in object 'safSi=NoRed3,safApp=OpenSAF' REVERTED. PBE rc:18
Sep 2 16:54:13 SLOT1 osafamfd[3698]: ER exec: update FAILED 18
2PBE case, both the PBEs in the controller must be up. From the logs only PBE at slot1 is up and slot2 is not yet joined the cluster. The RT-update will fail, because of slo2 PBE is not available.
From, the AMF perspective, this has to be analayzed or Error can be made as Warning for RT-updates.
Sep 2 16:54:13 SLOT1 osafamfd[3698]: ER exec: update FAILED 18
Hi,
As AMF perspective, if IMM returns TRY_AGAIN, AMF tries to update attributes again as indicated by the amfd trace:
Sep 2 16:54:09.294319 osafamfd [3698:imm.cc:0396] >> execute
Sep 2 16:54:09.294319 osafamfd [3698:imm.cc:0212] >> exec: Update 'safSi=NoRed3,safApp=OpenSAF' saAmfUnassignedAlarmStatus
Sep 2 16:54:09.294319 osafamfd [3698:imma_oi_api.c:2446] >> rt_object_update_common
Sep 2 16:54:09.294569 osafamfd [3698:imma_oi_api.c:2719] << rt_object_update_common
Sep 2 16:54:09.294583 osafamfd [3698:imm.cc:0226] TR TRY-AGAIN
Sep 2 16:54:09.294589 osafamfd [3698:imm.cc:0241] << exec
Sep 2 16:54:09.294595 osafamfd [3698:imm.cc:0400] << execute: 2
However, after AMF tried AGAIN several times in this case, IMM returned 18 (SA_AIS_ERR_NO_RESOURCES). AMF prints it as an error as expected.
As I understand, you only started opensaf on SC-1. This is not the recommended configuration for 2PBE enabled as noted by IMM "With 2PBE, both PBEs must be available for the imm to be persistent-writable. If one or both PBEs are unavailable (or unresponsive) then persistent writes
(CCBs, PRT operations, class changes) will fail.".
If there is no comment so far, I will set the ticket to "invalid" tomorrow.
Once the second controller joins, the object is updated. Hence closing.