Setup:
Changeset- 7613
OS: SUSE 11SP2 x86_64
Steps to reproduce:
/etc/init.d/opensad restart on standby controller
Issue reproducible most of the time
Standby failed to join the cluster back with following log errors:
May 13 12:34:50 SLOT-2 osafimmnd[31358]: NO IMMD service is UP ... ScAbsenseAllowed?:0 introduced?:0
May 13 12:34:50 SLOT-2 osafclmna[31319]: NO safNode=SC-2,safCluster=myClmCluster Joined cluster, nodeid=2020f
May 13 12:34:50 SLOT-2 osafrded[31328]: NO Got peer info request from node 0x2010f with role ACTIVE
May 13 12:34:50 SLOT-2 osafrded[31328]: NO Got peer info response from node 0x2010f with role ACTIVE
May 13 12:34:50 SLOT-2 osafrded[31328]: NO RDE role set to QUIESCED
May 13 12:34:50 SLOT-2 osafrded[31328]: NO Giving up election against 0x2010f with role ACTIVE. My role is now QUIESCED
May 13 12:34:50 SLOT-2 osafimmnd[31358]: NO SERVER STATE: IMM_SERVER_ANONYMOUS --> IMM_SERVER_CLUSTER_WAITING
May 13 12:34:50 SLOT-2 osafimmnd[31358]: NO Fevs count adjusted to 1936 preLoadPid: 0
May 13 12:34:50 SLOT-2 osafimmnd[31358]: NO SERVER STATE: IMM_SERVER_CLUSTER_WAITING --> IMM_SERVER_LOADING_PENDING
May 13 12:34:51 SLOT-2 osafimmnd[31358]: NO SERVER STATE: IMM_SERVER_LOADING_PENDING --> IMM_SERVER_SYNC_PENDING
May 13 12:34:51 SLOT-2 osafimmnd[31358]: NO NODE STATE-> IMM_NODE_ISOLATED
May 13 12:34:51 SLOT-2 osafimmnd[31358]: NO NODE STATE-> IMM_NODE_W_AVAILABLE
May 13 12:34:51 SLOT-2 osafimmnd[31358]: NO SERVER STATE: IMM_SERVER_SYNC_PENDING --> IMM_SERVER_SYNC_CLIENT
May 13 12:34:51 SLOT-2 osafimmnd[31358]: NO NODE STATE-> IMM_NODE_FULLY_AVAILABLE 2866
May 13 12:34:51 SLOT-2 osafimmnd[31358]: NO RepositoryInitModeT is SA_IMM_KEEP_REPOSITORY
May 13 12:34:51 SLOT-2 osafimmnd[31358]: WA IMM Access Control mode is DISABLED!
May 13 12:34:51 SLOT-2 osafimmnd[31358]: NO Epoch set to 14 in ImmModel
May 13 12:34:51 SLOT-2 osafimmnd[31358]: NO SERVER STATE: IMM_SERVER_SYNC_CLIENT --> IMM_SERVER_READY
May 13 12:34:51 SLOT-2 osafimmnd[31358]: NO ImmModel received scAbsenceAllowed 0
May 13 12:34:51 SLOT-2 osaflogd[31368]: Started
May 13 12:34:51 SLOT-2 osafntfd[31378]: Started
May 13 12:34:51 SLOT-2 osafclmd[31388]: Started
May 13 12:34:51 SLOT-2 osafamfd[31398]: Started
May 13 12:34:51 SLOT-2 osafamfd[31398]: WA saClmInitialize_4 returned 31
May 13 12:34:51 SLOT-2 osafamfd[31398]: WA saClmInitialize_4 returned 31
May 13 12:34:52 SLOT-2 osafamfd[31398]: WA saClmInitialize_4 returned 31
May 13 12:34:52 SLOT-2 osafamfd[31398]: WA saClmInitialize_4 returned 31
May 13 12:34:52 SLOT-2 osafamfd[31398]: WA saClmInitialize_4 returned 31
May 13 12:34:52 SLOT-2 osafamfd[31398]: WA saClmInitialize_4 returned 31
May 13 12:34:52 SLOT-2 osafamfd[31398]: WA saClmInitialize_4 returned 31
May 13 12:34:52 SLOT-2 osafamfd[31398]: WA saClmInitialize_4 returned 31
May 13 12:34:52 SLOT-2 osafamfd[31398]: WA saClmInitialize_4 returned 31
May 13 12:34:52 SLOT-2 osafamfd[31398]: WA saClmInitialize_4 returned 31
May 13 12:34:52 SLOT-2 osafamfd[31398]: WA saClmInitialize_4 returned 31
May 13 12:34:53 SLOT-2 osafamfd[31398]: WA saClmInitialize_4 returned 31
I am not able to reproduce on CS 7635:de79cf144234 with regular configuration(no PBE, two controllers).
Also tried on CS #7316, but not reproducible.
I am observing the issue with latest CS #7640 also. Traces are attached. Kindly refer.
Hi Guys,
I tried to reproduce the issue on latest opensaf release but i did not succeed.
I did not find the traces during the incident time May 13 12:34:51 in the attached logs.tar file.
So, can I close the ticket?
Thanks
Mohan (www.GetHighAvailability.com)