Cluster stuck unhealthy under SCs brutal reboot
2020-11-26 06:58:45.011 SC-2 osafamfd[247]: NO Received node_up from 2010f: msg_id 1 2020-11-26 06:58:45.012 SC-2 osafamfd[247]: NO Node 'SC-1' joined the cluster 2020-11-26 06:58:48.240 SC-2 systemd-sysctl[35]: Couldn't write '4 4 1 7' to 'kernel/printk', ignoring: Read-only file system 2020-11-26 06:58:48.252 SC-2 systemd-sysctl[35]: Couldn't write '1' to 'kernel/kptr_restrict', ignoring: Read-only file system 2020-11-26 06:58:45.512 SC-1 osafamfnd[260]: NO Assigning 'safSi=NoRed1,safApp=OpenSAF' ACTIVE to 'safSu=SC-1,safSg=NoRed,safApp=OpenSAF' 2020-11-26 06:58:45.518 SC-1 osafamfnd[260]: NO Assigned 'safSi=NoRed1,safApp=OpenSAF' ACTIVE to 'safSu=SC-1,safSg=NoRed,safApp=OpenSAF' 2020-11-26 06:58:46.425 SC-1 osafdtmd[126]: NO Lost contact with 'SC-2' 2020-11-26 06:58:46.428 SC-1 osafamfnd[260]: WA AMF director unexpectedly crashed 2020-11-26 06:58:46.428 SC-1 osafamfnd[260]: NO Checking 'safSu=SC-1,safSg=2N,safApp=OpenSAF' for pending messages 2020-11-26 06:58:46.428 SC-1 osafamfnd[260]: NO Checking 'safSu=SC-1,safSg=NoRed,safApp=OpenSAF' for pending messages 2020-11-26 06:58:46.436 SC-1 osafamfnd[260]: NO 'safSu=SC-1,safSg=2N,safApp=OpenSAF' Presence State INSTANTIATING => INSTANTIATED
SC-2 power off when SC-1 just up (not yet standby)
Then SC-1 enter headless and promote itself to Active (like roaming SC)
AMFND failed to record SU-SI as exist already
2020-11-26 06:58:49.365 SC-1 osafamfnd[260]: NO AVD NEW_ACTIVE, adest:1 2020-11-26 06:58:49.442 SC-1 osafamfnd[260]: NO saClmDispatch BAD_HANDLE 2020-11-26 06:58:49.442 SC-1 osafamfnd[260]: NO Sending node up due to NCSMDS_NEW_ACTIVE 2020-11-26 06:58:56.028 SC-1 osafamfnd[260]: CR SU-SI record addition failed, SU= safSu=SC-1,safSg=NoRed,safApp=OpenSAF : SI=safSi=NoRed1,safApp=OpenSAF 2020-11-26 06:58:56.038 SC-1 osafamfnd[260]: NO Assigning 'safSi=SC-2N,safApp=OpenSAF' ACTIVE to 'safSu=SC-1,safSg=2N,safApp=OpenSAF' 2020-11-26 06:58:56.073 SC-1 osafamfnd[260]: NO Assigned 'safSi=SC-2N,safApp=OpenSAF' ACTIVE to 'safSu=SC-1,safSg=2N,safApp=OpenSAF' 2020-11-26 06:58:56.700 SC-1 osafamfd[247]: NO Received node_up from 2020f: msg_id 1 2020-11-26 06:58:57.086 SC-1 osafamfd[247]: NO Received node_up from 2050f: msg_id 1 2020-11-26 06:58:57.087 SC-1 osafamfd[247]: NO Received node_up from 2030f: msg_id 1 2020-11-26 06:58:57.090 SC-1 osafamfd[247]: NO Received node_up from 2040f: msg_id 1 <143>1 2020-11-26T06:59:02.179518+01:00 SC-1 osafamfd 247 osafamfd [meta sequenceId="18992"] 247:amf/amfd/ndfsm.cc:373 TR invalid init state (2), node 2020f <143>1 2020-11-26T06:59:02.579492+01:00 SC-1 osafamfd 247 osafamfd [meta sequenceId="19025"] 247:amf/amfd/ndfsm.cc:373 TR invalid init state (2), node 2030f <143>1 2020-11-26T06:59:02.579642+01:00 SC-1 osafamfd 247 osafamfd [meta sequenceId="19040"] 247:amf/amfd/ndfsm.cc:373 TR invalid init state (2), node 2040f <143>1 2020-11-26T06:59:02.579795+01:00 SC-1 osafamfd 247 osafamfd [meta sequenceId="19055"] 247:amf/amfd/ndfsm.cc:373 TR invalid init state (2), node 2050f
commit 501241653d25bc2beffad7a25ea6a281d66c0c6f (HEAD -> develop, origin/develop)
Author: thuan.tran thuan.tran@dektech.com.au
Date: Tue Dec 1 17:06:31 2020 +0700