Menu

#1648 amf: amf director unexpectedly crashed up on ImplementerClear failed 5

future
unassigned
nobody
None
defect
amf
-
minor
2016-09-20
2015-12-18
No

Was running fowling script to do continuous fail-over & switch-over alternatively
and observed the below issue

NUM=2
for (( i =0; i <= 100; i++))
do

((EVEN = ($NUM % 2)))
if [ $EVEN -eq 0 ]; then
echo "Starting opensafd restart "
/etc/init.d/opensafd restart
else
echo "Starting opensafd si-swap "
amf-adm si-swap safSi=SC-2N,safApp=OpenSAF
fi

((NUM=$NUM + 1))
done

============================================
Dec 18 10:43:30 SC-2 osafimmnd[27298]: NO Implementer disconnected 100 <0, 2010f> (safLogService)
Dec 18 10:43:30 SC-2 osafimmnd[27298]: NO Implementer disconnected 104 <0, 2010f> (safClmService)
Dec 18 10:43:30 SC-2 osafimmnd[27298]: NO Implementer disconnected 103 <0, 2010f> (safEvtService)
Dec 18 10:43:30 SC-2 osafimmnd[27298]: NO Implementer disconnected 101 <0, 2010f> (safCheckPointService)
Dec 18 10:43:30 SC-2 osafimmnd[27298]: NO Implementer disconnected 99 <0, 2010f> (safMsgGrpService)
Dec 18 10:43:30 SC-2 osafimmd[27287]: WA IMMD lost contact with peer IMMD (NCSMDS_RED_DOWN)
Dec 18 10:43:30 SC-2 osafimmnd[27298]: WA DISCARD DUPLICATE FEVS message:18220
Dec 18 10:43:30 SC-2 osafimmnd[27298]: WA Error code 2 returned for message type 82 - ignoring
Dec 18 10:43:30 SC-2 osafimmnd[27298]: WA DISCARD DUPLICATE FEVS message:18221
Dec 18 10:43:30 SC-2 osafimmnd[27298]: WA Error code 2 returned for message type 82 - ignoring
Dec 18 10:43:30 SC-2 osafimmd[27287]: WA IMMND DOWN on active controller f1 detected at standby immd!! f2. Possible failover
Dec 18 10:43:30 SC-2 osafimmd[27287]: NO Skipping re-send of fevs message 18220 since it has recently been resent.
Dec 18 10:43:30 SC-2 osafimmd[27287]: NO Skipping re-send of fevs message 18221 since it has recently been resent.
Dec 18 10:43:30 SC-2 osafimmnd[27298]: NO Global discard node received for nodeId:2010f pid:6737
Dec 18 10:43:30 SC-2 osafimmnd[27298]: NO Implementer disconnected 98 <0, 2010f(down)> (MsgQueueService131343)
Dec 18 10:43:30 SC-2 osafimmnd[27298]: NO Implementer disconnected 102 <0, 2010f(down)> (safLckService)
Dec 18 10:43:30 SC-2 osafimmnd[27298]: NO Implementer disconnected 97 <0, 2010f(down)> (@safAmfService2010f)
Dec 18 10:43:31 SC-2 kernel: [68517.295788] tipc: Resetting link <1.1.2:eth2-1.1.1:eth1>, changeover initiated by peer
Dec 18 10:43:31 SC-2 kernel: [68517.295794] tipc: Lost link <1.1.2:eth2-1.1.1:eth1> on network plane A
Dec 18 10:43:31 SC-2 kernel: [68517.354991] tipc: Duplicate <1.1.1> using eth(08:00:27:3b:a5:86) seen on <eth:eth2>
Dec 18 10:43:40 SC-2 osafamfd[27348]: ER FAILOVER Active --> Quiesced FAILED, ImplementerClear failed 5
Dec 18 10:43:40 SC-2 osafimmnd[27298]: WA ERR_BAD_HANDLE: Handle use is blocked by pending reply on syncronous call
Dec 18 10:43:40 SC-2 osafimmnd[27298]: NO Implementer locally disconnected. Marking it as doomed 90 <9, 2020f> (safAmfService)
Dec 18 10:43:40 SC-2 osafamfd[27348]: ER FAILOVER Active --> Quiesced FAILED, ImplementerClear failed 9
Dec 18 10:43:40 SC-2 osafamfd[27348]: NO Re-initializing with IMM
Dec 18 10:43:40 SC-2 osafimmnd[27298]: WA IMMND - Client Node Get Failed for cli_hdl 38654837263
Dec 18 10:43:50 SC-2 osafamfd[27348]: ER saImmOiImplementerSet failed 5
Dec 18 10:43:50 SC-2 osafamfd[27348]: ER exiting since avd_imm_applier_set failed
Dec 18 10:43:50 SC-2 osafamfnd[27362]: ER AMF director unexpectedly crashed
Dec 18 10:43:50 SC-2 osafamfnd[27362]: Rebooting OpenSAF NodeId = 131599 EE Name = , Reason: local AVD down(Adest) or both AVD down(Vdest) received, OwnNodeId = 131599, SupervisionTime = 60
Dec 18 10:43:50 SC-2 opensaf_reboot: Rebooting local node; timeout=60
Dec 18 10:49:17 SC-2 syslog-ng[1193]: syslog-ng starting up; version='2.0.9'</eth:eth2>

============================================

Discussion

  • Hans Nordebäck

    Hans Nordebäck - 2015-12-21

    This fault seems due to amfd is not handling SA_AIS_ERR_TIMEOUT in avd_imm_reinit_bg_thread. See also [#1607]. Handling ERR_TIMEOUT the same way as TRY_AGAIN is possible if the operation is idempotent, saImmOiImplementerSet is idempotent.

     

    Related

    Tickets: #1607


    Last edit: Hans Nordebäck 2016-01-07
  • Mathi Naickan

    Mathi Naickan - 2016-05-04
    • Milestone: 4.6.2 --> 4.7.2
     
  • Anders Widell

    Anders Widell - 2016-09-20
    • Milestone: 4.7.2 --> 5.0.2
     
  • Anders Widell

    Anders Widell - 2017-04-03
    • Milestone: 5.0.2 --> future
     

Log in to post a comment.