Menu

#2493 amf: amfnd asserts while shutting down when active monitoring fails for NPI comp.

5.17.11
fixed
nobody
None
defect
amf
nd
major
True
2017-10-30
2017-06-13
Praveen
No

steps to reproduce:
1)Bring one controller up.
2)Add attached configuration in the system.
3)Unlock-in and unlock su1.

Attached configuration uses amfpm command to start active monitoring. If this command is wrongly configured by the user, AMF reports fault on the component and AMFND restarts it. Since everytime active monitoring command fails, component is getting continuously faulted. As a last option when OpenSAF is stopped on the node, AMFND asserted:

syslog:
Jun 13 12:27:03 SC-1 osafamfnd[30287]: NO Removed 'safSi=AmfDemo,safApp=AmfDemo1' from 'safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1'
Jun 13 12:27:03 SC-1 osafamfnd[30287]: NO Removed assignments from AMF components
Jun 13 12:27:03 SC-1 osafamfnd[30287]: NO 'safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1' Component or SU restart probation timer expired
Jun 13 12:27:03 SC-1 osafamfnd[30287]: NO Terminating all AMF components
Jun 13 12:27:03 SC-1 osafamfnd[30287]: NO 'safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1' Presence State RESTARTING => TERMINATING
Jun 13 12:27:03 SC-1 osafamfnd[30287]: src/amf/amfnd/susm.cc:1886: avnd_su_pres_st_chng_prc: Assertion 'si' failed.
Jun 13 12:27:03 SC-1 osafclmd[30264]: AL AMF Node Director is down, terminate this process

bt:
#0 0x00007f662fbe8cc9 in GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1 0x00007f662fbec0d8 in
GI_abort () at abort.c:89
#2 0x00007f66306dedbe in osafassert_fail (file=<optimized out="">, line=<optimized out="">, func=<optimized out="">,
__assertion=<optimized out="">) at src/base/sysf_def.c:286
#3 0x00007f66313fff3f in avnd_su_pres_st_chng_prc (final_st=SA_AMF_PRESENCE_TERMINATING,
prv_st=SA_AMF_PRESENCE_RESTARTING, su=0x7f66324d33c0, cb=0x7f663161f240 <_avnd_cb>) at src/amf/amfnd/susm.cc:1886
#4 avnd_su_pres_fsm_run (cb=cb@entry=0x7f663161f240 <_avnd_cb>, su=0x7f66324d33c0, comp=comp@entry=0x7f66324d46b0,
ev=<optimized out="">) at src/amf/amfnd/susm.cc:1610
#5 0x00007f66313caf58 in avnd_comp_clc_st_chng_prc (cb=cb@entry=0x7f663161f240 <_avnd_cb>,
comp=comp@entry=0x7f66324d46b0, prv_st=prv_st@entry=SA_AMF_PRESENCE_RESTARTING,
final_st=final_st@entry=SA_AMF_PRESENCE_TERMINATING) at src/amf/amfnd/clc.cc:1501
#6 0x00007f66313cf127 in avnd_comp_clc_fsm_run (cb=0x7f663161f240 <_avnd_cb>, comp=comp@entry=0x7f66324d46b0,
ev=ev@entry=AVND_COMP_CLC_PRES_FSM_EV_CLEANUP) at src/amf/amfnd/clc.cc:892
#7 0x00007f66314067e8 in avnd_comp_cleanup_launch (comp=comp@entry=0x7f66324d46b0) at src/amf/amfnd/util.cc:178
#8 0x00007f6631405beb in avnd_last_step_clean (cb=cb@entry=0x7f663161f240 <_avnd_cb>) at src/amf/amfnd/term.cc:76
#9 0x00007f66313e13b9 in avnd_di_msg_ack_process (cb=cb@entry=0x7f663161f240 <_avnd_cb>, mid=<optimized out="">)
at src/amf/amfnd/di.cc:1264
#10 0x00007f66313e1484 in avnd_evt_avd_ack_evh (cb=0x7f663161f240 <_avnd_cb>, evt=0x7f6628001010)
at src/amf/amfnd/di.cc:411
#11 0x00007f66313ec9df in avnd_evt_process (evt=0x7f6628001010) at src/amf/amfnd/main.cc:658
#12 avnd_main_process () at src/amf/amfnd/main.cc:610
#13 0x00007f66313c261f in main (argc=2, argv=0x7ffc47fa34f8) at src/amf/amfnd/main.cc:203

3 Attachments

Related

Wiki: ChangeLog-5.17.11

Discussion

  • Praveen

    Praveen - 2017-06-13
    • Description has changed:

    Diff:

    --- old
    +++ new
    @@ -1,4 +1,3 @@
    -
     steps to reproduce:
     1)Bring one controller up.
     2)Add attached configuration in the system.
    
    • status: unassigned --> assigned
    • assigned_to: Praveen
     
  • Anders Widell

    Anders Widell - 2017-07-01
    • Milestone: 5.17.06 --> 5.17.08
     
  • Anders Widell

    Anders Widell - 2017-07-28
    • Milestone: 5.17.07 --> 5.17.10
     
  • Praveen

    Praveen - 2017-08-29
    • status: assigned --> accepted
    • Blocker: False --> True
     
  • Praveen

    Praveen - 2017-08-30
    • status: accepted --> review
     
  • Praveen

    Praveen - 2017-08-30

    Escalation is not reaching to node failover in this issue because both comp and su restart prob timer value is very less( less than a nano second).

     
  • Praveen

    Praveen - 2017-09-12
    • status: review --> fixed
    • assigned_to: Praveen --> nobody
     
  • Praveen

    Praveen - 2017-09-12

    develop:
    commit 126c7d9c59a41205ce16c2c9e8a7cae7457a0c2c
    Author: Praveen praveen.malviya@oracle.com
    Date: Tue Sep 12 17:08:11 2017 +0530

    amfnd: fix opensaf shutdown and active monitoring failure [#2493]
    

    commit 74476b88a30c80c788e56b6ede2baea040e22c18
    Author: Praveen praveen.malviya@oracle.com
    Date: Tue Sep 12 17:08:11 2017 +0530

    amfnd: fix opensaf shutdown and active monitoring failure [#2493]
    
     

Log in to post a comment.