Menu

#1361 AMF should not honor attributes saAmfSGMaxActiveSIsperSU / saAmfSGMaxStandbySIsperSU attributes for 2n

4.5.2
fixed
None
defect
amf
d
4.6FC
major
2015-05-27
2015-04-30
Srikanth R
No

Changeset : 6377

Issue : amfd crashed during SG unlock operation, after update of flag saAmfSGMaxStandbySIsperSU to 0

Steps performed :

-> Amf application is configured in 2N redundancy model, with 2 SUs and 5 SI's ( si-si dependency configured with 1 sponsor SI and other dependent SI)

-> Initially application is brought up by unlocking the SUs and SG.

-> locked the SG

-> Now ran the command on locked SG.

immcfg -a saAmfSGMaxStandbySIsperSU=0 safSg=SG,safApp=test2nApp
Apr 30 14:29:04 SYSTEST-CNTLR-1 osafimmnd[10286]: NO Ccb 95 COMMITTED (immcfg_SYSTEST-CNTLR-1_14095)

-> Finally unlocked the SG, for which amfd crashed with the following syslog.

SYSTEST-CNTLR-1:/opt/goahead/tetware/opensaffire/scripts # immadm -o 1 safSg=SG,safApp=test2nApp
Apr 30 14:29:14 SYSTEST-CNTLR-1 osafamfd[10349]: su.cc:1821: inc_curr_stdby_si: Assertion 'saAmfSUNumCurrStandbySIs <= sg_of_su->saAmfSGMaxStandbySIsperSU' failed.
Apr 30 14:29:14 SYSTEST-CNTLR-1 osafamfnd[10359]: ER AMF director unexpectedly crashed
Apr 30 14:29:14 SYSTEST-CNTLR-1 osafamfnd[10359]: Rebooting OpenSAF NodeId = 131343 EE Name = , Reason: local AVD down(Adest) or both AVD down(Vdest) received, OwnNodeId = 131343, SupervisionTime = 60
Apr 30 14:29:14 SYSTEST-CNTLR-1 opensaf_reboot: Rebooting local node; timeout=60
Apr 30 14:29:14 SYSTEST-CNTLR-1 osafimmnd[10286]: NO Implementer locally disconnected. Marking it as doomed 4 <19, 2010f> (safAmfService)
Apr 30 14:29:14 SYSTEST-CNTLR-1 osafimmnd[10286]: NO Implementer disconnected 4 <19, 2010f> (safAmfService)
error - saImmOmAdminOperationInvoke_2 FAILED: SA_AIS_ERR_TIMEOUT (5)
Apr 30 14:29:15 SYSTEST-CNTLR-1 osafimmnd[10286]: WA Timeout on syncronous admin operation 1

-> Also standby controller and cluster went for reboot, with amfd crashed for the same assertion.

Apr 30 14:29:34 SYSTEST-CNTLR-2 osafimmnd[20291]: NO Implementer disconnected 25 <339, 2020f> (MsgQueueService131343)
Apr 30 14:29:34 SYSTEST-CNTLR-2 osafamfd[20347]: su.cc:1821: inc_curr_stdby_si: Assertion 'saAmfSUNumCurrStandbySIs <= sg_of_su->saAmfSGMaxStandbySIsperSU' failed.
Apr 30 14:29:34 SYSTEST-CNTLR-2 osafimmnd[20291]: NO Implementer connected: 26 (safSmfService) <335, 2020f>
Apr 30 14:29:34 SYSTEST-CNTLR-2 osafimmpbed: NO Successfully opened pre-existing sqlite pbe file /home/immPBE/imm.db
Apr 30 14:29:34 SYSTEST-CNTLR-2 osafamfnd[20357]: ER AMF director unexpectedly crashed
Apr 30 14:29:34 SYSTEST-CNTLR-2 osafamfnd[20357]: Rebooting OpenSAF NodeId = 131599 EE Name = , Reason: local AVD down(Adest) or both AVD down(Vdest) received, OwnNodeId = 131599, SupervisionTime = 60

Below is the backtrace.

#0 0x00007fdf22d02b55 in raise () from /lib64/libc.so.6
#1 0x00007fdf22d04131 in abort () from /lib64/libc.so.6
#2 0x00007fdf24aeb37a in __osafassert_fail () from /usr/lib64/libopensaf_core.so.0
#3 0x0000000000477454 in AVD_SU::inc_curr_stdby_si() () at su.cc:1821
#4 0x000000000046b168 in avd_susi_update_assignment_counters(avd_su_si_rel_tag, AVSV_SUSI_ACT, SaAmfHAStateT, SaAmfHAStateT) ()
at siass.cc:624
#5 0x000000000046bbc5 in avd_susi_create(cl_cb_tag
, AVD_SI, AVD_SU, SaAmfHAStateT, bool) () at siass.cc:255
#6 0x0000000000464dd8 in avd_new_assgn_susi(cl_cb_tag, AVD_SU, AVD_SI, SaAmfHAStateT, bool, avd_su_si_rel_tag) () at sgproc.cc:111
#7 0x00000000004453b1 in avd_sg_2n_su_chose_asgn(cl_cb_tag
, AVD_SG) () at sg_2n_fsm.cc:700
#8 0x0000000000449068 in SG_2N::susi_success_sg_realign(AVD_SU
, avd_su_si_rel_tag, AVSV_SUSI_ACT, SaAmfHAStateT) () at sg_2n_fsm.cc:1814
#9 0x000000000044a5bc in SG_2N::susi_success(cl_cb_tag
, AVD_SU, avd_su_si_rel_tag, AVSV_SUSI_ACT, SaAmfHAStateT) () at sg_2n_fsm.cc:2379
#10 0x0000000000467278 in avd_su_si_assign_evh(cl_cb_tag, avd_evt_tag) () at sgproc.cc:1251
#11 0x00000000004332b6 in process_event(cl_cb_tag, avd_evt_tag) () at main.cc:775
#12 0x0000000000407b3c in main () at main.cc:395

Related

Tickets: #1361
Wiki: ChangeLog-4.5.2
Wiki: ChangeLog-4.6.1

Discussion

  • Srikanth R

    Srikanth R - 2015-04-30

    While bringing up the configuration, the attributes saAmfSGMaxActiveSIsperSU and saAmfSGMaxStandbySIsperSU are ignored for the 2n model.

    Apr 30 14:39:17 SYSTEST-CNTLR-1 osafamfd[2346]: NO 'safSg=SG,safApp=test2nApp' attribute saAmfSGMaxActiveSIsperSU ignored, not valid for red model
    Apr 30 14:39:17 SYSTEST-CNTLR-1 osafamfd[2346]: NO 'safSg=SG,safApp=test2nApp' attribute saAmfSGMaxStandbySIsperSU ignored, not valid for red model

    Once the application is configured, the mentioned attributes change is reflected accordingly.

    The attributes should be totally supported or ignored for the 2n model.

     
  • Srikanth R

    Srikanth R - 2015-04-30
    • summary: AMF: osafamfd crashed during SG unlock operation, after update of flag saAmfSGMaxStandbySIsperSU to 0 --> AMF should not honor attributes saAmfSGMaxStandbySIsperSU / saAmfSGMaxStandbySIsperSU attributes for 2n
     
  • Nagendra Kumar

    Nagendra Kumar - 2015-05-15
    • status: unassigned --> assigned
    • assigned_to: Nagendra Kumar
     
  • Nagendra Kumar

    Nagendra Kumar - 2015-05-18
    • status: assigned --> accepted
     
  • Nagendra Kumar

    Nagendra Kumar - 2015-05-19

    It is reproducible with the following steps: 2N Red models, 2 controllers and 1 payload:
    1. Configure SU1 on SC-1 as Act.
    2. Configure SU2 on SC-2 as Std.
    3. Configure SU3(unlock) on PL-3.

    Lock SG and modify saAmfSGMaxStandbySIsperSU to zero and then unlock SG.
    Act Amfd crashes leading to SC-1 node reboot.
    SC-2 becomes Act and assigns Act to SU2 and try to assign Std to SU3 and this Amfd also crashes leading to Cluster reboot.

    Thanks
    -Nagu

     
  • Srikanth R

    Srikanth R - 2015-05-21
    • summary: AMF should not honor attributes saAmfSGMaxStandbySIsperSU / saAmfSGMaxStandbySIsperSU attributes for 2n --> AMF should not honor attributes saAmfSGMaxActiveSIsperSU / saAmfSGMaxStandbySIsperSU attributes for 2n
     
  • Nagendra Kumar

    Nagendra Kumar - 2015-05-21
    • status: accepted --> review
    • Milestone: future --> 4.5.2
     
  • Nagendra Kumar

    Nagendra Kumar - 2015-05-27
    • status: review --> fixed
    • Part: - --> d
     
  • Nagendra Kumar

    Nagendra Kumar - 2015-05-27

    changeset: 6590:d775d8fb7951
    branch: opensaf-4.5.x
    parent: 6587:071c4ca7a679
    user: Nagendra Kumarnagendra.k@oracle.com
    date: Wed May 27 12:31:51 2015 +0530
    summary: amfd: ignore invalid modification of saAmfSGMaxActiveSIsperSU/saAmfSGMaxStandbySIsperSU [#1361]

    changeset: 6591:05d5ba64ae8a
    branch: opensaf-4.6.x
    parent: 6588:21730a950421
    user: Nagendra Kumarnagendra.k@oracle.com
    date: Wed May 27 12:32:04 2015 +0530
    summary: amfd: ignore invalid modification of saAmfSGMaxActiveSIsperSU/saAmfSGMaxStandbySIsperSU [#1361]

    changeset: 6592:17406d1e43d3
    tag: tip
    parent: 6589:d719ade2b028
    user: Nagendra Kumarnagendra.k@oracle.com
    date: Wed May 27 12:32:12 2015 +0530
    summary: amfd: ignore invalid modification of saAmfSGMaxActiveSIsperSU/saAmfSGMaxStandbySIsperSU [#1361]

    [staging:d775d8]
    [staging:05d5ba]
    [staging:17406d]

     

    Related

    Tickets: #1361


Log in to post a comment.