Menu

#2469 amf: Stop tracking api returns NOT_EXIST

5.17.07
fixed
nobody
None
defect
amf
d
major
False
2017-07-27
2017-05-29
No

When performing switchover, AMFD fails to stop CLM track callback with error code 12 (NOT_EXIST)

syslog:

2017-05-26 10:19:02 SC-1 osafamfd[268]: NO Controller switch over initiated
2017-05-26 10:19:02 SC-1 osafamfd[268]: NO ROLE SWITCH Active --> Quiesced
2017-05-26 10:19:02 SC-1 osafimmnd[205]: NO Implementer (applier) connected: 40 (@OpenSafImmReplicatorB) <343, 2010f>
2017-05-26 10:19:02 SC-1 osafntfimcnd[626]: NO Started
2017-05-26 10:19:12 SC-1 osafamfd[268]: WA Failed to stop cluster tracking 5
2017-05-26 10:19:12 SC-1 osafimmnd[205]: NO Implementer disconnected 32 <27, 2010f> (safAmfService)
2017-05-26 10:19:12 SC-1 osafimmnd[205]: NO Implementer (applier) connected: 41 (@safAmfService2010f) <27, 2010f>
2017-05-26 10:19:12 SC-1 osafamfnd[283]: NO AVD NEW_ACTIVE, adest:1
2017-05-26 10:19:12 SC-1 osafimmnd[205]: NO Implementer disconnected 31 <0, 2020f> (@safAmfService2020f)
2017-05-26 10:19:12 SC-1 osafimmnd[205]: NO Implementer connected: 42 (safAmfService) <0, 2020f>
2017-05-26 10:19:12 SC-1 osafamfd[268]: NO Switching Quiesced --> StandBy
2017-05-26 10:19:13 SC-1 osafamfd[268]: ER Failed to stop cluster tracking 12
2017-05-26 10:19:13 SC-1 osafamfd[268]: ER Failed to stop cluster tracking after switch over
2017-05-26 10:19:13 SC-1 osafamfd[268]: NO Controller switch over done

CLM trace:

May 26 10:19:13.173369 osafclmd [240:240:src/clm/clmd/clms_evt.c:1347] >> proc_track_stop_msg
May 26 10:19:13.173374 osafclmd [240:240:src/clm/clmd/clms_util.c:0126] >> clms_node_get_by_id
May 26 10:19:13.173379 osafclmd [240:240:src/clm/clmd/clms_util.c:0137] TR Node found 131343
May 26 10:19:13.173383 osafclmd [240:240:src/clm/clmd/clms_util.c:0140] << clms_node_get_by_id
May 26 10:19:13.173388 osafclmd [240:240:src/clm/clmd/clms_evt.c:1350] TR Node id = 131343
May 26 10:19:13.173393 osafclmd [240:240:src/clm/clmd/clms_mds.c:1553] >> clms_mds_msg_send
May 26 10:19:13.173448 osafclmd [240:240:src/clm/clmd/clms_mds.c:1587] << clms_mds_msg_send
May 26 10:19:13.173457 osafclmd [240:240:src/clm/clmd/clms_mbcsv.c:0810] >> clms_send_async_update
May 26 10:19:13.173462 osafclmd [240:240:src/mbc/mbcsv_api.c:0798] >> mbcsv_process_snd_ckpt_request: Sending checkpoint data to all STANDBY peers, as per the send-type specified
May 26 10:19:13.173504 osafclmd [240:240:src/mbc/mbcsv_api.c:0830] TR svc_id:48, pwe_hdl:65552
May 26 10:19:13.173509 osafclmd [240:240:src/mbc/mbcsv_util.c:0363] >> mbcsv_send_ckpt_data_to_all_peers
May 26 10:19:13.173593 osafclmd [240:240:src/mbc/mbcsv_util.c:0411] TR dispatching FSM for NCSMBCSV_SEND_ASYNC_UPDATE
May 26 10:19:13.173599 osafclmd [240:240:src/mbc/mbcsv_act.c:0103] TR ASYNC update to be sent. role: 1, svc_id: 48, pwe_hdl: 65552
May 26 10:19:13.173604 osafclmd [240:240:src/mbc/mbcsv_util.c:0424] TR calling encode callback
May 26 10:19:13.173610 osafclmd [240:240:src/clm/clmd/clms_mbcsv.c:0740] >> mbcsv_callback
May 26 10:19:13.173615 osafclmd [240:240:src/clm/clmd/clms_mbcsv.c:0856] >> ckpt_encode_cbk_handler
May 26 10:19:13.173626 osafclmd [240:240:src/clm/clmd/clms_mbcsv.c:0867] TR cbk_arg->info.encode.io_msg_type type 1
May 26 10:19:13.173632 osafclmd [240:240:src/clm/clmd/clms_mbcsv.c:1307] >> ckpt_encode_async_update
May 26 10:19:13.173637 osafclmd [240:240:src/clm/clmd/clms_mbcsv.c:1324] TR data->header.type 3
May 26 10:19:13.173641 osafclmd [240:240:src/clm/clmd/clms_mbcsv.c:1362] TR Async update CLMS_CKPT_TRACK_START
May 26 10:19:13.173646 osafclmd [240:240:src/clm/clmd/clms_mbcsv.c:1701] >> enc_mbcsv_track_changes_msg
May 26 10:19:13.173650 osafclmd [240:240:src/clm/clmd/clms_mbcsv.c:1714] << enc_mbcsv_track_changes_msg
May 26 10:19:13.173654 osafclmd [240:240:src/clm/clmd/clms_mbcsv.c:1515] << ckpt_encode_async_update
May 26 10:19:13.173658 osafclmd [240:240:src/clm/clmd/clms_mbcsv.c:0910] << ckpt_encode_cbk_handler
May 26 10:19:13.173663 osafclmd [240:240:src/clm/clmd/clms_mbcsv.c:0780] << mbcsv_callback
May 26 10:19:13.173667 osafclmd [240:240:src/mbc/mbcsv_util.c:0469] TR send the encoded message to any other peer with same s/w version
May 26 10:19:13.173671 osafclmd [240:240:src/mbc/mbcsv_util.c:0472] TR dispatching FSM for NCSMBCSV_SEND_ASYNC_UPDATE
May 26 10:19:13.173675 osafclmd [240:240:src/mbc/mbcsv_act.c:0103] TR ASYNC update to be sent. role: 1, svc_id: 48, pwe_hdl: 65552
May 26 10:19:13.173680 osafclmd [240:240:src/mbc/mbcsv_mds.c:0197] >> mbcsv_mds_send_msg: sending to vdest:10
May 26 10:19:13.173684 osafclmd [240:240:src/mbc/mbcsv_mds.c:0218] TR send type MDS_SENDTYPE_REDRSP:
May 26 10:19:13.174115 osafclmd [240:243:src/mds/mds_dt_trans.c:0755] >> mdtm_process_poll_recv_data_tcp
May 26 10:19:13.174192 osafclmd [240:240:src/mbc/mbcsv_mds.c:0252] << mbcsv_mds_send_msg: success
May 26 10:19:13.174205 osafclmd [240:240:src/mbc/mbcsv_util.c:0526] << mbcsv_send_ckpt_data_to_all_peers
May 26 10:19:13.174211 osafclmd [240:240:src/mbc/mbcsv_api.c:0895] << mbcsv_process_snd_ckpt_request: retval: 1
May 26 10:19:13.174216 osafclmd [240:240:src/clm/clmd/clms_mbcsv.c:0834] << clms_send_async_update
May 26 10:19:13.174221 osafclmd [240:240:src/clm/clmd/clms_evt.c:1405] << proc_track_stop_msg

May 26 10:19:13.175405 osafclmd [240:240:src/clm/clmd/clms_evt.c:1347] >> proc_track_stop_msg
May 26 10:19:13.175409 osafclmd [240:240:src/clm/clmd/clms_util.c:0126] >> clms_node_get_by_id
May 26 10:19:13.175413 osafclmd [240:240:src/clm/clmd/clms_util.c:0137] TR Node found 131343
May 26 10:19:13.175418 osafclmd [240:240:src/clm/clmd/clms_util.c:0140] << clms_node_get_by_id
May 26 10:19:13.175422 osafclmd [240:240:src/clm/clmd/clms_evt.c:1350] TR Node id = 131343
May 26 10:19:13.175436 osafclmd [240:240:src/clm/clmd/clms_evt.c:1369] IN Client 11 didn't subscribe for track start
May 26 10:19:13.175441 osafclmd [240:240:src/clm/clmd/clms_mds.c:1553] >> clms_mds_msg_send
May 26 10:19:13.175493 osafclmd [240:240:src/clm/clmd/clms_mds.c:1587] << clms_mds_msg_send
May 26 10:19:13.175502 osafclmd [240:240:src/clm/clmd/clms_evt.c:1405] << proc_track_stop_msg

Related

Wiki: ChangeLog-5.17.07

Discussion

  • Praveen

    Praveen - 2017-05-29
    • status: unassigned --> assigned
    • assigned_to: Praveen
    • Blocker: False --> True
     
  • Praveen

    Praveen - 2017-06-07

    It seems TrackStop() request came to CLMS and it executed it. But the client which is AMFD received ERR_TIMEOUT:
    2017-05-26 10:19:12 SC-1 osafamfd[268]: WA Failed to stop cluster tracking 5

    Now same AMFD after becoming standby tries to do TrackStop() again. It will surely get ERR_NOT_EXIST as the tracking was stopped.
    AMFD should finalize the handle when its gets ERR_TIMEOUT.

     
  • Minh Hon Chau

    Minh Hon Chau - 2017-06-14

    Hi Praveen,

    In CLM spec, ERR_TIMEOUT is return from TrackStop() that means the call whether can succeed or not, so at this moment AMFD tries to stop again. Can AMFD treat ERR_NOT_EXIST as a successful TrackStop()? So AMFD can ignore the ERR_NOT_EXIST?

    thanks,
    Minh

     
  • Praveen

    Praveen - 2017-06-16

    Hi Minh,

    Generally for ERR_TIMEOUT case it is recommended to finalize that handle because if the API was called for creating some resource then user does not know whether resource was created or not.
    Since present case is for stoping the the tracking, it may work.

    Thanks
    Praveen

     
  • Praveen

    Praveen - 2017-06-19
    • summary: clm: Stop tracking api returns NOT_EXIST --> amf: Stop tracking api returns NOT_EXIST
    • status: assigned --> unassigned
    • Component: clm --> amf
    • Blocker: True --> False
     
  • Minh Hon Chau

    Minh Hon Chau - 2017-06-19
    • status: unassigned --> accepted
    • assigned_to: Praveen --> Minh Hon Chau
     
  • Minh Hon Chau

    Minh Hon Chau - 2017-06-26
    • status: accepted --> review
     
  • Minh Hon Chau

    Minh Hon Chau - 2017-06-28
    • status: review --> fixed
    • assigned_to: Minh Hon Chau --> nobody
     
  • Anders Widell

    Anders Widell - 2017-07-01
    • Milestone: 5.17.06 --> 5.17.08
     

Log in to post a comment.