Menu

#2400 AMFD: Cached node_up message causes amfnd reboot after node joins cluster

5.2.0
fixed
Gary Lee
None
defect
amf
d
major
2017-04-05
2017-03-29
No

SC Absence is enabled, restarts both SCs. After all amfnd introduce node_up and join cluster, cluster startup timer expires in which amfd will start application assignments. At this time, a retransmitted node_up message which could be cached in mailbox (or late coming) that makes amfd to order a node reboot

ar 20 15:04:46 SC-2 osafamfd[9576]: NO Receive message with event type:12, msg_type:31, from node:2040f, msg_id:0
Mar 20 15:04:46 SC-2 osafamfd[9576]: NO Receive message with event type:12, msg_type:31, from node:2030f, msg_id:0
Mar 20 15:04:46 SC-2 osafamfd[9576]: NO Receive message with event type:13, msg_type:32, from node:2040f, msg_id:0
Mar 20 15:04:46 SC-2 osafamfd[9576]: NO Receive message with event type:13, msg_type:32, from node:2030f, msg_id:0
Mar 20 15:04:46 SC-2 osafamfd[9576]: NO Received node_up_msg from all nodes
Mar 20 15:04:46 SC-2 osafamfd[9576]: NO Received node_up from 2030f: msg_id 1

Mar 20 15:04:46 SC-2 osafamfd[9576]: NO Enter restore headless cached RTAs from IMM
Mar 20 15:04:46 SC-2 osafamfd[9576]: NO Leave reading headless cached RTAs from IMM: SUCCESS
Mar 20 15:04:46 SC-2 osafamfd[9576]: NO Node 'SC-2' joined the cluster

Mar 20 15:04:49 SC-2 osafamfd[9576]: NO Received node_up from 2030f: msg_id 1
Mar 20 15:04:49 SC-2 osafamfd[9576]: NO Node 'PL-3' joined the cluster
Mar 20 15:04:49 SC-2 osafamfd[9576]: NO Received node_up from 2010f: msg_id 1
Mar 20 15:04:49 SC-2 osafamfd[9576]: NO Node 'SC-1' joined the cluster

Mar 20 15:05:00 SC-2 osafamfd[9576]: NO Cluster startup is done

Mar 20 15:05:18 SC-2 osafamfd[9576]: NO Received node_up from 2030f: msg_id 1
Mar 20 15:05:18 SC-2 osafamfd[9576]: WA Sending node reboot order to node:safAmfNode=PL-3,safAmfCluster=myAmfCluster, due to late node_up_msg after cluster startup timeout

Related

Tickets: #2400
Wiki: ChangeLog-5.1.1

Discussion

  • Gary Lee

    Gary Lee - 2017-03-29
    • status: unassigned --> accepted
    • assigned_to: Gary Lee
     
  • Gary Lee

    Gary Lee - 2017-04-03
    • status: accepted --> review
     
  • Anders Widell

    Anders Widell - 2017-04-03
    • Milestone: 5.1.1 --> 5.2.0
     
  • Anders Widell

    Anders Widell - 2017-04-03
    • Milestone: 5.2.0 --> next
     
  • Anders Widell

    Anders Widell - 2017-04-03
    • Milestone: next --> 5.2.0
     
  • Gary Lee

    Gary Lee - 2017-04-05

    changeset: 8750:428cb7d8c3cd
    branch: opensaf-5.1.x
    tag: tip
    parent: 8737:f9a5a957c16a
    user: Gary Lee gary.lee@dektech.com.au
    date: Wed Apr 05 15:17:51 2017 +1000
    summary: amfd: ignore node_up if node state is not absent [#2400]

    changeset: 8749:cc3ae4601faf
    user: Gary Lee gary.lee@dektech.com.au
    date: Wed Apr 05 15:15:30 2017 +1000
    summary: amfd: ignore node_up if node state is not absent [#2400]

     

    Related

    Tickets: #2400

  • Gary Lee

    Gary Lee - 2017-04-05
    • status: review --> fixed
     

Log in to post a comment.