Menu

#3330 imm: immnd crashes due to NFS state unsynced

5.23.03
fixed
None
defect
imm
nd
5.23.03
minor
False
2023-02-21
2023-02-13
No

There's a case when ccb state from some nodes un-synced during nodes losing connection with each other. The root cause is due to notification of FS availability is not sent to all nodes during disconnection among nodes(e.g. Split-brain, …). This makes ccb processing in each nodes behave differently as they received different information.
Steps to reproduce


  • Split cluster into partitions or power off some nodes
  • Use cmd: "immadm -o 400 safRdn=immManagement,safApp=safImmService" on a partition or alive nodes to notify IMM "FS unavailable". Only nodes maintaining connection with each other will receive this notification.
  • Merge network/Power on nodes.
  • Change attribute's value of an object (object must associate with an implementer).

Related

Wiki: ChangeLog-5.23.03

Discussion

  • PhanTranQuocDat

    PhanTranQuocDat - 2023-02-13
    • summary: imm: Ccb state unsynced after split-brain --> imm: immnd crashes due to NFS state unsynced
    • Description has changed:

    Diff:

    --- old
    +++ new
    @@ -1,4 +1,4 @@
    -There's a case when ccb state from some nodes un-synced after split-brain. The root cause is due to notification of FS availability is not sent to all nodes during disconnection among nodes(e.g. Split-brain, …). This makes ccb processing in each nodes behave differently as they received different information.
    +There's a case when ccb state from some nodes un-synced during nodes losing connection with each other. The root cause is due to notification of FS availability is not sent to all nodes during disconnection among nodes(e.g. Split-brain, …). This makes ccb processing in each nodes behave differently as they received different information.
     Steps to reproduce
     ------------------
     - Split cluster into partitions or power off some nodes
    
     
  • PhanTranQuocDat

    PhanTranQuocDat - 2023-02-21
    • status: assigned --> fixed
     
  • PhanTranQuocDat

    PhanTranQuocDat - 2023-02-21
     
  • PhanTranQuocDat

    PhanTranQuocDat - 2023-02-21

    commit bb193b9154084ee57164e3783f7499fbc1443a9a (HEAD -> develop, origin/develop)
    Author: dat.tq.phan dat.tq.phan@dektech.com.au
    Date: Mon Feb 13 13:41:03 2023 +0700

    imm: Make NFS state consistent among nodes [#3330]
    
    There is a case that the ccb apply not run
    on some nodes due to NFS unavailable. Then
    it causes the mismatch state in complete ack.
    
    Solution is to make NFS state consistent among nodes
    Also this commit combines three similar function into one.
    
     

Log in to post a comment.