Menu

#2418 imm: Info of dead IMMND remains in standby IMMD

5.17.07
fixed
None
defect
imm
d
major
False
2017-07-27
2017-04-10
Hung Nguyen
No

When Standby IMMD is up at the same time with a IMMND exiting, the info of that IMMND might not be removed from immnd_tree of the Standby IMMD.

Details of the problem is explained in the sequence diagram below
sequence diagram

SC-5 was Active, SC-2 was Standby, IMMND on SC-1 was exiting

18:35:03 SC-1 osafimmnd[441]: exiting for shutdown

18:35:03 SC-2 osafrded[413]: NO RDE role set to STANDBY
18:35:03 SC-2 osafimmd[430]: NO MDS event from svc_id 25 (change:3, dest:568511936070075)
18:35:03 SC-2 osafimmd[430]: NO MDS event from svc_id 25 (change:3, dest:567412424442298)
18:35:03 SC-2 osafimmd[430]: NO MDS event from svc_id 25 (change:3, dest:566312912814523)
18:35:03 SC-2 osafimmd[430]: NO MDS event from svc_id 25 (change:3, dest:565213401186744)

18:35:03 SC-5 osafimmd[433]: NO MDS event from svc_id 25 (change:4, dest:564113889558969)

Down event for IMMND@SC-1 was received on SC-5 but not on SC-2.



The symptoms:

  1. If the down IMMND is the corrdinator, that results in when that Standby IMMD becomes Active, it fails to elect new coordinator as there's already a coordinator in the immnd_tree.
18:35:11 SC-2 osafimmd[430]: WA IMMND coordinator at 2050f apparently crashed => electing new coord

No more logs about newly elected coordinator were printed out.


  1. When IMMND@SC-1 is up again, it will fail to introduce to IMMD because the IMMD already have IMMND@SC-1 in immnd_tree with a wrong epoch.
18:35:29 SC-1 osafimmnd[441]: NO SERVER STATE: IMM_SERVER_ANONYMOUS --> IMM_SERVER_CLUSTER_WAITING
18:35:29 SC-1 osafimmnd[441]: NO This IMMND is now the NEW Coord
18:35:29 SC-1 osafimmnd[441]: ER 3 > 0, exiting
1 Attachments

Related

Tickets: #2418
Wiki: ChangeLog-5.17.07

Discussion

  • Zoran Milinkovic

    • status: accepted --> review
     
  • Anders Bjornerstedt

    I the defect only occurs in a headless system, then I think the ticket slogan, or at least the description sholud say so.

     
  • Hung Nguyen

    Hung Nguyen - 2017-04-25
    • Blocker: --> False
    • Milestone: 5.0.2 --> 5.17.06
     
  • Hung Nguyen

    Hung Nguyen - 2017-04-25

    5.17.08 (develop) [code:85c90b]

    commit 85c90b4abead8bd66e1f20be3f84255645880597
    Author: Hung Nguyen <hung.d.nguyen@dektech.com.au>
    Date:   Tue Apr 25 13:24:29 2017 +0700
    
        imm: Ignore the sync'ed IMMND nodes that are not up [#2418]
    

    5.17.06 (release) [code:c1a37f]

    commit c1a37fb5032c0e63165bc36e79d5a79be3fd19dd
    Author: Hung Nguyen <hung.d.nguyen@dektech.com.au>
    Date:   Tue Apr 25 13:24:29 2017 +0700
    
        imm: Ignore the sync'ed IMMND nodes that are not up [#2418]
    

    default (mercurial) [staging:dc6067]

    changeset:   8777:dc60670bfd3b
    user:        Hung Nguyen <hung.d.nguyen@dektech.com.au>
    date:        Tue Apr 25 13:40:04 2017 +0700
    summary:     imm: Ignore the sync'ed IMMND nodes that are not up [#2418]
    
     

    Related

    Commit: [dc6067]
    Tickets: #2418
    Commit: [85c90b]
    Commit: [c1a37f]

  • Hung Nguyen

    Hung Nguyen - 2017-04-25
    • status: review --> fixed
     
  • Hung Nguyen

    Hung Nguyen - 2017-05-17
    • status: fixed --> review
     
  • Hung Nguyen

    Hung Nguyen - 2017-05-17

    Re-open this ticket since the new active IMMD (switches from STANDBY role) has problem with dead IMMND in the immnd_tree. The dead IMMND should be cleanup before switching to ACTIVE.

     
  • Hung Nguyen

    Hung Nguyen - 2017-05-26
    • status: review --> fixed
     
  • Hung Nguyen

    Hung Nguyen - 2017-05-26

    5.17.08 (develop) [code:ff044b]

    commit ff044b93c3182997cbe9ab318245846c876ecd02
    Author: Hung Nguyen <hung.d.nguyen@dektech.com.au>
    Date:   Mon May 15 14:09:06 2017 +0700
    
        imm: Clear dead IMMND info before switching to ACTIVE role [#2418]
    
        During cold-sync, standby IMMD may receive info of dead IMMND.
        Before switching to active, the IMMD should clear those dead IMMND info.
    

    5.17.06 (release) [code:b6d724]

    commit b6d724a849988ef91dcfad4c0267df7a8ea96e4b
    Author: Hung Nguyen <hung.d.nguyen@dektech.com.au>
    Date:   Mon May 15 14:09:06 2017 +0700
    
        imm: Clear dead IMMND info before switching to ACTIVE role [#2418]
    
        During cold-sync, standby IMMD may receive info of dead IMMND.
        Before switching to active, the IMMD should clear those dead IMMND info.
    
     

    Related

    Commit: [b6d724]
    Commit: [ff044b]

  • Anders Widell

    Anders Widell - 2017-07-01
    • Milestone: 5.17.06 --> 5.17.08
     

Log in to post a comment.

MongoDB Logo MongoDB