Menu

#2794 imm: immnd coredump during scale-in on large configuration

5.18.04
fixed
nobody
None
defect
imm
nd
major
False
2018-03-15
2018-03-06
No

IMMND coredump is generated (see the attached full bt) during execution of the following test scenario:

1) System is up on 2+73 configuration
2) Perform Scale-In of 10 nodes
3) During Scale-In operation perform reboot of cluster

Syslog:

Feb 22 05:50:11 PL-53 osafimmnd[15930]: ER MESSAGE:65229 OUT OF ORDER my highest processed:64720 - exiting
...
Feb 22 05:50:11 PL-53 osafimmnd[19962]: Started
Feb 22 05:50:11 PL-53 osafimmnd[19962]: NO Persistent Back-End capability configured, Pbe file:imm.db (suffix may get added)
Feb 22 05:50:11 PL-53 osafimmnd[19962]: NO IMMD service is UP ... ScAbsenseAllowed?:0 introduced?:0
Feb 22 05:50:11 PL-53 osafimmnd[19962]: NO Fevs count adjusted to 65246 preLoadPid: 0
Feb 22 05:50:11 PL-53 osafimmnd[19962]: NO SERVER STATE: IMM_SERVER_ANONYMOUS --> IMM_SERVER_CLUSTER_WAITING
Feb 22 05:50:11 PL-53 osafimmnd[19962]: src/imm/immnd/immnd_evt.c:10731: immnd_evt_proc_fevs_rcv: Assertion '!reply_dest || (reply_dest == cb->immnd_mdest_id) || isObjSync' failed.
1 Attachments

Related

Wiki: ChangeLog-5.18.04

Discussion

  • Vu Minh Nguyen

    Vu Minh Nguyen - 2018-03-06

    When the IMMND process restarts due to OUT OF ORDER detection, new mdest destination will be allocated (cb->immnd_mdest_id=621291006108467 [0x2350f95b8a333]) which is different with the destination of previous dead IMMND process - pid 15930 (reply_dest=621291133173878 - [0x2350f9d4b8076]).

    IMMND should discard messages comming from the same node but different IMMND mdest.

    /Vu

     
  • Vu Minh Nguyen

    Vu Minh Nguyen - 2018-03-12
    • status: assigned --> review
     
  • Vu Minh Nguyen

    Vu Minh Nguyen - 2018-03-15
    • status: review --> fixed
    • assigned_to: Vu Minh Nguyen --> nobody
     
  • Vu Minh Nguyen

    Vu Minh Nguyen - 2018-03-15

    commit aa492f8c2a014977c3b821ed4027c3cb20263c73 (HEAD, origin/develop, ticket-2794, develop)
    Author: Vu Minh Nguyen vu.m.nguyen@dektech.com.au
    Date: Thu Mar 15 13:32:07 2018 +0700

    imm: coredump during scale-in on large configuration [#2794]
    
    When IMMND restarts (e.g: OUT OF ORDER detection), it may get message
    from active IMMD which is originated from just-dead IMMND process.
    In such case, we are in confused situation - messages come from
    local IMMND, but not me (reply_dest != cb->immnd_mdest_id)!
    
    This patch discards such messages, notify the case to syslog
    instead of aborting the IMMND progress.
    

    commit 3bb09f47b2bfeb628cc2e53fa821ffbf69c864cc (HEAD, origin/release, release)
    Author: Vu Minh Nguyen vu.m.nguyen@dektech.com.au
    Date: Thu Mar 15 13:32:07 2018 +0700

    imm: coredump during scale-in on large configuration [#2794]
    
    When IMMND restarts (e.g: OUT OF ORDER detection), it may get message
    from active IMMD which is originated from just-dead IMMND process.
    In such case, we are in confused situation - messages come from
    local IMMND, but not me (reply_dest != cb->immnd_mdest_id)!
    
    This patch discards such messages, notify the case to syslog
    instead of aborting the IMMND progress.
    
     

Log in to post a comment.