Menu

#2522 dtm: if TCP_USER_TIMEOUT closes socket, no attempt is make to reconnect

5.17.11
fixed
None
defect
dtm
d
5.1
major
False
2017-10-30
2017-07-06
Alex Jones
No

If TCP is used for transport, and TCP_USER_TIMEOUT is used also, if a node leaves the cluster due to some quick network outage, the nodes do not come back into the cluster automatically.

If TCP_USER_TIMEOUT is set to 1500 ms, and the network outage on the link is for 2000 ms, the node never comes back into the cluster.

Related

Tickets: #2551
Tickets: #2562
Wiki: ChangeLog-5.17.11

Discussion

  • Anders Widell

    Anders Widell - 2017-07-20

    I believe DTM sends broadcast (or multicast) messages on the network for a while after it has started, to discover other nodes on the network. But it stops doing this after a while and that is the reason why it fails to reconnect after a network disturbance.

    A solution could be:
    The node with the lowest node_id will never stop broadcasting the discovery messages
    A node which is connected with another node with a lower node_id will never broadcast discovery messages
    * The node with the lowest node_id will inform all the other connected nodes about the topology of the cluster - in particular, if a new node has appeared.

     
  • A V Mahesh (AVM)

    I don't think Alex is taking about initial discovery issue/ processes ( topology node discovery) ,
    but any how we can configure very big value of DTM_INI_DIS_TIMEOUT_SECS in dtm.conf to verify

     
  • Alex Jones

    Alex Jones - 2017-08-09

    If I set DTM_INI_DIS_TIMEOUT_SECS to 5000s the nodes do relearn each other and come back into the cluster.

     
  • Alex Jones

    Alex Jones - 2017-08-11
    • status: unassigned --> accepted
    • assigned_to: Alex Jones
    • Part: - --> d
     
  • Alex Jones

    Alex Jones - 2017-08-11
    • status: accepted --> review
     
  • Alex Jones

    Alex Jones - 2017-08-15
    • status: review --> fixed
     
  • Alex Jones

    Alex Jones - 2017-08-15

    commit 3ac6c452d30d2814f1704af578617f2a90f439b7
    Author: Alex Jones alex.jones@genband.com
    Date: Tue Aug 15 11:36:41 2017 -0400

     

Log in to post a comment.