From: Jon M. <jon...@er...> - 2019-11-13 12:02:19
|
Hi Hoang, This is good, but you have missed the point about the synchronization problem I have been talking about. 1) A new node comes up 2) The "bulk" binding table update is sent, as a series of packets over the new unicast link. This may take some time. 3) The owner of one of the bindings in the bulk (on this node) does unbind. 4) This is sent as broadcast withdraw to all nodes, and arrives before the last packets of the unicast bulk to the newly connected node. 5) Since there is no corresponding publication in the peer node's binding table yet, the withdraw is ignored. 6) The last bulk unicasts arrive at the new peer, and the now invalid publication is added to its binding table. 7) This publication will stay there forever. We need to find a way to synchronize so that we know that all the bulk publications are in place in the binding table before any broadcast publications/withdraws can be accepted. Obviously, we could create a backlog queue in the name table, but I hope we can find a simpler and neater solution. Regards ///jon > -----Original Message----- > From: Hoang Le <hoa...@de...> > Sent: 13-Nov-19 02:35 > To: Jon Maloy <jon...@er...>; ma...@do...; tip...@li... > Subject: [net-next] tipc: update a binding service via broadcast > > Currently, updating binding table (add service binding to > name table/withdraw a service binding) is being sent over replicast. > However, if we are scaling up clusters to > 100 nodes/containers this > method is less affection because of looping through nodes in a cluster one > by one. > > It is worth to use broadcast to update a binding service. Then binding > table updates in all nodes for one shot. > > The mechanism is backward compatible because of sending side changing. > > Signed-off-by: Hoang Le <hoa...@de...> > --- > net/tipc/bcast.c | 13 +++++++++++++ > net/tipc/bcast.h | 2 ++ > net/tipc/name_table.c | 4 ++-- > 3 files changed, 17 insertions(+), 2 deletions(-) > > diff --git a/net/tipc/bcast.c b/net/tipc/bcast.c > index f41096a759fa..18431fa897ab 100644 > --- a/net/tipc/bcast.c > +++ b/net/tipc/bcast.c > @@ -843,3 +843,16 @@ void tipc_mcast_filter_msg(struct net *net, struct sk_buff_head *defq, > __skb_queue_tail(inputq, _skb); > } > } > + > +int tipc_bcast_named_publish(struct net *net, struct sk_buff *skb) > +{ > + struct sk_buff_head xmitq; > + u16 cong_link_cnt; > + int rc = 0; > + > + __skb_queue_head_init(&xmitq); > + __skb_queue_tail(&xmitq, skb); > + rc = tipc_bcast_xmit(net, &xmitq, &cong_link_cnt); > + __skb_queue_purge(&xmitq); > + return rc; > +} > diff --git a/net/tipc/bcast.h b/net/tipc/bcast.h > index dadad953e2be..a100da3800fc 100644 > --- a/net/tipc/bcast.h > +++ b/net/tipc/bcast.h > @@ -101,6 +101,8 @@ int tipc_bclink_reset_stats(struct net *net); > u32 tipc_bcast_get_broadcast_mode(struct net *net); > u32 tipc_bcast_get_broadcast_ratio(struct net *net); > > +int tipc_bcast_named_publish(struct net *net, struct sk_buff *skb); > + > void tipc_mcast_filter_msg(struct net *net, struct sk_buff_head *defq, > struct sk_buff_head *inputq); > > diff --git a/net/tipc/name_table.c b/net/tipc/name_table.c > index 66a65c2cdb23..9e9c61f7c999 100644 > --- a/net/tipc/name_table.c > +++ b/net/tipc/name_table.c > @@ -633,7 +633,7 @@ struct publication *tipc_nametbl_publish(struct net *net, u32 type, u32 lower, > spin_unlock_bh(&tn->nametbl_lock); > > if (skb) > - tipc_node_broadcast(net, skb); > + tipc_bcast_named_publish(net, skb); > return p; > } > > @@ -664,7 +664,7 @@ int tipc_nametbl_withdraw(struct net *net, u32 type, u32 lower, > spin_unlock_bh(&tn->nametbl_lock); > > if (skb) { > - tipc_node_broadcast(net, skb); > + tipc_bcast_named_publish(net, skb); > return 1; > } > return 0; > -- > 2.20.1 |