Thread: [tipc-discussion] [PATCH net-next 0/3] fix broadcast sync issue

Cluster wide IPC providing datagram, connection, and bus messaging

Brought to you by: goranm, johnr64, jonmaloy, ldcanh, and 6 others

tipc-discussion

[tipc-discussion] [PATCH net-next 0/3] fix broadcast sync issue

From: Ying X. <yin...@wi...> - 2012-07-03 07:52:40

Currently there have two major known issues about broadcast link as
belows:

1. There has one risk that name table updates sent over the broadcast
link after the neighbor is discovered will arrive before the initial
transfer of name table entries over the unicast link has been completed.

2. There has another risk that the node may send a broadcast message
after the neighbor is discovered without being certain whether the
neighbor will acknowledge it or not.

To resolve above issues, we introduce a new BCAST_PROTOCOL, and send
that immediately after name table messages have been sent via reliable
unicast link. This message contains the sequence number of the most
recent broadcast message the sending node has sent, telling the
neighbor node when to start accepting broadcast messages and where to
start receiving and acknowledging broadcast messages.

In addition, to keep protocol compatibility backwards, we also involve
a flag bit to indicate whether a node supports the enhanced broadcast
synchronization mechanism or not.

Ying Xue (3):
  tipc: Add a new flag bit to indicate bclink sync protocol is
    supported
  tipc: Keep protocol compatability backwards
  tipc: Involve the enhanced broadcast synchronization mechanism

 net/tipc/bcast.c      |   23 ++++++++++++++++++++++
 net/tipc/bcast.h      |    1 +
 net/tipc/link.c       |   50 ++++++++++++++++++++++++++++++++++++++++++++++++-
 net/tipc/link.h       |    1 +
 net/tipc/msg.h        |   10 +++++++++
 net/tipc/name_distr.c |    1 +
 net/tipc/node.c       |    5 ++-
 net/tipc/node.h       |    2 +
 8 files changed, 90 insertions(+), 3 deletions(-)

[tipc-discussion] [PATCH net-next 2/3] tipc: Keep protocol compatibility backwards

From: Ying X. <yin...@wi...> - 2012-07-03 07:52:45

In previous patch a flag bit has been involved to indicate whether a
node supports the enhanced broadcast synchronization mechanism or not.
But to keep TIPC protocol interoperability between nodes which do not
support the new broadcast synchronization mechanism and nodes which do
support the new mechanism, there must have a mechanism which can ensure
that a node can simply fall back to using the existing broadcast
synchronization mechanism if it recognizes its peer node doesn't support
the new synchronization mechanism.

Signed-off-by: Ying Xue <yin...@wi...>
---
 net/tipc/link.c |    5 ++++-
 net/tipc/node.c |    3 ++-
 net/tipc/node.h |    2 ++
 3 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/net/tipc/link.c b/net/tipc/link.c
index 45ae706..918c1b0 100644
--- a/net/tipc/link.c
+++ b/net/tipc/link.c
@@ -1960,6 +1960,7 @@ void tipc_link_send_proto_msg(struct tipc_link *l_ptr, u32 msg_typ,
 	}
 
 	r_flag = (l_ptr->owner->working_links > tipc_link_is_up(l_ptr));
+	msg_set_bclink_sync(msg, 0);
 	msg_set_redundant_link(msg, r_flag);
 	msg_set_linkprio(msg, l_ptr->priority);
 	msg_set_size(msg, msg_size);
@@ -2058,9 +2059,11 @@ static void link_recv_proto_msg(struct tipc_link *l_ptr, struct sk_buff *buf)
 			l_ptr->max_pkt = l_ptr->max_pkt_target;
 		}
 		l_ptr->owner->bclink.supportable = (max_pkt_info != 0);
+		l_ptr->owner->bclink.sync = msg_bclink_sync(msg);
 
 		/* Synchronize broadcast link info, if not done previously */
-		if (!tipc_node_is_up(l_ptr->owner)) {
+		if (!tipc_node_is_up(l_ptr->owner) &&
+		    !l_ptr->owner->bclink.sync) {
 			l_ptr->owner->bclink.last_sent =
 				l_ptr->owner->bclink.last_in =
 				msg_last_bcast(msg);
diff --git a/net/tipc/node.c b/net/tipc/node.c
index e03fc45..37cf3e3 100644
--- a/net/tipc/node.c
+++ b/net/tipc/node.c
@@ -268,7 +268,8 @@ static void node_established_contact(struct tipc_node *n_ptr)
 	if (n_ptr->bclink.supportable) {
 		n_ptr->bclink.acked = tipc_bclink_get_last_sent();
 		tipc_bclink_add_node(n_ptr->addr);
-		n_ptr->bclink.supported = 1;
+		if (!n_ptr->bclink.sync)
+			n_ptr->bclink.supported = 1;
 	}
 }
 
diff --git a/net/tipc/node.h b/net/tipc/node.h
index cfcaf4d..75c582b 100644
--- a/net/tipc/node.h
+++ b/net/tipc/node.h
@@ -69,6 +69,7 @@
  * @bclink: broadcast-related info
  *    @supportable: non-zero if node supports TIPC b'cast link capability
  *    @supported: non-zero if node supports TIPC b'cast capability
+ *    @sync: non-zero if node supports new broadcast synchronziation mechanism
  *    @acked: sequence # of last outbound b'cast message acknowledged by node
  *    @last_in: sequence # of last in-sequence b'cast message received from node
  *    @last_sent: sequence # of last b'cast message sent by node
@@ -94,6 +95,7 @@ struct tipc_node {
 	struct {
 		u8 supportable;
 		u8 supported;
+		u32 sync;
 		u32 acked;
 		u32 last_in;
 		u32 last_sent;
-- 
1.7.1

[tipc-discussion] [PATCH net-next 1/3] tipc: Add a new flag bit to indicate bclink sync protocol is supported

From: Ying X. <yin...@wi...> - 2012-07-03 07:52:46

Use the bit 14 in word 5 of LINK_PROTOCOL message header to indicate
whether the new enhanced broadcast synchronization mechanism is
supported.

Signed-off-by: Ying Xue <yin...@wi...>
---
 net/tipc/msg.h |   10 ++++++++++
 1 files changed, 10 insertions(+), 0 deletions(-)

diff --git a/net/tipc/msg.h b/net/tipc/msg.h
index ba2a72b..c832ae5 100644
--- a/net/tipc/msg.h
+++ b/net/tipc/msg.h
@@ -670,6 +670,16 @@ static inline void msg_set_redundant_link(struct tipc_msg *m, u32 r)
 	msg_set_bits(m, 5, 12, 0x1, r);
 }
 
+static inline u32 msg_bclink_sync(struct tipc_msg *m)
+{
+	return msg_bits(m, 5, 14, 0x1);
+}
+
+static inline void msg_set_bclink_sync(struct tipc_msg *m, u32 n)
+{
+	msg_set_bits(m, 5, 14, 0x1, n);
+}
+
 static inline char *msg_media_addr(struct tipc_msg *m)
 {
 	return (char *)&m->hdr[TIPC_MEDIA_ADDR_OFFSET];
-- 
1.7.1

[tipc-discussion] [PATCH net-next 3/3] tipc: Involve the enhanced broadcast synchronization mechanism

From: Ying X. <yin...@wi...> - 2012-07-03 07:52:46

Currently there have two major known issues about broadcast link as
belows:

1. There has one risk that name table updates sent over the broadcast
link after the neighbor is discovered will arrive before the initial
transfer of name table entries over the unicast link has been completed.

2. There has another risk that the node may send a broadcast message
after the neighbor is discovered without being certain whether the
neighbor will acknowledge it or not.

At present, the sending node cannot assume that the neighbor's link
endpoint is in the WW state just because its own link endpoint is in
that state, which means that is it doesn't know if the neighbor will
process the broadcast message or ignore it. However, once all of the
name table messages are acknowledged the node can be certain that the
other end is in WW state and that broadcast message will be processed.

Therefor, when the neighbor node is added to the sending node's
broadcast link map, the sending node should send all name talbe entries
over the unicast link. After these name table messages have been sent,
it should immediately one explicit message that tells the neighbor node
to start accepting broadcast messages, which can resolve the first
problem.

Since the explicit message contains the sequence number of the most
recent broadcast message the sending node has sent when it adds the
neighbor node to its broadcast link map, it can tell the neighbor node
where to start receiving and acknowledging broadcast messages, which
can reslove the second problem.

Signed-off-by: Ying Xue <yin...@wi...>
---
 net/tipc/bcast.c      |   23 +++++++++++++++++++++++
 net/tipc/bcast.h      |    1 +
 net/tipc/link.c       |   47 ++++++++++++++++++++++++++++++++++++++++++++++-
 net/tipc/link.h       |    1 +
 net/tipc/name_distr.c |    1 +
 net/tipc/node.c       |    2 +-
 6 files changed, 73 insertions(+), 2 deletions(-)

diff --git a/net/tipc/bcast.c b/net/tipc/bcast.c
index 47b61d3..3e7688a 100644
--- a/net/tipc/bcast.c
+++ b/net/tipc/bcast.c
@@ -355,6 +355,29 @@ static void bclink_peek_nack(struct tipc_msg *msg)
 	tipc_node_unlock(n_ptr);
 }
 
+/**
+ * tipc_bclink_info_recv - synchronize broadcast link info
+ */
+void tipc_bclink_info_recv(struct sk_buff *buf)
+{
+	struct tipc_node *n_ptr;
+	struct tipc_msg *msg = buf_msg(buf);
+
+	n_ptr = tipc_node_find(msg_prevnode(msg));
+	if (unlikely(!n_ptr))
+		return;
+
+	tipc_node_lock(n_ptr);
+	if (!n_ptr->bclink.supported && n_ptr->bclink.supportable &&
+	    n_ptr->bclink.sync) {
+		n_ptr->bclink.supported = 1;
+		n_ptr->bclink.last_sent = n_ptr->bclink.last_in =
+					  msg_last_bcast(msg);
+		n_ptr->bclink.oos_state = 0;
+	}
+	tipc_node_unlock(n_ptr);
+}
+
 /*
  * tipc_bclink_send_msg - broadcast a packet to all nodes in cluster
  */
diff --git a/net/tipc/bcast.h b/net/tipc/bcast.h
index a933065..c689aa2 100644
--- a/net/tipc/bcast.h
+++ b/net/tipc/bcast.h
@@ -89,6 +89,7 @@ void tipc_bclink_add_node(u32 addr);
 void tipc_bclink_remove_node(u32 addr);
 struct tipc_node *tipc_bclink_retransmit_to(void);
 void tipc_bclink_acknowledge(struct tipc_node *n_ptr, u32 acked);
+void tipc_bclink_info_recv(struct sk_buff *buf);
 int  tipc_bclink_send_msg(struct sk_buff *buf);
 void tipc_bclink_recv_pkt(struct sk_buff *buf);
 u32  tipc_bclink_get_last_sent(void);
diff --git a/net/tipc/link.c b/net/tipc/link.c
index 918c1b0..e6868e4 100644
--- a/net/tipc/link.c
+++ b/net/tipc/link.c
@@ -952,6 +952,47 @@ int tipc_link_send(struct sk_buff *buf, u32 dest, u32 selector)
 }
 
 /*
+ * tipc_link_send_bclink_info - send broadcast link info to new neighbor
+ *
+ * Send broadcast link info to open up for the new neighbor to accept broadcast
+ * message from the point contained in "last sent broadcast number" field and
+ * onwards. No link congestion checking is performed because the last sent
+ * broadcast number info *must* be delivered.
+ */
+void tipc_link_send_bclink_info(u32 dest)
+{
+	struct tipc_node *n_ptr;
+	struct tipc_link *l_ptr;
+	struct sk_buff *buf;
+
+	read_lock_bh(&tipc_net_lock);
+	n_ptr = tipc_node_find(dest);
+	if (n_ptr) {
+		tipc_node_lock(n_ptr);
+		buf = tipc_buf_acquire(INT_H_SIZE);
+		if (buf) {
+			struct tipc_msg *msg = buf_msg(buf);
+
+			tipc_msg_init(msg, BCAST_PROTOCOL, STATE_MSG,
+				      INT_H_SIZE, dest);
+			msg_set_non_seq(msg, 0);
+			msg_set_last_bcast(msg, n_ptr->bclink.acked);
+			msg_set_link_selector(msg, (dest & 1));
+
+			l_ptr = n_ptr->active_links[0];
+			if (l_ptr) {
+				link_add_chain_to_outqueue(l_ptr, buf, 0);
+				tipc_link_push_queue(l_ptr);
+			} else {
+				kfree_skb(buf);
+			}
+		}
+		tipc_node_unlock(n_ptr);
+	}
+	read_unlock_bh(&tipc_net_lock);
+}
+
+/*
  * tipc_link_send_names - send name table entries to new neighbor
  *
  * Send routine for bulk delivery of name table messages when contact
@@ -1730,6 +1771,10 @@ deliver:
 					tipc_node_unlock(n_ptr);
 					tipc_named_recv(buf);
 					continue;
+				case BCAST_PROTOCOL:
+					tipc_node_unlock(n_ptr);
+					tipc_bclink_info_recv(buf);
+					continue;
 				case CONN_MANAGER:
 					tipc_node_unlock(n_ptr);
 					tipc_port_recv_proto_msg(buf);
@@ -1960,7 +2005,7 @@ void tipc_link_send_proto_msg(struct tipc_link *l_ptr, u32 msg_typ,
 	}
 
 	r_flag = (l_ptr->owner->working_links > tipc_link_is_up(l_ptr));
-	msg_set_bclink_sync(msg, 0);
+	msg_set_bclink_sync(msg, 1);
 	msg_set_redundant_link(msg, r_flag);
 	msg_set_linkprio(msg, l_ptr->priority);
 	msg_set_size(msg, msg_size);
diff --git a/net/tipc/link.h b/net/tipc/link.h
index 3a045eb..a53f103 100644
--- a/net/tipc/link.h
+++ b/net/tipc/link.h
@@ -224,6 +224,7 @@ struct sk_buff *tipc_link_cmd_show_stats(const void *req_tlv_area, int req_tlv_s
 struct sk_buff *tipc_link_cmd_reset_stats(const void *req_tlv_area, int req_tlv_space);
 void tipc_link_reset(struct tipc_link *l_ptr);
 int tipc_link_send(struct sk_buff *buf, u32 dest, u32 selector);
+void tipc_link_send_bclink_info(u32 dest);
 void tipc_link_send_names(struct list_head *message_list, u32 dest);
 int tipc_link_send_buf(struct tipc_link *l_ptr, struct sk_buff *buf);
 u32 tipc_link_get_max_pkt(u32 dest, u32 selector);
diff --git a/net/tipc/name_distr.c b/net/tipc/name_distr.c
index 25b7b56..05a4303 100644
--- a/net/tipc/name_distr.c
+++ b/net/tipc/name_distr.c
@@ -263,6 +263,7 @@ void tipc_named_node_up(unsigned long nodearg)
 	read_unlock_bh(&tipc_nametbl_lock);
 
 	tipc_link_send_names(&message_list, (u32)node);
+	tipc_link_send_bclink_info(node);
 }
 
 /**
diff --git a/net/tipc/node.c b/net/tipc/node.c
index 37cf3e3..9e443d2 100644
--- a/net/tipc/node.c
+++ b/net/tipc/node.c
@@ -296,7 +296,7 @@ static void node_lost_contact(struct tipc_node *n_ptr)
 		tipc_addr_string_fill(addr_string, n_ptr->addr));
 
 	/* Flush broadcast link info associated with lost node */
-	if (n_ptr->bclink.supported) {
+	if (n_ptr->bclink.supportable) {
 		while (n_ptr->bclink.deferred_head) {
 			struct sk_buff *buf = n_ptr->bclink.deferred_head;
 			n_ptr->bclink.deferred_head = buf->next;
-- 
1.7.1

Re: [tipc-discussion] [PATCH net-next 3/3] tipc: Involve the enhanced broadcast synchronization mechanism

From: Erik H. <eri...@er...> - 2012-07-04 08:21:26

This comment is not directed to your patch, but more general.
What's the purpose of the supported/supportable fields in the node struct?
It looks like it's marking the node "bclink supportable" based on the 
msg_max_pkt field when it receives a link activate msg, and then 
immediately switches over to "bclink supported" when contact have been 
established to the remote node..
Why not use an enum for this?

Also, i dont quite understand what's the condition that decides if the 
bclink can be used to communicate with a node?

//E

On 2012-07-03 09:52, Ying Xue wrote:
> Currently there have two major known issues about broadcast link as
> belows:
>
> 1. There has one risk that name table updates sent over the broadcast
> link after the neighbor is discovered will arrive before the initial
> transfer of name table entries over the unicast link has been completed.
>
> 2. There has another risk that the node may send a broadcast message
> after the neighbor is discovered without being certain whether the
> neighbor will acknowledge it or not.
>
> At present, the sending node cannot assume that the neighbor's link
> endpoint is in the WW state just because its own link endpoint is in
> that state, which means that is it doesn't know if the neighbor will
> process the broadcast message or ignore it. However, once all of the
> name table messages are acknowledged the node can be certain that the
> other end is in WW state and that broadcast message will be processed.
>
> Therefor, when the neighbor node is added to the sending node's
> broadcast link map, the sending node should send all name talbe entries
> over the unicast link. After these name table messages have been sent,
> it should immediately one explicit message that tells the neighbor node
> to start accepting broadcast messages, which can resolve the first
> problem.
>
> Since the explicit message contains the sequence number of the most
> recent broadcast message the sending node has sent when it adds the
> neighbor node to its broadcast link map, it can tell the neighbor node
> where to start receiving and acknowledging broadcast messages, which
> can reslove the second problem.
>
> Signed-off-by: Ying Xue<yin...@wi...>
> ---
>   net/tipc/bcast.c      |   23 +++++++++++++++++++++++
>   net/tipc/bcast.h      |    1 +
>   net/tipc/link.c       |   47 ++++++++++++++++++++++++++++++++++++++++++++++-
>   net/tipc/link.h       |    1 +
>   net/tipc/name_distr.c |    1 +
>   net/tipc/node.c       |    2 +-
>   6 files changed, 73 insertions(+), 2 deletions(-)
>
> diff --git a/net/tipc/bcast.c b/net/tipc/bcast.c
> index 47b61d3..3e7688a 100644
> --- a/net/tipc/bcast.c
> +++ b/net/tipc/bcast.c
> @@ -355,6 +355,29 @@ static void bclink_peek_nack(struct tipc_msg *msg)
>   	tipc_node_unlock(n_ptr);
>   }
>
> +/**
> + * tipc_bclink_info_recv - synchronize broadcast link info
> + */
> +void tipc_bclink_info_recv(struct sk_buff *buf)
> +{
> +	struct tipc_node *n_ptr;
> +	struct tipc_msg *msg = buf_msg(buf);
> +
> +	n_ptr = tipc_node_find(msg_prevnode(msg));
> +	if (unlikely(!n_ptr))
> +		return;
> +
> +	tipc_node_lock(n_ptr);
> +	if (!n_ptr->bclink.supported&&  n_ptr->bclink.supportable&&
> +	    n_ptr->bclink.sync) {
> +		n_ptr->bclink.supported = 1;
> +		n_ptr->bclink.last_sent = n_ptr->bclink.last_in =
> +					  msg_last_bcast(msg);
> +		n_ptr->bclink.oos_state = 0;
> +	}
> +	tipc_node_unlock(n_ptr);
> +}
> +
>   /*
>    * tipc_bclink_send_msg - broadcast a packet to all nodes in cluster
>    */
> diff --git a/net/tipc/bcast.h b/net/tipc/bcast.h
> index a933065..c689aa2 100644
> --- a/net/tipc/bcast.h
> +++ b/net/tipc/bcast.h
> @@ -89,6 +89,7 @@ void tipc_bclink_add_node(u32 addr);
>   void tipc_bclink_remove_node(u32 addr);
>   struct tipc_node *tipc_bclink_retransmit_to(void);
>   void tipc_bclink_acknowledge(struct tipc_node *n_ptr, u32 acked);
> +void tipc_bclink_info_recv(struct sk_buff *buf);
>   int  tipc_bclink_send_msg(struct sk_buff *buf);
>   void tipc_bclink_recv_pkt(struct sk_buff *buf);
>   u32  tipc_bclink_get_last_sent(void);
> diff --git a/net/tipc/link.c b/net/tipc/link.c
> index 918c1b0..e6868e4 100644
> --- a/net/tipc/link.c
> +++ b/net/tipc/link.c
> @@ -952,6 +952,47 @@ int tipc_link_send(struct sk_buff *buf, u32 dest, u32 selector)
>   }
>
>   /*
> + * tipc_link_send_bclink_info - send broadcast link info to new neighbor
> + *
> + * Send broadcast link info to open up for the new neighbor to accept broadcast
> + * message from the point contained in "last sent broadcast number" field and
> + * onwards. No link congestion checking is performed because the last sent
> + * broadcast number info *must* be delivered.
> + */
> +void tipc_link_send_bclink_info(u32 dest)
> +{
> +	struct tipc_node *n_ptr;
> +	struct tipc_link *l_ptr;
> +	struct sk_buff *buf;
> +
> +	read_lock_bh(&tipc_net_lock);
> +	n_ptr = tipc_node_find(dest);
> +	if (n_ptr) {
> +		tipc_node_lock(n_ptr);
> +		buf = tipc_buf_acquire(INT_H_SIZE);
> +		if (buf) {
> +			struct tipc_msg *msg = buf_msg(buf);
> +
> +			tipc_msg_init(msg, BCAST_PROTOCOL, STATE_MSG,
> +				      INT_H_SIZE, dest);
> +			msg_set_non_seq(msg, 0);
> +			msg_set_last_bcast(msg, n_ptr->bclink.acked);
> +			msg_set_link_selector(msg, (dest&  1));
> +
> +			l_ptr = n_ptr->active_links[0];
> +			if (l_ptr) {
> +				link_add_chain_to_outqueue(l_ptr, buf, 0);
> +				tipc_link_push_queue(l_ptr);
> +			} else {
> +				kfree_skb(buf);
> +			}
> +		}
> +		tipc_node_unlock(n_ptr);
> +	}
> +	read_unlock_bh(&tipc_net_lock);
> +}
> +
> +/*
>    * tipc_link_send_names - send name table entries to new neighbor
>    *
>    * Send routine for bulk delivery of name table messages when contact
> @@ -1730,6 +1771,10 @@ deliver:
>   					tipc_node_unlock(n_ptr);
>   					tipc_named_recv(buf);
>   					continue;
> +				case BCAST_PROTOCOL:
> +					tipc_node_unlock(n_ptr);
> +					tipc_bclink_info_recv(buf);
> +					continue;
>   				case CONN_MANAGER:
>   					tipc_node_unlock(n_ptr);
>   					tipc_port_recv_proto_msg(buf);
> @@ -1960,7 +2005,7 @@ void tipc_link_send_proto_msg(struct tipc_link *l_ptr, u32 msg_typ,
>   	}
>
>   	r_flag = (l_ptr->owner->working_links>  tipc_link_is_up(l_ptr));
> -	msg_set_bclink_sync(msg, 0);
> +	msg_set_bclink_sync(msg, 1);
>   	msg_set_redundant_link(msg, r_flag);
>   	msg_set_linkprio(msg, l_ptr->priority);
>   	msg_set_size(msg, msg_size);
> diff --git a/net/tipc/link.h b/net/tipc/link.h
> index 3a045eb..a53f103 100644
> --- a/net/tipc/link.h
> +++ b/net/tipc/link.h
> @@ -224,6 +224,7 @@ struct sk_buff *tipc_link_cmd_show_stats(const void *req_tlv_area, int req_tlv_s
>   struct sk_buff *tipc_link_cmd_reset_stats(const void *req_tlv_area, int req_tlv_space);
>   void tipc_link_reset(struct tipc_link *l_ptr);
>   int tipc_link_send(struct sk_buff *buf, u32 dest, u32 selector);
> +void tipc_link_send_bclink_info(u32 dest);
>   void tipc_link_send_names(struct list_head *message_list, u32 dest);
>   int tipc_link_send_buf(struct tipc_link *l_ptr, struct sk_buff *buf);
>   u32 tipc_link_get_max_pkt(u32 dest, u32 selector);
> diff --git a/net/tipc/name_distr.c b/net/tipc/name_distr.c
> index 25b7b56..05a4303 100644
> --- a/net/tipc/name_distr.c
> +++ b/net/tipc/name_distr.c
> @@ -263,6 +263,7 @@ void tipc_named_node_up(unsigned long nodearg)
>   	read_unlock_bh(&tipc_nametbl_lock);
>
>   	tipc_link_send_names(&message_list, (u32)node);
> +	tipc_link_send_bclink_info(node);
>   }
>
>   /**
> diff --git a/net/tipc/node.c b/net/tipc/node.c
> index 37cf3e3..9e443d2 100644
> --- a/net/tipc/node.c
> +++ b/net/tipc/node.c
> @@ -296,7 +296,7 @@ static void node_lost_contact(struct tipc_node *n_ptr)
>   		tipc_addr_string_fill(addr_string, n_ptr->addr));
>
>   	/* Flush broadcast link info associated with lost node */
> -	if (n_ptr->bclink.supported) {
> +	if (n_ptr->bclink.supportable) {
>   		while (n_ptr->bclink.deferred_head) {
>   			struct sk_buff *buf = n_ptr->bclink.deferred_head;
>   			n_ptr->bclink.deferred_head = buf->next;

Re: [tipc-discussion] [PATCH net-next 3/3] tipc: Involve the enhanced broadcast synchronization mechanism

From: Ying X. <yin...@wi...> - 2012-07-04 09:32:26

Erik Hugne wrote:
> This comment is not directed to your patch, but more general.
Thanks for your comments, maybe i add some extra information to describe 
how to implement the explicit sync message - BCAST_PROTOCOL, which may 
be more clear.

> What's the purpose of the supported/supportable fields in the node 
> struct?
By my understanding, supportable means MTU negotiation between nodes is 
done and node has broadcast capability; supported indicates we can 
receive broadcast messages.

> It looks like it's marking the node "bclink supportable" based on the 
> msg_max_pkt field when it receives a link activate msg, and then 
> immediately switches over to "bclink supported" when contact have been 
> established to the remote node..
> Why not use an enum for this?
Using enum is no problem, but I think its result is same, please note 
supportable/supported are defined to char rather than int.
>
> Also, i dont quite understand what's the condition that decides if the 
> bclink can be used to communicate with a node?
>
Once receiving an explicit message of broadcast sync, it can talk with 
other nodes by broadcast way.

To completely understand the solution, I think we should first correctly 
understand the two risks described in comments in the patch; secondly we 
also should understand why the two issues cannot be resolved by link 
state machine, but finally we have to involve an explicit message to 
synchronize broadcast info although it has an obvious side-effect of 
protocol compatibility.

Lastly, please notice when the sync message is sent and what it conveys. 
Actually it tells neighbor node when to start accepting broadcast 
messages and where to start receiving and acknowledging broadcast messages.

Regards,
Ying

> //E
>
> On 2012-07-03 09:52, Ying Xue wrote:
>> Currently there have two major known issues about broadcast link as
>> belows:
>>
>> 1. There has one risk that name table updates sent over the broadcast
>> link after the neighbor is discovered will arrive before the initial
>> transfer of name table entries over the unicast link has been completed.
>>
>> 2. There has another risk that the node may send a broadcast message
>> after the neighbor is discovered without being certain whether the
>> neighbor will acknowledge it or not.
>>
>> At present, the sending node cannot assume that the neighbor's link
>> endpoint is in the WW state just because its own link endpoint is in
>> that state, which means that is it doesn't know if the neighbor will
>> process the broadcast message or ignore it. However, once all of the
>> name table messages are acknowledged the node can be certain that the
>> other end is in WW state and that broadcast message will be processed.
>>
>> Therefor, when the neighbor node is added to the sending node's
>> broadcast link map, the sending node should send all name talbe entries
>> over the unicast link. After these name table messages have been sent,
>> it should immediately one explicit message that tells the neighbor node
>> to start accepting broadcast messages, which can resolve the first
>> problem.
>>
>> Since the explicit message contains the sequence number of the most
>> recent broadcast message the sending node has sent when it adds the
>> neighbor node to its broadcast link map, it can tell the neighbor node
>> where to start receiving and acknowledging broadcast messages, which
>> can reslove the second problem.
>>
>> Signed-off-by: Ying Xue<yin...@wi...>
>> ---
>>   net/tipc/bcast.c      |   23 +++++++++++++++++++++++
>>   net/tipc/bcast.h      |    1 +
>>   net/tipc/link.c       |   47 
>> ++++++++++++++++++++++++++++++++++++++++++++++-
>>   net/tipc/link.h       |    1 +
>>   net/tipc/name_distr.c |    1 +
>>   net/tipc/node.c       |    2 +-
>>   6 files changed, 73 insertions(+), 2 deletions(-)
>>
>> diff --git a/net/tipc/bcast.c b/net/tipc/bcast.c
>> index 47b61d3..3e7688a 100644
>> --- a/net/tipc/bcast.c
>> +++ b/net/tipc/bcast.c
>> @@ -355,6 +355,29 @@ static void bclink_peek_nack(struct tipc_msg *msg)
>>       tipc_node_unlock(n_ptr);
>>   }
>>
>> +/**
>> + * tipc_bclink_info_recv - synchronize broadcast link info
>> + */
>> +void tipc_bclink_info_recv(struct sk_buff *buf)
>> +{
>> +    struct tipc_node *n_ptr;
>> +    struct tipc_msg *msg = buf_msg(buf);
>> +
>> +    n_ptr = tipc_node_find(msg_prevnode(msg));
>> +    if (unlikely(!n_ptr))
>> +        return;
>> +
>> +    tipc_node_lock(n_ptr);
>> +    if (!n_ptr->bclink.supported&&  n_ptr->bclink.supportable&&
>> +        n_ptr->bclink.sync) {
>> +        n_ptr->bclink.supported = 1;
>> +        n_ptr->bclink.last_sent = n_ptr->bclink.last_in =
>> +                      msg_last_bcast(msg);
>> +        n_ptr->bclink.oos_state = 0;
>> +    }
>> +    tipc_node_unlock(n_ptr);
>> +}
>> +
>>   /*
>>    * tipc_bclink_send_msg - broadcast a packet to all nodes in cluster
>>    */
>> diff --git a/net/tipc/bcast.h b/net/tipc/bcast.h
>> index a933065..c689aa2 100644
>> --- a/net/tipc/bcast.h
>> +++ b/net/tipc/bcast.h
>> @@ -89,6 +89,7 @@ void tipc_bclink_add_node(u32 addr);
>>   void tipc_bclink_remove_node(u32 addr);
>>   struct tipc_node *tipc_bclink_retransmit_to(void);
>>   void tipc_bclink_acknowledge(struct tipc_node *n_ptr, u32 acked);
>> +void tipc_bclink_info_recv(struct sk_buff *buf);
>>   int  tipc_bclink_send_msg(struct sk_buff *buf);
>>   void tipc_bclink_recv_pkt(struct sk_buff *buf);
>>   u32  tipc_bclink_get_last_sent(void);
>> diff --git a/net/tipc/link.c b/net/tipc/link.c
>> index 918c1b0..e6868e4 100644
>> --- a/net/tipc/link.c
>> +++ b/net/tipc/link.c
>> @@ -952,6 +952,47 @@ int tipc_link_send(struct sk_buff *buf, u32 
>> dest, u32 selector)
>>   }
>>
>>   /*
>> + * tipc_link_send_bclink_info - send broadcast link info to new 
>> neighbor
>> + *
>> + * Send broadcast link info to open up for the new neighbor to 
>> accept broadcast
>> + * message from the point contained in "last sent broadcast number" 
>> field and
>> + * onwards. No link congestion checking is performed because the 
>> last sent
>> + * broadcast number info *must* be delivered.
>> + */
>> +void tipc_link_send_bclink_info(u32 dest)
>> +{
>> +    struct tipc_node *n_ptr;
>> +    struct tipc_link *l_ptr;
>> +    struct sk_buff *buf;
>> +
>> +    read_lock_bh(&tipc_net_lock);
>> +    n_ptr = tipc_node_find(dest);
>> +    if (n_ptr) {
>> +        tipc_node_lock(n_ptr);
>> +        buf = tipc_buf_acquire(INT_H_SIZE);
>> +        if (buf) {
>> +            struct tipc_msg *msg = buf_msg(buf);
>> +
>> +            tipc_msg_init(msg, BCAST_PROTOCOL, STATE_MSG,
>> +                      INT_H_SIZE, dest);
>> +            msg_set_non_seq(msg, 0);
>> +            msg_set_last_bcast(msg, n_ptr->bclink.acked);
>> +            msg_set_link_selector(msg, (dest&  1));
>> +
>> +            l_ptr = n_ptr->active_links[0];
>> +            if (l_ptr) {
>> +                link_add_chain_to_outqueue(l_ptr, buf, 0);
>> +                tipc_link_push_queue(l_ptr);
>> +            } else {
>> +                kfree_skb(buf);
>> +            }
>> +        }
>> +        tipc_node_unlock(n_ptr);
>> +    }
>> +    read_unlock_bh(&tipc_net_lock);
>> +}
>> +
>> +/*
>>    * tipc_link_send_names - send name table entries to new neighbor
>>    *
>>    * Send routine for bulk delivery of name table messages when contact
>> @@ -1730,6 +1771,10 @@ deliver:
>>                       tipc_node_unlock(n_ptr);
>>                       tipc_named_recv(buf);
>>                       continue;
>> +                case BCAST_PROTOCOL:
>> +                    tipc_node_unlock(n_ptr);
>> +                    tipc_bclink_info_recv(buf);
>> +                    continue;
>>                   case CONN_MANAGER:
>>                       tipc_node_unlock(n_ptr);
>>                       tipc_port_recv_proto_msg(buf);
>> @@ -1960,7 +2005,7 @@ void tipc_link_send_proto_msg(struct tipc_link 
>> *l_ptr, u32 msg_typ,
>>       }
>>
>>       r_flag = (l_ptr->owner->working_links>  tipc_link_is_up(l_ptr));
>> -    msg_set_bclink_sync(msg, 0);
>> +    msg_set_bclink_sync(msg, 1);
>>       msg_set_redundant_link(msg, r_flag);
>>       msg_set_linkprio(msg, l_ptr->priority);
>>       msg_set_size(msg, msg_size);
>> diff --git a/net/tipc/link.h b/net/tipc/link.h
>> index 3a045eb..a53f103 100644
>> --- a/net/tipc/link.h
>> +++ b/net/tipc/link.h
>> @@ -224,6 +224,7 @@ struct sk_buff *tipc_link_cmd_show_stats(const 
>> void *req_tlv_area, int req_tlv_s
>>   struct sk_buff *tipc_link_cmd_reset_stats(const void *req_tlv_area, 
>> int req_tlv_space);
>>   void tipc_link_reset(struct tipc_link *l_ptr);
>>   int tipc_link_send(struct sk_buff *buf, u32 dest, u32 selector);
>> +void tipc_link_send_bclink_info(u32 dest);
>>   void tipc_link_send_names(struct list_head *message_list, u32 dest);
>>   int tipc_link_send_buf(struct tipc_link *l_ptr, struct sk_buff *buf);
>>   u32 tipc_link_get_max_pkt(u32 dest, u32 selector);
>> diff --git a/net/tipc/name_distr.c b/net/tipc/name_distr.c
>> index 25b7b56..05a4303 100644
>> --- a/net/tipc/name_distr.c
>> +++ b/net/tipc/name_distr.c
>> @@ -263,6 +263,7 @@ void tipc_named_node_up(unsigned long nodearg)
>>       read_unlock_bh(&tipc_nametbl_lock);
>>
>>       tipc_link_send_names(&message_list, (u32)node);
>> +    tipc_link_send_bclink_info(node);
>>   }
>>
>>   /**
>> diff --git a/net/tipc/node.c b/net/tipc/node.c
>> index 37cf3e3..9e443d2 100644
>> --- a/net/tipc/node.c
>> +++ b/net/tipc/node.c
>> @@ -296,7 +296,7 @@ static void node_lost_contact(struct tipc_node 
>> *n_ptr)
>>           tipc_addr_string_fill(addr_string, n_ptr->addr));
>>
>>       /* Flush broadcast link info associated with lost node */
>> -    if (n_ptr->bclink.supported) {
>> +    if (n_ptr->bclink.supportable) {
>>           while (n_ptr->bclink.deferred_head) {
>>               struct sk_buff *buf = n_ptr->bclink.deferred_head;
>>               n_ptr->bclink.deferred_head = buf->next;
>
>