|
From: Ying X. <yin...@wi...> - 2012-07-03 07:52:40
|
Currently there have two major known issues about broadcast link as
belows:
1. There has one risk that name table updates sent over the broadcast
link after the neighbor is discovered will arrive before the initial
transfer of name table entries over the unicast link has been completed.
2. There has another risk that the node may send a broadcast message
after the neighbor is discovered without being certain whether the
neighbor will acknowledge it or not.
To resolve above issues, we introduce a new BCAST_PROTOCOL, and send
that immediately after name table messages have been sent via reliable
unicast link. This message contains the sequence number of the most
recent broadcast message the sending node has sent, telling the
neighbor node when to start accepting broadcast messages and where to
start receiving and acknowledging broadcast messages.
In addition, to keep protocol compatibility backwards, we also involve
a flag bit to indicate whether a node supports the enhanced broadcast
synchronization mechanism or not.
Ying Xue (3):
tipc: Add a new flag bit to indicate bclink sync protocol is
supported
tipc: Keep protocol compatability backwards
tipc: Involve the enhanced broadcast synchronization mechanism
net/tipc/bcast.c | 23 ++++++++++++++++++++++
net/tipc/bcast.h | 1 +
net/tipc/link.c | 50 ++++++++++++++++++++++++++++++++++++++++++++++++-
net/tipc/link.h | 1 +
net/tipc/msg.h | 10 +++++++++
net/tipc/name_distr.c | 1 +
net/tipc/node.c | 5 ++-
net/tipc/node.h | 2 +
8 files changed, 90 insertions(+), 3 deletions(-)
|
|
From: Ying X. <yin...@wi...> - 2012-07-03 07:52:45
|
In previous patch a flag bit has been involved to indicate whether a
node supports the enhanced broadcast synchronization mechanism or not.
But to keep TIPC protocol interoperability between nodes which do not
support the new broadcast synchronization mechanism and nodes which do
support the new mechanism, there must have a mechanism which can ensure
that a node can simply fall back to using the existing broadcast
synchronization mechanism if it recognizes its peer node doesn't support
the new synchronization mechanism.
Signed-off-by: Ying Xue <yin...@wi...>
---
net/tipc/link.c | 5 ++++-
net/tipc/node.c | 3 ++-
net/tipc/node.h | 2 ++
3 files changed, 8 insertions(+), 2 deletions(-)
diff --git a/net/tipc/link.c b/net/tipc/link.c
index 45ae706..918c1b0 100644
--- a/net/tipc/link.c
+++ b/net/tipc/link.c
@@ -1960,6 +1960,7 @@ void tipc_link_send_proto_msg(struct tipc_link *l_ptr, u32 msg_typ,
}
r_flag = (l_ptr->owner->working_links > tipc_link_is_up(l_ptr));
+ msg_set_bclink_sync(msg, 0);
msg_set_redundant_link(msg, r_flag);
msg_set_linkprio(msg, l_ptr->priority);
msg_set_size(msg, msg_size);
@@ -2058,9 +2059,11 @@ static void link_recv_proto_msg(struct tipc_link *l_ptr, struct sk_buff *buf)
l_ptr->max_pkt = l_ptr->max_pkt_target;
}
l_ptr->owner->bclink.supportable = (max_pkt_info != 0);
+ l_ptr->owner->bclink.sync = msg_bclink_sync(msg);
/* Synchronize broadcast link info, if not done previously */
- if (!tipc_node_is_up(l_ptr->owner)) {
+ if (!tipc_node_is_up(l_ptr->owner) &&
+ !l_ptr->owner->bclink.sync) {
l_ptr->owner->bclink.last_sent =
l_ptr->owner->bclink.last_in =
msg_last_bcast(msg);
diff --git a/net/tipc/node.c b/net/tipc/node.c
index e03fc45..37cf3e3 100644
--- a/net/tipc/node.c
+++ b/net/tipc/node.c
@@ -268,7 +268,8 @@ static void node_established_contact(struct tipc_node *n_ptr)
if (n_ptr->bclink.supportable) {
n_ptr->bclink.acked = tipc_bclink_get_last_sent();
tipc_bclink_add_node(n_ptr->addr);
- n_ptr->bclink.supported = 1;
+ if (!n_ptr->bclink.sync)
+ n_ptr->bclink.supported = 1;
}
}
diff --git a/net/tipc/node.h b/net/tipc/node.h
index cfcaf4d..75c582b 100644
--- a/net/tipc/node.h
+++ b/net/tipc/node.h
@@ -69,6 +69,7 @@
* @bclink: broadcast-related info
* @supportable: non-zero if node supports TIPC b'cast link capability
* @supported: non-zero if node supports TIPC b'cast capability
+ * @sync: non-zero if node supports new broadcast synchronziation mechanism
* @acked: sequence # of last outbound b'cast message acknowledged by node
* @last_in: sequence # of last in-sequence b'cast message received from node
* @last_sent: sequence # of last b'cast message sent by node
@@ -94,6 +95,7 @@ struct tipc_node {
struct {
u8 supportable;
u8 supported;
+ u32 sync;
u32 acked;
u32 last_in;
u32 last_sent;
--
1.7.1
|
|
From: Ying X. <yin...@wi...> - 2012-07-03 07:52:46
|
Use the bit 14 in word 5 of LINK_PROTOCOL message header to indicate
whether the new enhanced broadcast synchronization mechanism is
supported.
Signed-off-by: Ying Xue <yin...@wi...>
---
net/tipc/msg.h | 10 ++++++++++
1 files changed, 10 insertions(+), 0 deletions(-)
diff --git a/net/tipc/msg.h b/net/tipc/msg.h
index ba2a72b..c832ae5 100644
--- a/net/tipc/msg.h
+++ b/net/tipc/msg.h
@@ -670,6 +670,16 @@ static inline void msg_set_redundant_link(struct tipc_msg *m, u32 r)
msg_set_bits(m, 5, 12, 0x1, r);
}
+static inline u32 msg_bclink_sync(struct tipc_msg *m)
+{
+ return msg_bits(m, 5, 14, 0x1);
+}
+
+static inline void msg_set_bclink_sync(struct tipc_msg *m, u32 n)
+{
+ msg_set_bits(m, 5, 14, 0x1, n);
+}
+
static inline char *msg_media_addr(struct tipc_msg *m)
{
return (char *)&m->hdr[TIPC_MEDIA_ADDR_OFFSET];
--
1.7.1
|
|
From: Ying X. <yin...@wi...> - 2012-07-03 07:52:46
|
Currently there have two major known issues about broadcast link as
belows:
1. There has one risk that name table updates sent over the broadcast
link after the neighbor is discovered will arrive before the initial
transfer of name table entries over the unicast link has been completed.
2. There has another risk that the node may send a broadcast message
after the neighbor is discovered without being certain whether the
neighbor will acknowledge it or not.
At present, the sending node cannot assume that the neighbor's link
endpoint is in the WW state just because its own link endpoint is in
that state, which means that is it doesn't know if the neighbor will
process the broadcast message or ignore it. However, once all of the
name table messages are acknowledged the node can be certain that the
other end is in WW state and that broadcast message will be processed.
Therefor, when the neighbor node is added to the sending node's
broadcast link map, the sending node should send all name talbe entries
over the unicast link. After these name table messages have been sent,
it should immediately one explicit message that tells the neighbor node
to start accepting broadcast messages, which can resolve the first
problem.
Since the explicit message contains the sequence number of the most
recent broadcast message the sending node has sent when it adds the
neighbor node to its broadcast link map, it can tell the neighbor node
where to start receiving and acknowledging broadcast messages, which
can reslove the second problem.
Signed-off-by: Ying Xue <yin...@wi...>
---
net/tipc/bcast.c | 23 +++++++++++++++++++++++
net/tipc/bcast.h | 1 +
net/tipc/link.c | 47 ++++++++++++++++++++++++++++++++++++++++++++++-
net/tipc/link.h | 1 +
net/tipc/name_distr.c | 1 +
net/tipc/node.c | 2 +-
6 files changed, 73 insertions(+), 2 deletions(-)
diff --git a/net/tipc/bcast.c b/net/tipc/bcast.c
index 47b61d3..3e7688a 100644
--- a/net/tipc/bcast.c
+++ b/net/tipc/bcast.c
@@ -355,6 +355,29 @@ static void bclink_peek_nack(struct tipc_msg *msg)
tipc_node_unlock(n_ptr);
}
+/**
+ * tipc_bclink_info_recv - synchronize broadcast link info
+ */
+void tipc_bclink_info_recv(struct sk_buff *buf)
+{
+ struct tipc_node *n_ptr;
+ struct tipc_msg *msg = buf_msg(buf);
+
+ n_ptr = tipc_node_find(msg_prevnode(msg));
+ if (unlikely(!n_ptr))
+ return;
+
+ tipc_node_lock(n_ptr);
+ if (!n_ptr->bclink.supported && n_ptr->bclink.supportable &&
+ n_ptr->bclink.sync) {
+ n_ptr->bclink.supported = 1;
+ n_ptr->bclink.last_sent = n_ptr->bclink.last_in =
+ msg_last_bcast(msg);
+ n_ptr->bclink.oos_state = 0;
+ }
+ tipc_node_unlock(n_ptr);
+}
+
/*
* tipc_bclink_send_msg - broadcast a packet to all nodes in cluster
*/
diff --git a/net/tipc/bcast.h b/net/tipc/bcast.h
index a933065..c689aa2 100644
--- a/net/tipc/bcast.h
+++ b/net/tipc/bcast.h
@@ -89,6 +89,7 @@ void tipc_bclink_add_node(u32 addr);
void tipc_bclink_remove_node(u32 addr);
struct tipc_node *tipc_bclink_retransmit_to(void);
void tipc_bclink_acknowledge(struct tipc_node *n_ptr, u32 acked);
+void tipc_bclink_info_recv(struct sk_buff *buf);
int tipc_bclink_send_msg(struct sk_buff *buf);
void tipc_bclink_recv_pkt(struct sk_buff *buf);
u32 tipc_bclink_get_last_sent(void);
diff --git a/net/tipc/link.c b/net/tipc/link.c
index 918c1b0..e6868e4 100644
--- a/net/tipc/link.c
+++ b/net/tipc/link.c
@@ -952,6 +952,47 @@ int tipc_link_send(struct sk_buff *buf, u32 dest, u32 selector)
}
/*
+ * tipc_link_send_bclink_info - send broadcast link info to new neighbor
+ *
+ * Send broadcast link info to open up for the new neighbor to accept broadcast
+ * message from the point contained in "last sent broadcast number" field and
+ * onwards. No link congestion checking is performed because the last sent
+ * broadcast number info *must* be delivered.
+ */
+void tipc_link_send_bclink_info(u32 dest)
+{
+ struct tipc_node *n_ptr;
+ struct tipc_link *l_ptr;
+ struct sk_buff *buf;
+
+ read_lock_bh(&tipc_net_lock);
+ n_ptr = tipc_node_find(dest);
+ if (n_ptr) {
+ tipc_node_lock(n_ptr);
+ buf = tipc_buf_acquire(INT_H_SIZE);
+ if (buf) {
+ struct tipc_msg *msg = buf_msg(buf);
+
+ tipc_msg_init(msg, BCAST_PROTOCOL, STATE_MSG,
+ INT_H_SIZE, dest);
+ msg_set_non_seq(msg, 0);
+ msg_set_last_bcast(msg, n_ptr->bclink.acked);
+ msg_set_link_selector(msg, (dest & 1));
+
+ l_ptr = n_ptr->active_links[0];
+ if (l_ptr) {
+ link_add_chain_to_outqueue(l_ptr, buf, 0);
+ tipc_link_push_queue(l_ptr);
+ } else {
+ kfree_skb(buf);
+ }
+ }
+ tipc_node_unlock(n_ptr);
+ }
+ read_unlock_bh(&tipc_net_lock);
+}
+
+/*
* tipc_link_send_names - send name table entries to new neighbor
*
* Send routine for bulk delivery of name table messages when contact
@@ -1730,6 +1771,10 @@ deliver:
tipc_node_unlock(n_ptr);
tipc_named_recv(buf);
continue;
+ case BCAST_PROTOCOL:
+ tipc_node_unlock(n_ptr);
+ tipc_bclink_info_recv(buf);
+ continue;
case CONN_MANAGER:
tipc_node_unlock(n_ptr);
tipc_port_recv_proto_msg(buf);
@@ -1960,7 +2005,7 @@ void tipc_link_send_proto_msg(struct tipc_link *l_ptr, u32 msg_typ,
}
r_flag = (l_ptr->owner->working_links > tipc_link_is_up(l_ptr));
- msg_set_bclink_sync(msg, 0);
+ msg_set_bclink_sync(msg, 1);
msg_set_redundant_link(msg, r_flag);
msg_set_linkprio(msg, l_ptr->priority);
msg_set_size(msg, msg_size);
diff --git a/net/tipc/link.h b/net/tipc/link.h
index 3a045eb..a53f103 100644
--- a/net/tipc/link.h
+++ b/net/tipc/link.h
@@ -224,6 +224,7 @@ struct sk_buff *tipc_link_cmd_show_stats(const void *req_tlv_area, int req_tlv_s
struct sk_buff *tipc_link_cmd_reset_stats(const void *req_tlv_area, int req_tlv_space);
void tipc_link_reset(struct tipc_link *l_ptr);
int tipc_link_send(struct sk_buff *buf, u32 dest, u32 selector);
+void tipc_link_send_bclink_info(u32 dest);
void tipc_link_send_names(struct list_head *message_list, u32 dest);
int tipc_link_send_buf(struct tipc_link *l_ptr, struct sk_buff *buf);
u32 tipc_link_get_max_pkt(u32 dest, u32 selector);
diff --git a/net/tipc/name_distr.c b/net/tipc/name_distr.c
index 25b7b56..05a4303 100644
--- a/net/tipc/name_distr.c
+++ b/net/tipc/name_distr.c
@@ -263,6 +263,7 @@ void tipc_named_node_up(unsigned long nodearg)
read_unlock_bh(&tipc_nametbl_lock);
tipc_link_send_names(&message_list, (u32)node);
+ tipc_link_send_bclink_info(node);
}
/**
diff --git a/net/tipc/node.c b/net/tipc/node.c
index 37cf3e3..9e443d2 100644
--- a/net/tipc/node.c
+++ b/net/tipc/node.c
@@ -296,7 +296,7 @@ static void node_lost_contact(struct tipc_node *n_ptr)
tipc_addr_string_fill(addr_string, n_ptr->addr));
/* Flush broadcast link info associated with lost node */
- if (n_ptr->bclink.supported) {
+ if (n_ptr->bclink.supportable) {
while (n_ptr->bclink.deferred_head) {
struct sk_buff *buf = n_ptr->bclink.deferred_head;
n_ptr->bclink.deferred_head = buf->next;
--
1.7.1
|
|
From: Erik H. <eri...@er...> - 2012-07-04 08:21:26
|
This comment is not directed to your patch, but more general.
What's the purpose of the supported/supportable fields in the node struct?
It looks like it's marking the node "bclink supportable" based on the
msg_max_pkt field when it receives a link activate msg, and then
immediately switches over to "bclink supported" when contact have been
established to the remote node..
Why not use an enum for this?
Also, i dont quite understand what's the condition that decides if the
bclink can be used to communicate with a node?
//E
On 2012-07-03 09:52, Ying Xue wrote:
> Currently there have two major known issues about broadcast link as
> belows:
>
> 1. There has one risk that name table updates sent over the broadcast
> link after the neighbor is discovered will arrive before the initial
> transfer of name table entries over the unicast link has been completed.
>
> 2. There has another risk that the node may send a broadcast message
> after the neighbor is discovered without being certain whether the
> neighbor will acknowledge it or not.
>
> At present, the sending node cannot assume that the neighbor's link
> endpoint is in the WW state just because its own link endpoint is in
> that state, which means that is it doesn't know if the neighbor will
> process the broadcast message or ignore it. However, once all of the
> name table messages are acknowledged the node can be certain that the
> other end is in WW state and that broadcast message will be processed.
>
> Therefor, when the neighbor node is added to the sending node's
> broadcast link map, the sending node should send all name talbe entries
> over the unicast link. After these name table messages have been sent,
> it should immediately one explicit message that tells the neighbor node
> to start accepting broadcast messages, which can resolve the first
> problem.
>
> Since the explicit message contains the sequence number of the most
> recent broadcast message the sending node has sent when it adds the
> neighbor node to its broadcast link map, it can tell the neighbor node
> where to start receiving and acknowledging broadcast messages, which
> can reslove the second problem.
>
> Signed-off-by: Ying Xue<yin...@wi...>
> ---
> net/tipc/bcast.c | 23 +++++++++++++++++++++++
> net/tipc/bcast.h | 1 +
> net/tipc/link.c | 47 ++++++++++++++++++++++++++++++++++++++++++++++-
> net/tipc/link.h | 1 +
> net/tipc/name_distr.c | 1 +
> net/tipc/node.c | 2 +-
> 6 files changed, 73 insertions(+), 2 deletions(-)
>
> diff --git a/net/tipc/bcast.c b/net/tipc/bcast.c
> index 47b61d3..3e7688a 100644
> --- a/net/tipc/bcast.c
> +++ b/net/tipc/bcast.c
> @@ -355,6 +355,29 @@ static void bclink_peek_nack(struct tipc_msg *msg)
> tipc_node_unlock(n_ptr);
> }
>
> +/**
> + * tipc_bclink_info_recv - synchronize broadcast link info
> + */
> +void tipc_bclink_info_recv(struct sk_buff *buf)
> +{
> + struct tipc_node *n_ptr;
> + struct tipc_msg *msg = buf_msg(buf);
> +
> + n_ptr = tipc_node_find(msg_prevnode(msg));
> + if (unlikely(!n_ptr))
> + return;
> +
> + tipc_node_lock(n_ptr);
> + if (!n_ptr->bclink.supported&& n_ptr->bclink.supportable&&
> + n_ptr->bclink.sync) {
> + n_ptr->bclink.supported = 1;
> + n_ptr->bclink.last_sent = n_ptr->bclink.last_in =
> + msg_last_bcast(msg);
> + n_ptr->bclink.oos_state = 0;
> + }
> + tipc_node_unlock(n_ptr);
> +}
> +
> /*
> * tipc_bclink_send_msg - broadcast a packet to all nodes in cluster
> */
> diff --git a/net/tipc/bcast.h b/net/tipc/bcast.h
> index a933065..c689aa2 100644
> --- a/net/tipc/bcast.h
> +++ b/net/tipc/bcast.h
> @@ -89,6 +89,7 @@ void tipc_bclink_add_node(u32 addr);
> void tipc_bclink_remove_node(u32 addr);
> struct tipc_node *tipc_bclink_retransmit_to(void);
> void tipc_bclink_acknowledge(struct tipc_node *n_ptr, u32 acked);
> +void tipc_bclink_info_recv(struct sk_buff *buf);
> int tipc_bclink_send_msg(struct sk_buff *buf);
> void tipc_bclink_recv_pkt(struct sk_buff *buf);
> u32 tipc_bclink_get_last_sent(void);
> diff --git a/net/tipc/link.c b/net/tipc/link.c
> index 918c1b0..e6868e4 100644
> --- a/net/tipc/link.c
> +++ b/net/tipc/link.c
> @@ -952,6 +952,47 @@ int tipc_link_send(struct sk_buff *buf, u32 dest, u32 selector)
> }
>
> /*
> + * tipc_link_send_bclink_info - send broadcast link info to new neighbor
> + *
> + * Send broadcast link info to open up for the new neighbor to accept broadcast
> + * message from the point contained in "last sent broadcast number" field and
> + * onwards. No link congestion checking is performed because the last sent
> + * broadcast number info *must* be delivered.
> + */
> +void tipc_link_send_bclink_info(u32 dest)
> +{
> + struct tipc_node *n_ptr;
> + struct tipc_link *l_ptr;
> + struct sk_buff *buf;
> +
> + read_lock_bh(&tipc_net_lock);
> + n_ptr = tipc_node_find(dest);
> + if (n_ptr) {
> + tipc_node_lock(n_ptr);
> + buf = tipc_buf_acquire(INT_H_SIZE);
> + if (buf) {
> + struct tipc_msg *msg = buf_msg(buf);
> +
> + tipc_msg_init(msg, BCAST_PROTOCOL, STATE_MSG,
> + INT_H_SIZE, dest);
> + msg_set_non_seq(msg, 0);
> + msg_set_last_bcast(msg, n_ptr->bclink.acked);
> + msg_set_link_selector(msg, (dest& 1));
> +
> + l_ptr = n_ptr->active_links[0];
> + if (l_ptr) {
> + link_add_chain_to_outqueue(l_ptr, buf, 0);
> + tipc_link_push_queue(l_ptr);
> + } else {
> + kfree_skb(buf);
> + }
> + }
> + tipc_node_unlock(n_ptr);
> + }
> + read_unlock_bh(&tipc_net_lock);
> +}
> +
> +/*
> * tipc_link_send_names - send name table entries to new neighbor
> *
> * Send routine for bulk delivery of name table messages when contact
> @@ -1730,6 +1771,10 @@ deliver:
> tipc_node_unlock(n_ptr);
> tipc_named_recv(buf);
> continue;
> + case BCAST_PROTOCOL:
> + tipc_node_unlock(n_ptr);
> + tipc_bclink_info_recv(buf);
> + continue;
> case CONN_MANAGER:
> tipc_node_unlock(n_ptr);
> tipc_port_recv_proto_msg(buf);
> @@ -1960,7 +2005,7 @@ void tipc_link_send_proto_msg(struct tipc_link *l_ptr, u32 msg_typ,
> }
>
> r_flag = (l_ptr->owner->working_links> tipc_link_is_up(l_ptr));
> - msg_set_bclink_sync(msg, 0);
> + msg_set_bclink_sync(msg, 1);
> msg_set_redundant_link(msg, r_flag);
> msg_set_linkprio(msg, l_ptr->priority);
> msg_set_size(msg, msg_size);
> diff --git a/net/tipc/link.h b/net/tipc/link.h
> index 3a045eb..a53f103 100644
> --- a/net/tipc/link.h
> +++ b/net/tipc/link.h
> @@ -224,6 +224,7 @@ struct sk_buff *tipc_link_cmd_show_stats(const void *req_tlv_area, int req_tlv_s
> struct sk_buff *tipc_link_cmd_reset_stats(const void *req_tlv_area, int req_tlv_space);
> void tipc_link_reset(struct tipc_link *l_ptr);
> int tipc_link_send(struct sk_buff *buf, u32 dest, u32 selector);
> +void tipc_link_send_bclink_info(u32 dest);
> void tipc_link_send_names(struct list_head *message_list, u32 dest);
> int tipc_link_send_buf(struct tipc_link *l_ptr, struct sk_buff *buf);
> u32 tipc_link_get_max_pkt(u32 dest, u32 selector);
> diff --git a/net/tipc/name_distr.c b/net/tipc/name_distr.c
> index 25b7b56..05a4303 100644
> --- a/net/tipc/name_distr.c
> +++ b/net/tipc/name_distr.c
> @@ -263,6 +263,7 @@ void tipc_named_node_up(unsigned long nodearg)
> read_unlock_bh(&tipc_nametbl_lock);
>
> tipc_link_send_names(&message_list, (u32)node);
> + tipc_link_send_bclink_info(node);
> }
>
> /**
> diff --git a/net/tipc/node.c b/net/tipc/node.c
> index 37cf3e3..9e443d2 100644
> --- a/net/tipc/node.c
> +++ b/net/tipc/node.c
> @@ -296,7 +296,7 @@ static void node_lost_contact(struct tipc_node *n_ptr)
> tipc_addr_string_fill(addr_string, n_ptr->addr));
>
> /* Flush broadcast link info associated with lost node */
> - if (n_ptr->bclink.supported) {
> + if (n_ptr->bclink.supportable) {
> while (n_ptr->bclink.deferred_head) {
> struct sk_buff *buf = n_ptr->bclink.deferred_head;
> n_ptr->bclink.deferred_head = buf->next;
|
|
From: Ying X. <yin...@wi...> - 2012-07-04 09:32:26
|
Erik Hugne wrote:
> This comment is not directed to your patch, but more general.
Thanks for your comments, maybe i add some extra information to describe
how to implement the explicit sync message - BCAST_PROTOCOL, which may
be more clear.
> What's the purpose of the supported/supportable fields in the node
> struct?
By my understanding, supportable means MTU negotiation between nodes is
done and node has broadcast capability; supported indicates we can
receive broadcast messages.
> It looks like it's marking the node "bclink supportable" based on the
> msg_max_pkt field when it receives a link activate msg, and then
> immediately switches over to "bclink supported" when contact have been
> established to the remote node..
> Why not use an enum for this?
Using enum is no problem, but I think its result is same, please note
supportable/supported are defined to char rather than int.
>
> Also, i dont quite understand what's the condition that decides if the
> bclink can be used to communicate with a node?
>
Once receiving an explicit message of broadcast sync, it can talk with
other nodes by broadcast way.
To completely understand the solution, I think we should first correctly
understand the two risks described in comments in the patch; secondly we
also should understand why the two issues cannot be resolved by link
state machine, but finally we have to involve an explicit message to
synchronize broadcast info although it has an obvious side-effect of
protocol compatibility.
Lastly, please notice when the sync message is sent and what it conveys.
Actually it tells neighbor node when to start accepting broadcast
messages and where to start receiving and acknowledging broadcast messages.
Regards,
Ying
> //E
>
> On 2012-07-03 09:52, Ying Xue wrote:
>> Currently there have two major known issues about broadcast link as
>> belows:
>>
>> 1. There has one risk that name table updates sent over the broadcast
>> link after the neighbor is discovered will arrive before the initial
>> transfer of name table entries over the unicast link has been completed.
>>
>> 2. There has another risk that the node may send a broadcast message
>> after the neighbor is discovered without being certain whether the
>> neighbor will acknowledge it or not.
>>
>> At present, the sending node cannot assume that the neighbor's link
>> endpoint is in the WW state just because its own link endpoint is in
>> that state, which means that is it doesn't know if the neighbor will
>> process the broadcast message or ignore it. However, once all of the
>> name table messages are acknowledged the node can be certain that the
>> other end is in WW state and that broadcast message will be processed.
>>
>> Therefor, when the neighbor node is added to the sending node's
>> broadcast link map, the sending node should send all name talbe entries
>> over the unicast link. After these name table messages have been sent,
>> it should immediately one explicit message that tells the neighbor node
>> to start accepting broadcast messages, which can resolve the first
>> problem.
>>
>> Since the explicit message contains the sequence number of the most
>> recent broadcast message the sending node has sent when it adds the
>> neighbor node to its broadcast link map, it can tell the neighbor node
>> where to start receiving and acknowledging broadcast messages, which
>> can reslove the second problem.
>>
>> Signed-off-by: Ying Xue<yin...@wi...>
>> ---
>> net/tipc/bcast.c | 23 +++++++++++++++++++++++
>> net/tipc/bcast.h | 1 +
>> net/tipc/link.c | 47
>> ++++++++++++++++++++++++++++++++++++++++++++++-
>> net/tipc/link.h | 1 +
>> net/tipc/name_distr.c | 1 +
>> net/tipc/node.c | 2 +-
>> 6 files changed, 73 insertions(+), 2 deletions(-)
>>
>> diff --git a/net/tipc/bcast.c b/net/tipc/bcast.c
>> index 47b61d3..3e7688a 100644
>> --- a/net/tipc/bcast.c
>> +++ b/net/tipc/bcast.c
>> @@ -355,6 +355,29 @@ static void bclink_peek_nack(struct tipc_msg *msg)
>> tipc_node_unlock(n_ptr);
>> }
>>
>> +/**
>> + * tipc_bclink_info_recv - synchronize broadcast link info
>> + */
>> +void tipc_bclink_info_recv(struct sk_buff *buf)
>> +{
>> + struct tipc_node *n_ptr;
>> + struct tipc_msg *msg = buf_msg(buf);
>> +
>> + n_ptr = tipc_node_find(msg_prevnode(msg));
>> + if (unlikely(!n_ptr))
>> + return;
>> +
>> + tipc_node_lock(n_ptr);
>> + if (!n_ptr->bclink.supported&& n_ptr->bclink.supportable&&
>> + n_ptr->bclink.sync) {
>> + n_ptr->bclink.supported = 1;
>> + n_ptr->bclink.last_sent = n_ptr->bclink.last_in =
>> + msg_last_bcast(msg);
>> + n_ptr->bclink.oos_state = 0;
>> + }
>> + tipc_node_unlock(n_ptr);
>> +}
>> +
>> /*
>> * tipc_bclink_send_msg - broadcast a packet to all nodes in cluster
>> */
>> diff --git a/net/tipc/bcast.h b/net/tipc/bcast.h
>> index a933065..c689aa2 100644
>> --- a/net/tipc/bcast.h
>> +++ b/net/tipc/bcast.h
>> @@ -89,6 +89,7 @@ void tipc_bclink_add_node(u32 addr);
>> void tipc_bclink_remove_node(u32 addr);
>> struct tipc_node *tipc_bclink_retransmit_to(void);
>> void tipc_bclink_acknowledge(struct tipc_node *n_ptr, u32 acked);
>> +void tipc_bclink_info_recv(struct sk_buff *buf);
>> int tipc_bclink_send_msg(struct sk_buff *buf);
>> void tipc_bclink_recv_pkt(struct sk_buff *buf);
>> u32 tipc_bclink_get_last_sent(void);
>> diff --git a/net/tipc/link.c b/net/tipc/link.c
>> index 918c1b0..e6868e4 100644
>> --- a/net/tipc/link.c
>> +++ b/net/tipc/link.c
>> @@ -952,6 +952,47 @@ int tipc_link_send(struct sk_buff *buf, u32
>> dest, u32 selector)
>> }
>>
>> /*
>> + * tipc_link_send_bclink_info - send broadcast link info to new
>> neighbor
>> + *
>> + * Send broadcast link info to open up for the new neighbor to
>> accept broadcast
>> + * message from the point contained in "last sent broadcast number"
>> field and
>> + * onwards. No link congestion checking is performed because the
>> last sent
>> + * broadcast number info *must* be delivered.
>> + */
>> +void tipc_link_send_bclink_info(u32 dest)
>> +{
>> + struct tipc_node *n_ptr;
>> + struct tipc_link *l_ptr;
>> + struct sk_buff *buf;
>> +
>> + read_lock_bh(&tipc_net_lock);
>> + n_ptr = tipc_node_find(dest);
>> + if (n_ptr) {
>> + tipc_node_lock(n_ptr);
>> + buf = tipc_buf_acquire(INT_H_SIZE);
>> + if (buf) {
>> + struct tipc_msg *msg = buf_msg(buf);
>> +
>> + tipc_msg_init(msg, BCAST_PROTOCOL, STATE_MSG,
>> + INT_H_SIZE, dest);
>> + msg_set_non_seq(msg, 0);
>> + msg_set_last_bcast(msg, n_ptr->bclink.acked);
>> + msg_set_link_selector(msg, (dest& 1));
>> +
>> + l_ptr = n_ptr->active_links[0];
>> + if (l_ptr) {
>> + link_add_chain_to_outqueue(l_ptr, buf, 0);
>> + tipc_link_push_queue(l_ptr);
>> + } else {
>> + kfree_skb(buf);
>> + }
>> + }
>> + tipc_node_unlock(n_ptr);
>> + }
>> + read_unlock_bh(&tipc_net_lock);
>> +}
>> +
>> +/*
>> * tipc_link_send_names - send name table entries to new neighbor
>> *
>> * Send routine for bulk delivery of name table messages when contact
>> @@ -1730,6 +1771,10 @@ deliver:
>> tipc_node_unlock(n_ptr);
>> tipc_named_recv(buf);
>> continue;
>> + case BCAST_PROTOCOL:
>> + tipc_node_unlock(n_ptr);
>> + tipc_bclink_info_recv(buf);
>> + continue;
>> case CONN_MANAGER:
>> tipc_node_unlock(n_ptr);
>> tipc_port_recv_proto_msg(buf);
>> @@ -1960,7 +2005,7 @@ void tipc_link_send_proto_msg(struct tipc_link
>> *l_ptr, u32 msg_typ,
>> }
>>
>> r_flag = (l_ptr->owner->working_links> tipc_link_is_up(l_ptr));
>> - msg_set_bclink_sync(msg, 0);
>> + msg_set_bclink_sync(msg, 1);
>> msg_set_redundant_link(msg, r_flag);
>> msg_set_linkprio(msg, l_ptr->priority);
>> msg_set_size(msg, msg_size);
>> diff --git a/net/tipc/link.h b/net/tipc/link.h
>> index 3a045eb..a53f103 100644
>> --- a/net/tipc/link.h
>> +++ b/net/tipc/link.h
>> @@ -224,6 +224,7 @@ struct sk_buff *tipc_link_cmd_show_stats(const
>> void *req_tlv_area, int req_tlv_s
>> struct sk_buff *tipc_link_cmd_reset_stats(const void *req_tlv_area,
>> int req_tlv_space);
>> void tipc_link_reset(struct tipc_link *l_ptr);
>> int tipc_link_send(struct sk_buff *buf, u32 dest, u32 selector);
>> +void tipc_link_send_bclink_info(u32 dest);
>> void tipc_link_send_names(struct list_head *message_list, u32 dest);
>> int tipc_link_send_buf(struct tipc_link *l_ptr, struct sk_buff *buf);
>> u32 tipc_link_get_max_pkt(u32 dest, u32 selector);
>> diff --git a/net/tipc/name_distr.c b/net/tipc/name_distr.c
>> index 25b7b56..05a4303 100644
>> --- a/net/tipc/name_distr.c
>> +++ b/net/tipc/name_distr.c
>> @@ -263,6 +263,7 @@ void tipc_named_node_up(unsigned long nodearg)
>> read_unlock_bh(&tipc_nametbl_lock);
>>
>> tipc_link_send_names(&message_list, (u32)node);
>> + tipc_link_send_bclink_info(node);
>> }
>>
>> /**
>> diff --git a/net/tipc/node.c b/net/tipc/node.c
>> index 37cf3e3..9e443d2 100644
>> --- a/net/tipc/node.c
>> +++ b/net/tipc/node.c
>> @@ -296,7 +296,7 @@ static void node_lost_contact(struct tipc_node
>> *n_ptr)
>> tipc_addr_string_fill(addr_string, n_ptr->addr));
>>
>> /* Flush broadcast link info associated with lost node */
>> - if (n_ptr->bclink.supported) {
>> + if (n_ptr->bclink.supportable) {
>> while (n_ptr->bclink.deferred_head) {
>> struct sk_buff *buf = n_ptr->bclink.deferred_head;
>> n_ptr->bclink.deferred_head = buf->next;
>
>
|
|
From: Ying X. <yin...@wi...> - 2012-07-05 05:11:17
|
Good news! I almost spent half of year resolving the issue for our customer on TIPC-1.7.7 version. After many and many pressure tests are done by the customer in their environment(as in my environment, it's hard to be reproduced), until today their tests' results prove the patch set of 1.7.7 version is very stable. Before that, I actually provided several versions to fix the issue, but they all were failed to be passed kinds of test cases. Here, thanks Jon and Allan for presenting proposals about it. Regards, Ying Ying Xue wrote: > Currently there have two major known issues about broadcast link as > belows: > > 1. There has one risk that name table updates sent over the broadcast > link after the neighbor is discovered will arrive before the initial > transfer of name table entries over the unicast link has been completed. > > 2. There has another risk that the node may send a broadcast message > after the neighbor is discovered without being certain whether the > neighbor will acknowledge it or not. > > To resolve above issues, we introduce a new BCAST_PROTOCOL, and send > that immediately after name table messages have been sent via reliable > unicast link. This message contains the sequence number of the most > recent broadcast message the sending node has sent, telling the > neighbor node when to start accepting broadcast messages and where to > start receiving and acknowledging broadcast messages. > > In addition, to keep protocol compatibility backwards, we also involve > a flag bit to indicate whether a node supports the enhanced broadcast > synchronization mechanism or not. > > Ying Xue (3): > tipc: Add a new flag bit to indicate bclink sync protocol is > supported > tipc: Keep protocol compatability backwards > tipc: Involve the enhanced broadcast synchronization mechanism > > net/tipc/bcast.c | 23 ++++++++++++++++++++++ > net/tipc/bcast.h | 1 + > net/tipc/link.c | 50 ++++++++++++++++++++++++++++++++++++++++++++++++- > net/tipc/link.h | 1 + > net/tipc/msg.h | 10 +++++++++ > net/tipc/name_distr.c | 1 + > net/tipc/node.c | 5 ++- > net/tipc/node.h | 2 + > 8 files changed, 90 insertions(+), 3 deletions(-) > > > |
|
From: Jon P. M. <ma...@do...> - 2012-07-11 17:13:34
|
Well done, Ying. This was one of the trickier bugs to find and reslove, and I think we found a pretty good solution to it in the end. Now we also must get it into the mainstream. I know I have been slow to test and sign off the patches you have sent lately, but I really have my plate full these days. I'll try to post some of them during the weekend. ///jon ________________________________ De : Ying Xue <yin...@wi...> À : ma...@do...; jon...@er...; all...@wi... Cc : Ying Xue <yin...@wi...>; eri...@er...; tip...@li... Envoyé le : jeudi 5 juillet 2012 1h11 Objet : Re: [PATCH net-next 0/3] fix broadcast sync issue Good news! I almost spent half of year resolving the issue for our customer on TIPC-1.7.7 version. After many and many pressure tests are done by the customer in their environment(as in my environment, it's hard to be reproduced), until today their tests' results prove the patch set of 1.7.7 version is very stable. Before that, I actually provided several versions to fix the issue, but they all were failed to be passed kinds of test cases. Here, thanks Jon and Allan for presenting proposals about it. Regards, Ying Ying Xue wrote: > Currently there have two major known issues about broadcast link as > belows: > > 1. There has one risk that name table updates sent over the broadcast > link after the neighbor is discovered will arrive before the initial > transfer of name table entries over the unicast link has been completed. > > 2. There has another risk that the node may send a broadcast message > after the neighbor is discovered without being certain whether the > neighbor will acknowledge it or not. > > To resolve above issues, we introduce a new BCAST_PROTOCOL, and send > that immediately after name table messages have been sent via reliable > unicast link. This message contains the sequence number of the most > recent broadcast message the sending node has sent, telling the > neighbor node when to start accepting broadcast messages and where to > start receiving and acknowledging broadcast messages. > > In addition, to keep protocol compatibility backwards, we also involve > a flag bit to indicate whether a node supports the enhanced broadcast > synchronization mechanism or not. > > Ying Xue (3): > tipc: Add a new flag bit to indicate bclink sync protocol is > supported > tipc: Keep protocol compatability backwards > tipc: Involve the enhanced broadcast synchronization mechanism > > net/tipc/bcast.c | 23 ++++++++++++++++++++++ > net/tipc/bcast.h | 1 + > net/tipc/link.c | 50 ++++++++++++++++++++++++++++++++++++++++++++++++- > net/tipc/link.h | 1 + > net/tipc/msg.h | 10 +++++++++ > net/tipc/name_distr.c | 1 + > net/tipc/node.c | 5 ++- > net/tipc/node.h | 2 + > 8 files changed, 90 insertions(+), 3 deletions(-) > > > |
|
From: jason <huz...@gm...> - 2013-01-31 09:58:05
|
Hi Ying, Sorry to turn this old thread out. But with tipc-1.7.7 we encountered name table not matching between nodes several times. Can this be a consequence of the first issue you listed? Thank you! On Jul 3, 2012 3:53 PM, "Ying Xue" <yin...@wi...> wrote: > Currently there have two major known issues about broadcast link as > belows: > > 1. There has one risk that name table updates sent over the broadcast > link after the neighbor is discovered will arrive before the initial > transfer of name table entries over the unicast link has been completed. > > 2. There has another risk that the node may send a broadcast message > after the neighbor is discovered without being certain whether the > neighbor will acknowledge it or not. > > To resolve above issues, we introduce a new BCAST_PROTOCOL, and send > that immediately after name table messages have been sent via reliable > unicast link. This message contains the sequence number of the most > recent broadcast message the sending node has sent, telling the > neighbor node when to start accepting broadcast messages and where to > start receiving and acknowledging broadcast messages. > > In addition, to keep protocol compatibility backwards, we also involve > a flag bit to indicate whether a node supports the enhanced broadcast > synchronization mechanism or not. > > Ying Xue (3): > tipc: Add a new flag bit to indicate bclink sync protocol is > supported > tipc: Keep protocol compatability backwards > tipc: Involve the enhanced broadcast synchronization mechanism > > net/tipc/bcast.c | 23 ++++++++++++++++++++++ > net/tipc/bcast.h | 1 + > net/tipc/link.c | 50 > ++++++++++++++++++++++++++++++++++++++++++++++++- > net/tipc/link.h | 1 + > net/tipc/msg.h | 10 +++++++++ > net/tipc/name_distr.c | 1 + > net/tipc/node.c | 5 ++- > net/tipc/node.h | 2 + > 8 files changed, 90 insertions(+), 3 deletions(-) > > > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > tipc-discussion mailing list > tip...@li... > https://lists.sourceforge.net/lists/listinfo/tipc-discussion > |
|
From: Ying X. <yin...@wi...> - 2013-01-31 10:06:15
|
jason wrote: > > > Hi Ying, > Sorry to turn this old thread out. But with tipc-1.7.7 we encountered > name table not matching between nodes several times. Can this be a > consequence of the first issue you listed? Yes, the first risk can turn out to be not synced for name table. Regards, Ying > Thank you! > > On Jul 3, 2012 3:53 PM, "Ying Xue" <yin...@wi... > <mailto:yin...@wi...>> wrote: > > Currently there have two major known issues about broadcast link as > belows: > > 1. There has one risk that name table updates sent over the broadcast > link after the neighbor is discovered will arrive before the initial > transfer of name table entries over the unicast link has been completed. > > 2. There has another risk that the node may send a broadcast message > after the neighbor is discovered without being certain whether the > neighbor will acknowledge it or not. > > To resolve above issues, we introduce a new BCAST_PROTOCOL, and send > that immediately after name table messages have been sent via reliable > unicast link. This message contains the sequence number of the most > recent broadcast message the sending node has sent, telling the > neighbor node when to start accepting broadcast messages and where to > start receiving and acknowledging broadcast messages. > > In addition, to keep protocol compatibility backwards, we also involve > a flag bit to indicate whether a node supports the enhanced broadcast > synchronization mechanism or not. > > Ying Xue (3): > tipc: Add a new flag bit to indicate bclink sync protocol is > supported > tipc: Keep protocol compatability backwards > tipc: Involve the enhanced broadcast synchronization mechanism > > net/tipc/bcast.c | 23 ++++++++++++++++++++++ > net/tipc/bcast.h | 1 + > net/tipc/link.c | 50 > ++++++++++++++++++++++++++++++++++++++++++++++++- > net/tipc/link.h | 1 + > net/tipc/msg.h | 10 +++++++++ > net/tipc/name_distr.c | 1 + > net/tipc/node.c | 5 ++- > net/tipc/node.h | 2 + > 8 files changed, 90 insertions(+), 3 deletions(-) > > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. > Discussions > will include endpoint security, mobile security and the latest in > malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > tipc-discussion mailing list > tip...@li... > <mailto:tip...@li...> > https://lists.sourceforge.net/lists/listinfo/tipc-discussion > |
|
From: jason <huz...@gm...> - 2013-02-04 09:45:37
|
Hi Ying and all, Please help to understand this patch clearly:So as the new added handshake flow, the name table updates sent over the broadcast link which arrive before the initial transfer of name table entries over the unicast link will be discard by the receiving node, then after name table unicast finished, the receivng node will ask the sending node to retransmit it. Is that theoretically right?. Thanks a lot! On Jan 31, 2013 6:06 PM, "Ying Xue" <yin...@wi...> wrote: > jason wrote: > >> >> >> Hi Ying, >> Sorry to turn this old thread out. But with tipc-1.7.7 we encountered >> name table not matching between nodes several times. Can this be a >> consequence of the first issue you listed? >> > > > Yes, the first risk can turn out to be not synced for name table. > > Regards, > Ying > > Thank you! >> >> On Jul 3, 2012 3:53 PM, "Ying Xue" <yin...@wi... <mailto: >> yin...@wi...**>> wrote: >> >> Currently there have two major known issues about broadcast link as >> belows: >> >> 1. There has one risk that name table updates sent over the broadcast >> link after the neighbor is discovered will arrive before the initial >> transfer of name table entries over the unicast link has been >> completed. >> >> 2. There has another risk that the node may send a broadcast message >> after the neighbor is discovered without being certain whether the >> neighbor will acknowledge it or not. >> >> To resolve above issues, we introduce a new BCAST_PROTOCOL, and send >> that immediately after name table messages have been sent via reliable >> unicast link. This message contains the sequence number of the most >> recent broadcast message the sending node has sent, telling the >> neighbor node when to start accepting broadcast messages and where to >> start receiving and acknowledging broadcast messages. >> >> In addition, to keep protocol compatibility backwards, we also involve >> a flag bit to indicate whether a node supports the enhanced broadcast >> synchronization mechanism or not. >> >> Ying Xue (3): >> tipc: Add a new flag bit to indicate bclink sync protocol is >> supported >> tipc: Keep protocol compatability backwards >> tipc: Involve the enhanced broadcast synchronization mechanism >> >> net/tipc/bcast.c | 23 ++++++++++++++++++++++ >> net/tipc/bcast.h | 1 + >> net/tipc/link.c | 50 >> ++++++++++++++++++++++++++++++**++++++++++++++++++- >> net/tipc/link.h | 1 + >> net/tipc/msg.h | 10 +++++++++ >> net/tipc/name_distr.c | 1 + >> net/tipc/node.c | 5 ++- >> net/tipc/node.h | 2 + >> 8 files changed, 90 insertions(+), 3 deletions(-) >> >> >> ------------------------------**------------------------------** >> ------------------ >> Live Security Virtual Conference >> Exclusive live event will cover all the ways today's security and >> threat landscape has changed and how IT managers can respond. >> Discussions >> will include endpoint security, mobile security and the latest in >> malware >> threats. http://www.accelacomm.com/jaw/**sfrnl04242012/114/50122263/<http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/> >> ______________________________**_________________ >> tipc-discussion mailing list >> tipc-discussion@lists.**sourceforge.net<tip...@li...> >> <mailto:tipc-discussion@lists.**sourceforge.net<tip...@li...> >> > >> https://lists.sourceforge.net/**lists/listinfo/tipc-discussion<https://lists.sourceforge.net/lists/listinfo/tipc-discussion> >> >> > |
|
From: jason <huz...@gm...> - 2013-02-06 00:57:18
|
Hi Ying, It seems there is a problem about this patch. If peer lost contact before sending the new bcast sync message to us, then we will not be able to call tipc_bclink_remove_node() in node_lost_contact() because it is called only if bclink.supported is set. That will cause a wrong node map count then bclink will become stall. On Jan 31, 2013 6:06 PM, "Ying Xue" <yin...@wi...> wrote: > jason wrote: > >> >> >> Hi Ying, >> Sorry to turn this old thread out. But with tipc-1.7.7 we encountered >> name table not matching between nodes several times. Can this be a >> consequence of the first issue you listed? >> > > > Yes, the first risk can turn out to be not synced for name table. > > Regards, > Ying > > Thank you! >> >> On Jul 3, 2012 3:53 PM, "Ying Xue" <yin...@wi... <mailto: >> yin...@wi...**>> wrote: >> >> Currently there have two major known issues about broadcast link as >> belows: >> >> 1. There has one risk that name table updates sent over the broadcast >> link after the neighbor is discovered will arrive before the initial >> transfer of name table entries over the unicast link has been >> completed. >> >> 2. There has another risk that the node may send a broadcast message >> after the neighbor is discovered without being certain whether the >> neighbor will acknowledge it or not. >> >> To resolve above issues, we introduce a new BCAST_PROTOCOL, and send >> that immediately after name table messages have been sent via reliable >> unicast link. This message contains the sequence number of the most >> recent broadcast message the sending node has sent, telling the >> neighbor node when to start accepting broadcast messages and where to >> start receiving and acknowledging broadcast messages. >> >> In addition, to keep protocol compatibility backwards, we also involve >> a flag bit to indicate whether a node supports the enhanced broadcast >> synchronization mechanism or not. >> >> Ying Xue (3): >> tipc: Add a new flag bit to indicate bclink sync protocol is >> supported >> tipc: Keep protocol compatability backwards >> tipc: Involve the enhanced broadcast synchronization mechanism >> >> net/tipc/bcast.c | 23 ++++++++++++++++++++++ >> net/tipc/bcast.h | 1 + >> net/tipc/link.c | 50 >> ++++++++++++++++++++++++++++++**++++++++++++++++++- >> net/tipc/link.h | 1 + >> net/tipc/msg.h | 10 +++++++++ >> net/tipc/name_distr.c | 1 + >> net/tipc/node.c | 5 ++- >> net/tipc/node.h | 2 + >> 8 files changed, 90 insertions(+), 3 deletions(-) >> >> >> ------------------------------**------------------------------** >> ------------------ >> Live Security Virtual Conference >> Exclusive live event will cover all the ways today's security and >> threat landscape has changed and how IT managers can respond. >> Discussions >> will include endpoint security, mobile security and the latest in >> malware >> threats. http://www.accelacomm.com/jaw/**sfrnl04242012/114/50122263/<http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/> >> ______________________________**_________________ >> tipc-discussion mailing list >> tipc-discussion@lists.**sourceforge.net<tip...@li...> >> <mailto:tipc-discussion@lists.**sourceforge.net<tip...@li...> >> > >> https://lists.sourceforge.net/**lists/listinfo/tipc-discussion<https://lists.sourceforge.net/lists/listinfo/tipc-discussion> >> >> > |
|
From: jason <huz...@gm...> - 2013-03-06 03:41:56
|
Hi All, Let's say there are to nodes in cluster ,nodeA and nodeB. There is possibility that A has opened bcast receiving for B while B hasen't opened bcast receiving for A. Therefore, B hasen't sync its last_in to what A has sent ,then every messages sent by B will carry a invalid bcast acked seq number for A. Because A has open bcast receiving for B, it will process those invalid acks from B in tipc_bclink_acknowlegde(). It seems may cause problem I think. Please consider this. On Jan 31, 2013 6:06 PM, "Ying Xue" <yin...@wi...> wrote: |
|
From: jason <huz...@gm...> - 2013-03-06 03:44:32
|
Hi All, Let's say there are to nodes in cluster ,nodeA and nodeB. There is possibility that A has opened bcast receiving for B while B hasen't opened bcast receiving for A. Therefore, B hasen't sync its last_in to what A has sent ,then every messages sent by B will carry a invalid bcast acked seq number for A. Because A has open bcast receiving for B, it will process those invalid acks from B in tipc_bclink_acknowlegde(). It seems may cause problem I think. Please consider this. On Jan 31, 2013 6:06 PM, "Ying Xue" <yin...@wi...> wrote: |
|
From: jason <huz...@gm...> - 2013-03-06 03:45:28
|
Hi All, Let's say there are to nodes in cluster ,nodeA and nodeB. There is possibility that A has opened bcast receiving for B while B hasen't opened bcast receiving for A. Therefore, B hasen't sync its last_in to what A has sent ,then every messages sent by B will carry a invalid bcast acked seq number for A. Because A has open bcast receiving for B, it will process those invalid acks from B in tipc_bclink_acknowlegde(). It seems may cause problem I think. Please consider this. On Jan 31, 2013 6:06 PM, "Ying Xue" <yin...@wi...> wrote: |
|
From: Ying X. <yin...@wi...> - 2013-03-06 06:27:57
|
On 03/06/2013 11:41 AM, jason wrote: > Hi All, > Let's say there are to nodes in cluster ,nodeA and nodeB. There is > possibility that A has opened bcast receiving for B while B hasen't > opened bcast receiving for A. Therefore, B hasen't sync its last_in to > what A has sent ,then every messages sent by B will carry a invalid > bcast acked seq number for A. Because A has open bcast receiving for B, > it will process those invalid acks from B in tipc_bclink_acknowlegde(). > It seems may cause problem I think. Please consider this. > > On Jan 31, 2013 6:06 PM, "Ying Xue" <yin...@wi... > <mailto:yin...@wi...>> wrote: No, it's possible for you not to completely understand the root cause. Of course, I admit it's is a hard thing to clearly know every detail things. A least, after a while, I almost forgot why it happens and what reason is. the key reason why it appears is that TIPC does not properly cope with the sync problem between unicat link and multicast link. Even if one unicast link is set up by sending link state message via unicast channel, link states on both endpoints are not sync immediately due to distribution environment. For example, there have two nodes, one sender of sending multicast messages and one message receiver respectively. Suddenly one new node joins the cluster as another multicast messages receiver. As the link sate between new receiver and the sender is not sync timely, for instance, the sender still thinks there only has one receiver although the new receiver actually starts to receive the multicast messages sent by the sender at the moment. That means, during the time of link state being inconsistent sender can release message in its outbound queue as long as it receives one ack from one of the two receivers. In normally there has no big problem. But if one receiver finds one message is missed from a series of sequential received packets, it then sends retransmission request to ask the sender to send the missed packet again. But the missed packet has been released by sender as the sender already received an ack of the missed packet from another receiver. Therefore, sender cannot send out the missed packet for ever, however, the receiver must receive the missed packet. So deadlock happens. |
|
From: jason <huz...@gm...> - 2013-03-06 06:48:20
|
Hi Ying, I think I understand the root cause and I am clear about example you give. But may be I did not mentioned clearly that my example is base on your new bcast sync mechanism, and the problem I found belongs to your new bcast mechanism too. On Mar 6, 2013 2:27 PM, "Ying Xue" <yin...@wi...> wrote: > On 03/06/2013 11:41 AM, jason wrote: > > Hi All, > > Let's say there are to nodes in cluster ,nodeA and nodeB. There is > > possibility that A has opened bcast receiving for B while B hasen't > > opened bcast receiving for A. Therefore, B hasen't sync its last_in to > > what A has sent ,then every messages sent by B will carry a invalid > > bcast acked seq number for A. Because A has open bcast receiving for B, > > it will process those invalid acks from B in tipc_bclink_acknowlegde(). > > It seems may cause problem I think. Please consider this. > > > > On Jan 31, 2013 6:06 PM, "Ying Xue" <yin...@wi... > > <mailto:yin...@wi...>> wrote: > > No, it's possible for you not to completely understand the root cause. > Of course, I admit it's is a hard thing to clearly know every detail > things. A least, after a while, I almost forgot why it happens and what > reason is. > > the key reason why it appears is that TIPC does not properly cope with > the sync problem between unicat link and multicast link. Even if one > unicast link is set up by sending link state message via unicast > channel, link states on both endpoints are not sync immediately due to > distribution environment. For example, there have two nodes, one sender > of sending multicast messages and one message receiver respectively. > Suddenly one new node joins the cluster as another multicast messages > receiver. As the link sate between new receiver and the sender is not > sync timely, for instance, the sender still thinks there only has one > receiver although the new receiver actually starts to receive the > multicast messages sent by the sender at the moment. That means, during > the time of link state being inconsistent sender can release message in > its outbound queue as long as it receives one ack from one of the two > receivers. In normally there has no big problem. But if one receiver > finds one message is missed from a series of sequential received > packets, it then sends retransmission request to ask the sender to send > the missed packet again. But the missed packet has been released by > sender as the sender already received an ack of the missed packet from > another receiver. Therefore, sender cannot send out the missed packet > for ever, however, the receiver must receive the missed packet. So > deadlock happens. > > > > |
|
From: jason <huz...@gm...> - 2013-03-06 07:13:13
|
Hi Ying, Another question to your new bcast sync mechanism: I the design of the new mechanism , n_ptr->bclink.supported = 0 only prevent bcast message receiving from peer, so it has nothing to do with local node bcast sending, therefore I think it should not prevent us from calling tipc_bclink_acknowledge(). There are three places in tipc code where calls tipc_bclink_acknowledge() ,I list below: 1) when tipc_bclink_recv_pkt() receives a nack. 2) in tipc_recv_msg(). 3) in node_lost_contact() I saw your patch only remove checking n_ptr->bclink.supported in node_lost_contact() , so what about the other two cases? On Mar 6, 2013 2:27 PM, "Ying Xue" <yin...@wi...> wrote: > On 03/06/2013 11:41 AM, jason wrote: > > Hi All, > > Let's say there are to nodes in cluster ,nodeA and nodeB. There is > > possibility that A has opened bcast receiving for B while B hasen't > > opened bcast receiving for A. Therefore, B hasen't sync its last_in to > > what A has sent ,then every messages sent by B will carry a invalid > > bcast acked seq number for A. Because A has open bcast receiving for B, > > it will process those invalid acks from B in tipc_bclink_acknowlegde(). > > It seems may cause problem I think. Please consider this. > > > > On Jan 31, 2013 6:06 PM, "Ying Xue" <yin...@wi... > > <mailto:yin...@wi...>> wrote: > > No, it's possible for you not to completely understand the root cause. > Of course, I admit it's is a hard thing to clearly know every detail > things. A least, after a while, I almost forgot why it happens and what > reason is. > > the key reason why it appears is that TIPC does not properly cope with > the sync problem between unicat link and multicast link. Even if one > unicast link is set up by sending link state message via unicast > channel, link states on both endpoints are not sync immediately due to > distribution environment. For example, there have two nodes, one sender > of sending multicast messages and one message receiver respectively. > Suddenly one new node joins the cluster as another multicast messages > receiver. As the link sate between new receiver and the sender is not > sync timely, for instance, the sender still thinks there only has one > receiver although the new receiver actually starts to receive the > multicast messages sent by the sender at the moment. That means, during > the time of link state being inconsistent sender can release message in > its outbound queue as long as it receives one ack from one of the two > receivers. In normally there has no big problem. But if one receiver > finds one message is missed from a series of sequential received > packets, it then sends retransmission request to ask the sender to send > the missed packet again. But the missed packet has been released by > sender as the sender already received an ack of the missed packet from > another receiver. Therefore, sender cannot send out the missed packet > for ever, however, the receiver must receive the missed packet. So > deadlock happens. > > > > |
|
From: Jon M. <jon...@er...> - 2013-03-07 22:22:11
|
On 03/06/2013 01:27 AM, Ying Xue wrote: > On 03/06/2013 11:41 AM, jason wrote: >> Hi All, >> Let's say there are to nodes in cluster ,nodeA and nodeB. There is >> possibility that A has opened bcast receiving for B while B hasen't >> opened bcast receiving for A. Therefore, B hasen't sync its last_in to >> what A has sent ,then every messages sent by B will carry a invalid >> bcast acked seq number for A. >> Because A has open bcast receiving for B, >> it will process those invalid acks from B in tipc_bclink_acknowlegde(). No it won't. The broadcast themselves don't carry valid acknowledges, since there is no single node to acknowledge. Unicasts B -> A will carry acknowledges, but those will be ignored by A because they will be lower than the lowest acknowledge value A can accept from B. A knows that value; it is A's "next_out_no" value at the moment it opened for reception. (And sent its own BCAST_SYNC message). Regards ///jon >> It seems may cause problem I think. Please consider this. >> >> On Jan 31, 2013 6:06 PM, "Ying Xue" <yin...@wi... >> <mailto:yin...@wi...>> wrote: > > No, it's possible for you not to completely understand the root cause. > Of course, I admit it's is a hard thing to clearly know every detail > things. A least, after a while, I almost forgot why it happens and what > reason is. > > the key reason why it appears is that TIPC does not properly cope with > the sync problem between unicat link and multicast link. Even if one > unicast link is set up by sending link state message via unicast > channel, link states on both endpoints are not sync immediately due to > distribution environment. For example, there have two nodes, one sender > of sending multicast messages and one message receiver respectively. > Suddenly one new node joins the cluster as another multicast messages > receiver. As the link sate between new receiver and the sender is not > sync timely, for instance, the sender still thinks there only has one > receiver although the new receiver actually starts to receive the > multicast messages sent by the sender at the moment. That means, during > the time of link state being inconsistent sender can release message in > its outbound queue as long as it receives one ack from one of the two > receivers. In normally there has no big problem. But if one receiver > finds one message is missed from a series of sequential received > packets, it then sends retransmission request to ask the sender to send > the missed packet again. But the missed packet has been released by > sender as the sender already received an ack of the missed packet from > another receiver. Therefore, sender cannot send out the missed packet > for ever, however, the receiver must receive the missed packet. So > deadlock happens. > > > |
|
From: jason <huz...@gm...> - 2013-03-07 23:27:28
|
Hi Jon, Node A just update B.acked as its next_out_no when it found out node B is up. But it seems there is nothing can prevent B from carry a acknowledge lower than that. It may be a random number which occasionally greater than it. 在 2013-3-8 上午6:22,"Jon Maloy" <jon...@er...>写道: > On 03/06/2013 01:27 AM, Ying Xue wrote: > > On 03/06/2013 11:41 AM, jason wrote: > >> Hi All, > >> Let's say there are to nodes in cluster ,nodeA and nodeB. There is > >> possibility that A has opened bcast receiving for B while B hasen't > >> opened bcast receiving for A. Therefore, B hasen't sync its last_in to > >> what A has sent ,then every messages sent by B will carry a invalid > >> bcast acked seq number for A. > > >> Because A has open bcast receiving for B, > >> it will process those invalid acks from B in tipc_bclink_acknowlegde(). > > No it won't. > The broadcast themselves don't carry valid acknowledges, since there is > no single node to acknowledge. > Unicasts B -> A will carry acknowledges, but those will be ignored by A > because they will be lower than the lowest acknowledge value A can accept > from B. A knows that value; it is A's "next_out_no" value at the moment > it opened for reception. (And sent its own BCAST_SYNC message). > > Regards > ///jon > > >> It seems may cause problem I think. Please consider this. > >> > >> On Jan 31, 2013 6:06 PM, "Ying Xue" <yin...@wi... > >> <mailto:yin...@wi...>> wrote: > > > > No, it's possible for you not to completely understand the root cause. > > Of course, I admit it's is a hard thing to clearly know every detail > > things. A least, after a while, I almost forgot why it happens and what > > reason is. > > > > the key reason why it appears is that TIPC does not properly cope with > > the sync problem between unicat link and multicast link. Even if one > > unicast link is set up by sending link state message via unicast > > channel, link states on both endpoints are not sync immediately due to > > distribution environment. For example, there have two nodes, one sender > > of sending multicast messages and one message receiver respectively. > > Suddenly one new node joins the cluster as another multicast messages > > receiver. As the link sate between new receiver and the sender is not > > sync timely, for instance, the sender still thinks there only has one > > receiver although the new receiver actually starts to receive the > > multicast messages sent by the sender at the moment. That means, during > > the time of link state being inconsistent sender can release message in > > its outbound queue as long as it receives one ack from one of the two > > receivers. In normally there has no big problem. But if one receiver > > finds one message is missed from a series of sequential received > > packets, it then sends retransmission request to ask the sender to send > > the missed packet again. But the missed packet has been released by > > sender as the sender already received an ack of the missed packet from > > another receiver. Therefore, sender cannot send out the missed packet > > for ever, however, the receiver must receive the missed packet. So > > deadlock happens. > > > > > > > > |
|
From: jason <huz...@gm...> - 2013-03-08 01:03:56
|
Hi Jon, Just a quick resend of my previous mail( to make my point clear.) Node A just update B.acked as its next_out_no when it found out node B is up(node A calls node_established_contact()). But it seems there is nothing can prevent B from carry a acknowledge lower than that. It may be a random number which occasionally greater than that until B finally init its last_in at the time that B got BCAST_SYNC message from A(B calls tipc_bclink_info_recv()). 在 2013-3-8 上午7:27,"jason" <huz...@gm...>写道: > Hi Jon, > Node A just update B.acked as its next_out_no when it found out node B is > up. But it seems there is nothing can prevent B from carry a acknowledge > lower than that. It may be a random number which occasionally greater than > it. > 在 2013-3-8 上午6:22,"Jon Maloy" <jon...@er...>写道: > >> On 03/06/2013 01:27 AM, Ying Xue wrote: >> > On 03/06/2013 11:41 AM, jason wrote: >> >> Hi All, >> >> Let's say there are to nodes in cluster ,nodeA and nodeB. There is >> >> possibility that A has opened bcast receiving for B while B hasen't >> >> opened bcast receiving for A. Therefore, B hasen't sync its last_in to >> >> what A has sent ,then every messages sent by B will carry a invalid >> >> bcast acked seq number for A. >> >> >> Because A has open bcast receiving for B, >> >> it will process those invalid acks from B in tipc_bclink_acknowlegde(). >> >> No it won't. >> The broadcast themselves don't carry valid acknowledges, since there is >> no single node to acknowledge. >> Unicasts B -> A will carry acknowledges, but those will be ignored by A >> because they will be lower than the lowest acknowledge value A can accept >> from B. A knows that value; it is A's "next_out_no" value at the moment >> it opened for reception. (And sent its own BCAST_SYNC message). >> >> Regards >> ///jon >> >> >> It seems may cause problem I think. Please consider this. >> >> >> >> On Jan 31, 2013 6:06 PM, "Ying Xue" <yin...@wi... >> >> <mailto:yin...@wi...>> wrote: >> > >> > No, it's possible for you not to completely understand the root cause. >> > Of course, I admit it's is a hard thing to clearly know every detail >> > things. A least, after a while, I almost forgot why it happens and what >> > reason is. >> > >> > the key reason why it appears is that TIPC does not properly cope with >> > the sync problem between unicat link and multicast link. Even if one >> > unicast link is set up by sending link state message via unicast >> > channel, link states on both endpoints are not sync immediately due to >> > distribution environment. For example, there have two nodes, one sender >> > of sending multicast messages and one message receiver respectively. >> > Suddenly one new node joins the cluster as another multicast messages >> > receiver. As the link sate between new receiver and the sender is not >> > sync timely, for instance, the sender still thinks there only has one >> > receiver although the new receiver actually starts to receive the >> > multicast messages sent by the sender at the moment. That means, during >> > the time of link state being inconsistent sender can release message in >> > its outbound queue as long as it receives one ack from one of the two >> > receivers. In normally there has no big problem. But if one receiver >> > finds one message is missed from a series of sequential received >> > packets, it then sends retransmission request to ask the sender to send >> > the missed packet again. But the missed packet has been released by >> > sender as the sender already received an ack of the missed packet from >> > another receiver. Therefore, sender cannot send out the missed packet >> > for ever, however, the receiver must receive the missed packet. So >> > deadlock happens. >> > >> > >> > >> >> |
|
From: jason <huz...@gm...> - 2013-03-08 04:16:06
|
Hi Jon,
And my solution to the "random acknowledge number" problem is just simply
reversing the following piece of patch of the new bcast sync mechanism:
@@ -2058,9 +2059,11 @@ static void link_recv_proto_msg(struct tipc_link
*l_ptr, struct sk_buff *buf) l_ptr->max_pkt = l_ptr->max_pkt_target; }
l_ptr->owner->bclink.supportable = (max_pkt_info != 0); +
l_ptr->owner->bclink.sync = msg_bclink_sync(msg);
/* Synchronize broadcast link info, if not done previously */- if
(!tipc_node_is_up(l_ptr->owner)) { + if (!tipc_node_is_up(l_ptr->owner) &&
+ !l_ptr->owner->bclink.sync) { l_ptr->owner->bclink.last_sent =
l_ptr->owner->bclink.last_in = msg_last_bcast(msg);
Therefore, we early set last_in to a valit value from peer as soon as
possible to prevent it remains a invalid value.
在 2013-3-8 上午9:03,"jason" <huz...@gm...>写道:
> Hi Jon,
>
> Just a quick resend of my previous mail( to make my point clear.)
>
> Node A just update B.acked as its next_out_no when it found out node B is
> up(node A calls node_established_contact()). But it seems there is nothing
> can prevent B from carry a acknowledge lower than that. It may be a random
> number which occasionally greater than that until B finally init its
> last_in at the time that B got BCAST_SYNC message from A(B calls
> tipc_bclink_info_recv()).
> 在 2013-3-8 上午7:27,"jason" <huz...@gm...>写道:
>
>> Hi Jon,
>> Node A just update B.acked as its next_out_no when it found out node B is
>> up. But it seems there is nothing can prevent B from carry a acknowledge
>> lower than that. It may be a random number which occasionally greater than
>> it.
>> 在 2013-3-8 上午6:22,"Jon Maloy" <jon...@er...>写道:
>>
>>> On 03/06/2013 01:27 AM, Ying Xue wrote:
>>> > On 03/06/2013 11:41 AM, jason wrote:
>>> >> Hi All,
>>> >> Let's say there are to nodes in cluster ,nodeA and nodeB. There is
>>> >> possibility that A has opened bcast receiving for B while B hasen't
>>> >> opened bcast receiving for A. Therefore, B hasen't sync its last_in to
>>> >> what A has sent ,then every messages sent by B will carry a invalid
>>> >> bcast acked seq number for A.
>>>
>>> >> Because A has open bcast receiving for B,
>>> >> it will process those invalid acks from B in
>>> tipc_bclink_acknowlegde().
>>>
>>> No it won't.
>>> The broadcast themselves don't carry valid acknowledges, since there is
>>> no single node to acknowledge.
>>> Unicasts B -> A will carry acknowledges, but those will be ignored by A
>>> because they will be lower than the lowest acknowledge value A can accept
>>> from B. A knows that value; it is A's "next_out_no" value at the moment
>>> it opened for reception. (And sent its own BCAST_SYNC message).
>>>
>>> Regards
>>> ///jon
>>>
>>> >> It seems may cause problem I think. Please consider this.
>>> >>
>>> >> On Jan 31, 2013 6:06 PM, "Ying Xue" <yin...@wi...
>>> >> <mailto:yin...@wi...>> wrote:
>>> >
>>> > No, it's possible for you not to completely understand the root cause.
>>> > Of course, I admit it's is a hard thing to clearly know every detail
>>> > things. A least, after a while, I almost forgot why it happens and what
>>> > reason is.
>>> >
>>> > the key reason why it appears is that TIPC does not properly cope with
>>> > the sync problem between unicat link and multicast link. Even if one
>>> > unicast link is set up by sending link state message via unicast
>>> > channel, link states on both endpoints are not sync immediately due to
>>> > distribution environment. For example, there have two nodes, one
>>> sender
>>> > of sending multicast messages and one message receiver respectively.
>>> > Suddenly one new node joins the cluster as another multicast messages
>>> > receiver. As the link sate between new receiver and the sender is not
>>> > sync timely, for instance, the sender still thinks there only has one
>>> > receiver although the new receiver actually starts to receive the
>>> > multicast messages sent by the sender at the moment. That means, during
>>> > the time of link state being inconsistent sender can release message in
>>> > its outbound queue as long as it receives one ack from one of the two
>>> > receivers. In normally there has no big problem. But if one receiver
>>> > finds one message is missed from a series of sequential received
>>> > packets, it then sends retransmission request to ask the sender to send
>>> > the missed packet again. But the missed packet has been released by
>>> > sender as the sender already received an ack of the missed packet from
>>> > another receiver. Therefore, sender cannot send out the missed packet
>>> > for ever, however, the receiver must receive the missed packet. So
>>> > deadlock happens.
>>> >
>>> >
>>> >
>>>
>>>
|
|
From: Jon M. <jon...@er...> - 2013-03-08 15:14:18
|
On 03/07/2013 11:15 PM, jason wrote:
> Hi Jon,
>
> And my solution to the "random acknowledge number" problem is just simply
> reversing the following piece of patch of the new bcast sync mechanism:
>
> @@ -2058,9 +2059,11 @@ static void link_recv_proto_msg(struct tipc_link
> *l_ptr, struct sk_buff *buf) l_ptr->max_pkt = l_ptr->max_pkt_target; }
> l_ptr->owner->bclink.supportable = (max_pkt_info != 0); +
> l_ptr->owner->bclink.sync = msg_bclink_sync(msg);
>
> /* Synchronize broadcast link info, if not done previously */- if
> (!tipc_node_is_up(l_ptr->owner)) { + if (!tipc_node_is_up(l_ptr->owner) &&
> + !l_ptr->owner->bclink.sync) { l_ptr->owner->bclink.last_sent =
> l_ptr->owner->bclink.last_in = msg_last_bcast(msg);
Hi jason,
You are clearly not looking at the current code version at netdev.
There are no "supported" or "sync" flags in that version.
The scenario you have in mind really goes like this:
0: Node A has next_out_no N, Node B has next_out_no M
1: Node A sends an ACTIVATE message to B, where last_bcast = N -1,
effectively telling B it to start acking broadcast packets from that
sequence number.
2: Node B receives the ACTIVATE, activates its link endpoint, and
sets next_in = N -1.
Note that B is not open for receiving broadcasts yet, and does not
know where to start receiving. It only knows which bcast_ack value
to send in its unicasts.
( 2.5: Node A and B may now send out more broadcasts, but A doesn't
expect any acknowledges from A, or vice versa, since neither node is
in the peer's broadcast receiver's list.
Below, I'll assume that no broadcasts are sent, since this doesn't
affect the reasoning.)
3: Node B sends a BCAST_SYNC as *first* unicast message to A. This
message contains bcast_ack = N -1 , and last_bcast = M - 1,
meaning that A should start receiving from M.
4: Node A receives the BCAST_SYNC message and opens up to start
receiving from packet M from node B, setting last_in = M -1.
The bcast_ack value from B is ignored, since it is lower than N.
We are now in the the situation you describe in your example.
As you see, all values are strictly defined and valid.
5: Node B goes on sending NAME_DISTR messages after the BCAST_SYNC
was sent. Those will all contain bcast_ack = N -1, since B cannot
have received any broadcast messages until it has received its own
BCAST_SYNC. All those acks will be ignored by A, as described above.
6: Node B may now also send broadcasts M, M + 1 etc. to A, which A
will receive and deliver, but A will send no acks before its own
BCAST_SYNC has been sent out.
7: Node A sends a BCAST_SYNC as first unicast message to B. It carries
bcast_ack = M - 1 (or M + x if some broadcasts were received).
This is now a valid ack value. last_bcast = N - 1 (disregarding that A may
already have sent N, N+1 etc)
8: Node B receives the BCAST_SYNC, finds it can (and must) start
receiving from N, and open up for reception. If A has already
sent N + 1, N + 2 etc, Node B will either have them in its
deferred queue, or it will have to ask for retransmission.
I hope this answers your concern.
Regards
///jon
>
> Therefore, we early set last_in to a valit value from peer as soon as
> possible to prevent it remains a invalid value.
> 在 2013-3-8 上午9:03,"jason" <huz...@gm...>写道:
>
>> Hi Jon,
>>
>> Just a quick resend of my previous mail( to make my point clear.)
>>
>> Node A just update B.acked as its next_out_no when it found out node B is
>> up(node A calls node_established_contact()). But it seems there is nothing
>> can prevent B from carry a acknowledge lower than that. It may be a random
>> number which occasionally greater than that until B finally init its
>> last_in at the time that B got BCAST_SYNC message from A(B calls
>> tipc_bclink_info_recv()).
>> 在 2013-3-8 上午7:27,"jason" <huz...@gm...>写道:
>>
>>> Hi Jon,
>>> Node A just update B.acked as its next_out_no when it found out node B is
>>> up. But it seems there is nothing can prevent B from carry a acknowledge
>>> lower than that. It may be a random number which occasionally greater than
>>> it.
>>> 在 2013-3-8 上午6:22,"Jon Maloy" <jon...@er...>写道:
>>>
>>>> On 03/06/2013 01:27 AM, Ying Xue wrote:
>>>>> On 03/06/2013 11:41 AM, jason wrote:
>>>>>> Hi All,
>>>>>> Let's say there are to nodes in cluster ,nodeA and nodeB. There is
>>>>>> possibility that A has opened bcast receiving for B while B hasen't
>>>>>> opened bcast receiving for A. Therefore, B hasen't sync its last_in to
>>>>>> what A has sent ,then every messages sent by B will carry a invalid
>>>>>> bcast acked seq number for A.
>>>>
>>>>>> Because A has open bcast receiving for B,
>>>>>> it will process those invalid acks from B in
>>>> tipc_bclink_acknowlegde().
>>>>
>>>> No it won't.
>>>> The broadcast themselves don't carry valid acknowledges, since there is
>>>> no single node to acknowledge.
>>>> Unicasts B -> A will carry acknowledges, but those will be ignored by A
>>>> because they will be lower than the lowest acknowledge value A can accept
>>>> from B. A knows that value; it is A's "next_out_no" value at the moment
>>>> it opened for reception. (And sent its own BCAST_SYNC message).
>>>>
>>>> Regards
>>>> ///jon
>>>>
>>>>>> It seems may cause problem I think. Please consider this.
>>>>>>
>>>>>> On Jan 31, 2013 6:06 PM, "Ying Xue" <yin...@wi...
>>>>>> <mailto:yin...@wi...>> wrote:
>>>>>
>>>>> No, it's possible for you not to completely understand the root cause.
>>>>> Of course, I admit it's is a hard thing to clearly know every detail
>>>>> things. A least, after a while, I almost forgot why it happens and what
>>>>> reason is.
>>>>>
>>>>> the key reason why it appears is that TIPC does not properly cope with
>>>>> the sync problem between unicat link and multicast link. Even if one
>>>>> unicast link is set up by sending link state message via unicast
>>>>> channel, link states on both endpoints are not sync immediately due to
>>>>> distribution environment. For example, there have two nodes, one
>>>> sender
>>>>> of sending multicast messages and one message receiver respectively.
>>>>> Suddenly one new node joins the cluster as another multicast messages
>>>>> receiver. As the link sate between new receiver and the sender is not
>>>>> sync timely, for instance, the sender still thinks there only has one
>>>>> receiver although the new receiver actually starts to receive the
>>>>> multicast messages sent by the sender at the moment. That means, during
>>>>> the time of link state being inconsistent sender can release message in
>>>>> its outbound queue as long as it receives one ack from one of the two
>>>>> receivers. In normally there has no big problem. But if one receiver
>>>>> finds one message is missed from a series of sequential received
>>>>> packets, it then sends retransmission request to ask the sender to send
>>>>> the missed packet again. But the missed packet has been released by
>>>>> sender as the sender already received an ack of the missed packet from
>>>>> another receiver. Therefore, sender cannot send out the missed packet
>>>>> for ever, however, the receiver must receive the missed packet. So
>>>>> deadlock happens.
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>
|
|
From: jason <huz...@gm...> - 2013-03-08 16:06:25
|
Hi Jon, Thank you for the detailed explanation very much. It is the step 2 that answered my concern. In step 2 of your scenario, if node B do not set next_in = N -1, then B will have the invalid bcast_ack issue that I described. I am currently using tipc-1.7.7 and assume that Ying's patch at the beginning of this thread is the final version for tipc-1.7.7 to introduce this new bcast sync mechanism, so all my concerns are base on tipc-1.7.7 plus this patch. Do you have plan to update 1.7.7 to include the final version of this new bcast sync mechanism in future? Or I have to switch to netdev version to have it? |
|
From: jason <huz...@gm...> - 2013-03-08 18:10:59
|
在 2013-3-8 下午11:14,"Jon Maloy" <jon...@er...>写道:
>
> On 03/07/2013 11:15 PM, jason wrote:
> > Hi Jon,
> >
> > And my solution to the "random acknowledge number" problem is just
simply
> > reversing the following piece of patch of the new bcast sync mechanism:
> >
> > @@ -2058,9 +2059,11 @@ static void link_recv_proto_msg(struct tipc_link
> > *l_ptr, struct sk_buff *buf) l_ptr->max_pkt = l_ptr->max_pkt_target; }
> > l_ptr->owner->bclink.supportable = (max_pkt_info != 0); +
> > l_ptr->owner->bclink.sync = msg_bclink_sync(msg);
> >
> > /* Synchronize broadcast link info, if not done previously */- if
> > (!tipc_node_is_up(l_ptr->owner)) { + if (!tipc_node_is_up(l_ptr->owner)
&&
> > + !l_ptr->owner->bclink.sync) { l_ptr->owner->bclink.last_sent =
> > l_ptr->owner->bclink.last_in = msg_last_bcast(msg);
>
> Hi jason,
>
> You are clearly not looking at the current code version at netdev.
> There are no "supported" or "sync" flags in that version.
>
> The scenario you have in mind really goes like this:
>
> 0: Node A has next_out_no N, Node B has next_out_no M
>
> 1: Node A sends an ACTIVATE message to B, where last_bcast = N -1,
> effectively telling B it to start acking broadcast packets from
that
> sequence number.
>
> 2: Node B receives the ACTIVATE, activates its link endpoint, and
> sets next_in = N -1.
> Note that B is not open for receiving broadcasts yet, and does not
> know where to start receiving. It only knows which bcast_ack value
> to send in its unicasts.
>
> ( 2.5: Node A and B may now send out more broadcasts, but A doesn't
> expect any acknowledges from A, or vice versa, since neither node
is
> in the peer's broadcast receiver's list.
> Below, I'll assume that no broadcasts are sent, since this doesn't
> affect the reasoning.)
>
> 3: Node B sends a BCAST_SYNC as *first* unicast message to A. This
> message contains bcast_ack = N -1 , and last_bcast = M - 1,
> meaning that A should start receiving from M.
>
> 4: Node A receives the BCAST_SYNC message and opens up to start
> receiving from packet M from node B, setting last_in = M -1.
> The bcast_ack value from B is ignored, since it is lower than N.
>
> We are now in the the situation you describe in your example.
> As you see, all values are strictly defined and valid.
>
>
>
> 5: Node B goes on sending NAME_DISTR messages after the BCAST_SYNC
> was sent. Those will all contain bcast_ack = N -1, since B cannot
> have received any broadcast messages until it has received its own
> BCAST_SYNC. All those acks will be ignored by A, as described
above.
>
> 6: Node B may now also send broadcasts M, M + 1 etc. to A, which A
> will receive and deliver, but A will send no acks before its own
> BCAST_SYNC has been sent out.
>
Here I would like to ask a question. Assume node A spend a lot of time
before sending its own BCAST_SYNC out (not to say tipc-2.0, but say
tipc-1.7.7 which send nametable over unicast link before sending BCAST_SYNC
out) There will be an possibility that Node B's bcast out queue window
becomes full if B broadcast many messages at the same time, because B
ignored any bcast_ack from A until B got BCAST_SYNC from A. Right?
In my opinion, here B shall not ignored bcast_ack from A before B got
BCAST_SYNC from A. More general, getting a BCAST_SYNC from peer (set
recv_permitted = true ) is just a signal for openning up our receiving, it
shall not prevent us from processing peer's acknowledge which is a matter
of our sending. This is not a case for tipc-2.0 which does not have
nametable unicasting between link up and sending BCAST_SYNC, but it is much
more important to tipc-1.7.7. Please consider it. Thank you!
> 7: Node A sends a BCAST_SYNC as first unicast message to B. It
carries
> bcast_ack = M - 1 (or M + x if some broadcasts were received).
> This is now a valid ack value. last_bcast = N - 1 (disregarding
that A may
> already have sent N, N+1 etc)
>
> 8: Node B receives the BCAST_SYNC, finds it can (and must) start
> receiving from N, and open up for reception. If A has already
> sent N + 1, N + 2 etc, Node B will either have them in its
> deferred queue, or it will have to ask for retransmission.
>
>
> I hope this answers your concern.
>
> Regards
> ///jon
>
> >
> > Therefore, we early set last_in to a valit value from peer as soon as
> > possible to prevent it remains a invalid value.
> > 在 2013-3-8 上午9:03,"jason" <huz...@gm...>写道:
> >
> >> Hi Jon,
> >>
> >> Just a quick resend of my previous mail( to make my point clear.)
> >>
> >> Node A just update B.acked as its next_out_no when it found out node B
is
> >> up(node A calls node_established_contact()). But it seems there is
nothing
> >> can prevent B from carry a acknowledge lower than that. It may be a
random
> >> number which occasionally greater than that until B finally init its
> >> last_in at the time that B got BCAST_SYNC message from A(B calls
> >> tipc_bclink_info_recv()).
> >> 在 2013-3-8 上午7:27,"jason" <huz...@gm...>写道:
> >>
> >>> Hi Jon,
> >>> Node A just update B.acked as its next_out_no when it found out node
B is
> >>> up. But it seems there is nothing can prevent B from carry a
acknowledge
> >>> lower than that. It may be a random number which occasionally greater
than
> >>> it.
> >>> 在 2013-3-8 上午6:22,"Jon Maloy" <jon...@er...>写道:
> >>>
> >>>> On 03/06/2013 01:27 AM, Ying Xue wrote:
> >>>>> On 03/06/2013 11:41 AM, jason wrote:
> >>>>>> Hi All,
> >>>>>> Let's say there are to nodes in cluster ,nodeA and nodeB. There is
> >>>>>> possibility that A has opened bcast receiving for B while B hasen't
> >>>>>> opened bcast receiving for A. Therefore, B hasen't sync its
last_in to
> >>>>>> what A has sent ,then every messages sent by B will carry a invalid
> >>>>>> bcast acked seq number for A.
> >>>>
> >>>>>> Because A has open bcast receiving for B,
> >>>>>> it will process those invalid acks from B in
> >>>> tipc_bclink_acknowlegde().
> >>>>
> >>>> No it won't.
> >>>> The broadcast themselves don't carry valid acknowledges, since there
is
> >>>> no single node to acknowledge.
> >>>> Unicasts B -> A will carry acknowledges, but those will be ignored
by A
> >>>> because they will be lower than the lowest acknowledge value A can
accept
> >>>> from B. A knows that value; it is A's "next_out_no" value at the
moment
> >>>> it opened for reception. (And sent its own BCAST_SYNC message).
> >>>>
> >>>> Regards
> >>>> ///jon
> >>>>
> >>>>>> It seems may cause problem I think. Please consider this.
> >>>>>>
> >>>>>> On Jan 31, 2013 6:06 PM, "Ying Xue" <yin...@wi...
> >>>>>> <mailto:yin...@wi...>> wrote:
> >>>>>
> >>>>> No, it's possible for you not to completely understand the root
cause.
> >>>>> Of course, I admit it's is a hard thing to clearly know every detail
> >>>>> things. A least, after a while, I almost forgot why it happens and
what
> >>>>> reason is.
> >>>>>
> >>>>> the key reason why it appears is that TIPC does not properly cope
with
> >>>>> the sync problem between unicat link and multicast link. Even if one
> >>>>> unicast link is set up by sending link state message via unicast
> >>>>> channel, link states on both endpoints are not sync immediately due
to
> >>>>> distribution environment. For example, there have two nodes, one
> >>>> sender
> >>>>> of sending multicast messages and one message receiver respectively.
> >>>>> Suddenly one new node joins the cluster as another multicast
messages
> >>>>> receiver. As the link sate between new receiver and the sender is
not
> >>>>> sync timely, for instance, the sender still thinks there only has
one
> >>>>> receiver although the new receiver actually starts to receive the
> >>>>> multicast messages sent by the sender at the moment. That means,
during
> >>>>> the time of link state being inconsistent sender can release
message in
> >>>>> its outbound queue as long as it receives one ack from one of the
two
> >>>>> receivers. In normally there has no big problem. But if one receiver
> >>>>> finds one message is missed from a series of sequential received
> >>>>> packets, it then sends retransmission request to ask the sender to
send
> >>>>> the missed packet again. But the missed packet has been released by
> >>>>> sender as the sender already received an ack of the missed packet
from
> >>>>> another receiver. Therefore, sender cannot send out the missed
packet
> >>>>> for ever, however, the receiver must receive the missed packet. So
> >>>>> deadlock happens.
> >>>>>
> >>>>>
> >>>>>
> >>>>
> >>>>
> >
>
|