You can subscribe to this list here.
2003 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(6) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2004 |
Jan
(9) |
Feb
(11) |
Mar
(22) |
Apr
(73) |
May
(78) |
Jun
(146) |
Jul
(80) |
Aug
(27) |
Sep
(5) |
Oct
(14) |
Nov
(18) |
Dec
(27) |
2005 |
Jan
(20) |
Feb
(30) |
Mar
(19) |
Apr
(28) |
May
(50) |
Jun
(31) |
Jul
(32) |
Aug
(14) |
Sep
(36) |
Oct
(43) |
Nov
(74) |
Dec
(63) |
2006 |
Jan
(34) |
Feb
(32) |
Mar
(21) |
Apr
(76) |
May
(106) |
Jun
(72) |
Jul
(70) |
Aug
(175) |
Sep
(130) |
Oct
(39) |
Nov
(81) |
Dec
(43) |
2007 |
Jan
(81) |
Feb
(36) |
Mar
(20) |
Apr
(43) |
May
(54) |
Jun
(34) |
Jul
(44) |
Aug
(55) |
Sep
(44) |
Oct
(54) |
Nov
(43) |
Dec
(41) |
2008 |
Jan
(42) |
Feb
(84) |
Mar
(73) |
Apr
(30) |
May
(119) |
Jun
(54) |
Jul
(54) |
Aug
(93) |
Sep
(173) |
Oct
(130) |
Nov
(145) |
Dec
(153) |
2009 |
Jan
(59) |
Feb
(12) |
Mar
(28) |
Apr
(18) |
May
(56) |
Jun
(9) |
Jul
(28) |
Aug
(62) |
Sep
(16) |
Oct
(19) |
Nov
(15) |
Dec
(17) |
2010 |
Jan
(14) |
Feb
(36) |
Mar
(37) |
Apr
(30) |
May
(33) |
Jun
(53) |
Jul
(42) |
Aug
(50) |
Sep
(67) |
Oct
(66) |
Nov
(69) |
Dec
(36) |
2011 |
Jan
(52) |
Feb
(45) |
Mar
(49) |
Apr
(21) |
May
(34) |
Jun
(13) |
Jul
(19) |
Aug
(37) |
Sep
(43) |
Oct
(10) |
Nov
(23) |
Dec
(30) |
2012 |
Jan
(42) |
Feb
(36) |
Mar
(46) |
Apr
(25) |
May
(96) |
Jun
(146) |
Jul
(40) |
Aug
(28) |
Sep
(61) |
Oct
(45) |
Nov
(100) |
Dec
(53) |
2013 |
Jan
(79) |
Feb
(24) |
Mar
(134) |
Apr
(156) |
May
(118) |
Jun
(75) |
Jul
(278) |
Aug
(145) |
Sep
(136) |
Oct
(168) |
Nov
(137) |
Dec
(439) |
2014 |
Jan
(284) |
Feb
(158) |
Mar
(231) |
Apr
(275) |
May
(259) |
Jun
(91) |
Jul
(222) |
Aug
(215) |
Sep
(165) |
Oct
(166) |
Nov
(211) |
Dec
(150) |
2015 |
Jan
(164) |
Feb
(324) |
Mar
(299) |
Apr
(214) |
May
(111) |
Jun
(109) |
Jul
(105) |
Aug
(36) |
Sep
(58) |
Oct
(131) |
Nov
(68) |
Dec
(30) |
2016 |
Jan
(46) |
Feb
(87) |
Mar
(135) |
Apr
(174) |
May
(132) |
Jun
(135) |
Jul
(149) |
Aug
(125) |
Sep
(79) |
Oct
(49) |
Nov
(95) |
Dec
(102) |
2017 |
Jan
(104) |
Feb
(75) |
Mar
(72) |
Apr
(53) |
May
(18) |
Jun
(5) |
Jul
(14) |
Aug
(19) |
Sep
(2) |
Oct
(13) |
Nov
(21) |
Dec
(67) |
2018 |
Jan
(56) |
Feb
(50) |
Mar
(148) |
Apr
(41) |
May
(37) |
Jun
(34) |
Jul
(34) |
Aug
(11) |
Sep
(52) |
Oct
(48) |
Nov
(28) |
Dec
(46) |
2019 |
Jan
(29) |
Feb
(63) |
Mar
(95) |
Apr
(54) |
May
(14) |
Jun
(71) |
Jul
(60) |
Aug
(49) |
Sep
(3) |
Oct
(64) |
Nov
(115) |
Dec
(57) |
2020 |
Jan
(15) |
Feb
(9) |
Mar
(38) |
Apr
(27) |
May
(60) |
Jun
(53) |
Jul
(35) |
Aug
(46) |
Sep
(37) |
Oct
(64) |
Nov
(20) |
Dec
(25) |
2021 |
Jan
(20) |
Feb
(31) |
Mar
(27) |
Apr
(23) |
May
(21) |
Jun
(30) |
Jul
(30) |
Aug
(7) |
Sep
(18) |
Oct
|
Nov
(15) |
Dec
(4) |
2022 |
Jan
(3) |
Feb
(1) |
Mar
(10) |
Apr
|
May
(2) |
Jun
(26) |
Jul
(5) |
Aug
|
Sep
(1) |
Oct
(2) |
Nov
(9) |
Dec
(2) |
2023 |
Jan
(4) |
Feb
(4) |
Mar
(5) |
Apr
(10) |
May
(29) |
Jun
(17) |
Jul
|
Aug
|
Sep
(1) |
Oct
(1) |
Nov
(2) |
Dec
|
2024 |
Jan
|
Feb
(6) |
Mar
|
Apr
(1) |
May
(6) |
Jun
|
Jul
(5) |
Aug
|
Sep
(3) |
Oct
|
Nov
|
Dec
|
2025 |
Jan
|
Feb
(3) |
Mar
|
Apr
|
May
|
Jun
|
Jul
(6) |
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: Tuong T. L. <tuo...@de...> - 2020-04-06 09:03:08
|
Hi Ying, Basically when it performs by the iproute2/tipc tool, it first asks the kernel to dump everything and then makes a filter on specific links according to the command options. Please see the other patch on iproute2/tipc for more details: [iproute2-next] tipc: enable printing of broadcast rcv link stats So, for this patch, we just introduce a flag for user to dump the broadcast-receiver links stats (in addition to the traditional links ones) when needed. That is the 'TIPC_NL_LINK_GET'/'TIPC_NLA_LINK_BROADCAST' flag as mentioned in the commit message. In the iproute2/tipc, it will be: > + /* Set the flag to dump all bc links */ > + attrs = mnl_attr_nest_start(nlh, TIPC_NLA_LINK); > + mnl_attr_put(nlh, TIPC_NLA_LINK_BROADCAST, 0, NULL); > + mnl_attr_nest_end(nlh, attrs); BR/Tuong -----Original Message----- From: Xue, Ying <Yin...@wi...> Sent: Monday, April 6, 2020 1:45 PM To: Tuong Tong Lien <tuo...@de...>; jm...@re...; ma...@do...; tip...@li... Cc: tipc-dek <tip...@de...> Subject: RE: [PATCH RFC 4/4] tipc: add support for broadcast rcv stats dumping Just a minor comment: Please define macros for the cases: 1. Dump broadcast-link & unicast links 2. Dump broadcast-receiver links Thanks, Ying -----Original Message----- From: Tuong Lien [mailto:tuo...@de...] Sent: Saturday, March 28, 2020 12:03 PM To: jm...@re...; ma...@do...; Xue, Ying; tip...@li... Cc: tip...@de... Subject: [PATCH RFC 4/4] tipc: add support for broadcast rcv stats dumping This commit enables dumping the statistics of a broadcast-receiver link like the traditional 'broadcast-link' one (which is for broadcast- sender). The link dumping can be triggered via netlink (e.g. the iproute2/tipc tool) by the link flag - 'TIPC_NLA_LINK_BROADCAST' as the indicator. The name of a broadcast-receiver link of a specific peer will be in the format: 'broadcast-link:<peer-id>'. For example: Link <broadcast-link:1001002> Window:50 packets RX packets:7841 fragments:2408/440 bundles:0/0 TX packets:0 fragments:0/0 bundles:0/0 RX naks:0 defs:124 dups:0 TX naks:21 acks:0 retrans:0 Congestion link:0 Send queue max:0 avg:0 In addition, the broadcast-receiver link statistics can be reset in the usual way via netlink by specifying that link name in command. Note: the 'tipc_link_name_ext()' is removed because the link name can now be retrieved simply via the 'l->name'. Signed-off-by: Tuong Lien <tuo...@de...> --- net/tipc/bcast.c | 6 ++--- net/tipc/bcast.h | 5 +++-- net/tipc/link.c | 65 +++++++++++++++++++++++++++--------------------------- net/tipc/link.h | 3 +-- net/tipc/msg.c | 9 ++++---- net/tipc/msg.h | 2 +- net/tipc/netlink.c | 2 +- net/tipc/node.c | 63 +++++++++++++++++++++++++++++++++++++++++++++------- net/tipc/trace.h | 4 ++-- 9 files changed, 103 insertions(+), 56 deletions(-) diff --git a/net/tipc/bcast.c b/net/tipc/bcast.c index 50a16f8bebd9..383f87bc1061 100644 --- a/net/tipc/bcast.c +++ b/net/tipc/bcast.c @@ -563,10 +563,8 @@ void tipc_bcast_remove_peer(struct net *net, struct tipc_link *rcv_l) tipc_sk_rcv(net, inputq); } -int tipc_bclink_reset_stats(struct net *net) +int tipc_bclink_reset_stats(struct net *net, struct tipc_link *l) { - struct tipc_link *l = tipc_bc_sndlink(net); - if (!l) return -ENOPROTOOPT; @@ -694,7 +692,7 @@ int tipc_bcast_init(struct net *net) tn->bcbase = bb; spin_lock_init(&tipc_net(net)->bclock); - if (!tipc_link_bc_create(net, 0, 0, + if (!tipc_link_bc_create(net, 0, 0, NULL, FB_MTU, BCLINK_WIN_DEFAULT, BCLINK_WIN_DEFAULT, diff --git a/net/tipc/bcast.h b/net/tipc/bcast.h index 97d3cf9d3e4d..4240c95188b1 100644 --- a/net/tipc/bcast.h +++ b/net/tipc/bcast.h @@ -96,9 +96,10 @@ void tipc_bcast_ack_rcv(struct net *net, struct tipc_link *l, int tipc_bcast_sync_rcv(struct net *net, struct tipc_link *l, struct tipc_msg *hdr, struct sk_buff_head *retrq); -int tipc_nl_add_bc_link(struct net *net, struct tipc_nl_msg *msg); +int tipc_nl_add_bc_link(struct net *net, struct tipc_nl_msg *msg, + struct tipc_link *bcl); int tipc_nl_bc_link_set(struct net *net, struct nlattr *attrs[]); -int tipc_bclink_reset_stats(struct net *net); +int tipc_bclink_reset_stats(struct net *net, struct tipc_link *l); u32 tipc_bcast_get_broadcast_mode(struct net *net); u32 tipc_bcast_get_broadcast_ratio(struct net *net); diff --git a/net/tipc/link.c b/net/tipc/link.c index 3071e46f029a..808d3a76c27f 100644 --- a/net/tipc/link.c +++ b/net/tipc/link.c @@ -539,7 +539,7 @@ bool tipc_link_create(struct net *net, char *if_name, int bearer_id, * * Returns true if link was created, otherwise false */ -bool tipc_link_bc_create(struct net *net, u32 ownnode, u32 peer, +bool tipc_link_bc_create(struct net *net, u32 ownnode, u32 peer, u8 *peer_id, int mtu, u32 min_win, u32 max_win, u16 peer_caps, struct sk_buff_head *inputq, struct sk_buff_head *namedq, @@ -554,7 +554,18 @@ bool tipc_link_bc_create(struct net *net, u32 ownnode, u32 peer, return false; l = *link; - strcpy(l->name, tipc_bclink_name); + if (peer_id) { + char peer_str[NODE_ID_STR_LEN] = {0,}; + + tipc_nodeid2string(peer_str, peer_id); + if (strlen(peer_str) > 16) + sprintf(peer_str, "%x", peer); + /* Broadcast receiver link name: "broadcast-link:<peer>" */ + snprintf(l->name, sizeof(l->name), "%s:%s", tipc_bclink_name, + peer_str); + } else { + strcpy(l->name, tipc_bclink_name); + } trace_tipc_link_reset(l, TIPC_DUMP_ALL, "bclink created!"); tipc_link_reset(l); l->state = LINK_RESET; @@ -1412,11 +1423,8 @@ static u8 __tipc_build_gap_ack_blks(struct tipc_gap_ack_blks *ga, gacks[n].ack = htons(expect - 1); gacks[n].gap = htons(seqno - expect); if (++n >= MAX_GAP_ACK_BLKS / 2) { - char buf[TIPC_MAX_LINK_NAME]; - pr_info_ratelimited("Gacks on %s: %d, ql: %d!\n", - tipc_link_name_ext(l, buf), - n, + l->name, n, skb_queue_len(&l->deferdq)); return n; } @@ -1600,6 +1608,8 @@ static int tipc_link_advance_transmq(struct tipc_link *l, struct tipc_link *r, _skb->priority = TC_PRIO_CONTROL; __skb_queue_tail(xmitq, _skb); l->stats.retransmitted++; + if (!is_uc) + r->stats.retransmitted++; *retransmitted = true; /* Increase actual retrans counter & mark first time */ if (!TIPC_SKB_CB(skb)->retr_cnt++) @@ -1766,7 +1776,8 @@ int tipc_link_rcv(struct tipc_link *l, struct sk_buff *skb, /* Defer delivery if sequence gap */ if (unlikely(seqno != rcv_nxt)) { - __tipc_skb_queue_sorted(defq, seqno, skb); + if (!__tipc_skb_queue_sorted(defq, seqno, skb)) + l->stats.duplicates++; rc |= tipc_link_build_nack_msg(l, xmitq); break; } @@ -1800,15 +1811,15 @@ static void tipc_link_build_proto_msg(struct tipc_link *l, int mtyp, bool probe, int tolerance, int priority, struct sk_buff_head *xmitq) { + struct tipc_mon_state *mstate = &l->mon_state; + struct sk_buff_head *dfq = &l->deferdq; struct tipc_link *bcl = l->bc_rcvlink; - struct sk_buff *skb; struct tipc_msg *hdr; - struct sk_buff_head *dfq = &l->deferdq; + struct sk_buff *skb; bool node_up = link_is_up(bcl); - struct tipc_mon_state *mstate = &l->mon_state; + u16 glen = 0, bc_rcvgap = 0; int dlen = 0; void *data; - u16 glen = 0; /* Don't send protocol message during reset or link failover */ if (tipc_link_is_blocked(l)) @@ -1846,7 +1857,8 @@ static void tipc_link_build_proto_msg(struct tipc_link *l, int mtyp, bool probe, if (l->peer_caps & TIPC_LINK_PROTO_SEQNO) msg_set_seqno(hdr, l->snd_nxt_state++); msg_set_seq_gap(hdr, rcvgap); - msg_set_bc_gap(hdr, link_bc_rcv_gap(bcl)); + bc_rcvgap = link_bc_rcv_gap(bcl); + msg_set_bc_gap(hdr, bc_rcvgap); msg_set_probe(hdr, probe); msg_set_is_keepalive(hdr, probe || probe_reply); if (l->peer_caps & TIPC_GAP_ACK_BLOCK) @@ -1871,6 +1883,8 @@ static void tipc_link_build_proto_msg(struct tipc_link *l, int mtyp, bool probe, l->stats.sent_probes++; if (rcvgap) l->stats.sent_nacks++; + if (bc_rcvgap) + bcl->stats.sent_nacks++; skb->priority = TC_PRIO_CONTROL; __skb_queue_tail(xmitq, skb); trace_tipc_proto_build(skb, false, l->name); @@ -2371,8 +2385,6 @@ int tipc_link_bc_sync_rcv(struct tipc_link *l, struct tipc_msg *hdr, if (!l->bc_peer_is_up) return rc; - l->stats.recv_nacks++; - /* Ignore if peers_snd_nxt goes beyond receive window */ if (more(peers_snd_nxt, l->rcv_nxt + l->window)) return rc; @@ -2423,6 +2435,11 @@ int tipc_link_bc_ack_rcv(struct tipc_link *r, u16 acked, u16 gap, if (!link_is_up(r) || !r->bc_peer_is_up) return 0; + if (gap) { + l->stats.recv_nacks++; + r->stats.recv_nacks++; + } + if (less(acked, r->acked) || (acked == r->acked && !gap && !ga)) return 0; @@ -2734,16 +2751,15 @@ static int __tipc_nl_add_bc_link_stat(struct sk_buff *skb, return -EMSGSIZE; } -int tipc_nl_add_bc_link(struct net *net, struct tipc_nl_msg *msg) +int tipc_nl_add_bc_link(struct net *net, struct tipc_nl_msg *msg, + struct tipc_link *bcl) { int err; void *hdr; struct nlattr *attrs; struct nlattr *prop; - struct tipc_net *tn = net_generic(net, tipc_net_id); u32 bc_mode = tipc_bcast_get_broadcast_mode(net); u32 bc_ratio = tipc_bcast_get_broadcast_ratio(net); - struct tipc_link *bcl = tn->bcl; if (!bcl) return 0; @@ -2830,21 +2846,6 @@ void tipc_link_set_abort_limit(struct tipc_link *l, u32 limit) l->abort_limit = limit; } -char *tipc_link_name_ext(struct tipc_link *l, char *buf) -{ - if (!l) - scnprintf(buf, TIPC_MAX_LINK_NAME, "null"); - else if (link_is_bc_sndlink(l)) - scnprintf(buf, TIPC_MAX_LINK_NAME, "broadcast-sender"); - else if (link_is_bc_rcvlink(l)) - scnprintf(buf, TIPC_MAX_LINK_NAME, - "broadcast-receiver, peer %x", l->addr); - else - memcpy(buf, l->name, TIPC_MAX_LINK_NAME); - - return buf; -} - /** * tipc_link_dump - dump TIPC link data * @l: tipc link to be dumped diff --git a/net/tipc/link.h b/net/tipc/link.h index 4d0768cf91d5..fc07232c9a12 100644 --- a/net/tipc/link.h +++ b/net/tipc/link.h @@ -80,7 +80,7 @@ bool tipc_link_create(struct net *net, char *if_name, int bearer_id, struct sk_buff_head *inputq, struct sk_buff_head *namedq, struct tipc_link **link); -bool tipc_link_bc_create(struct net *net, u32 ownnode, u32 peer, +bool tipc_link_bc_create(struct net *net, u32 ownnode, u32 peer, u8 *peer_id, int mtu, u32 min_win, u32 max_win, u16 peer_caps, struct sk_buff_head *inputq, struct sk_buff_head *namedq, @@ -111,7 +111,6 @@ u16 tipc_link_rcv_nxt(struct tipc_link *l); u16 tipc_link_acked(struct tipc_link *l); u32 tipc_link_id(struct tipc_link *l); char *tipc_link_name(struct tipc_link *l); -char *tipc_link_name_ext(struct tipc_link *l, char *buf); u32 tipc_link_state(struct tipc_link *l); char tipc_link_plane(struct tipc_link *l); int tipc_link_prio(struct tipc_link *l); diff --git a/net/tipc/msg.c b/net/tipc/msg.c index 0d515d20b056..69d68512300a 100644 --- a/net/tipc/msg.c +++ b/net/tipc/msg.c @@ -828,19 +828,19 @@ bool tipc_msg_pskb_copy(u32 dst, struct sk_buff_head *msg, * @seqno: sequence number of buffer to add * @skb: buffer to add */ -void __tipc_skb_queue_sorted(struct sk_buff_head *list, u16 seqno, +bool __tipc_skb_queue_sorted(struct sk_buff_head *list, u16 seqno, struct sk_buff *skb) { struct sk_buff *_skb, *tmp; if (skb_queue_empty(list) || less(seqno, buf_seqno(skb_peek(list)))) { __skb_queue_head(list, skb); - return; + return true; } if (more(seqno, buf_seqno(skb_peek_tail(list)))) { __skb_queue_tail(list, skb); - return; + return true; } skb_queue_walk_safe(list, _skb, tmp) { @@ -849,9 +849,10 @@ void __tipc_skb_queue_sorted(struct sk_buff_head *list, u16 seqno, if (seqno == buf_seqno(_skb)) break; __skb_queue_before(list, _skb, skb); - return; + return true; } kfree_skb(skb); + return false; } void tipc_skb_reject(struct net *net, int err, struct sk_buff *skb, diff --git a/net/tipc/msg.h b/net/tipc/msg.h index 9a38f9c9d6eb..87e2d472f75f 100644 --- a/net/tipc/msg.h +++ b/net/tipc/msg.h @@ -1127,7 +1127,7 @@ bool tipc_msg_assemble(struct sk_buff_head *list); bool tipc_msg_reassemble(struct sk_buff_head *list, struct sk_buff_head *rcvq); bool tipc_msg_pskb_copy(u32 dst, struct sk_buff_head *msg, struct sk_buff_head *cpy); -void __tipc_skb_queue_sorted(struct sk_buff_head *list, u16 seqno, +bool __tipc_skb_queue_sorted(struct sk_buff_head *list, u16 seqno, struct sk_buff *skb); bool tipc_msg_skb_clone(struct sk_buff_head *msg, struct sk_buff_head *cpy); diff --git a/net/tipc/netlink.c b/net/tipc/netlink.c index 7c35094c20b8..8dfad18330bc 100644 --- a/net/tipc/netlink.c +++ b/net/tipc/netlink.c @@ -187,7 +187,7 @@ static const struct genl_ops tipc_genl_v2_ops[] = { }, { .cmd = TIPC_NL_LINK_GET, - .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP, + .validate = GENL_DONT_VALIDATE_STRICT, .doit = tipc_nl_node_get_link, .dumpit = tipc_nl_node_dump_link, }, diff --git a/net/tipc/node.c b/net/tipc/node.c index 917ad3920fac..373d07ae6730 100644 --- a/net/tipc/node.c +++ b/net/tipc/node.c @@ -1138,7 +1138,7 @@ void tipc_node_check_dest(struct net *net, u32 addr, if (unlikely(!n->bc_entry.link)) { snd_l = tipc_bc_sndlink(net); if (!tipc_link_bc_create(net, tipc_own_addr(net), - addr, U16_MAX, + addr, peer_id, U16_MAX, tipc_link_min_win(snd_l), tipc_link_max_win(snd_l), n->capabilities, @@ -2432,7 +2432,7 @@ int tipc_nl_node_get_link(struct sk_buff *skb, struct genl_info *info) return -ENOMEM; if (strcmp(name, tipc_bclink_name) == 0) { - err = tipc_nl_add_bc_link(net, &msg); + err = tipc_nl_add_bc_link(net, &msg, tipc_net(net)->bcl); if (err) goto err_free; } else { @@ -2476,6 +2476,7 @@ int tipc_nl_node_reset_link_stats(struct sk_buff *skb, struct genl_info *info) struct tipc_node *node; struct nlattr *attrs[TIPC_NLA_LINK_MAX + 1]; struct net *net = sock_net(skb->sk); + struct tipc_net *tn = tipc_net(net); struct tipc_link_entry *le; if (!info->attrs[TIPC_NLA_LINK]) @@ -2492,11 +2493,26 @@ int tipc_nl_node_reset_link_stats(struct sk_buff *skb, struct genl_info *info) link_name = nla_data(attrs[TIPC_NLA_LINK_NAME]); - if (strcmp(link_name, tipc_bclink_name) == 0) { - err = tipc_bclink_reset_stats(net); + err = -EINVAL; + if (!strcmp(link_name, tipc_bclink_name)) { + err = tipc_bclink_reset_stats(net, tipc_bc_sndlink(net)); if (err) return err; return 0; + } else if (strstr(link_name, tipc_bclink_name)) { + rcu_read_lock(); + list_for_each_entry_rcu(node, &tn->node_list, list) { + tipc_node_read_lock(node); + link = node->bc_entry.link; + if (link && !strcmp(link_name, tipc_link_name(link))) { + err = tipc_bclink_reset_stats(net, link); + tipc_node_read_unlock(node); + break; + } + tipc_node_read_unlock(node); + } + rcu_read_unlock(); + return err; } node = tipc_node_find_by_name(net, link_name, &bearer_id); @@ -2520,7 +2536,8 @@ int tipc_nl_node_reset_link_stats(struct sk_buff *skb, struct genl_info *info) /* Caller should hold node lock */ static int __tipc_nl_add_node_links(struct net *net, struct tipc_nl_msg *msg, - struct tipc_node *node, u32 *prev_link) + struct tipc_node *node, u32 *prev_link, + u32 type) { u32 i; int err; @@ -2536,6 +2553,14 @@ static int __tipc_nl_add_node_links(struct net *net, struct tipc_nl_msg *msg, if (err) return err; } + + if (type == 2) { + *prev_link = 3; + err = tipc_nl_add_bc_link(net, msg, node->bc_entry.link); + if (err) + return err; + } + *prev_link = 0; return 0; @@ -2544,17 +2569,38 @@ static int __tipc_nl_add_node_links(struct net *net, struct tipc_nl_msg *msg, int tipc_nl_node_dump_link(struct sk_buff *skb, struct netlink_callback *cb) { struct net *net = sock_net(skb->sk); + struct nlattr **attrs = genl_dumpit_info(cb)->attrs; + struct nlattr *link[TIPC_NLA_LINK_MAX + 1]; struct tipc_net *tn = net_generic(net, tipc_net_id); struct tipc_node *node; struct tipc_nl_msg msg; u32 prev_node = cb->args[0]; u32 prev_link = cb->args[1]; int done = cb->args[2]; + u32 type = cb->args[3]; int err; if (done) return 0; + if (!type) { + /* Dump broadcast-link & unicast links */ + type = 1; + if (attrs && attrs[TIPC_NLA_LINK]) { + err = nla_parse_nested_deprecated(link, + TIPC_NLA_LINK_MAX, + attrs[TIPC_NLA_LINK], + tipc_nl_link_policy, + NULL); + if (unlikely(err)) + return err; + if (unlikely(!link[TIPC_NLA_LINK_BROADCAST])) + return -EINVAL; + /* Dump broadcast-receiver links as well */ + type = 2; + } + } + msg.skb = skb; msg.portid = NETLINK_CB(cb->skb).portid; msg.seq = cb->nlh->nlmsg_seq; @@ -2578,7 +2624,7 @@ int tipc_nl_node_dump_link(struct sk_buff *skb, struct netlink_callback *cb) list) { tipc_node_read_lock(node); err = __tipc_nl_add_node_links(net, &msg, node, - &prev_link); + &prev_link, type); tipc_node_read_unlock(node); if (err) goto out; @@ -2586,14 +2632,14 @@ int tipc_nl_node_dump_link(struct sk_buff *skb, struct netlink_callback *cb) prev_node = node->addr; } } else { - err = tipc_nl_add_bc_link(net, &msg); + err = tipc_nl_add_bc_link(net, &msg, tn->bcl); if (err) goto out; list_for_each_entry_rcu(node, &tn->node_list, list) { tipc_node_read_lock(node); err = __tipc_nl_add_node_links(net, &msg, node, - &prev_link); + &prev_link, type); tipc_node_read_unlock(node); if (err) goto out; @@ -2608,6 +2654,7 @@ int tipc_nl_node_dump_link(struct sk_buff *skb, struct netlink_callback *cb) cb->args[0] = prev_node; cb->args[1] = prev_link; cb->args[2] = done; + cb->args[3] = type; return skb->len; } diff --git a/net/tipc/trace.h b/net/tipc/trace.h index e7535ab75255..04af83f0500c 100644 --- a/net/tipc/trace.h +++ b/net/tipc/trace.h @@ -255,7 +255,7 @@ DECLARE_EVENT_CLASS(tipc_link_class, TP_fast_assign( __assign_str(header, header); - tipc_link_name_ext(l, __entry->name); + memcpy(__entry->name, tipc_link_name(l), TIPC_MAX_LINK_NAME); tipc_link_dump(l, dqueues, __get_str(buf)); ), @@ -295,7 +295,7 @@ DECLARE_EVENT_CLASS(tipc_link_transmq_class, ), TP_fast_assign( - tipc_link_name_ext(r, __entry->name); + memcpy(__entry->name, tipc_link_name(r), TIPC_MAX_LINK_NAME); __entry->from = f; __entry->to = t; __entry->len = skb_queue_len(tq); -- 2.13.7 |
From: Tuong T. L. <tuo...@de...> - 2020-04-06 08:55:01
|
Hi Ying, No problem, I'm using outlook too... Please see my answers correspondingly: 1. When you mention "this proposal", do you mean the single patch or the whole series since these features are actually separated and not dependent together...? Anyway, there have been some issues with the lab here, so I just tested this new feature on KVM/QEMU nodes using the virtio_net driver, with 4 vCPUs and only one TX/RX queue enabled. Also, the real-time kernel is not patched yet... If you have a better environment, may I ask you to help verify this? Anyhow, if I could catch up your concerns in the last meeting, it was mainly related to the amount of packet retransmissions that could panic the NIC or kernel, so not really scalable? If so, in theoretically, it should not be a problem since we have already had the following mechanisms to control it: - Link window (e.g. max 50 outstanding/retransmitted packets); - Retransmission restricting timer on individual packets (e.g. within 10ms, if a new retransmission request comes it will be ignored...); - The priority queue for packet retransmissions (that is unlikely congested); Or do you have any other concerns, so please clarify? 2. Yes, in the commit it has mentioned about the "bandwidth limit on broadcast" but it can be invisible to user. One obvious thing is probably through broadcast statistics (so there is a need for the other patch for the broadcast rcv link stats) that users can see the sender trying to make a lot of (re-)transmissions, but it doesn't really work, the receiver gets only a few... Ok, I will make this clear by repeating some performance tests. 3. Hmm, this totally was my mistake... I removed it when merging/separating the patches for this series ☹. In a premature patch, it was: @@ -2425,7 +2426,7 @@ int tipc_link_bc_ack_rcv(struct tipc_link *r, u16 acked, u16 gap, return 0; trace_tipc_link_bc_ack(r, r->acked, acked, &l->transmq); - tipc_link_advance_transmq(l, r, acked, gap, ga, xmitq, &unused, &rc); + tipc_link_advance_transmq(l, r, acked, gap, ga, retrq, &unused, &rc); tipc_link_advance_backlog(l, xmitq); if (unlikely(!skb_queue_empty(&l->wakeupq))) Thanks a lot for your finding, I will update this to the series! BR/Tuong -----Original Message----- From: Xue, Ying <Yin...@wi...> Sent: Monday, April 6, 2020 1:20 PM To: Tuong Tong Lien <tuo...@de...>; jm...@re...; ma...@do...; tip...@li... Cc: tipc-dek <tip...@de...> Subject: RE: [PATCH RFC 3/4] tipc: enable broadcast retrans via unicast Hi Tuong, Sorry, I have to use outlook client to reply to your email, which will make the email messed a bit. Please see my following comments: == [Ying] 1. Did you ever conduct comprehensive verification about this proposal? What kinds of test environment did you use in your testing? For example, how many TIPC physical nodes were gotten involved into your testing? Did the NICs used during your testing support multiqueue feature? How many cores were there on one your used physical TIPC machine? In addition, if possible, I suggest you could try to enable RT_PREEMPT kernel to measure what throughput results we would get. == In some environment, broadcast traffic is suppressed at high rate (i.e. a kind of bandwidth limit setting). When it is applied, TIPC broadcast can still run successfully. However, when it comes to a high load, some packets will be dropped first and TIPC tries to retransmit them but the packet retransmission is intentionally broadcast too, so making things worse and not helpful at all. This commit enables the broadcast retransmission via unicast which only retransmits packets to the specific peer that has really reported a gap i.e. not broadcasting to all nodes in the cluster, so will prevent from being suppressed, and also reduce some overheads on the other peers due to duplicates, finally improve the overall TIPC broadcast performance. Note: the functionality can be turned on/off via the sysctl file: echo 1 > /proc/sys/net/tipc/bc_retruni echo 0 > /proc/sys/net/tipc/bc_retruni Default is '0', i.e. the broadcast retransmission still works as usual. == [Ying] 2. Actually I had a similar idea before, so I also think the broadcast performance might be significantly improved through this proposal, but we act as TIPC developers, we should explicitly tell users what condition they should enable this option and what condition they should disable it, otherwise, users have no idea at all about when to enable this option or when to disable this option. So, please give more performance data obtained in different test conditions. If this patch can give broadcast performance a clear benefit under any test condition, ideally we completely remove this option. Otherwise, at least we can tell users when to enable this option. == Signed-off-by: Tuong Lien <tuo...@de...> int tipc_link_bc_ack_rcv(struct tipc_link *r, u16 acked, u16 gap, struct tipc_gap_ack_blks *ga, - struct sk_buff_head *xmitq) + struct sk_buff_head *xmitq, + struct sk_buff_head *retrq) { struct tipc_link *l = r->bc_sndlink; bool unused = false; == 3. [Ying] Sorry, I felt a bit confused. One new "retrq" parameter was introduced, but I didn't find where it was used in this function. Can you please explain how the new parameter works? == Thanks, Ying @@ -2460,7 +2461,8 @@ int tipc_link_bc_nack_rcv(struct tipc_link *l, struct sk_buff *skb, return 0; if (dnode == tipc_own_addr(l->net)) { - rc = tipc_link_bc_ack_rcv(l, acked, to - acked, NULL, xmitq); + rc = tipc_link_bc_ack_rcv(l, acked, to - acked, NULL, xmitq, + xmitq); l->stats.recv_nacks++; return rc; } diff --git a/net/tipc/link.h b/net/tipc/link.h index 0a0fa7350722..4d0768cf91d5 100644 --- a/net/tipc/link.h +++ b/net/tipc/link.h @@ -147,7 +147,8 @@ u16 tipc_get_gap_ack_blks(struct tipc_gap_ack_blks **ga, struct tipc_link *l, struct tipc_msg *hdr, bool uc); int tipc_link_bc_ack_rcv(struct tipc_link *l, u16 acked, u16 gap, struct tipc_gap_ack_blks *ga, - struct sk_buff_head *xmitq); + struct sk_buff_head *xmitq, + struct sk_buff_head *retrq); void tipc_link_build_bc_sync_msg(struct tipc_link *l, struct sk_buff_head *xmitq); void tipc_link_bc_init_rcv(struct tipc_link *l, struct tipc_msg *hdr); diff --git a/net/tipc/node.c b/net/tipc/node.c index eb6b62de81a7..917ad3920fac 100644 --- a/net/tipc/node.c +++ b/net/tipc/node.c @@ -1771,7 +1771,7 @@ static void tipc_node_bc_sync_rcv(struct tipc_node *n, struct tipc_msg *hdr, struct tipc_link *ucl; int rc; - rc = tipc_bcast_sync_rcv(n->net, n->bc_entry.link, hdr); + rc = tipc_bcast_sync_rcv(n->net, n->bc_entry.link, hdr, xmitq); if (rc & TIPC_LINK_DOWN_EVT) { tipc_node_reset_links(n); diff --git a/net/tipc/sysctl.c b/net/tipc/sysctl.c index 58ab3d6dcdce..97a6264a2993 100644 --- a/net/tipc/sysctl.c +++ b/net/tipc/sysctl.c @@ -36,7 +36,7 @@ #include "core.h" #include "trace.h" #include "crypto.h" - +#include "bcast.h" #include <linux/sysctl.h> static struct ctl_table_header *tipc_ctl_hdr; @@ -75,6 +75,13 @@ static struct ctl_table tipc_table[] = { .extra1 = SYSCTL_ONE, }, #endif + { + .procname = "bc_retruni", + .data = &sysctl_tipc_bc_retruni, + .maxlen = sizeof(sysctl_tipc_bc_retruni), + .mode = 0644, + .proc_handler = proc_doulongvec_minmax, + }, {} }; -- 2.13.7 |
From: Xue, Y. <Yin...@wi...> - 2020-04-06 07:19:07
|
31 16 15 0 +-------------+-------------+-------------+-------------+ | bgack_cnt | ugack_cnt | len | +-------------+-------------+-------------+-------------+ - | gap | ack | | +-------------+-------------+-------------+-------------+ > bc gacks : : : | +-------------+-------------+-------------+-------------+ - | gap | ack | | +-------------+-------------+-------------+-------------+ > uc gacks : : : | +-------------+-------------+-------------+-------------+ - which is "automatically" backward-compatible. === [Ying] In my opinion, this patch will cause the backward-compatible issue below: 1) On the TIPC node with the patch: When sending a 'PROTOCOL/STATE_MSG' message , its 'Gap ACK blocks' data field only contains bcl gap ack blocks, but no any unicast link gap ack block. 2) On the TIPC node without the patch: Upon receiving the message sent by the node of case 1), this node will suppose its 'Gap ACK blocks' data field are unicast link gap ack blocks rather than broadcast link gap ack blocks. [Tuong]: As you can see in the figure above, we have two different "b/ugack_cnt" fields which determine the number of broadcast/unicast gap ack blocks in the message. The "ugack_cnt" is fully identical to the "gack_cnt" in the old version (- without the patch) i.e. indicating the number of unicast gap ack blocks anyway, whereas the "bgack_cnt" was a reserved field. So, in your situation, the sending side will send the message with the "ugack_cnt" = 0 and this is completely compatible to the old version that the receiving side will see no unicast gap ack blocks and just ignore the broadcast gap ack blocks (- it doesn't really know). Actually, there is also a sanity check on the length in the old code that will shortly ignore such the gap ack block report... So, we have no problem at all. That is why I've declared it backward compatible automatically. >>[Ying]: Thanks for your clarification. Yes, you are right. Now it's really compatible between old and new versions. So I wonder no backward-compatible issue will exist and everything will become pretty easy if we use LINK_PROTOCOL to only contain unicast gap ack blocks and use BCAST_PROTOCOL to convey broadcast gap ack blocks. In other words, we don't need to enlarge current gap ack block space, and we don't need to change the current code related unicast gap ack blocks. Instead, we just need to add the support for broadcast gap ack blocks through BCAST_PROTOCOL rather than LINK_PROTOCOL. [Tuong]: The BCAST_PROTOCOL is currently only used for broadcast initializing or synching when a new peer joins, the old mechanism as broadcast NACKs is deprecated... I suppose that using the LINK_PROTOCOL is much more convenient since the traditional ack/gap reports for broadcast link is also made via the message, so we don't need to create a new code flow to handle the gap/ack blocks. Actually, the change in the current code related unicast gap ack blocks is just to optimize the code e.g. removing an old functions, etc., there is no impact in its functionality. >>[Ying]: Sorry, I forgot this comment: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=02d11ca20091fcef904f05defda80c53e5b4e793. It made broadcast NACK delivered through link state. Okay, both unicast link and bcl gap ack blocked can be transferred in the same link state message. >>[Ying]: By the way, I have another minor comment: As "bgack_cnt" is defined after " ugack_cnt " in struct tipc_gap_ack_blks, please reverse their order in this struct description section. /* struct tipc_gap_ack_blks * @len: actual length of the record - * @gack_cnt: number of Gap ACK blocks in the record + * @bgack_cnt: number of Gap ACK blocks for broadcast in the record + * @ugack_cnt: number of Gap ACK blocks for unicast (following the broadcast + * ones) + * @start_index: starting index for "valid" broadcast Gap ACK blocks * @gacks: array of Gap ACK blocks */ struct tipc_gap_ack_blks { __be16 len; - u8 gack_cnt; - u8 reserved; + union { + u8 ugack_cnt; + u8 start_index; + }; + u8 bgack_cnt; struct tipc_gap_ack gacks[]; }; |
From: Xue, Y. <Yin...@wi...> - 2020-04-06 07:18:46
|
Hi Tuong, Sorry, I have to use outlook client to reply to your email, which will make the email messed a bit. Please see my following comments: == [Ying] 1. Did you ever conduct comprehensive verification about this proposal? What kinds of test environment did you use in your testing? For example, how many TIPC physical nodes were gotten involved into your testing? Did the NICs used during your testing support multiqueue feature? How many cores were there on one your used physical TIPC machine? In addition, if possible, I suggest you could try to enable RT_PREEMPT kernel to measure what throughput results we would get. == In some environment, broadcast traffic is suppressed at high rate (i.e. a kind of bandwidth limit setting). When it is applied, TIPC broadcast can still run successfully. However, when it comes to a high load, some packets will be dropped first and TIPC tries to retransmit them but the packet retransmission is intentionally broadcast too, so making things worse and not helpful at all. This commit enables the broadcast retransmission via unicast which only retransmits packets to the specific peer that has really reported a gap i.e. not broadcasting to all nodes in the cluster, so will prevent from being suppressed, and also reduce some overheads on the other peers due to duplicates, finally improve the overall TIPC broadcast performance. Note: the functionality can be turned on/off via the sysctl file: echo 1 > /proc/sys/net/tipc/bc_retruni echo 0 > /proc/sys/net/tipc/bc_retruni Default is '0', i.e. the broadcast retransmission still works as usual. == [Ying] 2. Actually I had a similar idea before, so I also think the broadcast performance might be significantly improved through this proposal, but we act as TIPC developers, we should explicitly tell users what condition they should enable this option and what condition they should disable it, otherwise, users have no idea at all about when to enable this option or when to disable this option. So, please give more performance data obtained in different test conditions. If this patch can give broadcast performance a clear benefit under any test condition, ideally we completely remove this option. Otherwise, at least we can tell users when to enable this option. == Signed-off-by: Tuong Lien <tuo...@de...> int tipc_link_bc_ack_rcv(struct tipc_link *r, u16 acked, u16 gap, struct tipc_gap_ack_blks *ga, - struct sk_buff_head *xmitq) + struct sk_buff_head *xmitq, + struct sk_buff_head *retrq) { struct tipc_link *l = r->bc_sndlink; bool unused = false; == 3. [Ying] Sorry, I felt a bit confused. One new "retrq" parameter was introduced, but I didn't find where it was used in this function. Can you please explain how the new parameter works? == Thanks, Ying @@ -2460,7 +2461,8 @@ int tipc_link_bc_nack_rcv(struct tipc_link *l, struct sk_buff *skb, return 0; if (dnode == tipc_own_addr(l->net)) { - rc = tipc_link_bc_ack_rcv(l, acked, to - acked, NULL, xmitq); + rc = tipc_link_bc_ack_rcv(l, acked, to - acked, NULL, xmitq, + xmitq); l->stats.recv_nacks++; return rc; } diff --git a/net/tipc/link.h b/net/tipc/link.h index 0a0fa7350722..4d0768cf91d5 100644 --- a/net/tipc/link.h +++ b/net/tipc/link.h @@ -147,7 +147,8 @@ u16 tipc_get_gap_ack_blks(struct tipc_gap_ack_blks **ga, struct tipc_link *l, struct tipc_msg *hdr, bool uc); int tipc_link_bc_ack_rcv(struct tipc_link *l, u16 acked, u16 gap, struct tipc_gap_ack_blks *ga, - struct sk_buff_head *xmitq); + struct sk_buff_head *xmitq, + struct sk_buff_head *retrq); void tipc_link_build_bc_sync_msg(struct tipc_link *l, struct sk_buff_head *xmitq); void tipc_link_bc_init_rcv(struct tipc_link *l, struct tipc_msg *hdr); diff --git a/net/tipc/node.c b/net/tipc/node.c index eb6b62de81a7..917ad3920fac 100644 --- a/net/tipc/node.c +++ b/net/tipc/node.c @@ -1771,7 +1771,7 @@ static void tipc_node_bc_sync_rcv(struct tipc_node *n, struct tipc_msg *hdr, struct tipc_link *ucl; int rc; - rc = tipc_bcast_sync_rcv(n->net, n->bc_entry.link, hdr); + rc = tipc_bcast_sync_rcv(n->net, n->bc_entry.link, hdr, xmitq); if (rc & TIPC_LINK_DOWN_EVT) { tipc_node_reset_links(n); diff --git a/net/tipc/sysctl.c b/net/tipc/sysctl.c index 58ab3d6dcdce..97a6264a2993 100644 --- a/net/tipc/sysctl.c +++ b/net/tipc/sysctl.c @@ -36,7 +36,7 @@ #include "core.h" #include "trace.h" #include "crypto.h" - +#include "bcast.h" #include <linux/sysctl.h> static struct ctl_table_header *tipc_ctl_hdr; @@ -75,6 +75,13 @@ static struct ctl_table tipc_table[] = { .extra1 = SYSCTL_ONE, }, #endif + { + .procname = "bc_retruni", + .data = &sysctl_tipc_bc_retruni, + .maxlen = sizeof(sysctl_tipc_bc_retruni), + .mode = 0644, + .proc_handler = proc_doulongvec_minmax, + }, {} }; -- 2.13.7 |
From: Xue, Y. <Yin...@wi...> - 2020-04-06 07:16:07
|
Hi Tuong, Please see my comments inline: As achieved through commit 9195948fbf34 ("tipc: improve TIPC throughput by Gap ACK blocks"), we apply the same mechanism for the broadcast link as well. The 'Gap ACK blocks' data field in a 'PROTOCOL/STATE_MSG' will consist of two parts built for both the broadcast and unicast types: 31 16 15 0 +-------------+-------------+-------------+-------------+ | bgack_cnt | ugack_cnt | len | +-------------+-------------+-------------+-------------+ - | gap | ack | | +-------------+-------------+-------------+-------------+ > bc gacks : : : | +-------------+-------------+-------------+-------------+ - | gap | ack | | +-------------+-------------+-------------+-------------+ > uc gacks : : : | +-------------+-------------+-------------+-------------+ - which is "automatically" backward-compatible. === [Ying] In my opinion, this patch will cause the backward-compatible issue below: 1) On the TIPC node with the patch: When sending a 'PROTOCOL/STATE_MSG' message , its 'Gap ACK blocks' data field only contains bcl gap ack blocks, but no any unicast link gap ack block. 2) On the TIPC node without the patch: Upon receiving the message sent by the node of case 1), this node will suppose its 'Gap ACK blocks' data field are unicast link gap ack blocks rather than broadcast link gap ack blocks. So I wonder no backward-compatible issue will exist and everything will become pretty easy if we use LINK_PROTOCOL to only contain unicast gap ack blocks and use BCAST_PROTOCOL to convey broadcast gap ack blocks. In other words, we don't need to enlarge current gap ack block space, and we don't need to change the current code related unicast gap ack blocks. Instead, we just need to add the support for broadcast gap ack blocks through BCAST_PROTOCOL rather than LINK_PROTOCOL. === Thanks, Ying We also increase the max number of Gap ACK blocks to 128, allowing upto 64 blocks per type (total buffer size = 516 bytes). Besides, the 'tipc_link_advance_transmq()' function is refactored which is applicable for both the unicast and broadcast cases now, so some old functions can be removed and the code is optimized. With the patch, TIPC broadcast is more robust regardless of packet loss or disorder, latency, ... in the underlying network. Its performance is boost up significantly. For example, experiment with a 5% packet loss rate results: $ time tipc-pipe --mc --rdm --data_size 123 --data_num 1500000 real 0m 42.46s user 0m 1.16s sys 0m 17.67s Without the patch: $ time tipc-pipe --mc --rdm --data_size 123 --data_num 1500000 real 8m 27.94s user 0m 0.55s sys 0m 2.38s Acked-by: Jon Maloy <jm...@re...> Signed-off-by: Tuong Lien <tuo...@de...> --- net/tipc/bcast.c | 9 +- net/tipc/link.c | 438 +++++++++++++++++++++++++++++++++---------------------- net/tipc/link.h | 7 +- net/tipc/msg.h | 14 +- net/tipc/node.c | 10 +- 5 files changed, 293 insertions(+), 185 deletions(-) diff --git a/net/tipc/bcast.c b/net/tipc/bcast.c index 4c20be08b9c4..3ce690a96ee9 100644 --- a/net/tipc/bcast.c +++ b/net/tipc/bcast.c @@ -474,7 +474,7 @@ void tipc_bcast_ack_rcv(struct net *net, struct tipc_link *l, __skb_queue_head_init(&xmitq); tipc_bcast_lock(net); - tipc_link_bc_ack_rcv(l, acked, &xmitq); + tipc_link_bc_ack_rcv(l, acked, 0, NULL, &xmitq); tipc_bcast_unlock(net); tipc_bcbase_xmit(net, &xmitq); @@ -492,6 +492,7 @@ int tipc_bcast_sync_rcv(struct net *net, struct tipc_link *l, struct tipc_msg *hdr) { struct sk_buff_head *inputq = &tipc_bc_base(net)->inputq; + struct tipc_gap_ack_blks *ga; struct sk_buff_head xmitq; int rc = 0; @@ -501,8 +502,10 @@ int tipc_bcast_sync_rcv(struct net *net, struct tipc_link *l, if (msg_type(hdr) != STATE_MSG) { tipc_link_bc_init_rcv(l, hdr); } else if (!msg_bc_ack_invalid(hdr)) { - tipc_link_bc_ack_rcv(l, msg_bcast_ack(hdr), &xmitq); - rc = tipc_link_bc_sync_rcv(l, hdr, &xmitq); + tipc_get_gap_ack_blks(&ga, l, hdr, false); + rc = tipc_link_bc_ack_rcv(l, msg_bcast_ack(hdr), + msg_bc_gap(hdr), ga, &xmitq); + rc |= tipc_link_bc_sync_rcv(l, hdr, &xmitq); } tipc_bcast_unlock(net); diff --git a/net/tipc/link.c b/net/tipc/link.c index 467c53a1fb5c..1b60ba665504 100644 --- a/net/tipc/link.c +++ b/net/tipc/link.c @@ -188,6 +188,8 @@ struct tipc_link { /* Broadcast */ u16 ackers; u16 acked; + u16 last_gap; + struct tipc_gap_ack_blks *last_ga; struct tipc_link *bc_rcvlink; struct tipc_link *bc_sndlink; u8 nack_state; @@ -249,11 +251,14 @@ static int tipc_link_build_nack_msg(struct tipc_link *l, struct sk_buff_head *xmitq); static void tipc_link_build_bc_init_msg(struct tipc_link *l, struct sk_buff_head *xmitq); -static int tipc_link_release_pkts(struct tipc_link *l, u16 to); -static u16 tipc_build_gap_ack_blks(struct tipc_link *l, void *data, u16 gap); -static int tipc_link_advance_transmq(struct tipc_link *l, u16 acked, u16 gap, +static u8 __tipc_build_gap_ack_blks(struct tipc_gap_ack_blks *ga, + struct tipc_link *l, u8 start_index); +static u16 tipc_build_gap_ack_blks(struct tipc_link *l, struct tipc_msg *hdr); +static int tipc_link_advance_transmq(struct tipc_link *l, struct tipc_link *r, + u16 acked, u16 gap, struct tipc_gap_ack_blks *ga, - struct sk_buff_head *xmitq); + struct sk_buff_head *xmitq, + bool *retransmitted, int *rc); static void tipc_link_update_cwin(struct tipc_link *l, int released, bool retransmitted); /* @@ -370,7 +375,7 @@ void tipc_link_remove_bc_peer(struct tipc_link *snd_l, snd_l->ackers--; rcv_l->bc_peer_is_up = true; rcv_l->state = LINK_ESTABLISHED; - tipc_link_bc_ack_rcv(rcv_l, ack, xmitq); + tipc_link_bc_ack_rcv(rcv_l, ack, 0, NULL, xmitq); trace_tipc_link_reset(rcv_l, TIPC_DUMP_ALL, "bclink removed!"); tipc_link_reset(rcv_l); rcv_l->state = LINK_RESET; @@ -784,8 +789,6 @@ bool tipc_link_too_silent(struct tipc_link *l) return (l->silent_intv_cnt + 2 > l->abort_limit); } -static int tipc_link_bc_retrans(struct tipc_link *l, struct tipc_link *r, - u16 from, u16 to, struct sk_buff_head *xmitq); /* tipc_link_timeout - perform periodic task as instructed from node timeout */ int tipc_link_timeout(struct tipc_link *l, struct sk_buff_head *xmitq) @@ -948,6 +951,9 @@ void tipc_link_reset(struct tipc_link *l) l->snd_nxt_state = 1; l->rcv_nxt_state = 1; l->acked = 0; + l->last_gap = 0; + kfree(l->last_ga); + l->last_ga = NULL; l->silent_intv_cnt = 0; l->rst_cnt = 0; l->bc_peer_is_up = false; @@ -1183,68 +1189,14 @@ static bool link_retransmit_failure(struct tipc_link *l, struct tipc_link *r, if (link_is_bc_sndlink(l)) { r->state = LINK_RESET; - *rc = TIPC_LINK_DOWN_EVT; + *rc |= TIPC_LINK_DOWN_EVT; } else { - *rc = tipc_link_fsm_evt(l, LINK_FAILURE_EVT); + *rc |= tipc_link_fsm_evt(l, LINK_FAILURE_EVT); } return true; } -/* tipc_link_bc_retrans() - retransmit zero or more packets - * @l: the link to transmit on - * @r: the receiving link ordering the retransmit. Same as l if unicast - * @from: retransmit from (inclusive) this sequence number - * @to: retransmit to (inclusive) this sequence number - * xmitq: queue for accumulating the retransmitted packets - */ -static int tipc_link_bc_retrans(struct tipc_link *l, struct tipc_link *r, - u16 from, u16 to, struct sk_buff_head *xmitq) -{ - struct sk_buff *_skb, *skb = skb_peek(&l->transmq); - u16 bc_ack = l->bc_rcvlink->rcv_nxt - 1; - u16 ack = l->rcv_nxt - 1; - int retransmitted = 0; - struct tipc_msg *hdr; - int rc = 0; - - if (!skb) - return 0; - if (less(to, from)) - return 0; - - trace_tipc_link_retrans(r, from, to, &l->transmq); - - if (link_retransmit_failure(l, r, &rc)) - return rc; - - skb_queue_walk(&l->transmq, skb) { - hdr = buf_msg(skb); - if (less(msg_seqno(hdr), from)) - continue; - if (more(msg_seqno(hdr), to)) - break; - if (time_before(jiffies, TIPC_SKB_CB(skb)->nxt_retr)) - continue; - TIPC_SKB_CB(skb)->nxt_retr = TIPC_BC_RETR_LIM; - _skb = pskb_copy(skb, GFP_ATOMIC); - if (!_skb) - return 0; - hdr = buf_msg(_skb); - msg_set_ack(hdr, ack); - msg_set_bcast_ack(hdr, bc_ack); - _skb->priority = TC_PRIO_CONTROL; - __skb_queue_tail(xmitq, _skb); - l->stats.retransmitted++; - retransmitted++; - /* Increase actual retrans counter & mark first time */ - if (!TIPC_SKB_CB(skb)->retr_cnt++) - TIPC_SKB_CB(skb)->retr_stamp = jiffies; - } - tipc_link_update_cwin(l, 0, retransmitted); - return 0; -} - /* tipc_data_input - deliver data and name distr msgs to upper layer * * Consumes buffer if message is of right type @@ -1402,46 +1354,71 @@ static int tipc_link_tnl_rcv(struct tipc_link *l, struct sk_buff *skb, return rc; } -static int tipc_link_release_pkts(struct tipc_link *l, u16 acked) -{ - int released = 0; - struct sk_buff *skb, *tmp; - - skb_queue_walk_safe(&l->transmq, skb, tmp) { - if (more(buf_seqno(skb), acked)) - break; - __skb_unlink(skb, &l->transmq); - kfree_skb(skb); - released++; +/** + * tipc_get_gap_ack_blks - get Gap ACK blocks from PROTOCOL/STATE_MSG + * @ga: returned pointer to the Gap ACK blocks if any + * @l: the tipc link + * @hdr: the PROTOCOL/STATE_MSG header + * @uc: desired Gap ACK blocks type, i.e. unicast (= 1) or broadcast (= 0) + * + * Return: the total Gap ACK blocks size + */ +u16 tipc_get_gap_ack_blks(struct tipc_gap_ack_blks **ga, struct tipc_link *l, + struct tipc_msg *hdr, bool uc) +{ + struct tipc_gap_ack_blks *p; + u16 sz = 0; + + /* Does peer support the Gap ACK blocks feature? */ + if (l->peer_caps & TIPC_GAP_ACK_BLOCK) { + p = (struct tipc_gap_ack_blks *)msg_data(hdr); + sz = ntohs(p->len); + /* Sanity check */ + if (sz == tipc_gap_ack_blks_sz(p->ugack_cnt + p->bgack_cnt)) { + /* Good, check if the desired type exists */ + if ((uc && p->ugack_cnt) || (!uc && p->bgack_cnt)) + goto ok; + /* Backward compatible: peer might not support bc, but uc? */ + } else if (uc && sz == tipc_gap_ack_blks_sz(p->ugack_cnt)) { + if (p->ugack_cnt) { + p->bgack_cnt = 0; + goto ok; + } + } } - return released; + /* Other cases: ignore! */ + p = NULL; + +ok: + *ga = p; + return sz; } -/* tipc_build_gap_ack_blks - build Gap ACK blocks - * @l: tipc link that data have come with gaps in sequence if any - * @data: data buffer to store the Gap ACK blocks after built - * - * returns the actual allocated memory size - */ -static u16 tipc_build_gap_ack_blks(struct tipc_link *l, void *data, u16 gap) +static u8 __tipc_build_gap_ack_blks(struct tipc_gap_ack_blks *ga, + struct tipc_link *l, u8 start_index) { + struct tipc_gap_ack *gacks = &ga->gacks[start_index]; struct sk_buff *skb = skb_peek(&l->deferdq); - struct tipc_gap_ack_blks *ga = data; - u16 len, expect, seqno = 0; + u16 expect, seqno = 0; u8 n = 0; - if (!skb || !gap) - goto exit; + if (!skb) + return 0; expect = buf_seqno(skb); skb_queue_walk(&l->deferdq, skb) { seqno = buf_seqno(skb); if (unlikely(more(seqno, expect))) { - ga->gacks[n].ack = htons(expect - 1); - ga->gacks[n].gap = htons(seqno - expect); - if (++n >= MAX_GAP_ACK_BLKS) { - pr_info_ratelimited("Too few Gap ACK blocks!\n"); - goto exit; + gacks[n].ack = htons(expect - 1); + gacks[n].gap = htons(seqno - expect); + if (++n >= MAX_GAP_ACK_BLKS / 2) { + char buf[TIPC_MAX_LINK_NAME]; + + pr_info_ratelimited("Gacks on %s: %d, ql: %d!\n", + tipc_link_name_ext(l, buf), + n, + skb_queue_len(&l->deferdq)); + return n; } } else if (unlikely(less(seqno, expect))) { pr_warn("Unexpected skb in deferdq!\n"); @@ -1451,14 +1428,57 @@ static u16 tipc_build_gap_ack_blks(struct tipc_link *l, void *data, u16 gap) } /* last block */ - ga->gacks[n].ack = htons(seqno); - ga->gacks[n].gap = 0; + gacks[n].ack = htons(seqno); + gacks[n].gap = 0; n++; + return n; +} -exit: - len = tipc_gap_ack_blks_sz(n); +/* tipc_build_gap_ack_blks - build Gap ACK blocks + * @l: tipc unicast link + * @hdr: the tipc message buffer to store the Gap ACK blocks after built + * + * The function builds Gap ACK blocks for both the unicast & broadcast receiver + * links of a certain peer, the buffer after built has the network data format + * as follows: + * 31 16 15 0 + * +-------------+-------------+-------------+-------------+ + * | bgack_cnt | ugack_cnt | len | + * +-------------+-------------+-------------+-------------+ - + * | gap | ack | | + * +-------------+-------------+-------------+-------------+ > bc gacks + * : : : | + * +-------------+-------------+-------------+-------------+ - + * | gap | ack | | + * +-------------+-------------+-------------+-------------+ > uc gacks + * : : : | + * +-------------+-------------+-------------+-------------+ - + * (See struct tipc_gap_ack_blks) + * + * returns the actual allocated memory size + */ +static u16 tipc_build_gap_ack_blks(struct tipc_link *l, struct tipc_msg *hdr) +{ + struct tipc_link *bcl = l->bc_rcvlink; + struct tipc_gap_ack_blks *ga; + u16 len; + + ga = (struct tipc_gap_ack_blks *)msg_data(hdr); + + /* Start with broadcast link first */ + tipc_bcast_lock(bcl->net); + msg_set_bcast_ack(hdr, bcl->rcv_nxt - 1); + msg_set_bc_gap(hdr, link_bc_rcv_gap(bcl)); + ga->bgack_cnt = __tipc_build_gap_ack_blks(ga, bcl, 0); + tipc_bcast_unlock(bcl->net); + + /* Now for unicast link, but an explicit NACK only (???) */ + ga->ugack_cnt = (msg_seq_gap(hdr)) ? + __tipc_build_gap_ack_blks(ga, l, ga->bgack_cnt) : 0; + + /* Total len */ + len = tipc_gap_ack_blks_sz(ga->bgack_cnt + ga->ugack_cnt); ga->len = htons(len); - ga->gack_cnt = n; return len; } @@ -1466,47 +1486,109 @@ static u16 tipc_build_gap_ack_blks(struct tipc_link *l, void *data, u16 gap) * acked packets, also doing retransmissions if * gaps found * @l: tipc link with transmq queue to be advanced + * @r: tipc link "receiver" i.e. in case of broadcast (= "l" if unicast) * @acked: seqno of last packet acked by peer without any gaps before * @gap: # of gap packets * @ga: buffer pointer to Gap ACK blocks from peer * @xmitq: queue for accumulating the retransmitted packets if any + * @retransmitted: returned boolean value if a retransmission is really issued + * @rc: returned code e.g. TIPC_LINK_DOWN_EVT if a repeated retransmit failures + * happens (- unlikely case) * - * In case of a repeated retransmit failures, the call will return shortly - * with a returned code (e.g. TIPC_LINK_DOWN_EVT) + * Return: the number of packets released from the link transmq */ -static int tipc_link_advance_transmq(struct tipc_link *l, u16 acked, u16 gap, +static int tipc_link_advance_transmq(struct tipc_link *l, struct tipc_link *r, + u16 acked, u16 gap, struct tipc_gap_ack_blks *ga, - struct sk_buff_head *xmitq) + struct sk_buff_head *xmitq, + bool *retransmitted, int *rc) { + struct tipc_gap_ack_blks *last_ga = r->last_ga, *this_ga = NULL; + struct tipc_gap_ack *gacks = NULL; struct sk_buff *skb, *_skb, *tmp; struct tipc_msg *hdr; + u32 qlen = skb_queue_len(&l->transmq); + u16 nacked = acked, ngap = gap, gack_cnt = 0; u16 bc_ack = l->bc_rcvlink->rcv_nxt - 1; - bool retransmitted = false; u16 ack = l->rcv_nxt - 1; - bool passed = false; - u16 released = 0; u16 seqno, n = 0; - int rc = 0; + u16 end = r->acked, start = end, offset = r->last_gap; + u16 si = (last_ga) ? last_ga->start_index : 0; + bool is_uc = !link_is_bc_sndlink(l); + bool bc_has_acked = false; + + /* Determine Gap ACK blocks if any for the particular link */ + if (ga && is_uc) { + /* Get the Gap ACKs, uc part */ + gack_cnt = ga->ugack_cnt; + gacks = &ga->gacks[ga->bgack_cnt]; + } else if (ga) { + /* Copy the Gap ACKs, bc part, for later renewal if needed */ + this_ga = kmemdup(ga, tipc_gap_ack_blks_sz(ga->bgack_cnt), + GFP_ATOMIC); + if (likely(this_ga)) { + this_ga->start_index = 0; + /* Start with the bc Gap ACKs */ + gack_cnt = this_ga->bgack_cnt; + gacks = &this_ga->gacks[0]; + } else { + /* Hmm, we can get in trouble..., simply ignore it */ + pr_warn_ratelimited("Ignoring bc Gap ACKs, no memory\n"); + } + } + /* Advance the link transmq */ skb_queue_walk_safe(&l->transmq, skb, tmp) { seqno = buf_seqno(skb); next_gap_ack: - if (less_eq(seqno, acked)) { + if (less_eq(seqno, nacked)) { + if (is_uc) + goto release; + /* Skip packets peer has already acked */ + if (!more(seqno, r->acked)) + continue; + /* Get the next of last Gap ACK blocks */ + while (more(seqno, end)) { + if (!last_ga || si >= last_ga->bgack_cnt) + break; + start = end + offset + 1; + end = ntohs(last_ga->gacks[si].ack); + offset = ntohs(last_ga->gacks[si].gap); + si++; + WARN_ONCE(more(start, end) || + (!offset && + si < last_ga->bgack_cnt) || + si > MAX_GAP_ACK_BLKS, + "Corrupted Gap ACK: %d %d %d %d %d\n", + start, end, offset, si, + last_ga->bgack_cnt); + } + /* Check against the last Gap ACK block */ + if (in_range(seqno, start, end)) + continue; + /* Update/release the packet peer is acking */ + bc_has_acked = true; + if (--TIPC_SKB_CB(skb)->ackers) + continue; +release: /* release skb */ __skb_unlink(skb, &l->transmq); kfree_skb(skb); - released++; - } else if (less_eq(seqno, acked + gap)) { - /* First, check if repeated retrans failures occurs? */ - if (!passed && link_retransmit_failure(l, l, &rc)) - return rc; - passed = true; - + } else if (less_eq(seqno, nacked + ngap)) { + /* First gap: check if repeated retrans failures? */ + if (unlikely(seqno == acked + 1 && + link_retransmit_failure(l, r, rc))) { + /* Ignore this bc Gap ACKs if any */ + kfree(this_ga); + this_ga = NULL; + break; + } /* retransmit skb if unrestricted*/ if (time_before(jiffies, TIPC_SKB_CB(skb)->nxt_retr)) continue; - TIPC_SKB_CB(skb)->nxt_retr = TIPC_UC_RETR_TIME; + TIPC_SKB_CB(skb)->nxt_retr = (is_uc) ? + TIPC_UC_RETR_TIME : TIPC_BC_RETR_LIM; _skb = pskb_copy(skb, GFP_ATOMIC); if (!_skb) continue; @@ -1516,25 +1598,51 @@ static int tipc_link_advance_transmq(struct tipc_link *l, u16 acked, u16 gap, _skb->priority = TC_PRIO_CONTROL; __skb_queue_tail(xmitq, _skb); l->stats.retransmitted++; - retransmitted = true; + *retransmitted = true; /* Increase actual retrans counter & mark first time */ if (!TIPC_SKB_CB(skb)->retr_cnt++) TIPC_SKB_CB(skb)->retr_stamp = jiffies; } else { /* retry with Gap ACK blocks if any */ - if (!ga || n >= ga->gack_cnt) + if (n >= gack_cnt) break; - acked = ntohs(ga->gacks[n].ack); - gap = ntohs(ga->gacks[n].gap); + nacked = ntohs(gacks[n].ack); + ngap = ntohs(gacks[n].gap); n++; goto next_gap_ack; } } - if (released || retransmitted) - tipc_link_update_cwin(l, released, retransmitted); - if (released) - tipc_link_advance_backlog(l, xmitq); - return 0; + + /* Renew last Gap ACK blocks for bc if needed */ + if (bc_has_acked) { + if (this_ga) { + kfree(last_ga); + r->last_ga = this_ga; + r->last_gap = gap; + } else if (last_ga) { + if (less(acked, start)) { + si--; + offset = start - acked - 1; + } else if (less(acked, end)) { + acked = end; + } + if (si < last_ga->bgack_cnt) { + last_ga->start_index = si; + r->last_gap = offset; + } else { + kfree(last_ga); + r->last_ga = NULL; + r->last_gap = 0; + } + } else { + r->last_gap = 0; + } + r->acked = acked; + } else { + kfree(this_ga); + } + + return qlen - skb_queue_len(&l->transmq); } /* tipc_link_build_state_msg: prepare link state message for transmission @@ -1651,7 +1759,8 @@ int tipc_link_rcv(struct tipc_link *l, struct sk_buff *skb, kfree_skb(skb); break; } - released += tipc_link_release_pkts(l, msg_ack(hdr)); + released += tipc_link_advance_transmq(l, l, msg_ack(hdr), 0, + NULL, NULL, NULL, NULL); /* Defer delivery if sequence gap */ if (unlikely(seqno != rcv_nxt)) { @@ -1739,7 +1848,7 @@ static void tipc_link_build_proto_msg(struct tipc_link *l, int mtyp, bool probe, msg_set_probe(hdr, probe); msg_set_is_keepalive(hdr, probe || probe_reply); if (l->peer_caps & TIPC_GAP_ACK_BLOCK) - glen = tipc_build_gap_ack_blks(l, data, rcvgap); + glen = tipc_build_gap_ack_blks(l, hdr); tipc_mon_prep(l->net, data + glen, &dlen, mstate, l->bearer_id); msg_set_size(hdr, INT_H_SIZE + glen + dlen); skb_trim(skb, INT_H_SIZE + glen + dlen); @@ -2027,20 +2136,19 @@ static int tipc_link_proto_rcv(struct tipc_link *l, struct sk_buff *skb, { struct tipc_msg *hdr = buf_msg(skb); struct tipc_gap_ack_blks *ga = NULL; - u16 rcvgap = 0; - u16 ack = msg_ack(hdr); - u16 gap = msg_seq_gap(hdr); + bool reply = msg_probe(hdr), retransmitted = false; + u16 dlen = msg_data_sz(hdr), glen = 0; u16 peers_snd_nxt = msg_next_sent(hdr); u16 peers_tol = msg_link_tolerance(hdr); u16 peers_prio = msg_linkprio(hdr); + u16 gap = msg_seq_gap(hdr); + u16 ack = msg_ack(hdr); u16 rcv_nxt = l->rcv_nxt; - u16 dlen = msg_data_sz(hdr); + u16 rcvgap = 0; int mtyp = msg_type(hdr); - bool reply = msg_probe(hdr); - u16 glen = 0; - void *data; + int rc = 0, released; char *if_name; - int rc = 0; + void *data; trace_tipc_proto_rcv(skb, false, l->name); if (tipc_link_is_blocked(l) || !xmitq) @@ -2137,13 +2245,7 @@ static int tipc_link_proto_rcv(struct tipc_link *l, struct sk_buff *skb, } /* Receive Gap ACK blocks from peer if any */ - if (l->peer_caps & TIPC_GAP_ACK_BLOCK) { - ga = (struct tipc_gap_ack_blks *)data; - glen = ntohs(ga->len); - /* sanity check: if failed, ignore Gap ACK blocks */ - if (glen != tipc_gap_ack_blks_sz(ga->gack_cnt)) - ga = NULL; - } + glen = tipc_get_gap_ack_blks(&ga, l, hdr, true); tipc_mon_rcv(l->net, data + glen, dlen - glen, l->addr, &l->mon_state, l->bearer_id); @@ -2158,9 +2260,14 @@ static int tipc_link_proto_rcv(struct tipc_link *l, struct sk_buff *skb, tipc_link_build_proto_msg(l, STATE_MSG, 0, reply, rcvgap, 0, 0, xmitq); - rc |= tipc_link_advance_transmq(l, ack, gap, ga, xmitq); + released = tipc_link_advance_transmq(l, l, ack, gap, ga, xmitq, + &retransmitted, &rc); if (gap) l->stats.recv_nacks++; + if (released || retransmitted) + tipc_link_update_cwin(l, released, retransmitted); + if (released) + tipc_link_advance_backlog(l, xmitq); if (unlikely(!skb_queue_empty(&l->wakeupq))) link_prepare_wakeup(l); } @@ -2246,10 +2353,7 @@ void tipc_link_bc_init_rcv(struct tipc_link *l, struct tipc_msg *hdr) int tipc_link_bc_sync_rcv(struct tipc_link *l, struct tipc_msg *hdr, struct sk_buff_head *xmitq) { - struct tipc_link *snd_l = l->bc_sndlink; u16 peers_snd_nxt = msg_bc_snd_nxt(hdr); - u16 from = msg_bcast_ack(hdr) + 1; - u16 to = from + msg_bc_gap(hdr) - 1; int rc = 0; if (!link_is_up(l)) @@ -2271,8 +2375,6 @@ int tipc_link_bc_sync_rcv(struct tipc_link *l, struct tipc_msg *hdr, if (more(peers_snd_nxt, l->rcv_nxt + l->window)) return rc; - rc = tipc_link_bc_retrans(snd_l, l, from, to, xmitq); - l->snd_nxt = peers_snd_nxt; if (link_bc_rcv_gap(l)) rc |= TIPC_LINK_SND_STATE; @@ -2307,38 +2409,27 @@ int tipc_link_bc_sync_rcv(struct tipc_link *l, struct tipc_msg *hdr, return 0; } -void tipc_link_bc_ack_rcv(struct tipc_link *l, u16 acked, - struct sk_buff_head *xmitq) +int tipc_link_bc_ack_rcv(struct tipc_link *r, u16 acked, u16 gap, + struct tipc_gap_ack_blks *ga, + struct sk_buff_head *xmitq) { - struct sk_buff *skb, *tmp; - struct tipc_link *snd_l = l->bc_sndlink; + struct tipc_link *l = r->bc_sndlink; + bool unused = false; + int rc = 0; - if (!link_is_up(l) || !l->bc_peer_is_up) - return; + if (!link_is_up(r) || !r->bc_peer_is_up) + return 0; - if (!more(acked, l->acked)) - return; + if (less(acked, r->acked) || (acked == r->acked && !gap && !ga)) + return 0; - trace_tipc_link_bc_ack(l, l->acked, acked, &snd_l->transmq); - /* Skip over packets peer has already acked */ - skb_queue_walk(&snd_l->transmq, skb) { - if (more(buf_seqno(skb), l->acked)) - break; - } + tipc_link_advance_transmq(l, r, acked, gap, ga, xmitq, &unused, &rc); - /* Update/release the packets peer is acking now */ - skb_queue_walk_from_safe(&snd_l->transmq, skb, tmp) { - if (more(buf_seqno(skb), acked)) - break; - if (!--TIPC_SKB_CB(skb)->ackers) { - __skb_unlink(skb, &snd_l->transmq); - kfree_skb(skb); - } - } - l->acked = acked; - tipc_link_advance_backlog(snd_l, xmitq); - if (unlikely(!skb_queue_empty(&snd_l->wakeupq))) - link_prepare_wakeup(snd_l); + tipc_link_advance_backlog(l, xmitq); + if (unlikely(!skb_queue_empty(&l->wakeupq))) + link_prepare_wakeup(l); + + return rc; } /* tipc_link_bc_nack_rcv(): receive broadcast nack message @@ -2366,8 +2457,7 @@ int tipc_link_bc_nack_rcv(struct tipc_link *l, struct sk_buff *skb, return 0; if (dnode == tipc_own_addr(l->net)) { - tipc_link_bc_ack_rcv(l, acked, xmitq); - rc = tipc_link_bc_retrans(l->bc_sndlink, l, from, to, xmitq); + rc = tipc_link_bc_ack_rcv(l, acked, to - acked, NULL, xmitq); l->stats.recv_nacks++; return rc; } diff --git a/net/tipc/link.h b/net/tipc/link.h index d3c1c3fc1659..0a0fa7350722 100644 --- a/net/tipc/link.h +++ b/net/tipc/link.h @@ -143,8 +143,11 @@ int tipc_link_bc_peers(struct tipc_link *l); void tipc_link_set_mtu(struct tipc_link *l, int mtu); int tipc_link_mtu(struct tipc_link *l); int tipc_link_mss(struct tipc_link *l); -void tipc_link_bc_ack_rcv(struct tipc_link *l, u16 acked, - struct sk_buff_head *xmitq); +u16 tipc_get_gap_ack_blks(struct tipc_gap_ack_blks **ga, struct tipc_link *l, + struct tipc_msg *hdr, bool uc); +int tipc_link_bc_ack_rcv(struct tipc_link *l, u16 acked, u16 gap, + struct tipc_gap_ack_blks *ga, + struct sk_buff_head *xmitq); void tipc_link_build_bc_sync_msg(struct tipc_link *l, struct sk_buff_head *xmitq); void tipc_link_bc_init_rcv(struct tipc_link *l, struct tipc_msg *hdr); diff --git a/net/tipc/msg.h b/net/tipc/msg.h index 6d466ebdb64f..9a38f9c9d6eb 100644 --- a/net/tipc/msg.h +++ b/net/tipc/msg.h @@ -160,20 +160,26 @@ struct tipc_gap_ack { /* struct tipc_gap_ack_blks * @len: actual length of the record - * @gack_cnt: number of Gap ACK blocks in the record + * @bgack_cnt: number of Gap ACK blocks for broadcast in the record [Ying] "bgack_cnt" is defined after " ugack_cnt " in struct tipc_gap_ack_blks. Please reverse their order. + * @ugack_cnt: number of Gap ACK blocks for unicast (following the broadcast + * ones) + * @start_index: starting index for "valid" broadcast Gap ACK blocks * @gacks: array of Gap ACK blocks */ struct tipc_gap_ack_blks { __be16 len; - u8 gack_cnt; - u8 reserved; + union { + u8 ugack_cnt; + u8 start_index; + }; + u8 bgack_cnt; struct tipc_gap_ack gacks[]; }; #define tipc_gap_ack_blks_sz(n) (sizeof(struct tipc_gap_ack_blks) + \ sizeof(struct tipc_gap_ack) * (n)) -#define MAX_GAP_ACK_BLKS 32 +#define MAX_GAP_ACK_BLKS 128 #define MAX_GAP_ACK_BLKS_SZ tipc_gap_ack_blks_sz(MAX_GAP_ACK_BLKS) static inline struct tipc_msg *buf_msg(struct sk_buff *skb) diff --git a/net/tipc/node.c b/net/tipc/node.c index 0c88778c88b5..eb6b62de81a7 100644 --- a/net/tipc/node.c +++ b/net/tipc/node.c @@ -2069,10 +2069,16 @@ void tipc_rcv(struct net *net, struct sk_buff *skb, struct tipc_bearer *b) le = &n->links[bearer_id]; /* Ensure broadcast reception is in synch with peer's send state */ - if (unlikely(usr == LINK_PROTOCOL)) + if (unlikely(usr == LINK_PROTOCOL)) { + if (unlikely(skb_linearize(skb))) { + tipc_node_put(n); + goto discard; + } + hdr = buf_msg(skb); tipc_node_bc_sync_rcv(n, hdr, bearer_id, &xmitq); - else if (unlikely(tipc_link_acked(n->bc_entry.link) != bc_ack)) + } else if (unlikely(tipc_link_acked(n->bc_entry.link) != bc_ack)) { tipc_bcast_ack_rcv(net, n->bc_entry.link, hdr); + } /* Receive packet directly if conditions permit */ tipc_node_read_lock(n); -- 2.13.7 |
From: Xue, Y. <Yin...@wi...> - 2020-04-06 07:00:12
|
Just a minor comment: Please define macros for the cases: 1. Dump broadcast-link & unicast links 2. Dump broadcast-receiver links Thanks, Ying -----Original Message----- From: Tuong Lien [mailto:tuo...@de...] Sent: Saturday, March 28, 2020 12:03 PM To: jm...@re...; ma...@do...; Xue, Ying; tip...@li... Cc: tip...@de... Subject: [PATCH RFC 4/4] tipc: add support for broadcast rcv stats dumping This commit enables dumping the statistics of a broadcast-receiver link like the traditional 'broadcast-link' one (which is for broadcast- sender). The link dumping can be triggered via netlink (e.g. the iproute2/tipc tool) by the link flag - 'TIPC_NLA_LINK_BROADCAST' as the indicator. The name of a broadcast-receiver link of a specific peer will be in the format: 'broadcast-link:<peer-id>'. For example: Link <broadcast-link:1001002> Window:50 packets RX packets:7841 fragments:2408/440 bundles:0/0 TX packets:0 fragments:0/0 bundles:0/0 RX naks:0 defs:124 dups:0 TX naks:21 acks:0 retrans:0 Congestion link:0 Send queue max:0 avg:0 In addition, the broadcast-receiver link statistics can be reset in the usual way via netlink by specifying that link name in command. Note: the 'tipc_link_name_ext()' is removed because the link name can now be retrieved simply via the 'l->name'. Signed-off-by: Tuong Lien <tuo...@de...> --- net/tipc/bcast.c | 6 ++--- net/tipc/bcast.h | 5 +++-- net/tipc/link.c | 65 +++++++++++++++++++++++++++--------------------------- net/tipc/link.h | 3 +-- net/tipc/msg.c | 9 ++++---- net/tipc/msg.h | 2 +- net/tipc/netlink.c | 2 +- net/tipc/node.c | 63 +++++++++++++++++++++++++++++++++++++++++++++------- net/tipc/trace.h | 4 ++-- 9 files changed, 103 insertions(+), 56 deletions(-) diff --git a/net/tipc/bcast.c b/net/tipc/bcast.c index 50a16f8bebd9..383f87bc1061 100644 --- a/net/tipc/bcast.c +++ b/net/tipc/bcast.c @@ -563,10 +563,8 @@ void tipc_bcast_remove_peer(struct net *net, struct tipc_link *rcv_l) tipc_sk_rcv(net, inputq); } -int tipc_bclink_reset_stats(struct net *net) +int tipc_bclink_reset_stats(struct net *net, struct tipc_link *l) { - struct tipc_link *l = tipc_bc_sndlink(net); - if (!l) return -ENOPROTOOPT; @@ -694,7 +692,7 @@ int tipc_bcast_init(struct net *net) tn->bcbase = bb; spin_lock_init(&tipc_net(net)->bclock); - if (!tipc_link_bc_create(net, 0, 0, + if (!tipc_link_bc_create(net, 0, 0, NULL, FB_MTU, BCLINK_WIN_DEFAULT, BCLINK_WIN_DEFAULT, diff --git a/net/tipc/bcast.h b/net/tipc/bcast.h index 97d3cf9d3e4d..4240c95188b1 100644 --- a/net/tipc/bcast.h +++ b/net/tipc/bcast.h @@ -96,9 +96,10 @@ void tipc_bcast_ack_rcv(struct net *net, struct tipc_link *l, int tipc_bcast_sync_rcv(struct net *net, struct tipc_link *l, struct tipc_msg *hdr, struct sk_buff_head *retrq); -int tipc_nl_add_bc_link(struct net *net, struct tipc_nl_msg *msg); +int tipc_nl_add_bc_link(struct net *net, struct tipc_nl_msg *msg, + struct tipc_link *bcl); int tipc_nl_bc_link_set(struct net *net, struct nlattr *attrs[]); -int tipc_bclink_reset_stats(struct net *net); +int tipc_bclink_reset_stats(struct net *net, struct tipc_link *l); u32 tipc_bcast_get_broadcast_mode(struct net *net); u32 tipc_bcast_get_broadcast_ratio(struct net *net); diff --git a/net/tipc/link.c b/net/tipc/link.c index 3071e46f029a..808d3a76c27f 100644 --- a/net/tipc/link.c +++ b/net/tipc/link.c @@ -539,7 +539,7 @@ bool tipc_link_create(struct net *net, char *if_name, int bearer_id, * * Returns true if link was created, otherwise false */ -bool tipc_link_bc_create(struct net *net, u32 ownnode, u32 peer, +bool tipc_link_bc_create(struct net *net, u32 ownnode, u32 peer, u8 *peer_id, int mtu, u32 min_win, u32 max_win, u16 peer_caps, struct sk_buff_head *inputq, struct sk_buff_head *namedq, @@ -554,7 +554,18 @@ bool tipc_link_bc_create(struct net *net, u32 ownnode, u32 peer, return false; l = *link; - strcpy(l->name, tipc_bclink_name); + if (peer_id) { + char peer_str[NODE_ID_STR_LEN] = {0,}; + + tipc_nodeid2string(peer_str, peer_id); + if (strlen(peer_str) > 16) + sprintf(peer_str, "%x", peer); + /* Broadcast receiver link name: "broadcast-link:<peer>" */ + snprintf(l->name, sizeof(l->name), "%s:%s", tipc_bclink_name, + peer_str); + } else { + strcpy(l->name, tipc_bclink_name); + } trace_tipc_link_reset(l, TIPC_DUMP_ALL, "bclink created!"); tipc_link_reset(l); l->state = LINK_RESET; @@ -1412,11 +1423,8 @@ static u8 __tipc_build_gap_ack_blks(struct tipc_gap_ack_blks *ga, gacks[n].ack = htons(expect - 1); gacks[n].gap = htons(seqno - expect); if (++n >= MAX_GAP_ACK_BLKS / 2) { - char buf[TIPC_MAX_LINK_NAME]; - pr_info_ratelimited("Gacks on %s: %d, ql: %d!\n", - tipc_link_name_ext(l, buf), - n, + l->name, n, skb_queue_len(&l->deferdq)); return n; } @@ -1600,6 +1608,8 @@ static int tipc_link_advance_transmq(struct tipc_link *l, struct tipc_link *r, _skb->priority = TC_PRIO_CONTROL; __skb_queue_tail(xmitq, _skb); l->stats.retransmitted++; + if (!is_uc) + r->stats.retransmitted++; *retransmitted = true; /* Increase actual retrans counter & mark first time */ if (!TIPC_SKB_CB(skb)->retr_cnt++) @@ -1766,7 +1776,8 @@ int tipc_link_rcv(struct tipc_link *l, struct sk_buff *skb, /* Defer delivery if sequence gap */ if (unlikely(seqno != rcv_nxt)) { - __tipc_skb_queue_sorted(defq, seqno, skb); + if (!__tipc_skb_queue_sorted(defq, seqno, skb)) + l->stats.duplicates++; rc |= tipc_link_build_nack_msg(l, xmitq); break; } @@ -1800,15 +1811,15 @@ static void tipc_link_build_proto_msg(struct tipc_link *l, int mtyp, bool probe, int tolerance, int priority, struct sk_buff_head *xmitq) { + struct tipc_mon_state *mstate = &l->mon_state; + struct sk_buff_head *dfq = &l->deferdq; struct tipc_link *bcl = l->bc_rcvlink; - struct sk_buff *skb; struct tipc_msg *hdr; - struct sk_buff_head *dfq = &l->deferdq; + struct sk_buff *skb; bool node_up = link_is_up(bcl); - struct tipc_mon_state *mstate = &l->mon_state; + u16 glen = 0, bc_rcvgap = 0; int dlen = 0; void *data; - u16 glen = 0; /* Don't send protocol message during reset or link failover */ if (tipc_link_is_blocked(l)) @@ -1846,7 +1857,8 @@ static void tipc_link_build_proto_msg(struct tipc_link *l, int mtyp, bool probe, if (l->peer_caps & TIPC_LINK_PROTO_SEQNO) msg_set_seqno(hdr, l->snd_nxt_state++); msg_set_seq_gap(hdr, rcvgap); - msg_set_bc_gap(hdr, link_bc_rcv_gap(bcl)); + bc_rcvgap = link_bc_rcv_gap(bcl); + msg_set_bc_gap(hdr, bc_rcvgap); msg_set_probe(hdr, probe); msg_set_is_keepalive(hdr, probe || probe_reply); if (l->peer_caps & TIPC_GAP_ACK_BLOCK) @@ -1871,6 +1883,8 @@ static void tipc_link_build_proto_msg(struct tipc_link *l, int mtyp, bool probe, l->stats.sent_probes++; if (rcvgap) l->stats.sent_nacks++; + if (bc_rcvgap) + bcl->stats.sent_nacks++; skb->priority = TC_PRIO_CONTROL; __skb_queue_tail(xmitq, skb); trace_tipc_proto_build(skb, false, l->name); @@ -2371,8 +2385,6 @@ int tipc_link_bc_sync_rcv(struct tipc_link *l, struct tipc_msg *hdr, if (!l->bc_peer_is_up) return rc; - l->stats.recv_nacks++; - /* Ignore if peers_snd_nxt goes beyond receive window */ if (more(peers_snd_nxt, l->rcv_nxt + l->window)) return rc; @@ -2423,6 +2435,11 @@ int tipc_link_bc_ack_rcv(struct tipc_link *r, u16 acked, u16 gap, if (!link_is_up(r) || !r->bc_peer_is_up) return 0; + if (gap) { + l->stats.recv_nacks++; + r->stats.recv_nacks++; + } + if (less(acked, r->acked) || (acked == r->acked && !gap && !ga)) return 0; @@ -2734,16 +2751,15 @@ static int __tipc_nl_add_bc_link_stat(struct sk_buff *skb, return -EMSGSIZE; } -int tipc_nl_add_bc_link(struct net *net, struct tipc_nl_msg *msg) +int tipc_nl_add_bc_link(struct net *net, struct tipc_nl_msg *msg, + struct tipc_link *bcl) { int err; void *hdr; struct nlattr *attrs; struct nlattr *prop; - struct tipc_net *tn = net_generic(net, tipc_net_id); u32 bc_mode = tipc_bcast_get_broadcast_mode(net); u32 bc_ratio = tipc_bcast_get_broadcast_ratio(net); - struct tipc_link *bcl = tn->bcl; if (!bcl) return 0; @@ -2830,21 +2846,6 @@ void tipc_link_set_abort_limit(struct tipc_link *l, u32 limit) l->abort_limit = limit; } -char *tipc_link_name_ext(struct tipc_link *l, char *buf) -{ - if (!l) - scnprintf(buf, TIPC_MAX_LINK_NAME, "null"); - else if (link_is_bc_sndlink(l)) - scnprintf(buf, TIPC_MAX_LINK_NAME, "broadcast-sender"); - else if (link_is_bc_rcvlink(l)) - scnprintf(buf, TIPC_MAX_LINK_NAME, - "broadcast-receiver, peer %x", l->addr); - else - memcpy(buf, l->name, TIPC_MAX_LINK_NAME); - - return buf; -} - /** * tipc_link_dump - dump TIPC link data * @l: tipc link to be dumped diff --git a/net/tipc/link.h b/net/tipc/link.h index 4d0768cf91d5..fc07232c9a12 100644 --- a/net/tipc/link.h +++ b/net/tipc/link.h @@ -80,7 +80,7 @@ bool tipc_link_create(struct net *net, char *if_name, int bearer_id, struct sk_buff_head *inputq, struct sk_buff_head *namedq, struct tipc_link **link); -bool tipc_link_bc_create(struct net *net, u32 ownnode, u32 peer, +bool tipc_link_bc_create(struct net *net, u32 ownnode, u32 peer, u8 *peer_id, int mtu, u32 min_win, u32 max_win, u16 peer_caps, struct sk_buff_head *inputq, struct sk_buff_head *namedq, @@ -111,7 +111,6 @@ u16 tipc_link_rcv_nxt(struct tipc_link *l); u16 tipc_link_acked(struct tipc_link *l); u32 tipc_link_id(struct tipc_link *l); char *tipc_link_name(struct tipc_link *l); -char *tipc_link_name_ext(struct tipc_link *l, char *buf); u32 tipc_link_state(struct tipc_link *l); char tipc_link_plane(struct tipc_link *l); int tipc_link_prio(struct tipc_link *l); diff --git a/net/tipc/msg.c b/net/tipc/msg.c index 0d515d20b056..69d68512300a 100644 --- a/net/tipc/msg.c +++ b/net/tipc/msg.c @@ -828,19 +828,19 @@ bool tipc_msg_pskb_copy(u32 dst, struct sk_buff_head *msg, * @seqno: sequence number of buffer to add * @skb: buffer to add */ -void __tipc_skb_queue_sorted(struct sk_buff_head *list, u16 seqno, +bool __tipc_skb_queue_sorted(struct sk_buff_head *list, u16 seqno, struct sk_buff *skb) { struct sk_buff *_skb, *tmp; if (skb_queue_empty(list) || less(seqno, buf_seqno(skb_peek(list)))) { __skb_queue_head(list, skb); - return; + return true; } if (more(seqno, buf_seqno(skb_peek_tail(list)))) { __skb_queue_tail(list, skb); - return; + return true; } skb_queue_walk_safe(list, _skb, tmp) { @@ -849,9 +849,10 @@ void __tipc_skb_queue_sorted(struct sk_buff_head *list, u16 seqno, if (seqno == buf_seqno(_skb)) break; __skb_queue_before(list, _skb, skb); - return; + return true; } kfree_skb(skb); + return false; } void tipc_skb_reject(struct net *net, int err, struct sk_buff *skb, diff --git a/net/tipc/msg.h b/net/tipc/msg.h index 9a38f9c9d6eb..87e2d472f75f 100644 --- a/net/tipc/msg.h +++ b/net/tipc/msg.h @@ -1127,7 +1127,7 @@ bool tipc_msg_assemble(struct sk_buff_head *list); bool tipc_msg_reassemble(struct sk_buff_head *list, struct sk_buff_head *rcvq); bool tipc_msg_pskb_copy(u32 dst, struct sk_buff_head *msg, struct sk_buff_head *cpy); -void __tipc_skb_queue_sorted(struct sk_buff_head *list, u16 seqno, +bool __tipc_skb_queue_sorted(struct sk_buff_head *list, u16 seqno, struct sk_buff *skb); bool tipc_msg_skb_clone(struct sk_buff_head *msg, struct sk_buff_head *cpy); diff --git a/net/tipc/netlink.c b/net/tipc/netlink.c index 7c35094c20b8..8dfad18330bc 100644 --- a/net/tipc/netlink.c +++ b/net/tipc/netlink.c @@ -187,7 +187,7 @@ static const struct genl_ops tipc_genl_v2_ops[] = { }, { .cmd = TIPC_NL_LINK_GET, - .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP, + .validate = GENL_DONT_VALIDATE_STRICT, .doit = tipc_nl_node_get_link, .dumpit = tipc_nl_node_dump_link, }, diff --git a/net/tipc/node.c b/net/tipc/node.c index 917ad3920fac..373d07ae6730 100644 --- a/net/tipc/node.c +++ b/net/tipc/node.c @@ -1138,7 +1138,7 @@ void tipc_node_check_dest(struct net *net, u32 addr, if (unlikely(!n->bc_entry.link)) { snd_l = tipc_bc_sndlink(net); if (!tipc_link_bc_create(net, tipc_own_addr(net), - addr, U16_MAX, + addr, peer_id, U16_MAX, tipc_link_min_win(snd_l), tipc_link_max_win(snd_l), n->capabilities, @@ -2432,7 +2432,7 @@ int tipc_nl_node_get_link(struct sk_buff *skb, struct genl_info *info) return -ENOMEM; if (strcmp(name, tipc_bclink_name) == 0) { - err = tipc_nl_add_bc_link(net, &msg); + err = tipc_nl_add_bc_link(net, &msg, tipc_net(net)->bcl); if (err) goto err_free; } else { @@ -2476,6 +2476,7 @@ int tipc_nl_node_reset_link_stats(struct sk_buff *skb, struct genl_info *info) struct tipc_node *node; struct nlattr *attrs[TIPC_NLA_LINK_MAX + 1]; struct net *net = sock_net(skb->sk); + struct tipc_net *tn = tipc_net(net); struct tipc_link_entry *le; if (!info->attrs[TIPC_NLA_LINK]) @@ -2492,11 +2493,26 @@ int tipc_nl_node_reset_link_stats(struct sk_buff *skb, struct genl_info *info) link_name = nla_data(attrs[TIPC_NLA_LINK_NAME]); - if (strcmp(link_name, tipc_bclink_name) == 0) { - err = tipc_bclink_reset_stats(net); + err = -EINVAL; + if (!strcmp(link_name, tipc_bclink_name)) { + err = tipc_bclink_reset_stats(net, tipc_bc_sndlink(net)); if (err) return err; return 0; + } else if (strstr(link_name, tipc_bclink_name)) { + rcu_read_lock(); + list_for_each_entry_rcu(node, &tn->node_list, list) { + tipc_node_read_lock(node); + link = node->bc_entry.link; + if (link && !strcmp(link_name, tipc_link_name(link))) { + err = tipc_bclink_reset_stats(net, link); + tipc_node_read_unlock(node); + break; + } + tipc_node_read_unlock(node); + } + rcu_read_unlock(); + return err; } node = tipc_node_find_by_name(net, link_name, &bearer_id); @@ -2520,7 +2536,8 @@ int tipc_nl_node_reset_link_stats(struct sk_buff *skb, struct genl_info *info) /* Caller should hold node lock */ static int __tipc_nl_add_node_links(struct net *net, struct tipc_nl_msg *msg, - struct tipc_node *node, u32 *prev_link) + struct tipc_node *node, u32 *prev_link, + u32 type) { u32 i; int err; @@ -2536,6 +2553,14 @@ static int __tipc_nl_add_node_links(struct net *net, struct tipc_nl_msg *msg, if (err) return err; } + + if (type == 2) { + *prev_link = 3; + err = tipc_nl_add_bc_link(net, msg, node->bc_entry.link); + if (err) + return err; + } + *prev_link = 0; return 0; @@ -2544,17 +2569,38 @@ static int __tipc_nl_add_node_links(struct net *net, struct tipc_nl_msg *msg, int tipc_nl_node_dump_link(struct sk_buff *skb, struct netlink_callback *cb) { struct net *net = sock_net(skb->sk); + struct nlattr **attrs = genl_dumpit_info(cb)->attrs; + struct nlattr *link[TIPC_NLA_LINK_MAX + 1]; struct tipc_net *tn = net_generic(net, tipc_net_id); struct tipc_node *node; struct tipc_nl_msg msg; u32 prev_node = cb->args[0]; u32 prev_link = cb->args[1]; int done = cb->args[2]; + u32 type = cb->args[3]; int err; if (done) return 0; + if (!type) { + /* Dump broadcast-link & unicast links */ + type = 1; + if (attrs && attrs[TIPC_NLA_LINK]) { + err = nla_parse_nested_deprecated(link, + TIPC_NLA_LINK_MAX, + attrs[TIPC_NLA_LINK], + tipc_nl_link_policy, + NULL); + if (unlikely(err)) + return err; + if (unlikely(!link[TIPC_NLA_LINK_BROADCAST])) + return -EINVAL; + /* Dump broadcast-receiver links as well */ + type = 2; + } + } + msg.skb = skb; msg.portid = NETLINK_CB(cb->skb).portid; msg.seq = cb->nlh->nlmsg_seq; @@ -2578,7 +2624,7 @@ int tipc_nl_node_dump_link(struct sk_buff *skb, struct netlink_callback *cb) list) { tipc_node_read_lock(node); err = __tipc_nl_add_node_links(net, &msg, node, - &prev_link); + &prev_link, type); tipc_node_read_unlock(node); if (err) goto out; @@ -2586,14 +2632,14 @@ int tipc_nl_node_dump_link(struct sk_buff *skb, struct netlink_callback *cb) prev_node = node->addr; } } else { - err = tipc_nl_add_bc_link(net, &msg); + err = tipc_nl_add_bc_link(net, &msg, tn->bcl); if (err) goto out; list_for_each_entry_rcu(node, &tn->node_list, list) { tipc_node_read_lock(node); err = __tipc_nl_add_node_links(net, &msg, node, - &prev_link); + &prev_link, type); tipc_node_read_unlock(node); if (err) goto out; @@ -2608,6 +2654,7 @@ int tipc_nl_node_dump_link(struct sk_buff *skb, struct netlink_callback *cb) cb->args[0] = prev_node; cb->args[1] = prev_link; cb->args[2] = done; + cb->args[3] = type; return skb->len; } diff --git a/net/tipc/trace.h b/net/tipc/trace.h index e7535ab75255..04af83f0500c 100644 --- a/net/tipc/trace.h +++ b/net/tipc/trace.h @@ -255,7 +255,7 @@ DECLARE_EVENT_CLASS(tipc_link_class, TP_fast_assign( __assign_str(header, header); - tipc_link_name_ext(l, __entry->name); + memcpy(__entry->name, tipc_link_name(l), TIPC_MAX_LINK_NAME); tipc_link_dump(l, dqueues, __get_str(buf)); ), @@ -295,7 +295,7 @@ DECLARE_EVENT_CLASS(tipc_link_transmq_class, ), TP_fast_assign( - tipc_link_name_ext(r, __entry->name); + memcpy(__entry->name, tipc_link_name(r), TIPC_MAX_LINK_NAME); __entry->from = f; __entry->to = t; __entry->len = skb_queue_len(tq); -- 2.13.7 |
From: Tuong T. L. <tuo...@de...> - 2020-04-06 05:16:09
|
Hi Ying, Thanks for your comments, please see my feedback below. BR/Tuong -----Original Message----- From: Xue, Ying <Yin...@wi...> Sent: Monday, April 6, 2020 10:46 AM To: Tuong Tong Lien <tuo...@de...>; jm...@re...; ma...@do...; tip...@li... Cc: tipc-dek <tip...@de...> Subject: RE: [PATCH RFC 1/4] tipc: introduce Gap ACK blocks for broadcast link Hi Tuong, Please see my comments inline: As achieved through commit 9195948fbf34 ("tipc: improve TIPC throughput by Gap ACK blocks"), we apply the same mechanism for the broadcast link as well. The 'Gap ACK blocks' data field in a 'PROTOCOL/STATE_MSG' will consist of two parts built for both the broadcast and unicast types: 31 16 15 0 +-------------+-------------+-------------+-------------+ | bgack_cnt | ugack_cnt | len | +-------------+-------------+-------------+-------------+ - | gap | ack | | +-------------+-------------+-------------+-------------+ > bc gacks : : : | +-------------+-------------+-------------+-------------+ - | gap | ack | | +-------------+-------------+-------------+-------------+ > uc gacks : : : | +-------------+-------------+-------------+-------------+ - which is "automatically" backward-compatible. === [Ying] In my opinion, this patch will cause the backward-compatible issue below: 1) On the TIPC node with the patch: When sending a 'PROTOCOL/STATE_MSG' message , its 'Gap ACK blocks' data field only contains bcl gap ack blocks, but no any unicast link gap ack block. 2) On the TIPC node without the patch: Upon receiving the message sent by the node of case 1), this node will suppose its 'Gap ACK blocks' data field are unicast link gap ack blocks rather than broadcast link gap ack blocks. [Tuong]: As you can see in the figure above, we have two different "b/ugack_cnt" fields which determine the number of broadcast/unicast gap ack blocks in the message. The "ugack_cnt" is fully identical to the "gack_cnt" in the old version (- without the patch) i.e. indicating the number of unicast gap ack blocks anyway, whereas the "bgack_cnt" was a reserved field. So, in your situation, the sending side will send the message with the "ugack_cnt" = 0 and this is completely compatible to the old version that the receiving side will see no unicast gap ack blocks and just ignore the broadcast gap ack blocks (- it doesn't really know). Actually, there is also a sanity check on the length in the old code that will shortly ignore such the gap ack block report... So, we have no problem at all. That is why I've declared it backward compatible automatically. So I wonder no backward-compatible issue will exist and everything will become pretty easy if we use LINK_PROTOCOL to only contain unicast gap ack blocks and use BCAST_PROTOCOL to convey broadcast gap ack blocks. In other words, we don't need to enlarge current gap ack block space, and we don't need to change the current code related unicast gap ack blocks. Instead, we just need to add the support for broadcast gap ack blocks through BCAST_PROTOCOL rather than LINK_PROTOCOL. [Tuong]: The BCAST_PROTOCOL is currently only used for broadcast initializing or synching when a new peer joins, the old mechanism as broadcast NACKs is deprecated... I suppose that using the LINK_PROTOCOL is much more convenient since the traditional ack/gap reports for broadcast link is also made via the message, so we don't need to create a new code flow to handle the gap/ack blocks. Actually, the change in the current code related unicast gap ack blocks is just to optimize the code e.g. removing an old functions, etc., there is no impact in its functionality. === Thanks, Ying We also increase the max number of Gap ACK blocks to 128, allowing upto 64 blocks per type (total buffer size = 516 bytes). Besides, the 'tipc_link_advance_transmq()' function is refactored which is applicable for both the unicast and broadcast cases now, so some old functions can be removed and the code is optimized. With the patch, TIPC broadcast is more robust regardless of packet loss or disorder, latency, ... in the underlying network. Its performance is boost up significantly. For example, experiment with a 5% packet loss rate results: $ time tipc-pipe --mc --rdm --data_size 123 --data_num 1500000 real 0m 42.46s user 0m 1.16s sys 0m 17.67s Without the patch: $ time tipc-pipe --mc --rdm --data_size 123 --data_num 1500000 real 8m 27.94s user 0m 0.55s sys 0m 2.38s Acked-by: Jon Maloy <jm...@re...> Signed-off-by: Tuong Lien <tuo...@de...> --- net/tipc/bcast.c | 9 +- net/tipc/link.c | 438 +++++++++++++++++++++++++++++++++---------------------- net/tipc/link.h | 7 +- net/tipc/msg.h | 14 +- net/tipc/node.c | 10 +- 5 files changed, 293 insertions(+), 185 deletions(-) diff --git a/net/tipc/bcast.c b/net/tipc/bcast.c index 4c20be08b9c4..3ce690a96ee9 100644 --- a/net/tipc/bcast.c +++ b/net/tipc/bcast.c @@ -474,7 +474,7 @@ void tipc_bcast_ack_rcv(struct net *net, struct tipc_link *l, __skb_queue_head_init(&xmitq); tipc_bcast_lock(net); - tipc_link_bc_ack_rcv(l, acked, &xmitq); + tipc_link_bc_ack_rcv(l, acked, 0, NULL, &xmitq); tipc_bcast_unlock(net); tipc_bcbase_xmit(net, &xmitq); @@ -492,6 +492,7 @@ int tipc_bcast_sync_rcv(struct net *net, struct tipc_link *l, struct tipc_msg *hdr) { struct sk_buff_head *inputq = &tipc_bc_base(net)->inputq; + struct tipc_gap_ack_blks *ga; struct sk_buff_head xmitq; int rc = 0; @@ -501,8 +502,10 @@ int tipc_bcast_sync_rcv(struct net *net, struct tipc_link *l, if (msg_type(hdr) != STATE_MSG) { tipc_link_bc_init_rcv(l, hdr); } else if (!msg_bc_ack_invalid(hdr)) { - tipc_link_bc_ack_rcv(l, msg_bcast_ack(hdr), &xmitq); - rc = tipc_link_bc_sync_rcv(l, hdr, &xmitq); + tipc_get_gap_ack_blks(&ga, l, hdr, false); + rc = tipc_link_bc_ack_rcv(l, msg_bcast_ack(hdr), + msg_bc_gap(hdr), ga, &xmitq); + rc |= tipc_link_bc_sync_rcv(l, hdr, &xmitq); } tipc_bcast_unlock(net); diff --git a/net/tipc/link.c b/net/tipc/link.c index 467c53a1fb5c..1b60ba665504 100644 --- a/net/tipc/link.c +++ b/net/tipc/link.c @@ -188,6 +188,8 @@ struct tipc_link { /* Broadcast */ u16 ackers; u16 acked; + u16 last_gap; + struct tipc_gap_ack_blks *last_ga; struct tipc_link *bc_rcvlink; struct tipc_link *bc_sndlink; u8 nack_state; @@ -249,11 +251,14 @@ static int tipc_link_build_nack_msg(struct tipc_link *l, struct sk_buff_head *xmitq); static void tipc_link_build_bc_init_msg(struct tipc_link *l, struct sk_buff_head *xmitq); -static int tipc_link_release_pkts(struct tipc_link *l, u16 to); -static u16 tipc_build_gap_ack_blks(struct tipc_link *l, void *data, u16 gap); -static int tipc_link_advance_transmq(struct tipc_link *l, u16 acked, u16 gap, +static u8 __tipc_build_gap_ack_blks(struct tipc_gap_ack_blks *ga, + struct tipc_link *l, u8 start_index); +static u16 tipc_build_gap_ack_blks(struct tipc_link *l, struct tipc_msg *hdr); +static int tipc_link_advance_transmq(struct tipc_link *l, struct tipc_link *r, + u16 acked, u16 gap, struct tipc_gap_ack_blks *ga, - struct sk_buff_head *xmitq); + struct sk_buff_head *xmitq, + bool *retransmitted, int *rc); static void tipc_link_update_cwin(struct tipc_link *l, int released, bool retransmitted); /* @@ -370,7 +375,7 @@ void tipc_link_remove_bc_peer(struct tipc_link *snd_l, snd_l->ackers--; rcv_l->bc_peer_is_up = true; rcv_l->state = LINK_ESTABLISHED; - tipc_link_bc_ack_rcv(rcv_l, ack, xmitq); + tipc_link_bc_ack_rcv(rcv_l, ack, 0, NULL, xmitq); trace_tipc_link_reset(rcv_l, TIPC_DUMP_ALL, "bclink removed!"); tipc_link_reset(rcv_l); rcv_l->state = LINK_RESET; @@ -784,8 +789,6 @@ bool tipc_link_too_silent(struct tipc_link *l) return (l->silent_intv_cnt + 2 > l->abort_limit); } -static int tipc_link_bc_retrans(struct tipc_link *l, struct tipc_link *r, - u16 from, u16 to, struct sk_buff_head *xmitq); /* tipc_link_timeout - perform periodic task as instructed from node timeout */ int tipc_link_timeout(struct tipc_link *l, struct sk_buff_head *xmitq) @@ -948,6 +951,9 @@ void tipc_link_reset(struct tipc_link *l) l->snd_nxt_state = 1; l->rcv_nxt_state = 1; l->acked = 0; + l->last_gap = 0; + kfree(l->last_ga); + l->last_ga = NULL; l->silent_intv_cnt = 0; l->rst_cnt = 0; l->bc_peer_is_up = false; @@ -1183,68 +1189,14 @@ static bool link_retransmit_failure(struct tipc_link *l, struct tipc_link *r, if (link_is_bc_sndlink(l)) { r->state = LINK_RESET; - *rc = TIPC_LINK_DOWN_EVT; + *rc |= TIPC_LINK_DOWN_EVT; } else { - *rc = tipc_link_fsm_evt(l, LINK_FAILURE_EVT); + *rc |= tipc_link_fsm_evt(l, LINK_FAILURE_EVT); } return true; } -/* tipc_link_bc_retrans() - retransmit zero or more packets - * @l: the link to transmit on - * @r: the receiving link ordering the retransmit. Same as l if unicast - * @from: retransmit from (inclusive) this sequence number - * @to: retransmit to (inclusive) this sequence number - * xmitq: queue for accumulating the retransmitted packets - */ -static int tipc_link_bc_retrans(struct tipc_link *l, struct tipc_link *r, - u16 from, u16 to, struct sk_buff_head *xmitq) -{ - struct sk_buff *_skb, *skb = skb_peek(&l->transmq); - u16 bc_ack = l->bc_rcvlink->rcv_nxt - 1; - u16 ack = l->rcv_nxt - 1; - int retransmitted = 0; - struct tipc_msg *hdr; - int rc = 0; - - if (!skb) - return 0; - if (less(to, from)) - return 0; - - trace_tipc_link_retrans(r, from, to, &l->transmq); - - if (link_retransmit_failure(l, r, &rc)) - return rc; - - skb_queue_walk(&l->transmq, skb) { - hdr = buf_msg(skb); - if (less(msg_seqno(hdr), from)) - continue; - if (more(msg_seqno(hdr), to)) - break; - if (time_before(jiffies, TIPC_SKB_CB(skb)->nxt_retr)) - continue; - TIPC_SKB_CB(skb)->nxt_retr = TIPC_BC_RETR_LIM; - _skb = pskb_copy(skb, GFP_ATOMIC); - if (!_skb) - return 0; - hdr = buf_msg(_skb); - msg_set_ack(hdr, ack); - msg_set_bcast_ack(hdr, bc_ack); - _skb->priority = TC_PRIO_CONTROL; - __skb_queue_tail(xmitq, _skb); - l->stats.retransmitted++; - retransmitted++; - /* Increase actual retrans counter & mark first time */ - if (!TIPC_SKB_CB(skb)->retr_cnt++) - TIPC_SKB_CB(skb)->retr_stamp = jiffies; - } - tipc_link_update_cwin(l, 0, retransmitted); - return 0; -} - /* tipc_data_input - deliver data and name distr msgs to upper layer * * Consumes buffer if message is of right type @@ -1402,46 +1354,71 @@ static int tipc_link_tnl_rcv(struct tipc_link *l, struct sk_buff *skb, return rc; } -static int tipc_link_release_pkts(struct tipc_link *l, u16 acked) -{ - int released = 0; - struct sk_buff *skb, *tmp; - - skb_queue_walk_safe(&l->transmq, skb, tmp) { - if (more(buf_seqno(skb), acked)) - break; - __skb_unlink(skb, &l->transmq); - kfree_skb(skb); - released++; +/** + * tipc_get_gap_ack_blks - get Gap ACK blocks from PROTOCOL/STATE_MSG + * @ga: returned pointer to the Gap ACK blocks if any + * @l: the tipc link + * @hdr: the PROTOCOL/STATE_MSG header + * @uc: desired Gap ACK blocks type, i.e. unicast (= 1) or broadcast (= 0) + * + * Return: the total Gap ACK blocks size + */ +u16 tipc_get_gap_ack_blks(struct tipc_gap_ack_blks **ga, struct tipc_link *l, + struct tipc_msg *hdr, bool uc) +{ + struct tipc_gap_ack_blks *p; + u16 sz = 0; + + /* Does peer support the Gap ACK blocks feature? */ + if (l->peer_caps & TIPC_GAP_ACK_BLOCK) { + p = (struct tipc_gap_ack_blks *)msg_data(hdr); + sz = ntohs(p->len); + /* Sanity check */ + if (sz == tipc_gap_ack_blks_sz(p->ugack_cnt + p->bgack_cnt)) { + /* Good, check if the desired type exists */ + if ((uc && p->ugack_cnt) || (!uc && p->bgack_cnt)) + goto ok; + /* Backward compatible: peer might not support bc, but uc? */ + } else if (uc && sz == tipc_gap_ack_blks_sz(p->ugack_cnt)) { + if (p->ugack_cnt) { + p->bgack_cnt = 0; + goto ok; + } + } } - return released; + /* Other cases: ignore! */ + p = NULL; + +ok: + *ga = p; + return sz; } -/* tipc_build_gap_ack_blks - build Gap ACK blocks - * @l: tipc link that data have come with gaps in sequence if any - * @data: data buffer to store the Gap ACK blocks after built - * - * returns the actual allocated memory size - */ -static u16 tipc_build_gap_ack_blks(struct tipc_link *l, void *data, u16 gap) +static u8 __tipc_build_gap_ack_blks(struct tipc_gap_ack_blks *ga, + struct tipc_link *l, u8 start_index) { + struct tipc_gap_ack *gacks = &ga->gacks[start_index]; struct sk_buff *skb = skb_peek(&l->deferdq); - struct tipc_gap_ack_blks *ga = data; - u16 len, expect, seqno = 0; + u16 expect, seqno = 0; u8 n = 0; - if (!skb || !gap) - goto exit; + if (!skb) + return 0; expect = buf_seqno(skb); skb_queue_walk(&l->deferdq, skb) { seqno = buf_seqno(skb); if (unlikely(more(seqno, expect))) { - ga->gacks[n].ack = htons(expect - 1); - ga->gacks[n].gap = htons(seqno - expect); - if (++n >= MAX_GAP_ACK_BLKS) { - pr_info_ratelimited("Too few Gap ACK blocks!\n"); - goto exit; + gacks[n].ack = htons(expect - 1); + gacks[n].gap = htons(seqno - expect); + if (++n >= MAX_GAP_ACK_BLKS / 2) { + char buf[TIPC_MAX_LINK_NAME]; + + pr_info_ratelimited("Gacks on %s: %d, ql: %d!\n", + tipc_link_name_ext(l, buf), + n, + skb_queue_len(&l->deferdq)); + return n; } } else if (unlikely(less(seqno, expect))) { pr_warn("Unexpected skb in deferdq!\n"); @@ -1451,14 +1428,57 @@ static u16 tipc_build_gap_ack_blks(struct tipc_link *l, void *data, u16 gap) } /* last block */ - ga->gacks[n].ack = htons(seqno); - ga->gacks[n].gap = 0; + gacks[n].ack = htons(seqno); + gacks[n].gap = 0; n++; + return n; +} -exit: - len = tipc_gap_ack_blks_sz(n); +/* tipc_build_gap_ack_blks - build Gap ACK blocks + * @l: tipc unicast link + * @hdr: the tipc message buffer to store the Gap ACK blocks after built + * + * The function builds Gap ACK blocks for both the unicast & broadcast receiver + * links of a certain peer, the buffer after built has the network data format + * as follows: + * 31 16 15 0 + * +-------------+-------------+-------------+-------------+ + * | bgack_cnt | ugack_cnt | len | + * +-------------+-------------+-------------+-------------+ - + * | gap | ack | | + * +-------------+-------------+-------------+-------------+ > bc gacks + * : : : | + * +-------------+-------------+-------------+-------------+ - + * | gap | ack | | + * +-------------+-------------+-------------+-------------+ > uc gacks + * : : : | + * +-------------+-------------+-------------+-------------+ - + * (See struct tipc_gap_ack_blks) + * + * returns the actual allocated memory size + */ +static u16 tipc_build_gap_ack_blks(struct tipc_link *l, struct tipc_msg *hdr) +{ + struct tipc_link *bcl = l->bc_rcvlink; + struct tipc_gap_ack_blks *ga; + u16 len; + + ga = (struct tipc_gap_ack_blks *)msg_data(hdr); + + /* Start with broadcast link first */ + tipc_bcast_lock(bcl->net); + msg_set_bcast_ack(hdr, bcl->rcv_nxt - 1); + msg_set_bc_gap(hdr, link_bc_rcv_gap(bcl)); + ga->bgack_cnt = __tipc_build_gap_ack_blks(ga, bcl, 0); + tipc_bcast_unlock(bcl->net); + + /* Now for unicast link, but an explicit NACK only (???) */ + ga->ugack_cnt = (msg_seq_gap(hdr)) ? + __tipc_build_gap_ack_blks(ga, l, ga->bgack_cnt) : 0; + + /* Total len */ + len = tipc_gap_ack_blks_sz(ga->bgack_cnt + ga->ugack_cnt); ga->len = htons(len); - ga->gack_cnt = n; return len; } @@ -1466,47 +1486,109 @@ static u16 tipc_build_gap_ack_blks(struct tipc_link *l, void *data, u16 gap) * acked packets, also doing retransmissions if * gaps found * @l: tipc link with transmq queue to be advanced + * @r: tipc link "receiver" i.e. in case of broadcast (= "l" if unicast) * @acked: seqno of last packet acked by peer without any gaps before * @gap: # of gap packets * @ga: buffer pointer to Gap ACK blocks from peer * @xmitq: queue for accumulating the retransmitted packets if any + * @retransmitted: returned boolean value if a retransmission is really issued + * @rc: returned code e.g. TIPC_LINK_DOWN_EVT if a repeated retransmit failures + * happens (- unlikely case) * - * In case of a repeated retransmit failures, the call will return shortly - * with a returned code (e.g. TIPC_LINK_DOWN_EVT) + * Return: the number of packets released from the link transmq */ -static int tipc_link_advance_transmq(struct tipc_link *l, u16 acked, u16 gap, +static int tipc_link_advance_transmq(struct tipc_link *l, struct tipc_link *r, + u16 acked, u16 gap, struct tipc_gap_ack_blks *ga, - struct sk_buff_head *xmitq) + struct sk_buff_head *xmitq, + bool *retransmitted, int *rc) { + struct tipc_gap_ack_blks *last_ga = r->last_ga, *this_ga = NULL; + struct tipc_gap_ack *gacks = NULL; struct sk_buff *skb, *_skb, *tmp; struct tipc_msg *hdr; + u32 qlen = skb_queue_len(&l->transmq); + u16 nacked = acked, ngap = gap, gack_cnt = 0; u16 bc_ack = l->bc_rcvlink->rcv_nxt - 1; - bool retransmitted = false; u16 ack = l->rcv_nxt - 1; - bool passed = false; - u16 released = 0; u16 seqno, n = 0; - int rc = 0; + u16 end = r->acked, start = end, offset = r->last_gap; + u16 si = (last_ga) ? last_ga->start_index : 0; + bool is_uc = !link_is_bc_sndlink(l); + bool bc_has_acked = false; + + /* Determine Gap ACK blocks if any for the particular link */ + if (ga && is_uc) { + /* Get the Gap ACKs, uc part */ + gack_cnt = ga->ugack_cnt; + gacks = &ga->gacks[ga->bgack_cnt]; + } else if (ga) { + /* Copy the Gap ACKs, bc part, for later renewal if needed */ + this_ga = kmemdup(ga, tipc_gap_ack_blks_sz(ga->bgack_cnt), + GFP_ATOMIC); + if (likely(this_ga)) { + this_ga->start_index = 0; + /* Start with the bc Gap ACKs */ + gack_cnt = this_ga->bgack_cnt; + gacks = &this_ga->gacks[0]; + } else { + /* Hmm, we can get in trouble..., simply ignore it */ + pr_warn_ratelimited("Ignoring bc Gap ACKs, no memory\n"); + } + } + /* Advance the link transmq */ skb_queue_walk_safe(&l->transmq, skb, tmp) { seqno = buf_seqno(skb); next_gap_ack: - if (less_eq(seqno, acked)) { + if (less_eq(seqno, nacked)) { + if (is_uc) + goto release; + /* Skip packets peer has already acked */ + if (!more(seqno, r->acked)) + continue; + /* Get the next of last Gap ACK blocks */ + while (more(seqno, end)) { + if (!last_ga || si >= last_ga->bgack_cnt) + break; + start = end + offset + 1; + end = ntohs(last_ga->gacks[si].ack); + offset = ntohs(last_ga->gacks[si].gap); + si++; + WARN_ONCE(more(start, end) || + (!offset && + si < last_ga->bgack_cnt) || + si > MAX_GAP_ACK_BLKS, + "Corrupted Gap ACK: %d %d %d %d %d\n", + start, end, offset, si, + last_ga->bgack_cnt); + } + /* Check against the last Gap ACK block */ + if (in_range(seqno, start, end)) + continue; + /* Update/release the packet peer is acking */ + bc_has_acked = true; + if (--TIPC_SKB_CB(skb)->ackers) + continue; +release: /* release skb */ __skb_unlink(skb, &l->transmq); kfree_skb(skb); - released++; - } else if (less_eq(seqno, acked + gap)) { - /* First, check if repeated retrans failures occurs? */ - if (!passed && link_retransmit_failure(l, l, &rc)) - return rc; - passed = true; - + } else if (less_eq(seqno, nacked + ngap)) { + /* First gap: check if repeated retrans failures? */ + if (unlikely(seqno == acked + 1 && + link_retransmit_failure(l, r, rc))) { + /* Ignore this bc Gap ACKs if any */ + kfree(this_ga); + this_ga = NULL; + break; + } /* retransmit skb if unrestricted*/ if (time_before(jiffies, TIPC_SKB_CB(skb)->nxt_retr)) continue; - TIPC_SKB_CB(skb)->nxt_retr = TIPC_UC_RETR_TIME; + TIPC_SKB_CB(skb)->nxt_retr = (is_uc) ? + TIPC_UC_RETR_TIME : TIPC_BC_RETR_LIM; _skb = pskb_copy(skb, GFP_ATOMIC); if (!_skb) continue; @@ -1516,25 +1598,51 @@ static int tipc_link_advance_transmq(struct tipc_link *l, u16 acked, u16 gap, _skb->priority = TC_PRIO_CONTROL; __skb_queue_tail(xmitq, _skb); l->stats.retransmitted++; - retransmitted = true; + *retransmitted = true; /* Increase actual retrans counter & mark first time */ if (!TIPC_SKB_CB(skb)->retr_cnt++) TIPC_SKB_CB(skb)->retr_stamp = jiffies; } else { /* retry with Gap ACK blocks if any */ - if (!ga || n >= ga->gack_cnt) + if (n >= gack_cnt) break; - acked = ntohs(ga->gacks[n].ack); - gap = ntohs(ga->gacks[n].gap); + nacked = ntohs(gacks[n].ack); + ngap = ntohs(gacks[n].gap); n++; goto next_gap_ack; } } - if (released || retransmitted) - tipc_link_update_cwin(l, released, retransmitted); - if (released) - tipc_link_advance_backlog(l, xmitq); - return 0; + + /* Renew last Gap ACK blocks for bc if needed */ + if (bc_has_acked) { + if (this_ga) { + kfree(last_ga); + r->last_ga = this_ga; + r->last_gap = gap; + } else if (last_ga) { + if (less(acked, start)) { + si--; + offset = start - acked - 1; + } else if (less(acked, end)) { + acked = end; + } + if (si < last_ga->bgack_cnt) { + last_ga->start_index = si; + r->last_gap = offset; + } else { + kfree(last_ga); + r->last_ga = NULL; + r->last_gap = 0; + } + } else { + r->last_gap = 0; + } + r->acked = acked; + } else { + kfree(this_ga); + } + + return qlen - skb_queue_len(&l->transmq); } /* tipc_link_build_state_msg: prepare link state message for transmission @@ -1651,7 +1759,8 @@ int tipc_link_rcv(struct tipc_link *l, struct sk_buff *skb, kfree_skb(skb); break; } - released += tipc_link_release_pkts(l, msg_ack(hdr)); + released += tipc_link_advance_transmq(l, l, msg_ack(hdr), 0, + NULL, NULL, NULL, NULL); /* Defer delivery if sequence gap */ if (unlikely(seqno != rcv_nxt)) { @@ -1739,7 +1848,7 @@ static void tipc_link_build_proto_msg(struct tipc_link *l, int mtyp, bool probe, msg_set_probe(hdr, probe); msg_set_is_keepalive(hdr, probe || probe_reply); if (l->peer_caps & TIPC_GAP_ACK_BLOCK) - glen = tipc_build_gap_ack_blks(l, data, rcvgap); + glen = tipc_build_gap_ack_blks(l, hdr); tipc_mon_prep(l->net, data + glen, &dlen, mstate, l->bearer_id); msg_set_size(hdr, INT_H_SIZE + glen + dlen); skb_trim(skb, INT_H_SIZE + glen + dlen); @@ -2027,20 +2136,19 @@ static int tipc_link_proto_rcv(struct tipc_link *l, struct sk_buff *skb, { struct tipc_msg *hdr = buf_msg(skb); struct tipc_gap_ack_blks *ga = NULL; - u16 rcvgap = 0; - u16 ack = msg_ack(hdr); - u16 gap = msg_seq_gap(hdr); + bool reply = msg_probe(hdr), retransmitted = false; + u16 dlen = msg_data_sz(hdr), glen = 0; u16 peers_snd_nxt = msg_next_sent(hdr); u16 peers_tol = msg_link_tolerance(hdr); u16 peers_prio = msg_linkprio(hdr); + u16 gap = msg_seq_gap(hdr); + u16 ack = msg_ack(hdr); u16 rcv_nxt = l->rcv_nxt; - u16 dlen = msg_data_sz(hdr); + u16 rcvgap = 0; int mtyp = msg_type(hdr); - bool reply = msg_probe(hdr); - u16 glen = 0; - void *data; + int rc = 0, released; char *if_name; - int rc = 0; + void *data; trace_tipc_proto_rcv(skb, false, l->name); if (tipc_link_is_blocked(l) || !xmitq) @@ -2137,13 +2245,7 @@ static int tipc_link_proto_rcv(struct tipc_link *l, struct sk_buff *skb, } /* Receive Gap ACK blocks from peer if any */ - if (l->peer_caps & TIPC_GAP_ACK_BLOCK) { - ga = (struct tipc_gap_ack_blks *)data; - glen = ntohs(ga->len); - /* sanity check: if failed, ignore Gap ACK blocks */ - if (glen != tipc_gap_ack_blks_sz(ga->gack_cnt)) - ga = NULL; - } + glen = tipc_get_gap_ack_blks(&ga, l, hdr, true); tipc_mon_rcv(l->net, data + glen, dlen - glen, l->addr, &l->mon_state, l->bearer_id); @@ -2158,9 +2260,14 @@ static int tipc_link_proto_rcv(struct tipc_link *l, struct sk_buff *skb, tipc_link_build_proto_msg(l, STATE_MSG, 0, reply, rcvgap, 0, 0, xmitq); - rc |= tipc_link_advance_transmq(l, ack, gap, ga, xmitq); + released = tipc_link_advance_transmq(l, l, ack, gap, ga, xmitq, + &retransmitted, &rc); if (gap) l->stats.recv_nacks++; + if (released || retransmitted) + tipc_link_update_cwin(l, released, retransmitted); + if (released) + tipc_link_advance_backlog(l, xmitq); if (unlikely(!skb_queue_empty(&l->wakeupq))) link_prepare_wakeup(l); } @@ -2246,10 +2353,7 @@ void tipc_link_bc_init_rcv(struct tipc_link *l, struct tipc_msg *hdr) int tipc_link_bc_sync_rcv(struct tipc_link *l, struct tipc_msg *hdr, struct sk_buff_head *xmitq) { - struct tipc_link *snd_l = l->bc_sndlink; u16 peers_snd_nxt = msg_bc_snd_nxt(hdr); - u16 from = msg_bcast_ack(hdr) + 1; - u16 to = from + msg_bc_gap(hdr) - 1; int rc = 0; if (!link_is_up(l)) @@ -2271,8 +2375,6 @@ int tipc_link_bc_sync_rcv(struct tipc_link *l, struct tipc_msg *hdr, if (more(peers_snd_nxt, l->rcv_nxt + l->window)) return rc; - rc = tipc_link_bc_retrans(snd_l, l, from, to, xmitq); - l->snd_nxt = peers_snd_nxt; if (link_bc_rcv_gap(l)) rc |= TIPC_LINK_SND_STATE; @@ -2307,38 +2409,27 @@ int tipc_link_bc_sync_rcv(struct tipc_link *l, struct tipc_msg *hdr, return 0; } -void tipc_link_bc_ack_rcv(struct tipc_link *l, u16 acked, - struct sk_buff_head *xmitq) +int tipc_link_bc_ack_rcv(struct tipc_link *r, u16 acked, u16 gap, + struct tipc_gap_ack_blks *ga, + struct sk_buff_head *xmitq) { - struct sk_buff *skb, *tmp; - struct tipc_link *snd_l = l->bc_sndlink; + struct tipc_link *l = r->bc_sndlink; + bool unused = false; + int rc = 0; - if (!link_is_up(l) || !l->bc_peer_is_up) - return; + if (!link_is_up(r) || !r->bc_peer_is_up) + return 0; - if (!more(acked, l->acked)) - return; + if (less(acked, r->acked) || (acked == r->acked && !gap && !ga)) + return 0; - trace_tipc_link_bc_ack(l, l->acked, acked, &snd_l->transmq); - /* Skip over packets peer has already acked */ - skb_queue_walk(&snd_l->transmq, skb) { - if (more(buf_seqno(skb), l->acked)) - break; - } + tipc_link_advance_transmq(l, r, acked, gap, ga, xmitq, &unused, &rc); - /* Update/release the packets peer is acking now */ - skb_queue_walk_from_safe(&snd_l->transmq, skb, tmp) { - if (more(buf_seqno(skb), acked)) - break; - if (!--TIPC_SKB_CB(skb)->ackers) { - __skb_unlink(skb, &snd_l->transmq); - kfree_skb(skb); - } - } - l->acked = acked; - tipc_link_advance_backlog(snd_l, xmitq); - if (unlikely(!skb_queue_empty(&snd_l->wakeupq))) - link_prepare_wakeup(snd_l); + tipc_link_advance_backlog(l, xmitq); + if (unlikely(!skb_queue_empty(&l->wakeupq))) + link_prepare_wakeup(l); + + return rc; } /* tipc_link_bc_nack_rcv(): receive broadcast nack message @@ -2366,8 +2457,7 @@ int tipc_link_bc_nack_rcv(struct tipc_link *l, struct sk_buff *skb, return 0; if (dnode == tipc_own_addr(l->net)) { - tipc_link_bc_ack_rcv(l, acked, xmitq); - rc = tipc_link_bc_retrans(l->bc_sndlink, l, from, to, xmitq); + rc = tipc_link_bc_ack_rcv(l, acked, to - acked, NULL, xmitq); l->stats.recv_nacks++; return rc; } diff --git a/net/tipc/link.h b/net/tipc/link.h index d3c1c3fc1659..0a0fa7350722 100644 --- a/net/tipc/link.h +++ b/net/tipc/link.h @@ -143,8 +143,11 @@ int tipc_link_bc_peers(struct tipc_link *l); void tipc_link_set_mtu(struct tipc_link *l, int mtu); int tipc_link_mtu(struct tipc_link *l); int tipc_link_mss(struct tipc_link *l); -void tipc_link_bc_ack_rcv(struct tipc_link *l, u16 acked, - struct sk_buff_head *xmitq); +u16 tipc_get_gap_ack_blks(struct tipc_gap_ack_blks **ga, struct tipc_link *l, + struct tipc_msg *hdr, bool uc); +int tipc_link_bc_ack_rcv(struct tipc_link *l, u16 acked, u16 gap, + struct tipc_gap_ack_blks *ga, + struct sk_buff_head *xmitq); void tipc_link_build_bc_sync_msg(struct tipc_link *l, struct sk_buff_head *xmitq); void tipc_link_bc_init_rcv(struct tipc_link *l, struct tipc_msg *hdr); diff --git a/net/tipc/msg.h b/net/tipc/msg.h index 6d466ebdb64f..9a38f9c9d6eb 100644 --- a/net/tipc/msg.h +++ b/net/tipc/msg.h @@ -160,20 +160,26 @@ struct tipc_gap_ack { /* struct tipc_gap_ack_blks * @len: actual length of the record - * @gack_cnt: number of Gap ACK blocks in the record + * @bgack_cnt: number of Gap ACK blocks for broadcast in the record [Ying] "bgack_cnt" is defined after " ugack_cnt " in struct tipc_gap_ack_blks. Please reverse their order. + * @ugack_cnt: number of Gap ACK blocks for unicast (following the broadcast + * ones) + * @start_index: starting index for "valid" broadcast Gap ACK blocks * @gacks: array of Gap ACK blocks */ struct tipc_gap_ack_blks { __be16 len; - u8 gack_cnt; - u8 reserved; + union { + u8 ugack_cnt; + u8 start_index; + }; + u8 bgack_cnt; struct tipc_gap_ack gacks[]; }; #define tipc_gap_ack_blks_sz(n) (sizeof(struct tipc_gap_ack_blks) + \ sizeof(struct tipc_gap_ack) * (n)) -#define MAX_GAP_ACK_BLKS 32 +#define MAX_GAP_ACK_BLKS 128 #define MAX_GAP_ACK_BLKS_SZ tipc_gap_ack_blks_sz(MAX_GAP_ACK_BLKS) static inline struct tipc_msg *buf_msg(struct sk_buff *skb) diff --git a/net/tipc/node.c b/net/tipc/node.c index 0c88778c88b5..eb6b62de81a7 100644 --- a/net/tipc/node.c +++ b/net/tipc/node.c @@ -2069,10 +2069,16 @@ void tipc_rcv(struct net *net, struct sk_buff *skb, struct tipc_bearer *b) le = &n->links[bearer_id]; /* Ensure broadcast reception is in synch with peer's send state */ - if (unlikely(usr == LINK_PROTOCOL)) + if (unlikely(usr == LINK_PROTOCOL)) { + if (unlikely(skb_linearize(skb))) { + tipc_node_put(n); + goto discard; + } + hdr = buf_msg(skb); tipc_node_bc_sync_rcv(n, hdr, bearer_id, &xmitq); - else if (unlikely(tipc_link_acked(n->bc_entry.link) != bc_ack)) + } else if (unlikely(tipc_link_acked(n->bc_entry.link) != bc_ack)) { tipc_bcast_ack_rcv(net, n->bc_entry.link, hdr); + } /* Receive packet directly if conditions permit */ tipc_node_read_lock(n); -- 2.13.7 |
From: Jon M. <jm...@re...> - 2020-03-31 14:31:29
|
On 3/31/20 4:54 AM, Tuong Lien wrote: > This commit allows printing the statistics of a broadcast-receiver link > using the same tipc command, but with additional 'link' options: > > $ tipc link stat show --help > Usage: tipc link stat show [ link { LINK | SUBSTRING | all } ] > > With: > + 'LINK' : print the stats of the specific link 'LINK'; > + 'SUBSTRING' : print the stats of all the links having the 'SUBSTRING' > in name; > + 'all' : print all the links' stats incl. the broadcast-receiver > ones; > > Also, a link stats can be reset in the usual way by specifying the link > name in command. > > For example: > > $ tipc l st sh l br > Link <broadcast-link> > Window:50 packets > RX packets:0 fragments:0/0 bundles:0/0 > TX packets:5011125 fragments:4968774/149643 bundles:38402/307061 > RX naks:781484 defs:0 dups:0 > TX naks:0 acks:0 retrans:330259 > Congestion link:50657 Send queue max:0 avg:0 > > Link <broadcast-link:1001001> > Window:50 packets > RX packets:95146 fragments:95040/1980 bundles:1/2 > TX packets:0 fragments:0/0 bundles:0/0 > RX naks:380938 defs:83962 dups:403 > TX naks:8362 acks:0 retrans:170662 > Congestion link:0 Send queue max:0 avg:0 > > Link <broadcast-link:1001002> > Window:50 packets > RX packets:0 fragments:0/0 bundles:0/0 > TX packets:0 fragments:0/0 bundles:0/0 > RX naks:400546 defs:0 dups:0 > TX naks:0 acks:0 retrans:159597 > Congestion link:0 Send queue max:0 avg:0 > > $ tipc l st sh l 1001002 > Link <1001003:data0-1001002:data0> > ACTIVE MTU:1500 Priority:10 Tolerance:1500 ms Window:50 packets > RX packets:99546 fragments:0/0 bundles:33/877 > TX packets:629 fragments:0/0 bundles:35/828 > TX profile sample:8 packets average:390 octets > 0-64:75% -256:0% -1024:0% -4096:25% -16384:0% -32768:0% -66000:0% > RX states:488714 probes:7397 naks:0 defs:4 dups:5 > TX states:27734 probes:18016 naks:5 acks:2305 retrans:0 > Congestion link:0 Send queue max:0 avg:0 > > Link <broadcast-link:1001002> > Window:50 packets > RX packets:0 fragments:0/0 bundles:0/0 > TX packets:0 fragments:0/0 bundles:0/0 > RX naks:400546 defs:0 dups:0 > TX naks:0 acks:0 retrans:159597 > Congestion link:0 Send queue max:0 avg:0 > > $ tipc l st re l broadcast-link:1001002 > > $ tipc l st sh l broadcast-link:1001002 > Link <broadcast-link:1001002> > Window:50 packets > RX packets:0 fragments:0/0 bundles:0/0 > TX packets:0 fragments:0/0 bundles:0/0 > RX naks:0 defs:0 dups:0 > TX naks:0 acks:0 retrans:0 > Congestion link:0 Send queue max:0 avg:0 > > Signed-off-by: Tuong Lien <tuo...@de...> > --- > tipc/link.c | 25 +++++++++++++++++-------- > 1 file changed, 17 insertions(+), 8 deletions(-) > > diff --git a/tipc/link.c b/tipc/link.c > index e123c186..ba77a201 100644 > --- a/tipc/link.c > +++ b/tipc/link.c > @@ -334,7 +334,7 @@ static int _show_link_stat(const char *name, struct nlattr *attrs[], > > open_json_object(NULL); > > - print_string(PRINT_ANY, "link", "\nLink <%s>\n", name); > + print_string(PRINT_ANY, "link", "Link <%s>\n", name); > print_string(PRINT_JSON, "state", "", NULL); > open_json_array(PRINT_JSON, NULL); > if (attrs[TIPC_NLA_LINK_ACTIVE]) > @@ -433,7 +433,7 @@ static int _show_link_stat(const char *name, struct nlattr *attrs[], > mnl_attr_get_u32(stats[TIPC_NLA_STATS_LINK_CONGS])); > print_uint(PRINT_ANY, "send queue max", " Send queue max:%u", > mnl_attr_get_u32(stats[TIPC_NLA_STATS_MAX_QUEUE])); > - print_uint(PRINT_ANY, "avg", " avg:%u\n", > + print_uint(PRINT_ANY, "avg", " avg:%u\n\n", > mnl_attr_get_u32(stats[TIPC_NLA_STATS_AVG_QUEUE])); > > close_json_object(); > @@ -496,7 +496,7 @@ static int _show_bc_link_stat(const char *name, struct nlattr *prop[], > mnl_attr_get_u32(stats[TIPC_NLA_STATS_LINK_CONGS])); > print_uint(PRINT_ANY, "send queue max", " Send queue max:%u", > mnl_attr_get_u32(stats[TIPC_NLA_STATS_MAX_QUEUE])); > - print_uint(PRINT_ANY, "avg", " avg:%u\n", > + print_uint(PRINT_ANY, "avg", " avg:%u\n\n", > mnl_attr_get_u32(stats[TIPC_NLA_STATS_AVG_QUEUE])); > close_json_object(); > > @@ -527,8 +527,10 @@ static int link_stat_show_cb(const struct nlmsghdr *nlh, void *data) > > name = mnl_attr_get_str(attrs[TIPC_NLA_LINK_NAME]); > > - /* If a link is passed, skip all but that link */ > - if (link && (strcmp(name, link) != 0)) > + /* If a link is passed, skip all but that link. > + * Support a substring matching as well. > + */ > + if (link && !strstr(name, link)) > return MNL_CB_OK; > > if (attrs[TIPC_NLA_LINK_BROADCAST]) { > @@ -540,7 +542,7 @@ static int link_stat_show_cb(const struct nlmsghdr *nlh, void *data) > > static void cmd_link_stat_show_help(struct cmdl *cmdl) > { > - fprintf(stderr, "Usage: %s link stat show [ link LINK ]\n", > + fprintf(stderr, "Usage: %s link stat show [ link { LINK | SUBSTRING | all } ]\n", > cmdl->argv[0]); > } > > @@ -554,6 +556,7 @@ static int cmd_link_stat_show(struct nlmsghdr *nlh, const struct cmd *cmd, > { "link", OPT_KEYVAL, NULL }, > { NULL } > }; > + struct nlattr *attrs; > int err = 0; > > if (help_flag) { > @@ -571,8 +574,14 @@ static int cmd_link_stat_show(struct nlmsghdr *nlh, const struct cmd *cmd, > return -EINVAL; > > opt = get_opt(opts, "link"); > - if (opt) > - link = opt->val; > + if (opt) { > + if (strcmp(opt->val, "all")) > + link = opt->val; > + /* Set the flag to dump all bc links */ > + attrs = mnl_attr_nest_start(nlh, TIPC_NLA_LINK); > + mnl_attr_put(nlh, TIPC_NLA_LINK_BROADCAST, 0, NULL); > + mnl_attr_nest_end(nlh, attrs); > + } > > new_json_obj(json); > err = msg_dumpit(nlh, link_stat_show_cb, link); Acked-by: Jon Maloy <jm...@re...> |
From: Xue, Y. <Yin...@wi...> - 2020-03-31 11:00:46
|
Acked-by: Ying Xue <yin...@wi...> -----Original Message----- From: Hoang Le [mailto:hoa...@de...] Sent: Wednesday, March 25, 2020 3:43 PM To: tip...@de...; ma...@do...; tip...@li... Subject: [tipc-discussion] [net-next] tipc: Add a missing case of TIPC_DIRECT_MSG type In the commit f73b12812a3d ("tipc: improve throughput between nodes in netns"), we're missing a check to handle TIPC_DIRECT_MSG type, it's still using old sending mechanism for this message type. So, throughput improvement is not significant as expected. Besides that, when sending a large message with that type, we're also handle wrong receiving queue, it should be enqueued in socket receiving instead of multicast messages. Fix this by adding the missing case for TIPC_DIRECT_MSG. Fixes: f73b12812a3d ("tipc: improve throughput between nodes in netns") Reported-by: Tuong Lien <tuo...@de...> Signed-off-by: Hoang Le <hoa...@de...> --- net/tipc/msg.h | 5 +++++ net/tipc/node.c | 3 ++- net/tipc/socket.c | 2 +- 3 files changed, 8 insertions(+), 2 deletions(-) diff --git a/net/tipc/msg.h b/net/tipc/msg.h index 6d466ebdb64f..871feadbbc19 100644 --- a/net/tipc/msg.h +++ b/net/tipc/msg.h @@ -394,6 +394,11 @@ static inline u32 msg_connected(struct tipc_msg *m) return msg_type(m) == TIPC_CONN_MSG; } +static inline u32 msg_direct(struct tipc_msg *m) +{ + return msg_type(m) == TIPC_DIRECT_MSG; +} + static inline u32 msg_errcode(struct tipc_msg *m) { return msg_bits(m, 1, 25, 0xf); diff --git a/net/tipc/node.c b/net/tipc/node.c index 0c88778c88b5..10292c942384 100644 --- a/net/tipc/node.c +++ b/net/tipc/node.c @@ -1586,7 +1586,8 @@ static void tipc_lxc_xmit(struct net *peer_net, struct sk_buff_head *list) case TIPC_MEDIUM_IMPORTANCE: case TIPC_HIGH_IMPORTANCE: case TIPC_CRITICAL_IMPORTANCE: - if (msg_connected(hdr) || msg_named(hdr)) { + if (msg_connected(hdr) || msg_named(hdr) || + msg_direct(hdr)) { tipc_loopback_trace(peer_net, list); spin_lock_init(&list->lock); tipc_sk_rcv(peer_net, list); diff --git a/net/tipc/socket.c b/net/tipc/socket.c index 693e8902161e..87466607097f 100644 --- a/net/tipc/socket.c +++ b/net/tipc/socket.c @@ -1461,7 +1461,7 @@ static int __tipc_sendmsg(struct socket *sock, struct msghdr *m, size_t dlen) } __skb_queue_head_init(&pkts); - mtu = tipc_node_get_mtu(net, dnode, tsk->portid, false); + mtu = tipc_node_get_mtu(net, dnode, tsk->portid, true); rc = tipc_msg_build(hdr, m, 0, dlen, mtu, &pkts); if (unlikely(rc != dlen)) return rc; -- 2.20.1 _______________________________________________ tipc-discussion mailing list tip...@li... https://lists.sourceforge.net/lists/listinfo/tipc-discussion |
From: Tuong L. <tuo...@de...> - 2020-03-31 08:55:10
|
This commit allows printing the statistics of a broadcast-receiver link using the same tipc command, but with additional 'link' options: $ tipc link stat show --help Usage: tipc link stat show [ link { LINK | SUBSTRING | all } ] With: + 'LINK' : print the stats of the specific link 'LINK'; + 'SUBSTRING' : print the stats of all the links having the 'SUBSTRING' in name; + 'all' : print all the links' stats incl. the broadcast-receiver ones; Also, a link stats can be reset in the usual way by specifying the link name in command. For example: $ tipc l st sh l br Link <broadcast-link> Window:50 packets RX packets:0 fragments:0/0 bundles:0/0 TX packets:5011125 fragments:4968774/149643 bundles:38402/307061 RX naks:781484 defs:0 dups:0 TX naks:0 acks:0 retrans:330259 Congestion link:50657 Send queue max:0 avg:0 Link <broadcast-link:1001001> Window:50 packets RX packets:95146 fragments:95040/1980 bundles:1/2 TX packets:0 fragments:0/0 bundles:0/0 RX naks:380938 defs:83962 dups:403 TX naks:8362 acks:0 retrans:170662 Congestion link:0 Send queue max:0 avg:0 Link <broadcast-link:1001002> Window:50 packets RX packets:0 fragments:0/0 bundles:0/0 TX packets:0 fragments:0/0 bundles:0/0 RX naks:400546 defs:0 dups:0 TX naks:0 acks:0 retrans:159597 Congestion link:0 Send queue max:0 avg:0 $ tipc l st sh l 1001002 Link <1001003:data0-1001002:data0> ACTIVE MTU:1500 Priority:10 Tolerance:1500 ms Window:50 packets RX packets:99546 fragments:0/0 bundles:33/877 TX packets:629 fragments:0/0 bundles:35/828 TX profile sample:8 packets average:390 octets 0-64:75% -256:0% -1024:0% -4096:25% -16384:0% -32768:0% -66000:0% RX states:488714 probes:7397 naks:0 defs:4 dups:5 TX states:27734 probes:18016 naks:5 acks:2305 retrans:0 Congestion link:0 Send queue max:0 avg:0 Link <broadcast-link:1001002> Window:50 packets RX packets:0 fragments:0/0 bundles:0/0 TX packets:0 fragments:0/0 bundles:0/0 RX naks:400546 defs:0 dups:0 TX naks:0 acks:0 retrans:159597 Congestion link:0 Send queue max:0 avg:0 $ tipc l st re l broadcast-link:1001002 $ tipc l st sh l broadcast-link:1001002 Link <broadcast-link:1001002> Window:50 packets RX packets:0 fragments:0/0 bundles:0/0 TX packets:0 fragments:0/0 bundles:0/0 RX naks:0 defs:0 dups:0 TX naks:0 acks:0 retrans:0 Congestion link:0 Send queue max:0 avg:0 Signed-off-by: Tuong Lien <tuo...@de...> --- tipc/link.c | 25 +++++++++++++++++-------- 1 file changed, 17 insertions(+), 8 deletions(-) diff --git a/tipc/link.c b/tipc/link.c index e123c186..ba77a201 100644 --- a/tipc/link.c +++ b/tipc/link.c @@ -334,7 +334,7 @@ static int _show_link_stat(const char *name, struct nlattr *attrs[], open_json_object(NULL); - print_string(PRINT_ANY, "link", "\nLink <%s>\n", name); + print_string(PRINT_ANY, "link", "Link <%s>\n", name); print_string(PRINT_JSON, "state", "", NULL); open_json_array(PRINT_JSON, NULL); if (attrs[TIPC_NLA_LINK_ACTIVE]) @@ -433,7 +433,7 @@ static int _show_link_stat(const char *name, struct nlattr *attrs[], mnl_attr_get_u32(stats[TIPC_NLA_STATS_LINK_CONGS])); print_uint(PRINT_ANY, "send queue max", " Send queue max:%u", mnl_attr_get_u32(stats[TIPC_NLA_STATS_MAX_QUEUE])); - print_uint(PRINT_ANY, "avg", " avg:%u\n", + print_uint(PRINT_ANY, "avg", " avg:%u\n\n", mnl_attr_get_u32(stats[TIPC_NLA_STATS_AVG_QUEUE])); close_json_object(); @@ -496,7 +496,7 @@ static int _show_bc_link_stat(const char *name, struct nlattr *prop[], mnl_attr_get_u32(stats[TIPC_NLA_STATS_LINK_CONGS])); print_uint(PRINT_ANY, "send queue max", " Send queue max:%u", mnl_attr_get_u32(stats[TIPC_NLA_STATS_MAX_QUEUE])); - print_uint(PRINT_ANY, "avg", " avg:%u\n", + print_uint(PRINT_ANY, "avg", " avg:%u\n\n", mnl_attr_get_u32(stats[TIPC_NLA_STATS_AVG_QUEUE])); close_json_object(); @@ -527,8 +527,10 @@ static int link_stat_show_cb(const struct nlmsghdr *nlh, void *data) name = mnl_attr_get_str(attrs[TIPC_NLA_LINK_NAME]); - /* If a link is passed, skip all but that link */ - if (link && (strcmp(name, link) != 0)) + /* If a link is passed, skip all but that link. + * Support a substring matching as well. + */ + if (link && !strstr(name, link)) return MNL_CB_OK; if (attrs[TIPC_NLA_LINK_BROADCAST]) { @@ -540,7 +542,7 @@ static int link_stat_show_cb(const struct nlmsghdr *nlh, void *data) static void cmd_link_stat_show_help(struct cmdl *cmdl) { - fprintf(stderr, "Usage: %s link stat show [ link LINK ]\n", + fprintf(stderr, "Usage: %s link stat show [ link { LINK | SUBSTRING | all } ]\n", cmdl->argv[0]); } @@ -554,6 +556,7 @@ static int cmd_link_stat_show(struct nlmsghdr *nlh, const struct cmd *cmd, { "link", OPT_KEYVAL, NULL }, { NULL } }; + struct nlattr *attrs; int err = 0; if (help_flag) { @@ -571,8 +574,14 @@ static int cmd_link_stat_show(struct nlmsghdr *nlh, const struct cmd *cmd, return -EINVAL; opt = get_opt(opts, "link"); - if (opt) - link = opt->val; + if (opt) { + if (strcmp(opt->val, "all")) + link = opt->val; + /* Set the flag to dump all bc links */ + attrs = mnl_attr_nest_start(nlh, TIPC_NLA_LINK); + mnl_attr_put(nlh, TIPC_NLA_LINK_BROADCAST, 0, NULL); + mnl_attr_nest_end(nlh, attrs); + } new_json_obj(json); err = msg_dumpit(nlh, link_stat_show_cb, link); -- 2.13.7 |
From: Jon M. <jm...@re...> - 2020-03-30 15:47:41
|
On 3/28/20 12:03 AM, Tuong Lien wrote: > Hi Jon, all, > > Please find the full series here, > + For the 1st patch: it's really the last one I sent before, so you have > ack-ed already. > + For the other ones, please help take a look. Also, I will send another > patch for iproute2/tipc which is user-space part of the last one in this > series i.e. broadcast rcv stats dumping. > > Thanks alot! > > Tuong Lien (4): > tipc: introduce Gap ACK blocks for broadcast link > tipc: add back link trace events > tipc: enable broadcast retrans via unicast > tipc: add support for broadcast rcv stats dumping > > net/tipc/bcast.c | 22 ++- > net/tipc/bcast.h | 9 +- > net/tipc/link.c | 500 +++++++++++++++++++++++++++++++---------------------- > net/tipc/link.h | 11 +- > net/tipc/msg.c | 9 +- > net/tipc/msg.h | 16 +- > net/tipc/netlink.c | 2 +- > net/tipc/node.c | 75 ++++++-- > net/tipc/sysctl.c | 9 +- > net/tipc/trace.h | 17 +- > 10 files changed, 424 insertions(+), 246 deletions(-) > Whole series: Acked-by: Jon Maloy <jm...@re...> |
From: Tuong L. <tuo...@de...> - 2020-03-28 04:03:45
|
In some environment, broadcast traffic is suppressed at high rate (i.e. a kind of bandwidth limit setting). When it is applied, TIPC broadcast can still run successfully. However, when it comes to a high load, some packets will be dropped first and TIPC tries to retransmit them but the packet retransmission is intentionally broadcast too, so making things worse and not helpful at all. This commit enables the broadcast retransmission via unicast which only retransmits packets to the specific peer that has really reported a gap i.e. not broadcasting to all nodes in the cluster, so will prevent from being suppressed, and also reduce some overheads on the other peers due to duplicates, finally improve the overall TIPC broadcast performance. Note: the functionality can be turned on/off via the sysctl file: echo 1 > /proc/sys/net/tipc/bc_retruni echo 0 > /proc/sys/net/tipc/bc_retruni Default is '0', i.e. the broadcast retransmission still works as usual. Signed-off-by: Tuong Lien <tuo...@de...> --- net/tipc/bcast.c | 11 ++++++++--- net/tipc/bcast.h | 4 +++- net/tipc/link.c | 8 +++++--- net/tipc/link.h | 3 ++- net/tipc/node.c | 2 +- net/tipc/sysctl.c | 9 ++++++++- 6 files changed, 27 insertions(+), 10 deletions(-) diff --git a/net/tipc/bcast.c b/net/tipc/bcast.c index 3ce690a96ee9..50a16f8bebd9 100644 --- a/net/tipc/bcast.c +++ b/net/tipc/bcast.c @@ -46,6 +46,7 @@ #define BCLINK_WIN_MIN 32 /* bcast minimum link window size */ const char tipc_bclink_name[] = "broadcast-link"; +unsigned long sysctl_tipc_bc_retruni __read_mostly; /** * struct tipc_bc_base - base structure for keeping broadcast send state @@ -474,7 +475,7 @@ void tipc_bcast_ack_rcv(struct net *net, struct tipc_link *l, __skb_queue_head_init(&xmitq); tipc_bcast_lock(net); - tipc_link_bc_ack_rcv(l, acked, 0, NULL, &xmitq); + tipc_link_bc_ack_rcv(l, acked, 0, NULL, &xmitq, NULL); tipc_bcast_unlock(net); tipc_bcbase_xmit(net, &xmitq); @@ -489,7 +490,8 @@ void tipc_bcast_ack_rcv(struct net *net, struct tipc_link *l, * RCU is locked, no other locks set */ int tipc_bcast_sync_rcv(struct net *net, struct tipc_link *l, - struct tipc_msg *hdr) + struct tipc_msg *hdr, + struct sk_buff_head *retrq) { struct sk_buff_head *inputq = &tipc_bc_base(net)->inputq; struct tipc_gap_ack_blks *ga; @@ -503,8 +505,11 @@ int tipc_bcast_sync_rcv(struct net *net, struct tipc_link *l, tipc_link_bc_init_rcv(l, hdr); } else if (!msg_bc_ack_invalid(hdr)) { tipc_get_gap_ack_blks(&ga, l, hdr, false); + if (!sysctl_tipc_bc_retruni) + retrq = &xmitq; rc = tipc_link_bc_ack_rcv(l, msg_bcast_ack(hdr), - msg_bc_gap(hdr), ga, &xmitq); + msg_bc_gap(hdr), ga, &xmitq, + retrq); rc |= tipc_link_bc_sync_rcv(l, hdr, &xmitq); } tipc_bcast_unlock(net); diff --git a/net/tipc/bcast.h b/net/tipc/bcast.h index 9e847d9617d3..97d3cf9d3e4d 100644 --- a/net/tipc/bcast.h +++ b/net/tipc/bcast.h @@ -45,6 +45,7 @@ struct tipc_nl_msg; struct tipc_nlist; struct tipc_nitem; extern const char tipc_bclink_name[]; +extern unsigned long sysctl_tipc_bc_retruni; #define TIPC_METHOD_EXPIRE msecs_to_jiffies(5000) @@ -93,7 +94,8 @@ int tipc_bcast_rcv(struct net *net, struct tipc_link *l, struct sk_buff *skb); void tipc_bcast_ack_rcv(struct net *net, struct tipc_link *l, struct tipc_msg *hdr); int tipc_bcast_sync_rcv(struct net *net, struct tipc_link *l, - struct tipc_msg *hdr); + struct tipc_msg *hdr, + struct sk_buff_head *retrq); int tipc_nl_add_bc_link(struct net *net, struct tipc_nl_msg *msg); int tipc_nl_bc_link_set(struct net *net, struct nlattr *attrs[]); int tipc_bclink_reset_stats(struct net *net); diff --git a/net/tipc/link.c b/net/tipc/link.c index 405ccf597e59..3071e46f029a 100644 --- a/net/tipc/link.c +++ b/net/tipc/link.c @@ -375,7 +375,7 @@ void tipc_link_remove_bc_peer(struct tipc_link *snd_l, snd_l->ackers--; rcv_l->bc_peer_is_up = true; rcv_l->state = LINK_ESTABLISHED; - tipc_link_bc_ack_rcv(rcv_l, ack, 0, NULL, xmitq); + tipc_link_bc_ack_rcv(rcv_l, ack, 0, NULL, xmitq, NULL); trace_tipc_link_reset(rcv_l, TIPC_DUMP_ALL, "bclink removed!"); tipc_link_reset(rcv_l); rcv_l->state = LINK_RESET; @@ -2413,7 +2413,8 @@ int tipc_link_bc_sync_rcv(struct tipc_link *l, struct tipc_msg *hdr, int tipc_link_bc_ack_rcv(struct tipc_link *r, u16 acked, u16 gap, struct tipc_gap_ack_blks *ga, - struct sk_buff_head *xmitq) + struct sk_buff_head *xmitq, + struct sk_buff_head *retrq) { struct tipc_link *l = r->bc_sndlink; bool unused = false; @@ -2460,7 +2461,8 @@ int tipc_link_bc_nack_rcv(struct tipc_link *l, struct sk_buff *skb, return 0; if (dnode == tipc_own_addr(l->net)) { - rc = tipc_link_bc_ack_rcv(l, acked, to - acked, NULL, xmitq); + rc = tipc_link_bc_ack_rcv(l, acked, to - acked, NULL, xmitq, + xmitq); l->stats.recv_nacks++; return rc; } diff --git a/net/tipc/link.h b/net/tipc/link.h index 0a0fa7350722..4d0768cf91d5 100644 --- a/net/tipc/link.h +++ b/net/tipc/link.h @@ -147,7 +147,8 @@ u16 tipc_get_gap_ack_blks(struct tipc_gap_ack_blks **ga, struct tipc_link *l, struct tipc_msg *hdr, bool uc); int tipc_link_bc_ack_rcv(struct tipc_link *l, u16 acked, u16 gap, struct tipc_gap_ack_blks *ga, - struct sk_buff_head *xmitq); + struct sk_buff_head *xmitq, + struct sk_buff_head *retrq); void tipc_link_build_bc_sync_msg(struct tipc_link *l, struct sk_buff_head *xmitq); void tipc_link_bc_init_rcv(struct tipc_link *l, struct tipc_msg *hdr); diff --git a/net/tipc/node.c b/net/tipc/node.c index eb6b62de81a7..917ad3920fac 100644 --- a/net/tipc/node.c +++ b/net/tipc/node.c @@ -1771,7 +1771,7 @@ static void tipc_node_bc_sync_rcv(struct tipc_node *n, struct tipc_msg *hdr, struct tipc_link *ucl; int rc; - rc = tipc_bcast_sync_rcv(n->net, n->bc_entry.link, hdr); + rc = tipc_bcast_sync_rcv(n->net, n->bc_entry.link, hdr, xmitq); if (rc & TIPC_LINK_DOWN_EVT) { tipc_node_reset_links(n); diff --git a/net/tipc/sysctl.c b/net/tipc/sysctl.c index 58ab3d6dcdce..97a6264a2993 100644 --- a/net/tipc/sysctl.c +++ b/net/tipc/sysctl.c @@ -36,7 +36,7 @@ #include "core.h" #include "trace.h" #include "crypto.h" - +#include "bcast.h" #include <linux/sysctl.h> static struct ctl_table_header *tipc_ctl_hdr; @@ -75,6 +75,13 @@ static struct ctl_table tipc_table[] = { .extra1 = SYSCTL_ONE, }, #endif + { + .procname = "bc_retruni", + .data = &sysctl_tipc_bc_retruni, + .maxlen = sizeof(sysctl_tipc_bc_retruni), + .mode = 0644, + .proc_handler = proc_doulongvec_minmax, + }, {} }; -- 2.13.7 |
From: Tuong L. <tuo...@de...> - 2020-03-28 04:03:42
|
As achieved through commit 9195948fbf34 ("tipc: improve TIPC throughput by Gap ACK blocks"), we apply the same mechanism for the broadcast link as well. The 'Gap ACK blocks' data field in a 'PROTOCOL/STATE_MSG' will consist of two parts built for both the broadcast and unicast types: 31 16 15 0 +-------------+-------------+-------------+-------------+ | bgack_cnt | ugack_cnt | len | +-------------+-------------+-------------+-------------+ - | gap | ack | | +-------------+-------------+-------------+-------------+ > bc gacks : : : | +-------------+-------------+-------------+-------------+ - | gap | ack | | +-------------+-------------+-------------+-------------+ > uc gacks : : : | +-------------+-------------+-------------+-------------+ - which is "automatically" backward-compatible. We also increase the max number of Gap ACK blocks to 128, allowing upto 64 blocks per type (total buffer size = 516 bytes). Besides, the 'tipc_link_advance_transmq()' function is refactored which is applicable for both the unicast and broadcast cases now, so some old functions can be removed and the code is optimized. With the patch, TIPC broadcast is more robust regardless of packet loss or disorder, latency, ... in the underlying network. Its performance is boost up significantly. For example, experiment with a 5% packet loss rate results: $ time tipc-pipe --mc --rdm --data_size 123 --data_num 1500000 real 0m 42.46s user 0m 1.16s sys 0m 17.67s Without the patch: $ time tipc-pipe --mc --rdm --data_size 123 --data_num 1500000 real 8m 27.94s user 0m 0.55s sys 0m 2.38s Acked-by: Jon Maloy <jm...@re...> Signed-off-by: Tuong Lien <tuo...@de...> --- net/tipc/bcast.c | 9 +- net/tipc/link.c | 438 +++++++++++++++++++++++++++++++++---------------------- net/tipc/link.h | 7 +- net/tipc/msg.h | 14 +- net/tipc/node.c | 10 +- 5 files changed, 293 insertions(+), 185 deletions(-) diff --git a/net/tipc/bcast.c b/net/tipc/bcast.c index 4c20be08b9c4..3ce690a96ee9 100644 --- a/net/tipc/bcast.c +++ b/net/tipc/bcast.c @@ -474,7 +474,7 @@ void tipc_bcast_ack_rcv(struct net *net, struct tipc_link *l, __skb_queue_head_init(&xmitq); tipc_bcast_lock(net); - tipc_link_bc_ack_rcv(l, acked, &xmitq); + tipc_link_bc_ack_rcv(l, acked, 0, NULL, &xmitq); tipc_bcast_unlock(net); tipc_bcbase_xmit(net, &xmitq); @@ -492,6 +492,7 @@ int tipc_bcast_sync_rcv(struct net *net, struct tipc_link *l, struct tipc_msg *hdr) { struct sk_buff_head *inputq = &tipc_bc_base(net)->inputq; + struct tipc_gap_ack_blks *ga; struct sk_buff_head xmitq; int rc = 0; @@ -501,8 +502,10 @@ int tipc_bcast_sync_rcv(struct net *net, struct tipc_link *l, if (msg_type(hdr) != STATE_MSG) { tipc_link_bc_init_rcv(l, hdr); } else if (!msg_bc_ack_invalid(hdr)) { - tipc_link_bc_ack_rcv(l, msg_bcast_ack(hdr), &xmitq); - rc = tipc_link_bc_sync_rcv(l, hdr, &xmitq); + tipc_get_gap_ack_blks(&ga, l, hdr, false); + rc = tipc_link_bc_ack_rcv(l, msg_bcast_ack(hdr), + msg_bc_gap(hdr), ga, &xmitq); + rc |= tipc_link_bc_sync_rcv(l, hdr, &xmitq); } tipc_bcast_unlock(net); diff --git a/net/tipc/link.c b/net/tipc/link.c index 467c53a1fb5c..1b60ba665504 100644 --- a/net/tipc/link.c +++ b/net/tipc/link.c @@ -188,6 +188,8 @@ struct tipc_link { /* Broadcast */ u16 ackers; u16 acked; + u16 last_gap; + struct tipc_gap_ack_blks *last_ga; struct tipc_link *bc_rcvlink; struct tipc_link *bc_sndlink; u8 nack_state; @@ -249,11 +251,14 @@ static int tipc_link_build_nack_msg(struct tipc_link *l, struct sk_buff_head *xmitq); static void tipc_link_build_bc_init_msg(struct tipc_link *l, struct sk_buff_head *xmitq); -static int tipc_link_release_pkts(struct tipc_link *l, u16 to); -static u16 tipc_build_gap_ack_blks(struct tipc_link *l, void *data, u16 gap); -static int tipc_link_advance_transmq(struct tipc_link *l, u16 acked, u16 gap, +static u8 __tipc_build_gap_ack_blks(struct tipc_gap_ack_blks *ga, + struct tipc_link *l, u8 start_index); +static u16 tipc_build_gap_ack_blks(struct tipc_link *l, struct tipc_msg *hdr); +static int tipc_link_advance_transmq(struct tipc_link *l, struct tipc_link *r, + u16 acked, u16 gap, struct tipc_gap_ack_blks *ga, - struct sk_buff_head *xmitq); + struct sk_buff_head *xmitq, + bool *retransmitted, int *rc); static void tipc_link_update_cwin(struct tipc_link *l, int released, bool retransmitted); /* @@ -370,7 +375,7 @@ void tipc_link_remove_bc_peer(struct tipc_link *snd_l, snd_l->ackers--; rcv_l->bc_peer_is_up = true; rcv_l->state = LINK_ESTABLISHED; - tipc_link_bc_ack_rcv(rcv_l, ack, xmitq); + tipc_link_bc_ack_rcv(rcv_l, ack, 0, NULL, xmitq); trace_tipc_link_reset(rcv_l, TIPC_DUMP_ALL, "bclink removed!"); tipc_link_reset(rcv_l); rcv_l->state = LINK_RESET; @@ -784,8 +789,6 @@ bool tipc_link_too_silent(struct tipc_link *l) return (l->silent_intv_cnt + 2 > l->abort_limit); } -static int tipc_link_bc_retrans(struct tipc_link *l, struct tipc_link *r, - u16 from, u16 to, struct sk_buff_head *xmitq); /* tipc_link_timeout - perform periodic task as instructed from node timeout */ int tipc_link_timeout(struct tipc_link *l, struct sk_buff_head *xmitq) @@ -948,6 +951,9 @@ void tipc_link_reset(struct tipc_link *l) l->snd_nxt_state = 1; l->rcv_nxt_state = 1; l->acked = 0; + l->last_gap = 0; + kfree(l->last_ga); + l->last_ga = NULL; l->silent_intv_cnt = 0; l->rst_cnt = 0; l->bc_peer_is_up = false; @@ -1183,68 +1189,14 @@ static bool link_retransmit_failure(struct tipc_link *l, struct tipc_link *r, if (link_is_bc_sndlink(l)) { r->state = LINK_RESET; - *rc = TIPC_LINK_DOWN_EVT; + *rc |= TIPC_LINK_DOWN_EVT; } else { - *rc = tipc_link_fsm_evt(l, LINK_FAILURE_EVT); + *rc |= tipc_link_fsm_evt(l, LINK_FAILURE_EVT); } return true; } -/* tipc_link_bc_retrans() - retransmit zero or more packets - * @l: the link to transmit on - * @r: the receiving link ordering the retransmit. Same as l if unicast - * @from: retransmit from (inclusive) this sequence number - * @to: retransmit to (inclusive) this sequence number - * xmitq: queue for accumulating the retransmitted packets - */ -static int tipc_link_bc_retrans(struct tipc_link *l, struct tipc_link *r, - u16 from, u16 to, struct sk_buff_head *xmitq) -{ - struct sk_buff *_skb, *skb = skb_peek(&l->transmq); - u16 bc_ack = l->bc_rcvlink->rcv_nxt - 1; - u16 ack = l->rcv_nxt - 1; - int retransmitted = 0; - struct tipc_msg *hdr; - int rc = 0; - - if (!skb) - return 0; - if (less(to, from)) - return 0; - - trace_tipc_link_retrans(r, from, to, &l->transmq); - - if (link_retransmit_failure(l, r, &rc)) - return rc; - - skb_queue_walk(&l->transmq, skb) { - hdr = buf_msg(skb); - if (less(msg_seqno(hdr), from)) - continue; - if (more(msg_seqno(hdr), to)) - break; - if (time_before(jiffies, TIPC_SKB_CB(skb)->nxt_retr)) - continue; - TIPC_SKB_CB(skb)->nxt_retr = TIPC_BC_RETR_LIM; - _skb = pskb_copy(skb, GFP_ATOMIC); - if (!_skb) - return 0; - hdr = buf_msg(_skb); - msg_set_ack(hdr, ack); - msg_set_bcast_ack(hdr, bc_ack); - _skb->priority = TC_PRIO_CONTROL; - __skb_queue_tail(xmitq, _skb); - l->stats.retransmitted++; - retransmitted++; - /* Increase actual retrans counter & mark first time */ - if (!TIPC_SKB_CB(skb)->retr_cnt++) - TIPC_SKB_CB(skb)->retr_stamp = jiffies; - } - tipc_link_update_cwin(l, 0, retransmitted); - return 0; -} - /* tipc_data_input - deliver data and name distr msgs to upper layer * * Consumes buffer if message is of right type @@ -1402,46 +1354,71 @@ static int tipc_link_tnl_rcv(struct tipc_link *l, struct sk_buff *skb, return rc; } -static int tipc_link_release_pkts(struct tipc_link *l, u16 acked) -{ - int released = 0; - struct sk_buff *skb, *tmp; - - skb_queue_walk_safe(&l->transmq, skb, tmp) { - if (more(buf_seqno(skb), acked)) - break; - __skb_unlink(skb, &l->transmq); - kfree_skb(skb); - released++; +/** + * tipc_get_gap_ack_blks - get Gap ACK blocks from PROTOCOL/STATE_MSG + * @ga: returned pointer to the Gap ACK blocks if any + * @l: the tipc link + * @hdr: the PROTOCOL/STATE_MSG header + * @uc: desired Gap ACK blocks type, i.e. unicast (= 1) or broadcast (= 0) + * + * Return: the total Gap ACK blocks size + */ +u16 tipc_get_gap_ack_blks(struct tipc_gap_ack_blks **ga, struct tipc_link *l, + struct tipc_msg *hdr, bool uc) +{ + struct tipc_gap_ack_blks *p; + u16 sz = 0; + + /* Does peer support the Gap ACK blocks feature? */ + if (l->peer_caps & TIPC_GAP_ACK_BLOCK) { + p = (struct tipc_gap_ack_blks *)msg_data(hdr); + sz = ntohs(p->len); + /* Sanity check */ + if (sz == tipc_gap_ack_blks_sz(p->ugack_cnt + p->bgack_cnt)) { + /* Good, check if the desired type exists */ + if ((uc && p->ugack_cnt) || (!uc && p->bgack_cnt)) + goto ok; + /* Backward compatible: peer might not support bc, but uc? */ + } else if (uc && sz == tipc_gap_ack_blks_sz(p->ugack_cnt)) { + if (p->ugack_cnt) { + p->bgack_cnt = 0; + goto ok; + } + } } - return released; + /* Other cases: ignore! */ + p = NULL; + +ok: + *ga = p; + return sz; } -/* tipc_build_gap_ack_blks - build Gap ACK blocks - * @l: tipc link that data have come with gaps in sequence if any - * @data: data buffer to store the Gap ACK blocks after built - * - * returns the actual allocated memory size - */ -static u16 tipc_build_gap_ack_blks(struct tipc_link *l, void *data, u16 gap) +static u8 __tipc_build_gap_ack_blks(struct tipc_gap_ack_blks *ga, + struct tipc_link *l, u8 start_index) { + struct tipc_gap_ack *gacks = &ga->gacks[start_index]; struct sk_buff *skb = skb_peek(&l->deferdq); - struct tipc_gap_ack_blks *ga = data; - u16 len, expect, seqno = 0; + u16 expect, seqno = 0; u8 n = 0; - if (!skb || !gap) - goto exit; + if (!skb) + return 0; expect = buf_seqno(skb); skb_queue_walk(&l->deferdq, skb) { seqno = buf_seqno(skb); if (unlikely(more(seqno, expect))) { - ga->gacks[n].ack = htons(expect - 1); - ga->gacks[n].gap = htons(seqno - expect); - if (++n >= MAX_GAP_ACK_BLKS) { - pr_info_ratelimited("Too few Gap ACK blocks!\n"); - goto exit; + gacks[n].ack = htons(expect - 1); + gacks[n].gap = htons(seqno - expect); + if (++n >= MAX_GAP_ACK_BLKS / 2) { + char buf[TIPC_MAX_LINK_NAME]; + + pr_info_ratelimited("Gacks on %s: %d, ql: %d!\n", + tipc_link_name_ext(l, buf), + n, + skb_queue_len(&l->deferdq)); + return n; } } else if (unlikely(less(seqno, expect))) { pr_warn("Unexpected skb in deferdq!\n"); @@ -1451,14 +1428,57 @@ static u16 tipc_build_gap_ack_blks(struct tipc_link *l, void *data, u16 gap) } /* last block */ - ga->gacks[n].ack = htons(seqno); - ga->gacks[n].gap = 0; + gacks[n].ack = htons(seqno); + gacks[n].gap = 0; n++; + return n; +} -exit: - len = tipc_gap_ack_blks_sz(n); +/* tipc_build_gap_ack_blks - build Gap ACK blocks + * @l: tipc unicast link + * @hdr: the tipc message buffer to store the Gap ACK blocks after built + * + * The function builds Gap ACK blocks for both the unicast & broadcast receiver + * links of a certain peer, the buffer after built has the network data format + * as follows: + * 31 16 15 0 + * +-------------+-------------+-------------+-------------+ + * | bgack_cnt | ugack_cnt | len | + * +-------------+-------------+-------------+-------------+ - + * | gap | ack | | + * +-------------+-------------+-------------+-------------+ > bc gacks + * : : : | + * +-------------+-------------+-------------+-------------+ - + * | gap | ack | | + * +-------------+-------------+-------------+-------------+ > uc gacks + * : : : | + * +-------------+-------------+-------------+-------------+ - + * (See struct tipc_gap_ack_blks) + * + * returns the actual allocated memory size + */ +static u16 tipc_build_gap_ack_blks(struct tipc_link *l, struct tipc_msg *hdr) +{ + struct tipc_link *bcl = l->bc_rcvlink; + struct tipc_gap_ack_blks *ga; + u16 len; + + ga = (struct tipc_gap_ack_blks *)msg_data(hdr); + + /* Start with broadcast link first */ + tipc_bcast_lock(bcl->net); + msg_set_bcast_ack(hdr, bcl->rcv_nxt - 1); + msg_set_bc_gap(hdr, link_bc_rcv_gap(bcl)); + ga->bgack_cnt = __tipc_build_gap_ack_blks(ga, bcl, 0); + tipc_bcast_unlock(bcl->net); + + /* Now for unicast link, but an explicit NACK only (???) */ + ga->ugack_cnt = (msg_seq_gap(hdr)) ? + __tipc_build_gap_ack_blks(ga, l, ga->bgack_cnt) : 0; + + /* Total len */ + len = tipc_gap_ack_blks_sz(ga->bgack_cnt + ga->ugack_cnt); ga->len = htons(len); - ga->gack_cnt = n; return len; } @@ -1466,47 +1486,109 @@ static u16 tipc_build_gap_ack_blks(struct tipc_link *l, void *data, u16 gap) * acked packets, also doing retransmissions if * gaps found * @l: tipc link with transmq queue to be advanced + * @r: tipc link "receiver" i.e. in case of broadcast (= "l" if unicast) * @acked: seqno of last packet acked by peer without any gaps before * @gap: # of gap packets * @ga: buffer pointer to Gap ACK blocks from peer * @xmitq: queue for accumulating the retransmitted packets if any + * @retransmitted: returned boolean value if a retransmission is really issued + * @rc: returned code e.g. TIPC_LINK_DOWN_EVT if a repeated retransmit failures + * happens (- unlikely case) * - * In case of a repeated retransmit failures, the call will return shortly - * with a returned code (e.g. TIPC_LINK_DOWN_EVT) + * Return: the number of packets released from the link transmq */ -static int tipc_link_advance_transmq(struct tipc_link *l, u16 acked, u16 gap, +static int tipc_link_advance_transmq(struct tipc_link *l, struct tipc_link *r, + u16 acked, u16 gap, struct tipc_gap_ack_blks *ga, - struct sk_buff_head *xmitq) + struct sk_buff_head *xmitq, + bool *retransmitted, int *rc) { + struct tipc_gap_ack_blks *last_ga = r->last_ga, *this_ga = NULL; + struct tipc_gap_ack *gacks = NULL; struct sk_buff *skb, *_skb, *tmp; struct tipc_msg *hdr; + u32 qlen = skb_queue_len(&l->transmq); + u16 nacked = acked, ngap = gap, gack_cnt = 0; u16 bc_ack = l->bc_rcvlink->rcv_nxt - 1; - bool retransmitted = false; u16 ack = l->rcv_nxt - 1; - bool passed = false; - u16 released = 0; u16 seqno, n = 0; - int rc = 0; + u16 end = r->acked, start = end, offset = r->last_gap; + u16 si = (last_ga) ? last_ga->start_index : 0; + bool is_uc = !link_is_bc_sndlink(l); + bool bc_has_acked = false; + + /* Determine Gap ACK blocks if any for the particular link */ + if (ga && is_uc) { + /* Get the Gap ACKs, uc part */ + gack_cnt = ga->ugack_cnt; + gacks = &ga->gacks[ga->bgack_cnt]; + } else if (ga) { + /* Copy the Gap ACKs, bc part, for later renewal if needed */ + this_ga = kmemdup(ga, tipc_gap_ack_blks_sz(ga->bgack_cnt), + GFP_ATOMIC); + if (likely(this_ga)) { + this_ga->start_index = 0; + /* Start with the bc Gap ACKs */ + gack_cnt = this_ga->bgack_cnt; + gacks = &this_ga->gacks[0]; + } else { + /* Hmm, we can get in trouble..., simply ignore it */ + pr_warn_ratelimited("Ignoring bc Gap ACKs, no memory\n"); + } + } + /* Advance the link transmq */ skb_queue_walk_safe(&l->transmq, skb, tmp) { seqno = buf_seqno(skb); next_gap_ack: - if (less_eq(seqno, acked)) { + if (less_eq(seqno, nacked)) { + if (is_uc) + goto release; + /* Skip packets peer has already acked */ + if (!more(seqno, r->acked)) + continue; + /* Get the next of last Gap ACK blocks */ + while (more(seqno, end)) { + if (!last_ga || si >= last_ga->bgack_cnt) + break; + start = end + offset + 1; + end = ntohs(last_ga->gacks[si].ack); + offset = ntohs(last_ga->gacks[si].gap); + si++; + WARN_ONCE(more(start, end) || + (!offset && + si < last_ga->bgack_cnt) || + si > MAX_GAP_ACK_BLKS, + "Corrupted Gap ACK: %d %d %d %d %d\n", + start, end, offset, si, + last_ga->bgack_cnt); + } + /* Check against the last Gap ACK block */ + if (in_range(seqno, start, end)) + continue; + /* Update/release the packet peer is acking */ + bc_has_acked = true; + if (--TIPC_SKB_CB(skb)->ackers) + continue; +release: /* release skb */ __skb_unlink(skb, &l->transmq); kfree_skb(skb); - released++; - } else if (less_eq(seqno, acked + gap)) { - /* First, check if repeated retrans failures occurs? */ - if (!passed && link_retransmit_failure(l, l, &rc)) - return rc; - passed = true; - + } else if (less_eq(seqno, nacked + ngap)) { + /* First gap: check if repeated retrans failures? */ + if (unlikely(seqno == acked + 1 && + link_retransmit_failure(l, r, rc))) { + /* Ignore this bc Gap ACKs if any */ + kfree(this_ga); + this_ga = NULL; + break; + } /* retransmit skb if unrestricted*/ if (time_before(jiffies, TIPC_SKB_CB(skb)->nxt_retr)) continue; - TIPC_SKB_CB(skb)->nxt_retr = TIPC_UC_RETR_TIME; + TIPC_SKB_CB(skb)->nxt_retr = (is_uc) ? + TIPC_UC_RETR_TIME : TIPC_BC_RETR_LIM; _skb = pskb_copy(skb, GFP_ATOMIC); if (!_skb) continue; @@ -1516,25 +1598,51 @@ static int tipc_link_advance_transmq(struct tipc_link *l, u16 acked, u16 gap, _skb->priority = TC_PRIO_CONTROL; __skb_queue_tail(xmitq, _skb); l->stats.retransmitted++; - retransmitted = true; + *retransmitted = true; /* Increase actual retrans counter & mark first time */ if (!TIPC_SKB_CB(skb)->retr_cnt++) TIPC_SKB_CB(skb)->retr_stamp = jiffies; } else { /* retry with Gap ACK blocks if any */ - if (!ga || n >= ga->gack_cnt) + if (n >= gack_cnt) break; - acked = ntohs(ga->gacks[n].ack); - gap = ntohs(ga->gacks[n].gap); + nacked = ntohs(gacks[n].ack); + ngap = ntohs(gacks[n].gap); n++; goto next_gap_ack; } } - if (released || retransmitted) - tipc_link_update_cwin(l, released, retransmitted); - if (released) - tipc_link_advance_backlog(l, xmitq); - return 0; + + /* Renew last Gap ACK blocks for bc if needed */ + if (bc_has_acked) { + if (this_ga) { + kfree(last_ga); + r->last_ga = this_ga; + r->last_gap = gap; + } else if (last_ga) { + if (less(acked, start)) { + si--; + offset = start - acked - 1; + } else if (less(acked, end)) { + acked = end; + } + if (si < last_ga->bgack_cnt) { + last_ga->start_index = si; + r->last_gap = offset; + } else { + kfree(last_ga); + r->last_ga = NULL; + r->last_gap = 0; + } + } else { + r->last_gap = 0; + } + r->acked = acked; + } else { + kfree(this_ga); + } + + return qlen - skb_queue_len(&l->transmq); } /* tipc_link_build_state_msg: prepare link state message for transmission @@ -1651,7 +1759,8 @@ int tipc_link_rcv(struct tipc_link *l, struct sk_buff *skb, kfree_skb(skb); break; } - released += tipc_link_release_pkts(l, msg_ack(hdr)); + released += tipc_link_advance_transmq(l, l, msg_ack(hdr), 0, + NULL, NULL, NULL, NULL); /* Defer delivery if sequence gap */ if (unlikely(seqno != rcv_nxt)) { @@ -1739,7 +1848,7 @@ static void tipc_link_build_proto_msg(struct tipc_link *l, int mtyp, bool probe, msg_set_probe(hdr, probe); msg_set_is_keepalive(hdr, probe || probe_reply); if (l->peer_caps & TIPC_GAP_ACK_BLOCK) - glen = tipc_build_gap_ack_blks(l, data, rcvgap); + glen = tipc_build_gap_ack_blks(l, hdr); tipc_mon_prep(l->net, data + glen, &dlen, mstate, l->bearer_id); msg_set_size(hdr, INT_H_SIZE + glen + dlen); skb_trim(skb, INT_H_SIZE + glen + dlen); @@ -2027,20 +2136,19 @@ static int tipc_link_proto_rcv(struct tipc_link *l, struct sk_buff *skb, { struct tipc_msg *hdr = buf_msg(skb); struct tipc_gap_ack_blks *ga = NULL; - u16 rcvgap = 0; - u16 ack = msg_ack(hdr); - u16 gap = msg_seq_gap(hdr); + bool reply = msg_probe(hdr), retransmitted = false; + u16 dlen = msg_data_sz(hdr), glen = 0; u16 peers_snd_nxt = msg_next_sent(hdr); u16 peers_tol = msg_link_tolerance(hdr); u16 peers_prio = msg_linkprio(hdr); + u16 gap = msg_seq_gap(hdr); + u16 ack = msg_ack(hdr); u16 rcv_nxt = l->rcv_nxt; - u16 dlen = msg_data_sz(hdr); + u16 rcvgap = 0; int mtyp = msg_type(hdr); - bool reply = msg_probe(hdr); - u16 glen = 0; - void *data; + int rc = 0, released; char *if_name; - int rc = 0; + void *data; trace_tipc_proto_rcv(skb, false, l->name); if (tipc_link_is_blocked(l) || !xmitq) @@ -2137,13 +2245,7 @@ static int tipc_link_proto_rcv(struct tipc_link *l, struct sk_buff *skb, } /* Receive Gap ACK blocks from peer if any */ - if (l->peer_caps & TIPC_GAP_ACK_BLOCK) { - ga = (struct tipc_gap_ack_blks *)data; - glen = ntohs(ga->len); - /* sanity check: if failed, ignore Gap ACK blocks */ - if (glen != tipc_gap_ack_blks_sz(ga->gack_cnt)) - ga = NULL; - } + glen = tipc_get_gap_ack_blks(&ga, l, hdr, true); tipc_mon_rcv(l->net, data + glen, dlen - glen, l->addr, &l->mon_state, l->bearer_id); @@ -2158,9 +2260,14 @@ static int tipc_link_proto_rcv(struct tipc_link *l, struct sk_buff *skb, tipc_link_build_proto_msg(l, STATE_MSG, 0, reply, rcvgap, 0, 0, xmitq); - rc |= tipc_link_advance_transmq(l, ack, gap, ga, xmitq); + released = tipc_link_advance_transmq(l, l, ack, gap, ga, xmitq, + &retransmitted, &rc); if (gap) l->stats.recv_nacks++; + if (released || retransmitted) + tipc_link_update_cwin(l, released, retransmitted); + if (released) + tipc_link_advance_backlog(l, xmitq); if (unlikely(!skb_queue_empty(&l->wakeupq))) link_prepare_wakeup(l); } @@ -2246,10 +2353,7 @@ void tipc_link_bc_init_rcv(struct tipc_link *l, struct tipc_msg *hdr) int tipc_link_bc_sync_rcv(struct tipc_link *l, struct tipc_msg *hdr, struct sk_buff_head *xmitq) { - struct tipc_link *snd_l = l->bc_sndlink; u16 peers_snd_nxt = msg_bc_snd_nxt(hdr); - u16 from = msg_bcast_ack(hdr) + 1; - u16 to = from + msg_bc_gap(hdr) - 1; int rc = 0; if (!link_is_up(l)) @@ -2271,8 +2375,6 @@ int tipc_link_bc_sync_rcv(struct tipc_link *l, struct tipc_msg *hdr, if (more(peers_snd_nxt, l->rcv_nxt + l->window)) return rc; - rc = tipc_link_bc_retrans(snd_l, l, from, to, xmitq); - l->snd_nxt = peers_snd_nxt; if (link_bc_rcv_gap(l)) rc |= TIPC_LINK_SND_STATE; @@ -2307,38 +2409,27 @@ int tipc_link_bc_sync_rcv(struct tipc_link *l, struct tipc_msg *hdr, return 0; } -void tipc_link_bc_ack_rcv(struct tipc_link *l, u16 acked, - struct sk_buff_head *xmitq) +int tipc_link_bc_ack_rcv(struct tipc_link *r, u16 acked, u16 gap, + struct tipc_gap_ack_blks *ga, + struct sk_buff_head *xmitq) { - struct sk_buff *skb, *tmp; - struct tipc_link *snd_l = l->bc_sndlink; + struct tipc_link *l = r->bc_sndlink; + bool unused = false; + int rc = 0; - if (!link_is_up(l) || !l->bc_peer_is_up) - return; + if (!link_is_up(r) || !r->bc_peer_is_up) + return 0; - if (!more(acked, l->acked)) - return; + if (less(acked, r->acked) || (acked == r->acked && !gap && !ga)) + return 0; - trace_tipc_link_bc_ack(l, l->acked, acked, &snd_l->transmq); - /* Skip over packets peer has already acked */ - skb_queue_walk(&snd_l->transmq, skb) { - if (more(buf_seqno(skb), l->acked)) - break; - } + tipc_link_advance_transmq(l, r, acked, gap, ga, xmitq, &unused, &rc); - /* Update/release the packets peer is acking now */ - skb_queue_walk_from_safe(&snd_l->transmq, skb, tmp) { - if (more(buf_seqno(skb), acked)) - break; - if (!--TIPC_SKB_CB(skb)->ackers) { - __skb_unlink(skb, &snd_l->transmq); - kfree_skb(skb); - } - } - l->acked = acked; - tipc_link_advance_backlog(snd_l, xmitq); - if (unlikely(!skb_queue_empty(&snd_l->wakeupq))) - link_prepare_wakeup(snd_l); + tipc_link_advance_backlog(l, xmitq); + if (unlikely(!skb_queue_empty(&l->wakeupq))) + link_prepare_wakeup(l); + + return rc; } /* tipc_link_bc_nack_rcv(): receive broadcast nack message @@ -2366,8 +2457,7 @@ int tipc_link_bc_nack_rcv(struct tipc_link *l, struct sk_buff *skb, return 0; if (dnode == tipc_own_addr(l->net)) { - tipc_link_bc_ack_rcv(l, acked, xmitq); - rc = tipc_link_bc_retrans(l->bc_sndlink, l, from, to, xmitq); + rc = tipc_link_bc_ack_rcv(l, acked, to - acked, NULL, xmitq); l->stats.recv_nacks++; return rc; } diff --git a/net/tipc/link.h b/net/tipc/link.h index d3c1c3fc1659..0a0fa7350722 100644 --- a/net/tipc/link.h +++ b/net/tipc/link.h @@ -143,8 +143,11 @@ int tipc_link_bc_peers(struct tipc_link *l); void tipc_link_set_mtu(struct tipc_link *l, int mtu); int tipc_link_mtu(struct tipc_link *l); int tipc_link_mss(struct tipc_link *l); -void tipc_link_bc_ack_rcv(struct tipc_link *l, u16 acked, - struct sk_buff_head *xmitq); +u16 tipc_get_gap_ack_blks(struct tipc_gap_ack_blks **ga, struct tipc_link *l, + struct tipc_msg *hdr, bool uc); +int tipc_link_bc_ack_rcv(struct tipc_link *l, u16 acked, u16 gap, + struct tipc_gap_ack_blks *ga, + struct sk_buff_head *xmitq); void tipc_link_build_bc_sync_msg(struct tipc_link *l, struct sk_buff_head *xmitq); void tipc_link_bc_init_rcv(struct tipc_link *l, struct tipc_msg *hdr); diff --git a/net/tipc/msg.h b/net/tipc/msg.h index 6d466ebdb64f..9a38f9c9d6eb 100644 --- a/net/tipc/msg.h +++ b/net/tipc/msg.h @@ -160,20 +160,26 @@ struct tipc_gap_ack { /* struct tipc_gap_ack_blks * @len: actual length of the record - * @gack_cnt: number of Gap ACK blocks in the record + * @bgack_cnt: number of Gap ACK blocks for broadcast in the record + * @ugack_cnt: number of Gap ACK blocks for unicast (following the broadcast + * ones) + * @start_index: starting index for "valid" broadcast Gap ACK blocks * @gacks: array of Gap ACK blocks */ struct tipc_gap_ack_blks { __be16 len; - u8 gack_cnt; - u8 reserved; + union { + u8 ugack_cnt; + u8 start_index; + }; + u8 bgack_cnt; struct tipc_gap_ack gacks[]; }; #define tipc_gap_ack_blks_sz(n) (sizeof(struct tipc_gap_ack_blks) + \ sizeof(struct tipc_gap_ack) * (n)) -#define MAX_GAP_ACK_BLKS 32 +#define MAX_GAP_ACK_BLKS 128 #define MAX_GAP_ACK_BLKS_SZ tipc_gap_ack_blks_sz(MAX_GAP_ACK_BLKS) static inline struct tipc_msg *buf_msg(struct sk_buff *skb) diff --git a/net/tipc/node.c b/net/tipc/node.c index 0c88778c88b5..eb6b62de81a7 100644 --- a/net/tipc/node.c +++ b/net/tipc/node.c @@ -2069,10 +2069,16 @@ void tipc_rcv(struct net *net, struct sk_buff *skb, struct tipc_bearer *b) le = &n->links[bearer_id]; /* Ensure broadcast reception is in synch with peer's send state */ - if (unlikely(usr == LINK_PROTOCOL)) + if (unlikely(usr == LINK_PROTOCOL)) { + if (unlikely(skb_linearize(skb))) { + tipc_node_put(n); + goto discard; + } + hdr = buf_msg(skb); tipc_node_bc_sync_rcv(n, hdr, bearer_id, &xmitq); - else if (unlikely(tipc_link_acked(n->bc_entry.link) != bc_ack)) + } else if (unlikely(tipc_link_acked(n->bc_entry.link) != bc_ack)) { tipc_bcast_ack_rcv(net, n->bc_entry.link, hdr); + } /* Receive packet directly if conditions permit */ tipc_node_read_lock(n); -- 2.13.7 |
From: Tuong L. <tuo...@de...> - 2020-03-28 04:03:41
|
This commit enables dumping the statistics of a broadcast-receiver link like the traditional 'broadcast-link' one (which is for broadcast- sender). The link dumping can be triggered via netlink (e.g. the iproute2/tipc tool) by the link flag - 'TIPC_NLA_LINK_BROADCAST' as the indicator. The name of a broadcast-receiver link of a specific peer will be in the format: 'broadcast-link:<peer-id>'. For example: Link <broadcast-link:1001002> Window:50 packets RX packets:7841 fragments:2408/440 bundles:0/0 TX packets:0 fragments:0/0 bundles:0/0 RX naks:0 defs:124 dups:0 TX naks:21 acks:0 retrans:0 Congestion link:0 Send queue max:0 avg:0 In addition, the broadcast-receiver link statistics can be reset in the usual way via netlink by specifying that link name in command. Note: the 'tipc_link_name_ext()' is removed because the link name can now be retrieved simply via the 'l->name'. Signed-off-by: Tuong Lien <tuo...@de...> --- net/tipc/bcast.c | 6 ++--- net/tipc/bcast.h | 5 +++-- net/tipc/link.c | 65 +++++++++++++++++++++++++++--------------------------- net/tipc/link.h | 3 +-- net/tipc/msg.c | 9 ++++---- net/tipc/msg.h | 2 +- net/tipc/netlink.c | 2 +- net/tipc/node.c | 63 +++++++++++++++++++++++++++++++++++++++++++++------- net/tipc/trace.h | 4 ++-- 9 files changed, 103 insertions(+), 56 deletions(-) diff --git a/net/tipc/bcast.c b/net/tipc/bcast.c index 50a16f8bebd9..383f87bc1061 100644 --- a/net/tipc/bcast.c +++ b/net/tipc/bcast.c @@ -563,10 +563,8 @@ void tipc_bcast_remove_peer(struct net *net, struct tipc_link *rcv_l) tipc_sk_rcv(net, inputq); } -int tipc_bclink_reset_stats(struct net *net) +int tipc_bclink_reset_stats(struct net *net, struct tipc_link *l) { - struct tipc_link *l = tipc_bc_sndlink(net); - if (!l) return -ENOPROTOOPT; @@ -694,7 +692,7 @@ int tipc_bcast_init(struct net *net) tn->bcbase = bb; spin_lock_init(&tipc_net(net)->bclock); - if (!tipc_link_bc_create(net, 0, 0, + if (!tipc_link_bc_create(net, 0, 0, NULL, FB_MTU, BCLINK_WIN_DEFAULT, BCLINK_WIN_DEFAULT, diff --git a/net/tipc/bcast.h b/net/tipc/bcast.h index 97d3cf9d3e4d..4240c95188b1 100644 --- a/net/tipc/bcast.h +++ b/net/tipc/bcast.h @@ -96,9 +96,10 @@ void tipc_bcast_ack_rcv(struct net *net, struct tipc_link *l, int tipc_bcast_sync_rcv(struct net *net, struct tipc_link *l, struct tipc_msg *hdr, struct sk_buff_head *retrq); -int tipc_nl_add_bc_link(struct net *net, struct tipc_nl_msg *msg); +int tipc_nl_add_bc_link(struct net *net, struct tipc_nl_msg *msg, + struct tipc_link *bcl); int tipc_nl_bc_link_set(struct net *net, struct nlattr *attrs[]); -int tipc_bclink_reset_stats(struct net *net); +int tipc_bclink_reset_stats(struct net *net, struct tipc_link *l); u32 tipc_bcast_get_broadcast_mode(struct net *net); u32 tipc_bcast_get_broadcast_ratio(struct net *net); diff --git a/net/tipc/link.c b/net/tipc/link.c index 3071e46f029a..808d3a76c27f 100644 --- a/net/tipc/link.c +++ b/net/tipc/link.c @@ -539,7 +539,7 @@ bool tipc_link_create(struct net *net, char *if_name, int bearer_id, * * Returns true if link was created, otherwise false */ -bool tipc_link_bc_create(struct net *net, u32 ownnode, u32 peer, +bool tipc_link_bc_create(struct net *net, u32 ownnode, u32 peer, u8 *peer_id, int mtu, u32 min_win, u32 max_win, u16 peer_caps, struct sk_buff_head *inputq, struct sk_buff_head *namedq, @@ -554,7 +554,18 @@ bool tipc_link_bc_create(struct net *net, u32 ownnode, u32 peer, return false; l = *link; - strcpy(l->name, tipc_bclink_name); + if (peer_id) { + char peer_str[NODE_ID_STR_LEN] = {0,}; + + tipc_nodeid2string(peer_str, peer_id); + if (strlen(peer_str) > 16) + sprintf(peer_str, "%x", peer); + /* Broadcast receiver link name: "broadcast-link:<peer>" */ + snprintf(l->name, sizeof(l->name), "%s:%s", tipc_bclink_name, + peer_str); + } else { + strcpy(l->name, tipc_bclink_name); + } trace_tipc_link_reset(l, TIPC_DUMP_ALL, "bclink created!"); tipc_link_reset(l); l->state = LINK_RESET; @@ -1412,11 +1423,8 @@ static u8 __tipc_build_gap_ack_blks(struct tipc_gap_ack_blks *ga, gacks[n].ack = htons(expect - 1); gacks[n].gap = htons(seqno - expect); if (++n >= MAX_GAP_ACK_BLKS / 2) { - char buf[TIPC_MAX_LINK_NAME]; - pr_info_ratelimited("Gacks on %s: %d, ql: %d!\n", - tipc_link_name_ext(l, buf), - n, + l->name, n, skb_queue_len(&l->deferdq)); return n; } @@ -1600,6 +1608,8 @@ static int tipc_link_advance_transmq(struct tipc_link *l, struct tipc_link *r, _skb->priority = TC_PRIO_CONTROL; __skb_queue_tail(xmitq, _skb); l->stats.retransmitted++; + if (!is_uc) + r->stats.retransmitted++; *retransmitted = true; /* Increase actual retrans counter & mark first time */ if (!TIPC_SKB_CB(skb)->retr_cnt++) @@ -1766,7 +1776,8 @@ int tipc_link_rcv(struct tipc_link *l, struct sk_buff *skb, /* Defer delivery if sequence gap */ if (unlikely(seqno != rcv_nxt)) { - __tipc_skb_queue_sorted(defq, seqno, skb); + if (!__tipc_skb_queue_sorted(defq, seqno, skb)) + l->stats.duplicates++; rc |= tipc_link_build_nack_msg(l, xmitq); break; } @@ -1800,15 +1811,15 @@ static void tipc_link_build_proto_msg(struct tipc_link *l, int mtyp, bool probe, int tolerance, int priority, struct sk_buff_head *xmitq) { + struct tipc_mon_state *mstate = &l->mon_state; + struct sk_buff_head *dfq = &l->deferdq; struct tipc_link *bcl = l->bc_rcvlink; - struct sk_buff *skb; struct tipc_msg *hdr; - struct sk_buff_head *dfq = &l->deferdq; + struct sk_buff *skb; bool node_up = link_is_up(bcl); - struct tipc_mon_state *mstate = &l->mon_state; + u16 glen = 0, bc_rcvgap = 0; int dlen = 0; void *data; - u16 glen = 0; /* Don't send protocol message during reset or link failover */ if (tipc_link_is_blocked(l)) @@ -1846,7 +1857,8 @@ static void tipc_link_build_proto_msg(struct tipc_link *l, int mtyp, bool probe, if (l->peer_caps & TIPC_LINK_PROTO_SEQNO) msg_set_seqno(hdr, l->snd_nxt_state++); msg_set_seq_gap(hdr, rcvgap); - msg_set_bc_gap(hdr, link_bc_rcv_gap(bcl)); + bc_rcvgap = link_bc_rcv_gap(bcl); + msg_set_bc_gap(hdr, bc_rcvgap); msg_set_probe(hdr, probe); msg_set_is_keepalive(hdr, probe || probe_reply); if (l->peer_caps & TIPC_GAP_ACK_BLOCK) @@ -1871,6 +1883,8 @@ static void tipc_link_build_proto_msg(struct tipc_link *l, int mtyp, bool probe, l->stats.sent_probes++; if (rcvgap) l->stats.sent_nacks++; + if (bc_rcvgap) + bcl->stats.sent_nacks++; skb->priority = TC_PRIO_CONTROL; __skb_queue_tail(xmitq, skb); trace_tipc_proto_build(skb, false, l->name); @@ -2371,8 +2385,6 @@ int tipc_link_bc_sync_rcv(struct tipc_link *l, struct tipc_msg *hdr, if (!l->bc_peer_is_up) return rc; - l->stats.recv_nacks++; - /* Ignore if peers_snd_nxt goes beyond receive window */ if (more(peers_snd_nxt, l->rcv_nxt + l->window)) return rc; @@ -2423,6 +2435,11 @@ int tipc_link_bc_ack_rcv(struct tipc_link *r, u16 acked, u16 gap, if (!link_is_up(r) || !r->bc_peer_is_up) return 0; + if (gap) { + l->stats.recv_nacks++; + r->stats.recv_nacks++; + } + if (less(acked, r->acked) || (acked == r->acked && !gap && !ga)) return 0; @@ -2734,16 +2751,15 @@ static int __tipc_nl_add_bc_link_stat(struct sk_buff *skb, return -EMSGSIZE; } -int tipc_nl_add_bc_link(struct net *net, struct tipc_nl_msg *msg) +int tipc_nl_add_bc_link(struct net *net, struct tipc_nl_msg *msg, + struct tipc_link *bcl) { int err; void *hdr; struct nlattr *attrs; struct nlattr *prop; - struct tipc_net *tn = net_generic(net, tipc_net_id); u32 bc_mode = tipc_bcast_get_broadcast_mode(net); u32 bc_ratio = tipc_bcast_get_broadcast_ratio(net); - struct tipc_link *bcl = tn->bcl; if (!bcl) return 0; @@ -2830,21 +2846,6 @@ void tipc_link_set_abort_limit(struct tipc_link *l, u32 limit) l->abort_limit = limit; } -char *tipc_link_name_ext(struct tipc_link *l, char *buf) -{ - if (!l) - scnprintf(buf, TIPC_MAX_LINK_NAME, "null"); - else if (link_is_bc_sndlink(l)) - scnprintf(buf, TIPC_MAX_LINK_NAME, "broadcast-sender"); - else if (link_is_bc_rcvlink(l)) - scnprintf(buf, TIPC_MAX_LINK_NAME, - "broadcast-receiver, peer %x", l->addr); - else - memcpy(buf, l->name, TIPC_MAX_LINK_NAME); - - return buf; -} - /** * tipc_link_dump - dump TIPC link data * @l: tipc link to be dumped diff --git a/net/tipc/link.h b/net/tipc/link.h index 4d0768cf91d5..fc07232c9a12 100644 --- a/net/tipc/link.h +++ b/net/tipc/link.h @@ -80,7 +80,7 @@ bool tipc_link_create(struct net *net, char *if_name, int bearer_id, struct sk_buff_head *inputq, struct sk_buff_head *namedq, struct tipc_link **link); -bool tipc_link_bc_create(struct net *net, u32 ownnode, u32 peer, +bool tipc_link_bc_create(struct net *net, u32 ownnode, u32 peer, u8 *peer_id, int mtu, u32 min_win, u32 max_win, u16 peer_caps, struct sk_buff_head *inputq, struct sk_buff_head *namedq, @@ -111,7 +111,6 @@ u16 tipc_link_rcv_nxt(struct tipc_link *l); u16 tipc_link_acked(struct tipc_link *l); u32 tipc_link_id(struct tipc_link *l); char *tipc_link_name(struct tipc_link *l); -char *tipc_link_name_ext(struct tipc_link *l, char *buf); u32 tipc_link_state(struct tipc_link *l); char tipc_link_plane(struct tipc_link *l); int tipc_link_prio(struct tipc_link *l); diff --git a/net/tipc/msg.c b/net/tipc/msg.c index 0d515d20b056..69d68512300a 100644 --- a/net/tipc/msg.c +++ b/net/tipc/msg.c @@ -828,19 +828,19 @@ bool tipc_msg_pskb_copy(u32 dst, struct sk_buff_head *msg, * @seqno: sequence number of buffer to add * @skb: buffer to add */ -void __tipc_skb_queue_sorted(struct sk_buff_head *list, u16 seqno, +bool __tipc_skb_queue_sorted(struct sk_buff_head *list, u16 seqno, struct sk_buff *skb) { struct sk_buff *_skb, *tmp; if (skb_queue_empty(list) || less(seqno, buf_seqno(skb_peek(list)))) { __skb_queue_head(list, skb); - return; + return true; } if (more(seqno, buf_seqno(skb_peek_tail(list)))) { __skb_queue_tail(list, skb); - return; + return true; } skb_queue_walk_safe(list, _skb, tmp) { @@ -849,9 +849,10 @@ void __tipc_skb_queue_sorted(struct sk_buff_head *list, u16 seqno, if (seqno == buf_seqno(_skb)) break; __skb_queue_before(list, _skb, skb); - return; + return true; } kfree_skb(skb); + return false; } void tipc_skb_reject(struct net *net, int err, struct sk_buff *skb, diff --git a/net/tipc/msg.h b/net/tipc/msg.h index 9a38f9c9d6eb..87e2d472f75f 100644 --- a/net/tipc/msg.h +++ b/net/tipc/msg.h @@ -1127,7 +1127,7 @@ bool tipc_msg_assemble(struct sk_buff_head *list); bool tipc_msg_reassemble(struct sk_buff_head *list, struct sk_buff_head *rcvq); bool tipc_msg_pskb_copy(u32 dst, struct sk_buff_head *msg, struct sk_buff_head *cpy); -void __tipc_skb_queue_sorted(struct sk_buff_head *list, u16 seqno, +bool __tipc_skb_queue_sorted(struct sk_buff_head *list, u16 seqno, struct sk_buff *skb); bool tipc_msg_skb_clone(struct sk_buff_head *msg, struct sk_buff_head *cpy); diff --git a/net/tipc/netlink.c b/net/tipc/netlink.c index 7c35094c20b8..8dfad18330bc 100644 --- a/net/tipc/netlink.c +++ b/net/tipc/netlink.c @@ -187,7 +187,7 @@ static const struct genl_ops tipc_genl_v2_ops[] = { }, { .cmd = TIPC_NL_LINK_GET, - .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP, + .validate = GENL_DONT_VALIDATE_STRICT, .doit = tipc_nl_node_get_link, .dumpit = tipc_nl_node_dump_link, }, diff --git a/net/tipc/node.c b/net/tipc/node.c index 917ad3920fac..373d07ae6730 100644 --- a/net/tipc/node.c +++ b/net/tipc/node.c @@ -1138,7 +1138,7 @@ void tipc_node_check_dest(struct net *net, u32 addr, if (unlikely(!n->bc_entry.link)) { snd_l = tipc_bc_sndlink(net); if (!tipc_link_bc_create(net, tipc_own_addr(net), - addr, U16_MAX, + addr, peer_id, U16_MAX, tipc_link_min_win(snd_l), tipc_link_max_win(snd_l), n->capabilities, @@ -2432,7 +2432,7 @@ int tipc_nl_node_get_link(struct sk_buff *skb, struct genl_info *info) return -ENOMEM; if (strcmp(name, tipc_bclink_name) == 0) { - err = tipc_nl_add_bc_link(net, &msg); + err = tipc_nl_add_bc_link(net, &msg, tipc_net(net)->bcl); if (err) goto err_free; } else { @@ -2476,6 +2476,7 @@ int tipc_nl_node_reset_link_stats(struct sk_buff *skb, struct genl_info *info) struct tipc_node *node; struct nlattr *attrs[TIPC_NLA_LINK_MAX + 1]; struct net *net = sock_net(skb->sk); + struct tipc_net *tn = tipc_net(net); struct tipc_link_entry *le; if (!info->attrs[TIPC_NLA_LINK]) @@ -2492,11 +2493,26 @@ int tipc_nl_node_reset_link_stats(struct sk_buff *skb, struct genl_info *info) link_name = nla_data(attrs[TIPC_NLA_LINK_NAME]); - if (strcmp(link_name, tipc_bclink_name) == 0) { - err = tipc_bclink_reset_stats(net); + err = -EINVAL; + if (!strcmp(link_name, tipc_bclink_name)) { + err = tipc_bclink_reset_stats(net, tipc_bc_sndlink(net)); if (err) return err; return 0; + } else if (strstr(link_name, tipc_bclink_name)) { + rcu_read_lock(); + list_for_each_entry_rcu(node, &tn->node_list, list) { + tipc_node_read_lock(node); + link = node->bc_entry.link; + if (link && !strcmp(link_name, tipc_link_name(link))) { + err = tipc_bclink_reset_stats(net, link); + tipc_node_read_unlock(node); + break; + } + tipc_node_read_unlock(node); + } + rcu_read_unlock(); + return err; } node = tipc_node_find_by_name(net, link_name, &bearer_id); @@ -2520,7 +2536,8 @@ int tipc_nl_node_reset_link_stats(struct sk_buff *skb, struct genl_info *info) /* Caller should hold node lock */ static int __tipc_nl_add_node_links(struct net *net, struct tipc_nl_msg *msg, - struct tipc_node *node, u32 *prev_link) + struct tipc_node *node, u32 *prev_link, + u32 type) { u32 i; int err; @@ -2536,6 +2553,14 @@ static int __tipc_nl_add_node_links(struct net *net, struct tipc_nl_msg *msg, if (err) return err; } + + if (type == 2) { + *prev_link = 3; + err = tipc_nl_add_bc_link(net, msg, node->bc_entry.link); + if (err) + return err; + } + *prev_link = 0; return 0; @@ -2544,17 +2569,38 @@ static int __tipc_nl_add_node_links(struct net *net, struct tipc_nl_msg *msg, int tipc_nl_node_dump_link(struct sk_buff *skb, struct netlink_callback *cb) { struct net *net = sock_net(skb->sk); + struct nlattr **attrs = genl_dumpit_info(cb)->attrs; + struct nlattr *link[TIPC_NLA_LINK_MAX + 1]; struct tipc_net *tn = net_generic(net, tipc_net_id); struct tipc_node *node; struct tipc_nl_msg msg; u32 prev_node = cb->args[0]; u32 prev_link = cb->args[1]; int done = cb->args[2]; + u32 type = cb->args[3]; int err; if (done) return 0; + if (!type) { + /* Dump broadcast-link & unicast links */ + type = 1; + if (attrs && attrs[TIPC_NLA_LINK]) { + err = nla_parse_nested_deprecated(link, + TIPC_NLA_LINK_MAX, + attrs[TIPC_NLA_LINK], + tipc_nl_link_policy, + NULL); + if (unlikely(err)) + return err; + if (unlikely(!link[TIPC_NLA_LINK_BROADCAST])) + return -EINVAL; + /* Dump broadcast-receiver links as well */ + type = 2; + } + } + msg.skb = skb; msg.portid = NETLINK_CB(cb->skb).portid; msg.seq = cb->nlh->nlmsg_seq; @@ -2578,7 +2624,7 @@ int tipc_nl_node_dump_link(struct sk_buff *skb, struct netlink_callback *cb) list) { tipc_node_read_lock(node); err = __tipc_nl_add_node_links(net, &msg, node, - &prev_link); + &prev_link, type); tipc_node_read_unlock(node); if (err) goto out; @@ -2586,14 +2632,14 @@ int tipc_nl_node_dump_link(struct sk_buff *skb, struct netlink_callback *cb) prev_node = node->addr; } } else { - err = tipc_nl_add_bc_link(net, &msg); + err = tipc_nl_add_bc_link(net, &msg, tn->bcl); if (err) goto out; list_for_each_entry_rcu(node, &tn->node_list, list) { tipc_node_read_lock(node); err = __tipc_nl_add_node_links(net, &msg, node, - &prev_link); + &prev_link, type); tipc_node_read_unlock(node); if (err) goto out; @@ -2608,6 +2654,7 @@ int tipc_nl_node_dump_link(struct sk_buff *skb, struct netlink_callback *cb) cb->args[0] = prev_node; cb->args[1] = prev_link; cb->args[2] = done; + cb->args[3] = type; return skb->len; } diff --git a/net/tipc/trace.h b/net/tipc/trace.h index e7535ab75255..04af83f0500c 100644 --- a/net/tipc/trace.h +++ b/net/tipc/trace.h @@ -255,7 +255,7 @@ DECLARE_EVENT_CLASS(tipc_link_class, TP_fast_assign( __assign_str(header, header); - tipc_link_name_ext(l, __entry->name); + memcpy(__entry->name, tipc_link_name(l), TIPC_MAX_LINK_NAME); tipc_link_dump(l, dqueues, __get_str(buf)); ), @@ -295,7 +295,7 @@ DECLARE_EVENT_CLASS(tipc_link_transmq_class, ), TP_fast_assign( - tipc_link_name_ext(r, __entry->name); + memcpy(__entry->name, tipc_link_name(r), TIPC_MAX_LINK_NAME); __entry->from = f; __entry->to = t; __entry->len = skb_queue_len(tq); -- 2.13.7 |
From: Tuong L. <tuo...@de...> - 2020-03-28 04:03:40
|
Hi Jon, all, Please find the full series here, + For the 1st patch: it's really the last one I sent before, so you have ack-ed already. + For the other ones, please help take a look. Also, I will send another patch for iproute2/tipc which is user-space part of the last one in this series i.e. broadcast rcv stats dumping. Thanks alot! Tuong Lien (4): tipc: introduce Gap ACK blocks for broadcast link tipc: add back link trace events tipc: enable broadcast retrans via unicast tipc: add support for broadcast rcv stats dumping net/tipc/bcast.c | 22 ++- net/tipc/bcast.h | 9 +- net/tipc/link.c | 500 +++++++++++++++++++++++++++++++---------------------- net/tipc/link.h | 11 +- net/tipc/msg.c | 9 +- net/tipc/msg.h | 16 +- net/tipc/netlink.c | 2 +- net/tipc/node.c | 75 ++++++-- net/tipc/sysctl.c | 9 +- net/tipc/trace.h | 17 +- 10 files changed, 424 insertions(+), 246 deletions(-) -- 2.13.7 |
From: Tuong L. <tuo...@de...> - 2020-03-28 04:03:39
|
In the previous commit ("tipc: add Gap ACK blocks support for broadcast link"), we have removed the following link trace events due to the code changes: - tipc_link_bc_ack - tipc_link_retrans This commit adds them back along with some minor changes to adapt to the new code. Signed-off-by: Tuong Lien <tuo...@de...> --- net/tipc/link.c | 3 +++ net/tipc/trace.h | 13 ++++++++----- 2 files changed, 11 insertions(+), 5 deletions(-) diff --git a/net/tipc/link.c b/net/tipc/link.c index 1b60ba665504..405ccf597e59 100644 --- a/net/tipc/link.c +++ b/net/tipc/link.c @@ -1517,6 +1517,8 @@ static int tipc_link_advance_transmq(struct tipc_link *l, struct tipc_link *r, bool is_uc = !link_is_bc_sndlink(l); bool bc_has_acked = false; + trace_tipc_link_retrans(r, acked + 1, acked + gap, &l->transmq); + /* Determine Gap ACK blocks if any for the particular link */ if (ga && is_uc) { /* Get the Gap ACKs, uc part */ @@ -2423,6 +2425,7 @@ int tipc_link_bc_ack_rcv(struct tipc_link *r, u16 acked, u16 gap, if (less(acked, r->acked) || (acked == r->acked && !gap && !ga)) return 0; + trace_tipc_link_bc_ack(r, acked, gap, &l->transmq); tipc_link_advance_transmq(l, r, acked, gap, ga, xmitq, &unused, &rc); tipc_link_advance_backlog(l, xmitq); diff --git a/net/tipc/trace.h b/net/tipc/trace.h index 4d8e00483afc..e7535ab75255 100644 --- a/net/tipc/trace.h +++ b/net/tipc/trace.h @@ -299,8 +299,10 @@ DECLARE_EVENT_CLASS(tipc_link_transmq_class, __entry->from = f; __entry->to = t; __entry->len = skb_queue_len(tq); - __entry->fseqno = msg_seqno(buf_msg(skb_peek(tq))); - __entry->lseqno = msg_seqno(buf_msg(skb_peek_tail(tq))); + __entry->fseqno = __entry->len ? + msg_seqno(buf_msg(skb_peek(tq))) : 0; + __entry->lseqno = __entry->len ? + msg_seqno(buf_msg(skb_peek_tail(tq))) : 0; ), TP_printk("<%s> retrans req: [%u-%u] transmq: %u [%u-%u]\n", @@ -308,15 +310,16 @@ DECLARE_EVENT_CLASS(tipc_link_transmq_class, __entry->len, __entry->fseqno, __entry->lseqno) ); -DEFINE_EVENT(tipc_link_transmq_class, tipc_link_retrans, +DEFINE_EVENT_CONDITION(tipc_link_transmq_class, tipc_link_retrans, TP_PROTO(struct tipc_link *r, u16 f, u16 t, struct sk_buff_head *tq), - TP_ARGS(r, f, t, tq) + TP_ARGS(r, f, t, tq), + TP_CONDITION(less_eq(f, t)) ); DEFINE_EVENT_PRINT(tipc_link_transmq_class, tipc_link_bc_ack, TP_PROTO(struct tipc_link *r, u16 f, u16 t, struct sk_buff_head *tq), TP_ARGS(r, f, t, tq), - TP_printk("<%s> acked: [%u-%u] transmq: %u [%u-%u]\n", + TP_printk("<%s> acked: %u gap: %u transmq: %u [%u-%u]\n", __entry->name, __entry->from, __entry->to, __entry->len, __entry->fseqno, __entry->lseqno) ); -- 2.13.7 |
From: Hoang H. Le <hoa...@de...> - 2020-03-27 14:10:32
|
Yes, I got the same results. Hoang -----Original Message----- From: Jon Maloy <jm...@re...> Sent: Friday, March 27, 2020 8:43 PM To: Tuong Tong Lien <tuo...@de...>; ma...@do...; yin...@wi...; tip...@li... Cc: tip...@de... Subject: Re: [tipc-discussion] [PATCH RFC 2/4] tipc: add back link trace events I received [2/4], 3/4] and [4/4] of thi series but no [0/4] and [1/4]. This is the case both for my redhat account and my private account, so I assume the problem is at your end. Please re-post. ///jon On 3/27/20 7:56 AM, Tuong Lien wrote: > In the previous commit ("tipc: add Gap ACK blocks support for broadcast > link"), we have removed the following link trace events due to the code > changes: > > - tipc_link_bc_ack > - tipc_link_retrans > > This commit adds them back along with some minor changes to adapt to > the new code. > > Signed-off-by: Tuong Lien <tuo...@de...> > --- > net/tipc/link.c | 3 +++ > net/tipc/trace.h | 13 ++++++++----- > 2 files changed, 11 insertions(+), 5 deletions(-) > > diff --git a/net/tipc/link.c b/net/tipc/link.c > index 1b60ba665504..405ccf597e59 100644 > --- a/net/tipc/link.c > +++ b/net/tipc/link.c > @@ -1517,6 +1517,8 @@ static int tipc_link_advance_transmq(struct tipc_link *l, struct tipc_link *r, > bool is_uc = !link_is_bc_sndlink(l); > bool bc_has_acked = false; > > + trace_tipc_link_retrans(r, acked + 1, acked + gap, &l->transmq); > + > /* Determine Gap ACK blocks if any for the particular link */ > if (ga && is_uc) { > /* Get the Gap ACKs, uc part */ > @@ -2423,6 +2425,7 @@ int tipc_link_bc_ack_rcv(struct tipc_link *r, u16 acked, u16 gap, > if (less(acked, r->acked) || (acked == r->acked && !gap && !ga)) > return 0; > > + trace_tipc_link_bc_ack(r, acked, gap, &l->transmq); > tipc_link_advance_transmq(l, r, acked, gap, ga, xmitq, &unused, &rc); > > tipc_link_advance_backlog(l, xmitq); > diff --git a/net/tipc/trace.h b/net/tipc/trace.h > index 4d8e00483afc..e7535ab75255 100644 > --- a/net/tipc/trace.h > +++ b/net/tipc/trace.h > @@ -299,8 +299,10 @@ DECLARE_EVENT_CLASS(tipc_link_transmq_class, > __entry->from = f; > __entry->to = t; > __entry->len = skb_queue_len(tq); > - __entry->fseqno = msg_seqno(buf_msg(skb_peek(tq))); > - __entry->lseqno = msg_seqno(buf_msg(skb_peek_tail(tq))); > + __entry->fseqno = __entry->len ? > + msg_seqno(buf_msg(skb_peek(tq))) : 0; > + __entry->lseqno = __entry->len ? > + msg_seqno(buf_msg(skb_peek_tail(tq))) : 0; > ), > > TP_printk("<%s> retrans req: [%u-%u] transmq: %u [%u-%u]\n", > @@ -308,15 +310,16 @@ DECLARE_EVENT_CLASS(tipc_link_transmq_class, > __entry->len, __entry->fseqno, __entry->lseqno) > ); > > -DEFINE_EVENT(tipc_link_transmq_class, tipc_link_retrans, > +DEFINE_EVENT_CONDITION(tipc_link_transmq_class, tipc_link_retrans, > TP_PROTO(struct tipc_link *r, u16 f, u16 t, struct sk_buff_head *tq), > - TP_ARGS(r, f, t, tq) > + TP_ARGS(r, f, t, tq), > + TP_CONDITION(less_eq(f, t)) > ); > > DEFINE_EVENT_PRINT(tipc_link_transmq_class, tipc_link_bc_ack, > TP_PROTO(struct tipc_link *r, u16 f, u16 t, struct sk_buff_head *tq), > TP_ARGS(r, f, t, tq), > - TP_printk("<%s> acked: [%u-%u] transmq: %u [%u-%u]\n", > + TP_printk("<%s> acked: %u gap: %u transmq: %u [%u-%u]\n", > __entry->name, __entry->from, __entry->to, > __entry->len, __entry->fseqno, __entry->lseqno) > ); _______________________________________________ tipc-discussion mailing list tip...@li... https://lists.sourceforge.net/lists/listinfo/tipc-discussion |
From: Jon M. <jm...@re...> - 2020-03-27 13:43:32
|
I received [2/4], 3/4] and [4/4] of thi series but no [0/4] and [1/4]. This is the case both for my redhat account and my private account, so I assume the problem is at your end. Please re-post. ///jon On 3/27/20 7:56 AM, Tuong Lien wrote: > In the previous commit ("tipc: add Gap ACK blocks support for broadcast > link"), we have removed the following link trace events due to the code > changes: > > - tipc_link_bc_ack > - tipc_link_retrans > > This commit adds them back along with some minor changes to adapt to > the new code. > > Signed-off-by: Tuong Lien <tuo...@de...> > --- > net/tipc/link.c | 3 +++ > net/tipc/trace.h | 13 ++++++++----- > 2 files changed, 11 insertions(+), 5 deletions(-) > > diff --git a/net/tipc/link.c b/net/tipc/link.c > index 1b60ba665504..405ccf597e59 100644 > --- a/net/tipc/link.c > +++ b/net/tipc/link.c > @@ -1517,6 +1517,8 @@ static int tipc_link_advance_transmq(struct tipc_link *l, struct tipc_link *r, > bool is_uc = !link_is_bc_sndlink(l); > bool bc_has_acked = false; > > + trace_tipc_link_retrans(r, acked + 1, acked + gap, &l->transmq); > + > /* Determine Gap ACK blocks if any for the particular link */ > if (ga && is_uc) { > /* Get the Gap ACKs, uc part */ > @@ -2423,6 +2425,7 @@ int tipc_link_bc_ack_rcv(struct tipc_link *r, u16 acked, u16 gap, > if (less(acked, r->acked) || (acked == r->acked && !gap && !ga)) > return 0; > > + trace_tipc_link_bc_ack(r, acked, gap, &l->transmq); > tipc_link_advance_transmq(l, r, acked, gap, ga, xmitq, &unused, &rc); > > tipc_link_advance_backlog(l, xmitq); > diff --git a/net/tipc/trace.h b/net/tipc/trace.h > index 4d8e00483afc..e7535ab75255 100644 > --- a/net/tipc/trace.h > +++ b/net/tipc/trace.h > @@ -299,8 +299,10 @@ DECLARE_EVENT_CLASS(tipc_link_transmq_class, > __entry->from = f; > __entry->to = t; > __entry->len = skb_queue_len(tq); > - __entry->fseqno = msg_seqno(buf_msg(skb_peek(tq))); > - __entry->lseqno = msg_seqno(buf_msg(skb_peek_tail(tq))); > + __entry->fseqno = __entry->len ? > + msg_seqno(buf_msg(skb_peek(tq))) : 0; > + __entry->lseqno = __entry->len ? > + msg_seqno(buf_msg(skb_peek_tail(tq))) : 0; > ), > > TP_printk("<%s> retrans req: [%u-%u] transmq: %u [%u-%u]\n", > @@ -308,15 +310,16 @@ DECLARE_EVENT_CLASS(tipc_link_transmq_class, > __entry->len, __entry->fseqno, __entry->lseqno) > ); > > -DEFINE_EVENT(tipc_link_transmq_class, tipc_link_retrans, > +DEFINE_EVENT_CONDITION(tipc_link_transmq_class, tipc_link_retrans, > TP_PROTO(struct tipc_link *r, u16 f, u16 t, struct sk_buff_head *tq), > - TP_ARGS(r, f, t, tq) > + TP_ARGS(r, f, t, tq), > + TP_CONDITION(less_eq(f, t)) > ); > > DEFINE_EVENT_PRINT(tipc_link_transmq_class, tipc_link_bc_ack, > TP_PROTO(struct tipc_link *r, u16 f, u16 t, struct sk_buff_head *tq), > TP_ARGS(r, f, t, tq), > - TP_printk("<%s> acked: [%u-%u] transmq: %u [%u-%u]\n", > + TP_printk("<%s> acked: %u gap: %u transmq: %u [%u-%u]\n", > __entry->name, __entry->from, __entry->to, > __entry->len, __entry->fseqno, __entry->lseqno) > ); |
From: Tuong L. <tuo...@de...> - 2020-03-27 12:31:31
|
In commit 16ad3f4022bb ("tipc: introduce variable window congestion control"), we allow link window to change with the congestion avoidance algorithm. However, there is a bug that during the slow-start if packet retransmission occurs, the link will enter the fast-recovery phase, set its window to the 'ssthresh' which is never less than 300, so the link window suddenly increases to that limit instead of decreasing. Consequently, two issues have been observed: - For broadcast-link: it can leave a gap between the link queues that a new packet will be inserted and sent before the previous ones, i.e. not in-order. - For unicast: the algorithm does not work as expected, the link window jumps to the slow-start threshold whereas packet retransmission occurs. This commit fixes the issues by avoiding such the link window increase, but still decreasing if the 'ssthresh' is lowered. Signed-off-by: Tuong Lien <tuo...@de...> --- net/tipc/link.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/tipc/link.c b/net/tipc/link.c index 467c53a1fb5c..d4675e922a8f 100644 --- a/net/tipc/link.c +++ b/net/tipc/link.c @@ -1065,7 +1065,7 @@ static void tipc_link_update_cwin(struct tipc_link *l, int released, /* Enter fast recovery */ if (unlikely(retransmitted)) { l->ssthresh = max_t(u16, l->window / 2, 300); - l->window = l->ssthresh; + l->window = min_t(u16, l->ssthresh, l->window); return; } /* Enter slow start */ -- 2.13.7 |
From: Tuong L. <tuo...@de...> - 2020-03-27 11:57:01
|
This commit enables dumping the statistics of a broadcast-receiver link like the traditional 'broadcast-link' one (which is for broadcast- sender). The link dumping can be triggered via netlink (e.g. the iproute2/tipc tool) by the link flag - 'TIPC_NLA_LINK_BROADCAST' as the indicator. The name of a broadcast-receiver link of a specific peer will be in the format: 'broadcast-link:<peer-id>'. For example: Link <broadcast-link:1001002> Window:50 packets RX packets:7841 fragments:2408/440 bundles:0/0 TX packets:0 fragments:0/0 bundles:0/0 RX naks:0 defs:124 dups:0 TX naks:21 acks:0 retrans:0 Congestion link:0 Send queue max:0 avg:0 In addition, the broadcast-receiver link statistics can be reset in the usual way via netlink by specifying that link name in command. Note: the 'tipc_link_name_ext()' is removed because the link name can now be retrieved simply via the 'l->name'. Signed-off-by: Tuong Lien <tuo...@de...> --- net/tipc/bcast.c | 6 ++--- net/tipc/bcast.h | 5 +++-- net/tipc/link.c | 65 +++++++++++++++++++++++++++--------------------------- net/tipc/link.h | 3 +-- net/tipc/msg.c | 9 ++++---- net/tipc/msg.h | 2 +- net/tipc/netlink.c | 2 +- net/tipc/node.c | 63 +++++++++++++++++++++++++++++++++++++++++++++------- net/tipc/trace.h | 4 ++-- 9 files changed, 103 insertions(+), 56 deletions(-) diff --git a/net/tipc/bcast.c b/net/tipc/bcast.c index 50a16f8bebd9..383f87bc1061 100644 --- a/net/tipc/bcast.c +++ b/net/tipc/bcast.c @@ -563,10 +563,8 @@ void tipc_bcast_remove_peer(struct net *net, struct tipc_link *rcv_l) tipc_sk_rcv(net, inputq); } -int tipc_bclink_reset_stats(struct net *net) +int tipc_bclink_reset_stats(struct net *net, struct tipc_link *l) { - struct tipc_link *l = tipc_bc_sndlink(net); - if (!l) return -ENOPROTOOPT; @@ -694,7 +692,7 @@ int tipc_bcast_init(struct net *net) tn->bcbase = bb; spin_lock_init(&tipc_net(net)->bclock); - if (!tipc_link_bc_create(net, 0, 0, + if (!tipc_link_bc_create(net, 0, 0, NULL, FB_MTU, BCLINK_WIN_DEFAULT, BCLINK_WIN_DEFAULT, diff --git a/net/tipc/bcast.h b/net/tipc/bcast.h index 97d3cf9d3e4d..4240c95188b1 100644 --- a/net/tipc/bcast.h +++ b/net/tipc/bcast.h @@ -96,9 +96,10 @@ void tipc_bcast_ack_rcv(struct net *net, struct tipc_link *l, int tipc_bcast_sync_rcv(struct net *net, struct tipc_link *l, struct tipc_msg *hdr, struct sk_buff_head *retrq); -int tipc_nl_add_bc_link(struct net *net, struct tipc_nl_msg *msg); +int tipc_nl_add_bc_link(struct net *net, struct tipc_nl_msg *msg, + struct tipc_link *bcl); int tipc_nl_bc_link_set(struct net *net, struct nlattr *attrs[]); -int tipc_bclink_reset_stats(struct net *net); +int tipc_bclink_reset_stats(struct net *net, struct tipc_link *l); u32 tipc_bcast_get_broadcast_mode(struct net *net); u32 tipc_bcast_get_broadcast_ratio(struct net *net); diff --git a/net/tipc/link.c b/net/tipc/link.c index 3071e46f029a..808d3a76c27f 100644 --- a/net/tipc/link.c +++ b/net/tipc/link.c @@ -539,7 +539,7 @@ bool tipc_link_create(struct net *net, char *if_name, int bearer_id, * * Returns true if link was created, otherwise false */ -bool tipc_link_bc_create(struct net *net, u32 ownnode, u32 peer, +bool tipc_link_bc_create(struct net *net, u32 ownnode, u32 peer, u8 *peer_id, int mtu, u32 min_win, u32 max_win, u16 peer_caps, struct sk_buff_head *inputq, struct sk_buff_head *namedq, @@ -554,7 +554,18 @@ bool tipc_link_bc_create(struct net *net, u32 ownnode, u32 peer, return false; l = *link; - strcpy(l->name, tipc_bclink_name); + if (peer_id) { + char peer_str[NODE_ID_STR_LEN] = {0,}; + + tipc_nodeid2string(peer_str, peer_id); + if (strlen(peer_str) > 16) + sprintf(peer_str, "%x", peer); + /* Broadcast receiver link name: "broadcast-link:<peer>" */ + snprintf(l->name, sizeof(l->name), "%s:%s", tipc_bclink_name, + peer_str); + } else { + strcpy(l->name, tipc_bclink_name); + } trace_tipc_link_reset(l, TIPC_DUMP_ALL, "bclink created!"); tipc_link_reset(l); l->state = LINK_RESET; @@ -1412,11 +1423,8 @@ static u8 __tipc_build_gap_ack_blks(struct tipc_gap_ack_blks *ga, gacks[n].ack = htons(expect - 1); gacks[n].gap = htons(seqno - expect); if (++n >= MAX_GAP_ACK_BLKS / 2) { - char buf[TIPC_MAX_LINK_NAME]; - pr_info_ratelimited("Gacks on %s: %d, ql: %d!\n", - tipc_link_name_ext(l, buf), - n, + l->name, n, skb_queue_len(&l->deferdq)); return n; } @@ -1600,6 +1608,8 @@ static int tipc_link_advance_transmq(struct tipc_link *l, struct tipc_link *r, _skb->priority = TC_PRIO_CONTROL; __skb_queue_tail(xmitq, _skb); l->stats.retransmitted++; + if (!is_uc) + r->stats.retransmitted++; *retransmitted = true; /* Increase actual retrans counter & mark first time */ if (!TIPC_SKB_CB(skb)->retr_cnt++) @@ -1766,7 +1776,8 @@ int tipc_link_rcv(struct tipc_link *l, struct sk_buff *skb, /* Defer delivery if sequence gap */ if (unlikely(seqno != rcv_nxt)) { - __tipc_skb_queue_sorted(defq, seqno, skb); + if (!__tipc_skb_queue_sorted(defq, seqno, skb)) + l->stats.duplicates++; rc |= tipc_link_build_nack_msg(l, xmitq); break; } @@ -1800,15 +1811,15 @@ static void tipc_link_build_proto_msg(struct tipc_link *l, int mtyp, bool probe, int tolerance, int priority, struct sk_buff_head *xmitq) { + struct tipc_mon_state *mstate = &l->mon_state; + struct sk_buff_head *dfq = &l->deferdq; struct tipc_link *bcl = l->bc_rcvlink; - struct sk_buff *skb; struct tipc_msg *hdr; - struct sk_buff_head *dfq = &l->deferdq; + struct sk_buff *skb; bool node_up = link_is_up(bcl); - struct tipc_mon_state *mstate = &l->mon_state; + u16 glen = 0, bc_rcvgap = 0; int dlen = 0; void *data; - u16 glen = 0; /* Don't send protocol message during reset or link failover */ if (tipc_link_is_blocked(l)) @@ -1846,7 +1857,8 @@ static void tipc_link_build_proto_msg(struct tipc_link *l, int mtyp, bool probe, if (l->peer_caps & TIPC_LINK_PROTO_SEQNO) msg_set_seqno(hdr, l->snd_nxt_state++); msg_set_seq_gap(hdr, rcvgap); - msg_set_bc_gap(hdr, link_bc_rcv_gap(bcl)); + bc_rcvgap = link_bc_rcv_gap(bcl); + msg_set_bc_gap(hdr, bc_rcvgap); msg_set_probe(hdr, probe); msg_set_is_keepalive(hdr, probe || probe_reply); if (l->peer_caps & TIPC_GAP_ACK_BLOCK) @@ -1871,6 +1883,8 @@ static void tipc_link_build_proto_msg(struct tipc_link *l, int mtyp, bool probe, l->stats.sent_probes++; if (rcvgap) l->stats.sent_nacks++; + if (bc_rcvgap) + bcl->stats.sent_nacks++; skb->priority = TC_PRIO_CONTROL; __skb_queue_tail(xmitq, skb); trace_tipc_proto_build(skb, false, l->name); @@ -2371,8 +2385,6 @@ int tipc_link_bc_sync_rcv(struct tipc_link *l, struct tipc_msg *hdr, if (!l->bc_peer_is_up) return rc; - l->stats.recv_nacks++; - /* Ignore if peers_snd_nxt goes beyond receive window */ if (more(peers_snd_nxt, l->rcv_nxt + l->window)) return rc; @@ -2423,6 +2435,11 @@ int tipc_link_bc_ack_rcv(struct tipc_link *r, u16 acked, u16 gap, if (!link_is_up(r) || !r->bc_peer_is_up) return 0; + if (gap) { + l->stats.recv_nacks++; + r->stats.recv_nacks++; + } + if (less(acked, r->acked) || (acked == r->acked && !gap && !ga)) return 0; @@ -2734,16 +2751,15 @@ static int __tipc_nl_add_bc_link_stat(struct sk_buff *skb, return -EMSGSIZE; } -int tipc_nl_add_bc_link(struct net *net, struct tipc_nl_msg *msg) +int tipc_nl_add_bc_link(struct net *net, struct tipc_nl_msg *msg, + struct tipc_link *bcl) { int err; void *hdr; struct nlattr *attrs; struct nlattr *prop; - struct tipc_net *tn = net_generic(net, tipc_net_id); u32 bc_mode = tipc_bcast_get_broadcast_mode(net); u32 bc_ratio = tipc_bcast_get_broadcast_ratio(net); - struct tipc_link *bcl = tn->bcl; if (!bcl) return 0; @@ -2830,21 +2846,6 @@ void tipc_link_set_abort_limit(struct tipc_link *l, u32 limit) l->abort_limit = limit; } -char *tipc_link_name_ext(struct tipc_link *l, char *buf) -{ - if (!l) - scnprintf(buf, TIPC_MAX_LINK_NAME, "null"); - else if (link_is_bc_sndlink(l)) - scnprintf(buf, TIPC_MAX_LINK_NAME, "broadcast-sender"); - else if (link_is_bc_rcvlink(l)) - scnprintf(buf, TIPC_MAX_LINK_NAME, - "broadcast-receiver, peer %x", l->addr); - else - memcpy(buf, l->name, TIPC_MAX_LINK_NAME); - - return buf; -} - /** * tipc_link_dump - dump TIPC link data * @l: tipc link to be dumped diff --git a/net/tipc/link.h b/net/tipc/link.h index 4d0768cf91d5..fc07232c9a12 100644 --- a/net/tipc/link.h +++ b/net/tipc/link.h @@ -80,7 +80,7 @@ bool tipc_link_create(struct net *net, char *if_name, int bearer_id, struct sk_buff_head *inputq, struct sk_buff_head *namedq, struct tipc_link **link); -bool tipc_link_bc_create(struct net *net, u32 ownnode, u32 peer, +bool tipc_link_bc_create(struct net *net, u32 ownnode, u32 peer, u8 *peer_id, int mtu, u32 min_win, u32 max_win, u16 peer_caps, struct sk_buff_head *inputq, struct sk_buff_head *namedq, @@ -111,7 +111,6 @@ u16 tipc_link_rcv_nxt(struct tipc_link *l); u16 tipc_link_acked(struct tipc_link *l); u32 tipc_link_id(struct tipc_link *l); char *tipc_link_name(struct tipc_link *l); -char *tipc_link_name_ext(struct tipc_link *l, char *buf); u32 tipc_link_state(struct tipc_link *l); char tipc_link_plane(struct tipc_link *l); int tipc_link_prio(struct tipc_link *l); diff --git a/net/tipc/msg.c b/net/tipc/msg.c index 0d515d20b056..69d68512300a 100644 --- a/net/tipc/msg.c +++ b/net/tipc/msg.c @@ -828,19 +828,19 @@ bool tipc_msg_pskb_copy(u32 dst, struct sk_buff_head *msg, * @seqno: sequence number of buffer to add * @skb: buffer to add */ -void __tipc_skb_queue_sorted(struct sk_buff_head *list, u16 seqno, +bool __tipc_skb_queue_sorted(struct sk_buff_head *list, u16 seqno, struct sk_buff *skb) { struct sk_buff *_skb, *tmp; if (skb_queue_empty(list) || less(seqno, buf_seqno(skb_peek(list)))) { __skb_queue_head(list, skb); - return; + return true; } if (more(seqno, buf_seqno(skb_peek_tail(list)))) { __skb_queue_tail(list, skb); - return; + return true; } skb_queue_walk_safe(list, _skb, tmp) { @@ -849,9 +849,10 @@ void __tipc_skb_queue_sorted(struct sk_buff_head *list, u16 seqno, if (seqno == buf_seqno(_skb)) break; __skb_queue_before(list, _skb, skb); - return; + return true; } kfree_skb(skb); + return false; } void tipc_skb_reject(struct net *net, int err, struct sk_buff *skb, diff --git a/net/tipc/msg.h b/net/tipc/msg.h index 9a38f9c9d6eb..87e2d472f75f 100644 --- a/net/tipc/msg.h +++ b/net/tipc/msg.h @@ -1127,7 +1127,7 @@ bool tipc_msg_assemble(struct sk_buff_head *list); bool tipc_msg_reassemble(struct sk_buff_head *list, struct sk_buff_head *rcvq); bool tipc_msg_pskb_copy(u32 dst, struct sk_buff_head *msg, struct sk_buff_head *cpy); -void __tipc_skb_queue_sorted(struct sk_buff_head *list, u16 seqno, +bool __tipc_skb_queue_sorted(struct sk_buff_head *list, u16 seqno, struct sk_buff *skb); bool tipc_msg_skb_clone(struct sk_buff_head *msg, struct sk_buff_head *cpy); diff --git a/net/tipc/netlink.c b/net/tipc/netlink.c index 7c35094c20b8..8dfad18330bc 100644 --- a/net/tipc/netlink.c +++ b/net/tipc/netlink.c @@ -187,7 +187,7 @@ static const struct genl_ops tipc_genl_v2_ops[] = { }, { .cmd = TIPC_NL_LINK_GET, - .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP, + .validate = GENL_DONT_VALIDATE_STRICT, .doit = tipc_nl_node_get_link, .dumpit = tipc_nl_node_dump_link, }, diff --git a/net/tipc/node.c b/net/tipc/node.c index 917ad3920fac..373d07ae6730 100644 --- a/net/tipc/node.c +++ b/net/tipc/node.c @@ -1138,7 +1138,7 @@ void tipc_node_check_dest(struct net *net, u32 addr, if (unlikely(!n->bc_entry.link)) { snd_l = tipc_bc_sndlink(net); if (!tipc_link_bc_create(net, tipc_own_addr(net), - addr, U16_MAX, + addr, peer_id, U16_MAX, tipc_link_min_win(snd_l), tipc_link_max_win(snd_l), n->capabilities, @@ -2432,7 +2432,7 @@ int tipc_nl_node_get_link(struct sk_buff *skb, struct genl_info *info) return -ENOMEM; if (strcmp(name, tipc_bclink_name) == 0) { - err = tipc_nl_add_bc_link(net, &msg); + err = tipc_nl_add_bc_link(net, &msg, tipc_net(net)->bcl); if (err) goto err_free; } else { @@ -2476,6 +2476,7 @@ int tipc_nl_node_reset_link_stats(struct sk_buff *skb, struct genl_info *info) struct tipc_node *node; struct nlattr *attrs[TIPC_NLA_LINK_MAX + 1]; struct net *net = sock_net(skb->sk); + struct tipc_net *tn = tipc_net(net); struct tipc_link_entry *le; if (!info->attrs[TIPC_NLA_LINK]) @@ -2492,11 +2493,26 @@ int tipc_nl_node_reset_link_stats(struct sk_buff *skb, struct genl_info *info) link_name = nla_data(attrs[TIPC_NLA_LINK_NAME]); - if (strcmp(link_name, tipc_bclink_name) == 0) { - err = tipc_bclink_reset_stats(net); + err = -EINVAL; + if (!strcmp(link_name, tipc_bclink_name)) { + err = tipc_bclink_reset_stats(net, tipc_bc_sndlink(net)); if (err) return err; return 0; + } else if (strstr(link_name, tipc_bclink_name)) { + rcu_read_lock(); + list_for_each_entry_rcu(node, &tn->node_list, list) { + tipc_node_read_lock(node); + link = node->bc_entry.link; + if (link && !strcmp(link_name, tipc_link_name(link))) { + err = tipc_bclink_reset_stats(net, link); + tipc_node_read_unlock(node); + break; + } + tipc_node_read_unlock(node); + } + rcu_read_unlock(); + return err; } node = tipc_node_find_by_name(net, link_name, &bearer_id); @@ -2520,7 +2536,8 @@ int tipc_nl_node_reset_link_stats(struct sk_buff *skb, struct genl_info *info) /* Caller should hold node lock */ static int __tipc_nl_add_node_links(struct net *net, struct tipc_nl_msg *msg, - struct tipc_node *node, u32 *prev_link) + struct tipc_node *node, u32 *prev_link, + u32 type) { u32 i; int err; @@ -2536,6 +2553,14 @@ static int __tipc_nl_add_node_links(struct net *net, struct tipc_nl_msg *msg, if (err) return err; } + + if (type == 2) { + *prev_link = 3; + err = tipc_nl_add_bc_link(net, msg, node->bc_entry.link); + if (err) + return err; + } + *prev_link = 0; return 0; @@ -2544,17 +2569,38 @@ static int __tipc_nl_add_node_links(struct net *net, struct tipc_nl_msg *msg, int tipc_nl_node_dump_link(struct sk_buff *skb, struct netlink_callback *cb) { struct net *net = sock_net(skb->sk); + struct nlattr **attrs = genl_dumpit_info(cb)->attrs; + struct nlattr *link[TIPC_NLA_LINK_MAX + 1]; struct tipc_net *tn = net_generic(net, tipc_net_id); struct tipc_node *node; struct tipc_nl_msg msg; u32 prev_node = cb->args[0]; u32 prev_link = cb->args[1]; int done = cb->args[2]; + u32 type = cb->args[3]; int err; if (done) return 0; + if (!type) { + /* Dump broadcast-link & unicast links */ + type = 1; + if (attrs && attrs[TIPC_NLA_LINK]) { + err = nla_parse_nested_deprecated(link, + TIPC_NLA_LINK_MAX, + attrs[TIPC_NLA_LINK], + tipc_nl_link_policy, + NULL); + if (unlikely(err)) + return err; + if (unlikely(!link[TIPC_NLA_LINK_BROADCAST])) + return -EINVAL; + /* Dump broadcast-receiver links as well */ + type = 2; + } + } + msg.skb = skb; msg.portid = NETLINK_CB(cb->skb).portid; msg.seq = cb->nlh->nlmsg_seq; @@ -2578,7 +2624,7 @@ int tipc_nl_node_dump_link(struct sk_buff *skb, struct netlink_callback *cb) list) { tipc_node_read_lock(node); err = __tipc_nl_add_node_links(net, &msg, node, - &prev_link); + &prev_link, type); tipc_node_read_unlock(node); if (err) goto out; @@ -2586,14 +2632,14 @@ int tipc_nl_node_dump_link(struct sk_buff *skb, struct netlink_callback *cb) prev_node = node->addr; } } else { - err = tipc_nl_add_bc_link(net, &msg); + err = tipc_nl_add_bc_link(net, &msg, tn->bcl); if (err) goto out; list_for_each_entry_rcu(node, &tn->node_list, list) { tipc_node_read_lock(node); err = __tipc_nl_add_node_links(net, &msg, node, - &prev_link); + &prev_link, type); tipc_node_read_unlock(node); if (err) goto out; @@ -2608,6 +2654,7 @@ int tipc_nl_node_dump_link(struct sk_buff *skb, struct netlink_callback *cb) cb->args[0] = prev_node; cb->args[1] = prev_link; cb->args[2] = done; + cb->args[3] = type; return skb->len; } diff --git a/net/tipc/trace.h b/net/tipc/trace.h index e7535ab75255..04af83f0500c 100644 --- a/net/tipc/trace.h +++ b/net/tipc/trace.h @@ -255,7 +255,7 @@ DECLARE_EVENT_CLASS(tipc_link_class, TP_fast_assign( __assign_str(header, header); - tipc_link_name_ext(l, __entry->name); + memcpy(__entry->name, tipc_link_name(l), TIPC_MAX_LINK_NAME); tipc_link_dump(l, dqueues, __get_str(buf)); ), @@ -295,7 +295,7 @@ DECLARE_EVENT_CLASS(tipc_link_transmq_class, ), TP_fast_assign( - tipc_link_name_ext(r, __entry->name); + memcpy(__entry->name, tipc_link_name(r), TIPC_MAX_LINK_NAME); __entry->from = f; __entry->to = t; __entry->len = skb_queue_len(tq); -- 2.13.7 |
From: Tuong L. <tuo...@de...> - 2020-03-27 11:56:51
|
In the previous commit ("tipc: add Gap ACK blocks support for broadcast link"), we have removed the following link trace events due to the code changes: - tipc_link_bc_ack - tipc_link_retrans This commit adds them back along with some minor changes to adapt to the new code. Signed-off-by: Tuong Lien <tuo...@de...> --- net/tipc/link.c | 3 +++ net/tipc/trace.h | 13 ++++++++----- 2 files changed, 11 insertions(+), 5 deletions(-) diff --git a/net/tipc/link.c b/net/tipc/link.c index 1b60ba665504..405ccf597e59 100644 --- a/net/tipc/link.c +++ b/net/tipc/link.c @@ -1517,6 +1517,8 @@ static int tipc_link_advance_transmq(struct tipc_link *l, struct tipc_link *r, bool is_uc = !link_is_bc_sndlink(l); bool bc_has_acked = false; + trace_tipc_link_retrans(r, acked + 1, acked + gap, &l->transmq); + /* Determine Gap ACK blocks if any for the particular link */ if (ga && is_uc) { /* Get the Gap ACKs, uc part */ @@ -2423,6 +2425,7 @@ int tipc_link_bc_ack_rcv(struct tipc_link *r, u16 acked, u16 gap, if (less(acked, r->acked) || (acked == r->acked && !gap && !ga)) return 0; + trace_tipc_link_bc_ack(r, acked, gap, &l->transmq); tipc_link_advance_transmq(l, r, acked, gap, ga, xmitq, &unused, &rc); tipc_link_advance_backlog(l, xmitq); diff --git a/net/tipc/trace.h b/net/tipc/trace.h index 4d8e00483afc..e7535ab75255 100644 --- a/net/tipc/trace.h +++ b/net/tipc/trace.h @@ -299,8 +299,10 @@ DECLARE_EVENT_CLASS(tipc_link_transmq_class, __entry->from = f; __entry->to = t; __entry->len = skb_queue_len(tq); - __entry->fseqno = msg_seqno(buf_msg(skb_peek(tq))); - __entry->lseqno = msg_seqno(buf_msg(skb_peek_tail(tq))); + __entry->fseqno = __entry->len ? + msg_seqno(buf_msg(skb_peek(tq))) : 0; + __entry->lseqno = __entry->len ? + msg_seqno(buf_msg(skb_peek_tail(tq))) : 0; ), TP_printk("<%s> retrans req: [%u-%u] transmq: %u [%u-%u]\n", @@ -308,15 +310,16 @@ DECLARE_EVENT_CLASS(tipc_link_transmq_class, __entry->len, __entry->fseqno, __entry->lseqno) ); -DEFINE_EVENT(tipc_link_transmq_class, tipc_link_retrans, +DEFINE_EVENT_CONDITION(tipc_link_transmq_class, tipc_link_retrans, TP_PROTO(struct tipc_link *r, u16 f, u16 t, struct sk_buff_head *tq), - TP_ARGS(r, f, t, tq) + TP_ARGS(r, f, t, tq), + TP_CONDITION(less_eq(f, t)) ); DEFINE_EVENT_PRINT(tipc_link_transmq_class, tipc_link_bc_ack, TP_PROTO(struct tipc_link *r, u16 f, u16 t, struct sk_buff_head *tq), TP_ARGS(r, f, t, tq), - TP_printk("<%s> acked: [%u-%u] transmq: %u [%u-%u]\n", + TP_printk("<%s> acked: %u gap: %u transmq: %u [%u-%u]\n", __entry->name, __entry->from, __entry->to, __entry->len, __entry->fseqno, __entry->lseqno) ); -- 2.13.7 |
From: Tuong L. <tuo...@de...> - 2020-03-27 11:56:51
|
In some environment, broadcast traffic is suppressed at high rate (i.e. a kind of bandwidth limit setting). When it is applied, TIPC broadcast can still run successfully. However, when it comes to a high load, some packets will be dropped first and TIPC tries to retransmit them but the packet retransmission is intentionally broadcast too, so making things worse and not helpful at all. This commit enables the broadcast retransmission via unicast which only retransmits packets to the specific peer that has really reported a gap i.e. not broadcasting to all nodes in the cluster, so will prevent from being suppressed, and also reduce some overheads on the other peers due to duplicates, finally improve the overall TIPC broadcast performance. Note: the functionality can be turned on/off via the sysctl file: echo 1 > /proc/sys/net/tipc/bc_retruni echo 0 > /proc/sys/net/tipc/bc_retruni Default is '0', i.e. the broadcast retransmission still works as usual. Signed-off-by: Tuong Lien <tuo...@de...> --- net/tipc/bcast.c | 11 ++++++++--- net/tipc/bcast.h | 4 +++- net/tipc/link.c | 8 +++++--- net/tipc/link.h | 3 ++- net/tipc/node.c | 2 +- net/tipc/sysctl.c | 9 ++++++++- 6 files changed, 27 insertions(+), 10 deletions(-) diff --git a/net/tipc/bcast.c b/net/tipc/bcast.c index 3ce690a96ee9..50a16f8bebd9 100644 --- a/net/tipc/bcast.c +++ b/net/tipc/bcast.c @@ -46,6 +46,7 @@ #define BCLINK_WIN_MIN 32 /* bcast minimum link window size */ const char tipc_bclink_name[] = "broadcast-link"; +unsigned long sysctl_tipc_bc_retruni __read_mostly; /** * struct tipc_bc_base - base structure for keeping broadcast send state @@ -474,7 +475,7 @@ void tipc_bcast_ack_rcv(struct net *net, struct tipc_link *l, __skb_queue_head_init(&xmitq); tipc_bcast_lock(net); - tipc_link_bc_ack_rcv(l, acked, 0, NULL, &xmitq); + tipc_link_bc_ack_rcv(l, acked, 0, NULL, &xmitq, NULL); tipc_bcast_unlock(net); tipc_bcbase_xmit(net, &xmitq); @@ -489,7 +490,8 @@ void tipc_bcast_ack_rcv(struct net *net, struct tipc_link *l, * RCU is locked, no other locks set */ int tipc_bcast_sync_rcv(struct net *net, struct tipc_link *l, - struct tipc_msg *hdr) + struct tipc_msg *hdr, + struct sk_buff_head *retrq) { struct sk_buff_head *inputq = &tipc_bc_base(net)->inputq; struct tipc_gap_ack_blks *ga; @@ -503,8 +505,11 @@ int tipc_bcast_sync_rcv(struct net *net, struct tipc_link *l, tipc_link_bc_init_rcv(l, hdr); } else if (!msg_bc_ack_invalid(hdr)) { tipc_get_gap_ack_blks(&ga, l, hdr, false); + if (!sysctl_tipc_bc_retruni) + retrq = &xmitq; rc = tipc_link_bc_ack_rcv(l, msg_bcast_ack(hdr), - msg_bc_gap(hdr), ga, &xmitq); + msg_bc_gap(hdr), ga, &xmitq, + retrq); rc |= tipc_link_bc_sync_rcv(l, hdr, &xmitq); } tipc_bcast_unlock(net); diff --git a/net/tipc/bcast.h b/net/tipc/bcast.h index 9e847d9617d3..97d3cf9d3e4d 100644 --- a/net/tipc/bcast.h +++ b/net/tipc/bcast.h @@ -45,6 +45,7 @@ struct tipc_nl_msg; struct tipc_nlist; struct tipc_nitem; extern const char tipc_bclink_name[]; +extern unsigned long sysctl_tipc_bc_retruni; #define TIPC_METHOD_EXPIRE msecs_to_jiffies(5000) @@ -93,7 +94,8 @@ int tipc_bcast_rcv(struct net *net, struct tipc_link *l, struct sk_buff *skb); void tipc_bcast_ack_rcv(struct net *net, struct tipc_link *l, struct tipc_msg *hdr); int tipc_bcast_sync_rcv(struct net *net, struct tipc_link *l, - struct tipc_msg *hdr); + struct tipc_msg *hdr, + struct sk_buff_head *retrq); int tipc_nl_add_bc_link(struct net *net, struct tipc_nl_msg *msg); int tipc_nl_bc_link_set(struct net *net, struct nlattr *attrs[]); int tipc_bclink_reset_stats(struct net *net); diff --git a/net/tipc/link.c b/net/tipc/link.c index 405ccf597e59..3071e46f029a 100644 --- a/net/tipc/link.c +++ b/net/tipc/link.c @@ -375,7 +375,7 @@ void tipc_link_remove_bc_peer(struct tipc_link *snd_l, snd_l->ackers--; rcv_l->bc_peer_is_up = true; rcv_l->state = LINK_ESTABLISHED; - tipc_link_bc_ack_rcv(rcv_l, ack, 0, NULL, xmitq); + tipc_link_bc_ack_rcv(rcv_l, ack, 0, NULL, xmitq, NULL); trace_tipc_link_reset(rcv_l, TIPC_DUMP_ALL, "bclink removed!"); tipc_link_reset(rcv_l); rcv_l->state = LINK_RESET; @@ -2413,7 +2413,8 @@ int tipc_link_bc_sync_rcv(struct tipc_link *l, struct tipc_msg *hdr, int tipc_link_bc_ack_rcv(struct tipc_link *r, u16 acked, u16 gap, struct tipc_gap_ack_blks *ga, - struct sk_buff_head *xmitq) + struct sk_buff_head *xmitq, + struct sk_buff_head *retrq) { struct tipc_link *l = r->bc_sndlink; bool unused = false; @@ -2460,7 +2461,8 @@ int tipc_link_bc_nack_rcv(struct tipc_link *l, struct sk_buff *skb, return 0; if (dnode == tipc_own_addr(l->net)) { - rc = tipc_link_bc_ack_rcv(l, acked, to - acked, NULL, xmitq); + rc = tipc_link_bc_ack_rcv(l, acked, to - acked, NULL, xmitq, + xmitq); l->stats.recv_nacks++; return rc; } diff --git a/net/tipc/link.h b/net/tipc/link.h index 0a0fa7350722..4d0768cf91d5 100644 --- a/net/tipc/link.h +++ b/net/tipc/link.h @@ -147,7 +147,8 @@ u16 tipc_get_gap_ack_blks(struct tipc_gap_ack_blks **ga, struct tipc_link *l, struct tipc_msg *hdr, bool uc); int tipc_link_bc_ack_rcv(struct tipc_link *l, u16 acked, u16 gap, struct tipc_gap_ack_blks *ga, - struct sk_buff_head *xmitq); + struct sk_buff_head *xmitq, + struct sk_buff_head *retrq); void tipc_link_build_bc_sync_msg(struct tipc_link *l, struct sk_buff_head *xmitq); void tipc_link_bc_init_rcv(struct tipc_link *l, struct tipc_msg *hdr); diff --git a/net/tipc/node.c b/net/tipc/node.c index eb6b62de81a7..917ad3920fac 100644 --- a/net/tipc/node.c +++ b/net/tipc/node.c @@ -1771,7 +1771,7 @@ static void tipc_node_bc_sync_rcv(struct tipc_node *n, struct tipc_msg *hdr, struct tipc_link *ucl; int rc; - rc = tipc_bcast_sync_rcv(n->net, n->bc_entry.link, hdr); + rc = tipc_bcast_sync_rcv(n->net, n->bc_entry.link, hdr, xmitq); if (rc & TIPC_LINK_DOWN_EVT) { tipc_node_reset_links(n); diff --git a/net/tipc/sysctl.c b/net/tipc/sysctl.c index 58ab3d6dcdce..97a6264a2993 100644 --- a/net/tipc/sysctl.c +++ b/net/tipc/sysctl.c @@ -36,7 +36,7 @@ #include "core.h" #include "trace.h" #include "crypto.h" - +#include "bcast.h" #include <linux/sysctl.h> static struct ctl_table_header *tipc_ctl_hdr; @@ -75,6 +75,13 @@ static struct ctl_table tipc_table[] = { .extra1 = SYSCTL_ONE, }, #endif + { + .procname = "bc_retruni", + .data = &sysctl_tipc_bc_retruni, + .maxlen = sizeof(sysctl_tipc_bc_retruni), + .mode = 0644, + .proc_handler = proc_doulongvec_minmax, + }, {} }; -- 2.13.7 |
From: Jon M. <jm...@re...> - 2020-03-25 16:37:53
|
On 3/25/20 3:43 AM, Hoang Le wrote: > In the commit f73b12812a3d > ("tipc: improve throughput between nodes in netns"), we're missing a check > to handle TIPC_DIRECT_MSG type, it's still using old sending mechanism for > this message type. So, throughput improvement is not significant as > expected. > > Besides that, when sending a large message with that type, we're also > handle wrong receiving queue, it should be enqueued in socket receiving > instead of multicast messages. > > Fix this by adding the missing case for TIPC_DIRECT_MSG. > > Fixes: f73b12812a3d ("tipc: improve throughput between nodes in netns") > Reported-by: Tuong Lien <tuo...@de...> > Signed-off-by: Hoang Le <hoa...@de...> > --- > net/tipc/msg.h | 5 +++++ > net/tipc/node.c | 3 ++- > net/tipc/socket.c | 2 +- > 3 files changed, 8 insertions(+), 2 deletions(-) > > diff --git a/net/tipc/msg.h b/net/tipc/msg.h > index 6d466ebdb64f..871feadbbc19 100644 > --- a/net/tipc/msg.h > +++ b/net/tipc/msg.h > @@ -394,6 +394,11 @@ static inline u32 msg_connected(struct tipc_msg *m) > return msg_type(m) == TIPC_CONN_MSG; > } > > +static inline u32 msg_direct(struct tipc_msg *m) > +{ > + return msg_type(m) == TIPC_DIRECT_MSG; > +} > + > static inline u32 msg_errcode(struct tipc_msg *m) > { > return msg_bits(m, 1, 25, 0xf); > diff --git a/net/tipc/node.c b/net/tipc/node.c > index 0c88778c88b5..10292c942384 100644 > --- a/net/tipc/node.c > +++ b/net/tipc/node.c > @@ -1586,7 +1586,8 @@ static void tipc_lxc_xmit(struct net *peer_net, struct sk_buff_head *list) > case TIPC_MEDIUM_IMPORTANCE: > case TIPC_HIGH_IMPORTANCE: > case TIPC_CRITICAL_IMPORTANCE: > - if (msg_connected(hdr) || msg_named(hdr)) { > + if (msg_connected(hdr) || msg_named(hdr) || > + msg_direct(hdr)) { > tipc_loopback_trace(peer_net, list); > spin_lock_init(&list->lock); > tipc_sk_rcv(peer_net, list); > diff --git a/net/tipc/socket.c b/net/tipc/socket.c > index 693e8902161e..87466607097f 100644 > --- a/net/tipc/socket.c > +++ b/net/tipc/socket.c > @@ -1461,7 +1461,7 @@ static int __tipc_sendmsg(struct socket *sock, struct msghdr *m, size_t dlen) > } > > __skb_queue_head_init(&pkts); > - mtu = tipc_node_get_mtu(net, dnode, tsk->portid, false); > + mtu = tipc_node_get_mtu(net, dnode, tsk->portid, true); > rc = tipc_msg_build(hdr, m, 0, dlen, mtu, &pkts); > if (unlikely(rc != dlen)) > return rc; Acked-by: Jon Maloy <jm...@re...> |
From: Hoang Le <hoa...@de...> - 2020-03-25 07:44:01
|
In the commit f73b12812a3d ("tipc: improve throughput between nodes in netns"), we're missing a check to handle TIPC_DIRECT_MSG type, it's still using old sending mechanism for this message type. So, throughput improvement is not significant as expected. Besides that, when sending a large message with that type, we're also handle wrong receiving queue, it should be enqueued in socket receiving instead of multicast messages. Fix this by adding the missing case for TIPC_DIRECT_MSG. Fixes: f73b12812a3d ("tipc: improve throughput between nodes in netns") Reported-by: Tuong Lien <tuo...@de...> Signed-off-by: Hoang Le <hoa...@de...> --- net/tipc/msg.h | 5 +++++ net/tipc/node.c | 3 ++- net/tipc/socket.c | 2 +- 3 files changed, 8 insertions(+), 2 deletions(-) diff --git a/net/tipc/msg.h b/net/tipc/msg.h index 6d466ebdb64f..871feadbbc19 100644 --- a/net/tipc/msg.h +++ b/net/tipc/msg.h @@ -394,6 +394,11 @@ static inline u32 msg_connected(struct tipc_msg *m) return msg_type(m) == TIPC_CONN_MSG; } +static inline u32 msg_direct(struct tipc_msg *m) +{ + return msg_type(m) == TIPC_DIRECT_MSG; +} + static inline u32 msg_errcode(struct tipc_msg *m) { return msg_bits(m, 1, 25, 0xf); diff --git a/net/tipc/node.c b/net/tipc/node.c index 0c88778c88b5..10292c942384 100644 --- a/net/tipc/node.c +++ b/net/tipc/node.c @@ -1586,7 +1586,8 @@ static void tipc_lxc_xmit(struct net *peer_net, struct sk_buff_head *list) case TIPC_MEDIUM_IMPORTANCE: case TIPC_HIGH_IMPORTANCE: case TIPC_CRITICAL_IMPORTANCE: - if (msg_connected(hdr) || msg_named(hdr)) { + if (msg_connected(hdr) || msg_named(hdr) || + msg_direct(hdr)) { tipc_loopback_trace(peer_net, list); spin_lock_init(&list->lock); tipc_sk_rcv(peer_net, list); diff --git a/net/tipc/socket.c b/net/tipc/socket.c index 693e8902161e..87466607097f 100644 --- a/net/tipc/socket.c +++ b/net/tipc/socket.c @@ -1461,7 +1461,7 @@ static int __tipc_sendmsg(struct socket *sock, struct msghdr *m, size_t dlen) } __skb_queue_head_init(&pkts); - mtu = tipc_node_get_mtu(net, dnode, tsk->portid, false); + mtu = tipc_node_get_mtu(net, dnode, tsk->portid, true); rc = tipc_msg_build(hdr, m, 0, dlen, mtu, &pkts); if (unlikely(rc != dlen)) return rc; -- 2.20.1 |
From: Jon M. <jm...@re...> - 2020-03-18 14:53:22
|
On 3/13/20 6:47 AM, Tuong Lien wrote: > As achieved through commit 9195948fbf34 ("tipc: improve TIPC throughput > by Gap ACK blocks"), we apply the same mechanism for the broadcast link > as well. The 'Gap ACK blocks' data field in a 'PROTOCOL/STATE_MSG' will > consist of two parts built for both the broadcast and unicast types: > > 31 16 15 0 > +-------------+-------------+-------------+-------------+ > | bgack_cnt | ugack_cnt | len | > +-------------+-------------+-------------+-------------+ - > | gap | ack | | > +-------------+-------------+-------------+-------------+ > bc gacks > : : : | > +-------------+-------------+-------------+-------------+ - > | gap | ack | | > +-------------+-------------+-------------+-------------+ > uc gacks > : : : | > +-------------+-------------+-------------+-------------+ - > > which is "automatically" backward-compatible. > > We also increase the max number of Gap ACK blocks to 128, allowing upto > 64 blocks per type (total buffer size = 516 bytes). > > Besides, the 'tipc_link_advance_transmq()' function is refactored which > is applicable for both the unicast and broadcast cases now, so some old > functions can be removed and the code is optimized. > > With the patch, TIPC broadcast is more robust regardless of packet loss > or disorder, latency, ... in the underlying network. Its performance is > boost up significantly. > For example, experiment with a 5% packet loss rate results: > > $ time tipc-pipe --mc --rdm --data_size 123 --data_num 1500000 > real 0m 42.46s > user 0m 1.16s > sys 0m 17.67s > > Without the patch: > > $ time tipc-pipe --mc --rdm --data_size 123 --data_num 1500000 > real 5m 28.80s > user 0m 0.85s > sys 0m 3.62s > > Signed-off-by: Tuong Lien <tuo...@de...> > --- > net/tipc/bcast.c | 9 +- > net/tipc/link.c | 440 +++++++++++++++++++++++++++++++++---------------------- > net/tipc/link.h | 7 +- > net/tipc/msg.h | 14 +- > net/tipc/node.c | 10 +- > 5 files changed, 295 insertions(+), 185 deletions(-) > > diff --git a/net/tipc/bcast.c b/net/tipc/bcast.c > index 4c20be08b9c4..3ce690a96ee9 100644 > --- a/net/tipc/bcast.c > +++ b/net/tipc/bcast.c > @@ -474,7 +474,7 @@ void tipc_bcast_ack_rcv(struct net *net, struct tipc_link *l, > __skb_queue_head_init(&xmitq); > > tipc_bcast_lock(net); > - tipc_link_bc_ack_rcv(l, acked, &xmitq); > + tipc_link_bc_ack_rcv(l, acked, 0, NULL, &xmitq); > tipc_bcast_unlock(net); > > tipc_bcbase_xmit(net, &xmitq); > @@ -492,6 +492,7 @@ int tipc_bcast_sync_rcv(struct net *net, struct tipc_link *l, > struct tipc_msg *hdr) > { > struct sk_buff_head *inputq = &tipc_bc_base(net)->inputq; > + struct tipc_gap_ack_blks *ga; > struct sk_buff_head xmitq; > int rc = 0; > > @@ -501,8 +502,10 @@ int tipc_bcast_sync_rcv(struct net *net, struct tipc_link *l, > if (msg_type(hdr) != STATE_MSG) { > tipc_link_bc_init_rcv(l, hdr); > } else if (!msg_bc_ack_invalid(hdr)) { > - tipc_link_bc_ack_rcv(l, msg_bcast_ack(hdr), &xmitq); > - rc = tipc_link_bc_sync_rcv(l, hdr, &xmitq); > + tipc_get_gap_ack_blks(&ga, l, hdr, false); > + rc = tipc_link_bc_ack_rcv(l, msg_bcast_ack(hdr), > + msg_bc_gap(hdr), ga, &xmitq); > + rc |= tipc_link_bc_sync_rcv(l, hdr, &xmitq); > } > tipc_bcast_unlock(net); > > diff --git a/net/tipc/link.c b/net/tipc/link.c > index 467c53a1fb5c..6198b6d89a69 100644 > --- a/net/tipc/link.c > +++ b/net/tipc/link.c > @@ -188,6 +188,8 @@ struct tipc_link { > /* Broadcast */ > u16 ackers; > u16 acked; > + u16 last_gap; > + struct tipc_gap_ack_blks *last_ga; > struct tipc_link *bc_rcvlink; > struct tipc_link *bc_sndlink; > u8 nack_state; > @@ -249,11 +251,14 @@ static int tipc_link_build_nack_msg(struct tipc_link *l, > struct sk_buff_head *xmitq); > static void tipc_link_build_bc_init_msg(struct tipc_link *l, > struct sk_buff_head *xmitq); > -static int tipc_link_release_pkts(struct tipc_link *l, u16 to); > -static u16 tipc_build_gap_ack_blks(struct tipc_link *l, void *data, u16 gap); > -static int tipc_link_advance_transmq(struct tipc_link *l, u16 acked, u16 gap, > +static u8 __tipc_build_gap_ack_blks(struct tipc_gap_ack_blks *ga, > + struct tipc_link *l, u8 start_index); > +static u16 tipc_build_gap_ack_blks(struct tipc_link *l, struct tipc_msg *hdr); > +static int tipc_link_advance_transmq(struct tipc_link *l, struct tipc_link *r, > + u16 acked, u16 gap, > struct tipc_gap_ack_blks *ga, > - struct sk_buff_head *xmitq); > + struct sk_buff_head *xmitq, > + bool *retransmitted, int *rc); > static void tipc_link_update_cwin(struct tipc_link *l, int released, > bool retransmitted); > /* > @@ -370,7 +375,7 @@ void tipc_link_remove_bc_peer(struct tipc_link *snd_l, > snd_l->ackers--; > rcv_l->bc_peer_is_up = true; > rcv_l->state = LINK_ESTABLISHED; > - tipc_link_bc_ack_rcv(rcv_l, ack, xmitq); > + tipc_link_bc_ack_rcv(rcv_l, ack, 0, NULL, xmitq); > trace_tipc_link_reset(rcv_l, TIPC_DUMP_ALL, "bclink removed!"); > tipc_link_reset(rcv_l); > rcv_l->state = LINK_RESET; > @@ -784,8 +789,6 @@ bool tipc_link_too_silent(struct tipc_link *l) > return (l->silent_intv_cnt + 2 > l->abort_limit); > } > > -static int tipc_link_bc_retrans(struct tipc_link *l, struct tipc_link *r, > - u16 from, u16 to, struct sk_buff_head *xmitq); > /* tipc_link_timeout - perform periodic task as instructed from node timeout > */ > int tipc_link_timeout(struct tipc_link *l, struct sk_buff_head *xmitq) > @@ -948,6 +951,9 @@ void tipc_link_reset(struct tipc_link *l) > l->snd_nxt_state = 1; > l->rcv_nxt_state = 1; > l->acked = 0; > + l->last_gap = 0; > + kfree(l->last_ga); > + l->last_ga = NULL; > l->silent_intv_cnt = 0; > l->rst_cnt = 0; > l->bc_peer_is_up = false; > @@ -1183,68 +1189,14 @@ static bool link_retransmit_failure(struct tipc_link *l, struct tipc_link *r, > > if (link_is_bc_sndlink(l)) { > r->state = LINK_RESET; > - *rc = TIPC_LINK_DOWN_EVT; > + *rc |= TIPC_LINK_DOWN_EVT; > } else { > - *rc = tipc_link_fsm_evt(l, LINK_FAILURE_EVT); > + *rc |= tipc_link_fsm_evt(l, LINK_FAILURE_EVT); > } > > return true; > } > > -/* tipc_link_bc_retrans() - retransmit zero or more packets > - * @l: the link to transmit on > - * @r: the receiving link ordering the retransmit. Same as l if unicast > - * @from: retransmit from (inclusive) this sequence number > - * @to: retransmit to (inclusive) this sequence number > - * xmitq: queue for accumulating the retransmitted packets > - */ > -static int tipc_link_bc_retrans(struct tipc_link *l, struct tipc_link *r, > - u16 from, u16 to, struct sk_buff_head *xmitq) > -{ > - struct sk_buff *_skb, *skb = skb_peek(&l->transmq); > - u16 bc_ack = l->bc_rcvlink->rcv_nxt - 1; > - u16 ack = l->rcv_nxt - 1; > - int retransmitted = 0; > - struct tipc_msg *hdr; > - int rc = 0; > - > - if (!skb) > - return 0; > - if (less(to, from)) > - return 0; > - > - trace_tipc_link_retrans(r, from, to, &l->transmq); > - > - if (link_retransmit_failure(l, r, &rc)) > - return rc; > - > - skb_queue_walk(&l->transmq, skb) { > - hdr = buf_msg(skb); > - if (less(msg_seqno(hdr), from)) > - continue; > - if (more(msg_seqno(hdr), to)) > - break; > - if (time_before(jiffies, TIPC_SKB_CB(skb)->nxt_retr)) > - continue; > - TIPC_SKB_CB(skb)->nxt_retr = TIPC_BC_RETR_LIM; > - _skb = pskb_copy(skb, GFP_ATOMIC); > - if (!_skb) > - return 0; > - hdr = buf_msg(_skb); > - msg_set_ack(hdr, ack); > - msg_set_bcast_ack(hdr, bc_ack); > - _skb->priority = TC_PRIO_CONTROL; > - __skb_queue_tail(xmitq, _skb); > - l->stats.retransmitted++; > - retransmitted++; > - /* Increase actual retrans counter & mark first time */ > - if (!TIPC_SKB_CB(skb)->retr_cnt++) > - TIPC_SKB_CB(skb)->retr_stamp = jiffies; > - } > - tipc_link_update_cwin(l, 0, retransmitted); > - return 0; > -} > - > /* tipc_data_input - deliver data and name distr msgs to upper layer > * > * Consumes buffer if message is of right type > @@ -1402,46 +1354,71 @@ static int tipc_link_tnl_rcv(struct tipc_link *l, struct sk_buff *skb, > return rc; > } > > -static int tipc_link_release_pkts(struct tipc_link *l, u16 acked) > -{ > - int released = 0; > - struct sk_buff *skb, *tmp; > - > - skb_queue_walk_safe(&l->transmq, skb, tmp) { > - if (more(buf_seqno(skb), acked)) > - break; > - __skb_unlink(skb, &l->transmq); > - kfree_skb(skb); > - released++; > +/** > + * tipc_get_gap_ack_blks - get Gap ACK blocks from PROTOCOL/STATE_MSG > + * @ga: returned pointer to the Gap ACK blocks if any > + * @l: the tipc link > + * @hdr: the PROTOCOL/STATE_MSG header > + * @uc: desired Gap ACK blocks type, i.e. unicast (= 1) or broadcast (= 0) > + * > + * Return: the total Gap ACK blocks size > + */ > +u16 tipc_get_gap_ack_blks(struct tipc_gap_ack_blks **ga, struct tipc_link *l, > + struct tipc_msg *hdr, bool uc) > +{ > + struct tipc_gap_ack_blks *p; > + u16 sz = 0; > + > + /* Does peer support the Gap ACK blocks feature? */ > + if (l->peer_caps & TIPC_GAP_ACK_BLOCK) { > + p = (struct tipc_gap_ack_blks *)msg_data(hdr); > + sz = ntohs(p->len); > + /* Sanity check */ > + if (sz == tipc_gap_ack_blks_sz(p->ugack_cnt + p->bgack_cnt)) { > + /* Good, check if the desired type exists */ > + if ((uc && p->ugack_cnt) || (!uc && p->bgack_cnt)) > + goto ok; > + /* Backward compatible: peer might not support bc, but uc? */ > + } else if (uc && sz == tipc_gap_ack_blks_sz(p->ugack_cnt)) { > + if (p->ugack_cnt) { > + p->bgack_cnt = 0; > + goto ok; > + } > + } > } > - return released; > + /* Other cases: ignore! */ > + p = NULL; > + > +ok: > + *ga = p; > + return sz; > } > > -/* tipc_build_gap_ack_blks - build Gap ACK blocks > - * @l: tipc link that data have come with gaps in sequence if any > - * @data: data buffer to store the Gap ACK blocks after built > - * > - * returns the actual allocated memory size > - */ > -static u16 tipc_build_gap_ack_blks(struct tipc_link *l, void *data, u16 gap) > +static u8 __tipc_build_gap_ack_blks(struct tipc_gap_ack_blks *ga, > + struct tipc_link *l, u8 start_index) > { > + struct tipc_gap_ack *gacks = &ga->gacks[start_index]; > struct sk_buff *skb = skb_peek(&l->deferdq); > - struct tipc_gap_ack_blks *ga = data; > - u16 len, expect, seqno = 0; > + u16 expect, seqno = 0; > u8 n = 0; > > - if (!skb || !gap) > - goto exit; > + if (!skb) > + return 0; > > expect = buf_seqno(skb); > skb_queue_walk(&l->deferdq, skb) { > seqno = buf_seqno(skb); > if (unlikely(more(seqno, expect))) { > - ga->gacks[n].ack = htons(expect - 1); > - ga->gacks[n].gap = htons(seqno - expect); > - if (++n >= MAX_GAP_ACK_BLKS) { > - pr_info_ratelimited("Too few Gap ACK blocks!\n"); > - goto exit; > + gacks[n].ack = htons(expect - 1); > + gacks[n].gap = htons(seqno - expect); > + if (++n >= MAX_GAP_ACK_BLKS / 2) { > + char buf[TIPC_MAX_LINK_NAME]; > + > + pr_info_ratelimited("Gacks on %s: %d, ql: %d!\n", > + tipc_link_name_ext(l, buf), > + n, > + skb_queue_len(&l->deferdq)); > + return n; > } > } else if (unlikely(less(seqno, expect))) { > pr_warn("Unexpected skb in deferdq!\n"); > @@ -1451,14 +1428,57 @@ static u16 tipc_build_gap_ack_blks(struct tipc_link *l, void *data, u16 gap) > } > > /* last block */ > - ga->gacks[n].ack = htons(seqno); > - ga->gacks[n].gap = 0; > + gacks[n].ack = htons(seqno); > + gacks[n].gap = 0; > n++; > + return n; > +} > > -exit: > - len = tipc_gap_ack_blks_sz(n); > +/* tipc_build_gap_ack_blks - build Gap ACK blocks > + * @l: tipc unicast link > + * @hdr: the tipc message buffer to store the Gap ACK blocks after built > + * > + * The function builds Gap ACK blocks for both the unicast & broadcast receiver > + * links of a certain peer, the buffer after built has the network data format > + * as follows: > + * 31 16 15 0 > + * +-------------+-------------+-------------+-------------+ > + * | bgack_cnt | ugack_cnt | len | > + * +-------------+-------------+-------------+-------------+ - > + * | gap | ack | | > + * +-------------+-------------+-------------+-------------+ > bc gacks > + * : : : | > + * +-------------+-------------+-------------+-------------+ - > + * | gap | ack | | > + * +-------------+-------------+-------------+-------------+ > uc gacks > + * : : : | > + * +-------------+-------------+-------------+-------------+ - > + * (See struct tipc_gap_ack_blks) > + * > + * returns the actual allocated memory size > + */ > +static u16 tipc_build_gap_ack_blks(struct tipc_link *l, struct tipc_msg *hdr) > +{ > + struct tipc_link *bcl = l->bc_rcvlink; > + struct tipc_gap_ack_blks *ga; > + u16 len; > + > + ga = (struct tipc_gap_ack_blks *)msg_data(hdr); > + > + /* Start with broadcast link first */ > + tipc_bcast_lock(bcl->net); > + msg_set_bcast_ack(hdr, bcl->rcv_nxt - 1); > + msg_set_bc_gap(hdr, link_bc_rcv_gap(bcl)); > + ga->bgack_cnt = __tipc_build_gap_ack_blks(ga, bcl, 0); > + tipc_bcast_unlock(bcl->net); > + > + /* Now for unicast link, but an explicit NACK only (???) */ > + ga->ugack_cnt = (msg_seq_gap(hdr)) ? > + __tipc_build_gap_ack_blks(ga, l, ga->bgack_cnt) : 0; > + > + /* Total len */ > + len = tipc_gap_ack_blks_sz(ga->bgack_cnt + ga->ugack_cnt); > ga->len = htons(len); > - ga->gack_cnt = n; > return len; > } > > @@ -1466,47 +1486,111 @@ static u16 tipc_build_gap_ack_blks(struct tipc_link *l, void *data, u16 gap) > * acked packets, also doing retransmissions if > * gaps found > * @l: tipc link with transmq queue to be advanced > + * @r: tipc link "receiver" i.e. in case of broadcast (= "l" if unicast) > * @acked: seqno of last packet acked by peer without any gaps before > * @gap: # of gap packets > * @ga: buffer pointer to Gap ACK blocks from peer > * @xmitq: queue for accumulating the retransmitted packets if any > + * @retransmitted: returned boolean value if a retransmission is really issued > + * @rc: returned code e.g. TIPC_LINK_DOWN_EVT if a repeated retransmit failures > + * happens (- unlikely case) > * > - * In case of a repeated retransmit failures, the call will return shortly > - * with a returned code (e.g. TIPC_LINK_DOWN_EVT) > + * Return: the number of packets released from the link transmq > */ > -static int tipc_link_advance_transmq(struct tipc_link *l, u16 acked, u16 gap, > +static int tipc_link_advance_transmq(struct tipc_link *l, struct tipc_link *r, > + u16 acked, u16 gap, > struct tipc_gap_ack_blks *ga, > - struct sk_buff_head *xmitq) > + struct sk_buff_head *xmitq, > + bool *retransmitted, int *rc) > { > + struct tipc_gap_ack_blks *last_ga = r->last_ga, *this_ga = NULL; > + struct tipc_gap_ack *gacks = NULL; > struct sk_buff *skb, *_skb, *tmp; > struct tipc_msg *hdr; > + u32 qlen = skb_queue_len(&l->transmq); > + u16 nacked = acked, ngap = gap, gack_cnt = 0; > u16 bc_ack = l->bc_rcvlink->rcv_nxt - 1; > - bool retransmitted = false; > u16 ack = l->rcv_nxt - 1; > - bool passed = false; > - u16 released = 0; > u16 seqno, n = 0; > - int rc = 0; > + u16 end = r->acked, start = end, offset = r->last_gap; > + u16 si = (last_ga) ? last_ga->start_index : 0; > + bool is_uc = !link_is_bc_sndlink(l); > + bool bc_has_acked = false; > + > + trace_tipc_link_retrans(r, acked + 1, acked + gap, &l->transmq); > + > + /* Determine Gap ACK blocks if any for the particular link */ > + if (ga && is_uc) { > + /* Get the Gap ACKs, uc part */ > + gack_cnt = ga->ugack_cnt; > + gacks = &ga->gacks[ga->bgack_cnt]; > + } else if (ga) { > + /* Copy the Gap ACKs, bc part, for later renewal if needed */ > + this_ga = kmemdup(ga, tipc_gap_ack_blks_sz(ga->bgack_cnt), > + GFP_ATOMIC); > + if (likely(this_ga)) { > + this_ga->start_index = 0; > + /* Start with the bc Gap ACKs */ > + gack_cnt = this_ga->bgack_cnt; > + gacks = &this_ga->gacks[0]; > + } else { > + /* Hmm, we can get in trouble..., simply ignore it */ > + pr_warn_ratelimited("Ignoring bc Gap ACKs, no memory\n"); > + } > + } > > + /* Advance the link transmq */ > skb_queue_walk_safe(&l->transmq, skb, tmp) { > seqno = buf_seqno(skb); > > next_gap_ack: > - if (less_eq(seqno, acked)) { > + if (less_eq(seqno, nacked)) { > + if (is_uc) > + goto release; > + /* Skip packets peer has already acked */ > + if (!more(seqno, r->acked)) > + continue; > + /* Get the next of last Gap ACK blocks */ > + while (more(seqno, end)) { > + if (!last_ga || si >= last_ga->bgack_cnt) > + break; > + start = end + offset + 1; > + end = ntohs(last_ga->gacks[si].ack); > + offset = ntohs(last_ga->gacks[si].gap); > + si++; > + WARN_ONCE(more(start, end) || > + (!offset && > + si < last_ga->bgack_cnt) || > + si > MAX_GAP_ACK_BLKS, > + "Corrupted Gap ACK: %d %d %d %d %d\n", > + start, end, offset, si, > + last_ga->bgack_cnt); > + } > + /* Check against the last Gap ACK block */ > + if (in_range(seqno, start, end)) > + continue; > + /* Update/release the packet peer is acking */ > + bc_has_acked = true; > + if (--TIPC_SKB_CB(skb)->ackers) > + continue; > +release: > /* release skb */ > __skb_unlink(skb, &l->transmq); > kfree_skb(skb); > - released++; > - } else if (less_eq(seqno, acked + gap)) { > - /* First, check if repeated retrans failures occurs? */ > - if (!passed && link_retransmit_failure(l, l, &rc)) > - return rc; > - passed = true; > - > + } else if (less_eq(seqno, nacked + ngap)) { > + /* First gap: check if repeated retrans failures? */ > + if (unlikely(seqno == acked + 1 && > + link_retransmit_failure(l, r, rc))) { > + /* Ignore this bc Gap ACKs if any */ > + kfree(this_ga); > + this_ga = NULL; > + break; > + } > /* retransmit skb if unrestricted*/ > if (time_before(jiffies, TIPC_SKB_CB(skb)->nxt_retr)) > continue; > - TIPC_SKB_CB(skb)->nxt_retr = TIPC_UC_RETR_TIME; > + TIPC_SKB_CB(skb)->nxt_retr = (is_uc) ? > + TIPC_UC_RETR_TIME : TIPC_BC_RETR_LIM; > _skb = pskb_copy(skb, GFP_ATOMIC); > if (!_skb) > continue; > @@ -1516,25 +1600,50 @@ static int tipc_link_advance_transmq(struct tipc_link *l, u16 acked, u16 gap, > _skb->priority = TC_PRIO_CONTROL; > __skb_queue_tail(xmitq, _skb); > l->stats.retransmitted++; > - retransmitted = true; > + *retransmitted = true; > /* Increase actual retrans counter & mark first time */ > if (!TIPC_SKB_CB(skb)->retr_cnt++) > TIPC_SKB_CB(skb)->retr_stamp = jiffies; > } else { > /* retry with Gap ACK blocks if any */ > - if (!ga || n >= ga->gack_cnt) > + if (n >= gack_cnt) > break; > - acked = ntohs(ga->gacks[n].ack); > - gap = ntohs(ga->gacks[n].gap); > + nacked = ntohs(gacks[n].ack); > + ngap = ntohs(gacks[n].gap); > n++; > goto next_gap_ack; > } > } > - if (released || retransmitted) > - tipc_link_update_cwin(l, released, retransmitted); > - if (released) > - tipc_link_advance_backlog(l, xmitq); > - return 0; > + > + /* Renew last Gap ACK blocks for bc if needed */ > + if (bc_has_acked) { > + if (this_ga) { > + kfree(last_ga); > + r->last_ga = this_ga; > + r->last_gap = gap; > + } else if (last_ga) { > + if (less(acked, start)) { > + si--; > + offset = start - acked - 1; > + } else if (less(acked, end)) { > + acked = end; > + } > + if (si < last_ga->bgack_cnt) { > + last_ga->start_index = si; > + r->last_gap = offset; > + } else { > + kfree(last_ga); > + r->last_ga = NULL; > + r->last_gap = 0; > + } > + } else { > + r->last_gap = 0; > + } > + r->acked = acked; > + } else { > + kfree(this_ga); > + } > + return skb_queue_len(&l->transmq) - qlen; > } > > /* tipc_link_build_state_msg: prepare link state message for transmission > @@ -1651,7 +1760,8 @@ int tipc_link_rcv(struct tipc_link *l, struct sk_buff *skb, > kfree_skb(skb); > break; > } > - released += tipc_link_release_pkts(l, msg_ack(hdr)); > + released += tipc_link_advance_transmq(l, l, msg_ack(hdr), 0, > + NULL, NULL, NULL, NULL); > > /* Defer delivery if sequence gap */ > if (unlikely(seqno != rcv_nxt)) { > @@ -1739,7 +1849,7 @@ static void tipc_link_build_proto_msg(struct tipc_link *l, int mtyp, bool probe, > msg_set_probe(hdr, probe); > msg_set_is_keepalive(hdr, probe || probe_reply); > if (l->peer_caps & TIPC_GAP_ACK_BLOCK) > - glen = tipc_build_gap_ack_blks(l, data, rcvgap); > + glen = tipc_build_gap_ack_blks(l, hdr); > tipc_mon_prep(l->net, data + glen, &dlen, mstate, l->bearer_id); > msg_set_size(hdr, INT_H_SIZE + glen + dlen); > skb_trim(skb, INT_H_SIZE + glen + dlen); > @@ -2027,20 +2137,19 @@ static int tipc_link_proto_rcv(struct tipc_link *l, struct sk_buff *skb, > { > struct tipc_msg *hdr = buf_msg(skb); > struct tipc_gap_ack_blks *ga = NULL; > - u16 rcvgap = 0; > - u16 ack = msg_ack(hdr); > - u16 gap = msg_seq_gap(hdr); > + bool reply = msg_probe(hdr), retransmitted = false; > + u16 dlen = msg_data_sz(hdr), glen = 0; > u16 peers_snd_nxt = msg_next_sent(hdr); > u16 peers_tol = msg_link_tolerance(hdr); > u16 peers_prio = msg_linkprio(hdr); > + u16 gap = msg_seq_gap(hdr); > + u16 ack = msg_ack(hdr); > u16 rcv_nxt = l->rcv_nxt; > - u16 dlen = msg_data_sz(hdr); > + u16 rcvgap = 0; > int mtyp = msg_type(hdr); > - bool reply = msg_probe(hdr); > - u16 glen = 0; > - void *data; > + int rc = 0, released; > char *if_name; > - int rc = 0; > + void *data; > > trace_tipc_proto_rcv(skb, false, l->name); > if (tipc_link_is_blocked(l) || !xmitq) > @@ -2137,13 +2246,7 @@ static int tipc_link_proto_rcv(struct tipc_link *l, struct sk_buff *skb, > } > > /* Receive Gap ACK blocks from peer if any */ > - if (l->peer_caps & TIPC_GAP_ACK_BLOCK) { > - ga = (struct tipc_gap_ack_blks *)data; > - glen = ntohs(ga->len); > - /* sanity check: if failed, ignore Gap ACK blocks */ > - if (glen != tipc_gap_ack_blks_sz(ga->gack_cnt)) > - ga = NULL; > - } > + glen = tipc_get_gap_ack_blks(&ga, l, hdr, true); > > tipc_mon_rcv(l->net, data + glen, dlen - glen, l->addr, > &l->mon_state, l->bearer_id); > @@ -2158,9 +2261,14 @@ static int tipc_link_proto_rcv(struct tipc_link *l, struct sk_buff *skb, > tipc_link_build_proto_msg(l, STATE_MSG, 0, reply, > rcvgap, 0, 0, xmitq); > > - rc |= tipc_link_advance_transmq(l, ack, gap, ga, xmitq); > + released = tipc_link_advance_transmq(l, l, ack, gap, ga, xmitq, > + &retransmitted, &rc); > if (gap) > l->stats.recv_nacks++; > + if (released || retransmitted) > + tipc_link_update_cwin(l, released, retransmitted); > + if (released) > + tipc_link_advance_backlog(l, xmitq); > if (unlikely(!skb_queue_empty(&l->wakeupq))) > link_prepare_wakeup(l); > } > @@ -2246,10 +2354,7 @@ void tipc_link_bc_init_rcv(struct tipc_link *l, struct tipc_msg *hdr) > int tipc_link_bc_sync_rcv(struct tipc_link *l, struct tipc_msg *hdr, > struct sk_buff_head *xmitq) > { > - struct tipc_link *snd_l = l->bc_sndlink; > u16 peers_snd_nxt = msg_bc_snd_nxt(hdr); > - u16 from = msg_bcast_ack(hdr) + 1; > - u16 to = from + msg_bc_gap(hdr) - 1; > int rc = 0; > > if (!link_is_up(l)) > @@ -2271,8 +2376,6 @@ int tipc_link_bc_sync_rcv(struct tipc_link *l, struct tipc_msg *hdr, > if (more(peers_snd_nxt, l->rcv_nxt + l->window)) > return rc; > > - rc = tipc_link_bc_retrans(snd_l, l, from, to, xmitq); > - > l->snd_nxt = peers_snd_nxt; > if (link_bc_rcv_gap(l)) > rc |= TIPC_LINK_SND_STATE; > @@ -2307,38 +2410,28 @@ int tipc_link_bc_sync_rcv(struct tipc_link *l, struct tipc_msg *hdr, > return 0; > } > > -void tipc_link_bc_ack_rcv(struct tipc_link *l, u16 acked, > - struct sk_buff_head *xmitq) > +int tipc_link_bc_ack_rcv(struct tipc_link *r, u16 acked, u16 gap, > + struct tipc_gap_ack_blks *ga, > + struct sk_buff_head *xmitq) > { > - struct sk_buff *skb, *tmp; > - struct tipc_link *snd_l = l->bc_sndlink; > + struct tipc_link *l = r->bc_sndlink; > + bool unused = false; > + int rc = 0; > > - if (!link_is_up(l) || !l->bc_peer_is_up) > - return; > + if (!link_is_up(r) || !r->bc_peer_is_up) > + return 0; > > - if (!more(acked, l->acked)) > - return; > + if (less(acked, r->acked) || (acked == r->acked && !gap && !ga)) > + return 0; > > - trace_tipc_link_bc_ack(l, l->acked, acked, &snd_l->transmq); > - /* Skip over packets peer has already acked */ > - skb_queue_walk(&snd_l->transmq, skb) { > - if (more(buf_seqno(skb), l->acked)) > - break; > - } > + trace_tipc_link_bc_ack(r, r->acked, acked, &l->transmq); > + tipc_link_advance_transmq(l, r, acked, gap, ga, xmitq, &unused, &rc); > > - /* Update/release the packets peer is acking now */ > - skb_queue_walk_from_safe(&snd_l->transmq, skb, tmp) { > - if (more(buf_seqno(skb), acked)) > - break; > - if (!--TIPC_SKB_CB(skb)->ackers) { > - __skb_unlink(skb, &snd_l->transmq); > - kfree_skb(skb); > - } > - } > - l->acked = acked; > - tipc_link_advance_backlog(snd_l, xmitq); > - if (unlikely(!skb_queue_empty(&snd_l->wakeupq))) > - link_prepare_wakeup(snd_l); > + tipc_link_advance_backlog(l, xmitq); > + if (unlikely(!skb_queue_empty(&l->wakeupq))) > + link_prepare_wakeup(l); > + > + return rc; > } > > /* tipc_link_bc_nack_rcv(): receive broadcast nack message > @@ -2366,8 +2459,7 @@ int tipc_link_bc_nack_rcv(struct tipc_link *l, struct sk_buff *skb, > return 0; > > if (dnode == tipc_own_addr(l->net)) { > - tipc_link_bc_ack_rcv(l, acked, xmitq); > - rc = tipc_link_bc_retrans(l->bc_sndlink, l, from, to, xmitq); > + rc = tipc_link_bc_ack_rcv(l, acked, to - acked, NULL, xmitq); > l->stats.recv_nacks++; > return rc; > } > diff --git a/net/tipc/link.h b/net/tipc/link.h > index d3c1c3fc1659..0a0fa7350722 100644 > --- a/net/tipc/link.h > +++ b/net/tipc/link.h > @@ -143,8 +143,11 @@ int tipc_link_bc_peers(struct tipc_link *l); > void tipc_link_set_mtu(struct tipc_link *l, int mtu); > int tipc_link_mtu(struct tipc_link *l); > int tipc_link_mss(struct tipc_link *l); > -void tipc_link_bc_ack_rcv(struct tipc_link *l, u16 acked, > - struct sk_buff_head *xmitq); > +u16 tipc_get_gap_ack_blks(struct tipc_gap_ack_blks **ga, struct tipc_link *l, > + struct tipc_msg *hdr, bool uc); > +int tipc_link_bc_ack_rcv(struct tipc_link *l, u16 acked, u16 gap, > + struct tipc_gap_ack_blks *ga, > + struct sk_buff_head *xmitq); > void tipc_link_build_bc_sync_msg(struct tipc_link *l, > struct sk_buff_head *xmitq); > void tipc_link_bc_init_rcv(struct tipc_link *l, struct tipc_msg *hdr); > diff --git a/net/tipc/msg.h b/net/tipc/msg.h > index 6d466ebdb64f..9a38f9c9d6eb 100644 > --- a/net/tipc/msg.h > +++ b/net/tipc/msg.h > @@ -160,20 +160,26 @@ struct tipc_gap_ack { > > /* struct tipc_gap_ack_blks > * @len: actual length of the record > - * @gack_cnt: number of Gap ACK blocks in the record > + * @bgack_cnt: number of Gap ACK blocks for broadcast in the record > + * @ugack_cnt: number of Gap ACK blocks for unicast (following the broadcast > + * ones) > + * @start_index: starting index for "valid" broadcast Gap ACK blocks > * @gacks: array of Gap ACK blocks > */ > struct tipc_gap_ack_blks { > __be16 len; > - u8 gack_cnt; > - u8 reserved; > + union { > + u8 ugack_cnt; > + u8 start_index; > + }; > + u8 bgack_cnt; > struct tipc_gap_ack gacks[]; > }; > > #define tipc_gap_ack_blks_sz(n) (sizeof(struct tipc_gap_ack_blks) + \ > sizeof(struct tipc_gap_ack) * (n)) > > -#define MAX_GAP_ACK_BLKS 32 > +#define MAX_GAP_ACK_BLKS 128 > #define MAX_GAP_ACK_BLKS_SZ tipc_gap_ack_blks_sz(MAX_GAP_ACK_BLKS) > > static inline struct tipc_msg *buf_msg(struct sk_buff *skb) > diff --git a/net/tipc/node.c b/net/tipc/node.c > index 0c88778c88b5..eb6b62de81a7 100644 > --- a/net/tipc/node.c > +++ b/net/tipc/node.c > @@ -2069,10 +2069,16 @@ void tipc_rcv(struct net *net, struct sk_buff *skb, struct tipc_bearer *b) > le = &n->links[bearer_id]; > > /* Ensure broadcast reception is in synch with peer's send state */ > - if (unlikely(usr == LINK_PROTOCOL)) > + if (unlikely(usr == LINK_PROTOCOL)) { > + if (unlikely(skb_linearize(skb))) { > + tipc_node_put(n); > + goto discard; > + } > + hdr = buf_msg(skb); > tipc_node_bc_sync_rcv(n, hdr, bearer_id, &xmitq); > - else if (unlikely(tipc_link_acked(n->bc_entry.link) != bc_ack)) > + } else if (unlikely(tipc_link_acked(n->bc_entry.link) != bc_ack)) { > tipc_bcast_ack_rcv(net, n->bc_entry.link, hdr); > + } > > /* Receive packet directly if conditions permit */ > tipc_node_read_lock(n); Nice job! Acked-by: Jon Maloy <jm...@re...> |