You can subscribe to this list here.
2003 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(6) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2004 |
Jan
(9) |
Feb
(11) |
Mar
(22) |
Apr
(73) |
May
(78) |
Jun
(146) |
Jul
(80) |
Aug
(27) |
Sep
(5) |
Oct
(14) |
Nov
(18) |
Dec
(27) |
2005 |
Jan
(20) |
Feb
(30) |
Mar
(19) |
Apr
(28) |
May
(50) |
Jun
(31) |
Jul
(32) |
Aug
(14) |
Sep
(36) |
Oct
(43) |
Nov
(74) |
Dec
(63) |
2006 |
Jan
(34) |
Feb
(32) |
Mar
(21) |
Apr
(76) |
May
(106) |
Jun
(72) |
Jul
(70) |
Aug
(175) |
Sep
(130) |
Oct
(39) |
Nov
(81) |
Dec
(43) |
2007 |
Jan
(81) |
Feb
(36) |
Mar
(20) |
Apr
(43) |
May
(54) |
Jun
(34) |
Jul
(44) |
Aug
(55) |
Sep
(44) |
Oct
(54) |
Nov
(43) |
Dec
(41) |
2008 |
Jan
(42) |
Feb
(84) |
Mar
(73) |
Apr
(30) |
May
(119) |
Jun
(54) |
Jul
(54) |
Aug
(93) |
Sep
(173) |
Oct
(130) |
Nov
(145) |
Dec
(153) |
2009 |
Jan
(59) |
Feb
(12) |
Mar
(28) |
Apr
(18) |
May
(56) |
Jun
(9) |
Jul
(28) |
Aug
(62) |
Sep
(16) |
Oct
(19) |
Nov
(15) |
Dec
(17) |
2010 |
Jan
(14) |
Feb
(36) |
Mar
(37) |
Apr
(30) |
May
(33) |
Jun
(53) |
Jul
(42) |
Aug
(50) |
Sep
(67) |
Oct
(66) |
Nov
(69) |
Dec
(36) |
2011 |
Jan
(52) |
Feb
(45) |
Mar
(49) |
Apr
(21) |
May
(34) |
Jun
(13) |
Jul
(19) |
Aug
(37) |
Sep
(43) |
Oct
(10) |
Nov
(23) |
Dec
(30) |
2012 |
Jan
(42) |
Feb
(36) |
Mar
(46) |
Apr
(25) |
May
(96) |
Jun
(146) |
Jul
(40) |
Aug
(28) |
Sep
(61) |
Oct
(45) |
Nov
(100) |
Dec
(53) |
2013 |
Jan
(79) |
Feb
(24) |
Mar
(134) |
Apr
(156) |
May
(118) |
Jun
(75) |
Jul
(278) |
Aug
(145) |
Sep
(136) |
Oct
(168) |
Nov
(137) |
Dec
(439) |
2014 |
Jan
(284) |
Feb
(158) |
Mar
(231) |
Apr
(275) |
May
(259) |
Jun
(91) |
Jul
(222) |
Aug
(215) |
Sep
(165) |
Oct
(166) |
Nov
(211) |
Dec
(150) |
2015 |
Jan
(164) |
Feb
(324) |
Mar
(299) |
Apr
(214) |
May
(111) |
Jun
(109) |
Jul
(105) |
Aug
(36) |
Sep
(58) |
Oct
(131) |
Nov
(68) |
Dec
(30) |
2016 |
Jan
(46) |
Feb
(87) |
Mar
(135) |
Apr
(174) |
May
(132) |
Jun
(135) |
Jul
(149) |
Aug
(125) |
Sep
(79) |
Oct
(49) |
Nov
(95) |
Dec
(102) |
2017 |
Jan
(104) |
Feb
(75) |
Mar
(72) |
Apr
(53) |
May
(18) |
Jun
(5) |
Jul
(14) |
Aug
(19) |
Sep
(2) |
Oct
(13) |
Nov
(21) |
Dec
(67) |
2018 |
Jan
(56) |
Feb
(50) |
Mar
(148) |
Apr
(41) |
May
(37) |
Jun
(34) |
Jul
(34) |
Aug
(11) |
Sep
(52) |
Oct
(48) |
Nov
(28) |
Dec
(46) |
2019 |
Jan
(29) |
Feb
(63) |
Mar
(95) |
Apr
(54) |
May
(14) |
Jun
(71) |
Jul
(60) |
Aug
(49) |
Sep
(3) |
Oct
(64) |
Nov
(115) |
Dec
(57) |
2020 |
Jan
(15) |
Feb
(9) |
Mar
(38) |
Apr
(27) |
May
(60) |
Jun
(53) |
Jul
(35) |
Aug
(46) |
Sep
(37) |
Oct
(64) |
Nov
(20) |
Dec
(25) |
2021 |
Jan
(20) |
Feb
(31) |
Mar
(27) |
Apr
(23) |
May
(21) |
Jun
(30) |
Jul
(30) |
Aug
(7) |
Sep
(18) |
Oct
|
Nov
(15) |
Dec
(4) |
2022 |
Jan
(3) |
Feb
(1) |
Mar
(10) |
Apr
|
May
(2) |
Jun
(26) |
Jul
(5) |
Aug
|
Sep
(1) |
Oct
(2) |
Nov
(9) |
Dec
(2) |
2023 |
Jan
(4) |
Feb
(4) |
Mar
(5) |
Apr
(10) |
May
(29) |
Jun
(17) |
Jul
|
Aug
|
Sep
(1) |
Oct
(1) |
Nov
(2) |
Dec
|
2024 |
Jan
|
Feb
(6) |
Mar
|
Apr
(1) |
May
(6) |
Jun
|
Jul
(5) |
Aug
|
Sep
(3) |
Oct
|
Nov
|
Dec
|
2025 |
Jan
|
Feb
(3) |
Mar
|
Apr
|
May
|
Jun
|
Jul
(6) |
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: Hoang Le <hoa...@de...> - 2019-04-03 06:05:28
|
skb somehow dequeued out of inputq before processing, it causes to NULL pointer and kernel crashed. Add checking skb valid before using. Fixes: c55c8edafa9 ("tipc: smooth change between replicast and broadcast") Reported-by: Tuong Lien Tong <tuo...@de...> Acked-by: Ying Xue <yin...@wi...> Signed-off-by: Hoang Le <hoa...@de...> --- net/tipc/bcast.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/net/tipc/bcast.c b/net/tipc/bcast.c index 76e14dc08bb9..6c997d4a6218 100644 --- a/net/tipc/bcast.c +++ b/net/tipc/bcast.c @@ -769,6 +769,9 @@ void tipc_mcast_filter_msg(struct net *net, struct sk_buff_head *defq, u32 node, port; skb = skb_peek(inputq); + if (!skb) + return; + hdr = buf_msg(skb); if (likely(!msg_is_syn(hdr) && skb_queue_empty(defq))) -- 2.17.1 |
From: Ying X. <yin...@wi...> - 2019-04-03 05:31:46
|
On 4/2/19 8:40 PM, Hoang Le wrote: > skb somehow dequeued out of inputq before processing, it causes to > NULL pointer and kernel crashed. > > Add checking skb valid before using. > > Fixes: c55c8edafa9 ("tipc: smooth change between replicast and broadcast") > Reported-by: Tuong Lien Tong <tuo...@de...> > Signed-off-by: Hoang Le <hoa...@de...> Acked-by: Ying Xue <yin...@wi...> > --- > net/tipc/bcast.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/net/tipc/bcast.c b/net/tipc/bcast.c > index 76e14dc08bb9..6c997d4a6218 100644 > --- a/net/tipc/bcast.c > +++ b/net/tipc/bcast.c > @@ -769,6 +769,9 @@ void tipc_mcast_filter_msg(struct net *net, struct sk_buff_head *defq, > u32 node, port; > > skb = skb_peek(inputq); > + if (!skb) > + return; > + > hdr = buf_msg(skb); > > if (likely(!msg_is_syn(hdr) && skb_queue_empty(defq))) > |
From: Hoang Le <hoa...@de...> - 2019-04-02 12:40:51
|
skb somehow dequeued out of inputq before processing, it causes to NULL pointer and kernel crashed. Add checking skb valid before using. Fixes: c55c8edafa9 ("tipc: smooth change between replicast and broadcast") Reported-by: Tuong Lien Tong <tuo...@de...> Signed-off-by: Hoang Le <hoa...@de...> --- net/tipc/bcast.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/net/tipc/bcast.c b/net/tipc/bcast.c index 76e14dc08bb9..6c997d4a6218 100644 --- a/net/tipc/bcast.c +++ b/net/tipc/bcast.c @@ -769,6 +769,9 @@ void tipc_mcast_filter_msg(struct net *net, struct sk_buff_head *defq, u32 node, port; skb = skb_peek(inputq); + if (!skb) + return; + hdr = buf_msg(skb); if (likely(!msg_is_syn(hdr) && skb_queue_empty(defq))) -- 2.17.1 |
From: Jon M. <jon...@er...> - 2019-04-02 02:11:03
|
Hi Tung, I am afraid we will have to define a new socket option for this. Although the risk probably is very low, we don't want to break any possible users who are designed from the assumption that it received messages, not bytes. BR ///jon > -----Original Message----- > From: Tung Nguyen <tun...@de...> > Sent: 1-Apr-19 04:47 > To: tip...@li...; Jon Maloy > <jon...@er...>; ma...@do...; yin...@wi... > Subject: [tipc-discussion][PATCH v1 1/1] tipc: return allocated buffer when > using TIPC_SOCK_RECVQ_DEPTH > > When using TIPC_SOCK_RECVQ_DEPTH for getsockopt(), it returns the > number of buffer elements in receive socket buffer which is not so helpful > for user applications. > > By returning the current number of allocated bytes of the receive socket > buffer, applications can dimension its buffer usage to avoid buffer overload > issue. > > Signed-off-by: Tung Nguyen <tun...@de...> > --- > net/tipc/socket.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/net/tipc/socket.c b/net/tipc/socket.c index > b542f14ed444..43a60a808493 100644 > --- a/net/tipc/socket.c > +++ b/net/tipc/socket.c > @@ -3062,7 +3062,7 @@ static int tipc_getsockopt(struct socket *sock, int > lvl, int opt, > value = 0; /* was tipc_queue_size, now obsolete */ > break; > case TIPC_SOCK_RECVQ_DEPTH: > - value = skb_queue_len(&sk->sk_receive_queue); > + value = sk_rmem_alloc_get(sk); > break; > case TIPC_GROUP_JOIN: > seq.type = 0; > -- > 2.17.1 |
From: Tung N. <tun...@de...> - 2019-04-01 08:47:44
|
Returning the number of allocated bytes of the receive socket buffer when using getsockopt() with option TIPC_SOCK_RECVQ_DEPTH Tung Nguyen (1): tipc: return allocated buffer when using TIPC_SOCK_RECVQ_DEPTH net/tipc/socket.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) -- 2.17.1 |
From: Tung N. <tun...@de...> - 2019-04-01 08:47:44
|
When using TIPC_SOCK_RECVQ_DEPTH for getsockopt(), it returns the number of buffer elements in receive socket buffer which is not so helpful for user applications. By returning the current number of allocated bytes of the receive socket buffer, applications can dimension its buffer usage to avoid buffer overload issue. Signed-off-by: Tung Nguyen <tun...@de...> --- net/tipc/socket.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/tipc/socket.c b/net/tipc/socket.c index b542f14ed444..43a60a808493 100644 --- a/net/tipc/socket.c +++ b/net/tipc/socket.c @@ -3062,7 +3062,7 @@ static int tipc_getsockopt(struct socket *sock, int lvl, int opt, value = 0; /* was tipc_queue_size, now obsolete */ break; case TIPC_SOCK_RECVQ_DEPTH: - value = skb_queue_len(&sk->sk_receive_queue); + value = sk_rmem_alloc_get(sk); break; case TIPC_GROUP_JOIN: seq.type = 0; -- 2.17.1 |
From: David M. <da...@da...> - 2019-03-31 23:46:32
|
From: Jon Maloy <jon...@er...> Date: Sun, 31 Mar 2019 22:27:19 +0000 > All three acked by me. Series applied. Jon, please provide a formal, full: Acked-by: David S. Miller <da...@da...> next time. Thank you. |
From: Jon M. <jon...@er...> - 2019-03-31 22:27:33
|
All three acked by me. ///jon > -----Original Message----- > From: Xin Long <luc...@gm...> > Sent: 31-Mar-19 10:50 > To: network dev <ne...@vg...> > Cc: da...@da...; Jon Maloy <jon...@er...>; Ying Xue > <yin...@wi...>; tip...@li...; > syz...@go... > Subject: [PATCH net 0/3] tipc: a batch of uninit-value fixes for > netlink_compat > > These issues were all reported by syzbot, and exist since very beginning. > See the details on each patch. > > Xin Long (3): > tipc: check bearer name with right length in > tipc_nl_compat_bearer_enable > tipc: check link name with right length in tipc_nl_compat_link_set > tipc: handle the err returned from cmd header function > > net/tipc/netlink_compat.c | 24 ++++++++++++++++++++---- > 1 file changed, 20 insertions(+), 4 deletions(-) > > -- > 2.1.0 |
From: Xin L. <luc...@gm...> - 2019-03-31 14:50:51
|
Syzbot found a crash: BUG: KMSAN: uninit-value in tipc_nl_compat_name_table_dump+0x54f/0xcd0 net/tipc/netlink_compat.c:872 Call Trace: tipc_nl_compat_name_table_dump+0x54f/0xcd0 net/tipc/netlink_compat.c:872 __tipc_nl_compat_dumpit+0x59e/0xda0 net/tipc/netlink_compat.c:215 tipc_nl_compat_dumpit+0x63a/0x820 net/tipc/netlink_compat.c:280 tipc_nl_compat_handle net/tipc/netlink_compat.c:1226 [inline] tipc_nl_compat_recv+0x1b5f/0x2750 net/tipc/netlink_compat.c:1265 genl_family_rcv_msg net/netlink/genetlink.c:601 [inline] genl_rcv_msg+0x185f/0x1a60 net/netlink/genetlink.c:626 netlink_rcv_skb+0x431/0x620 net/netlink/af_netlink.c:2477 genl_rcv+0x63/0x80 net/netlink/genetlink.c:637 netlink_unicast_kernel net/netlink/af_netlink.c:1310 [inline] netlink_unicast+0xf3e/0x1020 net/netlink/af_netlink.c:1336 netlink_sendmsg+0x127f/0x1300 net/netlink/af_netlink.c:1917 sock_sendmsg_nosec net/socket.c:622 [inline] sock_sendmsg net/socket.c:632 [inline] Uninit was created at: __alloc_skb+0x309/0xa20 net/core/skbuff.c:208 alloc_skb include/linux/skbuff.h:1012 [inline] netlink_alloc_large_skb net/netlink/af_netlink.c:1182 [inline] netlink_sendmsg+0xb82/0x1300 net/netlink/af_netlink.c:1892 sock_sendmsg_nosec net/socket.c:622 [inline] sock_sendmsg net/socket.c:632 [inline] It was supposed to be fixed on commit 974cb0e3e7c9 ("tipc: fix uninit-value in tipc_nl_compat_name_table_dump") by checking TLV_GET_DATA_LEN(msg->req) in cmd->header()/tipc_nl_compat_name_table_dump_header(), which is called ahead of tipc_nl_compat_name_table_dump(). However, tipc_nl_compat_dumpit() doesn't handle the error returned from cmd header function. It means even when the check added in that fix fails, it won't stop calling tipc_nl_compat_name_table_dump(), and the issue will be triggered again. So this patch is to add the process for the err returned from cmd header function in tipc_nl_compat_dumpit(). Reported-by: syz...@sy... Signed-off-by: Xin Long <luc...@gm...> --- net/tipc/netlink_compat.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/net/tipc/netlink_compat.c b/net/tipc/netlink_compat.c index 0bfd03d6..340a6e7 100644 --- a/net/tipc/netlink_compat.c +++ b/net/tipc/netlink_compat.c @@ -267,8 +267,14 @@ static int tipc_nl_compat_dumpit(struct tipc_nl_compat_cmd_dump *cmd, if (msg->rep_type) tipc_tlv_init(msg->rep, msg->rep_type); - if (cmd->header) - (*cmd->header)(msg); + if (cmd->header) { + err = (*cmd->header)(msg); + if (err) { + kfree_skb(msg->rep); + msg->rep = NULL; + return err; + } + } arg = nlmsg_new(0, GFP_KERNEL); if (!arg) { -- 2.1.0 |
From: Xin L. <luc...@gm...> - 2019-03-31 14:50:42
|
A similar issue as fixed by Patch "tipc: check bearer name with right length in tipc_nl_compat_bearer_enable" was also found by syzbot in tipc_nl_compat_link_set(). The length to check with should be 'TLV_GET_DATA_LEN(msg->req) - offsetof(struct tipc_link_config, name)'. Reported-by: syz...@sy... Signed-off-by: Xin Long <luc...@gm...> --- net/tipc/netlink_compat.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/net/tipc/netlink_compat.c b/net/tipc/netlink_compat.c index 5f8e53c..0bfd03d6 100644 --- a/net/tipc/netlink_compat.c +++ b/net/tipc/netlink_compat.c @@ -771,7 +771,12 @@ static int tipc_nl_compat_link_set(struct tipc_nl_compat_cmd_doit *cmd, lc = (struct tipc_link_config *)TLV_DATA(msg->req); - len = min_t(int, TLV_GET_DATA_LEN(msg->req), TIPC_MAX_LINK_NAME); + len = TLV_GET_DATA_LEN(msg->req); + len -= offsetof(struct tipc_link_config, name); + if (len <= 0) + return -EINVAL; + + len = min_t(int, len, TIPC_MAX_LINK_NAME); if (!string_is_valid(lc->name, len)) return -EINVAL; -- 2.1.0 |
From: Xin L. <luc...@gm...> - 2019-03-31 14:50:35
|
Syzbot reported the following crash: BUG: KMSAN: uninit-value in memchr+0xce/0x110 lib/string.c:961 memchr+0xce/0x110 lib/string.c:961 string_is_valid net/tipc/netlink_compat.c:176 [inline] tipc_nl_compat_bearer_enable+0x2c4/0x910 net/tipc/netlink_compat.c:401 __tipc_nl_compat_doit net/tipc/netlink_compat.c:321 [inline] tipc_nl_compat_doit+0x3aa/0xaf0 net/tipc/netlink_compat.c:354 tipc_nl_compat_handle net/tipc/netlink_compat.c:1162 [inline] tipc_nl_compat_recv+0x1ae7/0x2750 net/tipc/netlink_compat.c:1265 genl_family_rcv_msg net/netlink/genetlink.c:601 [inline] genl_rcv_msg+0x185f/0x1a60 net/netlink/genetlink.c:626 netlink_rcv_skb+0x431/0x620 net/netlink/af_netlink.c:2477 genl_rcv+0x63/0x80 net/netlink/genetlink.c:637 netlink_unicast_kernel net/netlink/af_netlink.c:1310 [inline] netlink_unicast+0xf3e/0x1020 net/netlink/af_netlink.c:1336 netlink_sendmsg+0x127f/0x1300 net/netlink/af_netlink.c:1917 sock_sendmsg_nosec net/socket.c:622 [inline] sock_sendmsg net/socket.c:632 [inline] Uninit was created at: __alloc_skb+0x309/0xa20 net/core/skbuff.c:208 alloc_skb include/linux/skbuff.h:1012 [inline] netlink_alloc_large_skb net/netlink/af_netlink.c:1182 [inline] netlink_sendmsg+0xb82/0x1300 net/netlink/af_netlink.c:1892 sock_sendmsg_nosec net/socket.c:622 [inline] sock_sendmsg net/socket.c:632 [inline] It was triggered when the bearer name size < TIPC_MAX_BEARER_NAME, it would check with a wrong len/TLV_GET_DATA_LEN(msg->req), which also includes priority and disc_domain length. This patch is to fix it by checking it with a right length: 'TLV_GET_DATA_LEN(msg->req) - offsetof(struct tipc_bearer_config, name)'. Reported-by: syz...@sy... Signed-off-by: Xin Long <luc...@gm...> --- net/tipc/netlink_compat.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/net/tipc/netlink_compat.c b/net/tipc/netlink_compat.c index 4ad3586..5f8e53c 100644 --- a/net/tipc/netlink_compat.c +++ b/net/tipc/netlink_compat.c @@ -397,7 +397,12 @@ static int tipc_nl_compat_bearer_enable(struct tipc_nl_compat_cmd_doit *cmd, if (!bearer) return -EMSGSIZE; - len = min_t(int, TLV_GET_DATA_LEN(msg->req), TIPC_MAX_BEARER_NAME); + len = TLV_GET_DATA_LEN(msg->req); + len -= offsetof(struct tipc_bearer_config, name); + if (len <= 0) + return -EINVAL; + + len = min_t(int, len, TIPC_MAX_BEARER_NAME); if (!string_is_valid(b->name, len)) return -EINVAL; -- 2.1.0 |
From: Xin L. <luc...@gm...> - 2019-03-31 14:50:26
|
These issues were all reported by syzbot, and exist since very beginning. See the details on each patch. Xin Long (3): tipc: check bearer name with right length in tipc_nl_compat_bearer_enable tipc: check link name with right length in tipc_nl_compat_link_set tipc: handle the err returned from cmd header function net/tipc/netlink_compat.c | 24 ++++++++++++++++++++---- 1 file changed, 20 insertions(+), 4 deletions(-) -- 2.1.0 |
From: Xin L. <luc...@gm...> - 2019-03-30 16:11:33
|
On Thu, Mar 28, 2019 at 1:54 AM syzbot <syz...@sy...> wrote: > > Hello, > > syzbot found the following crash on: > > HEAD commit: c530a275 kmsan: call vmap hooks from vmalloc and ioremap f.. > git tree: kmsan > console output: https://syzkaller.appspot.com/x/log.txt?x=13bd4733200000 > kernel config: https://syzkaller.appspot.com/x/.config?x=a5675814e8eae69e > dashboard link: https://syzkaller.appspot.com/bug?extid=de00a87b8644a582ae79 > compiler: clang version 8.0.0 (trunk 350509) > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=12c534d7200000 > C reproducer: https://syzkaller.appspot.com/x/repro.c?x=10f9e733200000 > > IMPORTANT: if you fix the bug, please add the following tag to the commit: > Reported-by: syz...@sy... > > ================================================================== > BUG: KMSAN: uninit-value in memchr+0xce/0x110 lib/string.c:961 > CPU: 1 PID: 10538 Comm: syz-executor101 Not tainted 5.0.0+ #12 > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS > Google 01/01/2011 > Call Trace: > __dump_stack lib/dump_stack.c:77 [inline] > dump_stack+0x173/0x1d0 lib/dump_stack.c:113 > kmsan_report+0x12e/0x2a0 mm/kmsan/kmsan.c:600 > __msan_warning+0x82/0xf0 mm/kmsan/kmsan_instr.c:313 > memchr+0xce/0x110 lib/string.c:961 > string_is_valid net/tipc/netlink_compat.c:176 [inline] > tipc_nl_compat_link_set+0x121/0x1550 net/tipc/netlink_compat.c:770 @@ -766,8 +775,10 @@ static int tipc_nl_compat_link_set(struct tipc_nl_compat_cmd_doit *cmd, lc = (struct tipc_link_config *)TLV_DATA(msg->req); - len = min_t(int, TLV_GET_DATA_LEN(msg->req), TIPC_MAX_LINK_NAME); - if (!string_is_valid(lc->name, len)) + len = TLV_GET_DATA_LEN(msg->req) - + offsetof(struct tipc_link_config, name); + if (len <= 0 || + !string_is_valid(lc->name, min_t(int, len, TIPC_MAX_BEARER_NAME))) return -EINVAL; > __tipc_nl_compat_doit net/tipc/netlink_compat.c:321 [inline] > tipc_nl_compat_doit+0x3aa/0xaf0 net/tipc/netlink_compat.c:354 > tipc_nl_compat_handle net/tipc/netlink_compat.c:1162 [inline] > tipc_nl_compat_recv+0x1ae7/0x2750 net/tipc/netlink_compat.c:1265 > genl_family_rcv_msg net/netlink/genetlink.c:601 [inline] > genl_rcv_msg+0x185f/0x1a60 net/netlink/genetlink.c:626 > netlink_rcv_skb+0x431/0x620 net/netlink/af_netlink.c:2477 > genl_rcv+0x63/0x80 net/netlink/genetlink.c:637 > netlink_unicast_kernel net/netlink/af_netlink.c:1310 [inline] > netlink_unicast+0xf3e/0x1020 net/netlink/af_netlink.c:1336 > netlink_sendmsg+0x127f/0x1300 net/netlink/af_netlink.c:1917 > sock_sendmsg_nosec net/socket.c:622 [inline] > sock_sendmsg net/socket.c:632 [inline] > ___sys_sendmsg+0xdb9/0x11b0 net/socket.c:2115 > __sys_sendmsg net/socket.c:2153 [inline] > __do_sys_sendmsg net/socket.c:2162 [inline] > __se_sys_sendmsg+0x305/0x460 net/socket.c:2160 > __x64_sys_sendmsg+0x4a/0x70 net/socket.c:2160 > do_syscall_64+0xbc/0xf0 arch/x86/entry/common.c:291 > entry_SYSCALL_64_after_hwframe+0x63/0xe7 > RIP: 0033:0x440259 > Code: 18 89 d0 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 48 89 f8 48 89 f7 > 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff > ff 0f 83 fb 13 fc ff c3 66 2e 0f 1f 84 00 00 00 00 > RSP: 002b:00007ffcb435a248 EFLAGS: 00000246 ORIG_RAX: 000000000000002e > RAX: ffffffffffffffda RBX: 00000000004002c8 RCX: 0000000000440259 > RDX: 0000000000000000 RSI: 00000000200000c0 RDI: 0000000000000003 > RBP: 00000000006ca018 R08: 0000000000000000 R09: 00000000004002c8 > R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000401ae0 > R13: 0000000000401b70 R14: 0000000000000000 R15: 0000000000000000 > > Uninit was created at: > kmsan_save_stack_with_flags mm/kmsan/kmsan.c:205 [inline] > kmsan_internal_poison_shadow+0x92/0x150 mm/kmsan/kmsan.c:159 > kmsan_kmalloc+0xa6/0x130 mm/kmsan/kmsan_hooks.c:176 > kmsan_slab_alloc+0xe/0x10 mm/kmsan/kmsan_hooks.c:185 > slab_post_alloc_hook mm/slab.h:445 [inline] > slab_alloc_node mm/slub.c:2773 [inline] > __kmalloc_node_track_caller+0xe9e/0xff0 mm/slub.c:4398 > __kmalloc_reserve net/core/skbuff.c:140 [inline] > __alloc_skb+0x309/0xa20 net/core/skbuff.c:208 > alloc_skb include/linux/skbuff.h:1012 [inline] > netlink_alloc_large_skb net/netlink/af_netlink.c:1182 [inline] > netlink_sendmsg+0xb82/0x1300 net/netlink/af_netlink.c:1892 > sock_sendmsg_nosec net/socket.c:622 [inline] > sock_sendmsg net/socket.c:632 [inline] > ___sys_sendmsg+0xdb9/0x11b0 net/socket.c:2115 > __sys_sendmsg net/socket.c:2153 [inline] > __do_sys_sendmsg net/socket.c:2162 [inline] > __se_sys_sendmsg+0x305/0x460 net/socket.c:2160 > __x64_sys_sendmsg+0x4a/0x70 net/socket.c:2160 > do_syscall_64+0xbc/0xf0 arch/x86/entry/common.c:291 > entry_SYSCALL_64_after_hwframe+0x63/0xe7 > ================================================================== > > > --- > This bug is generated by a bot. It may contain errors. > See https://goo.gl/tpsmEJ for more information about syzbot. > syzbot engineers can be reached at syz...@go.... > > syzbot will keep track of this bug report. See: > https://goo.gl/tpsmEJ#status for how to communicate with syzbot. > syzbot can test patches for this bug, for details see: > https://goo.gl/tpsmEJ#testing-patches |
From: Xin L. <luc...@gm...> - 2019-03-30 15:35:31
|
On Fri, Mar 29, 2019 at 12:26 AM syzbot <syz...@sy...> wrote: > > Hello, > > syzbot found the following crash on: > > HEAD commit: a695dc5e kmsan: fixup mm/sl[au]b.[ch] after rebase > git tree: kmsan > console output: https://syzkaller.appspot.com/x/log.txt?x=1683e04d200000 > kernel config: https://syzkaller.appspot.com/x/.config?x=a5675814e8eae69e > dashboard link: https://syzkaller.appspot.com/bug?extid=3ce8520484b0d4e260a5 > compiler: clang version 8.0.0 (trunk 350509) > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=15bdf95f200000 > C reproducer: https://syzkaller.appspot.com/x/repro.c?x=11b5a4cf200000 > > IMPORTANT: if you fix the bug, please add the following tag to the commit: > Reported-by: syz...@sy... > > sshd (11234) used greatest stack depth: 54160 bytes left > ================================================================== > BUG: KMSAN: uninit-value in __arch_swab32 > arch/x86/include/uapi/asm/swab.h:10 [inline] > BUG: KMSAN: uninit-value in __fswab32 include/uapi/linux/swab.h:59 [inline] > BUG: KMSAN: uninit-value in tipc_nl_compat_name_table_dump+0x54f/0xcd0 > net/tipc/netlink_compat.c:872 > CPU: 1 PID: 11248 Comm: syz-executor646 Not tainted 5.0.0+ #11 > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS > Google 01/01/2011 > Call Trace: > __dump_stack lib/dump_stack.c:77 [inline] > dump_stack+0x173/0x1d0 lib/dump_stack.c:113 > kmsan_report+0x12e/0x2a0 mm/kmsan/kmsan.c:600 > __msan_warning+0x82/0xf0 mm/kmsan/kmsan_instr.c:313 > __arch_swab32 arch/x86/include/uapi/asm/swab.h:10 [inline] > __fswab32 include/uapi/linux/swab.h:59 [inline] > tipc_nl_compat_name_table_dump+0x54f/0xcd0 net/tipc/netlink_compat.c:872 commit 974cb0e3e7c963ced06c4e32c5b2884173fa5e01 Author: Ying Xue <yin...@wi...> Date: Mon Jan 14 17:22:28 2019 +0800 tipc: fix uninit-value in tipc_nl_compat_name_table_dump This patch tried to fix it in tipc_nl_compat_name_table_dump_header(). But it forgot to handler the err returned from cmd->header(): diff --git a/net/tipc/netlink_compat.c b/net/tipc/netlink_compat.c index 4ad3586..25bc39b 100644 --- a/net/tipc/netlink_compat.c +++ b/net/tipc/netlink_compat.c @@ -267,8 +267,14 @@ static int tipc_nl_compat_dumpit(struct tipc_nl_compat_cmd_dump *cmd, if (msg->rep_type) tipc_tlv_init(msg->rep, msg->rep_type); - if (cmd->header) - (*cmd->header)(msg); + if (cmd->header) { + err = (*cmd->header)(msg); + if (err) { + kfree_skb(msg->rep); + msg->rep = NULL; + return err; + } + } > __tipc_nl_compat_dumpit+0x59e/0xda0 net/tipc/netlink_compat.c:215 > tipc_nl_compat_dumpit+0x63a/0x820 net/tipc/netlink_compat.c:280 > tipc_nl_compat_handle net/tipc/netlink_compat.c:1226 [inline] > tipc_nl_compat_recv+0x1b5f/0x2750 net/tipc/netlink_compat.c:1265 > genl_family_rcv_msg net/netlink/genetlink.c:601 [inline] > genl_rcv_msg+0x185f/0x1a60 net/netlink/genetlink.c:626 > netlink_rcv_skb+0x431/0x620 net/netlink/af_netlink.c:2477 > genl_rcv+0x63/0x80 net/netlink/genetlink.c:637 > netlink_unicast_kernel net/netlink/af_netlink.c:1310 [inline] > netlink_unicast+0xf3e/0x1020 net/netlink/af_netlink.c:1336 > netlink_sendmsg+0x127f/0x1300 net/netlink/af_netlink.c:1917 > sock_sendmsg_nosec net/socket.c:622 [inline] > sock_sendmsg net/socket.c:632 [inline] > ___sys_sendmsg+0xdb9/0x11b0 net/socket.c:2115 > __sys_sendmsg net/socket.c:2153 [inline] > __do_sys_sendmsg net/socket.c:2162 [inline] > __se_sys_sendmsg+0x305/0x460 net/socket.c:2160 > __x64_sys_sendmsg+0x4a/0x70 net/socket.c:2160 > do_syscall_64+0xbc/0xf0 arch/x86/entry/common.c:291 > entry_SYSCALL_64_after_hwframe+0x63/0xe7 > RIP: 0033:0x444069 > Code: 18 89 d0 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 48 89 f8 48 89 f7 > 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff > ff 0f 83 1b d8 fb ff c3 66 2e 0f 1f 84 00 00 00 00 > RSP: 002b:00007ffda17b3718 EFLAGS: 00000246 ORIG_RAX: 000000000000002e > RAX: ffffffffffffffda RBX: 00000000004002e0 RCX: 0000000000444069 > RDX: 0000000000000000 RSI: 00000000200001c0 RDI: 0000000000000003 > RBP: 00000000006ce018 R08: 0000000000000000 R09: 00000000004002e0 > R10: 0000000000001900 R11: 0000000000000246 R12: 0000000000401d10 > R13: 0000000000401da0 R14: 0000000000000000 R15: 0000000000000000 > > Uninit was created at: > kmsan_save_stack_with_flags mm/kmsan/kmsan.c:205 [inline] > kmsan_internal_poison_shadow+0x92/0x150 mm/kmsan/kmsan.c:159 > kmsan_kmalloc+0xa6/0x130 mm/kmsan/kmsan_hooks.c:176 > kmsan_slab_alloc+0xe/0x10 mm/kmsan/kmsan_hooks.c:185 > slab_post_alloc_hook mm/slab.h:445 [inline] > slab_alloc_node mm/slub.c:2773 [inline] > __kmalloc_node_track_caller+0xe9e/0xff0 mm/slub.c:4398 > __kmalloc_reserve net/core/skbuff.c:140 [inline] > __alloc_skb+0x309/0xa20 net/core/skbuff.c:208 > alloc_skb include/linux/skbuff.h:1012 [inline] > netlink_alloc_large_skb net/netlink/af_netlink.c:1182 [inline] > netlink_sendmsg+0xb82/0x1300 net/netlink/af_netlink.c:1892 > sock_sendmsg_nosec net/socket.c:622 [inline] > sock_sendmsg net/socket.c:632 [inline] > ___sys_sendmsg+0xdb9/0x11b0 net/socket.c:2115 > __sys_sendmsg net/socket.c:2153 [inline] > __do_sys_sendmsg net/socket.c:2162 [inline] > __se_sys_sendmsg+0x305/0x460 net/socket.c:2160 > __x64_sys_sendmsg+0x4a/0x70 net/socket.c:2160 > do_syscall_64+0xbc/0xf0 arch/x86/entry/common.c:291 > entry_SYSCALL_64_after_hwframe+0x63/0xe7 > ================================================================== > > > --- > This bug is generated by a bot. It may contain errors. > See https://goo.gl/tpsmEJ for more information about syzbot. > syzbot engineers can be reached at syz...@go.... > > syzbot will keep track of this bug report. See: > https://goo.gl/tpsmEJ#status for how to communicate with syzbot. > syzbot can test patches for this bug, for details see: > https://goo.gl/tpsmEJ#testing-patches |
From: Xin L. <luc...@gm...> - 2019-03-30 14:18:51
|
On Thu, Mar 28, 2019 at 1:55 AM syzbot <syz...@sy...> wrote: > > Hello, > > syzbot found the following crash on: > > HEAD commit: 9536b452 kmsan: uaccess.h: fix variable name conflicts > git tree: kmsan > console output: https://syzkaller.appspot.com/x/log.txt?x=15f5d583200000 > kernel config: https://syzkaller.appspot.com/x/.config?x=a5675814e8eae69e > dashboard link: https://syzkaller.appspot.com/bug?extid=8b707430713eb46e1e45 > compiler: clang version 8.0.0 (trunk 350509) > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=16438e1b200000 > C reproducer: https://syzkaller.appspot.com/x/repro.c?x=16898cd7200000 > > IMPORTANT: if you fix the bug, please add the following tag to the commit: > Reported-by: syz...@sy... > > ================================================================== > BUG: KMSAN: uninit-value in memchr+0xce/0x110 lib/string.c:961 > CPU: 0 PID: 10526 Comm: syz-executor961 Not tainted 5.0.0+ #13 > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS > Google 01/01/2011 > Call Trace: > __dump_stack lib/dump_stack.c:77 [inline] > dump_stack+0x173/0x1d0 lib/dump_stack.c:113 > kmsan_report+0x12e/0x2a0 mm/kmsan/kmsan.c:600 > __msan_warning+0x82/0xf0 mm/kmsan/kmsan_instr.c:313 > memchr+0xce/0x110 lib/string.c:961 > string_is_valid net/tipc/netlink_compat.c:176 [inline] > tipc_nl_compat_bearer_enable+0x2c4/0x910 net/tipc/netlink_compat.c:401 It's using the wrong length to check if b->name is valid when b->name size < TIPC_MAX_BEARER_NAME. The right count should start from tipc_bearer_config->name. diff --git a/net/tipc/netlink_compat.c b/net/tipc/netlink_compat.c index 4ad3586..895f368 100644 --- a/net/tipc/netlink_compat.c +++ b/net/tipc/netlink_compat.c @@ -397,7 +397,9 @@ static int tipc_nl_compat_bearer_enable(struct tipc_nl_compat_cmd_doit *cmd, if (!bearer) return -EMSGSIZE; - len = min_t(int, TLV_GET_DATA_LEN(msg->req), TIPC_MAX_BEARER_NAME); + len = TLV_GET_DATA_LEN(msg->req); + len -= offsetof(struct tipc_bearer_config, name); + len = min_t(int, len, TIPC_MAX_BEARER_NAME); if (!string_is_valid(b->name, len)) return -EINVAL; The simliar thing should be done in: tipc_nl_compat_media_set() tipc_nl_compat_bearer_set() tipc_nl_compat_link_set() > __tipc_nl_compat_doit net/tipc/netlink_compat.c:321 [inline] > tipc_nl_compat_doit+0x3aa/0xaf0 net/tipc/netlink_compat.c:354 > tipc_nl_compat_handle net/tipc/netlink_compat.c:1162 [inline] > tipc_nl_compat_recv+0x1ae7/0x2750 net/tipc/netlink_compat.c:1265 > genl_family_rcv_msg net/netlink/genetlink.c:601 [inline] > genl_rcv_msg+0x185f/0x1a60 net/netlink/genetlink.c:626 > netlink_rcv_skb+0x431/0x620 net/netlink/af_netlink.c:2477 > genl_rcv+0x63/0x80 net/netlink/genetlink.c:637 > netlink_unicast_kernel net/netlink/af_netlink.c:1310 [inline] > netlink_unicast+0xf3e/0x1020 net/netlink/af_netlink.c:1336 > netlink_sendmsg+0x127f/0x1300 net/netlink/af_netlink.c:1917 > sock_sendmsg_nosec net/socket.c:622 [inline] > sock_sendmsg net/socket.c:632 [inline] > ___sys_sendmsg+0xdb9/0x11b0 net/socket.c:2115 > __sys_sendmsg net/socket.c:2153 [inline] > __do_sys_sendmsg net/socket.c:2162 [inline] > __se_sys_sendmsg+0x305/0x460 net/socket.c:2160 > __x64_sys_sendmsg+0x4a/0x70 net/socket.c:2160 > do_syscall_64+0xbc/0xf0 arch/x86/entry/common.c:291 > entry_SYSCALL_64_after_hwframe+0x63/0xe7 > RIP: 0033:0x440209 > Code: 18 89 d0 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 48 89 f8 48 89 f7 > 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff > ff 0f 83 fb 13 fc ff c3 66 2e 0f 1f 84 00 00 00 00 > RSP: 002b:00007ffc77bfd0a8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e > RAX: ffffffffffffffda RBX: 00000000004002c8 RCX: 0000000000440209 > RDX: 0000000000000000 RSI: 0000000020000140 RDI: 0000000000000003 > RBP: 00000000006ca018 R08: 0000000000000000 R09: 00000000004002c8 > R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000401a90 > R13: 0000000000401b20 R14: 0000000000000000 R15: 0000000000000000 > > Uninit was created at: > kmsan_save_stack_with_flags mm/kmsan/kmsan.c:205 [inline] > kmsan_internal_poison_shadow+0x92/0x150 mm/kmsan/kmsan.c:159 > kmsan_kmalloc+0xa6/0x130 mm/kmsan/kmsan_hooks.c:176 > kmsan_slab_alloc+0xe/0x10 mm/kmsan/kmsan_hooks.c:185 > slab_post_alloc_hook mm/slab.h:445 [inline] > slab_alloc_node mm/slub.c:2773 [inline] > __kmalloc_node_track_caller+0xe9e/0xff0 mm/slub.c:4398 > __kmalloc_reserve net/core/skbuff.c:140 [inline] > __alloc_skb+0x309/0xa20 net/core/skbuff.c:208 > alloc_skb include/linux/skbuff.h:1012 [inline] > netlink_alloc_large_skb net/netlink/af_netlink.c:1182 [inline] > netlink_sendmsg+0xb82/0x1300 net/netlink/af_netlink.c:1892 > sock_sendmsg_nosec net/socket.c:622 [inline] > sock_sendmsg net/socket.c:632 [inline] > ___sys_sendmsg+0xdb9/0x11b0 net/socket.c:2115 > __sys_sendmsg net/socket.c:2153 [inline] > __do_sys_sendmsg net/socket.c:2162 [inline] > __se_sys_sendmsg+0x305/0x460 net/socket.c:2160 > __x64_sys_sendmsg+0x4a/0x70 net/socket.c:2160 > do_syscall_64+0xbc/0xf0 arch/x86/entry/common.c:291 > entry_SYSCALL_64_after_hwframe+0x63/0xe7 > ================================================================== > > > --- > This bug is generated by a bot. It may contain errors. > See https://goo.gl/tpsmEJ for more information about syzbot. > syzbot engineers can be reached at syz...@go.... > > syzbot will keep track of this bug report. See: > https://goo.gl/tpsmEJ#status for how to communicate with syzbot. > syzbot can test patches for this bug, for details see: > https://goo.gl/tpsmEJ#testing-patches |
From: Jon M. <jon...@er...> - 2019-03-27 12:58:32
|
Yes, maybe I am just too conservative here. After all, the node struct is not huge, only a few hundred bytes. But I see a problem with the fact that the link spinlocks are aggregated into the node struct. If two threads are seeing two different copies of the spinlock there is no mutual exclusion. Somehow we need to break out the parts of struct node that really are updated on the write side (link pointers, node state) and only make those subject to the rcu mechanism. Or we add a reference counter to the link object and move the spinlocks back to struct tipc_link… ///jon From: Erik Hugne <eri...@gm...> Sent: 27-Mar-19 02:19 To: Jon Maloy <jon...@er...> Cc: tip...@de...; tip...@li... Subject: Re: [tipc-discussion] FW: [PATCH 1/1] tipc: eliminate node rwlock Hi. I remember this 😊 The node struct size could potentially be a problem on preempt kernels is if we need to do synchronize_rcu in a hot path (tipc_rcv). But i dont think thats necessary, and that the RCU approach is preferrable. //E On Tue, 26 Mar 2019, 22:03 Jon Maloy, <jon...@er...<mailto:jon...@er...>> wrote: Hi Hoang, This is what I did back in 2015. It worked fully with hundreds of failover tests, and I was ready to deliver it. Ying didn't like it though, and since it didn't give any visible performance improvements, I abandoned it for the time being. Back then I believed we could replace the rw lock with an rcu lock, as even suggested by David Miller, but I still cannot see how that would be possible. RCU locks require that a copy of the item in question is updated, and then linked in at an atomic operation at the end of the grace period. The node object is too big to just copy, update and re-link in, in my view, but It might be possible that there is still something I don't understand regarding this. I suggest we try with this again, and then ask for feedback from the community. Either they must convince us that this is a bad approach, or that it is possible to do by using RCU locks instead. Note that the code has changed a bit since this was developed, so the patch won't apply, but it should give you an idea about how to do it. ///jon -----Original Message----- From: Jon Maloy Sent: 26-Mar-19 15:14 To: Jon Maloy <jon...@er...<mailto:jon...@er...>>; Jon Maloy <ma...@do...<mailto:ma...@do...>> Cc: Tung Quang Nguyen <tun...@de...<mailto:tun...@de...>>; Hoang Huu Le <hoa...@de...<mailto:hoa...@de...>>; tuo...@de...<mailto:tuo...@de...>; tip...@de...<mailto:tip...@de...> Subject: [PATCH 1/1] tipc: eliminate node rwlock Thorough analysis of the code has shown that the node rwlock is not strictly needed. Instead, it can be replaced with proper usage of the link locks. The read mode of this lock is almost always followed by grabbing a link lock, because it is used for either sending, receiving or running a timeout handler on a particular link. The write mode means that we are either changing the working state of one or more links, or that we are redistributing the link slots in the node struct itself. These actions can be protected by grabbing the link lock of all the involved links, instead of using any separate lock for this. We now change the node locking policy as follows: - We grab a "write mode" lock for the node by just taking all (currently three) link spinlocks pertaining to that node. - We "release" the write mode lock by releasing all the spinlock again, in the opposite order. - As an effect of the above, grabbing any one of the link locks not only protects the link in question, but now effectively serves as a "read mode" lock for the whole node structure. Since the ratio read_mode/write_mode is extremely low, we can safely make this change without worrying about the extra cost of grabbing multiple spinlocks. In a few locations, where the current read mode lock is not followed by grabbing a link lock, we change the locking into "write mode" as described above. This only affects a couple of management calls, and will have no impact on overall performance. Signed-off-by: Jon Maloy <jon...@er...<mailto:jon...@er...>> --- net/tipc/node.c | 94 +++++++++++++++++++++++++-------------------------------- 1 file changed, 41 insertions(+), 53 deletions(-) diff --git a/net/tipc/node.c b/net/tipc/node.c index 3f7a4ed..d3b01f6 100644 --- a/net/tipc/node.c +++ b/net/tipc/node.c @@ -264,19 +264,12 @@ static struct tipc_node *tipc_node_find(struct net *net, u32 addr) return NULL; } -static void tipc_node_read_lock(struct tipc_node *n) -{ - read_lock_bh(&n->lock); -} - -static void tipc_node_read_unlock(struct tipc_node *n) -{ - read_unlock_bh(&n->lock); -} - static void tipc_node_write_lock(struct tipc_node *n) { - write_lock_bh(&n->lock); + int i = 0; + + for (; i < MAX_BEARERS; i++) + spin_lock_bh(&n->links[i].lock); } static void tipc_node_write_unlock(struct tipc_node *n) @@ -285,13 +278,9 @@ static void tipc_node_write_unlock(struct tipc_node *n) u32 addr = 0; u32 flags = n->action_flags; u32 link_id = 0; + int i = 0; struct list_head *publ_list; - if (likely(!flags)) { - write_unlock_bh(&n->lock); - return; - } - addr = n->addr; link_id = n->link_id; publ_list = &n->publ_list; @@ -299,7 +288,11 @@ static void tipc_node_write_unlock(struct tipc_node *n) n->action_flags &= ~(TIPC_NOTIFY_NODE_DOWN | TIPC_NOTIFY_NODE_UP | TIPC_NOTIFY_LINK_DOWN | TIPC_NOTIFY_LINK_UP); - write_unlock_bh(&n->lock); + for (i = MAX_BEARERS - 1; i >= 0; i--) + spin_unlock_bh(&n->links[i].lock); + + if (!flags) + return; if (flags & TIPC_NOTIFY_NODE_DOWN) tipc_publ_notify(net, publ_list, addr); @@ -516,7 +509,6 @@ static void tipc_node_timeout(unsigned long data) __skb_queue_head_init(&xmitq); for (bearer_id = 0; bearer_id < MAX_BEARERS; bearer_id++) { - tipc_node_read_lock(n); le = &n->links[bearer_id]; spin_lock_bh(&le->lock); if (le->link) { @@ -525,7 +517,6 @@ static void tipc_node_timeout(unsigned long data) rc = tipc_link_timeout(le->link, &xmitq); } spin_unlock_bh(&le->lock); - tipc_node_read_unlock(n); tipc_bearer_xmit(n->net, bearer_id, &xmitq, &le->maddr); if (rc & TIPC_LINK_DOWN_EVT) tipc_node_link_down(n, bearer_id, false); @@ -1113,14 +1104,14 @@ int tipc_node_get_linkname(struct net *net, u32 bearer_id, u32 addr, if (bearer_id >= MAX_BEARERS) goto exit; - tipc_node_read_lock(node); + spin_lock_bh(&node->links[bearer_id].lock); link = node->links[bearer_id].link; if (link) { strncpy(linkname, tipc_link_name(link), len); err = 0; } exit: - tipc_node_read_unlock(node); + spin_unlock_bh(&node->links[bearer_id].lock); tipc_node_put(node); return err; } @@ -1174,21 +1165,25 @@ int tipc_node_xmit(struct net *net, struct sk_buff_head *list, struct tipc_link_entry *le = NULL; struct tipc_node *n; struct sk_buff_head xmitq; - int bearer_id = -1; + int bearer_id = -1, prel_id; + int slc = selector & 1; int rc = -EHOSTUNREACH; __skb_queue_head_init(&xmitq); n = tipc_node_find(net, dnode); if (likely(n)) { - tipc_node_read_lock(n); - bearer_id = n->active_links[selector & 1]; - if (bearer_id >= 0) { - le = &n->links[bearer_id]; +again: + prel_id = n->active_links[slc]; + if (prel_id >= 0) { + le = &n->links[prel_id]; spin_lock_bh(&le->lock); - rc = tipc_link_xmit(le->link, list, &xmitq); + bearer_id = n->active_links[slc]; + if (bearer_id == prel_id) + rc = tipc_link_xmit(le->link, list, &xmitq); spin_unlock_bh(&le->lock); + if (unlikely(bearer_id != prel_id)) + goto again; } - tipc_node_read_unlock(n); if (likely(!skb_queue_empty(&xmitq))) { tipc_bearer_xmit(net, bearer_id, &xmitq, &le->maddr); return 0; @@ -1292,9 +1287,9 @@ static void tipc_node_bc_rcv(struct net *net, struct sk_buff *skb, int bearer_id /* Broadcast ACKs are sent on a unicast link */ if (rc & TIPC_LINK_SND_BC_ACK) { - tipc_node_read_lock(n); + spin_lock_bh(&le->lock); tipc_link_build_ack_msg(le->link, &xmitq); - tipc_node_read_unlock(n); + spin_unlock_bh(&le->lock); } if (!skb_queue_empty(&xmitq)) @@ -1487,16 +1482,14 @@ void tipc_rcv(struct net *net, struct sk_buff *skb, struct tipc_bearer *b) tipc_bcast_ack_rcv(net, n->bc_entry.link, bc_ack); /* Receive packet directly if conditions permit */ - tipc_node_read_lock(n); + spin_lock_bh(&le->lock); if (likely((n->state == SELF_UP_PEER_UP) && (usr != TUNNEL_PROTOCOL))) { - spin_lock_bh(&le->lock); if (le->link) { rc = tipc_link_rcv(le->link, skb, &xmitq); skb = NULL; } - spin_unlock_bh(&le->lock); } - tipc_node_read_unlock(n); + spin_unlock_bh(&le->lock); /* Check/update node state before receiving */ if (unlikely(skb)) { @@ -1573,15 +1566,15 @@ int tipc_nl_node_dump(struct sk_buff *skb, struct netlink_callback *cb) continue; } - tipc_node_read_lock(node); + tipc_node_write_lock(node); err = __tipc_nl_add_node(&msg, node); if (err) { last_addr = node->addr; - tipc_node_read_unlock(node); + tipc_node_write_unlock(node); goto out; } - tipc_node_read_unlock(node); + tipc_node_write_unlock(node); } done = 1; out: @@ -1612,7 +1605,7 @@ static struct tipc_node *tipc_node_find_by_name(struct net *net, *bearer_id = 0; rcu_read_lock(); list_for_each_entry_rcu(n, &tn->node_list, list) { - tipc_node_read_lock(n); + tipc_node_write_lock(n); for (i = 0; i < MAX_BEARERS; i++) { l = n->links[i].link; if (l && !strcmp(tipc_link_name(l), link_name)) { @@ -1621,7 +1614,7 @@ static struct tipc_node *tipc_node_find_by_name(struct net *net, break; } } - tipc_node_read_unlock(n); + tipc_node_write_unlock(n); if (found_node) break; } @@ -1662,8 +1655,7 @@ int tipc_nl_node_set_link(struct sk_buff *skb, struct genl_info *info) if (!node) return -EINVAL; - tipc_node_read_lock(node); - + spin_lock_bh(&node->links[bearer_id].lock); link = node->links[bearer_id].link; if (!link) { res = -EINVAL; @@ -1701,7 +1693,7 @@ int tipc_nl_node_set_link(struct sk_buff *skb, struct genl_info *info) } out: - tipc_node_read_unlock(node); + spin_unlock_bh(&node->links[bearer_id].lock); return res; } @@ -1738,17 +1730,16 @@ int tipc_nl_node_get_link(struct sk_buff *skb, struct genl_info *info) node = tipc_node_find_by_name(net, name, &bearer_id); if (!node) return -EINVAL; - - tipc_node_read_lock(node); + spin_lock_bh(&node->links[bearer_id].lock); link = node->links[bearer_id].link; if (!link) { - tipc_node_read_unlock(node); + spin_unlock_bh(&node->links[bearer_id].lock); nlmsg_free(msg.skb); return -EINVAL; } err = __tipc_nl_add_link(net, &msg, link, 0); - tipc_node_read_unlock(node); + spin_unlock_bh(&node->links[bearer_id].lock); if (err) { nlmsg_free(msg.skb); return err; @@ -1795,17 +1786,14 @@ int tipc_nl_node_reset_link_stats(struct sk_buff *skb, struct genl_info *info) return -EINVAL; le = &node->links[bearer_id]; - tipc_node_read_lock(node); spin_lock_bh(&le->lock); link = node->links[bearer_id].link; if (!link) { spin_unlock_bh(&le->lock); - tipc_node_read_unlock(node); return -EINVAL; } tipc_link_reset_stats(link); spin_unlock_bh(&le->lock); - tipc_node_read_unlock(node); return 0; } @@ -1867,10 +1855,10 @@ int tipc_nl_node_dump_link(struct sk_buff *skb, struct netlink_callback *cb) list_for_each_entry_continue_rcu(node, &tn->node_list, list) { - tipc_node_read_lock(node); + tipc_node_write_lock(node); err = __tipc_nl_add_node_links(net, &msg, node, &prev_link); - tipc_node_read_unlock(node); + tipc_node_write_unlock(node); if (err) goto out; @@ -1882,10 +1870,10 @@ int tipc_nl_node_dump_link(struct sk_buff *skb, struct netlink_callback *cb) goto out; list_for_each_entry_rcu(node, &tn->node_list, list) { - tipc_node_read_lock(node); + tipc_node_write_lock(node); err = __tipc_nl_add_node_links(net, &msg, node, &prev_link); - tipc_node_read_unlock(node); + tipc_node_write_unlock(node); if (err) goto out; -- 2.1.4 _______________________________________________ tipc-discussion mailing list tip...@li...<mailto:tip...@li...> https://lists.sourceforge.net/lists/listinfo/tipc-discussion |
From: Erik H. <eri...@gm...> - 2019-03-27 06:19:33
|
Hi. I remember this 😊 The node struct size could potentially be a problem on preempt kernels is if we need to do synchronize_rcu in a hot path (tipc_rcv). But i dont think thats necessary, and that the RCU approach is preferrable. //E On Tue, 26 Mar 2019, 22:03 Jon Maloy, <jon...@er...> wrote: > Hi Hoang, > This is what I did back in 2015. It worked fully with hundreds of failover > tests, and I was ready to deliver it. > Ying didn't like it though, and since it didn't give any visible > performance improvements, I abandoned it for the time being. > > Back then I believed we could replace the rw lock with an rcu lock, as > even suggested by David Miller, but I still cannot see how that would be > possible. > RCU locks require that a copy of the item in question is updated, and then > linked in at an atomic operation at the end of the grace period. > The node object is too big to just copy, update and re-link in, in my > view, but It might be possible that there is still something I don't > understand regarding this. > > I suggest we try with this again, and then ask for feedback from the > community. Either they must convince us that this is a bad approach, or > that it is possible to do by using RCU locks instead. > > Note that the code has changed a bit since this was developed, so the > patch won't apply, but it should give you an idea about how to do it. > > ///jon > > > -----Original Message----- > From: Jon Maloy > Sent: 26-Mar-19 15:14 > To: Jon Maloy <jon...@er...>; Jon Maloy <ma...@do...> > Cc: Tung Quang Nguyen <tun...@de...>; Hoang Huu Le < > hoa...@de...>; tuo...@de...; > tip...@de... > Subject: [PATCH 1/1] tipc: eliminate node rwlock > > Thorough analysis of the code has shown that the node rwlock is not > strictly needed. Instead, it can be replaced with proper usage of the link > locks. > > The read mode of this lock is almost always followed by grabbing a link > lock, because it is used for either sending, receiving or running a timeout > handler on a particular link. > > The write mode means that we are either changing the working state of one > or more links, or that we are redistributing the link slots in the node > struct itself. These actions can be protected by grabbing the link lock of > all the involved links, instead of using any separate lock for this. > > We now change the node locking policy as follows: > > - We grab a "write mode" lock for the node by just taking all (currently > three) link spinlocks pertaining to that node. > > - We "release" the write mode lock by releasing all the spinlock again, > in the opposite order. > > - As an effect of the above, grabbing any one of the link locks not > only protects the link in question, but now effectively serves as > a "read mode" lock for the whole node structure. > > Since the ratio read_mode/write_mode is extremely low, we can safely make > this change without worrying about the extra cost of grabbing multiple > spinlocks. > > In a few locations, where the current read mode lock is not followed by > grabbing a link lock, we change the locking into "write mode" as described > above. This only affects a couple of management calls, and will have no > impact on overall performance. > > Signed-off-by: Jon Maloy <jon...@er...> > --- > net/tipc/node.c | 94 > +++++++++++++++++++++++++-------------------------------- > 1 file changed, 41 insertions(+), 53 deletions(-) > > diff --git a/net/tipc/node.c b/net/tipc/node.c index 3f7a4ed..d3b01f6 > 100644 > --- a/net/tipc/node.c > +++ b/net/tipc/node.c > @@ -264,19 +264,12 @@ static struct tipc_node *tipc_node_find(struct net > *net, u32 addr) > return NULL; > } > > -static void tipc_node_read_lock(struct tipc_node *n) -{ > - read_lock_bh(&n->lock); > -} > - > -static void tipc_node_read_unlock(struct tipc_node *n) -{ > - read_unlock_bh(&n->lock); > -} > - > static void tipc_node_write_lock(struct tipc_node *n) { > - write_lock_bh(&n->lock); > + int i = 0; > + > + for (; i < MAX_BEARERS; i++) > + spin_lock_bh(&n->links[i].lock); > } > > static void tipc_node_write_unlock(struct tipc_node *n) @@ -285,13 +278,9 > @@ static void tipc_node_write_unlock(struct tipc_node *n) > u32 addr = 0; > u32 flags = n->action_flags; > u32 link_id = 0; > + int i = 0; > struct list_head *publ_list; > > - if (likely(!flags)) { > - write_unlock_bh(&n->lock); > - return; > - } > - > addr = n->addr; > link_id = n->link_id; > publ_list = &n->publ_list; > @@ -299,7 +288,11 @@ static void tipc_node_write_unlock(struct tipc_node > *n) > n->action_flags &= ~(TIPC_NOTIFY_NODE_DOWN | TIPC_NOTIFY_NODE_UP | > TIPC_NOTIFY_LINK_DOWN | TIPC_NOTIFY_LINK_UP); > > - write_unlock_bh(&n->lock); > + for (i = MAX_BEARERS - 1; i >= 0; i--) > + spin_unlock_bh(&n->links[i].lock); > + > + if (!flags) > + return; > > if (flags & TIPC_NOTIFY_NODE_DOWN) > tipc_publ_notify(net, publ_list, addr); @@ -516,7 +509,6 > @@ static void tipc_node_timeout(unsigned long data) > __skb_queue_head_init(&xmitq); > > for (bearer_id = 0; bearer_id < MAX_BEARERS; bearer_id++) { > - tipc_node_read_lock(n); > le = &n->links[bearer_id]; > spin_lock_bh(&le->lock); > if (le->link) { > @@ -525,7 +517,6 @@ static void tipc_node_timeout(unsigned long data) > rc = tipc_link_timeout(le->link, &xmitq); > } > spin_unlock_bh(&le->lock); > - tipc_node_read_unlock(n); > tipc_bearer_xmit(n->net, bearer_id, &xmitq, &le->maddr); > if (rc & TIPC_LINK_DOWN_EVT) > tipc_node_link_down(n, bearer_id, false); @@ > -1113,14 +1104,14 @@ int tipc_node_get_linkname(struct net *net, u32 > bearer_id, u32 addr, > if (bearer_id >= MAX_BEARERS) > goto exit; > > - tipc_node_read_lock(node); > + spin_lock_bh(&node->links[bearer_id].lock); > link = node->links[bearer_id].link; > if (link) { > strncpy(linkname, tipc_link_name(link), len); > err = 0; > } > exit: > - tipc_node_read_unlock(node); > + spin_unlock_bh(&node->links[bearer_id].lock); > tipc_node_put(node); > return err; > } > @@ -1174,21 +1165,25 @@ int tipc_node_xmit(struct net *net, struct > sk_buff_head *list, > struct tipc_link_entry *le = NULL; > struct tipc_node *n; > struct sk_buff_head xmitq; > - int bearer_id = -1; > + int bearer_id = -1, prel_id; > + int slc = selector & 1; > int rc = -EHOSTUNREACH; > > __skb_queue_head_init(&xmitq); > n = tipc_node_find(net, dnode); > if (likely(n)) { > - tipc_node_read_lock(n); > - bearer_id = n->active_links[selector & 1]; > - if (bearer_id >= 0) { > - le = &n->links[bearer_id]; > +again: > + prel_id = n->active_links[slc]; > + if (prel_id >= 0) { > + le = &n->links[prel_id]; > spin_lock_bh(&le->lock); > - rc = tipc_link_xmit(le->link, list, &xmitq); > + bearer_id = n->active_links[slc]; > + if (bearer_id == prel_id) > + rc = tipc_link_xmit(le->link, list, > &xmitq); > spin_unlock_bh(&le->lock); > + if (unlikely(bearer_id != prel_id)) > + goto again; > } > - tipc_node_read_unlock(n); > if (likely(!skb_queue_empty(&xmitq))) { > tipc_bearer_xmit(net, bearer_id, &xmitq, > &le->maddr); > return 0; > @@ -1292,9 +1287,9 @@ static void tipc_node_bc_rcv(struct net *net, struct > sk_buff *skb, int bearer_id > > /* Broadcast ACKs are sent on a unicast link */ > if (rc & TIPC_LINK_SND_BC_ACK) { > - tipc_node_read_lock(n); > + spin_lock_bh(&le->lock); > tipc_link_build_ack_msg(le->link, &xmitq); > - tipc_node_read_unlock(n); > + spin_unlock_bh(&le->lock); > } > > if (!skb_queue_empty(&xmitq)) > @@ -1487,16 +1482,14 @@ void tipc_rcv(struct net *net, struct sk_buff > *skb, struct tipc_bearer *b) > tipc_bcast_ack_rcv(net, n->bc_entry.link, bc_ack); > > /* Receive packet directly if conditions permit */ > - tipc_node_read_lock(n); > + spin_lock_bh(&le->lock); > if (likely((n->state == SELF_UP_PEER_UP) && (usr != > TUNNEL_PROTOCOL))) { > - spin_lock_bh(&le->lock); > if (le->link) { > rc = tipc_link_rcv(le->link, skb, &xmitq); > skb = NULL; > } > - spin_unlock_bh(&le->lock); > } > - tipc_node_read_unlock(n); > + spin_unlock_bh(&le->lock); > > /* Check/update node state before receiving */ > if (unlikely(skb)) { > @@ -1573,15 +1566,15 @@ int tipc_nl_node_dump(struct sk_buff *skb, struct > netlink_callback *cb) > continue; > } > > - tipc_node_read_lock(node); > + tipc_node_write_lock(node); > err = __tipc_nl_add_node(&msg, node); > if (err) { > last_addr = node->addr; > - tipc_node_read_unlock(node); > + tipc_node_write_unlock(node); > goto out; > } > > - tipc_node_read_unlock(node); > + tipc_node_write_unlock(node); > } > done = 1; > out: > @@ -1612,7 +1605,7 @@ static struct tipc_node > *tipc_node_find_by_name(struct net *net, > *bearer_id = 0; > rcu_read_lock(); > list_for_each_entry_rcu(n, &tn->node_list, list) { > - tipc_node_read_lock(n); > + tipc_node_write_lock(n); > for (i = 0; i < MAX_BEARERS; i++) { > l = n->links[i].link; > if (l && !strcmp(tipc_link_name(l), link_name)) { > @@ -1621,7 +1614,7 @@ static struct tipc_node > *tipc_node_find_by_name(struct net *net, > break; > } > } > - tipc_node_read_unlock(n); > + tipc_node_write_unlock(n); > if (found_node) > break; > } > @@ -1662,8 +1655,7 @@ int tipc_nl_node_set_link(struct sk_buff *skb, > struct genl_info *info) > if (!node) > return -EINVAL; > > - tipc_node_read_lock(node); > - > + spin_lock_bh(&node->links[bearer_id].lock); > link = node->links[bearer_id].link; > if (!link) { > res = -EINVAL; > @@ -1701,7 +1693,7 @@ int tipc_nl_node_set_link(struct sk_buff *skb, > struct genl_info *info) > } > > out: > - tipc_node_read_unlock(node); > + spin_unlock_bh(&node->links[bearer_id].lock); > > return res; > } > @@ -1738,17 +1730,16 @@ int tipc_nl_node_get_link(struct sk_buff *skb, > struct genl_info *info) > node = tipc_node_find_by_name(net, name, &bearer_id); > if (!node) > return -EINVAL; > - > - tipc_node_read_lock(node); > + spin_lock_bh(&node->links[bearer_id].lock); > link = node->links[bearer_id].link; > if (!link) { > - tipc_node_read_unlock(node); > + spin_unlock_bh(&node->links[bearer_id].lock); > nlmsg_free(msg.skb); > return -EINVAL; > } > > err = __tipc_nl_add_link(net, &msg, link, 0); > - tipc_node_read_unlock(node); > + spin_unlock_bh(&node->links[bearer_id].lock); > if (err) { > nlmsg_free(msg.skb); > return err; > @@ -1795,17 +1786,14 @@ int tipc_nl_node_reset_link_stats(struct sk_buff > *skb, struct genl_info *info) > return -EINVAL; > > le = &node->links[bearer_id]; > - tipc_node_read_lock(node); > spin_lock_bh(&le->lock); > link = node->links[bearer_id].link; > if (!link) { > spin_unlock_bh(&le->lock); > - tipc_node_read_unlock(node); > return -EINVAL; > } > tipc_link_reset_stats(link); > spin_unlock_bh(&le->lock); > - tipc_node_read_unlock(node); > return 0; > } > > @@ -1867,10 +1855,10 @@ int tipc_nl_node_dump_link(struct sk_buff *skb, > struct netlink_callback *cb) > > list_for_each_entry_continue_rcu(node, &tn->node_list, > list) { > - tipc_node_read_lock(node); > + tipc_node_write_lock(node); > err = __tipc_nl_add_node_links(net, &msg, node, > &prev_link); > - tipc_node_read_unlock(node); > + tipc_node_write_unlock(node); > if (err) > goto out; > > @@ -1882,10 +1870,10 @@ int tipc_nl_node_dump_link(struct sk_buff *skb, > struct netlink_callback *cb) > goto out; > > list_for_each_entry_rcu(node, &tn->node_list, list) { > - tipc_node_read_lock(node); > + tipc_node_write_lock(node); > err = __tipc_nl_add_node_links(net, &msg, node, > &prev_link); > - tipc_node_read_unlock(node); > + tipc_node_write_unlock(node); > if (err) > goto out; > > -- > 2.1.4 > > > > _______________________________________________ > tipc-discussion mailing list > tip...@li... > https://lists.sourceforge.net/lists/listinfo/tipc-discussion > |
From: Jon M. <jon...@er...> - 2019-03-26 21:02:50
|
Hi Hoang, This is what I did back in 2015. It worked fully with hundreds of failover tests, and I was ready to deliver it. Ying didn't like it though, and since it didn't give any visible performance improvements, I abandoned it for the time being. Back then I believed we could replace the rw lock with an rcu lock, as even suggested by David Miller, but I still cannot see how that would be possible. RCU locks require that a copy of the item in question is updated, and then linked in at an atomic operation at the end of the grace period. The node object is too big to just copy, update and re-link in, in my view, but It might be possible that there is still something I don't understand regarding this. I suggest we try with this again, and then ask for feedback from the community. Either they must convince us that this is a bad approach, or that it is possible to do by using RCU locks instead. Note that the code has changed a bit since this was developed, so the patch won't apply, but it should give you an idea about how to do it. ///jon -----Original Message----- From: Jon Maloy Sent: 26-Mar-19 15:14 To: Jon Maloy <jon...@er...>; Jon Maloy <ma...@do...> Cc: Tung Quang Nguyen <tun...@de...>; Hoang Huu Le <hoa...@de...>; tuo...@de...; tip...@de... Subject: [PATCH 1/1] tipc: eliminate node rwlock Thorough analysis of the code has shown that the node rwlock is not strictly needed. Instead, it can be replaced with proper usage of the link locks. The read mode of this lock is almost always followed by grabbing a link lock, because it is used for either sending, receiving or running a timeout handler on a particular link. The write mode means that we are either changing the working state of one or more links, or that we are redistributing the link slots in the node struct itself. These actions can be protected by grabbing the link lock of all the involved links, instead of using any separate lock for this. We now change the node locking policy as follows: - We grab a "write mode" lock for the node by just taking all (currently three) link spinlocks pertaining to that node. - We "release" the write mode lock by releasing all the spinlock again, in the opposite order. - As an effect of the above, grabbing any one of the link locks not only protects the link in question, but now effectively serves as a "read mode" lock for the whole node structure. Since the ratio read_mode/write_mode is extremely low, we can safely make this change without worrying about the extra cost of grabbing multiple spinlocks. In a few locations, where the current read mode lock is not followed by grabbing a link lock, we change the locking into "write mode" as described above. This only affects a couple of management calls, and will have no impact on overall performance. Signed-off-by: Jon Maloy <jon...@er...> --- net/tipc/node.c | 94 +++++++++++++++++++++++++-------------------------------- 1 file changed, 41 insertions(+), 53 deletions(-) diff --git a/net/tipc/node.c b/net/tipc/node.c index 3f7a4ed..d3b01f6 100644 --- a/net/tipc/node.c +++ b/net/tipc/node.c @@ -264,19 +264,12 @@ static struct tipc_node *tipc_node_find(struct net *net, u32 addr) return NULL; } -static void tipc_node_read_lock(struct tipc_node *n) -{ - read_lock_bh(&n->lock); -} - -static void tipc_node_read_unlock(struct tipc_node *n) -{ - read_unlock_bh(&n->lock); -} - static void tipc_node_write_lock(struct tipc_node *n) { - write_lock_bh(&n->lock); + int i = 0; + + for (; i < MAX_BEARERS; i++) + spin_lock_bh(&n->links[i].lock); } static void tipc_node_write_unlock(struct tipc_node *n) @@ -285,13 +278,9 @@ static void tipc_node_write_unlock(struct tipc_node *n) u32 addr = 0; u32 flags = n->action_flags; u32 link_id = 0; + int i = 0; struct list_head *publ_list; - if (likely(!flags)) { - write_unlock_bh(&n->lock); - return; - } - addr = n->addr; link_id = n->link_id; publ_list = &n->publ_list; @@ -299,7 +288,11 @@ static void tipc_node_write_unlock(struct tipc_node *n) n->action_flags &= ~(TIPC_NOTIFY_NODE_DOWN | TIPC_NOTIFY_NODE_UP | TIPC_NOTIFY_LINK_DOWN | TIPC_NOTIFY_LINK_UP); - write_unlock_bh(&n->lock); + for (i = MAX_BEARERS - 1; i >= 0; i--) + spin_unlock_bh(&n->links[i].lock); + + if (!flags) + return; if (flags & TIPC_NOTIFY_NODE_DOWN) tipc_publ_notify(net, publ_list, addr); @@ -516,7 +509,6 @@ static void tipc_node_timeout(unsigned long data) __skb_queue_head_init(&xmitq); for (bearer_id = 0; bearer_id < MAX_BEARERS; bearer_id++) { - tipc_node_read_lock(n); le = &n->links[bearer_id]; spin_lock_bh(&le->lock); if (le->link) { @@ -525,7 +517,6 @@ static void tipc_node_timeout(unsigned long data) rc = tipc_link_timeout(le->link, &xmitq); } spin_unlock_bh(&le->lock); - tipc_node_read_unlock(n); tipc_bearer_xmit(n->net, bearer_id, &xmitq, &le->maddr); if (rc & TIPC_LINK_DOWN_EVT) tipc_node_link_down(n, bearer_id, false); @@ -1113,14 +1104,14 @@ int tipc_node_get_linkname(struct net *net, u32 bearer_id, u32 addr, if (bearer_id >= MAX_BEARERS) goto exit; - tipc_node_read_lock(node); + spin_lock_bh(&node->links[bearer_id].lock); link = node->links[bearer_id].link; if (link) { strncpy(linkname, tipc_link_name(link), len); err = 0; } exit: - tipc_node_read_unlock(node); + spin_unlock_bh(&node->links[bearer_id].lock); tipc_node_put(node); return err; } @@ -1174,21 +1165,25 @@ int tipc_node_xmit(struct net *net, struct sk_buff_head *list, struct tipc_link_entry *le = NULL; struct tipc_node *n; struct sk_buff_head xmitq; - int bearer_id = -1; + int bearer_id = -1, prel_id; + int slc = selector & 1; int rc = -EHOSTUNREACH; __skb_queue_head_init(&xmitq); n = tipc_node_find(net, dnode); if (likely(n)) { - tipc_node_read_lock(n); - bearer_id = n->active_links[selector & 1]; - if (bearer_id >= 0) { - le = &n->links[bearer_id]; +again: + prel_id = n->active_links[slc]; + if (prel_id >= 0) { + le = &n->links[prel_id]; spin_lock_bh(&le->lock); - rc = tipc_link_xmit(le->link, list, &xmitq); + bearer_id = n->active_links[slc]; + if (bearer_id == prel_id) + rc = tipc_link_xmit(le->link, list, &xmitq); spin_unlock_bh(&le->lock); + if (unlikely(bearer_id != prel_id)) + goto again; } - tipc_node_read_unlock(n); if (likely(!skb_queue_empty(&xmitq))) { tipc_bearer_xmit(net, bearer_id, &xmitq, &le->maddr); return 0; @@ -1292,9 +1287,9 @@ static void tipc_node_bc_rcv(struct net *net, struct sk_buff *skb, int bearer_id /* Broadcast ACKs are sent on a unicast link */ if (rc & TIPC_LINK_SND_BC_ACK) { - tipc_node_read_lock(n); + spin_lock_bh(&le->lock); tipc_link_build_ack_msg(le->link, &xmitq); - tipc_node_read_unlock(n); + spin_unlock_bh(&le->lock); } if (!skb_queue_empty(&xmitq)) @@ -1487,16 +1482,14 @@ void tipc_rcv(struct net *net, struct sk_buff *skb, struct tipc_bearer *b) tipc_bcast_ack_rcv(net, n->bc_entry.link, bc_ack); /* Receive packet directly if conditions permit */ - tipc_node_read_lock(n); + spin_lock_bh(&le->lock); if (likely((n->state == SELF_UP_PEER_UP) && (usr != TUNNEL_PROTOCOL))) { - spin_lock_bh(&le->lock); if (le->link) { rc = tipc_link_rcv(le->link, skb, &xmitq); skb = NULL; } - spin_unlock_bh(&le->lock); } - tipc_node_read_unlock(n); + spin_unlock_bh(&le->lock); /* Check/update node state before receiving */ if (unlikely(skb)) { @@ -1573,15 +1566,15 @@ int tipc_nl_node_dump(struct sk_buff *skb, struct netlink_callback *cb) continue; } - tipc_node_read_lock(node); + tipc_node_write_lock(node); err = __tipc_nl_add_node(&msg, node); if (err) { last_addr = node->addr; - tipc_node_read_unlock(node); + tipc_node_write_unlock(node); goto out; } - tipc_node_read_unlock(node); + tipc_node_write_unlock(node); } done = 1; out: @@ -1612,7 +1605,7 @@ static struct tipc_node *tipc_node_find_by_name(struct net *net, *bearer_id = 0; rcu_read_lock(); list_for_each_entry_rcu(n, &tn->node_list, list) { - tipc_node_read_lock(n); + tipc_node_write_lock(n); for (i = 0; i < MAX_BEARERS; i++) { l = n->links[i].link; if (l && !strcmp(tipc_link_name(l), link_name)) { @@ -1621,7 +1614,7 @@ static struct tipc_node *tipc_node_find_by_name(struct net *net, break; } } - tipc_node_read_unlock(n); + tipc_node_write_unlock(n); if (found_node) break; } @@ -1662,8 +1655,7 @@ int tipc_nl_node_set_link(struct sk_buff *skb, struct genl_info *info) if (!node) return -EINVAL; - tipc_node_read_lock(node); - + spin_lock_bh(&node->links[bearer_id].lock); link = node->links[bearer_id].link; if (!link) { res = -EINVAL; @@ -1701,7 +1693,7 @@ int tipc_nl_node_set_link(struct sk_buff *skb, struct genl_info *info) } out: - tipc_node_read_unlock(node); + spin_unlock_bh(&node->links[bearer_id].lock); return res; } @@ -1738,17 +1730,16 @@ int tipc_nl_node_get_link(struct sk_buff *skb, struct genl_info *info) node = tipc_node_find_by_name(net, name, &bearer_id); if (!node) return -EINVAL; - - tipc_node_read_lock(node); + spin_lock_bh(&node->links[bearer_id].lock); link = node->links[bearer_id].link; if (!link) { - tipc_node_read_unlock(node); + spin_unlock_bh(&node->links[bearer_id].lock); nlmsg_free(msg.skb); return -EINVAL; } err = __tipc_nl_add_link(net, &msg, link, 0); - tipc_node_read_unlock(node); + spin_unlock_bh(&node->links[bearer_id].lock); if (err) { nlmsg_free(msg.skb); return err; @@ -1795,17 +1786,14 @@ int tipc_nl_node_reset_link_stats(struct sk_buff *skb, struct genl_info *info) return -EINVAL; le = &node->links[bearer_id]; - tipc_node_read_lock(node); spin_lock_bh(&le->lock); link = node->links[bearer_id].link; if (!link) { spin_unlock_bh(&le->lock); - tipc_node_read_unlock(node); return -EINVAL; } tipc_link_reset_stats(link); spin_unlock_bh(&le->lock); - tipc_node_read_unlock(node); return 0; } @@ -1867,10 +1855,10 @@ int tipc_nl_node_dump_link(struct sk_buff *skb, struct netlink_callback *cb) list_for_each_entry_continue_rcu(node, &tn->node_list, list) { - tipc_node_read_lock(node); + tipc_node_write_lock(node); err = __tipc_nl_add_node_links(net, &msg, node, &prev_link); - tipc_node_read_unlock(node); + tipc_node_write_unlock(node); if (err) goto out; @@ -1882,10 +1870,10 @@ int tipc_nl_node_dump_link(struct sk_buff *skb, struct netlink_callback *cb) goto out; list_for_each_entry_rcu(node, &tn->node_list, list) { - tipc_node_read_lock(node); + tipc_node_write_lock(node); err = __tipc_nl_add_node_links(net, &msg, node, &prev_link); - tipc_node_read_unlock(node); + tipc_node_write_unlock(node); if (err) goto out; -- 2.1.4 |
From: Erik H. <eri...@gm...> - 2019-03-26 20:02:41
|
I'll see to it. You may have seen that i sent in the /dev/tipc redirection patch to the GNU/bash list about a week ago, but it seems it got a cold reception, no response yet. //E On Tue, 26 Mar 2019, 20:46 Jon Maloy, <jon...@er...> wrote: > So I guess the subscriber module should be added to tipc_utils as a > separate module, or possibly as part of libtipc, which is already there. > > You are still registered as developer at SF/tipc, so you are free to do > it. Otherwise I can fix it, but I am pretty busy these days. > > > > BR > > ///jon > > > > > > *From:* Erik Hugne <eri...@gm...> > *Sent:* 20-Mar-19 15:32 > *To:* Jon Maloy <jon...@er...> > *Cc:* tip...@li... > *Subject:* Re: Topology subscriptions in bash > > > > I only planned to send the /dev/tipc network redirection stuff there. I > dont think that the topology module is relevant enough to be a default bash > builtin. > > > > //E > > > > On Wed, 20 Mar 2019, 11:09 Jon Maloy, <jon...@er...> wrote: > > Nice. Is this all for the bash project, or do we need to add something to > tipc_utils? > > ///jon > > > > -----Original Message----- > > From: Erik Hugne <eri...@gm...> > > Sent: 19-Mar-19 22:33 > > To: tip...@li...; Jon Maloy > > <jon...@er...> > > Subject: Topology subscriptions in bash > > > > A small tool to get shell integrations to the TIPC topology server, maybe > > someone will find it useful https://github.com/Hugne/tipc.sh > > > > The subscriber examples use part of the functionality that i plan to > submit to > > the bash project for network redirection support, e.g: > > > > [root@kdev ~] exec 5</dev/tipc/1000/1 > > [root@kdev ~] echo hello world >/dev/tipc/1000/1 [root@kdev ~] cat <&5 > > hello world ^C [root@kdev ~] > > > > //E > > |
From: Jon M. <jon...@er...> - 2019-03-26 20:01:50
|
So I guess the subscriber module should be added to tipc_utils as a separate module, or possibly as part of libtipc, which is already there. You are still registered as developer at SF/tipc, so you are free to do it. Otherwise I can fix it, but I am pretty busy these days. BR ///jon From: Erik Hugne <eri...@gm...> Sent: 20-Mar-19 15:32 To: Jon Maloy <jon...@er...> Cc: tip...@li... Subject: Re: Topology subscriptions in bash I only planned to send the /dev/tipc network redirection stuff there. I dont think that the topology module is relevant enough to be a default bash builtin. //E On Wed, 20 Mar 2019, 11:09 Jon Maloy, <jon...@er...<mailto:jon...@er...>> wrote: Nice. Is this all for the bash project, or do we need to add something to tipc_utils? ///jon > -----Original Message----- > From: Erik Hugne <eri...@gm...<mailto:eri...@gm...>> > Sent: 19-Mar-19 22:33 > To: tip...@li...<mailto:tip...@li...>; Jon Maloy > <jon...@er...<mailto:jon...@er...>> > Subject: Topology subscriptions in bash > > A small tool to get shell integrations to the TIPC topology server, maybe > someone will find it useful https://github.com/Hugne/tipc.sh > > The subscriber examples use part of the functionality that i plan to submit to > the bash project for network redirection support, e.g: > > [root@kdev ~] exec 5</dev/tipc/1000/1 > [root@kdev ~] echo hello world >/dev/tipc/1000/1 [root@kdev ~] cat <&5 > hello world ^C [root@kdev ~] > > //E |
From: David M. <da...@da...> - 2019-03-26 18:40:06
|
From: Wei Yongjun <wei...@hu...> Date: Mon, 25 Mar 2019 06:31:09 +0000 > Fix the return value check which testing the wrong variable > in tipc_mcast_send_sync(). > > Fixes: c55c8edafa91 ("tipc: smooth change between replicast and broadcast") > Reported-by: Hulk Robot <hu...@hu...> > Signed-off-by: Wei Yongjun <wei...@hu...> > --- > v1 -> v2: add reported-by Applied. |
From: David M. <da...@da...> - 2019-03-26 18:22:25
|
From: Xin Long <luc...@gm...> Date: Sun, 24 Mar 2019 00:48:22 +0800 > When running a syz script, a panic occurred: ... > It was caused by the netns freed without deleting the discoverer timer, > while later on the netns would be accessed in the timer handler. > > The timer should have been deleted by tipc_net_stop() when cleaning up a > netns. However, tipc has been able to enable a bearer and start d->timer > without the local node_addr set since Commit 52dfae5c85a4 ("tipc: obtain > node identity from interface by default"), which caused the timer not to > be deleted in tipc_net_stop() then. > > So fix it in tipc_net_stop() by changing to check local node_id instead > of local node_addr, as Jon suggested. > > While at it, remove the calling of tipc_nametbl_withdraw() there, since > tipc_nametbl_stop() will take of the nametbl's freeing after. > > Fixes: 52dfae5c85a4 ("tipc: obtain node identity from interface by default") > Reported-by: syz...@sy... > Signed-off-by: Xin Long <luc...@gm...> Applied and queued up for -stable, anks Xin! |
From: Xue, Y. <Yin...@wi...> - 2019-03-26 15:13:23
|
Hi Tuong, Thank you for your explanation below. It's pretty clear for me. Also, you can add my ack-by tag in the series if you want. Thanks, Ying -----Original Message----- From: Tuong Lien Tong [mailto:tuo...@de...] Sent: Monday, March 25, 2019 3:33 PM To: Xue, Ying; tip...@li...; jon...@er...; ma...@do... Subject: RE: [net-next 1/3] tipc: improve TIPC throughput by Gap ACK blocks Hi Ying! Correct, the idea was inspired from SACK in SCTP protocol but with simplicity. Also, I see your suggestions about the "duplicated packets"... which is connected to the SCTP "delayed SACK" algorithm (i.e. in the case of no payload message loss). In TIPC, as I understand so far, we already have such a delay on acknowledgements by the link "rcv_unacked" (- Jon may correct me?) (but we don’t implement a timer for the SACK delay timeout i.e. the 200ms in the SCTP RFC). However, that duplicated TSN concept is based on DATA chunks _and_ "with no new DATA chunk(s)" which usually happens in case of SACK loss and the sender has tried to retransmit the same DATA chunks when its RTO timer expired..., obviously an immediate SACK is needed in this situation. In TIPC, we might not face with this situation because we do not have a retransmission timer at sender side, duplicates can occur almost due to the overactive NACK sending and should be covered by the 2nd patch of the series. For me, in the case of packet loss, an immediate retransmission is important, otherwise it can reduce the performance. However, because we never know if the packet is really lost or just delayed, we have to apply the "1ms restriction" to reduce duplicates (- as you can also see in the 2nd patch). Fast retransmission was also tried, Jon and I had some discussions before... but the fact is, in TIPC, the sender is passive (due to no retransmission timer) and we could be in trouble if trying to wait for the 2nd or 3rd indications... Instead, the NACK sending criteria has been changed by the 2nd patch to both reduce duplicates but try to keep the performance... Actually, in SCTP, the situation is a bit difference as they play with "chunks" and "multi-streaming" than individual packets like ours at the link layer, and many chunks can be optionally bundled in a single packet because of the slow-start or Nagle's algorithm... Anyway, if you have any ideas to improve TIPC performance more, I will try to see what happens. Thanks a lot! BR/Tuong -----Original Message----- From: Ying Xue <yin...@wi...> Sent: Friday, March 22, 2019 7:52 PM To: Tuong Lien <tuo...@de...>; tip...@li...; jon...@er...; ma...@do... Subject: Re: [net-next 1/3] tipc: improve TIPC throughput by Gap ACK blocks Hi Tuong, Great job! It's a very nice enhancement, and we should do the improvement early. On 3/20/19 11:28 AM, Tuong Lien wrote: > During unicast link transmission, it's observed very often that because > of one or a few lost/dis-ordered packets, the sending side will fastly > reach the send window limit and must wait for the packets to be arrived > at the receiving side or in the worst case, a retransmission must be > done first. The sending side cannot release a lot of subsequent packets > in its transmq even though all of them might have already been received > by the receiving side. > That is, one or two packets dis-ordered/lost and dozens of packets have > to wait, this obviously reduces the overall throughput! > > This commit introduces an algorithm to overcome this by using "Gap ACK > blocks". Basically, a Gap ACK block will consist of <ack, gap> numbers > that describes the link deferdq where packets have been got by the > receiving side but with gaps, for example: > > link deferdq: [1 2 3 4 10 11 13 14 15 20] > --> Gap ACK blocks: <4, 5>, <11, 1>, <15, 4>, <20, 0> This idea is the exactly same as SACK of SCTP: https://tools.ietf.org/html/rfc4960#section-3.3.4 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type = 3 |Chunk Flags | Chunk Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Cumulative TSN Ack | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Advertised Receiver Window Credit (a_rwnd) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Number of Gap Ack Blocks = N | Number of Duplicate TSNs = X | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Gap Ack Block #1 Start | Gap Ack Block #1 End | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ / / \ ... \ / / +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Gap Ack Block #N Start | Gap Ack Block #N End | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Duplicate TSN 1 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ / / \ ... \ / / +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Duplicate TSN X | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ For example, assume that the receiver has the following DATA chunks newly arrived at the time when it decides to send a Selective ACK, ---------- | TSN=17 | ---------- | | <- still missing ---------- | TSN=15 | ---------- | TSN=14 | ---------- | | <- still missing ---------- | TSN=12 | ---------- | TSN=11 | ---------- | TSN=10 | ---------- then the parameter part of the SACK MUST be constructed as follows (assuming the new a_rwnd is set to 4660 by the sender): +--------------------------------+ | Cumulative TSN Ack = 12 | +--------------------------------+ | a_rwnd = 4660 | +----------------+---------------+ | num of block=2 | num of dup=0 | +----------------+---------------+ |block #1 strt=2 |block #1 end=3 | +----------------+---------------+ |block #2 strt=5 |block #2 end=5 | +----------------+---------------+ It seems more easier to understand than our solution. Of course, I don't intend to disagree to your proposal. In addition, we also can consider to mark duplicated packet as SCTP is doing. For duplicated packets, SCTP adapt the strategy: When a packet arrives with duplicate DATA chunk(s) and with no new DATA chunk(s), the endpoint MUST _immediately_ send a SACK with no delay. If a packet arrives with duplicate DATA chunk(s) bundled with new DATA chunks, the endpoint MAY immediately send a SACK. In the current solution, when a sender detects a gap in tipc_link_advance_transmq(), it will _immediately_ retransmit missing messages. However, probably we can borrow some idea from SCTP Fast Retransmit algorithm, which might help us further improve throughput performance: https://tools.ietf.org/html/rfc4960#section-7.2.4 In the absence of data loss, an endpoint performs delayed acknowledgement. However, whenever an endpoint notices a hole in the arriving TSN sequence, it SHOULD start sending a SACK back every time a packet arrives carrying data until the hole is filled. Whenever an endpoint receives a SACK that indicates that some TSNs are missing, it SHOULD wait for _two further miss indications_ (via subsequent SACKs for a total of three missing reports) on the same TSNs before taking action with regard to Fast Retransmit. In SCTP, whenever sender endpoint detects a gap, it doesn't immediately retransmit the missing packet. Instead, it will wait for two further miss indications. Of course, SCTP flow control management algorithm is not the exactly same with TIPC. Especially, each SCTP endpoint has a retransmission timer, slow-start etc. Maybe its algorithms are all useful for TIPC scenario, but I think we can try to use the two ideas I mentioned above based on the current solution to check whether they can further improve TIPC transmission performance. Thanks, Ying > > The Gap ACK blocks will be sent to the sending side along with the > traditional ACK or NACK message. Immediately when receiving the message > the sending side will now not only release from its transmq the packets > ack-ed by the ACK but also by the Gap ACK blocks! So, more packets can > be enqueued and transmitted. > In addition, the sending side can now do "multi-retransmissions" > according to the Gaps reported in the Gap ACK blocks. > > The new algorithm as verified helps greatly improve the TIPC throughput > especially under packet loss condition. > > So far, a maximum of 32 blocks is quite enough without any "Too few Gap > ACK blocks" reports with a 5.0% packet loss rate, however this number > can be increased in the furture if needed. > > Also, the patch is backward compatible. > > Acked-by: Jon Maloy <jon...@er...> > Signed-off-by: Tuong Lien <tuo...@de...> > --- > net/tipc/link.c | 134 +++++++++++++++++++++++++++++++++++++++++++++++++++----- > net/tipc/msg.h | 30 +++++++++++++ > net/tipc/node.h | 6 ++- > 3 files changed, 158 insertions(+), 12 deletions(-) > > diff --git a/net/tipc/link.c b/net/tipc/link.c > index 52d23b3ffaf5..5aee1ed23ba9 100644 > --- a/net/tipc/link.c > +++ b/net/tipc/link.c > @@ -246,6 +246,10 @@ static int tipc_link_build_nack_msg(struct tipc_link *l, > static void tipc_link_build_bc_init_msg(struct tipc_link *l, > struct sk_buff_head *xmitq); > static bool tipc_link_release_pkts(struct tipc_link *l, u16 to); > +static u16 tipc_build_gap_ack_blks(struct tipc_link *l, void *data); > +static void tipc_link_advance_transmq(struct tipc_link *l, u16 acked, u16 gap, > + struct tipc_gap_ack_blks *ga, > + struct sk_buff_head *xmitq); > > /* > * Simple non-static link routines (i.e. referenced outside this file) > @@ -1226,6 +1230,102 @@ static bool tipc_link_release_pkts(struct tipc_link *l, u16 acked) > return released; > } > > +/* tipc_build_gap_ack_blks - build Gap ACK blocks > + * @l: tipc link that data have come with gaps in sequence if any > + * @data: data buffer to store the Gap ACK blocks after built > + * > + * returns the actual allocated memory size > + */ > +static u16 tipc_build_gap_ack_blks(struct tipc_link *l, void *data) > +{ > + struct sk_buff *skb = skb_peek(&l->deferdq); > + struct tipc_gap_ack_blks *ga = data; > + u16 len, expect, seqno = 0; > + u8 n = 0; > + > + if (!skb) > + goto exit; > + > + expect = buf_seqno(skb); > + skb_queue_walk(&l->deferdq, skb) { > + seqno = buf_seqno(skb); > + if (unlikely(more(seqno, expect))) { > + ga->gacks[n].ack = htons(expect - 1); > + ga->gacks[n].gap = htons(seqno - expect); > + if (++n >= MAX_GAP_ACK_BLKS) { > + pr_info_ratelimited("Too few Gap ACK blocks!\n"); > + goto exit; > + } > + } else if (unlikely(less(seqno, expect))) { > + pr_warn("Unexpected skb in deferdq!\n"); > + continue; > + } > + expect = seqno + 1; > + } > + > + /* last block */ > + ga->gacks[n].ack = htons(seqno); > + ga->gacks[n].gap = 0; > + n++; > + > +exit: > + len = tipc_gap_ack_blks_sz(n); > + ga->len = htons(len); > + ga->gack_cnt = n; > + return len; > +} > + > +/* tipc_link_advance_transmq - advance TIPC link transmq queue by releasing > + * acked packets, also doing retransmissions if > + * gaps found > + * @l: tipc link with transmq queue to be advanced > + * @acked: seqno of last packet acked by peer without any gaps before > + * @gap: # of gap packets > + * @ga: buffer pointer to Gap ACK blocks from peer > + * @xmitq: queue for accumulating the retransmitted packets if any > + */ > +static void tipc_link_advance_transmq(struct tipc_link *l, u16 acked, u16 gap, > + struct tipc_gap_ack_blks *ga, > + struct sk_buff_head *xmitq) > +{ > + struct sk_buff *skb, *_skb, *tmp; > + struct tipc_msg *hdr; > + u16 bc_ack = l->bc_rcvlink->rcv_nxt - 1; > + u16 ack = l->rcv_nxt - 1; > + u16 seqno; > + u16 n = 0; > + > + skb_queue_walk_safe(&l->transmq, skb, tmp) { > + seqno = buf_seqno(skb); > + > +next_gap_ack: > + if (less_eq(seqno, acked)) { > + /* release skb */ > + __skb_unlink(skb, &l->transmq); > + kfree_skb(skb); > + } else if (less_eq(seqno, acked + gap)) { > + /* retransmit skb */ > + _skb = __pskb_copy(skb, MIN_H_SIZE, GFP_ATOMIC); > + if (!_skb) > + continue; > + hdr = buf_msg(_skb); > + msg_set_ack(hdr, ack); > + msg_set_bcast_ack(hdr, bc_ack); > + _skb->priority = TC_PRIO_CONTROL; > + __skb_queue_tail(xmitq, _skb); > + l->stats.retransmitted++; > + } else { > + /* retry with Gap ACK blocks if any */ > + if (!ga || n >= ga->gack_cnt) > + break; > + acked = ntohs(ga->gacks[n].ack); > + gap = ntohs(ga->gacks[n].gap); > + n++; > + goto next_gap_ack; > + } > + } > +} > + > /* tipc_link_build_state_msg: prepare link state message for transmission > * > * Note that sending of broadcast ack is coordinated among nodes, to reduce > @@ -1378,6 +1478,7 @@ static void tipc_link_build_proto_msg(struct tipc_link *l, int mtyp, bool probe, > struct tipc_mon_state *mstate = &l->mon_state; > int dlen = 0; > void *data; > + u16 glen = 0; > > /* Don't send protocol message during reset or link failover */ > if (tipc_link_is_blocked(l)) > @@ -1390,8 +1491,8 @@ static void tipc_link_build_proto_msg(struct tipc_link *l, int mtyp, bool probe, > rcvgap = buf_seqno(skb_peek(dfq)) - l->rcv_nxt; > > skb = tipc_msg_create(LINK_PROTOCOL, mtyp, INT_H_SIZE, > - tipc_max_domain_size, l->addr, > - tipc_own_addr(l->net), 0, 0, 0); > + tipc_max_domain_size + MAX_GAP_ACK_BLKS_SZ, > + l->addr, tipc_own_addr(l->net), 0, 0, 0); > if (!skb) > return; > > @@ -1418,9 +1519,11 @@ static void tipc_link_build_proto_msg(struct tipc_link *l, int mtyp, bool probe, > msg_set_bc_gap(hdr, link_bc_rcv_gap(bcl)); > msg_set_probe(hdr, probe); > msg_set_is_keepalive(hdr, probe || probe_reply); > - tipc_mon_prep(l->net, data, &dlen, mstate, l->bearer_id); > - msg_set_size(hdr, INT_H_SIZE + dlen); > - skb_trim(skb, INT_H_SIZE + dlen); > + if (l->peer_caps & TIPC_GAP_ACK_BLOCK) > + glen = tipc_build_gap_ack_blks(l, data); > + tipc_mon_prep(l->net, data + glen, &dlen, mstate, l->bearer_id); > + msg_set_size(hdr, INT_H_SIZE + glen + dlen); > + skb_trim(skb, INT_H_SIZE + glen + dlen); > l->stats.sent_states++; > l->rcv_unacked = 0; > } else { > @@ -1590,6 +1693,7 @@ static int tipc_link_proto_rcv(struct tipc_link *l, struct sk_buff *skb, > struct sk_buff_head *xmitq) > { > struct tipc_msg *hdr = buf_msg(skb); > + struct tipc_gap_ack_blks *ga = NULL; > u16 rcvgap = 0; > u16 ack = msg_ack(hdr); > u16 gap = msg_seq_gap(hdr); > @@ -1600,6 +1704,7 @@ static int tipc_link_proto_rcv(struct tipc_link *l, struct sk_buff *skb, > u16 dlen = msg_data_sz(hdr); > int mtyp = msg_type(hdr); > bool reply = msg_probe(hdr); > + u16 glen = 0; > void *data; > char *if_name; > int rc = 0; > @@ -1697,7 +1802,17 @@ static int tipc_link_proto_rcv(struct tipc_link *l, struct sk_buff *skb, > rc = TIPC_LINK_UP_EVT; > break; > } > - tipc_mon_rcv(l->net, data, dlen, l->addr, > + > + /* Receive Gap ACK blocks from peer if any */ > + if (l->peer_caps & TIPC_GAP_ACK_BLOCK) { > + ga = (struct tipc_gap_ack_blks *)data; > + glen = ntohs(ga->len); > + /* sanity check: if failed, ignore Gap ACK blocks */ > + if (glen != tipc_gap_ack_blks_sz(ga->gack_cnt)) > + ga = NULL; > + } > + > + tipc_mon_rcv(l->net, data + glen, dlen - glen, l->addr, > &l->mon_state, l->bearer_id); > > /* Send NACK if peer has sent pkts we haven't received yet */ > @@ -1706,13 +1821,12 @@ static int tipc_link_proto_rcv(struct tipc_link *l, struct sk_buff *skb, > if (rcvgap || reply) > tipc_link_build_proto_msg(l, STATE_MSG, 0, reply, > rcvgap, 0, 0, xmitq); > - tipc_link_release_pkts(l, ack); > + > + tipc_link_advance_transmq(l, ack, gap, ga, xmitq); > > /* If NACK, retransmit will now start at right position */ > - if (gap) { > - rc = tipc_link_retrans(l, l, ack + 1, ack + gap, xmitq); > + if (gap) > l->stats.recv_nacks++; > - } > > tipc_link_advance_backlog(l, xmitq); > if (unlikely(!skb_queue_empty(&l->wakeupq))) > diff --git a/net/tipc/msg.h b/net/tipc/msg.h > index 528ba9241acc..dcb73421c19a 100644 > --- a/net/tipc/msg.h > +++ b/net/tipc/msg.h > @@ -117,6 +117,36 @@ struct tipc_msg { > __be32 hdr[15]; > }; > > +/* struct tipc_gap_ack - TIPC Gap ACK block > + * @ack: seqno of the last consecutive packet in link deferdq > + * @gap: number of gap packets since the last ack > + * > + * E.g: > + * link deferdq: 1 2 3 4 10 11 13 14 15 20 > + * --> Gap ACK blocks: <4, 5>, <11, 1>, <15, 4>, <20, 0> > + */ > +struct tipc_gap_ack { > + __be16 ack; > + __be16 gap; > +}; > + > +/* struct tipc_gap_ack_blks > + * @len: actual length of the record > + * @gack_cnt: number of Gap ACK blocks in the record > + * @gacks: array of Gap ACK blocks > + */ > +struct tipc_gap_ack_blks { > + __be16 len; > + u8 gack_cnt; > + struct tipc_gap_ack gacks[]; > +}; > + > +#define tipc_gap_ack_blks_sz(n) (sizeof(struct tipc_gap_ack_blks) + \ > + sizeof(struct tipc_gap_ack) * (n)) > + > +#define MAX_GAP_ACK_BLKS 32 > +#define MAX_GAP_ACK_BLKS_SZ tipc_gap_ack_blks_sz(MAX_GAP_ACK_BLKS) > + > static inline struct tipc_msg *buf_msg(struct sk_buff *skb) > { > return (struct tipc_msg *)skb->data; > diff --git a/net/tipc/node.h b/net/tipc/node.h > index 2404225c5d58..c0bf49ea3de4 100644 > --- a/net/tipc/node.h > +++ b/net/tipc/node.h > @@ -52,7 +52,8 @@ enum { > TIPC_BCAST_RCAST = (1 << 4), > TIPC_NODE_ID128 = (1 << 5), > TIPC_LINK_PROTO_SEQNO = (1 << 6), > - TIPC_MCAST_RBCTL = (1 << 7) > + TIPC_MCAST_RBCTL = (1 << 7), > + TIPC_GAP_ACK_BLOCK = (1 << 8) > }; > > #define TIPC_NODE_CAPABILITIES (TIPC_SYN_BIT | \ > @@ -62,7 +63,8 @@ enum { > TIPC_BLOCK_FLOWCTL | \ > TIPC_NODE_ID128 | \ > TIPC_LINK_PROTO_SEQNO | \ > - TIPC_MCAST_RBCTL) > + TIPC_MCAST_RBCTL | \ > + TIPC_GAP_ACK_BLOCK) > #define INVALID_BEARER_ID -1 > > void tipc_node_stop(struct net *net); > |
From: Jon M. <jon...@er...> - 2019-03-25 19:06:43
|
Yet another duplicate of syz...@sy... A fix has been posted. ///jon > -----Original Message----- > From: net...@vg... <net...@vg...> > On Behalf Of syzbot > Sent: 23-Mar-19 19:03 > To: da...@da...; Jon Maloy <jon...@er...>; > ku...@ms...; lin...@vg...; > ne...@vg...; syz...@go...; tipc- > dis...@li...; yin...@wi...; yoshfuji@linux- > ipv6.org > Subject: Re: WARNING: locking bug in icmp_send > > syzbot has bisected this bug to: > > commit 52dfae5c85a4c1078e9f1d5e8947d4a25f73dd81 > Author: Jon Maloy <jon...@er...> > Date: Thu Mar 22 19:42:52 2018 +0000 > > tipc: obtain node identity from interface by default > > bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=11b6dc5d200000 > start commit: b5372fe5 exec: load_script: Do not exec truncated interpre.. > git tree: upstream > final crash: https://syzkaller.appspot.com/x/report.txt?x=13b6dc5d200000 > console output: https://syzkaller.appspot.com/x/log.txt?x=15b6dc5d200000 > kernel config: https://syzkaller.appspot.com/x/.config?x=7132344728e7ec3f > dashboard link: > https://syzkaller.appspot.com/bug?extid=ba21d65f55b562432c58 > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=14c90fa7400000 > > Reported-by: syz...@sy... > Fixes: 52dfae5c85a4 ("tipc: obtain node identity from interface by default") > > For information about bisection process see: > https://goo.gl/tpsmEJ#bisection |
From: Jon M. <jon...@er...> - 2019-03-25 17:03:36
|
Acked-by: Jon Maloy <jon...@er...> > -----Original Message----- > From: net...@vg... <net...@vg...> > On Behalf Of Wei Yongjun > Sent: 25-Mar-19 07:31 > To: Jon Maloy <jon...@er...>; Ying Xue > <yin...@wi...>; Hoang Huu Le <hoa...@de...>; > Eric Dumazet <eri...@gm...> > Cc: Wei Yongjun <wei...@hu...>; ne...@vg...; > tip...@li...; ker...@vg...; Hulk > Robot <hu...@hu...> > Subject: [PATCH net-next v2] tipc: fix return value check in > tipc_mcast_send_sync() > > Fix the return value check which testing the wrong variable in > tipc_mcast_send_sync(). > > Fixes: c55c8edafa91 ("tipc: smooth change between replicast and broadcast") > Reported-by: Hulk Robot <hu...@hu...> > Signed-off-by: Wei Yongjun <wei...@hu...> > --- > v1 -> v2: add reported-by > --- > net/tipc/bcast.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/net/tipc/bcast.c b/net/tipc/bcast.c index > 88edfb358ae7..76e14dc08bb9 100644 > --- a/net/tipc/bcast.c > +++ b/net/tipc/bcast.c > @@ -329,7 +329,7 @@ static int tipc_mcast_send_sync(struct net *net, > struct sk_buff *skb, > > /* Allocate dummy message */ > _skb = tipc_buf_acquire(MCAST_H_SIZE, GFP_KERNEL); > - if (!skb) > + if (!_skb) > return -ENOMEM; > > /* Preparing for 'synching' header */ > > |