You can subscribe to this list here.
2003 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(6) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2004 |
Jan
(9) |
Feb
(11) |
Mar
(22) |
Apr
(73) |
May
(78) |
Jun
(146) |
Jul
(80) |
Aug
(27) |
Sep
(5) |
Oct
(14) |
Nov
(18) |
Dec
(27) |
2005 |
Jan
(20) |
Feb
(30) |
Mar
(19) |
Apr
(28) |
May
(50) |
Jun
(31) |
Jul
(32) |
Aug
(14) |
Sep
(36) |
Oct
(43) |
Nov
(74) |
Dec
(63) |
2006 |
Jan
(34) |
Feb
(32) |
Mar
(21) |
Apr
(76) |
May
(106) |
Jun
(72) |
Jul
(70) |
Aug
(175) |
Sep
(130) |
Oct
(39) |
Nov
(81) |
Dec
(43) |
2007 |
Jan
(81) |
Feb
(36) |
Mar
(20) |
Apr
(43) |
May
(54) |
Jun
(34) |
Jul
(44) |
Aug
(55) |
Sep
(44) |
Oct
(54) |
Nov
(43) |
Dec
(41) |
2008 |
Jan
(42) |
Feb
(84) |
Mar
(73) |
Apr
(30) |
May
(119) |
Jun
(54) |
Jul
(54) |
Aug
(93) |
Sep
(173) |
Oct
(130) |
Nov
(145) |
Dec
(153) |
2009 |
Jan
(59) |
Feb
(12) |
Mar
(28) |
Apr
(18) |
May
(56) |
Jun
(9) |
Jul
(28) |
Aug
(62) |
Sep
(16) |
Oct
(19) |
Nov
(15) |
Dec
(17) |
2010 |
Jan
(14) |
Feb
(36) |
Mar
(37) |
Apr
(30) |
May
(33) |
Jun
(53) |
Jul
(42) |
Aug
(50) |
Sep
(67) |
Oct
(66) |
Nov
(69) |
Dec
(36) |
2011 |
Jan
(52) |
Feb
(45) |
Mar
(49) |
Apr
(21) |
May
(34) |
Jun
(13) |
Jul
(19) |
Aug
(37) |
Sep
(43) |
Oct
(10) |
Nov
(23) |
Dec
(30) |
2012 |
Jan
(42) |
Feb
(36) |
Mar
(46) |
Apr
(25) |
May
(96) |
Jun
(146) |
Jul
(40) |
Aug
(28) |
Sep
(61) |
Oct
(45) |
Nov
(100) |
Dec
(53) |
2013 |
Jan
(79) |
Feb
(24) |
Mar
(134) |
Apr
(156) |
May
(118) |
Jun
(75) |
Jul
(278) |
Aug
(145) |
Sep
(136) |
Oct
(168) |
Nov
(137) |
Dec
(439) |
2014 |
Jan
(284) |
Feb
(158) |
Mar
(231) |
Apr
(275) |
May
(259) |
Jun
(91) |
Jul
(222) |
Aug
(215) |
Sep
(165) |
Oct
(166) |
Nov
(211) |
Dec
(150) |
2015 |
Jan
(164) |
Feb
(324) |
Mar
(299) |
Apr
(214) |
May
(111) |
Jun
(109) |
Jul
(105) |
Aug
(36) |
Sep
(58) |
Oct
(131) |
Nov
(68) |
Dec
(30) |
2016 |
Jan
(46) |
Feb
(87) |
Mar
(135) |
Apr
(174) |
May
(132) |
Jun
(135) |
Jul
(149) |
Aug
(125) |
Sep
(79) |
Oct
(49) |
Nov
(95) |
Dec
(102) |
2017 |
Jan
(104) |
Feb
(75) |
Mar
(72) |
Apr
(53) |
May
(18) |
Jun
(5) |
Jul
(14) |
Aug
(19) |
Sep
(2) |
Oct
(13) |
Nov
(21) |
Dec
(67) |
2018 |
Jan
(56) |
Feb
(50) |
Mar
(148) |
Apr
(41) |
May
(37) |
Jun
(34) |
Jul
(34) |
Aug
(11) |
Sep
(52) |
Oct
(48) |
Nov
(28) |
Dec
(46) |
2019 |
Jan
(29) |
Feb
(63) |
Mar
(95) |
Apr
(54) |
May
(14) |
Jun
(71) |
Jul
(60) |
Aug
(49) |
Sep
(3) |
Oct
(64) |
Nov
(115) |
Dec
(57) |
2020 |
Jan
(15) |
Feb
(9) |
Mar
(38) |
Apr
(27) |
May
(60) |
Jun
(53) |
Jul
(35) |
Aug
(46) |
Sep
(37) |
Oct
(64) |
Nov
(20) |
Dec
(25) |
2021 |
Jan
(20) |
Feb
(31) |
Mar
(27) |
Apr
(23) |
May
(21) |
Jun
(30) |
Jul
(30) |
Aug
(7) |
Sep
(18) |
Oct
|
Nov
(15) |
Dec
(4) |
2022 |
Jan
(3) |
Feb
(1) |
Mar
(10) |
Apr
|
May
(2) |
Jun
(26) |
Jul
(5) |
Aug
|
Sep
(1) |
Oct
(2) |
Nov
(9) |
Dec
(2) |
2023 |
Jan
(4) |
Feb
(4) |
Mar
(5) |
Apr
(10) |
May
(29) |
Jun
(17) |
Jul
|
Aug
|
Sep
(1) |
Oct
(1) |
Nov
(2) |
Dec
|
2024 |
Jan
|
Feb
(6) |
Mar
|
Apr
(1) |
May
(6) |
Jun
|
Jul
(5) |
Aug
|
Sep
(3) |
Oct
|
Nov
|
Dec
|
2025 |
Jan
|
Feb
(3) |
Mar
|
Apr
|
May
|
Jun
|
Jul
(6) |
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: Jon M. <jon...@er...> - 2019-03-25 16:47:09
|
Just another manifestation of syz...@sy... We have just posted a patch fixing this. ///jon > Signed-off-by: Xin Long <luc...@gm...> > -----Original Message----- > From: net...@vg... <net...@vg...> > On Behalf Of syzbot > Sent: 24-Mar-19 14:38 > To: da...@da...; Jon Maloy <jon...@er...>; > ku...@ms...; lin...@vg...; > ne...@vg...; syz...@go...; tipc- > dis...@li...; yin...@wi...; yoshfuji@linux- > ipv6.org > Subject: Re: inconsistent lock state in icmp_send > > syzbot has bisected this bug to: > > commit 52dfae5c85a4c1078e9f1d5e8947d4a25f73dd81 > Author: Jon Maloy <jon...@er...> > Date: Thu Mar 22 19:42:52 2018 +0000 > > tipc: obtain node identity from interface by default > > bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=12619917200000 > start commit: f5d58277 Merge branch 'for-linus' of git://git.kernel.org/.. > git tree: upstream > final crash: https://syzkaller.appspot.com/x/report.txt?x=11619917200000 > console output: https://syzkaller.appspot.com/x/log.txt?x=16619917200000 > kernel config: https://syzkaller.appspot.com/x/.config?x=c8970c89a0efbb23 > dashboard link: > https://syzkaller.appspot.com/bug?extid=251ec6887ada6eac4921 > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=10ab6ba3400000 > > Reported-by: syz...@sy... > Fixes: 52dfae5c85a4 ("tipc: obtain node identity from interface by default") > > For information about bisection process see: > https://goo.gl/tpsmEJ#bisection |
From: Jon M. <jon...@er...> - 2019-03-25 16:40:02
|
Acked-by: Jon Maloy <jon...@er...> > -----Original Message----- > From: Xin Long <luc...@gm...> > Sent: 23-Mar-19 17:48 > To: network dev <ne...@vg...> > Cc: da...@da...; Jon Maloy <jon...@er...>; Ying Xue > <yin...@wi...>; tip...@li...; > syz...@go... > Subject: [PATCH net] tipc: change to check tipc_own_id to return in > tipc_net_stop > > When running a syz script, a panic occurred: > > [ 156.088228] BUG: KASAN: use-after-free in tipc_disc_timeout+0x9c9/0xb20 > [tipc] [ 156.094315] Call Trace: > [ 156.094844] <IRQ> > [ 156.095306] dump_stack+0x7c/0xc0 > [ 156.097346] print_address_description+0x65/0x22e > [ 156.100445] kasan_report.cold.3+0x37/0x7a [ 156.102402] > tipc_disc_timeout+0x9c9/0xb20 [tipc] [ 156.106517] > call_timer_fn+0x19a/0x610 [ 156.112749] run_timer_softirq+0xb51/0x1090 > > It was caused by the netns freed without deleting the discoverer timer, while > later on the netns would be accessed in the timer handler. > > The timer should have been deleted by tipc_net_stop() when cleaning up a > netns. However, tipc has been able to enable a bearer and start d->timer > without the local node_addr set since Commit 52dfae5c85a4 ("tipc: obtain > node identity from interface by default"), which caused the timer not to be > deleted in tipc_net_stop() then. > > So fix it in tipc_net_stop() by changing to check local node_id instead of local > node_addr, as Jon suggested. > > While at it, remove the calling of tipc_nametbl_withdraw() there, since > tipc_nametbl_stop() will take of the nametbl's freeing after. > > Fixes: 52dfae5c85a4 ("tipc: obtain node identity from interface by default") > Reported-by: syz...@sy... > Signed-off-by: Xin Long <luc...@gm...> > --- > net/tipc/net.c | 5 +---- > 1 file changed, 1 insertion(+), 4 deletions(-) > > diff --git a/net/tipc/net.c b/net/tipc/net.c index f076edb..7ce1e86 100644 > --- a/net/tipc/net.c > +++ b/net/tipc/net.c > @@ -163,12 +163,9 @@ void tipc_sched_net_finalize(struct net *net, u32 > addr) > > void tipc_net_stop(struct net *net) > { > - u32 self = tipc_own_addr(net); > - > - if (!self) > + if (!tipc_own_id(net)) > return; > > - tipc_nametbl_withdraw(net, TIPC_CFG_SRV, self, self, self); > rtnl_lock(); > tipc_bearer_stop(net); > tipc_node_stop(net); > -- > 2.1.0 |
From: Tuong L. T. <tuo...@de...> - 2019-03-25 07:33:13
|
Hi Ying! Correct, the idea was inspired from SACK in SCTP protocol but with simplicity. Also, I see your suggestions about the "duplicated packets"... which is connected to the SCTP "delayed SACK" algorithm (i.e. in the case of no payload message loss). In TIPC, as I understand so far, we already have such a delay on acknowledgements by the link "rcv_unacked" (- Jon may correct me?) (but we don’t implement a timer for the SACK delay timeout i.e. the 200ms in the SCTP RFC). However, that duplicated TSN concept is based on DATA chunks _and_ "with no new DATA chunk(s)" which usually happens in case of SACK loss and the sender has tried to retransmit the same DATA chunks when its RTO timer expired..., obviously an immediate SACK is needed in this situation. In TIPC, we might not face with this situation because we do not have a retransmission timer at sender side, duplicates can occur almost due to the overactive NACK sending and should be covered by the 2nd patch of the series. For me, in the case of packet loss, an immediate retransmission is important, otherwise it can reduce the performance. However, because we never know if the packet is really lost or just delayed, we have to apply the "1ms restriction" to reduce duplicates (- as you can also see in the 2nd patch). Fast retransmission was also tried, Jon and I had some discussions before... but the fact is, in TIPC, the sender is passive (due to no retransmission timer) and we could be in trouble if trying to wait for the 2nd or 3rd indications... Instead, the NACK sending criteria has been changed by the 2nd patch to both reduce duplicates but try to keep the performance... Actually, in SCTP, the situation is a bit difference as they play with "chunks" and "multi-streaming" than individual packets like ours at the link layer, and many chunks can be optionally bundled in a single packet because of the slow-start or Nagle's algorithm... Anyway, if you have any ideas to improve TIPC performance more, I will try to see what happens. Thanks a lot! BR/Tuong -----Original Message----- From: Ying Xue <yin...@wi...> Sent: Friday, March 22, 2019 7:52 PM To: Tuong Lien <tuo...@de...>; tip...@li...; jon...@er...; ma...@do... Subject: Re: [net-next 1/3] tipc: improve TIPC throughput by Gap ACK blocks Hi Tuong, Great job! It's a very nice enhancement, and we should do the improvement early. On 3/20/19 11:28 AM, Tuong Lien wrote: > During unicast link transmission, it's observed very often that because > of one or a few lost/dis-ordered packets, the sending side will fastly > reach the send window limit and must wait for the packets to be arrived > at the receiving side or in the worst case, a retransmission must be > done first. The sending side cannot release a lot of subsequent packets > in its transmq even though all of them might have already been received > by the receiving side. > That is, one or two packets dis-ordered/lost and dozens of packets have > to wait, this obviously reduces the overall throughput! > > This commit introduces an algorithm to overcome this by using "Gap ACK > blocks". Basically, a Gap ACK block will consist of <ack, gap> numbers > that describes the link deferdq where packets have been got by the > receiving side but with gaps, for example: > > link deferdq: [1 2 3 4 10 11 13 14 15 20] > --> Gap ACK blocks: <4, 5>, <11, 1>, <15, 4>, <20, 0> This idea is the exactly same as SACK of SCTP: https://tools.ietf.org/html/rfc4960#section-3.3.4 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type = 3 |Chunk Flags | Chunk Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Cumulative TSN Ack | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Advertised Receiver Window Credit (a_rwnd) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Number of Gap Ack Blocks = N | Number of Duplicate TSNs = X | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Gap Ack Block #1 Start | Gap Ack Block #1 End | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ / / \ ... \ / / +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Gap Ack Block #N Start | Gap Ack Block #N End | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Duplicate TSN 1 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ / / \ ... \ / / +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Duplicate TSN X | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ For example, assume that the receiver has the following DATA chunks newly arrived at the time when it decides to send a Selective ACK, ---------- | TSN=17 | ---------- | | <- still missing ---------- | TSN=15 | ---------- | TSN=14 | ---------- | | <- still missing ---------- | TSN=12 | ---------- | TSN=11 | ---------- | TSN=10 | ---------- then the parameter part of the SACK MUST be constructed as follows (assuming the new a_rwnd is set to 4660 by the sender): +--------------------------------+ | Cumulative TSN Ack = 12 | +--------------------------------+ | a_rwnd = 4660 | +----------------+---------------+ | num of block=2 | num of dup=0 | +----------------+---------------+ |block #1 strt=2 |block #1 end=3 | +----------------+---------------+ |block #2 strt=5 |block #2 end=5 | +----------------+---------------+ It seems more easier to understand than our solution. Of course, I don't intend to disagree to your proposal. In addition, we also can consider to mark duplicated packet as SCTP is doing. For duplicated packets, SCTP adapt the strategy: When a packet arrives with duplicate DATA chunk(s) and with no new DATA chunk(s), the endpoint MUST _immediately_ send a SACK with no delay. If a packet arrives with duplicate DATA chunk(s) bundled with new DATA chunks, the endpoint MAY immediately send a SACK. In the current solution, when a sender detects a gap in tipc_link_advance_transmq(), it will _immediately_ retransmit missing messages. However, probably we can borrow some idea from SCTP Fast Retransmit algorithm, which might help us further improve throughput performance: https://tools.ietf.org/html/rfc4960#section-7.2.4 In the absence of data loss, an endpoint performs delayed acknowledgement. However, whenever an endpoint notices a hole in the arriving TSN sequence, it SHOULD start sending a SACK back every time a packet arrives carrying data until the hole is filled. Whenever an endpoint receives a SACK that indicates that some TSNs are missing, it SHOULD wait for _two further miss indications_ (via subsequent SACKs for a total of three missing reports) on the same TSNs before taking action with regard to Fast Retransmit. In SCTP, whenever sender endpoint detects a gap, it doesn't immediately retransmit the missing packet. Instead, it will wait for two further miss indications. Of course, SCTP flow control management algorithm is not the exactly same with TIPC. Especially, each SCTP endpoint has a retransmission timer, slow-start etc. Maybe its algorithms are all useful for TIPC scenario, but I think we can try to use the two ideas I mentioned above based on the current solution to check whether they can further improve TIPC transmission performance. Thanks, Ying > > The Gap ACK blocks will be sent to the sending side along with the > traditional ACK or NACK message. Immediately when receiving the message > the sending side will now not only release from its transmq the packets > ack-ed by the ACK but also by the Gap ACK blocks! So, more packets can > be enqueued and transmitted. > In addition, the sending side can now do "multi-retransmissions" > according to the Gaps reported in the Gap ACK blocks. > > The new algorithm as verified helps greatly improve the TIPC throughput > especially under packet loss condition. > > So far, a maximum of 32 blocks is quite enough without any "Too few Gap > ACK blocks" reports with a 5.0% packet loss rate, however this number > can be increased in the furture if needed. > > Also, the patch is backward compatible. > > Acked-by: Jon Maloy <jon...@er...> > Signed-off-by: Tuong Lien <tuo...@de...> > --- > net/tipc/link.c | 134 +++++++++++++++++++++++++++++++++++++++++++++++++++----- > net/tipc/msg.h | 30 +++++++++++++ > net/tipc/node.h | 6 ++- > 3 files changed, 158 insertions(+), 12 deletions(-) > > diff --git a/net/tipc/link.c b/net/tipc/link.c > index 52d23b3ffaf5..5aee1ed23ba9 100644 > --- a/net/tipc/link.c > +++ b/net/tipc/link.c > @@ -246,6 +246,10 @@ static int tipc_link_build_nack_msg(struct tipc_link *l, > static void tipc_link_build_bc_init_msg(struct tipc_link *l, > struct sk_buff_head *xmitq); > static bool tipc_link_release_pkts(struct tipc_link *l, u16 to); > +static u16 tipc_build_gap_ack_blks(struct tipc_link *l, void *data); > +static void tipc_link_advance_transmq(struct tipc_link *l, u16 acked, u16 gap, > + struct tipc_gap_ack_blks *ga, > + struct sk_buff_head *xmitq); > > /* > * Simple non-static link routines (i.e. referenced outside this file) > @@ -1226,6 +1230,102 @@ static bool tipc_link_release_pkts(struct tipc_link *l, u16 acked) > return released; > } > > +/* tipc_build_gap_ack_blks - build Gap ACK blocks > + * @l: tipc link that data have come with gaps in sequence if any > + * @data: data buffer to store the Gap ACK blocks after built > + * > + * returns the actual allocated memory size > + */ > +static u16 tipc_build_gap_ack_blks(struct tipc_link *l, void *data) > +{ > + struct sk_buff *skb = skb_peek(&l->deferdq); > + struct tipc_gap_ack_blks *ga = data; > + u16 len, expect, seqno = 0; > + u8 n = 0; > + > + if (!skb) > + goto exit; > + > + expect = buf_seqno(skb); > + skb_queue_walk(&l->deferdq, skb) { > + seqno = buf_seqno(skb); > + if (unlikely(more(seqno, expect))) { > + ga->gacks[n].ack = htons(expect - 1); > + ga->gacks[n].gap = htons(seqno - expect); > + if (++n >= MAX_GAP_ACK_BLKS) { > + pr_info_ratelimited("Too few Gap ACK blocks!\n"); > + goto exit; > + } > + } else if (unlikely(less(seqno, expect))) { > + pr_warn("Unexpected skb in deferdq!\n"); > + continue; > + } > + expect = seqno + 1; > + } > + > + /* last block */ > + ga->gacks[n].ack = htons(seqno); > + ga->gacks[n].gap = 0; > + n++; > + > +exit: > + len = tipc_gap_ack_blks_sz(n); > + ga->len = htons(len); > + ga->gack_cnt = n; > + return len; > +} > + > +/* tipc_link_advance_transmq - advance TIPC link transmq queue by releasing > + * acked packets, also doing retransmissions if > + * gaps found > + * @l: tipc link with transmq queue to be advanced > + * @acked: seqno of last packet acked by peer without any gaps before > + * @gap: # of gap packets > + * @ga: buffer pointer to Gap ACK blocks from peer > + * @xmitq: queue for accumulating the retransmitted packets if any > + */ > +static void tipc_link_advance_transmq(struct tipc_link *l, u16 acked, u16 gap, > + struct tipc_gap_ack_blks *ga, > + struct sk_buff_head *xmitq) > +{ > + struct sk_buff *skb, *_skb, *tmp; > + struct tipc_msg *hdr; > + u16 bc_ack = l->bc_rcvlink->rcv_nxt - 1; > + u16 ack = l->rcv_nxt - 1; > + u16 seqno; > + u16 n = 0; > + > + skb_queue_walk_safe(&l->transmq, skb, tmp) { > + seqno = buf_seqno(skb); > + > +next_gap_ack: > + if (less_eq(seqno, acked)) { > + /* release skb */ > + __skb_unlink(skb, &l->transmq); > + kfree_skb(skb); > + } else if (less_eq(seqno, acked + gap)) { > + /* retransmit skb */ > + _skb = __pskb_copy(skb, MIN_H_SIZE, GFP_ATOMIC); > + if (!_skb) > + continue; > + hdr = buf_msg(_skb); > + msg_set_ack(hdr, ack); > + msg_set_bcast_ack(hdr, bc_ack); > + _skb->priority = TC_PRIO_CONTROL; > + __skb_queue_tail(xmitq, _skb); > + l->stats.retransmitted++; > + } else { > + /* retry with Gap ACK blocks if any */ > + if (!ga || n >= ga->gack_cnt) > + break; > + acked = ntohs(ga->gacks[n].ack); > + gap = ntohs(ga->gacks[n].gap); > + n++; > + goto next_gap_ack; > + } > + } > +} > + > /* tipc_link_build_state_msg: prepare link state message for transmission > * > * Note that sending of broadcast ack is coordinated among nodes, to reduce > @@ -1378,6 +1478,7 @@ static void tipc_link_build_proto_msg(struct tipc_link *l, int mtyp, bool probe, > struct tipc_mon_state *mstate = &l->mon_state; > int dlen = 0; > void *data; > + u16 glen = 0; > > /* Don't send protocol message during reset or link failover */ > if (tipc_link_is_blocked(l)) > @@ -1390,8 +1491,8 @@ static void tipc_link_build_proto_msg(struct tipc_link *l, int mtyp, bool probe, > rcvgap = buf_seqno(skb_peek(dfq)) - l->rcv_nxt; > > skb = tipc_msg_create(LINK_PROTOCOL, mtyp, INT_H_SIZE, > - tipc_max_domain_size, l->addr, > - tipc_own_addr(l->net), 0, 0, 0); > + tipc_max_domain_size + MAX_GAP_ACK_BLKS_SZ, > + l->addr, tipc_own_addr(l->net), 0, 0, 0); > if (!skb) > return; > > @@ -1418,9 +1519,11 @@ static void tipc_link_build_proto_msg(struct tipc_link *l, int mtyp, bool probe, > msg_set_bc_gap(hdr, link_bc_rcv_gap(bcl)); > msg_set_probe(hdr, probe); > msg_set_is_keepalive(hdr, probe || probe_reply); > - tipc_mon_prep(l->net, data, &dlen, mstate, l->bearer_id); > - msg_set_size(hdr, INT_H_SIZE + dlen); > - skb_trim(skb, INT_H_SIZE + dlen); > + if (l->peer_caps & TIPC_GAP_ACK_BLOCK) > + glen = tipc_build_gap_ack_blks(l, data); > + tipc_mon_prep(l->net, data + glen, &dlen, mstate, l->bearer_id); > + msg_set_size(hdr, INT_H_SIZE + glen + dlen); > + skb_trim(skb, INT_H_SIZE + glen + dlen); > l->stats.sent_states++; > l->rcv_unacked = 0; > } else { > @@ -1590,6 +1693,7 @@ static int tipc_link_proto_rcv(struct tipc_link *l, struct sk_buff *skb, > struct sk_buff_head *xmitq) > { > struct tipc_msg *hdr = buf_msg(skb); > + struct tipc_gap_ack_blks *ga = NULL; > u16 rcvgap = 0; > u16 ack = msg_ack(hdr); > u16 gap = msg_seq_gap(hdr); > @@ -1600,6 +1704,7 @@ static int tipc_link_proto_rcv(struct tipc_link *l, struct sk_buff *skb, > u16 dlen = msg_data_sz(hdr); > int mtyp = msg_type(hdr); > bool reply = msg_probe(hdr); > + u16 glen = 0; > void *data; > char *if_name; > int rc = 0; > @@ -1697,7 +1802,17 @@ static int tipc_link_proto_rcv(struct tipc_link *l, struct sk_buff *skb, > rc = TIPC_LINK_UP_EVT; > break; > } > - tipc_mon_rcv(l->net, data, dlen, l->addr, > + > + /* Receive Gap ACK blocks from peer if any */ > + if (l->peer_caps & TIPC_GAP_ACK_BLOCK) { > + ga = (struct tipc_gap_ack_blks *)data; > + glen = ntohs(ga->len); > + /* sanity check: if failed, ignore Gap ACK blocks */ > + if (glen != tipc_gap_ack_blks_sz(ga->gack_cnt)) > + ga = NULL; > + } > + > + tipc_mon_rcv(l->net, data + glen, dlen - glen, l->addr, > &l->mon_state, l->bearer_id); > > /* Send NACK if peer has sent pkts we haven't received yet */ > @@ -1706,13 +1821,12 @@ static int tipc_link_proto_rcv(struct tipc_link *l, struct sk_buff *skb, > if (rcvgap || reply) > tipc_link_build_proto_msg(l, STATE_MSG, 0, reply, > rcvgap, 0, 0, xmitq); > - tipc_link_release_pkts(l, ack); > + > + tipc_link_advance_transmq(l, ack, gap, ga, xmitq); > > /* If NACK, retransmit will now start at right position */ > - if (gap) { > - rc = tipc_link_retrans(l, l, ack + 1, ack + gap, xmitq); > + if (gap) > l->stats.recv_nacks++; > - } > > tipc_link_advance_backlog(l, xmitq); > if (unlikely(!skb_queue_empty(&l->wakeupq))) > diff --git a/net/tipc/msg.h b/net/tipc/msg.h > index 528ba9241acc..dcb73421c19a 100644 > --- a/net/tipc/msg.h > +++ b/net/tipc/msg.h > @@ -117,6 +117,36 @@ struct tipc_msg { > __be32 hdr[15]; > }; > > +/* struct tipc_gap_ack - TIPC Gap ACK block > + * @ack: seqno of the last consecutive packet in link deferdq > + * @gap: number of gap packets since the last ack > + * > + * E.g: > + * link deferdq: 1 2 3 4 10 11 13 14 15 20 > + * --> Gap ACK blocks: <4, 5>, <11, 1>, <15, 4>, <20, 0> > + */ > +struct tipc_gap_ack { > + __be16 ack; > + __be16 gap; > +}; > + > +/* struct tipc_gap_ack_blks > + * @len: actual length of the record > + * @gack_cnt: number of Gap ACK blocks in the record > + * @gacks: array of Gap ACK blocks > + */ > +struct tipc_gap_ack_blks { > + __be16 len; > + u8 gack_cnt; > + struct tipc_gap_ack gacks[]; > +}; > + > +#define tipc_gap_ack_blks_sz(n) (sizeof(struct tipc_gap_ack_blks) + \ > + sizeof(struct tipc_gap_ack) * (n)) > + > +#define MAX_GAP_ACK_BLKS 32 > +#define MAX_GAP_ACK_BLKS_SZ tipc_gap_ack_blks_sz(MAX_GAP_ACK_BLKS) > + > static inline struct tipc_msg *buf_msg(struct sk_buff *skb) > { > return (struct tipc_msg *)skb->data; > diff --git a/net/tipc/node.h b/net/tipc/node.h > index 2404225c5d58..c0bf49ea3de4 100644 > --- a/net/tipc/node.h > +++ b/net/tipc/node.h > @@ -52,7 +52,8 @@ enum { > TIPC_BCAST_RCAST = (1 << 4), > TIPC_NODE_ID128 = (1 << 5), > TIPC_LINK_PROTO_SEQNO = (1 << 6), > - TIPC_MCAST_RBCTL = (1 << 7) > + TIPC_MCAST_RBCTL = (1 << 7), > + TIPC_GAP_ACK_BLOCK = (1 << 8) > }; > > #define TIPC_NODE_CAPABILITIES (TIPC_SYN_BIT | \ > @@ -62,7 +63,8 @@ enum { > TIPC_BLOCK_FLOWCTL | \ > TIPC_NODE_ID128 | \ > TIPC_LINK_PROTO_SEQNO | \ > - TIPC_MCAST_RBCTL) > + TIPC_MCAST_RBCTL | \ > + TIPC_GAP_ACK_BLOCK) > #define INVALID_BEARER_ID -1 > > void tipc_node_stop(struct net *net); > |
From: Ying X. <yin...@wi...> - 2019-03-24 07:42:01
|
On 3/24/19 12:48 AM, Xin Long wrote: > When running a syz script, a panic occurred: > > [ 156.088228] BUG: KASAN: use-after-free in tipc_disc_timeout+0x9c9/0xb20 [tipc] > [ 156.094315] Call Trace: > [ 156.094844] <IRQ> > [ 156.095306] dump_stack+0x7c/0xc0 > [ 156.097346] print_address_description+0x65/0x22e > [ 156.100445] kasan_report.cold.3+0x37/0x7a > [ 156.102402] tipc_disc_timeout+0x9c9/0xb20 [tipc] > [ 156.106517] call_timer_fn+0x19a/0x610 > [ 156.112749] run_timer_softirq+0xb51/0x1090 > > It was caused by the netns freed without deleting the discoverer timer, > while later on the netns would be accessed in the timer handler. > > The timer should have been deleted by tipc_net_stop() when cleaning up a > netns. However, tipc has been able to enable a bearer and start d->timer > without the local node_addr set since Commit 52dfae5c85a4 ("tipc: obtain > node identity from interface by default"), which caused the timer not to > be deleted in tipc_net_stop() then. > > So fix it in tipc_net_stop() by changing to check local node_id instead > of local node_addr, as Jon suggested. > > While at it, remove the calling of tipc_nametbl_withdraw() there, since > tipc_nametbl_stop() will take of the nametbl's freeing after. > > Fixes: 52dfae5c85a4 ("tipc: obtain node identity from interface by default") > Reported-by: syz...@sy... > Signed-off-by: Xin Long <luc...@gm...> Acked-by: Ying Xue <yin...@wi...> > --- > net/tipc/net.c | 5 +---- > 1 file changed, 1 insertion(+), 4 deletions(-) > > diff --git a/net/tipc/net.c b/net/tipc/net.c > index f076edb..7ce1e86 100644 > --- a/net/tipc/net.c > +++ b/net/tipc/net.c > @@ -163,12 +163,9 @@ void tipc_sched_net_finalize(struct net *net, u32 addr) > > void tipc_net_stop(struct net *net) > { > - u32 self = tipc_own_addr(net); > - > - if (!self) > + if (!tipc_own_id(net)) > return; > > - tipc_nametbl_withdraw(net, TIPC_CFG_SRV, self, self, self); > rtnl_lock(); > tipc_bearer_stop(net); > tipc_node_stop(net); > |
From: David M. <da...@da...> - 2019-03-24 01:46:29
|
From: Jon Maloy <jon...@er...> Date: Fri, 22 Mar 2019 15:03:51 +0100 > When checking the code with clang -Wsometimes-uninitialized we get the > following warning: > > if (!tipc_link_is_establishing(l)) { > ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > net/tipc/node.c:847:46: note: uninitialized use occurs here > tipc_bearer_xmit(n->net, bearer_id, &xmitq, maddr); > > net/tipc/node.c:831:2: note: remove the 'if' if its condition is always > true > if (!tipc_link_is_establishing(l)) { > ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > net/tipc/node.c:821:31: note: initialize the variable 'maddr' to silence > this warning > struct tipc_media_addr *maddr; > > We fix this by initializing 'maddr' to NULL. For the matter of clarity, > we also test if 'xmitq' is non-empty before we use it and 'maddr' > further down in the function. It will never happen that 'xmitq' is non- > empty at the same time as 'maddr' is NULL, so this is a sufficient test. > > Fixes: 598411d70f85 ("tipc: make resetting of links non-atomic") > Reported-by: Nathan Chancellor <nat...@gm...> > Signed-off-by: Jon Maloy <jon...@er...> Applied. |
From: Xin L. <luc...@gm...> - 2019-03-23 16:48:38
|
When running a syz script, a panic occurred: [ 156.088228] BUG: KASAN: use-after-free in tipc_disc_timeout+0x9c9/0xb20 [tipc] [ 156.094315] Call Trace: [ 156.094844] <IRQ> [ 156.095306] dump_stack+0x7c/0xc0 [ 156.097346] print_address_description+0x65/0x22e [ 156.100445] kasan_report.cold.3+0x37/0x7a [ 156.102402] tipc_disc_timeout+0x9c9/0xb20 [tipc] [ 156.106517] call_timer_fn+0x19a/0x610 [ 156.112749] run_timer_softirq+0xb51/0x1090 It was caused by the netns freed without deleting the discoverer timer, while later on the netns would be accessed in the timer handler. The timer should have been deleted by tipc_net_stop() when cleaning up a netns. However, tipc has been able to enable a bearer and start d->timer without the local node_addr set since Commit 52dfae5c85a4 ("tipc: obtain node identity from interface by default"), which caused the timer not to be deleted in tipc_net_stop() then. So fix it in tipc_net_stop() by changing to check local node_id instead of local node_addr, as Jon suggested. While at it, remove the calling of tipc_nametbl_withdraw() there, since tipc_nametbl_stop() will take of the nametbl's freeing after. Fixes: 52dfae5c85a4 ("tipc: obtain node identity from interface by default") Reported-by: syz...@sy... Signed-off-by: Xin Long <luc...@gm...> --- net/tipc/net.c | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/net/tipc/net.c b/net/tipc/net.c index f076edb..7ce1e86 100644 --- a/net/tipc/net.c +++ b/net/tipc/net.c @@ -163,12 +163,9 @@ void tipc_sched_net_finalize(struct net *net, u32 addr) void tipc_net_stop(struct net *net) { - u32 self = tipc_own_addr(net); - - if (!self) + if (!tipc_own_id(net)) return; - tipc_nametbl_withdraw(net, TIPC_CFG_SRV, self, self, self); rtnl_lock(); tipc_bearer_stop(net); tipc_node_stop(net); -- 2.1.0 |
From: Ying X. <yin...@wi...> - 2019-03-22 15:45:23
|
Hi Tuong, Great job! It's a very nice enhancement, and we should do the improvement early. On 3/20/19 11:28 AM, Tuong Lien wrote: > During unicast link transmission, it's observed very often that because > of one or a few lost/dis-ordered packets, the sending side will fastly > reach the send window limit and must wait for the packets to be arrived > at the receiving side or in the worst case, a retransmission must be > done first. The sending side cannot release a lot of subsequent packets > in its transmq even though all of them might have already been received > by the receiving side. > That is, one or two packets dis-ordered/lost and dozens of packets have > to wait, this obviously reduces the overall throughput! > > This commit introduces an algorithm to overcome this by using "Gap ACK > blocks". Basically, a Gap ACK block will consist of <ack, gap> numbers > that describes the link deferdq where packets have been got by the > receiving side but with gaps, for example: > > link deferdq: [1 2 3 4 10 11 13 14 15 20] > --> Gap ACK blocks: <4, 5>, <11, 1>, <15, 4>, <20, 0> This idea is the exactly same as SACK of SCTP: https://tools.ietf.org/html/rfc4960#section-3.3.4 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type = 3 |Chunk Flags | Chunk Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Cumulative TSN Ack | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Advertised Receiver Window Credit (a_rwnd) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Number of Gap Ack Blocks = N | Number of Duplicate TSNs = X | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Gap Ack Block #1 Start | Gap Ack Block #1 End | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ / / \ ... \ / / +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Gap Ack Block #N Start | Gap Ack Block #N End | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Duplicate TSN 1 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ / / \ ... \ / / +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Duplicate TSN X | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ For example, assume that the receiver has the following DATA chunks newly arrived at the time when it decides to send a Selective ACK, ---------- | TSN=17 | ---------- | | <- still missing ---------- | TSN=15 | ---------- | TSN=14 | ---------- | | <- still missing ---------- | TSN=12 | ---------- | TSN=11 | ---------- | TSN=10 | ---------- then the parameter part of the SACK MUST be constructed as follows (assuming the new a_rwnd is set to 4660 by the sender): +--------------------------------+ | Cumulative TSN Ack = 12 | +--------------------------------+ | a_rwnd = 4660 | +----------------+---------------+ | num of block=2 | num of dup=0 | +----------------+---------------+ |block #1 strt=2 |block #1 end=3 | +----------------+---------------+ |block #2 strt=5 |block #2 end=5 | +----------------+---------------+ It seems more easier to understand than our solution. Of course, I don't intend to disagree to your proposal. In addition, we also can consider to mark duplicated packet as SCTP is doing. For duplicated packets, SCTP adapt the strategy: When a packet arrives with duplicate DATA chunk(s) and with no new DATA chunk(s), the endpoint MUST _immediately_ send a SACK with no delay. If a packet arrives with duplicate DATA chunk(s) bundled with new DATA chunks, the endpoint MAY immediately send a SACK. In the current solution, when a sender detects a gap in tipc_link_advance_transmq(), it will _immediately_ retransmit missing messages. However, probably we can borrow some idea from SCTP Fast Retransmit algorithm, which might help us further improve throughput performance: https://tools.ietf.org/html/rfc4960#section-7.2.4 In the absence of data loss, an endpoint performs delayed acknowledgement. However, whenever an endpoint notices a hole in the arriving TSN sequence, it SHOULD start sending a SACK back every time a packet arrives carrying data until the hole is filled. Whenever an endpoint receives a SACK that indicates that some TSNs are missing, it SHOULD wait for _two further miss indications_ (via subsequent SACKs for a total of three missing reports) on the same TSNs before taking action with regard to Fast Retransmit. In SCTP, whenever sender endpoint detects a gap, it doesn't immediately retransmit the missing packet. Instead, it will wait for two further miss indications. Of course, SCTP flow control management algorithm is not the exactly same with TIPC. Especially, each SCTP endpoint has a retransmission timer, slow-start etc. Maybe its algorithms are all useful for TIPC scenario, but I think we can try to use the two ideas I mentioned above based on the current solution to check whether they can further improve TIPC transmission performance. Thanks, Ying > > The Gap ACK blocks will be sent to the sending side along with the > traditional ACK or NACK message. Immediately when receiving the message > the sending side will now not only release from its transmq the packets > ack-ed by the ACK but also by the Gap ACK blocks! So, more packets can > be enqueued and transmitted. > In addition, the sending side can now do "multi-retransmissions" > according to the Gaps reported in the Gap ACK blocks. > > The new algorithm as verified helps greatly improve the TIPC throughput > especially under packet loss condition. > > So far, a maximum of 32 blocks is quite enough without any "Too few Gap > ACK blocks" reports with a 5.0% packet loss rate, however this number > can be increased in the furture if needed. > > Also, the patch is backward compatible. > > Acked-by: Jon Maloy <jon...@er...> > Signed-off-by: Tuong Lien <tuo...@de...> > --- > net/tipc/link.c | 134 +++++++++++++++++++++++++++++++++++++++++++++++++++----- > net/tipc/msg.h | 30 +++++++++++++ > net/tipc/node.h | 6 ++- > 3 files changed, 158 insertions(+), 12 deletions(-) > > diff --git a/net/tipc/link.c b/net/tipc/link.c > index 52d23b3ffaf5..5aee1ed23ba9 100644 > --- a/net/tipc/link.c > +++ b/net/tipc/link.c > @@ -246,6 +246,10 @@ static int tipc_link_build_nack_msg(struct tipc_link *l, > static void tipc_link_build_bc_init_msg(struct tipc_link *l, > struct sk_buff_head *xmitq); > static bool tipc_link_release_pkts(struct tipc_link *l, u16 to); > +static u16 tipc_build_gap_ack_blks(struct tipc_link *l, void *data); > +static void tipc_link_advance_transmq(struct tipc_link *l, u16 acked, u16 gap, > + struct tipc_gap_ack_blks *ga, > + struct sk_buff_head *xmitq); > > /* > * Simple non-static link routines (i.e. referenced outside this file) > @@ -1226,6 +1230,102 @@ static bool tipc_link_release_pkts(struct tipc_link *l, u16 acked) > return released; > } > > +/* tipc_build_gap_ack_blks - build Gap ACK blocks > + * @l: tipc link that data have come with gaps in sequence if any > + * @data: data buffer to store the Gap ACK blocks after built > + * > + * returns the actual allocated memory size > + */ > +static u16 tipc_build_gap_ack_blks(struct tipc_link *l, void *data) > +{ > + struct sk_buff *skb = skb_peek(&l->deferdq); > + struct tipc_gap_ack_blks *ga = data; > + u16 len, expect, seqno = 0; > + u8 n = 0; > + > + if (!skb) > + goto exit; > + > + expect = buf_seqno(skb); > + skb_queue_walk(&l->deferdq, skb) { > + seqno = buf_seqno(skb); > + if (unlikely(more(seqno, expect))) { > + ga->gacks[n].ack = htons(expect - 1); > + ga->gacks[n].gap = htons(seqno - expect); > + if (++n >= MAX_GAP_ACK_BLKS) { > + pr_info_ratelimited("Too few Gap ACK blocks!\n"); > + goto exit; > + } > + } else if (unlikely(less(seqno, expect))) { > + pr_warn("Unexpected skb in deferdq!\n"); > + continue; > + } > + expect = seqno + 1; > + } > + > + /* last block */ > + ga->gacks[n].ack = htons(seqno); > + ga->gacks[n].gap = 0; > + n++; > + > +exit: > + len = tipc_gap_ack_blks_sz(n); > + ga->len = htons(len); > + ga->gack_cnt = n; > + return len; > +} > + > +/* tipc_link_advance_transmq - advance TIPC link transmq queue by releasing > + * acked packets, also doing retransmissions if > + * gaps found > + * @l: tipc link with transmq queue to be advanced > + * @acked: seqno of last packet acked by peer without any gaps before > + * @gap: # of gap packets > + * @ga: buffer pointer to Gap ACK blocks from peer > + * @xmitq: queue for accumulating the retransmitted packets if any > + */ > +static void tipc_link_advance_transmq(struct tipc_link *l, u16 acked, u16 gap, > + struct tipc_gap_ack_blks *ga, > + struct sk_buff_head *xmitq) > +{ > + struct sk_buff *skb, *_skb, *tmp; > + struct tipc_msg *hdr; > + u16 bc_ack = l->bc_rcvlink->rcv_nxt - 1; > + u16 ack = l->rcv_nxt - 1; > + u16 seqno; > + u16 n = 0; > + > + skb_queue_walk_safe(&l->transmq, skb, tmp) { > + seqno = buf_seqno(skb); > + > +next_gap_ack: > + if (less_eq(seqno, acked)) { > + /* release skb */ > + __skb_unlink(skb, &l->transmq); > + kfree_skb(skb); > + } else if (less_eq(seqno, acked + gap)) { > + /* retransmit skb */ > + _skb = __pskb_copy(skb, MIN_H_SIZE, GFP_ATOMIC); > + if (!_skb) > + continue; > + hdr = buf_msg(_skb); > + msg_set_ack(hdr, ack); > + msg_set_bcast_ack(hdr, bc_ack); > + _skb->priority = TC_PRIO_CONTROL; > + __skb_queue_tail(xmitq, _skb); > + l->stats.retransmitted++; > + } else { > + /* retry with Gap ACK blocks if any */ > + if (!ga || n >= ga->gack_cnt) > + break; > + acked = ntohs(ga->gacks[n].ack); > + gap = ntohs(ga->gacks[n].gap); > + n++; > + goto next_gap_ack; > + } > + } > +} > + > /* tipc_link_build_state_msg: prepare link state message for transmission > * > * Note that sending of broadcast ack is coordinated among nodes, to reduce > @@ -1378,6 +1478,7 @@ static void tipc_link_build_proto_msg(struct tipc_link *l, int mtyp, bool probe, > struct tipc_mon_state *mstate = &l->mon_state; > int dlen = 0; > void *data; > + u16 glen = 0; > > /* Don't send protocol message during reset or link failover */ > if (tipc_link_is_blocked(l)) > @@ -1390,8 +1491,8 @@ static void tipc_link_build_proto_msg(struct tipc_link *l, int mtyp, bool probe, > rcvgap = buf_seqno(skb_peek(dfq)) - l->rcv_nxt; > > skb = tipc_msg_create(LINK_PROTOCOL, mtyp, INT_H_SIZE, > - tipc_max_domain_size, l->addr, > - tipc_own_addr(l->net), 0, 0, 0); > + tipc_max_domain_size + MAX_GAP_ACK_BLKS_SZ, > + l->addr, tipc_own_addr(l->net), 0, 0, 0); > if (!skb) > return; > > @@ -1418,9 +1519,11 @@ static void tipc_link_build_proto_msg(struct tipc_link *l, int mtyp, bool probe, > msg_set_bc_gap(hdr, link_bc_rcv_gap(bcl)); > msg_set_probe(hdr, probe); > msg_set_is_keepalive(hdr, probe || probe_reply); > - tipc_mon_prep(l->net, data, &dlen, mstate, l->bearer_id); > - msg_set_size(hdr, INT_H_SIZE + dlen); > - skb_trim(skb, INT_H_SIZE + dlen); > + if (l->peer_caps & TIPC_GAP_ACK_BLOCK) > + glen = tipc_build_gap_ack_blks(l, data); > + tipc_mon_prep(l->net, data + glen, &dlen, mstate, l->bearer_id); > + msg_set_size(hdr, INT_H_SIZE + glen + dlen); > + skb_trim(skb, INT_H_SIZE + glen + dlen); > l->stats.sent_states++; > l->rcv_unacked = 0; > } else { > @@ -1590,6 +1693,7 @@ static int tipc_link_proto_rcv(struct tipc_link *l, struct sk_buff *skb, > struct sk_buff_head *xmitq) > { > struct tipc_msg *hdr = buf_msg(skb); > + struct tipc_gap_ack_blks *ga = NULL; > u16 rcvgap = 0; > u16 ack = msg_ack(hdr); > u16 gap = msg_seq_gap(hdr); > @@ -1600,6 +1704,7 @@ static int tipc_link_proto_rcv(struct tipc_link *l, struct sk_buff *skb, > u16 dlen = msg_data_sz(hdr); > int mtyp = msg_type(hdr); > bool reply = msg_probe(hdr); > + u16 glen = 0; > void *data; > char *if_name; > int rc = 0; > @@ -1697,7 +1802,17 @@ static int tipc_link_proto_rcv(struct tipc_link *l, struct sk_buff *skb, > rc = TIPC_LINK_UP_EVT; > break; > } > - tipc_mon_rcv(l->net, data, dlen, l->addr, > + > + /* Receive Gap ACK blocks from peer if any */ > + if (l->peer_caps & TIPC_GAP_ACK_BLOCK) { > + ga = (struct tipc_gap_ack_blks *)data; > + glen = ntohs(ga->len); > + /* sanity check: if failed, ignore Gap ACK blocks */ > + if (glen != tipc_gap_ack_blks_sz(ga->gack_cnt)) > + ga = NULL; > + } > + > + tipc_mon_rcv(l->net, data + glen, dlen - glen, l->addr, > &l->mon_state, l->bearer_id); > > /* Send NACK if peer has sent pkts we haven't received yet */ > @@ -1706,13 +1821,12 @@ static int tipc_link_proto_rcv(struct tipc_link *l, struct sk_buff *skb, > if (rcvgap || reply) > tipc_link_build_proto_msg(l, STATE_MSG, 0, reply, > rcvgap, 0, 0, xmitq); > - tipc_link_release_pkts(l, ack); > + > + tipc_link_advance_transmq(l, ack, gap, ga, xmitq); > > /* If NACK, retransmit will now start at right position */ > - if (gap) { > - rc = tipc_link_retrans(l, l, ack + 1, ack + gap, xmitq); > + if (gap) > l->stats.recv_nacks++; > - } > > tipc_link_advance_backlog(l, xmitq); > if (unlikely(!skb_queue_empty(&l->wakeupq))) > diff --git a/net/tipc/msg.h b/net/tipc/msg.h > index 528ba9241acc..dcb73421c19a 100644 > --- a/net/tipc/msg.h > +++ b/net/tipc/msg.h > @@ -117,6 +117,36 @@ struct tipc_msg { > __be32 hdr[15]; > }; > > +/* struct tipc_gap_ack - TIPC Gap ACK block > + * @ack: seqno of the last consecutive packet in link deferdq > + * @gap: number of gap packets since the last ack > + * > + * E.g: > + * link deferdq: 1 2 3 4 10 11 13 14 15 20 > + * --> Gap ACK blocks: <4, 5>, <11, 1>, <15, 4>, <20, 0> > + */ > +struct tipc_gap_ack { > + __be16 ack; > + __be16 gap; > +}; > + > +/* struct tipc_gap_ack_blks > + * @len: actual length of the record > + * @gack_cnt: number of Gap ACK blocks in the record > + * @gacks: array of Gap ACK blocks > + */ > +struct tipc_gap_ack_blks { > + __be16 len; > + u8 gack_cnt; > + struct tipc_gap_ack gacks[]; > +}; > + > +#define tipc_gap_ack_blks_sz(n) (sizeof(struct tipc_gap_ack_blks) + \ > + sizeof(struct tipc_gap_ack) * (n)) > + > +#define MAX_GAP_ACK_BLKS 32 > +#define MAX_GAP_ACK_BLKS_SZ tipc_gap_ack_blks_sz(MAX_GAP_ACK_BLKS) > + > static inline struct tipc_msg *buf_msg(struct sk_buff *skb) > { > return (struct tipc_msg *)skb->data; > diff --git a/net/tipc/node.h b/net/tipc/node.h > index 2404225c5d58..c0bf49ea3de4 100644 > --- a/net/tipc/node.h > +++ b/net/tipc/node.h > @@ -52,7 +52,8 @@ enum { > TIPC_BCAST_RCAST = (1 << 4), > TIPC_NODE_ID128 = (1 << 5), > TIPC_LINK_PROTO_SEQNO = (1 << 6), > - TIPC_MCAST_RBCTL = (1 << 7) > + TIPC_MCAST_RBCTL = (1 << 7), > + TIPC_GAP_ACK_BLOCK = (1 << 8) > }; > > #define TIPC_NODE_CAPABILITIES (TIPC_SYN_BIT | \ > @@ -62,7 +63,8 @@ enum { > TIPC_BLOCK_FLOWCTL | \ > TIPC_NODE_ID128 | \ > TIPC_LINK_PROTO_SEQNO | \ > - TIPC_MCAST_RBCTL) > + TIPC_MCAST_RBCTL | \ > + TIPC_GAP_ACK_BLOCK) > #define INVALID_BEARER_ID -1 > > void tipc_node_stop(struct net *net); > |
From: Jon M. <jon...@er...> - 2019-03-22 14:04:02
|
When checking the code with clang -Wsometimes-uninitialized we get the following warning: if (!tipc_link_is_establishing(l)) { ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~ net/tipc/node.c:847:46: note: uninitialized use occurs here tipc_bearer_xmit(n->net, bearer_id, &xmitq, maddr); net/tipc/node.c:831:2: note: remove the 'if' if its condition is always true if (!tipc_link_is_establishing(l)) { ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ net/tipc/node.c:821:31: note: initialize the variable 'maddr' to silence this warning struct tipc_media_addr *maddr; We fix this by initializing 'maddr' to NULL. For the matter of clarity, we also test if 'xmitq' is non-empty before we use it and 'maddr' further down in the function. It will never happen that 'xmitq' is non- empty at the same time as 'maddr' is NULL, so this is a sufficient test. Fixes: 598411d70f85 ("tipc: make resetting of links non-atomic") Reported-by: Nathan Chancellor <nat...@gm...> Signed-off-by: Jon Maloy <jon...@er...> --- net/tipc/node.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/net/tipc/node.c b/net/tipc/node.c index 2dc4919..dd3b6dc 100644 --- a/net/tipc/node.c +++ b/net/tipc/node.c @@ -817,10 +817,10 @@ static void __tipc_node_link_down(struct tipc_node *n, int *bearer_id, static void tipc_node_link_down(struct tipc_node *n, int bearer_id, bool delete) { struct tipc_link_entry *le = &n->links[bearer_id]; + struct tipc_media_addr *maddr = NULL; struct tipc_link *l = le->link; - struct tipc_media_addr *maddr; - struct sk_buff_head xmitq; int old_bearer_id = bearer_id; + struct sk_buff_head xmitq; if (!l) return; @@ -844,7 +844,8 @@ static void tipc_node_link_down(struct tipc_node *n, int bearer_id, bool delete) tipc_node_write_unlock(n); if (delete) tipc_mon_remove_peer(n->net, n->addr, old_bearer_id); - tipc_bearer_xmit(n->net, bearer_id, &xmitq, maddr); + if (!skb_queue_empty(&xmitq)) + tipc_bearer_xmit(n->net, bearer_id, &xmitq, maddr); tipc_sk_rcv(n->net, &le->inputq); } -- 2.1.4 |
From: Hoang Le <hoa...@de...> - 2019-03-22 08:47:58
|
The command prints the actually method that multicast is running in the system. Also 'ratio' value for AUTOSELECT method. A sample usage is shown below: $tipc link get broadcast BROADCAST $tipc link get broadcast AUTOSELECT ratio:30% $tipc link get broadcast -j -p [ { "method": "AUTOSELECT" },{ "ratio": 30 } ] Acked-by: Jon Maloy <jon...@er...> Signed-off-by: Hoang Le <hoa...@de...> --- tipc/link.c | 85 ++++++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 84 insertions(+), 1 deletion(-) diff --git a/tipc/link.c b/tipc/link.c index e3b10bb7b3d4..e123c1863575 100644 --- a/tipc/link.c +++ b/tipc/link.c @@ -175,10 +175,92 @@ static void cmd_link_get_help(struct cmdl *cmdl) "PROPERTIES\n" " tolerance - Get link tolerance\n" " priority - Get link priority\n" - " window - Get link window\n", + " window - Get link window\n" + " broadcast - Get link broadcast\n", cmdl->argv[0]); } +static int cmd_link_get_bcast_cb(const struct nlmsghdr *nlh, void *data) +{ + int *prop = data; + int prop_ratio = TIPC_NLA_PROP_BROADCAST_RATIO; + struct genlmsghdr *genl = mnl_nlmsg_get_payload(nlh); + struct nlattr *info[TIPC_NLA_MAX + 1] = {}; + struct nlattr *attrs[TIPC_NLA_LINK_MAX + 1] = {}; + struct nlattr *props[TIPC_NLA_PROP_MAX + 1] = {}; + int bc_mode; + + mnl_attr_parse(nlh, sizeof(*genl), parse_attrs, info); + if (!info[TIPC_NLA_LINK]) + return MNL_CB_ERROR; + + mnl_attr_parse_nested(info[TIPC_NLA_LINK], parse_attrs, attrs); + if (!attrs[TIPC_NLA_LINK_PROP]) + return MNL_CB_ERROR; + + mnl_attr_parse_nested(attrs[TIPC_NLA_LINK_PROP], parse_attrs, props); + if (!props[*prop]) + return MNL_CB_ERROR; + + bc_mode = mnl_attr_get_u32(props[*prop]); + + new_json_obj(json); + open_json_object(NULL); + switch (bc_mode) { + case 0x1: + print_string(PRINT_ANY, "method", "%s\n", "BROADCAST"); + break; + case 0x2: + print_string(PRINT_ANY, "method", "%s\n", "REPLICAST"); + break; + case 0x4: + print_string(PRINT_ANY, "method", "%s", "AUTOSELECT"); + close_json_object(); + open_json_object(NULL); + print_uint(PRINT_ANY, "ratio", " ratio:%u%\n", + mnl_attr_get_u32(props[prop_ratio])); + break; + default: + print_string(PRINT_ANY, NULL, "UNKNOWN\n", NULL); + break; + } + close_json_object(); + delete_json_obj(); + return MNL_CB_OK; +} + +static void cmd_link_get_bcast_help(struct cmdl *cmdl) +{ + fprintf(stderr, "Usage: %s link get PPROPERTY\n\n" + "PROPERTIES\n" + " broadcast - Get link broadcast\n", + cmdl->argv[0]); +} + +static int cmd_link_get_bcast(struct nlmsghdr *nlh, const struct cmd *cmd, + struct cmdl *cmdl, void *data) +{ + int prop = TIPC_NLA_PROP_BROADCAST; + char buf[MNL_SOCKET_BUFFER_SIZE]; + struct nlattr *attrs; + + if (help_flag) { + (cmd->help)(cmdl); + return -EINVAL; + } + + nlh = msg_init(buf, TIPC_NL_LINK_GET); + if (!nlh) { + fprintf(stderr, "error, message initialisation failed\n"); + return -1; + } + attrs = mnl_attr_nest_start(nlh, TIPC_NLA_LINK); + /* Direct to broadcast-link setting */ + mnl_attr_put_strz(nlh, TIPC_NLA_LINK_NAME, tipc_bclink_name); + mnl_attr_nest_end(nlh, attrs); + return msg_doit(nlh, cmd_link_get_bcast_cb, &prop); +} + static int cmd_link_get(struct nlmsghdr *nlh, const struct cmd *cmd, struct cmdl *cmdl, void *data) { @@ -186,6 +268,7 @@ static int cmd_link_get(struct nlmsghdr *nlh, const struct cmd *cmd, { PRIORITY_STR, cmd_link_get_prop, cmd_link_get_help }, { TOLERANCE_STR, cmd_link_get_prop, cmd_link_get_help }, { WINDOW_STR, cmd_link_get_prop, cmd_link_get_help }, + { BROADCAST_STR, cmd_link_get_bcast, cmd_link_get_bcast_help }, { NULL } }; -- 2.17.1 |
From: Hoang Le <hoa...@de...> - 2019-03-22 08:47:58
|
Add a man page describing tipc link broadcast command get and set Signed-off-by: Hoang Le <hoa...@de...> --- man/man8/tipc-link.8 | 53 +++++++++++++++++++++++++++++++++++++++----- 1 file changed, 48 insertions(+), 5 deletions(-) diff --git a/man/man8/tipc-link.8 b/man/man8/tipc-link.8 index 01afa1c3ad9f..47dae25d3626 100644 --- a/man/man8/tipc-link.8 +++ b/man/man8/tipc-link.8 @@ -1,4 +1,4 @@ -.TH TIPC-LINK 8 "02 Jun 2015" "iproute2" "Linux" +.TH TIPC-LINK 8 "22 Mar 2019" "iproute2" "Linux" .\" For consistency, please keep padding right aligned. .\" For example '.B "foo " bar' and not '.B foo " bar"' @@ -14,18 +14,36 @@ tipc-link \- show links or modify link properties .ti -8 .B tipc link set -.RB "{ " "priority " +.br +.RB "[ " "{ " "priority " .IR PRIORITY .RB "| " tolerance .IR TOLERANCE .RB "| " window .IR "WINDOW " } -.BI "link " LINK +.BI "link " LINK " ]" +.RB "|" +.br +.RB "[ " +.RB "{ " broadcast " [ " +.IR BROADCAST +.RB " | " +.IR REPLICAST +.RB " | " +.IR AUTOSELECT +.RB "[ " ratio +.IR SIZE +.RB "] " ] " } " "]" .ti -8 .B tipc link get -.RB "{ " "priority" " | " tolerance " | " window " } " link -.I LINK +.br +.RB "[ " "{ " "priority" " | " tolerance " | " window " } " link +.IR LINK " ] " +.RB "|" +.br +.RB "[ " { " broadcast " } " ]" +.br .ti -8 .B tipc link statistics @@ -306,6 +324,31 @@ They are usually transient and occur during the cluster startup phase or network reconfiguration. Possible status are: U or D. The status U implies up and D down. +.SS Broadcast properties +.TP +.B BROADCAST +.br +Forces all multicast traffic to be transmitted via broadcast only, +irrespective of cluster size and number of destinations. + +.TP +.B REPLICAST +.br +Forces all multicast traffic to be transmitted via replicast only, +irrespective of cluster size and number of destinations. + +.TP +.B AUTOSELECT +.br +Auto switching to broadcast or replicast depending on cluster size and +destination node number. + +.TP +.B ratio SIZE +.br +Set the AUTOSELECT criteria, percentage of destination nodes vs cluster +size. + .SH EXAMPLES .PP tipc link monitor list -- 2.17.1 |
From: Hoang Le <hoa...@de...> - 2019-03-22 08:47:57
|
The command added here makes it possible to forcibly configure the broadcast link to use either broadcast or replicast, in addition to the already existing auto selection algorithm. A sample usage is shown below: $tipc link set broadcast BROADCAST $tipc link set broadcast AUTOSELECT ratio 25 $tipc link set broadcast -h Usage: tipc link set broadcast PROPERTY PROPERTIES BROADCAST - Forces all multicast traffic to be transmitted via broadcast only, irrespective of cluster size and number of destinations REPLICAST - Forces all multicast traffic to be transmitted via replicast only, irrespective of cluster size and number of destinations AUTOSELECT - Auto switching to broadcast or replicast depending on cluster size and destination node number ratio SIZE - Set the AUTOSELECT criteria, percentage of destination nodes vs cluster size Acked-by: Jon Maloy <jon...@er...> Signed-off-by: Hoang Le <hoa...@de...> --- include/uapi/linux/tipc_netlink.h | 2 + tipc/link.c | 96 ++++++++++++++++++++++++++++++- 2 files changed, 97 insertions(+), 1 deletion(-) diff --git a/include/uapi/linux/tipc_netlink.h b/include/uapi/linux/tipc_netlink.h index 0ebe02ef1a86..efb958fd167d 100644 --- a/include/uapi/linux/tipc_netlink.h +++ b/include/uapi/linux/tipc_netlink.h @@ -281,6 +281,8 @@ enum { TIPC_NLA_PROP_TOL, /* u32 */ TIPC_NLA_PROP_WIN, /* u32 */ TIPC_NLA_PROP_MTU, /* u32 */ + TIPC_NLA_PROP_BROADCAST, /* u32 */ + TIPC_NLA_PROP_BROADCAST_RATIO, /* u32 */ __TIPC_NLA_PROP_MAX, TIPC_NLA_PROP_MAX = __TIPC_NLA_PROP_MAX - 1 diff --git a/tipc/link.c b/tipc/link.c index 43e26da3fa6b..e3b10bb7b3d4 100644 --- a/tipc/link.c +++ b/tipc/link.c @@ -28,6 +28,9 @@ #define PRIORITY_STR "priority" #define TOLERANCE_STR "tolerance" #define WINDOW_STR "window" +#define BROADCAST_STR "broadcast" + +static const char tipc_bclink_name[] = "broadcast-link"; static int link_list_cb(const struct nlmsghdr *nlh, void *data) { @@ -521,7 +524,8 @@ static void cmd_link_set_help(struct cmdl *cmdl) "PROPERTIES\n" " tolerance TOLERANCE - Set link tolerance\n" " priority PRIORITY - Set link priority\n" - " window WINDOW - Set link window\n", + " window WINDOW - Set link window\n" + " broadcast BROADCAST - Set link broadcast\n", cmdl->argv[0]); } @@ -585,6 +589,95 @@ static int cmd_link_set_prop(struct nlmsghdr *nlh, const struct cmd *cmd, return msg_doit(nlh, link_get_cb, &prop); } +static void cmd_link_set_bcast_help(struct cmdl *cmdl) +{ + fprintf(stderr, "Usage: %s link set broadcast PROPERTY\n\n" + "PROPERTIES\n" + " BROADCAST - Forces all multicast traffic to be\n" + " transmitted via broadcast only,\n" + " irrespective of cluster size and number\n" + " of destinations\n\n" + " REPLICAST - Forces all multicast traffic to be\n" + " transmitted via replicast only,\n" + " irrespective of cluster size and number\n" + " of destinations\n\n" + " AUTOSELECT - Auto switching to broadcast or replicast\n" + " depending on cluster size and destination\n" + " node number\n\n" + " ratio SIZE - Set the AUTOSELECT criteria, percentage of\n" + " destination nodes vs cluster size\n\n", + cmdl->argv[0]); +} + +static int cmd_link_set_bcast(struct nlmsghdr *nlh, const struct cmd *cmd, + struct cmdl *cmdl, void *data) +{ + char buf[MNL_SOCKET_BUFFER_SIZE]; + struct nlattr *props; + struct nlattr *attrs; + struct opt *opt; + struct opt opts[] = { + { "BROADCAST", OPT_KEY, NULL }, + { "REPLICAST", OPT_KEY, NULL }, + { "AUTOSELECT", OPT_KEY, NULL }, + { "ratio", OPT_KEYVAL, NULL }, + { NULL } + }; + int method = 0; + + if (help_flag) { + (cmd->help)(cmdl); + return -EINVAL; + } + + if (parse_opts(opts, cmdl) < 0) + return -EINVAL; + + for (opt = opts; opt->key; opt++) + if (opt->val) + break; + + if (!opt || !opt->key) { + (cmd->help)(cmdl); + return -EINVAL; + } + + nlh = msg_init(buf, TIPC_NL_LINK_SET); + if (!nlh) { + fprintf(stderr, "error, message initialisation failed\n"); + return -1; + } + + attrs = mnl_attr_nest_start(nlh, TIPC_NLA_LINK); + /* Direct to broadcast-link setting */ + mnl_attr_put_strz(nlh, TIPC_NLA_LINK_NAME, tipc_bclink_name); + props = mnl_attr_nest_start(nlh, TIPC_NLA_LINK_PROP); + + if (get_opt(opts, "BROADCAST")) + method = 0x1; + else if (get_opt(opts, "REPLICAST")) + method = 0x2; + else if (get_opt(opts, "AUTOSELECT")) + method = 0x4; + + opt = get_opt(opts, "ratio"); + if (!method && !opt) { + (cmd->help)(cmdl); + return -EINVAL; + } + + if (method) + mnl_attr_put_u32(nlh, TIPC_NLA_PROP_BROADCAST, method); + + if (opt) + mnl_attr_put_u32(nlh, TIPC_NLA_PROP_BROADCAST_RATIO, + atoi(opt->val)); + + mnl_attr_nest_end(nlh, props); + mnl_attr_nest_end(nlh, attrs); + return msg_doit(nlh, NULL, NULL); +} + static int cmd_link_set(struct nlmsghdr *nlh, const struct cmd *cmd, struct cmdl *cmdl, void *data) { @@ -592,6 +685,7 @@ static int cmd_link_set(struct nlmsghdr *nlh, const struct cmd *cmd, { PRIORITY_STR, cmd_link_set_prop, cmd_link_set_help }, { TOLERANCE_STR, cmd_link_set_prop, cmd_link_set_help }, { WINDOW_STR, cmd_link_set_prop, cmd_link_set_help }, + { BROADCAST_STR, cmd_link_set_bcast, cmd_link_set_bcast_help }, { NULL } }; -- 2.17.1 |
From: Jon M. <jon...@er...> - 2019-03-21 19:50:09
|
> -----Original Message----- > From: net...@vg... <net...@vg...> > On Behalf Of Arnd Bergmann > Sent: 21-Mar-19 19:23 > To: Jon Maloy <jon...@er...> > Cc: Nick Desaulniers <nde...@go...>; Nathan Chancellor > <nat...@gm...>; Ying Xue <yin...@wi...>; David S. > Miller <da...@da...>; tip...@li...; > Networking <ne...@vg...>; LKML <linux- > ke...@vg...>; cla...@go... > Subject: Re: -Wsometimes-uninitialized Clang warning in net/tipc/node.c > > On Thu, Mar 21, 2019 at 4:25 PM Arnd Bergmann <ar...@ar...> wrote: > > > > On Thu, Mar 21, 2019 at 3:57 PM Jon Maloy <jon...@er...> > wrote: > > > > > > > > > diff --git a/net/tipc/node.c b/net/tipc/node.c index > > > > 2dc4919ab23c..147786795e48 100644 > > > > --- a/net/tipc/node.c > > > > +++ b/net/tipc/node.c > > > > @@ -844,7 +844,8 @@ static void tipc_node_link_down(struct > > > > tipc_node *n, int bearer_id, bool delete) > > > > tipc_node_write_unlock(n); > > > > if (delete) > > > > tipc_mon_remove_peer(n->net, n->addr, old_bearer_id); > > > > - tipc_bearer_xmit(n->net, bearer_id, &xmitq, maddr); > > > > + if (skb_queue_empty(xmitq)) > > > > + tipc_bearer_xmit(n->net, bearer_id, &xmitq, > > > > + maddr); > > > > tipc_sk_rcv(n->net, &le->inputq); } > > > > > > > > This duplicates the check inside of skb_queue_empty(), but I don't > > > > know if the compiler can see through the logic behind the inlined > function calls. > > > > > > Probably not, but this is not in any way time critical. > > > > I meant it's unclear whether compilers should be expected to see that > > skb_queue_empty() is true after the call to __skb_queue_head_init() > > initializes it. > > I reproduced the warning now, and verified that my change eliminates the > warning. I still think that is the more logical solution here. Ok. No problems with that. ///jon > > Arnd |
From: David M. <da...@da...> - 2019-03-21 16:57:36
|
From: Hoang Le <hoa...@de...> Date: Thu, 21 Mar 2019 17:25:18 +0700 > In commit c55c8edafa91 ("tipc: smooth change between replicast and > broadcast") we introduced new method to eliminate the risk of message > reordering that happen in between different nodes. > Unfortunately, we forgot checking at receiving side to ignore intra node. > > We fix this by checking and returning if arrived message from intra node. > > syzbot report: ... > Reported-by: syz...@sy... > Fixes: c55c8eda ("tipc: smooth change between replicast and broadcast") > Acked-by: Jon Maloy <jon...@er...> > Signed-off-by: Hoang Le <hoa...@de...> Applied. |
From: David M. <da...@da...> - 2019-03-21 16:57:24
|
From: Hoang Le <hoa...@de...> Date: Thu, 21 Mar 2019 17:25:17 +0700 > skb free-ed in: > 1/ condition 1: tipc_sk_filter_rcv -> tipc_sk_proto_rcv > 2/ condition 2: tipc_sk_filter_rcv -> tipc_group_filter_msg > This leads to a "use-after-free" access in the next condition. > > We fix this by intializing the variable at declaration, then it is safe > to check this variable to continue processing if condition matches. > > syzbot report: ... > Reported-by: syz...@sy... > Fixes: c55c8eda ("tipc: smooth change between replicast and broadcast") > Acked-by: Jon Maloy <jon...@er...> > Signed-off-by: Hoang Le <hoa...@de...> Applied. |
From: David M. <da...@da...> - 2019-03-21 16:10:21
|
From: Jon Maloy <jon...@er...> Date: Thu, 21 Mar 2019 09:11:59 +0100 > From: Erik Hugne <eri...@gm...> > > When cancelling a subscription, we have to clear the cancel bit in the > request before iterating over any established subscriptions with memcmp. > Otherwise no subscription will ever be found, and it will not be > possible to explicitly unsubscribe individual subscriptions. > > Fixes: 8985ecc7c1e0 ("tipc: simplify endianness handling in topology subscriber") > Signed-off-by: Erik Hugne <eri...@gm...> > Signed-off-by: Jon Maloy <jon...@er...> > > --- > v2: clear bit in a more sensible way.. (comment from davem) Applied and queued up for -stable. Thanks for following up. |
From: Jon M. <jon...@er...> - 2019-03-21 14:57:34
|
> -----Original Message----- > From: Arnd Bergmann <ar...@ar...> > Sent: 21-Mar-19 12:45 > To: Nick Desaulniers <nde...@go...> > Cc: Nathan Chancellor <nat...@gm...>; Jon Maloy > <jon...@er...>; Ying Xue <yin...@wi...>; David S. > Miller <da...@da...>; tip...@li...; > Networking <ne...@vg...>; LKML <linux- > ke...@vg...>; cla...@go... > Subject: Re: -Wsometimes-uninitialized Clang warning in net/tipc/node.c > > On Wed, Mar 20, 2019 at 9:51 PM 'Nick Desaulniers' via Clang Built Linux > <cla...@go...> wrote: > > On Wed, Mar 20, 2019 at 12:07 PM Nathan Chancellor > > <nat...@gm...> wrote: > > > > The use in tipc_bearer_xmit() isn't even guaranteed to set the in/out > > parameter, so even if the if is taken doesn't guarantee that maddr is > > always initialized before calling tipc_bearer_xmit(). > > Right, it is only initialized in certain states. It was always initialized until > commit 598411d70f85 ("tipc: make resetting of links non-atomic"), > afterwards only if the link was not reset, and as of commit 73f646cec354 > ("tipc: delay ESTABLISH state event when link is established") only if it's not > 'establishing' or 'reset'. > > > At the minimum, we should initialize maddr to NULL. I think we'd > > prefer to risk the possibility of a null pointer dereference to the > > possibility of working with uninitialized memory. To be clear, both > > are bad, but one is easier to spot/debug later than the other. > > I disagree with setting it to NULL, given that it is still an obviously incorrect > value. We could add a if(maddr) check before calling tipc_bearer_xmit(), but > I think it would be clearer to check > skb_queue_empty(xmitq)) if that avoids the warning: I may be wrong, but I would be surprised if that would eliminate the warning. To me, setting maddr to NULL and then testing for it looks both more comprehensible and is guaranteed to fix the warning. > > diff --git a/net/tipc/node.c b/net/tipc/node.c index > 2dc4919ab23c..147786795e48 100644 > --- a/net/tipc/node.c > +++ b/net/tipc/node.c > @@ -844,7 +844,8 @@ static void tipc_node_link_down(struct tipc_node *n, > int bearer_id, bool delete) > tipc_node_write_unlock(n); > if (delete) > tipc_mon_remove_peer(n->net, n->addr, old_bearer_id); > - tipc_bearer_xmit(n->net, bearer_id, &xmitq, maddr); > + if (skb_queue_empty(xmitq)) > + tipc_bearer_xmit(n->net, bearer_id, &xmitq, maddr); > tipc_sk_rcv(n->net, &le->inputq); } > > This duplicates the check inside of skb_queue_empty(), but I don't know if > the compiler can see through the logic behind the inlined function calls. Probably not, but this is not in any way time critical. ///jon > > Arnd |
From: Hoang Le <hoa...@de...> - 2019-03-21 10:25:40
|
skb free-ed in: 1/ condition 1: tipc_sk_filter_rcv -> tipc_sk_proto_rcv 2/ condition 2: tipc_sk_filter_rcv -> tipc_group_filter_msg This leads to a "use-after-free" access in the next condition. We fix this by intializing the variable at declaration, then it is safe to check this variable to continue processing if condition matches. syzbot report: ================================================================== BUG: KASAN: use-after-free in tipc_sk_filter_rcv+0x2166/0x34f0 net/tipc/socket.c:2167 Read of size 4 at addr ffff88808ea58534 by task kworker/u4:0/7 CPU: 0 PID: 7 Comm: kworker/u4:0 Not tainted 5.0.0+ #61 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Workqueue: tipc_send tipc_conn_send_work Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x172/0x1f0 lib/dump_stack.c:113 print_address_description.cold+0x7c/0x20d mm/kasan/report.c:187 kasan_report.cold+0x1b/0x40 mm/kasan/report.c:317 __asan_report_load4_noabort+0x14/0x20 mm/kasan/generic_report.c:131 tipc_sk_filter_rcv+0x2166/0x34f0 net/tipc/socket.c:2167 tipc_sk_enqueue net/tipc/socket.c:2254 [inline] tipc_sk_rcv+0xc45/0x25a0 net/tipc/socket.c:2305 tipc_topsrv_kern_evt+0x3b7/0x580 net/tipc/topsrv.c:610 tipc_conn_send_to_sock+0x43e/0x5f0 net/tipc/topsrv.c:283 tipc_conn_send_work+0x65/0x80 net/tipc/topsrv.c:303 process_one_work+0x98e/0x1790 kernel/workqueue.c:2269 worker_thread+0x98/0xe40 kernel/workqueue.c:2415 kthread+0x357/0x430 kernel/kthread.c:253 ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:352 Reported-by: syz...@sy... Fixes: c55c8eda ("tipc: smooth change between replicast and broadcast") Acked-by: Jon Maloy <jon...@er...> Signed-off-by: Hoang Le <hoa...@de...> --- net/tipc/socket.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/net/tipc/socket.c b/net/tipc/socket.c index 922b75ff56d3..a7b3e1a070e4 100644 --- a/net/tipc/socket.c +++ b/net/tipc/socket.c @@ -2151,6 +2151,7 @@ static void tipc_sk_filter_rcv(struct sock *sk, struct sk_buff *skb, struct tipc_msg *hdr = buf_msg(skb); struct net *net = sock_net(sk); struct sk_buff_head inputq; + int mtyp = msg_type(hdr); int limit, err = TIPC_OK; trace_tipc_sk_filter_rcv(sk, skb, TIPC_DUMP_ALL, " "); @@ -2164,7 +2165,7 @@ static void tipc_sk_filter_rcv(struct sock *sk, struct sk_buff *skb, if (unlikely(grp)) tipc_group_filter_msg(grp, &inputq, xmitq); - if (msg_type(hdr) == TIPC_MCAST_MSG) + if (unlikely(!grp) && mtyp == TIPC_MCAST_MSG) tipc_mcast_filter_msg(&tsk->mc_method.deferredq, &inputq); /* Validate and add to receive buffer if there is space */ -- 2.17.1 |
From: Hoang Le <hoa...@de...> - 2019-03-21 10:25:40
|
In commit c55c8edafa91 ("tipc: smooth change between replicast and broadcast") we introduced new method to eliminate the risk of message reordering that happen in between different nodes. Unfortunately, we forgot checking at receiving side to ignore intra node. We fix this by checking and returning if arrived message from intra node. syzbot report: ================================================================== kasan: CONFIG_KASAN_INLINE enabled kasan: GPF could be caused by NULL-ptr deref or user memory access general protection fault: 0000 [#1] PREEMPT SMP KASAN CPU: 0 PID: 7820 Comm: syz-executor418 Not tainted 5.0.0+ #61 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:tipc_mcast_filter_msg+0x21b/0x13d0 net/tipc/bcast.c:782 Code: 45 c0 0f 84 39 06 00 00 48 89 5d 98 e8 ce ab a5 fa 49 8d bc 24 c8 00 00 00 48 b9 00 00 00 00 00 fc ff df 48 89 f8 48 c1 e8 03 <80> 3c 08 00 0f 85 9a 0e 00 00 49 8b 9c 24 c8 00 00 00 48 be 00 00 RSP: 0018:ffff8880959defc8 EFLAGS: 00010202 RAX: 0000000000000019 RBX: ffff888081258a48 RCX: dffffc0000000000 RDX: 0000000000000000 RSI: ffffffff86cab862 RDI: 00000000000000c8 RBP: ffff8880959df030 R08: ffff8880813d0200 R09: ffffed1015d05bc8 R10: ffffed1015d05bc7 R11: ffff8880ae82de3b R12: 0000000000000000 R13: 000000000000002c R14: 0000000000000000 R15: ffff888081258a48 FS: 000000000106a880(0000) GS:ffff8880ae800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000020001cc0 CR3: 0000000094a20000 CR4: 00000000001406f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: tipc_sk_filter_rcv+0x182d/0x34f0 net/tipc/socket.c:2168 tipc_sk_enqueue net/tipc/socket.c:2254 [inline] tipc_sk_rcv+0xc45/0x25a0 net/tipc/socket.c:2305 tipc_sk_mcast_rcv+0x724/0x1020 net/tipc/socket.c:1209 tipc_mcast_xmit+0x7fe/0x1200 net/tipc/bcast.c:410 tipc_sendmcast+0xb36/0xfc0 net/tipc/socket.c:820 __tipc_sendmsg+0x10df/0x18d0 net/tipc/socket.c:1358 tipc_sendmsg+0x53/0x80 net/tipc/socket.c:1291 sock_sendmsg_nosec net/socket.c:651 [inline] sock_sendmsg+0xdd/0x130 net/socket.c:661 ___sys_sendmsg+0x806/0x930 net/socket.c:2260 __sys_sendmsg+0x105/0x1d0 net/socket.c:2298 __do_sys_sendmsg net/socket.c:2307 [inline] __se_sys_sendmsg net/socket.c:2305 [inline] __x64_sys_sendmsg+0x78/0xb0 net/socket.c:2305 do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x4401c9 Code: 18 89 d0 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 fb 13 fc ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007ffd887fa9d8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 00000000004002c8 RCX: 00000000004401c9 RDX: 0000000000000000 RSI: 0000000020002140 RDI: 0000000000000003 RBP: 00000000006ca018 R08: 0000000000000000 R09: 00000000004002c8 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000401a50 R13: 0000000000401ae0 R14: 0000000000000000 R15: 0000000000000000 Modules linked in: ---[ end trace ba79875754e1708f ]--- Reported-by: syz...@sy... Fixes: c55c8eda ("tipc: smooth change between replicast and broadcast") Acked-by: Jon Maloy <jon...@er...> Signed-off-by: Hoang Le <hoa...@de...> --- net/tipc/bcast.c | 5 ++++- net/tipc/bcast.h | 2 +- net/tipc/socket.c | 2 +- 3 files changed, 6 insertions(+), 3 deletions(-) diff --git a/net/tipc/bcast.c b/net/tipc/bcast.c index 5264a8ff6e01..88edfb358ae7 100644 --- a/net/tipc/bcast.c +++ b/net/tipc/bcast.c @@ -760,7 +760,7 @@ u32 tipc_bcast_get_broadcast_ratio(struct net *net) return bb->rc_ratio; } -void tipc_mcast_filter_msg(struct sk_buff_head *defq, +void tipc_mcast_filter_msg(struct net *net, struct sk_buff_head *defq, struct sk_buff_head *inputq) { struct sk_buff *skb, *_skb, *tmp; @@ -775,6 +775,9 @@ void tipc_mcast_filter_msg(struct sk_buff_head *defq, return; node = msg_orignode(hdr); + if (node == tipc_own_addr(net)) + return; + port = msg_origport(hdr); /* Has the twin SYN message already arrived ? */ diff --git a/net/tipc/bcast.h b/net/tipc/bcast.h index 484bde289d3a..dadad953e2be 100644 --- a/net/tipc/bcast.h +++ b/net/tipc/bcast.h @@ -101,7 +101,7 @@ int tipc_bclink_reset_stats(struct net *net); u32 tipc_bcast_get_broadcast_mode(struct net *net); u32 tipc_bcast_get_broadcast_ratio(struct net *net); -void tipc_mcast_filter_msg(struct sk_buff_head *defq, +void tipc_mcast_filter_msg(struct net *net, struct sk_buff_head *defq, struct sk_buff_head *inputq); static inline void tipc_bcast_lock(struct net *net) diff --git a/net/tipc/socket.c b/net/tipc/socket.c index a7b3e1a070e4..8ac8ddf1e324 100644 --- a/net/tipc/socket.c +++ b/net/tipc/socket.c @@ -2166,7 +2166,7 @@ static void tipc_sk_filter_rcv(struct sock *sk, struct sk_buff *skb, tipc_group_filter_msg(grp, &inputq, xmitq); if (unlikely(!grp) && mtyp == TIPC_MCAST_MSG) - tipc_mcast_filter_msg(&tsk->mc_method.deferredq, &inputq); + tipc_mcast_filter_msg(net, &tsk->mc_method.deferredq, &inputq); /* Validate and add to receive buffer if there is space */ while ((skb = __skb_dequeue(&inputq))) { -- 2.17.1 |
From: Jon M. <jon...@er...> - 2019-03-21 10:11:33
|
No need to declare a separate stack variable if you access it only once. Better with: if (node == tipc_own_addr(net)) {...} Otherwise, Acked-by: jon > -----Original Message----- > From: Hoang Le <hoa...@de...> > Sent: 21-Mar-19 09:29 > To: Jon Maloy <jon...@er...>; ma...@do...; > yin...@wi...; tip...@li... > Subject: [PATCH 2/2] tipc: fix a null pointer deref > > In commit c55c8edafa91 ("tipc: smooth change between replicast and > broadcast") we introduced new method to eliminate the risk of message > reordering that happen in between different nodes. > Unfortunately, we forgot checking at receiving side to ignore intra node. > > We fix this by checking and returning if arrived message from intra node. > > syzbot report: > ========================================================== > ======== > kasan: CONFIG_KASAN_INLINE enabled > kasan: GPF could be caused by NULL-ptr deref or user memory access > general protection fault: 0000 [#1] PREEMPT SMP KASAN > CPU: 0 PID: 7820 Comm: syz-executor418 Not tainted 5.0.0+ #61 Hardware > name: Google Google Compute Engine/Google Compute Engine, BIOS > Google 01/01/2011 > RIP: 0010:tipc_mcast_filter_msg+0x21b/0x13d0 net/tipc/bcast.c:782 > Code: 45 c0 0f 84 39 06 00 00 48 89 5d 98 e8 ce ab a5 fa 49 8d bc > 24 c8 00 00 00 48 b9 00 00 00 00 00 fc ff df 48 89 f8 48 c1 e8 03 <80> 3c 08 00 0f > 85 9a 0e 00 00 49 8b 9c 24 c8 00 00 00 48 be 00 00 > RSP: 0018:ffff8880959defc8 EFLAGS: 00010202 > RAX: 0000000000000019 RBX: ffff888081258a48 RCX: dffffc0000000000 > RDX: 0000000000000000 RSI: ffffffff86cab862 RDI: 00000000000000c8 > RBP: ffff8880959df030 R08: ffff8880813d0200 R09: ffffed1015d05bc8 > R10: ffffed1015d05bc7 R11: ffff8880ae82de3b R12: 0000000000000000 > R13: 000000000000002c R14: 0000000000000000 R15: ffff888081258a48 > FS: 000000000106a880(0000) GS:ffff8880ae800000(0000) > knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 0000000020001cc0 CR3: 0000000094a20000 CR4: 00000000001406f0 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call > Trace: > tipc_sk_filter_rcv+0x182d/0x34f0 net/tipc/socket.c:2168 tipc_sk_enqueue > net/tipc/socket.c:2254 [inline] > tipc_sk_rcv+0xc45/0x25a0 net/tipc/socket.c:2305 > tipc_sk_mcast_rcv+0x724/0x1020 net/tipc/socket.c:1209 > tipc_mcast_xmit+0x7fe/0x1200 net/tipc/bcast.c:410 > tipc_sendmcast+0xb36/0xfc0 net/tipc/socket.c:820 > __tipc_sendmsg+0x10df/0x18d0 net/tipc/socket.c:1358 > tipc_sendmsg+0x53/0x80 net/tipc/socket.c:1291 sock_sendmsg_nosec > net/socket.c:651 [inline] > sock_sendmsg+0xdd/0x130 net/socket.c:661 > ___sys_sendmsg+0x806/0x930 net/socket.c:2260 > __sys_sendmsg+0x105/0x1d0 net/socket.c:2298 __do_sys_sendmsg > net/socket.c:2307 [inline] __se_sys_sendmsg net/socket.c:2305 [inline] > __x64_sys_sendmsg+0x78/0xb0 net/socket.c:2305 > do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290 > entry_SYSCALL_64_after_hwframe+0x49/0xbe > RIP: 0033:0x4401c9 > Code: 18 89 d0 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 48 89 f8 > 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff > ff 0f 83 fb 13 fc ff c3 66 2e 0f 1f 84 00 00 00 00 > RSP: 002b:00007ffd887fa9d8 EFLAGS: 00000246 ORIG_RAX: > 000000000000002e > RAX: ffffffffffffffda RBX: 00000000004002c8 RCX: 00000000004401c9 > RDX: 0000000000000000 RSI: 0000000020002140 RDI: 0000000000000003 > RBP: 00000000006ca018 R08: 0000000000000000 R09: 00000000004002c8 > R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000401a50 > R13: 0000000000401ae0 R14: 0000000000000000 R15: 0000000000000000 > Modules linked in: > ---[ end trace ba79875754e1708f ]--- > > Reported-by: syz...@sy... > Fixes: c55c8eda ("tipc: smooth change between replicast and broadcast") > Signed-off-by: Hoang Le <hoa...@de...> > --- > net/tipc/bcast.c | 6 +++++- > net/tipc/bcast.h | 2 +- > net/tipc/socket.c | 2 +- > 3 files changed, 7 insertions(+), 3 deletions(-) > > diff --git a/net/tipc/bcast.c b/net/tipc/bcast.c index > 5264a8ff6e01..b3e6b4892425 100644 > --- a/net/tipc/bcast.c > +++ b/net/tipc/bcast.c > @@ -760,10 +760,11 @@ u32 tipc_bcast_get_broadcast_ratio(struct net > *net) > return bb->rc_ratio; > } > > -void tipc_mcast_filter_msg(struct sk_buff_head *defq, > +void tipc_mcast_filter_msg(struct net *net, struct sk_buff_head *defq, > struct sk_buff_head *inputq) > { > struct sk_buff *skb, *_skb, *tmp; > + u32 self = tipc_own_addr(net); > struct tipc_msg *hdr, *_hdr; > bool match = false; > u32 node, port; > @@ -775,6 +776,9 @@ void tipc_mcast_filter_msg(struct sk_buff_head > *defq, > return; > > node = msg_orignode(hdr); > + if (node == self) > + return; > + > port = msg_origport(hdr); > > /* Has the twin SYN message already arrived ? */ diff --git > a/net/tipc/bcast.h b/net/tipc/bcast.h index 484bde289d3a..dadad953e2be > 100644 > --- a/net/tipc/bcast.h > +++ b/net/tipc/bcast.h > @@ -101,7 +101,7 @@ int tipc_bclink_reset_stats(struct net *net); > u32 tipc_bcast_get_broadcast_mode(struct net *net); > u32 tipc_bcast_get_broadcast_ratio(struct net *net); > > -void tipc_mcast_filter_msg(struct sk_buff_head *defq, > +void tipc_mcast_filter_msg(struct net *net, struct sk_buff_head *defq, > struct sk_buff_head *inputq); > > static inline void tipc_bcast_lock(struct net *net) diff --git > a/net/tipc/socket.c b/net/tipc/socket.c index a7b3e1a070e4..8ac8ddf1e324 > 100644 > --- a/net/tipc/socket.c > +++ b/net/tipc/socket.c > @@ -2166,7 +2166,7 @@ static void tipc_sk_filter_rcv(struct sock *sk, struct > sk_buff *skb, > tipc_group_filter_msg(grp, &inputq, xmitq); > > if (unlikely(!grp) && mtyp == TIPC_MCAST_MSG) > - tipc_mcast_filter_msg(&tsk->mc_method.deferredq, > &inputq); > + tipc_mcast_filter_msg(net, &tsk->mc_method.deferredq, > &inputq); > > /* Validate and add to receive buffer if there is space */ > while ((skb = __skb_dequeue(&inputq))) { > -- > 2.1.4 |
From: Jon M. <jon...@er...> - 2019-03-21 10:07:23
|
Acked-by: jon > -----Original Message----- > From: Hoang Le <hoa...@de...> > Sent: 21-Mar-19 09:29 > To: Jon Maloy <jon...@er...>; ma...@do...; > yin...@wi...; tip...@li... > Subject: [PATCH 1/2] tipc: fix use-after-free tipc_sk_filter_rcv > > skb free-ed in: > 1/ condition 1: tipc_sk_filter_rcv -> tipc_sk_proto_rcv > 2/ condition 2: tipc_sk_filter_rcv -> tipc_group_filter_msg This leads to a > "use-after-free" access in the next condition. > > We fix this by intializing the variable at declaration, then it is safe to check this > variable to continue processing if condition matches. > > syzbot report: > ========================================================== > ======== > BUG: KASAN: use-after-free in tipc_sk_filter_rcv+0x2166/0x34f0 > net/tipc/socket.c:2167 > Read of size 4 at addr ffff88808ea58534 by task kworker/u4:0/7 > > CPU: 0 PID: 7 Comm: kworker/u4:0 Not tainted 5.0.0+ #61 Hardware name: > Google Google Compute Engine/Google Compute Engine, BIOS Google > 01/01/2011 > Workqueue: tipc_send tipc_conn_send_work Call Trace: > __dump_stack lib/dump_stack.c:77 [inline] > dump_stack+0x172/0x1f0 lib/dump_stack.c:113 > print_address_description.cold+0x7c/0x20d mm/kasan/report.c:187 > kasan_report.cold+0x1b/0x40 mm/kasan/report.c:317 > __asan_report_load4_noabort+0x14/0x20 mm/kasan/generic_report.c:131 > tipc_sk_filter_rcv+0x2166/0x34f0 net/tipc/socket.c:2167 tipc_sk_enqueue > net/tipc/socket.c:2254 [inline] > tipc_sk_rcv+0xc45/0x25a0 net/tipc/socket.c:2305 > tipc_topsrv_kern_evt+0x3b7/0x580 net/tipc/topsrv.c:610 > tipc_conn_send_to_sock+0x43e/0x5f0 net/tipc/topsrv.c:283 > tipc_conn_send_work+0x65/0x80 net/tipc/topsrv.c:303 > process_one_work+0x98e/0x1790 kernel/workqueue.c:2269 > worker_thread+0x98/0xe40 kernel/workqueue.c:2415 > kthread+0x357/0x430 kernel/kthread.c:253 > ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:352 > > Reported-by: syz...@sy... > Fixes: c55c8eda ("tipc: smooth change between replicast and broadcast") > Signed-off-by: Hoang Le <hoa...@de...> > --- > net/tipc/socket.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/net/tipc/socket.c b/net/tipc/socket.c index > 922b75ff56d3..a7b3e1a070e4 100644 > --- a/net/tipc/socket.c > +++ b/net/tipc/socket.c > @@ -2151,6 +2151,7 @@ static void tipc_sk_filter_rcv(struct sock *sk, struct > sk_buff *skb, > struct tipc_msg *hdr = buf_msg(skb); > struct net *net = sock_net(sk); > struct sk_buff_head inputq; > + int mtyp = msg_type(hdr); > int limit, err = TIPC_OK; > > trace_tipc_sk_filter_rcv(sk, skb, TIPC_DUMP_ALL, " "); @@ -2164,7 > +2165,7 @@ static void tipc_sk_filter_rcv(struct sock *sk, struct sk_buff > *skb, > if (unlikely(grp)) > tipc_group_filter_msg(grp, &inputq, xmitq); > > - if (msg_type(hdr) == TIPC_MCAST_MSG) > + if (unlikely(!grp) && mtyp == TIPC_MCAST_MSG) > tipc_mcast_filter_msg(&tsk->mc_method.deferredq, > &inputq); > > /* Validate and add to receive buffer if there is space */ > -- > 2.1.4 |
From: Hoang Le <hoa...@de...> - 2019-03-21 08:29:25
|
skb free-ed in: 1/ condition 1: tipc_sk_filter_rcv -> tipc_sk_proto_rcv 2/ condition 2: tipc_sk_filter_rcv -> tipc_group_filter_msg This leads to a "use-after-free" access in the next condition. We fix this by intializing the variable at declaration, then it is safe to check this variable to continue processing if condition matches. syzbot report: ================================================================== BUG: KASAN: use-after-free in tipc_sk_filter_rcv+0x2166/0x34f0 net/tipc/socket.c:2167 Read of size 4 at addr ffff88808ea58534 by task kworker/u4:0/7 CPU: 0 PID: 7 Comm: kworker/u4:0 Not tainted 5.0.0+ #61 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Workqueue: tipc_send tipc_conn_send_work Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x172/0x1f0 lib/dump_stack.c:113 print_address_description.cold+0x7c/0x20d mm/kasan/report.c:187 kasan_report.cold+0x1b/0x40 mm/kasan/report.c:317 __asan_report_load4_noabort+0x14/0x20 mm/kasan/generic_report.c:131 tipc_sk_filter_rcv+0x2166/0x34f0 net/tipc/socket.c:2167 tipc_sk_enqueue net/tipc/socket.c:2254 [inline] tipc_sk_rcv+0xc45/0x25a0 net/tipc/socket.c:2305 tipc_topsrv_kern_evt+0x3b7/0x580 net/tipc/topsrv.c:610 tipc_conn_send_to_sock+0x43e/0x5f0 net/tipc/topsrv.c:283 tipc_conn_send_work+0x65/0x80 net/tipc/topsrv.c:303 process_one_work+0x98e/0x1790 kernel/workqueue.c:2269 worker_thread+0x98/0xe40 kernel/workqueue.c:2415 kthread+0x357/0x430 kernel/kthread.c:253 ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:352 Reported-by: syz...@sy... Fixes: c55c8eda ("tipc: smooth change between replicast and broadcast") Signed-off-by: Hoang Le <hoa...@de...> --- net/tipc/socket.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/net/tipc/socket.c b/net/tipc/socket.c index 922b75ff56d3..a7b3e1a070e4 100644 --- a/net/tipc/socket.c +++ b/net/tipc/socket.c @@ -2151,6 +2151,7 @@ static void tipc_sk_filter_rcv(struct sock *sk, struct sk_buff *skb, struct tipc_msg *hdr = buf_msg(skb); struct net *net = sock_net(sk); struct sk_buff_head inputq; + int mtyp = msg_type(hdr); int limit, err = TIPC_OK; trace_tipc_sk_filter_rcv(sk, skb, TIPC_DUMP_ALL, " "); @@ -2164,7 +2165,7 @@ static void tipc_sk_filter_rcv(struct sock *sk, struct sk_buff *skb, if (unlikely(grp)) tipc_group_filter_msg(grp, &inputq, xmitq); - if (msg_type(hdr) == TIPC_MCAST_MSG) + if (unlikely(!grp) && mtyp == TIPC_MCAST_MSG) tipc_mcast_filter_msg(&tsk->mc_method.deferredq, &inputq); /* Validate and add to receive buffer if there is space */ -- 2.1.4 |
From: Hoang Le <hoa...@de...> - 2019-03-21 08:29:24
|
In commit c55c8edafa91 ("tipc: smooth change between replicast and broadcast") we introduced new method to eliminate the risk of message reordering that happen in between different nodes. Unfortunately, we forgot checking at receiving side to ignore intra node. We fix this by checking and returning if arrived message from intra node. syzbot report: ================================================================== kasan: CONFIG_KASAN_INLINE enabled kasan: GPF could be caused by NULL-ptr deref or user memory access general protection fault: 0000 [#1] PREEMPT SMP KASAN CPU: 0 PID: 7820 Comm: syz-executor418 Not tainted 5.0.0+ #61 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:tipc_mcast_filter_msg+0x21b/0x13d0 net/tipc/bcast.c:782 Code: 45 c0 0f 84 39 06 00 00 48 89 5d 98 e8 ce ab a5 fa 49 8d bc 24 c8 00 00 00 48 b9 00 00 00 00 00 fc ff df 48 89 f8 48 c1 e8 03 <80> 3c 08 00 0f 85 9a 0e 00 00 49 8b 9c 24 c8 00 00 00 48 be 00 00 RSP: 0018:ffff8880959defc8 EFLAGS: 00010202 RAX: 0000000000000019 RBX: ffff888081258a48 RCX: dffffc0000000000 RDX: 0000000000000000 RSI: ffffffff86cab862 RDI: 00000000000000c8 RBP: ffff8880959df030 R08: ffff8880813d0200 R09: ffffed1015d05bc8 R10: ffffed1015d05bc7 R11: ffff8880ae82de3b R12: 0000000000000000 R13: 000000000000002c R14: 0000000000000000 R15: ffff888081258a48 FS: 000000000106a880(0000) GS:ffff8880ae800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000020001cc0 CR3: 0000000094a20000 CR4: 00000000001406f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: tipc_sk_filter_rcv+0x182d/0x34f0 net/tipc/socket.c:2168 tipc_sk_enqueue net/tipc/socket.c:2254 [inline] tipc_sk_rcv+0xc45/0x25a0 net/tipc/socket.c:2305 tipc_sk_mcast_rcv+0x724/0x1020 net/tipc/socket.c:1209 tipc_mcast_xmit+0x7fe/0x1200 net/tipc/bcast.c:410 tipc_sendmcast+0xb36/0xfc0 net/tipc/socket.c:820 __tipc_sendmsg+0x10df/0x18d0 net/tipc/socket.c:1358 tipc_sendmsg+0x53/0x80 net/tipc/socket.c:1291 sock_sendmsg_nosec net/socket.c:651 [inline] sock_sendmsg+0xdd/0x130 net/socket.c:661 ___sys_sendmsg+0x806/0x930 net/socket.c:2260 __sys_sendmsg+0x105/0x1d0 net/socket.c:2298 __do_sys_sendmsg net/socket.c:2307 [inline] __se_sys_sendmsg net/socket.c:2305 [inline] __x64_sys_sendmsg+0x78/0xb0 net/socket.c:2305 do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x4401c9 Code: 18 89 d0 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 fb 13 fc ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007ffd887fa9d8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 00000000004002c8 RCX: 00000000004401c9 RDX: 0000000000000000 RSI: 0000000020002140 RDI: 0000000000000003 RBP: 00000000006ca018 R08: 0000000000000000 R09: 00000000004002c8 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000401a50 R13: 0000000000401ae0 R14: 0000000000000000 R15: 0000000000000000 Modules linked in: ---[ end trace ba79875754e1708f ]--- Reported-by: syz...@sy... Fixes: c55c8eda ("tipc: smooth change between replicast and broadcast") Signed-off-by: Hoang Le <hoa...@de...> --- net/tipc/bcast.c | 6 +++++- net/tipc/bcast.h | 2 +- net/tipc/socket.c | 2 +- 3 files changed, 7 insertions(+), 3 deletions(-) diff --git a/net/tipc/bcast.c b/net/tipc/bcast.c index 5264a8ff6e01..b3e6b4892425 100644 --- a/net/tipc/bcast.c +++ b/net/tipc/bcast.c @@ -760,10 +760,11 @@ u32 tipc_bcast_get_broadcast_ratio(struct net *net) return bb->rc_ratio; } -void tipc_mcast_filter_msg(struct sk_buff_head *defq, +void tipc_mcast_filter_msg(struct net *net, struct sk_buff_head *defq, struct sk_buff_head *inputq) { struct sk_buff *skb, *_skb, *tmp; + u32 self = tipc_own_addr(net); struct tipc_msg *hdr, *_hdr; bool match = false; u32 node, port; @@ -775,6 +776,9 @@ void tipc_mcast_filter_msg(struct sk_buff_head *defq, return; node = msg_orignode(hdr); + if (node == self) + return; + port = msg_origport(hdr); /* Has the twin SYN message already arrived ? */ diff --git a/net/tipc/bcast.h b/net/tipc/bcast.h index 484bde289d3a..dadad953e2be 100644 --- a/net/tipc/bcast.h +++ b/net/tipc/bcast.h @@ -101,7 +101,7 @@ int tipc_bclink_reset_stats(struct net *net); u32 tipc_bcast_get_broadcast_mode(struct net *net); u32 tipc_bcast_get_broadcast_ratio(struct net *net); -void tipc_mcast_filter_msg(struct sk_buff_head *defq, +void tipc_mcast_filter_msg(struct net *net, struct sk_buff_head *defq, struct sk_buff_head *inputq); static inline void tipc_bcast_lock(struct net *net) diff --git a/net/tipc/socket.c b/net/tipc/socket.c index a7b3e1a070e4..8ac8ddf1e324 100644 --- a/net/tipc/socket.c +++ b/net/tipc/socket.c @@ -2166,7 +2166,7 @@ static void tipc_sk_filter_rcv(struct sock *sk, struct sk_buff *skb, tipc_group_filter_msg(grp, &inputq, xmitq); if (unlikely(!grp) && mtyp == TIPC_MCAST_MSG) - tipc_mcast_filter_msg(&tsk->mc_method.deferredq, &inputq); + tipc_mcast_filter_msg(net, &tsk->mc_method.deferredq, &inputq); /* Validate and add to receive buffer if there is space */ while ((skb = __skb_dequeue(&inputq))) { -- 2.1.4 |
From: Jon M. <jon...@er...> - 2019-03-21 08:12:22
|
From: Erik Hugne <eri...@gm...> When cancelling a subscription, we have to clear the cancel bit in the request before iterating over any established subscriptions with memcmp. Otherwise no subscription will ever be found, and it will not be possible to explicitly unsubscribe individual subscriptions. Fixes: 8985ecc7c1e0 ("tipc: simplify endianness handling in topology subscriber") Signed-off-by: Erik Hugne <eri...@gm...> Signed-off-by: Jon Maloy <jon...@er...> --- v2: clear bit in a more sensible way.. (comment from davem) --- net/tipc/topsrv.c | 1 + 1 file changed, 1 insertion(+) diff --git a/net/tipc/topsrv.c b/net/tipc/topsrv.c index 4a708a4..b45932d7 100644 --- a/net/tipc/topsrv.c +++ b/net/tipc/topsrv.c @@ -363,6 +363,7 @@ static int tipc_conn_rcv_sub(struct tipc_topsrv *srv, struct tipc_subscription *sub; if (tipc_sub_read(s, filter) & TIPC_SUB_CANCEL) { + s->filter &= __constant_ntohl(~TIPC_SUB_CANCEL); tipc_conn_delete_sub(con, s); return 0; } -- 2.1.4 |
From: <eri...@gm...> - 2019-03-20 20:58:18
|
From: Erik Hugne <eri...@gm...> When cancelling a subscription, we have to clear the cancel bit in the request before iterating over any established subscriptions with memcmp. Otherwise no subscription will ever be found, and it will not be possible to unsubscribe. Signed-off-by: Erik Hugne <eri...@gm...> --- v2: clear bit in a more sensible way.. (comment from davem) net/tipc/topsrv.c | 1 + 1 file changed, 1 insertion(+) diff --git a/net/tipc/topsrv.c b/net/tipc/topsrv.c index 4a708a4e8583..b45932d78004 100644 --- a/net/tipc/topsrv.c +++ b/net/tipc/topsrv.c @@ -363,6 +363,7 @@ static int tipc_conn_rcv_sub(struct tipc_topsrv *srv, struct tipc_subscription *sub; if (tipc_sub_read(s, filter) & TIPC_SUB_CANCEL) { + s->filter &= __constant_ntohl(~TIPC_SUB_CANCEL); tipc_conn_delete_sub(con, s); return 0; } -- 2.14.1 |
From: Erik H. <eri...@gm...> - 2019-03-20 19:32:41
|
I only planned to send the /dev/tipc network redirection stuff there. I dont think that the topology module is relevant enough to be a default bash builtin. //E On Wed, 20 Mar 2019, 11:09 Jon Maloy, <jon...@er...> wrote: > Nice. Is this all for the bash project, or do we need to add something to > tipc_utils? > > ///jon > > > > -----Original Message----- > > From: Erik Hugne <eri...@gm...> > > Sent: 19-Mar-19 22:33 > > To: tip...@li...; Jon Maloy > > <jon...@er...> > > Subject: Topology subscriptions in bash > > > > A small tool to get shell integrations to the TIPC topology server, maybe > > someone will find it useful https://github.com/Hugne/tipc.sh > > > > The subscriber examples use part of the functionality that i plan to > submit to > > the bash project for network redirection support, e.g: > > > > [root@kdev ~] exec 5</dev/tipc/1000/1 > > [root@kdev ~] echo hello world >/dev/tipc/1000/1 [root@kdev ~] cat <&5 > > hello world ^C [root@kdev ~] > > > > //E > |