You can subscribe to this list here.
2003 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(6) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2004 |
Jan
(9) |
Feb
(11) |
Mar
(22) |
Apr
(73) |
May
(78) |
Jun
(146) |
Jul
(80) |
Aug
(27) |
Sep
(5) |
Oct
(14) |
Nov
(18) |
Dec
(27) |
2005 |
Jan
(20) |
Feb
(30) |
Mar
(19) |
Apr
(28) |
May
(50) |
Jun
(31) |
Jul
(32) |
Aug
(14) |
Sep
(36) |
Oct
(43) |
Nov
(74) |
Dec
(63) |
2006 |
Jan
(34) |
Feb
(32) |
Mar
(21) |
Apr
(76) |
May
(106) |
Jun
(72) |
Jul
(70) |
Aug
(175) |
Sep
(130) |
Oct
(39) |
Nov
(81) |
Dec
(43) |
2007 |
Jan
(81) |
Feb
(36) |
Mar
(20) |
Apr
(43) |
May
(54) |
Jun
(34) |
Jul
(44) |
Aug
(55) |
Sep
(44) |
Oct
(54) |
Nov
(43) |
Dec
(41) |
2008 |
Jan
(42) |
Feb
(84) |
Mar
(73) |
Apr
(30) |
May
(119) |
Jun
(54) |
Jul
(54) |
Aug
(93) |
Sep
(173) |
Oct
(130) |
Nov
(145) |
Dec
(153) |
2009 |
Jan
(59) |
Feb
(12) |
Mar
(28) |
Apr
(18) |
May
(56) |
Jun
(9) |
Jul
(28) |
Aug
(62) |
Sep
(16) |
Oct
(19) |
Nov
(15) |
Dec
(17) |
2010 |
Jan
(14) |
Feb
(36) |
Mar
(37) |
Apr
(30) |
May
(33) |
Jun
(53) |
Jul
(42) |
Aug
(50) |
Sep
(67) |
Oct
(66) |
Nov
(69) |
Dec
(36) |
2011 |
Jan
(52) |
Feb
(45) |
Mar
(49) |
Apr
(21) |
May
(34) |
Jun
(13) |
Jul
(19) |
Aug
(37) |
Sep
(43) |
Oct
(10) |
Nov
(23) |
Dec
(30) |
2012 |
Jan
(42) |
Feb
(36) |
Mar
(46) |
Apr
(25) |
May
(96) |
Jun
(146) |
Jul
(40) |
Aug
(28) |
Sep
(61) |
Oct
(45) |
Nov
(100) |
Dec
(53) |
2013 |
Jan
(79) |
Feb
(24) |
Mar
(134) |
Apr
(156) |
May
(118) |
Jun
(75) |
Jul
(278) |
Aug
(145) |
Sep
(136) |
Oct
(168) |
Nov
(137) |
Dec
(439) |
2014 |
Jan
(284) |
Feb
(158) |
Mar
(231) |
Apr
(275) |
May
(259) |
Jun
(91) |
Jul
(222) |
Aug
(215) |
Sep
(165) |
Oct
(166) |
Nov
(211) |
Dec
(150) |
2015 |
Jan
(164) |
Feb
(324) |
Mar
(299) |
Apr
(214) |
May
(111) |
Jun
(109) |
Jul
(105) |
Aug
(36) |
Sep
(58) |
Oct
(131) |
Nov
(68) |
Dec
(30) |
2016 |
Jan
(46) |
Feb
(87) |
Mar
(135) |
Apr
(174) |
May
(132) |
Jun
(135) |
Jul
(149) |
Aug
(125) |
Sep
(79) |
Oct
(49) |
Nov
(95) |
Dec
(102) |
2017 |
Jan
(104) |
Feb
(75) |
Mar
(72) |
Apr
(53) |
May
(18) |
Jun
(5) |
Jul
(14) |
Aug
(19) |
Sep
(2) |
Oct
(13) |
Nov
(21) |
Dec
(67) |
2018 |
Jan
(56) |
Feb
(50) |
Mar
(148) |
Apr
(41) |
May
(37) |
Jun
(34) |
Jul
(34) |
Aug
(11) |
Sep
(52) |
Oct
(48) |
Nov
(28) |
Dec
(46) |
2019 |
Jan
(29) |
Feb
(63) |
Mar
(95) |
Apr
(54) |
May
(14) |
Jun
(71) |
Jul
(60) |
Aug
(49) |
Sep
(3) |
Oct
(64) |
Nov
(115) |
Dec
(57) |
2020 |
Jan
(15) |
Feb
(9) |
Mar
(38) |
Apr
(27) |
May
(60) |
Jun
(53) |
Jul
(35) |
Aug
(46) |
Sep
(37) |
Oct
(64) |
Nov
(20) |
Dec
(25) |
2021 |
Jan
(20) |
Feb
(31) |
Mar
(27) |
Apr
(23) |
May
(21) |
Jun
(30) |
Jul
(30) |
Aug
(7) |
Sep
(18) |
Oct
|
Nov
(15) |
Dec
(4) |
2022 |
Jan
(3) |
Feb
(1) |
Mar
(10) |
Apr
|
May
(2) |
Jun
(26) |
Jul
(5) |
Aug
|
Sep
(1) |
Oct
(2) |
Nov
(9) |
Dec
(2) |
2023 |
Jan
(4) |
Feb
(4) |
Mar
(5) |
Apr
(10) |
May
(29) |
Jun
(17) |
Jul
|
Aug
|
Sep
(1) |
Oct
(1) |
Nov
(2) |
Dec
|
2024 |
Jan
|
Feb
(6) |
Mar
|
Apr
(1) |
May
(6) |
Jun
|
Jul
(5) |
Aug
|
Sep
(3) |
Oct
|
Nov
|
Dec
|
2025 |
Jan
|
Feb
(3) |
Mar
|
Apr
|
May
|
Jun
|
Jul
(6) |
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: Jon M. <jm...@re...> - 2021-06-10 12:44:41
|
> -----Original Message----- > From: Jon Maloy <jm...@re...> > Sent: Thursday, June 10, 2021 4:12 PM > To: Menglong Dong <men...@gm...> > Cc: yin...@wi...; tip...@li...; Xin Long <lx...@re...>; tipc-dek <tip...@de...> > Subject: Re: [PATCH v2 net-next 0/2] net: tipc: fix FB_MTU eat two pages and do some code cleanup > > > On 6/9/21 8:56 AM, Menglong Dong wrote: >> On Wed, Jun 9, 2021 at 6:47 PM Jon Maloy <jm...@re...> wrote: >>> >>> On 6/9/21 6:32 AM, men...@gm... wrote: >>>> From: Menglong Dong <don...@zt... [...] >> No, no, I didn't miss your mail. I think it can make us clear about what and how >> to do by sending the V2 patches. >> >> So we can define two versions 'FB_MTU' for bcast.c and msg.c, such as CRYPTO_MTU >> and NON_CRYPTO_MTU. And within tipc_buf_acquire(), we decide which version >> BUF_HEADROOM to use by the data size? Such as: >> >> int buf_size; >> if (IS_ENABLED(CONFIG_TIPC_CRYPTO) && size == NON_CRYPTO_MTU) { >> buf_size = size + BUF_HEADROOM_non-crypto + BUF_TAILROOM_non-crypto; >> } else { >> buf_size = size + BUF_HEADROOM_crypto + BUF_TAILROOM_crypto; >> } >> >> Is this feeling? >> (It's a little weird to check whether the data should be crypto by data size). >> >> Thanks! >> Menglong Dong > (Removed netdev, David Miller etc from thread) > > I think our main mistake is that we are trying to "compensate" for a > behavior in tipc_buf_acquire() that doens't fit our purposes. > > What if we do the following: > 1) We define BUF_HEADROOM and BUF_TAILROOM as we do now, plus a FB_MTU > as you suggest, all inside msg.c. > 2) We create a new inline function tipc_alloc_skb(int headroom, int > tailroom, int size) in msg.c > This function does the job that tipc_buf_acqire() does now. > 3) We let tipc_buf_acquire() call this function as > tipc_alloc_skb(BUF_HEADROOM, BUF_TAILROOM, size); > Alternatively, we skip the BUF_HEADROOM/BUF_TAILROOM macros > altogether andmake the code in tipc_buf_acquire() conditional as you > did, but without testing for size. Actually, I think I like this better. Maybe even better: we keep BUF_HEADROOM, but only define it as LL_MAX_HEADER + 48, i.e. the non-crypto value. We still keep the conditional call in tipc_buf_acquire(), and add the crypto-header extras explictly when needed. That way, we only need to change in one place if we decide to expand the headroom. ///jon > 4) We let the fallback function in msg.c call it as > tipc_alloc_skb(LL_MAX_HEADER + 48, 0, ONEPAGE_SKB); > 5) We let the user in bcast.c get acces to the calculated value of > FB_MTU as a function or global value, like you did in v2. > > I suggest you send v3 to tipc-discussion (if you are a member) and the > recipients of this mail first, so we don't spam the netdev list with > intermediate versions of this series. > > ///jon > > |
From: Tung Q. N. <tun...@de...> - 2021-06-10 09:31:36
|
Remove. -----Original Message----- From: Tung Quang Nguyen Sent: Thursday, June 10, 2021 4:16 PM To: Jon Maloy <jm...@re...>; Menglong Dong <men...@gm...> Cc: yin...@wi...; tip...@li...; Xin Long <lx...@re...>; tipc-dek <tip...@de...> Subject: RE: [PATCH v2 net-next 0/2] net: tipc: fix FB_MTU eat two pages and do some code cleanup Remove tipc-dek mail list to avoid bothering other people. Best regards, Tung Nguyen -----Original Message----- From: Jon Maloy <jm...@re...> Sent: Thursday, June 10, 2021 4:12 PM To: Menglong Dong <men...@gm...> Cc: yin...@wi...; tip...@li...; Xin Long <lx...@re...>; tipc-dek <tip...@de...> Subject: Re: [PATCH v2 net-next 0/2] net: tipc: fix FB_MTU eat two pages and do some code cleanup On 6/9/21 8:56 AM, Menglong Dong wrote: > On Wed, Jun 9, 2021 at 6:47 PM Jon Maloy <jm...@re...> wrote: >> >> >> On 6/9/21 6:32 AM, men...@gm... wrote: >>> From: Menglong Dong <don...@zt...> >>> >>> In the first patch, FB_MTU is redefined to make sure data size will not >>> exceed PAGE_SIZE. Besides, I removed the alignment for buf_size in >>> tipc_buf_acquire, because skb_alloc_fclone will do the alignment job. >>> >>> In the second patch, I removed align() in msg.c and replace it with >>> ALIGN(). >>> >>> >>> >>> >>> Menglong Dong (2): >>> net: tipc: fix FB_MTU eat two pages >>> net: tipc: replace align() with ALIGN in msg.c >>> >>> net/tipc/bcast.c | 2 +- >>> net/tipc/msg.c | 31 ++++++++++++++----------------- >>> net/tipc/msg.h | 3 ++- >>> 3 files changed, 17 insertions(+), 19 deletions(-) >>> >> NACK. >> You must have missed my last mail before you sent out this. We have to >> define a separate macro for bcast.c, since those buffers sometimes will >> need encryption. >> Sorry for the confusion. > No, no, I didn't miss your mail. I think it can make us clear about what and how > to do by sending the V2 patches. > > So we can define two versions 'FB_MTU' for bcast.c and msg.c, such as CRYPTO_MTU > and NON_CRYPTO_MTU. And within tipc_buf_acquire(), we decide which version > BUF_HEADROOM to use by the data size? Such as: > > int buf_size; > if (IS_ENABLED(CONFIG_TIPC_CRYPTO) && size == NON_CRYPTO_MTU) { > buf_size = size + BUF_HEADROOM_non-crypto + BUF_TAILROOM_non-crypto; > } else { > buf_size = size + BUF_HEADROOM_crypto + BUF_TAILROOM_crypto; > } > > Is this feeling? > (It's a little weird to check whether the data should be crypto by data size). > > Thanks! > Menglong Dong (Removed netdev, David Miller etc from thread) I think our main mistake is that we are trying to "compensate" for a behavior in tipc_buf_acquire() that doens't fit our purposes. What if we do the following: 1) We define BUF_HEADROOM and BUF_TAILROOM as we do now, plus a FB_MTU as you suggest, all inside msg.c. 2) We create a new inline function tipc_alloc_skb(int headroom, int tailroom, int size) in msg.c This function does the job that tipc_buf_acqire() does now. 3) We let tipc_buf_acquire() call this function as tipc_alloc_skb(BUF_HEADROOM, BUF_TAILROOM, size); Alternatively, we skip the BUF_HEADROOM/BUF_TAILROOM macros altogether andmake the code in tipc_buf_acquire() conditional as you did, but without testing for size. Actually, I think I like this better. 4) We let the fallback function in msg.c call it as tipc_alloc_skb(LL_MAX_HEADER + 48, 0, ONEPAGE_SKB); 5) We let the user in bcast.c get acces to the calculated value of FB_MTU as a function or global value, like you did in v2. I suggest you send v3 to tipc-discussion (if you are a member) and the recipients of this mail first, so we don't spam the netdev list with intermediate versions of this series. ///jon |
From: Tung Q. N. <tun...@de...> - 2021-06-10 09:16:21
|
Remove tipc-dek mail list to avoid bothering other people. Best regards, Tung Nguyen -----Original Message----- From: Jon Maloy <jm...@re...> Sent: Thursday, June 10, 2021 4:12 PM To: Menglong Dong <men...@gm...> Cc: yin...@wi...; tip...@li...; Xin Long <lx...@re...>; tipc-dek <tip...@de...> Subject: Re: [PATCH v2 net-next 0/2] net: tipc: fix FB_MTU eat two pages and do some code cleanup On 6/9/21 8:56 AM, Menglong Dong wrote: > On Wed, Jun 9, 2021 at 6:47 PM Jon Maloy <jm...@re...> wrote: >> >> >> On 6/9/21 6:32 AM, men...@gm... wrote: >>> From: Menglong Dong <don...@zt...> >>> >>> In the first patch, FB_MTU is redefined to make sure data size will not >>> exceed PAGE_SIZE. Besides, I removed the alignment for buf_size in >>> tipc_buf_acquire, because skb_alloc_fclone will do the alignment job. >>> >>> In the second patch, I removed align() in msg.c and replace it with >>> ALIGN(). >>> >>> >>> >>> >>> Menglong Dong (2): >>> net: tipc: fix FB_MTU eat two pages >>> net: tipc: replace align() with ALIGN in msg.c >>> >>> net/tipc/bcast.c | 2 +- >>> net/tipc/msg.c | 31 ++++++++++++++----------------- >>> net/tipc/msg.h | 3 ++- >>> 3 files changed, 17 insertions(+), 19 deletions(-) >>> >> NACK. >> You must have missed my last mail before you sent out this. We have to >> define a separate macro for bcast.c, since those buffers sometimes will >> need encryption. >> Sorry for the confusion. > No, no, I didn't miss your mail. I think it can make us clear about what and how > to do by sending the V2 patches. > > So we can define two versions 'FB_MTU' for bcast.c and msg.c, such as CRYPTO_MTU > and NON_CRYPTO_MTU. And within tipc_buf_acquire(), we decide which version > BUF_HEADROOM to use by the data size? Such as: > > int buf_size; > if (IS_ENABLED(CONFIG_TIPC_CRYPTO) && size == NON_CRYPTO_MTU) { > buf_size = size + BUF_HEADROOM_non-crypto + BUF_TAILROOM_non-crypto; > } else { > buf_size = size + BUF_HEADROOM_crypto + BUF_TAILROOM_crypto; > } > > Is this feeling? > (It's a little weird to check whether the data should be crypto by data size). > > Thanks! > Menglong Dong (Removed netdev, David Miller etc from thread) I think our main mistake is that we are trying to "compensate" for a behavior in tipc_buf_acquire() that doens't fit our purposes. What if we do the following: 1) We define BUF_HEADROOM and BUF_TAILROOM as we do now, plus a FB_MTU as you suggest, all inside msg.c. 2) We create a new inline function tipc_alloc_skb(int headroom, int tailroom, int size) in msg.c This function does the job that tipc_buf_acqire() does now. 3) We let tipc_buf_acquire() call this function as tipc_alloc_skb(BUF_HEADROOM, BUF_TAILROOM, size); Alternatively, we skip the BUF_HEADROOM/BUF_TAILROOM macros altogether andmake the code in tipc_buf_acquire() conditional as you did, but without testing for size. Actually, I think I like this better. 4) We let the fallback function in msg.c call it as tipc_alloc_skb(LL_MAX_HEADER + 48, 0, ONEPAGE_SKB); 5) We let the user in bcast.c get acces to the calculated value of FB_MTU as a function or global value, like you did in v2. I suggest you send v3 to tipc-discussion (if you are a member) and the recipients of this mail first, so we don't spam the netdev list with intermediate versions of this series. ///jon |
From: Jon M. <jm...@re...> - 2021-06-10 09:12:14
|
On 6/9/21 8:56 AM, Menglong Dong wrote: > On Wed, Jun 9, 2021 at 6:47 PM Jon Maloy <jm...@re...> wrote: >> >> >> On 6/9/21 6:32 AM, men...@gm... wrote: >>> From: Menglong Dong <don...@zt...> >>> >>> In the first patch, FB_MTU is redefined to make sure data size will not >>> exceed PAGE_SIZE. Besides, I removed the alignment for buf_size in >>> tipc_buf_acquire, because skb_alloc_fclone will do the alignment job. >>> >>> In the second patch, I removed align() in msg.c and replace it with >>> ALIGN(). >>> >>> >>> >>> >>> Menglong Dong (2): >>> net: tipc: fix FB_MTU eat two pages >>> net: tipc: replace align() with ALIGN in msg.c >>> >>> net/tipc/bcast.c | 2 +- >>> net/tipc/msg.c | 31 ++++++++++++++----------------- >>> net/tipc/msg.h | 3 ++- >>> 3 files changed, 17 insertions(+), 19 deletions(-) >>> >> NACK. >> You must have missed my last mail before you sent out this. We have to >> define a separate macro for bcast.c, since those buffers sometimes will >> need encryption. >> Sorry for the confusion. > No, no, I didn't miss your mail. I think it can make us clear about what and how > to do by sending the V2 patches. > > So we can define two versions 'FB_MTU' for bcast.c and msg.c, such as CRYPTO_MTU > and NON_CRYPTO_MTU. And within tipc_buf_acquire(), we decide which version > BUF_HEADROOM to use by the data size? Such as: > > int buf_size; > if (IS_ENABLED(CONFIG_TIPC_CRYPTO) && size == NON_CRYPTO_MTU) { > buf_size = size + BUF_HEADROOM_non-crypto + BUF_TAILROOM_non-crypto; > } else { > buf_size = size + BUF_HEADROOM_crypto + BUF_TAILROOM_crypto; > } > > Is this feeling? > (It's a little weird to check whether the data should be crypto by data size). > > Thanks! > Menglong Dong (Removed netdev, David Miller etc from thread) I think our main mistake is that we are trying to "compensate" for a behavior in tipc_buf_acquire() that doens't fit our purposes. What if we do the following: 1) We define BUF_HEADROOM and BUF_TAILROOM as we do now, plus a FB_MTU as you suggest, all inside msg.c. 2) We create a new inline function tipc_alloc_skb(int headroom, int tailroom, int size) in msg.c This function does the job that tipc_buf_acqire() does now. 3) We let tipc_buf_acquire() call this function as tipc_alloc_skb(BUF_HEADROOM, BUF_TAILROOM, size); Alternatively, we skip the BUF_HEADROOM/BUF_TAILROOM macros altogether andmake the code in tipc_buf_acquire() conditional as you did, but without testing for size. Actually, I think I like this better. 4) We let the fallback function in msg.c call it as tipc_alloc_skb(LL_MAX_HEADER + 48, 0, ONEPAGE_SKB); 5) We let the user in bcast.c get acces to the calculated value of FB_MTU as a function or global value, like you did in v2. I suggest you send v3 to tipc-discussion (if you are a member) and the recipients of this mail first, so we don't spam the netdev list with intermediate versions of this series. ///jon |
From: Jon M. <jm...@re...> - 2021-06-09 18:11:37
|
On 6/9/21 1:07 PM, Duzan, Gary D wrote: > For the record, as I got some of my application issues out of the > way and actually started using the STREAM socket, I did encounter some > odd behavior (connection timeout) which disappeared when I switched to > using a different service type. So the presence of a listening STREAM > socket didn't cause trouble for the DGRAM activity (except on > kernel-4.19), but something about the DGRAM activity appeared to cause > trouble for the STREAMs. We are still in the process of evaluating this, and what is a reasonable solution. We want to be as permissive as possible, but of course it has to work "as expected" without surprises. > > As an aside, at least some of our systems threw an EMSGSIZE when I > tried to sendmsg() on a STREAM socket with an oversized message, which > was at least a little surprising. The first message cannot be oversized, as it (in reality a data-carrying SYN) is routed as a regular SEQPACKET/DGRAM message to the server socket. Not sure if this is well documented, though. > Using explicit connect()/writev() worked, but having the implicit > connect work Would Be Nice. This is another case where knowing what is > expected to work would help, as I could always try the sendmsg() first > and fall back to connect()/writev() on EMSGSIZE, but if it is never > expected to work, then there would be no point. It is supposed to work, but I haven't heard about anybody actually using this feature in a product, and it is not part of our regular test suite. So, testing this has probably slipped though our filter at some moment. We will have to look into this too, and add it to our test suite. Regards ///jon > > Thanks. > > Gary Duzan > FIS - GT.M Core > > > ------------------------------------------------------------------------ > *From:* Duzan, Gary D via tipc-discussion > <tip...@li...> > *Sent:* Thursday, June 3, 2021 10:57 AM > *To:* Jon Maloy <jm...@re...>; Xin Long <luc...@gm...> > *Cc:* tip...@li... > <tip...@li...> > *Subject:* Re: [tipc-discussion] EXTERNAL: Re: DGRAM/STREAM Crossover > on Debian? > > I can frankly admit that this is a problem we never came across > before, and that we haven´t really spent much time considering this > kind of issues. > That's why you need crazy developers trying to do weird stuff with > it. 🙂 > > You are right that this needs to be at least clarified in the > documentation. > But we should also run some internal tests to identify what is working > and what is not, and then decide what is the correct approach. > Thank you for making us aware of this. > Thanks for your efforts in making TIPC a great tool for datacenter > communication. > > Gary Duzan > FIS - GT.M Core > > > ///jon > > > > > ________________________________ > From: Jon Maloy <jm...@re...> > Sent: Thursday, June 3, 2021 10:19 AM > To: Duzan, Gary D <Gar...@fi...>; Xin Long > <luc...@gm...> > Cc: tip...@li... > <tip...@li...> > Subject: Re: EXTERNAL: Re: [tipc-discussion] DGRAM/STREAM Crossover on > Debian? > > > > On 6/3/21 9:55 AM, Duzan, Gary D wrote: > > Contrary to UDP vs TCP, TIPC is only one protocol, so you > cannot bind the same service type/instance to different socket types > without risking problems. > The SYN bit will prevent a connection from being established with a > socket of the wrong type, but it will not stop the binding table lookup > from selecting such a socket, since it knows nothing about socket types. > I am actually surprised that this works even on the non-Debian machines. > Maybe the secondary lookup mechanism is saving the day. > > This could of course be fixed without too much effort, but the question > is if that is the right way to go. At least we would have to carefully > consider possible compatibility issues. > > Would it be a problem for you to just choose different service > types/ranges? > > ///jon > > Interesting. So, accidents of implementation aside, you would > expect DGRAM, STREAM, and SEQPACKET sockets to be able to communicate > with each other? I'm not using SEQPACKET (yet), but it sounds like one > might be able to connect() between STREAM and SEQPACKET sockets today, > though presumably the way that send()s on the STREAM side get > "packetized" on the SEQPACKET end is not clearly defined. If this is > what is intended, I think some clarification in the documentation > would be helpful, as would testing to ensure expected behavior across > socket types. It is true that bringing TCP/UDP thinking to TIPC is > going to lead to errors, but it isn't going to be rare. If it is > straightforward to keep communication between different socket types > separate, then I suspect that would produce the least surprise. > > As for my own project, logically I'm dealing with the same service > either way; it is only the DGRAM message size limit that is pushing > the STREAM fallback option. I could require a separate service type > for the STREAM implementation, but that is another piece of > configuration which I'd rather avoid, if practical. I expect I will > keep it the way I have it while I'm in experimental mode, as it seems > to be working as I expect on the target kernels, but I'll need to be > sure I'm on solid ground if we ever manage to get the thing to production. > > Thanks. > > Gary Duzan > FIS - GT.M Core > > I can frankly admit that this is a problem we never came across > before, and that we haven´t really spent much time considering this > kind of issues. > You are right that this needs to be at least clarified in the > documentation. > But we should also run some internal tests to identify what is working > and what is not, and then decide what is the correct approach. > Thank you for making us aware of this. > > ///jon > > > ________________________________ > From: Jon Maloy <jm...@re...><mailto:jm...@re...> > Sent: Wednesday, June 2, 2021 8:56 PM > To: Xin Long <luc...@gm...><mailto:luc...@gm...>; > Duzan, Gary D <Gar...@fi...><mailto:Gar...@fi...> > Cc: > tip...@li...<mailto:tip...@li...> > <tip...@li...><mailto:tip...@li...> > Subject: EXTERNAL: Re: [tipc-discussion] DGRAM/STREAM Crossover on Debian? > > CAUTION: This email originated from outside of the company. Do not > click links or open attachments unless you recognize the sender and > know the content is safe. > > > > On 6/2/21 4:14 PM, Xin Long wrote: > > On Wed, May 26, 2021 at 11:38 AM Duzan, Gary D via tipc-discussion > > > <tip...@li...><mailto:tip...@li...> > wrote: > >> I'm in the process of enhancing a TIPC DGRAM-based RPC-ish > service to include TIPC STREAM transport for larger messages. To > simplify configuration, I have the server process(es) bind() the same > type/range for both DGRAM and STREAM sockets (poll()ing to see which > have incoming requests), then choose which to use on the client. This > seems to work on most of my Linux systems (RHEL-8, Ubuntu 20.04/21.04, > Fedora 34, Debian 11), but on my Debian 10 system (4.19.181-1 kernel) > I am seeing messages from a DGRAM client appearing on an accept()ed > STREAM socket on the server. I have confirmed that the client is not > sending anything on a STREAM socket, and the message received by the > server is formatted as a DGRAM message (without the message framing > header). > > When you start two scoket on the server: DGRAM and STREAM, in the > > client's nametable there will be 2 sockets with different portids: > > # tipc nametable show > > Type Lower Upper Scope Port Node > > 18888 17 17 cluster 4063960415 > > 18888 17 17 cluster 1106254118 > > > > When the client calls sendmsg()/connect() to send msg to the server, > > it will choose one of them by the rule of "local, closest-first or > > round-robin". > > The client doesn't know if the peer is a DGRAM socket or STREAM > > socket. In your case, it should go round-robin. > > > > Without this commit: > > > > commit 25b9221b959483f17c2964d0922869e16caa86b5 > > Author: Jon Maloy > <jon...@er...><mailto:jon...@er...> > > Date: Fri Sep 28 20:23:21 2018 +0200 > > > > tipc: add SYN bit to connection setup messages > > > > The SYN msg for STREAM is no different from the DATA msg for DGRAM. > > that's what you're seeing in kernel-4.19 > > > >> Debian isn't a target platform for production, so I don't need > a specific fix, but it is still surprising and a bit disturbing. Was > this a known problem with the 4.19 kernel? Are there particular > reasons why using this pattern is a bad idea? > > I think it may not work as expected if you create 2 different types of > > TIPC sockets binding to the same address. > > At least on the latest kernel, once the DGRAM client chooses the > > STREAM socket, the DATA msg will be dropped. > > > > Thanks. > Exactly. Contrary to UDP vs TCP, TIPC is only one protocol, so you > cannot bind the same service type/instance to different socket types > without risking problems. > The SYN bit will prevent a connection from being established with a > socket of the wrong type, but it will not stop the binding table lookup > from selecting such a socket, since it knows nothing about socket types. > I am actually surprised that this works even on the non-Debian machines. > Maybe the secondary lookup mechanism is saving the day. > > This could of course be fixed without too much effort, but the question > is if that is the right way to go. At least we would have to carefully > consider possible compatibility issues. > > Would it be a problem for you to just choose different service > types/ranges? > > ///jon > > > > >> Thanks. > >> > >> Gary Duzan > >> FIS - GT.M Core > >> > >> The information contained in this message is proprietary and/or > confidential. If you are not the intended recipient, please: (i) > delete the message and all copies; (ii) do not disclose, distribute or > use the message in any manner; and (iii) notify the sender > immediately. In addition, please be aware that any message addressed > to our domain is subject to archiving and review by persons other than > the intended recipient. Thank you. > >> > >> _______________________________________________ > >> tipc-discussion mailing list > >> > tip...@li...<mailto:tip...@li...> > >> > https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.sourceforge.net%2Flists%2Flistinfo%2Ftipc-discussion&data=04%7C01%7Cgary.duzan%40fisglobal.com%7Cc5cc8db882484ba033b008d926a229c6%7Ce3ff91d834c84b15a0b418910a6ac575%7C0%7C0%7C637583300194004465%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=0qukqL1DPfuLrcwjZJe3%2BkyQQhVbpO9KSeKv81dkYqc%3D&reserved=0<https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.sourceforge.net%2Flists%2Flistinfo%2Ftipc-discussion&data=04%7C01%7Cgary.duzan%40fisglobal.com%7Cc5cc8db882484ba033b008d926a229c6%7Ce3ff91d834c84b15a0b418910a6ac575%7C0%7C0%7C637583300194004465%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=0qukqL1DPfuLrcwjZJe3%2BkyQQhVbpO9KSeKv81dkYqc%3D&reserved=0 > <https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.sourceforge.net%2Flists%2Flistinfo%2Ftipc-discussion&data=04%7C01%7Cgary.duzan%40fisglobal.com%7Cc5cc8db882484ba033b008d926a229c6%7Ce3ff91d834c84b15a0b418910a6ac575%7C0%7C0%7C637583300194004465%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=0qukqL1DPfuLrcwjZJe3%2BkyQQhVbpO9KSeKv81dkYqc%3D&reserved=0<https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.sourceforge.net%2Flists%2Flistinfo%2Ftipc-discussion&data=04%7C01%7Cgary.duzan%40fisglobal.com%7Cc5cc8db882484ba033b008d926a229c6%7Ce3ff91d834c84b15a0b418910a6ac575%7C0%7C0%7C637583300194004465%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=0qukqL1DPfuLrcwjZJe3%2BkyQQhVbpO9KSeKv81dkYqc%3D&reserved=0>> > > The information contained in this message is proprietary and/or > confidential. If you are not the intended recipient, please: (i) > delete the message and all copies; (ii) do not disclose, distribute or > use the message in any manner; and (iii) notify the sender > immediately. In addition, please be aware that any message addressed > to our domain is subject to archiving and review by persons other than > the intended recipient. Thank you. > > The information contained in this message is proprietary and/or > confidential. If you are not the intended recipient, please: (i) > delete the message and all copies; (ii) do not disclose, distribute or > use the message in any manner; and (iii) notify the sender > immediately. In addition, please be aware that any message addressed > to our domain is subject to archiving and review by persons other than > the intended recipient. Thank you. > > _______________________________________________ > tipc-discussion mailing list > tip...@li... > https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.sourceforge.net%2Flists%2Flistinfo%2Ftipc-discussion&data=04%7C01%7Cgary.duzan%40fisglobal.com%7Cc5cc8db882484ba033b008d926a229c6%7Ce3ff91d834c84b15a0b418910a6ac575%7C0%7C0%7C637583300194004465%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=0qukqL1DPfuLrcwjZJe3%2BkyQQhVbpO9KSeKv81dkYqc%3D&reserved=0 > <https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.sourceforge.net%2Flists%2Flistinfo%2Ftipc-discussion&data=04%7C01%7Cgary.duzan%40fisglobal.com%7Cc5cc8db882484ba033b008d926a229c6%7Ce3ff91d834c84b15a0b418910a6ac575%7C0%7C0%7C637583300194004465%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=0qukqL1DPfuLrcwjZJe3%2BkyQQhVbpO9KSeKv81dkYqc%3D&reserved=0> > The information contained in this message is proprietary and/or > confidential. If you are not the intended recipient, please: (i) > delete the message and all copies; (ii) do not disclose, distribute or > use the message in any manner; and (iii) notify the sender > immediately. In addition, please be aware that any message addressed > to our domain is subject to archiving and review by persons other than > the intended recipient. Thank you. |
From: Duzan, G. D <Gar...@fi...> - 2021-06-09 17:23:43
|
For the record, as I got some of my application issues out of the way and actually started using the STREAM socket, I did encounter some odd behavior (connection timeout) which disappeared when I switched to using a different service type. So the presence of a listening STREAM socket didn't cause trouble for the DGRAM activity (except on kernel-4.19), but something about the DGRAM activity appeared to cause trouble for the STREAMs. As an aside, at least some of our systems threw an EMSGSIZE when I tried to sendmsg() on a STREAM socket with an oversized message, which was at least a little surprising. Using explicit connect()/writev() worked, but having the implicit connect work Would Be Nice. This is another case where knowing what is expected to work would help, as I could always try the sendmsg() first and fall back to connect()/writev() on EMSGSIZE, but if it is never expected to work, then there would be no point. Thanks. Gary Duzan FIS - GT.M Core ________________________________ From: Duzan, Gary D via tipc-discussion <tip...@li...> Sent: Thursday, June 3, 2021 10:57 AM To: Jon Maloy <jm...@re...>; Xin Long <luc...@gm...> Cc: tip...@li... <tip...@li...> Subject: Re: [tipc-discussion] EXTERNAL: Re: DGRAM/STREAM Crossover on Debian? I can frankly admit that this is a problem we never came across before, and that we haven´t really spent much time considering this kind of issues. That's why you need crazy developers trying to do weird stuff with it. 🙂 You are right that this needs to be at least clarified in the documentation. But we should also run some internal tests to identify what is working and what is not, and then decide what is the correct approach. Thank you for making us aware of this. Thanks for your efforts in making TIPC a great tool for datacenter communication. Gary Duzan FIS - GT.M Core ///jon ________________________________ From: Jon Maloy <jm...@re...> Sent: Thursday, June 3, 2021 10:19 AM To: Duzan, Gary D <Gar...@fi...>; Xin Long <luc...@gm...> Cc: tip...@li... <tip...@li...> Subject: Re: EXTERNAL: Re: [tipc-discussion] DGRAM/STREAM Crossover on Debian? On 6/3/21 9:55 AM, Duzan, Gary D wrote: Contrary to UDP vs TCP, TIPC is only one protocol, so you cannot bind the same service type/instance to different socket types without risking problems. The SYN bit will prevent a connection from being established with a socket of the wrong type, but it will not stop the binding table lookup from selecting such a socket, since it knows nothing about socket types. I am actually surprised that this works even on the non-Debian machines. Maybe the secondary lookup mechanism is saving the day. This could of course be fixed without too much effort, but the question is if that is the right way to go. At least we would have to carefully consider possible compatibility issues. Would it be a problem for you to just choose different service types/ranges? ///jon Interesting. So, accidents of implementation aside, you would expect DGRAM, STREAM, and SEQPACKET sockets to be able to communicate with each other? I'm not using SEQPACKET (yet), but it sounds like one might be able to connect() between STREAM and SEQPACKET sockets today, though presumably the way that send()s on the STREAM side get "packetized" on the SEQPACKET end is not clearly defined. If this is what is intended, I think some clarification in the documentation would be helpful, as would testing to ensure expected behavior across socket types. It is true that bringing TCP/UDP thinking to TIPC is going to lead to errors, but it isn't going to be rare. If it is straightforward to keep communication between different socket types separate, then I suspect that would produce the least surprise. As for my own project, logically I'm dealing with the same service either way; it is only the DGRAM message size limit that is pushing the STREAM fallback option. I could require a separate service type for the STREAM implementation, but that is another piece of configuration which I'd rather avoid, if practical. I expect I will keep it the way I have it while I'm in experimental mode, as it seems to be working as I expect on the target kernels, but I'll need to be sure I'm on solid ground if we ever manage to get the thing to production. Thanks. Gary Duzan FIS - GT.M Core I can frankly admit that this is a problem we never came across before, and that we haven´t really spent much time considering this kind of issues. You are right that this needs to be at least clarified in the documentation. But we should also run some internal tests to identify what is working and what is not, and then decide what is the correct approach. Thank you for making us aware of this. ///jon ________________________________ From: Jon Maloy <jm...@re...><mailto:jm...@re...> Sent: Wednesday, June 2, 2021 8:56 PM To: Xin Long <luc...@gm...><mailto:luc...@gm...>; Duzan, Gary D <Gar...@fi...><mailto:Gar...@fi...> Cc: tip...@li...<mailto:tip...@li...> <tip...@li...><mailto:tip...@li...> Subject: EXTERNAL: Re: [tipc-discussion] DGRAM/STREAM Crossover on Debian? CAUTION: This email originated from outside of the company. Do not click links or open attachments unless you recognize the sender and know the content is safe. On 6/2/21 4:14 PM, Xin Long wrote: > On Wed, May 26, 2021 at 11:38 AM Duzan, Gary D via tipc-discussion > <tip...@li...><mailto:tip...@li...> wrote: >> I'm in the process of enhancing a TIPC DGRAM-based RPC-ish service to include TIPC STREAM transport for larger messages. To simplify configuration, I have the server process(es) bind() the same type/range for both DGRAM and STREAM sockets (poll()ing to see which have incoming requests), then choose which to use on the client. This seems to work on most of my Linux systems (RHEL-8, Ubuntu 20.04/21.04, Fedora 34, Debian 11), but on my Debian 10 system (4.19.181-1 kernel) I am seeing messages from a DGRAM client appearing on an accept()ed STREAM socket on the server. I have confirmed that the client is not sending anything on a STREAM socket, and the message received by the server is formatted as a DGRAM message (without the message framing header). > When you start two scoket on the server: DGRAM and STREAM, in the > client's nametable there will be 2 sockets with different portids: > # tipc nametable show > Type Lower Upper Scope Port Node > 18888 17 17 cluster 4063960415 > 18888 17 17 cluster 1106254118 > > When the client calls sendmsg()/connect() to send msg to the server, > it will choose one of them by the rule of "local, closest-first or > round-robin". > The client doesn't know if the peer is a DGRAM socket or STREAM > socket. In your case, it should go round-robin. > > Without this commit: > > commit 25b9221b959483f17c2964d0922869e16caa86b5 > Author: Jon Maloy <jon...@er...><mailto:jon...@er...> > Date: Fri Sep 28 20:23:21 2018 +0200 > > tipc: add SYN bit to connection setup messages > > The SYN msg for STREAM is no different from the DATA msg for DGRAM. > that's what you're seeing in kernel-4.19 > >> Debian isn't a target platform for production, so I don't need a specific fix, but it is still surprising and a bit disturbing. Was this a known problem with the 4.19 kernel? Are there particular reasons why using this pattern is a bad idea? > I think it may not work as expected if you create 2 different types of > TIPC sockets binding to the same address. > At least on the latest kernel, once the DGRAM client chooses the > STREAM socket, the DATA msg will be dropped. > > Thanks. Exactly. Contrary to UDP vs TCP, TIPC is only one protocol, so you cannot bind the same service type/instance to different socket types without risking problems. The SYN bit will prevent a connection from being established with a socket of the wrong type, but it will not stop the binding table lookup from selecting such a socket, since it knows nothing about socket types. I am actually surprised that this works even on the non-Debian machines. Maybe the secondary lookup mechanism is saving the day. This could of course be fixed without too much effort, but the question is if that is the right way to go. At least we would have to carefully consider possible compatibility issues. Would it be a problem for you to just choose different service types/ranges? ///jon > >> Thanks. >> >> Gary Duzan >> FIS - GT.M Core >> >> The information contained in this message is proprietary and/or confidential. If you are not the intended recipient, please: (i) delete the message and all copies; (ii) do not disclose, distribute or use the message in any manner; and (iii) notify the sender immediately. In addition, please be aware that any message addressed to our domain is subject to archiving and review by persons other than the intended recipient. Thank you. >> >> _______________________________________________ >> tipc-discussion mailing list >> tip...@li...<mailto:tip...@li...> >> https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.sourceforge.net%2Flists%2Flistinfo%2Ftipc-discussion&data=04%7C01%7Cgary.duzan%40fisglobal.com%7Cc5cc8db882484ba033b008d926a229c6%7Ce3ff91d834c84b15a0b418910a6ac575%7C0%7C0%7C637583300194004465%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=0qukqL1DPfuLrcwjZJe3%2BkyQQhVbpO9KSeKv81dkYqc%3D&reserved=0<https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.sourceforge.net%2Flists%2Flistinfo%2Ftipc-discussion&data=04%7C01%7Cgary.duzan%40fisglobal.com%7Cc5cc8db882484ba033b008d926a229c6%7Ce3ff91d834c84b15a0b418910a6ac575%7C0%7C0%7C637583300194004465%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=0qukqL1DPfuLrcwjZJe3%2BkyQQhVbpO9KSeKv81dkYqc%3D&reserved=0> The information contained in this message is proprietary and/or confidential. If you are not the intended recipient, please: (i) delete the message and all copies; (ii) do not disclose, distribute or use the message in any manner; and (iii) notify the sender immediately. In addition, please be aware that any message addressed to our domain is subject to archiving and review by persons other than the intended recipient. Thank you. The information contained in this message is proprietary and/or confidential. If you are not the intended recipient, please: (i) delete the message and all copies; (ii) do not disclose, distribute or use the message in any manner; and (iii) notify the sender immediately. In addition, please be aware that any message addressed to our domain is subject to archiving and review by persons other than the intended recipient. Thank you. _______________________________________________ tipc-discussion mailing list tip...@li... https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.sourceforge.net%2Flists%2Flistinfo%2Ftipc-discussion&data=04%7C01%7Cgary.duzan%40fisglobal.com%7Cc5cc8db882484ba033b008d926a229c6%7Ce3ff91d834c84b15a0b418910a6ac575%7C0%7C0%7C637583300194004465%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=0qukqL1DPfuLrcwjZJe3%2BkyQQhVbpO9KSeKv81dkYqc%3D&reserved=0 The information contained in this message is proprietary and/or confidential. If you are not the intended recipient, please: (i) delete the message and all copies; (ii) do not disclose, distribute or use the message in any manner; and (iii) notify the sender immediately. In addition, please be aware that any message addressed to our domain is subject to archiving and review by persons other than the intended recipient. Thank you. |
From: Jon M. <jm...@re...> - 2021-06-09 10:47:56
|
On 6/9/21 6:32 AM, men...@gm... wrote: > From: Menglong Dong <don...@zt...> > > In the first patch, FB_MTU is redefined to make sure data size will not > exceed PAGE_SIZE. Besides, I removed the alignment for buf_size in > tipc_buf_acquire, because skb_alloc_fclone will do the alignment job. > > In the second patch, I removed align() in msg.c and replace it with > ALIGN(). > > > > > Menglong Dong (2): > net: tipc: fix FB_MTU eat two pages > net: tipc: replace align() with ALIGN in msg.c > > net/tipc/bcast.c | 2 +- > net/tipc/msg.c | 31 ++++++++++++++----------------- > net/tipc/msg.h | 3 ++- > 3 files changed, 17 insertions(+), 19 deletions(-) > NACK. You must have missed my last mail before you sent out this. We have to define a separate macro for bcast.c, since those buffers sometimes will need encryption. Sorry for the confusion. ///jon |
From: Jon M. <jm...@re...> - 2021-06-09 07:34:50
|
On 6/8/21 10:54 PM, Menglong Dong wrote: > On Tue, Jun 08, 2021 at 06:37:38PM -0400, Jon Maloy wrote: >> > [...] >> I spent a little more time looking into this. I think the best we can do is >> to keep FB_MTU internal to msg.c, and then add an outline function to msg.c >> that can be used by bcast.c. The way it is used is never time critical. >> >> I also see that we could need a little cleanup around this. There is a >> redundant align() function that should be removed and replaced with the >> global ALIGN() macro. >> Even tipc_buf_acquire() should use this macro instead of the explicit method >> that is used now. >> In general, I stongly dislike conditional code, and it is not necessary in >> this function. If we redefine the non-crypto BUF_TAILROOM to 0 instead of 16 >> (it is not used anywhere else) we could get rid of this too. >> >> But I leave that to you. If you only fix the FB_MTU macro I am content. >> > Yeah, I think I can handle it, just leave it to me. > > (finger heart :/) > Menglong DongI It seems like I have been misleading you. It turns out that these messages *will* be sent out over the nework in some cases, i.e. at multicast/broadcast over an UDP bearer. So, what we need is two macros, one with the conditional crypto head/tailroom defined as you first suggested, and one that only use the non-crypto head/tailroom as we have been discussing now. The first one can be defined inside bcast.c, the latter inside msg.c. It might also be a good idea to give the macros more descriptive names, such as ONEPAGE_MTU in the broadcast version, and ONEPAGE_SKB in the node local version. Does that make sense? ///jon > |
From: Jon M. <jm...@re...> - 2021-06-08 22:37:56
|
On 6/7/21 8:51 AM, Menglong Dong wrote: > On Sat, Jun 05, 2021 at 10:25:53AM -0400, Jon Maloy wrote: >> >> On 6/4/21 9:28 PM, Menglong Dong wrote: >>> Hello Maloy, >>> >>> On Sat, Jun 5, 2021 at 3:20 AM Jon Maloy <jm...@re...> wrote: >>>> [...] >>> >>> So if I use the non-crypto version, the size allocated will be: >>> >>> PAGE_SIZE - BUF_HEADROOM_non-crypto - BUF_TAILROOM_non-crypt + >>> BUF_HEADROOM_crypto + BUF_TAILROOM_crypto >>> >>> which is larger than PAGE_SIZE. >>> >>> So, I think the simple way is to define FB_MTU in 'crypto.h'. Is this >>> acceptable? >>> >>> Thanks! >>> Menglong Dong >>> I spent a little more time looking into this. I think the best we can do is to keep FB_MTU internal to msg.c, and then add an outline function to msg.c that can be used by bcast.c. The way it is used is never time critical. I also see that we could need a little cleanup around this. There is a redundant align() function that should be removed and replaced with the global ALIGN() macro. Even tipc_buf_acquire() should use this macro instead of the explicit method that is used now. In general, I stongly dislike conditional code, and it is not necessary in this function. If we redefine the non-crypto BUF_TAILROOM to 0 instead of 16 (it is not used anywhere else) we could get rid of this too. But I leave that to you. If you only fix the FB_MTU macro I am content. ///jon |
From: Jon M. <jm...@re...> - 2021-06-07 02:41:08
|
On 6/4/21 9:28 PM, Menglong Dong wrote: > Hello Maloy, > > On Sat, Jun 5, 2021 at 3:20 AM Jon Maloy <jm...@re...> wrote: >> > [...] >> Please don't add any extra file just for this little fix. We have enough >> files. >> Keep the macros in msg.h/c where they used to be. You can still add >> your copyright line to those files. >> Regarding the macros kept inside msg.c, they are there because we design >> by the principle of minimal exposure, even among our module internal files. >> Otherwise it is ok. >> > I don't want to add a new file too, but I found it's hard to define FB_MTU. I > tried to define it in msg.h, and 'crypto.h' should be included, which > 'BUF_HEADROOM' is defined in. However, 'msg.h' is already included in > 'crypto.h', so it doesn't work. > > I tried to define FB_MTU in 'crypto.h', but it feels weird to define > it here. And > FB_MTU is also used in 'bcast.c', so it can't be defined in 'msg.c'. > > I will see if there is a better solution. I think we can leverage the fact that this by definition is a node local message, and those are never encrypted. So, if you base FB_MTU on the non-crypto versions of BUF_HEADROOM and BUF_TAILROOM we should be safe. That will even give us better utilization of the space available. ///jon > > Thanks! > Menglong Dong > >>> @@ -0,0 +1,55 @@ >>> +/* SPDX-License-Identifier: GPL-2.0-only */ >>> +/* >>> + * Copyright 2021 ZTE Corporation. >>> + * All rights reserved. >>> + * >>> + * Redistribution and use in source and binary forms, with or without >>> + * modification, are permitted provided that the following conditions are met: >>> + * >>> + * 1. Redistributions of source code must retain the above copyright >>> + * notice, this list of conditions and the following disclaimer. >>> + * 2. Redistributions in binary form must reproduce the above copyright >>> + * notice, this list of conditions and the following disclaimer in the >>> + * documentation and/or other materials provided with the distribution. >>> + * 3. Neither the names of the copyright holders nor the names of its >>> + * contributors may be used to endorse or promote products derived from >>> + * this software without specific prior written permission. >>> + * >>> + * Alternatively, this software may be distributed under the terms of the >>> + * GNU General Public License ("GPL") version 2 as published by the Free >>> + * Software Foundation. >>> + * >>> + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" >>> + * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE >>> + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE >>> + * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE >>> + * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR >>> + * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF >>> + * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS >>> + * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN >>> + * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) >>> + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE >>> + * POSSIBILITY OF SUCH DAMAGE. >>> + */ >>> + >>> +#ifndef _TIPC_MTU_H >>> +#define _TIPC_MTU_H >>> + >>> +#include <linux/tipc.h> >>> +#include "crypto.h" >>> + >>> +#ifdef CONFIG_TIPC_CRYPTO >>> +#define BUF_HEADROOM ALIGN(((LL_MAX_HEADER + 48) + EHDR_MAX_SIZE), 16) >>> +#define BUF_TAILROOM (TIPC_AES_GCM_TAG_SIZE) >>> +#define FB_MTU (PAGE_SIZE - \ >>> + SKB_DATA_ALIGN(sizeof(struct skb_shared_info)) - \ >>> + SKB_DATA_ALIGN(BUF_HEADROOM + BUF_TAILROOM + 3)) >>> +#else >>> +#define BUF_HEADROOM (LL_MAX_HEADER + 48) >>> +#define BUF_TAILROOM 16 >>> +#define FB_MTU (PAGE_SIZE - \ >>> + SKB_DATA_ALIGN(sizeof(struct skb_shared_info)) - \ >>> + SKB_DATA_ALIGN(BUF_HEADROOM + 3)) >>> +#endif >>> + >>> +#endif |
From: Jon M. <jm...@re...> - 2021-06-04 19:20:45
|
On 6/4/21 3:44 AM, men...@gm... wrote: > From: Menglong Dong <don...@zt...> > > FB_MTU is used in 'tipc_msg_build()' to alloc smaller skb when memory > allocation fails, which can avoid unnecessary sending failures. > > The value of FB_MTU now is 3744, and the data size will be: > > (3744 + SKB_DATA_ALIGN(sizeof(struct skb_shared_info)) + \ > SKB_DATA_ALIGN(BUF_HEADROOM + BUF_TAILROOM + 3)) > > which is larger than one page(4096), and two pages will be allocated. > > To avoid it, replace '3744' with a calculation: > > FB_MTU = (PAGE_SIZE - SKB_DATA_ALIGN(sizeof(struct skb_shared_info)) > - SKB_DATA_ALIGN(BUF_HEADROOM + BUF_TAILROOM + 3)) > > Fixes: 4c94cc2d3d57 ("tipc: fall back to smaller MTU if allocation of local send skb fails") > > Reported-by: Zeal Robot <ze...@zt...> > Signed-off-by: Menglong Dong <don...@zt...> > --- > net/tipc/bcast.c | 1 + > net/tipc/msg.c | 8 +------ > net/tipc/msg.h | 1 - > net/tipc/mtu.h | 55 ++++++++++++++++++++++++++++++++++++++++++++++++ > 4 files changed, 57 insertions(+), 8 deletions(-) > create mode 100644 net/tipc/mtu.h > > diff --git a/net/tipc/bcast.c b/net/tipc/bcast.c > index d4beca895992..c641b68e0812 100644 > --- a/net/tipc/bcast.c > +++ b/net/tipc/bcast.c > @@ -41,6 +41,7 @@ > #include "bcast.h" > #include "link.h" > #include "name_table.h" > +#include "mtu.h" > > #define BCLINK_WIN_DEFAULT 50 /* bcast link window size (default) */ > #define BCLINK_WIN_MIN 32 /* bcast minimum link window size */ > diff --git a/net/tipc/msg.c b/net/tipc/msg.c > index ce6ab54822d8..ec70d271c2da 100644 > --- a/net/tipc/msg.c > +++ b/net/tipc/msg.c > @@ -40,15 +40,9 @@ > #include "addr.h" > #include "name_table.h" > #include "crypto.h" > +#include "mtu.h" > > #define MAX_FORWARD_SIZE 1024 > -#ifdef CONFIG_TIPC_CRYPTO > -#define BUF_HEADROOM ALIGN(((LL_MAX_HEADER + 48) + EHDR_MAX_SIZE), 16) > -#define BUF_TAILROOM (TIPC_AES_GCM_TAG_SIZE) > -#else > -#define BUF_HEADROOM (LL_MAX_HEADER + 48) > -#define BUF_TAILROOM 16 > -#endif > > static unsigned int align(unsigned int i) > { > diff --git a/net/tipc/msg.h b/net/tipc/msg.h > index 5d64596ba987..e83689d0f0f6 100644 > --- a/net/tipc/msg.h > +++ b/net/tipc/msg.h > @@ -99,7 +99,6 @@ struct plist; > #define MAX_H_SIZE 60 /* Largest possible TIPC header size */ > > #define MAX_MSG_SIZE (MAX_H_SIZE + TIPC_MAX_USER_MSG_SIZE) > -#define FB_MTU 3744 > #define TIPC_MEDIA_INFO_OFFSET 5 > > struct tipc_skb_cb { > diff --git a/net/tipc/mtu.h b/net/tipc/mtu.h > new file mode 100644 > index 000000000000..033f0b178f9d > --- /dev/null > +++ b/net/tipc/mtu.h Please don't add any extra file just for this little fix. We have enough files. Keep the macros in msg.h/c where they used to be. You can still add your copyright line to those files. Regarding the macros kept inside msg.c, they are there because we design by the principle of minimal exposure, even among our module internal files. Otherwise it is ok. Thanks ///jon > @@ -0,0 +1,55 @@ > +/* SPDX-License-Identifier: GPL-2.0-only */ > +/* > + * Copyright 2021 ZTE Corporation. > + * All rights reserved. > + * > + * Redistribution and use in source and binary forms, with or without > + * modification, are permitted provided that the following conditions are met: > + * > + * 1. Redistributions of source code must retain the above copyright > + * notice, this list of conditions and the following disclaimer. > + * 2. Redistributions in binary form must reproduce the above copyright > + * notice, this list of conditions and the following disclaimer in the > + * documentation and/or other materials provided with the distribution. > + * 3. Neither the names of the copyright holders nor the names of its > + * contributors may be used to endorse or promote products derived from > + * this software without specific prior written permission. > + * > + * Alternatively, this software may be distributed under the terms of the > + * GNU General Public License ("GPL") version 2 as published by the Free > + * Software Foundation. > + * > + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" > + * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE > + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE > + * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE > + * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR > + * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF > + * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS > + * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN > + * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) > + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE > + * POSSIBILITY OF SUCH DAMAGE. > + */ > + > +#ifndef _TIPC_MTU_H > +#define _TIPC_MTU_H > + > +#include <linux/tipc.h> > +#include "crypto.h" > + > +#ifdef CONFIG_TIPC_CRYPTO > +#define BUF_HEADROOM ALIGN(((LL_MAX_HEADER + 48) + EHDR_MAX_SIZE), 16) > +#define BUF_TAILROOM (TIPC_AES_GCM_TAG_SIZE) > +#define FB_MTU (PAGE_SIZE - \ > + SKB_DATA_ALIGN(sizeof(struct skb_shared_info)) - \ > + SKB_DATA_ALIGN(BUF_HEADROOM + BUF_TAILROOM + 3)) > +#else > +#define BUF_HEADROOM (LL_MAX_HEADER + 48) > +#define BUF_TAILROOM 16 > +#define FB_MTU (PAGE_SIZE - \ > + SKB_DATA_ALIGN(sizeof(struct skb_shared_info)) - \ > + SKB_DATA_ALIGN(BUF_HEADROOM + 3)) > +#endif > + > +#endif |
From: Duzan, G. D <Gar...@fi...> - 2021-06-03 16:27:29
|
Contrary to UDP vs TCP, TIPC is only one protocol, so you cannot bind the same service type/instance to different socket types without risking problems. The SYN bit will prevent a connection from being established with a socket of the wrong type, but it will not stop the binding table lookup from selecting such a socket, since it knows nothing about socket types. I am actually surprised that this works even on the non-Debian machines. Maybe the secondary lookup mechanism is saving the day. This could of course be fixed without too much effort, but the question is if that is the right way to go. At least we would have to carefully consider possible compatibility issues. Would it be a problem for you to just choose different service types/ranges? ///jon Interesting. So, accidents of implementation aside, you would expect DGRAM, STREAM, and SEQPACKET sockets to be able to communicate with each other? I'm not using SEQPACKET (yet), but it sounds like one might be able to connect() between STREAM and SEQPACKET sockets today, though presumably the way that send()s on the STREAM side get "packetized" on the SEQPACKET end is not clearly defined. If this is what is intended, I think some clarification in the documentation would be helpful, as would testing to ensure expected behavior across socket types. It is true that bringing TCP/UDP thinking to TIPC is going to lead to errors, but it isn't going to be rare. If it is straightforward to keep communication between different socket types separate, then I suspect that would produce the least surprise. As for my own project, logically I'm dealing with the same service either way; it is only the DGRAM message size limit that is pushing the STREAM fallback option. I could require a separate service type for the STREAM implementation, but that is another piece of configuration which I'd rather avoid, if practical. I expect I will keep it the way I have it while I'm in experimental mode, as it seems to be working as I expect on the target kernels, but I'll need to be sure I'm on solid ground if we ever manage to get the thing to production. Thanks. Gary Duzan FIS - GT.M Core ________________________________ From: Jon Maloy <jm...@re...> Sent: Wednesday, June 2, 2021 8:56 PM To: Xin Long <luc...@gm...>; Duzan, Gary D <Gar...@fi...> Cc: tip...@li... <tip...@li...> Subject: EXTERNAL: Re: [tipc-discussion] DGRAM/STREAM Crossover on Debian? CAUTION: This email originated from outside of the company. Do not click links or open attachments unless you recognize the sender and know the content is safe. On 6/2/21 4:14 PM, Xin Long wrote: > On Wed, May 26, 2021 at 11:38 AM Duzan, Gary D via tipc-discussion > <tip...@li...> wrote: >> I'm in the process of enhancing a TIPC DGRAM-based RPC-ish service to include TIPC STREAM transport for larger messages. To simplify configuration, I have the server process(es) bind() the same type/range for both DGRAM and STREAM sockets (poll()ing to see which have incoming requests), then choose which to use on the client. This seems to work on most of my Linux systems (RHEL-8, Ubuntu 20.04/21.04, Fedora 34, Debian 11), but on my Debian 10 system (4.19.181-1 kernel) I am seeing messages from a DGRAM client appearing on an accept()ed STREAM socket on the server. I have confirmed that the client is not sending anything on a STREAM socket, and the message received by the server is formatted as a DGRAM message (without the message framing header). > When you start two scoket on the server: DGRAM and STREAM, in the > client's nametable there will be 2 sockets with different portids: > # tipc nametable show > Type Lower Upper Scope Port Node > 18888 17 17 cluster 4063960415 > 18888 17 17 cluster 1106254118 > > When the client calls sendmsg()/connect() to send msg to the server, > it will choose one of them by the rule of "local, closest-first or > round-robin". > The client doesn't know if the peer is a DGRAM socket or STREAM > socket. In your case, it should go round-robin. > > Without this commit: > > commit 25b9221b959483f17c2964d0922869e16caa86b5 > Author: Jon Maloy <jon...@er...> > Date: Fri Sep 28 20:23:21 2018 +0200 > > tipc: add SYN bit to connection setup messages > > The SYN msg for STREAM is no different from the DATA msg for DGRAM. > that's what you're seeing in kernel-4.19 > >> Debian isn't a target platform for production, so I don't need a specific fix, but it is still surprising and a bit disturbing. Was this a known problem with the 4.19 kernel? Are there particular reasons why using this pattern is a bad idea? > I think it may not work as expected if you create 2 different types of > TIPC sockets binding to the same address. > At least on the latest kernel, once the DGRAM client chooses the > STREAM socket, the DATA msg will be dropped. > > Thanks. Exactly. Contrary to UDP vs TCP, TIPC is only one protocol, so you cannot bind the same service type/instance to different socket types without risking problems. The SYN bit will prevent a connection from being established with a socket of the wrong type, but it will not stop the binding table lookup from selecting such a socket, since it knows nothing about socket types. I am actually surprised that this works even on the non-Debian machines. Maybe the secondary lookup mechanism is saving the day. This could of course be fixed without too much effort, but the question is if that is the right way to go. At least we would have to carefully consider possible compatibility issues. Would it be a problem for you to just choose different service types/ranges? ///jon > >> Thanks. >> >> Gary Duzan >> FIS - GT.M Core >> >> The information contained in this message is proprietary and/or confidential. If you are not the intended recipient, please: (i) delete the message and all copies; (ii) do not disclose, distribute or use the message in any manner; and (iii) notify the sender immediately. In addition, please be aware that any message addressed to our domain is subject to archiving and review by persons other than the intended recipient. Thank you. >> >> _______________________________________________ >> tipc-discussion mailing list >> tip...@li... >> https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.sourceforge.net%2Flists%2Flistinfo%2Ftipc-discussion&data=04%7C01%7Cgary.duzan%40fisglobal.com%7C68e57e4f23b548f804b008d9262a7be5%7Ce3ff91d834c84b15a0b418910a6ac575%7C0%7C0%7C637582786180071891%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=VNEWWSJ4euG3QutIj3MXmPpOI56vPasJtcdFYM61ToU%3D&reserved=0 The information contained in this message is proprietary and/or confidential. If you are not the intended recipient, please: (i) delete the message and all copies; (ii) do not disclose, distribute or use the message in any manner; and (iii) notify the sender immediately. In addition, please be aware that any message addressed to our domain is subject to archiving and review by persons other than the intended recipient. Thank you. |
From: Duzan, G. D <Gar...@fi...> - 2021-06-03 15:13:14
|
I can frankly admit that this is a problem we never came across before, and that we haven´t really spent much time considering this kind of issues. That's why you need crazy developers trying to do weird stuff with it. 🙂 You are right that this needs to be at least clarified in the documentation. But we should also run some internal tests to identify what is working and what is not, and then decide what is the correct approach. Thank you for making us aware of this. Thanks for your efforts in making TIPC a great tool for datacenter communication. Gary Duzan FIS - GT.M Core ///jon ________________________________ From: Jon Maloy <jm...@re...> Sent: Thursday, June 3, 2021 10:19 AM To: Duzan, Gary D <Gar...@fi...>; Xin Long <luc...@gm...> Cc: tip...@li... <tip...@li...> Subject: Re: EXTERNAL: Re: [tipc-discussion] DGRAM/STREAM Crossover on Debian? On 6/3/21 9:55 AM, Duzan, Gary D wrote: Contrary to UDP vs TCP, TIPC is only one protocol, so you cannot bind the same service type/instance to different socket types without risking problems. The SYN bit will prevent a connection from being established with a socket of the wrong type, but it will not stop the binding table lookup from selecting such a socket, since it knows nothing about socket types. I am actually surprised that this works even on the non-Debian machines. Maybe the secondary lookup mechanism is saving the day. This could of course be fixed without too much effort, but the question is if that is the right way to go. At least we would have to carefully consider possible compatibility issues. Would it be a problem for you to just choose different service types/ranges? ///jon Interesting. So, accidents of implementation aside, you would expect DGRAM, STREAM, and SEQPACKET sockets to be able to communicate with each other? I'm not using SEQPACKET (yet), but it sounds like one might be able to connect() between STREAM and SEQPACKET sockets today, though presumably the way that send()s on the STREAM side get "packetized" on the SEQPACKET end is not clearly defined. If this is what is intended, I think some clarification in the documentation would be helpful, as would testing to ensure expected behavior across socket types. It is true that bringing TCP/UDP thinking to TIPC is going to lead to errors, but it isn't going to be rare. If it is straightforward to keep communication between different socket types separate, then I suspect that would produce the least surprise. As for my own project, logically I'm dealing with the same service either way; it is only the DGRAM message size limit that is pushing the STREAM fallback option. I could require a separate service type for the STREAM implementation, but that is another piece of configuration which I'd rather avoid, if practical. I expect I will keep it the way I have it while I'm in experimental mode, as it seems to be working as I expect on the target kernels, but I'll need to be sure I'm on solid ground if we ever manage to get the thing to production. Thanks. Gary Duzan FIS - GT.M Core I can frankly admit that this is a problem we never came across before, and that we haven´t really spent much time considering this kind of issues. You are right that this needs to be at least clarified in the documentation. But we should also run some internal tests to identify what is working and what is not, and then decide what is the correct approach. Thank you for making us aware of this. ///jon ________________________________ From: Jon Maloy <jm...@re...><mailto:jm...@re...> Sent: Wednesday, June 2, 2021 8:56 PM To: Xin Long <luc...@gm...><mailto:luc...@gm...>; Duzan, Gary D <Gar...@fi...><mailto:Gar...@fi...> Cc: tip...@li...<mailto:tip...@li...> <tip...@li...><mailto:tip...@li...> Subject: EXTERNAL: Re: [tipc-discussion] DGRAM/STREAM Crossover on Debian? CAUTION: This email originated from outside of the company. Do not click links or open attachments unless you recognize the sender and know the content is safe. On 6/2/21 4:14 PM, Xin Long wrote: > On Wed, May 26, 2021 at 11:38 AM Duzan, Gary D via tipc-discussion > <tip...@li...><mailto:tip...@li...> wrote: >> I'm in the process of enhancing a TIPC DGRAM-based RPC-ish service to include TIPC STREAM transport for larger messages. To simplify configuration, I have the server process(es) bind() the same type/range for both DGRAM and STREAM sockets (poll()ing to see which have incoming requests), then choose which to use on the client. This seems to work on most of my Linux systems (RHEL-8, Ubuntu 20.04/21.04, Fedora 34, Debian 11), but on my Debian 10 system (4.19.181-1 kernel) I am seeing messages from a DGRAM client appearing on an accept()ed STREAM socket on the server. I have confirmed that the client is not sending anything on a STREAM socket, and the message received by the server is formatted as a DGRAM message (without the message framing header). > When you start two scoket on the server: DGRAM and STREAM, in the > client's nametable there will be 2 sockets with different portids: > # tipc nametable show > Type Lower Upper Scope Port Node > 18888 17 17 cluster 4063960415 > 18888 17 17 cluster 1106254118 > > When the client calls sendmsg()/connect() to send msg to the server, > it will choose one of them by the rule of "local, closest-first or > round-robin". > The client doesn't know if the peer is a DGRAM socket or STREAM > socket. In your case, it should go round-robin. > > Without this commit: > > commit 25b9221b959483f17c2964d0922869e16caa86b5 > Author: Jon Maloy <jon...@er...><mailto:jon...@er...> > Date: Fri Sep 28 20:23:21 2018 +0200 > > tipc: add SYN bit to connection setup messages > > The SYN msg for STREAM is no different from the DATA msg for DGRAM. > that's what you're seeing in kernel-4.19 > >> Debian isn't a target platform for production, so I don't need a specific fix, but it is still surprising and a bit disturbing. Was this a known problem with the 4.19 kernel? Are there particular reasons why using this pattern is a bad idea? > I think it may not work as expected if you create 2 different types of > TIPC sockets binding to the same address. > At least on the latest kernel, once the DGRAM client chooses the > STREAM socket, the DATA msg will be dropped. > > Thanks. Exactly. Contrary to UDP vs TCP, TIPC is only one protocol, so you cannot bind the same service type/instance to different socket types without risking problems. The SYN bit will prevent a connection from being established with a socket of the wrong type, but it will not stop the binding table lookup from selecting such a socket, since it knows nothing about socket types. I am actually surprised that this works even on the non-Debian machines. Maybe the secondary lookup mechanism is saving the day. This could of course be fixed without too much effort, but the question is if that is the right way to go. At least we would have to carefully consider possible compatibility issues. Would it be a problem for you to just choose different service types/ranges? ///jon > >> Thanks. >> >> Gary Duzan >> FIS - GT.M Core >> >> The information contained in this message is proprietary and/or confidential. If you are not the intended recipient, please: (i) delete the message and all copies; (ii) do not disclose, distribute or use the message in any manner; and (iii) notify the sender immediately. In addition, please be aware that any message addressed to our domain is subject to archiving and review by persons other than the intended recipient. Thank you. >> >> _______________________________________________ >> tipc-discussion mailing list >> tip...@li...<mailto:tip...@li...> >> https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.sourceforge.net%2Flists%2Flistinfo%2Ftipc-discussion&data=04%7C01%7Cgary.duzan%40fisglobal.com%7C68e57e4f23b548f804b008d9262a7be5%7Ce3ff91d834c84b15a0b418910a6ac575%7C0%7C0%7C637582786180071891%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=VNEWWSJ4euG3QutIj3MXmPpOI56vPasJtcdFYM61ToU%3D&reserved=0<https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.sourceforge.net%2Flists%2Flistinfo%2Ftipc-discussion&data=04%7C01%7Cgary.duzan%40fisglobal.com%7C657296242b444228769108d9269ab00d%7Ce3ff91d834c84b15a0b418910a6ac575%7C0%7C0%7C637583268094936158%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=clQl34v632%2BOtKfgEbLh4H6xQFAxctI%2FS9GyO12Jync%3D&reserved=0> The information contained in this message is proprietary and/or confidential. If you are not the intended recipient, please: (i) delete the message and all copies; (ii) do not disclose, distribute or use the message in any manner; and (iii) notify the sender immediately. In addition, please be aware that any message addressed to our domain is subject to archiving and review by persons other than the intended recipient. Thank you. The information contained in this message is proprietary and/or confidential. If you are not the intended recipient, please: (i) delete the message and all copies; (ii) do not disclose, distribute or use the message in any manner; and (iii) notify the sender immediately. In addition, please be aware that any message addressed to our domain is subject to archiving and review by persons other than the intended recipient. Thank you. |
From: Jon M. <jm...@re...> - 2021-06-03 14:20:11
|
On 6/3/21 9:55 AM, Duzan, Gary D wrote: > > Contrary to UDP vs TCP, TIPC is only one protocol, so you > cannot bind the same service type/instance to different socket types > without risking problems. > The SYN bit will prevent a connection from being established with a > socket of the wrong type, but it will not stop the binding table > lookup > from selecting such a socket, since it knows nothing about socket > types. > I am actually surprised that this works even on the non-Debian > machines. > Maybe the secondary lookup mechanism is saving the day. > > This could of course be fixed without too much effort, but the > question > is if that is the right way to go. At least we would have to carefully > consider possible compatibility issues. > > Would it be a problem for you to just choose different service > types/ranges? > > ///jon > > > Interesting. So, accidents of implementation aside, you would > expect DGRAM, STREAM, and SEQPACKET sockets to be able to communicate > with each other? I'm not using SEQPACKET (yet), but it sounds like one > might be able to connect() between STREAM and SEQPACKET sockets today, > though presumably the way that send()s on the STREAM side get > "packetized" on the SEQPACKET end is not clearly defined. If this is > what is intended, I think some clarification in the documentation > would be helpful, as would testing to ensure expected behavior across > socket types. It is true that bringing TCP/UDP thinking to TIPC is > going to lead to errors, but it isn't going to be rare. If it is > straightforward to keep communication between different socket types > separate, then I suspect that would produce the least surprise. > > As for my own project, logically I'm dealing with the same service > either way; it is only the DGRAM message size limit that is pushing > the STREAM fallback option. I could require a separate service type > for the STREAM implementation, but that is another piece of > configuration which I'd rather avoid, if practical. I expect I will > keep it the way I have it while I'm in experimental mode, as it seems > to be working as I expect on the target kernels, but I'll need to be > sure I'm on solid ground if we ever manage to get the thing to production. > > Thanks. > > Gary Duzan > FIS - GT.M Core > I can frankly admit that this is a problem we never came across before, and that we haven´t really spent much time considering this kind of issues. You are right that this needs to be at least clarified in the documentation. But we should also run some internal tests to identify what is working and what is not, and then decide what is the correct approach. Thank you for making us aware of this. ///jon > > ------------------------------------------------------------------------ > *From:* Jon Maloy <jm...@re...> > *Sent:* Wednesday, June 2, 2021 8:56 PM > *To:* Xin Long <luc...@gm...>; Duzan, Gary D > <Gar...@fi...> > *Cc:* tip...@li... > <tip...@li...> > *Subject:* EXTERNAL: Re: [tipc-discussion] DGRAM/STREAM Crossover on > Debian? > CAUTION: This email originated from outside of the company. Do not > click links or open attachments unless you recognize the sender and > know the content is safe. > > > > On 6/2/21 4:14 PM, Xin Long wrote: > > On Wed, May 26, 2021 at 11:38 AM Duzan, Gary D via tipc-discussion > > <tip...@li...> wrote: > >> I'm in the process of enhancing a TIPC DGRAM-based RPC-ish > service to include TIPC STREAM transport for larger messages. To > simplify configuration, I have the server process(es) bind() the same > type/range for both DGRAM and STREAM sockets (poll()ing to see which > have incoming requests), then choose which to use on the client. This > seems to work on most of my Linux systems (RHEL-8, Ubuntu 20.04/21.04, > Fedora 34, Debian 11), but on my Debian 10 system (4.19.181-1 kernel) > I am seeing messages from a DGRAM client appearing on an accept()ed > STREAM socket on the server. I have confirmed that the client is not > sending anything on a STREAM socket, and the message received by the > server is formatted as a DGRAM message (without the message framing > header). > > When you start two scoket on the server: DGRAM and STREAM, in the > > client's nametable there will be 2 sockets with different portids: > > # tipc nametable show > > Type Lower Upper Scope Port Node > > 18888 17 17 cluster 4063960415 > > 18888 17 17 cluster 1106254118 > > > > When the client calls sendmsg()/connect() to send msg to the server, > > it will choose one of them by the rule of "local, closest-first or > > round-robin". > > The client doesn't know if the peer is a DGRAM socket or STREAM > > socket. In your case, it should go round-robin. > > > > Without this commit: > > > > commit 25b9221b959483f17c2964d0922869e16caa86b5 > > Author: Jon Maloy <jon...@er...> > > Date: Fri Sep 28 20:23:21 2018 +0200 > > > > tipc: add SYN bit to connection setup messages > > > > The SYN msg for STREAM is no different from the DATA msg for DGRAM. > > that's what you're seeing in kernel-4.19 > > > >> Debian isn't a target platform for production, so I don't need > a specific fix, but it is still surprising and a bit disturbing. Was > this a known problem with the 4.19 kernel? Are there particular > reasons why using this pattern is a bad idea? > > I think it may not work as expected if you create 2 different types of > > TIPC sockets binding to the same address. > > At least on the latest kernel, once the DGRAM client chooses the > > STREAM socket, the DATA msg will be dropped. > > > > Thanks. > Exactly. Contrary to UDP vs TCP, TIPC is only one protocol, so you > cannot bind the same service type/instance to different socket types > without risking problems. > The SYN bit will prevent a connection from being established with a > socket of the wrong type, but it will not stop the binding table lookup > from selecting such a socket, since it knows nothing about socket types. > I am actually surprised that this works even on the non-Debian machines. > Maybe the secondary lookup mechanism is saving the day. > > This could of course be fixed without too much effort, but the question > is if that is the right way to go. At least we would have to carefully > consider possible compatibility issues. > > Would it be a problem for you to just choose different service > types/ranges? > > ///jon > > > > >> Thanks. > >> > >> Gary Duzan > >> FIS - GT.M Core > >> > >> The information contained in this message is proprietary and/or > confidential. If you are not the intended recipient, please: (i) > delete the message and all copies; (ii) do not disclose, distribute or > use the message in any manner; and (iii) notify the sender > immediately. In addition, please be aware that any message addressed > to our domain is subject to archiving and review by persons other than > the intended recipient. Thank you. > >> > >> _______________________________________________ > >> tipc-discussion mailing list > >> tip...@li... > >> > https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.sourceforge.net%2Flists%2Flistinfo%2Ftipc-discussion&data=04%7C01%7Cgary.duzan%40fisglobal.com%7C68e57e4f23b548f804b008d9262a7be5%7Ce3ff91d834c84b15a0b418910a6ac575%7C0%7C0%7C637582786180071891%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=VNEWWSJ4euG3QutIj3MXmPpOI56vPasJtcdFYM61ToU%3D&reserved=0 > <https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.sourceforge.net%2Flists%2Flistinfo%2Ftipc-discussion&data=04%7C01%7Cgary.duzan%40fisglobal.com%7C68e57e4f23b548f804b008d9262a7be5%7Ce3ff91d834c84b15a0b418910a6ac575%7C0%7C0%7C637582786180071891%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=VNEWWSJ4euG3QutIj3MXmPpOI56vPasJtcdFYM61ToU%3D&reserved=0> > > The information contained in this message is proprietary and/or > confidential. If you are not the intended recipient, please: (i) > delete the message and all copies; (ii) do not disclose, distribute or > use the message in any manner; and (iii) notify the sender > immediately. In addition, please be aware that any message addressed > to our domain is subject to archiving and review by persons other than > the intended recipient. Thank you. |
From: Jon M. <jm...@re...> - 2021-06-03 13:08:14
|
On 6/2/21 10:26 PM, Menglong Dong wrote: > Hello Maloy, > > On Thu, Jun 3, 2021 at 3:50 AM Jon Maloy <jm...@re...> wrote: > > [...] >> Hi Dong, >> The value is based on empiric knowledge. >> When I determined it I made a small loop in a kernel driver where I >> allocated skbs (using tipc_buf_acquire) with an increasing size >> (incremented with 1 each iteration), and then printed out the >> corresponding truesize. >> >> That gave the value we are using now. >> >> Now, when re-running the test I get a different value, so something has >> obviously changed since then. >> >> [ 1622.158586] skb(513) =>> truesize 2304, prev skb(512) => prev >> truesize 1280 >> [ 1622.162074] skb(1537) =>> truesize 4352, prev skb(1536) => prev >> truesize 2304 >> [ 1622.165984] skb(3585) =>> truesize 8448, prev skb(3584) => prev >> truesize 4352 >> >> As you can see, the optimal value now, for an x86_64 machine compiled >> with gcc, is 3584 bytes, not 3744. > I'm not sure if this is a perfect way to determine the value of FB_MTU. > If 'struct skb_shared_info' changes, this value seems should change, > too. > > How about we make it this: > > #define FB_MTU (PAGE_SIZE - \ > SKB_DATA_ALIGN(sizeof(struct skb_shared_info)) - \ > SKB_DATA_ALIGN(BUF_HEADROOM + BUF_TAILROOM + 3 + \ > MAX_H_SIZ)) > > The value 'BUF_HEADROOM + BUF_TAILROOM + 3' come from 'tipc_buf_acquire()': > > #ifdef CONFIG_TIPC_CRYPTO > unsigned int buf_size = (BUF_HEADROOM + size + BUF_TAILROOM + 3) & ~3u; > #else > unsigned int buf_size = (BUF_HEADROOM + size + 3) & ~3u; > #endif > > Is it a good idea? Yes, I think that makes sense. I was always aware of the "fragility" of my approach, -this one looks more future safe. ///jon > > Thanks > Menglong Dong > |
From: Jon M. <jm...@re...> - 2021-06-03 00:57:09
|
On 6/2/21 4:14 PM, Xin Long wrote: > On Wed, May 26, 2021 at 11:38 AM Duzan, Gary D via tipc-discussion > <tip...@li...> wrote: >> I'm in the process of enhancing a TIPC DGRAM-based RPC-ish service to include TIPC STREAM transport for larger messages. To simplify configuration, I have the server process(es) bind() the same type/range for both DGRAM and STREAM sockets (poll()ing to see which have incoming requests), then choose which to use on the client. This seems to work on most of my Linux systems (RHEL-8, Ubuntu 20.04/21.04, Fedora 34, Debian 11), but on my Debian 10 system (4.19.181-1 kernel) I am seeing messages from a DGRAM client appearing on an accept()ed STREAM socket on the server. I have confirmed that the client is not sending anything on a STREAM socket, and the message received by the server is formatted as a DGRAM message (without the message framing header). > When you start two scoket on the server: DGRAM and STREAM, in the > client's nametable there will be 2 sockets with different portids: > # tipc nametable show > Type Lower Upper Scope Port Node > 18888 17 17 cluster 4063960415 > 18888 17 17 cluster 1106254118 > > When the client calls sendmsg()/connect() to send msg to the server, > it will choose one of them by the rule of "local, closest-first or > round-robin". > The client doesn't know if the peer is a DGRAM socket or STREAM > socket. In your case, it should go round-robin. > > Without this commit: > > commit 25b9221b959483f17c2964d0922869e16caa86b5 > Author: Jon Maloy <jon...@er...> > Date: Fri Sep 28 20:23:21 2018 +0200 > > tipc: add SYN bit to connection setup messages > > The SYN msg for STREAM is no different from the DATA msg for DGRAM. > that's what you're seeing in kernel-4.19 > >> Debian isn't a target platform for production, so I don't need a specific fix, but it is still surprising and a bit disturbing. Was this a known problem with the 4.19 kernel? Are there particular reasons why using this pattern is a bad idea? > I think it may not work as expected if you create 2 different types of > TIPC sockets binding to the same address. > At least on the latest kernel, once the DGRAM client chooses the > STREAM socket, the DATA msg will be dropped. > > Thanks. Exactly. Contrary to UDP vs TCP, TIPC is only one protocol, so you cannot bind the same service type/instance to different socket types without risking problems. The SYN bit will prevent a connection from being established with a socket of the wrong type, but it will not stop the binding table lookup from selecting such a socket, since it knows nothing about socket types. I am actually surprised that this works even on the non-Debian machines. Maybe the secondary lookup mechanism is saving the day. This could of course be fixed without too much effort, but the question is if that is the right way to go. At least we would have to carefully consider possible compatibility issues. Would it be a problem for you to just choose different service types/ranges? ///jon > >> Thanks. >> >> Gary Duzan >> FIS - GT.M Core >> >> The information contained in this message is proprietary and/or confidential. If you are not the intended recipient, please: (i) delete the message and all copies; (ii) do not disclose, distribute or use the message in any manner; and (iii) notify the sender immediately. In addition, please be aware that any message addressed to our domain is subject to archiving and review by persons other than the intended recipient. Thank you. >> >> _______________________________________________ >> tipc-discussion mailing list >> tip...@li... >> https://lists.sourceforge.net/lists/listinfo/tipc-discussion |
From: Xin L. <luc...@gm...> - 2021-06-02 20:15:13
|
On Wed, May 26, 2021 at 11:38 AM Duzan, Gary D via tipc-discussion <tip...@li...> wrote: > > I'm in the process of enhancing a TIPC DGRAM-based RPC-ish service to include TIPC STREAM transport for larger messages. To simplify configuration, I have the server process(es) bind() the same type/range for both DGRAM and STREAM sockets (poll()ing to see which have incoming requests), then choose which to use on the client. This seems to work on most of my Linux systems (RHEL-8, Ubuntu 20.04/21.04, Fedora 34, Debian 11), but on my Debian 10 system (4.19.181-1 kernel) I am seeing messages from a DGRAM client appearing on an accept()ed STREAM socket on the server. I have confirmed that the client is not sending anything on a STREAM socket, and the message received by the server is formatted as a DGRAM message (without the message framing header). When you start two scoket on the server: DGRAM and STREAM, in the client's nametable there will be 2 sockets with different portids: # tipc nametable show Type Lower Upper Scope Port Node 18888 17 17 cluster 4063960415 18888 17 17 cluster 1106254118 When the client calls sendmsg()/connect() to send msg to the server, it will choose one of them by the rule of "local, closest-first or round-robin". The client doesn't know if the peer is a DGRAM socket or STREAM socket. In your case, it should go round-robin. Without this commit: commit 25b9221b959483f17c2964d0922869e16caa86b5 Author: Jon Maloy <jon...@er...> Date: Fri Sep 28 20:23:21 2018 +0200 tipc: add SYN bit to connection setup messages The SYN msg for STREAM is no different from the DATA msg for DGRAM. that's what you're seeing in kernel-4.19 > > Debian isn't a target platform for production, so I don't need a specific fix, but it is still surprising and a bit disturbing. Was this a known problem with the 4.19 kernel? Are there particular reasons why using this pattern is a bad idea? I think it may not work as expected if you create 2 different types of TIPC sockets binding to the same address. At least on the latest kernel, once the DGRAM client chooses the STREAM socket, the DATA msg will be dropped. Thanks. > > Thanks. > > Gary Duzan > FIS - GT.M Core > > The information contained in this message is proprietary and/or confidential. If you are not the intended recipient, please: (i) delete the message and all copies; (ii) do not disclose, distribute or use the message in any manner; and (iii) notify the sender immediately. In addition, please be aware that any message addressed to our domain is subject to archiving and review by persons other than the intended recipient. Thank you. > > _______________________________________________ > tipc-discussion mailing list > tip...@li... > https://lists.sourceforge.net/lists/listinfo/tipc-discussion |
From: Jon M. <jm...@re...> - 2021-06-02 19:50:48
|
On 6/1/21 10:18 AM, Menglong Dong wrote: > Hello! > > I have a question about the value of FB_MTU in tipc, how does the '3744' form? > I notice that it is used in 'tipc_msg_build()' when memory allocation > fails, and it > tries to fall back to a smaller MTU to avoid unnecessary sending failures. > > However, the size of the data allocated will be more than 4096 when FB_MTU > is 3744. I did a rough calculation, the size of data will more than 4200: > > (FB_MTU + TIPCHDR + BUF_HEADROOM + sizeof(struct skb_shared_info)) > > Therefore, 8192 will be allocated from slab, and about 4000 of it will > not be used. > > FB_MTU is used for low memory, and I think eating two pages will make it worse. > Do I miss something? > > Thanks! > Menglong Dong > Hi Dong, The value is based on empiric knowledge. When I determined it I made a small loop in a kernel driver where I allocated skbs (using tipc_buf_acquire) with an increasing size (incremented with 1 each iteration), and then printed out the corresponding truesize. That gave the value we are using now. Now, when re-running the test I get a different value, so something has obviously changed since then. [ 1622.158586] skb(513) =>> truesize 2304, prev skb(512) => prev truesize 1280 [ 1622.162074] skb(1537) =>> truesize 4352, prev skb(1536) => prev truesize 2304 [ 1622.165984] skb(3585) =>> truesize 8448, prev skb(3584) => prev truesize 4352 As you can see, the optimal value now, for an x86_64 machine compiled with gcc, is 3584 bytes, not 3744. Feel free to post a patch for this if you want to. Thanks ///Jon Maloy |
From: Duzan, G. D <Gar...@fi...> - 2021-05-26 15:37:42
|
I'm in the process of enhancing a TIPC DGRAM-based RPC-ish service to include TIPC STREAM transport for larger messages. To simplify configuration, I have the server process(es) bind() the same type/range for both DGRAM and STREAM sockets (poll()ing to see which have incoming requests), then choose which to use on the client. This seems to work on most of my Linux systems (RHEL-8, Ubuntu 20.04/21.04, Fedora 34, Debian 11), but on my Debian 10 system (4.19.181-1 kernel) I am seeing messages from a DGRAM client appearing on an accept()ed STREAM socket on the server. I have confirmed that the client is not sending anything on a STREAM socket, and the message received by the server is formatted as a DGRAM message (without the message framing header). Debian isn't a target platform for production, so I don't need a specific fix, but it is still surprising and a bit disturbing. Was this a known problem with the 4.19 kernel? Are there particular reasons why using this pattern is a bad idea? Thanks. Gary Duzan FIS - GT.M Core The information contained in this message is proprietary and/or confidential. If you are not the intended recipient, please: (i) delete the message and all copies; (ii) do not disclose, distribute or use the message in any manner; and (iii) notify the sender immediately. In addition, please be aware that any message addressed to our domain is subject to archiving and review by persons other than the intended recipient. Thank you. |
From: Xin L. <luc...@gm...> - 2021-05-18 02:09:23
|
This patch is to use "struct work_struct" for the finalize work queue instead of "struct tipc_net_work", as it can get the "net" and "addr" from tipc_net's other members and there is no need to add extra net and addr in tipc_net by defining "struct tipc_net_work". Note that it's safe to get net from tn->bcl as bcl is always released after the finalize work queue is done. Signed-off-by: Xin Long <luc...@gm...> Acked-by: Jon Maloy <jm...@re...> --- net/tipc/core.c | 4 ++-- net/tipc/core.h | 8 +------- net/tipc/discover.c | 4 ++-- net/tipc/link.c | 5 +++++ net/tipc/link.h | 1 + net/tipc/net.c | 15 +++------------ 6 files changed, 14 insertions(+), 23 deletions(-) diff --git a/net/tipc/core.c b/net/tipc/core.c index 72f3ac7..3f4542e 100644 --- a/net/tipc/core.c +++ b/net/tipc/core.c @@ -60,7 +60,7 @@ static int __net_init tipc_init_net(struct net *net) tn->trial_addr = 0; tn->addr_trial_end = 0; tn->capabilities = TIPC_NODE_CAPABILITIES; - INIT_WORK(&tn->final_work.work, tipc_net_finalize_work); + INIT_WORK(&tn->work, tipc_net_finalize_work); memset(tn->node_id, 0, sizeof(tn->node_id)); memset(tn->node_id_string, 0, sizeof(tn->node_id_string)); tn->mon_threshold = TIPC_DEF_MON_THRESHOLD; @@ -110,7 +110,7 @@ static void __net_exit tipc_exit_net(struct net *net) tipc_detach_loopback(net); /* Make sure the tipc_net_finalize_work() finished */ - cancel_work_sync(&tn->final_work.work); + cancel_work_sync(&tn->work); tipc_net_stop(net); tipc_bcast_stop(net); diff --git a/net/tipc/core.h b/net/tipc/core.h index 5741ae4..0a3f7a7 100644 --- a/net/tipc/core.h +++ b/net/tipc/core.h @@ -91,12 +91,6 @@ extern unsigned int tipc_net_id __read_mostly; extern int sysctl_tipc_rmem[3] __read_mostly; extern int sysctl_tipc_named_timeout __read_mostly; -struct tipc_net_work { - struct work_struct work; - struct net *net; - u32 addr; -}; - struct tipc_net { u8 node_id[NODE_ID_LEN]; u32 node_addr; @@ -148,7 +142,7 @@ struct tipc_net { struct tipc_crypto *crypto_tx; #endif /* Work item for net finalize */ - struct tipc_net_work final_work; + struct work_struct work; /* The numbers of work queues in schedule */ atomic_t wq_count; }; diff --git a/net/tipc/discover.c b/net/tipc/discover.c index 5380f60..da69e1a 100644 --- a/net/tipc/discover.c +++ b/net/tipc/discover.c @@ -168,7 +168,7 @@ static bool tipc_disc_addr_trial_msg(struct tipc_discoverer *d, /* Apply trial address if we just left trial period */ if (!trial && !self) { - tipc_sched_net_finalize(net, tn->trial_addr); + schedule_work(&tn->work); msg_set_prevnode(buf_msg(d->skb), tn->trial_addr); msg_set_type(buf_msg(d->skb), DSC_REQ_MSG); } @@ -308,7 +308,7 @@ static void tipc_disc_timeout(struct timer_list *t) if (!time_before(jiffies, tn->addr_trial_end) && !tipc_own_addr(net)) { mod_timer(&d->timer, jiffies + TIPC_DISC_INIT); spin_unlock_bh(&d->lock); - tipc_sched_net_finalize(net, tn->trial_addr); + schedule_work(&tn->work); return; } diff --git a/net/tipc/link.c b/net/tipc/link.c index 1151092..c44b4bf 100644 --- a/net/tipc/link.c +++ b/net/tipc/link.c @@ -372,6 +372,11 @@ char tipc_link_plane(struct tipc_link *l) return l->net_plane; } +struct net *tipc_link_net(struct tipc_link *l) +{ + return l->net; +} + void tipc_link_update_caps(struct tipc_link *l, u16 capabilities) { l->peer_caps = capabilities; diff --git a/net/tipc/link.h b/net/tipc/link.h index fc07232..a16f401 100644 --- a/net/tipc/link.h +++ b/net/tipc/link.h @@ -156,4 +156,5 @@ int tipc_link_bc_sync_rcv(struct tipc_link *l, struct tipc_msg *hdr, int tipc_link_bc_nack_rcv(struct tipc_link *l, struct sk_buff *skb, struct sk_buff_head *xmitq); bool tipc_link_too_silent(struct tipc_link *l); +struct net *tipc_link_net(struct tipc_link *l); #endif diff --git a/net/tipc/net.c b/net/tipc/net.c index a130195..0e95572 100644 --- a/net/tipc/net.c +++ b/net/tipc/net.c @@ -41,6 +41,7 @@ #include "socket.h" #include "node.h" #include "bcast.h" +#include "link.h" #include "netlink.h" #include "monitor.h" @@ -142,19 +143,9 @@ static void tipc_net_finalize(struct net *net, u32 addr) void tipc_net_finalize_work(struct work_struct *work) { - struct tipc_net_work *fwork; + struct tipc_net *tn = container_of(work, struct tipc_net, work); - fwork = container_of(work, struct tipc_net_work, work); - tipc_net_finalize(fwork->net, fwork->addr); -} - -void tipc_sched_net_finalize(struct net *net, u32 addr) -{ - struct tipc_net *tn = tipc_net(net); - - tn->final_work.net = net; - tn->final_work.addr = addr; - schedule_work(&tn->final_work.work); + tipc_net_finalize(tipc_link_net(tn->bcl), tn->trial_addr); } void tipc_net_stop(struct net *net) -- 2.1.0 |
From: Xin L. <luc...@gm...> - 2021-05-16 18:29:15
|
On some host, a crash could be triggered simply by repeating these commands several times: # modprobe tipc # tipc bearer enable media udp name UDP1 localip 127.0.0.1 # rmmod tipc [] BUG: unable to handle kernel paging request at ffffffffc096bb00 [] Workqueue: events 0xffffffffc096bb00 [] Call Trace: [] ? process_one_work+0x1a7/0x360 [] ? worker_thread+0x30/0x390 [] ? create_worker+0x1a0/0x1a0 [] ? kthread+0x116/0x130 [] ? kthread_flush_work_fn+0x10/0x10 [] ? ret_from_fork+0x35/0x40 When removing the TIPC module, the UDP tunnel sock will be delayed to release in a work queue as sock_release() can't be done in rtnl_lock(). If the work queue is schedule to run after the TIPC module is removed, kernel will crash as the work queue function cleanup_beareri() code no longer exists when trying to invoke it. To fix it, this patch introduce a member wq_count in tipc_net to track the numbers of work queues in schedule, and wait and exit until all work queues are done in tipc_exit_net(). Fixes: d0f91938bede ("tipc: add ip/udp media type") Reported-by: Shuang Li <sh...@re...> Signed-off-by: Xin Long <luc...@gm...> Acked-by: Jon Maloy <jm...@re...> --- net/tipc/core.c | 2 ++ net/tipc/core.h | 2 ++ net/tipc/udp_media.c | 2 ++ 3 files changed, 6 insertions(+) diff --git a/net/tipc/core.c b/net/tipc/core.c index 5cc1f03..72f3ac7 100644 --- a/net/tipc/core.c +++ b/net/tipc/core.c @@ -119,6 +119,8 @@ static void __net_exit tipc_exit_net(struct net *net) #ifdef CONFIG_TIPC_CRYPTO tipc_crypto_stop(&tipc_net(net)->crypto_tx); #endif + while (atomic_read(&tn->wq_count)) + cond_resched(); } static void __net_exit tipc_pernet_pre_exit(struct net *net) diff --git a/net/tipc/core.h b/net/tipc/core.h index 03de7b2..5741ae4 100644 --- a/net/tipc/core.h +++ b/net/tipc/core.h @@ -149,6 +149,8 @@ struct tipc_net { #endif /* Work item for net finalize */ struct tipc_net_work final_work; + /* The numbers of work queues in schedule */ + atomic_t wq_count; }; static inline struct tipc_net *tipc_net(struct net *net) diff --git a/net/tipc/udp_media.c b/net/tipc/udp_media.c index e556d2c..c2bb818 100644 --- a/net/tipc/udp_media.c +++ b/net/tipc/udp_media.c @@ -814,6 +814,7 @@ static void cleanup_bearer(struct work_struct *work) kfree_rcu(rcast, rcu); } + atomic_dec(&tipc_net(sock_net(ub->ubsock->sk))->wq_count); dst_cache_destroy(&ub->rcast.dst_cache); udp_tunnel_sock_release(ub->ubsock); synchronize_net(); @@ -834,6 +835,7 @@ static void tipc_udp_disable(struct tipc_bearer *b) RCU_INIT_POINTER(ub->bearer, NULL); /* sock_release need to be done outside of rtnl lock */ + atomic_inc(&tipc_net(sock_net(ub->ubsock->sk))->wq_count); INIT_WORK(&ub->work, cleanup_bearer); schedule_work(&ub->work); } -- 2.1.0 |
From: Jon M. <jm...@re...> - 2021-05-15 22:58:10
|
On 5/7/21 3:57 PM, Xin Long wrote: > It's not a good idea to append the frag skb to a skb's frag_list if > the frag_list already has skbs from elsewhere, such as this skb was > created by pskb_copy() where the frag_list was cloned (all the skbs > in it were skb_get'ed) and shared by multiple skbs. > > However, the new appended frag skb should have been only seen by the > current skb. Otherwise, it will cause use after free crashes as this > appended frag skb are seen by multiple skbs but it only got skb_get > called once. > > The same thing happens with a skb updated by pskb_may_pull() with a > skb_cloned skb. Li Shuang has reported quite a few crashes caused > by this when doing testing over macvlan devices: > > [] kernel BUG at net/core/skbuff.c:1970! > [] Call Trace: > [] skb_clone+0x4d/0xb0 > [] macvlan_broadcast+0xd8/0x160 [macvlan] > [] macvlan_process_broadcast+0x148/0x150 [macvlan] > [] process_one_work+0x1a7/0x360 > [] worker_thread+0x30/0x390 > > [] kernel BUG at mm/usercopy.c:102! > [] Call Trace: > [] __check_heap_object+0xd3/0x100 > [] __check_object_size+0xff/0x16b > [] simple_copy_to_iter+0x1c/0x30 > [] __skb_datagram_iter+0x7d/0x310 > [] __skb_datagram_iter+0x2a5/0x310 > [] skb_copy_datagram_iter+0x3b/0x90 > [] tipc_recvmsg+0x14a/0x3a0 [tipc] > [] ____sys_recvmsg+0x91/0x150 > [] ___sys_recvmsg+0x7b/0xc0 > > [] kernel BUG at mm/slub.c:305! > [] Call Trace: > [] <IRQ> > [] kmem_cache_free+0x3ff/0x400 > [] __netif_receive_skb_core+0x12c/0xc40 > [] ? kmem_cache_alloc+0x12e/0x270 > [] netif_receive_skb_internal+0x3d/0xb0 > [] ? get_rx_page_info+0x8e/0xa0 [be2net] > [] be_poll+0x6ef/0xd00 [be2net] > [] ? irq_exit+0x4f/0x100 > [] net_rx_action+0x149/0x3b0 > > ... > > This patch is to fix it by linearizing the head skb if it has frag_list > set in tipc_buf_append(). Note that we choose to do this before calling > skb_unshare(), as __skb_linearize() will avoid skb_copy(). Also, we can > not just drop the frag_list either as the early time. > > Fixes: 45c8b7b175ce ("tipc: allow non-linear first fragment buffer") > Reported-by: Li Shuang <sh...@re...> > Signed-off-by: Xin Long <luc...@gm...> > --- > net/tipc/msg.c | 9 ++------- > 1 file changed, 2 insertions(+), 7 deletions(-) > > diff --git a/net/tipc/msg.c b/net/tipc/msg.c > index 3f0a253..ce6ab54 100644 > --- a/net/tipc/msg.c > +++ b/net/tipc/msg.c > @@ -149,18 +149,13 @@ int tipc_buf_append(struct sk_buff **headbuf, struct sk_buff **buf) > if (unlikely(head)) > goto err; > *buf = NULL; > + if (skb_has_frag_list(frag) && __skb_linearize(frag)) > + goto err; > frag = skb_unshare(frag, GFP_ATOMIC); > if (unlikely(!frag)) > goto err; > head = *headbuf = frag; > TIPC_SKB_CB(head)->tail = NULL; > - if (skb_is_nonlinear(head)) { > - skb_walk_frags(head, tail) { > - TIPC_SKB_CB(head)->tail = tail; > - } > - } else { > - skb_frag_list_init(head); > - } > return 0; > } > Acked-by: Jon Maloy <jm...@re...> |
From: Jon M. <jm...@re...> - 2021-05-15 22:52:52
|
On 5/14/21 2:54 PM, Xin Long wrote: > This patch is to use "struct work_struct" for the finalize work queue > instead of "struct tipc_net_work", as it can get the "net" and "addr" > from tipc_net's other members and there is no need to add extra net > and addr in tipc_net by defining "struct tipc_net_work". > > Note that it's safe to get net from tn->bcl as bcl is always released > after the finalize work queue is done. > > Signed-off-by: Xin Long <luc...@gm...> > --- > net/tipc/core.c | 4 ++-- > net/tipc/core.h | 8 +------- > net/tipc/discover.c | 4 ++-- > net/tipc/link.c | 5 +++++ > net/tipc/link.h | 1 + > net/tipc/net.c | 15 +++------------ > 6 files changed, 14 insertions(+), 23 deletions(-) > > diff --git a/net/tipc/core.c b/net/tipc/core.c > index 72f3ac7..3f4542e 100644 > --- a/net/tipc/core.c > +++ b/net/tipc/core.c > @@ -60,7 +60,7 @@ static int __net_init tipc_init_net(struct net *net) > tn->trial_addr = 0; > tn->addr_trial_end = 0; > tn->capabilities = TIPC_NODE_CAPABILITIES; > - INIT_WORK(&tn->final_work.work, tipc_net_finalize_work); > + INIT_WORK(&tn->work, tipc_net_finalize_work); > memset(tn->node_id, 0, sizeof(tn->node_id)); > memset(tn->node_id_string, 0, sizeof(tn->node_id_string)); > tn->mon_threshold = TIPC_DEF_MON_THRESHOLD; > @@ -110,7 +110,7 @@ static void __net_exit tipc_exit_net(struct net *net) > > tipc_detach_loopback(net); > /* Make sure the tipc_net_finalize_work() finished */ > - cancel_work_sync(&tn->final_work.work); > + cancel_work_sync(&tn->work); > tipc_net_stop(net); > > tipc_bcast_stop(net); > diff --git a/net/tipc/core.h b/net/tipc/core.h > index 5741ae4..0a3f7a7 100644 > --- a/net/tipc/core.h > +++ b/net/tipc/core.h > @@ -91,12 +91,6 @@ extern unsigned int tipc_net_id __read_mostly; > extern int sysctl_tipc_rmem[3] __read_mostly; > extern int sysctl_tipc_named_timeout __read_mostly; > > -struct tipc_net_work { > - struct work_struct work; > - struct net *net; > - u32 addr; > -}; > - > struct tipc_net { > u8 node_id[NODE_ID_LEN]; > u32 node_addr; > @@ -148,7 +142,7 @@ struct tipc_net { > struct tipc_crypto *crypto_tx; > #endif > /* Work item for net finalize */ > - struct tipc_net_work final_work; > + struct work_struct work; > /* The numbers of work queues in schedule */ > atomic_t wq_count; > }; > diff --git a/net/tipc/discover.c b/net/tipc/discover.c > index 5380f60..da69e1a 100644 > --- a/net/tipc/discover.c > +++ b/net/tipc/discover.c > @@ -168,7 +168,7 @@ static bool tipc_disc_addr_trial_msg(struct tipc_discoverer *d, > > /* Apply trial address if we just left trial period */ > if (!trial && !self) { > - tipc_sched_net_finalize(net, tn->trial_addr); > + schedule_work(&tn->work); > msg_set_prevnode(buf_msg(d->skb), tn->trial_addr); > msg_set_type(buf_msg(d->skb), DSC_REQ_MSG); > } > @@ -308,7 +308,7 @@ static void tipc_disc_timeout(struct timer_list *t) > if (!time_before(jiffies, tn->addr_trial_end) && !tipc_own_addr(net)) { > mod_timer(&d->timer, jiffies + TIPC_DISC_INIT); > spin_unlock_bh(&d->lock); > - tipc_sched_net_finalize(net, tn->trial_addr); > + schedule_work(&tn->work); > return; > } > > diff --git a/net/tipc/link.c b/net/tipc/link.c > index 1151092..c44b4bf 100644 > --- a/net/tipc/link.c > +++ b/net/tipc/link.c > @@ -372,6 +372,11 @@ char tipc_link_plane(struct tipc_link *l) > return l->net_plane; > } > > +struct net *tipc_link_net(struct tipc_link *l) > +{ > + return l->net; > +} > + > void tipc_link_update_caps(struct tipc_link *l, u16 capabilities) > { > l->peer_caps = capabilities; > diff --git a/net/tipc/link.h b/net/tipc/link.h > index fc07232..a16f401 100644 > --- a/net/tipc/link.h > +++ b/net/tipc/link.h > @@ -156,4 +156,5 @@ int tipc_link_bc_sync_rcv(struct tipc_link *l, struct tipc_msg *hdr, > int tipc_link_bc_nack_rcv(struct tipc_link *l, struct sk_buff *skb, > struct sk_buff_head *xmitq); > bool tipc_link_too_silent(struct tipc_link *l); > +struct net *tipc_link_net(struct tipc_link *l); > #endif > diff --git a/net/tipc/net.c b/net/tipc/net.c > index a130195..0e95572 100644 > --- a/net/tipc/net.c > +++ b/net/tipc/net.c > @@ -41,6 +41,7 @@ > #include "socket.h" > #include "node.h" > #include "bcast.h" > +#include "link.h" > #include "netlink.h" > #include "monitor.h" > > @@ -142,19 +143,9 @@ static void tipc_net_finalize(struct net *net, u32 addr) > > void tipc_net_finalize_work(struct work_struct *work) > { > - struct tipc_net_work *fwork; > + struct tipc_net *tn = container_of(work, struct tipc_net, work); > > - fwork = container_of(work, struct tipc_net_work, work); > - tipc_net_finalize(fwork->net, fwork->addr); > -} > - > -void tipc_sched_net_finalize(struct net *net, u32 addr) > -{ > - struct tipc_net *tn = tipc_net(net); > - > - tn->final_work.net = net; > - tn->final_work.addr = addr; > - schedule_work(&tn->final_work.work); > + tipc_net_finalize(tipc_link_net(tn->bcl), tn->trial_addr); > } > > void tipc_net_stop(struct net *net) Acked-by: Jon Maloy <jm...@re...> |
From: Jon M. <jm...@re...> - 2021-05-15 22:51:36
|
On 5/14/21 2:40 PM, Xin Long wrote: > On some host, a crash could be triggered simply by repeating these > commands several times: > > # modprobe tipc > # tipc bearer enable media udp name UDP1 localip 127.0.0.1 > # rmmod tipc > > [] BUG: unable to handle kernel paging request at ffffffffc096bb00 > [] Workqueue: events 0xffffffffc096bb00 > [] Call Trace: > [] ? process_one_work+0x1a7/0x360 > [] ? worker_thread+0x30/0x390 > [] ? create_worker+0x1a0/0x1a0 > [] ? kthread+0x116/0x130 > [] ? kthread_flush_work_fn+0x10/0x10 > [] ? ret_from_fork+0x35/0x40 > > When removing the TIPC module, the UDP tunnel sock will be delayed to > release in a work queue as sock_release() can't be done in rtnl_lock(). > If the work queue is schedule to run after the TIPC module is removed, > kernel will crash as the work queue function cleanup_beareri() code no > longer exists when trying to invoke it. > > To fix it, this patch introduce a member wq_count in tipc_net to track > the numbers of work queues in schedule, and wait and exit until all > work queues are done in tipc_exit_net(). > > Reported-by: Shuang Li <sh...@re...> > Signed-off-by: Xin Long <luc...@gm...> > --- > net/tipc/core.c | 2 ++ > net/tipc/core.h | 2 ++ > net/tipc/udp_media.c | 2 ++ > 3 files changed, 6 insertions(+) > > diff --git a/net/tipc/core.c b/net/tipc/core.c > index 5cc1f03..72f3ac7 100644 > --- a/net/tipc/core.c > +++ b/net/tipc/core.c > @@ -119,6 +119,8 @@ static void __net_exit tipc_exit_net(struct net *net) > #ifdef CONFIG_TIPC_CRYPTO > tipc_crypto_stop(&tipc_net(net)->crypto_tx); > #endif > + while (atomic_read(&tn->wq_count)) > + cond_resched(); > } > > static void __net_exit tipc_pernet_pre_exit(struct net *net) > diff --git a/net/tipc/core.h b/net/tipc/core.h > index 03de7b2..5741ae4 100644 > --- a/net/tipc/core.h > +++ b/net/tipc/core.h > @@ -149,6 +149,8 @@ struct tipc_net { > #endif > /* Work item for net finalize */ > struct tipc_net_work final_work; > + /* The numbers of work queues in schedule */ > + atomic_t wq_count; > }; > > static inline struct tipc_net *tipc_net(struct net *net) > diff --git a/net/tipc/udp_media.c b/net/tipc/udp_media.c > index e556d2c..c2bb818 100644 > --- a/net/tipc/udp_media.c > +++ b/net/tipc/udp_media.c > @@ -814,6 +814,7 @@ static void cleanup_bearer(struct work_struct *work) > kfree_rcu(rcast, rcu); > } > > + atomic_dec(&tipc_net(sock_net(ub->ubsock->sk))->wq_count); > dst_cache_destroy(&ub->rcast.dst_cache); > udp_tunnel_sock_release(ub->ubsock); > synchronize_net(); > @@ -834,6 +835,7 @@ static void tipc_udp_disable(struct tipc_bearer *b) > RCU_INIT_POINTER(ub->bearer, NULL); > > /* sock_release need to be done outside of rtnl lock */ > + atomic_inc(&tipc_net(sock_net(ub->ubsock->sk))->wq_count); > INIT_WORK(&ub->work, cleanup_bearer); > schedule_work(&ub->work); > } Acked-by: Jon Maloy <jm...@re...> |
From: Jon M. <jm...@re...> - 2021-05-15 17:07:34
|
On 5/15/21 11:58 AM, Xin Long wrote: > On Fri, May 14, 2021 at 7:18 PM Jon Maloy <jm...@re...> wrote: >> >> >> On 5/14/21 6:10 PM, pat...@ke... wrote: >>> Hello: >>> >>> This patch was applied to netdev/net.git (refs/heads/master): >>> >>> On Fri, 14 May 2021 08:23:03 +0700 you wrote: >>>> This reverts commit 6bf24dc0cc0cc43b29ba344b66d78590e687e046. >>>> Above fix is not correct and caused memory leak issue. >> I just convinced Xin (and myself) that the crash (double free) he was >> observing, and which he wanted to fix with the "tipc: fix a race in >> tipc_sk_mcast_rcv" patch was due to this bug. >> Now, realizing that this is causing a memory leak and not a double free >> I suspect there might still be an issue. >> Does anybody have a theory? > Hi Jon, I think the double free issue was due to the one I fixed in the patch > I posted: > > [PATCH net] tipc: skb_linearize the head skb when reassembling msgs > > see the changelog. That makes sense. So, just backport Hoang's patch and re-run the tests, and we'll have that one confirmed. ///jon >> ///jon >> >>>> Fixes: 6bf24dc0cc0c ("net:tipc: Fix a double free in tipc_sk_mcast_rcv") >>>> Acked-by: Jon Maloy <jm...@re...> >>>> Acked-by: Tung Nguyen <tun...@de...> >>>> Signed-off-by: Hoang Le <hoa...@de...> >>>> >>>> [...] >>> Here is the summary with links: >>> - [net] Revert "net:tipc: Fix a double free in tipc_sk_mcast_rcv" >>> https://git.kernel.org/netdev/net/c/75016891357a >>> >>> You are awesome, thank you! >>> -- >>> Deet-doot-dot, I am a bot. >>> https://korg.docs.kernel.org/patchwork/pwbot.html >>> >>> |