You can subscribe to this list here.
2003 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(6) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2004 |
Jan
(9) |
Feb
(11) |
Mar
(22) |
Apr
(73) |
May
(78) |
Jun
(146) |
Jul
(80) |
Aug
(27) |
Sep
(5) |
Oct
(14) |
Nov
(18) |
Dec
(27) |
2005 |
Jan
(20) |
Feb
(30) |
Mar
(19) |
Apr
(28) |
May
(50) |
Jun
(31) |
Jul
(32) |
Aug
(14) |
Sep
(36) |
Oct
(43) |
Nov
(74) |
Dec
(63) |
2006 |
Jan
(34) |
Feb
(32) |
Mar
(21) |
Apr
(76) |
May
(106) |
Jun
(72) |
Jul
(70) |
Aug
(175) |
Sep
(130) |
Oct
(39) |
Nov
(81) |
Dec
(43) |
2007 |
Jan
(81) |
Feb
(36) |
Mar
(20) |
Apr
(43) |
May
(54) |
Jun
(34) |
Jul
(44) |
Aug
(55) |
Sep
(44) |
Oct
(54) |
Nov
(43) |
Dec
(41) |
2008 |
Jan
(42) |
Feb
(84) |
Mar
(73) |
Apr
(30) |
May
(119) |
Jun
(54) |
Jul
(54) |
Aug
(93) |
Sep
(173) |
Oct
(130) |
Nov
(145) |
Dec
(153) |
2009 |
Jan
(59) |
Feb
(12) |
Mar
(28) |
Apr
(18) |
May
(56) |
Jun
(9) |
Jul
(28) |
Aug
(62) |
Sep
(16) |
Oct
(19) |
Nov
(15) |
Dec
(17) |
2010 |
Jan
(14) |
Feb
(36) |
Mar
(37) |
Apr
(30) |
May
(33) |
Jun
(53) |
Jul
(42) |
Aug
(50) |
Sep
(67) |
Oct
(66) |
Nov
(69) |
Dec
(36) |
2011 |
Jan
(52) |
Feb
(45) |
Mar
(49) |
Apr
(21) |
May
(34) |
Jun
(13) |
Jul
(19) |
Aug
(37) |
Sep
(43) |
Oct
(10) |
Nov
(23) |
Dec
(30) |
2012 |
Jan
(42) |
Feb
(36) |
Mar
(46) |
Apr
(25) |
May
(96) |
Jun
(146) |
Jul
(40) |
Aug
(28) |
Sep
(61) |
Oct
(45) |
Nov
(100) |
Dec
(53) |
2013 |
Jan
(79) |
Feb
(24) |
Mar
(134) |
Apr
(156) |
May
(118) |
Jun
(75) |
Jul
(278) |
Aug
(145) |
Sep
(136) |
Oct
(168) |
Nov
(137) |
Dec
(439) |
2014 |
Jan
(284) |
Feb
(158) |
Mar
(231) |
Apr
(275) |
May
(259) |
Jun
(91) |
Jul
(222) |
Aug
(215) |
Sep
(165) |
Oct
(166) |
Nov
(211) |
Dec
(150) |
2015 |
Jan
(164) |
Feb
(324) |
Mar
(299) |
Apr
(214) |
May
(111) |
Jun
(109) |
Jul
(105) |
Aug
(36) |
Sep
(58) |
Oct
(131) |
Nov
(68) |
Dec
(30) |
2016 |
Jan
(46) |
Feb
(87) |
Mar
(135) |
Apr
(174) |
May
(132) |
Jun
(135) |
Jul
(149) |
Aug
(125) |
Sep
(79) |
Oct
(49) |
Nov
(95) |
Dec
(102) |
2017 |
Jan
(104) |
Feb
(75) |
Mar
(72) |
Apr
(53) |
May
(18) |
Jun
(5) |
Jul
(14) |
Aug
(19) |
Sep
(2) |
Oct
(13) |
Nov
(21) |
Dec
(67) |
2018 |
Jan
(56) |
Feb
(50) |
Mar
(148) |
Apr
(41) |
May
(37) |
Jun
(34) |
Jul
(34) |
Aug
(11) |
Sep
(52) |
Oct
(48) |
Nov
(28) |
Dec
(46) |
2019 |
Jan
(29) |
Feb
(63) |
Mar
(95) |
Apr
(54) |
May
(14) |
Jun
(71) |
Jul
(60) |
Aug
(49) |
Sep
(3) |
Oct
(64) |
Nov
(115) |
Dec
(57) |
2020 |
Jan
(15) |
Feb
(9) |
Mar
(38) |
Apr
(27) |
May
(60) |
Jun
(53) |
Jul
(35) |
Aug
(46) |
Sep
(37) |
Oct
(64) |
Nov
(20) |
Dec
(25) |
2021 |
Jan
(20) |
Feb
(31) |
Mar
(27) |
Apr
(23) |
May
(21) |
Jun
(30) |
Jul
(30) |
Aug
(7) |
Sep
(18) |
Oct
|
Nov
(15) |
Dec
(4) |
2022 |
Jan
(3) |
Feb
(1) |
Mar
(10) |
Apr
|
May
(2) |
Jun
(26) |
Jul
(5) |
Aug
|
Sep
(1) |
Oct
(2) |
Nov
(9) |
Dec
(2) |
2023 |
Jan
(4) |
Feb
(4) |
Mar
(5) |
Apr
(10) |
May
(29) |
Jun
(17) |
Jul
|
Aug
|
Sep
(1) |
Oct
(1) |
Nov
(2) |
Dec
|
2024 |
Jan
|
Feb
(6) |
Mar
|
Apr
(1) |
May
(6) |
Jun
|
Jul
(5) |
Aug
|
Sep
(3) |
Oct
|
Nov
|
Dec
|
2025 |
Jan
|
Feb
(3) |
Mar
|
Apr
|
May
|
Jun
|
Jul
(6) |
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: David M. <da...@da...> - 2019-05-04 04:45:08
|
From: Chris Packham <chr...@al...> Date: Thu, 2 May 2019 15:10:04 +1200 > TLV_SET is called with a data pointer and a len parameter that tells us > how many bytes are pointed to by data. When invoking memcpy() we need > to careful to only copy len bytes. > > Previously we would copy TLV_LENGTH(len) bytes which would copy an extra > 4 bytes past the end of the data pointer which newer GCC versions > complain about. > > In file included from test.c:17: > In function 'TLV_SET', > inlined from 'test' at test.c:186:5: > /usr/include/linux/tipc_config.h:317:3: > warning: 'memcpy' forming offset [33, 36] is out of the bounds [0, 32] > of object 'bearer_name' with type 'char[32]' [-Warray-bounds] > memcpy(TLV_DATA(tlv_ptr), data, tlv_len); > ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > test.c: In function 'test': > test.c::161:10: note: > 'bearer_name' declared here > char bearer_name[TIPC_MAX_BEARER_NAME]; > ^~~~~~~~~~~ > > Signed-off-by: Chris Packham <chr...@al...> But now the pad bytes at the end are uninitialized. The whole idea is that the encapsulating TLV object has to be rounded up in size based upon the given 'len' for the data. |
From: Tuong L. <tuo...@de...> - 2019-05-02 10:23:42
|
TIPC link can temporarily fall into "half-establish" that only one of the link endpoints is ESTABLISHED and starts to send traffic, PROTOCOL messages, whereas the other link endpoint is not up (e.g. immediately when the endpoint receives ACTIVATE_MSG, the network interface goes down...). This is a normal situation and will be settled because the link endpoint will be eventually brought down after the link tolerance time. However, the situation will become worse when the second link is established before the first link endpoint goes down, For example: 1. Both links <1A-2A>, <1B-2B> down 2. Link endpoint 2A up, but 1A still down (e.g. due to network disturbance, wrong session, etc.) 3. Link <1B-2B> up 4. Link endpoint 2A down (e.g. due to link tolerance timeout) 5. Node B starts failover onto link <1B-2B> ==> Node A does never start link failover. When the "half-failover" situation happens, two consequences have been observed: a) Peer link/node gets stuck in FAILINGOVER state; b) Traffic or user messages that peer node is trying to failover onto the second link can be partially or completely dropped by this node. The consequence a) was actually solved by commit c140eb166d68 ("tipc: fix failover problem"), but that commit didn't cover the b). It's due to the fact that the tunnel link endpoint has never been prepared for a failover, so the 'l->drop_point' (and the other data...) is not set correctly. When a TUNNEL_MSG from peer node arrives on the link, depending on the inner message's seqno and the current 'l->drop_point' value, the message can be dropped (- treated as a duplicate message) or processed. At this early stage, the traffic messages from peer are likely to be NAME_DISTRIBUTORs, this means some name table entries will be missed on the node forever! The commit resolves the issue by starting the FAILOVER process on this node as well. Another benefit from this solution is that we ensure the link will not be re-established until the failover ends. Acked-by: Jon Maloy <jon...@er...> Signed-off-by: Tuong Lien <tuo...@de...> --- net/tipc/link.c | 35 +++++++++++++++++++++++++++++++++++ net/tipc/link.h | 2 ++ net/tipc/node.c | 54 +++++++++++++++++++++++++++++++++++++++++++++++------- 3 files changed, 84 insertions(+), 7 deletions(-) diff --git a/net/tipc/link.c b/net/tipc/link.c index 1c514b64a0a9..f5cd986e1e50 100644 --- a/net/tipc/link.c +++ b/net/tipc/link.c @@ -1705,6 +1705,41 @@ void tipc_link_tnl_prepare(struct tipc_link *l, struct tipc_link *tnl, } } +/** + * tipc_link_failover_prepare() - prepare tnl for link failover + * + * This is a special version of the precursor - tipc_link_tnl_prepare(), + * see the tipc_node_link_failover() for details + * + * @l: failover link + * @tnl: tunnel link + * @xmitq: queue for messages to be xmited + */ +void tipc_link_failover_prepare(struct tipc_link *l, struct tipc_link *tnl, + struct sk_buff_head *xmitq) +{ + struct sk_buff_head *fdefq = &tnl->failover_deferdq; + + tipc_link_create_dummy_tnl_msg(tnl, xmitq); + + /* This failover link enpoint was never established before, + * so it has not received anything from peer. + * Otherwise, it must be a normal failover situation or the + * node has entered SELF_DOWN_PEER_LEAVING and both peer nodes + * would have to start over from scratch instead. + */ + WARN_ON(l && tipc_link_is_up(l)); + tnl->drop_point = 1; + tnl->failover_reasm_skb = NULL; + + /* Initiate the link's failover deferdq */ + if (unlikely(!skb_queue_empty(fdefq))) { + pr_warn("Link failover deferdq not empty: %d!\n", + skb_queue_len(fdefq)); + __skb_queue_purge(fdefq); + } +} + /* tipc_link_validate_msg(): validate message against current link state * Returns true if message should be accepted, otherwise false */ diff --git a/net/tipc/link.h b/net/tipc/link.h index 8439e0ee53a8..adcad65e761c 100644 --- a/net/tipc/link.h +++ b/net/tipc/link.h @@ -90,6 +90,8 @@ void tipc_link_tnl_prepare(struct tipc_link *l, struct tipc_link *tnl, int mtyp, struct sk_buff_head *xmitq); void tipc_link_create_dummy_tnl_msg(struct tipc_link *tnl, struct sk_buff_head *xmitq); +void tipc_link_failover_prepare(struct tipc_link *l, struct tipc_link *tnl, + struct sk_buff_head *xmitq); void tipc_link_build_reset_msg(struct tipc_link *l, struct sk_buff_head *xmitq); int tipc_link_fsm_evt(struct tipc_link *l, int evt); bool tipc_link_is_up(struct tipc_link *l); diff --git a/net/tipc/node.c b/net/tipc/node.c index 0eb1bf850219..9e106d3ed187 100644 --- a/net/tipc/node.c +++ b/net/tipc/node.c @@ -714,7 +714,6 @@ static void __tipc_node_link_up(struct tipc_node *n, int bearer_id, *slot0 = bearer_id; *slot1 = bearer_id; tipc_node_fsm_evt(n, SELF_ESTABL_CONTACT_EVT); - n->failover_sent = false; n->action_flags |= TIPC_NOTIFY_NODE_UP; tipc_link_set_active(nl, true); tipc_bcast_add_peer(n->net, nl, xmitq); @@ -757,6 +756,45 @@ static void tipc_node_link_up(struct tipc_node *n, int bearer_id, } /** + * tipc_node_link_failover() - start failover in case "half-failover" + * + * This function is only called in a very special situation where link + * failover can be already started on peer node but not on this node. + * This can happen when e.g. + * 1. Both links <1A-2A>, <1B-2B> down + * 2. Link endpoint 2A up, but 1A still down (e.g. due to network + * disturbance, wrong session, etc.) + * 3. Link <1B-2B> up + * 4. Link endpoint 2A down (e.g. due to link tolerance timeout) + * 5. Node B starts failover onto link <1B-2B> + * + * ==> Node A does never start link/node failover! + * + * @n: tipc node structure + * @l: link peer endpoint failingover (- can be NULL) + * @tnl: tunnel link + * @xmitq: queue for messages to be xmited on tnl link later + */ +static void tipc_node_link_failover(struct tipc_node *n, struct tipc_link *l, + struct tipc_link *tnl, + struct sk_buff_head *xmitq) +{ + /* Avoid to be "self-failover" that can never end */ + if (!tipc_link_is_up(tnl)) + return; + + tipc_link_fsm_evt(tnl, LINK_SYNCH_END_EVT); + tipc_node_fsm_evt(n, NODE_SYNCH_END_EVT); + + n->sync_point = tipc_link_rcv_nxt(tnl) + (U16_MAX / 2 - 1); + tipc_link_failover_prepare(l, tnl, xmitq); + + if (l) + tipc_link_fsm_evt(l, LINK_FAILOVER_BEGIN_EVT); + tipc_node_fsm_evt(n, NODE_FAILOVER_BEGIN_EVT); +} + +/** * __tipc_node_link_down - handle loss of link */ static void __tipc_node_link_down(struct tipc_node *n, int *bearer_id, @@ -1675,14 +1713,16 @@ static bool tipc_node_check_state(struct tipc_node *n, struct sk_buff *skb, tipc_skb_queue_splice_tail_init(tipc_link_inputq(pl), tipc_link_inputq(l)); } + /* If parallel link was already down, and this happened before - * the tunnel link came up, FAILOVER was never sent. Ensure that - * FAILOVER is sent to get peer out of NODE_FAILINGOVER state. + * the tunnel link came up, node failover was never started. + * Ensure that a FAILOVER_MSG is sent to get peer out of + * NODE_FAILINGOVER state, also this node must accept + * TUNNEL_MSGs from peer. */ - if (n->state != NODE_FAILINGOVER && !n->failover_sent) { - tipc_link_create_dummy_tnl_msg(l, xmitq); - n->failover_sent = true; - } + if (n->state != NODE_FAILINGOVER) + tipc_node_link_failover(n, pl, l, xmitq); + /* If pkts arrive out of order, use lowest calculated syncpt */ if (less(syncpt, n->sync_point)) n->sync_point = syncpt; -- 2.13.7 |
From: Erik H. <eri...@gm...> - 2019-04-30 21:22:07
|
Check out: http://hintjens.com/blog:70 //E On Tue, 30 Apr 2019, 10:59 Jon Maloy, <jon...@er...> wrote: > Hi Ivan, > > Asa matter of fact, TIPC is already supported by ZeroMQ. This was done by > Erik Hugne when he was working at Ericsson a few years back. > > As I understand it, the support is somewhat limited, and does not comprise > all feature of TIPC, but I don’t know the details. > > I cc Erik, and trust he can give you more relevant information regarding > this, > > > > BR > > ///jon > > > > > > *From:* Ivan Serdyuk <loc...@gm...> > *Sent:* 28-Apr-19 16:28 > *To:* Jon Maloy <jon...@er...> > *Subject:* Using TIPC protocol for ZeroMQ bindings, in Clojure CLR > > > > Jon, can you assist (and find any motivated colleges, at Ericsson and/or > 3rd party end-user company) with implementing support for the protocol for > the ZeroMQ binding? > > > > You have been mentioned here > https://www.landley.net/kdocs/ols/2004/ols2004v2-pages-61-70.pdf > <https://protect2.fireeye.com/url?k=33162cb7-6f9d27a8-33166c2c-86074c80521c-9a11a83cae21f64a&u=https://www.landley.net/kdocs/ols/2004/ols2004v2-pages-61-70.pdf> - > so I thought that I should ping the original authors of the spec and the > implementations. I am unsure what was the background for the ZeroMQ > project. Only least of the bindings are supported (there are various ones, > for various programming languages) and they are mostly covering TCP or IPC > transports. > > > > https://github.com/clojure/clojure-clr > <https://protect2.fireeye.com/url?k=f66915a8-aae21eb7-f6695533-86074c80521c-37d38342a5d5d2c4&u=https://github.com/clojure/clojure-clr> > - the compiler is based on the DLR. It is one of two language > implementations (the other one is IronPython) which enforce an improvement > of the scripting language runtime, for the CLR. Even Powershell is does not > rely on the bleeding edge version. Currently looking forward to move to > .NET Core 3.0 and work on the Mono's vector. David Miller is the architect. > > > > So to make the language usable for various developers and commercial > domains we require a good ecosystem with libs and frameworks. Plus there > are various design approaches/architectures like monolith, micro-services > and serverless. So for integrating monolith, interaction between > micro-services (and hybrid integrations, of some kind, via some middleware) > - we would require ZeroMQ. In the same time - it is the only low-latency > option for interaction of virtualized apps and/or services (whether that is > a process level virt./containers or OS level virt.). So that would allow to > improve the compiler of the programming language itself and define a > roadmap, so the project would have future. > > > > Ivan > > > |
From: Jon M. <jon...@er...> - 2019-04-30 15:31:50
|
Hi Ivan, Asa matter of fact, TIPC is already supported by ZeroMQ. This was done by Erik Hugne when he was working at Ericsson a few years back. As I understand it, the support is somewhat limited, and does not comprise all feature of TIPC, but I don’t know the details. I cc Erik, and trust he can give you more relevant information regarding this, BR ///jon From: Ivan Serdyuk <loc...@gm...> Sent: 28-Apr-19 16:28 To: Jon Maloy <jon...@er...> Subject: Using TIPC protocol for ZeroMQ bindings, in Clojure CLR Jon, can you assist (and find any motivated colleges, at Ericsson and/or 3rd party end-user company) with implementing support for the protocol for the ZeroMQ binding? You have been mentioned here https://www.landley.net/kdocs/ols/2004/ols2004v2-pages-61-70.pdf<https://protect2.fireeye.com/url?k=33162cb7-6f9d27a8-33166c2c-86074c80521c-9a11a83cae21f64a&u=https://www.landley.net/kdocs/ols/2004/ols2004v2-pages-61-70.pdf> - so I thought that I should ping the original authors of the spec and the implementations. I am unsure what was the background for the ZeroMQ project. Only least of the bindings are supported (there are various ones, for various programming languages) and they are mostly covering TCP or IPC transports. https://github.com/clojure/clojure-clr<https://protect2.fireeye.com/url?k=f66915a8-aae21eb7-f6695533-86074c80521c-37d38342a5d5d2c4&u=https://github.com/clojure/clojure-clr> - the compiler is based on the DLR. It is one of two language implementations (the other one is IronPython) which enforce an improvement of the scripting language runtime, for the CLR. Even Powershell is does not rely on the bleeding edge version. Currently looking forward to move to .NET Core 3.0 and work on the Mono's vector. David Miller is the architect. So to make the language usable for various developers and commercial domains we require a good ecosystem with libs and frameworks. Plus there are various design approaches/architectures like monolith, micro-services and serverless. So for integrating monolith, interaction between micro-services (and hybrid integrations, of some kind, via some middleware) - we would require ZeroMQ. In the same time - it is the only low-latency option for interaction of virtualized apps and/or services (whether that is a process level virt./containers or OS level virt.). So that would allow to improve the compiler of the programming language itself and define a roadmap, so the project would have future. Ivan |
From: Erik H. <eri...@gm...> - 2019-04-26 16:22:41
|
You could probably cook up an eBPF program to do this. Have a look at bpftrace, it doesn't give the full solution, but points the right way. //E On Fri, 26 Apr 2019, 11:50 Peter Fröhlich, <pet...@gm...> wrote: > I expressed myself unclearly in that second email, sorry for that. > What I am looking for is "subscribe to changes in the service binding > table / name table" which is probably some kind of "meta subscription" > compared to the existing ones that fire when a specific service type > comes or goes. The goal is to avoid having to subscribe to lots and > lots of service types explicitly just in order to produce a log of the > cluster state over time. (That said, I have not noticed any problems > with subscribing to hundreds of service types, so maybe it's fine to > just do that instead.) > > On Thu, Apr 25, 2019 at 2:31 PM Jon Maloy <jon...@er...> wrote: > > > > > > > > > -----Original Message----- > > > From: Peter Fröhlich <pet...@gm...> > > > Sent: 25-Apr-19 08:17 > > > To: Jon Maloy <jon...@er...> > > > Cc: tip...@li... > > > Subject: Re: [tipc-discussion] Subscribing for "all" service changes? > > > > > > On Thu, Apr 25, 2019 at 12:58 PM Jon Maloy <jon...@er...> > > > wrote: > > > > No, we don't have any wildcard type for the service type itself, > only for > > > changes for a given service type. > > > > I honestly have never thought about that, nor had any requirements > for it. > > > > Bu I 'll give it a thought. > > > > > > Thank you! I don't know if it's an undue overhead to support a > wildcard, but > > > if it's straightforward to add it would certainly be convenient. As of > right now > > > we just have a few services so I can add the IDs manually to configure > > > logging. But as we add more services it'll get easier and easier to > forget to > > > keep that in sync. Also it seems that for links and nodes I already > get the > > > desired behavior, but maybe there's something about the implementation > > > that makes this easier for those two than for all services. > > > > Link and node subscriptions are just another two service types, and > behave exactly like the rest. > > So, I don't understand your comment. Have I misunderstood your question? > > > > ///jon > > > > > -- > Peter H. Fröhlich | Senior Code Monkey | https://phf.github.io/ > > > _______________________________________________ > tipc-discussion mailing list > tip...@li... > https://lists.sourceforge.net/lists/listinfo/tipc-discussion > |
From: Jon M. <jon...@er...> - 2019-04-26 13:50:14
|
We are routinely testing with 5000 subscriptions, and it works just fine. So, a few hundreds should be no problem. ///jon > -----Original Message----- > From: Peter Fröhlich <pet...@gm...> > Sent: 26-Apr-19 05:39 > To: Jon Maloy <jon...@er...> > Cc: tip...@li... > Subject: Re: [tipc-discussion] Subscribing for "all" service changes? > > I expressed myself unclearly in that second email, sorry for that. > What I am looking for is "subscribe to changes in the service binding table / > name table" which is probably some kind of "meta subscription" > compared to the existing ones that fire when a specific service type comes or > goes. The goal is to avoid having to subscribe to lots and lots of service types > explicitly just in order to produce a log of the cluster state over time. (That > said, I have not noticed any problems with subscribing to hundreds of service > types, so maybe it's fine to just do that instead.) > > On Thu, Apr 25, 2019 at 2:31 PM Jon Maloy <jon...@er...> > wrote: > > > > > > > > > -----Original Message----- > > > From: Peter Fröhlich <pet...@gm...> > > > Sent: 25-Apr-19 08:17 > > > To: Jon Maloy <jon...@er...> > > > Cc: tip...@li... > > > Subject: Re: [tipc-discussion] Subscribing for "all" service changes? > > > > > > On Thu, Apr 25, 2019 at 12:58 PM Jon Maloy <jon...@er...> > > > wrote: > > > > No, we don't have any wildcard type for the service type itself, > > > > only for > > > changes for a given service type. > > > > I honestly have never thought about that, nor had any requirements > for it. > > > > Bu I 'll give it a thought. > > > > > > Thank you! I don't know if it's an undue overhead to support a > > > wildcard, but if it's straightforward to add it would certainly be > > > convenient. As of right now we just have a few services so I can add > > > the IDs manually to configure logging. But as we add more services > > > it'll get easier and easier to forget to keep that in sync. Also it > > > seems that for links and nodes I already get the desired behavior, > > > but maybe there's something about the implementation that makes this > easier for those two than for all services. > > > > Link and node subscriptions are just another two service types, and behave > exactly like the rest. > > So, I don't understand your comment. Have I misunderstood your > question? > > > > ///jon > > > > > -- > Peter H. Fröhlich | Senior Code Monkey | https://phf.github.io/ |
From: Peter F. <pet...@gm...> - 2019-04-26 09:39:15
|
I expressed myself unclearly in that second email, sorry for that. What I am looking for is "subscribe to changes in the service binding table / name table" which is probably some kind of "meta subscription" compared to the existing ones that fire when a specific service type comes or goes. The goal is to avoid having to subscribe to lots and lots of service types explicitly just in order to produce a log of the cluster state over time. (That said, I have not noticed any problems with subscribing to hundreds of service types, so maybe it's fine to just do that instead.) On Thu, Apr 25, 2019 at 2:31 PM Jon Maloy <jon...@er...> wrote: > > > > > -----Original Message----- > > From: Peter Fröhlich <pet...@gm...> > > Sent: 25-Apr-19 08:17 > > To: Jon Maloy <jon...@er...> > > Cc: tip...@li... > > Subject: Re: [tipc-discussion] Subscribing for "all" service changes? > > > > On Thu, Apr 25, 2019 at 12:58 PM Jon Maloy <jon...@er...> > > wrote: > > > No, we don't have any wildcard type for the service type itself, only for > > changes for a given service type. > > > I honestly have never thought about that, nor had any requirements for it. > > > Bu I 'll give it a thought. > > > > Thank you! I don't know if it's an undue overhead to support a wildcard, but > > if it's straightforward to add it would certainly be convenient. As of right now > > we just have a few services so I can add the IDs manually to configure > > logging. But as we add more services it'll get easier and easier to forget to > > keep that in sync. Also it seems that for links and nodes I already get the > > desired behavior, but maybe there's something about the implementation > > that makes this easier for those two than for all services. > > Link and node subscriptions are just another two service types, and behave exactly like the rest. > So, I don't understand your comment. Have I misunderstood your question? > > ///jon > -- Peter H. Fröhlich | Senior Code Monkey | https://phf.github.io/ |
From: Jon M. <jon...@er...> - 2019-04-25 12:31:32
|
> -----Original Message----- > From: Peter Fröhlich <pet...@gm...> > Sent: 25-Apr-19 08:17 > To: Jon Maloy <jon...@er...> > Cc: tip...@li... > Subject: Re: [tipc-discussion] Subscribing for "all" service changes? > > On Thu, Apr 25, 2019 at 12:58 PM Jon Maloy <jon...@er...> > wrote: > > No, we don't have any wildcard type for the service type itself, only for > changes for a given service type. > > I honestly have never thought about that, nor had any requirements for it. > > Bu I 'll give it a thought. > > Thank you! I don't know if it's an undue overhead to support a wildcard, but > if it's straightforward to add it would certainly be convenient. As of right now > we just have a few services so I can add the IDs manually to configure > logging. But as we add more services it'll get easier and easier to forget to > keep that in sync. Also it seems that for links and nodes I already get the > desired behavior, but maybe there's something about the implementation > that makes this easier for those two than for all services. Link and node subscriptions are just another two service types, and behave exactly like the rest. So, I don't understand your comment. Have I misunderstood your question? ///jon |
From: Peter F. <pet...@gm...> - 2019-04-25 12:17:23
|
On Thu, Apr 25, 2019 at 12:58 PM Jon Maloy <jon...@er...> wrote: > No, we don't have any wildcard type for the service type itself, only for changes for a given service type. > I honestly have never thought about that, nor had any requirements for it. > Bu I 'll give it a thought. Thank you! I don't know if it's an undue overhead to support a wildcard, but if it's straightforward to add it would certainly be convenient. As of right now we just have a few services so I can add the IDs manually to configure logging. But as we add more services it'll get easier and easier to forget to keep that in sync. Also it seems that for links and nodes I already get the desired behavior, but maybe there's something about the implementation that makes this easier for those two than for all services. |
From: Jon M. <jon...@er...> - 2019-04-25 10:58:43
|
No, we don't have any wildcard type for the service type itself, only for changes for a given service type. I honestly have never thought about that, nor had any requirements for it. Bu I 'll give it a thought. BR ///jon > -----Original Message----- > From: Peter Fröhlich <pet...@gm...> > Sent: 25-Apr-19 05:06 > To: tip...@li... > Subject: [tipc-discussion] Subscribing for "all" service changes? > > Dear all, > > I am able to subscribe to node and link changes just fine. I was trying to do > the same for "service changes" but it seems that there's no "wildcard" for > the service type that would mean "any service changes whatsoever please". > Things work fine for specific services, but I had hoped to avoid having to > subscribe many times in order to track the various services that might come > and go in our system. As you have no doubt guessed, this is for logging > what's going on in the cluster. Any hints? > > Best, > Peter > -- > Peter H. Fröhlich | Senior Code Monkey | > https://protect2.fireeye.com/url?k=74639786-28ea4dc1-7463d71d- > 0cc47ad93c18-1512055a91f709ca&u=https://phf.github.io/ > > > _______________________________________________ > tipc-discussion mailing list > tip...@li... > https://lists.sourceforge.net/lists/listinfo/tipc-discussion |
From: Peter F. <pet...@gm...> - 2019-04-25 09:06:09
|
Dear all, I am able to subscribe to node and link changes just fine. I was trying to do the same for "service changes" but it seems that there's no "wildcard" for the service type that would mean "any service changes whatsoever please". Things work fine for specific services, but I had hoped to avoid having to subscribe many times in order to track the various services that might come and go in our system. As you have no doubt guessed, this is for logging what's going on in the cluster. Any hints? Best, Peter -- Peter H. Fröhlich | Senior Code Monkey | https://phf.github.io/ |
From: Tuong L. T. <tuo...@de...> - 2019-04-25 02:38:34
|
Thanks a lot, Jon! Please see my answers inline... I will send the patch to net-next then. BR/Tuong -----Original Message----- From: Jon Maloy <jon...@er...> Sent: Wednesday, April 24, 2019 11:41 PM To: Tuong Tong Lien <tuo...@de...>; ma...@do...; yin...@wi...; tip...@li... Subject: RE: [PATCH RFC] tipc: fix link "half-failover" issue Acked-by: Jon Maloy <jon...@er...> Also see my comments below. ///jon > -----Original Message----- > From: Tuong Lien <tuo...@de...> > Sent: 24-Apr-19 00:41 > To: Jon Maloy <jon...@er...>; ma...@do...; > yin...@wi...; tip...@li... > Subject: [PATCH RFC] tipc: fix link "half-failover" issue > > TIPC link can temporarily fall into "half-establish" that only one of the link > endpoints is ESTABLISHED and starts to send traffic, PROTOCOL messages, > whereas the other link endpoint is not up (e.g. immediately when the > endpoint receives ACTIVATE_MSG, the network interface goes down...). > > This is a normal situation and will be settled because the link endpoint will be > eventually brought down after the link tolerance time. > > However, the situation will become worse when the second link is > established before the first link endpoint goes down, For example: > > 1. Both links <1A-1B>, <2A-2B> down Confusing terminology here. How can there be a link <1A-1B>? I think you want to say "Both link endpoints" 1A,1B down etc. [Tuong]: I didn't realize this confusion until you said it here 😊! It should be "Both links <1A-2A>, <1B-2B> down" where the numeric part presents the node number and the character part is the interface name. > 2. Link endpoint 1B up, but 1A still down (e.g. due to network > disturbance, wrong session, etc.) > 3. Link <2A-2B> up Same here. [Tuong]: Same above, it should be "Link <1B-2B> up" > 4. Link endpoint 1B down (e.g. due to link tolerance timeout) > 5. Node B starts failover onto link <2A-2B> > > ==> Node A does never start link failover. > > When the "half-failover" situation happens, two consequences have been > observed: > > a) Peer link/node gets stuck in FAILINGOVER state; > b) Traffic or user messages that peer node is trying to failover onto the > second link can be partially or completely dropped by this node. > > The consequence a) was actually solved by commit c140eb166d68 ("tipc: > fix failover problem"), but that commit didn't cover the b). It's due to the fact > that the tunnel link endpoint has never been prepared for a failover, so the > 'l->drop_point' (and the other data...) is not set correctly. When a > TUNNLE_MSG s/TUNNLE_MSG/TUNNEL_MSG [Tuong]: yep, a typo! from peer node arrives on the link, depending on the inner > message's seqno and the current 'l->drop_point' > value, the message can be dropped (- treated as a duplicate message) or > processed. > At this early stage, the traffic messages from peer are likely to be > NAME_DISTRIBUTORs, this means some name table entries will be missed on > the node forever! > > The commit resolves the issue by starting the FAILOVER process on this node > as well. Another benefit from this solution is that we ensure the link will not > be re-established until the failover ends. > > Signed-off-by: Tuong Lien <tuo...@de...> > --- > net/tipc/link.c | 35 +++++++++++++++++++++++++++++++++++ > net/tipc/link.h | 2 ++ > net/tipc/node.c | 54 > +++++++++++++++++++++++++++++++++++++++++++++++------- > 3 files changed, 84 insertions(+), 7 deletions(-) > > diff --git a/net/tipc/link.c b/net/tipc/link.c index 6053489c8063..fa639054329d > 100644 > --- a/net/tipc/link.c > +++ b/net/tipc/link.c > @@ -1705,6 +1705,41 @@ void tipc_link_tnl_prepare(struct tipc_link *l, > struct tipc_link *tnl, > } > } > > +/** > + * tipc_link_failover_prepare() - prepare tnl for link failover > + * > + * This is a special version of the precursor - > +tipc_link_tnl_prepare(), > + * see the __tipc_node_link_failover() for details > + * > + * @l: failover link > + * @tnl: tunnel link > + * @xmitq: queue for messages to be xmited */ void > +tipc_link_failover_prepare(struct tipc_link *l, struct tipc_link *tnl, > + struct sk_buff_head *xmitq) > +{ > + struct sk_buff_head *fdefq = &tnl->failover_deferdq; > + > + tipc_link_create_dummy_tnl_msg(tnl, xmitq); > + > + /* This failover link enpoint should never be established s/should never be established before, so did not../was never established, so it has not ... [Tuong]: ok, will update it. > + * before, so did not receive anything from peer. > + * Otherwise, it must be a normal failover situation or the > + * node has entered SELF_DOWN_PEER_LEAVING and both peer > nodes > + * would have to start over from scratch instead. > + */ > + WARN_ON(l && tipc_link_is_up(l)); > + tnl->drop_point = 1; > + tnl->failover_reasm_skb = NULL; > + > + /* Initiate the link's failover deferdq */ > + if (unlikely(!skb_queue_empty(fdefq))) { > + pr_warn("Link failover deferdq not empty: %d!\n", > + skb_queue_len(fdefq)); > + __skb_queue_purge(fdefq); > + } > +} > + > /* tipc_link_validate_msg(): validate message against current link state > * Returns true if message should be accepted, otherwise false > */ > diff --git a/net/tipc/link.h b/net/tipc/link.h index > 8439e0ee53a8..adcad65e761c 100644 > --- a/net/tipc/link.h > +++ b/net/tipc/link.h > @@ -90,6 +90,8 @@ void tipc_link_tnl_prepare(struct tipc_link *l, struct > tipc_link *tnl, > int mtyp, struct sk_buff_head *xmitq); void > tipc_link_create_dummy_tnl_msg(struct tipc_link *tnl, > struct sk_buff_head *xmitq); > +void tipc_link_failover_prepare(struct tipc_link *l, struct tipc_link *tnl, > + struct sk_buff_head *xmitq); > void tipc_link_build_reset_msg(struct tipc_link *l, struct sk_buff_head > *xmitq); int tipc_link_fsm_evt(struct tipc_link *l, int evt); bool > tipc_link_is_up(struct tipc_link *l); diff --git a/net/tipc/node.c > b/net/tipc/node.c index 7478e2d4ec02..be07cf327d2d 100644 > --- a/net/tipc/node.c > +++ b/net/tipc/node.c > @@ -714,7 +714,6 @@ static void __tipc_node_link_up(struct tipc_node *n, > int bearer_id, > *slot0 = bearer_id; > *slot1 = bearer_id; > tipc_node_fsm_evt(n, SELF_ESTABL_CONTACT_EVT); > - n->failover_sent = false; > n->action_flags |= TIPC_NOTIFY_NODE_UP; > tipc_link_set_active(nl, true); > tipc_bcast_add_peer(n->net, nl, xmitq); @@ -757,6 +756,45 > @@ static void tipc_node_link_up(struct tipc_node *n, int bearer_id, } > > /** > + * __tipc_node_link_failover() - start failover in case "half-failover" Don't see any reason for the "__" prefix. This is normally used when we need two functions with the same name, one locked and one unlocked. [Tuong]: Actually, I just want to emphasize that this function should be called with care that one needs to obtain the node lock or perform some checks (e.g. node state...) before, rather than there is another function with the same name... If you think it is unnecessary, then I will remove the prefix... > + * > + * This function is only called in a very special situation where link > + * failover can be already started on peer node but not on this node. > + * This can happen when e.g. > + * 1. Both links <1A-1B>, <2A-2B> down > + * 2. Link endpoint 1B up, but 1A still down (e.g. due to network > + * disturbance, wrong session, etc.) > + * 3. Link <2A-2B> up > + * 4. Link endpoint 1B down (e.g. due to link tolerance timeout) > + * 5. Node B starts failover onto link <2A-2B> > + * > + * ==> Node A does never start link/node failover! > + * > + * @n: tipc node structure > + * @l: link peer endpoint failingover (- can be NULL) > + * @tnl: tunnel link > + * @xmitq: queue for messages to be xmited on tnl link later */ static > +void __tipc_node_link_failover(struct tipc_node *n, struct tipc_link *l, > + struct tipc_link *tnl, > + struct sk_buff_head *xmitq) > +{ > + /* Avoid to be "self-failover" that can never end */ > + if (!tipc_link_is_up(tnl)) > + return; > + > + tipc_link_fsm_evt(tnl, LINK_SYNCH_END_EVT); > + tipc_node_fsm_evt(n, NODE_SYNCH_END_EVT); > + > + n->sync_point = tipc_link_rcv_nxt(tnl) + (U16_MAX / 2 - 1); > + tipc_link_failover_prepare(l, tnl, xmitq); > + > + if (l) > + tipc_link_fsm_evt(l, LINK_FAILOVER_BEGIN_EVT); > + tipc_node_fsm_evt(n, NODE_FAILOVER_BEGIN_EVT); } > + > +/** > * __tipc_node_link_down - handle loss of link > */ > static void __tipc_node_link_down(struct tipc_node *n, int *bearer_id, @@ > -1675,14 +1713,16 @@ static bool tipc_node_check_state(struct tipc_node > *n, struct sk_buff *skb, > tipc_skb_queue_splice_tail_init(tipc_link_inputq(pl), > tipc_link_inputq(l)); > } > + > /* If parallel link was already down, and this happened > before > - * the tunnel link came up, FAILOVER was never sent. Ensure > that > - * FAILOVER is sent to get peer out of NODE_FAILINGOVER > state. > + * the tunnel link came up, node failover was never started. > + * Ensure that a FAILOVER_MSG is sent to get peer out of > + * NODE_FAILINGOVER state, also this node must accept > + * TUNNLE_MSGs from peer. s/TUNNLE_MSG/TUNEL_MSG > */ > - if (n->state != NODE_FAILINGOVER && !n->failover_sent) { > - tipc_link_create_dummy_tnl_msg(l, xmitq); > - n->failover_sent = true; > - } > + if (n->state != NODE_FAILINGOVER) > + __tipc_node_link_failover(n, pl, l, xmitq); > + > /* If pkts arrive out of order, use lowest calculated syncpt */ > if (less(syncpt, n->sync_point)) > n->sync_point = syncpt; > -- > 2.13.7 |
From: Jon M. <jon...@er...> - 2019-04-24 19:12:35
|
Acked-by: Jon Maloy <jon...@er...> Also see my comments below. ///jon > -----Original Message----- > From: Tuong Lien <tuo...@de...> > Sent: 24-Apr-19 00:41 > To: Jon Maloy <jon...@er...>; ma...@do...; > yin...@wi...; tip...@li... > Subject: [PATCH RFC] tipc: fix link "half-failover" issue > > TIPC link can temporarily fall into "half-establish" that only one of the link > endpoints is ESTABLISHED and starts to send traffic, PROTOCOL messages, > whereas the other link endpoint is not up (e.g. immediately when the > endpoint receives ACTIVATE_MSG, the network interface goes down...). > > This is a normal situation and will be settled because the link endpoint will be > eventually brought down after the link tolerance time. > > However, the situation will become worse when the second link is > established before the first link endpoint goes down, For example: > > 1. Both links <1A-1B>, <2A-2B> down Confusing terminology here. How can there be a link <1A-1B>? I think you want to say "Both link endpoints" 1A,1B down etc. > 2. Link endpoint 1B up, but 1A still down (e.g. due to network > disturbance, wrong session, etc.) > 3. Link <2A-2B> up Same here. > 4. Link endpoint 1B down (e.g. due to link tolerance timeout) > 5. Node B starts failover onto link <2A-2B> > > ==> Node A does never start link failover. > > When the "half-failover" situation happens, two consequences have been > observed: > > a) Peer link/node gets stuck in FAILINGOVER state; > b) Traffic or user messages that peer node is trying to failover onto the > second link can be partially or completely dropped by this node. > > The consequence a) was actually solved by commit c140eb166d68 ("tipc: > fix failover problem"), but that commit didn't cover the b). It's due to the fact > that the tunnel link endpoint has never been prepared for a failover, so the > 'l->drop_point' (and the other data...) is not set correctly. When a > TUNNLE_MSG s/TUNNLE_MSG/TUNNEL_MSG from peer node arrives on the link, depending on the inner > message's seqno and the current 'l->drop_point' > value, the message can be dropped (- treated as a duplicate message) or > processed. > At this early stage, the traffic messages from peer are likely to be > NAME_DISTRIBUTORs, this means some name table entries will be missed on > the node forever! > > The commit resolves the issue by starting the FAILOVER process on this node > as well. Another benefit from this solution is that we ensure the link will not > be re-established until the failover ends. > > Signed-off-by: Tuong Lien <tuo...@de...> > --- > net/tipc/link.c | 35 +++++++++++++++++++++++++++++++++++ > net/tipc/link.h | 2 ++ > net/tipc/node.c | 54 > +++++++++++++++++++++++++++++++++++++++++++++++------- > 3 files changed, 84 insertions(+), 7 deletions(-) > > diff --git a/net/tipc/link.c b/net/tipc/link.c index 6053489c8063..fa639054329d > 100644 > --- a/net/tipc/link.c > +++ b/net/tipc/link.c > @@ -1705,6 +1705,41 @@ void tipc_link_tnl_prepare(struct tipc_link *l, > struct tipc_link *tnl, > } > } > > +/** > + * tipc_link_failover_prepare() - prepare tnl for link failover > + * > + * This is a special version of the precursor - > +tipc_link_tnl_prepare(), > + * see the __tipc_node_link_failover() for details > + * > + * @l: failover link > + * @tnl: tunnel link > + * @xmitq: queue for messages to be xmited */ void > +tipc_link_failover_prepare(struct tipc_link *l, struct tipc_link *tnl, > + struct sk_buff_head *xmitq) > +{ > + struct sk_buff_head *fdefq = &tnl->failover_deferdq; > + > + tipc_link_create_dummy_tnl_msg(tnl, xmitq); > + > + /* This failover link enpoint should never be established s/should never be established before, so did not../was never established, so it has not ... > + * before, so did not receive anything from peer. > + * Otherwise, it must be a normal failover situation or the > + * node has entered SELF_DOWN_PEER_LEAVING and both peer > nodes > + * would have to start over from scratch instead. > + */ > + WARN_ON(l && tipc_link_is_up(l)); > + tnl->drop_point = 1; > + tnl->failover_reasm_skb = NULL; > + > + /* Initiate the link's failover deferdq */ > + if (unlikely(!skb_queue_empty(fdefq))) { > + pr_warn("Link failover deferdq not empty: %d!\n", > + skb_queue_len(fdefq)); > + __skb_queue_purge(fdefq); > + } > +} > + > /* tipc_link_validate_msg(): validate message against current link state > * Returns true if message should be accepted, otherwise false > */ > diff --git a/net/tipc/link.h b/net/tipc/link.h index > 8439e0ee53a8..adcad65e761c 100644 > --- a/net/tipc/link.h > +++ b/net/tipc/link.h > @@ -90,6 +90,8 @@ void tipc_link_tnl_prepare(struct tipc_link *l, struct > tipc_link *tnl, > int mtyp, struct sk_buff_head *xmitq); void > tipc_link_create_dummy_tnl_msg(struct tipc_link *tnl, > struct sk_buff_head *xmitq); > +void tipc_link_failover_prepare(struct tipc_link *l, struct tipc_link *tnl, > + struct sk_buff_head *xmitq); > void tipc_link_build_reset_msg(struct tipc_link *l, struct sk_buff_head > *xmitq); int tipc_link_fsm_evt(struct tipc_link *l, int evt); bool > tipc_link_is_up(struct tipc_link *l); diff --git a/net/tipc/node.c > b/net/tipc/node.c index 7478e2d4ec02..be07cf327d2d 100644 > --- a/net/tipc/node.c > +++ b/net/tipc/node.c > @@ -714,7 +714,6 @@ static void __tipc_node_link_up(struct tipc_node *n, > int bearer_id, > *slot0 = bearer_id; > *slot1 = bearer_id; > tipc_node_fsm_evt(n, SELF_ESTABL_CONTACT_EVT); > - n->failover_sent = false; > n->action_flags |= TIPC_NOTIFY_NODE_UP; > tipc_link_set_active(nl, true); > tipc_bcast_add_peer(n->net, nl, xmitq); @@ -757,6 +756,45 > @@ static void tipc_node_link_up(struct tipc_node *n, int bearer_id, } > > /** > + * __tipc_node_link_failover() - start failover in case "half-failover" Don't see any reason for the "__" prefix. This is normally used when we need two functions with the same name, one locked and one unlocked. > + * > + * This function is only called in a very special situation where link > + * failover can be already started on peer node but not on this node. > + * This can happen when e.g. > + * 1. Both links <1A-1B>, <2A-2B> down > + * 2. Link endpoint 1B up, but 1A still down (e.g. due to network > + * disturbance, wrong session, etc.) > + * 3. Link <2A-2B> up > + * 4. Link endpoint 1B down (e.g. due to link tolerance timeout) > + * 5. Node B starts failover onto link <2A-2B> > + * > + * ==> Node A does never start link/node failover! > + * > + * @n: tipc node structure > + * @l: link peer endpoint failingover (- can be NULL) > + * @tnl: tunnel link > + * @xmitq: queue for messages to be xmited on tnl link later */ static > +void __tipc_node_link_failover(struct tipc_node *n, struct tipc_link *l, > + struct tipc_link *tnl, > + struct sk_buff_head *xmitq) > +{ > + /* Avoid to be "self-failover" that can never end */ > + if (!tipc_link_is_up(tnl)) > + return; > + > + tipc_link_fsm_evt(tnl, LINK_SYNCH_END_EVT); > + tipc_node_fsm_evt(n, NODE_SYNCH_END_EVT); > + > + n->sync_point = tipc_link_rcv_nxt(tnl) + (U16_MAX / 2 - 1); > + tipc_link_failover_prepare(l, tnl, xmitq); > + > + if (l) > + tipc_link_fsm_evt(l, LINK_FAILOVER_BEGIN_EVT); > + tipc_node_fsm_evt(n, NODE_FAILOVER_BEGIN_EVT); } > + > +/** > * __tipc_node_link_down - handle loss of link > */ > static void __tipc_node_link_down(struct tipc_node *n, int *bearer_id, @@ > -1675,14 +1713,16 @@ static bool tipc_node_check_state(struct tipc_node > *n, struct sk_buff *skb, > tipc_skb_queue_splice_tail_init(tipc_link_inputq(pl), > tipc_link_inputq(l)); > } > + > /* If parallel link was already down, and this happened > before > - * the tunnel link came up, FAILOVER was never sent. Ensure > that > - * FAILOVER is sent to get peer out of NODE_FAILINGOVER > state. > + * the tunnel link came up, node failover was never started. > + * Ensure that a FAILOVER_MSG is sent to get peer out of > + * NODE_FAILINGOVER state, also this node must accept > + * TUNNLE_MSGs from peer. s/TUNNLE_MSG/TUNEL_MSG > */ > - if (n->state != NODE_FAILINGOVER && !n->failover_sent) { > - tipc_link_create_dummy_tnl_msg(l, xmitq); > - n->failover_sent = true; > - } > + if (n->state != NODE_FAILINGOVER) > + __tipc_node_link_failover(n, pl, l, xmitq); > + > /* If pkts arrive out of order, use lowest calculated syncpt */ > if (less(syncpt, n->sync_point)) > n->sync_point = syncpt; > -- > 2.13.7 |
From: Tuong L. <tuo...@de...> - 2019-04-24 04:41:43
|
TIPC link can temporarily fall into "half-establish" that only one of the link endpoints is ESTABLISHED and starts to send traffic, PROTOCOL messages, whereas the other link endpoint is not up (e.g. immediately when the endpoint receives ACTIVATE_MSG, the network interface goes down...). This is a normal situation and will be settled because the link endpoint will be eventually brought down after the link tolerance time. However, the situation will become worse when the second link is established before the first link endpoint goes down, For example: 1. Both links <1A-1B>, <2A-2B> down 2. Link endpoint 1B up, but 1A still down (e.g. due to network disturbance, wrong session, etc.) 3. Link <2A-2B> up 4. Link endpoint 1B down (e.g. due to link tolerance timeout) 5. Node B starts failover onto link <2A-2B> ==> Node A does never start link failover. When the "half-failover" situation happens, two consequences have been observed: a) Peer link/node gets stuck in FAILINGOVER state; b) Traffic or user messages that peer node is trying to failover onto the second link can be partially or completely dropped by this node. The consequence a) was actually solved by commit c140eb166d68 ("tipc: fix failover problem"), but that commit didn't cover the b). It's due to the fact that the tunnel link endpoint has never been prepared for a failover, so the 'l->drop_point' (and the other data...) is not set correctly. When a TUNNLE_MSG from peer node arrives on the link, depending on the inner message's seqno and the current 'l->drop_point' value, the message can be dropped (- treated as a duplicate message) or processed. At this early stage, the traffic messages from peer are likely to be NAME_DISTRIBUTORs, this means some name table entries will be missed on the node forever! The commit resolves the issue by starting the FAILOVER process on this node as well. Another benefit from this solution is that we ensure the link will not be re-established until the failover ends. Signed-off-by: Tuong Lien <tuo...@de...> --- net/tipc/link.c | 35 +++++++++++++++++++++++++++++++++++ net/tipc/link.h | 2 ++ net/tipc/node.c | 54 +++++++++++++++++++++++++++++++++++++++++++++++------- 3 files changed, 84 insertions(+), 7 deletions(-) diff --git a/net/tipc/link.c b/net/tipc/link.c index 6053489c8063..fa639054329d 100644 --- a/net/tipc/link.c +++ b/net/tipc/link.c @@ -1705,6 +1705,41 @@ void tipc_link_tnl_prepare(struct tipc_link *l, struct tipc_link *tnl, } } +/** + * tipc_link_failover_prepare() - prepare tnl for link failover + * + * This is a special version of the precursor - tipc_link_tnl_prepare(), + * see the __tipc_node_link_failover() for details + * + * @l: failover link + * @tnl: tunnel link + * @xmitq: queue for messages to be xmited + */ +void tipc_link_failover_prepare(struct tipc_link *l, struct tipc_link *tnl, + struct sk_buff_head *xmitq) +{ + struct sk_buff_head *fdefq = &tnl->failover_deferdq; + + tipc_link_create_dummy_tnl_msg(tnl, xmitq); + + /* This failover link enpoint should never be established + * before, so did not receive anything from peer. + * Otherwise, it must be a normal failover situation or the + * node has entered SELF_DOWN_PEER_LEAVING and both peer nodes + * would have to start over from scratch instead. + */ + WARN_ON(l && tipc_link_is_up(l)); + tnl->drop_point = 1; + tnl->failover_reasm_skb = NULL; + + /* Initiate the link's failover deferdq */ + if (unlikely(!skb_queue_empty(fdefq))) { + pr_warn("Link failover deferdq not empty: %d!\n", + skb_queue_len(fdefq)); + __skb_queue_purge(fdefq); + } +} + /* tipc_link_validate_msg(): validate message against current link state * Returns true if message should be accepted, otherwise false */ diff --git a/net/tipc/link.h b/net/tipc/link.h index 8439e0ee53a8..adcad65e761c 100644 --- a/net/tipc/link.h +++ b/net/tipc/link.h @@ -90,6 +90,8 @@ void tipc_link_tnl_prepare(struct tipc_link *l, struct tipc_link *tnl, int mtyp, struct sk_buff_head *xmitq); void tipc_link_create_dummy_tnl_msg(struct tipc_link *tnl, struct sk_buff_head *xmitq); +void tipc_link_failover_prepare(struct tipc_link *l, struct tipc_link *tnl, + struct sk_buff_head *xmitq); void tipc_link_build_reset_msg(struct tipc_link *l, struct sk_buff_head *xmitq); int tipc_link_fsm_evt(struct tipc_link *l, int evt); bool tipc_link_is_up(struct tipc_link *l); diff --git a/net/tipc/node.c b/net/tipc/node.c index 7478e2d4ec02..be07cf327d2d 100644 --- a/net/tipc/node.c +++ b/net/tipc/node.c @@ -714,7 +714,6 @@ static void __tipc_node_link_up(struct tipc_node *n, int bearer_id, *slot0 = bearer_id; *slot1 = bearer_id; tipc_node_fsm_evt(n, SELF_ESTABL_CONTACT_EVT); - n->failover_sent = false; n->action_flags |= TIPC_NOTIFY_NODE_UP; tipc_link_set_active(nl, true); tipc_bcast_add_peer(n->net, nl, xmitq); @@ -757,6 +756,45 @@ static void tipc_node_link_up(struct tipc_node *n, int bearer_id, } /** + * __tipc_node_link_failover() - start failover in case "half-failover" + * + * This function is only called in a very special situation where link + * failover can be already started on peer node but not on this node. + * This can happen when e.g. + * 1. Both links <1A-1B>, <2A-2B> down + * 2. Link endpoint 1B up, but 1A still down (e.g. due to network + * disturbance, wrong session, etc.) + * 3. Link <2A-2B> up + * 4. Link endpoint 1B down (e.g. due to link tolerance timeout) + * 5. Node B starts failover onto link <2A-2B> + * + * ==> Node A does never start link/node failover! + * + * @n: tipc node structure + * @l: link peer endpoint failingover (- can be NULL) + * @tnl: tunnel link + * @xmitq: queue for messages to be xmited on tnl link later + */ +static void __tipc_node_link_failover(struct tipc_node *n, struct tipc_link *l, + struct tipc_link *tnl, + struct sk_buff_head *xmitq) +{ + /* Avoid to be "self-failover" that can never end */ + if (!tipc_link_is_up(tnl)) + return; + + tipc_link_fsm_evt(tnl, LINK_SYNCH_END_EVT); + tipc_node_fsm_evt(n, NODE_SYNCH_END_EVT); + + n->sync_point = tipc_link_rcv_nxt(tnl) + (U16_MAX / 2 - 1); + tipc_link_failover_prepare(l, tnl, xmitq); + + if (l) + tipc_link_fsm_evt(l, LINK_FAILOVER_BEGIN_EVT); + tipc_node_fsm_evt(n, NODE_FAILOVER_BEGIN_EVT); +} + +/** * __tipc_node_link_down - handle loss of link */ static void __tipc_node_link_down(struct tipc_node *n, int *bearer_id, @@ -1675,14 +1713,16 @@ static bool tipc_node_check_state(struct tipc_node *n, struct sk_buff *skb, tipc_skb_queue_splice_tail_init(tipc_link_inputq(pl), tipc_link_inputq(l)); } + /* If parallel link was already down, and this happened before - * the tunnel link came up, FAILOVER was never sent. Ensure that - * FAILOVER is sent to get peer out of NODE_FAILINGOVER state. + * the tunnel link came up, node failover was never started. + * Ensure that a FAILOVER_MSG is sent to get peer out of + * NODE_FAILINGOVER state, also this node must accept + * TUNNLE_MSGs from peer. */ - if (n->state != NODE_FAILINGOVER && !n->failover_sent) { - tipc_link_create_dummy_tnl_msg(l, xmitq); - n->failover_sent = true; - } + if (n->state != NODE_FAILINGOVER) + __tipc_node_link_failover(n, pl, l, xmitq); + /* If pkts arrive out of order, use lowest calculated syncpt */ if (less(syncpt, n->sync_point)) n->sync_point = syncpt; -- 2.13.7 |
From: Jon M. <jon...@er...> - 2019-04-19 22:25:45
|
> -----Original Message----- > From: net...@vg... <net...@vg...> > On Behalf Of David Miller > Sent: 19-Apr-19 17:41 > To: Tung Quang Nguyen <tun...@de...> > Cc: ne...@vg...; tip...@li... > Subject: Re: [tipc-discussion][net-next v1] tipc: introduce new socket option > TIPC_SOCK_RECVQ_USED > > From: Tung Nguyen <tun...@de...> > Date: Thu, 18 Apr 2019 21:02:19 +0700 > > > When using TIPC_SOCK_RECVQ_DEPTH for getsockopt(), it returns the > > number of buffers in receive socket buffer which is not so helpful for > > user space applications. > > > > This commit introduces the new option TIPC_SOCK_RECVQ_USED which > > returns the current allocated bytes of the receive socket buffer. > > This helps user space applications dimension its buffer usage to avoid > > buffer overload issue. > > > > Signed-off-by: Tung Nguyen <tun...@de...> > > TIPC folks, please review. Acked-by: Jon Maloy <jon...@er...> It would of course be nicer if we could recycle TIPC_SOCK_RECV_QUEUE_DEPTH for this purpose, but that would mean altering the current ABI and incur a (probably very low) risk of breaking existing application. I am not particularly happy with this, but do we have users who claim it would be useful for them. ///jon |
From: David M. <da...@da...> - 2019-04-19 21:59:50
|
From: Jon Maloy <jon...@er...> Date: Fri, 19 Apr 2019 21:51:31 +0000 > > >> -----Original Message----- >> From: net...@vg... <net...@vg...> >> On Behalf Of David Miller >> Sent: 19-Apr-19 17:41 >> To: Tung Quang Nguyen <tun...@de...> >> Cc: ne...@vg...; tip...@li... >> Subject: Re: [tipc-discussion][net-next v1] tipc: introduce new socket option >> TIPC_SOCK_RECVQ_USED >> >> From: Tung Nguyen <tun...@de...> >> Date: Thu, 18 Apr 2019 21:02:19 +0700 >> >> > When using TIPC_SOCK_RECVQ_DEPTH for getsockopt(), it returns the >> > number of buffers in receive socket buffer which is not so helpful for >> > user space applications. >> > >> > This commit introduces the new option TIPC_SOCK_RECVQ_USED which >> > returns the current allocated bytes of the receive socket buffer. >> > This helps user space applications dimension its buffer usage to avoid >> > buffer overload issue. >> > >> > Signed-off-by: Tung Nguyen <tun...@de...> >> >> TIPC folks, please review. > > Acked-by: Jon Maloy <jon...@er...> Applied, thanks for reviewing. > It would of course be nicer if we could recycle > TIPC_SOCK_RECV_QUEUE_DEPTH for this purpose, but that would mean > altering the current ABI and incur a (probably very low) risk of > breaking existing application. I am not particularly happy with > this, but do we have users who claim it would be useful for them. Better safe than sorry when it comes to user facing ABIs. |
From: David M. <da...@da...> - 2019-04-19 21:40:55
|
From: Tung Nguyen <tun...@de...> Date: Thu, 18 Apr 2019 21:02:19 +0700 > When using TIPC_SOCK_RECVQ_DEPTH for getsockopt(), it returns the > number of buffers in receive socket buffer which is not so helpful > for user space applications. > > This commit introduces the new option TIPC_SOCK_RECVQ_USED which > returns the current allocated bytes of the receive socket buffer. > This helps user space applications dimension its buffer usage to > avoid buffer overload issue. > > Signed-off-by: Tung Nguyen <tun...@de...> TIPC folks, please review. |
From: Tung N. <tun...@de...> - 2019-04-19 02:59:49
|
When using TIPC_SOCK_RECVQ_DEPTH for getsockopt(), it returns the number of buffers in receive socket buffer which is not so helpful for user space applications. This commit introduces the new option TIPC_SOCK_RECVQ_USED which returns the current allocated bytes of the receive socket buffer. This helps user space applications dimension its buffer usage to avoid buffer overload issue. Signed-off-by: Tung Nguyen <tun...@de...> --- include/uapi/linux/tipc.h | 1 + net/tipc/socket.c | 3 +++ 2 files changed, 4 insertions(+) diff --git a/include/uapi/linux/tipc.h b/include/uapi/linux/tipc.h index 6b2fd4d9655f..7df026ea6aff 100644 --- a/include/uapi/linux/tipc.h +++ b/include/uapi/linux/tipc.h @@ -190,6 +190,7 @@ struct sockaddr_tipc { #define TIPC_MCAST_REPLICAST 134 /* Default: TIPC selects. No arg */ #define TIPC_GROUP_JOIN 135 /* Takes struct tipc_group_req* */ #define TIPC_GROUP_LEAVE 136 /* No argument */ +#define TIPC_SOCK_RECVQ_USED 137 /* Default: none (read only) */ /* * Flag values diff --git a/net/tipc/socket.c b/net/tipc/socket.c index 8ac8ddf1e324..1385207a301f 100644 --- a/net/tipc/socket.c +++ b/net/tipc/socket.c @@ -3070,6 +3070,9 @@ static int tipc_getsockopt(struct socket *sock, int lvl, int opt, case TIPC_SOCK_RECVQ_DEPTH: value = skb_queue_len(&sk->sk_receive_queue); break; + case TIPC_SOCK_RECVQ_USED: + value = sk_rmem_alloc_get(sk); + break; case TIPC_GROUP_JOIN: seq.type = 0; if (tsk->group) -- 2.17.1 |
From: Rune T. <ru...@in...> - 2019-04-18 16:28:45
|
Ok, I am pretty sure we don't need the domain setting, based on earlier comments. I will test our setup using the newer tipc commands, and hopefully I don't have any issues. The -b command is nice but not required in our case (especially as I can use "tipc bearer list" instead). -----Original Message----- From: Jon Maloy <jon...@er...> Sent: Wednesday, April 17, 2019 15:20 To: Rune Torgersen <ru...@in...>; tip...@li... Subject: RE: help with tipc command Hi Rune, A little more testing, but still on Linux 5.0, reveals the following: - tipc-config -b is gone, but tipc-config -be=... is still working as intended. The change seems to have happened when we moved from the old (tipc-config) netlink format internally to the new (tipc) netlink format. After this move, tipc-config is in reality using a wrapper around the tipc netlink code, and something has happened at that moment. - tipc bearer enable domain 1.1.0 is accepted, but the domain is internally set to an invalid value, so the links will never come up. - The same happens with tipc bearer enable domain 1001001 - tipc bearer enable domain 0x1001000 or any other string will set the domain internally to 0, so the links will come up, but of course, since 0 is the default value, the command is meaningless. I will try to fix this, but I don't see it as an urgent matter unless you feel you are stuck. BR ///jon > -----Original Message----- > From: Rune Torgersen <ru...@in...> > Sent: 17-Apr-19 11:20 > To: Jon Maloy <jon...@er...>; tipc- > dis...@li... > Subject: RE: help with tipc command > > We use the netid param to separate different clusters on the same network, > so I don't think I need the domain param anymore. I will attempt using it > without the domain param and see if everything looks to work. > Just a leftover from 15 years of TIPC usage... > > Tipc-config seems to work for all bearer ops, except for -b > > Not working: > root@test10020u2:~# tipc-config -b > Bearers: > operation not supported > root@test10020u2:~# dmesg -c > root@test10020u2:~# uname -r > 4.4.0-145-generic > root@test10020u2:~# > > Working: > root@testrec198:~# tipc-config -b > Bearers: > eth:lo > eth:eth0 > root@testrec198:~# uname -r > 4.4.0-142-generic > root@testrec198:~# > > root@testrec198:~# tipc-config -V > TIPC configuration tool version 2.1.1 > > So somewhere between 4.4.0-142 (I think it worked in -144 also) and -145 it > stopped working. > > -----Original Message----- > From: Jon Maloy <jon...@er...> > Sent: Tuesday, April 16, 2019 15:02 > To: Rune Torgersen <ru...@in...>; tipc- > dis...@li... > Subject: RE: help with tipc command > > Hi, > I tried it in my version (5.0) , and the parameter was accepted both as 1.1.0 > and as 1001000, but the links didn't come up. > It is possible that we have broken something, as this parameter is regarded > as obsolete, and not really anything we pay attention to during testing. > But of course it should work, just as tipc-config should continue working > without any limitations. > I'll look into this later today, when I find some time. > > Still, my first question is if you really need this setting. It is only needed when > you are planning to have nodes interconnected on the same LAN/VLAN, but > you still want to keep them isolated. Is that so? > And, if you need this isolation, you could use the network id (nowadays > called cluster id) instead. > Just give it a try, while I start some troubleshooting on my part. > > BR > ///jon > > > > > -----Original Message----- > > From: Rune Torgersen <ru...@in...> > > Sent: 16-Apr-19 12:39 > > To: Jon Maloy <jon...@er...>; tipc- > > dis...@li... > > Subject: RE: help with tipc command > > > > Did find that site after a while. > > > > However it only shows enable without the domain param. > > > > I got "tipc bearer enable priority 5 media eth device eth2" to work > > but still get that error on " tipc bearer enable domain 1.1.0 priority > > 5 media eth device eth2" > > > > Nowhere in the documentation does it show me the correct format for > > the domain param. > > > > -----Original Message----- > > From: Jon Maloy <jon...@er...> > > Sent: Tuesday, April 16, 2019 11:16 > > To: Rune Torgersen <ru...@in...>; tipc- > > dis...@li... > > Subject: RE: help with tipc command > > > > Hi Rune, > > Have a look here: > > > > https://protect2.fireeye.com/url?k=fa65b58c-a6ec1a45-fa65f517-0cc47ad9 > > 3e1c-ca8f6bf35a68f41e&u=http://tipc.io/getting_started.html > > > > Hilsen > > ///jon > > > > > > > -----Original Message----- > > > From: Rune Torgersen <ru...@in...> > > > Sent: 16-Apr-19 11:41 > > > To: tip...@li... > > > Subject: [tipc-discussion] help with tipc command > > > > > > I am trying to convert some scripts from using tipc-config to newer > > > tipc command (no, not "ip tipc" yet as it is not available on Ubuntu 16.04). > > > > > > I was not planning g on doing this unit we moved to Ubuntu 18, but I > > > just noticed that with the lastest ubuntu 16.04 kernel (4.4.0-145) > > > tipc-config -b no longer works (worked on 4.4.0-143). > > > > > > I manager to figure out syntax for bearer disable, but have problems > > > with bearer enable. > > > > > > > > > Old command was > > > tipc-config -be=eth:eth2/1.1.0/5 > > > > > > New is > > > > > > But gets an error > > > root@test10020u3:~# tipc bearer enable domain 1.1.0 priority 5 media > > > eth device eth2 > > > error: Invalid argument > > > root@test10020u3:~# dmesg -c > > > [ 2896.786648] Bearer <eth:eth2> rejected, illegal discovery domain > > > > > > > > > Also, what would be the equivalents of: > > > tipc-config -netid=$netid > > > tipc-config -a=1.1.${unit} > > > > > > > > > > > > > > > _______________________________________________ > > > tipc-discussion mailing list > > > tip...@li... > > > https://lists.sourceforge.net/lists/listinfo/tipc-discussion |
From: Jon M. <jon...@er...> - 2019-04-18 14:02:44
|
Acked-by: Jon Maloy <jon...@er...> Just make sure you don't send a banner patch ([PATCH 0/1] ) when you send this to netdev. For single patches that isn't necessary. And remember to tag it [net-next] . ///jon > -----Original Message----- > From: Tung Nguyen <tun...@de...> > Sent: 18-Apr-19 07:22 > To: tip...@li...; Jon Maloy > <jon...@er...>; ma...@do...; yin...@wi... > Subject: [PATCH v2 1/1] tipc: introduce new socket option > TIPC_SOCK_RECVQ_USED > > When using TIPC_SOCK_RECVQ_DEPTH for getsockopt(), it returns the > number of buffers in receive socket buffer which is not so helpful for user > space applications. > > This commit introduces the new option TIPC_SOCK_RECVQ_USED which > returns the current allocated bytes of the receive socket buffer. > This helps user space applications dimension its buffer usage to avoid buffer > overload issue. > > Signed-off-by: Tung Nguyen <tun...@de...> > --- > include/uapi/linux/tipc.h | 1 + > net/tipc/socket.c | 3 +++ > 2 files changed, 4 insertions(+) > > diff --git a/include/uapi/linux/tipc.h b/include/uapi/linux/tipc.h index > 6b2fd4d9655f..7df026ea6aff 100644 > --- a/include/uapi/linux/tipc.h > +++ b/include/uapi/linux/tipc.h > @@ -190,6 +190,7 @@ struct sockaddr_tipc { > #define TIPC_MCAST_REPLICAST 134 /* Default: TIPC selects. No arg */ > #define TIPC_GROUP_JOIN 135 /* Takes struct tipc_group_req* */ > #define TIPC_GROUP_LEAVE 136 /* No argument */ > +#define TIPC_SOCK_RECVQ_USED 137 /* Default: none (read only) */ > > /* > * Flag values > diff --git a/net/tipc/socket.c b/net/tipc/socket.c index > 8ac8ddf1e324..1385207a301f 100644 > --- a/net/tipc/socket.c > +++ b/net/tipc/socket.c > @@ -3070,6 +3070,9 @@ static int tipc_getsockopt(struct socket *sock, int > lvl, int opt, > case TIPC_SOCK_RECVQ_DEPTH: > value = skb_queue_len(&sk->sk_receive_queue); > break; > + case TIPC_SOCK_RECVQ_USED: > + value = sk_rmem_alloc_get(sk); > + break; > case TIPC_GROUP_JOIN: > seq.type = 0; > if (tsk->group) > -- > 2.17.1 |
From: Tung N. <tun...@de...> - 2019-04-18 11:22:00
|
Returning the number of allocated bytes of the receive socket buffer when using getsockopt() with option TIPC_SOCK_RECVQ_USED Tung Nguyen (1): tipc: introduce new socket option TIPC_SOCK_RECVQ_USED include/uapi/linux/tipc.h | 1 + net/tipc/socket.c | 3 +++ 2 files changed, 4 insertions(+) -- 2.17.1 |
From: Tung N. <tun...@de...> - 2019-04-18 11:22:00
|
When using TIPC_SOCK_RECVQ_DEPTH for getsockopt(), it returns the number of buffers in receive socket buffer which is not so helpful for user space applications. This commit introduces the new option TIPC_SOCK_RECVQ_USED which returns the current allocated bytes of the receive socket buffer. This helps user space applications dimension its buffer usage to avoid buffer overload issue. Signed-off-by: Tung Nguyen <tun...@de...> --- include/uapi/linux/tipc.h | 1 + net/tipc/socket.c | 3 +++ 2 files changed, 4 insertions(+) diff --git a/include/uapi/linux/tipc.h b/include/uapi/linux/tipc.h index 6b2fd4d9655f..7df026ea6aff 100644 --- a/include/uapi/linux/tipc.h +++ b/include/uapi/linux/tipc.h @@ -190,6 +190,7 @@ struct sockaddr_tipc { #define TIPC_MCAST_REPLICAST 134 /* Default: TIPC selects. No arg */ #define TIPC_GROUP_JOIN 135 /* Takes struct tipc_group_req* */ #define TIPC_GROUP_LEAVE 136 /* No argument */ +#define TIPC_SOCK_RECVQ_USED 137 /* Default: none (read only) */ /* * Flag values diff --git a/net/tipc/socket.c b/net/tipc/socket.c index 8ac8ddf1e324..1385207a301f 100644 --- a/net/tipc/socket.c +++ b/net/tipc/socket.c @@ -3070,6 +3070,9 @@ static int tipc_getsockopt(struct socket *sock, int lvl, int opt, case TIPC_SOCK_RECVQ_DEPTH: value = skb_queue_len(&sk->sk_receive_queue); break; + case TIPC_SOCK_RECVQ_USED: + value = sk_rmem_alloc_get(sk); + break; case TIPC_GROUP_JOIN: seq.type = 0; if (tsk->group) -- 2.17.1 |
From: Jon M. <jon...@er...> - 2019-04-17 20:20:30
|
Hi Rune, A little more testing, but still on Linux 5.0, reveals the following: - tipc-config -b is gone, but tipc-config -be=... is still working as intended. The change seems to have happened when we moved from the old (tipc-config) netlink format internally to the new (tipc) netlink format. After this move, tipc-config is in reality using a wrapper around the tipc netlink code, and something has happened at that moment. - tipc bearer enable domain 1.1.0 is accepted, but the domain is internally set to an invalid value, so the links will never come up. - The same happens with tipc bearer enable domain 1001001 - tipc bearer enable domain 0x1001000 or any other string will set the domain internally to 0, so the links will come up, but of course, since 0 is the default value, the command is meaningless. I will try to fix this, but I don't see it as an urgent matter unless you feel you are stuck. BR ///jon > -----Original Message----- > From: Rune Torgersen <ru...@in...> > Sent: 17-Apr-19 11:20 > To: Jon Maloy <jon...@er...>; tipc- > dis...@li... > Subject: RE: help with tipc command > > We use the netid param to separate different clusters on the same network, > so I don't think I need the domain param anymore. I will attempt using it > without the domain param and see if everything looks to work. > Just a leftover from 15 years of TIPC usage... > > Tipc-config seems to work for all bearer ops, except for -b > > Not working: > root@test10020u2:~# tipc-config -b > Bearers: > operation not supported > root@test10020u2:~# dmesg -c > root@test10020u2:~# uname -r > 4.4.0-145-generic > root@test10020u2:~# > > Working: > root@testrec198:~# tipc-config -b > Bearers: > eth:lo > eth:eth0 > root@testrec198:~# uname -r > 4.4.0-142-generic > root@testrec198:~# > > root@testrec198:~# tipc-config -V > TIPC configuration tool version 2.1.1 > > So somewhere between 4.4.0-142 (I think it worked in -144 also) and -145 it > stopped working. > > -----Original Message----- > From: Jon Maloy <jon...@er...> > Sent: Tuesday, April 16, 2019 15:02 > To: Rune Torgersen <ru...@in...>; tipc- > dis...@li... > Subject: RE: help with tipc command > > Hi, > I tried it in my version (5.0) , and the parameter was accepted both as 1.1.0 > and as 1001000, but the links didn't come up. > It is possible that we have broken something, as this parameter is regarded > as obsolete, and not really anything we pay attention to during testing. > But of course it should work, just as tipc-config should continue working > without any limitations. > I'll look into this later today, when I find some time. > > Still, my first question is if you really need this setting. It is only needed when > you are planning to have nodes interconnected on the same LAN/VLAN, but > you still want to keep them isolated. Is that so? > And, if you need this isolation, you could use the network id (nowadays > called cluster id) instead. > Just give it a try, while I start some troubleshooting on my part. > > BR > ///jon > > > > > -----Original Message----- > > From: Rune Torgersen <ru...@in...> > > Sent: 16-Apr-19 12:39 > > To: Jon Maloy <jon...@er...>; tipc- > > dis...@li... > > Subject: RE: help with tipc command > > > > Did find that site after a while. > > > > However it only shows enable without the domain param. > > > > I got "tipc bearer enable priority 5 media eth device eth2" to work > > but still get that error on " tipc bearer enable domain 1.1.0 priority > > 5 media eth device eth2" > > > > Nowhere in the documentation does it show me the correct format for > > the domain param. > > > > -----Original Message----- > > From: Jon Maloy <jon...@er...> > > Sent: Tuesday, April 16, 2019 11:16 > > To: Rune Torgersen <ru...@in...>; tipc- > > dis...@li... > > Subject: RE: help with tipc command > > > > Hi Rune, > > Have a look here: > > > > https://protect2.fireeye.com/url?k=fa65b58c-a6ec1a45-fa65f517-0cc47ad9 > > 3e1c-ca8f6bf35a68f41e&u=http://tipc.io/getting_started.html > > > > Hilsen > > ///jon > > > > > > > -----Original Message----- > > > From: Rune Torgersen <ru...@in...> > > > Sent: 16-Apr-19 11:41 > > > To: tip...@li... > > > Subject: [tipc-discussion] help with tipc command > > > > > > I am trying to convert some scripts from using tipc-config to newer > > > tipc command (no, not "ip tipc" yet as it is not available on Ubuntu 16.04). > > > > > > I was not planning g on doing this unit we moved to Ubuntu 18, but I > > > just noticed that with the lastest ubuntu 16.04 kernel (4.4.0-145) > > > tipc-config -b no longer works (worked on 4.4.0-143). > > > > > > I manager to figure out syntax for bearer disable, but have problems > > > with bearer enable. > > > > > > > > > Old command was > > > tipc-config -be=eth:eth2/1.1.0/5 > > > > > > New is > > > > > > But gets an error > > > root@test10020u3:~# tipc bearer enable domain 1.1.0 priority 5 media > > > eth device eth2 > > > error: Invalid argument > > > root@test10020u3:~# dmesg -c > > > [ 2896.786648] Bearer <eth:eth2> rejected, illegal discovery domain > > > > > > > > > Also, what would be the equivalents of: > > > tipc-config -netid=$netid > > > tipc-config -a=1.1.${unit} > > > > > > > > > > > > > > > _______________________________________________ > > > tipc-discussion mailing list > > > tip...@li... > > > https://lists.sourceforge.net/lists/listinfo/tipc-discussion |
From: Rune T. <ru...@in...> - 2019-04-17 15:20:28
|
We use the netid param to separate different clusters on the same network, so I don't think I need the domain param anymore. I will attempt using it without the domain param and see if everything looks to work. Just a leftover from 15 years of TIPC usage... Tipc-config seems to work for all bearer ops, except for -b Not working: root@test10020u2:~# tipc-config -b Bearers: operation not supported root@test10020u2:~# dmesg -c root@test10020u2:~# uname -r 4.4.0-145-generic root@test10020u2:~# Working: root@testrec198:~# tipc-config -b Bearers: eth:lo eth:eth0 root@testrec198:~# uname -r 4.4.0-142-generic root@testrec198:~# root@testrec198:~# tipc-config -V TIPC configuration tool version 2.1.1 So somewhere between 4.4.0-142 (I think it worked in -144 also) and -145 it stopped working. -----Original Message----- From: Jon Maloy <jon...@er...> Sent: Tuesday, April 16, 2019 15:02 To: Rune Torgersen <ru...@in...>; tip...@li... Subject: RE: help with tipc command Hi, I tried it in my version (5.0) , and the parameter was accepted both as 1.1.0 and as 1001000, but the links didn't come up. It is possible that we have broken something, as this parameter is regarded as obsolete, and not really anything we pay attention to during testing. But of course it should work, just as tipc-config should continue working without any limitations. I'll look into this later today, when I find some time. Still, my first question is if you really need this setting. It is only needed when you are planning to have nodes interconnected on the same LAN/VLAN, but you still want to keep them isolated. Is that so? And, if you need this isolation, you could use the network id (nowadays called cluster id) instead. Just give it a try, while I start some troubleshooting on my part. BR ///jon > -----Original Message----- > From: Rune Torgersen <ru...@in...> > Sent: 16-Apr-19 12:39 > To: Jon Maloy <jon...@er...>; tipc- > dis...@li... > Subject: RE: help with tipc command > > Did find that site after a while. > > However it only shows enable without the domain param. > > I got "tipc bearer enable priority 5 media eth device eth2" to work but still get > that error on " tipc bearer enable domain 1.1.0 priority 5 media eth device > eth2" > > Nowhere in the documentation does it show me the correct format for the > domain param. > > -----Original Message----- > From: Jon Maloy <jon...@er...> > Sent: Tuesday, April 16, 2019 11:16 > To: Rune Torgersen <ru...@in...>; tipc- > dis...@li... > Subject: RE: help with tipc command > > Hi Rune, > Have a look here: > > http://tipc.io/getting_started.html > > Hilsen > ///jon > > > > -----Original Message----- > > From: Rune Torgersen <ru...@in...> > > Sent: 16-Apr-19 11:41 > > To: tip...@li... > > Subject: [tipc-discussion] help with tipc command > > > > I am trying to convert some scripts from using tipc-config to newer > > tipc command (no, not "ip tipc" yet as it is not available on Ubuntu 16.04). > > > > I was not planning g on doing this unit we moved to Ubuntu 18, but I > > just noticed that with the lastest ubuntu 16.04 kernel (4.4.0-145) > > tipc-config -b no longer works (worked on 4.4.0-143). > > > > I manager to figure out syntax for bearer disable, but have problems > > with bearer enable. > > > > > > Old command was > > tipc-config -be=eth:eth2/1.1.0/5 > > > > New is > > > > But gets an error > > root@test10020u3:~# tipc bearer enable domain 1.1.0 priority 5 media > > eth device eth2 > > error: Invalid argument > > root@test10020u3:~# dmesg -c > > [ 2896.786648] Bearer <eth:eth2> rejected, illegal discovery domain > > > > > > Also, what would be the equivalents of: > > tipc-config -netid=$netid > > tipc-config -a=1.1.${unit} > > > > > > > > > > _______________________________________________ > > tipc-discussion mailing list > > tip...@li... > > https://lists.sourceforge.net/lists/listinfo/tipc-discussion |
From: David M. <da...@da...> - 2019-04-17 04:32:34
|
From: Zhiqiang Liu <liu...@hu...> Date: Tue, 16 Apr 2019 13:10:09 +0800 > From: Jie Liu <liu...@hu...> > > We find that sysctl_tipc_rmem and named_timeout do not have the right minimum > setting. sysctl_tipc_rmem should be larger than zero, like sysctl_tcp_rmem. > And named_timeout as a timeout setting should be not less than zero. > > Fixes: cc79dd1ba9c10 ("tipc: change socket buffer overflow control to respect sk_rcvbuf") > Fixes: a5325ae5b8bff ("tipc: add name distributor resiliency queue") > Signed-off-by: Jie Liu <liu...@hu...> > Reported-by: Qiang Ning <nin...@hu...> > Reviewed-by: Zhiqiang Liu <liu...@hu...> > Reviewed-by: Miaohe Lin <lin...@hu...> Applied. |