You can subscribe to this list here.
2003 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(6) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2004 |
Jan
(9) |
Feb
(11) |
Mar
(22) |
Apr
(73) |
May
(78) |
Jun
(146) |
Jul
(80) |
Aug
(27) |
Sep
(5) |
Oct
(14) |
Nov
(18) |
Dec
(27) |
2005 |
Jan
(20) |
Feb
(30) |
Mar
(19) |
Apr
(28) |
May
(50) |
Jun
(31) |
Jul
(32) |
Aug
(14) |
Sep
(36) |
Oct
(43) |
Nov
(74) |
Dec
(63) |
2006 |
Jan
(34) |
Feb
(32) |
Mar
(21) |
Apr
(76) |
May
(106) |
Jun
(72) |
Jul
(70) |
Aug
(175) |
Sep
(130) |
Oct
(39) |
Nov
(81) |
Dec
(43) |
2007 |
Jan
(81) |
Feb
(36) |
Mar
(20) |
Apr
(43) |
May
(54) |
Jun
(34) |
Jul
(44) |
Aug
(55) |
Sep
(44) |
Oct
(54) |
Nov
(43) |
Dec
(41) |
2008 |
Jan
(42) |
Feb
(84) |
Mar
(73) |
Apr
(30) |
May
(119) |
Jun
(54) |
Jul
(54) |
Aug
(93) |
Sep
(173) |
Oct
(130) |
Nov
(145) |
Dec
(153) |
2009 |
Jan
(59) |
Feb
(12) |
Mar
(28) |
Apr
(18) |
May
(56) |
Jun
(9) |
Jul
(28) |
Aug
(62) |
Sep
(16) |
Oct
(19) |
Nov
(15) |
Dec
(17) |
2010 |
Jan
(14) |
Feb
(36) |
Mar
(37) |
Apr
(30) |
May
(33) |
Jun
(53) |
Jul
(42) |
Aug
(50) |
Sep
(67) |
Oct
(66) |
Nov
(69) |
Dec
(36) |
2011 |
Jan
(52) |
Feb
(45) |
Mar
(49) |
Apr
(21) |
May
(34) |
Jun
(13) |
Jul
(19) |
Aug
(37) |
Sep
(43) |
Oct
(10) |
Nov
(23) |
Dec
(30) |
2012 |
Jan
(42) |
Feb
(36) |
Mar
(46) |
Apr
(25) |
May
(96) |
Jun
(146) |
Jul
(40) |
Aug
(28) |
Sep
(61) |
Oct
(45) |
Nov
(100) |
Dec
(53) |
2013 |
Jan
(79) |
Feb
(24) |
Mar
(134) |
Apr
(156) |
May
(118) |
Jun
(75) |
Jul
(278) |
Aug
(145) |
Sep
(136) |
Oct
(168) |
Nov
(137) |
Dec
(439) |
2014 |
Jan
(284) |
Feb
(158) |
Mar
(231) |
Apr
(275) |
May
(259) |
Jun
(91) |
Jul
(222) |
Aug
(215) |
Sep
(165) |
Oct
(166) |
Nov
(211) |
Dec
(150) |
2015 |
Jan
(164) |
Feb
(324) |
Mar
(299) |
Apr
(214) |
May
(111) |
Jun
(109) |
Jul
(105) |
Aug
(36) |
Sep
(58) |
Oct
(131) |
Nov
(68) |
Dec
(30) |
2016 |
Jan
(46) |
Feb
(87) |
Mar
(135) |
Apr
(174) |
May
(132) |
Jun
(135) |
Jul
(149) |
Aug
(125) |
Sep
(79) |
Oct
(49) |
Nov
(95) |
Dec
(102) |
2017 |
Jan
(104) |
Feb
(75) |
Mar
(72) |
Apr
(53) |
May
(18) |
Jun
(5) |
Jul
(14) |
Aug
(19) |
Sep
(2) |
Oct
(13) |
Nov
(21) |
Dec
(67) |
2018 |
Jan
(56) |
Feb
(50) |
Mar
(148) |
Apr
(41) |
May
(37) |
Jun
(34) |
Jul
(34) |
Aug
(11) |
Sep
(52) |
Oct
(48) |
Nov
(28) |
Dec
(46) |
2019 |
Jan
(29) |
Feb
(63) |
Mar
(95) |
Apr
(54) |
May
(14) |
Jun
(71) |
Jul
(60) |
Aug
(49) |
Sep
(3) |
Oct
(64) |
Nov
(115) |
Dec
(57) |
2020 |
Jan
(15) |
Feb
(9) |
Mar
(38) |
Apr
(27) |
May
(60) |
Jun
(53) |
Jul
(35) |
Aug
(46) |
Sep
(37) |
Oct
(64) |
Nov
(20) |
Dec
(25) |
2021 |
Jan
(20) |
Feb
(31) |
Mar
(27) |
Apr
(23) |
May
(21) |
Jun
(30) |
Jul
(30) |
Aug
(7) |
Sep
(18) |
Oct
|
Nov
(15) |
Dec
(4) |
2022 |
Jan
(3) |
Feb
(1) |
Mar
(10) |
Apr
|
May
(2) |
Jun
(26) |
Jul
(5) |
Aug
|
Sep
(1) |
Oct
(2) |
Nov
(9) |
Dec
(2) |
2023 |
Jan
(4) |
Feb
(4) |
Mar
(5) |
Apr
(10) |
May
(29) |
Jun
(17) |
Jul
|
Aug
|
Sep
(1) |
Oct
(1) |
Nov
(2) |
Dec
|
2024 |
Jan
|
Feb
(6) |
Mar
|
Apr
(1) |
May
(6) |
Jun
|
Jul
(5) |
Aug
|
Sep
(3) |
Oct
|
Nov
|
Dec
|
2025 |
Jan
|
Feb
(3) |
Mar
|
Apr
|
May
|
Jun
|
Jul
(6) |
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: Jon M. <jon...@er...> - 2019-10-17 19:35:05
|
Hi Rune, Do you see any signs of general memory leak ("free") on your node? Anyway there can be no doubt that this happens because the big buffer pool is running empty. We fixed that in commit 4c94cc2d3d57 ("tipc: fall back to smaller MTU if allocation of local send skb fails") which was delivered to Linux 4.16. Do you have any opportunity to apply that patch and try it? BR ///jon > -----Original Message----- > From: Rune Torgersen <ru...@in...> > Sent: 17-Oct-19 12:38 > To: 'tip...@li...' <tipc- > dis...@li...> > Subject: [tipc-discussion] Error allocating memeory error when sending RDM > message > > Hi. > > I am running into an issue when sending SOCK_RDM or SOCK_DGRAM > messages. On a system that has been up for a time (120+ days inthis case), I > cannot send any RDM/DGRAM type TIPC messages that are larger than about > 16000 bytes (16033+ fails, 15100 and smaller still works). > Any larger messages fails with erro code 12 :"Cannot allocate memory". > > Really odd thing about it only happens on some connections and not others, > on the same system (example, sending to tipc node 103:1003 gets no error, > while sending to 103:3 get error). > When it gets into this state, it seems to happen forever on the same > destination address, and not on others until system is rebooted. (restarting the > server side application makes no difference). > The sends are done on the same node as the receiver is on. > > Kernel is Ubuntu 16.04 LTS 4.4.0-150 in this case, also seen on 161. > > Nametable for 103: > 103 2 2 <1.1.1:2328193343> 2328193344 cluster > 103 3 3 <1.1.2:3153441800> 3153441801 cluster > 103 5 5 <1.1.4:269294867> 269294868 cluster > 103 1002 1002 <1.1.1:490133365> 490133366 cluster > 103 1003 1003 <1.1.2:2552019732> 2552019733 cluster > 103 1005 1005 <1.1.4:625110186> 625110187 cluster > > _______________________________________________ > tipc-discussion mailing list > tip...@li... > https://lists.sourceforge.net/lists/listinfo/tipc-discussion |
From: Rune T. <ru...@in...> - 2019-10-17 17:05:15
|
Hi. I am running into an issue when sending SOCK_RDM or SOCK_DGRAM messages. On a system that has been up for a time (120+ days inthis case), I cannot send any RDM/DGRAM type TIPC messages that are larger than about 16000 bytes (16033+ fails, 15100 and smaller still works). Any larger messages fails with erro code 12 :"Cannot allocate memory". Really odd thing about it only happens on some connections and not others, on the same system (example, sending to tipc node 103:1003 gets no error, while sending to 103:3 get error). When it gets into this state, it seems to happen forever on the same destination address, and not on others until system is rebooted. (restarting the server side application makes no difference). The sends are done on the same node as the receiver is on. Kernel is Ubuntu 16.04 LTS 4.4.0-150 in this case, also seen on 161. Nametable for 103: 103 2 2 <1.1.1:2328193343> 2328193344 cluster 103 3 3 <1.1.2:3153441800> 3153441801 cluster 103 5 5 <1.1.4:269294867> 269294868 cluster 103 1002 1002 <1.1.1:490133365> 490133366 cluster 103 1003 1003 <1.1.2:2552019732> 2552019733 cluster 103 1005 1005 <1.1.4:625110186> 625110187 cluster |
From: Jon M. <jon...@er...> - 2019-10-16 15:23:47
|
> -----Original Message----- > From: Ying Xue <yin...@wi...> > Sent: 16-Oct-19 08:30 > To: Tuong Tong Lien <tuo...@de...>; tipc- > dis...@li...; Jon Maloy <jon...@er...>; > ma...@do... > Subject: Re: [iproute2] tipc: add new commands to set TIPC AEAD key > > Tt looks like we will use "tipc node" command to configure static key to TIPC > module, right? The key is static in the sense that TIPC itself cannot change the key. But the protocol ensures that keys can be replaced without any traffic disturbances. > > Do we plan to support dynamic key setting? If yes, what kinds of key exchange > protocol would we use? For example, in IPSEC, it uses IKEv2 as its key > exchange protocol. At the moment we assume there is an external user land framework where node authentication is done and where keys are generated and distributed (via TLS) to the nodes. When we want to replace a key (probably at fix pre-defined intervals), the framework has to generate new keys and distribute/inject those to TIPC. > > Will key be expired after a specific lifetime? For instance, in > IPSEC/Raccoon2 or strongswan, they use rekey feature to provide this > function to make security association safer. We are considering this, so that the external framework can be kept simpler or even be eliminated. That would be the next step, once this series is applied. Regards ///jon > > On 10/14/19 7:36 PM, Tuong Lien wrote: > > Two new commands are added as part of 'tipc node' command: > > > > $tipc node set key KEY [algname ALGNAME] [nodeid NODEID] $tipc node > > flush key > > > > which enable user to set and remove AEAD keys in kernel TIPC. > > > > For the 'set key' command, the given 'nodeid' parameter decides the > > mode to be applied to the key, particularly: > > > > - If NODEID is empty, the key is a 'cluster' key which will be used > > for all message encryption/decryption from/to the node (i.e. both TX & RX). > > The same key needs to be set in the other nodes i.e. the 'cluster key' > > mode. > > > > - If NODEID is own node, the key is used for message encryption (TX) > > from the node. Whereas, if NODEID is a peer node, the key is for > > message decryption (RX) from that peer node. > > This is the 'per-node-key' mode that each nodes in the cluster has its > > specific (TX) key. > > > > Signed-off-by: Tuong Lien <tuo...@de...> > > --- > > include/uapi/linux/tipc.h | 21 ++++++ > > include/uapi/linux/tipc_netlink.h | 4 ++ > > tipc/misc.c | 38 +++++++++++ > > tipc/misc.h | 1 + > > tipc/node.c | 133 > +++++++++++++++++++++++++++++++++++++- > > 5 files changed, 195 insertions(+), 2 deletions(-) > > > > diff --git a/include/uapi/linux/tipc.h b/include/uapi/linux/tipc.h > > index e16cb4e2..b118ce9b 100644 > > --- a/include/uapi/linux/tipc.h > > +++ b/include/uapi/linux/tipc.h > > @@ -232,6 +232,27 @@ struct tipc_sioc_nodeid_req { > > char node_id[TIPC_NODEID_LEN]; > > }; > > > > +/* > > + * TIPC Crypto, AEAD mode > > + */ > > +#define TIPC_AEAD_MAX_ALG_NAME (32) > > +#define TIPC_AEAD_MIN_KEYLEN (16 + 4) > > +#define TIPC_AEAD_MAX_KEYLEN (32 + 4) > > + > > +struct tipc_aead_key { > > + char alg_name[TIPC_AEAD_MAX_ALG_NAME]; > > + unsigned int keylen; /* in bytes */ > > + char key[]; > > +}; > > + > > +#define TIPC_AEAD_KEY_MAX_SIZE (sizeof(struct tipc_aead_key) + \ > > + TIPC_AEAD_MAX_KEYLEN) > > + > > +static inline int tipc_aead_key_size(struct tipc_aead_key *key) { > > + return sizeof(*key) + key->keylen; > > +} > > + > > /* The macros and functions below are deprecated: > > */ > > > > diff --git a/include/uapi/linux/tipc_netlink.h > > b/include/uapi/linux/tipc_netlink.h > > index efb958fd..6c2194ab 100644 > > --- a/include/uapi/linux/tipc_netlink.h > > +++ b/include/uapi/linux/tipc_netlink.h > > @@ -63,6 +63,8 @@ enum { > > TIPC_NL_PEER_REMOVE, > > TIPC_NL_BEARER_ADD, > > TIPC_NL_UDP_GET_REMOTEIP, > > + TIPC_NL_KEY_SET, > > + TIPC_NL_KEY_FLUSH, > > > > __TIPC_NL_CMD_MAX, > > TIPC_NL_CMD_MAX = __TIPC_NL_CMD_MAX - 1 @@ -160,6 +162,8 > @@ enum { > > TIPC_NLA_NODE_UNSPEC, > > TIPC_NLA_NODE_ADDR, /* u32 */ > > TIPC_NLA_NODE_UP, /* flag */ > > + TIPC_NLA_NODE_ID, /* data */ > > + TIPC_NLA_NODE_KEY, /* data */ > > > > __TIPC_NLA_NODE_MAX, > > TIPC_NLA_NODE_MAX = __TIPC_NLA_NODE_MAX - 1 diff --git > a/tipc/misc.c > > b/tipc/misc.c index e4b1cd0c..1daf3072 100644 > > --- a/tipc/misc.c > > +++ b/tipc/misc.c > > @@ -98,6 +98,44 @@ int str2nodeid(char *str, uint8_t *id) > > return 0; > > } > > > > +int str2key(char *str, struct tipc_aead_key *key) { > > + int len = strlen(str); > > + int ishex = 0; > > + int i; > > + > > + /* Check if the input is a hex string (i.e. 0x...) */ > > + if (len > 2 && strncmp(str, "0x", 2) == 0) { > > + ishex = is_hex(str + 2, len - 2 - 1); > > + if (ishex) { > > + len -= 2; > > + str += 2; > > + } > > + } > > + > > + /* Obtain key: */ > > + if (!ishex) { > > + key->keylen = len; > > + memcpy(key->key, str, len); > > + } else { > > + /* Convert hex string to key */ > > + key->keylen = (len + 1) / 2; > > + for (i = 0; i < key->keylen; i++) { > > + if (i == 0 && len % 2 != 0) { > > + if (sscanf(str, "%1hhx", &key->key[0]) != 1) > > + return -1; > > + str += 1; > > + continue; > > + } > > + if (sscanf(str, "%2hhx", &key->key[i]) != 1) > > + return -1; > > + str += 2; > > + } > > + } > > + > > + return 0; > > +} > > + > > void nodeid2str(uint8_t *id, char *str) { > > int i; > > diff --git a/tipc/misc.h b/tipc/misc.h index ff2f31f1..59309f68 100644 > > --- a/tipc/misc.h > > +++ b/tipc/misc.h > > @@ -18,5 +18,6 @@ uint32_t str2addr(char *str); int str2nodeid(char > > *str, uint8_t *id); void nodeid2str(uint8_t *id, char *str); void > > hash2nodestr(uint32_t hash, char *str); > > +int str2key(char *str, struct tipc_aead_key *key); > > > > #endif > > diff --git a/tipc/node.c b/tipc/node.c index 2fec6753..fc81bd30 100644 > > --- a/tipc/node.c > > +++ b/tipc/node.c > > @@ -157,6 +157,111 @@ static int cmd_node_set_nodeid(struct nlmsghdr > *nlh, const struct cmd *cmd, > > return msg_doit(nlh, NULL, NULL); > > } > > > > +static void cmd_node_set_key_help(struct cmdl *cmdl) { > > + fprintf(stderr, > > + "Usage: %s node set key KEY [algname ALGNAME] [nodeid > NODEID]\n\n" > > + "PROPERTIES\n" > > + " KEY - Symmetric KEY & SALT as a normal or hex > string\n" > > + " that consists of two parts:\n" > > + " [KEY: 16, 24 or 32 octets][SALT: 4 octets]\n\n" > > + " algname ALGNAME - Default: \"gcm(aes)\"\n\n" > > + " nodeid NODEID - Own or peer node identity to which the > key will\n" > > + " be attached. If not present, the key is a cluster\n" > > + " key!\n\n" > > + "EXAMPLES\n" > > + " %s node set key this_is_a_key16_salt algname \"gcm(aes)\" > nodeid node1\n" > > + " %s node set key > 0x746869735F69735F615F6B657931365F73616C74 nodeid node2\n\n", > > + cmdl->argv[0], cmdl->argv[0], cmdl->argv[0]); } > > + > > +static int cmd_node_set_key(struct nlmsghdr *nlh, const struct cmd *cmd, > > + struct cmdl *cmdl, void *data) { > > + struct { > > + struct tipc_aead_key key; > > + char mem[TIPC_AEAD_MAX_KEYLEN + 1]; > > + } input = {}; > > + struct opt opts[] = { > > + { "algname", OPT_KEYVAL, NULL }, > > + { "nodeid", OPT_KEYVAL, NULL }, > > + { NULL } > > + }; > > + struct nlattr *nest; > > + struct opt *opt_algname, *opt_nodeid; > > + char buf[MNL_SOCKET_BUFFER_SIZE]; > > + uint8_t id[TIPC_NODEID_LEN] = {0,}; > > + int keysize; > > + char *str; > > + > > + if (help_flag) { > > + (cmd->help)(cmdl); > > + return -EINVAL; > > + } > > + > > + if (cmdl->optind >= cmdl->argc) { > > + fprintf(stderr, "error, missing key\n"); > > + return -EINVAL; > > + } > > + > > + /* Get user key */ > > + str = shift_cmdl(cmdl); > > + if (str2key(str, &input.key)) { > > + fprintf(stderr, "error, invalid key input\n"); > > + return -EINVAL; > > + } > > + > > + if (parse_opts(opts, cmdl) < 0) > > + return -EINVAL; > > + > > + /* Get algorithm name, default: "gcm(aes)" */ > > + opt_algname = get_opt(opts, "algname"); > > + if (!opt_algname) > > + strcpy(input.key.alg_name, "gcm(aes)"); > > + else > > + strcpy(input.key.alg_name, opt_algname->val); > > + > > + /* Get node identity */ > > + opt_nodeid = get_opt(opts, "nodeid"); > > + if (opt_nodeid && str2nodeid(opt_nodeid->val, id)) { > > + fprintf(stderr, "error, invalid node identity\n"); > > + return -EINVAL; > > + } > > + > > + /* Init & do the command */ > > + nlh = msg_init(buf, TIPC_NL_KEY_SET); > > + if (!nlh) { > > + fprintf(stderr, "error, message initialisation failed\n"); > > + return -1; > > + } > > + nest = mnl_attr_nest_start(nlh, TIPC_NLA_NODE); > > + keysize = tipc_aead_key_size(&input.key); > > + mnl_attr_put(nlh, TIPC_NLA_NODE_KEY, keysize, &input.key); > > + if (opt_nodeid) > > + mnl_attr_put(nlh, TIPC_NLA_NODE_ID, TIPC_NODEID_LEN, id); > > + mnl_attr_nest_end(nlh, nest); > > + return msg_doit(nlh, NULL, NULL); > > +} > > + > > +static int cmd_node_flush_key(struct nlmsghdr *nlh, const struct cmd > *cmd, > > + struct cmdl *cmdl, void *data) { > > + char buf[MNL_SOCKET_BUFFER_SIZE]; > > + > > + if (help_flag) { > > + (cmd->help)(cmdl); > > + return -EINVAL; > > + } > > + > > + /* Init & do the command */ > > + nlh = msg_init(buf, TIPC_NL_KEY_FLUSH); > > + if (!nlh) { > > + fprintf(stderr, "error, message initialisation failed\n"); > > + return -1; > > + } > > + return msg_doit(nlh, NULL, NULL); > > +} > > + > > static int nodeid_get_cb(const struct nlmsghdr *nlh, void *data) { > > struct nlattr *info[TIPC_NLA_MAX + 1] = {}; @@ -270,13 +375,34 @@ > > static int cmd_node_set_netid(struct nlmsghdr *nlh, const struct cmd *cmd, > > return msg_doit(nlh, NULL, NULL); > > } > > > > +static void cmd_node_flush_help(struct cmdl *cmdl) { > > + fprintf(stderr, > > + "Usage: %s node flush PROPERTY\n\n" > > + "PROPERTIES\n" > > + " key - Flush all symmetric-keys\n", > > + cmdl->argv[0]); > > +} > > + > > +static int cmd_node_flush(struct nlmsghdr *nlh, const struct cmd *cmd, > > + struct cmdl *cmdl, void *data) > > +{ > > + const struct cmd cmds[] = { > > + { "key", cmd_node_flush_key, NULL }, > > + { NULL } > > + }; > > + > > + return run_cmd(nlh, cmd, cmds, cmdl, NULL); } > > + > > static void cmd_node_set_help(struct cmdl *cmdl) { > > fprintf(stderr, > > "Usage: %s node set PROPERTY\n\n" > > "PROPERTIES\n" > > " identity NODEID - Set node identity\n" > > - " clusterid CLUSTERID - Set local cluster id\n", > > + " clusterid CLUSTERID - Set local cluster id\n" > > + " key PROPERTY - Set symmetric-key\n", > > cmdl->argv[0]); > > } > > > > @@ -288,6 +414,7 @@ static int cmd_node_set(struct nlmsghdr *nlh, > const struct cmd *cmd, > > { "identity", cmd_node_set_nodeid, NULL }, > > { "netid", cmd_node_set_netid, NULL }, > > { "clusterid", cmd_node_set_netid, NULL }, > > + { "key", cmd_node_set_key, cmd_node_set_key_help }, > > { NULL } > > }; > > > > @@ -325,7 +452,8 @@ void cmd_node_help(struct cmdl *cmdl) > > "COMMANDS\n" > > " list - List remote nodes\n" > > " get - Get local node parameters\n" > > - " set - Set local node parameters\n", > > + " set - Set local node parameters\n" > > + " flush - Flush local node parameters\n", > > cmdl->argv[0]); > > } > > > > @@ -336,6 +464,7 @@ int cmd_node(struct nlmsghdr *nlh, const struct > cmd *cmd, struct cmdl *cmdl, > > { "list", cmd_node_list, NULL }, > > { "get", cmd_node_get, cmd_node_get_help }, > > { "set", cmd_node_set, cmd_node_set_help }, > > + { "flush", cmd_node_flush, cmd_node_flush_help}, > > { NULL } > > }; > > > > |
From: Jon M. <jon...@er...> - 2019-10-16 14:41:43
|
> -----Original Message----- > From: Ying Xue <yin...@wi...> > Sent: 16-Oct-19 08:13 > To: Tuong Tong Lien <tuo...@de...>; tipc- > dis...@li...; Jon Maloy <jon...@er...>; > ma...@do... > Subject: Re: [PATCH RFC 0/5] TIPC encryption > > Looks like this is an amazing proposal! > > I had the idea long time ago, but at that moment, I didn't think encrypting TIPC > message was meaningful because TIPC was mostly used within internal > network. After UDP bearer was supported and one TIPC node was capable of > communicating with its peers across IP, it seemed the encryption feature > became useful. But if needed, we could enable IPSEC during this situation. > > At present, the only useful scenario that I can image is that TIPC will be used as > low level communication infrastructure in Docker or k8s environment. Is there > other case? The main driver for this has been that Ericsson customers want a fully encrypted "backplane" even for TIPC traffic that doesn't use UDP. We have considered MACsec, but that is not always desirable for our customers, just as they are not always happy with IPsec. So the solution was to make TIPC "self sufficient" regarding encryption. Now we can also benefit from the fact that we can encrypt true multicast, something nobody else is doing. > > Sorry, I am pretty busy in this week, and significant changes are made in the > series. I have to take a bit long time to review the series. > Please wait for a while. We are looking forward to your feedback. BR ///jon > > On 10/14/19 7:07 PM, Tuong Lien wrote: > > This series provides TIPC encryption feature, kernel part. There will > > be another one in the 'iproute2/tipc' for user space to set key. > > > > Tuong Lien (5): > > tipc: add reference counter to bearer > > tipc: enable creating a "preliminary" node > > tipc: add new AEAD key structure for user API > > tipc: introduce TIPC encryption & authentication > > tipc: add support for AEAD key setting via netlink > > > > include/uapi/linux/tipc.h | 21 + > > include/uapi/linux/tipc_netlink.h | 4 + > > net/tipc/Makefile | 2 +- > > net/tipc/bcast.c | 2 +- > > net/tipc/bearer.c | 52 +- > > net/tipc/bearer.h | 6 +- > > net/tipc/core.c | 10 + > > net/tipc/core.h | 4 + > > net/tipc/crypto.c | 1986 > +++++++++++++++++++++++++++++++++++++ > > net/tipc/crypto.h | 166 ++++ > > net/tipc/link.c | 16 +- > > net/tipc/link.h | 1 + > > net/tipc/msg.c | 24 +- > > net/tipc/msg.h | 44 +- > > net/tipc/netlink.c | 16 +- > > net/tipc/node.c | 314 +++++- > > net/tipc/node.h | 10 + > > net/tipc/sysctl.c | 9 + > > net/tipc/udp_media.c | 1 + > > 19 files changed, 2604 insertions(+), 84 deletions(-) create mode > > 100644 net/tipc/crypto.c create mode 100644 net/tipc/crypto.h > > |
From: Ying X. <yin...@wi...> - 2019-10-16 12:43:23
|
Tt looks like we will use "tipc node" command to configure static key to TIPC module, right? Do we plan to support dynamic key setting? If yes, what kinds of key exchange protocol would we use? For example, in IPSEC, it uses IKEv2 as its key exchange protocol. Will key be expired after a specific lifetime? For instance, in IPSEC/Raccoon2 or strongswan, they use rekey feature to provide this function to make security association safer. On 10/14/19 7:36 PM, Tuong Lien wrote: > Two new commands are added as part of 'tipc node' command: > > $tipc node set key KEY [algname ALGNAME] [nodeid NODEID] > $tipc node flush key > > which enable user to set and remove AEAD keys in kernel TIPC. > > For the 'set key' command, the given 'nodeid' parameter decides the > mode to be applied to the key, particularly: > > - If NODEID is empty, the key is a 'cluster' key which will be used for > all message encryption/decryption from/to the node (i.e. both TX & RX). > The same key needs to be set in the other nodes i.e. the 'cluster key' > mode. > > - If NODEID is own node, the key is used for message encryption (TX) > from the node. Whereas, if NODEID is a peer node, the key is for > message decryption (RX) from that peer node. > This is the 'per-node-key' mode that each nodes in the cluster has its > specific (TX) key. > > Signed-off-by: Tuong Lien <tuo...@de...> > --- > include/uapi/linux/tipc.h | 21 ++++++ > include/uapi/linux/tipc_netlink.h | 4 ++ > tipc/misc.c | 38 +++++++++++ > tipc/misc.h | 1 + > tipc/node.c | 133 +++++++++++++++++++++++++++++++++++++- > 5 files changed, 195 insertions(+), 2 deletions(-) > > diff --git a/include/uapi/linux/tipc.h b/include/uapi/linux/tipc.h > index e16cb4e2..b118ce9b 100644 > --- a/include/uapi/linux/tipc.h > +++ b/include/uapi/linux/tipc.h > @@ -232,6 +232,27 @@ struct tipc_sioc_nodeid_req { > char node_id[TIPC_NODEID_LEN]; > }; > > +/* > + * TIPC Crypto, AEAD mode > + */ > +#define TIPC_AEAD_MAX_ALG_NAME (32) > +#define TIPC_AEAD_MIN_KEYLEN (16 + 4) > +#define TIPC_AEAD_MAX_KEYLEN (32 + 4) > + > +struct tipc_aead_key { > + char alg_name[TIPC_AEAD_MAX_ALG_NAME]; > + unsigned int keylen; /* in bytes */ > + char key[]; > +}; > + > +#define TIPC_AEAD_KEY_MAX_SIZE (sizeof(struct tipc_aead_key) + \ > + TIPC_AEAD_MAX_KEYLEN) > + > +static inline int tipc_aead_key_size(struct tipc_aead_key *key) > +{ > + return sizeof(*key) + key->keylen; > +} > + > /* The macros and functions below are deprecated: > */ > > diff --git a/include/uapi/linux/tipc_netlink.h b/include/uapi/linux/tipc_netlink.h > index efb958fd..6c2194ab 100644 > --- a/include/uapi/linux/tipc_netlink.h > +++ b/include/uapi/linux/tipc_netlink.h > @@ -63,6 +63,8 @@ enum { > TIPC_NL_PEER_REMOVE, > TIPC_NL_BEARER_ADD, > TIPC_NL_UDP_GET_REMOTEIP, > + TIPC_NL_KEY_SET, > + TIPC_NL_KEY_FLUSH, > > __TIPC_NL_CMD_MAX, > TIPC_NL_CMD_MAX = __TIPC_NL_CMD_MAX - 1 > @@ -160,6 +162,8 @@ enum { > TIPC_NLA_NODE_UNSPEC, > TIPC_NLA_NODE_ADDR, /* u32 */ > TIPC_NLA_NODE_UP, /* flag */ > + TIPC_NLA_NODE_ID, /* data */ > + TIPC_NLA_NODE_KEY, /* data */ > > __TIPC_NLA_NODE_MAX, > TIPC_NLA_NODE_MAX = __TIPC_NLA_NODE_MAX - 1 > diff --git a/tipc/misc.c b/tipc/misc.c > index e4b1cd0c..1daf3072 100644 > --- a/tipc/misc.c > +++ b/tipc/misc.c > @@ -98,6 +98,44 @@ int str2nodeid(char *str, uint8_t *id) > return 0; > } > > +int str2key(char *str, struct tipc_aead_key *key) > +{ > + int len = strlen(str); > + int ishex = 0; > + int i; > + > + /* Check if the input is a hex string (i.e. 0x...) */ > + if (len > 2 && strncmp(str, "0x", 2) == 0) { > + ishex = is_hex(str + 2, len - 2 - 1); > + if (ishex) { > + len -= 2; > + str += 2; > + } > + } > + > + /* Obtain key: */ > + if (!ishex) { > + key->keylen = len; > + memcpy(key->key, str, len); > + } else { > + /* Convert hex string to key */ > + key->keylen = (len + 1) / 2; > + for (i = 0; i < key->keylen; i++) { > + if (i == 0 && len % 2 != 0) { > + if (sscanf(str, "%1hhx", &key->key[0]) != 1) > + return -1; > + str += 1; > + continue; > + } > + if (sscanf(str, "%2hhx", &key->key[i]) != 1) > + return -1; > + str += 2; > + } > + } > + > + return 0; > +} > + > void nodeid2str(uint8_t *id, char *str) > { > int i; > diff --git a/tipc/misc.h b/tipc/misc.h > index ff2f31f1..59309f68 100644 > --- a/tipc/misc.h > +++ b/tipc/misc.h > @@ -18,5 +18,6 @@ uint32_t str2addr(char *str); > int str2nodeid(char *str, uint8_t *id); > void nodeid2str(uint8_t *id, char *str); > void hash2nodestr(uint32_t hash, char *str); > +int str2key(char *str, struct tipc_aead_key *key); > > #endif > diff --git a/tipc/node.c b/tipc/node.c > index 2fec6753..fc81bd30 100644 > --- a/tipc/node.c > +++ b/tipc/node.c > @@ -157,6 +157,111 @@ static int cmd_node_set_nodeid(struct nlmsghdr *nlh, const struct cmd *cmd, > return msg_doit(nlh, NULL, NULL); > } > > +static void cmd_node_set_key_help(struct cmdl *cmdl) > +{ > + fprintf(stderr, > + "Usage: %s node set key KEY [algname ALGNAME] [nodeid NODEID]\n\n" > + "PROPERTIES\n" > + " KEY - Symmetric KEY & SALT as a normal or hex string\n" > + " that consists of two parts:\n" > + " [KEY: 16, 24 or 32 octets][SALT: 4 octets]\n\n" > + " algname ALGNAME - Default: \"gcm(aes)\"\n\n" > + " nodeid NODEID - Own or peer node identity to which the key will\n" > + " be attached. If not present, the key is a cluster\n" > + " key!\n\n" > + "EXAMPLES\n" > + " %s node set key this_is_a_key16_salt algname \"gcm(aes)\" nodeid node1\n" > + " %s node set key 0x746869735F69735F615F6B657931365F73616C74 nodeid node2\n\n", > + cmdl->argv[0], cmdl->argv[0], cmdl->argv[0]); > +} > + > +static int cmd_node_set_key(struct nlmsghdr *nlh, const struct cmd *cmd, > + struct cmdl *cmdl, void *data) > +{ > + struct { > + struct tipc_aead_key key; > + char mem[TIPC_AEAD_MAX_KEYLEN + 1]; > + } input = {}; > + struct opt opts[] = { > + { "algname", OPT_KEYVAL, NULL }, > + { "nodeid", OPT_KEYVAL, NULL }, > + { NULL } > + }; > + struct nlattr *nest; > + struct opt *opt_algname, *opt_nodeid; > + char buf[MNL_SOCKET_BUFFER_SIZE]; > + uint8_t id[TIPC_NODEID_LEN] = {0,}; > + int keysize; > + char *str; > + > + if (help_flag) { > + (cmd->help)(cmdl); > + return -EINVAL; > + } > + > + if (cmdl->optind >= cmdl->argc) { > + fprintf(stderr, "error, missing key\n"); > + return -EINVAL; > + } > + > + /* Get user key */ > + str = shift_cmdl(cmdl); > + if (str2key(str, &input.key)) { > + fprintf(stderr, "error, invalid key input\n"); > + return -EINVAL; > + } > + > + if (parse_opts(opts, cmdl) < 0) > + return -EINVAL; > + > + /* Get algorithm name, default: "gcm(aes)" */ > + opt_algname = get_opt(opts, "algname"); > + if (!opt_algname) > + strcpy(input.key.alg_name, "gcm(aes)"); > + else > + strcpy(input.key.alg_name, opt_algname->val); > + > + /* Get node identity */ > + opt_nodeid = get_opt(opts, "nodeid"); > + if (opt_nodeid && str2nodeid(opt_nodeid->val, id)) { > + fprintf(stderr, "error, invalid node identity\n"); > + return -EINVAL; > + } > + > + /* Init & do the command */ > + nlh = msg_init(buf, TIPC_NL_KEY_SET); > + if (!nlh) { > + fprintf(stderr, "error, message initialisation failed\n"); > + return -1; > + } > + nest = mnl_attr_nest_start(nlh, TIPC_NLA_NODE); > + keysize = tipc_aead_key_size(&input.key); > + mnl_attr_put(nlh, TIPC_NLA_NODE_KEY, keysize, &input.key); > + if (opt_nodeid) > + mnl_attr_put(nlh, TIPC_NLA_NODE_ID, TIPC_NODEID_LEN, id); > + mnl_attr_nest_end(nlh, nest); > + return msg_doit(nlh, NULL, NULL); > +} > + > +static int cmd_node_flush_key(struct nlmsghdr *nlh, const struct cmd *cmd, > + struct cmdl *cmdl, void *data) > +{ > + char buf[MNL_SOCKET_BUFFER_SIZE]; > + > + if (help_flag) { > + (cmd->help)(cmdl); > + return -EINVAL; > + } > + > + /* Init & do the command */ > + nlh = msg_init(buf, TIPC_NL_KEY_FLUSH); > + if (!nlh) { > + fprintf(stderr, "error, message initialisation failed\n"); > + return -1; > + } > + return msg_doit(nlh, NULL, NULL); > +} > + > static int nodeid_get_cb(const struct nlmsghdr *nlh, void *data) > { > struct nlattr *info[TIPC_NLA_MAX + 1] = {}; > @@ -270,13 +375,34 @@ static int cmd_node_set_netid(struct nlmsghdr *nlh, const struct cmd *cmd, > return msg_doit(nlh, NULL, NULL); > } > > +static void cmd_node_flush_help(struct cmdl *cmdl) > +{ > + fprintf(stderr, > + "Usage: %s node flush PROPERTY\n\n" > + "PROPERTIES\n" > + " key - Flush all symmetric-keys\n", > + cmdl->argv[0]); > +} > + > +static int cmd_node_flush(struct nlmsghdr *nlh, const struct cmd *cmd, > + struct cmdl *cmdl, void *data) > +{ > + const struct cmd cmds[] = { > + { "key", cmd_node_flush_key, NULL }, > + { NULL } > + }; > + > + return run_cmd(nlh, cmd, cmds, cmdl, NULL); > +} > + > static void cmd_node_set_help(struct cmdl *cmdl) > { > fprintf(stderr, > "Usage: %s node set PROPERTY\n\n" > "PROPERTIES\n" > " identity NODEID - Set node identity\n" > - " clusterid CLUSTERID - Set local cluster id\n", > + " clusterid CLUSTERID - Set local cluster id\n" > + " key PROPERTY - Set symmetric-key\n", > cmdl->argv[0]); > } > > @@ -288,6 +414,7 @@ static int cmd_node_set(struct nlmsghdr *nlh, const struct cmd *cmd, > { "identity", cmd_node_set_nodeid, NULL }, > { "netid", cmd_node_set_netid, NULL }, > { "clusterid", cmd_node_set_netid, NULL }, > + { "key", cmd_node_set_key, cmd_node_set_key_help }, > { NULL } > }; > > @@ -325,7 +452,8 @@ void cmd_node_help(struct cmdl *cmdl) > "COMMANDS\n" > " list - List remote nodes\n" > " get - Get local node parameters\n" > - " set - Set local node parameters\n", > + " set - Set local node parameters\n" > + " flush - Flush local node parameters\n", > cmdl->argv[0]); > } > > @@ -336,6 +464,7 @@ int cmd_node(struct nlmsghdr *nlh, const struct cmd *cmd, struct cmdl *cmdl, > { "list", cmd_node_list, NULL }, > { "get", cmd_node_get, cmd_node_get_help }, > { "set", cmd_node_set, cmd_node_set_help }, > + { "flush", cmd_node_flush, cmd_node_flush_help}, > { NULL } > }; > > |
From: Ying X. <yin...@wi...> - 2019-10-16 12:25:26
|
Looks like this is an amazing proposal! I had the idea long time ago, but at that moment, I didn't think encrypting TIPC message was meaningful because TIPC was mostly used within internal network. After UDP bearer was supported and one TIPC node was capable of communicating with its peers across IP, it seemed the encryption feature became useful. But if needed, we could enable IPSEC during this situation. At present, the only useful scenario that I can image is that TIPC will be used as low level communication infrastructure in Docker or k8s environment. Is there other case? Sorry, I am pretty busy in this week, and significant changes are made in the series. I have to take a bit long time to review the series. Please wait for a while. On 10/14/19 7:07 PM, Tuong Lien wrote: > This series provides TIPC encryption feature, kernel part. There will be > another one in the 'iproute2/tipc' for user space to set key. > > Tuong Lien (5): > tipc: add reference counter to bearer > tipc: enable creating a "preliminary" node > tipc: add new AEAD key structure for user API > tipc: introduce TIPC encryption & authentication > tipc: add support for AEAD key setting via netlink > > include/uapi/linux/tipc.h | 21 + > include/uapi/linux/tipc_netlink.h | 4 + > net/tipc/Makefile | 2 +- > net/tipc/bcast.c | 2 +- > net/tipc/bearer.c | 52 +- > net/tipc/bearer.h | 6 +- > net/tipc/core.c | 10 + > net/tipc/core.h | 4 + > net/tipc/crypto.c | 1986 +++++++++++++++++++++++++++++++++++++ > net/tipc/crypto.h | 166 ++++ > net/tipc/link.c | 16 +- > net/tipc/link.h | 1 + > net/tipc/msg.c | 24 +- > net/tipc/msg.h | 44 +- > net/tipc/netlink.c | 16 +- > net/tipc/node.c | 314 +++++- > net/tipc/node.h | 10 + > net/tipc/sysctl.c | 9 + > net/tipc/udp_media.c | 1 + > 19 files changed, 2604 insertions(+), 84 deletions(-) > create mode 100644 net/tipc/crypto.c > create mode 100644 net/tipc/crypto.h > |
From: Ying X. <yin...@wi...> - 2019-10-16 12:02:52
|
On 10/15/19 7:46 PM, Jon Maloy wrote: > You must have forgot that since commit 6c9081a3915d ("add loopback device tracing") this is no problem any more. > Of course we do the same in this case, so a trouble shooter only needs to do tcpdump on the sender's loopback interface. Oh, the most inconvenience is gone. Please move on. |
From: Ying X. <yin...@wi...> - 2019-10-16 11:58:10
|
On 10/16/19 4:41 AM, Mohamed Hamed El-Gamal wrote: > Hello, > > I would like to ask you regarding the optimum way to run TIPC over > containerized workload > Will it be using MACVLAN interfaces for docker and K8s ? > Interfaces attached to different containers are connected to a virtual bridge on host, and each of contains serves as TIPC node, which definitely works. I think macvlan interface used as TIPC bearer for docker or k8s should be the most efficient way. But particularly when creating macvlan interface, it should use "bridge" mode, otherwise, TIPC links between different containers cannot be established. But I don't do any experiment to verify whether it works to configure maclvan interface as TIPC bearer. If you would like to try, please tell your test result. > Note: We are using also Mulit-netowkring > > > It would be great if there is any supportive documentations > > > Thanks a lot > Best Regards > > > > > <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail> > Garanti > sans virus. www.avast.com > <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail> > <#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2> > > _______________________________________________ > tipc-discussion mailing list > tip...@li... > https://lists.sourceforge.net/lists/listinfo/tipc-discussion > |
From: Mohamed H. El-G. <mhm...@gm...> - 2019-10-15 20:41:52
|
Hello, I would like to ask you regarding the optimum way to run TIPC over containerized workload Will it be using MACVLAN interfaces for docker and K8s ? Note: We are using also Mulit-netowkring It would be great if there is any supportive documentations Thanks a lot Best Regards <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail> Garanti sans virus. www.avast.com <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail> <#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2> |
From: Jon M. <jon...@er...> - 2019-10-15 13:22:32
|
> -----Original Message----- > From: Xue, Ying <Yin...@wi...> > Sent: 14-Oct-19 11:55 > To: Jon Maloy <jon...@er...>; Xin Long <lx...@re...> > Cc: tip...@li... > Subject: RE: [net-next] tipc: improve throughput between nodes in netns > > Hi Jon, > > Please see my comment inline. > > At netdev 0x13 in Prague last July there was presented a related proposal > https://protect2.fireeye.com/url?k=aef37301-f27aa946-aef3339a- > 0cc47ad93c18- > 1e8b56dd88647abb&q=1&u=https%3A%2F%2Fnetdevconf.info%2F0x13%2 > Fsession.html%3Ftalk-AF_GRAFT. > I was there, and I cannot say there was any overwhelming approval of this > proposal, but neither was it rejected out of hand. > > [Ying] The idea of AF_GRAFT socket is exactly the same as this patch. If it can > be recognized, it's definitely worth trying to submit this patch to upstream. But > after my checking, the wired thing is that AF_GRAFT is not supported by latest > kernel and I don't find its author ever attempted to submit its patch to > upstream. > > First, I see TIPC as an IPC, not a network protocol, and anybody using TIPC > inside a cluster has per definition been authenticated to start a node and > connect to the cluster. Here, there is no change from current policies. > Once a node has been accepted in a cluster, possibly via encrypted discovery > messages which have been passing all policies checks, and we are 100% > certain it is legitimate and located in the same kernel (as we are trying to > ensure in this patch), I cannot see any reason why we should not be allowed to > short-cut the stack the way we do. Security checks have already been done. > Are we circumventing any other policies by doing this that must not be done? > > [Ying] If we treat TIPC as IPC channel, bypassing its lower level interface is > acceptable. Beside AF_GRAFT socket, in fact AF_UNIX socket provides an > interconnection mechanism between different processes on socket level, and > there are several options available for us to configure policies against socket, > such as, SO_ATTACH_FILTER, SO_ATTACH_BPF, > SO_ATTACH_REUSEPORT_EBPF etc. If we bypass TIPC bearer, the most > inconvenient thing is that it's hard for us to monitor traffics between netns > with tcpdump. You must have forgot that since commit 6c9081a3915d ("add loopback device tracing") this is no problem any more. Of course we do the same in this case, so a trouble shooter only needs to do tcpdump on the sender's loopback interface. ///jon Of course, as Xin mentioned previously, we could not use > traditional tools to control/shape TIPC traffic across netns. > > Unless you strongly object I would suggest we send this to netdev as an RFC > and observe the reactions. If David or Eric or any of the other heavyweight say > flatly no there is nothing we can do. But It might be worth a try. > > [Ying] No, I don't strongly object this proposal. We can try to submit it to net- > next mail list. > > Thanks, > Ying > > > -----Original Message----- > > From: Xue, Ying <Yin...@wi...> > > Sent: 11-Oct-19 07:58 > > To: Jon Maloy <jon...@er...>; Xin Long <lx...@re...> > > Subject: RE: [net-next] tipc: improve throughput between nodes in > > netns > > > > Exactly. I agree with Xin. The major purpose of namespace is mainly to > > provide an isolated environment. But as this patch almost completely > > bypasses security check points of networking stack, the traffics > > between namespaces will be out of control. So I don't think this is a good > idea. > > > > Thanks, > > Ying > > > > -----Original Message----- > > From: Jon Maloy [mailto:jon...@er...] > > Sent: Friday, October 11, 2019 2:14 AM > > To: Xin Long > > Cc: Xue, Ying > > Subject: RE: [net-next] tipc: improve throughput between nodes in > > netns > > > > Hi Xin, > > I am not surprised by you answer. Apart from concerns about security, > > this is the same objection I have heard from others when presenting > > this idea, and I suspect that this would also be the reaction if we try to > deliver this to David. > > If we can achieve anything close to this by adding GSO to the veth > > interface I think that would be a safer approach. > > So, I suggest we put this one to rest for now, and I'll try to go > > ahead with the GSO approach instead. > > > > Sorry Hoang for making you waste your time. > > > > BR > > ///jon > > > > > -----Original Message----- > > > From: Xin Long <lx...@re...> > > > Sent: 10-Oct-19 07:14 > > > To: Jon Maloy <jon...@er...> > > > Cc: Ying Xue <yin...@wi...> > > > Subject: Re: [net-next] tipc: improve throughput between nodes in > > > netns > > > > > > > > > > > > ----- Original Message ----- > > > > Ying and Xin, > > > > This is the "wormhole" functionality I have been suggesting a > > > > since while back. > > > > Basically, we send messages directly socket to socket between name > > > > spaces on the same host, not only between sockets within the same > > > > name > > > space. > > > > As you will understand this might have a huge positive impact on > > > > performance between e.g., docker containers or containers inside > > > Kubernetes pods. > > > > > > > > Please spend some time reviewing this, as it might be a > > > > controversial feature. It is imperative that we get security right here. > > > > > > > If I understand it right: > > > > > > With this patch, TIPC packets will skip all lower layers protocol > > > stack, like IP (udp media), ether link layer, which means all rules > > > of like tc, ovs, netfiler/br_netfilter will be skipped. > > > > > > I don't think this could be endured, especially when it comes to a > > > cloud environment where many rules are configured on those virtual > > > NICs. Unless we have some special needs, I'm not sure if this > > > performance improvement is worth a big protocol stack skip. > > > > > > Thanks. > > > > > > > BR > > > > ///jon > > > > > > > > > > > > -----Original Message----- > > > > From: Hoang Le <hoa...@de...> > > > > Sent: 2-Oct-19 06:26 > > > > To: Jon Maloy <jon...@er...>; ma...@do...; > > > > tip...@li... > > > > Subject: [net-next] tipc: improve throughput between nodes in > > > > netns > > > > > > > > Introduce traffic cross namespaces transmission as local node. > > > > By this way, throughput between nodes in namespace as fast as local. > > > > > > > > Testcase: > > > > $ip netns exec 1 benchmark_client -c 100 $ip netns exec 2 > > > > benchmark_server > > > > > > > > Before: > > > > +----------------------------------------------------------------- > > > > +---------------------------- > > + > > > > | Msg Size | # | # Msgs/ | Elapsed | Throughput > > > > | | > > > > | [octets] | Conns | Conn | [ms] > > > > | +------------------------------------------------+ > > > > | | | | | Total [Msg/s] | Total [Mb/s] | > > > > | | | | | Per Conn [Mb/s] | > > > > +----------------------------------------------------------------- > > > > +---------------------------- > > + > > > > | 64 | 100 | 64000 | 13005 | 492103 | 251 | > > > > | 2 | > > > > +----------------------------------------------------------------- > > > > +---------------------------- > > + > > > > | 256 | 100 | 32000 | 4964 | 644627 | 1320 | > > > > | 13 | > > > > +----------------------------------------------------------------- > > > > +---------------------------- > > + > > > > | 1024 | 100 | 16000 | 4524 | 353612 | 2896 | > > > > | 28 | > > > > +----------------------------------------------------------------- > > > > +---------------------------- > > + > > > > | 4096 | 100 | 8000 | 3675 | 217644 | 7131 | > > > > | 71 | > > > > +----------------------------------------------------------------- > > > > +---------------------------- > > + > > > > | 16384 | 100 | 4000 | 7914 | 50540 | 6624 | > > > > | 66 | > > > > +----------------------------------------------------------------- > > > > +---------------------------- > > + > > > > | 65536 | 100 | 2000 | 13000 | 15384 | 8065 | > > > > | 80 | > > > > +----------------------------------------------------------------- > > > > +---------------------------- > > + > > > > > > > > After: > > > > +----------------------------------------------------------------- > > > > +---------------------------- > > + > > > > | Msg Size | # | # Msgs/ | Elapsed | Throughput > > > > | | > > > > | [octets] | Conns | Conn | [ms] > > > > | +------------------------------------------------+ > > > > | | | | | Total [Msg/s] | Total [Mb/s] | > > > > | | | | | Per Conn [Mb/s] | > > > > +----------------------------------------------------------------- > > > > +---------------------------- > > + > > > > | 64 | 100 | 64000 | 7842 | 816090 | 417 | > > > > | 4 | > > > > +----------------------------------------------------------------- > > > > +---------------------------- > > + > > > > | 256 | 100 | 32000 | 3593 | 890469 | 1823 | > > > > | 18 | > > > > +----------------------------------------------------------------- > > > > +---------------------------- > > + > > > > | 1024 | 100 | 16000 | 1835 | 871828 | 7142 | > > > > | 71 | > > > > +----------------------------------------------------------------- > > > > +---------------------------- > > + > > > > | 4096 | 100 | 8000 | 1134 | 704904 | 23098 | > > > > | 230 | > > > > +----------------------------------------------------------------- > > > > +---------------------------- > > + > > > > | 16384 | 100 | 4000 | 878 | 455295 | 59676 | > > > > | 596 | > > > > +----------------------------------------------------------------- > > > > +---------------------------- > > + > > > > | 65536 | 100 | 2000 | 1007 | 198487 | 104064 | > > > > | 1040 | > > > > +----------------------------------------------------------------- > > > > +---------------------------- > > + > > > > > > > > Signed-off-by: Hoang Le <hoa...@de...> > > > > --- > > > > net/tipc/discover.c | 6 ++- > > > > net/tipc/msg.h | 10 +++++ > > > > net/tipc/name_distr.c | 2 +- > > > > net/tipc/node.c | 94 > > > +++++++++++++++++++++++++++++++++++++++++-- > > > > net/tipc/node.h | 4 +- > > > > net/tipc/socket.c | 6 +-- > > > > 6 files changed, 111 insertions(+), 11 deletions(-) > > > > > > > > diff --git a/net/tipc/discover.c b/net/tipc/discover.c index > > > > c138d68e8a69..98d4eea97eb7 100644 > > > > --- a/net/tipc/discover.c > > > > +++ b/net/tipc/discover.c > > > > @@ -38,6 +38,8 @@ > > > > #include "node.h" > > > > #include "discover.h" > > > > > > > > +#include <net/netns/hash.h> > > > > + > > > > /* min delay during bearer start up */ > > > > #define TIPC_DISC_INIT msecs_to_jiffies(125) > > > > /* max delay if bearer has no links */ @@ -94,6 +96,7 @@ static > > > > void tipc_disc_init_msg(struct net *net, struct sk_buff *skb, > > > > msg_set_dest_domain(hdr, dest_domain); > > > > msg_set_bc_netid(hdr, tn->net_id); > > > > b->media->addr2msg(msg_media_addr(hdr), &b->addr); > > > > + msg_set_peer_net_hash(hdr, net_hash_mix(net)); > > > > msg_set_node_id(hdr, tipc_own_id(net)); } > > > > > > > > @@ -200,6 +203,7 @@ void tipc_disc_rcv(struct net *net, struct > > > > sk_buff > > > *skb, > > > > u8 peer_id[NODE_ID_LEN] = {0,}; > > > > u32 dst = msg_dest_domain(hdr); > > > > u32 net_id = msg_bc_netid(hdr); > > > > + u32 pnet_hash = msg_peer_net_hash(hdr); > > > > struct tipc_media_addr maddr; > > > > u32 src = msg_prevnode(hdr); > > > > u32 mtyp = msg_type(hdr); > > > > @@ -242,7 +246,7 @@ void tipc_disc_rcv(struct net *net, struct > > > > sk_buff > > > *skb, > > > > if (!tipc_in_scope(legacy, b->domain, src)) > > > > return; > > > > tipc_node_check_dest(net, src, peer_id, b, caps, signature, > > > > - &maddr, &respond, &dupl_addr); > > > > + pnet_hash, &maddr, &respond, &dupl_addr); > > > > if (dupl_addr) > > > > disc_dupl_alert(b, src, &maddr); > > > > if (!respond) > > > > diff --git a/net/tipc/msg.h b/net/tipc/msg.h index > > > > 0daa6f04ca81..a8d0f28094f2 > > > > 100644 > > > > --- a/net/tipc/msg.h > > > > +++ b/net/tipc/msg.h > > > > @@ -973,6 +973,16 @@ static inline void > > > > msg_set_grp_remitted(struct tipc_msg *m, u16 n) > > > > msg_set_bits(m, 9, 16, 0xffff, n); } > > > > > > > > +static inline void msg_set_peer_net_hash(struct tipc_msg *m, u32 n) { > > > > + msg_set_word(m, 9, n); > > > > +} > > > > + > > > > +static inline u32 msg_peer_net_hash(struct tipc_msg *m) { > > > > + return msg_word(m, 9); > > > > +} > > > > + > > > > /* Word 10 > > > > */ > > > > static inline u16 msg_grp_evt(struct tipc_msg *m) diff --git > > > > a/net/tipc/name_distr.c b/net/tipc/name_distr.c index > > > > 836e629e8f4a..5feaf3b67380 100644 > > > > --- a/net/tipc/name_distr.c > > > > +++ b/net/tipc/name_distr.c > > > > @@ -146,7 +146,7 @@ static void named_distribute(struct net *net, > > > > struct sk_buff_head *list, > > > > struct publication *publ; > > > > struct sk_buff *skb = NULL; > > > > struct distr_item *item = NULL; > > > > - u32 msg_dsz = ((tipc_node_get_mtu(net, dnode, 0) - > INT_H_SIZE) / > > > > + u32 msg_dsz = ((tipc_node_get_mtu(net, dnode, 0, false) - > > > > +INT_H_SIZE) / > > > > ITEM_SIZE) * ITEM_SIZE; > > > > u32 msg_rem = msg_dsz; > > > > > > > > diff --git a/net/tipc/node.c b/net/tipc/node.c index > > > > c8f6177dd5a2..9a4ffd647701 100644 > > > > --- a/net/tipc/node.c > > > > +++ b/net/tipc/node.c > > > > @@ -45,6 +45,8 @@ > > > > #include "netlink.h" > > > > #include "trace.h" > > > > > > > > +#include <net/netns/hash.h> > > > > + > > > > #define INVALID_NODE_SIG 0x10000 > > > > #define NODE_CLEANUP_AFTER 300000 > > > > > > > > @@ -126,6 +128,7 @@ struct tipc_node { > > > > struct timer_list timer; > > > > struct rcu_head rcu; > > > > unsigned long delete_at; > > > > + struct net *pnet; > > > > }; > > > > > > > > /* Node FSM states and events: > > > > @@ -184,7 +187,7 @@ static struct tipc_link > > > > *node_active_link(struct tipc_node *n, int sel) > > > > return n->links[bearer_id].link; } > > > > > > > > -int tipc_node_get_mtu(struct net *net, u32 addr, u32 sel) > > > > +int tipc_node_get_mtu(struct net *net, u32 addr, u32 sel, bool > > > > +connected) > > > > { > > > > struct tipc_node *n; > > > > int bearer_id; > > > > @@ -194,6 +197,14 @@ int tipc_node_get_mtu(struct net *net, u32 > > > > addr, > > > > u32 > > > > sel) > > > > if (unlikely(!n)) > > > > return mtu; > > > > > > > > + /* Allow MAX_MSG_SIZE when building connection oriented > message > > > > + * if they are in the same core network > > > > + */ > > > > + if (n->pnet && connected) { > > > > + tipc_node_put(n); > > > > + return mtu; > > > > + } > > > > + > > > > bearer_id = n->active_links[sel & 1]; > > > > if (likely(bearer_id != INVALID_BEARER_ID)) > > > > mtu = n->links[bearer_id].mtu; > > > > @@ -361,11 +372,14 @@ static void tipc_node_write_unlock(struct > > > > tipc_node *n) } > > > > > > > > static struct tipc_node *tipc_node_create(struct net *net, u32 addr, > > > > - u8 *peer_id, u16 capabilities) > > > > + u8 *peer_id, u16 capabilities, > > > > + u32 signature, u32 pnet_hash) > > > > { > > > > struct tipc_net *tn = net_generic(net, tipc_net_id); > > > > struct tipc_node *n, *temp_node; > > > > + struct tipc_net *tn_peer; > > > > struct tipc_link *l; > > > > + struct net *tmp; > > > > int bearer_id; > > > > int i; > > > > > > > > @@ -400,6 +414,23 @@ static struct tipc_node > > > > *tipc_node_create(struct net *net, u32 addr, > > > > memcpy(&n->peer_id, peer_id, 16); > > > > n->net = net; > > > > n->capabilities = capabilities; > > > > + n->pnet = NULL; > > > > + for_each_net_rcu(tmp) { > > > > + /* Integrity checking whether node exists in namespace or > not */ > > > > + if (net_hash_mix(tmp) != pnet_hash) > > > > + continue; > > > > + tn_peer = net_generic(tmp, tipc_net_id); > > > > + if (!tn_peer) > > > > + continue; > > > > + > > > > + if ((tn_peer->random & 0x7fff) != (signature & 0x7fff)) > > > > + continue; > > > > + > > > > + if (!memcmp(n->peer_id, tn_peer->node_id, > NODE_ID_LEN)) { > > > > + n->pnet = tmp; > > > > + break; > > > > + } > > > > + } > > > > kref_init(&n->kref); > > > > rwlock_init(&n->lock); > > > > INIT_HLIST_NODE(&n->hash); > > > > @@ -979,7 +1010,7 @@ u32 tipc_node_try_addr(struct net *net, u8 > > > > *id, > > > > u32 > > > > addr) > > > > > > > > void tipc_node_check_dest(struct net *net, u32 addr, > > > > u8 *peer_id, struct tipc_bearer *b, > > > > - u16 capabilities, u32 signature, > > > > + u16 capabilities, u32 signature, u32 pnet_hash, > > > > struct tipc_media_addr *maddr, > > > > bool *respond, bool *dupl_addr) { @@ -998,7 +1029,8 > > @@ void > > > > tipc_node_check_dest(struct net *net, u32 > > > addr, > > > > *dupl_addr = false; > > > > *respond = false; > > > > > > > > - n = tipc_node_create(net, addr, peer_id, capabilities); > > > > + n = tipc_node_create(net, addr, peer_id, capabilities, signature, > > > > + pnet_hash); > > > > if (!n) > > > > return; > > > > > > > > @@ -1424,6 +1456,49 @@ static int __tipc_nl_add_node(struct > > > > tipc_nl_msg *msg, struct tipc_node *node) > > > > return -EMSGSIZE; > > > > } > > > > > > > > +static void tipc_lxc_xmit(struct net *pnet, struct sk_buff_head > > > > +*list) { > > > > + struct tipc_msg *hdr = buf_msg(skb_peek(list)); > > > > + struct sk_buff_head inputq; > > > > + > > > > + switch (msg_user(hdr)) { > > > > + case TIPC_LOW_IMPORTANCE: > > > > + case TIPC_MEDIUM_IMPORTANCE: > > > > + case TIPC_HIGH_IMPORTANCE: > > > > + case TIPC_CRITICAL_IMPORTANCE: > > > > + if (msg_connected(hdr) || msg_named(hdr)) { > > > > + spin_lock_init(&list->lock); > > > > + tipc_sk_rcv(pnet, list); > > > > + return; > > > > + } > > > > + if (msg_mcast(hdr)) { > > > > + skb_queue_head_init(&inputq); > > > > + tipc_sk_mcast_rcv(pnet, list, &inputq); > > > > + __skb_queue_purge(list); > > > > + skb_queue_purge(&inputq); > > > > + return; > > > > + } > > > > + return; > > > > + case MSG_FRAGMENTER: > > > > + if (tipc_msg_assemble(list)) { > > > > + skb_queue_head_init(&inputq); > > > > + tipc_sk_mcast_rcv(pnet, list, &inputq); > > > > + __skb_queue_purge(list); > > > > + skb_queue_purge(&inputq); > > > > + } > > > > + return; > > > > + case LINK_PROTOCOL: > > > > + case NAME_DISTRIBUTOR: > > > > + case GROUP_PROTOCOL: > > > > + case CONN_MANAGER: > > > > + case TUNNEL_PROTOCOL: > > > > + case BCAST_PROTOCOL: > > > > + return; > > > > + default: > > > > + return; > > > > + }; > > > > +} > > > > + > > > > /** > > > > * tipc_node_xmit() is the general link level function for message sending > > > > * @net: the applicable net namespace @@ -1439,6 +1514,7 @@ int > > > > tipc_node_xmit(struct net *net, struct sk_buff_head *list, > > > > struct tipc_link_entry *le = NULL; > > > > struct tipc_node *n; > > > > struct sk_buff_head xmitq; > > > > + bool node_up = false; > > > > int bearer_id; > > > > int rc; > > > > > > > > @@ -1455,6 +1531,16 @@ int tipc_node_xmit(struct net *net, struct > > > > sk_buff_head *list, > > > > return -EHOSTUNREACH; > > > > } > > > > > > > > + node_up = node_is_up(n); > > > > + if (node_up && n->pnet && check_net(n->pnet)) { > > > > + /* xmit inner linux container */ > > > > + tipc_lxc_xmit(n->pnet, list); > > > > + if (likely(skb_queue_empty(list))) { > > > > + tipc_node_put(n); > > > > + return 0; > > > > + } > > > > + } > > > > + > > > > tipc_node_read_lock(n); > > > > bearer_id = n->active_links[selector & 1]; > > > > if (unlikely(bearer_id == INVALID_BEARER_ID)) { diff --git > > > > a/net/tipc/node.h b/net/tipc/node.h index > > > 291d0ecd4101..11eb95ce358b > > > > 100644 > > > > --- a/net/tipc/node.h > > > > +++ b/net/tipc/node.h > > > > @@ -75,7 +75,7 @@ u32 tipc_node_get_addr(struct tipc_node *node); > > > > u32 tipc_node_try_addr(struct net *net, u8 *id, u32 addr); void > > > > tipc_node_check_dest(struct net *net, u32 onode, u8 *peer_id128, > > > > struct tipc_bearer *bearer, > > > > - u16 capabilities, u32 signature, > > > > + u16 capabilities, u32 signature, u32 pnet_hash, > > > > struct tipc_media_addr *maddr, > > > > bool *respond, bool *dupl_addr); void > > > > tipc_node_delete_links(struct net *net, int bearer_id); @@ -92,7 > > > > +92,7 @@ void tipc_node_unsubscribe(struct net *net, struct > > > > list_head *subscr, > > > > u32 addr); void tipc_node_broadcast(struct net *net, struct > > > > sk_buff *skb); int tipc_node_add_conn(struct net *net, u32 dnode, > > > > u32 port, > > > > u32 peer_port); void tipc_node_remove_conn(struct net *net, u32 > > > > dnode, u32 port); -int tipc_node_get_mtu(struct net *net, u32 > > > > addr, > > > > u32 sel); > > > > +int tipc_node_get_mtu(struct net *net, u32 addr, u32 sel, bool > > > > +connected); > > > > bool tipc_node_is_up(struct net *net, u32 addr); > > > > u16 tipc_node_get_capabilities(struct net *net, u32 addr); int > > > > tipc_nl_node_dump(struct sk_buff *skb, struct netlink_callback > > > > *cb); diff --git a/net/tipc/socket.c b/net/tipc/socket.c index > > > > 3b9f8cc328f5..fb24df03da6c 100644 > > > > --- a/net/tipc/socket.c > > > > +++ b/net/tipc/socket.c > > > > @@ -854,7 +854,7 @@ static int tipc_send_group_msg(struct net > > > > *net, struct tipc_sock *tsk, > > > > > > > > /* Build message as chain of buffers */ > > > > __skb_queue_head_init(&pkts); > > > > - mtu = tipc_node_get_mtu(net, dnode, tsk->portid); > > > > + mtu = tipc_node_get_mtu(net, dnode, tsk->portid, false); > > > > rc = tipc_msg_build(hdr, m, 0, dlen, mtu, &pkts); > > > > if (unlikely(rc != dlen)) > > > > return rc; > > > > @@ -1388,7 +1388,7 @@ static int __tipc_sendmsg(struct socket > > > > *sock, struct msghdr *m, size_t dlen) > > > > return rc; > > > > > > > > __skb_queue_head_init(&pkts); > > > > - mtu = tipc_node_get_mtu(net, dnode, tsk->portid); > > > > + mtu = tipc_node_get_mtu(net, dnode, tsk->portid, false); > > > > rc = tipc_msg_build(hdr, m, 0, dlen, mtu, &pkts); > > > > if (unlikely(rc != dlen)) > > > > return rc; > > > > @@ -1526,7 +1526,7 @@ static void tipc_sk_finish_conn(struct > > > > tipc_sock *tsk, > > > > u32 peer_port, > > > > sk_reset_timer(sk, &sk->sk_timer, jiffies + CONN_PROBING_INTV); > > > > tipc_set_sk_state(sk, TIPC_ESTABLISHED); > > > > tipc_node_add_conn(net, peer_node, tsk->portid, peer_port); > > > > - tsk->max_pkt = tipc_node_get_mtu(net, peer_node, tsk- > >portid); > > > > + tsk->max_pkt = tipc_node_get_mtu(net, peer_node, tsk->portid, > > > > +true); > > > > tsk->peer_caps = tipc_node_get_capabilities(net, peer_node); > > > > __skb_queue_purge(&sk->sk_write_queue); > > > > if (tsk->peer_caps & TIPC_BLOCK_FLOWCTL) > > > > -- > > > > 2.20.1 > > > > > > > > |
From: Tuong L. <tuo...@de...> - 2019-10-15 04:59:17
|
As mentioned in commit e95584a889e1 ("tipc: fix unlimited bundling of small messages"), the current message bundling algorithm is inefficient that can generate bundles of only one payload message, that causes unnecessary overheads for both the sender and receiver. This commit re-designs the 'tipc_msg_make_bundle()' function (now named as 'tipc_msg_try_bundle()'), so that when a message comes at the first place, we will just check & keep a reference to it if the message is suitable for bundling. The message buffer will be put into the link backlog queue and processed as normal. Later on, when another one comes we will make a bundle with the first message if possible and so on... This way, a bundle if really needed will always consist of at least two payload messages. Otherwise, we let the first buffer go its way without any need of bundling, so reduce the overheads to zero. Moreover, since now we have both the messages in hand, we can even optimize the 'tipc_msg_bundle()' function, make bundle of a very large (size ~ MSS) and small messages which is not with the current algorithm e.g. [1400-byte message] + [10-byte message] (MTU = 1500). Signed-off-by: Tuong Lien <tuo...@de...> --- net/tipc/link.c | 60 +++++++++++----------- net/tipc/msg.c | 153 +++++++++++++++++++++++++++++--------------------------- net/tipc/msg.h | 5 +- 3 files changed, 114 insertions(+), 104 deletions(-) diff --git a/net/tipc/link.c b/net/tipc/link.c index 999eab592de8..3bd60bdbf56c 100644 --- a/net/tipc/link.c +++ b/net/tipc/link.c @@ -940,16 +940,17 @@ int tipc_link_xmit(struct tipc_link *l, struct sk_buff_head *list, struct sk_buff_head *xmitq) { struct tipc_msg *hdr = buf_msg(skb_peek(list)); - unsigned int maxwin = l->window; - int imp = msg_importance(hdr); - unsigned int mtu = l->mtu; + struct sk_buff_head *backlogq = &l->backlogq; + struct sk_buff_head *transmq = &l->transmq; + struct sk_buff *skb, *_skb; + u16 bc_ack = l->bc_rcvlink->rcv_nxt - 1; u16 ack = l->rcv_nxt - 1; u16 seqno = l->snd_nxt; - u16 bc_ack = l->bc_rcvlink->rcv_nxt - 1; - struct sk_buff_head *transmq = &l->transmq; - struct sk_buff_head *backlogq = &l->backlogq; - struct sk_buff *skb, *_skb, **tskb; int pkt_cnt = skb_queue_len(list); + int imp = msg_importance(hdr); + unsigned int maxwin = l->window; + unsigned int mtu = l->mtu; + bool new_bundle; int rc = 0; if (unlikely(msg_size(hdr) > mtu)) { @@ -975,20 +976,18 @@ int tipc_link_xmit(struct tipc_link *l, struct sk_buff_head *list, } /* Prepare each packet for sending, and add to relevant queue: */ - while (skb_queue_len(list)) { - skb = skb_peek(list); - hdr = buf_msg(skb); - msg_set_seqno(hdr, seqno); - msg_set_ack(hdr, ack); - msg_set_bcast_ack(hdr, bc_ack); - + while ((skb = __skb_dequeue(list))) { if (likely(skb_queue_len(transmq) < maxwin)) { + hdr = buf_msg(skb); + msg_set_seqno(hdr, seqno); + msg_set_ack(hdr, ack); + msg_set_bcast_ack(hdr, bc_ack); _skb = skb_clone(skb, GFP_ATOMIC); if (!_skb) { + kfree_skb(skb); __skb_queue_purge(list); return -ENOBUFS; } - __skb_dequeue(list); __skb_queue_tail(transmq, skb); /* next retransmit attempt */ if (link_is_bc_sndlink(l)) @@ -1000,22 +999,27 @@ int tipc_link_xmit(struct tipc_link *l, struct sk_buff_head *list, seqno++; continue; } - tskb = &l->backlog[imp].target_bskb; - if (tipc_msg_bundle(*tskb, hdr, mtu)) { - kfree_skb(__skb_dequeue(list)); - l->stats.sent_bundled++; - continue; - } - if (tipc_msg_make_bundle(tskb, hdr, mtu, l->addr)) { - kfree_skb(__skb_dequeue(list)); - __skb_queue_tail(backlogq, *tskb); - l->backlog[imp].len++; - l->stats.sent_bundled++; - l->stats.sent_bundles++; + if (tipc_msg_try_bundle(l->backlog[imp].target_bskb, &skb, + mtu - INT_H_SIZE, l->addr, + &new_bundle)) { + if (skb) { + /* Keep a ref. to the skb for next try */ + l->backlog[imp].target_bskb = skb; + l->backlog[imp].len++; + __skb_queue_tail(backlogq, skb); + } else { + if (new_bundle) { + l->stats.sent_bundles++; + l->stats.sent_bundled++; + } + l->stats.sent_bundled++; + } continue; } + /* Let's the last skb (if any) go its way! */ l->backlog[imp].target_bskb = NULL; - l->backlog[imp].len += skb_queue_len(list); + l->backlog[imp].len += (1 + skb_queue_len(list)); + __skb_queue_tail(backlogq, skb); skb_queue_splice_tail_init(list, backlogq); } l->snd_nxt = seqno; diff --git a/net/tipc/msg.c b/net/tipc/msg.c index 922d262e153f..a2f3582cf9fe 100644 --- a/net/tipc/msg.c +++ b/net/tipc/msg.c @@ -419,48 +419,98 @@ int tipc_msg_build(struct tipc_msg *mhdr, struct msghdr *m, int offset, } /** - * tipc_msg_bundle(): Append contents of a buffer to tail of an existing one - * @skb: the buffer to append to ("bundle") - * @msg: message to be appended - * @mtu: max allowable size for the bundle buffer - * Consumes buffer if successful - * Returns true if bundling could be performed, otherwise false + * tipc_msg_bundle - Append contents of a buffer to tail of an existing one + * @bskb: the bundle buffer to append to + * @msg: message to be appended + * @max: max allowable size for the bundle buffer + * + * Returns "true" if bundling has been performed, otherwise "false" */ -bool tipc_msg_bundle(struct sk_buff *skb, struct tipc_msg *msg, u32 mtu) +static bool tipc_msg_bundle(struct sk_buff *bskb, struct tipc_msg *msg, + u32 max) { - struct tipc_msg *bmsg; - unsigned int bsz; - unsigned int msz = msg_size(msg); - u32 start, pad; - u32 max = mtu - INT_H_SIZE; + struct tipc_msg *bmsg = buf_msg(bskb); + u32 msz, bsz, offset, pad; - if (likely(msg_user(msg) == MSG_FRAGMENTER)) - return false; - if (!skb) - return false; - bmsg = buf_msg(skb); + msz = msg_size(msg); bsz = msg_size(bmsg); - start = align(bsz); - pad = start - bsz; + offset = align(bsz); + pad = offset - bsz; - if (unlikely(msg_user(msg) == TUNNEL_PROTOCOL)) + if (unlikely(skb_tailroom(bskb) < (pad + msz))) return false; - if (unlikely(msg_user(msg) == BCAST_PROTOCOL)) + if (unlikely(max < (offset + msz))) return false; - if (unlikely(msg_user(bmsg) != MSG_BUNDLER)) + + skb_put(bskb, pad + msz); + skb_copy_to_linear_data_offset(bskb, offset, msg, msz); + msg_set_size(bmsg, offset + msz); + msg_set_msgcnt(bmsg, msg_msgcnt(bmsg) + 1); + return true; +} + +/** + * tipc_msg_try_bundle - Try to bundle a new message to the last one + * @tskb: the last/target message to which the new one will be appended + * @skb: the new message skb pointer + * @mss: max message size (header inclusive) + * @dnode: destination node for the message + * @new_bundle: if this call made a new bundle or not + * + * Return: "true" if the new message skb is potential for bundling this time or + * later, in the case a bundling has been done this time, the skb is consumed + * (the skb pointer = NULL). + * Otherwise, "false" if the skb cannot be bundled at all. + */ +bool tipc_msg_try_bundle(struct sk_buff *tskb, struct sk_buff **skb, u32 mss, + u32 dnode, bool *new_bundle) +{ + struct tipc_msg *msg, *inner, *outer; + u32 bsz; + + /* First, check if the new buffer is suitable for bundling */ + msg = buf_msg(*skb); + if (msg_user(msg) == MSG_FRAGMENTER) return false; - if (unlikely(skb_tailroom(skb) < (pad + msz))) + if (msg_user(msg) == TUNNEL_PROTOCOL) return false; - if (unlikely(max < (start + msz))) + if (msg_user(msg) == BCAST_PROTOCOL) return false; - if ((msg_importance(msg) < TIPC_SYSTEM_IMPORTANCE) && - (msg_importance(bmsg) == TIPC_SYSTEM_IMPORTANCE)) + if (mss <= INT_H_SIZE + msg_size(msg)) return false; - skb_put(skb, pad + msz); - skb_copy_to_linear_data_offset(skb, start, msg, msz); - msg_set_size(bmsg, start + msz); - msg_set_msgcnt(bmsg, msg_msgcnt(bmsg) + 1); + /* Ok, but the last/target buffer can be empty? */ + if (unlikely(!tskb)) + return true; + + /* Is it a bundle already? Try to bundle the new message to it */ + if (msg_user(buf_msg(tskb)) == MSG_BUNDLER) { + *new_bundle = false; + goto bundle; + } + + /* Make a new bundle of the two messages if possible */ + bsz = msg_size(buf_msg(tskb)); + if (unlikely(mss < align(INT_H_SIZE + bsz) + msg_size(msg))) + return true; + if (unlikely(pskb_expand_head(tskb, INT_H_SIZE, mss - bsz, + GFP_ATOMIC))) + return true; + inner = buf_msg(tskb); + skb_push(tskb, INT_H_SIZE); + outer = buf_msg(tskb); + tipc_msg_init(msg_prevnode(inner), outer, MSG_BUNDLER, 0, INT_H_SIZE, + dnode); + msg_set_importance(outer, msg_importance(inner)); + msg_set_size(outer, INT_H_SIZE + bsz); + msg_set_msgcnt(outer, 1); + *new_bundle = true; + +bundle: + if (likely(tipc_msg_bundle(tskb, msg, mss))) { + consume_skb(*skb); + *skb = NULL; + } return true; } @@ -510,49 +560,6 @@ bool tipc_msg_extract(struct sk_buff *skb, struct sk_buff **iskb, int *pos) } /** - * tipc_msg_make_bundle(): Create bundle buf and append message to its tail - * @list: the buffer chain, where head is the buffer to replace/append - * @skb: buffer to be created, appended to and returned in case of success - * @msg: message to be appended - * @mtu: max allowable size for the bundle buffer, inclusive header - * @dnode: destination node for message. (Not always present in header) - * Returns true if success, otherwise false - */ -bool tipc_msg_make_bundle(struct sk_buff **skb, struct tipc_msg *msg, - u32 mtu, u32 dnode) -{ - struct sk_buff *_skb; - struct tipc_msg *bmsg; - u32 msz = msg_size(msg); - u32 max = mtu - INT_H_SIZE; - - if (msg_user(msg) == MSG_FRAGMENTER) - return false; - if (msg_user(msg) == TUNNEL_PROTOCOL) - return false; - if (msg_user(msg) == BCAST_PROTOCOL) - return false; - if (msz > (max / 2)) - return false; - - _skb = tipc_buf_acquire(max, GFP_ATOMIC); - if (!_skb) - return false; - - skb_trim(_skb, INT_H_SIZE); - bmsg = buf_msg(_skb); - tipc_msg_init(msg_prevnode(msg), bmsg, MSG_BUNDLER, 0, - INT_H_SIZE, dnode); - msg_set_importance(bmsg, msg_importance(msg)); - msg_set_seqno(bmsg, msg_seqno(msg)); - msg_set_ack(bmsg, msg_ack(msg)); - msg_set_bcast_ack(bmsg, msg_bcast_ack(msg)); - tipc_msg_bundle(_skb, msg, mtu); - *skb = _skb; - return true; -} - -/** * tipc_msg_reverse(): swap source and destination addresses and add error code * @own_node: originating node id for reversed message * @skb: buffer containing message to be reversed; will be consumed diff --git a/net/tipc/msg.h b/net/tipc/msg.h index 0daa6f04ca81..4d4ed4bd058a 100644 --- a/net/tipc/msg.h +++ b/net/tipc/msg.h @@ -1057,9 +1057,8 @@ struct sk_buff *tipc_msg_create(uint user, uint type, uint hdr_sz, uint data_sz, u32 dnode, u32 onode, u32 dport, u32 oport, int errcode); int tipc_buf_append(struct sk_buff **headbuf, struct sk_buff **buf); -bool tipc_msg_bundle(struct sk_buff *skb, struct tipc_msg *msg, u32 mtu); -bool tipc_msg_make_bundle(struct sk_buff **skb, struct tipc_msg *msg, - u32 mtu, u32 dnode); +bool tipc_msg_try_bundle(struct sk_buff *tskb, struct sk_buff **skb, u32 mss, + u32 dnode, bool *new_bundle); bool tipc_msg_extract(struct sk_buff *skb, struct sk_buff **iskb, int *pos); int tipc_msg_fragment(struct sk_buff *skb, const struct tipc_msg *hdr, int pktmax, struct sk_buff_head *frags); -- 2.13.7 |
From: Tuong L. T. <tuo...@de...> - 2019-10-15 04:49:55
|
Hi Ying, Agree, it's hard to trace... I've changed the way we approach, will post it as a new patch, please take a look from there! Thanks a lot! BR/Tuong -----Original Message----- From: Xue, Ying <Yin...@wi...> Sent: Friday, October 11, 2019 9:52 PM To: Tuong Lien <tuo...@de...>; tip...@li...; jon...@er...; ma...@do... Subject: RE: [PATCH RFC 2/2] tipc: improve message bundling algorithm I can recognize this is a good improvement except that the following switch cases of return values of tipc_msg_try_bundle() are not very friendly for code reader. Although I do understand their real meanings, I have to spend time checking its context back and forth. At least we should the meaningless hard code case numbers or we try to change return value numbers of tipc_msg_try_bundle(). + n = tipc_msg_try_bundle(&l->backlog[imp].target_bskb, skb, + mtu - INT_H_SIZE, + l->addr); + switch (n) { + case 0: + break; + case 1: + __skb_queue_tail(backlogq, skb); l->backlog[imp].len++; - l->stats.sent_bundled++; + continue; + case 2: l->stats.sent_bundles++; + l->stats.sent_bundled++; + default: + kfree_skb(skb); + l->stats.sent_bundled++; continue; |
From: Xue, Y. <Yin...@wi...> - 2019-10-14 19:19:28
|
Hi Jon, Please see my comment inline. At netdev 0x13 in Prague last July there was presented a related proposal https://netdevconf.info/0x13/session.html?talk-AF_GRAFT. I was there, and I cannot say there was any overwhelming approval of this proposal, but neither was it rejected out of hand. [Ying] The idea of AF_GRAFT socket is exactly the same as this patch. If it can be recognized, it's definitely worth trying to submit this patch to upstream. But after my checking, the wired thing is that AF_GRAFT is not supported by latest kernel and I don't find its author ever attempted to submit its patch to upstream. First, I see TIPC as an IPC, not a network protocol, and anybody using TIPC inside a cluster has per definition been authenticated to start a node and connect to the cluster. Here, there is no change from current policies. Once a node has been accepted in a cluster, possibly via encrypted discovery messages which have been passing all policies checks, and we are 100% certain it is legitimate and located in the same kernel (as we are trying to ensure in this patch), I cannot see any reason why we should not be allowed to short-cut the stack the way we do. Security checks have already been done. Are we circumventing any other policies by doing this that must not be done? [Ying] If we treat TIPC as IPC channel, bypassing its lower level interface is acceptable. Beside AF_GRAFT socket, in fact AF_UNIX socket provides an interconnection mechanism between different processes on socket level, and there are several options available for us to configure policies against socket, such as, SO_ATTACH_FILTER, SO_ATTACH_BPF, SO_ATTACH_REUSEPORT_EBPF etc. If we bypass TIPC bearer, the most inconvenient thing is that it's hard for us to monitor traffics between netns with tcpdump. Of course, as Xin mentioned previously, we could not use traditional tools to control/shape TIPC traffic across netns. Unless you strongly object I would suggest we send this to netdev as an RFC and observe the reactions. If David or Eric or any of the other heavyweight say flatly no there is nothing we can do. But It might be worth a try. [Ying] No, I don't strongly object this proposal. We can try to submit it to net-next mail list. Thanks, Ying > -----Original Message----- > From: Xue, Ying <Yin...@wi...> > Sent: 11-Oct-19 07:58 > To: Jon Maloy <jon...@er...>; Xin Long <lx...@re...> > Subject: RE: [net-next] tipc: improve throughput between nodes in netns > > Exactly. I agree with Xin. The major purpose of namespace is mainly to provide > an isolated environment. But as this patch almost completely bypasses security > check points of networking stack, the traffics between namespaces will be out > of control. So I don't think this is a good idea. > > Thanks, > Ying > > -----Original Message----- > From: Jon Maloy [mailto:jon...@er...] > Sent: Friday, October 11, 2019 2:14 AM > To: Xin Long > Cc: Xue, Ying > Subject: RE: [net-next] tipc: improve throughput between nodes in netns > > Hi Xin, > I am not surprised by you answer. Apart from concerns about security, this is > the same objection I have heard from others when presenting this idea, and I > suspect that this would also be the reaction if we try to deliver this to David. > If we can achieve anything close to this by adding GSO to the veth interface I > think that would be a safer approach. > So, I suggest we put this one to rest for now, and I'll try to go ahead with the > GSO approach instead. > > Sorry Hoang for making you waste your time. > > BR > ///jon > > > -----Original Message----- > > From: Xin Long <lx...@re...> > > Sent: 10-Oct-19 07:14 > > To: Jon Maloy <jon...@er...> > > Cc: Ying Xue <yin...@wi...> > > Subject: Re: [net-next] tipc: improve throughput between nodes in > > netns > > > > > > > > ----- Original Message ----- > > > Ying and Xin, > > > This is the "wormhole" functionality I have been suggesting a since > > > while back. > > > Basically, we send messages directly socket to socket between name > > > spaces on the same host, not only between sockets within the same > > > name > > space. > > > As you will understand this might have a huge positive impact on > > > performance between e.g., docker containers or containers inside > > Kubernetes pods. > > > > > > Please spend some time reviewing this, as it might be a > > > controversial feature. It is imperative that we get security right here. > > > > > If I understand it right: > > > > With this patch, TIPC packets will skip all lower layers protocol > > stack, like IP (udp media), ether link layer, which means all rules of > > like tc, ovs, netfiler/br_netfilter will be skipped. > > > > I don't think this could be endured, especially when it comes to a > > cloud environment where many rules are configured on those virtual > > NICs. Unless we have some special needs, I'm not sure if this > > performance improvement is worth a big protocol stack skip. > > > > Thanks. > > > > > BR > > > ///jon > > > > > > > > > -----Original Message----- > > > From: Hoang Le <hoa...@de...> > > > Sent: 2-Oct-19 06:26 > > > To: Jon Maloy <jon...@er...>; ma...@do...; > > > tip...@li... > > > Subject: [net-next] tipc: improve throughput between nodes in netns > > > > > > Introduce traffic cross namespaces transmission as local node. > > > By this way, throughput between nodes in namespace as fast as local. > > > > > > Testcase: > > > $ip netns exec 1 benchmark_client -c 100 $ip netns exec 2 > > > benchmark_server > > > > > > Before: > > > +--------------------------------------------------------------------------------------------- > + > > > | Msg Size | # | # Msgs/ | Elapsed | Throughput > > > | | > > > | [octets] | Conns | Conn | [ms] > > > | +------------------------------------------------+ > > > | | | | | Total [Msg/s] | Total [Mb/s] | > > > | | | | | Per Conn [Mb/s] | > > > +--------------------------------------------------------------------------------------------- > + > > > | 64 | 100 | 64000 | 13005 | 492103 | 251 | > > > | 2 | > > > +--------------------------------------------------------------------------------------------- > + > > > | 256 | 100 | 32000 | 4964 | 644627 | 1320 | > > > | 13 | > > > +--------------------------------------------------------------------------------------------- > + > > > | 1024 | 100 | 16000 | 4524 | 353612 | 2896 | > > > | 28 | > > > +--------------------------------------------------------------------------------------------- > + > > > | 4096 | 100 | 8000 | 3675 | 217644 | 7131 | > > > | 71 | > > > +--------------------------------------------------------------------------------------------- > + > > > | 16384 | 100 | 4000 | 7914 | 50540 | 6624 | > > > | 66 | > > > +--------------------------------------------------------------------------------------------- > + > > > | 65536 | 100 | 2000 | 13000 | 15384 | 8065 | > > > | 80 | > > > +--------------------------------------------------------------------------------------------- > + > > > > > > After: > > > +--------------------------------------------------------------------------------------------- > + > > > | Msg Size | # | # Msgs/ | Elapsed | Throughput > > > | | > > > | [octets] | Conns | Conn | [ms] > > > | +------------------------------------------------+ > > > | | | | | Total [Msg/s] | Total [Mb/s] | > > > | | | | | Per Conn [Mb/s] | > > > +--------------------------------------------------------------------------------------------- > + > > > | 64 | 100 | 64000 | 7842 | 816090 | 417 | > > > | 4 | > > > +--------------------------------------------------------------------------------------------- > + > > > | 256 | 100 | 32000 | 3593 | 890469 | 1823 | > > > | 18 | > > > +--------------------------------------------------------------------------------------------- > + > > > | 1024 | 100 | 16000 | 1835 | 871828 | 7142 | > > > | 71 | > > > +--------------------------------------------------------------------------------------------- > + > > > | 4096 | 100 | 8000 | 1134 | 704904 | 23098 | > > > | 230 | > > > +--------------------------------------------------------------------------------------------- > + > > > | 16384 | 100 | 4000 | 878 | 455295 | 59676 | > > > | 596 | > > > +--------------------------------------------------------------------------------------------- > + > > > | 65536 | 100 | 2000 | 1007 | 198487 | 104064 | > > > | 1040 | > > > +--------------------------------------------------------------------------------------------- > + > > > > > > Signed-off-by: Hoang Le <hoa...@de...> > > > --- > > > net/tipc/discover.c | 6 ++- > > > net/tipc/msg.h | 10 +++++ > > > net/tipc/name_distr.c | 2 +- > > > net/tipc/node.c | 94 > > +++++++++++++++++++++++++++++++++++++++++-- > > > net/tipc/node.h | 4 +- > > > net/tipc/socket.c | 6 +-- > > > 6 files changed, 111 insertions(+), 11 deletions(-) > > > > > > diff --git a/net/tipc/discover.c b/net/tipc/discover.c index > > > c138d68e8a69..98d4eea97eb7 100644 > > > --- a/net/tipc/discover.c > > > +++ b/net/tipc/discover.c > > > @@ -38,6 +38,8 @@ > > > #include "node.h" > > > #include "discover.h" > > > > > > +#include <net/netns/hash.h> > > > + > > > /* min delay during bearer start up */ > > > #define TIPC_DISC_INIT msecs_to_jiffies(125) > > > /* max delay if bearer has no links */ @@ -94,6 +96,7 @@ static > > > void tipc_disc_init_msg(struct net *net, struct sk_buff *skb, > > > msg_set_dest_domain(hdr, dest_domain); > > > msg_set_bc_netid(hdr, tn->net_id); > > > b->media->addr2msg(msg_media_addr(hdr), &b->addr); > > > + msg_set_peer_net_hash(hdr, net_hash_mix(net)); > > > msg_set_node_id(hdr, tipc_own_id(net)); } > > > > > > @@ -200,6 +203,7 @@ void tipc_disc_rcv(struct net *net, struct > > > sk_buff > > *skb, > > > u8 peer_id[NODE_ID_LEN] = {0,}; > > > u32 dst = msg_dest_domain(hdr); > > > u32 net_id = msg_bc_netid(hdr); > > > + u32 pnet_hash = msg_peer_net_hash(hdr); > > > struct tipc_media_addr maddr; > > > u32 src = msg_prevnode(hdr); > > > u32 mtyp = msg_type(hdr); > > > @@ -242,7 +246,7 @@ void tipc_disc_rcv(struct net *net, struct > > > sk_buff > > *skb, > > > if (!tipc_in_scope(legacy, b->domain, src)) > > > return; > > > tipc_node_check_dest(net, src, peer_id, b, caps, signature, > > > - &maddr, &respond, &dupl_addr); > > > + pnet_hash, &maddr, &respond, &dupl_addr); > > > if (dupl_addr) > > > disc_dupl_alert(b, src, &maddr); > > > if (!respond) > > > diff --git a/net/tipc/msg.h b/net/tipc/msg.h index > > > 0daa6f04ca81..a8d0f28094f2 > > > 100644 > > > --- a/net/tipc/msg.h > > > +++ b/net/tipc/msg.h > > > @@ -973,6 +973,16 @@ static inline void msg_set_grp_remitted(struct > > > tipc_msg *m, u16 n) > > > msg_set_bits(m, 9, 16, 0xffff, n); } > > > > > > +static inline void msg_set_peer_net_hash(struct tipc_msg *m, u32 n) { > > > + msg_set_word(m, 9, n); > > > +} > > > + > > > +static inline u32 msg_peer_net_hash(struct tipc_msg *m) { > > > + return msg_word(m, 9); > > > +} > > > + > > > /* Word 10 > > > */ > > > static inline u16 msg_grp_evt(struct tipc_msg *m) diff --git > > > a/net/tipc/name_distr.c b/net/tipc/name_distr.c index > > > 836e629e8f4a..5feaf3b67380 100644 > > > --- a/net/tipc/name_distr.c > > > +++ b/net/tipc/name_distr.c > > > @@ -146,7 +146,7 @@ static void named_distribute(struct net *net, > > > struct sk_buff_head *list, > > > struct publication *publ; > > > struct sk_buff *skb = NULL; > > > struct distr_item *item = NULL; > > > - u32 msg_dsz = ((tipc_node_get_mtu(net, dnode, 0) - INT_H_SIZE) / > > > + u32 msg_dsz = ((tipc_node_get_mtu(net, dnode, 0, false) - > > > +INT_H_SIZE) / > > > ITEM_SIZE) * ITEM_SIZE; > > > u32 msg_rem = msg_dsz; > > > > > > diff --git a/net/tipc/node.c b/net/tipc/node.c index > > > c8f6177dd5a2..9a4ffd647701 100644 > > > --- a/net/tipc/node.c > > > +++ b/net/tipc/node.c > > > @@ -45,6 +45,8 @@ > > > #include "netlink.h" > > > #include "trace.h" > > > > > > +#include <net/netns/hash.h> > > > + > > > #define INVALID_NODE_SIG 0x10000 > > > #define NODE_CLEANUP_AFTER 300000 > > > > > > @@ -126,6 +128,7 @@ struct tipc_node { > > > struct timer_list timer; > > > struct rcu_head rcu; > > > unsigned long delete_at; > > > + struct net *pnet; > > > }; > > > > > > /* Node FSM states and events: > > > @@ -184,7 +187,7 @@ static struct tipc_link *node_active_link(struct > > > tipc_node *n, int sel) > > > return n->links[bearer_id].link; > > > } > > > > > > -int tipc_node_get_mtu(struct net *net, u32 addr, u32 sel) > > > +int tipc_node_get_mtu(struct net *net, u32 addr, u32 sel, bool > > > +connected) > > > { > > > struct tipc_node *n; > > > int bearer_id; > > > @@ -194,6 +197,14 @@ int tipc_node_get_mtu(struct net *net, u32 > > > addr, > > > u32 > > > sel) > > > if (unlikely(!n)) > > > return mtu; > > > > > > + /* Allow MAX_MSG_SIZE when building connection oriented message > > > + * if they are in the same core network > > > + */ > > > + if (n->pnet && connected) { > > > + tipc_node_put(n); > > > + return mtu; > > > + } > > > + > > > bearer_id = n->active_links[sel & 1]; > > > if (likely(bearer_id != INVALID_BEARER_ID)) > > > mtu = n->links[bearer_id].mtu; > > > @@ -361,11 +372,14 @@ static void tipc_node_write_unlock(struct > > > tipc_node *n) } > > > > > > static struct tipc_node *tipc_node_create(struct net *net, u32 addr, > > > - u8 *peer_id, u16 capabilities) > > > + u8 *peer_id, u16 capabilities, > > > + u32 signature, u32 pnet_hash) > > > { > > > struct tipc_net *tn = net_generic(net, tipc_net_id); > > > struct tipc_node *n, *temp_node; > > > + struct tipc_net *tn_peer; > > > struct tipc_link *l; > > > + struct net *tmp; > > > int bearer_id; > > > int i; > > > > > > @@ -400,6 +414,23 @@ static struct tipc_node > > > *tipc_node_create(struct net *net, u32 addr, > > > memcpy(&n->peer_id, peer_id, 16); > > > n->net = net; > > > n->capabilities = capabilities; > > > + n->pnet = NULL; > > > + for_each_net_rcu(tmp) { > > > + /* Integrity checking whether node exists in namespace or not */ > > > + if (net_hash_mix(tmp) != pnet_hash) > > > + continue; > > > + tn_peer = net_generic(tmp, tipc_net_id); > > > + if (!tn_peer) > > > + continue; > > > + > > > + if ((tn_peer->random & 0x7fff) != (signature & 0x7fff)) > > > + continue; > > > + > > > + if (!memcmp(n->peer_id, tn_peer->node_id, NODE_ID_LEN)) { > > > + n->pnet = tmp; > > > + break; > > > + } > > > + } > > > kref_init(&n->kref); > > > rwlock_init(&n->lock); > > > INIT_HLIST_NODE(&n->hash); > > > @@ -979,7 +1010,7 @@ u32 tipc_node_try_addr(struct net *net, u8 *id, > > > u32 > > > addr) > > > > > > void tipc_node_check_dest(struct net *net, u32 addr, > > > u8 *peer_id, struct tipc_bearer *b, > > > - u16 capabilities, u32 signature, > > > + u16 capabilities, u32 signature, u32 pnet_hash, > > > struct tipc_media_addr *maddr, > > > bool *respond, bool *dupl_addr) { @@ -998,7 +1029,8 > @@ void > > > tipc_node_check_dest(struct net *net, u32 > > addr, > > > *dupl_addr = false; > > > *respond = false; > > > > > > - n = tipc_node_create(net, addr, peer_id, capabilities); > > > + n = tipc_node_create(net, addr, peer_id, capabilities, signature, > > > + pnet_hash); > > > if (!n) > > > return; > > > > > > @@ -1424,6 +1456,49 @@ static int __tipc_nl_add_node(struct > > > tipc_nl_msg *msg, struct tipc_node *node) > > > return -EMSGSIZE; > > > } > > > > > > +static void tipc_lxc_xmit(struct net *pnet, struct sk_buff_head > > > +*list) { > > > + struct tipc_msg *hdr = buf_msg(skb_peek(list)); > > > + struct sk_buff_head inputq; > > > + > > > + switch (msg_user(hdr)) { > > > + case TIPC_LOW_IMPORTANCE: > > > + case TIPC_MEDIUM_IMPORTANCE: > > > + case TIPC_HIGH_IMPORTANCE: > > > + case TIPC_CRITICAL_IMPORTANCE: > > > + if (msg_connected(hdr) || msg_named(hdr)) { > > > + spin_lock_init(&list->lock); > > > + tipc_sk_rcv(pnet, list); > > > + return; > > > + } > > > + if (msg_mcast(hdr)) { > > > + skb_queue_head_init(&inputq); > > > + tipc_sk_mcast_rcv(pnet, list, &inputq); > > > + __skb_queue_purge(list); > > > + skb_queue_purge(&inputq); > > > + return; > > > + } > > > + return; > > > + case MSG_FRAGMENTER: > > > + if (tipc_msg_assemble(list)) { > > > + skb_queue_head_init(&inputq); > > > + tipc_sk_mcast_rcv(pnet, list, &inputq); > > > + __skb_queue_purge(list); > > > + skb_queue_purge(&inputq); > > > + } > > > + return; > > > + case LINK_PROTOCOL: > > > + case NAME_DISTRIBUTOR: > > > + case GROUP_PROTOCOL: > > > + case CONN_MANAGER: > > > + case TUNNEL_PROTOCOL: > > > + case BCAST_PROTOCOL: > > > + return; > > > + default: > > > + return; > > > + }; > > > +} > > > + > > > /** > > > * tipc_node_xmit() is the general link level function for message sending > > > * @net: the applicable net namespace @@ -1439,6 +1514,7 @@ int > > > tipc_node_xmit(struct net *net, struct sk_buff_head *list, > > > struct tipc_link_entry *le = NULL; > > > struct tipc_node *n; > > > struct sk_buff_head xmitq; > > > + bool node_up = false; > > > int bearer_id; > > > int rc; > > > > > > @@ -1455,6 +1531,16 @@ int tipc_node_xmit(struct net *net, struct > > > sk_buff_head *list, > > > return -EHOSTUNREACH; > > > } > > > > > > + node_up = node_is_up(n); > > > + if (node_up && n->pnet && check_net(n->pnet)) { > > > + /* xmit inner linux container */ > > > + tipc_lxc_xmit(n->pnet, list); > > > + if (likely(skb_queue_empty(list))) { > > > + tipc_node_put(n); > > > + return 0; > > > + } > > > + } > > > + > > > tipc_node_read_lock(n); > > > bearer_id = n->active_links[selector & 1]; > > > if (unlikely(bearer_id == INVALID_BEARER_ID)) { diff --git > > > a/net/tipc/node.h b/net/tipc/node.h index > > 291d0ecd4101..11eb95ce358b > > > 100644 > > > --- a/net/tipc/node.h > > > +++ b/net/tipc/node.h > > > @@ -75,7 +75,7 @@ u32 tipc_node_get_addr(struct tipc_node *node); > > > u32 tipc_node_try_addr(struct net *net, u8 *id, u32 addr); void > > > tipc_node_check_dest(struct net *net, u32 onode, u8 *peer_id128, > > > struct tipc_bearer *bearer, > > > - u16 capabilities, u32 signature, > > > + u16 capabilities, u32 signature, u32 pnet_hash, > > > struct tipc_media_addr *maddr, > > > bool *respond, bool *dupl_addr); void > > > tipc_node_delete_links(struct net *net, int bearer_id); @@ -92,7 > > > +92,7 @@ void tipc_node_unsubscribe(struct net *net, struct > > > list_head *subscr, > > > u32 addr); void tipc_node_broadcast(struct net *net, struct > > > sk_buff *skb); int tipc_node_add_conn(struct net *net, u32 dnode, > > > u32 port, > > > u32 peer_port); void tipc_node_remove_conn(struct net *net, u32 > > > dnode, u32 port); -int tipc_node_get_mtu(struct net *net, u32 addr, > > > u32 sel); > > > +int tipc_node_get_mtu(struct net *net, u32 addr, u32 sel, bool > > > +connected); > > > bool tipc_node_is_up(struct net *net, u32 addr); > > > u16 tipc_node_get_capabilities(struct net *net, u32 addr); int > > > tipc_nl_node_dump(struct sk_buff *skb, struct netlink_callback *cb); > > > diff --git a/net/tipc/socket.c b/net/tipc/socket.c index > > > 3b9f8cc328f5..fb24df03da6c 100644 > > > --- a/net/tipc/socket.c > > > +++ b/net/tipc/socket.c > > > @@ -854,7 +854,7 @@ static int tipc_send_group_msg(struct net *net, > > > struct tipc_sock *tsk, > > > > > > /* Build message as chain of buffers */ > > > __skb_queue_head_init(&pkts); > > > - mtu = tipc_node_get_mtu(net, dnode, tsk->portid); > > > + mtu = tipc_node_get_mtu(net, dnode, tsk->portid, false); > > > rc = tipc_msg_build(hdr, m, 0, dlen, mtu, &pkts); > > > if (unlikely(rc != dlen)) > > > return rc; > > > @@ -1388,7 +1388,7 @@ static int __tipc_sendmsg(struct socket *sock, > > > struct msghdr *m, size_t dlen) > > > return rc; > > > > > > __skb_queue_head_init(&pkts); > > > - mtu = tipc_node_get_mtu(net, dnode, tsk->portid); > > > + mtu = tipc_node_get_mtu(net, dnode, tsk->portid, false); > > > rc = tipc_msg_build(hdr, m, 0, dlen, mtu, &pkts); > > > if (unlikely(rc != dlen)) > > > return rc; > > > @@ -1526,7 +1526,7 @@ static void tipc_sk_finish_conn(struct > > > tipc_sock *tsk, > > > u32 peer_port, > > > sk_reset_timer(sk, &sk->sk_timer, jiffies + CONN_PROBING_INTV); > > > tipc_set_sk_state(sk, TIPC_ESTABLISHED); > > > tipc_node_add_conn(net, peer_node, tsk->portid, peer_port); > > > - tsk->max_pkt = tipc_node_get_mtu(net, peer_node, tsk->portid); > > > + tsk->max_pkt = tipc_node_get_mtu(net, peer_node, tsk->portid, > > > +true); > > > tsk->peer_caps = tipc_node_get_capabilities(net, peer_node); > > > __skb_queue_purge(&sk->sk_write_queue); > > > if (tsk->peer_caps & TIPC_BLOCK_FLOWCTL) > > > -- > > > 2.20.1 > > > > > > |
From: Tuong L. <tuo...@de...> - 2019-10-14 11:37:17
|
Two new commands are added as part of 'tipc node' command: $tipc node set key KEY [algname ALGNAME] [nodeid NODEID] $tipc node flush key which enable user to set and remove AEAD keys in kernel TIPC. For the 'set key' command, the given 'nodeid' parameter decides the mode to be applied to the key, particularly: - If NODEID is empty, the key is a 'cluster' key which will be used for all message encryption/decryption from/to the node (i.e. both TX & RX). The same key needs to be set in the other nodes i.e. the 'cluster key' mode. - If NODEID is own node, the key is used for message encryption (TX) from the node. Whereas, if NODEID is a peer node, the key is for message decryption (RX) from that peer node. This is the 'per-node-key' mode that each nodes in the cluster has its specific (TX) key. Signed-off-by: Tuong Lien <tuo...@de...> --- include/uapi/linux/tipc.h | 21 ++++++ include/uapi/linux/tipc_netlink.h | 4 ++ tipc/misc.c | 38 +++++++++++ tipc/misc.h | 1 + tipc/node.c | 133 +++++++++++++++++++++++++++++++++++++- 5 files changed, 195 insertions(+), 2 deletions(-) diff --git a/include/uapi/linux/tipc.h b/include/uapi/linux/tipc.h index e16cb4e2..b118ce9b 100644 --- a/include/uapi/linux/tipc.h +++ b/include/uapi/linux/tipc.h @@ -232,6 +232,27 @@ struct tipc_sioc_nodeid_req { char node_id[TIPC_NODEID_LEN]; }; +/* + * TIPC Crypto, AEAD mode + */ +#define TIPC_AEAD_MAX_ALG_NAME (32) +#define TIPC_AEAD_MIN_KEYLEN (16 + 4) +#define TIPC_AEAD_MAX_KEYLEN (32 + 4) + +struct tipc_aead_key { + char alg_name[TIPC_AEAD_MAX_ALG_NAME]; + unsigned int keylen; /* in bytes */ + char key[]; +}; + +#define TIPC_AEAD_KEY_MAX_SIZE (sizeof(struct tipc_aead_key) + \ + TIPC_AEAD_MAX_KEYLEN) + +static inline int tipc_aead_key_size(struct tipc_aead_key *key) +{ + return sizeof(*key) + key->keylen; +} + /* The macros and functions below are deprecated: */ diff --git a/include/uapi/linux/tipc_netlink.h b/include/uapi/linux/tipc_netlink.h index efb958fd..6c2194ab 100644 --- a/include/uapi/linux/tipc_netlink.h +++ b/include/uapi/linux/tipc_netlink.h @@ -63,6 +63,8 @@ enum { TIPC_NL_PEER_REMOVE, TIPC_NL_BEARER_ADD, TIPC_NL_UDP_GET_REMOTEIP, + TIPC_NL_KEY_SET, + TIPC_NL_KEY_FLUSH, __TIPC_NL_CMD_MAX, TIPC_NL_CMD_MAX = __TIPC_NL_CMD_MAX - 1 @@ -160,6 +162,8 @@ enum { TIPC_NLA_NODE_UNSPEC, TIPC_NLA_NODE_ADDR, /* u32 */ TIPC_NLA_NODE_UP, /* flag */ + TIPC_NLA_NODE_ID, /* data */ + TIPC_NLA_NODE_KEY, /* data */ __TIPC_NLA_NODE_MAX, TIPC_NLA_NODE_MAX = __TIPC_NLA_NODE_MAX - 1 diff --git a/tipc/misc.c b/tipc/misc.c index e4b1cd0c..1daf3072 100644 --- a/tipc/misc.c +++ b/tipc/misc.c @@ -98,6 +98,44 @@ int str2nodeid(char *str, uint8_t *id) return 0; } +int str2key(char *str, struct tipc_aead_key *key) +{ + int len = strlen(str); + int ishex = 0; + int i; + + /* Check if the input is a hex string (i.e. 0x...) */ + if (len > 2 && strncmp(str, "0x", 2) == 0) { + ishex = is_hex(str + 2, len - 2 - 1); + if (ishex) { + len -= 2; + str += 2; + } + } + + /* Obtain key: */ + if (!ishex) { + key->keylen = len; + memcpy(key->key, str, len); + } else { + /* Convert hex string to key */ + key->keylen = (len + 1) / 2; + for (i = 0; i < key->keylen; i++) { + if (i == 0 && len % 2 != 0) { + if (sscanf(str, "%1hhx", &key->key[0]) != 1) + return -1; + str += 1; + continue; + } + if (sscanf(str, "%2hhx", &key->key[i]) != 1) + return -1; + str += 2; + } + } + + return 0; +} + void nodeid2str(uint8_t *id, char *str) { int i; diff --git a/tipc/misc.h b/tipc/misc.h index ff2f31f1..59309f68 100644 --- a/tipc/misc.h +++ b/tipc/misc.h @@ -18,5 +18,6 @@ uint32_t str2addr(char *str); int str2nodeid(char *str, uint8_t *id); void nodeid2str(uint8_t *id, char *str); void hash2nodestr(uint32_t hash, char *str); +int str2key(char *str, struct tipc_aead_key *key); #endif diff --git a/tipc/node.c b/tipc/node.c index 2fec6753..fc81bd30 100644 --- a/tipc/node.c +++ b/tipc/node.c @@ -157,6 +157,111 @@ static int cmd_node_set_nodeid(struct nlmsghdr *nlh, const struct cmd *cmd, return msg_doit(nlh, NULL, NULL); } +static void cmd_node_set_key_help(struct cmdl *cmdl) +{ + fprintf(stderr, + "Usage: %s node set key KEY [algname ALGNAME] [nodeid NODEID]\n\n" + "PROPERTIES\n" + " KEY - Symmetric KEY & SALT as a normal or hex string\n" + " that consists of two parts:\n" + " [KEY: 16, 24 or 32 octets][SALT: 4 octets]\n\n" + " algname ALGNAME - Default: \"gcm(aes)\"\n\n" + " nodeid NODEID - Own or peer node identity to which the key will\n" + " be attached. If not present, the key is a cluster\n" + " key!\n\n" + "EXAMPLES\n" + " %s node set key this_is_a_key16_salt algname \"gcm(aes)\" nodeid node1\n" + " %s node set key 0x746869735F69735F615F6B657931365F73616C74 nodeid node2\n\n", + cmdl->argv[0], cmdl->argv[0], cmdl->argv[0]); +} + +static int cmd_node_set_key(struct nlmsghdr *nlh, const struct cmd *cmd, + struct cmdl *cmdl, void *data) +{ + struct { + struct tipc_aead_key key; + char mem[TIPC_AEAD_MAX_KEYLEN + 1]; + } input = {}; + struct opt opts[] = { + { "algname", OPT_KEYVAL, NULL }, + { "nodeid", OPT_KEYVAL, NULL }, + { NULL } + }; + struct nlattr *nest; + struct opt *opt_algname, *opt_nodeid; + char buf[MNL_SOCKET_BUFFER_SIZE]; + uint8_t id[TIPC_NODEID_LEN] = {0,}; + int keysize; + char *str; + + if (help_flag) { + (cmd->help)(cmdl); + return -EINVAL; + } + + if (cmdl->optind >= cmdl->argc) { + fprintf(stderr, "error, missing key\n"); + return -EINVAL; + } + + /* Get user key */ + str = shift_cmdl(cmdl); + if (str2key(str, &input.key)) { + fprintf(stderr, "error, invalid key input\n"); + return -EINVAL; + } + + if (parse_opts(opts, cmdl) < 0) + return -EINVAL; + + /* Get algorithm name, default: "gcm(aes)" */ + opt_algname = get_opt(opts, "algname"); + if (!opt_algname) + strcpy(input.key.alg_name, "gcm(aes)"); + else + strcpy(input.key.alg_name, opt_algname->val); + + /* Get node identity */ + opt_nodeid = get_opt(opts, "nodeid"); + if (opt_nodeid && str2nodeid(opt_nodeid->val, id)) { + fprintf(stderr, "error, invalid node identity\n"); + return -EINVAL; + } + + /* Init & do the command */ + nlh = msg_init(buf, TIPC_NL_KEY_SET); + if (!nlh) { + fprintf(stderr, "error, message initialisation failed\n"); + return -1; + } + nest = mnl_attr_nest_start(nlh, TIPC_NLA_NODE); + keysize = tipc_aead_key_size(&input.key); + mnl_attr_put(nlh, TIPC_NLA_NODE_KEY, keysize, &input.key); + if (opt_nodeid) + mnl_attr_put(nlh, TIPC_NLA_NODE_ID, TIPC_NODEID_LEN, id); + mnl_attr_nest_end(nlh, nest); + return msg_doit(nlh, NULL, NULL); +} + +static int cmd_node_flush_key(struct nlmsghdr *nlh, const struct cmd *cmd, + struct cmdl *cmdl, void *data) +{ + char buf[MNL_SOCKET_BUFFER_SIZE]; + + if (help_flag) { + (cmd->help)(cmdl); + return -EINVAL; + } + + /* Init & do the command */ + nlh = msg_init(buf, TIPC_NL_KEY_FLUSH); + if (!nlh) { + fprintf(stderr, "error, message initialisation failed\n"); + return -1; + } + return msg_doit(nlh, NULL, NULL); +} + static int nodeid_get_cb(const struct nlmsghdr *nlh, void *data) { struct nlattr *info[TIPC_NLA_MAX + 1] = {}; @@ -270,13 +375,34 @@ static int cmd_node_set_netid(struct nlmsghdr *nlh, const struct cmd *cmd, return msg_doit(nlh, NULL, NULL); } +static void cmd_node_flush_help(struct cmdl *cmdl) +{ + fprintf(stderr, + "Usage: %s node flush PROPERTY\n\n" + "PROPERTIES\n" + " key - Flush all symmetric-keys\n", + cmdl->argv[0]); +} + +static int cmd_node_flush(struct nlmsghdr *nlh, const struct cmd *cmd, + struct cmdl *cmdl, void *data) +{ + const struct cmd cmds[] = { + { "key", cmd_node_flush_key, NULL }, + { NULL } + }; + + return run_cmd(nlh, cmd, cmds, cmdl, NULL); +} + static void cmd_node_set_help(struct cmdl *cmdl) { fprintf(stderr, "Usage: %s node set PROPERTY\n\n" "PROPERTIES\n" " identity NODEID - Set node identity\n" - " clusterid CLUSTERID - Set local cluster id\n", + " clusterid CLUSTERID - Set local cluster id\n" + " key PROPERTY - Set symmetric-key\n", cmdl->argv[0]); } @@ -288,6 +414,7 @@ static int cmd_node_set(struct nlmsghdr *nlh, const struct cmd *cmd, { "identity", cmd_node_set_nodeid, NULL }, { "netid", cmd_node_set_netid, NULL }, { "clusterid", cmd_node_set_netid, NULL }, + { "key", cmd_node_set_key, cmd_node_set_key_help }, { NULL } }; @@ -325,7 +452,8 @@ void cmd_node_help(struct cmdl *cmdl) "COMMANDS\n" " list - List remote nodes\n" " get - Get local node parameters\n" - " set - Set local node parameters\n", + " set - Set local node parameters\n" + " flush - Flush local node parameters\n", cmdl->argv[0]); } @@ -336,6 +464,7 @@ int cmd_node(struct nlmsghdr *nlh, const struct cmd *cmd, struct cmdl *cmdl, { "list", cmd_node_list, NULL }, { "get", cmd_node_get, cmd_node_get_help }, { "set", cmd_node_set, cmd_node_set_help }, + { "flush", cmd_node_flush, cmd_node_flush_help}, { NULL } }; -- 2.13.7 |
From: Tuong L. <tuo...@de...> - 2019-10-14 11:07:57
|
This commit adds two netlink commands to TIPC in order for user to be able to set or remove AEAD keys: - TIPC_NL_KEY_SET - TIPC_NL_KEY_FLUSH When the 'KEY_SET' is given along with the key data, the key will be initiated and attached to TIPC crypto. On the other hand, the 'KEY_FLUSH' command will remove all existing keys if any. Signed-off-by: Tuong Lien <tuo...@de...> --- include/uapi/linux/tipc_netlink.h | 4 ++ net/tipc/netlink.c | 16 ++++- net/tipc/node.c | 133 ++++++++++++++++++++++++++++++++++++++ net/tipc/node.h | 2 + 4 files changed, 154 insertions(+), 1 deletion(-) diff --git a/include/uapi/linux/tipc_netlink.h b/include/uapi/linux/tipc_netlink.h index efb958fd167d..6c2194ab745b 100644 --- a/include/uapi/linux/tipc_netlink.h +++ b/include/uapi/linux/tipc_netlink.h @@ -63,6 +63,8 @@ enum { TIPC_NL_PEER_REMOVE, TIPC_NL_BEARER_ADD, TIPC_NL_UDP_GET_REMOTEIP, + TIPC_NL_KEY_SET, + TIPC_NL_KEY_FLUSH, __TIPC_NL_CMD_MAX, TIPC_NL_CMD_MAX = __TIPC_NL_CMD_MAX - 1 @@ -160,6 +162,8 @@ enum { TIPC_NLA_NODE_UNSPEC, TIPC_NLA_NODE_ADDR, /* u32 */ TIPC_NLA_NODE_UP, /* flag */ + TIPC_NLA_NODE_ID, /* data */ + TIPC_NLA_NODE_KEY, /* data */ __TIPC_NLA_NODE_MAX, TIPC_NLA_NODE_MAX = __TIPC_NLA_NODE_MAX - 1 diff --git a/net/tipc/netlink.c b/net/tipc/netlink.c index d32bbd0f5e46..f118cc9d0885 100644 --- a/net/tipc/netlink.c +++ b/net/tipc/netlink.c @@ -102,7 +102,11 @@ const struct nla_policy tipc_nl_link_policy[TIPC_NLA_LINK_MAX + 1] = { const struct nla_policy tipc_nl_node_policy[TIPC_NLA_NODE_MAX + 1] = { [TIPC_NLA_NODE_UNSPEC] = { .type = NLA_UNSPEC }, [TIPC_NLA_NODE_ADDR] = { .type = NLA_U32 }, - [TIPC_NLA_NODE_UP] = { .type = NLA_FLAG } + [TIPC_NLA_NODE_UP] = { .type = NLA_FLAG }, + [TIPC_NLA_NODE_ID] = { .type = NLA_BINARY, + .len = TIPC_NODEID_LEN}, + [TIPC_NLA_NODE_KEY] = { .type = NLA_BINARY, + .len = TIPC_AEAD_KEY_SIZE_MAX}, }; /* Properties valid for media, bearer and link */ @@ -257,6 +261,16 @@ static const struct genl_ops tipc_genl_v2_ops[] = { .dumpit = tipc_udp_nl_dump_remoteip, }, #endif + { + .cmd = TIPC_NL_KEY_SET, + .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP, + .doit = tipc_nl_node_set_key, + }, + { + .cmd = TIPC_NL_KEY_FLUSH, + .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP, + .doit = tipc_nl_node_flush_key, + }, }; struct genl_family tipc_genl_family __ro_after_init = { diff --git a/net/tipc/node.c b/net/tipc/node.c index e6e0c5bee4bc..6621091f22d1 100644 --- a/net/tipc/node.c +++ b/net/tipc/node.c @@ -2628,6 +2628,139 @@ int tipc_nl_node_dump_monitor_peer(struct sk_buff *skb, return skb->len; } +static int tipc_nl_retrieve_key(struct nlattr **attrs, + struct tipc_aead_key **key) +{ + struct nlattr *attr = attrs[TIPC_NLA_NODE_KEY]; + + if (!attr) + return -ENODATA; + + *key = (struct tipc_aead_key *)nla_data(attr); + if (nla_len(attr) < tipc_aead_key_size(*key)) + return -EINVAL; + + return 0; +} + +static int tipc_nl_retrieve_nodeid(struct nlattr **attrs, u8 **node_id) +{ + struct nlattr *attr = attrs[TIPC_NLA_NODE_ID]; + + if (!attr) + return -ENODATA; + + if (nla_len(attr) < TIPC_NODEID_LEN) + return -EINVAL; + + *node_id = (u8 *)nla_data(attr); + return 0; +} + +int __tipc_nl_node_set_key(struct sk_buff *skb, struct genl_info *info) +{ + struct nlattr *attrs[TIPC_NLA_NODE_MAX + 1]; + struct net *net = sock_net(skb->sk); + struct tipc_net *tn = tipc_net(net); + struct tipc_node *n = NULL; + struct tipc_aead_key *ukey; + struct tipc_crypto *c; + u8 *id, *own_id; + int rc = 0; + + if (!info->attrs[TIPC_NLA_NODE]) + return -EINVAL; + + rc = nla_parse_nested(attrs, TIPC_NLA_NODE_MAX, + info->attrs[TIPC_NLA_NODE], + tipc_nl_node_policy, info->extack); + if (rc) + goto exit; + + own_id = tipc_own_id(net); + if (!own_id) { + rc = -EPERM; + goto exit; + } + + rc = tipc_nl_retrieve_key(attrs, &ukey); + if (rc) + goto exit; + + rc = tipc_aead_key_validate(ukey); + if (rc) + goto exit; + + rc = tipc_nl_retrieve_nodeid(attrs, &id); + switch (rc) { + case -ENODATA: + /* Cluster key mode */ + rc = tipc_crypto_key_init(tn->crypto_tx, ukey, CLUSTER_KEY); + break; + case 0: + /* Per-node key mode */ + if (!memcmp(id, own_id, NODE_ID_LEN)) { + c = tn->crypto_tx; + } else { + n = tipc_node_find_by_id(net, id) ?: + tipc_node_create(net, 0, id, 0xffffu, true); + if (unlikely(!n)) { + rc = -ENOMEM; + break; + } + c = n->crypto_rx; + } + + rc = tipc_crypto_key_init(c, ukey, PER_NODE_KEY); + if (n) + tipc_node_put(n); + break; + default: + break; + } + +exit: + return (rc < 0) ? rc : 0; +} + +int tipc_nl_node_set_key(struct sk_buff *skb, struct genl_info *info) +{ + int err; + + rtnl_lock(); + err = __tipc_nl_node_set_key(skb, info); + rtnl_unlock(); + + return err; +} + +int __tipc_nl_node_flush_key(struct sk_buff *skb, struct genl_info *info) +{ + struct net *net = sock_net(skb->sk); + struct tipc_net *tn = tipc_net(net); + struct tipc_node *n; + + tipc_crypto_key_flush(tn->crypto_tx); + rcu_read_lock(); + list_for_each_entry_rcu(n, &tn->node_list, list) + tipc_crypto_key_flush(n->crypto_rx); + rcu_read_unlock(); + + pr_info("All keys are flushed!\n"); + return 0; +} + +int tipc_nl_node_flush_key(struct sk_buff *skb, struct genl_info *info) +{ + int err; + + rtnl_lock(); + err = __tipc_nl_node_flush_key(skb, info); + rtnl_unlock(); + + return err; +} + /** * tipc_node_dump - dump TIPC node data * @n: tipc node to be dumped diff --git a/net/tipc/node.h b/net/tipc/node.h index ec44ab454aa9..a0c59893fbd6 100644 --- a/net/tipc/node.h +++ b/net/tipc/node.h @@ -115,4 +115,6 @@ int tipc_nl_node_get_monitor(struct sk_buff *skb, struct genl_info *info); int tipc_nl_node_dump_monitor(struct sk_buff *skb, struct netlink_callback *cb); int tipc_nl_node_dump_monitor_peer(struct sk_buff *skb, struct netlink_callback *cb); +int tipc_nl_node_set_key(struct sk_buff *skb, struct genl_info *info); +int tipc_nl_node_flush_key(struct sk_buff *skb, struct genl_info *info); #endif -- 2.13.7 |
From: Tuong L. <tuo...@de...> - 2019-10-14 11:07:55
|
This commit offers an option to encrypt and authenticate all messaging, including the neighbor discovery messages. The currently most advanced algorithm supported is the AEAD AES-GCM (like IPSec or TLS). All encryption/decryption is done at the bearer layer, just before leaving or after entering TIPC. Supported features: - Encryption & authentication of all TIPC messages (header + data); - Two symmetric-key modes: Cluster and Per-node; - Automatic key switching; - Key-expired revoking (sequence number wrapped); - Lock-free encryption/decryption (RCU); - Asynchronous crypto, Intel AES-NI supported; - Multiple cipher transforms; - Logs & statistics; Two key modes: - Cluster key mode: One single key is used for both TX & RX in all nodes in the cluster. - Per-node key mode: Each nodes in the cluster has one specific TX key. For RX, a node requires its peers' TX key to be able to decrypt the messages from those peers. Key setting from user-space is performed via netlink by a user program (e.g. the iproute2 'tipc' tool). Internal key state machine: Attach Align(RX) +-+ +-+ | V | V +---------+ Attach +---------+ | IDLE |---------------->| PENDING |(user = 0) +---------+ +---------+ A A Switch| A | | | | | | Free(switch/revoked) | | (Free)| +----------------------+ | |Timeout | (TX) | | |(RX) | | | | | | v | +---------+ Switch +---------+ | PASSIVE |<----------------| ACTIVE | +---------+ (RX) +---------+ (user = 1) (user >= 1) The number of TFMs is 10 by default and can be changed via the procfs 'net/tipc/max_tfms'. At this moment, as for simplicity, this file is also used to print the crypto statistics at runtime: echo 0xfff1 > /proc/sys/net/tipc/max_tfms The patch defines a new TIPC version (v7) for the encryption message (- backward compatibility as well). The message is basically encapsulated as follows: +----------------------------------------------------------+ | TIPCv7 encryption | Original TIPCv2 | Authentication | | header | packet (encrypted) | Tag | +----------------------------------------------------------+ The throughput is about ~40% for small messages (compared with non- encryption) and ~9% for large messages. With the support from hardware crypto i.e. the Intel AES-NI CPU instructions, the throughput increases upto ~85% for small messages and ~55% for large messages. MAINTAINERS | add two new files 'crypto.h' & 'crypto.c' in tipc Signed-off-by: Tuong Lien <tuo...@de...> --- net/tipc/Makefile | 2 +- net/tipc/bcast.c | 2 +- net/tipc/bearer.c | 29 +- net/tipc/bearer.h | 3 +- net/tipc/core.c | 10 + net/tipc/core.h | 4 + net/tipc/crypto.c | 1986 ++++++++++++++++++++++++++++++++++++++++++++++++++ net/tipc/crypto.h | 166 +++++ net/tipc/link.c | 16 +- net/tipc/link.h | 1 + net/tipc/msg.c | 24 +- net/tipc/msg.h | 44 +- net/tipc/node.c | 84 ++- net/tipc/node.h | 5 + net/tipc/sysctl.c | 9 + net/tipc/udp_media.c | 1 + 16 files changed, 2330 insertions(+), 56 deletions(-) create mode 100644 net/tipc/crypto.c create mode 100644 net/tipc/crypto.h diff --git a/net/tipc/Makefile b/net/tipc/Makefile index c86aba0282af..8953a9c9427d 100644 --- a/net/tipc/Makefile +++ b/net/tipc/Makefile @@ -9,7 +9,7 @@ tipc-y += addr.o bcast.o bearer.o \ core.o link.o discover.o msg.o \ name_distr.o subscr.o monitor.o name_table.o net.o \ netlink.o netlink_compat.o node.o socket.o eth_media.o \ - topsrv.o socket.o group.o trace.o + topsrv.o socket.o group.o trace.o crypto.o CFLAGS_trace.o += -I$(src) diff --git a/net/tipc/bcast.c b/net/tipc/bcast.c index 6ef1abdd525f..f41096a759fa 100644 --- a/net/tipc/bcast.c +++ b/net/tipc/bcast.c @@ -84,7 +84,7 @@ static struct tipc_bc_base *tipc_bc_base(struct net *net) */ int tipc_bcast_get_mtu(struct net *net) { - return tipc_link_mtu(tipc_bc_sndlink(net)) - INT_H_SIZE; + return tipc_link_mss(tipc_bc_sndlink(net)); } void tipc_bcast_disable_rcast(struct net *net) diff --git a/net/tipc/bearer.c b/net/tipc/bearer.c index 6e0962e0f759..b1e32f9fb12a 100644 --- a/net/tipc/bearer.c +++ b/net/tipc/bearer.c @@ -44,6 +44,7 @@ #include "netlink.h" #include "udp_media.h" #include "trace.h" +#include "crypto.h" #define MAX_ADDR_STR 60 @@ -525,10 +526,13 @@ void tipc_bearer_xmit_skb(struct net *net, u32 bearer_id, rcu_read_lock(); b = bearer_get(net, bearer_id); - if (likely(b && (test_bit(0, &b->up) || msg_is_reset(hdr)))) - b->media->send_msg(net, skb, b, dest); - else + if (likely(b && (test_bit(0, &b->up) || msg_is_reset(hdr)))) { + tipc_crypto_xmit(net, &skb, b, dest, NULL); + if (skb) + b->media->send_msg(net, skb, b, dest); + } else { kfree_skb(skb); + } rcu_read_unlock(); } @@ -536,7 +540,8 @@ void tipc_bearer_xmit_skb(struct net *net, u32 bearer_id, */ void tipc_bearer_xmit(struct net *net, u32 bearer_id, struct sk_buff_head *xmitq, - struct tipc_media_addr *dst) + struct tipc_media_addr *dst, + struct tipc_node *__dnode) { struct tipc_bearer *b; struct sk_buff *skb, *tmp; @@ -550,10 +555,13 @@ void tipc_bearer_xmit(struct net *net, u32 bearer_id, __skb_queue_purge(xmitq); skb_queue_walk_safe(xmitq, skb, tmp) { __skb_dequeue(xmitq); - if (likely(test_bit(0, &b->up) || msg_is_reset(buf_msg(skb)))) - b->media->send_msg(net, skb, b, dst); - else + if (likely(test_bit(0, &b->up) || msg_is_reset(buf_msg(skb)))) { + tipc_crypto_xmit(net, &skb, b, dst, __dnode); + if (skb) + b->media->send_msg(net, skb, b, dst); + } else { kfree_skb(skb); + } } rcu_read_unlock(); } @@ -564,6 +572,7 @@ void tipc_bearer_bc_xmit(struct net *net, u32 bearer_id, struct sk_buff_head *xmitq) { struct tipc_net *tn = tipc_net(net); + struct tipc_media_addr *dst; int net_id = tn->net_id; struct tipc_bearer *b; struct sk_buff *skb, *tmp; @@ -578,7 +587,10 @@ void tipc_bearer_bc_xmit(struct net *net, u32 bearer_id, msg_set_non_seq(hdr, 1); msg_set_mc_netid(hdr, net_id); __skb_dequeue(xmitq); - b->media->send_msg(net, skb, b, &b->bcast_addr); + dst = &b->bcast_addr; + tipc_crypto_xmit(net, &skb, b, dst, NULL); + if (skb) + b->media->send_msg(net, skb, b, dst); } rcu_read_unlock(); } @@ -605,6 +617,7 @@ static int tipc_l2_rcv_msg(struct sk_buff *skb, struct net_device *dev, if (likely(b && test_bit(0, &b->up) && (skb->pkt_type <= PACKET_MULTICAST))) { skb_mark_not_on_list(skb); + TIPC_SKB_CB(skb)->flags = 0; tipc_rcv(dev_net(b->pt.dev), skb, b); rcu_read_unlock(); return NET_RX_SUCCESS; diff --git a/net/tipc/bearer.h b/net/tipc/bearer.h index faca696d422f..d0c79cc6c0c2 100644 --- a/net/tipc/bearer.h +++ b/net/tipc/bearer.h @@ -232,7 +232,8 @@ void tipc_bearer_xmit_skb(struct net *net, u32 bearer_id, struct tipc_media_addr *dest); void tipc_bearer_xmit(struct net *net, u32 bearer_id, struct sk_buff_head *xmitq, - struct tipc_media_addr *dst); + struct tipc_media_addr *dst, + struct tipc_node *__dnode); void tipc_bearer_bc_xmit(struct net *net, u32 bearer_id, struct sk_buff_head *xmitq); void tipc_clone_to_loopback(struct net *net, struct sk_buff_head *pkts); diff --git a/net/tipc/core.c b/net/tipc/core.c index 23cb379a93d6..1e3981c9680b 100644 --- a/net/tipc/core.c +++ b/net/tipc/core.c @@ -44,6 +44,7 @@ #include "socket.h" #include "bcast.h" #include "node.h" +#include "crypto.h" #include <linux/module.h> @@ -68,6 +69,10 @@ static int __net_init tipc_init_net(struct net *net) INIT_LIST_HEAD(&tn->node_list); spin_lock_init(&tn->node_list_lock); + err = tipc_crypto_start(&tn->crypto_tx, net, NULL); + if (err) + goto out_crypto; + err = tipc_sk_rht_init(net); if (err) goto out_sk_rht; @@ -93,16 +98,21 @@ static int __net_init tipc_init_net(struct net *net) out_nametbl: tipc_sk_rht_destroy(net); out_sk_rht: + tipc_crypto_stop(&tn->crypto_tx); +out_crypto: return err; } static void __net_exit tipc_exit_net(struct net *net) { + struct tipc_net *tn = tipc_net(net); + tipc_detach_loopback(net); tipc_net_stop(net); tipc_bcast_stop(net); tipc_nametbl_stop(net); tipc_sk_rht_destroy(net); + tipc_crypto_stop(&tn->crypto_tx); } static struct pernet_operations tipc_net_ops = { diff --git a/net/tipc/core.h b/net/tipc/core.h index 60d829581068..0ee2aa1bba2f 100644 --- a/net/tipc/core.h +++ b/net/tipc/core.h @@ -67,6 +67,7 @@ struct tipc_link; struct tipc_name_table; struct tipc_topsrv; struct tipc_monitor; +struct tipc_crypto; #define TIPC_MOD_VER "2.0.0" @@ -128,6 +129,9 @@ struct tipc_net { /* Tracing of node internal messages */ struct packet_type loopback_pt; + + /* TX crypto handler */ + struct tipc_crypto *crypto_tx; }; static inline struct tipc_net *tipc_net(struct net *net) diff --git a/net/tipc/crypto.c b/net/tipc/crypto.c new file mode 100644 index 000000000000..101a812a0f20 --- /dev/null +++ b/net/tipc/crypto.c @@ -0,0 +1,1986 @@ +// SPDX-License-Identifier: GPL-2.0 +/** + * net/tipc/crypto.c: TIPC crypto for key handling & packet en/decryption + * + * Copyright (c) 2019, Ericsson AB + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions are met: + * + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * 3. Neither the names of the copyright holders nor the names of its + * contributors may be used to endorse or promote products derived from + * this software without specific prior written permission. + * + * Alternatively, this software may be distributed under the terms of the + * GNU General Public License ("GPL") version 2 as published by the Free + * Software Foundation. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" + * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE + * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR + * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF + * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS + * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN + * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE + * POSSIBILITY OF SUCH DAMAGE. + */ + +#include <crypto/aead.h> +#include <crypto/aes.h> +#include "crypto.h" + +#define TIPC_TX_PROBE_LIM msecs_to_jiffies(1000) /* > 1s */ +#define TIPC_TX_LASTING_LIM msecs_to_jiffies(120000) /* 2 mins */ +#define TIPC_RX_ACTIVE_LIM msecs_to_jiffies(3000) /* 3s */ +#define TIPC_RX_PASSIVE_LIM msecs_to_jiffies(180000) /* 3 mins */ +#define TIPC_MAX_TFMS_DEF 10 +#define TIPC_MAX_TFMS_LIM 1000 + +/** + * TIPC Key ids + */ +enum { + KEY_UNUSED = 0, + KEY_MIN, + KEY_1 = KEY_MIN, + KEY_2, + KEY_3, + KEY_MAX = KEY_3, +}; + +/** + * TIPC Crypto statistics + */ +enum { + STAT_OK, + STAT_NOK, + STAT_ASYNC, + STAT_ASYNC_OK, + STAT_ASYNC_NOK, + STAT_BADKEYS, /* tx only */ + STAT_BADMSGS = STAT_BADKEYS, /* rx only */ + STAT_NOKEYS, + STAT_SWITCHES, + + MAX_STATS, +}; + +/* TIPC crypto statistics' header */ +static const char *hstats[MAX_STATS] = {"ok", "nok", "async", "async_ok", + "async_nok", "badmsgs", "nokeys", + "switches"}; + +/* Max TFMs number per key */ +int sysctl_tipc_max_tfms __read_mostly = TIPC_MAX_TFMS_DEF; + +/** + * struct tipc_key - TIPC keys' status indicator + * + * 7 6 5 4 3 2 1 0 + * +-----+-----+-----+-----+-----+-----+-----+-----+ + * key: | (reserved)|passive idx| active idx|pending idx| + * +-----+-----+-----+-----+-----+-----+-----+-----+ + */ +struct tipc_key { +#define KEY_BITS (2) +#define KEY_MASK ((1 << KEY_BITS) - 1) + union { + struct { +#if defined(__LITTLE_ENDIAN_BITFIELD) + u8 pending:2, + active:2, + passive:2, /* rx only */ + reserved:2; +#elif defined(__BIG_ENDIAN_BITFIELD) + u8 reserved:2, + passive:2, /* rx only */ + active:2, + pending:2; +#else +#error "Please fix <asm/byteorder.h>" +#endif + } __packed; + u8 keys; + }; +}; + +/** + * struct tipc_tfm - TIPC TFM structure to form a list of TFMs + */ +struct tipc_tfm { + struct crypto_aead *tfm; + struct list_head list; +}; + +/** + * struct tipc_aead - TIPC AEAD key structure + * @tfm_entry: per-cpu pointer to one entry in TFM list + * @crypto: TIPC crypto owns this key + * @cloned: reference to the source key in case cloning + * @users: the number of the key users (TX/RX) + * @salt: the key's SALT value + * @authsize: authentication tag size (max = 16) + * @mode: crypto mode is applied to the key + * @hint[]: a hint for user key + * @rcu: struct rcu_head + * @seqno: the key seqno (cluster scope) + * @refcnt: the key reference counter + */ +struct tipc_aead { +#define TIPC_AEAD_HINT_LEN (5) + struct tipc_tfm * __percpu *tfm_entry; + struct tipc_crypto *crypto; + struct tipc_aead *cloned; + atomic_t users; + u32 salt; + u8 authsize; + u8 mode; + char hint[TIPC_AEAD_HINT_LEN + 1]; + struct rcu_head rcu; + + atomic64_t seqno ____cacheline_aligned; + refcount_t refcnt ____cacheline_aligned; + +} ____cacheline_aligned; + +/** + * struct tipc_crypto_stats - TIPC Crypto statistics + */ +struct tipc_crypto_stats { + unsigned int stat[MAX_STATS]; +}; + +/** + * struct tipc_crypto - TIPC TX/RX crypto structure + * @net: struct net + * @node: TIPC node (RX) + * @aead: array of pointers to AEAD keys for encryption/decryption + * @peer_rx_active: replicated peer RX active key index + * @key: the key states + * @working: the crypto is working or not + * @stats: the crypto statistics + * @sndnxt: the per-peer sndnxt (TX) + * @timer1: general timer 1 (jiffies) + * @timer2: general timer 1 (jiffies) + * @lock: tipc_key lock + */ +struct tipc_crypto { + struct net *net; + struct tipc_node *node; + struct tipc_aead __rcu *aead[KEY_MAX + 1]; /* key[0] is UNUSED */ + atomic_t peer_rx_active; + struct tipc_key key; + u8 working:1; + struct tipc_crypto_stats __percpu *stats; + + atomic64_t sndnxt ____cacheline_aligned; + unsigned long timer1; + unsigned long timer2; + spinlock_t lock; /* crypto lock */ + +} ____cacheline_aligned; + +/* struct tipc_crypto_tx_ctx - TX context for callbacks */ +struct tipc_crypto_tx_ctx { + struct tipc_aead *aead; + struct tipc_bearer *bearer; + struct tipc_media_addr dst; +}; + +/* struct tipc_crypto_rx_ctx - RX context for callbacks */ +struct tipc_crypto_rx_ctx { + struct tipc_aead *aead; + struct tipc_bearer *bearer; +}; + +static struct tipc_aead *tipc_aead_get(struct tipc_aead __rcu *aead); +static inline void tipc_aead_put(struct tipc_aead *aead); +static void tipc_aead_free(struct rcu_head *rp); +static int tipc_aead_users(struct tipc_aead __rcu *aead); +static void tipc_aead_users_inc(struct tipc_aead __rcu *aead, int lim); +static void tipc_aead_users_dec(struct tipc_aead __rcu *aead, int lim); +static void tipc_aead_users_set(struct tipc_aead __rcu *aead, int val); +static struct crypto_aead *tipc_aead_tfm_next(struct tipc_aead *aead); +static int tipc_aead_init(struct tipc_aead **aead, struct tipc_aead_key *ukey, + u8 mode); +static int tipc_aead_clone(struct tipc_aead **dst, struct tipc_aead *src); +static void *tipc_aead_mem_alloc(struct crypto_aead *tfm, + unsigned int crypto_ctx_size, + u8 **iv, struct aead_request **req, + struct scatterlist **sg, int nsg); +static int tipc_aead_encrypt(struct tipc_aead *aead, struct sk_buff *skb, + struct tipc_bearer *b, + struct tipc_media_addr *dst, + struct tipc_node *__dnode); +static void tipc_aead_encrypt_done(struct crypto_async_request *base, int err); +static int tipc_aead_decrypt(struct net *net, struct tipc_aead *aead, + struct sk_buff *skb, struct tipc_bearer *b); +static void tipc_aead_decrypt_done(struct crypto_async_request *base, int err); +static inline int tipc_ehdr_size(struct tipc_ehdr *ehdr); +static int tipc_ehdr_build(struct net *net, struct tipc_aead *aead, + u8 tx_key, struct sk_buff *skb, + struct tipc_crypto *__rx); +static inline void tipc_crypto_key_set_state(struct tipc_crypto *c, + u8 new_passive, + u8 new_active, + u8 new_pending); +static int tipc_crypto_key_attach(struct tipc_crypto *c, + struct tipc_aead *aead, u8 pos); +static bool tipc_crypto_key_try_align(struct tipc_crypto *rx, u8 new_pending); +static struct tipc_aead *tipc_crypto_key_pick_tx(struct tipc_crypto *tx, + struct tipc_crypto *rx, + struct sk_buff *skb); +static void tipc_crypto_key_synch(struct tipc_crypto *rx, u8 new_rx_active, + struct tipc_msg *hdr); +static int tipc_crypto_key_revoke(struct net *net, u8 tx_key); +static void tipc_crypto_rcv_complete(struct net *net, struct tipc_aead *aead, + struct tipc_bearer *b, + struct sk_buff **skb, int err); +static void tipc_crypto_do_cmd(struct net *net, int cmd); +static char *tipc_crypto_key_dump(struct tipc_crypto *c, char *buf); +#ifdef TIPC_CRYPTO_DEBUG +static char *tipc_key_change_dump(struct tipc_key old, struct tipc_key new, + char *buf); +#endif + +#define key_next(cur) ((cur) % KEY_MAX + 1) + +#define tipc_aead_rcu_ptr(rcu_ptr, lock) \ + rcu_dereference_protected((rcu_ptr), lockdep_is_held(lock)) + +#define tipc_aead_rcu_swap(rcu_ptr, ptr, lock) \ + rcu_swap_protected((rcu_ptr), (ptr), lockdep_is_held(lock)) + +#define tipc_aead_rcu_replace(rcu_ptr, ptr, lock) \ +do { \ + typeof(rcu_ptr) __tmp = rcu_dereference_protected((rcu_ptr), \ + lockdep_is_held(lock)); \ + rcu_assign_pointer((rcu_ptr), (ptr)); \ + tipc_aead_put(__tmp); \ +} while (0) + +#define tipc_crypto_key_detach(rcu_ptr, lock) \ + tipc_aead_rcu_replace((rcu_ptr), NULL, lock) + +/** + * tipc_aead_key_validate - Validate a AEAD user key + */ +int tipc_aead_key_validate(struct tipc_aead_key *ukey) +{ + int keylen; + + /* Check if algorithm exists */ + if (unlikely(!crypto_has_alg(ukey->alg_name, 0, 0))) { + pr_info("Not found cipher: \"%s\"!\n", ukey->alg_name); + return -ENODEV; + } + + /* Currently, we only support the "gcm(aes)" cipher algorithm (or its + * templates but with 12-byte IV length) + */ + if (strcmp(ukey->alg_name, "gcm(aes)")) + return -ENOTSUPP; + + /* Check if key size is correct */ + keylen = ukey->keylen - TIPC_AES_GCM_SALT_SIZE; + if (unlikely(keylen != TIPC_AES_GCM_KEY_SIZE_128 && + keylen != TIPC_AES_GCM_KEY_SIZE_192 && + keylen != TIPC_AES_GCM_KEY_SIZE_256)) + return -EINVAL; + + return 0; +} + +static struct tipc_aead *tipc_aead_get(struct tipc_aead __rcu *aead) +{ + struct tipc_aead *tmp; + + rcu_read_lock(); + tmp = rcu_dereference(aead); + if (unlikely(!tmp || !refcount_inc_not_zero(&tmp->refcnt))) + tmp = NULL; + rcu_read_unlock(); + + return tmp; +} + +static inline void tipc_aead_put(struct tipc_aead *aead) +{ + if (aead && refcount_dec_and_test(&aead->refcnt)) + call_rcu(&aead->rcu, tipc_aead_free); +} + +/** + * tipc_aead_free - Release AEAD key incl. all the TFMs in the list + * @rp: rcu head pointer + */ +static void tipc_aead_free(struct rcu_head *rp) +{ + struct tipc_aead *aead = container_of(rp, struct tipc_aead, rcu); + struct tipc_tfm *tfm_entry, *head, *tmp; + + if (aead->cloned) { + tipc_aead_put(aead->cloned); + } else { + head = *this_cpu_ptr(aead->tfm_entry); + list_for_each_entry_safe(tfm_entry, tmp, &head->list, list) { + crypto_free_aead(tfm_entry->tfm); + list_del(&tfm_entry->list); + kfree(tfm_entry); + } + /* Free the head */ + crypto_free_aead(head->tfm); + list_del(&head->list); + kfree(head); + } + free_percpu(aead->tfm_entry); + kfree(aead); +} + +static int tipc_aead_users(struct tipc_aead __rcu *aead) +{ + struct tipc_aead *tmp; + int users = 0; + + rcu_read_lock(); + tmp = rcu_dereference(aead); + if (tmp) + users = atomic_read(&tmp->users); + rcu_read_unlock(); + + return users; +} + +static void tipc_aead_users_inc(struct tipc_aead __rcu *aead, int lim) +{ + struct tipc_aead *tmp; + + rcu_read_lock(); + tmp = rcu_dereference(aead); + if (tmp) + atomic_add_unless(&tmp->users, 1, lim); + rcu_read_unlock(); +} + +static void tipc_aead_users_dec(struct tipc_aead __rcu *aead, int lim) +{ + struct tipc_aead *tmp; + + rcu_read_lock(); + tmp = rcu_dereference(aead); + if (tmp) + atomic_add_unless(&rcu_dereference(aead)->users, -1, lim); + rcu_read_unlock(); +} + +static void tipc_aead_users_set(struct tipc_aead __rcu *aead, int val) +{ + struct tipc_aead *tmp; + int cur; + + rcu_read_lock(); + tmp = rcu_dereference(aead); + if (tmp) { + do { + cur = atomic_read(&tmp->users); + if (cur == val) + break; + } while (atomic_cmpxchg(&tmp->users, cur, val) != cur); + } + rcu_read_unlock(); +} + +/** + * tipc_aead_tfm_next - Move TFM entry to the next one in list and return it + */ +static struct crypto_aead *tipc_aead_tfm_next(struct tipc_aead *aead) +{ + struct tipc_tfm **tfm_entry = this_cpu_ptr(aead->tfm_entry); + + *tfm_entry = list_next_entry(*tfm_entry, list); + return (*tfm_entry)->tfm; +} + +/** + * tipc_aead_init - Initiate TIPC AEAD + * @aead: returned new TIPC AEAD key handle pointer + * @ukey: pointer to user key data + * @mode: the key mode + * + * Allocate a (list of) new cipher transformation (TFM) with the specific user + * key data if valid. The number of the allocated TFMs can be set via the sysfs + * "net/tipc/max_tfms" first. + * Also, all the other AEAD data are also initialized. + * + * Return: 0 if the initiation is successful, otherwise: < 0 + */ +static int tipc_aead_init(struct tipc_aead **aead, struct tipc_aead_key *ukey, + u8 mode) +{ + struct tipc_tfm *tfm_entry, *head; + struct crypto_aead *tfm; + struct tipc_aead *tmp; + int keylen, err, cpu; + int tfm_cnt = 0; + + if (unlikely(*aead)) + return -EEXIST; + + /* Allocate a new AEAD */ + tmp = kzalloc(sizeof(*tmp), GFP_ATOMIC); + if (unlikely(!tmp)) + return -ENOMEM; + + /* The key consists of two parts: [AES-KEY][SALT] */ + keylen = ukey->keylen - TIPC_AES_GCM_SALT_SIZE; + + /* Allocate per-cpu TFM entry pointer */ + tmp->tfm_entry = alloc_percpu(struct tipc_tfm *); + if (!tmp->tfm_entry) { + kzfree(tmp); + return -ENOMEM; + } + + /* Make a list of TFMs with the user key data */ + do { + tfm = crypto_alloc_aead(ukey->alg_name, 0, 0); + if (IS_ERR(tfm)) { + err = PTR_ERR(tfm); + break; + } + + if (unlikely(!tfm_cnt && + crypto_aead_ivsize(tfm) != TIPC_AES_GCM_IV_SIZE)) { + crypto_free_aead(tfm); + err = -ENOTSUPP; + break; + } + err |= crypto_aead_setauthsize(tfm, TIPC_AES_GCM_TAG_SIZE); + err |= crypto_aead_setkey(tfm, ukey->key, keylen); + if (unlikely(err)) { + crypto_free_aead(tfm); + break; + } + + tfm_entry = kmalloc(sizeof(*tfm_entry), GFP_KERNEL); + if (unlikely(!tfm_entry)) { + crypto_free_aead(tfm); + err = -ENOMEM; + break; + } + INIT_LIST_HEAD(&tfm_entry->list); + tfm_entry->tfm = tfm; + + /* First entry? */ + if (!tfm_cnt) { + head = tfm_entry; + for_each_possible_cpu(cpu) { + *per_cpu_ptr(tmp->tfm_entry, cpu) = head; + } + } else { + list_add_tail(&tfm_entry->list, &head->list); + } + + } while (++tfm_cnt < sysctl_tipc_max_tfms); + + /* Not any TFM is allocated? */ + if (!tfm_cnt) { + free_percpu(tmp->tfm_entry); + kzfree(tmp); + return err; + } + + /* Copy some chars from the user key as a hint */ + memcpy(tmp->hint, ukey->key, TIPC_AEAD_HINT_LEN); + tmp->hint[TIPC_AEAD_HINT_LEN] = '\0'; + + /* Initialize the other data */ + tmp->mode = mode; + tmp->cloned = NULL; + tmp->authsize = TIPC_AES_GCM_TAG_SIZE; + memcpy(&tmp->salt, ukey->key + keylen, TIPC_AES_GCM_SALT_SIZE); + atomic_set(&tmp->users, 0); + atomic64_set(&tmp->seqno, 0); + refcount_set(&tmp->refcnt, 1); + + *aead = tmp; + return 0; +} + +/** + * tipc_aead_clone - Clone a TIPC AEAD key + * @dst: dest key for the cloning + * @src: source key to clone from + * + * Make a "copy" of the source AEAD key data to the dest, the TFMs list is + * common for the keys. + * A reference to the source is hold in the "cloned" pointer for the later + * freeing purposes. + * + * Note: this must be done in cluster-key mode only! + * Return: 0 in case of success, otherwise < 0 + */ +static int tipc_aead_clone(struct tipc_aead **dst, struct tipc_aead *src) +{ + struct tipc_aead *aead; + int cpu; + + if (!src) + return -ENOKEY; + + if (src->mode != CLUSTER_KEY) + return -EINVAL; + + if (unlikely(*dst)) + return -EEXIST; + + aead = kzalloc(sizeof(*aead), GFP_ATOMIC); + if (unlikely(!aead)) + return -ENOMEM; + + aead->tfm_entry = alloc_percpu_gfp(struct tipc_tfm *, GFP_ATOMIC); + if (unlikely(!aead->tfm_entry)) { + kzfree(aead); + return -ENOMEM; + } + + for_each_possible_cpu(cpu) { + *per_cpu_ptr(aead->tfm_entry, cpu) = + *per_cpu_ptr(src->tfm_entry, cpu); + } + + memcpy(aead->hint, src->hint, sizeof(src->hint)); + aead->mode = src->mode; + aead->salt = src->salt; + aead->authsize = src->authsize; + atomic_set(&aead->users, 0); + atomic64_set(&aead->seqno, 0); + refcount_set(&aead->refcnt, 1); + + WARN_ON(!refcount_inc_not_zero(&src->refcnt)); + aead->cloned = src; + + *dst = aead; + return 0; +} + +/** + * tipc_aead_mem_alloc - Allocate memory for AEAD request operations + * @tfm: cipher handle to be registered with the request + * @crypto_ctx_size: size of crypto context for callback + * @iv: returned pointer to IV data + * @req: returned pointer to AEAD request data + * @sg: returned pointer to SG lists + * @nsg: number of SG lists to be allocated + * + * Allocate memory to store the crypto context data, AEAD request, IV and SG + * lists, the memory layout is as follows: + * crypto_ctx || iv || aead_req || sg[] + * + * Return: the pointer to the memory areas in case of success, otherwise NULL + */ +static void *tipc_aead_mem_alloc(struct crypto_aead *tfm, + unsigned int crypto_ctx_size, + u8 **iv, struct aead_request **req, + struct scatterlist **sg, int nsg) +{ + unsigned int iv_size, req_size; + unsigned int len; + u8 *mem; + + iv_size = crypto_aead_ivsize(tfm); + req_size = sizeof(**req) + crypto_aead_reqsize(tfm); + + len = crypto_ctx_size; + len += iv_size; + len += crypto_aead_alignmask(tfm) & ~(crypto_tfm_ctx_alignment() - 1); + len = ALIGN(len, crypto_tfm_ctx_alignment()); + len += req_size; + len = ALIGN(len, __alignof__(struct scatterlist)); + len += nsg * sizeof(**sg); + + mem = kmalloc(len, GFP_ATOMIC); + if (!mem) + return NULL; + + *iv = (u8 *)PTR_ALIGN(mem + crypto_ctx_size, + crypto_aead_alignmask(tfm) + 1); + *req = (struct aead_request *)PTR_ALIGN(*iv + iv_size, + crypto_tfm_ctx_alignment()); + *sg = (struct scatterlist *)PTR_ALIGN((u8 *)*req + req_size, + __alignof__(struct scatterlist)); + + return (void *)mem; +} + +/** + * tipc_aead_encrypt - Encrypt a message + * @aead: TIPC AEAD key for the message encryption + * @skb: the input/output skb + * @b: TIPC bearer where the message will be delivered after the encryption + * @dst: the destination media address + * @__dnode: TIPC dest node if "known" + * + * Return: + * 0 : if the encryption has completed + * -EINPROGRESS/-EBUSY : if a callback will be performed + * < 0 : the encryption has failed + */ +static int tipc_aead_encrypt(struct tipc_aead *aead, struct sk_buff *skb, + struct tipc_bearer *b, + struct tipc_media_addr *dst, + struct tipc_node *__dnode) +{ + struct crypto_aead *tfm = tipc_aead_tfm_next(aead); + struct tipc_crypto_tx_ctx *tx_ctx; + struct aead_request *req; + struct sk_buff *trailer; + struct scatterlist *sg; + struct tipc_ehdr *ehdr; + int ehsz, len, tailen, nsg, rc; + void *ctx; + u32 salt; + u8 *iv; + + /* Make sure message len at least 4-byte aligned */ + len = ALIGN(skb->len, 4); + tailen = len - skb->len + aead->authsize; + + /* Expand skb tail for authentication tag: + * As for simplicity, we'd have made sure skb having enough tailroom + * for authentication tag @skb allocation. Even when skb is nonlinear + * but there is no frag_list, it should be still fine! + * Otherwise, we must cow it to be a writable buffer with the tailroom. + */ +#ifdef TIPC_CRYPTO_DEBUG + SKB_LINEAR_ASSERT(skb); + if (tailen > skb_tailroom(skb)) { + pr_warn("TX: skb tailroom is not enough: %d, requires: %d\n", + skb_tailroom(skb), tailen); + } +#endif + + if (unlikely(!skb_cloned(skb) && tailen <= skb_tailroom(skb))) { + nsg = 1; + trailer = skb; + } else { + /* TODO: We could avoid skb_cow_data() if skb has no frag_list + * e.g. by skb_fill_page_desc() to add another page to the skb + * with the wanted tailen... However, page skbs look not often, + * so take it easy now! + * Cloned skbs e.g. from link_xmit() seems no choice though :( + */ + nsg = skb_cow_data(skb, tailen, &trailer); + if (unlikely(nsg < 0)) { + pr_err("TX: skb_cow_data() returned %d\n", nsg); + return nsg; + } + } + + pskb_put(skb, trailer, tailen); + + /* Allocate memory for the AEAD operation */ + ctx = tipc_aead_mem_alloc(tfm, sizeof(*tx_ctx), &iv, &req, &sg, nsg); + if (unlikely(!ctx)) + return -ENOMEM; + TIPC_SKB_CB(skb)->crypto_ctx = ctx; + + /* Map skb to the sg lists */ + sg_init_table(sg, nsg); + rc = skb_to_sgvec(skb, sg, 0, skb->len); + if (unlikely(rc < 0)) { + pr_err("TX: skb_to_sgvec() returned %d, nsg %d!\n", rc, nsg); + goto exit; + } + + /* Prepare IV: [SALT (4 octets)][SEQNO (8 octets)] + * In case we're in cluster-key mode, SALT is varied by xor-ing with + * the source address (or w0 of id), otherwise with the dest address + * if dest is known. + */ + ehdr = (struct tipc_ehdr *)skb->data; + salt = aead->salt; + if (aead->mode == CLUSTER_KEY) + salt ^= ehdr->addr; /* __be32 */ + else if (__dnode) + salt ^= tipc_node_get_addr(__dnode); + memcpy(iv, &salt, 4); + memcpy(iv + 4, (u8 *)&ehdr->seqno, 8); + + /* Prepare request */ + ehsz = tipc_ehdr_size(ehdr); + aead_request_set_tfm(req, tfm); + aead_request_set_ad(req, ehsz); + aead_request_set_crypt(req, sg, sg, len - ehsz, iv); + + /* Set callback function & data */ + aead_request_set_callback(req, CRYPTO_TFM_REQ_MAY_BACKLOG, + tipc_aead_encrypt_done, skb); + tx_ctx = (struct tipc_crypto_tx_ctx *)ctx; + tx_ctx->aead = aead; + tx_ctx->bearer = b; + memcpy(&tx_ctx->dst, dst, sizeof(*dst)); + + /* Hold bearer */ + if (unlikely(!tipc_bearer_hold(b))) { + rc = -ENODEV; + goto exit; + } + + /* Now, do encrypt */ + rc = crypto_aead_encrypt(req); + if (rc == -EINPROGRESS || rc == -EBUSY) + return rc; + + tipc_bearer_put(b); + +exit: + kfree(ctx); + TIPC_SKB_CB(skb)->crypto_ctx = NULL; + return rc; +} + +static void tipc_aead_encrypt_done(struct crypto_async_request *base, int err) +{ + struct sk_buff *skb = base->data; + struct tipc_crypto_tx_ctx *tx_ctx = TIPC_SKB_CB(skb)->crypto_ctx; + struct tipc_bearer *b = tx_ctx->bearer; + struct tipc_aead *aead = tx_ctx->aead; + struct tipc_crypto *tx = aead->crypto; + struct net *net = tx->net; + + switch (err) { + case 0: + this_cpu_inc(tx->stats->stat[STAT_ASYNC_OK]); + if (likely(test_bit(0, &b->up))) + b->media->send_msg(net, skb, b, &tx_ctx->dst); + else + kfree_skb(skb); + break; + case -EINPROGRESS: + return; + default: + this_cpu_inc(tx->stats->stat[STAT_ASYNC_NOK]); + kfree_skb(skb); + break; + } + + kfree(tx_ctx); + tipc_bearer_put(b); + tipc_aead_put(aead); +} + +/** + * tipc_aead_decrypt - Decrypt an encrypted message + * @net: struct net + * @aead: TIPC AEAD for the message decryption + * @skb: the input/output skb + * @b: TIPC bearer where the message has been received + * + * Return: + * 0 : if the decryption has completed + * -EINPROGRESS/-EBUSY : if a callback will be performed + * < 0 : the decryption has failed + */ +static int tipc_aead_decrypt(struct net *net, struct tipc_aead *aead, + struct sk_buff *skb, struct tipc_bearer *b) +{ + struct tipc_crypto_rx_ctx *rx_ctx; + struct aead_request *req; + struct crypto_aead *tfm; + struct sk_buff *unused; + struct scatterlist *sg; + struct tipc_ehdr *ehdr; + int ehsz, nsg, rc; + void *ctx; + u32 salt; + u8 *iv; + + if (unlikely(!aead)) + return -ENOKEY; + + /* Cow skb data if needed */ + if (likely(!skb_cloned(skb) && + (!skb_is_nonlinear(skb) || !skb_has_frag_list(skb)))) { + nsg = 1 + skb_shinfo(skb)->nr_frags; + } else { + nsg = skb_cow_data(skb, 0, &unused); + if (unlikely(nsg < 0)) { + pr_err("RX: skb_cow_data() returned %d\n", nsg); + return nsg; + } + } + + /* Allocate memory for the AEAD operation */ + tfm = tipc_aead_tfm_next(aead); + ctx = tipc_aead_mem_alloc(tfm, sizeof(*rx_ctx), &iv, &req, &sg, nsg); + if (unlikely(!ctx)) + return -ENOMEM; + TIPC_SKB_CB(skb)->crypto_ctx = ctx; + + /* Map skb to the sg lists */ + sg_init_table(sg, nsg); + rc = skb_to_sgvec(skb, sg, 0, skb->len); + if (unlikely(rc < 0)) { + pr_err("RX: skb_to_sgvec() returned %d, nsg %d\n", rc, nsg); + goto exit; + } + + /* Reconstruct IV: */ + ehdr = (struct tipc_ehdr *)skb->data; + salt = aead->salt; + if (aead->mode == CLUSTER_KEY) + salt ^= ehdr->addr; /* __be32 */ + else if (ehdr->destined) + salt ^= tipc_own_addr(net); + memcpy(iv, &salt, 4); + memcpy(iv + 4, (u8 *)&ehdr->seqno, 8); + + /* Prepare request */ + ehsz = tipc_ehdr_size(ehdr); + aead_request_set_tfm(req, tfm); + aead_request_set_ad(req, ehsz); + aead_request_set_crypt(req, sg, sg, skb->len - ehsz, iv); + + /* Set callback function & data */ + aead_request_set_callback(req, CRYPTO_TFM_REQ_MAY_BACKLOG, + tipc_aead_decrypt_done, skb); + rx_ctx = (struct tipc_crypto_rx_ctx *)ctx; + rx_ctx->aead = aead; + rx_ctx->bearer = b; + + /* Hold bearer */ + if (unlikely(!tipc_bearer_hold(b))) { + rc = -ENODEV; + goto exit; + } + + /* Now, do decrypt */ + rc = crypto_aead_decrypt(req); + if (rc == -EINPROGRESS || rc == -EBUSY) + return rc; + + tipc_bearer_put(b); + +exit: + kfree(ctx); + TIPC_SKB_CB(skb)->crypto_ctx = NULL; + return rc; +} + +static void tipc_aead_decrypt_done(struct crypto_async_request *base, int err) +{ + struct sk_buff *skb = base->data; + struct tipc_crypto_rx_ctx *rx_ctx = TIPC_SKB_CB(skb)->crypto_ctx; + struct tipc_bearer *b = rx_ctx->bearer; + struct tipc_aead *aead = rx_ctx->aead; + struct tipc_crypto_stats __percpu *stats = aead->crypto->stats; + struct net *net = aead->crypto->net; + + switch (err) { + case 0: + this_cpu_inc(stats->stat[STAT_ASYNC_OK]); + break; + case -EINPROGRESS: + return; + default: + this_cpu_inc(stats->stat[STAT_ASYNC_NOK]); + break; + } + + kfree(rx_ctx); + tipc_crypto_rcv_complete(net, aead, b, &skb, err); + if (likely(skb)) { + if (likely(test_bit(0, &b->up))) + tipc_rcv(net, skb, b); + else + kfree_skb(skb); + } + + tipc_bearer_put(b); +} + +static inline int tipc_ehdr_size(struct tipc_ehdr *ehdr) +{ + return (ehdr->user != LINK_CONFIG) ? EHDR_SIZE : EHDR_CFG_SIZE; +} + +/** + * tipc_ehdr_validate - Validate an encryption message + * @skb: the message buffer + * + * Returns "true" if this is a valid encryption message, otherwise "false" + */ +bool tipc_ehdr_validate(struct sk_buff *skb) +{ + struct tipc_ehdr *ehdr; + int ehsz; + + if (unlikely(!pskb_may_pull(skb, EHDR_MIN_SIZE))) + return false; + + ehdr = (struct tipc_ehdr *)skb->data; + if (unlikely(ehdr->version != TIPC_EVERSION)) + return false; + ehsz = tipc_ehdr_size(ehdr); + if (unlikely(!pskb_may_pull(skb, ehsz))) + return false; + if (unlikely(skb->len <= ehsz + TIPC_AES_GCM_TAG_SIZE)) + return false; + if (unlikely(!ehdr->tx_key)) + return false; + + return true; +} + +/** + * tipc_ehdr_build - Build TIPC encryption message header + * @net: struct net + * @aead: TX AEAD key to be used for the message encryption + * @tx_key: key id used for the message encryption + * @skb: input/output message skb + * @__rx: RX crypto handle if dest is "known" + * + * Return: the header size if the building is successful, otherwise < 0 + */ +static int tipc_ehdr_build(struct net *net, struct tipc_aead *aead, + u8 tx_key, struct sk_buff *skb, + struct tipc_crypto *__rx) +{ + struct tipc_msg *hdr = buf_msg(skb); + struct tipc_ehdr *ehdr; + u32 user = msg_user(hdr); + u64 seqno; + int ehsz; + + /* Make room for encryption header */ + ehsz = (user != LINK_CONFIG) ? EHDR_SIZE : EHDR_CFG_SIZE; + WARN_ON(skb_headroom(skb) < ehsz); + ehdr = (struct tipc_ehdr *)skb_push(skb, ehsz); + + /* Obtain a seqno first: + * Use the key seqno (= cluster wise) if dest is unknown or we're in + * cluster key mode, otherwise it's better for a per-peer seqno! + */ + if (!__rx || aead->mode == CLUSTER_KEY) + seqno = atomic64_inc_return(&aead->seqno); + else + seqno = atomic64_inc_return(&__rx->sndnxt); + + /* Revoke the key if seqno is wrapped around */ + if (unlikely(!seqno)) + return tipc_crypto_key_revoke(net, tx_key); + + /* Word 1-2 */ + ehdr->seqno = cpu_to_be64(seqno); + + /* Words 0, 3- */ + ehdr->version = TIPC_EVERSION; + ehdr->user = 0; + ehdr->keepalive = 0; + ehdr->tx_key = tx_key; + ehdr->destined = (__rx) ? 1 : 0; + ehdr->rx_key_active = (__rx) ? __rx->key.active : 0; + ehdr->reserved_1 = 0; + ehdr->reserved_2 = 0; + + switch (user) { + case LINK_CONFIG: + ehdr->user = LINK_CONFIG; + memcpy(ehdr->id, tipc_own_id(net), NODE_ID_LEN); + break; + default: + if (user == LINK_PROTOCOL && msg_type(hdr) == STATE_MSG) { + ehdr->user = LINK_PROTOCOL; + ehdr->keepalive = msg_is_keepalive(hdr); + } + ehdr->addr = hdr->hdr[3]; + break; + } + + return ehsz; +} + +static inline void tipc_crypto_key_set_state(struct tipc_crypto *c, + u8 new_passive, + u8 new_active, + u8 new_pending) +{ +#ifdef TIPC_CRYPTO_DEBUG + struct tipc_key old = c->key; + char buf[32]; +#endif + + c->key.keys = ((new_passive & KEY_MASK) << (KEY_BITS * 2)) | + ((new_active & KEY_MASK) << (KEY_BITS)) | + ((new_pending & KEY_MASK)); + +#ifdef TIPC_CRYPTO_DEBUG + pr_info("%s(%s): key changing %s ::%pS\n", + (c->node) ? "RX" : "TX", + (c->node) ? tipc_node_get_id_str(c->node) : + tipc_own_id_string(c->net), + tipc_key_change_dump(old, c->key, buf), + __builtin_return_address(0)); +#endif +} + +/** + * tipc_crypto_key_init - Initiate a new user / AEAD key + * @c: TIPC crypto to which new key is attached + * @ukey: the user key + * @mode: the key mode (CLUSTER_KEY or PER_NODE_KEY) + * + * A new TIPC AEAD key will be allocated and initiated with the specified user + * key, then attached to the TIPC crypto. + * + * Return: new key id in case of success, otherwise: < 0 + */ +int tipc_crypto_key_init(struct tipc_crypto *c, struct tipc_aead_key *ukey, + u8 mode) +{ + struct tipc_aead *aead = NULL; + int rc = 0; + + /* Initiate with the new user key */ + rc = tipc_aead_init(&aead, ukey, mode); + + /* Attach it to the crypto */ + if (likely(!rc)) { + rc = tipc_crypto_key_attach(c, aead, 0); + if (rc < 0) + tipc_aead_free(&aead->rcu); + } + + pr_info("%s(%s): key initiating, rc %d!\n", + (c->node) ? "RX" : "TX", + (c->node) ? tipc_node_get_id_str(c->node) : + tipc_own_id_string(c->net), + rc); + + return rc; +} + +/** + * tipc_crypto_key_attach - Attach a new AEAD key to TIPC crypto + * @c: TIPC crypto to which the new AEAD key is attached + * @aead: the new AEAD key pointer + * @pos: desired slot in the crypto key array, = 0 if any! + * + * Return: new key id in case of success, otherwise: -EBUSY + */ +static int tipc_crypto_key_attach(struct tipc_crypto *c, + struct tipc_aead *aead, u8 pos) +{ + u8 new_pending, new_passive, new_key; + struct tipc_key key; + int rc = -EBUSY; + + spin_lock_bh(&c->lock); + key = c->key; + if (key.active && key.passive) + goto exit; + if (key.passive && !tipc_aead_users(c->aead[key.passive])) + goto exit; + if (key.pending) { + if (pos) + goto exit; + if (tipc_aead_users(c->aead[key.pending]) > 0) + goto exit; + /* Replace it */ + new_pending = key.pending; + new_passive = key.passive; + new_key = new_pending; + } else { + if (pos) { + if (key.active && pos != key_next(key.active)) { + new_pending = key.pending; + new_passive = pos; + new_key = new_passive; + goto attach; + } else if (!key.active && !key.passive) { + new_pending = pos; + new_passive = key.passive; + new_key = new_pending; + goto attach; + } + } + new_pending = key_next(key.active ?: key.passive); + new_passive = key.passive; + new_key = new_pending; + } + +attach: + aead->crypto = c; + tipc_crypto_key_set_state(c, new_passive, key.active, new_pending); + tipc_aead_rcu_replace(c->aead[new_key], aead, &c->lock); + + c->working = 1; + c->timer1 = jiffies; + c->timer2 = jiffies; + rc = new_key; + +exit: + spin_unlock_bh(&c->lock); + return rc; +} + +void tipc_crypto_key_flush(struct tipc_crypto *c) +{ + int k; + + spin_lock_bh(&c->lock); + c->working = 0; + tipc_crypto_key_set_state(c, 0, 0, 0); + for (k = KEY_MIN; k <= KEY_MAX; k++) + tipc_crypto_key_detach(c->aead[k], &c->lock); + atomic_set(&c->peer_rx_active, 0); + atomic64_set(&c->sndnxt, 0); + spin_unlock_bh(&c->lock); +} + +/** + * tipc_crypto_key_try_align - Align RX keys if possible + * @rx: RX crypto handle + * @new_pending: new pending slot if aligned (= TX key from peer) + * + * Peer has used an unknown key slot, this only happens when peer has left and + * rejoned, or we are newcomer. + * That means, there must be no active key but a pending key at unaligned slot. + * If so, we try to move the pending key to the new slot. + * Note: A potential passive key can exist, it will be shifted correspondingly! + * + * Return: "true" if key is successfully aligned, otherwise "false" + */ +static bool tipc_crypto_key_try_align(struct tipc_crypto *rx, u8 new_pending) +{ + struct tipc_aead *tmp1, *tmp2 = NULL; + struct tipc_key key; + bool aligned = false; + u8 new_passive = 0; + int x; + + spin_lock(&rx->lock); + key = rx->key; + if (key.pending == new_pending) { + aligned = true; + goto exit; + } + if (key.active) + goto exit; + if (!key.pending) + goto exit; + if (tipc_aead_users(rx->aead[key.pending]) > 0) + goto exit; + + /* Try to "isolate" this pending key first */ + tmp1 = tipc_aead_rcu_ptr(rx->aead[key.pending], &rx->lock); + if (!refcount_dec_if_one(&tmp1->refcnt)) + goto exit; + rcu_assign_pointer(rx->aead[key.pending], NULL); + + /* Move passive key if any */ + if (key.passive) { + tipc_aead_rcu_swap(rx->aead[key.passive], tmp2, &rx->lock); + x = (key.passive - key.pending + new_pending) % KEY_MAX; + new_passive = (x <= 0) ? x + KEY_MAX : x; + } + + /* Re-allocate the key(s) */ + tipc_crypto_key_set_state(rx, new_passive, 0, new_pending); + rcu_assign_pointer(rx->aead[new_pending], tmp1); + if (new_passive) + rcu_assign_pointer(rx->aead[new_passive], tmp2); + refcount_set(&tmp1->refcnt, 1); + aligned = true; + pr_info("RX(%s): key is aligned!\n", tipc_node_get_id_str(rx->node)); + +exit: + spin_unlock(&rx->lock); + return aligned; +} + +/** + * tipc_crypto_key_pick_tx - Pick one TX key for message decryption + * @tx: TX crypto handle + * @rx: RX crypto handle (can be NULL) + * @skb: the message skb which will be decrypted later + * + * This function looks up the existing TX keys and pick one which is suitable + * for the message decryption, that must be a cluster key and not used before + * on the same message (i.e. recursive). + * + * Return: the TX AEAD key handle in case of success, otherwise NULL + */ +static struct tipc_aead *tipc_crypto_key_pick_tx(struct tipc_crypto *tx, + struct tipc_crypto *rx, + struct sk_buff *skb) +{ + struct tipc_skb_cb *skb_cb = TIPC_SKB_CB(skb); + struct tipc_aead *aead = NULL; + struct tipc_key key = tx->key; + u8 k, i = 0; + + /* Initialize data if not yet */ + if (!skb_cb->tx_clone_deferred) { + skb_cb->tx_clone_deferred = 1; + memset(&skb_cb->tx_clone_ctx, 0, sizeof(skb_cb->tx_clone_ctx)); + } + + skb_cb->tx_clone_ctx.rx = rx; + if (++skb_cb->tx_clone_ctx.recurs > 2) + return NULL; + + /* Pick one TX key */ + spin_lock(&tx->lock); + do { + k = (i == 0) ? key.pending : + ((i == 1) ? key.active : key.passive); + if (!k) + continue; + aead = tipc_aead_rcu_ptr(tx->aead[k], &tx->lock); + if (!aead) + continue; + if (aead->mode != CLUSTER_KEY || + aead == skb_cb->tx_clone_ctx.last) { + aead = NULL; + continue; + } + /* Ok, found one cluster key */ + skb_cb->tx_clone_ctx.last = aead; + WARN_ON(skb->next); + skb->next = skb_clone(skb, GFP_ATOMIC); + if (unlikely(!skb->next)) + pr_warn("Failed to clone skb for next round if any\n"); + WARN_ON(!refcount_inc_not_zero(&aead->refcnt)); + break; + } while (++i < 3); + spin_unlock(&tx->lock); + + return aead; +} + +/** + * tipc_crypto_key_synch: Synch own key data according to peer key status + * @rx: RX crypto handle + * @new_rx_active: latest RX active key from peer + * @hdr: TIPCv2 message + * + * This function updates the peer node related data as the peer RX active key + * has changed, so the number of TX keys' users on this node are increased and + * decreased correspondingly. + * + * The "per-peer" sndnxt is also reset when the peer key has switched. + */ +static void tipc_crypto_key_synch(struct tipc_crypto *rx, u8 new_rx_active, + struct tipc_msg *hdr) +{ + struct net *net = rx->net; + struct tipc_crypto *tx = tipc_net(net)->crypto_tx; + u8 cur_rx_active; + + /* TX might be even not ready yet */ + if (unlikely(!tx->key.active && !tx->key.pending)) + return; + + cur_rx_active = atomic_read(&rx->peer_rx_active); + if (likely(cur_rx_active == new_rx_active)) + return; + + /* Make sure this message destined for this node */ + if (unlikely(msg_short(hdr) || + msg_destnode(hdr) != tipc_own_addr(net))) + return; + + /* Peer RX active key has changed, try to update owns' & TX users */ + if (atomic_cmpxchg(&rx->peer_rx_active, + cur_rx_active, + new_rx_active) == cur_rx_active) { + if (new_rx_active) + tipc_aead_users_inc(tx->aead[new_rx_active], INT_MAX); + if (cur_rx_active) + tipc_aead_users_dec(tx->aead[cur_rx_active], 0); + + atomic64_set(&rx->sndnxt, 0); + /* Mark the point TX key users changed */ + tx->timer1 = jiffies; + +#ifdef TIPC_CRYPTO_DEBUG + pr_info("TX(%s): key users changed %d-- %d++, peer RX(%s)\n", + tipc_own_id_string(net), cur_rx_active, + new_rx_active, tipc_node_get_id_str(rx->node)); +#endif + } +} + +static int tipc_crypto_key_revoke(struct net *net, u8 tx_key) +{ + struct tipc_crypto *tx = tipc_net(net)->crypto_tx; + struct tipc_key key; + + spin_lock(&tx->lock); + key = tx->key; + WARN_ON(!key.active || tx_key != key.active); + + /* Free the active key */ + tipc_crypto_key_set_state(tx, key.passive, 0, key.pending); + tipc_crypto_key_detach(tx->aead[key.active], &tx->lock); + spin_unlock(&tx->lock); + + pr_warn("TX(%s): key is revoked!\n", tipc_own_id_string(net)); + return -EKEYREVOKED; +} + +int tipc_crypto_start(struct tipc_crypto **crypto, struct net *net, + struct tipc_node *node) +{ + struct tipc_crypto *c; + + if (*crypto) + return -EEXIST; + + /* Allocate crypto */ + c = kzalloc(sizeof(*c), GFP_ATOMIC); + if (!c) + return -ENOMEM; + + /* Allocate statistic structure */ + c->stats = alloc_percpu_gfp(struct tipc_crypto_stats, GFP_ATOMIC); + if (!c->stats) { + kzfree(c); + return -ENOMEM; + } + + c->working = 0; + c->net = net; + c->node = node; + tipc_crypto_key_set_state(c, 0, 0, 0); + atomic_set(&c->peer_rx_active, 0); + atomic64_set(&c->sndnxt, 0); + c->timer1 = jiffies; + c->timer2 = jiffies; + spin_lock_init(&c->lock); + *crypto = c; + + return 0; +} + +void tipc_crypto_stop(struct tipc_crypto **crypto) +{ + struct tipc_crypto *c, *tx, *rx; + bool is_rx; + u8 k; + + if (!*crypto) + return; + + rcu_read_lock(); + /* RX stopping? => decrease TX key users if any */ + is_rx = !!((*crypto)->node); + if (is_rx) { + rx = *crypto; + tx = tipc_net(rx->net)->crypto_tx; + k = atomic_read(&rx->peer_rx_active); + if (k) { + tipc_aead_users_dec(tx->aead[k], 0); + /* Mark the point TX key users changed */ + tx->timer1 = jiffies; + } + } + + /* Release AEAD keys */ + c = *crypto; + for (k = KEY_MIN; k <= KEY_MAX; k++) + tipc_aead_put(rcu_dereference(c->aead[k])); + rcu_read_unlock(); + + pr_warn("%s(%s) has been purged, node left!\n", + (is_rx) ? "RX" : "TX", + (is_rx) ? tipc_node_get_id_str((*crypto)->node) : + tipc_own_id_string((*crypto)->net)); + + /* Free this crypto statistics */ + free_percpu(c->stats); + + *crypto = NULL; + kzfree(c); +} + +void tipc_crypto_timeout(struct tipc_crypto *rx) +{ + struct tipc_net *tn = tipc_net(rx->net); + struct tipc_crypto *tx = tn->crypto_tx; + struct tipc_key key; + u8 new_pending, new_passive; + int cmd; + + /* TX key activating: + * The pending key (users > 0) -> active + * The active key if any (users == 0) -> free + */ + spin_lock(&tx->lock); + key = tx->key; + if (key.active && tipc_aead_users(tx->aead[key.active]) > 0) + goto s1; + if (!key.pending || tipc_aead_users(tx->aead[key.pending]) <= 0) + goto s1; + if (time_before(jiffies, tx->timer1 + TIPC_TX_LASTING_LIM)) + goto s1; + + tipc_crypto_key_set_state(tx, key.passive, key.pending, 0); + if (key.active) + tipc_crypto_key_detach(tx->aead[key.active], &tx->lock); + this_cpu_inc(tx->stats->stat[STAT_SWITCHES]); + pr_info("TX(%s): key %d is activated!\n", tipc_own_id_string(tx->net), + key.pending); + +s1: + spin_unlock(&tx->lock); + + /* RX key activating: + * The pending key (users > 0) -> active + * The active key if any -> passive, freed later + */ + spin_lock(&rx->lock); + key = rx->key; + if (!key.pending || tipc_aead_users(rx->aead[key.pending]) <= 0) + goto s2; + + new_pending = (key.passive && + !tipc_aead_users(rx->aead[key.passive])) ? + key.passive : 0; + new_passive = (key.active) ?: ((new_pending) ? 0 : key.passive); + tipc_crypto_key_set_state(rx, new_passive, key.pending, new_pending); + this_cpu_inc(rx->stats->stat[STAT_SWITCHES]); + pr_info("RX(%s): key %d is activated!\n", + tipc_node_get_id_str(rx->node), key.pending); + goto s5; + +s2: + /* RX key "faulty" switching: + * The faulty pending key (users < -30) -> passive + * The passive key (users = 0) -> pending + * Note: This only happens after RX deactivated - s3! + */ + key = rx->key; + if (!key.pending || tipc_aead_users(rx->aead[key.pending]) > -30) + goto s3; + if (!key.passive || tipc_aead_users(rx->aead[key.passive]) != 0) + goto s3; + + new_pending = key.passive; + new_passive = key.pending; + tipc_crypto_key_set_state(rx, new_passive, key.active, new_pending); + goto s5; + +s3: + /* RX key deactivating: + * The passive key if any -> pending + * The active key -> passive (users = 0) / pending + * The pending key if any -> passive (users = 0) + */ + key = rx->key; + if (!key.active) + goto s4; + if (time_before(jiffies, rx->timer1 + TIPC_RX_ACTIVE_LIM)) + goto s4; + + new_pending = (key.passive) ?: key.active; + new_passive = (key.passive) ? key.active : key.pending; + tipc_aead_users_set(rx->aead[new_pending], 0); + if (new_passive) + tipc_aead_users_set(rx->aead[new_passive], 0); + tipc_crypto_key_set_state(rx, new_passive, 0, new_pending); + pr_info("RX(%s): key %d is deactivated!\n", + tipc_node_get_id_str(rx->node), key.active); + goto s5; + +s4: + /* RX key passive -> freed: */ + key = rx->key; + if (!key.passive || !tipc_aead_users(rx->aead[key.passive])) + goto s5; + if (time_before(jiffies, rx->timer2 + TIPC_RX_PASSIVE_LIM)) + goto s5; + + tipc_crypto_key_set_state(rx, 0, key.active, key.pending); + tipc_crypto_key_detach(rx->aead[key.passive], &rx->lock); + pr_info("RX(%s): key %d is freed!\n", tipc_node_get_id_str(rx->node), + key.passive); + +s5: + spin_unlock(&rx->lock); + + /* Limit max_tfms & do debug commands if needed */ + if (likely(sysctl_tipc_max_tfms <= TIPC_MAX_TFMS_LIM)) + return; + + cmd = sysctl_tipc_max_tfms; + sysctl_tipc_max_tfms = TIPC_MAX_TFMS_DEF; + tipc_crypto_do_cmd(rx->net, cmd); +} + +/** + * tipc_crypto_xmit - Build & encrypt TIPC message for xmit + * @net: struct net + * @skb: input/output message skb pointer + * @b: bearer used for xmit later + * @dst: destination media address + * @__dnode: destination node for reference if any + * + * First, build an encryption message header on the top of the message, then + * encrypt the original TIPC message by using the active or pending TX key. + * If the encryption is successful, the encrypted skb is returned directly or + * via the callback. + * Otherwise, the skb is freed! + * + * Return: + * 0 : the encryption has succeeded (or no encryption) + * -EINPROGRESS/-EBUSY : the encryption is ongoing, a callback will be made + * -ENOKEK : the encryption has failed due to no key + * -EKEYREVOKED : the encryption has failed due to key revoked + * -ENOMEM : the encryption has failed due to no memory + * < 0 : the encryption has failed due to other reasons + */ +int tipc_crypto_xmit(struct net *net, struct sk_buff **skb, + struct tipc_bearer *b, struct tipc_media_addr *dst, + struct tipc_node *__dnode) +{ + struct tipc_crypto *__rx = tipc_node_crypto_rx(__dnode); + struct tipc_crypto *tx = tipc_net(net)->crypto_tx; + struct tipc_crypto_stats __percpu *stats = tx->stats; + struct tipc_key key = tx->key; + struct tipc_aead *aead = NULL; + struct sk_buff *probe; + int rc = -ENOKEY; + u8 tx_key; + + /* No encryption? */ + if (!tx->working) + return 0; + + /* Try with the pending key if available and: + * 1) This is the only choice (i.e. no active key) or; + * 2) Peer has switched to this key (unicast only) or; + * 3) It is time to do a pending key probe; + */ + if (unlikely(key.pending)) { + tx_key = key.pending; + if (!key.active) + goto encrypt; + if (__rx && atomic_read(&__rx->peer_rx_active) == tx_key) + goto encrypt; + if (TIPC_SKB_CB(*skb)->probe) + goto encrypt; + if (!__rx && + time_after(jiffies, tx->timer2 + TIPC_TX_PROBE_LIM)) { + tx->timer2 = jiffies; + probe = skb_clone(*skb, GFP_ATOMIC); + if (probe) { + TIPC_SKB_CB(probe)->probe = 1; + tipc_crypto_xmit(net, &probe, b, dst, __dnode); + if (probe) + b->media->send_msg(net, probe, b, dst); + } + } + } + /* Else, use the active key if any */ + if (likely(key.active)) { + tx_key = key.active; + goto encrypt; + } + goto exit; + +encrypt: + aead = tipc_aead_get(tx->aead[tx_key]); + if (unlikely(!aead)) + goto exit; + rc = tipc_ehdr_build(net, aead, tx_key, *skb, __rx); + if (likely(rc > 0)) + rc = tipc_aead_encrypt(aead, *skb, b, dst, __dnode); + +exit: + switch (rc) { + case 0: + this_cpu_inc(stats->stat[STAT_OK]); + break; + case -EINPROGRESS: + case -EBUSY: + this_cpu_inc(stats->stat[STAT_ASYNC]); + *skb = NULL; + return rc; + default: + this_cpu_inc(stats->stat[STAT_NOK]); + if (rc == -ENOKEY) + this_cpu_inc(stats->stat[STAT_NOKEYS]); + else if (rc == -EKEYREVOKED) + this_cpu_inc(stats->stat[STAT_BADKEYS]); + kfree_skb(*skb); + *skb = NULL; + break; + } + + tipc_aead_put(aead); + return rc; +} + +/** + * tipc_crypto_rcv - Decrypt an encrypted TIPC message from peer + * @net: struct net + * @rx: RX crypto handle + * @skb: input/output message skb pointer + * @b: bearer where the message has been received + * + * If the decryption is successful, the decrypted skb is returned directly or + * as the callback, the encryption header and auth tag will be trimed out + * before forwarding to tipc_rcv() via the tipc_crypto_rcv_complete(). + * Otherwise, the skb will be freed! + * Note: RX key(s) can be re-aligned, or in case of no key suitable, TX + * cluster key(s) can be taken for decryption (- recursive). + * + * Return: + * 0 : the decryption has successfully completed + * -EINPROGRESS/-EBUSY : the decryption is ongoing, a callback will be made + * -ENOKEY : the decryption has failed due to no key + * -EBADMSG : the decryption has failed due to bad message + * -ENOMEM : the decryption has failed due to no memory + * < 0 : the decryption has failed due to other reasons + */ +int tipc_crypto_rcv(struct net *net, struct tipc_crypto *rx, + struct sk_buff **skb, struct tipc_bearer *b) +{ + struct tipc_crypto *tx = tipc_net(net)->crypto_tx; + struct tipc_crypto_stats __percpu *stats; + struct tipc_aead *aead = NULL; + struct tipc_key key; + int rc = -ENOKEY; + u8 tx_key = 0; + + /* New peer? + * Let's try with TX key (i.e. cluster mode) & verify the skb first! + */ + if (unlikely(!rx)) + goto pick_tx; + + /* Pick RX key according to TX key, three cases are possible: + * 1) The current active key (likely) or; + * 2) The pending (new or deactivated) key (if any) or; + * 3) The passive or old active key (i.e. users > 0); + */ + tx_key = ((struct tipc_ehdr *)(*skb)->data)->tx_key; + key = rx->key; + if (likely(tx_key == key.active)) + goto decrypt; + if (tx_key == key.pending) + goto decrypt; + if (tx_key == key.passive) { + rx->timer2 = jiffies; + if (tipc_aead_users(rx->aead[key.passive]) > 0) + goto decrypt; + } + + /* Unknown key, let's try to align RX key(s) */ + if (tipc_crypto_key_try_align(rx, tx_key)) + goto decrypt; + +pick_tx: + /* No key suitable? Try to pick one from TX... */ + aead = tipc_crypto_key_pick_tx(tx, rx, *skb); + if (aead) + goto decrypt; + goto exit; + +decrypt: + rcu_read_lock(); + if (!aead) + aead = tipc_aead_get(rx->aead[tx_key]); + rc = tipc_aead_decrypt(net, aead, *skb, b); + rcu_read_unlock(); + +exit: + stats = ((rx) ?: tx)->stats; + switch (rc) { + case 0: + this_cpu_inc(stats->stat[STAT_OK]); + break; + case -EINPROGRESS: + case -EBUSY: + this_cpu_inc(stats->stat[STAT_ASYNC]); + *skb = NULL; + return rc; + default: + this_cpu_inc(stats->stat[STAT_NOK]); + if (rc == -ENOKEY) { + kfree_skb(*skb); + *skb = NULL; + if (rx) + tipc_node_put(rx->node); + this_cpu_inc(stats->stat[STAT_NOKEYS]); + return rc; + } else if (rc == -EBADMSG) { + this_cpu_inc(stats->stat[STAT_BADMSGS]); + } + break; + } + + tipc_crypto_rcv_complete(net, aead, b, skb, rc); + return rc; +} + +static void tipc_crypto_rcv_complete(struct net *net, struct tipc_aead *aead, + struct tipc_bearer *b, + struct sk_buff **skb, int err) +{ + struct tipc_skb_cb *skb_cb = TIPC_SKB_CB(*skb); + struct tipc_crypto *rx = aead->crypto; + struct tipc_aead *tmp = NULL; + struct tipc_ehdr *ehdr; + struct tipc_node *n; + u8 rx_key_active; + bool destined; + + /* Is this completed by TX? */ + if (unlikely(!rx->node)) { + rx = skb_cb->tx_clone_ctx.rx; +#ifdef TIPC_CRYPTO_DEBUG + pr_info("TX->RX(%s): err %d, aead %p, skb->next %p, flags %x\n", + (rx) ? tipc_node_get_id_str(rx->node) : "-", err, aead, + (*skb)->next, skb_cb->flags); + pr_info("skb_cb [recurs %d, last %p], tx->aead [%p %p %p]\n", + skb_cb->tx_clone_ctx.recurs, skb_cb->tx_clone_ctx.last, + aead->crypto->aead[1], aead->crypto->aead[2], + aead->crypto->aead[3]); +#endif + if (unlikely(err)) { + if (err == -EBADMSG && (*skb)->next) + tipc_rcv(net, (*skb)->next, b); + goto free_skb; + } + + if (likely((*skb)->next)) { + kfree_skb((*skb)->next); + (*skb)->next = NULL; + } + ehdr = (struct tipc_ehdr *)(*skb)->data; + if (!rx) { + WARN_ON(ehdr->user != LINK_CONFIG); + n = tipc_node_create(net, 0, ehdr->id, 0xffffu, true); + rx = tipc_node_crypto_rx(n); + if (unlikely(!rx)) + goto free_skb; + } + + /* Skip cloning this time as we had a RX pending key */ + if (rx->key.pending) + goto rcv; + if (tipc_aead_clone(&tmp, aead) < 0) + goto rcv; + if (tipc_crypto_key_attach(rx, tmp, ehdr->tx_key) < 0) { + tipc_aead_free(&tmp->rcu); + goto rcv; + } + tipc_aead_put(aead); + aead = tipc_aead_get(tmp); + } + + if (unlikely(err)) { + tipc_aead_users_dec(aead, INT_MIN); + goto free_skb; + } + + /* Set the... [truncated message content] |
From: Tuong L. <tuo...@de...> - 2019-10-14 11:07:51
|
When user sets RX key for a peer not existing on the own node, a new node entry is needed to which the RX key will be attached. However, since the peer node address (& capabilities) is unknown at that moment, only the node-ID is provided, this commit allows the creation of a node with only the data that we call as “preliminary”. A preliminary node is not the object of the “tipc_node_find()” but the “tipc_node_find_by_id()”. Once the first message i.e. LINK_CONFIG comes from that peer, and is successfully decrypted by the own node, the actual peer node data will be properly updated and the node will function as usual. In addition, the node timer always starts when a node object is created so if a preliminary node is not used, it will be cleaned up. The later encryption functions will also use the node timer and be able to create a preliminary node automatically when needed. Signed-off-by: Tuong Lien <tuo...@de...> --- net/tipc/node.c | 103 ++++++++++++++++++++++++++++++++++++++++---------------- net/tipc/node.h | 3 ++ 2 files changed, 77 insertions(+), 29 deletions(-) diff --git a/net/tipc/node.c b/net/tipc/node.c index f2e3cf70c922..2eeffc380e8f 100644 --- a/net/tipc/node.c +++ b/net/tipc/node.c @@ -79,6 +79,7 @@ struct tipc_bclink_entry { /** * struct tipc_node - TIPC node structure * @addr: network address of node + * @preliminary: a preliminary node or not * @ref: reference counter to node object * @lock: rwlock governing access to structure * @net: the applicable net namespace @@ -102,6 +103,7 @@ struct tipc_bclink_entry { */ struct tipc_node { u32 addr; + bool preliminary; struct kref kref; rwlock_t lock; struct net *net; @@ -120,6 +122,7 @@ struct tipc_node { u32 signature; u32 link_id; u8 peer_id[16]; + char peer_id_string[NODE_ID_STR_LEN]; struct list_head publ_list; struct list_head conn_sks; unsigned long keepalive_intv; @@ -235,6 +238,16 @@ u16 tipc_node_get_capabilities(struct net *net, u32 addr) return caps; } +u32 tipc_node_get_addr(struct tipc_node *node) +{ + return (node) ? node->addr : 0; +} + +char *tipc_node_get_id_str(struct tipc_node *node) +{ + return node->peer_id_string; +} + static void tipc_node_kref_release(struct kref *kref) { struct tipc_node *n = container_of(kref, struct tipc_node, kref); @@ -264,7 +277,7 @@ static struct tipc_node *tipc_node_find(struct net *net, u32 addr) rcu_read_lock(); hlist_for_each_entry_rcu(node, &tn->node_htable[thash], hash) { - if (node->addr != addr) + if (node->addr != addr || node->preliminary) continue; if (!kref_get_unless_zero(&node->kref)) node = NULL; @@ -360,22 +373,44 @@ static void tipc_node_write_unlock(struct tipc_node *n) } } -static struct tipc_node *tipc_node_create(struct net *net, u32 addr, - u8 *peer_id, u16 capabilities) +struct tipc_node *tipc_node_create(struct net *net, u32 addr, u8 *peer_id, + u16 capabilities, bool preliminary) { struct tipc_net *tn = net_generic(net, tipc_net_id); struct tipc_node *n, *temp_node; struct tipc_link *l; + unsigned long intv; int bearer_id; int i; spin_lock_bh(&tn->node_list_lock); - n = tipc_node_find(net, addr); + n = tipc_node_find(net, addr) ?: + tipc_node_find_by_id(net, peer_id); if (n) { - if (n->capabilities == capabilities) + if (!n->preliminary) { + if (n->capabilities == capabilities) + goto exit; + tipc_node_write_lock(n); + goto update_caps; + } + if (preliminary) goto exit; - /* Same node may come back with new capabilities */ + /* Update preliminary node data & make it "real" */ tipc_node_write_lock(n); + n->preliminary = false; + n->addr = addr; + hlist_del_rcu(&n->hash); + hlist_add_head_rcu(&n->hash, + &tn->node_htable[tipc_hashfn(addr)]); + list_del_rcu(&n->list); + list_for_each_entry_rcu(temp_node, &tn->node_list, list) { + if (n->addr < temp_node->addr) + break; + } + list_add_tail_rcu(&n->list, &temp_node->list); + +update_caps: + /* Same node may come back with new capabilities */ n->capabilities = capabilities; for (bearer_id = 0; bearer_id < MAX_BEARERS; bearer_id++) { l = n->links[bearer_id].link; @@ -396,7 +431,9 @@ static struct tipc_node *tipc_node_create(struct net *net, u32 addr, pr_warn("Node creation failed, no memory\n"); goto exit; } + tipc_nodeid2string(n->peer_id_string, peer_id); n->addr = addr; + n->preliminary = preliminary; memcpy(&n->peer_id, peer_id, 16); n->net = net; n->capabilities = capabilities; @@ -417,22 +454,14 @@ static struct tipc_node *tipc_node_create(struct net *net, u32 addr, n->signature = INVALID_NODE_SIG; n->active_links[0] = INVALID_BEARER_ID; n->active_links[1] = INVALID_BEARER_ID; - if (!tipc_link_bc_create(net, tipc_own_addr(net), - addr, U16_MAX, - tipc_link_window(tipc_bc_sndlink(net)), - n->capabilities, - &n->bc_entry.inputq1, - &n->bc_entry.namedq, - tipc_bc_sndlink(net), - &n->bc_entry.link)) { - pr_warn("Broadcast rcv link creation failed, no memory\n"); - kfree(n); - n = NULL; - goto exit; - } + n->bc_entry.link = NULL; tipc_node_get(n); timer_setup(&n->timer, tipc_node_timeout, 0); - n->keepalive_intv = U32_MAX; + /* Start a slow timer anyway, crypto needs it */ + n->keepalive_intv = 10000; + intv = jiffies + msecs_to_jiffies(n->keepalive_intv); + if (!mod_timer(&n->timer, intv)) + tipc_node_get(n); hlist_add_head_rcu(&n->hash, &tn->node_htable[tipc_hashfn(addr)]); list_for_each_entry_rcu(temp_node, &tn->node_list, list) { if (n->addr < temp_node->addr) @@ -950,6 +979,8 @@ u32 tipc_node_try_addr(struct net *net, u8 *id, u32 addr) { struct tipc_net *tn = tipc_net(net); struct tipc_node *n; + bool preliminary; + u32 sugg_addr; /* Suggest new address if some other peer is using this one */ n = tipc_node_find(net, addr); @@ -965,9 +996,11 @@ u32 tipc_node_try_addr(struct net *net, u8 *id, u32 addr) /* Suggest previously used address if peer is known */ n = tipc_node_find_by_id(net, id); if (n) { - addr = n->addr; + sugg_addr = n->addr; + preliminary = n->preliminary; tipc_node_put(n); - return addr; + if (!preliminary) + return sugg_addr; } /* Even this node may be in conflict */ @@ -984,7 +1017,7 @@ void tipc_node_check_dest(struct net *net, u32 addr, bool *respond, bool *dupl_addr) { struct tipc_node *n; - struct tipc_link *l; + struct tipc_link *l, *snd_l; struct tipc_link_entry *le; bool addr_match = false; bool sign_match = false; @@ -998,11 +1031,26 @@ void tipc_node_check_dest(struct net *net, u32 addr, *dupl_addr = false; *respond = false; - n = tipc_node_create(net, addr, peer_id, capabilities); + n = tipc_node_create(net, addr, peer_id, capabilities, false); if (!n) return; tipc_node_write_lock(n); + if (unlikely(!n->bc_entry.link)) { + snd_l = tipc_bc_sndlink(net); + if (!tipc_link_bc_create(net, tipc_own_addr(net), + addr, U16_MAX, + tipc_link_window(snd_l), + n->capabilities, + &n->bc_entry.inputq1, + &n->bc_entry.namedq, snd_l, + &n->bc_entry.link)) { + pr_warn("Broadcast rcv link creation failed, no mem\n"); + tipc_node_write_unlock_fast(n); + tipc_node_put(n); + return; + } + } le = &n->links[b->identity]; @@ -2011,6 +2059,8 @@ int tipc_nl_node_dump(struct sk_buff *skb, struct netlink_callback *cb) } list_for_each_entry_rcu(node, &tn->node_list, list) { + if (node->preliminary) + continue; if (last_addr) { if (node->addr == last_addr) last_addr = 0; @@ -2526,11 +2576,6 @@ int tipc_nl_node_dump_monitor_peer(struct sk_buff *skb, return skb->len; } -u32 tipc_node_get_addr(struct tipc_node *node) -{ - return (node) ? node->addr : 0; -} - /** * tipc_node_dump - dump TIPC node data * @n: tipc node to be dumped diff --git a/net/tipc/node.h b/net/tipc/node.h index 291d0ecd4101..d128c0de3288 100644 --- a/net/tipc/node.h +++ b/net/tipc/node.h @@ -72,12 +72,15 @@ enum { void tipc_node_stop(struct net *net); bool tipc_node_get_id(struct net *net, u32 addr, u8 *id); u32 tipc_node_get_addr(struct tipc_node *node); +char *tipc_node_get_id_str(struct tipc_node *node); u32 tipc_node_try_addr(struct net *net, u8 *id, u32 addr); void tipc_node_check_dest(struct net *net, u32 onode, u8 *peer_id128, struct tipc_bearer *bearer, u16 capabilities, u32 signature, struct tipc_media_addr *maddr, bool *respond, bool *dupl_addr); +struct tipc_node *tipc_node_create(struct net *net, u32 addr, u8 *peer_id, + u16 capabilities, bool preliminary); void tipc_node_delete_links(struct net *net, int bearer_id); void tipc_node_apply_property(struct net *net, struct tipc_bearer *b, int prop); int tipc_node_get_linkname(struct net *net, u32 bearer_id, u32 node, -- 2.13.7 |
From: Tuong L. <tuo...@de...> - 2019-10-14 11:07:50
|
The new structure 'tipc_aead_key' is added to the 'tipc.h' for user to be able to transfer a key to TIPC in kernel. Netlink will be used for this purpose in the later commits. Signed-off-by: Tuong Lien <tuo...@de...> --- include/uapi/linux/tipc.h | 21 +++++++++++++++++++++ 1 file changed, 21 insertions(+) diff --git a/include/uapi/linux/tipc.h b/include/uapi/linux/tipc.h index 7df026ea6aff..a1b64a916797 100644 --- a/include/uapi/linux/tipc.h +++ b/include/uapi/linux/tipc.h @@ -232,6 +232,27 @@ struct tipc_sioc_nodeid_req { char node_id[TIPC_NODEID_LEN]; }; +/* + * TIPC Crypto, AEAD + */ +#define TIPC_AEAD_ALG_NAME (32) + +struct tipc_aead_key { + char alg_name[TIPC_AEAD_ALG_NAME]; + unsigned int keylen; /* in bytes */ + char key[]; +}; + +#define TIPC_AEAD_KEYLEN_MIN (16 + 4) +#define TIPC_AEAD_KEYLEN_MAX (32 + 4) +#define TIPC_AEAD_KEY_SIZE_MAX (sizeof(struct tipc_aead_key) + \ + TIPC_AEAD_KEYLEN_MAX) + +static inline int tipc_aead_key_size(struct tipc_aead_key *key) +{ + return sizeof(*key) + key->keylen; +} + /* The macros and functions below are deprecated: */ -- 2.13.7 |
From: Tuong L. <tuo...@de...> - 2019-10-14 11:07:49
|
As a need to support the crypto asynchronous operations in the later commits, apart from the current RCU mechanism for bearer pointer, we add a 'refcnt' to the bearer object as well. So, a bearer can be hold via 'tipc_bearer_hold()' without being freed even though the bearer or interface can be disabled in the meanwhile. If that happens, the bearer will be released then when the crypto operation is completed and 'tipc_bearer_put()' is called. Signed-off-by: Tuong Lien <tuo...@de...> --- net/tipc/bearer.c | 23 ++++++++++++++++++++++- net/tipc/bearer.h | 3 +++ 2 files changed, 25 insertions(+), 1 deletion(-) diff --git a/net/tipc/bearer.c b/net/tipc/bearer.c index 0214aa1c4427..6e0962e0f759 100644 --- a/net/tipc/bearer.c +++ b/net/tipc/bearer.c @@ -315,6 +315,7 @@ static int tipc_enable_bearer(struct net *net, const char *name, b->net_plane = bearer_id + 'A'; b->priority = prio; test_and_set_bit_lock(0, &b->up); + refcount_set(&b->refcnt, 1); res = tipc_disc_create(net, b, &b->bcast_addr, &skb); if (res) { @@ -351,6 +352,26 @@ static int tipc_reset_bearer(struct net *net, struct tipc_bearer *b) return 0; } +bool tipc_bearer_hold(struct tipc_bearer *b) +{ + if (unlikely(!b)) + return false; + + if (unlikely(!refcount_inc_not_zero(&b->refcnt))) + return false; + + return true; +} + +void tipc_bearer_put(struct tipc_bearer *b) +{ + if (unlikely(!b)) + return; + + if (refcount_dec_and_test(&b->refcnt)) + kfree_rcu(b, rcu); +} + /** * bearer_disable * @@ -369,7 +390,7 @@ static void bearer_disable(struct net *net, struct tipc_bearer *b) if (b->disc) tipc_disc_delete(b->disc); RCU_INIT_POINTER(tn->bearer_list[bearer_id], NULL); - kfree_rcu(b, rcu); + tipc_bearer_put(b); tipc_mon_delete(net, bearer_id); } diff --git a/net/tipc/bearer.h b/net/tipc/bearer.h index ea0f3c49cbed..faca696d422f 100644 --- a/net/tipc/bearer.h +++ b/net/tipc/bearer.h @@ -165,6 +165,7 @@ struct tipc_bearer { struct tipc_discoverer *disc; char net_plane; unsigned long up; + refcount_t refcnt; }; struct tipc_bearer_names { @@ -210,6 +211,8 @@ int tipc_media_set_window(const char *name, u32 new_value); int tipc_media_addr_printf(char *buf, int len, struct tipc_media_addr *a); int tipc_enable_l2_media(struct net *net, struct tipc_bearer *b, struct nlattr *attrs[]); +bool tipc_bearer_hold(struct tipc_bearer *b); +void tipc_bearer_put(struct tipc_bearer *b); void tipc_disable_l2_media(struct tipc_bearer *b); int tipc_l2_send_msg(struct net *net, struct sk_buff *buf, struct tipc_bearer *b, struct tipc_media_addr *dest); -- 2.13.7 |
From: Tuong L. <tuo...@de...> - 2019-10-14 11:07:49
|
This series provides TIPC encryption feature, kernel part. There will be another one in the 'iproute2/tipc' for user space to set key. Tuong Lien (5): tipc: add reference counter to bearer tipc: enable creating a "preliminary" node tipc: add new AEAD key structure for user API tipc: introduce TIPC encryption & authentication tipc: add support for AEAD key setting via netlink include/uapi/linux/tipc.h | 21 + include/uapi/linux/tipc_netlink.h | 4 + net/tipc/Makefile | 2 +- net/tipc/bcast.c | 2 +- net/tipc/bearer.c | 52 +- net/tipc/bearer.h | 6 +- net/tipc/core.c | 10 + net/tipc/core.h | 4 + net/tipc/crypto.c | 1986 +++++++++++++++++++++++++++++++++++++ net/tipc/crypto.h | 166 ++++ net/tipc/link.c | 16 +- net/tipc/link.h | 1 + net/tipc/msg.c | 24 +- net/tipc/msg.h | 44 +- net/tipc/netlink.c | 16 +- net/tipc/node.c | 314 +++++- net/tipc/node.h | 10 + net/tipc/sysctl.c | 9 + net/tipc/udp_media.c | 1 + 19 files changed, 2604 insertions(+), 84 deletions(-) create mode 100644 net/tipc/crypto.c create mode 100644 net/tipc/crypto.h -- 2.13.7 |
From: Hoang Le <hoa...@de...> - 2019-10-14 04:55:53
|
We add the support to remove a specific node down with 128bit node identifier, as an alternative to legacy 32-bit node address. v2: improve usage for 'tipc peer remove' command Signed-off-by: Hoang Le <hoa...@de...> --- tipc/peer.c | 53 ++++++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 52 insertions(+), 1 deletion(-) diff --git a/tipc/peer.c b/tipc/peer.c index f6380777033d..f14ec35e6f71 100644 --- a/tipc/peer.c +++ b/tipc/peer.c @@ -59,17 +59,68 @@ static int cmd_peer_rm_addr(struct nlmsghdr *nlh, const struct cmd *cmd, return msg_doit(nlh, NULL, NULL); } +static int cmd_peer_rm_nodeid(struct nlmsghdr *nlh, const struct cmd *cmd, + struct cmdl *cmdl, void *data) +{ + char buf[MNL_SOCKET_BUFFER_SIZE]; + __u8 id[16] = {0,}; + __u64 *w0 = (__u64 *)&id[0]; + __u64 *w1 = (__u64 *)&id[8]; + struct nlattr *nest; + char *str; + + if (cmdl->argc != cmdl->optind + 1) { + fprintf(stderr, "Usage: %s peer remove identity NODEID\n", + cmdl->argv[0]); + return -EINVAL; + } + + str = shift_cmdl(cmdl); + if (str2nodeid(str, id)) { + fprintf(stderr, "Invalid node identity\n"); + return -EINVAL; + } + + nlh = msg_init(buf, TIPC_NL_PEER_REMOVE); + if (!nlh) { + fprintf(stderr, "error, message initialisation failed\n"); + return -1; + } + + nest = mnl_attr_nest_start(nlh, TIPC_NLA_NET); + mnl_attr_put_u64(nlh, TIPC_NLA_NET_NODEID, *w0); + mnl_attr_put_u64(nlh, TIPC_NLA_NET_NODEID_W1, *w1); + mnl_attr_nest_end(nlh, nest); + + return msg_doit(nlh, NULL, NULL); +} + static void cmd_peer_rm_help(struct cmdl *cmdl) +{ + fprintf(stderr, "Usage: %s peer remove PROPERTY\n\n" + "PROPERTIES\n" + " identity NODEID - Remove peer node identity\n", + cmdl->argv[0]); +} + +static void cmd_peer_rm_addr_help(struct cmdl *cmdl) { fprintf(stderr, "Usage: %s peer remove address ADDRESS\n", cmdl->argv[0]); } +static void cmd_peer_rm_nodeid_help(struct cmdl *cmdl) +{ + fprintf(stderr, "Usage: %s peer remove identity NODEID\n", + cmdl->argv[0]); +} + static int cmd_peer_rm(struct nlmsghdr *nlh, const struct cmd *cmd, struct cmdl *cmdl, void *data) { const struct cmd cmds[] = { - { "address", cmd_peer_rm_addr, cmd_peer_rm_help }, + { "address", cmd_peer_rm_addr, cmd_peer_rm_addr_help }, + { "identity", cmd_peer_rm_nodeid, cmd_peer_rm_nodeid_help }, { NULL } }; -- 2.20.1 |
From: Hoang Le <hoa...@de...> - 2019-10-14 04:04:56
|
We add the support to remove a specific node down with 128bit node identifier, as an alternative to legacy 32-bit node address. v2: improve usage for 'tipc peer remove' command Signed-off-by: Hoang Le <hoa...@de...> --- tipc/peer.c | 55 ++++++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 54 insertions(+), 1 deletion(-) diff --git a/tipc/peer.c b/tipc/peer.c index f6380777033d..e1517743f80f 100644 --- a/tipc/peer.c +++ b/tipc/peer.c @@ -59,17 +59,70 @@ static int cmd_peer_rm_addr(struct nlmsghdr *nlh, const struct cmd *cmd, return msg_doit(nlh, NULL, NULL); } +static int cmd_peer_rm_nodeid(struct nlmsghdr *nlh, const struct cmd *cmd, + struct cmdl *cmdl, void *data) +{ + char buf[MNL_SOCKET_BUFFER_SIZE]; + __u8 id[16] = {0,}; + __u64 *w0 = (__u64 *)&id[0]; + __u64 *w1 = (__u64 *)&id[8]; + struct nlattr *nest; + char *str; + + if (cmdl->argc != cmdl->optind + 1) { + fprintf(stderr, "Usage: %s peer remove nodeid NODEID\n", + cmdl->argv[0]); + return -EINVAL; + } + + str = shift_cmdl(cmdl); + if (str2nodeid(str, id)) { + fprintf(stderr, "Invalid node identity\n"); + return -EINVAL; + } + + nlh = msg_init(buf, TIPC_NL_PEER_REMOVE); + if (!nlh) { + fprintf(stderr, "error, message initialisation failed\n"); + return -1; + } + + nest = mnl_attr_nest_start(nlh, TIPC_NLA_NET); + mnl_attr_put_u64(nlh, TIPC_NLA_NET_NODEID, *w0); + mnl_attr_put_u64(nlh, TIPC_NLA_NET_NODEID_W1, *w1); + mnl_attr_nest_end(nlh, nest); + + return msg_doit(nlh, NULL, NULL); +} + static void cmd_peer_rm_help(struct cmdl *cmdl) +{ + fprintf(stderr, + "Usage: %s peer remove PROPERTY\n\n" + "PROPERTIES\n" + " address - Remove peer node address\n" + " nodeid - Remove peer node identity\n", + cmdl->argv[0]); +} + +static void cmd_peer_rm_addr_help(struct cmdl *cmdl) { fprintf(stderr, "Usage: %s peer remove address ADDRESS\n", cmdl->argv[0]); } +static void cmd_peer_rm_nodeid_help(struct cmdl *cmdl) +{ + fprintf(stderr, "Usage: %s peer remove nodeid NODEID\n", + cmdl->argv[0]); +} + static int cmd_peer_rm(struct nlmsghdr *nlh, const struct cmd *cmd, struct cmdl *cmdl, void *data) { const struct cmd cmds[] = { - { "address", cmd_peer_rm_addr, cmd_peer_rm_help }, + { "address", cmd_peer_rm_addr, cmd_peer_rm_addr_help }, + { "nodeid", cmd_peer_rm_nodeid, cmd_peer_rm_nodeid_help }, { NULL } }; -- 2.20.1 |
From: Hoang Le <hoa...@de...> - 2019-10-14 03:27:07
|
We add the support to remove a specific node down with 128bit node identifier, as an alternative to legacy 32-bit node address. Signed-off-by: Hoang Le <hoa...@de...> --- tipc/peer.c | 54 ++++++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 53 insertions(+), 1 deletion(-) diff --git a/tipc/peer.c b/tipc/peer.c index f6380777033d..9f116b257fda 100644 --- a/tipc/peer.c +++ b/tipc/peer.c @@ -59,17 +59,69 @@ static int cmd_peer_rm_addr(struct nlmsghdr *nlh, const struct cmd *cmd, return msg_doit(nlh, NULL, NULL); } +static int cmd_peer_rm_nodeid(struct nlmsghdr *nlh, const struct cmd *cmd, + struct cmdl *cmdl, void *data) +{ + char buf[MNL_SOCKET_BUFFER_SIZE]; + __u8 id[16] = {0,}; + __u64 *w0 = (__u64 *)&id[0]; + __u64 *w1 = (__u64 *)&id[8]; + struct nlattr *nest; + char *str; + + if (cmdl->argc != cmdl->optind + 1) { + fprintf(stderr, "Usage: %s peer remove nodeid NODEID\n", + cmdl->argv[0]); + return -EINVAL; + } + + str = shift_cmdl(cmdl); + if (str2nodeid(str, id)) { + fprintf(stderr, "Invalid node identity\n"); + return -EINVAL; + } + + nlh = msg_init(buf, TIPC_NL_PEER_REMOVE); + if (!nlh) { + fprintf(stderr, "error, message initialisation failed\n"); + return -1; + } + + nest = mnl_attr_nest_start(nlh, TIPC_NLA_NET); + mnl_attr_put_u64(nlh, TIPC_NLA_NET_NODEID, *w0); + mnl_attr_put_u64(nlh, TIPC_NLA_NET_NODEID_W1, *w1); + mnl_attr_nest_end(nlh, nest); + + return msg_doit(nlh, NULL, NULL); +} + static void cmd_peer_rm_help(struct cmdl *cmdl) +{ + fprintf(stderr, "Usage: %s peer remove PROPERTY\n\n", + "PROPERTIES\n" + " address - Remove peer node address\n" + " nodeid - Remove peer node identity\n", + cmdl->argv[0]); +} + +static void cmd_peer_rm_addr_help(struct cmdl *cmdl) { fprintf(stderr, "Usage: %s peer remove address ADDRESS\n", cmdl->argv[0]); } +static void cmd_peer_rm_nodeid_help(struct cmdl *cmdl) +{ + fprintf(stderr, "Usage: %s peer remove nodeid NODEID\n", + cmdl->argv[0]); +} + static int cmd_peer_rm(struct nlmsghdr *nlh, const struct cmd *cmd, struct cmdl *cmdl, void *data) { const struct cmd cmds[] = { - { "address", cmd_peer_rm_addr, cmd_peer_rm_help }, + { "address", cmd_peer_rm_addr, cmd_peer_rm_addr_help }, + { "nodeid", cmd_peer_rm_nodeid, cmd_peer_rm_nodeid_help }, { NULL } }; -- 2.20.1 |
From: Hoang Le <hoa...@de...> - 2019-10-14 03:26:54
|
We add the support to remove a specific node down with 128bit node identifier, as an alternative to legacy 32-bit node address. Signed-off-by: Hoang Le <hoa...@de...> --- net/tipc/node.c | 29 +++++++++++++++++++++-------- 1 file changed, 21 insertions(+), 8 deletions(-) diff --git a/net/tipc/node.c b/net/tipc/node.c index c8f6177dd5a2..152b98b2e8f5 100644 --- a/net/tipc/node.c +++ b/net/tipc/node.c @@ -1926,8 +1926,11 @@ int tipc_nl_peer_rm(struct sk_buff *skb, struct genl_info *info) struct net *net = sock_net(skb->sk); struct tipc_net *tn = net_generic(net, tipc_net_id); struct nlattr *attrs[TIPC_NLA_NET_MAX + 1]; + u8 node_id[NODE_ID_LEN]; + u64 *w0 = (u64 *)&node_id[0]; + u64 *w1 = (u64 *)&node_id[8]; struct tipc_node *peer; - u32 addr; + u32 addr = 0; int err; /* We identify the peer by its net */ @@ -1940,16 +1943,26 @@ int tipc_nl_peer_rm(struct sk_buff *skb, struct genl_info *info) if (err) return err; - if (!attrs[TIPC_NLA_NET_ADDR]) - return -EINVAL; - - addr = nla_get_u32(attrs[TIPC_NLA_NET_ADDR]); + if (attrs[TIPC_NLA_NET_ADDR]) { + addr = nla_get_u32(attrs[TIPC_NLA_NET_ADDR]); + if (!addr) + return -EINVAL; + if (in_own_node(net, addr)) + return -ENOTSUPP; + } - if (in_own_node(net, addr)) - return -ENOTSUPP; + if (attrs[TIPC_NLA_NET_NODEID]) { + if (!attrs[TIPC_NLA_NET_NODEID_W1]) + return -EINVAL; + *w0 = nla_get_u64(attrs[TIPC_NLA_NET_NODEID]); + *w1 = nla_get_u64(attrs[TIPC_NLA_NET_NODEID_W1]); + } spin_lock_bh(&tn->node_list_lock); - peer = tipc_node_find(net, addr); + if (!addr) + peer = tipc_node_find_by_id(net, node_id); + else + peer = tipc_node_find(net, addr); if (!peer) { spin_unlock_bh(&tn->node_list_lock); return -ENXIO; -- 2.20.1 |
From: Jon M. <jon...@er...> - 2019-10-13 17:21:44
|
Hi Kumar, Which kernel version are you using? Also, is it possible for tyou to narrow the problem by using a smaller program displaying the same behavior? If so, I would like to use that program for trouble shooting. How repeatable is your problem? Every time, often, very seldom? BR ///jon From: Mahesh Kumar <lm...@ya...> Sent: 11-Oct-19 18:12 To: Jon Maloy <jon...@er...> Cc: Ying Xue <yin...@wi...>; Mahesh Kumar via Tipc-discussion <tip...@li...> Subject: TIPC / client gets disconnected Hi Jon, I am observing a strange behavior, where client connection gets dropped once its sends out a message to server. (Server does open TIPC socket, listen, accept, read) The connection remain healthy (connected with server) if no transaction was made. I have changed the server listen socket scope from zone to cluster as we have deprecated zone. Is there anything more I will need to change to keep the connection healthy. please advise. thanks Mahesh kumar.L On Tuesday, 16 July, 2019, 05:24:53 am GMT-7, Jon Maloy <jon...@er...<mailto:jon...@er...>> wrote: Hi Kumar, First of all, I would recommend you not to use <1.88.88> as listener address, since the service types [0,64] are reserved for internal use by TIPC itself. I am also a little confused about which kernel you are using. Are you now using 4.19.44, and not 4.4.180, or was this just a test? BTW, since 4.17, the use of “zone” scope is deprecated, and translated internally to “cluster”. There was never any functional difference between them anyway. Regarding your system, a good start would be to issue the following commands, both on the host and in the container: $ ip addr $ tipc bearer list $ tipc link list $ tipc nametable show BR ///jon From: Mahesh Kumar <lm...@ya...<mailto:lm...@ya...>> Sent: 15-Jul-19 16:21 To: Ying Xue <yin...@wi...<mailto:yin...@wi...>>; Jon Maloy <jon...@er...<mailto:jon...@er...>> Cc: tip...@li...<mailto:tip...@li...> Subject: Re: [tipc-discussion] TIPC ; config trouble ; help request Hi Jon, Thanks a lot for checking this and providing feedback. A brief background of the system. In the host system, upon bootup node address 1.1.1 would be configured. I added is a listener in 1.88.88. tams_srv_addr<https://wwwin-opengrok.cisco.com/v1611_throttle/xref/polaris/binos/infra/tam_svcs/server/platform/mcp/src/tams_tipc.c#tams_srv_addr>.family<https://wwwin-opengrok.cisco.com/v1611_throttle/s?defs=family> = AF_TIPC<https://wwwin-opengrok.cisco.com/v1611_throttle/s?defs=AF_TIPC>; tams_srv_addr<https://wwwin-opengrok.cisco.com/v1611_throttle/xref/polaris/binos/infra/tam_svcs/server/platform/mcp/src/tams_tipc.c#tams_srv_addr>.addrtype<https://wwwin-opengrok.cisco.com/v1611_throttle/s?defs=addrtype> = TIPC_ADDR_NAMESEQ<https://wwwin-opengrok.cisco.com/v1611_throttle/s?defs=TIPC_ADDR_NAMESEQ>; tams_srv_addr<https://wwwin-opengrok.cisco.com/v1611_throttle/xref/polaris/binos/infra/tam_svcs/server/platform/mcp/src/tams_tipc.c#tams_srv_addr>.addr<https://wwwin-opengrok.cisco.com/v1611_throttle/s?defs=addr>.nameseq<https://wwwin-opengrok.cisco.com/v1611_throttle/s?defs=nameseq>.type<https://wwwin-opengrok.cisco.com/v1611_throttle/s?defs=type> = TIPC_TOP_SRV<https://wwwin-opengrok.cisco.com/v1611_throttle/s?defs=TIPC_TOP_SRV>; tams_srv_addr<https://wwwin-opengrok.cisco.com/v1611_throttle/xref/polaris/binos/infra/tam_svcs/server/platform/mcp/src/tams_tipc.c#tams_srv_addr>.addr<https://wwwin-opengrok.cisco.com/v1611_throttle/s?defs=addr>.nameseq<https://wwwin-opengrok.cisco.com/v1611_throttle/s?defs=nameseq>.lower<https://wwwin-opengrok.cisco.com/v1611_throttle/s?defs=lower> = TAMS_TIPC_LISTEN_PORT<https://wwwin-opengrok.cisco.com/v1611_throttle/s?defs=TAMS_TIPC_LISTEN_PORT>; <<88 tams_srv_addr<https://wwwin-opengrok.cisco.com/v1611_throttle/xref/polaris/binos/infra/tam_svcs/server/platform/mcp/src/tams_tipc.c#tams_srv_addr>.addr<https://wwwin-opengrok.cisco.com/v1611_throttle/s?defs=addr>.nameseq<https://wwwin-opengrok.cisco.com/v1611_throttle/s?defs=nameseq>.upper<https://wwwin-opengrok.cisco.com/v1611_throttle/s?defs=upper> = TAMS_TIPC_LISTEN_PORT<https://wwwin-opengrok.cisco.com/v1611_throttle/s?defs=TAMS_TIPC_LISTEN_PORT>; tams_srv_addr<https://wwwin-opengrok.cisco.com/v1611_throttle/xref/polaris/binos/infra/tam_svcs/server/platform/mcp/src/tams_tipc.c#tams_srv_addr>.scope<https://wwwin-opengrok.cisco.com/v1611_throttle/s?defs=scope> = TIPC_ZONE_SCOPE<https://wwwin-opengrok.cisco.com/v1611_throttle/s?defs=TIPC_ZONE_SCOPE>; Now from a container, I am trying to access the host service (88); by setting its container's node address as 1.1.100 using "tipc node set addr 1.1.100". This used to work fine, but it bails out with 4.4.180 kernel. Another change I noticed by changing my kernel to 4.19.44 is that all of the listeners are in cluster scope now instead of zone scope. I didn't had a chance to check the results from container with the new kernel. >>from host >>>tipc-config -nt Type Lower Upper Port Identity Publication Scope 0 16781313 16781313 <1.1.1:0> 16781313 cluster 1 1 1 <1.1.1:2399481494> 2399481495 node 1 88 88 <1.1.1:2536304640> 2536304641 cluster 15003 5 5 <1.1.1:3973838764> 3973838765 cluster [Switch_1_RP_0:/]$ uname -a Linux Switch_1_RP_0 4.19.44 #1 SMP Wed May 22 13:50:02 PDT 2019 x86_64 x86_64 x86_64 GNU/Linux Please let me know; which outputs should I need to share from the 'host' and 'container' side. thanks & regards Mahesh kumar.L On Monday, 15 July, 2019, 12:36:09 pm GMT-7, Jon Maloy <jon...@er...<mailto:jon...@er...>> wrote: Hi Kumar, Your binding table listing reveals that your node already has an address <1.1.1>, which explains why your address setting fails. You should check if you have any script that sets the address by default at module loading, or maybe you just set it manually and then forgot... Furthermore, its seems you have published a service <1,88,88> which means you are illegally using the reserved service type <1>. The latter isn't your fault, but due to a bug in TIPC that wrongly allows users to publish such service types, in the function tipc_bind(). I discovered this ug a couple of months ago, but haven't fixed it yet, and I am not quite sure how to do it without breaking any BPI. This may cause you surprises, but I cannot see why it would cause the bearer enabling to fail. If this problem persists, you should post some more system info about your interfaces, which tipc links you have etc. BR ///jon > -----Original Message----- > From: Mahesh Kumar via tipc-discussion <tipc- > dis...@li...<mailto:dis...@li...>> > Sent: 15-Jul-19 11:49 > To: tip...@li...<mailto:tip...@li...>; Ying Xue > <yin...@wi...<mailto:yin...@wi...>> > Subject: Re: [tipc-discussion] TIPC ; config trouble ; help request > > Hi Ying, > Thank you very much for letting me know.Do we suspect any related ioctl() > patches?. could you please point me to link where we can review the TIPC > patches that went in the kernel.?. > Much appreciate the help. > thanks & regardsMahesh kumar.L > On Monday, 15 July, 2019, 02:56:32 am GMT-7, Ying Xue > <yin...@wi...<mailto:yin...@wi...>> wrote: > > On 7/13/19 11:58 AM, Mahesh Kumar via tipc-discussion wrote: > > Tipc Team, > > > > Greetings!. > > I have been using TIPC for about a year without any issueHowever the > > TIPC tool is bailing out when I tried to set address, bearer > > > > > > / # tipc node set addr 1.1.100 > > > > error: Operation not permitted > > > > / # tipc bearer enable media eth dev ieobc > > > > error: Invalid argument > > > > / # > > > > I am using the new kernel now; > > uname -aLinux 2c3f0b001900_1_RP_0 4.4.180 #1 SMP Tue Jun 25 > 15:36:10 > > PDT 2019 x86_64 x86_64 x86_64 GNU/Linux > > dmesg output ; grep -i TIPC d.txt[ 29.436599] tipc: Activated > > (version 2.0.0)[ 29.436768] tipc: Started in single node mode > > Suspected some TIPC patches integrated through 4.4.180 release introduced > regression. The most simplest method to identify the issue is to revert some > TIPC patches to identify which ones caused the regression. > > > > > [2c3f0b001900_1_RP_0:/] $ tipc-config -nt Type Lower Upper Port Identity Publication Scope 0 16781313 16781313 <1.1.1:0> 16781313 zone 1 1 1 <1.1.1:483390874> 483390875 node 1 88 88 <1.1.1:2870943326> 2870943327 zone 15003 5 5 <1.1.1:3556781096> 3556781097 zone [2c3f0b001900_1_RP_0:/]$ > > > > please let me know if any issue. > > thanks & regardsMahesh kumar.L > > > > > > _______________________________________________ > > tipc-discussion mailing list > > tip...@li...<mailto:tip...@li...> > > https://lists.sourceforge.net/lists/listinfo/tipc-discussion > > > > _______________________________________________ > tipc-discussion mailing list > tip...@li...<mailto:tip...@li...> > https://lists.sourceforge.net/lists/listinfo/tipc-discussion |