From: Jon M. <jm...@re...> - 2021-06-29 19:57:37
|
On 28/06/2021 15:16, Xin Long wrote: > On Mon, Jun 28, 2021 at 3:03 PM Xin Long <luc...@gm...> wrote: >> On Sun, Jun 27, 2021 at 3:44 PM Erin Shepherd <eri...@e4...> wrote: >>> Xin Long <luc...@gm...> writes: >>> >>>> Currently, when userspace reads a datagram with a buffer that is >>>> smaller than this datagram, the data will be truncated and only >>>> part of it can be received by users. It doesn't seem right that >>>> users don't know the datagram size and have to use a huge buffer >>>> to read it to avoid the truncation. >>>> >>>> This patch to fix it by keeping the skb in rcv queue until the >>>> whole data is read by users. Only the last msg of the datagram >>>> will be marked with MSG_EOR, just as TCP/SCTP does. Makes sense to me. >>> I agree that the current behavior is suboptimal, but: >>> >>> * Isn't this the same behavior that other datagram socket types >>> exhibit? It seems like this would make TIPC behave inconsistently >>> compared to other transports >> Yes, SCTP. >> Do you see any reliable datagram transports not doing this? >> >>> * Isn't this a compatibility break with existing software? Particularly >>> existing software will not expect to receive trailers of overlong >>> datagrams >> I talked to Jon about this, he seems okay with this. >> >>> It feels like this behavior should be activated either with a >>> setsockopt(2) call or a new MSG_* flag passed to recv >> Anyway, It may not be worth a new sockopt. >> I'm thinking to pass MSG_EOR into sendmsg: >> sendmsg(MSG_EOR). > sorry, I meant recvmsg(); Still not sure I understand what you are suggesting here. Do you mean that if we add MSG_EOR as a flag to recvmsg() that means we *don't* want the remainder of the message, i.e., it is ok to truncate it? Or do you mean the opposite? In the first case, we don't solve any compatibility issue, if that is the purpose. The programmer still has to add code to get the current behavior. In the latter case we would be on the 100% safe side, although I have a real hard time to see that this could be a real issue. Why would anybody deliberately design an application for having messages truncated. ///jon >> to indicate we don't want the truncating msg. >> >> When the msg flag returns with no MSG_EOR, it means there's more data to read. >> >> Thanks. >>> - Erin >>> >>>> Signed-off-by: Xin Long <luc...@gm...> >>>> --- >>>> net/tipc/socket.c | 30 +++++++++++++++++++++--------- >>>> 1 file changed, 21 insertions(+), 9 deletions(-) >>>> >>>> diff --git a/net/tipc/socket.c b/net/tipc/socket.c >>>> index 34a97ea36cc8..504e59838b8b 100644 >>>> --- a/net/tipc/socket.c >>>> +++ b/net/tipc/socket.c >>>> @@ -1880,6 +1880,7 @@ static int tipc_recvmsg(struct socket *sock, struct msghdr *m, >>>> bool connected = !tipc_sk_type_connectionless(sk); >>>> struct tipc_sock *tsk = tipc_sk(sk); >>>> int rc, err, hlen, dlen, copy; >>>> + struct tipc_skb_cb *skb_cb; >>>> struct sk_buff_head xmitq; >>>> struct tipc_msg *hdr; >>>> struct sk_buff *skb; >>>> @@ -1903,6 +1904,7 @@ static int tipc_recvmsg(struct socket *sock, struct msghdr *m, >>>> if (unlikely(rc)) >>>> goto exit; >>>> skb = skb_peek(&sk->sk_receive_queue); >>>> + skb_cb = TIPC_SKB_CB(skb); >>>> hdr = buf_msg(skb); >>>> dlen = msg_data_sz(hdr); >>>> hlen = msg_hdr_sz(hdr); >>>> @@ -1922,18 +1924,27 @@ static int tipc_recvmsg(struct socket *sock, struct msghdr *m, >>>> >>>> /* Capture data if non-error msg, otherwise just set return value */ >>>> if (likely(!err)) { >>>> - copy = min_t(int, dlen, buflen); >>>> - if (unlikely(copy != dlen)) >>>> - m->msg_flags |= MSG_TRUNC; >>>> - rc = skb_copy_datagram_msg(skb, hlen, m, copy); >>>> + int offset = skb_cb->bytes_read; >>>> + >>>> + copy = min_t(int, dlen - offset, buflen); >>>> + rc = skb_copy_datagram_msg(skb, hlen + offset, m, copy); >>>> + if (unlikely(rc)) >>>> + goto exit; >>>> + if (unlikely(offset + copy < dlen)) { >>>> + if (!(flags & MSG_PEEK)) >>>> + skb_cb->bytes_read = offset + copy; >>>> + } else { >>>> + m->msg_flags |= MSG_EOR; >>>> + skb_cb->bytes_read = 0; >>>> + } >>>> } else { >>>> copy = 0; >>>> rc = 0; >>>> - if (err != TIPC_CONN_SHUTDOWN && connected && !m->msg_control) >>>> + if (err != TIPC_CONN_SHUTDOWN && connected && !m->msg_control) { >>>> rc = -ECONNRESET; >>>> + goto exit; >>>> + } >>>> } >>>> - if (unlikely(rc)) >>>> - goto exit; >>>> >>>> /* Mark message as group event if applicable */ >>>> if (unlikely(grp_evt)) { >>>> @@ -1956,9 +1967,10 @@ static int tipc_recvmsg(struct socket *sock, struct msghdr *m, >>>> tipc_node_distr_xmit(sock_net(sk), &xmitq); >>>> } >>>> >>>> - tsk_advance_rx_queue(sk); >>>> + if (!skb_cb->bytes_read) >>>> + tsk_advance_rx_queue(sk); >>>> >>>> - if (likely(!connected)) >>>> + if (likely(!connected) || skb_cb->bytes_read) >>>> goto exit; >>>> >>>> /* Send connection flow control advertisement when applicable */ >>>> -- >>>> 2.27.0 >>>> >>>> >>>> >>>> _______________________________________________ >>>> tipc-discussion mailing list >>>> tip...@li... >>>> https://lists.sourceforge.net/lists/listinfo/tipc-discussion > > _______________________________________________ > tipc-discussion mailing list > tip...@li... > https://lists.sourceforge.net/lists/listinfo/tipc-discussion > |