From: Jon M. <jm...@re...> - 2020-01-21 12:11:32
|
Hi Ying, Do you have any ideas regarding this? My best proposal is to att a time stamp to the new server socket at creation, and then delay a 'close' if the socket is < 5ms old. But I am not quite happy with this solution, so any better idea would be welcome. ///jon > > > ----- Forwarded Message ----- From: Tuong Lien Tong > <tuo...@de...>To: 'Ying Xue' <yin...@wi...>; > "jm...@re..." <jm...@re...>; "ma...@do..." > <ma...@do...>Cc: "tip...@de..." > <tip...@de...>Sent: Monday, January 13, 2020, 04:57:58 AM > GMT-5Subject: FW: [net v2] tipc: fix wrong connect() return code > > Hi Ying, Jon, > > > > The last change has exposed a different issue with our “lightweight” > connection setup mechanism, > > Please see the scenario below: > > > > > > server process client process > > ----------------------- > ------------------------------ > > create()| | > > listen()*tskL | > > bind()| |create() > > +- accept()* tskC* > > | | |connect() ~~~~~~~~~~~~~+ > > wait_for_rcv()| |<---------[SYN]-----------| | > > | | |//tskC: CONNECTING > | | | *wait_for_conn() > > +--------->*tskS | > |->add to wait queue > > |//tskS: ESTABLISHED | | > > |----------[ACK]---------->|//tskC: ESTABLISHED > | ->wakeup(): SLEEPING -> run queue > > send()|----------[DATA]--------->|\ > ->wakeup(): already woken > > send()|----------[DATA]--------->| | > ->wakeup(): already woken > > . . . . |-> receive queue . > > . . . . | . > > send()|----------[DATA]--------->|/ > ->wakeup(): already woken > > close()|----------[FIN]---------->|//tskC: DISCONNECTING | > > | | | > > | ~~~~~~~~~~~~~~~~~~~~~~~~~~>schedule(): > | run queue -> RUNNING > > |->add > |back > |to > |wait > |queue > |(state > |!= > |ESTABLISHED) > > . > > . > > | > > *timed > out > > > > > > When the server side receives ‘SYN’ from client, the ‘accept()’ call ends up > with the new socket in the ‘ESTABLISHED’ state. Shortly thereafter, the > server starts sending a number of messages before calling the socket > ‘close()’ without waiting for any response from the client! > > In this scenario, it may happen on the other side that the client process > waiting for the connection to be ‘ESTABLISHED’ cannot be awakened in time. > Although the ‘wakeup()’ call is still done when the ‘ACK’ from the server > arrives, but it just moves the process into the run queue (i.e. runnable) to > be scheduled to run later. The kernel (i.e. RX process) is not preemptable > when it processes the messages from the server, including the ‘ACK ‘, > ‘DATA’, … and finally the ‘FIN’ which will terminate the connection. When > the wait process is really run back, the client socket state is now > ‘DISCONNECTING’ (i.e. != ‘ESTABLISHED’), so it continues to wait until the > timer expires. The client will finally get ‘-ETIMEDOUT’ and forced to > release the socket whereas there remains the messages from the server in the > socket’s receive queue. > > > > Notes: > > - The issue is found by the tipcutils/ptts tool. > - This is a timing issue, sometimes it happens sometimes not, depending on > the small enough number of the data messages (e.g. < 100) from the server > before it sends the ‘FIN’, as well as the network traffic, RX process, > scheduler, etc. on the client side. > - Without the previous change, the issue is _not detected_ since the > ‘connect()’ call will return 0 when the wait process is woken & running > back but the socket state is ‘DISCONNECTING’, there is no socket error > (sk->sk_err == 0) in this case… However, this is the original issue and > unexpected because if the client tries to send something after that, it > will get ‘-EPIPE’ immediately. > > > > We need to find a solution to this problem, > > - It seems to me that in this mechanism the server is “too hasty” to set > the new socket to ‘ESTABLISHED’ at the ‘accept()’ when it has just > received a ‘SYN’ from client. That allows it to go ahead with sending the > messages & closing the socket in one step… Should we consider to include > something like the TCP ‘3-way handshake’ here, so the server needs to get > a ‘ACK’ from the client before the connection is fully established? I > guess ‘no’ since it should be kept lightweight…? > - An alternative is we can introduce a bit delay e.g. 2 or 5ms to the > ‘accept()’ or ‘close()’ before completing it, which will give a chance > for its client to be really established first? This solution looks simple > and should have less impact to user space… > - Or …? > > > > What is your opinion, please share? > > Many thanks! > > > > BR/Tuong > > > > -----Original Message----- > From: Xue, Ying <Yin...@wi...> > Sent: Thursday, January 2, 2020 4:10 PM > To: Tuong Lien <tuo...@de...>; > tip...@li...; jm...@re...; ma...@do... > Subject: RE: [net v2] tipc: fix wrong connect() return code > > > > Great! > > > > Acked-by: Ying Xue <yin...@wi...> > > > > -----Original Message----- > > From: Tuong Lien [mailto:tuo...@de...] > > Sent: Monday, December 30, 2019 4:56 PM > > To: tip...@li...; jm...@re...; > ma...@do...; Xue, Ying > > Subject: [net v2] tipc: fix wrong connect() return code > > > > The current 'tipc_wait_for_connect()' function makes a loop and waits > > for the condition 'sk->sk_state != TIPC_CONNECTING' to conclude if the > > connecting has done. However, when the condition is met, it always > > returns '0' even in the case the connecting was actually failed (e.g. > > refused because the server socket has closed...) and the socket state > > was set to 'TIPC_DISCONNECTING'. > > This results in a wrong return code for the 'connect()' call from user, > > making it believe that the connection is established and goes ahead > > with more actions e.g. building & sending a message but then finally > > gets an unexpected result (e.g. '-EPIPE'). > > > > This commit fixes the issue by instead setting the wait condition to > > 'tipc_sk_connected(sk)', so that the function will return '0' only when > > the connection is really established. Otherwise, either the socket > > error code if any or '-ETIMEDOUT'/'-EINTR' will be returned > > correspondingly. > > > > --------- > > v2: changed after discussing with Ying > > --------- > > > > Signed-off-by: Tuong Lien <tuo...@de...> > > --- > > net/tipc/socket.c | 4 ++-- > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > diff --git a/net/tipc/socket.c b/net/tipc/socket.c > > index 6552f986774c..2f5679f84060 100644 > > --- a/net/tipc/socket.c > > +++ b/net/tipc/socket.c > > @@ -2432,8 +2432,8 @@ static int tipc_wait_for_connect(struct socket *sock, > long *timeo_p) > > return > sock_intr_errno(*timeo_p); > > > > add_wait_queue(sk_sleep(sk), &wait); > > - done = sk_wait_event(sk, timeo_p, > > - > sk->sk_state > != TIPC_CONNECTING, &wait); > > + done = sk_wait_event(sk, timeo_p, > tipc_sk_connected(sk), > > + &wait); > > remove_wait_queue(sk_sleep(sk), &wait); > > } while (!done); > > return 0; > > -- > > 2.13.7 > > > |