|
From: Samuel O. <sam...@so...> - 2013-09-03 09:14:58
|
Hi everyone, As part of an recent project, we've used libtirpc to implement a multi-threaded RPC server. For this project we made custom versions of some of libtirpc's entry functions, svc_run(), svc_getreq_set() etc. The solution was based in part on an earlier post by Ian Kent on this mailing-list. We've ported this solution and created a patch for the latest version of libtirpc, but it should be noted that there are some issues with it. We figured we'd submit it here anyway, just in case. The issues we've had to deal with are: 1) EBADF in main loop select() returns with read-event on client socket. It turns out the read event was that the client had closed his conenction. A child thread deals with cleanup. Before the child thread has closed the client connection, the main thread makes it back into the next iteration of select(), meaning that the client fd is still in the fd_set. As the child thread closes the fd, select() will return -1 and set errno to EBADF in the main thread. Perhaps not a huge issue, but it's something that needs to be dealt with in the main loop. 2) Busy-wait select() Because the implementation spawns a child thread for each read-event returned by select(), a kind of race-condition exists where select() will keep immediately returning, until the child thread has pulled the read-event data off of the socket. Another way to solve this might have been to FD_CLR the fd from the svc_fdset, and hand it off to a child thread permanently, though his would effectively disable __svc_clean_idle(), force the child thread to handle socket timeout issues, and basically hog a thread slot for as long as the client stays connected. We've not done anything about this busy-wait issue, as it does not appear to affect much in testing. 3) Max threads If running with a limit on the maximum number of threads, the current solution can actually at times have a few more threads running than allowed by the limit. The extra threads should already have done all of their processing, and be just about to terminate, though. We initially had placed the sem_post() call later in the svc_getreq_thread() but this caused dead-locks. That should cover it, and we understand if these issues are considered too severe to include our patch in libtirpc, but figured that perhaps the patch can serve as a starting point to be improved upon. Thanks, Samuel & Mattias |