Hello,
We are trying to send commands to WinLIRC from a .NET application and we are experiencing a timing issue. If we connect to the server through a socket and then immediately send a command to it, the server does not process the request. Only if a (very short) delay is injected between the connection and the write does the command successfully get processed. I have verified using Wireshark that even when the command doesn't get processed, the socket still ACKs the write. This is what led me to believe there is a bug in the server logic.
Looking in the ThreadProc method of server.cpp, I see the following code (with edits and comments for clarity):
else if(res==(WAIT_OBJECT_0+1)) //The serverEvent is in the signaled state due to an incoming connection
{
//
//Logic to set i to the index of the next available socket in m_clients
//
m_clients[i] = accept(m_server,NULL,NULL); //Permit the incoming connection on the socket
if(m_clients[i]==INVALID_SOCKET)
{
WL_DEBUG("accept() failed\n");
continue;
}
WSAEventSelect(m_clients[i],clientEvent[i],FD_CLOSE|FD_READ); //Signal clientEvent[i] on close and read events for m_clients[i]
clientEvent[i].ResetEvent(); // ****This may cause an issue****
clientData[i][0]='\0';
WL_DEBUG("Client connection %d accepted\n",i);
}
The starred comment highlights the potential issue. Based on my understanding (please correct me if I am wrong), if the network event for m_clients[i] has already been triggered by the time WSAEventSelect is called, that function will immediately signal the provided event (in this case clientEvent[i]). However, because the following line manually calls ResetEvent on that event, it will no longer be in the signaled state by the time WaitForMultipleObjects is called on the next iteration of the loop, leaving it stuck state where it won't respond to subsequent commands. The only way I have found to get it to respond again is to disconnect, reconnect, and delay before trying to write.
Based on my testing (albeit black box testing), I believe moving the clientEvent[i].ResetEvent() call to (potentially anytime) before WSAEventSelect is called would solve the race condition we are experiencing. What do you think?
Thanks a lot
I'll see if i can debug this. It's been quite sometime since I looked at this code. But patches are welcome.
Okay I just had a look at this .. CEvent in MFC has a default constructor which has BOOL bManualReset = FALSE. So I think the manual reset isn't needed at all. Try simply removing clientEvent[i].ResetEvent();
Hmm, I believe the manual reset is still needed. Here's why:
Take a look at Microsoft's documentation for WSAEventSelect:
https://docs.microsoft.com/en-us/windows/win32/api/winsock2/nf-winsock2-wsaeventselect#return-value
It states:
"Having successfully recorded the occurrence of the network event (by setting the corresponding bit in the internal network event record) and signaled the associated event object, no further actions are taken for that network event until the application makes the function call that implicitly reenables the setting of that network event and signaling of the associated event object."
It then lists the functions that will implicitly "reenable" (reset) an event object after an associated network event. For the FD_READ network event, the list of functions includes the recv function. This is good: whenever ThreadProc responds to a signaled clientEvent[i], it calls recv, so it doesn't have to worry about resetting the clientEvent[i] to respond to future network events. However, for FD_CLOSE, it says that there are no re-enabling functions, to me implying that an event object associated with FD_CLOSE will indefinitely remain signaled after the event has occurred (and even after closesocket is called on the socket). This is initially not an issue. However, because ThreadProc reuses a given CEvent in clientEvent for subsequent socket connections, unless ThreadProc manually resets it after a socket gets closed, it will already be in the signaled state for a new socket even before accept is called. Therefore, I think the manual reset is still necessary; I just believe it should occur before WSAEventSelect to handle rapid FD_READ events immediately after accept. But let me know your thoughts on this. I really appreciate you looking into it.
On a side-note, you mentioned I should try changing the code myself. However, unfortunately my machine isn't in a state to do local builds for a program like WinLIRC, and even if it was I wouldn't trust myself to build it correctly. Is there a way that I could commit a change or even if you could make the edit a kick a build for me? I've never done any development on SourceForge before.
Thanks again!
The WaitForMultipleObjects method will automatically reset the handles for events which have an autoreset state.
Ah, I didn't realize that. Thanks for pointing that out.
Actually, looking at the documentation for WaitForMultipleObjects, something else concerns me about that function:
"The function modifies the state of some types of synchronization objects. Modification occurs only for the object or objects whose signaled state caused the function to return."
Does that mean if 2 sockets got FD_READ events at the exact same time that both of them will automatically get reset but the return value will only indicate 1 of them? Or am I reading that wrong?
Only the event which caused WaitForMultipleObjects to return is reset. Unless waitforall is set, then they all are.
Okay, perfect. That's great to know. I have one last scenario to run by you if you don't mind.
Say an FD_READ occurs on a socket. This will cause WaitForMultipleObjects to unblock after automatically resetting the event object, and ThreadProc will start processing the event. Then an FD_CLOSE event occurs concurrently, re-signalling the event object. But let's say that while processing the FD_READ, the recv call returns SOCKET_ERROR, causing this logic to run:
This sets m_clients[i] to INVALID_SOCKET, which means on the next iteration, the event object won't be included in the events array passed to WaitForMultipleObjects:
Would that result in the event object remaining set after the socket is closed or is this scenario not possible?
Last edit: wizardbrony 2020-06-22