I have dug into the debug output from the server code (and added more to see what the select call is doing) and can see no indication of the failing connections being attempted. I have looked at all of the socket handling code in the server and can't see anything obviously wrong but I am not a socket expert.
My test simply involves running the miniweb server to serve the contents of this zip (just a web page using 6 images):
...and then browsing to http://localhost:8000/index.htm and refreshing it until it fails (one or more images will fail to load and the browser will just sit waiting). I have tried it on quite a few machines and some fail quite often while others hardly ever fail (my i7-3770 Win8-64 fails about 20% when using the IE control but only about 5% with a standalone browser, where an old T2600 CoreDuo? laptop win Win7-32 fails much less often).
Does anyone have any ideas of what might be causing this? This looks to me like a race condition in the socket handling somewhere. I find it hard to believe this sort of problem would have existed in Windows going back several versions but I can't see anything wrong with the Miniweb socket code. The listening socket should be accept'ed when select indicates and Winsock is supposed to queue up other connections to the socket (to handle multiple simultaneous connections while the server code is processing the results of the previous select). Is there an easy way to query the number of queued connections on the listen socket?
This issue appears to be something to do with keep-alive. After investigating with a TCP dump program, the browser appears to be trying to reuse the first connection and miniweb has either closed it (I presume not or the browser would get an error and would retry) or is no longer monitoring it (e.g. if it stops including it in the select call then it won't work) despite having returned the keep-alive header.
Simply adding a keepalive=FALSE; to the _mwBuildHttpHeader function stops the problem from occuring.
I will have a look to see if I can track down why it is happening so the code can be fixed in the official source as I don't really want to have to ship a modified version with the LGPL implications that would have...
It appears that when _mwSendFileChunk has finished sending the file, it closes the file descriptor and then returns 1 but the calling code in mwHttpLoop sees the non-zero return value and sets FLAG_CONN_CLOSE before calling _mwCloseSocket. This causes the socket to be closed rather than reset into the reading state. Fixing this properly without introducing other odd behaviour is going to require more detailed understanding of the code than I have at present...
I am also a little concerned that a couple of bits of code in mwHttpLoop set FLAG_CONN_CLOSE by doing an explicit phsSocketCur->flags=FLAG_CONN_CLOSE; rather than using the SETFLAG macro. It looks like this will cause a memory leak if FLAG_TO_FREE was set (unless that never happens in these two cases).
Sign up for the SourceForge newsletter:
You seem to have CSS turned off.
Please don't fill out this field.