I have a simple UDP Packet Reception server written using the CommonC++ libraries that simply binds a socket and listens and displays what is sent to that socket. I am using UDPSocket. My primary machine is SuSE Linux 6.3 using Kernel 2.2.13, the machines with the problem are SuSE 7.1 (kernel 2.4.0-4GB) and Debian (2.4.4.)
The 2.2.x kernel box works fine. However, when I try to compile/execute the code on the 2.4.x kernel boxes I get this behavior:
UDPSocket is created and binds correctly (I have it nested in a try block to make certain I catch any Exceptions.) Then when I move onto use UDPSocket.isPending( SOCKET_PENDING_INPUT, 5000 ); it simply returns true but no packet had been had by the socket. The same behavior occurs for SOCKET_PENDING_ERROR, etc. It just instantly returns true without waiting.
SuSE 6.3 has gcc 2.95.3; the SuSE 7.1 has gcc 2.95.2; and the Debian has gcc 2.95.2.
Didn't notice that before, however, I find myself doubting that the difference between gcc 2.95.2 and gcc 2.95.3 is causing this behavior.
If anyone can help me I'd greatly appreciate it.
Amerist.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
[Q: Is there a better place for me to post this sort of thing?]
Well, I spent a little bit of time running gdb on my code trying to figure out what was going on... I'm not experienced with debugging this way but I think that I did find something.
When the isPending(SOCKET_PENDING_INPUT...) entered the isPending function the pfd.revents became set to 17 ( I believe that's POLLIN (1) and POLLHUP (16) ) ... if I'm right about what 16 is I'm wondering how there the socket can be "hung-up" and have data ready to read at the same time, especially when no data had been sent to the socket yet!
To make certain that I knew what was going on I went and edited socket.cpp to display a debug-print that contained these values each time isPending was called; same results.
Another thing that I don't understand is when SOCKET_PENDING_ERROR is used the revents only contains 16 (even when called immidiately after the SOCKET_PENDING_INPUT one that contained 17.)
I'm at a loss going much further with this examination, I'm afraid.
Amerist.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I've been scanning the Internet (wonderful thing this organism)--Linux Google mostly--for an explanation to what is going on and I've discovered several postings of people who have noticed the same behavior. Apparantly they did change poll() in 2.4.x kernels to return HUP if the socket is not connected to anything! This would be absolutely true for a UDP Socket because they're not usually used to form a direct connexion to another socket.
Here's the first article that I found in my search explaining the situation:
My apologies if someone has already found this one and is correcting it in the UDPSocket or something. I don't know if I have the time to see if I can produce a patch, but I'll try just in case it'd be helpful to the project.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I have a simple UDP Packet Reception server written using the CommonC++ libraries that simply binds a socket and listens and displays what is sent to that socket. I am using UDPSocket. My primary machine is SuSE Linux 6.3 using Kernel 2.2.13, the machines with the problem are SuSE 7.1 (kernel 2.4.0-4GB) and Debian (2.4.4.)
The 2.2.x kernel box works fine. However, when I try to compile/execute the code on the 2.4.x kernel boxes I get this behavior:
UDPSocket is created and binds correctly (I have it nested in a try block to make certain I catch any Exceptions.) Then when I move onto use UDPSocket.isPending( SOCKET_PENDING_INPUT, 5000 ); it simply returns true but no packet had been had by the socket. The same behavior occurs for SOCKET_PENDING_ERROR, etc. It just instantly returns true without waiting.
SuSE 6.3 has gcc 2.95.3; the SuSE 7.1 has gcc 2.95.2; and the Debian has gcc 2.95.2.
Didn't notice that before, however, I find myself doubting that the difference between gcc 2.95.2 and gcc 2.95.3 is causing this behavior.
If anyone can help me I'd greatly appreciate it.
Amerist.
[Q: Is there a better place for me to post this sort of thing?]
Well, I spent a little bit of time running gdb on my code trying to figure out what was going on... I'm not experienced with debugging this way but I think that I did find something.
When the isPending(SOCKET_PENDING_INPUT...) entered the isPending function the pfd.revents became set to 17 ( I believe that's POLLIN (1) and POLLHUP (16) ) ... if I'm right about what 16 is I'm wondering how there the socket can be "hung-up" and have data ready to read at the same time, especially when no data had been sent to the socket yet!
To make certain that I knew what was going on I went and edited socket.cpp to display a debug-print that contained these values each time isPending was called; same results.
Another thing that I don't understand is when SOCKET_PENDING_ERROR is used the revents only contains 16 (even when called immidiately after the SOCKET_PENDING_INPUT one that contained 17.)
I'm at a loss going much further with this examination, I'm afraid.
Amerist.
It's me again.
I've been scanning the Internet (wonderful thing this organism)--Linux Google mostly--for an explanation to what is going on and I've discovered several postings of people who have noticed the same behavior. Apparantly they did change poll() in 2.4.x kernels to return HUP if the socket is not connected to anything! This would be absolutely true for a UDP Socket because they're not usually used to form a direct connexion to another socket.
Here's the first article that I found in my search explaining the situation:
http://www.uwsg.indiana.edu/hypermail/linux/kernel/0103.1/0711.html
Just follow the thread to see what's what.
My apologies if someone has already found this one and is correcting it in the UDPSocket or something. I don't know if I have the time to see if I can produce a patch, but I'll try just in case it'd be helpful to the project.
This is interesting and problematic...it is also not the only odd 2.4 behavior compared to 2.2.