Re: [Linuxptp-users] Fw: “Resource temporarily unavailable” errors during flood ping test
PTP IEEE 1588 stack for Linux
Brought to you by:
rcochran
From: Jacob K. <jac...@in...> - 2012-10-30 16:54:44
|
On 10/30/2012 08:47 AM, Mario Molitor wrote: > > Hallo Richard, > > I observed following error message with a new GM Clock search during my flood ping test: > > ptp4l[23768.970]: recvmsg tx timestamp failed: Resource temporarily unavailable > ptp4l[23768.975]: port 1: send delay request failed > ptp4l[23768.975]: port 1: SLAVE to FAULTY on FAULT_DETECTED > ptp4l[23784.059]: port 1: FAULTY to LISTENING on FAULT_CLEARED > ptp4l[23784.414]: port 1: new foreign master 0050c2.fffe.c2dfc3-1 > ptp4l[23788.424]: selected best master clock 0050c2.fffe.c2dfc3 > ptp4l[23788.425]: port 1: LISTENING to UNCALIBRATED on RS_SLAVE > ptp4l[23790.092]: port 1: minimum delay request interval 2^3 > ptp4l[23790.688]: master offset -329 s2 adj -13761 path delay 1626 > ptp4l[23790.708]: port 1: UNCALIBRATED to SLAVE on MASTER_CLOCK_SELECTED > > flood ping test script: > while true; do sudo ping -f -c 1000 -s $RANDOM <IP of PTP Module> ; done. > > I have instrument the ptp4l code and I could see that a part of problem was a not correct error handling in the function sk_receive(). The recvmsg() returns sometime a EAGAIN and try-again variable was not increment. > I have changed this and now disappears this error message with GM Clock search during my flood ping test and it works all very well. > > My code changes: > --- a/sk.c > +++ b/sk.c > > } > if (errno == EINTR) { > try_again++; > - } else if (errno == EAGAIN) { > + } else if ((errno == EAGAIN ) || (errno == EWOULDBLOCK)) { > usleep(1); > + try_again++; > } else { > break; > } > > > Do you have an idea why these EAGAIN errors occur? I cloud not find a reason for non-blocking. > > Best regards, > Mario > > > ------------------------------------------------------------------------------ > Everyone hates slow websites. So do we. > Make your web apps faster with AppDynamics > Download AppDynamics Lite for free today: > http://p.sf.net/sfu/appdyn_sfd2d_oct > _______________________________________________ > Linuxptp-users mailing list > Lin...@li... > https://lists.sourceforge.net/lists/listinfo/linuxptp-users > That loop is there due to the way hardware timestamps are returned from the network stack to PTP4l. They are looped back on the socket error queue, and then picked up by PTP4l. Current design doesn't want to wait indefinitely due to possible missed timestamps. The try_again isn't incremented on purpose, as that would cause it to loop an infinite number of times. You could try to increment the tx_timestamp_retry value in the config file and see if this fixes the issue. I believe we should have a higher default in this field, because the drivers I've tested all have trouble returning the timestamp within the very short time (2 nanoseconds). If it's a problem due to the regular receive that might be an entirely different issue. I believe the true correct answer is to completely re-architect the tx_hwtstamp to be asynchronous, so that it just waits until it receives the timestamp for a complete sequence of events. That design is significantly more difficult to write though. - Jake |