Re: [Linuxptp-users] One-step sync and P2P mode issue
PTP IEEE 1588 stack for Linux
Brought to you by:
rcochran
From: Keller, J. E <jac...@in...> - 2014-08-12 20:05:40
|
On Tue, 2014-08-12 at 06:46 +0000, Sriharsha Basavapatna wrote: > Hi folks, > > > > I'm running into this problem in P2P mode with our PTP capable NIC. > I'd > > appreciate if you could take a look at the details below and confirm > if this is > > an issue in ptp4l. The linuxptp version is: 1.4-00060-g93b7807. > > > > After starting ptp4l, it fails to send sync after a few iterations > right at > > the beginning. The error is "No message of desired type" (ENOMSG - > 42). After > > a timeout of 16 seconds (fault clear timeout) it restarts and > encounters > > the same error and this repeats. > > > > Here's the ptp4l command/args used: > > ptp4l -f ptp2.cfg -HPm2 -l7 -i ptp2 > > > > Note that I've configured one-step sync that seems to be triggering > this error > > condition. > > > > ptp4l[352396.491]: selected /dev/ptp4 as PTP clock > > ptp4l[352396.493]: PI servo: sync interval 1.000 kp 0.700 ki 0.300000 > > ptp4l[352396.514]: port 1: INITIALIZING to LISTENING on INITIALIZE > > ptp4l[352397.514]: port 1: delay timeout > > ptp4l[352397.514]: port 1: sending pdelay req seq(0) > > > > ptp4l[352398.514]: port 1: delay timeout > > ptp4l[352398.514]: port 1: sending pdelay req seq(1) > > > > ptp4l[352398.544]: port 1: setting asCapable > > ptp4l[352399.514]: port 1: delay timeout > > ptp4l[352399.514]: port 1: sending pdelay req seq(2) > > > > ptp4l[352400.514]: port 1: delay timeout > > ptp4l[352400.514]: port 1: sending pdelay req seq(3) > > > > ptp4l[352401.514]: port 1: delay timeout > > ptp4l[352401.514]: port 1: sending pdelay req seq(4) > > > > ptp4l[352402.514]: port 1: delay timeout > > ptp4l[352402.515]: port 1: sending pdelay req seq(5) > > > > ptp4l[352403.160]: port 1: announce timeout > > ptp4l[352403.160]: port 1: LISTENING to MASTER on > ANNOUNCE_RECEIPT_TIMEOUT_EXPIRES > > ptp4l[352403.161]: selected best master clock 0090fa.fffe.6c024e > > ptp4l[352403.161]: assuming the grand master role > > ptp4l[352403.162]: port 1: master tx announce timeout > > ptp4l[352403.515]: port 1: delay timeout > > ptp4l[352403.515]: port 1: sending pdelay req seq(6) > > > > ptp4l[352404.161]: port 1: master sync timeout > > ptp4l[352404.161]: p->timestamping (3) > > > > ptp4l[352404.515]: port 1: delay timeout > > ptp4l[352404.515]: port 1: sending pdelay req seq(7) > > > > ptp4l[352405.161]: port 1: master sync timeout > > ptp4l[352405.161]: p->timestamping (3) > > > > ptp4l[352405.162]: port 1: master tx announce timeout > > ptp4l[352405.515]: port 1: delay timeout > > ptp4l[352405.515]: port 1: sending pdelay req seq(8) > > > > ptp4l[352406.161]: port 1: master sync timeout > > ptp4l[352406.161]: send failed: 42 No message of desired type > > ptp4l[352406.161]: port 1: send sync failed > > ptp4l[352406.161]: port 1: MASTER to FAULTY on FAULT_DETECTED > (FT_UNSPECIFIED) > > ptp4l[352406.183]: waiting 2^{4} seconds to clear fault on port 1 > > ptp4l[352422.183]: clearing fault on port 1 > > > > Based on further debugging, the sequence of events leading to the > error is > > shown below: > > > > - Driver/HW receives pdelay request and sync msgs alternately like > this: > > - pdelay-req > > - sync > > - pdelay-req > > - sync <----- hits ENOMSG error > > - Messages are sent and tx timestamps generated for each (all are > event msgs). > > - Driver makes a copy of skb with timestamp info into socket error > queue. > I don't think we should even put a message into the socket error queue in the first place if the transmit type was ONESTEP. That might be a driver bug? The driver should be modifying the PTP packet and inserting the timestamp payload for us, without then further notifying the stack. If we are in ONESTEP mode, and the driver supports it, then the driver should no longer be sending the timestamp clone back up the stack. Thanks, Jake > - The first sync message in the error queue is not consumed by ptp4l. > > - This is because port_tx_sync() calls transport_send() with event set > to > > TRANS_ONESTEP. The transport send routines check the event type to > know > > whether to call sk_receive() after sending the message to get a > copy. They do > > sk_receive() only if event == TRANS_EVENT. > > - In this case since event is TRANS_ONESTEP, there's no call to > sk_receive(). > > - This leaves the sync msg in socket error queue. > > - Next pdelay-req is sent; driver clones this skb into error queue. > > - ptp4l invokes sk_receive() to get a copy of this pdelay-req. > > - The kernel function sock_recv_errqueue() or ip_recv_error() is > invoked. > > - There's a check in those functions to see if there's more skbs in > err queue. > > - The kernel function marks the socket in error(sets sk->sk_err) > before return. > > - The next sync message is sent down by ptp4l. > > - Stack calls sock_alloc_send_pskb() to get a skb. > > - There, it checks if the socket is in error and returns failure. > > - The second sync send fails. > > > > Note that the driver decides whether to make a copy into the error > queue based > > on SKBTX_HW_TSTAMP. This gets set for packets that are sent down the > stack on > > the event socket; i.e, the socket on which setsockopt(SO_TIMESTAMPING) > has been > > done. So the driver seems to be correctly handling the packets. > > > > It looks like a change in ptp4l might be needed to address this. The > transport > > send functions (raw_send(), udp_send()...) should be modified to > consume the > > error queue skb also for the case when event is TRANS_ONESTEP. They > might just > > have to throw away the skb if the timestamp is not of interest since > it is > > one-step. > > > > Thanks, > > -Harsha > > > ------------------------------------------------------------------------------ > _______________________________________________ > Linuxptp-users mailing list > Lin...@li... > https://lists.sourceforge.net/lists/listinfo/linuxptp-users |