Re: [Linuxptp-users] One-step sync and P2P mode issue
PTP IEEE 1588 stack for Linux
Brought to you by:
rcochran
From: Sriharsha B. <Sri...@Em...> - 2014-08-14 06:08:03
|
May be I didn't explain it clearly earlier. I understand one-step mode is where HW support is needed and our hardware does insert the timestamp. But because the protocol stack sets the SKBTX_HW_TSTAMP bit in the skb, the driver (which is supposed to return the tx timestamp back up the stack based on this flag) can't distinguish this as a special case for which though the flag is set, it should *not* put a copy of the pkt back on the error queue. Here's the sequence: 1. ptp4l sends down 1-Step sync packet on the event socket (i.e, one for which SO_TIMESTAMPING socket option is set). 2. Network stack sets the SKBTX_HW_TSTAMP in skb since socket is marked for SO_TIMESTAMPING. 3. Stack sends down the skb to driver with SKBTX_HW_TSTAMP set. 4. Driver sends down the packet to HW. 5. HW inserts timestamp (yes, it is HW that's inserting, not driver !!). 6. HW transmits the packet and generates a completion to driver. 7. The driver looks up the corresponding skb and since SKBTX_HW_TSTAMP is set, makes a copy of skb into error queue. It is the SKBTX_HW_TSTAMP flag that tells the driver whether it should or shouldn't pass back a copy of pkt into error queue. In this case, the HW is working correctly, both inserting the timestamp and providing that inserted timestamp back to driver. But the driver's decision to pass it back up, is based on the SKBTX_HW_TSTAMP flag which is driven by ptp4l and the stack. The reason this flag is set is because ptp4l sent down the skb on a socket that is marked for HW timestamp insertion/generation. We could change the driver to make an exception and ignore the flag SKBTX_HW_TSTAMP and not put a skb back into the error queue, if it's one-step sync packet that got transmitted. But that doesn't seem like the right approach as the driver ends up not honoring what the stack is asking it via the flag. The driver essentially violates the driver "interface" with the stack. This is why I think that the right way to fix this would be: 1. SKBTX_HW_TSTAMP should not be set on the skb if a copy of skb with tx-timestamp is not needed by the consumer. HW would still insert the timestamp in the packet while transmitting. But the driver won't copy into error queue since SKBTX_HW_TSTAMP is not set in skb. OR 2. If the consumer sets the flag SKBTX_HW_TSTAMP, then the consumer should be prepared to get a copy of the packet and read it. I feel that approach 2 is better since SKBTX_HW_TSTAMP implies both inserting a timestamp and getting back a copy with the timestamp in it. But the setting of the flag on the skb/pkt is driven by ptp4l. Once that's set, driver is supposed to take expected action. Like I already said in my previous email this is also as per the kernel documentation. Let me know if you still think it's something the driver should workaround. Thanks, -Harsha -----Original Message----- From: Keller, Jacob E [mailto:jac...@in...] Sent: Wednesday, August 13, 2014 11:27 PM To: Sriharsha Basavapatna Cc: lin...@li... Subject: Re: [Linuxptp-users] One-step sync and P2P mode issue On Wed, 2014-08-13 at 04:21 +0000, Sriharsha Basavapatna wrote: > > -----Original Message----- > From: Keller, Jacob E [mailto:jac...@in...] > Sent: Wednesday, August 13, 2014 1:35 AM > To: lin...@li... > Subject: Re: [Linuxptp-users] One-step sync and P2P mode issue > > On Tue, 2014-08-12 at 06:46 +0000, Sriharsha Basavapatna wrote: > > Hi folks, > > > > > > > > I'm running into this problem in P2P mode with our PTP capable NIC. > > I'd > > > > appreciate if you could take a look at the details below and confirm > > if this is > > > > an issue in ptp4l. The linuxptp version is: 1.4-00060-g93b7807. > > > > > > > > After starting ptp4l, it fails to send sync after a few iterations > > right at > > > > the beginning. The error is "No message of desired type" (ENOMSG - > > 42). After > > > > a timeout of 16 seconds (fault clear timeout) it restarts and > > encounters > > > > the same error and this repeats. > > > > > > > > Here's the ptp4l command/args used: > > > > ptp4l -f ptp2.cfg -HPm2 -l7 -i ptp2 > > > > > > > > Note that I've configured one-step sync that seems to be triggering > > this error > > > > condition. > > > > > > > > ptp4l[352396.491]: selected /dev/ptp4 as PTP clock > > > > ptp4l[352396.493]: PI servo: sync interval 1.000 kp 0.700 ki > > 0.300000 > > > > ptp4l[352396.514]: port 1: INITIALIZING to LISTENING on INITIALIZE > > > > ptp4l[352397.514]: port 1: delay timeout > > > > ptp4l[352397.514]: port 1: sending pdelay req seq(0) > > > > > > > > ptp4l[352398.514]: port 1: delay timeout > > > > ptp4l[352398.514]: port 1: sending pdelay req seq(1) > > > > > > > > ptp4l[352398.544]: port 1: setting asCapable > > > > ptp4l[352399.514]: port 1: delay timeout > > > > ptp4l[352399.514]: port 1: sending pdelay req seq(2) > > > > > > > > ptp4l[352400.514]: port 1: delay timeout > > > > ptp4l[352400.514]: port 1: sending pdelay req seq(3) > > > > > > > > ptp4l[352401.514]: port 1: delay timeout > > > > ptp4l[352401.514]: port 1: sending pdelay req seq(4) > > > > > > > > ptp4l[352402.514]: port 1: delay timeout > > > > ptp4l[352402.515]: port 1: sending pdelay req seq(5) > > > > > > > > ptp4l[352403.160]: port 1: announce timeout > > > > ptp4l[352403.160]: port 1: LISTENING to MASTER on > > ANNOUNCE_RECEIPT_TIMEOUT_EXPIRES > > > > ptp4l[352403.161]: selected best master clock 0090fa.fffe.6c024e > > > > ptp4l[352403.161]: assuming the grand master role > > > > ptp4l[352403.162]: port 1: master tx announce timeout > > > > ptp4l[352403.515]: port 1: delay timeout > > > > ptp4l[352403.515]: port 1: sending pdelay req seq(6) > > > > > > > > ptp4l[352404.161]: port 1: master sync timeout > > > > ptp4l[352404.161]: p->timestamping (3) > > > > > > > > ptp4l[352404.515]: port 1: delay timeout > > > > ptp4l[352404.515]: port 1: sending pdelay req seq(7) > > > > > > > > ptp4l[352405.161]: port 1: master sync timeout > > > > ptp4l[352405.161]: p->timestamping (3) > > > > > > > > ptp4l[352405.162]: port 1: master tx announce timeout > > > > ptp4l[352405.515]: port 1: delay timeout > > > > ptp4l[352405.515]: port 1: sending pdelay req seq(8) > > > > > > > > ptp4l[352406.161]: port 1: master sync timeout > > > > ptp4l[352406.161]: send failed: 42 No message of desired type > > > > ptp4l[352406.161]: port 1: send sync failed > > > > ptp4l[352406.161]: port 1: MASTER to FAULTY on FAULT_DETECTED > > (FT_UNSPECIFIED) > > > > ptp4l[352406.183]: waiting 2^{4} seconds to clear fault on port 1 > > > > ptp4l[352422.183]: clearing fault on port 1 > > > > > > > > Based on further debugging, the sequence of events leading to the > > error is > > > > shown below: > > > > > > > > - Driver/HW receives pdelay request and sync msgs alternately like > > this: > > > > - pdelay-req > > > > - sync > > > > - pdelay-req > > > > - sync <----- hits ENOMSG error > > > > - Messages are sent and tx timestamps generated for each (all are > > event msgs). > > > > - Driver makes a copy of skb with timestamp info into socket error > > queue. > > > > I don't think we should even put a message into the socket error queue in the first place if the transmit type was ONESTEP. That might be a driver bug? The driver should be modifying the PTP packet and inserting the timestamp payload for us, without then further notifying the stack. > > If we are in ONESTEP mode, and the driver supports it, then the driver should no longer be sending the timestamp clone back up the stack. > > Thanks, > Jake > > > This would mean that the driver needs to inspect packets in the > transmit path to determine if it's 1) a PTP pkt, 2) Sync pkt and 3) > One-step flag is set. I feel this is more work in the driver and also > I don't see this specified as a requirement for driver support, in the > kernel documentation for PTP (Documentation/ > networking/timestamping.txt). The only information that the > kernel/stack provides to the driver is whether timestamp should be > generated for a packet, through SKBTX_HW_TSTAMP in skb_shinfo(skb)->tx_flags. This is the only check that the driver is expected to do. > > I looked at a couple of other drivers and they don't seem to have any > such extra checking to filter out error queue insertion of one-step > sync. The problem might show up with other drivers. So I think it'd be > better to filter it out in ptp4l by just reading back the packet in > error queue and dropping/ignoring it. > > Thanks, > -Harsha That is exactly what it means. You cannot support one-step mode unless hardware supports it. One-step mode is where the hardware inserts the hardware timestamp into the packet just before sending it with as close accuracy as possible. There is nothing to filter, or deal with because they simply do not call skb_tx_tstamp. In the case of one-step there is no returned timestamp. If you use the ONESTEP mode of the hwtstamp ioctl, that is where you inform that all timestamped packets will be onestep mode. Do any of those drivers you looked at actually support Onestep mode? One-step mode is a specialized mode which requires hardware support. It is not better to filter the packet out because it is a bug. The hardware/driver in question needs to be inserting the timestamp directly into the payload of the Sync packet it is timestamping. (else it's not "one step" mode). If ptp4l silently ignored this packet it would make it harder to find the bug. In addition, if the hardware isn't inserting the timestamp directly into the packet and instead is passing it up the stack it won't work anyways! Thanks, Jake |