Thread: [Linuxptp-users] One-step sync and P2P mode issue
PTP IEEE 1588 stack for Linux
Brought to you by:
rcochran
From: Sriharsha B. <Sri...@Em...> - 2014-08-12 06:47:17
|
Hi folks, I'm running into this problem in P2P mode with our PTP capable NIC. I'd appreciate if you could take a look at the details below and confirm if this is an issue in ptp4l. The linuxptp version is: 1.4-00060-g93b7807. After starting ptp4l, it fails to send sync after a few iterations right at the beginning. The error is "No message of desired type" (ENOMSG - 42). After a timeout of 16 seconds (fault clear timeout) it restarts and encounters the same error and this repeats. Here's the ptp4l command/args used: ptp4l -f ptp2.cfg -HPm2 -l7 -i ptp2 Note that I've configured one-step sync that seems to be triggering this error condition. ptp4l[352396.491]: selected /dev/ptp4 as PTP clock ptp4l[352396.493]: PI servo: sync interval 1.000 kp 0.700 ki 0.300000 ptp4l[352396.514]: port 1: INITIALIZING to LISTENING on INITIALIZE ptp4l[352397.514]: port 1: delay timeout ptp4l[352397.514]: port 1: sending pdelay req seq(0) ptp4l[352398.514]: port 1: delay timeout ptp4l[352398.514]: port 1: sending pdelay req seq(1) ptp4l[352398.544]: port 1: setting asCapable ptp4l[352399.514]: port 1: delay timeout ptp4l[352399.514]: port 1: sending pdelay req seq(2) ptp4l[352400.514]: port 1: delay timeout ptp4l[352400.514]: port 1: sending pdelay req seq(3) ptp4l[352401.514]: port 1: delay timeout ptp4l[352401.514]: port 1: sending pdelay req seq(4) ptp4l[352402.514]: port 1: delay timeout ptp4l[352402.515]: port 1: sending pdelay req seq(5) ptp4l[352403.160]: port 1: announce timeout ptp4l[352403.160]: port 1: LISTENING to MASTER on ANNOUNCE_RECEIPT_TIMEOUT_EXPIRES ptp4l[352403.161]: selected best master clock 0090fa.fffe.6c024e ptp4l[352403.161]: assuming the grand master role ptp4l[352403.162]: port 1: master tx announce timeout ptp4l[352403.515]: port 1: delay timeout ptp4l[352403.515]: port 1: sending pdelay req seq(6) ptp4l[352404.161]: port 1: master sync timeout ptp4l[352404.161]: p->timestamping (3) ptp4l[352404.515]: port 1: delay timeout ptp4l[352404.515]: port 1: sending pdelay req seq(7) ptp4l[352405.161]: port 1: master sync timeout ptp4l[352405.161]: p->timestamping (3) ptp4l[352405.162]: port 1: master tx announce timeout ptp4l[352405.515]: port 1: delay timeout ptp4l[352405.515]: port 1: sending pdelay req seq(8) ptp4l[352406.161]: port 1: master sync timeout ptp4l[352406.161]: send failed: 42 No message of desired type ptp4l[352406.161]: port 1: send sync failed ptp4l[352406.161]: port 1: MASTER to FAULTY on FAULT_DETECTED (FT_UNSPECIFIED) ptp4l[352406.183]: waiting 2^{4} seconds to clear fault on port 1 ptp4l[352422.183]: clearing fault on port 1 Based on further debugging, the sequence of events leading to the error is shown below: - Driver/HW receives pdelay request and sync msgs alternately like this: - pdelay-req - sync - pdelay-req - sync <----- hits ENOMSG error - Messages are sent and tx timestamps generated for each (all are event msgs). - Driver makes a copy of skb with timestamp info into socket error queue. - The first sync message in the error queue is not consumed by ptp4l. - This is because port_tx_sync() calls transport_send() with event set to TRANS_ONESTEP. The transport send routines check the event type to know whether to call sk_receive() after sending the message to get a copy. They do sk_receive() only if event == TRANS_EVENT. - In this case since event is TRANS_ONESTEP, there's no call to sk_receive(). - This leaves the sync msg in socket error queue. - Next pdelay-req is sent; driver clones this skb into error queue. - ptp4l invokes sk_receive() to get a copy of this pdelay-req. - The kernel function sock_recv_errqueue() or ip_recv_error() is invoked. - There's a check in those functions to see if there's more skbs in err queue. - The kernel function marks the socket in error(sets sk->sk_err) before return. - The next sync message is sent down by ptp4l. - Stack calls sock_alloc_send_pskb() to get a skb. - There, it checks if the socket is in error and returns failure. - The second sync send fails. Note that the driver decides whether to make a copy into the error queue based on SKBTX_HW_TSTAMP. This gets set for packets that are sent down the stack on the event socket; i.e, the socket on which setsockopt(SO_TIMESTAMPING) has been done. So the driver seems to be correctly handling the packets. It looks like a change in ptp4l might be needed to address this. The transport send functions (raw_send(), udp_send()...) should be modified to consume the error queue skb also for the case when event is TRANS_ONESTEP. They might just have to throw away the skb if the timestamp is not of interest since it is one-step. Thanks, -Harsha |
From: Keller, J. E <jac...@in...> - 2014-08-12 20:05:40
|
On Tue, 2014-08-12 at 06:46 +0000, Sriharsha Basavapatna wrote: > Hi folks, > > > > I'm running into this problem in P2P mode with our PTP capable NIC. > I'd > > appreciate if you could take a look at the details below and confirm > if this is > > an issue in ptp4l. The linuxptp version is: 1.4-00060-g93b7807. > > > > After starting ptp4l, it fails to send sync after a few iterations > right at > > the beginning. The error is "No message of desired type" (ENOMSG - > 42). After > > a timeout of 16 seconds (fault clear timeout) it restarts and > encounters > > the same error and this repeats. > > > > Here's the ptp4l command/args used: > > ptp4l -f ptp2.cfg -HPm2 -l7 -i ptp2 > > > > Note that I've configured one-step sync that seems to be triggering > this error > > condition. > > > > ptp4l[352396.491]: selected /dev/ptp4 as PTP clock > > ptp4l[352396.493]: PI servo: sync interval 1.000 kp 0.700 ki 0.300000 > > ptp4l[352396.514]: port 1: INITIALIZING to LISTENING on INITIALIZE > > ptp4l[352397.514]: port 1: delay timeout > > ptp4l[352397.514]: port 1: sending pdelay req seq(0) > > > > ptp4l[352398.514]: port 1: delay timeout > > ptp4l[352398.514]: port 1: sending pdelay req seq(1) > > > > ptp4l[352398.544]: port 1: setting asCapable > > ptp4l[352399.514]: port 1: delay timeout > > ptp4l[352399.514]: port 1: sending pdelay req seq(2) > > > > ptp4l[352400.514]: port 1: delay timeout > > ptp4l[352400.514]: port 1: sending pdelay req seq(3) > > > > ptp4l[352401.514]: port 1: delay timeout > > ptp4l[352401.514]: port 1: sending pdelay req seq(4) > > > > ptp4l[352402.514]: port 1: delay timeout > > ptp4l[352402.515]: port 1: sending pdelay req seq(5) > > > > ptp4l[352403.160]: port 1: announce timeout > > ptp4l[352403.160]: port 1: LISTENING to MASTER on > ANNOUNCE_RECEIPT_TIMEOUT_EXPIRES > > ptp4l[352403.161]: selected best master clock 0090fa.fffe.6c024e > > ptp4l[352403.161]: assuming the grand master role > > ptp4l[352403.162]: port 1: master tx announce timeout > > ptp4l[352403.515]: port 1: delay timeout > > ptp4l[352403.515]: port 1: sending pdelay req seq(6) > > > > ptp4l[352404.161]: port 1: master sync timeout > > ptp4l[352404.161]: p->timestamping (3) > > > > ptp4l[352404.515]: port 1: delay timeout > > ptp4l[352404.515]: port 1: sending pdelay req seq(7) > > > > ptp4l[352405.161]: port 1: master sync timeout > > ptp4l[352405.161]: p->timestamping (3) > > > > ptp4l[352405.162]: port 1: master tx announce timeout > > ptp4l[352405.515]: port 1: delay timeout > > ptp4l[352405.515]: port 1: sending pdelay req seq(8) > > > > ptp4l[352406.161]: port 1: master sync timeout > > ptp4l[352406.161]: send failed: 42 No message of desired type > > ptp4l[352406.161]: port 1: send sync failed > > ptp4l[352406.161]: port 1: MASTER to FAULTY on FAULT_DETECTED > (FT_UNSPECIFIED) > > ptp4l[352406.183]: waiting 2^{4} seconds to clear fault on port 1 > > ptp4l[352422.183]: clearing fault on port 1 > > > > Based on further debugging, the sequence of events leading to the > error is > > shown below: > > > > - Driver/HW receives pdelay request and sync msgs alternately like > this: > > - pdelay-req > > - sync > > - pdelay-req > > - sync <----- hits ENOMSG error > > - Messages are sent and tx timestamps generated for each (all are > event msgs). > > - Driver makes a copy of skb with timestamp info into socket error > queue. > I don't think we should even put a message into the socket error queue in the first place if the transmit type was ONESTEP. That might be a driver bug? The driver should be modifying the PTP packet and inserting the timestamp payload for us, without then further notifying the stack. If we are in ONESTEP mode, and the driver supports it, then the driver should no longer be sending the timestamp clone back up the stack. Thanks, Jake > - The first sync message in the error queue is not consumed by ptp4l. > > - This is because port_tx_sync() calls transport_send() with event set > to > > TRANS_ONESTEP. The transport send routines check the event type to > know > > whether to call sk_receive() after sending the message to get a > copy. They do > > sk_receive() only if event == TRANS_EVENT. > > - In this case since event is TRANS_ONESTEP, there's no call to > sk_receive(). > > - This leaves the sync msg in socket error queue. > > - Next pdelay-req is sent; driver clones this skb into error queue. > > - ptp4l invokes sk_receive() to get a copy of this pdelay-req. > > - The kernel function sock_recv_errqueue() or ip_recv_error() is > invoked. > > - There's a check in those functions to see if there's more skbs in > err queue. > > - The kernel function marks the socket in error(sets sk->sk_err) > before return. > > - The next sync message is sent down by ptp4l. > > - Stack calls sock_alloc_send_pskb() to get a skb. > > - There, it checks if the socket is in error and returns failure. > > - The second sync send fails. > > > > Note that the driver decides whether to make a copy into the error > queue based > > on SKBTX_HW_TSTAMP. This gets set for packets that are sent down the > stack on > > the event socket; i.e, the socket on which setsockopt(SO_TIMESTAMPING) > has been > > done. So the driver seems to be correctly handling the packets. > > > > It looks like a change in ptp4l might be needed to address this. The > transport > > send functions (raw_send(), udp_send()...) should be modified to > consume the > > error queue skb also for the case when event is TRANS_ONESTEP. They > might just > > have to throw away the skb if the timestamp is not of interest since > it is > > one-step. > > > > Thanks, > > -Harsha > > > ------------------------------------------------------------------------------ > _______________________________________________ > Linuxptp-users mailing list > Lin...@li... > https://lists.sourceforge.net/lists/listinfo/linuxptp-users |
From: Sriharsha B. <Sri...@Em...> - 2014-08-13 04:21:29
|
-----Original Message----- From: Keller, Jacob E [mailto:jac...@in...] Sent: Wednesday, August 13, 2014 1:35 AM To: lin...@li... Subject: Re: [Linuxptp-users] One-step sync and P2P mode issue On Tue, 2014-08-12 at 06:46 +0000, Sriharsha Basavapatna wrote: > Hi folks, > > > > I'm running into this problem in P2P mode with our PTP capable NIC. > I'd > > appreciate if you could take a look at the details below and confirm > if this is > > an issue in ptp4l. The linuxptp version is: 1.4-00060-g93b7807. > > > > After starting ptp4l, it fails to send sync after a few iterations > right at > > the beginning. The error is "No message of desired type" (ENOMSG - > 42). After > > a timeout of 16 seconds (fault clear timeout) it restarts and > encounters > > the same error and this repeats. > > > > Here's the ptp4l command/args used: > > ptp4l -f ptp2.cfg -HPm2 -l7 -i ptp2 > > > > Note that I've configured one-step sync that seems to be triggering > this error > > condition. > > > > ptp4l[352396.491]: selected /dev/ptp4 as PTP clock > > ptp4l[352396.493]: PI servo: sync interval 1.000 kp 0.700 ki 0.300000 > > ptp4l[352396.514]: port 1: INITIALIZING to LISTENING on INITIALIZE > > ptp4l[352397.514]: port 1: delay timeout > > ptp4l[352397.514]: port 1: sending pdelay req seq(0) > > > > ptp4l[352398.514]: port 1: delay timeout > > ptp4l[352398.514]: port 1: sending pdelay req seq(1) > > > > ptp4l[352398.544]: port 1: setting asCapable > > ptp4l[352399.514]: port 1: delay timeout > > ptp4l[352399.514]: port 1: sending pdelay req seq(2) > > > > ptp4l[352400.514]: port 1: delay timeout > > ptp4l[352400.514]: port 1: sending pdelay req seq(3) > > > > ptp4l[352401.514]: port 1: delay timeout > > ptp4l[352401.514]: port 1: sending pdelay req seq(4) > > > > ptp4l[352402.514]: port 1: delay timeout > > ptp4l[352402.515]: port 1: sending pdelay req seq(5) > > > > ptp4l[352403.160]: port 1: announce timeout > > ptp4l[352403.160]: port 1: LISTENING to MASTER on > ANNOUNCE_RECEIPT_TIMEOUT_EXPIRES > > ptp4l[352403.161]: selected best master clock 0090fa.fffe.6c024e > > ptp4l[352403.161]: assuming the grand master role > > ptp4l[352403.162]: port 1: master tx announce timeout > > ptp4l[352403.515]: port 1: delay timeout > > ptp4l[352403.515]: port 1: sending pdelay req seq(6) > > > > ptp4l[352404.161]: port 1: master sync timeout > > ptp4l[352404.161]: p->timestamping (3) > > > > ptp4l[352404.515]: port 1: delay timeout > > ptp4l[352404.515]: port 1: sending pdelay req seq(7) > > > > ptp4l[352405.161]: port 1: master sync timeout > > ptp4l[352405.161]: p->timestamping (3) > > > > ptp4l[352405.162]: port 1: master tx announce timeout > > ptp4l[352405.515]: port 1: delay timeout > > ptp4l[352405.515]: port 1: sending pdelay req seq(8) > > > > ptp4l[352406.161]: port 1: master sync timeout > > ptp4l[352406.161]: send failed: 42 No message of desired type > > ptp4l[352406.161]: port 1: send sync failed > > ptp4l[352406.161]: port 1: MASTER to FAULTY on FAULT_DETECTED > (FT_UNSPECIFIED) > > ptp4l[352406.183]: waiting 2^{4} seconds to clear fault on port 1 > > ptp4l[352422.183]: clearing fault on port 1 > > > > Based on further debugging, the sequence of events leading to the > error is > > shown below: > > > > - Driver/HW receives pdelay request and sync msgs alternately like > this: > > - pdelay-req > > - sync > > - pdelay-req > > - sync <----- hits ENOMSG error > > - Messages are sent and tx timestamps generated for each (all are > event msgs). > > - Driver makes a copy of skb with timestamp info into socket error > queue. > I don't think we should even put a message into the socket error queue in the first place if the transmit type was ONESTEP. That might be a driver bug? The driver should be modifying the PTP packet and inserting the timestamp payload for us, without then further notifying the stack. If we are in ONESTEP mode, and the driver supports it, then the driver should no longer be sending the timestamp clone back up the stack. Thanks, Jake This would mean that the driver needs to inspect packets in the transmit path to determine if it's 1) a PTP pkt, 2) Sync pkt and 3) One-step flag is set. I feel this is more work in the driver and also I don't see this specified as a requirement for driver support, in the kernel documentation for PTP (Documentation/ networking/timestamping.txt). The only information that the kernel/stack provides to the driver is whether timestamp should be generated for a packet, through SKBTX_HW_TSTAMP in skb_shinfo(skb)->tx_flags. This is the only check that the driver is expected to do. I looked at a couple of other drivers and they don't seem to have any such extra checking to filter out error queue insertion of one-step sync. The problem might show up with other drivers. So I think it'd be better to filter it out in ptp4l by just reading back the packet in error queue and dropping/ignoring it. Thanks, -Harsha > - The first sync message in the error queue is not consumed by ptp4l. > > - This is because port_tx_sync() calls transport_send() with event set > to > > TRANS_ONESTEP. The transport send routines check the event type to > know > > whether to call sk_receive() after sending the message to get a > copy. They do > > sk_receive() only if event == TRANS_EVENT. > > - In this case since event is TRANS_ONESTEP, there's no call to > sk_receive(). > > - This leaves the sync msg in socket error queue. > > - Next pdelay-req is sent; driver clones this skb into error queue. > > - ptp4l invokes sk_receive() to get a copy of this pdelay-req. > > - The kernel function sock_recv_errqueue() or ip_recv_error() is > invoked. > > - There's a check in those functions to see if there's more skbs in > err queue. > > - The kernel function marks the socket in error(sets sk->sk_err) > before return. > > - The next sync message is sent down by ptp4l. > > - Stack calls sock_alloc_send_pskb() to get a skb. > > - There, it checks if the socket is in error and returns failure. > > - The second sync send fails. > > > > Note that the driver decides whether to make a copy into the error > queue based > > on SKBTX_HW_TSTAMP. This gets set for packets that are sent down the > stack on > > the event socket; i.e, the socket on which setsockopt(SO_TIMESTAMPING) > has been > > done. So the driver seems to be correctly handling the packets. > > > > It looks like a change in ptp4l might be needed to address this. The > transport > > send functions (raw_send(), udp_send()...) should be modified to > consume the > > error queue skb also for the case when event is TRANS_ONESTEP. They > might just > > have to throw away the skb if the timestamp is not of interest since > it is > > one-step. > > > > Thanks, > > -Harsha > > > ---------------------------------------------------------------------- > -------- _______________________________________________ > Linuxptp-users mailing list > Lin...@li... > https://lists.sourceforge.net/lists/listinfo/linuxptp-users ------------------------------------------------------------------------------ _______________________________________________ Linuxptp-users mailing list Lin...@li... https://lists.sourceforge.net/lists/listinfo/linuxptp-users |
From: Keller, J. E <jac...@in...> - 2014-08-13 17:57:34
|
On Wed, 2014-08-13 at 04:21 +0000, Sriharsha Basavapatna wrote: > > -----Original Message----- > From: Keller, Jacob E [mailto:jac...@in...] > Sent: Wednesday, August 13, 2014 1:35 AM > To: lin...@li... > Subject: Re: [Linuxptp-users] One-step sync and P2P mode issue > > On Tue, 2014-08-12 at 06:46 +0000, Sriharsha Basavapatna wrote: > > Hi folks, > > > > > > > > I'm running into this problem in P2P mode with our PTP capable NIC. > > I'd > > > > appreciate if you could take a look at the details below and confirm > > if this is > > > > an issue in ptp4l. The linuxptp version is: 1.4-00060-g93b7807. > > > > > > > > After starting ptp4l, it fails to send sync after a few iterations > > right at > > > > the beginning. The error is "No message of desired type" (ENOMSG - > > 42). After > > > > a timeout of 16 seconds (fault clear timeout) it restarts and > > encounters > > > > the same error and this repeats. > > > > > > > > Here's the ptp4l command/args used: > > > > ptp4l -f ptp2.cfg -HPm2 -l7 -i ptp2 > > > > > > > > Note that I've configured one-step sync that seems to be triggering > > this error > > > > condition. > > > > > > > > ptp4l[352396.491]: selected /dev/ptp4 as PTP clock > > > > ptp4l[352396.493]: PI servo: sync interval 1.000 kp 0.700 ki 0.300000 > > > > ptp4l[352396.514]: port 1: INITIALIZING to LISTENING on INITIALIZE > > > > ptp4l[352397.514]: port 1: delay timeout > > > > ptp4l[352397.514]: port 1: sending pdelay req seq(0) > > > > > > > > ptp4l[352398.514]: port 1: delay timeout > > > > ptp4l[352398.514]: port 1: sending pdelay req seq(1) > > > > > > > > ptp4l[352398.544]: port 1: setting asCapable > > > > ptp4l[352399.514]: port 1: delay timeout > > > > ptp4l[352399.514]: port 1: sending pdelay req seq(2) > > > > > > > > ptp4l[352400.514]: port 1: delay timeout > > > > ptp4l[352400.514]: port 1: sending pdelay req seq(3) > > > > > > > > ptp4l[352401.514]: port 1: delay timeout > > > > ptp4l[352401.514]: port 1: sending pdelay req seq(4) > > > > > > > > ptp4l[352402.514]: port 1: delay timeout > > > > ptp4l[352402.515]: port 1: sending pdelay req seq(5) > > > > > > > > ptp4l[352403.160]: port 1: announce timeout > > > > ptp4l[352403.160]: port 1: LISTENING to MASTER on > > ANNOUNCE_RECEIPT_TIMEOUT_EXPIRES > > > > ptp4l[352403.161]: selected best master clock 0090fa.fffe.6c024e > > > > ptp4l[352403.161]: assuming the grand master role > > > > ptp4l[352403.162]: port 1: master tx announce timeout > > > > ptp4l[352403.515]: port 1: delay timeout > > > > ptp4l[352403.515]: port 1: sending pdelay req seq(6) > > > > > > > > ptp4l[352404.161]: port 1: master sync timeout > > > > ptp4l[352404.161]: p->timestamping (3) > > > > > > > > ptp4l[352404.515]: port 1: delay timeout > > > > ptp4l[352404.515]: port 1: sending pdelay req seq(7) > > > > > > > > ptp4l[352405.161]: port 1: master sync timeout > > > > ptp4l[352405.161]: p->timestamping (3) > > > > > > > > ptp4l[352405.162]: port 1: master tx announce timeout > > > > ptp4l[352405.515]: port 1: delay timeout > > > > ptp4l[352405.515]: port 1: sending pdelay req seq(8) > > > > > > > > ptp4l[352406.161]: port 1: master sync timeout > > > > ptp4l[352406.161]: send failed: 42 No message of desired type > > > > ptp4l[352406.161]: port 1: send sync failed > > > > ptp4l[352406.161]: port 1: MASTER to FAULTY on FAULT_DETECTED > > (FT_UNSPECIFIED) > > > > ptp4l[352406.183]: waiting 2^{4} seconds to clear fault on port 1 > > > > ptp4l[352422.183]: clearing fault on port 1 > > > > > > > > Based on further debugging, the sequence of events leading to the > > error is > > > > shown below: > > > > > > > > - Driver/HW receives pdelay request and sync msgs alternately like > > this: > > > > - pdelay-req > > > > - sync > > > > - pdelay-req > > > > - sync <----- hits ENOMSG error > > > > - Messages are sent and tx timestamps generated for each (all are > > event msgs). > > > > - Driver makes a copy of skb with timestamp info into socket error > > queue. > > > > I don't think we should even put a message into the socket error queue in the first place if the transmit type was ONESTEP. That might be a driver bug? The driver should be modifying the PTP packet and inserting the timestamp payload for us, without then further notifying the stack. > > If we are in ONESTEP mode, and the driver supports it, then the driver should no longer be sending the timestamp clone back up the stack. > > Thanks, > Jake > > > This would mean that the driver needs to inspect packets in the transmit path > to determine if it's 1) a PTP pkt, 2) Sync pkt and 3) One-step flag is set. I > feel this is more work in the driver and also I don't see this specified as a > requirement for driver support, in the kernel documentation for PTP > (Documentation/ networking/timestamping.txt). The only information that the > kernel/stack provides to the driver is whether timestamp should be generated > for a packet, through SKBTX_HW_TSTAMP in skb_shinfo(skb)->tx_flags. This is the > only check that the driver is expected to do. > > I looked at a couple of other drivers and they don't seem to have any such > extra checking to filter out error queue insertion of one-step sync. The > problem might show up with other drivers. So I think it'd be better to filter > it out in ptp4l by just reading back the packet in error queue and > dropping/ignoring it. > > Thanks, > -Harsha That is exactly what it means. You cannot support one-step mode unless hardware supports it. One-step mode is where the hardware inserts the hardware timestamp into the packet just before sending it with as close accuracy as possible. There is nothing to filter, or deal with because they simply do not call skb_tx_tstamp. In the case of one-step there is no returned timestamp. If you use the ONESTEP mode of the hwtstamp ioctl, that is where you inform that all timestamped packets will be onestep mode. Do any of those drivers you looked at actually support Onestep mode? One-step mode is a specialized mode which requires hardware support. It is not better to filter the packet out because it is a bug. The hardware/driver in question needs to be inserting the timestamp directly into the payload of the Sync packet it is timestamping. (else it's not "one step" mode). If ptp4l silently ignored this packet it would make it harder to find the bug. In addition, if the hardware isn't inserting the timestamp directly into the packet and instead is passing it up the stack it won't work anyways! Thanks, Jake |
From: Sriharsha B. <Sri...@Em...> - 2014-08-14 06:08:03
|
May be I didn't explain it clearly earlier. I understand one-step mode is where HW support is needed and our hardware does insert the timestamp. But because the protocol stack sets the SKBTX_HW_TSTAMP bit in the skb, the driver (which is supposed to return the tx timestamp back up the stack based on this flag) can't distinguish this as a special case for which though the flag is set, it should *not* put a copy of the pkt back on the error queue. Here's the sequence: 1. ptp4l sends down 1-Step sync packet on the event socket (i.e, one for which SO_TIMESTAMPING socket option is set). 2. Network stack sets the SKBTX_HW_TSTAMP in skb since socket is marked for SO_TIMESTAMPING. 3. Stack sends down the skb to driver with SKBTX_HW_TSTAMP set. 4. Driver sends down the packet to HW. 5. HW inserts timestamp (yes, it is HW that's inserting, not driver !!). 6. HW transmits the packet and generates a completion to driver. 7. The driver looks up the corresponding skb and since SKBTX_HW_TSTAMP is set, makes a copy of skb into error queue. It is the SKBTX_HW_TSTAMP flag that tells the driver whether it should or shouldn't pass back a copy of pkt into error queue. In this case, the HW is working correctly, both inserting the timestamp and providing that inserted timestamp back to driver. But the driver's decision to pass it back up, is based on the SKBTX_HW_TSTAMP flag which is driven by ptp4l and the stack. The reason this flag is set is because ptp4l sent down the skb on a socket that is marked for HW timestamp insertion/generation. We could change the driver to make an exception and ignore the flag SKBTX_HW_TSTAMP and not put a skb back into the error queue, if it's one-step sync packet that got transmitted. But that doesn't seem like the right approach as the driver ends up not honoring what the stack is asking it via the flag. The driver essentially violates the driver "interface" with the stack. This is why I think that the right way to fix this would be: 1. SKBTX_HW_TSTAMP should not be set on the skb if a copy of skb with tx-timestamp is not needed by the consumer. HW would still insert the timestamp in the packet while transmitting. But the driver won't copy into error queue since SKBTX_HW_TSTAMP is not set in skb. OR 2. If the consumer sets the flag SKBTX_HW_TSTAMP, then the consumer should be prepared to get a copy of the packet and read it. I feel that approach 2 is better since SKBTX_HW_TSTAMP implies both inserting a timestamp and getting back a copy with the timestamp in it. But the setting of the flag on the skb/pkt is driven by ptp4l. Once that's set, driver is supposed to take expected action. Like I already said in my previous email this is also as per the kernel documentation. Let me know if you still think it's something the driver should workaround. Thanks, -Harsha -----Original Message----- From: Keller, Jacob E [mailto:jac...@in...] Sent: Wednesday, August 13, 2014 11:27 PM To: Sriharsha Basavapatna Cc: lin...@li... Subject: Re: [Linuxptp-users] One-step sync and P2P mode issue On Wed, 2014-08-13 at 04:21 +0000, Sriharsha Basavapatna wrote: > > -----Original Message----- > From: Keller, Jacob E [mailto:jac...@in...] > Sent: Wednesday, August 13, 2014 1:35 AM > To: lin...@li... > Subject: Re: [Linuxptp-users] One-step sync and P2P mode issue > > On Tue, 2014-08-12 at 06:46 +0000, Sriharsha Basavapatna wrote: > > Hi folks, > > > > > > > > I'm running into this problem in P2P mode with our PTP capable NIC. > > I'd > > > > appreciate if you could take a look at the details below and confirm > > if this is > > > > an issue in ptp4l. The linuxptp version is: 1.4-00060-g93b7807. > > > > > > > > After starting ptp4l, it fails to send sync after a few iterations > > right at > > > > the beginning. The error is "No message of desired type" (ENOMSG - > > 42). After > > > > a timeout of 16 seconds (fault clear timeout) it restarts and > > encounters > > > > the same error and this repeats. > > > > > > > > Here's the ptp4l command/args used: > > > > ptp4l -f ptp2.cfg -HPm2 -l7 -i ptp2 > > > > > > > > Note that I've configured one-step sync that seems to be triggering > > this error > > > > condition. > > > > > > > > ptp4l[352396.491]: selected /dev/ptp4 as PTP clock > > > > ptp4l[352396.493]: PI servo: sync interval 1.000 kp 0.700 ki > > 0.300000 > > > > ptp4l[352396.514]: port 1: INITIALIZING to LISTENING on INITIALIZE > > > > ptp4l[352397.514]: port 1: delay timeout > > > > ptp4l[352397.514]: port 1: sending pdelay req seq(0) > > > > > > > > ptp4l[352398.514]: port 1: delay timeout > > > > ptp4l[352398.514]: port 1: sending pdelay req seq(1) > > > > > > > > ptp4l[352398.544]: port 1: setting asCapable > > > > ptp4l[352399.514]: port 1: delay timeout > > > > ptp4l[352399.514]: port 1: sending pdelay req seq(2) > > > > > > > > ptp4l[352400.514]: port 1: delay timeout > > > > ptp4l[352400.514]: port 1: sending pdelay req seq(3) > > > > > > > > ptp4l[352401.514]: port 1: delay timeout > > > > ptp4l[352401.514]: port 1: sending pdelay req seq(4) > > > > > > > > ptp4l[352402.514]: port 1: delay timeout > > > > ptp4l[352402.515]: port 1: sending pdelay req seq(5) > > > > > > > > ptp4l[352403.160]: port 1: announce timeout > > > > ptp4l[352403.160]: port 1: LISTENING to MASTER on > > ANNOUNCE_RECEIPT_TIMEOUT_EXPIRES > > > > ptp4l[352403.161]: selected best master clock 0090fa.fffe.6c024e > > > > ptp4l[352403.161]: assuming the grand master role > > > > ptp4l[352403.162]: port 1: master tx announce timeout > > > > ptp4l[352403.515]: port 1: delay timeout > > > > ptp4l[352403.515]: port 1: sending pdelay req seq(6) > > > > > > > > ptp4l[352404.161]: port 1: master sync timeout > > > > ptp4l[352404.161]: p->timestamping (3) > > > > > > > > ptp4l[352404.515]: port 1: delay timeout > > > > ptp4l[352404.515]: port 1: sending pdelay req seq(7) > > > > > > > > ptp4l[352405.161]: port 1: master sync timeout > > > > ptp4l[352405.161]: p->timestamping (3) > > > > > > > > ptp4l[352405.162]: port 1: master tx announce timeout > > > > ptp4l[352405.515]: port 1: delay timeout > > > > ptp4l[352405.515]: port 1: sending pdelay req seq(8) > > > > > > > > ptp4l[352406.161]: port 1: master sync timeout > > > > ptp4l[352406.161]: send failed: 42 No message of desired type > > > > ptp4l[352406.161]: port 1: send sync failed > > > > ptp4l[352406.161]: port 1: MASTER to FAULTY on FAULT_DETECTED > > (FT_UNSPECIFIED) > > > > ptp4l[352406.183]: waiting 2^{4} seconds to clear fault on port 1 > > > > ptp4l[352422.183]: clearing fault on port 1 > > > > > > > > Based on further debugging, the sequence of events leading to the > > error is > > > > shown below: > > > > > > > > - Driver/HW receives pdelay request and sync msgs alternately like > > this: > > > > - pdelay-req > > > > - sync > > > > - pdelay-req > > > > - sync <----- hits ENOMSG error > > > > - Messages are sent and tx timestamps generated for each (all are > > event msgs). > > > > - Driver makes a copy of skb with timestamp info into socket error > > queue. > > > > I don't think we should even put a message into the socket error queue in the first place if the transmit type was ONESTEP. That might be a driver bug? The driver should be modifying the PTP packet and inserting the timestamp payload for us, without then further notifying the stack. > > If we are in ONESTEP mode, and the driver supports it, then the driver should no longer be sending the timestamp clone back up the stack. > > Thanks, > Jake > > > This would mean that the driver needs to inspect packets in the > transmit path to determine if it's 1) a PTP pkt, 2) Sync pkt and 3) > One-step flag is set. I feel this is more work in the driver and also > I don't see this specified as a requirement for driver support, in the > kernel documentation for PTP (Documentation/ > networking/timestamping.txt). The only information that the > kernel/stack provides to the driver is whether timestamp should be > generated for a packet, through SKBTX_HW_TSTAMP in skb_shinfo(skb)->tx_flags. This is the only check that the driver is expected to do. > > I looked at a couple of other drivers and they don't seem to have any > such extra checking to filter out error queue insertion of one-step > sync. The problem might show up with other drivers. So I think it'd be > better to filter it out in ptp4l by just reading back the packet in > error queue and dropping/ignoring it. > > Thanks, > -Harsha That is exactly what it means. You cannot support one-step mode unless hardware supports it. One-step mode is where the hardware inserts the hardware timestamp into the packet just before sending it with as close accuracy as possible. There is nothing to filter, or deal with because they simply do not call skb_tx_tstamp. In the case of one-step there is no returned timestamp. If you use the ONESTEP mode of the hwtstamp ioctl, that is where you inform that all timestamped packets will be onestep mode. Do any of those drivers you looked at actually support Onestep mode? One-step mode is a specialized mode which requires hardware support. It is not better to filter the packet out because it is a bug. The hardware/driver in question needs to be inserting the timestamp directly into the payload of the Sync packet it is timestamping. (else it's not "one step" mode). If ptp4l silently ignored this packet it would make it harder to find the bug. In addition, if the hardware isn't inserting the timestamp directly into the packet and instead is passing it up the stack it won't work anyways! Thanks, Jake |
From: Richard C. <ric...@gm...> - 2014-08-14 08:02:01
|
On Thu, Aug 14, 2014 at 06:07:29AM +0000, Sriharsha Basavapatna wrote: > May be I didn't explain it clearly earlier. > > I understand one-step mode is where HW support is needed and our hardware does > insert the timestamp. But because the protocol stack sets the SKBTX_HW_TSTAMP > bit in the skb, the driver (which is supposed to return the tx timestamp back > up the stack based on this flag) can't distinguish this as a special case for Yes, it can. /* * Enables time stamping for outgoing packets just as * HWTSTAMP_TX_ON does, but also enables time stamp insertion * directly into Sync packets. In this case, transmitted Sync * packets will not received a time stamp via the socket error * queue. */ HWTSTAMP_TX_ONESTEP_SYNC, Thanks, Richard |
From: Sriharsha B. <Sri...@Em...> - 2014-09-02 10:00:35
|
Thanks for the pointer to the flag. But this is still an issue for the driver, since it'd have to parse the packet to find that it's a Sync pkt; please correct me if I'm missing something here. The driver saves this flag when it gets the ioctl request. Then at transmit completion time the driver would still have to check that the flag HWTSTAMP_TX_ONESTEP_SYNC has been set and it should not queue the packet in the error queue if it's a Sync pkt. For other packets say Delay Request it should still continue to queue the pkt in error queue. For example, here's the pseudo code. Before checking the msg type below, the driver would have to parse the packet to find that it's a PTP Sync msg(in a raw Ethernet frame or UDP/IPv4/IPV6 pkt). if (skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP && (ptp_msg->type != MSG_SYNC || (dev-inst->ptp_flags & HWTSTAMP_TX_ONESTEP_SYNC) == 0)) { /* add skb to error queue */ } Thanks, -Harsha -----Original Message----- From: Richard Cochran [mailto:ric...@gm...] Sent: Thursday, August 14, 2014 1:32 PM To: Sriharsha Basavapatna Cc: Keller, Jacob E; lin...@li... Subject: Re: [Linuxptp-users] One-step sync and P2P mode issue On Thu, Aug 14, 2014 at 06:07:29AM +0000, Sriharsha Basavapatna wrote: > May be I didn't explain it clearly earlier. > > I understand one-step mode is where HW support is needed and our > hardware does insert the timestamp. But because the protocol stack > sets the SKBTX_HW_TSTAMP bit in the skb, the driver (which is supposed > to return the tx timestamp back up the stack based on this flag) can't > distinguish this as a special case for Yes, it can. /* * Enables time stamping for outgoing packets just as * HWTSTAMP_TX_ON does, but also enables time stamp insertion * directly into Sync packets. In this case, transmitted Sync * packets will not received a time stamp via the socket error * queue. */ HWTSTAMP_TX_ONESTEP_SYNC, Thanks, Richard |
From: Richard C. <ric...@gm...> - 2014-09-04 21:02:01
|
On Tue, Sep 02, 2014 at 10:00:13AM +0000, Sriharsha Basavapatna wrote: > Thanks for the pointer to the flag. But this is still an issue for the driver, > since it'd have to parse the packet to find that it's a Sync pkt; please > correct me if I'm missing something here. Yes, the driver must parse the packets. This is not an "issue" for driver, but rather simply something that you have to implement. It isn't hard. See dp83640.c in the Linux kernel sources. Thanks, Richard |
From: Keller, J. E <jac...@in...> - 2014-09-03 00:29:14
|
Hi, > -----Original Message----- > From: Sriharsha Basavapatna > [mailto:Sri...@Em...] > Sent: Tuesday, September 02, 2014 3:00 AM > To: Richard Cochran > Cc: Keller, Jacob E; lin...@li... > Subject: RE: [Linuxptp-users] One-step sync and P2P mode issue > > Thanks for the pointer to the flag. But this is still an issue for the driver, > since it'd have to parse the packet to find that it's a Sync pkt; please > correct me if I'm missing something here. > > The driver saves this flag when it gets the ioctl request. Then at transmit > completion time the driver would still have to check that the flag > HWTSTAMP_TX_ONESTEP_SYNC has been set and it should not queue the > packet in > the error queue if it's a Sync pkt. For other packets say Delay Request it > should still continue to queue the pkt in error queue. > > For example, here's the pseudo code. Before checking the msg type > below, the > driver would have to parse the packet to find that it's a PTP Sync msg(in a > raw > Ethernet frame or UDP/IPv4/IPV6 pkt). > I do not know if one-step mode is specific to SYNC packets or should handle any transmit timestamps. Richard, do you know? You may have to check these. At any rate, I believe we already implement packet filters which will do this for you very efficiently. In the one-step case, you will configure hardware to directly insert the timestamp into the packet buffer as it transmits it In the other case, you configure hardware to notify driver of the timestamp, and then return it via skb_tstamp_tx I do actually believe that one-step works for any PTP transmitted timestamp, and if you enable TX_ONESTEP_SYNC that you actually have to do this for all packets, but I honestly cannot remember. Regards, Jake > if (skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP && > (ptp_msg->type != MSG_SYNC || > (dev-inst->ptp_flags & HWTSTAMP_TX_ONESTEP_SYNC) == 0)) { > > /* add skb to error queue */ > > } > > Thanks, > -Harsha |
From: Sriharsha B. <Sri...@Em...> - 2014-09-03 17:28:50
|
-----Original Message----- From: Keller, Jacob E [mailto:jac...@in...] Sent: Wednesday, September 03, 2014 5:59 AM To: Sriharsha Basavapatna; Richard Cochran Cc: lin...@li... Subject: RE: [Linuxptp-users] One-step sync and P2P mode issue Hi, > -----Original Message----- > From: Sriharsha Basavapatna > [mailto:Sri...@Em...] > Sent: Tuesday, September 02, 2014 3:00 AM > To: Richard Cochran > Cc: Keller, Jacob E; lin...@li...<mailto:lin...@li...> > Subject: RE: [Linuxptp-users] One-step sync and P2P mode issue > > Thanks for the pointer to the flag. But this is still an issue for the > driver, since it'd have to parse the packet to find that it's a Sync > pkt; please correct me if I'm missing something here. > > The driver saves this flag when it gets the ioctl request. Then at > transmit completion time the driver would still have to check that the > flag HWTSTAMP_TX_ONESTEP_SYNC has been set and it should not queue the > packet in the error queue if it's a Sync pkt. For other packets say > Delay Request it should still continue to queue the pkt in error > queue. > > For example, here's the pseudo code. Before checking the msg type > below, the driver would have to parse the packet to find that it's a > PTP Sync msg(in a raw Ethernet frame or UDP/IPv4/IPV6 pkt). > I do not know if one-step mode is specific to SYNC packets or should handle any transmit timestamps. Richard, do you know? >> I believe it is specific to SYNC packets only in this context; at least >> w.r.t the flag HWTSTAMP_TX_ONESTEP_SYNC. You may have to check these. At any rate, I believe we already implement packet filters which will do this for you very efficiently. >> It'd be better if we can avoid this condition itself, which requires packet >> filtering. In the one-step case, you will configure hardware to directly insert the timestamp into the packet buffer as it transmits it >> HW inserts the timestamp (see my earlier email with details). And provides a >> copy of timestamp to driver. The driver has been asked by the stack to give >> a copy of the transmitted packet via the error queue, as clearly indicated >> by skb->tx_flags: SKBTX_HW_TSTAMP. But the driver has to make an exception >> just for the 1-step Sync packet and to know that it's a 1-step Sync, it has >> to parse the packet. Thanks, -Harsha In the other case, you configure hardware to notify driver of the timestamp, and then return it via skb_tstamp_tx I do actually believe that one-step works for any PTP transmitted timestamp, and if you enable TX_ONESTEP_SYNC that you actually have to do this for all packets, but I honestly cannot remember. Regards, Jake > if (skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP && > (ptp_msg->type != MSG_SYNC || > (dev-inst->ptp_flags & HWTSTAMP_TX_ONESTEP_SYNC) == 0)) { > > /* add skb to error queue */ > > } > > Thanks, > -Harsha |
From: Richard C. <ric...@gm...> - 2014-09-04 20:57:09
|
On Wed, Sep 03, 2014 at 12:29:05AM +0000, Keller, Jacob E wrote: > > I do not know if one-step mode is specific to SYNC packets or should handle any transmit timestamps. Richard, do you know? One-step is for Sync messages only. The standard also allows one-step on Pdelay_Resp messages, but this is not implemented in the Linux kernel, and I don't know of any hardware implementations either. > You may have to check these. At any rate, I believe we already implement packet filters which will do this for you very efficiently. Right, the BPF can classify packets, but identifying Sync packets requires a bit more code. See the helper function is_sync() in drivers/net/phy/dp83640.c in the Linux kernel sources. Thanks, Richard |