Thread: [Linuxptp-users] Intermittent send delay request failure (tx_timestamp_timeout) w/dp83640
PTP IEEE 1588 stack for Linux
Brought to you by:
rcochran
|
From: Dan G. <gee...@gm...> - 2021-12-06 12:28:11
|
I would like to start with my thanks for the support of the Linux PTP Project. The tools are a great success, and keep getting better! I'm developing a custom SBC that works perfectly approximately 90% of the time. However, when the SBC starts or restarts there are cases immediately after initialization where ptp4l goes into a repeating cycle of 1) syncing to a PTP Grand Master, 2) timing out on the port_delay_request, and 3) resetting the failed port. My SBC is based on a MPC8360E (NXP PowerQUICC® II Pro Processor with ucc_geth MACs) connected to DP83640 PHYTERs. I'm building the entire distro using yoctoproject's honister release, and Linux 5.10.73 with the CONFIG PREEMPT RT Patch. I recently upgraded to linuxptp 3.1.1. Ptp4l is configured for Hardware timestamps and twoStepFlag = 0. It appears that the device drivers support PTP since the time sync works extremely well most of the time. I'm in the process of trying to isolate the cause of the failure, but when everything works most of the time, this can be a big challenge. I understand the statement that this is "likely a driver bug", and would like to know if anyone can point me in the right direction. Which driver should I focus on? Is this more likely in the dp83640 PHY layer, or the ucc_geth MAC layer? And, can you please offer any suggestions on how I might isolate the cause of the repeating poll timeout? (I might need to provide the fix to the maintainer) Thank you in advance for your help! Dan |
|
From: Richard C. <ric...@gm...> - 2021-12-07 01:34:09
|
On Mon, Dec 06, 2021 at 07:27:54AM -0500, Dan Geer wrote: > I would like to start with my thanks for the support of the Linux PTP > Project. The tools are a great success, and keep getting better! > > I'm developing a custom SBC that works perfectly approximately 90% of the > time. However, when the SBC starts or restarts there are cases immediately > after initialization where ptp4l goes into a repeating cycle of 1) syncing > to a PTP Grand Master, 2) timing out on the port_delay_request, and 3) > resetting the failed port. > > My SBC is based on a MPC8360E (NXP PowerQUICC® II Pro Processor with > ucc_geth MACs) connected to DP83640 PHYTERs. You want to use the PTP features in the phyters. First step is to make sure the MAC driver isn't also doing time stamping. I don't see PTP support in mainline ucc_geth, but maybe you have a vendor kernel with time stamping patched in? Thanks, Richard |
|
From: Dan G. <gee...@gm...> - 2021-12-07 11:30:52
|
On Dec 6, 2021, at 8:34 PM, Richard Cochran <ric...@gm...> wrote: On Mon, Dec 06, 2021 at 07:27:54AM -0500, Dan Geer wrote: I would like to start with my thanks for the support of the Linux PTP Project. The tools are a great success, and keep getting better! I'm developing a custom SBC that works perfectly approximately 90% of the time. However, when the SBC starts or restarts there are cases immediately after initialization where ptp4l goes into a repeating cycle of 1) syncing to a PTP Grand Master, 2) timing out on the port_delay_request, and 3) resetting the failed port. My SBC is based on a MPC8360E (NXP PowerQUICC® II Pro Processor with ucc_geth MACs) connected to DP83640 PHYTERs. You want to use the PTP features in the phyters. First step is to make sure the MAC driver isn't also doing time stamping. I don't see PTP support in mainline ucc_geth, but maybe you have a vendor kernel with time stamping patched in? Thanks, Richard Richard, Thanks for the quick response! No vendor kernel involved. I have the mainline ucc_geth. On the transmit side, the ucc_geth_start_xmit function calls skb_tx_timestamp(skb) which hooks into the mii_ts->txtstamp() function pointing to dp83640.c The phyter is doing the timestamp via the PHY MII timestamp interface. I've been going through the driver code, and all of the components seem to be in there. I'm using Documentation/networking/timestamping.rst as my reference. So, I'm into the phyter device driver... What's the next step? Thanks again, Dan |
|
From: Richard C. <ric...@gm...> - 2021-12-07 16:27:31
|
On Tue, Dec 07, 2021 at 06:30:32AM -0500, Dan Geer wrote: > So, I'm into the phyter device driver... What's the next step? The phyter generates special frames that carry the time stamps. These frames are delivered to the host. Because the frames appear to be PTP frames, the networking stack will deliver them to the phy driver via skb_defer_rx_timestamp() So the flow is: 1. host sends a PTP event message 2. phyter time stamps it and generates a frame back to the host 3. frame is delivered to the phy driver via mii_ts->rxtstamp() 4. driver decodes frame and delivers cmsg to user space The default Tx timestamp timeout is 1 ms. Maybe your NW stack is just a bit too slow. Try increasing the timeout to 10 ms. Another thing to consider about the phyter: It generates these special frames, but only when there is some bandwidth left to send them. It will not preempt normal NW traffic. That means you cannot use the 100 Mbit completely. HTH, Richard |
|
From: Dan G. <gee...@gm...> - 2021-12-07 23:19:37
|
This is a huge help! Thank you very much! I set the Tx timestamp up to 50 ms to ensure that it wasn't creating confusion. So far I've found that the PSF_TX status frames (TX Timestamps) are not coming back to the dp83640 driver when the problem shows up. I think I'm on the right track. And, I should be able to squash the bug from here. Thanks again! -Dan On Tue, Dec 7, 2021 at 11:27 AM Richard Cochran <ric...@gm...> wrote: > On Tue, Dec 07, 2021 at 06:30:32AM -0500, Dan Geer wrote: > > So, I'm into the phyter device driver... What's the next step? > > The phyter generates special frames that carry the time stamps. These > frames are delivered to the host. Because the frames appear to be PTP > frames, the networking stack will deliver them to the phy driver via > > skb_defer_rx_timestamp() > > So the flow is: > > 1. host sends a PTP event message > 2. phyter time stamps it and generates a frame back to the host > 3. frame is delivered to the phy driver via mii_ts->rxtstamp() > 4. driver decodes frame and delivers cmsg to user space > > The default Tx timestamp timeout is 1 ms. Maybe your NW stack is just > a bit too slow. Try increasing the timeout to 10 ms. > > Another thing to consider about the phyter: It generates these > special frames, but only when there is some bandwidth left to send > them. It will not preempt normal NW traffic. That means you cannot > use the 100 Mbit completely. > > HTH, > Richard > > |