linuxptp-users Mailing List for linuxptp (Page 39)
PTP IEEE 1588 stack for Linux
Brought to you by:
rcochran
You can subscribe to this list here.
2012 |
Jan
|
Feb
(10) |
Mar
(47) |
Apr
|
May
(26) |
Jun
(10) |
Jul
(4) |
Aug
(2) |
Sep
(2) |
Oct
(20) |
Nov
(14) |
Dec
(8) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2013 |
Jan
(6) |
Feb
(18) |
Mar
(27) |
Apr
(57) |
May
(32) |
Jun
(21) |
Jul
(79) |
Aug
(108) |
Sep
(13) |
Oct
(73) |
Nov
(51) |
Dec
(24) |
2014 |
Jan
(24) |
Feb
(41) |
Mar
(39) |
Apr
(5) |
May
(6) |
Jun
(2) |
Jul
(5) |
Aug
(15) |
Sep
(7) |
Oct
(6) |
Nov
|
Dec
(7) |
2015 |
Jan
(27) |
Feb
(18) |
Mar
(37) |
Apr
(8) |
May
(13) |
Jun
(44) |
Jul
(4) |
Aug
(50) |
Sep
(35) |
Oct
(6) |
Nov
(24) |
Dec
(19) |
2016 |
Jan
(30) |
Feb
(30) |
Mar
(23) |
Apr
(4) |
May
(12) |
Jun
(19) |
Jul
(26) |
Aug
(13) |
Sep
|
Oct
(23) |
Nov
(37) |
Dec
(15) |
2017 |
Jan
(33) |
Feb
(19) |
Mar
(20) |
Apr
(43) |
May
(39) |
Jun
(23) |
Jul
(20) |
Aug
(27) |
Sep
(10) |
Oct
(15) |
Nov
|
Dec
(24) |
2018 |
Jan
(3) |
Feb
(10) |
Mar
(34) |
Apr
(34) |
May
(28) |
Jun
(50) |
Jul
(27) |
Aug
(75) |
Sep
(21) |
Oct
(42) |
Nov
(25) |
Dec
(31) |
2019 |
Jan
(39) |
Feb
(28) |
Mar
(19) |
Apr
(7) |
May
(30) |
Jun
(22) |
Jul
(54) |
Aug
(36) |
Sep
(19) |
Oct
(33) |
Nov
(36) |
Dec
(32) |
2020 |
Jan
(29) |
Feb
(38) |
Mar
(29) |
Apr
(30) |
May
(39) |
Jun
(45) |
Jul
(31) |
Aug
(52) |
Sep
(40) |
Oct
(8) |
Nov
(48) |
Dec
(30) |
2021 |
Jan
(35) |
Feb
(32) |
Mar
(23) |
Apr
(55) |
May
(43) |
Jun
(63) |
Jul
(17) |
Aug
(24) |
Sep
(9) |
Oct
(31) |
Nov
(67) |
Dec
(55) |
2022 |
Jan
(31) |
Feb
(48) |
Mar
(76) |
Apr
(18) |
May
(13) |
Jun
(46) |
Jul
(75) |
Aug
(54) |
Sep
(59) |
Oct
(65) |
Nov
(44) |
Dec
(7) |
2023 |
Jan
(38) |
Feb
(32) |
Mar
(35) |
Apr
(23) |
May
(46) |
Jun
(53) |
Jul
(18) |
Aug
(10) |
Sep
(24) |
Oct
(15) |
Nov
(40) |
Dec
(6) |
From: Vladimir O. <ol...@gm...> - 2021-11-11 22:34:15
|
On Thu, Nov 11, 2021 at 03:55:29PM -0600, Ed Branch wrote: > As of commit 380d023abb1fdce0dba9d58ca1abaf2e2de5488f PHC device nodes or > symlinks named as "/dev/ptp*" where "*" is not a number cause phc2sys to > fail with "failed to parse PHC index". Note however, if the filename does > not match this pattern, it happily continues with no PHC index set. That would be me and my ts2phc patches. Sorry for that. Prior to that change, posix_clock_open() used to simply not populate *phc_index when passed a path to a PHC. I don't think there's any kernel API to deduce the PHC number using ioctls on the char device itself, so for my use case it would be pretty odd to have a PTP device named differently than /dev/ptpN. Nonetheless, phc2sys does not need to know the exact number of the PTP clock, and therefore, what I can do is move the error one level upper, at the caller. I'll try to come up with a patch tomorrow. |
From: Ed B. <br...@ar...> - 2021-11-11 22:11:10
|
As of commit 380d023abb1fdce0dba9d58ca1abaf2e2de5488f PHC device nodes or symlinks named as "/dev/ptp*" where "*" is not a number cause phc2sys to fail with "failed to parse PHC index". Note however, if the filename does not match this pattern, it happily continues with no PHC index set. |
From: ramesh t <ram...@ya...> - 2021-11-11 16:54:06
|
hi Richard, Verified on the nodes where the problem re-occurred after almost a week using phc_ctl. NIC phc time was fine. But only system time had changed. Also as suggested was running a script to capture phc_ctl periodically, but the issue didn't happen on those node. Even nodes running ptp2.0 didn't report the problem. Nov 10 23:19:13 phc2sys: [43031.919] CLOCK_REALTIME phc offset -62 s2 freq +4363 delay 2331 Nov 10 23:19:14 ptp4l: [43032.308] rms 3 max 6 freq -834 +/- 4 delay 83 +/- 1 Nov 10 02:38:29 phc2sys: [43032.919] clockcheck: clock jumped backward or running slower than expected! Nov 10 02:38:29 phc2sys: [43032.919] CLOCK_REALTIME phc offset -74444707194042 s0 freq +4363 delay 2459 Nov 10 17:47:19 ptp4l: [43033.302] rms 3 max 7 freq -837 +/- 5 delay 83 +/- 1 Nov 10 22:36:11 phc2sys: [93661.492] CLOCK_REALTIME phc offset 34 s2 freq +9568 delay 2352 Nov 10 22:36:12 ptp4l: [93662.280] rms 3 max 5 freq +3086 +/- 4 delay 85 +/- 0 Nov 4 12:50:40 phc2sys: [93662.493] clockcheck: clock jumped backward or running slower than expected! Nov 4 12:50:40 phc2sys: [93662.493] CLOCK_REALTIME phc offset -553532738053986 s0 freq +9568 delay 2455 Nov 10 15:54:49 ptp4l: [93663.272] rms 2 max 5 freq +3084 +/- 3 delay 86 +/- 1 I guess there are two possibilities. 1) Some process is changing the time? Is there a way to monitor which all processes modify the system time? 2) phc2sys code has issues in 3.0 version? Please suggest. regards, Ramesh On Tuesday, November 2, 2021, 10:02:01 PM GMT+5:30, Richard Cochran <ric...@gm...> wrote: On Tue, Nov 02, 2021 at 03:32:04PM +0000, ramesh t wrote: > hi, > > Any suggestion for the below mentioned issue? Please let me know. Looks like a HW and/or driver issue. I would start by validating the HW/driver. For example, a long term test of phc_ctl eth0 get HTH, Richard |
From: Miroslav L. <mli...@re...> - 2021-11-10 08:13:28
|
On Tue, Nov 09, 2021 at 05:13:49PM +0000, Brooks, Jason wrote: > So I followed your suggestion for a dummy entry. I made it in the form: > [ptp_domain 0] > interfaces enp0s25 > > and my main domain entry is > [ptp_domain 44] > interfaces enp0s25 > delay 10e-6 > ptp4l_option logging_level 7 > > observations: > > ptp4l on domain 0 always chooses the local clock as its best master That's expected. > ptp4l on domain 44 seems to cycle between > "UNCALIBRATED to LISTENING on "ANNOUNCE_RECEIPT_TIMEOUT_EXPIRES > and > "LISTENING TO UNCALIBRATED on RS_SLAVE" Hm, that suggests there are two different issues. Check the firewall configuration and tcpdump/wireshark output to see if the master configuration is ok. Try to get ptp4l working with SW timestamping first. Then you can enable HW timestamping and only when that works, you should switch to timemaster. -- Miroslav Lichvar |
From: Brooks, J. <Jas...@Al...> - 2021-11-09 17:14:01
|
Hello Miroslav, This is Centos 7.9 with linuxptp-2.0-2.el7_9.1.x86_64 installed, running kernel version 3.10.0-1160.41.1.el7.x86_64. It is running on a Lenovo thinkpad t440, with a built-in intel I218-LM ethernet controller, that uses the e1000e kernel driver. The output of ethtool -T enp0s25 is: ethtool -T enp0s25 Time stamping parameters for enp0s25: Capabilities: hardware-transmit (SOF_TIMESTAMPING_TX_HARDWARE) software-transmit (SOF_TIMESTAMPING_TX_SOFTWARE) hardware-receive (SOF_TIMESTAMPING_RX_HARDWARE) software-receive (SOF_TIMESTAMPING_RX_SOFTWARE) software-system-clock (SOF_TIMESTAMPING_SOFTWARE) hardware-raw-clock (SOF_TIMESTAMPING_RAW_HARDWARE) PTP Hardware Clock: 0 Hardware Transmit Timestamp Modes: off (HWTSTAMP_TX_OFF) on (HWTSTAMP_TX_ON) Hardware Receive Filter Modes: none (HWTSTAMP_FILTER_NONE) all (HWTSTAMP_FILTER_ALL) ptpv1-l4-sync (HWTSTAMP_FILTER_PTP_V1_L4_SYNC) ptpv1-l4-delay-req (HWTSTAMP_FILTER_PTP_V1_L4_DELAY_REQ) ptpv2-l4-sync (HWTSTAMP_FILTER_PTP_V2_L4_SYNC) ptpv2-l4-delay-req (HWTSTAMP_FILTER_PTP_V2_L4_DELAY_REQ) ptpv2-l2-sync (HWTSTAMP_FILTER_PTP_V2_L2_SYNC) ptpv2-l2-delay-req (HWTSTAMP_FILTER_PTP_V2_L2_DELAY_REQ) ptpv2-event (HWTSTAMP_FILTER_PTP_V2_EVENT) ptpv2-sync (HWTSTAMP_FILTER_PTP_V2_SYNC) ptpv2-delay-req (HWTSTAMP_FILTER_PTP_V2_DELAY_REQ) So I followed your suggestion for a dummy entry. I made it in the form: [ptp_domain 0] interfaces enp0s25 and my main domain entry is [ptp_domain 44] interfaces enp0s25 delay 10e-6 ptp4l_option logging_level 7 observations: ptp4l on domain 0 always chooses the local clock as its best master ptp4l on domain 44 seems to cycle between "UNCALIBRATED to LISTENING on "ANNOUNCE_RECEIPT_TIMEOUT_EXPIRES and "LISTENING TO UNCALIBRATED on RS_SLAVE" phc2sys is only listening to the ptp clock for domain 0 which will not get synced since there is no domain 0 ptp clock. And chronyc sources now shows two ptp clocks that are not usable. chronyc sources 210 Number of sources = 5 MS Name/IP address Stratum Poll Reach LastRx Last sample =============================================================================== #? PTP0 0 2 0 - +0ns[ +0ns] +/- 0ns #? PTP1 0 2 0 - +0ns[ +0ns] +/- 0ns ^+ india.colorado.edu 1 4 203 4 +7589us[+7589us] +/- 29ms ^+ uslax1-ntp-002.aaplimg.c> 1 4 377 5 -1059us[-1059us] +/- 14ms ^* clock.fmt.he.net 1 4 377 6 -2296us[-2251us] +/- 13ms So it would seem that while that dummy entry stops the " port 1: received SYNC without timestamp" messages, it doesn't do much good afterwards. -----Original Message----- From: Miroslav Lichvar <mli...@re...> Sent: Monday, November 8, 2021 01:09 To: Brooks, Jason <Jas...@Al...> Cc: lin...@li... Subject: [External]Re: [Linuxptp-users] Timemaster question with ptp4l and chrony CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender and know the content is safe. On Thu, Nov 04, 2021 at 10:55:49PM +0000, Brooks, Jason wrote: > 1. Chronyc sources shows > * 210 Number of sources = 5 > * MS Name/IP address Stratum Poll Reach LastRx Last sample > * =============================================================================== > * #? PTP0 0 2 0 - +0ns[ +0ns] +/- 0nsq > * ^+ time-c-b.nist.gov 1 4 377 5 +7083us[+7083us] +/- 28ms > * ^+ time2.google.com 1 4 377 7 +333us[ +333us] +/- 28ms > * ^* usscz2-ntp-001.aaplimg.c> 1 4 377 5 -780us[ -780us] +/- 11ms > * ^+ clock.fmt.he.net 1 4 377 6 -2245us[-2245us] +/- 14ms > > Why is chrony showing PTP0 as a problem? The refclock (i.e. ptp4l+phc2sys) didn't provide any samples. FYI, Google NTP servers uses a nonstandard timescale (leap smear) and must not be combined with standard NTP servers. > 1. And when using this system as an ntp source on another system it shows as a stratum 2 system. Shouldn't this be a stratum 1, given the ptp0 clock? > 2. A lot of messages in /var/log/messages: > * [44:enp0s25] port 1: received SYNC without timestamp What OS and version is it? Does it work with SW timestamping? If you add a dummy PTP domain to timemaster.conf before the 44, it should force 44 to use SW timestamping. -- Miroslav Lichvar NOTICE - CONFIDENTIAL INFORMATION This communication is the property of Allstream and may contain confidential or privileged information. If you have received this communication in error, please promptly notify the sender by reply e-mail, do not disseminate, distribute, copy or use the information contained in this communication, and destroy all copies of the communication and any attachments. AVIS – RENSEIGNEMENTS CONFIDENTIELS Cette communication est la propriété d’Allstream et peut contenir des renseignements confidentiels ou privilégiés. Si vous avez reçu cette communication par erreur, veuillez informer rapidement l’expéditeur en répondant par courriel, ne pas diffuser, distribuer, copier ou utiliser les renseignements contenus dans la présente communication, et détruire toutes les copies de la communication et ses pièces jointes. |
From: Vladimir O. <ol...@gm...> - 2021-11-08 22:08:22
|
On Mon, Nov 08, 2021 at 02:03:56PM -0800, Christopher Wingert wrote: > > > On 11/8/2021 1:43 PM, Vladimir Oltean wrote: > > On Mon, Nov 08, 2021 at 01:28:03PM -0800, Christopher Wingert wrote: > > > > > > On 11/8/2021 12:38 PM, Vladimir Oltean wrote: > > > > On Mon, Nov 08, 2021 at 12:11:11PM -0800, Christopher Wingert wrote: > > > > > Hi, > > > > > > > > > > I am working with a Aquantia AQC 107 ethernet interface. After the announce > > > > > message is sent on FD_GENERAL, a poll() of the the FD_GENERAL descriptor > > > > > generates a POLLERR. I see 3 delay messages go out the interface on > > > > > FD_EVENT (previous to the announce message) without issue (no socket error > > > > > on read on the FD_EVENT descriptor). > > > > > > > > > > The only difference i see between the two sockets is how the sock_filter is > > > > > setup. > > > > > > > > > > I am thinking this is an issue with the Aquantia driver, as the same command > > > > > on a Mellanox Connect X5 works fine. > > > > > > > > > > Has anyone seen this issue or have a clue as to where I should start? > > > > > > > > > > Thanks! > > > > > Chris > > > > > > > > > > > > > > > ptp4l command line : ptp4l -i els1 -H -P -2 -m > > > > > Kernel is 4.18 > > > > > I downloaded the latest Atlantic driver from the Marvell website 2.4.14.0 > > > > > I have upgraded the AQC 107 firmware to 3.1.121 > > > > I've no experience with this driver whatsoever, but generally, what > > > > ptp4l receives on the error queue of a socket is a TX timestamp. What is > > > > surprising is that there's a TX timestamp for a general (not event) > > > > message, because ptp4l does not ask these to be timestamped. > > > > > > > > Apart from the error messages, does the system otherwise behave ok? > > > > > > > > You can try to read from the general message socket into a packet buffer > > > > and hexdump it, put it in tcpdump and see what it is. Then the next step > > > > might be to process its control messages (cmsg), although my first guess > > > > would be that TX timestamping is what's going on. > > > > > > > > There are plenty of things that could go wrong in a driver (especially > > > > in one you downloaded from the vendor's website and not from kernel.org). > > > > If you're handy with the source code, you can check what is the > > > > condition based on which this driver offers hardware TX timestamps to > > > > the stack. It should be if skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP > > > > is set for that packet, AND hardware TX timestamping was requested > > > > through HWTSTAMP_TX_ON. > > > Thank you for the quick response! > > > > > > This is what the current version from git looks like on the 107 without any > > > code changes (3 delay requests, 1 announce), this loops indefinitely and > > > MASTER never gets enabled. > > > ptp4l[506134.862]: selected /dev/ptp11 as PTP clock > > > ptp4l[506134.889]: port 1 (els1): INITIALIZING to LISTENING on INIT_COMPLETE > > > ptp4l[506134.889]: port 0 (/var/run/ptp4l): INITIALIZING to LISTENING on > > > INIT_COMPLETE > > > ptp4l[506134.889]: port 0 (/var/run/ptp4lro): INITIALIZING to LISTENING on > > > INIT_COMPLETE > > > ptp4l[506141.948]: port 1 (els1): LISTENING to MASTER on > > > ANNOUNCE_RECEIPT_TIMEOUT_EXPIRES > > > ptp4l[506141.948]: selected local clock ac1f6b.fffe.dce92d as best master > > > ptp4l[506141.948]: port 1 (els1): assuming the grand master role > > > ptp4l[506141.950]: port 1 (els1): unexpected socket error > > > ptp4l[506141.950]: port 1 (els1): MASTER to FAULTY on FAULT_DETECTED > > > (FT_UNSPECIFIED) > > > > > > > > > I changed raw.c function raw_send() to the below code to get the timestamp > > > on both sockets. > > > /* > > > * Get the time stamp right away. > > > */ > > > // return event == TRANS_EVENT ? sk_receive(fd, pkt, len, NULL, hwts, MSG_ERRQUEUE) : cnt; > > > if ( event == TRANS_EVENT ) return sk_receive(fd, pkt, len, NULL, hwts, MSG_ERRQUEUE); > > > if ( event == TRANS_GENERAL ) return sk_receive(fd, pkt, len, NULL, hwts, MSG_ERRQUEUE); > > > return cnt; > > > > > > This is the result. > > > ptp4l[506201.215]: selected /dev/ptp11 as PTP clock > > > ptp4l[506201.245]: port 1 (els1): INITIALIZING to LISTENING on INIT_COMPLETE > > > ptp4l[506201.245]: port 0 (/var/run/ptp4l): INITIALIZING to LISTENING on INIT_COMPLETE > > > ptp4l[506201.245]: port 0 (/var/run/ptp4lro): INITIALIZING to LISTENING on INIT_COMPLETE > > > ptp4l[506208.757]: port 1 (els1): LISTENING to MASTER on ANNOUNCE_RECEIPT_TIMEOUT_EXPIRES > > > ptp4l[506208.757]: selected local clock ac1f6b.fffe.dce92d as best master > > > ptp4l[506208.757]: port 1 (els1): assuming the grand master role > > > ptp4l[506208.759]: poll for tx timestamp woke up on non ERR event > > > ptp4l[506208.759]: port 1 (els1): send announce failed > > > ptp4l[506208.759]: port 1 (els1): MASTER to FAULTY on FAULT_DETECTED > > > (FT_UNSPECIFIED) > > > > > > Unless there is something wrong in my code change, it doesn't seem to be a > > > timestamp. > > > > > > Are you saying that every POLLERR should be combined with a message in the > > > Error Queue? > > It's still implausible that it's not a timestamp (and I don't know what > > it can be if that's not it). "man poll" only says: > > > > POLLERR > > Error condition (only returned in revents; ignored in > > events). This bit is also set for a file descriptor > > referring to the write end of a pipe when the read end has > > been closed. > > > > and since ptp4l does not open connection-oriented sockets for general > > PTP messages, I don't think it can detect that the read end has been > > closed. > > > > What seems to be more likely to be going on is that you haven't made all > > changes necessary for reading TX timestamps from the error queue of the > > general socket. Have you called sk_timestamping_init? > > > > flags = 1; > > if (setsockopt(fd, SOL_SOCKET, SO_SELECT_ERR_QUEUE, > > &flags, sizeof(flags)) < 0) { > > pr_warning("%s: SO_SELECT_ERR_QUEUE: %m", device); > > sk_events = 0; > > sk_revents = POLLERR; > > } > > > > introduced by this kernel commit: > > > > commit 7d4c04fc170087119727119074e72445f2bb192b > > Author: Keller, Jacob E <jac...@in...> > > Date: Thu Mar 28 11:19:25 2013 +0000 > > > > net: add option to enable error queue packets waking select > > > > Currently, when a socket receives something on the error queue it only wakes up > > the socket on select if it is in the "read" list, that is the socket has > > something to read. It is useful also to wake the socket if it is in the error > > list, which would enable software to wait on error queue packets without waking > > up for regular data on the socket. The main use case is for receiving > > timestamped transmit packets which return the timestamp to the socket via the > > error queue. This enables an application to select on the socket for the error > > queue only instead of for the regular traffic. > > > > -v2- > > * Added the SO_SELECT_ERR_QUEUE socket option to every architechture specific file > > * Modified every socket poll function that checks error queue > > > > Signed-off-by: Jacob Keller <jac...@in...> > > Cc: Jeffrey Kirsher <jef...@in...> > > Cc: Richard Cochran <ric...@gm...> > > Cc: Matthew Vick <mat...@in...> > > Signed-off-by: David S. Miller <da...@da...> > > > > So you effectively cannot call poll() or select() on the error queue of > > a socket without enabling this option. Also, I think the sk_receive() > > function messes up quite badly, because of this incosistent mode in > > which it's operating. See, it looks at this global variable called > > sk_revents to figure out which events is poll() supposed to return. But > > the code was written assuming that there's a single socket on which you > > will poll for TX timestamps. And you have two, and configured > > differently, at that: on one you call sk_timestamping_init() and on the > > other you don't (or at least you don't mention that you do). > > Again, thank you for the quick response. I owe you a beer. > > Adding the call to sk_timestamping_init() in raw_open() for FD_GENERAL > allowed ptp to come online to a steady state and assume a MASTER role. > > I am guessing the AQC 107 driver is getting confused on which socket > actually wants timestamps. Please read carefully my second reply. The test is invalid as it stands. Either call only the setsockopt(SO_SELECT_ERR_QUEUE) portion instead of the whole sk_timestamping_init() for the general message socket, or remove the revents check from sk_receive(). |
From: Christopher W. <us...@wi...> - 2021-11-08 22:04:28
|
On 11/8/2021 1:43 PM, Vladimir Oltean wrote: > On Mon, Nov 08, 2021 at 01:28:03PM -0800, Christopher Wingert wrote: >> >> On 11/8/2021 12:38 PM, Vladimir Oltean wrote: >>> On Mon, Nov 08, 2021 at 12:11:11PM -0800, Christopher Wingert wrote: >>>> Hi, >>>> >>>> I am working with a Aquantia AQC 107 ethernet interface. After the announce >>>> message is sent on FD_GENERAL, a poll() of the the FD_GENERAL descriptor >>>> generates a POLLERR. I see 3 delay messages go out the interface on >>>> FD_EVENT (previous to the announce message) without issue (no socket error >>>> on read on the FD_EVENT descriptor). >>>> >>>> The only difference i see between the two sockets is how the sock_filter is >>>> setup. >>>> >>>> I am thinking this is an issue with the Aquantia driver, as the same command >>>> on a Mellanox Connect X5 works fine. >>>> >>>> Has anyone seen this issue or have a clue as to where I should start? >>>> >>>> Thanks! >>>> Chris >>>> >>>> >>>> ptp4l command line : ptp4l -i els1 -H -P -2 -m >>>> Kernel is 4.18 >>>> I downloaded the latest Atlantic driver from the Marvell website 2.4.14.0 >>>> I have upgraded the AQC 107 firmware to 3.1.121 >>> I've no experience with this driver whatsoever, but generally, what >>> ptp4l receives on the error queue of a socket is a TX timestamp. What is >>> surprising is that there's a TX timestamp for a general (not event) >>> message, because ptp4l does not ask these to be timestamped. >>> >>> Apart from the error messages, does the system otherwise behave ok? >>> >>> You can try to read from the general message socket into a packet buffer >>> and hexdump it, put it in tcpdump and see what it is. Then the next step >>> might be to process its control messages (cmsg), although my first guess >>> would be that TX timestamping is what's going on. >>> >>> There are plenty of things that could go wrong in a driver (especially >>> in one you downloaded from the vendor's website and not from kernel.org). >>> If you're handy with the source code, you can check what is the >>> condition based on which this driver offers hardware TX timestamps to >>> the stack. It should be if skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP >>> is set for that packet, AND hardware TX timestamping was requested >>> through HWTSTAMP_TX_ON. >> Thank you for the quick response! >> >> This is what the current version from git looks like on the 107 without any >> code changes (3 delay requests, 1 announce), this loops indefinitely and >> MASTER never gets enabled. >> ptp4l[506134.862]: selected /dev/ptp11 as PTP clock >> ptp4l[506134.889]: port 1 (els1): INITIALIZING to LISTENING on INIT_COMPLETE >> ptp4l[506134.889]: port 0 (/var/run/ptp4l): INITIALIZING to LISTENING on >> INIT_COMPLETE >> ptp4l[506134.889]: port 0 (/var/run/ptp4lro): INITIALIZING to LISTENING on >> INIT_COMPLETE >> ptp4l[506141.948]: port 1 (els1): LISTENING to MASTER on >> ANNOUNCE_RECEIPT_TIMEOUT_EXPIRES >> ptp4l[506141.948]: selected local clock ac1f6b.fffe.dce92d as best master >> ptp4l[506141.948]: port 1 (els1): assuming the grand master role >> ptp4l[506141.950]: port 1 (els1): unexpected socket error >> ptp4l[506141.950]: port 1 (els1): MASTER to FAULTY on FAULT_DETECTED >> (FT_UNSPECIFIED) >> >> >> I changed raw.c function raw_send() to the below code to get the timestamp >> on both sockets. >> /* >> * Get the time stamp right away. >> */ >> // return event == TRANS_EVENT ? sk_receive(fd, pkt, len, NULL, hwts, MSG_ERRQUEUE) : cnt; >> if ( event == TRANS_EVENT ) return sk_receive(fd, pkt, len, NULL, hwts, MSG_ERRQUEUE); >> if ( event == TRANS_GENERAL ) return sk_receive(fd, pkt, len, NULL, hwts, MSG_ERRQUEUE); >> return cnt; >> >> This is the result. >> ptp4l[506201.215]: selected /dev/ptp11 as PTP clock >> ptp4l[506201.245]: port 1 (els1): INITIALIZING to LISTENING on INIT_COMPLETE >> ptp4l[506201.245]: port 0 (/var/run/ptp4l): INITIALIZING to LISTENING on INIT_COMPLETE >> ptp4l[506201.245]: port 0 (/var/run/ptp4lro): INITIALIZING to LISTENING on INIT_COMPLETE >> ptp4l[506208.757]: port 1 (els1): LISTENING to MASTER on ANNOUNCE_RECEIPT_TIMEOUT_EXPIRES >> ptp4l[506208.757]: selected local clock ac1f6b.fffe.dce92d as best master >> ptp4l[506208.757]: port 1 (els1): assuming the grand master role >> ptp4l[506208.759]: poll for tx timestamp woke up on non ERR event >> ptp4l[506208.759]: port 1 (els1): send announce failed >> ptp4l[506208.759]: port 1 (els1): MASTER to FAULTY on FAULT_DETECTED >> (FT_UNSPECIFIED) >> >> Unless there is something wrong in my code change, it doesn't seem to be a >> timestamp. >> >> Are you saying that every POLLERR should be combined with a message in the >> Error Queue? > It's still implausible that it's not a timestamp (and I don't know what > it can be if that's not it). "man poll" only says: > > POLLERR > Error condition (only returned in revents; ignored in > events). This bit is also set for a file descriptor > referring to the write end of a pipe when the read end has > been closed. > > and since ptp4l does not open connection-oriented sockets for general > PTP messages, I don't think it can detect that the read end has been > closed. > > What seems to be more likely to be going on is that you haven't made all > changes necessary for reading TX timestamps from the error queue of the > general socket. Have you called sk_timestamping_init? > > flags = 1; > if (setsockopt(fd, SOL_SOCKET, SO_SELECT_ERR_QUEUE, > &flags, sizeof(flags)) < 0) { > pr_warning("%s: SO_SELECT_ERR_QUEUE: %m", device); > sk_events = 0; > sk_revents = POLLERR; > } > > introduced by this kernel commit: > > commit 7d4c04fc170087119727119074e72445f2bb192b > Author: Keller, Jacob E <jac...@in...> > Date: Thu Mar 28 11:19:25 2013 +0000 > > net: add option to enable error queue packets waking select > > Currently, when a socket receives something on the error queue it only wakes up > the socket on select if it is in the "read" list, that is the socket has > something to read. It is useful also to wake the socket if it is in the error > list, which would enable software to wait on error queue packets without waking > up for regular data on the socket. The main use case is for receiving > timestamped transmit packets which return the timestamp to the socket via the > error queue. This enables an application to select on the socket for the error > queue only instead of for the regular traffic. > > -v2- > * Added the SO_SELECT_ERR_QUEUE socket option to every architechture specific file > * Modified every socket poll function that checks error queue > > Signed-off-by: Jacob Keller <jac...@in...> > Cc: Jeffrey Kirsher <jef...@in...> > Cc: Richard Cochran <ric...@gm...> > Cc: Matthew Vick <mat...@in...> > Signed-off-by: David S. Miller <da...@da...> > > So you effectively cannot call poll() or select() on the error queue of > a socket without enabling this option. Also, I think the sk_receive() > function messes up quite badly, because of this incosistent mode in > which it's operating. See, it looks at this global variable called > sk_revents to figure out which events is poll() supposed to return. But > the code was written assuming that there's a single socket on which you > will poll for TX timestamps. And you have two, and configured > differently, at that: on one you call sk_timestamping_init() and on the > other you don't (or at least you don't mention that you do). Again, thank you for the quick response. I owe you a beer. Adding the call to sk_timestamping_init() in raw_open() for FD_GENERAL allowed ptp to come online to a steady state and assume a MASTER role. I am guessing the AQC 107 driver is getting confused on which socket actually wants timestamps. Thanks again! Chris |
From: Vladimir O. <ol...@gm...> - 2021-11-08 21:51:12
|
On Mon, Nov 08, 2021 at 11:43:55PM +0200, Vladimir Oltean wrote: > On Mon, Nov 08, 2021 at 01:28:03PM -0800, Christopher Wingert wrote: > > > > > > On 11/8/2021 12:38 PM, Vladimir Oltean wrote: > > > On Mon, Nov 08, 2021 at 12:11:11PM -0800, Christopher Wingert wrote: > > > > Hi, > > > > > > > > I am working with a Aquantia AQC 107 ethernet interface. After the announce > > > > message is sent on FD_GENERAL, a poll() of the the FD_GENERAL descriptor > > > > generates a POLLERR. I see 3 delay messages go out the interface on > > > > FD_EVENT (previous to the announce message) without issue (no socket error > > > > on read on the FD_EVENT descriptor). > > > > > > > > The only difference i see between the two sockets is how the sock_filter is > > > > setup. > > > > > > > > I am thinking this is an issue with the Aquantia driver, as the same command > > > > on a Mellanox Connect X5 works fine. > > > > > > > > Has anyone seen this issue or have a clue as to where I should start? > > > > > > > > Thanks! > > > > Chris > > > > > > > > > > > > ptp4l command line : ptp4l -i els1 -H -P -2 -m > > > > Kernel is 4.18 > > > > I downloaded the latest Atlantic driver from the Marvell website 2.4.14.0 > > > > I have upgraded the AQC 107 firmware to 3.1.121 > > > I've no experience with this driver whatsoever, but generally, what > > > ptp4l receives on the error queue of a socket is a TX timestamp. What is > > > surprising is that there's a TX timestamp for a general (not event) > > > message, because ptp4l does not ask these to be timestamped. > > > > > > Apart from the error messages, does the system otherwise behave ok? > > > > > > You can try to read from the general message socket into a packet buffer > > > and hexdump it, put it in tcpdump and see what it is. Then the next step > > > might be to process its control messages (cmsg), although my first guess > > > would be that TX timestamping is what's going on. > > > > > > There are plenty of things that could go wrong in a driver (especially > > > in one you downloaded from the vendor's website and not from kernel.org). > > > If you're handy with the source code, you can check what is the > > > condition based on which this driver offers hardware TX timestamps to > > > the stack. It should be if skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP > > > is set for that packet, AND hardware TX timestamping was requested > > > through HWTSTAMP_TX_ON. > > > > Thank you for the quick response! > > > > This is what the current version from git looks like on the 107 without any > > code changes (3 delay requests, 1 announce), this loops indefinitely and > > MASTER never gets enabled. > > ptp4l[506134.862]: selected /dev/ptp11 as PTP clock > > ptp4l[506134.889]: port 1 (els1): INITIALIZING to LISTENING on INIT_COMPLETE > > ptp4l[506134.889]: port 0 (/var/run/ptp4l): INITIALIZING to LISTENING on > > INIT_COMPLETE > > ptp4l[506134.889]: port 0 (/var/run/ptp4lro): INITIALIZING to LISTENING on > > INIT_COMPLETE > > ptp4l[506141.948]: port 1 (els1): LISTENING to MASTER on > > ANNOUNCE_RECEIPT_TIMEOUT_EXPIRES > > ptp4l[506141.948]: selected local clock ac1f6b.fffe.dce92d as best master > > ptp4l[506141.948]: port 1 (els1): assuming the grand master role > > ptp4l[506141.950]: port 1 (els1): unexpected socket error > > ptp4l[506141.950]: port 1 (els1): MASTER to FAULTY on FAULT_DETECTED > > (FT_UNSPECIFIED) > > > > > > I changed raw.c function raw_send() to the below code to get the timestamp > > on both sockets. > > /* > > * Get the time stamp right away. > > */ > > // return event == TRANS_EVENT ? sk_receive(fd, pkt, len, NULL, hwts, MSG_ERRQUEUE) : cnt; > > if ( event == TRANS_EVENT ) return sk_receive(fd, pkt, len, NULL, hwts, MSG_ERRQUEUE); > > if ( event == TRANS_GENERAL ) return sk_receive(fd, pkt, len, NULL, hwts, MSG_ERRQUEUE); > > return cnt; > > > > This is the result. > > ptp4l[506201.215]: selected /dev/ptp11 as PTP clock > > ptp4l[506201.245]: port 1 (els1): INITIALIZING to LISTENING on INIT_COMPLETE > > ptp4l[506201.245]: port 0 (/var/run/ptp4l): INITIALIZING to LISTENING on INIT_COMPLETE > > ptp4l[506201.245]: port 0 (/var/run/ptp4lro): INITIALIZING to LISTENING on INIT_COMPLETE > > ptp4l[506208.757]: port 1 (els1): LISTENING to MASTER on ANNOUNCE_RECEIPT_TIMEOUT_EXPIRES > > ptp4l[506208.757]: selected local clock ac1f6b.fffe.dce92d as best master > > ptp4l[506208.757]: port 1 (els1): assuming the grand master role > > ptp4l[506208.759]: poll for tx timestamp woke up on non ERR event > > ptp4l[506208.759]: port 1 (els1): send announce failed > > ptp4l[506208.759]: port 1 (els1): MASTER to FAULTY on FAULT_DETECTED > > (FT_UNSPECIFIED) > > > > Unless there is something wrong in my code change, it doesn't seem to be a > > timestamp. > > > > Are you saying that every POLLERR should be combined with a message in the > > Error Queue? > > It's still implausible that it's not a timestamp (and I don't know what > it can be if that's not it). "man poll" only says: > > POLLERR > Error condition (only returned in revents; ignored in > events). This bit is also set for a file descriptor > referring to the write end of a pipe when the read end has > been closed. > > and since ptp4l does not open connection-oriented sockets for general > PTP messages, I don't think it can detect that the read end has been > closed. > > What seems to be more likely to be going on is that you haven't made all > changes necessary for reading TX timestamps from the error queue of the > general socket. Have you called sk_timestamping_init? > > flags = 1; > if (setsockopt(fd, SOL_SOCKET, SO_SELECT_ERR_QUEUE, > &flags, sizeof(flags)) < 0) { > pr_warning("%s: SO_SELECT_ERR_QUEUE: %m", device); > sk_events = 0; > sk_revents = POLLERR; > } > > introduced by this kernel commit: > > commit 7d4c04fc170087119727119074e72445f2bb192b > Author: Keller, Jacob E <jac...@in...> > Date: Thu Mar 28 11:19:25 2013 +0000 > > net: add option to enable error queue packets waking select > > Currently, when a socket receives something on the error queue it only wakes up > the socket on select if it is in the "read" list, that is the socket has > something to read. It is useful also to wake the socket if it is in the error > list, which would enable software to wait on error queue packets without waking > up for regular data on the socket. The main use case is for receiving > timestamped transmit packets which return the timestamp to the socket via the > error queue. This enables an application to select on the socket for the error > queue only instead of for the regular traffic. > > -v2- > * Added the SO_SELECT_ERR_QUEUE socket option to every architechture specific file > * Modified every socket poll function that checks error queue > > Signed-off-by: Jacob Keller <jac...@in...> > Cc: Jeffrey Kirsher <jef...@in...> > Cc: Richard Cochran <ric...@gm...> > Cc: Matthew Vick <mat...@in...> > Signed-off-by: David S. Miller <da...@da...> > > So you effectively cannot call poll() or select() on the error queue of > a socket without enabling this option. Also, I think the sk_receive() > function messes up quite badly, because of this incosistent mode in > which it's operating. See, it looks at this global variable called > sk_revents to figure out which events is poll() supposed to return. But > the code was written assuming that there's a single socket on which you > will poll for TX timestamps. And you have two, and configured > differently, at that: on one you call sk_timestamping_init() and on the > other you don't (or at least you don't mention that you do). I think we are running around in circles. If you call sk_timestamping_init() you are effectively requesting TX timestamps for general messages, therefore changing the premise of the issue. Can you instead remove that check for !(pfd.revents & sk_revents)? |
From: Vladimir O. <ol...@gm...> - 2021-11-08 21:44:03
|
On Mon, Nov 08, 2021 at 01:28:03PM -0800, Christopher Wingert wrote: > > > On 11/8/2021 12:38 PM, Vladimir Oltean wrote: > > On Mon, Nov 08, 2021 at 12:11:11PM -0800, Christopher Wingert wrote: > > > Hi, > > > > > > I am working with a Aquantia AQC 107 ethernet interface. After the announce > > > message is sent on FD_GENERAL, a poll() of the the FD_GENERAL descriptor > > > generates a POLLERR. I see 3 delay messages go out the interface on > > > FD_EVENT (previous to the announce message) without issue (no socket error > > > on read on the FD_EVENT descriptor). > > > > > > The only difference i see between the two sockets is how the sock_filter is > > > setup. > > > > > > I am thinking this is an issue with the Aquantia driver, as the same command > > > on a Mellanox Connect X5 works fine. > > > > > > Has anyone seen this issue or have a clue as to where I should start? > > > > > > Thanks! > > > Chris > > > > > > > > > ptp4l command line : ptp4l -i els1 -H -P -2 -m > > > Kernel is 4.18 > > > I downloaded the latest Atlantic driver from the Marvell website 2.4.14.0 > > > I have upgraded the AQC 107 firmware to 3.1.121 > > I've no experience with this driver whatsoever, but generally, what > > ptp4l receives on the error queue of a socket is a TX timestamp. What is > > surprising is that there's a TX timestamp for a general (not event) > > message, because ptp4l does not ask these to be timestamped. > > > > Apart from the error messages, does the system otherwise behave ok? > > > > You can try to read from the general message socket into a packet buffer > > and hexdump it, put it in tcpdump and see what it is. Then the next step > > might be to process its control messages (cmsg), although my first guess > > would be that TX timestamping is what's going on. > > > > There are plenty of things that could go wrong in a driver (especially > > in one you downloaded from the vendor's website and not from kernel.org). > > If you're handy with the source code, you can check what is the > > condition based on which this driver offers hardware TX timestamps to > > the stack. It should be if skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP > > is set for that packet, AND hardware TX timestamping was requested > > through HWTSTAMP_TX_ON. > > Thank you for the quick response! > > This is what the current version from git looks like on the 107 without any > code changes (3 delay requests, 1 announce), this loops indefinitely and > MASTER never gets enabled. > ptp4l[506134.862]: selected /dev/ptp11 as PTP clock > ptp4l[506134.889]: port 1 (els1): INITIALIZING to LISTENING on INIT_COMPLETE > ptp4l[506134.889]: port 0 (/var/run/ptp4l): INITIALIZING to LISTENING on > INIT_COMPLETE > ptp4l[506134.889]: port 0 (/var/run/ptp4lro): INITIALIZING to LISTENING on > INIT_COMPLETE > ptp4l[506141.948]: port 1 (els1): LISTENING to MASTER on > ANNOUNCE_RECEIPT_TIMEOUT_EXPIRES > ptp4l[506141.948]: selected local clock ac1f6b.fffe.dce92d as best master > ptp4l[506141.948]: port 1 (els1): assuming the grand master role > ptp4l[506141.950]: port 1 (els1): unexpected socket error > ptp4l[506141.950]: port 1 (els1): MASTER to FAULTY on FAULT_DETECTED > (FT_UNSPECIFIED) > > > I changed raw.c function raw_send() to the below code to get the timestamp > on both sockets. > /* > * Get the time stamp right away. > */ > // return event == TRANS_EVENT ? sk_receive(fd, pkt, len, NULL, hwts, MSG_ERRQUEUE) : cnt; > if ( event == TRANS_EVENT ) return sk_receive(fd, pkt, len, NULL, hwts, MSG_ERRQUEUE); > if ( event == TRANS_GENERAL ) return sk_receive(fd, pkt, len, NULL, hwts, MSG_ERRQUEUE); > return cnt; > > This is the result. > ptp4l[506201.215]: selected /dev/ptp11 as PTP clock > ptp4l[506201.245]: port 1 (els1): INITIALIZING to LISTENING on INIT_COMPLETE > ptp4l[506201.245]: port 0 (/var/run/ptp4l): INITIALIZING to LISTENING on INIT_COMPLETE > ptp4l[506201.245]: port 0 (/var/run/ptp4lro): INITIALIZING to LISTENING on INIT_COMPLETE > ptp4l[506208.757]: port 1 (els1): LISTENING to MASTER on ANNOUNCE_RECEIPT_TIMEOUT_EXPIRES > ptp4l[506208.757]: selected local clock ac1f6b.fffe.dce92d as best master > ptp4l[506208.757]: port 1 (els1): assuming the grand master role > ptp4l[506208.759]: poll for tx timestamp woke up on non ERR event > ptp4l[506208.759]: port 1 (els1): send announce failed > ptp4l[506208.759]: port 1 (els1): MASTER to FAULTY on FAULT_DETECTED > (FT_UNSPECIFIED) > > Unless there is something wrong in my code change, it doesn't seem to be a > timestamp. > > Are you saying that every POLLERR should be combined with a message in the > Error Queue? It's still implausible that it's not a timestamp (and I don't know what it can be if that's not it). "man poll" only says: POLLERR Error condition (only returned in revents; ignored in events). This bit is also set for a file descriptor referring to the write end of a pipe when the read end has been closed. and since ptp4l does not open connection-oriented sockets for general PTP messages, I don't think it can detect that the read end has been closed. What seems to be more likely to be going on is that you haven't made all changes necessary for reading TX timestamps from the error queue of the general socket. Have you called sk_timestamping_init? flags = 1; if (setsockopt(fd, SOL_SOCKET, SO_SELECT_ERR_QUEUE, &flags, sizeof(flags)) < 0) { pr_warning("%s: SO_SELECT_ERR_QUEUE: %m", device); sk_events = 0; sk_revents = POLLERR; } introduced by this kernel commit: commit 7d4c04fc170087119727119074e72445f2bb192b Author: Keller, Jacob E <jac...@in...> Date: Thu Mar 28 11:19:25 2013 +0000 net: add option to enable error queue packets waking select Currently, when a socket receives something on the error queue it only wakes up the socket on select if it is in the "read" list, that is the socket has something to read. It is useful also to wake the socket if it is in the error list, which would enable software to wait on error queue packets without waking up for regular data on the socket. The main use case is for receiving timestamped transmit packets which return the timestamp to the socket via the error queue. This enables an application to select on the socket for the error queue only instead of for the regular traffic. -v2- * Added the SO_SELECT_ERR_QUEUE socket option to every architechture specific file * Modified every socket poll function that checks error queue Signed-off-by: Jacob Keller <jac...@in...> Cc: Jeffrey Kirsher <jef...@in...> Cc: Richard Cochran <ric...@gm...> Cc: Matthew Vick <mat...@in...> Signed-off-by: David S. Miller <da...@da...> So you effectively cannot call poll() or select() on the error queue of a socket without enabling this option. Also, I think the sk_receive() function messes up quite badly, because of this incosistent mode in which it's operating. See, it looks at this global variable called sk_revents to figure out which events is poll() supposed to return. But the code was written assuming that there's a single socket on which you will poll for TX timestamps. And you have two, and configured differently, at that: on one you call sk_timestamping_init() and on the other you don't (or at least you don't mention that you do). |
From: Christopher W. <us...@wi...> - 2021-11-08 21:28:25
|
On 11/8/2021 12:38 PM, Vladimir Oltean wrote: > On Mon, Nov 08, 2021 at 12:11:11PM -0800, Christopher Wingert wrote: >> Hi, >> >> I am working with a Aquantia AQC 107 ethernet interface. After the announce >> message is sent on FD_GENERAL, a poll() of the the FD_GENERAL descriptor >> generates a POLLERR. I see 3 delay messages go out the interface on >> FD_EVENT (previous to the announce message) without issue (no socket error >> on read on the FD_EVENT descriptor). >> >> The only difference i see between the two sockets is how the sock_filter is >> setup. >> >> I am thinking this is an issue with the Aquantia driver, as the same command >> on a Mellanox Connect X5 works fine. >> >> Has anyone seen this issue or have a clue as to where I should start? >> >> Thanks! >> Chris >> >> >> ptp4l command line : ptp4l -i els1 -H -P -2 -m >> Kernel is 4.18 >> I downloaded the latest Atlantic driver from the Marvell website 2.4.14.0 >> I have upgraded the AQC 107 firmware to 3.1.121 > I've no experience with this driver whatsoever, but generally, what > ptp4l receives on the error queue of a socket is a TX timestamp. What is > surprising is that there's a TX timestamp for a general (not event) > message, because ptp4l does not ask these to be timestamped. > > Apart from the error messages, does the system otherwise behave ok? > > You can try to read from the general message socket into a packet buffer > and hexdump it, put it in tcpdump and see what it is. Then the next step > might be to process its control messages (cmsg), although my first guess > would be that TX timestamping is what's going on. > > There are plenty of things that could go wrong in a driver (especially > in one you downloaded from the vendor's website and not from kernel.org). > If you're handy with the source code, you can check what is the > condition based on which this driver offers hardware TX timestamps to > the stack. It should be if skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP > is set for that packet, AND hardware TX timestamping was requested > through HWTSTAMP_TX_ON. Thank you for the quick response! This is what the current version from git looks like on the 107 without any code changes (3 delay requests, 1 announce), this loops indefinitely and MASTER never gets enabled. ptp4l[506134.862]: selected /dev/ptp11 as PTP clock ptp4l[506134.889]: port 1 (els1): INITIALIZING to LISTENING on INIT_COMPLETE ptp4l[506134.889]: port 0 (/var/run/ptp4l): INITIALIZING to LISTENING on INIT_COMPLETE ptp4l[506134.889]: port 0 (/var/run/ptp4lro): INITIALIZING to LISTENING on INIT_COMPLETE ptp4l[506141.948]: port 1 (els1): LISTENING to MASTER on ANNOUNCE_RECEIPT_TIMEOUT_EXPIRES ptp4l[506141.948]: selected local clock ac1f6b.fffe.dce92d as best master ptp4l[506141.948]: port 1 (els1): assuming the grand master role ptp4l[506141.950]: port 1 (els1): unexpected socket error ptp4l[506141.950]: port 1 (els1): MASTER to FAULTY on FAULT_DETECTED (FT_UNSPECIFIED) I changed raw.c function raw_send() to the below code to get the timestamp on both sockets. /* * Get the time stamp right away. */ // return event == TRANS_EVENT ? sk_receive(fd, pkt, len, NULL, hwts, MSG_ERRQUEUE) : cnt; if ( event == TRANS_EVENT ) return sk_receive(fd, pkt, len, NULL, hwts, MSG_ERRQUEUE); if ( event == TRANS_GENERAL ) return sk_receive(fd, pkt, len, NULL, hwts, MSG_ERRQUEUE); return cnt; This is the result. ptp4l[506201.215]: selected /dev/ptp11 as PTP clock ptp4l[506201.245]: port 1 (els1): INITIALIZING to LISTENING on INIT_COMPLETE ptp4l[506201.245]: port 0 (/var/run/ptp4l): INITIALIZING to LISTENING on INIT_COMPLETE ptp4l[506201.245]: port 0 (/var/run/ptp4lro): INITIALIZING to LISTENING on INIT_COMPLETE ptp4l[506208.757]: port 1 (els1): LISTENING to MASTER on ANNOUNCE_RECEIPT_TIMEOUT_EXPIRES ptp4l[506208.757]: selected local clock ac1f6b.fffe.dce92d as best master ptp4l[506208.757]: port 1 (els1): assuming the grand master role ptp4l[506208.759]: poll for tx timestamp woke up on non ERR event ptp4l[506208.759]: port 1 (els1): send announce failed ptp4l[506208.759]: port 1 (els1): MASTER to FAULTY on FAULT_DETECTED (FT_UNSPECIFIED) Unless there is something wrong in my code change, it doesn't seem to be a timestamp. Are you saying that every POLLERR should be combined with a message in the Error Queue? |
From: Vladimir O. <ol...@gm...> - 2021-11-08 20:38:45
|
On Mon, Nov 08, 2021 at 12:11:11PM -0800, Christopher Wingert wrote: > Hi, > > I am working with a Aquantia AQC 107 ethernet interface. After the announce > message is sent on FD_GENERAL, a poll() of the the FD_GENERAL descriptor > generates a POLLERR. I see 3 delay messages go out the interface on > FD_EVENT (previous to the announce message) without issue (no socket error > on read on the FD_EVENT descriptor). > > The only difference i see between the two sockets is how the sock_filter is > setup. > > I am thinking this is an issue with the Aquantia driver, as the same command > on a Mellanox Connect X5 works fine. > > Has anyone seen this issue or have a clue as to where I should start? > > Thanks! > Chris > > > ptp4l command line : ptp4l -i els1 -H -P -2 -m > Kernel is 4.18 > I downloaded the latest Atlantic driver from the Marvell website 2.4.14.0 > I have upgraded the AQC 107 firmware to 3.1.121 I've no experience with this driver whatsoever, but generally, what ptp4l receives on the error queue of a socket is a TX timestamp. What is surprising is that there's a TX timestamp for a general (not event) message, because ptp4l does not ask these to be timestamped. Apart from the error messages, does the system otherwise behave ok? You can try to read from the general message socket into a packet buffer and hexdump it, put it in tcpdump and see what it is. Then the next step might be to process its control messages (cmsg), although my first guess would be that TX timestamping is what's going on. There are plenty of things that could go wrong in a driver (especially in one you downloaded from the vendor's website and not from kernel.org). If you're handy with the source code, you can check what is the condition based on which this driver offers hardware TX timestamps to the stack. It should be if skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP is set for that packet, AND hardware TX timestamping was requested through HWTSTAMP_TX_ON. |
From: Christopher W. <us...@wi...> - 2021-11-08 20:28:04
|
Hi, I am working with a Aquantia AQC 107 ethernet interface. After the announce message is sent on FD_GENERAL, a poll() of the the FD_GENERAL descriptor generates a POLLERR. I see 3 delay messages go out the interface on FD_EVENT (previous to the announce message) without issue (no socket error on read on the FD_EVENT descriptor). The only difference i see between the two sockets is how the sock_filter is setup. I am thinking this is an issue with the Aquantia driver, as the same command on a Mellanox Connect X5 works fine. Has anyone seen this issue or have a clue as to where I should start? Thanks! Chris ptp4l command line : ptp4l -i els1 -H -P -2 -m Kernel is 4.18 I downloaded the latest Atlantic driver from the Marvell website 2.4.14.0 I have upgraded the AQC 107 firmware to 3.1.121 |
From: Federico M. G. <fed...@gm...> - 2021-11-08 17:02:58
|
Hey everyone, I am quite new to PTP deployment and got stuck while setting up my PTP client in an Ubuntu laptop. I have a device where Ubuntu 18.04 is running and a GPS signal is available. This node is acting as a Grandmaster clock. On the other side of the Ethernet connection I have my Ubuntu 20.04 laptop where I am running the PTP client. I can see that it receives some PTP messages (I saw the packets in Wireshark to confirm this) and that the client perceives a Master in the LAN. Everything looks good but now I can`t set the client to use this clock as its main clock. I would like to change the system clock in my laptop to synchronize with the master. I used pch2sys as following but I get this output and it gets stuck there repeating the "waiting for ptp4l" message: *fcs@fcs-ThinkPad-T470P*:*~*$ sudo phc2sys -a -r -m phc2sys[682997.385]: Waiting for ptp4l... phc2sys[682998.387]: Waiting for ptp4l... phc2sys[682999.388]: Waiting for ptp4l... Any help would be very appreciated. Thanks in advance! This are the output of the services where ptp4l are running (I receive some scripts from the GPS device vendor with this services, master and slave): *MASTER*: root@MK6C:~# systemctl status ptp_master *●* ptp_master.service - Linux PTP Time Service Loaded: loaded (/etc/systemd/system/ptp_master.service; disabled; vendor preset: enabled) Active: *active (running)* since Mon 2021-11-08 16:36:06 UTC; 6s ago Main PID: 18357 (ptp_start_maste) Tasks: 2 (limit: 2128) CGroup: /system.slice/ptp_master.service ├─18357 /bin/bash /mnt/rw/ptp/ptp_start_master └─18359 ptp4l -E -S -l 5 -f /mnt/rw/ptp/ptp4l_master.conf -m Nov 08 16:36:06 MK6C ptp_start_master[18357]: + '[' -f /mnt/rw/ptp/ptp4l_master.conf ']' Nov 08 16:36:06 MK6C ptp_start_master[18357]: + killall -9 ptp4l Nov 08 16:36:07 MK6C ptp_start_master[18357]: ptp4l: no process found Nov 08 16:36:07 MK6C ptp_start_master[18357]: + nice -n 10 ptp4l -E -S -l 5 -f /mnt/rw/ptp/ptp4l_master.conf -m Nov 08 16:36:07 MK6C ptp_start_master[18357]: ptp4l[20525.075]: port 1: INITIALIZING to LISTENING on INITIALIZE Nov 08 16:36:07 MK6C ptp_start_master[18357]: ptp4l[20525.075]: port 0: INITIALIZING to LISTENING on INITIALIZE Nov 08 16:36:07 MK6C ptp_start_master[18357]: ptp4l[20525.076]: port 1: link up Nov 08 16:36:13 MK6C ptp_start_master[18357]: ptp4l[20531.263]: port 1: LISTENING to MASTER on ANNOUNCE_RECEIPT_TIMEOUT_EXPIRES Nov 08 16:36:13 MK6C ptp_start_master[18357]: ptp4l[20531.263]: selected best master clock 04e548.fffe.230108 Nov 08 16:36:13 MK6C ptp_start_master[18357]: ptp4l[20531.263]: assuming the grand master role *SLAVE:* (base) *fcs@fcs-ThinkPad-T470P*:*~*$ sudo systemctl status ptp_slave.service *●* ptp_slave.service - Linux PTP Time Service Loaded: loaded (/etc/systemd/system/ptp_slave.service; disabled; vendor preset: enabled) Active: *active (running)* since Mon 2021-11-08 16:36:54 GMT; 7s ago Main PID: 594428 (ptp_start_slave) Tasks: 2 (limit: 18781) Memory: 664.0K CGroup: /system.slice/ptp_slave.service ├─594428 /bin/bash /mnt/rw/ptp/ptp_start_slave └─594430 ptp4l -E -S -l 5 -f /mnt/rw/ptp/ptp4l_slave.conf -m Nov 08 16:36:55 fcs-ThinkPad-T470P ptp_start_slave[594430]: ptp4l[681570.370]: port 1 (enp0s31f6): new foreign master 04e548.fffe.230108-1 Nov 08 16:36:55 fcs-ThinkPad-T470P ptp4l[594430]: *[681570.370] port 1 (enp0s31f6): new foreign master 04e548.fffe.230108-1* Nov 08 16:36:59 fcs-ThinkPad-T470P ptp_start_slave[594430]: ptp4l[681574.371]: selected best master clock 04e548.fffe.230108 Nov 08 16:36:59 fcs-ThinkPad-T470P ptp_start_slave[594430]: ptp4l[681574.371]: foreign master not using PTP timescale Nov 08 16:36:59 fcs-ThinkPad-T470P ptp_start_slave[594430]: ptp4l[681574.371]: running in a temporal vortex Nov 08 16:36:59 fcs-ThinkPad-T470P ptp4l[594430]: *[681574.371] selected best master clock 04e548.fffe.230108* Nov 08 16:36:59 fcs-ThinkPad-T470P ptp_start_slave[594430]: ptp4l[681574.371]: port 1 (enp0s31f6): LISTENING to UNCALIBRATED on RS_SLAVE Nov 08 16:36:59 fcs-ThinkPad-T470P ptp4l[594430]: *[681574.371] foreign master not using PTP timescale* Nov 08 16:36:59 fcs-ThinkPad-T470P ptp4l[594430]: *[681574.371] running in a temporal vortex* Nov 08 16:36:59 fcs-ThinkPad-T470P ptp4l[594430]: *[681574.371] port 1 (enp0s31f6): LISTENING to UNCALIBRATED on RS_SLAVE* -- Best regards, Federico |
From: Vladimir O. <ol...@gm...> - 2021-11-08 15:39:58
|
On Mon, Nov 08, 2021 at 06:44:28AM -0800, Richard Cochran wrote: > On Sun, Nov 07, 2021 at 04:22:52PM +0200, Vladimir Oltean wrote: > > On Sun, Nov 07, 2021 at 04:19:55PM +0200, Vladimir Oltean wrote: > > > On Sun, Nov 07, 2021 at 05:55:43AM -0800, Richard Cochran wrote: > > > > On Sun, Nov 07, 2021 at 01:32:59PM +0200, Vladimir Oltean wrote: > > > > > 1. ptp4l in the role of a GM appears to listen for GRANDMASTER_SETTINGS_NP > > > > > PTP management messages on the local r/w UDS socket and that's how it > > > > > updates its ANNOUNCE message contents. Who is supposed to construct and > > > > > send these PTP management messages to the ptp4l GM in a "normal" system? > > > > > > > > This role must be taken by an outside program. For example, I wrote a > > > > shell script to do this for a GM that always has the correct UTC offset. > > > > > > What is the role that this outside program supposed to have, apart from > > > establishing that the UTC time is traceable? Trying to figure out > > > > Sorry, unfinished phrase. Trying to figure out what it would take for > > that program to be written and be usable in a more general sense. > > I've no problem in setting up my application to set the CLOCK_TAI offset > > by itself when it detects that phc2sys and ptp4l didn't commit what > > they're working with to the kernel. But I'm not sure whether that may > > clash with what other parts of the system may have in mind. > > The only role of this program would be to configure ptp4l with the > correct values found in: > > * GRANDMASTER_SETTINGS_NP > # values used when nodes becomes the GM > # SET action useful for GPS time server > clockClass 248 > clockAccuracy 0xfe > offsetScaledLogVariance 0xffff > currentUtcOffset 35 > leap61 0 > leap59 0 > currentUtcOffsetValid 0 > ptpTimescale 0 > timeTraceable 0 > frequencyTraceable 0 > timeSource 0xa0 > > These are used by ptp4l when becoming the GM. If the values are > static and never change, then the task is simple and can be done with > a shell script that invokes pmc. > > - You can set currentUtcOffsetValid, ptpTimescale, timeTraceable, and > frequencyTraceable all to "true". > > - If you don't care about a globally correct currentUtcOffset, then > you can simply set it to the latest value from the leapseconds file > (or anything at all, really). Use the same value to set the local > kernel UTC offset on boot. Unfortunately no standard utility does > this, but I have adapted one to allow this: > > https://github.com/richardcochran/ntpclient-2015 > > FWIW, the only other program I know of that sets the kernel TAI offset > is ntpd (not sure about chrony and systemd). ntpd takes a long, LONG > time to actually set the offset after cold boot. Thanks, I have some logic in my application too (this is a consumer of CLOCK_TAI timers), that queries ptp4l's TIME_PROPERTIES_DATA_SET member currentUtcOffset, and reads the kernel's CLOCK_TAI offset, and in case they are different, it fixes up the kernel's offset. This logic can't be too wrong, I figure. > > > I'm asking about linuxptp being used to synchronize the CLOCK_TAI on two > > > back-to-back systems. I couldn't care less about traceability to UTC, or > > > about who is GM for that matter. I just want that when I read CLOCK_TAI > > > on a system where phc2sys synchronizes CLOCK_REALTIME to a PHC, or the > > > other way around, the offset between CLOCK_TAI and the PHC to converge > > > to zero. > > In this use case, you need only set GRANDMASTER_SETTINGS_NP once after > ptp4l start up. Actually no. As mentioned, if I set GRANDMASTER_SETTINGS_NP member currentUtcOffsetValid=1 via management: (a) I take the responsibility for claiming it is valid, which is neither something that I need nor want (b) I only solve the CLOCK_TAI offset problem for the slaves, not for the GM system itself, since phc2sys updates the CLOCK_TAI offset only if CLOCK_REALTIME is a slave clock. So the only realistic option I see is to do what I ended up doing, as long as there is no standard program that sets and forgets this value. I am not actually sure whether phc2sys can/should set the CLOCK_TAI offset on a system where ptp4l is a GM. That should perhaps be ptp4l itself, but then again, if ptp4l doesn't want to claim responsibility either, and expect somebody else to set it up, then it is what it is. |
From: Richard C. <ric...@gm...> - 2021-11-08 14:44:36
|
On Sun, Nov 07, 2021 at 04:22:52PM +0200, Vladimir Oltean wrote: > On Sun, Nov 07, 2021 at 04:19:55PM +0200, Vladimir Oltean wrote: > > On Sun, Nov 07, 2021 at 05:55:43AM -0800, Richard Cochran wrote: > > > On Sun, Nov 07, 2021 at 01:32:59PM +0200, Vladimir Oltean wrote: > > > > 1. ptp4l in the role of a GM appears to listen for GRANDMASTER_SETTINGS_NP > > > > PTP management messages on the local r/w UDS socket and that's how it > > > > updates its ANNOUNCE message contents. Who is supposed to construct and > > > > send these PTP management messages to the ptp4l GM in a "normal" system? > > > > > > This role must be taken by an outside program. For example, I wrote a > > > shell script to do this for a GM that always has the correct UTC offset. > > > > What is the role that this outside program supposed to have, apart from > > establishing that the UTC time is traceable? Trying to figure out > > Sorry, unfinished phrase. Trying to figure out what it would take for > that program to be written and be usable in a more general sense. > I've no problem in setting up my application to set the CLOCK_TAI offset > by itself when it detects that phc2sys and ptp4l didn't commit what > they're working with to the kernel. But I'm not sure whether that may > clash with what other parts of the system may have in mind. The only role of this program would be to configure ptp4l with the correct values found in: * GRANDMASTER_SETTINGS_NP # values used when nodes becomes the GM # SET action useful for GPS time server clockClass 248 clockAccuracy 0xfe offsetScaledLogVariance 0xffff currentUtcOffset 35 leap61 0 leap59 0 currentUtcOffsetValid 0 ptpTimescale 0 timeTraceable 0 frequencyTraceable 0 timeSource 0xa0 These are used by ptp4l when becoming the GM. If the values are static and never change, then the task is simple and can be done with a shell script that invokes pmc. - You can set currentUtcOffsetValid, ptpTimescale, timeTraceable, and frequencyTraceable all to "true". - If you don't care about a globally correct currentUtcOffset, then you can simply set it to the latest value from the leapseconds file (or anything at all, really). Use the same value to set the local kernel UTC offset on boot. Unfortunately no standard utility does this, but I have adapted one to allow this: https://github.com/richardcochran/ntpclient-2015 FWIW, the only other program I know of that sets the kernel TAI offset is ntpd (not sure about chrony and systemd). ntpd takes a long, LONG time to actually set the offset after cold boot. > > I'm asking about linuxptp being used to synchronize the CLOCK_TAI on two > > back-to-back systems. I couldn't care less about traceability to UTC, or > > about who is GM for that matter. I just want that when I read CLOCK_TAI > > on a system where phc2sys synchronizes CLOCK_REALTIME to a PHC, or the > > other way around, the offset between CLOCK_TAI and the PHC to converge > > to zero. In this use case, you need only set GRANDMASTER_SETTINGS_NP once after ptp4l start up. HTH, Richard |
From: Richard C. <ric...@gm...> - 2021-11-08 14:27:45
|
On Mon, Nov 08, 2021 at 12:34:08PM +0100, Luigi 'Comio' Mantellini wrote: > (my message was rejected... why?) Looks like both messages were in fact published on the list. Sorry about SF, but I have very little control over the mailing list. I plan to move the project off of SF as soon as I can. Thanks, Richard |
From: ramesh t <ram...@ya...> - 2021-11-08 11:04:06
|
hi, Rms value of PTP slave/support is taking longer duration (1-2 hours) to come back to single/two digit value. Oct 29 01:21:37 ptp4l_slave: [2115639.889] handle_state_decision_event PS_LISTENING Oct 29 01:21:37 ptp4l_slave: [2115640.178] selected best master clock 28affd.fffe.e5de3f Oct 29 01:21:37 ptp4l_slave: [2115640.178] handle_state_decision_event PS_SLAVE Oct 29 01:21:37 ptp4l_slave: [2115640.178] port 1: LISTENING to UNCALIBRATED on RS_SLAVE Oct 29 01:21:37 ptp4l_slave: [2115640.178] PS_SLAVE: port_e2e_transition Oct 29 01:21:37 ptp4l_slave: [2115640.242] port 1: UNCALIBRATED to SLAVE on MASTER_CLOCK_SELECTED Oct 29 01:21:37 ptp4l_slave: [2115640.242] PS_SLAVE: port_e2e_transition Oct 29 01:21:38 ptp4l_slave: [2115640.370] rms 10301244690663 max 15341881743901 freq -900000000 +/- 0 delay -16376519 +/- 2081539 Oct 29 01:21:39 ptp4l_slave: [2115641.370] rms 8782943571542 max 8783365601487 freq -900000000 +/- 0 delay -13254922 +/- 1790307 Oct 29 01:21:39 ptp4l_slave: [2115641.434] selected best master clock 28affd.fffe.e5de3f Oct 29 01:21:39 ptp4l_slave: [2115641.435] handle_state_decision_event PS_SLAVE Oct 29 01:21:40 ptp4l_slave: [2115642.374] rms 8782045654609 max 8782467416190 freq -900000000 +/- 0 delay -10871120 +/- 1851783 Oct 29 03:25:03 ptp4l_slave: [2123045.451] rms 2119240296684 max 2119664072935 freq -900000000 +/- 0 delay -11533365 +/- 1437648 We are using below values in the ptp config file. step_threshold 0.0 first_step_threshold 0.00002 max_frequency 900000000 clock_servo pi sanity_freq_limit 200000000 Can you please suggest what should be proper configuration? regards, Ramesh |
From: Miroslav L. <mli...@re...> - 2021-11-08 09:09:31
|
On Thu, Nov 04, 2021 at 10:55:49PM +0000, Brooks, Jason wrote: > 1. Chronyc sources shows > * 210 Number of sources = 5 > * MS Name/IP address Stratum Poll Reach LastRx Last sample > * =============================================================================== > * #? PTP0 0 2 0 - +0ns[ +0ns] +/- 0nsq > * ^+ time-c-b.nist.gov 1 4 377 5 +7083us[+7083us] +/- 28ms > * ^+ time2.google.com 1 4 377 7 +333us[ +333us] +/- 28ms > * ^* usscz2-ntp-001.aaplimg.c> 1 4 377 5 -780us[ -780us] +/- 11ms > * ^+ clock.fmt.he.net 1 4 377 6 -2245us[-2245us] +/- 14ms > > Why is chrony showing PTP0 as a problem? The refclock (i.e. ptp4l+phc2sys) didn't provide any samples. FYI, Google NTP servers uses a nonstandard timescale (leap smear) and must not be combined with standard NTP servers. > 1. And when using this system as an ntp source on another system it shows as a stratum 2 system. Shouldn't this be a stratum 1, given the ptp0 clock? > 2. A lot of messages in /var/log/messages: > * [44:enp0s25] port 1: received SYNC without timestamp What OS and version is it? Does it work with SW timestamping? If you add a dummy PTP domain to timemaster.conf before the 44, it should force 44 to use SW timestamping. -- Miroslav Lichvar |
From: Vladimir O. <ol...@gm...> - 2021-11-07 14:23:00
|
On Sun, Nov 07, 2021 at 04:19:55PM +0200, Vladimir Oltean wrote: > On Sun, Nov 07, 2021 at 05:55:43AM -0800, Richard Cochran wrote: > > On Sun, Nov 07, 2021 at 01:32:59PM +0200, Vladimir Oltean wrote: > > > 1. ptp4l in the role of a GM appears to listen for GRANDMASTER_SETTINGS_NP > > > PTP management messages on the local r/w UDS socket and that's how it > > > updates its ANNOUNCE message contents. Who is supposed to construct and > > > send these PTP management messages to the ptp4l GM in a "normal" system? > > > > This role must be taken by an outside program. For example, I wrote a > > shell script to do this for a GM that always has the correct UTC offset. > > What is the role that this outside program supposed to have, apart from > establishing that the UTC time is traceable? Trying to figure out Sorry, unfinished phrase. Trying to figure out what it would take for that program to be written and be usable in a more general sense. I've no problem in setting up my application to set the CLOCK_TAI offset by itself when it detects that phc2sys and ptp4l didn't commit what they're working with to the kernel. But I'm not sure whether that may clash with what other parts of the system may have in mind. > > > In general, no Linux distro provides a sure way to determine the > > correct UTC offset. In fact, this is not possible without consulting > > the bulletin! So the responsibility of claiming correctness falls to > > the designer of the GM. > > > > > 2. In the case of a slave clock, phc2sys detects that the UTC offset of > > > the GM is traceable, and updates the CLOCK_TAI offset in the kernel. > > > > Are you asking about the slave ... > > > > > But for the GM system, who is supposed to update the CLOCK_TAI offset? > > > > or the GM ??? > > I'm asking about linuxptp being used to synchronize the CLOCK_TAI on two > back-to-back systems. I couldn't care less about traceability to UTC, or > about who is GM for that matter. I just want that when I read CLOCK_TAI > on a system where phc2sys synchronizes CLOCK_REALTIME to a PHC, or the > other way around, the offset between CLOCK_TAI and the PHC to converge > to zero. > > > > There is some logic in clock.c, namely clock_utc_correct() called > > > from clock_synchronize(): > > > > > > /* Update TAI-UTC offset of the system clock if valid and traceable. */ > > > if (c->tds.flags & UTC_OFF_VALID && c->tds.flags & TIME_TRACEABLE && > > > c->utc_offset_set != utc_offset && c->clkid == CLOCK_REALTIME) { > > > sysclk_set_tai_offset(utc_offset); > > > c->utc_offset_set = utc_offset; > > > } > > > > > > but mind you, c->clkid is only set to CLOCK_REALTIME if we are > > > performing software timestamping. > > > > Right, because with SW time stamping ptp4l is responsible for the > > Linux system time. > > > > > So in the general case, ptp4l as GM > > > does not update the CLOCK_TAI offset, and phc2sys does not, either. > > > > phc2sys does indeed set the offset. > > > > update_clock() > > clock_handle_leap() > > sysclk_set_tai_offset() > > > > > 3. Finally, why update the kernel's CLOCK_TAI offset only if the UTC > > > offset is traceable? > > > > Because the GM tells us whether the offset is valid or not. > > > > > I mean, phc2sys in automatic mode sets up the > > > CLOCK_REALTIME apart by 37 seconds from the PHC anyway, regardless of > > > whether the UTC offset is traceable or not. Would it not make sense > > > to set the kernel's TAI offset regardless? > > > > If that makes sense to you, then by all means, do it. You need only > > use the pmc to read the "not-valid" UTC offset from ptp4l. > > Do what, patch phc2sys to set the CLOCK_TAI offset regardless of > traceability of UTC? Would you accept that change? > > > > > Thanks, > > Richard |
From: Vladimir O. <ol...@gm...> - 2021-11-07 14:20:04
|
On Sun, Nov 07, 2021 at 05:55:43AM -0800, Richard Cochran wrote: > On Sun, Nov 07, 2021 at 01:32:59PM +0200, Vladimir Oltean wrote: > > 1. ptp4l in the role of a GM appears to listen for GRANDMASTER_SETTINGS_NP > > PTP management messages on the local r/w UDS socket and that's how it > > updates its ANNOUNCE message contents. Who is supposed to construct and > > send these PTP management messages to the ptp4l GM in a "normal" system? > > This role must be taken by an outside program. For example, I wrote a > shell script to do this for a GM that always has the correct UTC offset. What is the role that this outside program supposed to have, apart from establishing that the UTC time is traceable? Trying to figure out > In general, no Linux distro provides a sure way to determine the > correct UTC offset. In fact, this is not possible without consulting > the bulletin! So the responsibility of claiming correctness falls to > the designer of the GM. > > > 2. In the case of a slave clock, phc2sys detects that the UTC offset of > > the GM is traceable, and updates the CLOCK_TAI offset in the kernel. > > Are you asking about the slave ... > > > But for the GM system, who is supposed to update the CLOCK_TAI offset? > > or the GM ??? I'm asking about linuxptp being used to synchronize the CLOCK_TAI on two back-to-back systems. I couldn't care less about traceability to UTC, or about who is GM for that matter. I just want that when I read CLOCK_TAI on a system where phc2sys synchronizes CLOCK_REALTIME to a PHC, or the other way around, the offset between CLOCK_TAI and the PHC to converge to zero. > > There is some logic in clock.c, namely clock_utc_correct() called > > from clock_synchronize(): > > > > /* Update TAI-UTC offset of the system clock if valid and traceable. */ > > if (c->tds.flags & UTC_OFF_VALID && c->tds.flags & TIME_TRACEABLE && > > c->utc_offset_set != utc_offset && c->clkid == CLOCK_REALTIME) { > > sysclk_set_tai_offset(utc_offset); > > c->utc_offset_set = utc_offset; > > } > > > > but mind you, c->clkid is only set to CLOCK_REALTIME if we are > > performing software timestamping. > > Right, because with SW time stamping ptp4l is responsible for the > Linux system time. > > > So in the general case, ptp4l as GM > > does not update the CLOCK_TAI offset, and phc2sys does not, either. > > phc2sys does indeed set the offset. > > update_clock() > clock_handle_leap() > sysclk_set_tai_offset() > > > 3. Finally, why update the kernel's CLOCK_TAI offset only if the UTC > > offset is traceable? > > Because the GM tells us whether the offset is valid or not. > > > I mean, phc2sys in automatic mode sets up the > > CLOCK_REALTIME apart by 37 seconds from the PHC anyway, regardless of > > whether the UTC offset is traceable or not. Would it not make sense > > to set the kernel's TAI offset regardless? > > If that makes sense to you, then by all means, do it. You need only > use the pmc to read the "not-valid" UTC offset from ptp4l. Do what, patch phc2sys to set the CLOCK_TAI offset regardless of traceability of UTC? Would you accept that change? > > Thanks, > Richard |
From: Richard C. <ric...@gm...> - 2021-11-07 13:55:54
|
On Sun, Nov 07, 2021 at 01:32:59PM +0200, Vladimir Oltean wrote: > 1. ptp4l in the role of a GM appears to listen for GRANDMASTER_SETTINGS_NP > PTP management messages on the local r/w UDS socket and that's how it > updates its ANNOUNCE message contents. Who is supposed to construct and > send these PTP management messages to the ptp4l GM in a "normal" system? This role must be taken by an outside program. For example, I wrote a shell script to do this for a GM that always has the correct UTC offset. In general, no Linux distro provides a sure way to determine the correct UTC offset. In fact, this is not possible without consulting the bulletin! So the responsibility of claiming correctness falls to the designer of the GM. > 2. In the case of a slave clock, phc2sys detects that the UTC offset of > the GM is traceable, and updates the CLOCK_TAI offset in the kernel. Are you asking about the slave ... > But for the GM system, who is supposed to update the CLOCK_TAI offset? or the GM ??? > There is some logic in clock.c, namely clock_utc_correct() called > from clock_synchronize(): > > /* Update TAI-UTC offset of the system clock if valid and traceable. */ > if (c->tds.flags & UTC_OFF_VALID && c->tds.flags & TIME_TRACEABLE && > c->utc_offset_set != utc_offset && c->clkid == CLOCK_REALTIME) { > sysclk_set_tai_offset(utc_offset); > c->utc_offset_set = utc_offset; > } > > but mind you, c->clkid is only set to CLOCK_REALTIME if we are > performing software timestamping. Right, because with SW time stamping ptp4l is responsible for the Linux system time. > So in the general case, ptp4l as GM > does not update the CLOCK_TAI offset, and phc2sys does not, either. phc2sys does indeed set the offset. update_clock() clock_handle_leap() sysclk_set_tai_offset() > 3. Finally, why update the kernel's CLOCK_TAI offset only if the UTC > offset is traceable? Because the GM tells us whether the offset is valid or not. > I mean, phc2sys in automatic mode sets up the > CLOCK_REALTIME apart by 37 seconds from the PHC anyway, regardless of > whether the UTC offset is traceable or not. Would it not make sense > to set the kernel's TAI offset regardless? If that makes sense to you, then by all means, do it. You need only use the pmc to read the "not-valid" UTC offset from ptp4l. Thanks, Richard |
From: Vladimir O. <ol...@gm...> - 2021-11-07 11:33:08
|
phc2sys sets the CLOCK_TAI offset of the kernel since commit commit fefd5b4b05039ea0a0770291b12b0eb931079970 Author: Miroslav Lichvar <mli...@re...> Date: Wed Jun 18 15:44:49 2014 +0200 Set TAI offset of system clock. When synchronizing the system clock and the PTP UTC offset is valid and traceable, set the TAI offset of the clock to have correct CLOCK_TAI (which is implemented in the kernel as CLOCK_REALTIME + TAI offset). Signed-off-by: Miroslav Lichvar <mli...@re...> What I'm missing is: 1. ptp4l in the role of a GM appears to listen for GRANDMASTER_SETTINGS_NP PTP management messages on the local r/w UDS socket and that's how it updates its ANNOUNCE message contents. Who is supposed to construct and send these PTP management messages to the ptp4l GM in a "normal" system? 2. In the case of a slave clock, phc2sys detects that the UTC offset of the GM is traceable, and updates the CLOCK_TAI offset in the kernel. But for the GM system, who is supposed to update the CLOCK_TAI offset? There is some logic in clock.c, namely clock_utc_correct() called from clock_synchronize(): /* Update TAI-UTC offset of the system clock if valid and traceable. */ if (c->tds.flags & UTC_OFF_VALID && c->tds.flags & TIME_TRACEABLE && c->utc_offset_set != utc_offset && c->clkid == CLOCK_REALTIME) { sysclk_set_tai_offset(utc_offset); c->utc_offset_set = utc_offset; } but mind you, c->clkid is only set to CLOCK_REALTIME if we are performing software timestamping. So in the general case, ptp4l as GM does not update the CLOCK_TAI offset, and phc2sys does not, either. 3. Finally, why update the kernel's CLOCK_TAI offset only if the UTC offset is traceable? I mean, phc2sys in automatic mode sets up the CLOCK_REALTIME apart by 37 seconds from the PHC anyway, regardless of whether the UTC offset is traceable or not. Would it not make sense to set the kernel's TAI offset regardless? As things stand, I think this behavior is just highly inconsistent. CLOCK_REALTIME certainly has an offset from the TAI timescale, as set by phc2sys, but applications cannot detect this. |
From: Brooks, J. <Jas...@Al...> - 2021-11-05 14:48:12
|
Hello, I looked into this, and as far as I can tell, this is for redhat-based kvm, not vmware. Am I missing something? Here is what I get when I try and load ptp_kvm: modprobe ptp_kvm modprobe: ERROR: could not insert 'ptp_kvm': No such device From: Hussamuddin Nasir <na...@ne...> Sent: Friday, November 5, 2021 05:53 To: Brooks, Jason <Jas...@Al...>; lin...@li... Subject: [External]Re: [Linuxptp-users] running ptp4l under vmware CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender and know the content is safe. You have to use the ptp_kvm linux module for this. See https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/virtualization_deployment_and_administration_guide/chap-kvm_guest_timing_management#sect-KVM_guest_timing_management-Host-guest-time-sync<https://can01.safelinks.protection.outlook.com/?url=https%3A%2F%2Faccess.redhat.com%2Fdocumentation%2Fen-us%2Fred_hat_enterprise_linux%2F7%2Fhtml%2Fvirtualization_deployment_and_administration_guide%2Fchap-kvm_guest_timing_management%23sect-KVM_guest_timing_management-Host-guest-time-sync&data=04%7C01%7CJason.Brooks%40Allstream.com%7Cbb6b20b093e0432d06bb08d9a05b24e8%7Cd0484f44efdf4959a2734163a45cd363%7C0%7C0%7C637717135604662108%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=hYKnZIA09NLEQy00rKUanxgll6ofXXdXb7xye2QxFZg%3D&reserved=0> -- cheers, Hussam (Hussamuddin Nasir) Netlab & GENI Operations Team ------------------------------------------------------------------- Laboratory for Adv. Networking Phone : (859)218-0059 James F Hardymon Building Fax : (859)323-3740 301 Rose Street, Rm 237 E-mail : na...@ne...<mailto:na...@ne...> Lexington, KY 40506-0495 Web : http://www.netlab.uky.edu<https://can01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.netlab.uky.edu%2F&data=04%7C01%7CJason.Brooks%40Allstream.com%7Cbb6b20b093e0432d06bb08d9a05b24e8%7Cd0484f44efdf4959a2734163a45cd363%7C0%7C0%7C637717135604672065%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=d5oj5KkVppHMPdNbVfxjkS%2B67S1NnQW2%2FHsw9bzPJbw%3D&reserved=0> University of Kentucky ********************** ------------------------------------------------------------------- On 11/4/21 15:47, Brooks, Jason wrote: Hello, I am trying to build a proof-of-concept systems that will listen for ptp and then serve ntp, but I seem to have a problem: the clock on the system does not seem to get set by ptp, and therefore the clock drifts. I have a test system that uses multiple internet ntp sources and this one: this one's offset keeps growing. I don't see any sign in /var/log/messages that the clock is trying to be slewed or stepped. This is a centos 7 system running under vmware 6.7 (no precision clock available). It is using the ethernet device "e1000" to allow the ethernet software timestamping for both transmit and receive. There is no /dev/ptp device so phc2sys is not running. There are two grandmaster ptp feeds coming into this system. Upgrading to vmware 7 is not in the cards at the moment, but I might get to play with sr-iov. I am running ptp4l as: "/usr/sbin/ptp4l -f /etc/ptp4l.conf -i ens34 -l 5 -S" Ptp4l is configured with the following values altered in the /etc/ptp4l.conf: domainNumber == 44 slaveOnly == 1 ntpd is running with a minimal config: server 127.127.1.0 fudge 127.127.1.0 stratum 0 pmc "GET time_status_np" shows that ptp4l is synced with the grand master sending: GET TIME_STATUS_NP 005056.fffe.be1c0f-0 seq 0 RESPONSE MANAGEMENT TIME_STATUS_NP master_offset 0 ingress_time 1636054977549001043 cumulativeScaledRateOffset +0.000000000 scaledLastGmPhaseChange 0 gmTimeBaseIndicator 0 lastGmPhaseChange 0x0000'0000000000000000.0000 gmPresent true gmIdentity 0080ea.fffe.842b60 pmc -u -b 0 -f /etc/ptp4l.conf "GET current_data_set" sending: GET CURRENT_DATA_SET 005056.fffe.be1c0f-0 seq 0 RESPONSE MANAGEMENT CURRENT_DATA_SET stepsRemoved 1 offsetFromMaster 0.0 meanPathDelay 0.0 Jason Brooks Senior Cloud Infrastructure Engineer Infrastructure and Engineering Services Allstream [id:image001.jpg@01D2AD47.F9210620]<http://www.allstream.com/> NOTICE - CONFIDENTIAL INFORMATION This communication is the property of Allstream and may contain confidential or privileged information. If you have received this communication in error, please promptly notify the sender by reply e-mail, do not disseminate, distribute, copy or use the information contained in this communication, and destroy all copies of the communication and any attachments. AVIS - RENSEIGNEMENTS CONFIDENTIELS Cette communication est la propriété d'Allstream et peut contenir des renseignements confidentiels ou privilégiés. Si vous avez reçu cette communication par erreur, veuillez informer rapidement l'expéditeur en répondant par courriel, ne pas diffuser, distribuer, copier ou utiliser les renseignements contenus dans la présente communication, et détruire toutes les copies de la communication et ses pièces jointes. _______________________________________________ Linuxptp-users mailing list Lin...@li...<mailto:Lin...@li...> https://lists.sourceforge.net/lists/listinfo/linuxptp-users<https://can01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.sourceforge.net%2Flists%2Flistinfo%2Flinuxptp-users&data=04%7C01%7CJason.Brooks%40Allstream.com%7Cbb6b20b093e0432d06bb08d9a05b24e8%7Cd0484f44efdf4959a2734163a45cd363%7C0%7C0%7C637717135604682019%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=vbdlZlPnb0IbUCiPgwkPv1EwsjVLNdJxNxlT494PLU8%3D&reserved=0> |
From: Hussamuddin N. <na...@ne...> - 2021-11-05 13:10:13
|
You have to use the ptp_kvm linux module for this. See https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/virtualization_deployment_and_administration_guide/chap-kvm_guest_timing_management#sect-KVM_guest_timing_management-Host-guest-time-sync -- cheers, Hussam (Hussamuddin Nasir) Netlab & GENI Operations Team ------------------------------------------------------------------- Laboratory for Adv. Networking Phone : (859)218-0059 James F Hardymon Building Fax : (859)323-3740 301 Rose Street, Rm 237 E-mail :na...@ne... Lexington, KY 40506-0495 Web :http://www.netlab.uky.edu University of Kentucky ********************** ------------------------------------------------------------------- On 11/4/21 15:47, Brooks, Jason wrote: > > Hello, > > I am trying to build a proof-of-concept systems that will listen for > ptp and then serve ntp, but I seem to have a problem: the clock on the > system does not seem to get set by ptp, and therefore the clock > drifts. I have a test system that uses multiple internet ntp sources > and this one: this one’s offset keeps growing. > > I don’t see any sign in /var/log/messages that the clock is trying to > be slewed or stepped. > > This is a centos 7 system running under vmware 6.7 (no precision clock > available). It is using the ethernet device “e1000” to allow the > ethernet software timestamping for both transmit and receive. There > is no /dev/ptp device so phc2sys is not running. > > There are two grandmaster ptp feeds coming into this system. > > Upgrading to vmware 7 is not in the cards at the moment, but I might > get to play with sr-iov. > > I am running ptp4l as: “/usr/sbin/ptp4l -f /etc/ptp4l.conf -i ens34 -l > 5 -S” > > Ptp4l is configured with the following values altered in the > /etc/ptp4l.conf: > > domainNumber == 44 > > slaveOnly == 1 > > ntpd is running with a minimal config: > > server 127.127.1.0 > > fudge 127.127.1.0 stratum 0 > > pmc "GET time_status_np" shows that ptp4l is synced with the grand master > > sending: GET TIME_STATUS_NP > > 005056.fffe.be1c0f-0 seq 0 RESPONSE MANAGEMENT TIME_STATUS_NP > > master_offset 0 > > ingress_time 1636054977549001043 > > cumulativeScaledRateOffset +0.000000000 > > scaledLastGmPhaseChange 0 > > gmTimeBaseIndicator 0 > > lastGmPhaseChange 0x0000'0000000000000000.0000 > > gmPresent true > > gmIdentity 0080ea.fffe.842b60 > > pmc -u -b 0 -f /etc/ptp4l.conf "GET current_data_set" > > sending: GET CURRENT_DATA_SET > > 005056.fffe.be1c0f-0 seq 0 RESPONSE MANAGEMENT CURRENT_DATA_SET > > stepsRemoved 1 > > offsetFromMaster 0.0 > > meanPathDelay 0.0 > > *Jason Brooks* > > Senior Cloud Infrastructure Engineer > > Infrastructure and Engineering Services > > Allstream > > id:image001.jpg@01D2AD47.F9210620 <http://www.allstream.com/> > > NOTICE - CONFIDENTIAL INFORMATION This communication is the property > of Allstream and may contain confidential or privileged information. > If you have received this communication in error, please promptly > notify the sender by reply e-mail, do not disseminate, distribute, > copy or use the information contained in this communication, and > destroy all copies of the communication and any attachments. > AVIS – RENSEIGNEMENTS CONFIDENTIELS Cette communication est la > propriété d’Allstream et peut contenir des renseignements > confidentiels ou privilégiés. Si vous avez reçu cette communication > par erreur, veuillez informer rapidement l’expéditeur en répondant par > courriel, ne pas diffuser, distribuer, copier ou utiliser les > renseignements contenus dans la présente communication, et détruire > toutes les copies de la communication et ses pièces jointes. > > > _______________________________________________ > Linuxptp-users mailing list > Lin...@li... > https://lists.sourceforge.net/lists/listinfo/linuxptp-users |
From: Brooks, J. <Jas...@Al...> - 2021-11-04 23:11:42
|
Hello, I have a laptop configured as a proof of concept system. It's an older Lenovo laptop with an Intel Corporation Ethernet Connection I218-LM (rev 04), configured as an e1000 device. "ethtool -T enp0s25" shows both software and hardware timestamping available. Feeding into this system are two ptp clock streams in domain 44. I am running timemaster with ptp4l, and chrony. My desire is to have ptp (since it's more accurate) set the local clock and fallback to ntp when a failure occurs. The system would then serve ntp. Timemaster, chronyd, ptp4l, and phc2sys are all running. 21680 ? Ss 0:00 /usr/sbin/timemaster -f /etc/timemaster.conf 21681 ? S 0:00 \_ /usr/sbin/chronyd -u chrony -n -f /var/run/timemaster/chrony.conf 21682 ? S 0:09 \_ /usr/sbin/ptp4l -l 5 -f /var/run/timemaster/ptp4l.0.conf -H -i enp0s25 21683 ? S 0:00 \_ /usr/sbin/phc2sys -l 5 -a -r -R 1.00 -z /var/run/timemaster/ptp4l.0.socket -t [44:enp0s25 ] -n 44 -E ntpshm -M 0 I noticed three things: 1. Chronyc sources shows * 210 Number of sources = 5 * MS Name/IP address Stratum Poll Reach LastRx Last sample * =============================================================================== * #? PTP0 0 2 0 - +0ns[ +0ns] +/- 0nsq * ^+ time-c-b.nist.gov 1 4 377 5 +7083us[+7083us] +/- 28ms * ^+ time2.google.com 1 4 377 7 +333us[ +333us] +/- 28ms * ^* usscz2-ntp-001.aaplimg.c> 1 4 377 5 -780us[ -780us] +/- 11ms * ^+ clock.fmt.he.net 1 4 377 6 -2245us[-2245us] +/- 14ms Why is chrony showing PTP0 as a problem? 1. And when using this system as an ntp source on another system it shows as a stratum 2 system. Shouldn't this be a stratum 1, given the ptp0 clock? 2. A lot of messages in /var/log/messages: * [44:enp0s25] port 1: received SYNC without timestamp Thank you for your time! Jason Brooks Senior Cloud Infrastructure Engineer Infrastructure and Engineering Services Allstream www.allstream.com<http://www.allstream.com/> [id:image001.jpg@01D2AD47.F9210620]<http://www.allstream.com/> NOTICE - CONFIDENTIAL INFORMATION This communication is the property of Allstream and may contain confidential or privileged information. If you have received this communication in error, please promptly notify the sender by reply e-mail, do not disseminate, distribute, copy or use the information contained in this communication, and destroy all copies of the communication and any attachments. AVIS - RENSEIGNEMENTS CONFIDENTIELS Cette communication est la propri?t? d'Allstream et peut contenir des renseignements confidentiels ou privil?gi?s. Si vous avez re?u cette communication par erreur, veuillez informer rapidement l'exp?diteur en r?pondant par courriel, ne pas diffuser, distribuer, copier ou utiliser les renseignements contenus dans la pr?sente communication, et d?truire toutes les copies de la communication et ses pi?ces jointes. |