Re: [Linuxptp-users] SLAVE to UNCALIBRATED on SYNCHRONIZATION_FAULT
PTP IEEE 1588 stack for Linux
Brought to you by:
rcochran
From: Daniel Le <dan...@ex...> - 2015-04-02 15:38:46
|
Please see inline. -----Original Message----- From: Richard Cochran [mailto:ric...@gm...] Sent: Thursday, April 02, 2015 2:32 AM To: Daniel Le Cc: lin...@li... Subject: Re: [Linuxptp-users] SLAVE to UNCALIBRATED on SYNCHRONIZATION_FAULT On Wed, Apr 01, 2015 at 10:18:27PM +0000, Daniel Le wrote: > My PTP slave clock appears to lose sync with a grandmaster clock when > under heavy load and worse it can't recover. The sync is good when > there is low or no other traffic. This slave clock uses software > timestamping to adjust the host system time. The PTP transmit and > receive packets are time stamped by a non-1588 aware NIC's FPGA clock > which is sync'd to the host system clock, i.e. the NIC regularly gets > host system time to step/slew to it. This sounds fishy to me. You say your slave uses SW time stamping, but that the FPGA provides time stamps. That is HW time stamping! Also, since the Linux system time is purely software, how do you get its time into the FPGA? By using phc2sys? [DL] The FPGA has its own clock and a proprietary slewing mechanism to sync to a time source. It does not use phc2sys because my embedded system doesn't have 3.x Linux kernel. [DL] In the case of PTP time source, the FPGA engine on the NIC periodically reads the kernel system time (do_gettimeofday) in order to step/slew to the system time which is synchronized to PTP grandmaster time. [DL] The ptp4l program is run with -S option, however, for example when sending/receiving packets via IPv4 transport in udp_send() and udp_recv(), a timestamping pipe is used to get the FPGA hardware timestamps of the packets, instead of the functions sendto() and sk_receive(). > The log shows: > - port <port#>: SLAVE to UNCALIBRATED on SYNCHRONIZATION_FAULT > > and the following repetitive messages: > - clockcheck: clock jumped forward or running faster than expected! > - clockcheck: clock jumped backward or running slower than expected! > > I would appreciate information to debug this, as well an explanation of what may be happening. That message comes from the function, clockcheck_sample(), in clockcheck.c. It does the following: /* Check the sanity of the synchronized clock by comparing its uncorrected frequency with the system monotonic clock. If the synchronized clock is the system clock, the measured frequency offset will be the current frequency correction of the system clock. */ This is sanity check against CLOCK_MONOTONIC. Probably there is a bug in your custom HW design or in the system/fpga synchronization method. [DL] Could you further elaborate this clockcheck_sample functionality (such as uncorrected frequency)? Is my understating of the following correct? - The synchronized clock is the PTP clock and is maintained by PTP packet TX/RX timestamps per 1588 standard. - The system monotonic clock (CLOCK_MONOTONIC) is the Linux kernel system clock. [DL] What is the threshold to determine that clock jumped forward/backward too much? [DL]Upon a system boot-up or restart, how does PTP slave clock sets the system clock initially? Is CLOCK_REALTIME involved? Thank you. Daniel |