Thread: [Linuxptp-users] Failed synchronization
PTP IEEE 1588 stack for Linux
Brought to you by:
rcochran
From: Daniel Le <dan...@ex...> - 2016-08-04 20:01:30
|
Hello, I have frequent synchronization issue on a PTP ordinary clock which has Linux kernel 3.18.12 and LinuxPTP v1.6. It runs in slave mode with software timestamping. To force the problem to occur every time, I change the system clock to be approximately 5 minutes ahead of PTP Grandmaster clock. In the following, the (enhanced) log messages show ptp4l correctly steps the system clock with the observed offset, then transitions into SERVO_LOCKED state, however the master offset as seen by ptp4l keeps increasing afterwards. The slew and PI reset operations seem not to be able to keep up. Any help is appreciated. Daniel ptp4l[137.667]: port 1: INITIALIZING to LISTENING on INITIALIZE ptp4l[137.667]: port 0: INITIALIZING to LISTENING on INITIALIZE ptp4l[137.864]: port 1: new foreign master 00b0ae.fffe.02d104-1 ptp4l[141.864]: selected best master clock 00b0ae.fffe.02d104 ptp4l[141.864]: port 1: LISTENING to UNCALIBRATED on RS_SLAVE ptp4l[143.090]: port 1: minimum delay request interval 2^-7 ptp4l[143.801]: master offset 292990930284 s0 freq +0 path delay 255286 .................................... ptp4l[158.800]: master offset 292990192008 s0 freq +0 path delay 2970 ptp4l[159.799]: master offset 292989435868 s1 freq -93410 path delay 255430 ptp4l[159.800]: ***Before step*** system time = Thu Aug 4 18:59:45 2016 839875 micros ptp4l[159.800]: clockadj_step clkid=0 offset=-292989435868 ptp4l[159.800]: ***After step*** system time = Thu Aug 4 18:54:51 2016 850548 micros ptp4l[160.799]: master offset 292989436892 s2 freq +100000000 path delay 255430 <<<========= ptp4l[160.799]: clockadj_set_freq ptp4l[160.800]: port 1: UNCALIBRATED to SLAVE on MASTER_CLOCK_SELECTED ptp4l[160.945]: clockcheck: clock jumped forward or running faster than expected! ptp4l[160.945]: pi_reset .................................... ptp4l[161.700]: master offset 292989691148 s0 freq +100000000 path delay 2778 ptp4l[161.700]: port 1: SLAVE to UNCALIBRATED on SYNCHRONIZATION_FAULT ptp4l[161.792]: clockcheck: clock jumped forward or running faster than expected! ptp4l[161.792]: pi_reset .................................... ptp4l[162.600]: master offset 292989692174 s0 freq +100000000 path delay 2904 ptp4l[162.650]: clockcheck: clock jumped forward or running faster than expected! ptp4l[162.650]: pi_reset .................................... ptp4l[163.500]: master offset 292989693962 s0 freq +100000000 path delay 2748 ptp4l[163.586]: clockcheck: clock jumped forward or running faster than expected! ptp4l[163.586]: pi_reset .................................... ptp4l[164.400]: master offset 292989694906 s0 freq +100000000 path delay 2888 ptp4l[164.431]: clockcheck: clock jumped forward or running faster than expected! ptp4l[164.431]: pi_reset .................................... ptp4l[165.300]: master offset -147127914636 s0 freq +100000000 path delay 146705873134 ptp4l[165.382]: clockcheck: clock jumped forward or running faster than expected! ptp4l[165.382]: pi_reset .................................... ptp4l[166.200]: master offset -422042768 s0 freq +100000000 path delay 2870 ptp4l[166.231]: clockcheck: clock jumped forward or running faster than expected! ptp4l[166.231]: pi_reset .................................... ptp4l[167.100]: master offset -422041762 s0 freq +100000000 path delay 2980 ptp4l[167.180]: clockcheck: clock jumped forward or running faster than expected! ptp4l[167.180]: pi_reset .................................... ptp4l[167.999]: master offset -422040068 s0 freq +100000000 path delay 2886 ptp4l[168.031]: clockcheck: clock jumped forward or running faster than expected! ptp4l[168.031]: pi_reset .................................... ptp4l[168.899]: master offset -422039004 s0 freq +100000000 path delay 2910 ptp4l[168.966]: clockcheck: clock jumped forward or running faster than expected! ptp4l[168.966]: pi_reset .................................... ptp4l[169.799]: master offset -422037260 s0 freq +100000000 path delay 2770 ptp4l[170.001]: clockcheck: clock jumped forward or running faster than expected! ptp4l[170.001]: pi_reset ptp4l[170.699]: master offset -585402570 s0 freq +100000000 path delay 51353036 ptp4l[171.599]: master offset -746676648 s0 freq +100000000 path delay 69771338 ptp4l[172.499]: master offset -887764240 s0 freq +100000000 path delay 68002674 .................................... |
From: Richard C. <ric...@gm...> - 2016-08-04 20:34:11
|
On Thu, Aug 04, 2016 at 07:36:45PM +0000, Daniel Le wrote: > I have frequent synchronization issue on a PTP ordinary clock which > has Linux kernel 3.18.12 and LinuxPTP v1.6. It runs in slave mode > with software timestamping. You didn't tell us what hardware your are using: platform, NIC, driver? > To force the problem to occur every time, I change the system clock > to be approximately 5 minutes ahead of PTP Grandmaster clock. In the > following, the (enhanced) log messages show ptp4l correctly steps > the system clock with the observed offset, then transitions into > SERVO_LOCKED state, however the master offset as seen by ptp4l keeps > increasing afterwards. The slew and PI reset operations seem not to > be able to keep up. After stepping the clock, the offset remains the same. Your system clock or the MAC driver is somehow broken. Also, the frequency estimation that occurs between UNCALIBRATED and SLAVE runs off the scale. Is your local oscillator really that bad? No idea how your computer could be so messed up. Sorry, Richard |
From: Daniel Le <dan...@ex...> - 2016-08-04 22:13:42
|
Hi Richard, please see follow-up comments and questions inline. -----Original Message----- From: Richard Cochran [mailto:ric...@gm...] Sent: Thursday, August 04, 2016 4:34 PM To: Daniel Le <dan...@ex...> Cc: lin...@li... Subject: Re: [Linuxptp-users] Failed synchronization On Thu, Aug 04, 2016 at 07:36:45PM +0000, Daniel Le wrote: > I have frequent synchronization issue on a PTP ordinary clock which > has Linux kernel 3.18.12 and LinuxPTP v1.6. It runs in slave mode with > software timestamping. You didn't tell us what hardware your are using: platform, NIC, driver? [DL] It's an embedded system with proprietary FPGA based NIC and driver. > To force the problem to occur every time, I change the system clock to > be approximately 5 minutes ahead of PTP Grandmaster clock. In the > following, the (enhanced) log messages show ptp4l correctly steps the > system clock with the observed offset, then transitions into > SERVO_LOCKED state, however the master offset as seen by ptp4l keeps > increasing afterwards. The slew and PI reset operations seem not to be > able to keep up. After stepping the clock, the offset remains the same. Your system clock or the MAC driver is somehow broken. [DL] Is tsproc_update_offset function a good place where to check t1, t2, t3, t4 values in order to see which timestamp(s) may cause the offset to remain the same? [DL] In this function, are master timestamps (t1, t4) TAI and slave timestamps (t2, t3) UTC? Also, the frequency estimation that occurs between UNCALIBRATED and SLAVE runs off the scale. Is your local oscillator really that bad? [DL] I can't explain why... but if the system is rebooted then synchronization works fine, while a restart of ptp4l doesn't make a difference. No idea how your computer could be so messed up. Sorry, Richard |
From: Richard C. <ric...@gm...> - 2016-08-05 07:40:53
|
On Thu, Aug 04, 2016 at 10:13:35PM +0000, Daniel Le wrote: > You didn't tell us what hardware your are using: platform, NIC, driver? > [DL] It's an embedded system with proprietary FPGA based NIC and driver. Sigh. This is the same system you were asking us about back in April 2015. If you have HW time stamping, then you should implement a proper PHC. The fact that you are using SW time stamping in the application is a sure sign that you have it all wrong. > After stepping the clock, the offset remains the same. Your system clock or the MAC driver is somehow broken. > [DL] Is tsproc_update_offset function a good place where to check t1, t2, t3, t4 values > in order to see which timestamp(s) may cause the offset to remain the same? Try looking in your driver. > [DL] In this function, are master timestamps (t1, t4) TAI and slave timestamps (t2, t3) UTC? Just how are we supposed to know? You have some kind of weird custom system in which you abuse the whole PHC subsystem. We can't help you with that. Sorry, Richard |
From: Daniel Le <dan...@ex...> - 2016-08-10 20:49:06
|
Hi Richard, It's a new hardware platform and newer 3.x Linux kernel, however the integration approach is same. I changed to rely on software timestamps instead and synchronization works fine. When our FPGA team will make our NIC 1588 aware and implement a PHC, I will use hardware timestamping. Regards, Daniel -----Original Message----- From: Richard Cochran [mailto:ric...@gm...] Sent: Friday, August 05, 2016 3:41 AM To: Daniel Le <dan...@ex...> Cc: lin...@li... Subject: Re: [Linuxptp-users] Failed synchronization On Thu, Aug 04, 2016 at 10:13:35PM +0000, Daniel Le wrote: > You didn't tell us what hardware your are using: platform, NIC, driver? > [DL] It's an embedded system with proprietary FPGA based NIC and driver. Sigh. This is the same system you were asking us about back in April 2015. If you have HW time stamping, then you should implement a proper PHC. The fact that you are using SW time stamping in the application is a sure sign that you have it all wrong. > After stepping the clock, the offset remains the same. Your system clock or the MAC driver is somehow broken. > [DL] Is tsproc_update_offset function a good place where to check t1, t2, t3, t4 values > in order to see which timestamp(s) may cause the offset to remain the same? Try looking in your driver. > [DL] In this function, are master timestamps (t1, t4) TAI and slave timestamps (t2, t3) UTC? Just how are we supposed to know? You have some kind of weird custom system in which you abuse the whole PHC subsystem. We can't help you with that. Sorry, Richard |