Thread: [Linuxptp-users] Master offsets don't converge
PTP IEEE 1588 stack for Linux
Brought to you by:
rcochran
From: Daniel Le <dan...@ex...> - 2016-01-15 16:46:12
|
Hello, My ptp4l version 1.4 in software timestamping mode works fine with a Linux kernel 2.6.35, however when I switch to the kernel 3.18.12 (and new Ethernet driver), I see the master offsets are huge and never converge. Any pointer to debug this is much appreciated. / #ptp4l -f /etc/ptp4l.conf ptp4l[250704.924]: port 1: INITIALIZING to LISTENING on INITIALIZE ptp4l[250704.924]: port 0: INITIALIZING to LISTENING on INITIALIZE ptp4l[250705.355]: port 1: new foreign master 00b0ae.fffe.02d103-1 ptp4l[250708.955]: selected best master clock 00b0ae.fffe.02d103 ptp4l[250708.955]: port 1: LISTENING to UNCALIBRATED on RS_SLAVE ptp4l[250709.856]: port 1: minimum delay request interval 2^-7 ptp4l[250710.698]: master offset1 -6601404463576 s0 freq +100000000 path delay 220834 ptp4l[250711.598]: master offset1 -6601404940762 s0 freq +100000000 path delay 224676 ptp4l[250712.498]: master offset1 -6601405412898 s0 freq +100000000 path delay 223500 ptp4l[250713.398]: master offset1 -6601405890510 s0 freq +100000000 path delay 227796 ptp4l[250714.298]: master offset1 -6601406361480 s0 freq +100000000 path delay 225458 ptp4l[250715.198]: master offset1 -6601406835542 s0 freq +100000000 path delay 226236 ptp4l[250716.098]: master offset1 -6601407311244 s0 freq +100000000 path delay 228594 ptp4l[250716.998]: master offset1 -6601407784176 s0 freq +100000000 path delay 228218 ptp4l[250717.898]: master offset1 -6601408255930 s0 freq +100000000 path delay 226660 ptp4l[250718.798]: master offset1 -6601408732050 s0 freq +100000000 path delay 229468 ptp4l[250719.698]: master offset1 -6601409205854 s0 freq +100000000 path delay 229956 ptp4l[250720.598]: master offset1 -6601409673392 s0 freq +100000000 path delay 224186 ptp4l[250721.497]: master offset1 -6601410151822 s0 freq +100000000 path delay 229300 ptp4l[250722.397]: master offset1 -6601410625760 s0 freq +100000000 path delay 229898 ptp4l[250723.297]: master offset1 -6601411100194 s0 freq +100000000 path delay 231052 ptp4l[250724.197]: master offset1 -6601411564234 s0 freq +100000000 path delay 221780 ptp4l[250725.097]: master offset1 -6601412044146 s1 freq +99573391 path delay 228348 ptp4l[250725.208]: clockcheck: clock jumped forward or running faster than expected! ptp4l[250725.319]: clockcheck: clock jumped forward or running faster than expected! ptp4l[250725.426]: clockcheck: clock jumped forward or running faster than expected! ptp4l[250725.536]: clockcheck: clock jumped forward or running faster than expected! ptp4l[250725.638]: clockcheck: clock jumped forward or running faster than expected! ptp4l[250725.741]: clockcheck: clock jumped forward or running faster than expected! ptp4l[250725.853]: clockcheck: clock jumped forward or running faster than expected! ptp4l[250725.965]: clockcheck: clock jumped forward or running faster than expected! ptp4l[250725.998]: master offset1 -2711305334 s0 freq +99573391 path delay 228348 ptp4l[250726.068]: clockcheck: clock jumped forward or running faster than expected! ptp4l[250726.176]: clockcheck: clock jumped forward or running faster than expected! ptp4l[250726.280]: clockcheck: clock jumped forward or running faster than expected! ptp4l[250726.387]: clockcheck: clock jumped forward or running faster than expected! ptp4l[250726.491]: clockcheck: clock jumped forward or running faster than expected! ptp4l[250726.605]: clockcheck: clock jumped forward or running faster than expected! ptp4l[250726.713]: clockcheck: clock jumped forward or running faster than expected! ptp4l[250726.814]: clockcheck: clock jumped forward or running faster than expected! ptp4l[250726.898]: master offset1 -2711772472 s0 freq +99573391 path delay 222142 ptp4l[250726.916]: clockcheck: clock jumped forward or running faster than expected! ptp4l[250727.020]: clockcheck: clock jumped forward or running faster than expected! ptp4l[250727.120]: clockcheck: clock jumped forward or running faster than expected! Thanks, Daniel |
From: Richard C. <ric...@gm...> - 2016-01-15 18:03:53
|
On Fri, Jan 15, 2016 at 04:19:23PM +0000, Daniel Le wrote: > My ptp4l version 1.4 in software timestamping mode works fine with a > Linux kernel 2.6.35, however when I switch to the kernel 3.18.12 > (and new Ethernet driver), I see the master offsets are huge and > never converge. Any pointer to debug this is much appreciated. 1. Start with vanilla 1.4 and verfiy correct operation. 2. Add your first (next) minimal change. 3. Correct operation? If yes, goto step 2 4. You found the bug. If you have lots of changes, then use git bisect. HTH, Richard |
From: Keller, J. E <jac...@in...> - 2016-01-15 18:27:53
|
On Fri, 2016-01-15 at 16:19 +0000, Daniel Le wrote: > Hello, > > My ptp4l version 1.4 in software timestamping mode works fine with a > Linux kernel 2.6.35, however when I switch to the kernel 3.18.12 (and > new Ethernet driver), I see the master offsets are huge and never > converge. Any pointer to debug this is much appreciated. > You say this is software timestamping? What's your configuration? I would suspect such a large kernel change to possibly be result of a driver bug, but this wouldn't be the case if you're using pure software timestamping. Can you copy your ptp4l.conf file? Are you using only unmodified upstream versions? If you're using any modifications, I would bisect through those, confirming that the vanilla versions work just fine. Regards, Jake > / #ptp4l -f /etc/ptp4l.conf > ptp4l[250704.924]: port 1: INITIALIZING to LISTENING on INITIALIZE > ptp4l[250704.924]: port 0: INITIALIZING to LISTENING on INITIALIZE > ptp4l[250705.355]: port 1: new foreign master 00b0ae.fffe.02d103-1 > ptp4l[250708.955]: selected best master clock 00b0ae.fffe.02d103 > ptp4l[250708.955]: port 1: LISTENING to UNCALIBRATED on RS_SLAVE > ptp4l[250709.856]: port 1: minimum delay request interval 2^-7 > ptp4l[250710.698]: master offset1 -6601404463576 s0 freq +100000000 path delay 220834 > ptp4l[250711.598]: master offset1 -6601404940762 s0 freq +100000000 path delay 224676 > ptp4l[250712.498]: master offset1 -6601405412898 s0 freq +100000000 path delay 223500 This smells of a driver bug. Notice how the frequency shift is maxed, and yet the clock is still drifting farther apart. This either means that the real clock drift is *over* 10%, (which is very unlikely), or there is a bug in the frequency tuning. But if you really are using software timestamps, this doesn't make sense. Again, if you're not using vanilla LinuxPTP 1.4, I would retry with that and confirm the behavior. If you are using vanilla LinuxPTP, I would confirm that you are infact actually using software only timestamping. Regards, Jake |
From: Daniel Le <dan...@ex...> - 2016-01-15 18:56:13
|
Below is my PTP configuration. It doesn't run in 'pure' software timestamping, i.e. although ptp4l is configured for software timestamping, the packet timestamps are provided by the FPGA hardware on a NIC, which gets the host system time every 1 second and steps/slews to it. There may be a synchronization issue between the system clock that is maintained by ptp4l and the FPGA based hardware clock. I am guessing that the large offsets are due to wrong timestamps and not sure how best to debug it... In 2.6.35 kernel, clock_adjtime() is defined as adjtimex() by #ifndef HAVE_CLOCK_ADJTIME, and in 3.18.12 clock_adjtime() is used as is, but that seems not to be the issue. Thanks. / #cat /etc/ptp4l.conf [global] domainNumber 0 slaveOnly 1 priority1 128 priority2 128 clockClass 248 clockAccuracy 254 offsetScaledLogVariance 65535 freq_est_interval 1 time_stamping software tx_timestamp_timeout 1 logging_level 6 verbose 1 use_syslog 0 summary_interval 0 [eth1] delay_mechanism E2E network_transport UDPv4 delayAsymmetry 0 logAnnounceInterval 1 logSyncInterval 0 logMinDelayReqInterval 0 logMinPdelayReqInterval 0 announceReceiptTimeout 3 syncReceiptTimeout 0 delay_filter moving_average delay_filter_length 10 path_trace_enabled 0 fault_reset_interval 4 -----Original Message----- From: Keller, Jacob E [mailto:jac...@in...] Sent: Friday, January 15, 2016 1:28 PM To: Daniel Le <dan...@ex...>; lin...@li... Subject: Re: [Linuxptp-users] Master offsets don't converge On Fri, 2016-01-15 at 16:19 +0000, Daniel Le wrote: > Hello, > > My ptp4l version 1.4 in software timestamping mode works fine with a > Linux kernel 2.6.35, however when I switch to the kernel 3.18.12 (and > new Ethernet driver), I see the master offsets are huge and never > converge. Any pointer to debug this is much appreciated. > You say this is software timestamping? What's your configuration? I would suspect such a large kernel change to possibly be result of a driver bug, but this wouldn't be the case if you're using pure software timestamping. Can you copy your ptp4l.conf file? Are you using only unmodified upstream versions? If you're using any modifications, I would bisect through those, confirming that the vanilla versions work just fine. Regards, Jake > / #ptp4l -f /etc/ptp4l.conf > ptp4l[250704.924]: port 1: INITIALIZING to LISTENING on INITIALIZE > ptp4l[250704.924]: port 0: INITIALIZING to LISTENING on INITIALIZE > ptp4l[250705.355]: port 1: new foreign master 00b0ae.fffe.02d103-1 > ptp4l[250708.955]: selected best master clock 00b0ae.fffe.02d103 > ptp4l[250708.955]: port 1: LISTENING to UNCALIBRATED on RS_SLAVE > ptp4l[250709.856]: port 1: minimum delay request interval 2^-7 > ptp4l[250710.698]: master offset1 -6601404463576 s0 freq +100000000 > path delay 220834 > ptp4l[250711.598]: master offset1 -6601404940762 s0 freq +100000000 > path delay 224676 > ptp4l[250712.498]: master offset1 -6601405412898 s0 freq +100000000 > path delay 223500 This smells of a driver bug. Notice how the frequency shift is maxed, and yet the clock is still drifting farther apart. This either means that the real clock drift is *over* 10%, (which is very unlikely), or there is a bug in the frequency tuning. But if you really are using software timestamps, this doesn't make sense. Again, if you're not using vanilla LinuxPTP 1.4, I would retry with that and confirm the behavior. If you are using vanilla LinuxPTP, I would confirm that you are infact actually using software only timestamping. Regards, Jake |
From: Keller, J. E <jac...@in...> - 2016-01-15 19:29:32
|
It is almost certainly a result of the driver doing the mixed hardware/software timestamps. I suspect that the software clock is being slewed, but somehow your timestamps are not being updated fast enough so these hardware timestamps are no longer matching against the system clock. Out of curiosity, why not expose the hardware clock directly as a PHC? Regards, Jake On Fri, 2016-01-15 at 18:56 +0000, Daniel Le wrote: > Below is my PTP configuration. It doesn't run in 'pure' software > timestamping, i.e. although ptp4l is configured for software > timestamping, the packet timestamps are provided by the FPGA hardware > on a NIC, which gets the host system time every 1 second and > steps/slews to it. There may be a synchronization issue between the > system clock that is maintained by ptp4l and the FPGA based hardware > clock. I am guessing that the large offsets are due to wrong > timestamps and not sure how best to debug it... > > In 2.6.35 kernel, clock_adjtime() is defined as adjtimex() by #ifndef > HAVE_CLOCK_ADJTIME, and in 3.18.12 clock_adjtime() is used as is, but > that seems not to be the issue. > > Thanks. > > / #cat /etc/ptp4l.conf > [global] > domainNumber 0 > slaveOnly 1 > priority1 128 > priority2 128 > clockClass 248 > clockAccuracy 254 > offsetScaledLogVariance 65535 > freq_est_interval 1 > time_stamping software > tx_timestamp_timeout 1 > logging_level 6 > verbose 1 > use_syslog 0 > summary_interval 0 > [eth1] > delay_mechanism E2E > network_transport UDPv4 > delayAsymmetry 0 > logAnnounceInterval 1 > logSyncInterval 0 > logMinDelayReqInterval 0 > logMinPdelayReqInterval 0 > announceReceiptTimeout 3 > syncReceiptTimeout 0 > delay_filter moving_average > delay_filter_length 10 > path_trace_enabled 0 > fault_reset_interval 4 > > > -----Original Message----- > From: Keller, Jacob E [mailto:jac...@in...] > Sent: Friday, January 15, 2016 1:28 PM > To: Daniel Le <dan...@ex...>; lin...@li...urceforge. > net > Subject: Re: [Linuxptp-users] Master offsets don't converge > > On Fri, 2016-01-15 at 16:19 +0000, Daniel Le wrote: > > Hello, > > > > My ptp4l version 1.4 in software timestamping mode works fine with > > a > > Linux kernel 2.6.35, however when I switch to the kernel 3.18.12 > > (and > > new Ethernet driver), I see the master offsets are huge and never > > converge. Any pointer to debug this is much appreciated. > > > > You say this is software timestamping? What's your configuration? I > would suspect such a large kernel change to possibly be result of a > driver bug, but this wouldn't be the case if you're using pure > software timestamping. Can you copy your ptp4l.conf file? > > > Are you using only unmodified upstream versions? If you're using any > modifications, I would bisect through those, confirming that the > vanilla versions work just fine. > > Regards, > Jake > > > / #ptp4l -f /etc/ptp4l.conf > > ptp4l[250704.924]: port 1: INITIALIZING to LISTENING on INITIALIZE > > ptp4l[250704.924]: port 0: INITIALIZING to LISTENING on INITIALIZE > > ptp4l[250705.355]: port 1: new foreign master 00b0ae.fffe.02d103-1 > > ptp4l[250708.955]: selected best master clock 00b0ae.fffe.02d103 > > ptp4l[250708.955]: port 1: LISTENING to UNCALIBRATED on RS_SLAVE > > ptp4l[250709.856]: port 1: minimum delay request interval 2^-7 > > ptp4l[250710.698]: master offset1 -6601404463576 s0 freq > > +100000000 > > path delay 220834 > > ptp4l[250711.598]: master offset1 -6601404940762 s0 freq > > +100000000 > > path delay 224676 > > ptp4l[250712.498]: master offset1 -6601405412898 s0 freq > > +100000000 > > path delay 223500 > > This smells of a driver bug. Notice how the frequency shift is maxed, > and yet the clock is still drifting farther apart. This either means > that the real clock drift is *over* 10%, (which is very unlikely), or > there is a bug in the frequency tuning. But if you really are using > software timestamps, this doesn't make sense. > > > Again, if you're not using vanilla LinuxPTP 1.4, I would retry with > that and confirm the behavior. If you are using vanilla LinuxPTP, I > would confirm that you are infact actually using software only > timestamping. > > Regards, > Jake |
From: Daniel Le <dan...@ex...> - 2016-01-15 19:40:36
|
Hi Jake, Because my hardware NIC is not 1588 capable and that would require FPGA change, however I'm hoping to get better timestamp accuracy from the hardware clock that is tuned to the host system clock in software timestamping mode (which I understand it's in the reverse direction of LinuxPTP hardware timestamping configuration where the system clock synchronizes to the PHC clock instead). Daniel -----Original Message----- From: Keller, Jacob E [mailto:jac...@in...] Sent: Friday, January 15, 2016 2:29 PM To: Daniel Le <dan...@ex...>; lin...@li... Subject: Re: [Linuxptp-users] Master offsets don't converge It is almost certainly a result of the driver doing the mixed hardware/software timestamps. I suspect that the software clock is being slewed, but somehow your timestamps are not being updated fast enough so these hardware timestamps are no longer matching against the system clock. Out of curiosity, why not expose the hardware clock directly as a PHC? Regards, Jake On Fri, 2016-01-15 at 18:56 +0000, Daniel Le wrote: > Below is my PTP configuration. It doesn't run in 'pure' software > timestamping, i.e. although ptp4l is configured for software > timestamping, the packet timestamps are provided by the FPGA hardware > on a NIC, which gets the host system time every 1 second and > steps/slews to it. There may be a synchronization issue between the > system clock that is maintained by ptp4l and the FPGA based hardware > clock. I am guessing that the large offsets are due to wrong > timestamps and not sure how best to debug it... > > In 2.6.35 kernel, clock_adjtime() is defined as adjtimex() by #ifndef > HAVE_CLOCK_ADJTIME, and in 3.18.12 clock_adjtime() is used as is, but > that seems not to be the issue. > > Thanks. > > / #cat /etc/ptp4l.conf > [global] > domainNumber 0 > slaveOnly 1 > priority1 128 > priority2 128 clockClass > 248 clockAccuracy 254 offsetScaledLogVariance > 65535 freq_est_interval 1 time_stamping > software tx_timestamp_timeout 1 logging_level > 6 verbose 1 use_syslog > 0 summary_interval 0 [eth1] delay_mechanism > E2E network_transport UDPv4 delayAsymmetry > 0 logAnnounceInterval 1 logSyncInterval > 0 logMinDelayReqInterval 0 logMinPdelayReqInterval 0 > announceReceiptTimeout 3 syncReceiptTimeout 0 > delay_filter moving_average > delay_filter_length 10 path_trace_enabled > 0 fault_reset_interval 4 > > > -----Original Message----- > From: Keller, Jacob E [mailto:jac...@in...] > Sent: Friday, January 15, 2016 1:28 PM > To: Daniel Le <dan...@ex...>; lin...@li...urceforge. > net > Subject: Re: [Linuxptp-users] Master offsets don't converge > > On Fri, 2016-01-15 at 16:19 +0000, Daniel Le wrote: > > Hello, > > > > My ptp4l version 1.4 in software timestamping mode works fine with a > > Linux kernel 2.6.35, however when I switch to the kernel 3.18.12 > > (and new Ethernet driver), I see the master offsets are huge and > > never converge. Any pointer to debug this is much appreciated. > > > > You say this is software timestamping? What's your configuration? I > would suspect such a large kernel change to possibly be result of a > driver bug, but this wouldn't be the case if you're using pure > software timestamping. Can you copy your ptp4l.conf file? > > > Are you using only unmodified upstream versions? If you're using any > modifications, I would bisect through those, confirming that the > vanilla versions work just fine. > > Regards, > Jake > > > / #ptp4l -f /etc/ptp4l.conf > > ptp4l[250704.924]: port 1: INITIALIZING to LISTENING on INITIALIZE > > ptp4l[250704.924]: port 0: INITIALIZING to LISTENING on INITIALIZE > > ptp4l[250705.355]: port 1: new foreign master 00b0ae.fffe.02d103-1 > > ptp4l[250708.955]: selected best master clock 00b0ae.fffe.02d103 > > ptp4l[250708.955]: port 1: LISTENING to UNCALIBRATED on RS_SLAVE > > ptp4l[250709.856]: port 1: minimum delay request interval 2^-7 > > ptp4l[250710.698]: master offset1 -6601404463576 s0 freq > > +100000000 > > path delay 220834 > > ptp4l[250711.598]: master offset1 -6601404940762 s0 freq > > +100000000 > > path delay 224676 > > ptp4l[250712.498]: master offset1 -6601405412898 s0 freq > > +100000000 > > path delay 223500 > > This smells of a driver bug. Notice how the frequency shift is maxed, > and yet the clock is still drifting farther apart. This either means > that the real clock drift is *over* 10%, (which is very unlikely), or > there is a bug in the frequency tuning. But if you really are using > software timestamps, this doesn't make sense. > > > Again, if you're not using vanilla LinuxPTP 1.4, I would retry with > that and confirm the behavior. If you are using vanilla LinuxPTP, I > would confirm that you are infact actually using software only > timestamping. > > Regards, > Jake |
From: Keller, J. E <jac...@in...> - 2016-01-15 23:57:37
|
On Fri, 2016-01-15 at 19:40 +0000, Daniel Le wrote: > Hi Jake, > > Because my hardware NIC is not 1588 capable and that would require > FPGA change, however I'm hoping to get better timestamp accuracy from > the hardware clock that is tuned to the host system clock in software > timestamping mode (which I understand it's in the reverse direction > of LinuxPTP hardware timestamping configuration where the system > clock synchronizes to the PHC clock instead). > > Daniel If you have a hardware clock which you can slew, and the ability to take hardware timestamps I am failing to see how you are unable to implement the PHC subsystem calls? I suspect however you are converting the timestamps taken by hardware is incorrect, and we can't help you with that easily. This is the first place I would look for an issue, especially if you can confirm that your software timestamps without the special ethernet driver work as expected. Regards, Jake |