Thread: [Linuxptp-users] clockcheck - need to filter large spurious phase jumps?
PTP IEEE 1588 stack for Linux
Brought to you by:
rcochran
From: Rich S. <sch...@gm...> - 2013-11-05 21:26:29
|
This is Rich Schmidt, linuxptp newbie. I am testing linuxptp on this system at the US Naval Observatory: Supermicro SYS-5015A-EHF-D525 (Atom) Intel 82547L NICs driver: e1000e version: 2.5.4-NAPI firmware-version: 1.9-0 Debian with kernel 3.12.0-rc Running: Sync PHC to USNO Master Clock via Zyfer Gsync PTP GrandMaster: ptp4l -i eth1 -l 7 -s -p /dev/ptp1 Sync CLOCK_REALTIME to PHC: phc2sys -s /dev/ptp1 -L 100000000 -l 7 -R 0.25 -O 0 Things seem to work fine for a while, then I get a single large phase offset detected by ptp4l. The -L freq limit was an attempt to control these offsets, but did not help. Are these large phase jumps filtered out by ptp4l? It seems not, because phc2sys sees them. Or is this some unreliability in the Intel 82547L NICs? Is the PHC read failing? Thank you for your thoughts. Here is a sample. The clock is not being steered by NTP or any other program. Nov 5 18:12:27 pluto ptp4l: [354666.428] master offset 57 s2 freq +34356 path delay 6086 Nov 5 18:12:29 pluto ptp4l: [354668.428] master offset -139 s2 freq +34266 path delay 6092 Nov 5 18:12:30 pluto phc2sys: [354669.993] phc offset 4529 s2 freq +8805 delay 4715 Nov 5 18:12:31 pluto ptp4l: [354670.428] master offset -32 s2 freq +34299 path delay 6092 Nov 5 18:12:33 pluto ptp4l: [354672.428] master offset 20 s2 freq +34320 path delay 6092 Nov 5 18:12:34 pluto phc2sys: [354673.993] phc offset 470 s2 freq +7931 delay 4705 Nov 5 18:12:35 pluto ptp4l: [354674.428] master offset 54 s2 freq +34340 path delay 6095 Nov 5 18:12:37 pluto ptp4l: [354676.428] master offset -15 s2 freq +34314 path delay 6095 Nov 5 18:12:38 pluto phc2sys: [354677.993] phc offset -6992 s2 freq +3968 delay 4870 Nov 5 18:12:39 pluto ptp4l: [354678.428] master offset -19 s2 freq +34309 path delay 6096 Nov 5 18:12:41 pluto ptp4l: [354680.429] master offset 55 s2 freq +34344 path delay 6096 Nov 5 18:12:42 pluto phc2sys: [354681.994] phc offset 11326 s2 freq +11945 delay 4715 Nov 5 18:12:43 pluto ptp4l: [354682.428] master offset -90 s2 freq +34279 path delay 6096 Nov 5 18:12:45 pluto ptp4l: [354684.429] master offset -49 s2 freq +34286 path delay 6096 Nov 5 18:12:46 pluto phc2sys: [354685.994] phc offset -70368744182111 s2 freq -500000 delay 4715 *Nov 5 18:12:47 pluto ptp4l: [354686.428] clockcheck: clock jumped forward or running faster than expected!Nov 5 18:12:47 pluto ptp4l: [354686.429] master offset 70368744177715 s0 freq +0 path delay 6095* Nov 5 18:12:47 pluto ptp4l: [354686.430] port 1: SLAVE to UNCALIBRATED on SYNCHRONIZATION_FAULT Nov 5 18:12:49 pluto ptp4l: [354688.430] master offset 70368744177775 s1 freq +34334 path delay 6095 Nov 5 18:12:50 pluto phc2sys: [354689.994] phc offset 2028439 s2 freq +500000 delay 4717 Nov 5 18:12:51 pluto ptp4l: [354690.431] master offset -4473 s2 freq +32097 path delay 6095 Nov 5 18:12:51 pluto ptp4l: [354690.431] port 1: UNCALIBRATED to SLAVE on MASTER_CLOCK_SELECTED Nov 5 18:12:53 pluto ptp4l: [354692.430] master offset -48 s2 freq +33639 path delay 6095 Nov 5 18:12:54 pluto phc2sys: [354693.994] phc offset 52807 s2 freq +38158 delay 4712 Nov 5 18:12:55 pluto ptp4l: [354694.429] master offset 1367 s2 freq +34339 path delay 6095 Nov 5 18:12:57 pluto ptp4l: [354696.429] master offset 1179 s2 freq +34450 path delay 6095 Nov 5 18:12:58 pluto phc2sys: [354697.995] phc offset -65029 s2 freq -10810 delay 4715 Nov 5 18:12:59 pluto ptp4l: [354698.429] master offset 866 s2 freq +34470 path delay 6104 ... Nov 5 18:19:02 pluto phc2sys: [355062.012] phc offset 6991 s2 freq +10913 delay 4704 Nov 5 18:19:03 pluto ptp4l: [355062.442] master offset -8 s2 freq +34342 path delay 6109 Nov 5 18:19:05 pluto ptp4l: [355064.442] master offset -160 s2 freq +34265 path delay 6109 Nov 5 18:19:06 pluto phc2sys: [355066.012] phc offset -11133 s2 freq +3042 delay 4715 Nov 5 18:19:07 pluto ptp4l: [355066.442] master offset -73 s2 freq +34285 path delay 6102 Nov 5 18:19:09 pluto ptp4l: [355068.442] master offset 8 s2 freq +34314 path delay 6102 Nov 5 18:19:10 pluto phc2sys: [355070.012] phc offset 6061 s2 freq +9159 delay 4715 Nov 5 18:19:11 pluto ptp4l: [355070.442] master offset 89 s2 freq +34356 path delay 6098 Nov 5 18:19:13 pluto ptp4l: [355072.442] master offset 20 s2 freq +34335 path delay 6098 *Nov 5 18:19:14 pluto phc2sys: [355074.012] phc offset -70368744173547 s2 freq -500000 delay 4715Nov 5 18:19:15 pluto ptp4l: [355074.441] clockcheck: clock jumped forward or running faster than expected!* Nov 5 18:19:15 pluto ptp4l: [355074.442] master offset 70368744177526 s0 freq +0 path delay 6103 Nov 5 18:19:15 pluto ptp4l: [355074.443] port 1: SLAVE to UNCALIBRATED on SYNCHRONIZATION_FAULT Nov 5 18:19:17 pluto ptp4l: [355076.444] master offset 70368744177457 s1 freq +34293 path delay 6103 Nov 5 18:19:18 pluto phc2sys: [355078.013] phc offset 2034086 s2 freq +500000 delay 4762 Nov 5 18:19:19 pluto ptp4l: [355078.444] master offset -4252 s2 freq +32167 path delay 6103 Nov 5 18:19:19 pluto ptp4l: [355078.445] port 1: UNCALIBRATED to SLAVE on MASTER_CLOCK_SELECTED Nov 5 18:19:21 pluto ptp4l: [355080.443] master offset 71 s2 freq +33691 path delay 6103 Nov 5 18:19:22 pluto phc2sys: [355082.013] phc offset 54724 s2 freq +37742 delay 4713 Nov 5 18:19:23 pluto ptp4l: [355082.443] master offset 1255 s2 freq +34294 path delay 6103 Nov 5 18:19:25 pluto ptp4l: [355084.443] master offset 1298 s2 freq +34503 path delay 6102 Nov 5 18:19:26 pluto phc2sys: [355086.013] phc offset -71161 s2 freq -15078 delay 4706 Nov 5 18:19:27 pluto ptp4l: [355086.443] master offset 1034 s2 freq +34566 path delay 6102 Nov 5 18:19:29 pluto ptp4l: [355088.443] master offset 540 s2 freq +34474 path delay 6106 ... Nov 5 19:16:34 pluto phc2sys: [358514.174] phc offset -691 s2 freq +10120 delay 4714 Nov 5 19:16:35 pluto ptp4l: [358514.563] master offset 56 s2 freq +34310 path delay 6097 Nov 5 19:16:37 pluto ptp4l: [358516.563] master offset 98 s2 freq +34340 path delay 6097 Nov 5 19:16:38 pluto phc2sys: [358518.174] phc offset -37068 s2 freq -10095 delay 4716 *Nov 5 19:16:39 pluto ptp4l: [358518.562] clockcheck: clock jumped forward or running faster than expected!Nov 5 19:16:39 pluto ptp4l: [358518.563] master offset 70368744177566 s0 freq +0 path delay 6100* Nov 5 19:16:39 pluto ptp4l: [358518.563] port 1: SLAVE to UNCALIBRATED on SYNCHRONIZATION_FAULT Nov 5 19:16:41 pluto ptp4l: [358520.563] master offset 70368744177456 s1 freq +34250 path delay 6100 Nov 5 19:16:42 pluto phc2sys: [358522.175] phc offset 59749 s2 freq +32034 delay 4714 Nov 5 19:16:43 pluto ptp4l: [358522.563] master offset -4442 s2 freq +32029 path delay 6100 Nov 5 19:16:43 pluto ptp4l: [358522.563] port 1: UNCALIBRATED to SLAVE on MASTER_CLOCK_SELECTED Nov 5 19:16:45 pluto ptp4l: [358524.564] master offset 126 s2 freq +33647 path delay 6100 Nov 5 19:16:46 pluto phc2sys: [358526.175] phc offset -32436 s2 freq -743 delay 4933 Nov 5 19:16:47 pluto ptp4l: [358526.563] master offset 1510 s2 freq +34358 path delay 6100 Nov 5 19:16:49 pluto ptp4l: [358528.563] master offset 1356 s2 freq +34508 path delay 6106 Nov 5 19:16:50 pluto phc2sys: [358530.175] phc offset -19275 s2 freq -3235 delay 4716 ... |
From: Keller, J. E <jac...@in...> - 2013-11-05 22:02:32
|
Hi Rich, On Tue, 2013-11-05 at 16:26 -0500, Rich Schmidt wrote: > This is Rich Schmidt, linuxptp newbie. > > I am testing linuxptp on this system at the US Naval Observatory: > > Supermicro SYS-5015A-EHF-D525 (Atom) > > Intel 82547L NICs driver: e1000e version: 2.5.4-NAPI > firmware-version: 1.9-0 > Debian with kernel 3.12.0-rc > > > Running: > Sync PHC to USNO Master Clock via Zyfer Gsync PTP GrandMaster: > ptp4l -i eth1 -l 7 -s -p /dev/ptp1 > > Sync CLOCK_REALTIME to PHC: > phc2sys -s /dev/ptp1 -L 100000000 -l 7 -R 0.25 -O 0 > > > > Things seem to work fine for a while, then I get a single large phase > offset detected by ptp4l. The -L freq limit was an attempt to > control these offsets, but did not help. > > > Are these large phase jumps filtered out by ptp4l? It seems not, > because phc2sys sees them. Or is this some unreliability in the Intel > > 82547L NICs? Is the PHC read failing? Thank you for your thoughts. > > > > Here is a sample. The clock is not being steered by NTP or any other > program. > Are you sure? I can't think of anything else controlling the clock, but something is obviously controlling it as seen in the logs. > Nov 5 18:12:27 pluto ptp4l: [354666.428] master offset 57 s2 > freq +34356 path delay 6086 > Nov 5 18:12:29 pluto ptp4l: [354668.428] master offset -139 s2 > freq +34266 path delay 6092 > Nov 5 18:12:30 pluto phc2sys: [354669.993] phc offset 4529 s2 > freq +8805 delay 4715 > Nov 5 18:12:31 pluto ptp4l: [354670.428] master offset -32 s2 > freq +34299 path delay 6092 > Nov 5 18:12:33 pluto ptp4l: [354672.428] master offset 20 s2 > freq +34320 path delay 6092 > Nov 5 18:12:34 pluto phc2sys: [354673.993] phc offset 470 s2 > freq +7931 delay 4705 > Nov 5 18:12:35 pluto ptp4l: [354674.428] master offset 54 s2 > freq +34340 path delay 6095 > Nov 5 18:12:37 pluto ptp4l: [354676.428] master offset -15 s2 > freq +34314 path delay 6095 > Nov 5 18:12:38 pluto phc2sys: [354677.993] phc offset -6992 s2 > freq +3968 delay 4870 > Nov 5 18:12:39 pluto ptp4l: [354678.428] master offset -19 s2 > freq +34309 path delay 6096 > Nov 5 18:12:41 pluto ptp4l: [354680.429] master offset 55 s2 > freq +34344 path delay 6096 > Nov 5 18:12:42 pluto phc2sys: [354681.994] phc offset 11326 s2 > freq +11945 delay 4715 > Nov 5 18:12:43 pluto ptp4l: [354682.428] master offset -90 s2 > freq +34279 path delay 6096 > Nov 5 18:12:45 pluto ptp4l: [354684.429] master offset -49 s2 > freq +34286 path delay 6096 > Nov 5 18:12:46 pluto phc2sys: [354685.994] phc offset -70368744182111 > s2 freq -500000 delay 4715 > Nov 5 18:12:47 pluto ptp4l: [354686.428] clockcheck: clock jumped > forward or running faster than expected! This should pretty much be caused by something managing the clock causing a jump. Possibly your grand master on the other end is doing something? I can't think of any other reason this would occur... Do you have the ability to monitor the grand master state and see if it was jumped? Since you're doing hardware timestamping, nothing would control the clock on the device except ptp4l.. so even NTP running shouldn't cause an issue (other than phc2sys trying to interfere with it... but that wouldn't be in the ptp4l logs) My gut says the driver is resetting the clock to 0 somehow on accident... What about the driver, what version are you using? The debian in-kernel e1000e driver? Could you try this against the one available on sourceforge.net from our e1000 project? This could theoretically be caused by a bug in the driver.. Since I am not part of the e1000e team, I don't know the specifics for that driver... maybe they have some logic that is resetting the register values incorrectly.. You could also check the output of the clock directly by using the ptp test program provided in the Documentation folder in the kernel source.. you might be able to kill ptp4l in time and check to see what the value of the ptp device clock says it is at that point... Could you show us some of the dmesg output as well? Maybe that might indicate some other issue occurring.. I'm not really sure.. Regards, Jake |
From: Rich S. <sch...@gm...> - 2013-11-05 23:50:42
|
Thanks, Jake, Well, it is a jump of 19.55 hours in the last example that you note. We would notice that on the GrandMaster. These jumps happen four or 5 times per hour at random times. I am using a newer e1000e driver than the sourceforge driver (oh, oh). From the Intel site, 2.5.4-NAPI. I will try the sourceforge version next. # testptp -c -d /dev/ptp1 capabilities: 599999999 maximum frequency adjustment (ppb) 0 programmable alarms 0 external time stamp channels 0 programmable periodic signals 0 pulse per second # testptp -g -d /dev/ptp1;date clock time: 1383695290.199580197 or Tue Nov 5 23:48:10 2013 Tue Nov 5 23:48:10 UTC 2013 # ethtool -T eth1 Time stamping parameters for eth1: Capabilities: hardware-transmit (SOF_TIMESTAMPING_TX_HARDWARE) software-transmit (SOF_TIMESTAMPING_TX_SOFTWARE) hardware-receive (SOF_TIMESTAMPING_RX_HARDWARE) software-receive (SOF_TIMESTAMPING_RX_SOFTWARE) software-system-clock (SOF_TIMESTAMPING_SOFTWARE) hardware-raw-clock (SOF_TIMESTAMPING_RAW_HARDWARE) PTP Hardware Clock: 1 Hardware Transmit Timestamp Modes: off (HWTSTAMP_TX_OFF) on (HWTSTAMP_TX_ON) Hardware Receive Filter Modes: none (HWTSTAMP_FILTER_NONE) all (HWTSTAMP_FILTER_ALL) ptpv1-l4-sync (HWTSTAMP_FILTER_PTP_V1_L4_SYNC) ptpv1-l4-delay-req (HWTSTAMP_FILTER_PTP_V1_L4_DELAY_REQ) ptpv2-l4-sync (HWTSTAMP_FILTER_PTP_V2_L4_SYNC) ptpv2-l4-delay-req (HWTSTAMP_FILTER_PTP_V2_L4_DELAY_REQ) ptpv2-l2-sync (HWTSTAMP_FILTER_PTP_V2_L2_SYNC) ptpv2-l2-delay-req (HWTSTAMP_FILTER_PTP_V2_L2_DELAY_REQ) ptpv2-event (HWTSTAMP_FILTER_PTP_V2_EVENT) ptpv2-sync (HWTSTAMP_FILTER_PTP_V2_SYNC) ptpv2-delay-req (HWTSTAMP_FILTER_PTP_V2_DELAY_REQ) My real question is can the clock_sanity check in linuxptp filter out crazy big offsets that are say, greater than 3 s.d. from the mean? On Tue, Nov 5, 2013 at 5:02 PM, Keller, Jacob E <jac...@in...>wrote: > Hi Rich, > > On Tue, 2013-11-05 at 16:26 -0500, Rich Schmidt wrote: > > This is Rich Schmidt, linuxptp newbie. > > > > I am testing linuxptp on this system at the US Naval Observatory: > > > > Supermicro SYS-5015A-EHF-D525 (Atom) > > > > Intel 82547L NICs driver: e1000e version: 2.5.4-NAPI > > firmware-version: 1.9-0 > > Debian with kernel 3.12.0-rc > > > > > > Running: > > Sync PHC to USNO Master Clock via Zyfer Gsync PTP GrandMaster: > > ptp4l -i eth1 -l 7 -s -p /dev/ptp1 > > > > Sync CLOCK_REALTIME to PHC: > > phc2sys -s /dev/ptp1 -L 100000000 -l 7 -R 0.25 -O 0 > > > > > > > > Things seem to work fine for a while, then I get a single large phase > > offset detected by ptp4l. The -L freq limit was an attempt to > > control these offsets, but did not help. > > > > > > Are these large phase jumps filtered out by ptp4l? It seems not, > > because phc2sys sees them. Or is this some unreliability in the Intel > > > > 82547L NICs? Is the PHC read failing? Thank you for your thoughts. > > > > > > > > Here is a sample. The clock is not being steered by NTP or any other > > program. > > > > Are you sure? I can't think of anything else controlling the clock, but > something is obviously controlling it as seen in the logs. > > > Nov 5 18:12:27 pluto ptp4l: [354666.428] master offset 57 s2 > > freq +34356 path delay 6086 > > Nov 5 18:12:29 pluto ptp4l: [354668.428] master offset -139 s2 > > freq +34266 path delay 6092 > > Nov 5 18:12:30 pluto phc2sys: [354669.993] phc offset 4529 s2 > > freq +8805 delay 4715 > > Nov 5 18:12:31 pluto ptp4l: [354670.428] master offset -32 s2 > > freq +34299 path delay 6092 > > Nov 5 18:12:33 pluto ptp4l: [354672.428] master offset 20 s2 > > freq +34320 path delay 6092 > > Nov 5 18:12:34 pluto phc2sys: [354673.993] phc offset 470 s2 > > freq +7931 delay 4705 > > Nov 5 18:12:35 pluto ptp4l: [354674.428] master offset 54 s2 > > freq +34340 path delay 6095 > > Nov 5 18:12:37 pluto ptp4l: [354676.428] master offset -15 s2 > > freq +34314 path delay 6095 > > Nov 5 18:12:38 pluto phc2sys: [354677.993] phc offset -6992 s2 > > freq +3968 delay 4870 > > Nov 5 18:12:39 pluto ptp4l: [354678.428] master offset -19 s2 > > freq +34309 path delay 6096 > > Nov 5 18:12:41 pluto ptp4l: [354680.429] master offset 55 s2 > > freq +34344 path delay 6096 > > Nov 5 18:12:42 pluto phc2sys: [354681.994] phc offset 11326 s2 > > freq +11945 delay 4715 > > Nov 5 18:12:43 pluto ptp4l: [354682.428] master offset -90 s2 > > freq +34279 path delay 6096 > > Nov 5 18:12:45 pluto ptp4l: [354684.429] master offset -49 s2 > > freq +34286 path delay 6096 > > Nov 5 18:12:46 pluto phc2sys: [354685.994] phc offset -70368744182111 > > s2 freq -500000 delay 4715 > > Nov 5 18:12:47 pluto ptp4l: [354686.428] clockcheck: clock jumped > > forward or running faster than expected! > > This should pretty much be caused by something managing the clock > causing a jump. Possibly your grand master on the other end is doing > something? I can't think of any other reason this would occur... Do you > have the ability to monitor the grand master state and see if it was > jumped? > > Since you're doing hardware timestamping, nothing would control the > clock on the device except ptp4l.. so even NTP running shouldn't cause > an issue (other than phc2sys trying to interfere with it... but that > wouldn't be in the ptp4l logs) > > My gut says the driver is resetting the clock to 0 somehow on > accident... > > What about the driver, what version are you using? The debian in-kernel > e1000e driver? Could you try this against the one available on > sourceforge.net from our e1000 project? This could theoretically be > caused by a bug in the driver.. > > Since I am not part of the e1000e team, I don't know the specifics for > that driver... maybe they have some logic that is resetting the register > values incorrectly.. > > You could also check the output of the clock directly by using the ptp > test program provided in the Documentation folder in the kernel source.. > you might be able to kill ptp4l in time and check to see what the value > of the ptp device clock says it is at that point... > > Could you show us some of the dmesg output as well? Maybe that might > indicate some other issue occurring.. I'm not really sure.. > > Regards, > Jake > > > |
From: Richard C. <ric...@gm...> - 2013-11-06 10:13:55
|
On Tue, Nov 05, 2013 at 06:50:33PM -0500, Rich Schmidt wrote: > > My real question is can the clock_sanity check in linuxptp filter out crazy > big offsets that are say, greater than 3 s.d. from the mean? The check code does not filter out any samples. It only prints a warning and resets the servo. Thanks, Richard |
From: Keller, J. E <jac...@in...> - 2013-11-06 00:14:07
|
On Tue, 2013-11-05 at 18:50 -0500, Rich Schmidt wrote: > Thanks, Jake, > Well, it is a jump of 19.55 hours in the last example that you note. > We would notice that on the GrandMaster. These jumps happen four or > 5 times per hour at random times. I am using a newer e1000e driver > than the sourceforge driver (oh, oh). From the Intel site, > 2.5.4-NAPI. I will try the sourceforge version next. > > # testptp -c -d /dev/ptp1 > capabilities: > 599999999 maximum frequency adjustment (ppb) > 0 programmable alarms > 0 external time stamp channels > 0 programmable periodic signals > 0 pulse per second > > # testptp -g -d /dev/ptp1;date > clock time: 1383695290.199580197 or Tue Nov 5 23:48:10 2013 > Tue Nov 5 23:48:10 UTC 2013 > # ethtool -T eth1 > > Time stamping parameters for eth1: > Capabilities: > hardware-transmit (SOF_TIMESTAMPING_TX_HARDWARE) > software-transmit (SOF_TIMESTAMPING_TX_SOFTWARE) > hardware-receive (SOF_TIMESTAMPING_RX_HARDWARE) > software-receive (SOF_TIMESTAMPING_RX_SOFTWARE) > software-system-clock (SOF_TIMESTAMPING_SOFTWARE) > hardware-raw-clock (SOF_TIMESTAMPING_RAW_HARDWARE) > PTP Hardware Clock: 1 > Hardware Transmit Timestamp Modes: > off (HWTSTAMP_TX_OFF) > on (HWTSTAMP_TX_ON) > Hardware Receive Filter Modes: > none (HWTSTAMP_FILTER_NONE) > all (HWTSTAMP_FILTER_ALL) > ptpv1-l4-sync (HWTSTAMP_FILTER_PTP_V1_L4_SYNC) > ptpv1-l4-delay-req (HWTSTAMP_FILTER_PTP_V1_L4_DELAY_REQ) > ptpv2-l4-sync (HWTSTAMP_FILTER_PTP_V2_L4_SYNC) > ptpv2-l4-delay-req (HWTSTAMP_FILTER_PTP_V2_L4_DELAY_REQ) > ptpv2-l2-sync (HWTSTAMP_FILTER_PTP_V2_L2_SYNC) > ptpv2-l2-delay-req (HWTSTAMP_FILTER_PTP_V2_L2_DELAY_REQ) > ptpv2-event (HWTSTAMP_FILTER_PTP_V2_EVENT) > ptpv2-sync (HWTSTAMP_FILTER_PTP_V2_SYNC) > ptpv2-delay-req (HWTSTAMP_FILTER_PTP_V2_DELAY_REQ) > > > My real question is can the clock_sanity check in linuxptp filter out > crazy big offsets that are say, greater than 3 s.d. from the mean? > > If the Grand Master is having these jumps, there's no way to filter them.. that is the remote clock simply being jumped, and we can't "filter" it, because what we do is reset the servo, and then relock on the new target. It looks like a "you should filter this out" but in reality the values you see are the time offset from the master, not the current time. That is, the ptp4l log output doesn't really do a good job of showing the current time on the clock, so it isn't obvious whether you are actually going forward or not. Does this make sense? If the Grand Master is *not* jumping and you are just seeing your local clock reset, that's a bug. Thanks, Jake > > On Tue, Nov 5, 2013 at 5:02 PM, Keller, Jacob E > <jac...@in...> wrote: > Hi Rich, > > On Tue, 2013-11-05 at 16:26 -0500, Rich Schmidt wrote: > > This is Rich Schmidt, linuxptp newbie. > > > > I am testing linuxptp on this system at the US Naval > Observatory: > > > > Supermicro SYS-5015A-EHF-D525 (Atom) > > > > Intel 82547L NICs driver: e1000e version: 2.5.4-NAPI > > firmware-version: 1.9-0 > > Debian with kernel 3.12.0-rc > > > > > > Running: > > Sync PHC to USNO Master Clock via Zyfer Gsync PTP > GrandMaster: > > ptp4l -i eth1 -l 7 -s -p /dev/ptp1 > > > > Sync CLOCK_REALTIME to PHC: > > phc2sys -s /dev/ptp1 -L 100000000 -l 7 -R 0.25 -O 0 > > > > > > > > Things seem to work fine for a while, then I get a single > large phase > > offset detected by ptp4l. The -L freq limit was an attempt > to > > control these offsets, but did not help. > > > > > > Are these large phase jumps filtered out by ptp4l? It seems > not, > > because phc2sys sees them. Or is this some unreliability in > the Intel > > > > 82547L NICs? Is the PHC read failing? Thank you for your > thoughts. > > > > > > > > Here is a sample. The clock is not being steered by NTP or > any other > > program. > > > > > Are you sure? I can't think of anything else controlling the > clock, but > something is obviously controlling it as seen in the logs. > > > Nov 5 18:12:27 pluto ptp4l: [354666.428] master offset > 57 s2 > > freq +34356 path delay 6086 > > Nov 5 18:12:29 pluto ptp4l: [354668.428] master offset > -139 s2 > > freq +34266 path delay 6092 > > Nov 5 18:12:30 pluto phc2sys: [354669.993] phc offset > 4529 s2 > > freq +8805 delay 4715 > > Nov 5 18:12:31 pluto ptp4l: [354670.428] master offset > -32 s2 > > freq +34299 path delay 6092 > > Nov 5 18:12:33 pluto ptp4l: [354672.428] master offset > 20 s2 > > freq +34320 path delay 6092 > > Nov 5 18:12:34 pluto phc2sys: [354673.993] phc offset > 470 s2 > > freq +7931 delay 4705 > > Nov 5 18:12:35 pluto ptp4l: [354674.428] master offset > 54 s2 > > freq +34340 path delay 6095 > > Nov 5 18:12:37 pluto ptp4l: [354676.428] master offset > -15 s2 > > freq +34314 path delay 6095 > > Nov 5 18:12:38 pluto phc2sys: [354677.993] phc offset > -6992 s2 > > freq +3968 delay 4870 > > Nov 5 18:12:39 pluto ptp4l: [354678.428] master offset > -19 s2 > > freq +34309 path delay 6096 > > Nov 5 18:12:41 pluto ptp4l: [354680.429] master offset > 55 s2 > > freq +34344 path delay 6096 > > Nov 5 18:12:42 pluto phc2sys: [354681.994] phc offset > 11326 s2 > > freq +11945 delay 4715 > > Nov 5 18:12:43 pluto ptp4l: [354682.428] master offset > -90 s2 > > freq +34279 path delay 6096 > > Nov 5 18:12:45 pluto ptp4l: [354684.429] master offset > -49 s2 > > freq +34286 path delay 6096 > > Nov 5 18:12:46 pluto phc2sys: [354685.994] phc offset > -70368744182111 > > s2 freq -500000 delay 4715 > > Nov 5 18:12:47 pluto ptp4l: [354686.428] clockcheck: clock > jumped > > forward or running faster than expected! > > > This should pretty much be caused by something managing the > clock > causing a jump. Possibly your grand master on the other end is > doing > something? I can't think of any other reason this would > occur... Do you > have the ability to monitor the grand master state and see if > it was > jumped? > > Since you're doing hardware timestamping, nothing would > control the > clock on the device except ptp4l.. so even NTP running > shouldn't cause > an issue (other than phc2sys trying to interfere with it... > but that > wouldn't be in the ptp4l logs) > > My gut says the driver is resetting the clock to 0 somehow on > accident... > > What about the driver, what version are you using? The debian > in-kernel > e1000e driver? Could you try this against the one available on > sourceforge.net from our e1000 project? This could > theoretically be > caused by a bug in the driver.. > > Since I am not part of the e1000e team, I don't know the > specifics for > that driver... maybe they have some logic that is resetting > the register > values incorrectly.. > > You could also check the output of the clock directly by using > the ptp > test program provided in the Documentation folder in the > kernel source.. > you might be able to kill ptp4l in time and check to see what > the value > of the ptp device clock says it is at that point... > > Could you show us some of the dmesg output as well? Maybe that > might > indicate some other issue occurring.. I'm not really sure.. > > Regards, > Jake > > > > > ------------------------------------------------------------------------------ > November Webinars for C, C++, Fortran Developers > Accelerate application performance with scalable programming models. Explore > techniques for threading, error checking, porting, and tuning. Get the most > from the latest Intel processors and coprocessors. See abstracts and register > http://pubads.g.doubleclick.net/gampad/clk?id=60136231&iu=/4140/ostg.clktrk > _______________________________________________ > Linuxptp-users mailing list > Lin...@li... > https://lists.sourceforge.net/lists/listinfo/linuxptp-users |
From: Miroslav L. <mli...@re...> - 2013-11-06 08:03:25
|
On Wed, Nov 06, 2013 at 12:13:57AM +0000, Keller, Jacob E wrote: > On Tue, 2013-11-05 at 18:50 -0500, Rich Schmidt wrote: > > Well, it is a jump of 19.55 hours in the last example that you note. > > We would notice that on the GrandMaster. These jumps happen four or > > 5 times per hour at random times. I am using a newer e1000e driver > > than the sourceforge driver (oh, oh). From the Intel site, > > 2.5.4-NAPI. I will try the sourceforge version next. > If the Grand Master is having these jumps, there's no way to filter > them.. that is the remote clock simply being jumped, and we can't > "filter" it, because what we do is reset the servo, and then relock on > the new target. It looks like a "you should filter this out" but in > reality the values you see are the time offset from the master, not the > current time. It's not the Grand Master jumping as the sanity check includes the corrections made by the local servo. If there is a warning it means something else is touching the synchronized clock or the clock/driver is broken. -- Miroslav Lichvar |
From: Richard C. <ric...@gm...> - 2013-11-06 10:22:32
|
On Tue, Nov 05, 2013 at 04:26:21PM -0500, Rich Schmidt wrote: > This is Rich Schmidt, linuxptp newbie. > > I am testing linuxptp on this system at the US Naval Observatory: > Supermicro SYS-5015A-EHF-D525 (Atom) > Intel 82547L NICs driver: e1000e version: 2.5.4-NAPI firmware-version: > 1.9-0 > Debian with kernel 3.12.0-rc > > Running: > Sync PHC to USNO Master Clock via Zyfer Gsync PTP GrandMaster: > ptp4l -i eth1 -l 7 -s -p /dev/ptp1 > > Sync CLOCK_REALTIME to PHC: > phc2sys -s /dev/ptp1 -L 100000000 -l 7 -R 0.25 -O 0 Can you try this again running ptp4l alone without phc2sys? If no funny offsets appear, this would indicate a certain kind of driver problem, when both programs access the PHC. Thanks, Richard |
From: Rich S. <sch...@gm...> - 2013-11-06 17:18:23
|
*1. Last night ptp4 and phc2sys got into a loop of sorts:* Nov 6 01:34:47 pluto ptp4l: [381207.181] master offset -3312879535876 s2 freq -599999999 path delay -332476904 Nov 6 01:34:47 pluto phc2sys: [381207.216] phc offset 3313004697655 s2 freq +500000 delay 4863 Nov 6 01:34:49 pluto ptp4l: [381209.180] master offset -3311679354743 s2 freq -599999999 path delay -332476904 Nov 6 01:34:51 pluto ptp4l: [381211.178] master offset -3310479396557 s2 freq -599999999 path delay -332476904 Nov 6 01:34:51 pluto phc2sys: [381211.216] phc offset 3310601232712 s2 freq +500000 delay 4713 Nov 6 01:34:52 pluto ptp4l: [381211.853] negative path delay -192163390 Nov 6 01:34:52 pluto ptp4l: [381211.853] path_delay = (t2 - t3) + (t4 - t1) Nov 6 01:34:52 pluto ptp4l: [381211.853] t2 - t3 = -1024845438 Nov 6 01:34:52 pluto ptp4l: [381211.853] t4 - t1 = +640518657 Nov 6 01:34:52 pluto ptp4l: [381211.853] c1 0 Nov 6 01:34:52 pluto ptp4l: [381211.853] c2 0 Nov 6 01:34:52 pluto ptp4l: [381211.854] c3 0 Nov 6 01:34:53 pluto ptp4l: [381213.178] master offset -3309260051493 s2 freq -599999999 path delay -351664636 Nov 6 01:34:55 pluto ptp4l: [381215.177] master offset -3308059890747 s2 freq -599999999 path delay -351664636 Nov 6 01:35:11 pluto phc2sys: [381231.217] phc offset -67053297050824 s2 freq -500000 delay 4718 Nov 6 01:35:13 pluto ptp4l: [381233.183] master offset 67052031445161 s2 freq +599999999 path delay -97064991 ... during the night a number of jumps are detected by phc2sys (lines deleted)... Nov 6 05:14:00 pluto ptp4l: [394359.794] master offset 66882744861346 s2 freq +599999999 path delay -32382521 Nov 6 05:14:00 pluto phc2sys: [394359.859] phc offset -66882716164068 s2 freq -500000 delay 4717 Nov 6 05:14:02 pluto ptp4l: [394361.796] master offset 66881544843520 s2 freq +599999999 path delay -32382521 Nov 6 05:14:04 pluto ptp4l: [394363.797] master offset 66880344836655 s2 freq +599999999 path delay -32382521 Nov 6 05:14:04 pluto phc2sys: [394363.859] phc offset -66880315280640 s2 freq -500000 delay 4717 Nov 6 05:14:06 pluto ptp4l: [394365.798] master offset 66879099025472 s2 freq +599999999 path delay 13411172 Nov 6 05:14:08 pluto ptp4l: [394367.799] master offset 66877899010215 s2 freq +599999999 path delay 13411172 Nov 6 05:14:08 pluto phc2sys: [394367.859] phc offset -66877914394903 s2 freq -500000 delay 4727 Nov 6 05:14:10 pluto ptp4l: [394369.800] master offset 66876589756605 s2 freq +599999999 path delay 122641620 ...lines deleted... Nov 6 06:40:17 pluto ptp4l: [399536.604] master offset 63777989946575 s2 freq +599999999 path delay 283612433 Nov 6 06:40:19 pluto ptp4l: [399538.605] master offset 63776799780537 s2 freq +599999999 path delay 273757038 Nov 6 06:40:20 pluto phc2sys: [399540.146] phc offset -134142308172546 s2 freq -500000 delay 4717 Nov 6 06:40:21 pluto ptp4l: [399540.605] clockcheck: clock jumped forward or running faster than expected! Nov 6 06:40:21 pluto ptp4l: [399540.606] master offset 134144343946703 s0 freq +0 path delay 273757038 ... lines deleted... Nov 6 06:41:11 pluto ptp4l: [399591.102] negative path delay -66539 Nov 6 06:41:11 pluto ptp4l: [399591.102] path_delay = (t2 - t3) + (t4 - t1) Nov 6 06:41:11 pluto ptp4l: [399591.102] t2 - t3 = -484054584 Nov 6 06:41:11 pluto ptp4l: [399591.102] t4 - t1 = +483921505 Nov 6 06:41:11 pluto ptp4l: [399591.102] c1 0 Nov 6 06:41:11 pluto ptp4l: [399591.102] c2 0 Nov 6 06:41:11 pluto ptp4l: [399591.102] c3 0 Nov 6 06:41:12 pluto phc2sys: [399592.148] phc offset 2561210209 s2 freq +500000 delay 4713 Nov 6 06:41:13 pluto ptp4l: [399592.586] master offset 31642 s2 freq -72514 path delay 1932384 Nov 6 06:41:15 pluto ptp4l: [399594.585] master offset 155956 s2 freq -5611 path delay 2021789 Nov 6 06:41:16 pluto phc2sys: [399596.149] phc offset 2558838385 s2 freq +500000 delay 4712 Nov 6 06:41:17 pluto ptp4l: [399596.584] master offset 235565 s2 freq +57587 path delay 2021789 Nov 6 06:41:19 pluto ptp4l: [399598.583] master offset 189166 s2 freq +69722 path delay 2021789 ... *2. This morning ptp4 reports that clock time is 17.1 hours offset:* Nov 6 16:14:27 pluto ptp4l: [433991.680] master offset 61591498021410 s2 freq +599999999 path delay 383379265 Nov 6 16:14:29 pluto ptp4l: [433993.680] master offset 61590330241412 s2 freq +599999999 path delay 351150078 Nov 6 16:14:31 pluto ptp4l: [433995.680] master offset 61589133175281 s2 freq +599999999 path delay 348198056 Nov 6 16:14:33 pluto ptp4l: [433997.680] master offset 61587955406085 s2 freq +599999999 path delay 325950874 Nov 6 16:14:35 pluto ptp4l: [433999.680] master offset 61586755389932 s2 freq +599999999 path delay 325950874 Nov 6 16:14:37 pluto ptp4l: [434001.680] master offset 61585555381755 s2 freq +599999999 path delay 325950874 Nov 6 16:14:39 pluto ptp4l: [434003.681] master offset 61584332985592 s2 freq +599999999 path delay 348330452 Nov 6 16:14:41 pluto ptp4l: [434005.681] master offset 61583132971605 s2 freq +599999999 path delay 348330452 Nov 6 16:14:43 pluto ptp4l: [434007.681] master offset 61581932953300 s2 freq +599999999 path delay 348330452 Nov 6 16:14:45 pluto ptp4l: [434009.681] master offset 61580761060120 s2 freq +599999999 path delay 320211807 Nov 6 16:14:47 pluto ptp4l: [434011.681] master offset 61579561037423 s2 freq +599999999 path delay 320211807 Nov 6 16:14:49 pluto ptp4l: [434013.681] master offset 61578361028013 s2 freq +599999999 path delay 320211807 Nov 6 16:14:51 pluto ptp4l: [434015.681] master offset 61577183886161 s2 freq +599999999 path delay 29733548217.1 Actually it is (24 - 17.1) = 6h 53m behind: root@pluto:/tmp# testptp -g -d /dev/ptp1;date clock time: 1383816008.279232658 or Thu Nov 7 09:20:08 2013 Wed Nov 6 16:12:23 UTC 2013 *3. I set the correct time on the clock using ntpdate, and sync the PHC:root@pluto:/tmp# testptp -s -d /dev/ptp1set time okayroot@pluto:/tmp# testptp -g -d /dev/ptp1;dateclock time: 1383754501.245498330 or Wed Nov 6 16:15:01 2013Wed Nov 6 16:14:59 UTC 20134. ptp4l responds:* Nov 6 16:14:53 pluto ptp4l: [434017.681] master offset 61575983886672 s2 freq +599999999 path delay 297335482 Nov 6 16:14:55 pluto ptp4l: [434019.680] clockcheck: clock jumped backward or running slower than expected! Nov 6 16:14:55 pluto ptp4l: [434019.681] master offset -360063735 s0 freq +0 path delay 277518883 Nov 6 16:14:55 pluto ptp4l: [434019.681] port 1: SLAVE to UNCALIBRATED on SYNCHRONIZATION_FAULT Nov 6 16:14:57 pluto ptp4l: [434021.681] master offset -1560079696 s1 freq -599999999 path delay 277518883 Nov 6 16:14:59 pluto ptp4l: [434023.681] master offset 1198240376 s2 freq -879811 path delay 277518883 Nov 6 16:14:59 pluto ptp4l: [434023.681] port 1: UNCALIBRATED to SLAVE on MASTER_CLOCK_SELECTED Nov 6 16:15:01 pluto ptp4l: [434025.681] master offset 1201022077 s2 freq +180247096 path delay 277518883 Nov 6 16:15:03 pluto ptp4l: [434027.682] master offset 840864807 s2 freq +180321772 path delay 277518883 Nov 6 16:15:05 pluto ptp4l: [434029.682] master offset 480264272 s2 freq +126151226 path delay 277518883 Nov 6 16:15:07 pluto ptp4l: [434031.682] master offset 277988358 s2 freq +97052910 path delay 227454323 ...lines deleted... *# testptp -g -d /dev/ptp1;dateclock time: 1383754506.107142710 or Wed Nov 6 16:15:06 2013Wed Nov 6 16:15:05 UTC 2013# testptp -s -d /dev/ptp1set time okay# testptp -g -d /dev/ptp1;dateclock time: 1383754512.793827793 or Wed Nov 6 16:15:12 2013Wed Nov 6 16:15:12 UTC 2013# testptp -g -d /dev/ptp1;dateclock time: 1383754517.291075454 or Wed Nov 6 16:15:17 2013Wed Nov 6 16:15:17 UTC 2013*... Now I'm running ptp4l without phc2sys and will watch for any clock jumps today. On Wed, Nov 6, 2013 at 5:22 AM, Richard Cochran <ric...@gm...>wrote: > On Tue, Nov 05, 2013 at 04:26:21PM -0500, Rich Schmidt wrote: > > This is Rich Schmidt, linuxptp newbie. > > > > I am testing linuxptp on this system at the US Naval Observatory: > > Supermicro SYS-5015A-EHF-D525 (Atom) > > Intel 82547L NICs driver: e1000e version: 2.5.4-NAPI firmware-version: > > 1.9-0 > > Debian with kernel 3.12.0-rc > > > > Running: > > Sync PHC to USNO Master Clock via Zyfer Gsync PTP GrandMaster: > > ptp4l -i eth1 -l 7 -s -p /dev/ptp1 > > > > Sync CLOCK_REALTIME to PHC: > > phc2sys -s /dev/ptp1 -L 100000000 -l 7 -R 0.25 -O 0 > > Can you try this again running ptp4l alone without phc2sys? > > If no funny offsets appear, this would indicate a certain kind of > driver problem, when both programs access the PHC. > > Thanks, > Richard > |
From: Richard C. <ric...@gm...> - 2013-11-06 19:23:40
|
A couple of things to try, below... On Tue, Nov 05, 2013 at 04:26:21PM -0500, Rich Schmidt wrote: > Intel 82547L NICs driver: e1000e version: 2.5.4-NAPI firmware-version: 1.9-0 > Debian with kernel 3.12.0-rc 1. This driver also supports software time stamping. So please try 'ptp4l -S' and no phc2sys. (In this case, ptp4l will adjust the Linux system clock.) If that works without the jumps, then the issue is most likely a driver or kernel issue. 2. You gave the e1000e version as 2.5.4, but in the kernel source drivers/net/ethernet/intel/e1000e/netdev.c I see version 2.3.2, so you must using the SF drivers. You might also try the driver from the kernel. 3. Are you using your own kernel.org kernel build from source? If so, that is good, but you should know that v3.12 has been released, and it is better to drop the -rc kernel. Thanks, Richard |
From: Richard C. <ric...@gm...> - 2013-11-06 19:26:43
|
On Tue, Nov 05, 2013 at 04:26:21PM -0500, Rich Schmidt wrote: > > Things seem to work fine for a while, then I get a single large phase > offset detected by ptp4l. The -L freq limit was an attempt to control > these offsets, but did not help. How long before the first jump appears after a cold start? I have a 82547 PCIe card, and I will see if I can reproduce this effect. Thanks, Richard |
From: Richard C. <ric...@gm...> - 2013-11-09 14:38:04
|
On Wed, Nov 06, 2013 at 08:26:26PM +0100, Richard Cochran wrote: > > I have a 82547 PCIe card, and I will see if I can reproduce this > effect. I ran my 82547 card in my old rack mounted Pentium dual core, with debian wheezy but with a custom, vanilla 3.11+ kernel. I couldn't see anything going wrong at all. Regarding the phc2sys servo performance (Rick mentioned off-list that he wanted one sample every 32 seconds and that the frequency offset looked too jittery), I think it is only a matter of choosing the right PI constants. The defaults of kp=0.7 and ki=0.3 are not good, I think, and I will post a patch for this. In the mean time, you can set the constants manually, for example: -P 0.022 -I 0.004 I appended a bit of the phc2sys log to show how it *should* be working. Thanks, Richard $ sudo ./phc2sys -s eth4 -O0 -m -q -l7 -R .03125 -P 0.022 -I 0.004 phc2sys[7543.139]: PI servo: sync interval 32.000 kp 0.022 ki 0.004000 phc2sys[7575.139]: sys offset 309 s0 freq +0 delay 4427 phc2sys[7607.140]: sys offset 7595 s1 freq -121370 delay 4421 phc2sys[7639.140]: sys offset -6397 s2 freq -121537 delay 4422 phc2sys[7671.140]: sys offset 3198 s2 freq -121313 delay 4428 phc2sys[7703.140]: sys offset -14547 s2 freq -121761 delay 4427 phc2sys[7735.140]: sys offset -2589 s2 freq -121509 delay 4427 phc2sys[7767.141]: sys offset -4859 s2 freq -121578 delay 4430 phc2sys[7799.141]: sys offset -580 s2 freq -121486 delay 4434 phc2sys[7831.141]: sys offset 9944 s2 freq -121215 delay 4421 phc2sys[7863.141]: sys offset -10701 s2 freq -121712 delay 4424 phc2sys[7895.141]: sys offset 878 s2 freq -121454 delay 4421 phc2sys[7927.142]: sys offset -197 s2 freq -121478 delay 4428 phc2sys[7959.142]: sys offset 5006 s2 freq -121344 delay 4421 phc2sys[7991.142]: sys offset -11600 s2 freq -121755 delay 4432 phc2sys[8023.142]: sys offset 4294 s2 freq -121388 delay 4427 phc2sys[8055.142]: sys offset 5382 s2 freq -121343 delay 4421 phc2sys[8087.142]: sys offset -9560 s2 freq -121710 delay 4429 phc2sys[8119.142]: sys offset -5780 s2 freq -121650 delay 4444 phc2sys[8151.142]: sys offset -6365 s2 freq -121688 delay 4421 phc2sys[8183.143]: sys offset 5942 s2 freq -121394 delay 4424 phc2sys[8215.143]: sys offset 3935 s2 freq -121422 delay 4434 phc2sys[8247.143]: sys offset -9148 s2 freq -121747 delay 4427 phc2sys[8279.143]: sys offset 5052 s2 freq -121414 delay 4429 phc2sys[8311.143]: sys offset 2501 s2 freq -121460 delay 4446 phc2sys[8343.143]: sys offset 4341 s2 freq -121402 delay 4422 phc2sys[8375.143]: sys offset 119 s2 freq -121495 delay 4429 phc2sys[8407.143]: sys offset -7465 s2 freq -121691 delay 4424 phc2sys[8439.143]: sys offset 12213 s2 freq -121210 delay 4426 phc2sys[8471.144]: sys offset -1892 s2 freq -121527 delay 4421 phc2sys[8503.144]: sys offset -6495 s2 freq -121655 delay 4421 phc2sys[8535.144]: sys offset -11090 s2 freq -121800 delay 4422 |
From: Richard C. <ric...@gm...> - 2013-11-14 09:17:15
|
On Thu, Nov 14, 2013 at 10:01:17AM +0100, Richard Cochran wrote: > On Wed, Nov 13, 2013 at 01:41:37PM +0100, Miroslav Lichvar wrote: > > > > I tried to run the test with 32s interval on my test machine. When the > > CPU is idle: > > > > phc2sys[773832.254]: PI servo: sync interval 32.000 kp 0.022 ki 0.009375 > > phc2sys[773864.254]: sys offset -58 s0 freq +0 delay 4929 > > Your results look way better than mine. I guess that your machine is a > 64 bit OS with VDSO, right? > > Mine is a 32 bit Pentium D, and Rick's is an atom, IIRC. > > It looks to me like the VDSO is giving much different results. But we > want to have phc2sys working well in both cases. I take it back. My setup is using the PTP_SYS_OFFSET ioctl. But still, I see much more jitter in the time error... Thanks, Richard |
From: Miroslav L. <mli...@re...> - 2013-11-14 09:38:56
|
On Thu, Nov 14, 2013 at 10:16:57AM +0100, Richard Cochran wrote: > On Thu, Nov 14, 2013 at 10:01:17AM +0100, Richard Cochran wrote: > > Mine is a 32 bit Pentium D, and Rick's is an atom, IIRC. > > > > It looks to me like the VDSO is giving much different results. But we > > want to have phc2sys working well in both cases. > > I take it back. My setup is using the PTP_SYS_OFFSET ioctl. > > But still, I see much more jitter in the time error... Any chance your kernel is running with nohz? You might get better results with nohz=off. https://lkml.org/lkml/2012/10/11/343 I made some progress in investigation of the bug and I'll try to post the results/patch soon. -- Miroslav Lichvar |
From: Richard C. <ric...@gm...> - 2013-11-14 11:50:25
|
On Thu, Nov 14, 2013 at 10:38:46AM +0100, Miroslav Lichvar wrote: > > Any chance your kernel is running with nohz? You might get better > results with nohz=off. Yes, I just noticed that my test machine has CONFIG_NO_HZ_IDLE. I suspect that the NTP time accounting is broken in this case, when the machine load is light. With a heavy load (like compiling kernel with -j2, dual core) then the time jitter stays within 100 ns or so using: phc2sys -R 1 -P.1 -I.001 -N 25 I am right now recompiling with CONFIG_HZ_PERIODIC CONFIG_PREEMPT_NONE CONFIG_HZ_1000 and then I'll rerun the test. Thanks, Richard |
From: Richard C. <ric...@gm...> - 2013-11-14 12:32:23
|
On Thu, Nov 14, 2013 at 10:38:46AM +0100, Miroslav Lichvar wrote: > > I made some progress in investigation of the bug and I'll try to post > the results/patch soon. If you post to LKML then please put me on CC so that I can spot the thread. Thanks, Richard |
From: Rich S. <sch...@gm...> - 2013-11-06 20:30:07
|
Richard thanks for the suggestions. I have been running ptp4l alone (hardware stamping) for about 2 hrs 45m with no phase jumps. Now I'm testing ptp4l -S , but in fact my PTP from the FEI-Zyfer Gsync is in UTC, not PTP, and ptp4l thinks otherwise, so it set the system clock ahead 35 seconds. Yes kernel 3.12 has e1000e driver 2.3.2. I built it from kernel.orgsource. I downloaded 2.5.4 from Intel and loaded it as a module (only mode supported by Intel for that version.) # ethtool -T eth1 Time stamping parameters for eth1: Capabilities: hardware-transmit (SOF_TIMESTAMPING_TX_HARDWARE) software-transmit (SOF_TIMESTAMPING_TX_SOFTWARE) hardware-receive (SOF_TIMESTAMPING_RX_HARDWARE) software-receive (SOF_TIMESTAMPING_RX_SOFTWARE) software-system-clock (SOF_TIMESTAMPING_SOFTWARE) hardware-raw-clock (SOF_TIMESTAMPING_RAW_HARDWARE) PTP Hardware Clock: 1 Hardware Transmit Timestamp Modes: off (HWTSTAMP_TX_OFF) on (HWTSTAMP_TX_ON) Hardware Receive Filter Modes: none (HWTSTAMP_FILTER_NONE) all (HWTSTAMP_FILTER_ALL) ptpv1-l4-sync (HWTSTAMP_FILTER_PTP_V1_L4_SYNC) ptpv1-l4-delay-req (HWTSTAMP_FILTER_PTP_V1_L4_DELAY_REQ) ptpv2-l4-sync (HWTSTAMP_FILTER_PTP_V2_L4_SYNC) ptpv2-l4-delay-req (HWTSTAMP_FILTER_PTP_V2_L4_DELAY_REQ) ptpv2-l2-sync (HWTSTAMP_FILTER_PTP_V2_L2_SYNC) ptpv2-l2-delay-req (HWTSTAMP_FILTER_PTP_V2_L2_DELAY_REQ) ptpv2-event (HWTSTAMP_FILTER_PTP_V2_EVENT) ptpv2-sync (HWTSTAMP_FILTER_PTP_V2_SYNC) ptpv2-delay-req (HWTSTAMP_FILTER_PTP_V2_DELAY_REQ) ... Nov 6 20:22:05 pluto ptp4l: [448891.338] master offset -215 s2 freq +7423 path delay 41013 Nov 6 20:22:07 pluto ptp4l: [448893.338] master offset 1614 s2 freq +7573 path delay 41013 Nov 6 20:22:09 pluto ptp4l: [448895.338] master offset -2071 s2 freq +7271 path delay 41013 Nov 6 20:22:11 pluto ptp4l: [448897.338] master offset 5175 s2 freq +7867 path delay 40792 Nov 6 20:22:13 pluto ptp4l: [448899.338] master offset 2514 s2 freq +7654 path delay 40792 Nov 6 20:22:15 pluto ptp4l: [448901.339] master offset 4204 s2 freq +7797 path delay 40792 Nov 6 20:22:17 pluto ptp4l: [448903.338] master offset 3497 s2 freq +7744 path delay 40792 ... Rich On Wed, Nov 6, 2013 at 2:23 PM, Richard Cochran <ric...@gm...>wrote: > A couple of things to try, below... > > On Tue, Nov 05, 2013 at 04:26:21PM -0500, Rich Schmidt wrote: > > > Intel 82547L NICs driver: e1000e version: 2.5.4-NAPI > firmware-version: 1.9-0 > > Debian with kernel 3.12.0-rc > > 1. This driver also supports software time stamping. So please try > 'ptp4l -S' and no phc2sys. (In this case, ptp4l will adjust the > Linux system clock.) If that works without the jumps, then the > issue is most likely a driver or kernel issue. > > 2. You gave the e1000e version as 2.5.4, but in the kernel source > drivers/net/ethernet/intel/e1000e/netdev.c I see version 2.3.2, so > you must using the SF drivers. You might also try the driver from > the kernel. > > 3. Are you using your own kernel.org kernel build from source? If so, > that is good, but you should know that v3.12 has been released, and > it is better to drop the -rc kernel. > > Thanks, > Richard > > |
From: Rich S. <sch...@gm...> - 2013-11-11 22:12:18
Attachments:
log-nov11.txt
|
Nov 11 21:42:50 pluto phc2sys: [2757.866] phc offset 261267301 s0 freq +0 delay 4722 Nov 11 21:43:22 pluto phc2sys: [2789.866] phc offset 277500554 s1 freq +7285 delay 4722 Nov 11 21:43:54 pluto phc2sys: [2821.867] phc offset 3462 s2 freq +7375 delay 4715 Nov 11 21:44:26 pluto phc2sys: [2853.867] phc offset 5913 s2 freq +7453 delay 4710 Nov 11 21:44:58 pluto phc2sys: [2885.867] phc offset -491 s2 freq +7310 delay 4705 Nov 11 21:45:30 pluto phc2sys: [2917.867] phc offset 3465 s2 freq +7411 delay 4720 Nov 11 21:46:02 pluto phc2sys: [2949.867] phc offset 4888 s2 freq +7462 delay 4700 Nov 11 21:46:34 pluto phc2sys: [2981.867] phc offset -1083 s2 freq +7326 delay 4710 Nov 11 21:47:06 pluto phc2sys: [3013.868] phc offset 6008 s2 freq +7506 delay 4720 Nov 11 21:47:38 pluto phc2sys: [3045.868] phc offset 4738 s2 freq +7497 delay 4700 Nov 11 21:48:10 pluto phc2sys: [3077.868] phc offset 578 s2 freq +7408 delay 4710 Nov 11 21:48:42 pluto phc2sys: [3109.868] phc offset 1082 s2 freq +7423 delay 4700 Nov 11 21:49:14 pluto phc2sys: [3141.868] phc offset 1497 s2 freq +7438 delay 4705 Nov 11 21:49:46 pluto phc2sys: [3173.869] phc offset -1775 s2 freq +7359 delay 4709 Nov 11 21:50:18 pluto phc2sys: [3205.869] phc offset -2171 s2 freq +7342 delay 4715 Nov 11 21:50:50 pluto phc2sys: [3237.869] phc offset -1054 s2 freq +7362 delay 4705 Nov 11 21:51:22 pluto phc2sys: [3269.869] phc offset 466 s2 freq +7398 delay 4700 Nov 11 21:51:54 pluto phc2sys: [3301.869] phc offset 2186 s2 freq +7444 delay 4720 Nov 11 21:52:26 pluto phc2sys: [3333.869] phc offset -1351 s2 freq +7361 delay 4695 Nov 11 21:52:58 pluto phc2sys: [3365.870] phc offset -7468 s2 freq +7196 delay 4720 Nov 11 21:53:30 pluto phc2sys: [3397.870] phc offset 6053 s2 freq +7518 delay 4709 Nov 11 21:54:02 pluto phc2sys: [3429.870] phc offset -1937 s2 freq +7335 delay 4700 Nov 11 21:54:34 pluto phc2sys: [3461.870] phc offset 2551 s2 freq +7444 delay 4711 Nov 11 21:55:06 pluto phc2sys: [3493.870] phc offset 1067 s2 freq +7415 delay 4690 Nov 11 21:55:38 pluto phc2sys: [3525.871] phc offset -6052 s2 freq +7234 delay 4695 Nov 11 21:56:10 pluto phc2sys: [3557.871] phc offset 7692 s2 freq +7568 delay 4700 Nov 11 21:56:42 pluto phc2sys: [3589.871] phc offset 4897 s2 freq +7526 delay 4710 Nov 11 21:57:14 pluto phc2sys: [3621.871] phc offset -8351 s2 freq +7201 delay 4720 Nov 11 21:57:46 pluto phc2sys: [3653.871] phc offset 8703 s2 freq +7611 delay 4809 Nov 11 21:58:18 pluto phc2sys: [3685.871] phc offset -3012 s2 freq +7341 delay 4719 Nov 11 21:58:50 pluto phc2sys: [3717.872] phc offset -872 s2 freq +7385 delay 4719 Nov 11 21:59:22 pluto phc2sys: [3749.872] phc offset -2757 s2 freq +7332 delay 4710 Nov 11 21:59:54 pluto phc2sys: [3781.872] phc offset 506 s2 freq +7406 delay 4710 Nov 11 22:00:26 pluto phc2sys: [3813.872] phc offset -1086 s2 freq +7367 delay 4710 Nov 11 22:00:58 pluto phc2sys: [3845.872] phc offset 6962 s2 freq +7571 delay 4710 Nov 11 22:01:30 pluto phc2sys: [3877.873] phc offset -3331 s2 freq +7332 delay 4719 Nov 11 22:02:02 pluto phc2sys: [3909.873] phc offset -5253 s2 freq +7268 delay 4705 Nov 11 22:02:34 pluto phc2sys: [3941.873] phc offset -791 s2 freq +7363 delay 4720 Nov 11 22:03:06 pluto phc2sys: [3973.873] phc offset -78 s2 freq +7379 delay 4700 Nov 11 22:03:38 pluto phc2sys: [4005.873] phc offset -5569 s2 freq +7236 delay 4710 Nov 11 22:04:10 pluto phc2sys: [4037.873] phc offset 3904 s2 freq +7460 delay 4710 Nov 11 22:04:42 pluto phc2sys: [4069.874] phc offset -5725 s2 freq +7225 delay 4830 Nov 11 22:05:14 pluto phc2sys: [4101.874] phc offset 6031 s2 freq +7508 delay 4710 Nov 11 22:05:46 pluto phc2sys: [4133.874] phc offset 7248 s2 freq +7563 delay 4705 Nov 11 22:06:18 pluto phc2sys: [4165.874] phc offset -9122 s2 freq +7167 delay 4700 Nov 11 22:06:50 pluto phc2sys: [4197.874] phc offset 1891 s2 freq +7417 delay 4705 Nov 11 22:07:22 pluto phc2sys: [4229.875] phc offset -53 s2 freq +7374 delay 4711 Nov 11 22:07:54 pluto phc2sys: [4261.875] phc offset 4310 s2 freq +7487 delay 4705 Nov 11 22:08:26 pluto phc2sys: [4293.875] phc offset -1578 s2 freq +7351 delay 4719 Nov 11 22:08:58 pluto phc2sys: [4325.875] phc offset -6171 s2 freq +7225 delay 4711 Nov 11 22:09:30 pluto phc2sys: [4357.875] phc offset 2605 s2 freq +7429 delay 4710 Nov 11 22:10:02 pluto phc2sys: [4389.875] phc offset 4649 s2 freq +7492 delay 4689 |
From: Rich S. <sch...@gm...> - 2013-11-15 16:45:53
|
Miroslav, My kernel configuration of linux 3.12 has: CONFIG_TICK_ONESHOT=y CONFIG_NO_HZ_COMMON=y # CONFIG_HZ_PERIODIC is not set CONFIG_NO_HZ_IDLE=y # CONFIG_NO_HZ_FULL is not set CONFIG_NO_HZ=y CONFIG_HIGH_RES_TIMERS=y CONFIG_HZ_1000=y CONFIG_HZ=1000 chrony was assuming 100hz. Nov 15 16:28:29 pluto chronyd[2504]: hz=100 shift_hz=7 freq_scale=1.00000000 nominal_tick=10000 slew_delta_tick=833 max_tick_bias=1000 shift_pll=2 When I specify "linux_hz 1000" in chrony.conf, I get: Nov 15 16:39:11 pluto chronyd[2532]: Fatal error : adjtimex failed for set_frequency, freq_ppm=7.4496e+00 scaled_freq=-7.4496e+00 required_tick=1000 Do I need to recompile the kernel with 100hz? Also, the description of log files in chrony.txt does not match the format of the files in my log. Can you provide an interpretation? 52864df1 000a0aa5 7.278382e-06 9.000000e-06 1.380000e-04 1.066010e-06 1.380000e-04 1.130216e-03 1.000000e+00 1 "Measurements" does not log? Rich Schmidt On Fri, Nov 15, 2013 at 7:51 AM, Miroslav Lichvar <mli...@re...>wrote: > On Thu, Nov 14, 2013 at 06:47:39PM -0500, Rich Schmidt wrote: > > Miroslav, thank you for the info on chrony! > > I have downloaded and started chrony 1.29. Very nice. Then I went to the > > git repository and downloaded chrony-7261d11.tar.gz. When compiling I > get: > > > #define PTP_MAX_SAMPLES 10 (I just guessed at a number) > > > so I added to Makefile: LIBS = -lm -lrt > > Thanks for the report, Rich. It should be fixed in git now, please let > me know if it still doesn't compile for you. > > > I added to chrony.conf: > > refclock PHC /dev/ptp1 poll 3 dpoll -2 offset -0 > > > 210 Number of sources = 3 > > MS Name/IP address Stratum Poll Reach LastRx Last sample > > > =============================================================================== > > #- PHC0 0 3 377 6 +92us[ +58us] +/- > > 84.3s > > That 84.3s figure doesn't look good. It should be in nanoseconds, not > seconds :). You can see the individial samples in the refclocks.log, > maybe something interesting will be there. > > Here is what I get on an idle machine with i210 card: > > 2013-11-15 12:46:50.171265 PHC0 21 N 0 1.000000e-09 7.506664e-23 > 6.465e-09 > 2013-11-15 12:46:50.421633 PHC0 22 N 0 1.600000e-08 1.500000e-08 > 6.465e-09 > 2013-11-15 12:46:50.671984 PHC0 23 N 0 0.000000e+00 -1.000000e-09 > 6.465e-09 > 2013-11-15 12:46:50.922248 PHC0 24 N 0 -5.500000e-09 -6.500000e-09 > 6.465e-09 > 2013-11-15 12:46:51.172608 PHC0 25 N 0 -6.500000e-09 -6.500000e-09 > 6.465e-09 > 2013-11-15 12:46:51.422972 PHC0 26 N 0 1.250000e-08 1.250000e-08 > 6.465e-09 > 2013-11-15 12:46:51.673240 PHC0 27 N 0 1.400000e-08 1.400000e-08 > 6.465e-09 > 2013-11-15 12:46:51.923629 PHC0 28 N 0 1.100000e-08 1.100000e-08 > 6.465e-09 > 2013-11-15 12:46:52.173989 PHC0 29 N 0 -5.000000e-10 -5.000000e-10 > 6.465e-09 > 2013-11-15 12:46:52.424144 PHC0 30 N 0 1.350000e-08 1.350000e-08 > 6.465e-09 > 2013-11-15 12:46:52.674523 PHC0 31 N 0 0.000000e+00 0.000000e+00 > 6.465e-09 > 2013-11-15 12:46:48.156077 PHC0 - N - - 4.175000e-09 > 1.450e-09 > 2013-11-15 12:46:52.924976 PHC0 0 N 0 4.200000e-08 4.000000e-08 > 6.174e-09 > 2013-11-15 12:46:53.175264 PHC0 1 N 0 -8.500000e-09 -1.050000e-08 > 6.174e-09 > 2013-11-15 12:46:53.425603 PHC0 2 N 0 7.500000e-09 5.500000e-09 > 6.174e-09 > > $ chronyc sources > 210 Number of sources = 1 > MS Name/IP address Stratum Poll Reach LastRx Last sample > > =============================================================================== > #* PHC0 0 3 377 3 -1ns[ +10ns] +/- > 3ns > > $ chronyc sourcestats > 210 Number of sources = 1 > Name/IP Address NP NR Span Frequency Freq Skew Offset Std > Dev > > ============================================================================== > PHC0 6 3 38 0.000 0.001 +0ns > 2ns > > > Now I will read chrony.txt. Would like to log statistics as NTP does. I > > will run it all night tonight and check tomorrow. > > That can be configured with logdir and log: > logdir /var/log/chrony > log measurements statistics tracking refclocks > > -- > Miroslav Lichvar > |
From: Miroslav L. <mli...@re...> - 2013-11-15 17:23:27
|
On Fri, Nov 15, 2013 at 11:45:45AM -0500, Rich Schmidt wrote: > Miroslav, > My kernel configuration of linux 3.12 has: > > CONFIG_TICK_ONESHOT=y > CONFIG_NO_HZ_COMMON=y > # CONFIG_HZ_PERIODIC is not set > CONFIG_NO_HZ_IDLE=y > # CONFIG_NO_HZ_FULL is not set > CONFIG_NO_HZ=y > CONFIG_HIGH_RES_TIMERS=y > CONFIG_HZ_1000=y > CONFIG_HZ=1000 That means the kernel is compiled with nohz. If you want to disable it without recompiling the kernel, you can add nohz=off to the kernel boot options. (getting off topic on this list) > chrony was assuming 100hz. That's ok, that's the HZ visible from user space (getconf CLK_TCK), which is fixed per arch to not break the programs. > Also, the description of log files in chrony.txt does not match the format > of the files in my log. > Can you provide an interpretation? > 52864df1 000a0aa5 7.278382e-06 9.000000e-06 1.380000e-04 1.066010e-06 > 1.380000e-04 1.130216e-03 1.000000e+00 1 Is that from the statistics log? The line seems to be mangled. > "Measurements" does not log? The chrony measurements log is for network measurements, not reference clocks. -- Miroslav Lichvar |
From: Richard C. <ric...@gm...> - 2013-11-16 06:23:47
|
On Fri, Nov 15, 2013 at 11:45:45AM -0500, Rich Schmidt wrote: > Miroslav, > My kernel configuration of linux 3.12 has: > > CONFIG_TICK_ONESHOT=y > CONFIG_NO_HZ_COMMON=y > # CONFIG_HZ_PERIODIC is not set > CONFIG_NO_HZ_IDLE=y That explains the poor phc2sys performance. Miroslav has identified a time keeping bug in the Linux kernel that appears when this option is enabled. > Do I need to recompile the kernel with 100hz? You don't need to recompile. You can add "nohz=off" to the kernel command line (e.g. from the grub boot menu). Please do this and then rerun your phc2sys tests. I expect that this will make a huge improvement. Thanks, Richard |
From: Rich S. <sch...@gm...> - 2013-11-12 17:09:08
|
Nov 11 21:42:50 2757.866 261267301 s0 freq +0 4722 Nov 11 21:43:22 2789.866 277500554 s1 freq +7285 4722 Nov 11 21:43:54 2821.867 3462 +7375 4715 Nov 11 21:44:26 2853.867 5913 +7453 4710 ... Nov 11 23:12:26 8133.897 -12638 +7082 4711 Nov 11 23:12:58 8165.897 10997 +7645 4719 Nov 11 23:13:30 8197.897 -70368744166426 -500000 4720 Nov 11 23:14:02 8229.897 3518453432642 +500000 4717 Nov 11 23:14:34 8261.897 3512213078157 +500000 4704 Nov 11 23:15:06 8293.898 3492986001963 +500000 4707 ... Nov 11 23:55:38 10725.911 2031727550676 +500000 4787 Nov 11 23:56:10 10757.911 2012500472205 +500000 4702 ... Nov 12 00:34:34 13061.923 628150600222 +500000 4697 Nov 12 00:35:06 13093.923 608923535029 +500000 4707 Nov 12 00:35:38 13125.923 -69779047728383 -500000 4703 Nov 12 00:36:10 13157.923 -2315629608 -500000 4712 Nov 12 00:36:42 13189.924 -2364688921 -500000 4698 ... Nov 12 02:06:50 18597.958 21561 +7597 4710 Nov 12 02:07:22 18629.958 18370 +7600 4710 Nov 12 02:07:54 18661.959 13352 +7543 4711 Nov 12 02:08:26 18693.959 11357 +7545 4715 Nov 12 02:08:58 18725.959 8546 +7517 4705 Nov 12 02:09:30 18757.959 7370 +7521 4720 Nov 12 02:10:02 18789.959 3376 +7446 4710 Nov 12 02:10:34 18821.959 5736 +7521 4700 Nov 12 02:11:06 18853.960 2087 +7449 4726 Nov 12 02:12:42 18949.960 554 +7444 4830 Nov 12 02:13:14 18981.960 -187 +7427 4700 Nov 12 02:13:46 19013.961 3278 +7516 4715 Nov 12 02:14:18 19045.961 -979 +7418 4711 ... Nov 12 02:26:34 19781.965 -2503 +7397 4701 Nov 12 02:27:06 19813.965 -986 +7426 4701 Nov 12 02:27:38 19845.965 4347 +7561 4720 Nov 12 02:28:10 19877.965 -70368744179149 -500000 4710 Nov 12 02:28:42 19909.966 3518453449519 +500000 4711 Nov 12 02:29:14 19941.966 3506419020568 +500000 4857 Nov 12 02:29:46 19973.966 3487191947960 +500000 4697 ... Nov 12 03:17:46 22853.979 1756756010257 +500000 4687 Nov 12 03:18:18 22885.979 1737528941707 +500000 4717 ... Nov 12 04:11:06 26053.999 -963261146 -500000 4712 Nov 12 04:11:38 26085.999 -947030769 -500000 4702 Nov 12 04:12:10 26118.000 -930800188 -500000 4722 Nov 12 04:12:42 26150.000 -70369658744841 -500000 4722 Nov 12 04:13:14 26182.000 -898334456 -500000 4723 ... Nov 12 04:53:14 28582.017 31184 +7742 4705 Nov 12 04:53:46 28614.017 16720 +7491 4700 Nov 12 04:54:18 28646.017 21197 +7674 4705 Nov 12 04:54:50 28678.017 11083 +7496 4705 Nov 12 04:55:22 28710.017 12842 +7586 4700 Nov 12 04:55:54 28742.018 8813 +7533 4720 Nov 12 04:56:26 28774.018 8074 +7549 4715 Nov 12 04:56:58 28806.018 4479 +7488 4720 Nov 12 04:57:30 28838.018 5512 +7533 4704 ... Nov 12 05:05:30 29318.021 1148 +7492 4720 Nov 12 05:06:02 29350.021 -1275 +7433 4705 Nov 12 05:06:34 29382.021 2896 +7537 4710 Nov 12 05:07:06 29414.021 344 +7482 4720 Nov 12 05:07:38 29446.022 798 +7495 4700 ... Nov 12 06:31:22 34470.050 -1427 +7465 4710 Nov 12 06:31:54 34502.050 7969 +7703 4710 Nov 12 06:32:26 34534.050 94 +7531 4710 Nov 12 06:32:58 34566.051 -1839 +7481 4720 Nov 12 06:33:30 34598.051 -70368744178218 -500000 4710 Nov 12 06:34:02 34630.051 16225279 +429378 4712 Nov 12 06:34:34 34662.051 2715379 +143022 4703 Nov 12 06:35:06 34694.051 -1613228 +41340 4714 Nov 12 06:35:38 34726.052 -2698087 +6681 4699 Nov 12 06:36:10 34758.052 -2668799 -3350 4709 Nov 12 06:36:42 34790.052 -2331560 -5257 4710 Nov 12 06:37:14 34822.052 -1918316 -3839 4710 Nov 12 06:37:46 34854.052 -1554358 -2049 4711 Nov 12 06:38:18 34886.053 -1249295 -335 4711 ... Nov 12 06:54:50 35878.058 -1772 +7460 4710 Nov 12 06:55:22 35910.058 2787 +7572 4720 Nov 12 06:55:54 35942.059 -1782 +7464 4705 Nov 12 06:56:26 35974.059 -2391 +7441 4689 Nov 12 06:56:58 36006.059 349 +7503 4710 Nov 12 06:57:30 36038.059 882 +7518 4720 Nov 12 06:58:02 36070.059 3952 +7602 4715 Nov 12 06:58:34 36102.059 -7383 +7323 4706 Nov 12 06:59:06 36134.060 9864 +7742 4709 ... Nov 12 11:49:46 53574.156 471 +7491 4710 Nov 12 11:50:18 53606.156 18413 +7959 4699 Nov 12 11:50:50 53638.156 7464 +7748 4699 Nov 12 11:51:22 53670.157 -13980 +7221 4705 Nov 12 11:51:54 53702.157 -10936 +7244 4720 Nov 12 11:52:26 53734.157 -4285 +7373 4871 Nov 12 11:52:58 53766.157 -4120 +7360 4720 Nov 12 11:53:30 53798.157 7780 +7653 4700 Nov 12 11:54:02 53830.157 14539 +7860 4699 Nov 12 11:54:34 53862.158 -70368744196527 -500000 4791 Nov 12 11:55:06 53894.158 16243645 +429875 4712 Nov 12 11:55:38 53926.158 2725650 +143382 4687 Nov 12 11:56:10 53958.158 -1634519 +40920 4719 ... Nov 12 12:25:30 55718.168 12079 +7799 4869 Nov 12 12:26:02 55750.168 -15486 +7130 4710 ... Nov 12 12:28:10 55878.169 831 +7576 4720 Nov 12 12:28:42 55910.169 4129 +7665 4820 Nov 12 12:29:14 55942.169 -17983 +7107 4790 ... Nov 12 15:52:58 68166.235 2546 +7603 4865 Nov 12 15:53:30 68198.236 1091 +7575 4700 Nov 12 15:54:02 68230.236 -3903 +7450 4710 Nov 12 15:54:34 68262.236 -4744 +7412 4720 Nov 12 15:55:06 68294.236 5526 +7660 4700 |
From: Richard C. <ric...@gm...> - 2013-11-13 08:46:46
|
On Tue, Nov 12, 2013 at 12:08:58PM -0500, Rich Schmidt wrote: > In an 18 hour test of phc2sys with these parameters: > numactl --physcpubind=0 --localalloc phc2sys -s /dev/ptp1 -l 7 -R 0.03125 > -O 0 -P 0.022 -I 0.004 > on four occasions phc2sys failed with large phase offsets (19.55 hours!). Thanks for running these tests and posting the results. The strange events are very infrequent. I'll see if I can set up a longer test with my own 82574 card. The big jumps definitely look like either a driver, kernel, or main board issue, or that someone is messing with the system time. Are you sure that the Spectracom drivers/libraries/services are not interfering in some way? Thanks, Richard |
From: Richard C. <ric...@gm...> - 2013-11-14 11:56:16
|
On Tue, Nov 12, 2013 at 12:08:58PM -0500, Rich Schmidt wrote: > > Even with the large phase offsets excluded, phc2sys performed poorly as a > timing standard, 50 microseconds peak-peak, standard deviation 8.1 > microseconds. See phc2sys-offset.png and phc2sys-offs2.png for a look at > the best data. Rich, Is your kernel configured with CONFIG_NO_HZ_IDLE? That is the likely cause of the time error jitter. Thanks, Richard PS. I think I have been referring to you as "Rick" in some other mails. Please forgive me for that. |
From: Miroslav L. <mli...@re...> - 2013-11-12 18:09:20
|
On Tue, Nov 12, 2013 at 12:08:58PM -0500, Rich Schmidt wrote: > In an 18 hour test of phc2sys with these parameters: > numactl --physcpubind=0 --localalloc phc2sys -s /dev/ptp1 -l 7 -R 0.03125 > -O 0 -P 0.022 -I 0.004 > Even with the large phase offsets excluded, phc2sys performed poorly as a > timing standard, 50 microseconds peak-peak, standard deviation 8.1 > microseconds. See phc2sys-offset.png and phc2sys-offs2.png for a look at > the best data. Just curious, why are you using 32s update interval? If the clock is not very stable (e.g. rapid changes in the temperature), you may be hitting the limit of what's possible. I'd expect better results with a shorter interval. -- Miroslav Lichvar |
From: Rich S. <sch...@gm...> - 2013-11-12 19:28:46
|
We use 16s update interval on our NTP servers that are synced to PTP time code using a Spectracom PCIe PTP card, and our system clock jitter is around 700 nanoseconds! Frequency changes due to temperature effects are not rapid in our environment. If you run adjtime too frequently you may end up chasing your tail. Rich Schmidt Time Service Dept. US Naval Observatory On Tue, Nov 12, 2013 at 1:07 PM, Miroslav Lichvar <mli...@re...>wrote: > On Tue, Nov 12, 2013 at 12:08:58PM -0500, Rich Schmidt wrote: > > In an 18 hour test of phc2sys with these parameters: > > numactl --physcpubind=0 --localalloc phc2sys -s /dev/ptp1 -l 7 -R 0.03125 > > -O 0 -P 0.022 -I 0.004 > > > Even with the large phase offsets excluded, phc2sys performed poorly as a > > timing standard, 50 microseconds peak-peak, standard deviation 8.1 > > microseconds. See phc2sys-offset.png and phc2sys-offs2.png for a look at > > the best data. > > Just curious, why are you using 32s update interval? If the clock is > not very stable (e.g. rapid changes in the temperature), you may be > hitting the limit of what's possible. I'd expect better results with a > shorter interval. > > -- > Miroslav Lichvar > |