Re: [Linuxptp-users] clockcheck - need to filter large spurious phase jumps?
PTP IEEE 1588 stack for Linux
Brought to you by:
rcochran
From: Rich S. <sch...@gm...> - 2013-11-05 23:50:42
|
Thanks, Jake, Well, it is a jump of 19.55 hours in the last example that you note. We would notice that on the GrandMaster. These jumps happen four or 5 times per hour at random times. I am using a newer e1000e driver than the sourceforge driver (oh, oh). From the Intel site, 2.5.4-NAPI. I will try the sourceforge version next. # testptp -c -d /dev/ptp1 capabilities: 599999999 maximum frequency adjustment (ppb) 0 programmable alarms 0 external time stamp channels 0 programmable periodic signals 0 pulse per second # testptp -g -d /dev/ptp1;date clock time: 1383695290.199580197 or Tue Nov 5 23:48:10 2013 Tue Nov 5 23:48:10 UTC 2013 # ethtool -T eth1 Time stamping parameters for eth1: Capabilities: hardware-transmit (SOF_TIMESTAMPING_TX_HARDWARE) software-transmit (SOF_TIMESTAMPING_TX_SOFTWARE) hardware-receive (SOF_TIMESTAMPING_RX_HARDWARE) software-receive (SOF_TIMESTAMPING_RX_SOFTWARE) software-system-clock (SOF_TIMESTAMPING_SOFTWARE) hardware-raw-clock (SOF_TIMESTAMPING_RAW_HARDWARE) PTP Hardware Clock: 1 Hardware Transmit Timestamp Modes: off (HWTSTAMP_TX_OFF) on (HWTSTAMP_TX_ON) Hardware Receive Filter Modes: none (HWTSTAMP_FILTER_NONE) all (HWTSTAMP_FILTER_ALL) ptpv1-l4-sync (HWTSTAMP_FILTER_PTP_V1_L4_SYNC) ptpv1-l4-delay-req (HWTSTAMP_FILTER_PTP_V1_L4_DELAY_REQ) ptpv2-l4-sync (HWTSTAMP_FILTER_PTP_V2_L4_SYNC) ptpv2-l4-delay-req (HWTSTAMP_FILTER_PTP_V2_L4_DELAY_REQ) ptpv2-l2-sync (HWTSTAMP_FILTER_PTP_V2_L2_SYNC) ptpv2-l2-delay-req (HWTSTAMP_FILTER_PTP_V2_L2_DELAY_REQ) ptpv2-event (HWTSTAMP_FILTER_PTP_V2_EVENT) ptpv2-sync (HWTSTAMP_FILTER_PTP_V2_SYNC) ptpv2-delay-req (HWTSTAMP_FILTER_PTP_V2_DELAY_REQ) My real question is can the clock_sanity check in linuxptp filter out crazy big offsets that are say, greater than 3 s.d. from the mean? On Tue, Nov 5, 2013 at 5:02 PM, Keller, Jacob E <jac...@in...>wrote: > Hi Rich, > > On Tue, 2013-11-05 at 16:26 -0500, Rich Schmidt wrote: > > This is Rich Schmidt, linuxptp newbie. > > > > I am testing linuxptp on this system at the US Naval Observatory: > > > > Supermicro SYS-5015A-EHF-D525 (Atom) > > > > Intel 82547L NICs driver: e1000e version: 2.5.4-NAPI > > firmware-version: 1.9-0 > > Debian with kernel 3.12.0-rc > > > > > > Running: > > Sync PHC to USNO Master Clock via Zyfer Gsync PTP GrandMaster: > > ptp4l -i eth1 -l 7 -s -p /dev/ptp1 > > > > Sync CLOCK_REALTIME to PHC: > > phc2sys -s /dev/ptp1 -L 100000000 -l 7 -R 0.25 -O 0 > > > > > > > > Things seem to work fine for a while, then I get a single large phase > > offset detected by ptp4l. The -L freq limit was an attempt to > > control these offsets, but did not help. > > > > > > Are these large phase jumps filtered out by ptp4l? It seems not, > > because phc2sys sees them. Or is this some unreliability in the Intel > > > > 82547L NICs? Is the PHC read failing? Thank you for your thoughts. > > > > > > > > Here is a sample. The clock is not being steered by NTP or any other > > program. > > > > Are you sure? I can't think of anything else controlling the clock, but > something is obviously controlling it as seen in the logs. > > > Nov 5 18:12:27 pluto ptp4l: [354666.428] master offset 57 s2 > > freq +34356 path delay 6086 > > Nov 5 18:12:29 pluto ptp4l: [354668.428] master offset -139 s2 > > freq +34266 path delay 6092 > > Nov 5 18:12:30 pluto phc2sys: [354669.993] phc offset 4529 s2 > > freq +8805 delay 4715 > > Nov 5 18:12:31 pluto ptp4l: [354670.428] master offset -32 s2 > > freq +34299 path delay 6092 > > Nov 5 18:12:33 pluto ptp4l: [354672.428] master offset 20 s2 > > freq +34320 path delay 6092 > > Nov 5 18:12:34 pluto phc2sys: [354673.993] phc offset 470 s2 > > freq +7931 delay 4705 > > Nov 5 18:12:35 pluto ptp4l: [354674.428] master offset 54 s2 > > freq +34340 path delay 6095 > > Nov 5 18:12:37 pluto ptp4l: [354676.428] master offset -15 s2 > > freq +34314 path delay 6095 > > Nov 5 18:12:38 pluto phc2sys: [354677.993] phc offset -6992 s2 > > freq +3968 delay 4870 > > Nov 5 18:12:39 pluto ptp4l: [354678.428] master offset -19 s2 > > freq +34309 path delay 6096 > > Nov 5 18:12:41 pluto ptp4l: [354680.429] master offset 55 s2 > > freq +34344 path delay 6096 > > Nov 5 18:12:42 pluto phc2sys: [354681.994] phc offset 11326 s2 > > freq +11945 delay 4715 > > Nov 5 18:12:43 pluto ptp4l: [354682.428] master offset -90 s2 > > freq +34279 path delay 6096 > > Nov 5 18:12:45 pluto ptp4l: [354684.429] master offset -49 s2 > > freq +34286 path delay 6096 > > Nov 5 18:12:46 pluto phc2sys: [354685.994] phc offset -70368744182111 > > s2 freq -500000 delay 4715 > > Nov 5 18:12:47 pluto ptp4l: [354686.428] clockcheck: clock jumped > > forward or running faster than expected! > > This should pretty much be caused by something managing the clock > causing a jump. Possibly your grand master on the other end is doing > something? I can't think of any other reason this would occur... Do you > have the ability to monitor the grand master state and see if it was > jumped? > > Since you're doing hardware timestamping, nothing would control the > clock on the device except ptp4l.. so even NTP running shouldn't cause > an issue (other than phc2sys trying to interfere with it... but that > wouldn't be in the ptp4l logs) > > My gut says the driver is resetting the clock to 0 somehow on > accident... > > What about the driver, what version are you using? The debian in-kernel > e1000e driver? Could you try this against the one available on > sourceforge.net from our e1000 project? This could theoretically be > caused by a bug in the driver.. > > Since I am not part of the e1000e team, I don't know the specifics for > that driver... maybe they have some logic that is resetting the register > values incorrectly.. > > You could also check the output of the clock directly by using the ptp > test program provided in the Documentation folder in the kernel source.. > you might be able to kill ptp4l in time and check to see what the value > of the ptp device clock says it is at that point... > > Could you show us some of the dmesg output as well? Maybe that might > indicate some other issue occurring.. I'm not really sure.. > > Regards, > Jake > > > |