Re: [Linuxptp-users] clockcheck - need to filter large spurious phase jumps?

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Hi Rich,

On Tue, 2013-11-05 at 16:26 -0500, Rich Schmidt wrote:
> This is Rich Schmidt, linuxptp newbie. 
> 
> I am testing linuxptp on this system at the US Naval Observatory:
> 
> Supermicro SYS-5015A-EHF-D525 (Atom)
> 
> Intel 82547L  NICs   driver: e1000e version: 2.5.4-NAPI
> firmware-version: 1.9-0
> Debian with kernel 3.12.0-rc 
> 
> 
> Running:
> Sync PHC to USNO Master Clock via Zyfer Gsync PTP GrandMaster:
> ptp4l -i eth1 -l 7 -s -p /dev/ptp1
> 
> Sync CLOCK_REALTIME to PHC:
> phc2sys -s /dev/ptp1 -L 100000000 -l 7 -R 0.25 -O 0 
> 
> 
> 
> Things seem to work fine for a while, then I get a single large phase
> offset detected by ptp4l.  The  -L freq limit was an attempt to
> control these offsets, but did not help. 
> 
> 
> Are these large phase jumps filtered out by ptp4l?  It seems not,
> because phc2sys sees them. Or is this some unreliability in the Intel 
> 
> 82547L NICs?  Is the PHC read failing?   Thank you for your thoughts.
> 
> 
> 
> Here is a sample.  The clock is not being steered by NTP or any other
> program.  
> 

Are you sure? I can't think of anything else controlling the clock, but
something is obviously controlling it as seen in the logs.

> Nov  5 18:12:27 pluto ptp4l: [354666.428] master offset         57 s2
> freq  +34356 path delay      6086
> Nov  5 18:12:29 pluto ptp4l: [354668.428] master offset       -139 s2
> freq  +34266 path delay      6092
> Nov  5 18:12:30 pluto phc2sys: [354669.993] phc offset      4529 s2
> freq   +8805 delay   4715
> Nov  5 18:12:31 pluto ptp4l: [354670.428] master offset        -32 s2
> freq  +34299 path delay      6092
> Nov  5 18:12:33 pluto ptp4l: [354672.428] master offset         20 s2
> freq  +34320 path delay      6092
> Nov  5 18:12:34 pluto phc2sys: [354673.993] phc offset       470 s2
> freq   +7931 delay   4705
> Nov  5 18:12:35 pluto ptp4l: [354674.428] master offset         54 s2
> freq  +34340 path delay      6095
> Nov  5 18:12:37 pluto ptp4l: [354676.428] master offset        -15 s2
> freq  +34314 path delay      6095
> Nov  5 18:12:38 pluto phc2sys: [354677.993] phc offset     -6992 s2
> freq   +3968 delay   4870
> Nov  5 18:12:39 pluto ptp4l: [354678.428] master offset        -19 s2
> freq  +34309 path delay      6096
> Nov  5 18:12:41 pluto ptp4l: [354680.429] master offset         55 s2
> freq  +34344 path delay      6096
> Nov  5 18:12:42 pluto phc2sys: [354681.994] phc offset     11326 s2
> freq  +11945 delay   4715
> Nov  5 18:12:43 pluto ptp4l: [354682.428] master offset        -90 s2
> freq  +34279 path delay      6096
> Nov  5 18:12:45 pluto ptp4l: [354684.429] master offset        -49 s2
> freq  +34286 path delay      6096
> Nov  5 18:12:46 pluto phc2sys: [354685.994] phc offset -70368744182111
> s2 freq -500000 delay   4715
> Nov  5 18:12:47 pluto ptp4l: [354686.428] clockcheck: clock jumped
> forward or running faster than expected!

This should pretty much be caused by something managing the clock
causing a jump. Possibly your grand master on the other end is doing
something? I can't think of any other reason this would occur... Do you
have the ability to monitor the grand master state and see if it was
jumped?

Since you're doing hardware timestamping, nothing would control the
clock on the device except ptp4l.. so even NTP running shouldn't cause
an issue (other than phc2sys trying to interfere with it... but that
wouldn't be in the ptp4l logs)

My gut says the driver is resetting the clock to 0 somehow on
accident...

What about the driver, what version are you using? The debian in-kernel
e1000e driver? Could you try this against the one available on
sourceforge.net from our e1000 project? This could theoretically be
caused by a bug in the driver..

Since I am not part of the e1000e team, I don't know the specifics for
that driver... maybe they have some logic that is resetting the register
values incorrectly..

You could also check the output of the clock directly by using the ptp
test program provided in the Documentation folder in the kernel source..
you might be able to kill ptp4l in time and check to see what the value
of the ptp device clock says it is at that point...

Could you show us some of the dmesg output as well? Maybe that might
indicate some other issue occurring.. I'm not really sure..

Regards,
Jake

Re: [Linuxptp-users] clockcheck - need to filter large spurious phase jumps?

PTP IEEE 1588 stack for Linux

Re: [Linuxptp-users] clockcheck - need to filter large spurious phase jumps?