[Linuxptp-users] ptp4l confusing UTC offset of different masters
PTP IEEE 1588 stack for Linux
Brought to you by:
rcochran
|
From: Luke B. <luk...@lm...> - 2019-04-30 16:39:44
|
Hello, I've got some odd PTP behaviour from a Metamako switch (which runs it's own version of linuxptp-1.9), and have reproduced similar behaviour with linuxptp-2.0. We've got a new Grand Master appliance on trial, an Orelia VelaSync 2. It's plugged into a LAN in our lab along side it's predecessor, the VelaSync 1 (which is really FSM Labs' TimeKeeper Server). These masters deal with the UTC offset and leap seconds differently. The VS1 / TimeKeeper takes the inventive approach of publishing PTP messages containing UTC time and a UTC offset of zero (it does this so it can be in control of leap second slewing). The VS2 simply publishes TAI time with a UTC offset of 37 seconds. As I understand it, net result should be the same: UTC - 0s and TAI - 37s. I've set the VS2 priority1 field to 127, higher than the VS1's priority1 of 128 (the point of the exercise is to test the VS2). What appears to happen is ptp4l first tries to calibrate to the VS1 (the VS1 sends a boat load more Announce messages than the VS2), and sets the UTC offset to zero: ptp4l[419887.425]: selected best master clock 248a07.fffe.fdb701 ptp4l[419887.425]: updating UTC offset to 0 ptp4l[419887.425]: port 1: LISTENING to UNCALIBRATED on RS_SLAVE ptp4l[419888.699]: rms 172 max 221 freq -24777 +/- 102 delay 2141 +/- 7 ptp4l[419889.727]: rms 96 max 141 freq -24730 +/- 28 delay 2142 +/- 4 ptp4l[419890.739]: rms 54 max 79 freq -24747 +/- 41 delay 2145 +/- 3 Not long after the VS2 is discovered, and selected as the best clock. The new UTC offset flag is set to match the different time in the Follow_Up messages, but the clock jumps 37 seconds anyway: ptp4l[419888.434]: port 1: new foreign master ec0d9a.fffe.2a5680-1 ptp4l[419892.434]: selected best master clock ffffff.ffff.ffffff ptp4l[419892.434]: updating UTC offset to 37 ptp4l[419892.434]: port 1: SLAVE to UNCALIBRATED on RS_SLAVE ptp4l[419892.811]: port 1: UNCALIBRATED to SLAVE on MASTER_CLOCK_SELECTED ptp4l[419893.306]: selected best master clock ffffff.ffff.ffffff ptp4l[419893.306]: updating UTC offset to 37 ptp4l[419893.434]: rms 32034559090 max 37000000060 freq -23256182 +/- 13412686 delay 2139 +/- 20 ptp4l[419894.434]: rms 36963221067 max 36976771182 freq -31000000 +/- 0 ptp4l[419894.841]: selected best master clock ffffff.ffff.ffffff ptp4l[419894.841]: updating UTC offset to 37 ptp4l[419895.352]: selected best master clock ffffff.ffff.ffffff ptp4l[419895.352]: updating UTC offset to 37 ptp4l[419895.434]: rms 36930870374 max 36945061895 freq -31000000 +/- 0 delay -1099123 +/- 367074 Note the clock ID of 0xffff here is a bogus value from the VS2 announce messages, it's populating it's grandmasterClockIdentity field with a reserved address. I've already emailed Orelia about it (side question; why is ptp4l logging the grandmasterClockIdentity field in the log file as the "master" and not the clockIdentity field?). When we were just running the VS1, everything was fine, this worked. If you shut off the VS1 (or change it's PTP domain number) so just the VS2 messages are "seen", everything is also fine, it works. The craziness only occurs when there is a mix of messages from both masters on the LAN. Also, this doesn't affect 'ptpd' (which is what most of my clients run), it only affects ptp4l. The other odd thing from the log (and the Metamako switch) is it constantly spits out these messages: selected best master clock ffffff.ffff.ffffff Which comes from handle_state_decision_event() in clock.c (remember, ignore the clock ID), and: updating UTC offset to 37 Which is in clock_update_slave(). That seems odd to me... I would expect to see them during/after a BMC election, not over and over again. The PTP priority hasn't changed, VS2 is still best clock. Part of the condition of the second message is: (c->tds.currentUtcOffset > c->utc_offset) I wonder if the VS1 messages are somehow being processed (even though it's not master), and the UTC offset flag is being polluted and this is what's jamming the clock away from real time? Full ptp4l log file is attached. -- Luke Bigum Head of Systems, LMAX Exchange |