[Linuxptp-devel] phc2sys, timemaster: stale/bad receive time /w ptp4l using 2 ethernet interfaces
PTP IEEE 1588 stack for Linux
Brought to you by:
rcochran
From: Stefan L. <s....@ga...> - 2017-03-21 07:38:03
|
Hello linuxptp community, thank you for your work. We are experiencing a problem using the timemaster application with PTP synchronizing to two ethernet ports. In detail I suspect the problem to be connected to the phc2sys autoconfiguration mode. We are not sure if this is caused by - wrong command line parameters / config files on our side - an inherent disablity of the respective applications for this use case - a bug The levels of likeliness are likely to decrease from top to bottom. ######################### ### setup description ### ######################### Our setup is as follows: The hardware is a custom board with a NXP P2020 PowerPC CPU. The CPU provides 3 MACs and one PTP Clock. The CPUs MACs provide hardware timestamping functionality. The device shall act as a PTP slave only. For redundacy reasons, one PTP Grandmaster Clock Master Clock is attached to our device's eth1, another PTP Grandmaster Clock Master Clock is attached to our device's eth2. The ntp server (a regular server pc) is currently connected to eth0. The timemaster application basically spawns the applications ptp4l, phc2sys and ntpd (instead of cronyd in our case). The effect seen is the same, whether ptp4l, phc2sys and ntpd are called manually or via timemaster. To make this issue more understandable, I will only display the console dumps for the manual calling of the applications here. The issue is not attributed to the timemaster application itself. ########################################## ### problem description / console logs ### ########################################## We are calling the respective applications in the following order (in different ssh sessions): 1. ptp4l: /usr/bin/ptp4l -l 7 -f /var/run/timemaster/ptp4l.0.conf.bak -H -i eth1 -i eth2 -m Config file: # cat /var/run/timemaster/ptp4l.0.conf.bak [global] slaveOnly 1 domainNumber 0 uds_address /var/run/timemaster/ptp4l.0.socket message_tag [0:eth1+eth2] It is very rudimentary and a copy of the config file created by the timemaster when we first ran into this problem. ptp4l correctly syncs /dev/ptp0 to one of the grandmaster clocks: root@cp61850:/var/log# pmc -u -b 0 'GET PORT_DATA_SET' sending: GET PORT_DATA_SET 00d093.fffe.274a5e-1 seq 0 RESPONSE MANAGEMENT PORT_DATA_SET portIdentity 00d093.fffe.274a5e-1 portState SLAVE logMinDelayReqInterval 0 peerMeanPathDelay 0 logAnnounceInterval 1 announceReceiptTimeout 3 logSyncInterval 0 delayMechanism 1 logMinPdelayReqInterval 0 versionNumber 2 00d093.fffe.274a5e-2 seq 0 RESPONSE MANAGEMENT PORT_DATA_SET portIdentity 00d093.fffe.274a5e-2 portState LISTENING logMinDelayReqInterval 0 peerMeanPathDelay 0 logAnnounceInterval 1 announceReceiptTimeout 3 logSyncInterval 0 delayMechanism 1 logMinPdelayReqInterval 0 versionNumber 2 Debug prints of ptp4l are attached. 2. phc2sys: /usr/bin/phc2sys -l 7 -a -r -R 1.00 -m -z /var/run/timemaster/ptp4l.0.socket -t [0:eth1+eth2] -n 0 -E ntpshm -M 0 resulting in following debug output: phc2sys[28965.849]: [0:eth1+eth2] config item (null).uds_address is '/var/run/timemaster/ptp4l.0.socket' phc2sys[28965.850]: [0:eth1+eth2] config item (null).ntpshm_segment is 0 phc2sys[28965.850]: [0:eth1+eth2] config item (null).step_threshold is 0.000000 phc2sys[28965.850]: [0:eth1+eth2] config item (null).first_step_threshold is 0.000020 phc2sys[28965.850]: [0:eth1+eth2] config item (null).max_frequency is 900000000 phc2sys[28965.851]: [0:eth1+eth2] config item (null).ntpshm_segment is 0 phc2sys[28965.851]: [0:eth1+eth2] config item (null).step_threshold is 0.000000 phc2sys[28965.851]: [0:eth1+eth2] config item (null).first_step_threshold is 0.000020 phc2sys[28965.851]: [0:eth1+eth2] config item (null).max_frequency is 900000000 phc2sys[28965.851]: [0:eth1+eth2] config item (null).ntpshm_segment is 0 phc2sys[28965.851]: [0:eth1+eth2] config item (null).step_threshold is 0.000000 phc2sys[28965.851]: [0:eth1+eth2] config item (null).first_step_threshold is 0.000020 phc2sys[28965.851]: [0:eth1+eth2] config item (null).max_frequency is 900000000 phc2sys[28966.852]: [0:eth1+eth2] reconfiguring after port state change phc2sys[28966.852]: [0:eth1+eth2] selecting eth2 for synchronization phc2sys[28966.852]: [0:eth1+eth2] selecting CLOCK_REALTIME for synchronization phc2sys[28966.853]: [0:eth1+eth2] selecting eth1 as the master clock phc2sys[28966.853]: [0:eth1+eth2] phc offset -6283178 s0 freq +0 delay 680 phc2sys[28966.853]: [0:eth1+eth2] phc offset 5 s0 freq +0 delay 2130 phc2sys[28967.853]: [0:eth1+eth2] phc offset -6285375 s0 freq +0 delay 680 phc2sys[28967.853]: [0:eth1+eth2] phc offset 5 s0 freq +0 delay 2170 phc2sys[28968.853]: [0:eth1+eth2] phc offset -6287640 s0 freq +0 delay 680 phc2sys[28968.854]: [0:eth1+eth2] phc offset 5 s0 freq +0 delay 2130 ... 3. ntpd /usr/sbin/ntpd -u ntp:ntp -g -n -c /var/run/timemaster/ntp.conf.bak Config file: # cat /var/run/timemaster/ntp.conf.bak includefile /etc/ntp.conf server 192.168.11.100 minpoll 6 maxpoll 10 server 127.127.28.0 minpoll 2 maxpoll 2 mode 1 fudge 127.127.28.0 refid PTP0 It is very rudimentary and a copy of the config file created by the timemaster when we first ran into this problem. resulting in following debug output: 20 Mar 15:48:15 ntpd[12373]: ntpd 4.2.8p8@1.3265 Wed Jan 25 14:36:50 UTC 2017 (2): Starting 20 Mar 15:48:15 ntpd[12373]: Command line: /usr/sbin/ntpd -u ntp:ntp -g -n -c /var/run/timemaster/ntp.conf.bak 20 Mar 15:48:15 ntpd[12373]: proto: precision = 0.780 usec (-20) 20 Mar 15:48:15 ntpd[12373]: minpoll: provided value (2) is out of range [3-255]) 20 Mar 15:48:15 ntpd[12373]: Listen and drop on 0 v6wildcard [::]:123 20 Mar 15:48:15 ntpd[12373]: Listen and drop on 1 v4wildcard 0.0.0.0:123 20 Mar 15:48:15 ntpd[12373]: Listen normally on 2 lo 127.0.0.1:123 20 Mar 15:48:15 ntpd[12373]: Listen normally on 3 eth0 192.168.11.247:123 20 Mar 15:48:15 ntpd[12373]: Listen normally on 4 eth1 192.168.1.236:123 20 Mar 15:48:15 ntpd[12373]: Listen normally on 5 eth2 192.168.91.220:123 20 Mar 15:48:15 ntpd[12373]: Listen normally on 6 lo [::1]:123 20 Mar 15:48:15 ntpd[12373]: Listen normally on 7 eth0 [fe80::2d0:93ff:fe27:4a5d%3]:123 20 Mar 15:48:15 ntpd[12373]: Listen normally on 8 eth1 [fe80::2d0:93ff:fe27:4a5e%4]:123 20 Mar 15:48:15 ntpd[12373]: Listen normally on 9 eth2 [fe80::2d0:93ff:fe27:4a5f%5]:123 20 Mar 15:48:15 ntpd[12373]: Listening on routing socket on fd #26 for interface updates 20 Mar 15:48:16 ntpd[12373]: SHM: stale/bad receive time, delay=-37s 20 Mar 15:48:17 ntpd[12373]: SHM: stale/bad receive time, delay=-37s 20 Mar 15:48:18 ntpd[12373]: SHM: stale/bad receive time, delay=-37s (continuing..) Obviously there is a problem with the leap seconds. In my understanding, phc2sys should "translate" the PHC's TAI into UTC and provide this to the ntpd via the SHM. "In hardware time stamping mode, ptp4l announces use of PTP time scale and PHC is used for the stamps. That means PHC must follow PTP time scale while system clock follows UTC. Time offset between these two is maintained by phc2sys." [phy2sys manpage] The ntpd only synchronises to the ntp server 192.168.11.100, not the PTP clock: root@cp61850:~# ntpq -c peers remote refid st t when poll reach delay offset jitter ============================================================================== *192.168.11.100 131.188.3.221 2 u 5 64 1 0.384 1.938 0.001 SHM(0) .PTP0. 0 l - 8 0 0.000 0.000 0.000 However the grand master clock and ntp server are set correctly to the correct UTC resp. TAI times. The above behaviour is the same if eth1 or eth2 are unplugged before or after the calling of the applications. For investigation, further tests were performed: ######################################################### ### comparative test 1: ptp4l using only eth1 or eth2 ### ######################################################### For a test, the basic setup (### problem description ###) is repeated with ptp4l using only one ethernet interface: /usr/bin/ptp4l -l 7 -f /var/run/timemaster/ptp4l.0.conf.bak -H -i eth1 -m Order and syntax of the other applications' calling are the same as above. The ntpd error message SHM: stale/bad receive time, delay=-37s does not appear. ntpd correctly syncs to the PTP Clock: root@cp61850:~# ntpq -c peers remote refid st t when poll reach delay offset jitter ============================================================================== +192.168.11.100 131.188.3.222 2 u 4 64 1 0.185 1.010 0.001 *SHM(0) .PTP0. 0 l 3 8 1 0.000 6.986 0.001 The behaviour is the same if eth2 instead of eth1 is used by ptp4l. ############################################################## ### comparative test 2: call phc2sys with -w instead of -a ### ############################################################## For a test, the basic setup (### problem description ###) is repeated with phc2sys flags -w instead of -a (autoconfiguration disabled). # /usr/bin/phc2sys -l 7 -w -r -R 1.00 -m -z /var/run/timemaster/ptp4l.0.socket -t [0:eth1+eth2] -n 0 -E ntpshm -M 0 -s /dev/ptp0 Order and syntax of the other applications' calling are the same as above. The ntpd error message SHM: stale/bad receive time, delay=-37s does not appear. ntpd correctly syncs to the PTP Clock: root@cp61850:~# ntpq -c peers remote refid st t when poll reach delay offset jitter ============================================================================== +192.168.11.100 131.188.3.222 2 u 4 64 1 0.185 1.010 0.001 *SHM(0) .PTP0. 0 l 3 8 1 0.000 6.986 0.001 ################## ### discussion ### ################## It seems there is a problem with the autoconfiguration featue of phc2sys in combination with ptp4l working with two ethernet interfaces. Can anyone confirm this behaviour? Did we make mistakes with the application flags and/or config files, respectively is there a solution by changing these? Best regards, Stefan -- Stefan Lange Entwicklung ___________________________________________________________ GateWare Communications GmbH !ACHTUNG, NEUE ADRESSE! Friggastr. 2, D-90461 Nürnberg Germany Tel: +49 (0) 911 / 946016-190 Fax: +49 (0) 911 / 946016-80 Eingetragene Rechtsform : GmbH Sitz und Registergericht: Amtsgericht Nürnberg HRB 18334 Geschäftsführer : Wolfgang Haubner Web: http://www.gateware.de E-Mail: s....@ga... ___________________________________________________________ Diese E-mail kann vertrauliche Informationen enthalten. Falls Sie diese E-Mail irrtümlich erhalten haben, informieren Sie bitte unverzüglich den Absender und löschen Sie diese E-Mail von jedem Rechner, auch von den Mailservern. Jede Verbreitung des Inhalts, auch die teilweise Verbreitung, ist in diesem Fall untersagt. Außer bei Vorsatz oder grober Fahrlässigkeit schließen wir jegliche Haftung für Verluste oder Schäden aus, die durch Viren befallene Software oder E-Mails verursacht werden. This e-mail may contain confidential information. If you received this e-mail in error, please contact the sender and delete this e-mail from your computer, including your mailservers. Any dissemination, even partly, is prohibited. Except in case of gross negligence or wilful misconduct we accept no liability for any loss or damage caused by software or e-mail viruses. |