Thread: [Linuxptp-users] question regarding clockcheck_sample() @clockcheck.c
PTP IEEE 1588 stack for Linux
Brought to you by:
rcochran
From: Ruei, E. <e-...@ti...> - 2017-10-26 15:54:03
|
To whom it may concern: We observed that the PTP reported "clockcheck: clock jumped" events when we added some CPU/Memory load by using stress-ng. It seemed to us that the problem was triggered by the uncertainty of the mono_interval = (int64_t)mono_ts - cc->last_mono_ts at this function where mono_ts is the local time when this routine is called. There is some uncertainty when the function is called in particular when the system is busy and therefore there will be a large variance of mono_interval. In our test case, the PTP will receive the sync event every 120/130ms, the difference of receive timestamp is very stable, but there could be large difference at the mono_interval and it triggers false clock jump event. // light load ptp4l[1984.369]: interval 130021595 mono_interval 130022231 ptp4l[1984.489]: interval 119977963 mono_interval 119997411 ptp4l[1984.619]: interval 130005312 mono_interval 129992626 ptp4l[1984.739]: interval 120002293 mono_interval 119996434 ptp4l[1984.870]: interval 130287464 mono_interval 130400107 ptp4l[1984.989]: interval 119719964 mono_interval 119621813 ptp4l[1985.119]: interval 129993071 mono_interval 129975709 // two bad intervals ptp4l[1985.262]: interval 119997966 mono_interval 142649231 ptp4l[1985.369]: interval 130029848 mono_interval 107389768 ptp4l[1985.369]: clockcheck: clock jumped forward or running faster than expected! //heavy load ptp4l[1090.977]: interval 119990608 mono_interval 109828955 ptp4l[1091.110]: interval 130000707 mono_interval 133090775 ptp4l[1091.226]: interval 120002438 mono_interval 116894219 ptp4l[1091.373]: interval 130025626 mono_interval 146662543 ptp4l[1091.500]: interval 120022987 mono_interval 126504891 ptp4l[1091.607]: interval 129964127 mono_interval 106861426 ptp4l[1091.608]: clockcheck: clock jumped forward or running faster than expected! Is it expected behavior? Best regards, Eric |
From: Miroslav L. <mli...@re...> - 2017-10-27 05:46:36
|
On Thu, Oct 26, 2017 at 03:40:27PM +0000, Ruei, Eric via Linuxptp-users wrote: > To whom it may concern: > > We observed that the PTP reported "clockcheck: clock jumped" events when we added some CPU/Memory load by using stress-ng. > It seemed to us that the problem was triggered by the uncertainty of the mono_interval = (int64_t)mono_ts - cc->last_mono_ts at this function where mono_ts is the local time when this routine is called. > There is some uncertainty when the function is called in particular when the system is busy and therefore there will be a large variance of mono_interval. > ptp4l[1090.977]: interval 119990608 mono_interval 109828955 > ptp4l[1091.110]: interval 130000707 mono_interval 133090775 > ptp4l[1091.226]: interval 120002438 mono_interval 116894219 > ptp4l[1091.373]: interval 130025626 mono_interval 146662543 > ptp4l[1091.500]: interval 120022987 mono_interval 126504891 > ptp4l[1091.607]: interval 129964127 mono_interval 106861426 > ptp4l[1091.608]: clockcheck: clock jumped forward or running faster than expected! > > Is it expected behavior? If a very short sync interval is used and the process may run up to 20 milliseconds late, then yes, I think it's expected that the clock check in default configuration will have false positives. You may want to increase or disable sanity_freq_limit. Just curious, on what hardware do you see this? -- Miroslav Lichvar |
From: Ruei, E. <e-...@ti...> - 2017-10-27 14:21:11
|
Hi, Miroslav: Thank you so much for your quick and kind response! I uses the TI AM571x IDK EVM. Best regards, Eric -----Original Message----- From: Miroslav Lichvar [mailto:mli...@re...] Sent: Friday, October 27, 2017 1:46 AM To: lin...@li... Subject: [EXTERNAL] Re: [Linuxptp-users] question regarding clockcheck_sample() @clockcheck.c On Thu, Oct 26, 2017 at 03:40:27PM +0000, Ruei, Eric via Linuxptp-users wrote: > To whom it may concern: > > We observed that the PTP reported "clockcheck: clock jumped" events when we added some CPU/Memory load by using stress-ng. > It seemed to us that the problem was triggered by the uncertainty of the mono_interval = (int64_t)mono_ts - cc->last_mono_ts at this function where mono_ts is the local time when this routine is called. > There is some uncertainty when the function is called in particular when the system is busy and therefore there will be a large variance of mono_interval. > ptp4l[1090.977]: interval 119990608 mono_interval 109828955 > ptp4l[1091.110]: interval 130000707 mono_interval 133090775 > ptp4l[1091.226]: interval 120002438 mono_interval 116894219 > ptp4l[1091.373]: interval 130025626 mono_interval 146662543 > ptp4l[1091.500]: interval 120022987 mono_interval 126504891 > ptp4l[1091.607]: interval 129964127 mono_interval 106861426 > ptp4l[1091.608]: clockcheck: clock jumped forward or running faster than expected! > > Is it expected behavior? If a very short sync interval is used and the process may run up to 20 milliseconds late, then yes, I think it's expected that the clock check in default configuration will have false positives. You may want to increase or disable sanity_freq_limit. Just curious, on what hardware do you see this? -- Miroslav Lichvar ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot _______________________________________________ Linuxptp-users mailing list Lin...@li... https://lists.sourceforge.net/lists/listinfo/linuxptp-users |
From: Richard C. <ric...@gm...> - 2017-10-27 15:38:10
|
On Fri, Oct 27, 2017 at 07:46:26AM +0200, Miroslav Lichvar wrote: > If a very short sync interval is used and the process may run up to 20 > milliseconds late, then yes, I think it's expected that the clock > check in default configuration will have false positives. Right. > You may want to increase or disable sanity_freq_limit. Or reduce latency by running the ptp4l process with sched_fifo and high priority (see chrt(1)). Thanks, Richard |
From: Gary E. M. <ge...@re...> - 2017-10-27 19:44:13
|
Yo All! About a year ago I was running tests feeding time from ptp4l to ntpd using a SHM. I just repeated the experiment after updating my linuxptp to git head, and now no luck. My version: # ptp4l -v 1.8-00063-g5ce04f9 I'm pretty much following the instructions here: http://www.catb.org/gpsd/gpsd-time-service-howto.html#_providing_local_ntp_service_using_ptp Here is my ptp.conf: ********************* [global] # only syslog every 1024 seconds summary_interval 10 # send to to chronyd/ntpd using SHM 2 clock_servo ntpshm ntpshm_segment 2 # set our priority high since we have PPS priority1 10 priority2 10 [eth1] ********************* I start the daemon this way: # ptp4l -s -f /etc/ptp-howto.conf I have several masters setup similarly. Serving local time. My logs show that ptp4l has found a master: Oct 27 12:22:02 spidey ptp4l[25614]: [3704655.074] tx_type 1 not 1 Oct 27 12:22:02 spidey ptp4l[25614]: [3704655.074] rx_filter 1 not 12 Oct 27 12:22:02 spidey ptp4l[25614]: [3704655.074] selected best master clock 0cc47a.fffe.72716b But no new shm in /dev/shm. My /dev/shm is world writeable, with the sticky bit: # ls -ld /dev/shm drwxrwxrwt 2 root root 40 Oct 27 12:26 /dev/shm And ntpshmmon shows my GPS and PPS SHMs, but nothing on SHM 2: # ntpshmmon ntpshmmon version 1 # Name Seen@ Clock Real L Prec sample NTP1 1509132165.355194143 1509132165.000085726 1509132165.000000000 0 -30 sample NTP0 1509132165.700599465 1509132165.699807167 1509132165.703999996 0 -20 sample NTP1 1509132166.000999683 1509132166.000049468 1509132166.000000000 0 -30 sample NTP0 1509132166.721725714 1509132166.721308687 1509132166.703999996 0 -20 I thought this worked last year, I had not touched the config or startup files. What stupid thing did I do to make this not work? RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 ge...@re... Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can’t measure it, you can’t improve it." - Lord Kelvin |
From: Gary E. M. <ge...@re...> - 2017-10-27 19:52:01
|
Yo All! Sorry, a typo: > I start the daemon this way: > > # ptp4l -s -f /etc/ptp-howto.conf I start this way, with a capital S: # ptp4l -S -f /etc/ptp-howto.conf strace shows me a bit of SHM action: # strace -o tmp ptp4l -S -f /etc/ptp-howto.conf shmget(0x4e545032, 96, IPC_CREAT|0600) = 65538 shmat(65538, NULL, 0) = 0x7fc60af87000 I'm out of ideas.... RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 ge...@re... Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can’t measure it, you can’t improve it." - Lord Kelvin |
From: Miroslav L. <mli...@re...> - 2017-10-30 09:21:04
|
On Fri, Oct 27, 2017 at 12:29:40PM -0700, Gary E. Miller wrote: > Yo All! > > About a year ago I was running tests feeding time from ptp4l to > ntpd using a SHM. I just repeated the experiment after updating > my linuxptp to git head, and now no luck. > My logs show that ptp4l has found a master: > > Oct 27 12:22:02 spidey ptp4l[25614]: [3704655.074] tx_type 1 not 1 > Oct 27 12:22:02 spidey ptp4l[25614]: [3704655.074] rx_filter 1 not 12 > Oct 27 12:22:02 spidey ptp4l[25614]: [3704655.074] selected best master clock 0cc47a.fffe.72716b This could be the local clock. Is there a "LISTENING to UNCALIBRATED on RS_SLAVE" message in the log? If not, ptp4l is probably not receiving master's messages, e.g. due to firewall or switch configuration. -- Miroslav Lichvar |
From: Gary E. M. <ge...@re...> - 2017-10-31 03:08:03
|
Yo Miroslav! On Mon, 30 Oct 2017 10:20:54 +0100 Miroslav Lichvar <mli...@re...> wrote: > On Fri, Oct 27, 2017 at 12:29:40PM -0700, Gary E. Miller wrote: > > Yo All! > > > > About a year ago I was running tests feeding time from ptp4l to > > ntpd using a SHM. I just repeated the experiment after updating > > my linuxptp to git head, and now no luck. > > > My logs show that ptp4l has found a master: > > > > Oct 27 12:22:02 spidey ptp4l[25614]: [3704655.074] tx_type 1 not 1 > > Oct 27 12:22:02 spidey ptp4l[25614]: [3704655.074] rx_filter 1 not > > 12 Oct 27 12:22:02 spidey ptp4l[25614]: [3704655.074] selected best > > master clock 0cc47a.fffe.72716b > > This could be the local clock. OK. > Is there a "LISTENING to UNCALIBRATED on RS_SLAVE" message in the log? > If not, ptp4l is probably not receiving master's messages, e.g. due to > firewall or switch configuration. Nope. Can you suggest a working configuration or a howto? I thought it used to work. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 ge...@re... Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can’t measure it, you can’t improve it." - Lord Kelvin |
From: Miroslav L. <mli...@re...> - 2017-10-31 10:52:23
|
On Mon, Oct 30, 2017 at 08:07:49PM -0700, Gary E. Miller wrote: > Miroslav Lichvar <mli...@re...> wrote: > > Is there a "LISTENING to UNCALIBRATED on RS_SLAVE" message in the log? > > If not, ptp4l is probably not receiving master's messages, e.g. due to > > firewall or switch configuration. > > Nope. Can you suggest a working configuration or a howto? I thought > it used to work. A minimal configuration using software timestamping and the NTP SHM segment #2 would be: 1. on master run: ptp4l -S -i eth0 -m 2. on slave run: ptp4l -s -S -i eth0 -m --clock_servo NTPSHM --ntpshm_segment 2 The slave ptp4l should print something like this: ptp4l[3444911.527]: port 1: INITIALIZING to LISTENING on INIT_COMPLETE ptp4l[3444911.528]: port 0: INITIALIZING to LISTENING on INIT_COMPLETE ptp4l[3444916.592]: port 1: new foreign master xxxxxx.xxxx.xxxxxx-1 ptp4l[3444918.313]: selected best master clock yyyyyy.yyyy.yyyyyy ptp4l[3444920.592]: selected best master clock xxxxxx.xxxx.xxxxxx ptp4l[3444920.592]: foreign master not using PTP timescale ptp4l[3444920.592]: port 1: LISTENING to UNCALIBRATED on RS_SLAVE ptp4l[3444922.592]: master offset -333886 s0 freq +0 path delay 87021 ptp4l[3444923.592]: master offset -335317 s0 freq +0 path delay 89726 For each "master offset" message there should be an update of the SHM segment. # ntpshmmon ntpshmmon version 1 # Name Seen@ Clock Real L Prec sample NTP2 1509446684.245300996 1509446683.444564513 1509446683.444868882 0 -30 sample NTP2 1509446684.445200649 1509446684.444584459 1509446684.444876828 0 -30 sample NTP2 1509446685.444777467 1509446685.444555127 1509446685.444893429 0 -30 sample NTP2 1509446686.444997907 1509446686.444603946 1509446686.444903255 0 -30 sample NTP2 1509446687.445353599 1509446687.444604192 1509446687.444912416 0 -30 sample NTP2 1509446688.445426052 1509446688.444619136 1509446688.444923429 0 -30 If the slave does not synchronize, you can try on slave # tcpdump -i eth0 -n port ptp-event or port ptp-general 11:47:10.446485 IP 192.168.101.1.ptp-event > 224.0.1.129.ptp-event: UDP, length 44 11:47:10.446503 IP 192.168.101.1.ptp-general > 224.0.1.129.ptp-general: UDP, length 44 11:47:10.446508 IP 192.168.101.1.ptp-general > 224.0.1.129.ptp-general: UDP, length 64 11:47:11.446489 IP 192.168.101.1.ptp-event > 224.0.1.129.ptp-event: UDP, length 44 11:47:11.446515 IP 192.168.101.1.ptp-general > 224.0.1.129.ptp-general: UDP, length 44 If there are some ptp message from the master, it's likely a firewall rule which blocks the messages from reaching ptp4l. UDP ports 319 and 320 need to be allowed. If no messages appear, the switch (if it's a managed switch) might need to be configured to allow multicast traffic between the hosts. -- Miroslav Lichvar |
From: Gary E. M. <ge...@re...> - 2017-10-31 16:47:52
|
Yo Miroslav! On Tue, 31 Oct 2017 11:52:14 +0100 Miroslav Lichvar <mli...@re...> wrote: > > Nope. Can you suggest a working configuration or a howto? I > > thought it used to work. > > A minimal configuration using software timestamping and the NTP SHM > segment #2 would be: > > 1. on master run: ptp4l -S -i eth0 -m > 2. on slave run: ptp4l -s -S -i eth0 -m --clock_servo NTPSHM > --ntpshm_segment 2 Bit I do not want a master/slave configuration. I want multi-master. Like this: http://www.catb.org/gpsd/gpsd-time-service-howto.html#_providing_local_ntp_service_using_ptp That config used to work. > If there are some ptp message from the master, it's likely a firewall > rule which blocks the messages from reaching ptp4l. UDP ports 319 and > 320 need to be allowed. Nope, my config is all local network, I only firewall at the connection to my ISP. > If no messages appear, the switch (if it's a managed switch) might > need to be configured to allow multicast traffic between the hosts. Nope, dumb switches only. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 ge...@re... Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can’t measure it, you can’t improve it." - Lord Kelvin |