Precision Time Protocol daemon / Discussion / Help: How to know the Accuracy of PTP

Black,

Right, so here comes an essay ;)

Your question touches on many topics. Are you asking about precision or accuracy? In this case, precision is how stable is the offset from your reference clock (master or master's reference) that PTPd is able to maintain, and accuracy is how close the mean/median of the offset is to zero (your reference).

Question number one: what is your precision target? If it's 100us, you're super safe. If it's 10us, you're OK. If it's 5us, read on, if it's less than that, read on and consider hardware assistance.

Checking against NTP can only give you a ballpark figure, and you should not rely on it, especially on a one-shot command like ntpq. It may be better if you run ntp client constantly but with clock control disabled (line: "disable ntp kernel" in ntp.conf). However ntp poll intervals are too infrequent and in no way correspond to the momentary conditions that PTP messages sent in-between were subject to. And finally - is the NTP server also your PTP GM? Again, what is your precision target?

The sad truth is that if you're running ptpd with no hardware assistance whatsoever, you cannot trust the offset from master and one-way delay statistics logged by it. Ptpd will try its best to minimise the offset it thinks it sees, under the assumption that the ingress timestamps (sync and delay response) are captured as soon as the data reaches the host system (straight off the wire). Same with egress timestamps (delay request) - they are assumed to be taken once the data hits the wire. With software timestamping not only it is not true, but the delay between when you've captured the timestamp and when it was taken off / placed on the wire is subject to variation (jitter) and you have no easy way of knowing it. I'm not saying that the numbers you see are random and false, but they are nearly always off by some amount. There are many factors contributing to it, but mainly simply the kernel IP stack latency, presence of firewall filters (like IPTables on Linux), and finally scheduler and interrupts. This gets worse as the server I/O and CPU load increases, but behaves better on realtime kernels. In fact, if loads are high but constant, the filters have the time to adapt to the situation. The worst case is when CPU, I/O and network load changes constantly and rapidly.

With the data out of ptpd output you can more-or-less (tm) establish the precision, but cannot establish the accuracy. The only reliable way to see what ptpd is actually doing and to establish the precision and accuracy (always with some degree of uncertainty though) is to have an external clock reference available in the OS (such as a hardware PTP NIC using the same source as your ptpd GM), and be able to compare its time to the OS time.

So having said all that, the accuracy (base offset from master) I usually see with ptpd (precise to few hundred nanos) using the above measurement method on a standard server is about 10 microseconds. You can try and correct it with the -l option - inbound and outbound latency - but first you would need to know what it actually is, and constant it is not. The overall precision (standard deviation of the offset from reference) you can get over long periods of measurement is within the 2us region (+/-1us) on a simple, quiet network, and about 5us (+/-2.5us) with a more complex, multi-hop network with average CPU and network loads. If you are running hardware (NIC) that gives you ingress and egress timestamping and plug that into ptpd, you can get into the sub-microsecond precision (and accuracy) region. All of it is far better than what you can get with standard NTPd. Funny enough, "Bastardised" NTP with fast polling rates can also give you good results.

The more simple and deterministic (latency wise) hardware and OS you use, the better it gets. On a microcontroller that does nothing but ptpd you will easily get sub-micro precision, and similar accuracy if you have hardware timestamping. In a real-world server OS, you will get events kicking in that will cause timing spikes. Trivial things like log rotation and other scheduled runs like updatedb if you use it, and then the actual applications on the server. They will not be huge spikes, they will be slight fluctuations.

While experimenting with some code that's not in ptpd yet, I was able to achieve sub-microsecond precision - however still with suboptimal accuracy, but there isn't much you can do about it. Well there is, but this involves measuring the duration of every step you take. There is also some code coming to ptpd that will use BPF / pcap to receive and timestamp packets. This is going to further improve accuracy.

Also the OS you're using is important, for example older Linux kernels (like 2.6.18 in RHEL5) did not have nanosecond resolution time system. The clock_gettime() function would take a microsecond value, multiply it by 1000 and sell it to you as a nanosecond field. More recent kernels (RHEL6 2.6.32 for example), actually run the clock subsystem using nanoseconds from start to finish.

There is always some uncertainty, but you can expect ptpd to at least keep the time on multiple slaves in sync with each other, if they run similar hardware and software.

This is an endless topic, but I hope this gives you a good insight into what you can expect, what affects it and why. The only really reliable PTP clock is the one implemented in hardware from A to Z.

So in the end, clear as mud.

Thanks
Woj

Hi Woj,
Thank you for your response - even if this isn´t even my topic ;).
I am currently working on a solution to test the precision of PTP over larger networks and the impact on precision when there are load-changes etc. (universty project). Therefore, your explanation was very good to get an idea of measuring and evaluating the precision.

Our idea is to run one ptp instance on a master, followed by a hub, which splits the connection into two different directions: One "shortcut" directly to the slave and the other connection over a network to the slave. The slave has 2 NICs where both connections from the master arrive. We wanted one ptpd instance to sync with the OS clock and one ptpd instance should only compare to the OS time. Our offset from the network would then be the difference between OS clock and Network-PTP process. We use simple Ubuntu machines and normal NICs for instance, and a +-15us with the shortcut would be OK to begin.

Do you have a suggestion on how it is possible to measure the offset between internal and OS clock (Over log files?)

Looking forward to hear from you!

Greetings,
Joris

Last edit: jolueckenga 2014-01-22

How to know the Accuracy of PTP

Portable, complete and BSD-licenced IEEE 1588 (PTP) implementation

Forums

Help

How to know the Accuracy of PTP

How to know the Accuracy of PTP

Portable, complete and BSD-licenced IEEE 1588 (PTP) implementation

Forums

Help

How to know the Accuracy of PTP document.SUBSCRIPTION_OPTIONS = { "thing": "topic", "subscribed": false, "url": "subscribe", "icon": { "css": "fa fa-envelope-o" } };

How to know the Accuracy of PTP