Re: [Linuxptp-users] Configuration for boundary clock with on two-port NIC
PTP IEEE 1588 stack for Linux
Brought to you by:
rcochran
From: Andre P. <and...@sr...> - 2023-11-20 12:49:39
|
Hi, I've tried three different ConnectX-3 NICs now and they all behave the same. To rule out any issues with the GM I tried a Intel i210 as well and that is spot on with excellent sync. However, the Mellanox is not. It's almost as if there is a frequency correction happening inside the NIC every so often. Then the rms values are bad out of a sudden, getting better as frequency is adjusted to the GM and then, again, really bad. ptp4l[447.094]: rms 1580 max 6078 freq +1598846 +/- 4029 delay 983 +/- 3 ptp4l[448.091]: rms 1608 max 6080 freq +1598430 +/- 4117 delay 971 +/- 2 ptp4l[449.089]: rms 111 max 244 freq +1598218 +/- 180 delay 977 +/- 2 ptp4l[450.086]: rms 25 max 39 freq +1598474 +/- 23 delay 983 +/- 2 ptp4l[451.084]: rms 30 max 57 freq +1598512 +/- 23 delay 984 +/- 1 ptp4l[452.081]: rms 17 max 28 freq +1598504 +/- 14 delay 984 +/- 1 ptp4l[453.078]: rms 13 max 24 freq +1598507 +/- 15 delay 985 +/- 2 ptp4l[454.076]: rms 14 max 43 freq +1598510 +/- 26 delay 984 +/- 1 ptp4l[455.073]: rms 7 max 16 freq +1598506 +/- 12 delay 985 +/- 1 ptp4l[456.071]: rms 11 max 18 freq +1598525 +/- 13 delay 983 +/- 1 ptp4l[457.068]: rms 7 max 15 freq +1598519 +/- 14 delay 983 +/- 1 ptp4l[458.066]: rms 6 max 14 freq +1598514 +/- 14 delay 984 +/- 1 ptp4l[459.063]: rms 6 max 13 freq +1598519 +/- 14 delay 985 +/- 0 ptp4l[460.061]: rms 1563 max 5997 freq +1598869 +/- 3991 delay 982 +/- 3 Thanks Andre On 20/11/23 10:27, Andre Puschmann wrote: > Hey, > > > How the GM side is configured? Are you writing system time to PHC > > every second? If so, you can try make the phc free run. Without 1PPS > > signal connecting to the phc or PTM enabled, it's not recommended to > > set pmc's time by software, the jitter is quite big. > > I am not writing any time to the PHC. I just start ptp4l. Shouldn't that > be enough to adjust to PHC to the GMs? > > > Is the GM and the client connected directly or through a switch? Try > > connect them directly with an utp or fiber. > > The GM is directly connected to port0 of the NIC. And the GM is GPS synced. > > > > Try the L2 transport. IIRC at least some Mellanox NICs performed > > worse with UDP transport for some reason. > > This is already with L2. > > > Meanwhile I tried with yet another ConnectX-3 card. This time a IBM > branded with FW 2.42.5032 but the results are similar. One new thing > I've observed is this: > > ptp4l[226.813]: rms 69 max 164 freq +1593151 +/- 131 delay 972 +/- 3 > ptp4l[227.811]: rms 31 max 40 freq +1593342 +/- 23 delay 977 +/- 1 > ptp4l[228.810]: rms 1607 max 6189 freq +1593762 +/- 4095 delay 979 > +/- 2 > ptp4l[229.816]: rms 23001617550970840 max 65058399023124016 freq > -11106166 +/- 33598711 delay 970 +/- 4 > ptp4l[230.695]: clockcheck: clock jumped backward or running slower than > expected! > ptp4l[230.695]: port 1 (enp1s0): SLAVE to UNCALIBRATED on > SYNCHRONIZATION_FAULT > ptp4l[230.821]: rms 65058399528662992 max 65058399969385488 freq > -100000000 +/- 0 delay 9404287 +/- 6628445 > ptp4l[231.825]: rms 65058400466045568 max 65058400899446072 freq > -100000000 +/- 0 delay 15392460 +/- 2718826 > ptp4l[232.828]: rms 65058401394850600 max 65058401833399432 freq > -100000000 +/- 0 delay 12181777 +/- 2190097 > ptp4l[233.831]: rms 65058402330812608 max 65058402771727544 freq > -100000000 +/- 0 delay 16575665 +/- 1701941 > > > RMS values were like before, but than suddenly increased and now don't > go back. > > Thanks > Andre > > > On 19/11/23 22:07, Andre Puschmann wrote: >> Hey, >> >> I've been able to get my hands on a ConnectX-3 Pro card and have done >> some initial testing. The card indeed has a shared PHC for both ports >> so running ptp4l as BC or TC does indeed work without the jbod option. >> >> However, sync performance (i.e. rms values) for the downstream OCs >> isn't great. And in fact, even the Mellanox as a OC isn't giving great >> results - rms values jump a lot (and I've tried various PI value >> combinations). >> >> Is anyone else seeing this with Mlx cards as well? Could it be my >> model or the firmware? >> >> Here is the output of a OC config with the card: >> >> $ sudo /opt/linuxptp/ptp4l -i enp1s0 -f ~/configs/ptp/oc.cfg -m -l6 >> ptp4l[12737.960]: selected /dev/ptp0 as PTP clock >> ptp4l[12738.012]: port 1 (enp1s0): INITIALIZING to LISTENING on >> INIT_COMPLETE >> ptp4l[12738.012]: port 0 (/var/run/ptp4l): INITIALIZING to LISTENING >> on INIT_COMPLETE >> ptp4l[12738.012]: port 0 (/var/run/ptp4lro): INITIALIZING to LISTENING >> on INIT_COMPLETE >> ptp4l[12738.060]: port 1 (enp1s0): new foreign master >> fcaf6a.fffe.02b447-1 >> ptp4l[12738.314]: selected best master clock fcaf6a.fffe.02b447 >> ptp4l[12738.314]: port 1 (enp1s0): LISTENING to UNCALIBRATED on RS_SLAVE >> ptp4l[12740.148]: port 1 (enp1s0): minimum delay request interval 2^-4 >> ptp4l[12740.512]: port 1 (enp1s0): UNCALIBRATED to SLAVE on >> MASTER_CLOCK_SELECTED >> ptp4l[12741.138]: rms 1450 max 1934 freq +270168 +/- 1641 delay 951 >> +/- 14 >> ptp4l[12742.139]: rms 129 max 179 freq +268843 +/- 296 delay 963 >> +/- 11 >> ptp4l[12743.140]: rms 241 max 490 freq +268455 +/- 452 delay 948 >> +/- 1 >> ptp4l[12744.141]: rms 135 max 180 freq +268381 +/- 25 delay 947 >> +/- 1 >> ptp4l[12745.142]: rms 1357 max 5277 freq +269064 +/- 3459 delay 950 >> +/- 1 >> ptp4l[12746.143]: rms 1397 max 5092 freq +268197 +/- 3539 delay 935 >> +/- 7 >> ptp4l[12747.144]: rms 210 max 417 freq +268048 +/- 243 delay 942 >> +/- 3 >> ptp4l[12748.145]: rms 15 max 32 freq +268415 +/- 29 delay 947 >> +/- 2 >> ptp4l[12749.146]: rms 1430 max 5594 freq +269126 +/- 3617 delay 950 >> +/- 1 >> ptp4l[12750.147]: rms 1391 max 5162 freq +268252 +/- 3543 delay 942 >> +/- 4 >> >> >> Thanks >> Andre >> >> >> >> >> >> On 2/11/23 17:37, Jacob Keller wrote: >>> >>> >>> On 11/2/2023 4:15 AM, Andre Puschmann wrote: >>>> Hi, >>>> >>>> On 2/11/23 4:11, James Clark wrote: >>>>> I have a dual-port Mellanox ConnectX-3 (specifically MCX312A-XCBT), >>>>> which has a shared PHC. You can get them for less than $50 on >>>>> eBay/AliExpress. I had to upgrade the firmware on mine to get PTP >>>>> support. I haven't yet tried it as a boundary clock. >>>> >>>> Excellent. This is very helpful James. I've ordered a MCX312A and B and >>>> will compare both here. I'll share my results here soon. If you have a >>>> chance please also share the firmware version you're currently using on >>>> your NIC. >>>> >>>> With my Intel NIC I could get the BC config working but I needed to set >>>> the twoStepFlag to 1. Otherwise I was getting this for both ports: >>>> >>>> ptp4l[1040.180]: ioctl SIOCSHWTSTAMP failed: Numerical result out of >>>> range >>>> >>> >>> Yep, that would indicate the device doesn't support one-step mode. >>> >>>> Sync quality wasn't great as expected though. I'll repeat with the >>>> Mellanox once I have them here. >>>> >>>> Thanks >>>> Andre >>>> >>> >>> For Intel NICs, the only products I am aware of which share PHC across >>> the device are the E800 series devices. Prior devices (E500, and E700, >>> as well as the gigabit products) do share the same internal oscillator >>> but due to the register interface each function has to setup its own >>> clock. >>> >>> Thanks, >>> Jake >>> >>> >>> _______________________________________________ >>> Linuxptp-users mailing list >>> Lin...@li... >>> https://lists.sourceforge.net/lists/listinfo/linuxptp-users >> > -- Andre Puschmann Software Radio Systems (SRS) https://www.srs.io an...@sr... PGP/GnuPG key: 0x204A85DFEA324D58 fingerprint: 3924 1C60 D52E 81A2 1F2E 0C9D 204A 85DF EA32 4D58 |