Thread: [Linuxptp-users] Intel NICs PTP Performance Comparison
PTP IEEE 1588 stack for Linux
Brought to you by:
rcochran
|
From: Joseph M. <jos...@gm...> - 2021-06-17 14:14:38
|
Hi experts, I'm working with different NIC devices, and I've seen that some of them don't have a "good" PTP performance (later I'll describe what I define as "good"). I decided to run a short test, and write a comparison table. I believe that maybe some of you are familiar with the behaviour I've encountered. The NICs unders test are (all Intel): I210, 82574L, I218-V, I219-V. My setup is as described below: [NIC under test / MASTER] <---- [2 meter cable] ----> [KSZ9477 / SLAVE] The Slave is always KSZ9477. Only the Master is replaced between each test. There is only a cable between Master and Slave. No other devices or switches. Both master and slave support HW timestamping. NIC Driver Version FW-Version Test Results Intel I210 igb 5.4.0-k 3.25, 0x80000678 GOOD Intel 82574L e1000e 3.2.6-k 1.8-0 GOOD Intel I218-V e1000e 3.2.6-k 0.2-4 BAD Intel I218-V e1000e 3.8.7-NAPI 0.2-4 BAD Intel I219-V e1000e 3.2.6-k 0.4-4 BAD Test results description: GOOD: Delay is always around 20 nsec (10 nsec jitter), even under heavy traffic. Offset is always < 200 nsec, even under heavy traffic. BAD: Unstable delay. With no traffic, the delay is around 7500-8000 nsec. Under heavy traffic it gets smaller - reaching a delay of 800 nsec and stabilizes on 800 nsec. When traffic goes off, delay reaches back to 7500-8000 nsec. Offset can jitter up to 5000 nsec, but it stabilizes when the delay is stabilized. The interesting thing I saw in the I218/I219 is that after a while of traffic (1 or 2 minutes), the delay is stabilized on 800 nsec, and when I turn off the traffic, the delay is still left stabilized on 800 nsec for a ~15 seconds, and only then starts go back to its original value of 7500-8000 nesc (and same behaviour vice versa). I must say it looks more like a feature than a bug (that maybe I just need to disable). I thought maybe: EEE, Low Power Mode or Flow Control (I tried disabling them with ethtool, but results didn't change - maybe using ethtool is not the right way to do that...) Does anyone is familiar with such a behaviour with Intel I218/I219 ? Thanks, Joseph |
|
From: Joseph M. <jos...@gm...> - 2021-06-17 19:06:46
|
Just editing the comparison table: # NIC # Driver # Version # FW-Version # Test-Results ======================================================= # Intel I210 # igb # 5.4.0-k # 3.25, 0x80000678 # GOOD # Intel 82574L # e1000e # 3.2.6-k # 1.8-0 # GOOD # Intel I218-V # e1000e # 3.2.6-k # 0.2-4 # BAD # Intel I218-V # e1000e # 3.8.7-NAPI # 0.2-4 # BAD # Intel I219-V # e1000e # 3.2.6-k # 0.4-4 # BAD On Thu, Jun 17, 2021 at 5:14 PM Joseph Matan <jos...@gm...> wrote: > Hi experts, > > I'm working with different NIC devices, and I've seen that some of them > don't have a "good" PTP performance (later I'll describe what I define as > "good"). > I decided to run a short test, and write a comparison table. > I believe that maybe some of you are familiar with the behaviour I've > encountered. > > The NICs unders test are (all Intel): I210, 82574L, I218-V, I219-V. > My setup is as described below: > > [NIC under test / MASTER] <---- [2 meter cable] ----> [KSZ9477 / SLAVE] > > The Slave is always KSZ9477. > > Only the Master is replaced between each test. > > There is only a cable between Master and Slave. No other devices or > switches. > > > Both master and slave support HW timestamping. > > NIC Driver Version FW-Version Test Results > Intel I210 igb 5.4.0-k 3.25, 0x80000678 GOOD > Intel 82574L e1000e 3.2.6-k 1.8-0 GOOD > Intel I218-V e1000e 3.2.6-k 0.2-4 BAD > Intel I218-V e1000e 3.8.7-NAPI 0.2-4 BAD > Intel I219-V e1000e 3.2.6-k 0.4-4 BAD > > Test results description: > > GOOD: > > Delay is always around 20 nsec (10 nsec jitter), even under heavy traffic. > > Offset is always < 200 nsec, even under heavy traffic. > > BAD: > > Unstable delay. > > With no traffic, the delay is around 7500-8000 nsec. > > Under heavy traffic it gets smaller - reaching a delay of 800 nsec and > stabilizes on 800 nsec. > > When traffic goes off, delay reaches back to 7500-8000 nsec. > > Offset can jitter up to 5000 nsec, but it stabilizes when the delay is > stabilized. > > The interesting thing I saw in the I218/I219 is that after a while of > traffic (1 or 2 minutes), the delay is stabilized on 800 nsec, > and when I turn off the traffic, the delay is still left stabilized on 800 > nsec for a ~15 seconds, and only then starts go back to its original value > of 7500-8000 nesc (and same behaviour vice versa). > I must say it looks more like a feature than a bug (that maybe I just need > to disable). > I thought maybe: EEE, Low Power Mode or Flow Control (I tried disabling > them with ethtool, but results didn't change - maybe using ethtool is not > the right way to do that...) > > Does anyone is familiar with such a behaviour with Intel I218/I219 ? > > Thanks, > Joseph > |
|
From: Dale S. <dal...@gm...> - 2021-06-17 22:04:25
|
So how in the would could *more* traffic cause a *smaller* delay? Very curious. Wild Guess, maybe interrupt coalescing is going on, and with more packets, it's actually responding sooner? Another guess, that somehow the determination of the timestamps are just plain wrong. Like maybe they were fudged to some value while under heavy load, and are way more off when under light load? Just guesses as I said. I don't do 1588 much anymore, but this is very intriguing. Please report back if you ever get to the bottom of this. -Dale |
|
From: Keller, J. E <jac...@in...> - 2021-06-18 00:04:38
|
> -----Original Message----- > From: Dale Smith <dal...@gm...> > Sent: Thursday, June 17, 2021 3:04 PM > To: Joseph Matan <jos...@gm...> > Cc: lin...@li... > Subject: Re: [Linuxptp-users] Intel NICs PTP Performance Comparison > > So how in the would could *more* traffic cause a *smaller* delay? > EEE low power ethernet kicking in, which puts the device into a low power state that has a higher latency to wake up. You could check "ethtool --show-eee" to see if the device supports it and what the status is. > Very curious. > > Wild Guess, maybe interrupt coalescing is going on, and with more > packets, it's actually responding sooner? This is unlikely, since timestamps are captured in hardware, so I don't think it should impact the latency with regards to the actual timestamps. > > Another guess, that somehow the determination of the timestamps are > just plain wrong. Like maybe they were fudged to some value while > under heavy load, and are way more off when under light load? > I don't think this is the case for Intel devices. > Just guesses as I said. I don't do 1588 much anymore, but this is > very intriguing. Please report back if you ever get to the bottom of > this. > > -Dale > > > _______________________________________________ > Linuxptp-users mailing list > Lin...@li... > https://lists.sourceforge.net/lists/listinfo/linuxptp-users |
|
From: Keller, J. E <jac...@in...> - 2021-06-18 00:05:54
|
> -----Original Message----- > From: Keller, Jacob E > Sent: Thursday, June 17, 2021 5:04 PM > To: 'Dale Smith' <dal...@gm...>; Joseph Matan > <jos...@gm...> > Cc: lin...@li... > Subject: RE: [Linuxptp-users] Intel NICs PTP Performance Comparison > > > > > -----Original Message----- > > From: Dale Smith <dal...@gm...> > > Sent: Thursday, June 17, 2021 3:04 PM > > To: Joseph Matan <jos...@gm...> > > Cc: lin...@li... > > Subject: Re: [Linuxptp-users] Intel NICs PTP Performance Comparison > > > > So how in the would could *more* traffic cause a *smaller* delay? > > > > EEE low power ethernet kicking in, which puts the device into a low power state > that has a higher latency to wake up. You could check "ethtool --show-eee" to see > if the device supports it and what the status is. > I missed that you ruled out EEE already. I wonder if i218 supports it without supporting the options to disable it though.... I'm not certain. Thanks, Jake > > Very curious. > > > > Wild Guess, maybe interrupt coalescing is going on, and with more > > packets, it's actually responding sooner? > > This is unlikely, since timestamps are captured in hardware, so I don't think it > should impact the latency with regards to the actual timestamps. > > > > > Another guess, that somehow the determination of the timestamps are > > just plain wrong. Like maybe they were fudged to some value while > > under heavy load, and are way more off when under light load? > > > > I don't think this is the case for Intel devices. > > > Just guesses as I said. I don't do 1588 much anymore, but this is > > very intriguing. Please report back if you ever get to the bottom of > > this. > > > > -Dale > > > > > > _______________________________________________ > > Linuxptp-users mailing list > > Lin...@li... > > https://lists.sourceforge.net/lists/listinfo/linuxptp-users |
|
From: Joseph M. <jos...@gm...> - 2021-06-18 09:21:46
|
The original state of the NIC (after reset) is:
ethtool --show-eee eth0
EEE Settings for eth0:
EEE status: enabled - inactive
Tx LPI: 17 (us)
Supported EEE link modes: 100baseT/Full
1000baseT/Full
Advertised EEE link modes: 100baseT/Full
1000baseT/Full
Link partner advertised EEE link modes: Not reported
ethtool -a eth0
Pause parameters for eth0:
Autonegotiate: on
RX: on
TX: on
So the first thing I thought was that it's obviously something with eee
getting into action.
(if it was the flow-control in action, I would expect the delay to get
bigger...)
But just to be sure, I ran:
ethtool -A eth0 autoneg off rx off tx off
ethtool -a eth0
Pause parameters for eth0:
Autonegotiate: off
RX: off
TX: off
ethtool --set-eee eth0 eee off
EEE Settings for eth0:
EEE status: disabled
Tx LPI: 17 (us)
Supported EEE link modes: 100baseT/Full
1000baseT/Full
Advertised EEE link modes: 100baseT/Full
1000baseT/Full
Link partner advertised EEE link modes: Not reported
But there was no effect...
Since I still think it's eee in action, my only guess is that turning eee
off via ethtool doesn't really work...
I'll try to see if I can dump the value from the relevant register (and not
use ethtool).
I just wonder how this issue was ever mentioned before in this forum...
>From what I see this NIC is quite common.
On Fri, Jun 18, 2021 at 3:05 AM Keller, Jacob E <jac...@in...>
wrote:
>
>
> > -----Original Message-----
> > From: Keller, Jacob E
> > Sent: Thursday, June 17, 2021 5:04 PM
> > To: 'Dale Smith' <dal...@gm...>; Joseph Matan
> > <jos...@gm...>
> > Cc: lin...@li...
> > Subject: RE: [Linuxptp-users] Intel NICs PTP Performance Comparison
> >
> >
> >
> > > -----Original Message-----
> > > From: Dale Smith <dal...@gm...>
> > > Sent: Thursday, June 17, 2021 3:04 PM
> > > To: Joseph Matan <jos...@gm...>
> > > Cc: lin...@li...
> > > Subject: Re: [Linuxptp-users] Intel NICs PTP Performance Comparison
> > >
> > > So how in the would could *more* traffic cause a *smaller* delay?
> > >
> >
> > EEE low power ethernet kicking in, which puts the device into a low
> power state
> > that has a higher latency to wake up. You could check "ethtool
> --show-eee" to see
> > if the device supports it and what the status is.
> >
>
> I missed that you ruled out EEE already. I wonder if i218 supports it
> without supporting the options to disable it though.... I'm not certain.
>
> Thanks,
> Jake
>
> > > Very curious.
> > >
> > > Wild Guess, maybe interrupt coalescing is going on, and with more
> > > packets, it's actually responding sooner?
> >
> > This is unlikely, since timestamps are captured in hardware, so I don't
> think it
> > should impact the latency with regards to the actual timestamps.
> >
> > >
> > > Another guess, that somehow the determination of the timestamps are
> > > just plain wrong. Like maybe they were fudged to some value while
> > > under heavy load, and are way more off when under light load?
> > >
> >
> > I don't think this is the case for Intel devices.
> >
> > > Just guesses as I said. I don't do 1588 much anymore, but this is
> > > very intriguing. Please report back if you ever get to the bottom of
> > > this.
> > >
> > > -Dale
> > >
> > >
> > > _______________________________________________
> > > Linuxptp-users mailing list
> > > Lin...@li...
> > > https://lists.sourceforge.net/lists/listinfo/linuxptp-users
>
|
|
From: Jacob K. <jac...@in...> - 2021-06-18 15:34:55
|
On 6/18/2021 2:10 AM, Joseph Matan wrote: > The original state of the NIC (after reset) is: > > ethtool --show-eee eth0 > EEE Settings for eth0: > EEE status: enabled - inactive > Tx LPI: 17 (us) > Supported EEE link modes: 100baseT/Full > 1000baseT/Full > Advertised EEE link modes: 100baseT/Full > 1000baseT/Full > Link partner advertised EEE link modes: Not reported > > ethtool -a eth0 > Pause parameters for eth0: > Autonegotiate: on > RX: on > TX: on > > So the first thing I thought was that it's obviously something with eee > getting into action. > (if it was the flow-control in action, I would expect the delay to get > bigger...) > > But just to be sure, I ran: > > ethtool -A eth0 autoneg off rx off tx off > ethtool -a eth0 > Pause parameters for eth0: > Autonegotiate: off > RX: off > TX: off > > ethtool --set-eee eth0 eee off > EEE Settings for eth0: > EEE status: disabled > Tx LPI: 17 (us) > Supported EEE link modes: 100baseT/Full > 1000baseT/Full > Advertised EEE link modes: 100baseT/Full > 1000baseT/Full > Link partner advertised EEE link modes: Not reported > > But there was no effect... > Since I still think it's eee in action, my only guess is that turning > eee off via ethtool doesn't really work... The only thing I can think of here that would cause this behavior is EEE, but it is possible I am missing something else. > I'll try to see if I can dump the value from the relevant register (and > not use ethtool). > I just wonder how this issue was ever mentioned before in this forum... > From what I see this NIC is quite common. > I guess no one else looked at the delay all that closely? |
|
From: Joseph M. <jos...@gm...> - 2021-06-20 16:43:50
|
I published the issue in the Intel Community forum. I hope they can help. In the meantime, I'll try to work with a usb to ethernet adapter (which of course supports PTP) - does anyone have recommendations? (I think I should publish this question on a different topic...) On Fri, Jun 18, 2021 at 6:34 PM Jacob Keller <jac...@in...> wrote: > > > On 6/18/2021 2:10 AM, Joseph Matan wrote: > > The original state of the NIC (after reset) is: > > > > ethtool --show-eee eth0 > > EEE Settings for eth0: > > EEE status: enabled - inactive > > Tx LPI: 17 (us) > > Supported EEE link modes: 100baseT/Full > > 1000baseT/Full > > Advertised EEE link modes: 100baseT/Full > > 1000baseT/Full > > Link partner advertised EEE link modes: Not reported > > > > ethtool -a eth0 > > Pause parameters for eth0: > > Autonegotiate: on > > RX: on > > TX: on > > > > So the first thing I thought was that it's obviously something with eee > > getting into action. > > (if it was the flow-control in action, I would expect the delay to get > > bigger...) > > > > But just to be sure, I ran: > > > > ethtool -A eth0 autoneg off rx off tx off > > ethtool -a eth0 > > Pause parameters for eth0: > > Autonegotiate: off > > RX: off > > TX: off > > > > ethtool --set-eee eth0 eee off > > EEE Settings for eth0: > > EEE status: disabled > > Tx LPI: 17 (us) > > Supported EEE link modes: 100baseT/Full > > 1000baseT/Full > > Advertised EEE link modes: 100baseT/Full > > 1000baseT/Full > > Link partner advertised EEE link modes: Not reported > > > > But there was no effect... > > Since I still think it's eee in action, my only guess is that turning > > eee off via ethtool doesn't really work... > > > The only thing I can think of here that would cause this behavior is > EEE, but it is possible I am missing something else. > > > I'll try to see if I can dump the value from the relevant register (and > > not use ethtool). > > I just wonder how this issue was ever mentioned before in this forum... > > From what I see this NIC is quite common. > > > > I guess no one else looked at the delay all that closely? > |