e1000-devel Mailing List for Formerly Intel Ethernet Drivers

e1000-devel — Discussion of the Intel Ethernet out-of-tree drivers

You can subscribe to this list here.

2002	Jan	Feb	Mar	Apr	May	Jun	Jul	Aug	Sep	Oct (2)	Nov (1)	Dec
2003	Jan	Feb	Mar (1)	Apr (9)	May (3)	Jun	Jul (3)	Aug (6)	Sep	Oct (7)	Nov	Dec
2004	Jan	Feb (5)	Mar (10)	Apr (2)	May (22)	Jun (8)	Jul (4)	Aug (8)	Sep (3)	Oct	Nov (36)	Dec (52)
2005	Jan (9)	Feb (13)	Mar (9)	Apr	May (14)	Jun (5)	Jul (20)	Aug (31)	Sep (2)	Oct (3)	Nov (18)	Dec (18)
2006	Jan (36)	Feb (16)	Mar (76)	Apr (78)	May (32)	Jun (30)	Jul (67)	Aug (43)	Sep (54)	Oct (116)	Nov (223)	Dec (158)
2007	Jan (180)	Feb (71)	Mar (110)	Apr (114)	May (203)	Jun (100)	Jul (238)	Aug (191)	Sep (177)	Oct (171)	Nov (211)	Dec (159)
2008	Jan (227)	Feb (288)	Mar (197)	Apr (253)	May (132)	Jun (152)	Jul (109)	Aug (143)	Sep (157)	Oct (198)	Nov (121)	Dec (147)
2009	Jan (105)	Feb (61)	Mar (191)	Apr (161)	May (118)	Jun (172)	Jul (166)	Aug (67)	Sep (86)	Oct (79)	Nov (118)	Dec (181)
2010	Jan (136)	Feb (154)	Mar (92)	Apr (83)	May (101)	Jun (66)	Jul (118)	Aug (78)	Sep (134)	Oct (131)	Nov (132)	Dec (104)
2011	Jan (79)	Feb (104)	Mar (144)	Apr (145)	May (130)	Jun (169)	Jul (146)	Aug (76)	Sep (113)	Oct (82)	Nov (145)	Dec (122)
2012	Jan (132)	Feb (106)	Mar (145)	Apr (238)	May (140)	Jun (162)	Jul (166)	Aug (147)	Sep (80)	Oct (148)	Nov (192)	Dec (90)
2013	Jan (139)	Feb (162)	Mar (174)	Apr (81)	May (261)	Jun (301)	Jul (106)	Aug (175)	Sep (305)	Oct (222)	Nov (95)	Dec (120)
2014	Jan (196)	Feb (171)	Mar (146)	Apr (118)	May (127)	Jun (93)	Jul (175)	Aug (66)	Sep (85)	Oct (120)	Nov (81)	Dec (192)
2015	Jan (141)	Feb (133)	Mar (189)	Apr (126)	May (59)	Jun (117)	Jul (56)	Aug (97)	Sep (44)	Oct (48)	Nov (33)	Dec (87)
2016	Jan (37)	Feb (56)	Mar (72)	Apr (65)	May (66)	Jun (65)	Jul (98)	Aug (54)	Sep (84)	Oct (68)	Nov (69)	Dec (60)
2017	Jan (30)	Feb (38)	Mar (53)	Apr (6)	May (2)	Jun (5)	Jul (15)	Aug (15)	Sep (7)	Oct (18)	Nov (23)	Dec (6)
2018	Jan (39)	Feb (5)	Mar (34)	Apr (26)	May (27)	Jun (5)	Jul (12)	Aug (4)	Sep	Oct (4)	Nov (4)	Dec (4)
2019	Jan (7)	Feb (10)	Mar (21)	Apr (26)	May (4)	Jun (5)	Jul (11)	Aug (6)	Sep (7)	Oct (13)	Nov (3)	Dec (17)
2020	Jan	Feb (3)	Mar (3)	Apr (5)	May (2)	Jun (5)	Jul	Aug	Sep (6)	Oct (7)	Nov (2)	Dec (7)
2021	Jan (9)	Feb (10)	Mar (18)	Apr (1)	May (3)	Jun	Jul (16)	Aug (2)	Sep	Oct	Nov (9)	Dec (2)
2022	Jan (3)	Feb	Mar (9)	Apr (8)	May (5)	Jun (6)	Jul (1)	Aug	Sep (1)	Oct	Nov (7)	Dec (2)
2023	Jan (7)	Feb (2)	Mar (6)	Apr	May (4)	Jun (2)	Jul (4)	Aug (3)	Sep (4)	Oct (2)	Nov (4)	Dec (10)
2024	Jan (4)	Feb (2)	Mar (1)	Apr	May (1)	Jun (1)	Jul	Aug (1)	Sep	Oct	Nov	Dec

Flat | Threaded

1 2 3 .. 647 > >> (Page 1 of 647)

[e1000-devel] IGB driver issues with flush_scheduled_work

From: Jon K. <jo...@nu...> - 2024-08-23 18:06:29

Hi e1000/igb folks,
Reaching out about a compile time error with IGB driver, kernel 6.6, and trying to compile with -Werror

In the Intel side of the IGB source, there is a flush_scheduled_work() in igb_remove, which according
to git blame from the GitHub side has been there since 1.0.1

https://github.com/intel/ethernet-linux-igb/blame/d4658bb8811ea60deb8d3398e8682b64dc0e1f07/src/igb_main.c#L3429

This same flush is not present on the mainline driver: https://github.com/torvalds/linux/blame/master/drivers/net/ethernet/intel/igb/igb_main.c#L3860

This flush now produces a compile time warning, which turns into a failure with Werror

There was an announcement on LKML about this for in-tree users a while back, here: https://lore.kernel.org/all/49925af7-78a8-a3dd-bce6-cfc02e1a9236@I-love.SAKURA.ne.jp/T/#u

Compiling without Werror and using the driver is just fine, but wanted to see if this issue has been raised before and if there was any harm in simply removing this call (As the mainline driver appears to be working just fine without it)?

Thanks,
Jon

In file included from ./include/linux/srcu.h:21,
                 from ./include/linux/notifier.h:16,
                 from ./arch/x86/include/asm/uprobes.h:13,
                 from ./include/linux/uprobes.h:49,
                 from ./include/linux/mm_types.h:16,
                 from ./include/linux/buildid.h:5,
                 from ./include/linux/module.h:14,
                 from /builddir/build/BUILD/igb-5.16.9/src/igb_main.c:4:
/builddir/build/BUILD/igb-5.16.9/src/igb_main.c: In function 'igb_remove':
./include/linux/workqueue.h:639:2: error: call to '__warn_flushing_systemwide_wq' declared with attribute warning: Please avoid flushing system-wide workqueues. [-Werror]
  __warn_flushing_systemwide_wq();    \
  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/builddir/build/BUILD/igb-5.16.9/src/igb_main.c:3431:2: note: in expansion of macro 'flush_scheduled_work'
  flush_scheduled_work();
  ^~~~~~~~~~~~~~~~~~~~

Re: [e1000-devel] [E1000-devel] ice e810 wrong inner tcp checksum for tunneled packets

From: Rustad, M. D <mar...@in...> - 2024-06-14 17:24:02

> On May 20, 2024, at 3:33 AM, Alexander Kokorin <zuk...@gm...> wrote:
> 
> We have noticed that when we receive TCP packets with the wrong
> checksum from the internet, on
> the receiving node it goes through, NIC compares the checksum, lets packet
> further and increases the kernel counter for RX ERR. It doesn't make sense
> as nothing can be done with such packets.

Don't be so sure. Years ago when I worked for another company, another team had a product that handled networking traffic. Their product dropped packets that had bad TCP checksums. There were certain features in some software that simply would not work. They spent months trying to figure out why the features only didn't work when traffic went through their product. Eventually they finally realized that there were always some TCP checksum errors when that software was used. They stopped dropping the packets with bad checksums and then it worked fine.

Yes, there is (or at least was?) software that abused the TCP checksum to pass some other data through the connection. The point of the hardware checksum check is to allow software not not have to do it when it is good - to optimize the normal case. It is not to drop the bad packets at a lower level, hiding them from TCP.

-- 
Mark Rustad (he/him), Ethernet Products Group, Intel Corporation

[e1000-devel] [E1000-devel] ice e810 wrong inner tcp checksum for tunneled packets

From: Alexander K. <zuk...@gm...> - 2024-05-20 10:41:45

Hello,


We are using e810 NICs in our work and mostly they are used to pass
through network traffic using QinQ.

We have noticed that when we receive TCP packets with the wrong
checksum from the internet, on
the receiving node it goes through, NIC compares the checksum, lets packet
further and increases the kernel counter for RX ERR. It doesn't make sense
as nothing can be done with such packets.

During the debugging process we found a place in code where it is not working
correctly, we think it goes straight to checksum_fail.

  ´´´
if (ipv4 && (rx_status0 & (BIT(ICE_RX_FLEX_DESC_STATUS0_XSUM_IPE_S))))
goto checksum_fail;

if (ipv6 && (rx_status0 & (BIT(ICE_RX_FLEX_DESC_STATUS0_IPV6EXADD_S))))
goto checksum_fail;

/* check for L4 errors and handle packets that were not able to be
* checksummed due to arrival speed
*/
if (rx_status0 & BIT(ICE_RX_FLEX_DESC_STATUS0_XSUM_L4E_S))
goto checksum_fail;

/* check for outer UDP checksum error in tunneled packets */
if ((rx_status1 & BIT(ICE_RX_FLEX_DESC_STATUS1_NAT_S)) &&
    (rx_status0 & BIT(ICE_RX_FLEX_DESC_STATUS0_XSUM_EUDPE_S)))
goto checksum_fail;
´´´
instead of going to next session where the checksum for tunneled packets
are unnecessary and shouldn't increase counter

´´´
  /* Only report checksum unnecessary for TCP, UDP, or SCTP */
switch (decoded.inner_prot) {
case ICE_RX_PTYPE_INNER_PROT_TCP:
case ICE_RX_PTYPE_INNER_PROT_UDP:
case ICE_RX_PTYPE_INNER_PROT_SCTP:
skb->ip_summed = CHECKSUM_UNNECESSARY;
´´´

The one way to deal with it is to disable rx checksumming in ethtool,
but we don't want to lose the monitoring for "normal" L2 packets.

Here is the example of such packet, the only difference the src and dst
IPs and checksum


Here are some examples for such packets:

Frame 2088: 70 bytes on wire (560 bits), 70 bytes captured (560 bits)
     Encapsulation type: Ethernet (1)
     UTC Arrival Time: May 13, 2024 13:12:17.815209000 UTC
     [Time shift for this packet: 0.000000000 seconds]
     [Time delta from previous captured frame: 0.054763000 seconds]
     [Time delta from previous displayed frame: 3.864796000 seconds]
     [Time since reference or first frame: 5.132255000 seconds]
     Frame Number: 2088
     Frame Length: 70 bytes (560 bits)
     Capture Length: 70 bytes (560 bits)
     [Frame is marked: False]
     [Frame is ignored: False]
     [Protocols in frame: eth:ethertype:vlan:ethertype:ip:tcp]
     [Coloring Rule Name: Checksum Errors]
     [Coloring Rule String [truncated]: eth.fcs.status=="Bad"
ip.checksum.status=="Bad"  tcp.checksum.status=="Bad"
udp.checksum.status=="Bad"  sctp.checksum.status=="Bad"
mstp.checksum.status=="Bad"  cdp.checksum.status=="Bad" ||]
Ethernet II, Src: , Dst:
     Destination:
     Source:
     Type: 802.1Q Virtual LAN (0x8100)
802.1Q Virtual LAN, PRI: 0, DEI: 0, ID: 258
     000. .... .... .... = Priority: Best Effort (default) (0)
     ...0 .... .... .... = DEI: Ineligible
     .... 0001 0000 0010 = ID: 258
     Type: IPv4 (0x0800)
Internet Protocol Version 4, Src: , Dst:
     0100 .... = Version: 4
     .... 0101 = Header Length: 20 bytes (5)
     Differentiated Services Field: 0x00 (DSCP: CS0, ECN: Not-ECT)
     Total Length: 52
     Identification: 0x1763 (5987)
     000. .... = Flags: 0x0
     ...0 0000 0000 0000 = Fragment Offset: 0
     Time to Live: 119
     Protocol: TCP (6)
     Header Checksum: 0x42c9 [correct]
     [Header checksum status: Good]
     [Calculated Checksum: 0x42c9]
     Source Address:
     Destination Address:
Transmission Control Protocol, Src Port: 64455, Dst Port: 22, Seq: 0, Len: 0
     Source Port: 64455
     Destination Port: 22
     [Stream index: 2]
     [Conversation completeness: Incomplete, SYN_SENT (1)]
     [TCP Segment Len: 0]
     Sequence Number: 0    (relative sequence number)
     Sequence Number (raw): 4100199712
     [Next Sequence Number: 1    (relative sequence number)]
     Acknowledgment Number: 0
     Acknowledgment number (raw): 0
     1000 .... = Header Length: 32 bytes (8)
     Flags: 0x002 (SYN)
     Window: 64240
     [Calculated window size: 64240]
     Checksum: 0x9bb8 incorrect, should be 0x8524(maybe caused by "TCP
checksum offload"?)
         [Expert Info (Error/Checksum): Bad checksum [should be 0x8524]]
             [Bad checksum [should be 0x8524]]
             [Severity level: Error]
             [Group: Checksum]
         [Calculated Checksum: 0x8524]
     [Checksum Status: Bad]
     Urgent Pointer: 0
     Options: (12 bytes), Maximum segment size, No-Operation (NOP),
Window scale, No-Operation (NOP), No-Operation (NOP), SACK permitted
         TCP Option - Maximum segment size: 1460 bytes
         TCP Option - No-Operation (NOP)
         TCP Option - Window scale: 8 (multiply by 256)
         TCP Option - No-Operation (NOP)
         TCP Option - No-Operation (NOP)
         TCP Option - SACK permitted
     [Timestamps]
         [Time since first frame in this TCP stream: 0.000000000 seconds]
         [Time since previous frame in this TCP stream: 0.000000000
seconds]


Looking forward to your reply on that matter

--
Mit freundlichen Grüßen
Alexander Kokorin

[e1000-devel] out of tree ice MSI-X reduction patch status

From: Ross V. <ro...@ka...> - 2024-03-26 21:48:34

Hello,

Is the patch at [1] relevant to the out-of-tree ice driver?  It was
merged into upstream linux 6.1, but doesn't seem to have been applied
out of tree.  I checked 1.11.14 and 1.13.7.  That patch came up while
investigating an error from the out-of-tree 1.11.14:

[    2.731841] ice 0000:05:00.0: ice_init_interrupt_scheme failed: -34
[    2.730619] ice 0000:05:00.0: not enough device MSI-X vectors. requested = 44, available = 1

Thanks,
Ross

[1] - https://lore.kernel.org/netdev/202...@in.../

Re: [e1000-devel] i40e: inconsistent #ifdef conditionals

From: Billie A. (balsup) <ba...@ci...> - 2024-02-13 16:44:03

Attachments: Build-for-6.6.9-kernel.patch

> In the latest i40e 2.24.6, there is inconsistent usage of conditionals leading to compilation errors.

I have resolved the i40e compilation problem for 6.6.9 kernel with the attached patch.
I basically swapped the order of CONFIG_PCI_IOV and CONFIG_DCB, and moved two functions outside of the CONFIG_DCB conditional.

The separate stub of pci_disable_pcie_error_reporting is because this function was deleted in 6.6.
I'm not positive that this is the correct solution.  I am just guessing that the functionality was added in the pci bus path somewhere.
I had to do similar with ixgbe driver.

[e1000-devel] i40e: inconsistent #ifdef conditionals

From: Billie A. (balsup) <ba...@ci...> - 2024-02-13 01:01:06

In the latest i40e 2.24.6, there is inconsistent usage of conditionals leading to compilation errors.

I am porting to 6.6.9 kernel.

In source src/i40e_virtchnl_pf.c, the function i40e_set_link_state is defined/implemented under three conditions, which must all be met (line 6663):

#ifdef HAVE_NDO_SET_VF_LINK_STATE
#ifdef CONFIG_DCB
#ifdef CONFIG_PCI_IOV

However, it is subsequently. used by function i40e_set_vf_enable under the single conditional at line 7595

#ifdef CONFIG_PCI_IOV

Similarly, it is set in the i40e_vfd_ops table under the same conditional at line 9490

#ifdef CONFIG_PCI_IOV
        .get_link_state         = i40e_get_link_state,
        .set_link_state         = i40e_set_link_state,
#endif

The inconsistency leads to errors during compilation if  CONFIG_PCI_IOV is defined and either HAVE_NDO_SET_VF_LINK_STATE or CONFIG_DCB is not.  In my case, CONFIG_DCB is not enabled, and I would prefer not to enable it in my kernel config.

What is the recommended solution?  It is not clear to me why CONFIG_DCB is necessary.

Should the two usages under a single #ifdef CONFIG_PCI_IOV be changed to require all three?  e.g.

#if defined(HAVE_NDO_SET_VF_LINK_STATE) && defined(CONFIG_DCB) && defined(CONFIG_PCI_IOV)

Or perhaps can the two functions i40e_set_link_state and i40e_configure_vf_link be moved outside of the CONFIG_DCB check?

Or is there another recommended solution?

[e1000-devel] Unable to compile out-of-tree ixgbe driver for VyOS

From: Skyler M. <sm+...@sk...> - 2024-01-18 21:26:25

Hi there,

As we can see here 
https://github.com/samipsolutions/vyos-build/actions/runs/7564805403/job/20599526418#step:20:117 
I'm unable to compile the driver due to it complaining about 
`pci_disable_pcie_error_reporting` and the enable version of that too.

This is with ixgbe 5.19.9 source, and 5.19.6 at least used to compile. I 
think it's the result of `-Werror=implicit-function-declaration` but not 
sure where it gets that as I'm using the vyos build container for this. 
Any ideas as to what to try to fix it would be much appreciated?

Skyler

Re: [e1000-devel] No versions for igb driver available on "modinfo igb" for SLE15-SP4

From: Jesse B. <jes...@in...> - 2024-01-18 20:32:58

On 1/11/2024 4:21 AM, Kum...@sw... wrote:
> Hi,
> 
> Unlike the previous releases for the drivers, we don’t see the column for version anymore for the Intel(R) Gigabit Ethernet Network Driver in SUSE Linux (SLE15-SP4) when doing modinfo igb.
> 
> Is this something expected and if yes, is there any other way to get the igb driver versions? For older SUSE installations we have , we see 5.6.0-k or something like that for igb driver versions.

Hi Mohit,

The Intel out-of-tree (OOT) drivers (like the one you download from
sourceforge) have a version number in them, but in the upstream, version
numbers were removed by the kernel community, and the version is
equivalent to the kernel the driver was released with.

If there was some reason you thought you needed a driver version, please
let us know.

The reason the kernel community removed the driver versions from
upstream (and therefore from consumers of upstream, like the SLES distro
you mention) is that the version numbers were misleading, wrong, or not
kept up to date. Basically the idea that comparing in-kernel to OOT
using a version number is not a good idea, as the drivers are not the
same, they're two different products released at different times, with
differing functionality.

If you need the specific upstream commit that igb was updated to in the
SLE15 SP4 release, please contact SuSE.

Hope this helps!
Jesse

[e1000-devel] No versions for igb driver available on "modinfo igb" for SLE15-SP4

From: <Kum...@sw...> - 2024-01-11 12:34:32

Hi,

Unlike the previous releases for the drivers, we don’t see the column for version anymore for the Intel(R) Gigabit Ethernet Network Driver in SUSE Linux (SLE15-SP4) when doing modinfo igb.

Is this something expected and if yes, is there any other way to get the igb driver versions? For older SUSE installations we have , we see 5.6.0-k or something like that for igb driver versions.

Br,
Mohit

Re: [e1000-devel] ixgbe driver version 5.19.6 build error

From: Pierre S. <psa...@ex...> - 2024-01-08 14:57:39

Attachments: ixgbe-fix-build.patch

Hi,

The attached patch fixes the issue below with 5.19.9.

Thanks,
Pierre


From: Pierre Sangouard
Sent: Thursday, August 31, 2023 17:53
To: e10...@li...
Subject: ixgbe driver version 5.19.6 build error

Hi,

Building ixgbe driver version 5.19.6 for kernel 4.9.337 with CONFIG_I40E_DISABLE_PACKET_SPLIT=1, I get the following failure:

env -u KERNELRELEASE make -C ixgbe-5.19.6/src KSRC=my_linux_directory EXTRA_CFLAGS=-DCONFIG_IXGBE_DISABLE_PACKET_SPLIT=1 INSTALL_MOD_DIR=extra || exit 1;
make[1]: Entering directory 'my_driver_directory'
filtering include/linux/dev_printk.h out
filtering include/net/flow_keys.h out
filtering include/net/flow_offload.h out
all files (for given query) filtered out
filtering include/linux/device/class.h out
all files (for given query) filtered out
filtering include/linux/gnss.h out
all files (for given query) filtered out
filtering include/linux/jump_label_type.h out
filtering include/linux/jump_label_type.h out
make[2]: Entering directory 'my_linux_directory'
  CC [M]  my_driver_directory/ixgbe_main.o
my_driver_directory/ixgbe_main.c: In function 'ixgbe_configure_rx_ring':
my_driver_directory/ixgbe_main.c:4423:20: error: implicit declaration of function 'ixgbe_rx_offset'; did you mean 'ixgbe_rx_bufsz'? [-Werror=implicit-function-declaration]
  ring->rx_offset = ixgbe_rx_offset(ring);
                    ^~~~~~~~~~~~~~~
                    ixgbe_rx_bufsz
cc1: some warnings being treated as errors
scripts/Makefile.build:307: recipe for target 'my_driver_directory/ixgbe_main.o' failed
make[3]: *** [my_driver_directory/ixgbe_main.o] Error 1
Makefile:1544: recipe for target '_module_my_driver_directory' failed
make[2]: *** [_module_my_driver_directory] Error 2
make[2]: Leaving directory 'my_linux_directory'
Makefile:100: recipe for target 'default' failed
make[1]: *** [default] Error 2
make[1]: Leaving directory 'my_driver_directory'

Any idea how to fix it?

Thanks,
Pierre

Pierre Sangouard

Integration Manager / Extreme Networks

psa...@ex...<mailto:psa...@ex...>

[e1000-devel] i40e AMD-Vi IO_PAGE_FAULT

From: stefanx <st...@lr...> - 2023-12-19 17:09:10

Hello,

one of our servers crashes regularly, apparently during heavy network load. The log files are then full of this message:

kernel: [514257.305733] i40e 0000:02:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0020 address=0x79ea8113f60 flags=0x0000]

This is the driver:

i40e: Intel(R) Ethernet Connection XL710 Network Driver
i40e: Copyright (c) 2013 - 2019 Intel Corporation.
i40e 0000:02:00.0: fw 8.5.67516 api 1.15 nvm 8.50 0x8000be1e 1.3295.0 [8086:15ff] [15d9:1c76]
i40e 0000:02:00.0: MAC address: 7c:c2:55:9d:d2:78
i40e 0000:02:00.0: FW LLDP is enabled
i40e 0000:02:00.0 eth0: NIC Link is Up, 10 Gbps Full Duplex, Flow Control: None
i40e 0000:02:00.0: PCI-Express: Speed 8.0GT/s Width x4
i40e 0000:02:00.0: PCI-Express bandwidth available for this device may be insufficient for optimal performance.
i40e 0000:02:00.0: Please move the device to a different PCI-e link with more lanes and/or higher transfer rate.
i40e 0000:02:00.0: Features: PF-id[0] VFs: 64 VSIs: 66 QP: 119 RSS FD_ATR FD_SB NTUPLE DCB VxLAN Geneve PTP VEPA
i40e 0000:02:00.1: fw 8.5.67516 api 1.15 nvm 8.50 0x8000be1e 1.3295.0 [8086:15ff] [15d9:1c76]
i40e 0000:02:00.1: MAC address: 7c:c2:55:9d:d2:79
i40e 0000:02:00.1: FW LLDP is enabled
i40e 0000:02:00.1: PCI-Express: Speed 8.0GT/s Width x4
i40e 0000:02:00.1: PCI-Express bandwidth available for this device may be insufficient for optimal performance.
i40e 0000:02:00.1: Please move the device to a different PCI-e link with more lanes and/or higher transfer rate.
i40e 0000:02:00.1: Features: PF-id[1] VFs: 64 VSIs: 66 QP: 119 RSS FD_ATR FD_SB NTUPLE DCB VxLAN Geneve PTP VEPA
i40e 0000:02:00.0 enp2s0f0: renamed from eth0
i40e 0000:02:00.1 enp2s0f1: renamed from eth1
i40e 0000:02:00.0: entering allmulti mode.

This message stands out there:

i40e 0000:02:00.1: PCI-Express bandwidth available for this device may be insufficient for optimal performance.
i40e 0000:02:00.1: Please move the device to a different PCI-e link with more lanes and/or higher transfer rate.

Does somebody has any idea ? GRUB_CMDLINE_LINUX_DEFAULT="iommu=soft" is sometimes recommended in similar cases with IO_PAGE_FAULT. Maybe I should lower the speed from 10 Gbit/s to 1 Gbit/s as a test?

Thanks

Stefan

[e1000-devel] i40e AMD-Vi IO_PAGE_FAULT

From: stefanx <st...@lr...> - 2023-12-17 16:49:43

Hello,

one of our servers crashes regularly, apparently during heavy network load. The log files are then full of this message:

kernel: [514257.305733] i40e 0000:02:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0020 address=0x79ea8113f60 flags=0x0000]

This is the driver:

i40e: Intel(R) Ethernet Connection XL710 Network Driver
i40e: Copyright (c) 2013 - 2019 Intel Corporation.
i40e 0000:02:00.0: fw 8.5.67516 api 1.15 nvm 8.50 0x8000be1e 1.3295.0 [8086:15ff] [15d9:1c76]
i40e 0000:02:00.0: MAC address: 7c:c2:55:9d:d2:78
i40e 0000:02:00.0: FW LLDP is enabled
i40e 0000:02:00.0 eth0: NIC Link is Up, 10 Gbps Full Duplex, Flow Control: None
i40e 0000:02:00.0: PCI-Express: Speed 8.0GT/s Width x4
i40e 0000:02:00.0: PCI-Express bandwidth available for this device may be insufficient for optimal performance.
i40e 0000:02:00.0: Please move the device to a different PCI-e link with more lanes and/or higher transfer rate.
i40e 0000:02:00.0: Features: PF-id[0] VFs: 64 VSIs: 66 QP: 119 RSS FD_ATR FD_SB NTUPLE DCB VxLAN Geneve PTP VEPA
i40e 0000:02:00.1: fw 8.5.67516 api 1.15 nvm 8.50 0x8000be1e 1.3295.0 [8086:15ff] [15d9:1c76]
i40e 0000:02:00.1: MAC address: 7c:c2:55:9d:d2:79
i40e 0000:02:00.1: FW LLDP is enabled
i40e 0000:02:00.1: PCI-Express: Speed 8.0GT/s Width x4
i40e 0000:02:00.1: PCI-Express bandwidth available for this device may be insufficient for optimal performance.
i40e 0000:02:00.1: Please move the device to a different PCI-e link with more lanes and/or higher transfer rate.
i40e 0000:02:00.1: Features: PF-id[1] VFs: 64 VSIs: 66 QP: 119 RSS FD_ATR FD_SB NTUPLE DCB VxLAN Geneve PTP VEPA
i40e 0000:02:00.0 enp2s0f0: renamed from eth0
i40e 0000:02:00.1 enp2s0f1: renamed from eth1
i40e 0000:02:00.0: entering allmulti mode.

This message stands out there:

i40e 0000:02:00.1: PCI-Express bandwidth available for this device may be insufficient for optimal performance.
i40e 0000:02:00.1: Please move the device to a different PCI-e link with more lanes and/or higher transfer rate.

Does somebody has any idea ? GRUB_CMDLINE_LINUX_DEFAULT="iommu=soft" is sometimes recommended in similar cases with IO_PAGE_FAULT. Maybe I should lower the speed from 10 Gbit/s to 1 Gbit/s as a test?

Thanks

Stefan

[e1000-devel] i40e AMD-Vi IO_PAGE_FAULT

From: stefanx <st...@lr...> - 2023-12-14 22:04:11

Hello,

one of our servers crashes regularly, apparently during heavy network load. The log files are then full of this message:

kernel: [514257.305733] i40e 0000:02:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0020 address=0x79ea8113f60 flags=0x0000]

This is the driver:

i40e: Intel(R) Ethernet Connection XL710 Network Driver
i40e: Copyright (c) 2013 - 2019 Intel Corporation.
i40e 0000:02:00.0: fw 8.5.67516 api 1.15 nvm 8.50 0x8000be1e 1.3295.0 [8086:15ff] [15d9:1c76]
i40e 0000:02:00.0: MAC address: 7c:c2:55:9d:d2:78
i40e 0000:02:00.0: FW LLDP is enabled
i40e 0000:02:00.0 eth0: NIC Link is Up, 10 Gbps Full Duplex, Flow Control: None
i40e 0000:02:00.0: PCI-Express: Speed 8.0GT/s Width x4
i40e 0000:02:00.0: PCI-Express bandwidth available for this device may be insufficient for optimal performance.
i40e 0000:02:00.0: Please move the device to a different PCI-e link with more lanes and/or higher transfer rate.
i40e 0000:02:00.0: Features: PF-id[0] VFs: 64 VSIs: 66 QP: 119 RSS FD_ATR FD_SB NTUPLE DCB VxLAN Geneve PTP VEPA
i40e 0000:02:00.1: fw 8.5.67516 api 1.15 nvm 8.50 0x8000be1e 1.3295.0 [8086:15ff] [15d9:1c76]
i40e 0000:02:00.1: MAC address: 7c:c2:55:9d:d2:79
i40e 0000:02:00.1: FW LLDP is enabled
i40e 0000:02:00.1: PCI-Express: Speed 8.0GT/s Width x4
i40e 0000:02:00.1: PCI-Express bandwidth available for this device may be insufficient for optimal performance.
i40e 0000:02:00.1: Please move the device to a different PCI-e link with more lanes and/or higher transfer rate.
i40e 0000:02:00.1: Features: PF-id[1] VFs: 64 VSIs: 66 QP: 119 RSS FD_ATR FD_SB NTUPLE DCB VxLAN Geneve PTP VEPA
i40e 0000:02:00.0 enp2s0f0: renamed from eth0
i40e 0000:02:00.1 enp2s0f1: renamed from eth1
i40e 0000:02:00.0: entering allmulti mode.

This message stands out there:

i40e 0000:02:00.1: PCI-Express bandwidth available for this device may be insufficient for optimal performance.
i40e 0000:02:00.1: Please move the device to a different PCI-e link with more lanes and/or higher transfer rate.

Does somebody has any idea ? GRUB_CMDLINE_LINUX_DEFAULT="iommu=soft" is sometimes recommended in similar cases with IO_PAGE_FAULT. Maybe I should lower the speed from 10 Gbit/s to 1 Gbit/s as a test?

Thanks

Stefan