#383 rx_missed_errors on ixgbe with several RSS queues

closed
standalone_driver
1
2015-04-06
2013-11-07
Tomaz Buh
No

During performance tests of my dual port 10GE NIC card based on Intel® 82599ES 10GbE controller chipset, which uses ixgbe driver I experience significant packet loss when using several RSS queues, which are mapped to multiple cores.
My configuration:
Proc: 2x Intel Xeon X5660 2.8 GHz with HT enabled (6 cores) - all together there are 12 cores.
Mem: 12 GB DDR3 RAM (1333MHz)
NIC: 2x dual port Intel® 82599ES 10GbE NIC

Steps to reproduce the problem:

  1. Compile the newest version of ixgbe driver (3.18.7).
  2. Load the module with default values (only LRO and GSO disabled). This creates in my case 24 Rx/Tx queues due to HT technology, which doubles number of logical processor.
  3. Map RSS queues uniformly among processors with irqbalance or via set_irq_affinity script. Each RSS queue of single NIC interface is mapped to different core.
  4. Set up linux bridge by loading bridge module and create br0 interface with brctl-util tools.
  5. When testing L2 forwarding packet get lost.
  6. Also rx_missed_errors statistical counter in increasing.

There is also no difference if HyperThreading is disbled. In this case only 12 RSS queues are created by default.

Please note that the problem can be avoided by using only 4 RSS queues:
modprobe ixgbe RSS=4,4,4,4

Any more lines then that cause the mentioned issue. I also tried older version 3.13.10, where the limit number was 6 RSS lines.

Could you provide me with any workaround or solution. Also please do not hesitate to ask me for more information or tests.

Tomaz Buh

Related

Bugs: #383

Discussion

  • Emil Tantilov

    Emil Tantilov - 2013-11-07

    rx_missed_errors usually indicates that the driver is running out of buffer space. It can also be that you are stressing the interface beyond what the bus can take.

    Could you upload the output of ethtool -S on the interface which shows the errors after disabling flow control?

     
  • Tomaz Buh

    Tomaz Buh - 2013-11-08

    Thanks for a quick response.
    I'm not sure how to disable the flow control on the NIC.
    Anyway I'm attaching the statictics file. I'm running 100 kfps and packet loss occurs.
    With a lower number of RSS queues (e.g 4) I can get up to 1.6 Mfps, while performing the bridging functionality.

     
  • Emil Tantilov

    Emil Tantilov - 2013-11-08

    to disable flow control:
    ethtool -A ethX tx off rx off autoneg off

    To check the state:
    ethtool -a ethX

    Also from your stats it seems like you have Flow Director enabled, but you get no hits, so you may be better off just using RSS. You can disable Flow Director:
    ethtool -K ethX ntuple on

    or by setting AtrSampleRate to 0 on driver load.

    Based on the above please post ethtool -S stats with flow control off and also see if you get better results with Flow Director disabled in your test.

     
  • Tomaz Buh

    Tomaz Buh - 2013-11-11

    I tested with disabled flow control. Packets are still lost.
    ethtool -S output is in file FlowControl_OFF12RSS.txt

    I disabled the FlowDirector with command: ethtool -K ethX ntuple on
    Packets are still lost.
    ethtool -S output is in file FlowDirector_OFF12RSS.txt

    I then used only 4 RSS lines and with default settings (FlowControl and FlowDirector were on) and no packets are lost:
    ethtool -S output is in file DefaultRSS.txt

    Please note that I tested the bridging functionality with 1Mfps traffic on the receive side, which is then distributed among other three NICs. The traffic has several TCP and UDP flows, which makes RSS functionaliy effective. However with this settings I can only use 4 cores. (I have 12 core machine with HyperThreading disabled).

    Best regards,
    Tomaz

     
  • Emil Tantilov

    Emil Tantilov - 2013-11-12

    The reason for disabling Flow Control is not to resolve the missed packets, but to get a better idea what is causing them. Flow Control can mask some of the rx_missed_errors and the cause - the rx_missed_errors will trickle down to either rx_no_buffer_count or rx_no_dma_resource. But since in your case those remain at 0, the only reasonable explanation is that the interface cannot keep up with the incomming traffic.

    The FlowDirector_OFF12RSS.txt still shows lots of fdir_miss, so it doesn't seem that Flow Director was disabled in this case. Flow Director will not work in bridging scenario (you can also see it from your stats) that is why I suggested you try and disable it.

    If you get good numbers with 4 queues - just use that in your setup. Having more queues is not always better since this will cause additional overhead in terms of allocated memory and CPU. Also if your system has more than 1 NUMA node (numactl --hardware) you may be seeing the effects of cross-numa traffic (especially with multiple NICs).

    Another thing you can try is increase the number of descriptors (ethtool -G ethX rx Y tx Z, max is 4096) - again this will increase the memory usage of the driver.

    You mentioned that you have TCP and UDP flows - what you can do is run only with TCP and only with UDP - chances are that you see more rx_missed_errors with UDP. By default RSS is not enabled for UDP - the reason is because with RSS some UDP packets may arrive out of order which can lead to issues for some applications. There is a section regarding UDP RSS in the README included with the driver. If you have not done this already to enable RSS for UDP:
    ethtool -N ethX rx-flow-hash udp4 sdfn

     
  • Tomaz Buh

    Tomaz Buh - 2013-11-12

    I performed all tests, which you suggested and it seems to me, that the problem is indeed with several NUMA nodes. Because this is dual processor machine, it has 2 NUMA nodes, which seems to cause the packet drops.

    I made the following tests:

    I disabled the FlowDirector with setting AtrSampleRate to 0 on driver load and the results are still rx_missed_errors, while there is no fdir_miss. The statistics file is attached.
    Also using higher number of descriptors (4096) doesn't seem to help.
    Using only TCP traffic still produces the same results.

    If I use more than 6 RSS queues, the rx_missed_errors occur. Because in my system I have 2x6 cores, more than 6 RSS queues evidently causes cross-numa traffic, which then consequrntly drops packets.

    Is there any way that packets would'n be dropped when crossing NUMA nodes. Of course it makes sense to use less RSS queues, but I'm doing a research in network handling optimization. Therefore I'm searching the performance peak of bridging (in terms of throughput and latency) of Linux Bridge.
    My results show that curently each core in my system can handle cca. 700kfps. By distributing traffic on several cores with RSS queues I can achieve N x 700kfps, where N is number of engaged cores.
    So currently I can use onsignificant latnecy). ly number of cores in each NUMA node and not all the availible cores. Is is possible to modify the driver not to cause packet drops if packets cross NUMA nodes (as I understand the memory access between NUMA nodes is significantly slower, so packets that wouldn't be dropped would still introduce
    Maybe driver could provide that packets which are recived on a specific NUMA node are also handled and transmitted on that NODE so cross-numa traffic would be avoided.

     
  • whshin

    whshin - 2014-07-10

    problem is fixed?
    I have a similar issue.

    Only eth2's rx_missed_error did increase.
    Other interfaces is normal.

    ixgbe driver : 3.2.9
    firmware-version: 3.13-0
    cpu cores : 16 (HT enable)
    intel chip : 82599

    If you know that why does it happens. please help me.

    I append my issue data

     
    Last edit: whshin 2014-07-10
    • Don Skidmore

      Don Skidmore - 2014-07-10

      This count is incremented for one of two reasons.

      • The HW Receive FIFO doesn’t have room for incoming packets.

      • Not enough bandwidth on the PCIe bus.

      Have you tried to enable pause frames?

      -Don Skidmore donald.c.skidmore@intel.com

      From: whshin [mailto:whshin@users.sf.net]
      Sent: Thursday, July 10, 2014 1:38 AM
      To: [e1000:bugs]
      Subject: [e1000:bugs] #383 rx_missed_errors on ixgbe with several RSS queues

      problem is fixed?
      I have a similar issue.

      Only rx_missed_error indicates

      ixgbe driver : 3.2.9
      firmware-version: 3.13-0
      cpu cores : 16 (HT enable)

      If you know that why does it happens. please help me.

      I append my issue data

      Attachment: ethtool_lspci.txt (6.9 kB; text/plain)


      [bugs:#383]http://sourceforge.net/p/e1000/bugs/383 rx_missed_errors on ixgbe with several RSS queues

      Status: open
      Labels: RSS ixgbe rx_missed_errors multi-core
      Created: Thu Nov 07, 2013 07:57 AM UTC by Tomaz Buh
      Last Updated: Mon Mar 24, 2014 03:05 PM UTC
      Owner: Emil Tantilov

      During performance tests of my dual port 10GE NIC card based on Intel® 82599ES 10GbE controller chipset, which uses ixgbe driver I experience significant packet loss when using several RSS queues, which are mapped to multiple cores.
      My configuration:
      Proc: 2x Intel Xeon X5660 2.8 GHz with HT enabled (6 cores) - all together there are 12 cores.
      Mem: 12 GB DDR3 RAM (1333MHz)
      NIC: 2x dual port Intel® 82599ES 10GbE NIC

      Steps to reproduce the problem:

      1. Compile the newest version of ixgbe driver (3.18.7).
      2. Load the module with default values (only LRO and GSO disabled). This creates in my case 24 Rx/Tx queues due to HT technology, which doubles number of logical processor.
      3. Map RSS queues uniformly among processors with irqbalance or via set_irq_affinity script. Each RSS queue of single NIC interface is mapped to different core.
      4. Set up linux bridge by loading bridge module and create br0 interface with brctl-util tools.
      5. When testing L2 forwarding packet get lost.
      6. Also rx_missed_errors statistical counter in increasing.

      There is also no difference if HyperThreading is disbled. In this case only 12 RSS queues are created by default.

      Please note that the problem can be avoided by using only 4 RSS queues:
      modprobe ixgbe RSS=4,4,4,4

      Any more lines then that cause the mentioned issue. I also tried older version 3.13.10, where the limit number was 6 RSS lines.

      Could you provide me with any workaround or solution. Also please do not hesitate to ask me for more information or tests.

      Tomaz Buh


      Sent from sourceforge.net because you indicated interest in https://sourceforge.net/p/e1000/bugs/383/https://sourceforge.net/p/e1000/bugs/383

      To unsubscribe from further messages, please visit https://sourceforge.net/auth/subscriptions/https://sourceforge.net/auth/subscriptions

       

      Related

      Bugs: #383

  • whshin

    whshin - 2014-07-11

    Packet-size is nothong to do with reproducing the problem??

     
  • whshin

    whshin - 2014-07-17

    When appliance doesn't transmits packets to nexthope or local hosts, only rx_missed_errors statistical counter in increasing.

    traffic flow : packets -> [eth0 -> eth1] -> packets

    eth1 doesn't know nexthop's arp. So appliance stacks packet in buffer. And eth0's rx_missed_error counter is increasing.
    Is it normal operating??

     
    Last edit: whshin 2014-07-17
  • Tore Anderson

    Tore Anderson - 2015-02-11

    We're experiencing the same issue. The setup is a HP ProLiant DL360G7 server with two Intel Xeon X5670 CPUs (24 threads/cores total) and a dual-port Intel 82599ES 10G NIC, using the bonding driver to create an active/active aggregated 20G uplink to the network. Ubuntu 14.04.1, kernel 3.13.0-45-generic, arch x86_64, ixgbe driver version 3.15.1-k. The uplinks are nowhere near saturated (the traffic level is somewhere between 1 and 2 Gb/s on the aggregated uplink).

    By default, irqbalance will honor the ixgbe driver's affinity_hint setting, which results in the interrupts being spread out across all the available CPU cores. This results in terrible performance. A ping session to a host on the other end of the uplink switch shows that the RTT spikes to very high levels and it's very jittery, as seen in the ping output below. Also there is some packet loss, as evidenced by the "missed" counter in "ip -s -s link show" output increasing.

    $ sudo ping -f -c 100000 svc1-osl2.i.bitbit.net
    PING svc1-osl2.i.bitbit.net (87.238.33.50) 56(84) bytes of data.
    
    --- svc1-osl2.i.bitbit.net ping statistics ---
    100000 packets transmitted, 100000 received, 0% packet loss, time 19099ms
    rtt min/avg/max/mdev = 0.060/0.159/23.788/0.628 ms, pipe 3, ipg/ewma 0.190/2.964 ms
    

    If I gather all the interrupts on a single NUMA node (e.g., using the attached irqbalance policy script and running irqbalance with "--hintpolicy=ignore --policyscript=/path/to/irqbalance-ixgbe-policy"), the RTT/jitter of the ping test return to acceptable levels and there are no more packet loss:

    $ sudo ping -f -c 100000 svc1-osl2.i.bitbit.net
    PING svc1-osl2.i.bitbit.net (87.238.33.50) 56(84) bytes of data.
    
    --- svc1-osl2.i.bitbit.net ping statistics ---
    100000 packets transmitted, 100000 received, 0% packet loss, time 11833ms
    rtt min/avg/max/mdev = 0.059/0.083/1.117/0.030 ms, ipg/ewma 0.118/0.087 ms
    

    It doesn't seem to matter which NUMA node I chose to gather the interrupts on, as long as they're all on the same node performance is good.

    Note that it doesn't work to simply run irqbalance with "--hintpolicy=ignore", since the numa_node file in the adapter's sysfs directory contains "-1", irqbalance considers it equidistant from all NUMA nodes and therefore ends up balancing the interrupts across both NUMA nodes, with resulting poor performance.

    Another thing worth noting is that it does not work to balance all interrupts belonging to the first adapter (eth0) to one NUMA node and the second adapter (eth1) to another. This also results in poor performance. I have found no way to utilise both physical CPUs for network traffic without impacting performance, so for now the second CPU will have to sit there idle.

    http://sourceforge.net/p/e1000/bugs/394/ is related to this one, by the way.

    Tore

     
  • Todd Fujinaka

    Todd Fujinaka - 2015-04-06
    • status: open --> closed
     
  • Todd Fujinaka

    Todd Fujinaka - 2015-04-06

    Closing due to inactivity.

    Tore, I'm hoping you started another bug with your issue.

     
    • Tore Anderson

      Tore Anderson - 2015-04-06

      Um, why would I open another duplicate bug when this one already exists?

       
  • Todd Fujinaka

    Todd Fujinaka - 2015-04-06

    Because we asked you to, and this issue is closed.

    Similar initial symptoms don't always lead to the same cause.

     
  • Tore Anderson

    Tore Anderson - 2015-04-06

    But this issue got closed because you closed it for inactivity, not because it actually got fixed in any way. If nobody cares enough about the issue to fix it, then that is fair enough, but then I see no point whatsoever in opening a duplicate bug report about it (presumably it too will eventually end up being closed due to inactivity). If you on the other hand would like to have an open issue in order to keep track of the problem, then I suggest you simply re-open this one. That's way simpler than having me copy&paste all the information from this one into a new bug report.

    Note that Tomaz confirmed in his update 2013-11-12 that the cause of his problems is the cross-NUMA traffic, so it seems clear that he and I were experiencing the exact same issue.

     
    • Emil Tantilov

      Emil Tantilov - 2015-04-06

      When it comes to performance issues related to NUMA cross node traffic there is no one silver bullet solution that would work for everyone. It depends on what your setup/traffic is and finding the right configuration. There are already multiple other tickets and discussions on e1000-devel (and I'm sure all over the web) regarding NUMA optimizations.

      There is also a tool from Intel that can help when optimizing for NUMA:
      http://www.intel.com/software/pcm

       
  • Todd Fujinaka

    Todd Fujinaka - 2015-04-06

    Issue solved, then. Talk to the kernel people.

     

Log in to post a comment.

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:





No, thanks