#10 Corrupted TX traffic from 82599

ixgbe (40)

1) Two host connected via ixgbe nic
HostA X520-LR1
HostB 10GbE XF LR
2) Send traffic from HostA to HostB (~9.7Gb)
3) Run on HostB
ifconfig ethX down
ifconfig extX up
ifconfig ethX down
ifconfig extX up
...repeat it many times until
4) See corrupted traffic on HostB RX, increasing values rx_errors

In all packets added 16 bytes in head and 16 bytes missed in tail

Example of corrupted and original packet attached


1 2 > >> (Page 1 of 2)
  • vkozhevnikov

    vkozhevnikov - 2011-02-04
  • vkozhevnikov

    vkozhevnikov - 2011-02-04
  • vkozhevnikov

    vkozhevnikov - 2011-02-04

    If nic in this strange state any manipulation does not help (driver reload, less or none traffic, physical reconnect). Normal operation restored only after host reboot.

  • Emil Tantilov

    Emil Tantilov - 2011-02-04

    Could you provide detailed information about your setup:
    1. lspci -vvv
    2. kernel version (and config if possible)
    3. driver version
    4. ethtool info (ethtool -i ethX, ethtool -e ethX)

  • Don Skidmore

    Don Skidmore - 2011-02-04

    Thanks for bringing this to our attention and all the debugging info. I do have a couple of additional questions:

    • Which system need to be powered cycled to recover Host A or B?
    • Was the test bi-directional traffic? If so do you see the same packet corruption in both directions?
    • Any rough time line for how long it took the failure to occur?
  • vkozhevnikov

    vkozhevnikov - 2011-02-07

    lspci -vvvnn

  • vkozhevnikov

    vkozhevnikov - 2011-02-07

    uname -a

  • vkozhevnikov

    vkozhevnikov - 2011-02-07

    From /boot But I'am not shure it corresponds to kernel or not

  • vkozhevnikov

    vkozhevnikov - 2011-02-07

    Host A (with X520) need to be rebooted to recover. No power cycle required.

    Only X520 TX traffic corrupted.
    X520 RX and 82598-nics RX/TX traffic is fine.

    Failure occur after running "ifconfig ... " on HostB exactly 16 times.

  • vkozhevnikov

    vkozhevnikov - 2011-02-09

    Any news?

  • Don Skidmore

    Don Skidmore - 2011-02-10

    I haven't been able to recreate your failure. I'm going to try again tomorrow.

    The systems that your running on how many cores does it have?
    You are using a fairly old driver (2.1.4) have you tried using our latest driver (3.2.9)?

  • vkozhevnikov

    vkozhevnikov - 2011-02-10

    HP ProLiant DL380 G6 (8 cores)
    HP ProLiant DL585 G7 (48 cores)

  • vkozhevnikov

    vkozhevnikov - 2011-02-25

    Failure reproduced with latest (3.2.9) driver too.

  • Don Skidmore

    Don Skidmore - 2011-02-26

    Thanks for trying this out with the latest driver.

    I have yet to be able to recreate this failure on my systems. They however don't have near as many cores as yours. I'm going to try and run a test over the weekend with the driver you were able to get to fail. If that doesn't work find systems that more closely match yours in the lab and barrow them for a test.

  • vkozhevnikov

    vkozhevnikov - 2011-02-28

    I found a way how to avoid failure.

    In 2.1.4 driver (I like this driver version :-)))) :

        if (some_tx_pending) {
            /* We've lost link, so the controller stops DMA,
             * but we've got queued Tx work that's never going
             * to get done, so reset controller to flush Tx.
             * (Do the reset outside of interrupt context).
             // schedule_work(&adapter->reset_task); // COMMENTED BY ME

    This change of course may affect something but it solves my issue.

    May be it is a bad idea to disable DMA by resetting DMATXCTL.TE bit until all DMA operations complite? The datasheert for 82599 doesn't contain relevant information about TX disabling...

  • Don Skidmore

    Don Skidmore - 2011-03-01

    This is great news, I’m glad you found a way to avoid the failure.

    I’m still really interested in finding out the root cause of this issue. Sadly, I still haven’t had any luck recreating the failure in house, but today I asked our validation folks to attempt a recreation on a system more closely resembling yours. I also have two additional questions.

    1 – What are you using to create your test traffic (netperf, iperf, …)? Maybe I’m not creating the correct traffic flow pattern to hit this as regularly as you are?

    2 - Since you’re able to recreate this failure coincidently would you mind testing the latest driver with a very similar modification to the one you made in the 2.1.4 driver. Basically just don’t set the bit requesting a reset. So in ixgbe_watchdog_flush_tx:

                if (some_tx_pending) {
                        /* We've lost link, so the controller stops DMA,
                         * but we've got queued Tx work that's never going
                         * to get done, so reset controller to flush Tx.
                         * (Do the reset outside of interrupt context).
                    //    adapter->flags2 |= IXGBE_FLAG2_RESET_REQUESTED;

    I’m interested if this also corrects the problem in the 3.2.9 driver. Between 2.1.4 and 3.2.9 we combined all the tasklets into on “master” tasklet. The idea being to help avoid corner case race conditions by forcing serial execution. This would help further isolate the issue you’re seeing as to where the race could be.

  • vkozhevnikov

    vkozhevnikov - 2011-03-01

    1) We using own proprietary player for dumps. It make direct calls to card driver, so we can easily achieve speed ~10G using only a few cores.

    Today I reproduced issue with tcpreplay (http://tcpreplay.synfin.net/wiki/tcpreplay)

    TX side:

    HP DL585 G7
    Fedora 10 + kernel from kernel.org
    driver 3.2.9

    insmod ixgbe.ko RSS=2,2,2,2,2,2 MQ=1,1,1,1,1,1 FdirMode=0,0,0,0,0,0 RxBufferMode=0,0,0,0,0,0
    ifconfig eth5 mtu 1700 promisc -arp up

    Run 3 copy of
    tcpreplay -i eth5 -t -l 0 /dumps/my.dump

    my.dump is a real dump from ISP

    RX side:

    HP DL580 G6
    10GbE XF LR
    driver 2.1.4

    ifconfig eth5 mtu 1700 promisc -arp up

    Receiving normal traffic ~3Gbps

    while true; do ifconfig eth5 down; ifconfig eth5 up; done

    Run and observe rx_errors counter
    watch -n 0 "ethtool -S eth5 | grep rx"

    Counter rx_errors usually starts growing after 1-3 minutes.

    2) Proposed modification corrects the problem in 3.2.9 too.

  • Don Skidmore

    Don Skidmore - 2011-03-02

    I was unable to reply to your direct email at users.sourceforge.net source forge’s mail sure responded with unknown user.

    Thanks for the offer ssh into the system but I'm not sure how helpful that would end up being. I'm pushing for recreation in house so that I can better isolate what events are leading to this failure. This may end up being a PCIe bus trace so I can find the exact sequence/timing of reset race you seem to be hitting so regularly.

    That said, I "very much" appreciate all the detailed recreation information you have been providing. Once I can force the hardware into this condition every 1-3 minutes like you have been able to it should just be a matter of time before we know root cause. Currently I suspect this may be related (or even the same issue) that we occasionally see with very heavy stress/reset testing. However that scenario takes days (if at all to recreate). I'm hopeful that your setup will help lead us to a solution both of these issues much faster.

    If you want to send me a direct email (outside of Source Forge) donald.c.skidmore@intel.com


  • vkozhevnikov

    vkozhevnikov - 2011-03-17

    Any news? Do you reproduce issue?

  • Don Skidmore

    Don Skidmore - 2011-03-18

    I've made some progess, I keep getting pulled of on other things so it's been slow.

    I can now concistainly recreate the failure. By doing the folowing:

    On Peer system:
    - Just run iperf client for UDP

    On SUT:
    - run several parallel iperf sessions (iperf -c <peer IP=""> -u -t 3600 -b 1G -P 10)
    - run simple reset script (while true; do ifconfig ethX down; ifconfig ethX up; sleep 5; done

    It normally fails with in a few seconds to as long as a hour. Seem I need the SUT Tx queues full and a fase down->up transition.

    The simptons of the failure very slightly in how the Tx patches get maggled but at some point things go south.

    Right now I'm attempting to isolate the timings, I have yet to recreate this with a PCIe bus analizer on. But I haven't attempted with the new test setup above either.

  • Alexander Duyck

    Alexander Duyck - 2011-03-29

    Please try applying the attached patch to one of the broken drivers.

    The patch addresses an issue that could cause problems similar to what you have described. It resolves the issue by issuing a pair of resets to the hardware instead of just one in order to flush out pending PCIe transactions that could cause data corruption.

  • vkozhevnikov

    vkozhevnikov - 2011-05-25

    Patch didn't help.

1 2 > >> (Page 1 of 2)

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:

No, thanks