From: Mike L. <jo...@iu...> - 2006-01-11 22:06:47
|
On Tue, 2006-01-10 at 14:59 -0800, Jesse Brandeburg wrote: > Sorry for the multiple threads, that and our email server here is > being rejected by the sourceforge server, so none of these messages are > showing up at sf.net > > On Tue, 10 Jan 2006, Mike Lowe wrote: > > > On Tue, 2006-01-10 at 10:28 -0800, Jesse Brandeburg wrote: > >> On Tue, 10 Jan 2006, Mike Lowe wrote: > >>> I am having considerable trouble with the tx performance on my e1000 > >>> cards. Using iperf I see significant numbers of out of order and > >>> dropped udp packets when using an e1000 to transmit. I am using both > >>> ia32 and ia64 architectures, sles8 and rhel4 distros, and the 6.3.9 > >>> e1000 driver. Any help is much appreciated. > >> > >> Hi Mike, can you be a bit more specific on which kernel version you used > >> with rhel4. Also, please include what e1000 hardware you have from lspci > >> -vvv and any other system information. Are you running at 1G speed? > > I can host your upload if you need me to (please bzip2 it) > > thanks for the detailed info, have you been able to repro this with > netperf? I ask because I don't really use iperf very much. > I am not able to reproduce this with netperf, but that I only because I can't get netperf to do any udp tests. I am however able to reproduce a 40Mbps difference based on the type of card sending using netperf and tcp. > I know it seems like a random question, but this still occurs with our > drivers <= 6.1.16? > Yes, I have some 6.0.54-k2-NAPI drivers that have the same problem. I believe that these are the standard ones distributed with rhel. > The 6.2.15 driver introduced multiple descriptor receives for jumbo > frames. We've seen a possible kernel issue with IP reassembly due to this > change. > > you can eliminate ip reassembly by running jumbo frame MTU > your message > size, or, you can run your iperf app with a message size of 1500 (equal to > the default MTU) > > The other thing is are you running jumbo frames, and why does one of your > adapters show rx_long_length errors? > Yes, we are running jumbo frames, the rx_long errors concerns me, but the amount of loss seems inversely proportional to the datagram size. I have some of the largest amounts of loss and reording when I use a 1k datagram size. I whipped up a nieve little udp tester in python that iterates through a loop sending a datagram with a sequential ascii number as the payload. I ran tcpdump on the sender (e1000 6.3.9-NAPI) and on the receiver (tigon3 3.10). The run was for i = 1000 but tcpdump only showed 875 on both sides with out any kernel drops? The socket function calls did not return errors, the packets seem to evaporate somewhere in the stack. I am afraid that I don't know enough to know if this is expected behavior. I am attaching the script and tcpdump files. If a phone conversation would help this process I am happy to call you at your convenience. > anyway, let me know, and as soon as I get some time (it may be next week) > i will take a look into this. > > Jesse -- _______________________________ Mike Lowe Research and Technical Services Indiana University http://www.indiana.edu/~rats/ 317.274.1352 _______________________________ |