On 11/4/07, Jon Nelson <jnelson-aoe@jamponi.net> wrote:

I've been playing with AoE for quite some time, and had to step away from it to work on other things.
I recently had a bit of time to play some more, and ran into a really weird issue that I hope somebody can help me figure out.

I have three machines:

A: AMD x86-64, dual core, 3600+
B: Intel PIII 900
C: AMD Athlon XP 2600+ (32bit)

'A' has dual MCP55 (NVidia) gig-e
'B' has Intel EEPro/100
'C' has Intel EEPro/100 (SiS 900 built-in, but not used)

A and C can only talk to B, not each other.

The problem:
C is never able to get more than 500 KB/s whether it is the client or the server.

I'm using vblade-14 and aoe6-53.
I've tried compiling a 32-bit vblade for 'A' but that made no difference.
Whether it is the client or the server, 'A' reports lots of frame errors.
BOTH machines, when acting as client, show lots of retransmits in /dev/etherd/err.
A tcpdump shows weird behavior (like after a series of packets, the client won't send a SINGLE PACKET for 0.3 to 0.6 seconds).

Performance between A and C is always reasonable.
What is going on here? Why is performance so bad from A->C or C->A ?
I've tried different NICs in C:

SiS 900 built-in
Intel EEPro/100
VIA Velocity (2 different ones, gig-e)

I've tried two different hubs (a 100MBit HUB and a gig-e switch).
Operating System is openSUSE 10.2 for C and openSUSE 10.3 for A and B (although the exact same problem was with openSUSE 10.2 all-the-way-around).


My brain is not functioning correctly - I tested with TCP and also got really HORRIBLE performance (with noapic, nolapic). A sample of an strace of a program which sends 100MB of data from C to A.


10:07:34.378454 send(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 262144, 0) = 262144 <2.292836>
10:07:36.672099 send(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 262144, 0) = 262144 < 3.119648>
10:07:39.792324 send(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 262144, 0 <unfinished ...>

5.4 seconds to send 512K. Does anybody have any ideas?



--
Jon