From: Jeff Sturm <jeff.sturm@ep...> - 2010-10-25 19:31:40
(Apologies for the duplicate send.)
Background: We have VS21 and SR1621 appliances connected over a pair of dedicated switches to 5 Linux hosts. All Linux hosts use jumbo frames and version 7.5 of the aoe driver. Each host has two interfaces for aoe traffic, plugged into separate switches, as do the VS21 and SR1621.
We see about 0.2% packet loss regularly (judging from the rate of retransmits), so I'm trying to understand why. We've already enabled hardware flow control on all host and switch ports carrying aoe traffic.
The aoe driver queries the shelf to see how many outstanding packets it can process, then sets each target to this number. For the VS21, this is 64*2, 128 total. The AoE protocol spec however says:
The maximum number of outstanding messages the server can queue for
processing. Messages in excess of this value are dropped.
It does not say whether this number should be set per interface or per shelf. With multipath, it can conceivably be higher than the shelf can handle with the way the Linux aoe driver currently operates.
I'm going to experiment to find if there is an optimum setting (by lowering the module parameter). In the meantime I'm curious what the experts have to say about this setting.
Get latest updates about Open Source Projects, Conferences and News.