That confirms some of my suspicion. In my testing I can see requests for 1024 sectors (512K) of data from the hard drive which the AoE client would have to carve up into individual read/write requests that can fit into an AoE packet. At the server, each of these requests would appear as individual read/write requests of the disk so if you followed the optimal packet usage for Ethernet for AoE you would end up with a jumbo frame request for 17 sectors. The initial 1024 sector request at the client would start on an aligned boundary for the first two 4k "sectors" but then have a trailing 512 byte sector request which will cause the next 7 requests to start off unaligned and end aligned... so a 1024 sector request from the host OS will result in only 1 out of 8 requests starting on an aligned boundary.

Since the AoE client driver is handling disk requests from the host OS, the host OS is going to assume certain things about the disk and try to ensure proper alignment requests. I even think I've seen that if you have an application that is going to write to sector 1 that the host will read (or page) in the 4k chunk starting at sector 0 and then write out the 4k chunk at sector 0 with the modification of sector 1. 

It seems like if we wanted to ensure alignment and support this configurable "max sector count" request size that the size we would want would be 16 to keep these large requests aligned and to ensure maximum efficiency for disk usage at the server. But this goes back to some of my original questions:

1) What is the test setup to determine the results of changing the max request size?
2) How does one measure latency and "responsiveness"?