Thread: [Aoetools-discuss] GGAoEd request merging [was:GGAoEd - initial evaluation feedback]

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

	Hi,

>>On Wed, 6 Jan 2010 16:17:18 +0100 Gabor Gombas wrote:
>>> On Tue, Jan 05, 2010 at 07:04:31PM +0200, Delian Krustev wrote:
>>> 
>>> > 1. I browsed the project in google code and downloaded your 1.1 distribution
>>> > then. It was missing the handy "debian" directory which I've met in the
>>> > SVN repository. I don't know why is that but it might be handy to do package it.
>>> > At least for debian users like me.
>>> > (I've checked out the tagged version and build the package from there )
>>> 
>>> Well, Debian developers currently disagree about if it is a good idea to
>>> include the "debian" directory in the upstream tarball or not.
>>
>>If the package is included in Debian you might stop distributing it and use
>>the debian package management facilities to manage the packaging part.
>>
>>But since it's not for now I thought it might be useful for others too.
>>Anyway, this was just a suggestion.
>>
>>> > 2. You might want to mention in the README the build dependency to libblkid-dev.
>>> > I didn't have it at first and the configuration step failed.
>>> 
>>> Thanks, I'll add that.
>>> 
>>> > P.S. The motivation to test ggaoed is write performance issue I've faced with
>>> > vblade. In case you're interested you might look at:
>>> > 
>>> > http://krustev.net/w/articles/Backup_service_and_software_block_devices_over_the_net/
>>> 
>>> With ggaoed I expect you get much more even performance, since it uses
>>> direct I/O by default, and therefore avoids the read/modify/write cycles
>>> you get when using buffered I/O (like vblade does) and an MTU smaller
>>> than the page size.
>>
>>Unfortunately this is not the case.
>>I've identified the bottleneck being too much IO transactions when using
>>either vblade or ggaoed.
>
>IMHO the lack of jumbo frames is biting you.

That is for sure.

>Can't you borrow two jumbo-capable NICs for testing?

Unfortunately no. The servers are sitting in a data centre in a different
country and changes on the hardware specification could not be easily done.
The solution that I need to implement is for this DC, so I need to find a
reasonable option.

>If the MTU is 1500, you have 2 sectors
>per request. If the MTU is 9000, you can have 17 sectors per request -
>that's a more than 8 times reduction in the number of I/O operations
>you're sending to the disk.

Yep. This is why I did hope to get the request merging working.

To illustrate the numbers, first the local test:

# dd if=/dev/zero of=/dev/mapper/vg0-nbd6.0 bs=1M count=1000 seek=100
1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB) copied, 8.18595 s, 128 MB/s

At the same time on the nearby console iostat shows:

Device:            tps    kB_read/s    kB_wrtn/s    kB_read    kB_wrtn
sda             147.00         4.00     73420.00          4      73420
sdb             141.00         8.00     70724.00          8      70724
md8           38662.00         0.00    154648.00          0     154648
dm-2          38662.00         0.00    154648.00          0     154648

The physical devices (sda/sdb) do about 1/2 MB per transfer operation.

Then do the AoE test:

# dd if=/dev/zero of=/dev/etherd/e6.0 bs=1M count=100 seek=100
100+0 records in
100+0 records out
104857600 bytes (105 MB) copied, 4.36242 s, 24.0 MB/s

And the iostat results:

Device:            tps    kB_read/s    kB_wrtn/s    kB_read    kB_wrtn
sda            2962.00         0.00     13145.00          0      13145
sdb            2918.00         0.00     12971.00          0      12971
md8            7658.00         0.00     25364.00          0      25364
dm-2           7658.00         0.00     25364.00          0      25364

So this time sda&sdb do about 4 KB per transfer operation.

# ggaoectl stats
# Statistics for device nbd6.0
read_cnt: 504
read_bytes: 516096
read_time: 4.79039
write_cnt: 204800
write_bytes: 209715200
write_time: 103.907
other_cnt: 34
other_time: 0.00017897
io_slots: 58789
io_runs: 58789
queue_length: 2128789
queue_stall: 0
queue_over: 0
ata_err: 0
proto_err: 0

# Statistics for interface eth1
rx_cnt: 205337
rx_bytes: 217120220
rx_runs: 58822
rx_buffers_full: 0
tx_cnt: 205338
tx_bytes: 12832088
tx_runs: 0
tx_buffers_full: 0
dropped: 0
ignored: 0
broadcast: 11

>>Then I decided to play with ggaoed settings to see if I could get this feature
>>working:
>>
>>>  Request merging: read/write requests for adjacent data blocks can
>>>  be submitted as a single I/O request
>>
>>> You can also use "ggaoectl stats" and "ggaoectl monitor" to see how
>>> things are going; ggaoed has quite a bit more knobs to tune than vblade.
>>
>>So I've tried various values for the (what I've thought were) related parameters:
>>
>>queue-length
>>max-delay
>>merge-delay
>>
>>( and the other params too )
>>
>>The number of IO operations was always too high for a decent performance
>>on a real block device. So I guess the request merging was just not working
>>for some case. I was not able to get more than 30 MB
>
>You can check the output of "ggaoectl stats": for every exported device,
>the (read_cnt + write_cnt) / io_slots ratio gives how many requests
>could be merged on average.

From the numbers above: ( 504 + 204800 ) / 58789 = 3.49

Here goes my config:

# egrep -v '^(#|$)' /etc/ggaoed.conf
[defaults]
queue-length = 16
interfaces = eth1
direct-io = true
pid-file = /var/run/ggaoed.pid
control-socket = /var/run/ggaoed.sock
state-directory = /var/lib/ggaoed
ring-buffer-size = 0
send-buffer-size = 1024
receive-buffer-size = 1024
[acls]
[nbd6.0]
path = /dev/mapper/vg0-nbd6.0
shelf = 6
slot = 0

Please let me know if you have some comments on the numbers.

>>
>>Disabling direct-io actually increased the performance in my case. 
>>
>>I've also tried exporting an in-memory file, and this test easily utilized around
>>900 Mbits of bandwidth.
>>
>>
>>P.S. I was not able to find a public discussion board(a mailing list?) for
>>your project. Otherwise I would have posted there since the discussed
>>information is not private in anyway and could be of interest for others.
>>
>
>IMHO you can use the aoe...@li... list,
>especially if you're also testing vblade. The volume is quite low so I
>think there is no need to create a separate list for ggaoed.

Thanks. I'll post there.

I could conclude that I'm hitting a protocol limitation which you're trying
to workaround with GGAoEd (request merging)

Cheers
--
Delian

Thread: [Aoetools-discuss] GGAoEd request merging [was:GGAoEd - initial evaluation feedback]

aoetools-discuss