Thread: [Aoetools-discuss] AOE
Brought to you by:
ecashin,
elcapitansam
From: Stefan P. - a. i. ag <s.p...@al...> - 2010-05-11 14:36:31
|
Hi List, i was trying to replace iSCSI with AOE - but when i was using AOE the target system was continually reading while i was just writing to the disk. When switching from AOE to iSCSI without changing anything else everything was fine again. Stefan |
From: Tracy R. <tr...@ul...> - 2010-05-11 20:06:57
|
On Tue, May 11, 2010 at 04:09:41PM +0200, Stefan Priebe - allied internet ag spake thusly: > i was trying to replace iSCSI with AOE - but when i was using AOE the > target system was continually reading while i was just writing to the > disk. When switching from AOE to iSCSI without changing anything else > everything was fine again. You are probably having the infamous AoE alignment issue. It has really been bugging me lately too. I would say this is probably my biggest hassle in using AoE. I really wish this could be fixed in the target. There are workarounds (playing with disk geometry) but you have to be very careful to ensure alignment all throughout the various layers of your IO system. And if you are doing RAID 5 be aware of the RAID 5 "write hole". The AoE alignment issue has been discussed previously. This might be of use to you: http://copilotco.com/Virtualization/wiki/aoe-caching-alignment.pdf/at_download/file -- Tracy Reed http://tracyreed.org |
From: Gabor G. <go...@di...> - 2010-05-12 05:54:49
|
On Tue, May 11, 2010 at 12:49:28PM -0700, Tracy Reed wrote: > You are probably having the infamous AoE alignment issue. It has > really been bugging me lately too. I would say this is probably my > biggest hassle in using AoE. I really wish this could be fixed in the > target. Fixing it is very easy: just do not use the page cache, i.e. tell the target to use direct I/O. Of course you better have a HW RAID controller with sufficient on-board cache, otherwise you'll see a noticable drop in performance. The simplicity of the AoE protocol has its price. Gabor |
From: Stefan P. - a. i. ag <s.p...@al...> - 2010-05-12 06:38:06
|
Seems to be really complicated. Isn't there a small tool or something which tells me how to setup the disk? I seems i'll stay with iSCSI - this is too complicated for daily usage. Stefan Tracy Reed schrieb: > On Tue, May 11, 2010 at 10:54:21PM +0200, Stefan Priebe - allied internet ag spake thusly: >> I'm using a Raid 10 and i already know that document. I played >> around with fdisk and different sectors, heads, ... but nothing >> helped to me. I'm shure this is an alignment problem but i don't >> know what to try. > > You may need to align the start of data of your partition on a 64 > cylinder boundary. I don't have a web browser handy but google for > linux raid alignment and you should find some pointers which may help. > |
From: Tracy R. <tr...@ul...> - 2010-05-12 06:53:11
|
On Tue, May 11, 2010 at 10:54:21PM +0200, Stefan Priebe - allied internet ag spake thusly: > I'm using a Raid 10 and i already know that document. I played > around with fdisk and different sectors, heads, ... but nothing > helped to me. I'm shure this is an alignment problem but i don't > know what to try. You may need to align the start of data of your partition on a 64 cylinder boundary. I don't have a web browser handy but google for linux raid alignment and you should find some pointers which may help. -- Tracy Reed http://tracyreed.org |
From: Stefan P. - a. i. ag <s.p...@al...> - 2010-05-12 06:39:14
|
Just an info for you: <tr...@ed...>: 206.71.189.130 does not like recipient. Remote host said: 554 5.7.1 <server655-han.de-nserver.de[85.158.177.45]>: Client host rejected: Germany Giving up on 206.71.189.130. Stefan Tracy Reed schrieb: > On Tue, May 11, 2010 at 10:54:21PM +0200, Stefan Priebe - allied internet ag spake thusly: >> I'm using a Raid 10 and i already know that document. I played >> around with fdisk and different sectors, heads, ... but nothing >> helped to me. I'm shure this is an alignment problem but i don't >> know what to try. > > You may need to align the start of data of your partition on a 64 > cylinder boundary. I don't have a web browser handy but google for > linux raid alignment and you should find some pointers which may help. > |
From: Stefan P. - a. i. ag <s.p...@al...> - 2010-05-12 06:58:03
|
Hi! I'm using a Hardware Raid 10 Controller with 512MB Cache. Could you give me an example how to tell the target to use direct I/O? Stefan Gabor Gombas schrieb: > On Tue, May 11, 2010 at 12:49:28PM -0700, Tracy Reed wrote: > >> You are probably having the infamous AoE alignment issue. It has >> really been bugging me lately too. I would say this is probably my >> biggest hassle in using AoE. I really wish this could be fixed in the >> target. > > Fixing it is very easy: just do not use the page cache, i.e. tell the > target to use direct I/O. Of course you better have a HW RAID controller > with sufficient on-board cache, otherwise you'll see a noticable drop in > performance. The simplicity of the AoE protocol has its price. > > Gabor |
From: Tracy R. <tr...@ul...> - 2010-05-12 07:13:37
|
On Wed, May 12, 2010 at 08:57:55AM +0200, Stefan Priebe - allied internet ag spake thusly: > I'm using a Hardware Raid 10 Controller with 512MB Cache. Could you > give me an example how to tell the target to use direct I/O? The vblade manpage says: -d The -d flag selects O_DIRECT mode for accessing the underlying block device. -- Tracy Reed http://tracyreed.org |
From: Gabor G. <go...@di...> - 2010-05-12 17:44:24
|
On Wed, May 12, 2010 at 08:57:55AM +0200, Stefan Priebe - allied internet ag wrote: > I'm using a Hardware Raid 10 Controller with 512MB Cache. Could you > give me an example how to tell the target to use direct I/O? If you're using ggaoed, then add "direct-io = true" to the config file (see man ggaoed.conf). AFAIR vblade has a command line option, I don't have the man page handy. Gabor |
From: Jesse B. <bec...@ma...> - 2010-05-12 13:47:05
|
On Wed, May 12, 2010 at 02:37:57AM -0400, Stefan Priebe - allied internet ag wrote: >I seems i'll stay with iSCSI - this is too complicated for daily usage. I have to admit that I find this statement more than a little amusing, with the AoE spec at 12 pages, and iSCSI weighing in at over 20 times that[2]. The only time I've seen a need to worry about alightment issues--which I'd expect would plague iSCSI as well under certain cases--is when you are using LVM. I've yet to see a problem when using basic AoE block devices. [1] http://www.coraid.com/RESOURCES/AoE-Protocol-Definition [2] http://www.ietf.org/rfc/rfc3720.txt -- Jesse Becker NHGRI Linux support (Digicon Contractor) |
From: Stefan P. - a. i. ag <s.p...@al...> - 2010-05-12 14:04:53
|
That's interesting as i haven't any probem with iSCSI using the same target Block device AND not using LVM at all. Stefan Jesse Becker schrieb: > On Wed, May 12, 2010 at 02:37:57AM -0400, Stefan Priebe - allied > internet ag wrote: >> I seems i'll stay with iSCSI - this is too complicated for daily usage. > > I have to admit that I find this statement more than a little amusing, > with the AoE spec at 12 pages, and iSCSI weighing in at over 20 times > that[2]. > > The only time I've seen a need to worry about alightment issues--which > I'd expect would plague iSCSI as well under certain cases--is when you are > using LVM. I've yet to see a problem when using basic AoE block devices. > > [1] http://www.coraid.com/RESOURCES/AoE-Protocol-Definition > [2] http://www.ietf.org/rfc/rfc3720.txt > |
From: Tracy R. <tr...@ul...> - 2010-05-12 16:40:06
|
On Wed, May 12, 2010 at 09:46:58AM -0400, Jesse Becker spake thusly: > I have to admit that I find this statement more than a little amusing, > with the AoE spec at 12 pages, and iSCSI weighing in at over 20 > times that[2]. Ease of use counts for something. > The only time I've seen a need to worry about alightment issues--which > I'd expect would plague iSCSI as well under certain cases--is when you are > using LVM. I've yet to see a problem when using basic AoE block devices. How does LVM affect alignment? Doesn't LVM put its info at the end of the disk? If not how do you compensate for it? -- Tracy Reed |
From: Jesse B. <bec...@ma...> - 2010-05-12 13:53:10
|
On Wed, May 12, 2010 at 09:49:41AM -0400, Stefan Priebe - allied internet ag wrote: >That's interesting as i haven't any probem with iSCSI using the same >target Block device AND not using LVM at all. Agreed, that is strange. Is there an underlying RAID and/or LVM layer with your current setup? -- Jesse Becker NHGRI Linux support (Digicon Contractor) |
From: Stefan P. - a. i. ag <s.p...@al...> - 2010-05-12 14:04:53
|
Only an Adaptec 5808 Controller with a Raid10 and DRBD on top - nothing more. Stefan Jesse Becker schrieb: > On Wed, May 12, 2010 at 09:49:41AM -0400, Stefan Priebe - allied > internet ag wrote: >> That's interesting as i haven't any probem with iSCSI using the same >> target Block device AND not using LVM at all. > > Agreed, that is strange. > > Is there an underlying RAID and/or LVM layer with your current setup? > > |
From: Sam H. <sa...@co...> - 2010-05-12 15:48:10
|
> Fixing it is very easy: just do not use the page cache, i.e. tell the > target to use direct I/O. Of course you better have a HW RAID > controller > with sufficient on-board cache, otherwise you'll see a noticable > drop in > performance. The simplicity of the AoE protocol has its price. The simplicity of the vblade userspace AoE implementation has its price. AoE screams using our target appliances. Don't confuse the limitations of an open source target implementation with limitations inherent in AoE. Sam |
From: Gabor G. <go...@di...> - 2010-05-12 18:00:56
|
On Wed, May 12, 2010 at 11:31:05AM -0400, Sam Hopkins wrote: > The simplicity of the vblade userspace AoE implementation has its > price. AoE screams using our target appliances. Don't confuse the > limitations of an open source target implementation with limitations > inherent in AoE. True, but the message starting the thread was about the SW AoE implementations. Gabor |
From: Gabor G. <go...@di...> - 2010-05-12 17:50:50
|
On Tue, May 11, 2010 at 04:38:21PM -0700, Tracy Reed wrote: > You may need to align the start of data of your partition on a 64 > cylinder boundary. I don't have a web browser handy but google for > linux raid alignment and you should find some pointers which may help. You're mixing things up. There are certain circumstances when you want to align the _first_ partition to start at _sector_ 64. And there are cases (e.g. you're partitioning a RAID device) when the requirements are different (usually bigger). Gabor |
From: Tracy R. <tr...@ul...> - 2010-05-12 18:01:48
|
On Wed, May 12, 2010 at 07:50:36PM +0200, Gabor Gombas spake thusly: > You're mixing things up. There are certain circumstances when you want > to align the _first_ partition to start at _sector_ 64. And there are > cases (e.g. you're partitioning a RAID device) when the requirements are > different (usually bigger). You are right, I misspoke. Just the other day I wanted to use the first few partitions of a pair of 1T drives for mirrored /boot and operating system stuff and sda6 and sdb6 for a RAID0. I adjusted the beginnings of sda6 and sdb6 (some huge number) until it fell on an even multiple of 64. This is what you are getting at, right? -- Tracy Reed http://tracyreed.org |
From: Gabor G. <go...@di...> - 2010-05-12 17:56:18
|
On Wed, May 12, 2010 at 08:37:57AM +0200, Stefan Priebe - allied internet ag wrote: > Seems to be really complicated. Isn't there a small tool or something > which tells me how to setup the disk? I don't know the exact versions required (and I'm too lazy to look it up), but if you are using a recent kernel with recent libblkid and recent parted/fdisk, then everything should be aligned by default. Gabor |
From: Stefan P. - a. i. ag <s.p...@al...> - 2010-05-14 09:07:09
|
I'm using latest 2.6.32.12 vanilla kernel. But as you've read i've a lot of strange reads instead of only writes. I'm using vblade as i can't compile ggoed cause debian lenny's libc isn't new enough. Can't remember what it needs. Stefan Gabor Gombas schrieb: > On Wed, May 12, 2010 at 08:37:57AM +0200, Stefan Priebe - allied internet ag wrote: > >> Seems to be really complicated. Isn't there a small tool or something >> which tells me how to setup the disk? > > I don't know the exact versions required (and I'm too lazy to look it > up), but if you are using a recent kernel with recent libblkid and > recent parted/fdisk, then everything should be aligned by default. > > Gabor |
From: Tracy R. <tr...@ul...> - 2010-05-12 18:04:38
|
On Wed, May 12, 2010 at 11:31:05AM -0400, Sam Hopkins spake thusly: > AoE screams using our target appliances. Don't confuse the > limitations of an open source target implementation with limitations > inherent in AoE. I understand that this is probably by design. However, the complexity of getting AoE to perform well in initial tests hurts the ability to sell the appliance as a solution. I have recommended Coraid appliances 4 times in the last few years. One company actually purchased. But when they found they could only export whole drives they ended up returning them as that setup was not suitable for their virtualization goals. Has that issue been corrected yet? This was 3-4 years ago. -- Tracy Reed http://tracyreed.org |
From: Jesse B. <bec...@ma...> - 2010-05-12 18:53:59
|
On Wed, May 12, 2010 at 02:04:29PM -0400, Tracy Reed wrote: >On Wed, May 12, 2010 at 11:31:05AM -0400, Sam Hopkins spake thusly: >> AoE screams using our target appliances. Don't confuse the >> limitations of an open source target implementation with limitations >> inherent in AoE. > >I understand that this is probably by design. However, the complexity >of getting AoE to perform well in initial tests hurts the ability to >sell the appliance as a solution. I have recommended Coraid appliances >4 times in the last few years. One company actually purchased. But >when they found they could only export whole drives they ended up >returning them as that setup was not suitable for their virtualization >goals. Has that issue been corrected yet? This was 3-4 years ago. It is certainly possible to create multiple LUNs within a single device. These are all exported separately. For example, given a 24 drive array, you could make (for example), a 6 disk RAID5 array, two different 4 disk RAID 10 arrays, 10 disk RAID6 array, and 2 hot spares. These would be exported as 4 different devices in the form of /dev/ether/eX.Y (where X is the shelf number, and Y is the LUN number). You can't, however, create a large RAID10 array and export smaller chunks of it. (at least, not yet, I think that's being worked on) -- Jesse Becker NHGRI Linux support (Digicon Contractor) |
From: Mark S. <ma...@fu...> - 2010-05-12 19:21:09
|
On 12 May 2010, at 19:04, Tracy Reed wrote: > I have recommended Coraid appliances > 4 times in the last few years. One company actually purchased. But > when they found they could only export whole drives they ended up > returning them as that setup was not suitable for their virtualization > goals. Has that issue been corrected yet? This was 3-4 years ago. I can report great success with the Coraid VS21 device for chopping up large storage pools into smaller volumes, not sure when it was first available. The VS21 device gives much the same functionality as LVM on Linux... We switched production systems from software target (qaoed/LVM) about 12 months ago after using, struggling with, and contributing to it for about 2-3 years. Mark |