Thread: Re: [Aoetools-discuss] Error checking against data corruption viaethernet
Brought to you by:
ecashin,
elcapitansam
From: PongráczI <pon...@po...> - 2013-04-06 10:00:32
|
----------------eredeti üzenet----------------- Feladó: "Alexandre" al...@gm... Címzett: "PongráczI" CC: "aoetools-discuss lists.sourceforge.net" aoe...@li... Dátum: Sat, 6 Apr 2013 10:24:10 +0100 ---------------------------------------------------------- > > > > > >I am not sure to understand against which type of error you want a protection/detection >of. > >Are you talking about corruption of data during the the transport over the wire? > >Can you find the name of the feature implemented in iscsi you'd like to check for in aoe? > > >regards, Alex > > > > > >Hi, > > >Thank you for your feedback! I try to explain my question in a better way :) > >So, your first question is exactly my question: Are you talking about corruption of >data during the the transport over the wire? Yes. > >On the client side, I want to write to the disk a data, for example a binary data: >01111000001111100001111 I want to be sure, on the server side it will be written to the disk the same data: >01111000001111100001111 > >At this moment, I do not know, if an electrical noise or whatever will alter the data sent >trough the wire, how will layer 2 and AoE handle that situation? > >For example I wrote 'piano' on the client, but 'violin' will be written to the disk on the >server, due to the corruption during the transport over the wire happens. Is that possible to >get this kind of corruption, or layer2+AoE can catch this situation and can fix this kind of >issues? > >I use ZFS filesystem, it uses end-to-end data protection by using extra checksum to >every block written to the disk. This can provide very good data protection, but would be nice >to know, using AoE will not cause corruption in a SAN environment. Others, who use iscsi, >always tell me, iscsi has checksum to detect transfer issues, while AoE has no this kind of >protection, so, to get data corruption can happen and will be not detected. In fact, I never used >iscsi, due to that comparing to AoE it is much more complicated and I saw several people had >issues to get it running on their environment. AoE just works. > >Thank you! > >István > > > > > >For example, |
From: PongráczI <pon...@po...> - 2013-04-06 10:06:14
|
----------------eredeti üzenet----------------- Feladó: "Alexandre" al...@gm... Címzett: "PongráczI" CC: "aoetools-discuss lists.sourceforge.net" aoe...@li... Dátum: Sat, 6 Apr 2013 10:57:22 +0100 ---------------------------------------------------------- > > > > > > >Oh are you talking about iSCSI Digest? > >If so I recommend you read this (only the the chapter should be enough to shced a ligth): > >[http://www.jdsu.com/ProductLiterature/Understanding-iSCSI-Digests-white- >paper-30162803.pdf -> >http://www.jdsu.com/ProductLiterature/Understanding-iSCSI-Digests-white-p >aper-30162803.pdf] > > >To make the long story short, iSCSI digest is meant to protect against errors during >protocol transitions on the hosts, which are highly used in iscsi ([data]/iSCSI-> TCP -> IP -> >Ethernet) while aoe only uses ethernet as a carrier so is far less prone to this kind of error. > > >Moreover, the overhead introduced by digest seems to have lead people to disable it in >most cases (most initiators disable it by default, and the document states that it is known >to be a common practice in the iscsi world). Instead, integrity is ensured by Etherne qnd >IP checksums (whle AoE only needs to check the Ethernet CRC). > > >If you're talking about another mechanism please let us know. > > > > > >Alexandre, thank you for your valuable answer! Good to know, iscsi in practice has a >good protection, which commonly disabled, great :) > >So, as I checked the Layer2, it has error checking as you wrote. So, as I understand now, >data corruption cannot happen on layer2, because this basic CRC protection handle layer 1 >issues which can happen in the physical layer. > >István |
From: Hilko B. <be...@hi...> - 2013-04-08 10:17:25
|
* PongráczI: > So, as I checked the Layer2, it has error checking as you wrote. So, > as I understand now, data corruption cannot happen on layer2, because > this basic CRC protection handle layer 1 issues which can happen in > the physical layer. This is only true if your Ethernet switch does not rewrite the frames as it forwards them. For example, you may want to use VLANs in order to separate initiators that are connected to the one machine that exports multiple targets. Using 802.1q-tagged VLANs on the target side and untagged VLANs for the initiator is a good idea for doing this. However, the switch will have to add/remove tags and recalculate the checksums. In such a case, the checksum will not protect you against data corruption that may occur within the switch. Cheers, -Hilko |
From: PongráczI <pon...@po...> - 2013-04-09 08:47:26
|
----------------eredeti üzenet----------------- Feladó: "Hilko Bengen" be...@hi... Címzett: "PongráczI" CC: aoe...@li... Dátum: Mon, 08 Apr 2013 11:32:19 +0200 ------------------------------------------------- > * PongráczI: > >> So, as I checked the Layer2, it has error checking as you wrote. So, >> as I understand now, data corruption cannot happen on layer2, because >> this basic CRC protection handle layer 1 issues which can happen in >> the physical layer. > > This is only true if your Ethernet switch does not rewrite the frames as > it forwards them. > > For example, you may want to use VLANs in order to separate initiators > that are connected to the one machine that exports multiple targets. > Using 802.1q-tagged VLANs on the target side and untagged VLANs for the > initiator is a good idea for doing this. However, the switch will have > to add/remove tags and recalculate the checksums. In such a case, the > checksum will not protect you against data corruption that may occur > within the switch. > > Cheers, > -Hilko > Hi Hilko, Thank you very much for your comment, I completly missed this use case. So, for using AoE it is better to use direct connect or using reliable switches, better without VLAN tags? Cheers, István |
From: Alexandre <al...@gm...> - 2013-04-09 17:47:44
|
You can also use Fully tagged 802.1Q VLANs. I don't think this should affect the performance. The decrease of the ethernet payload between the switch and the initiator is only 4 bytes (so it's a loss of payload of less than 0.045% for 9000 bytes large frames) Moreover, the presented use case contained a target with tagged VLAN on the target side, so I am not sure this would havr any impact at all. Never the less if this is worried, maybe you can also take a look at non 802.1Q VLANs (macvlan on linux?). Cheers, Alex. |
From: Alexandre <al...@gm...> - 2013-04-06 10:08:15
|
As stated in my previous answer data corruption on the wire is ensured by the ethernet CRC32 checksum http://en.wikipedia.org/wiki/Error_detection_and_correction#Internet If the corruption of data happens before the ethernet stack is reached on the wire nor iSCSI neither aoe can do something about that. Although I don't know ZFS, I guess in this case the data protection you're talking about should apply. But don't take my word for it... keep on digging. Regards, Alex. 2013/4/6 PongráczI <pon...@po...> > > > > ----------------eredeti üzenet----------------- > Feladó: "Alexandre" <al...@gm...> > Címzett: "PongráczI" <pon...@po...> > CC: "aoetools-discuss lists.sourceforge.net" < > aoe...@li...> > Dátum: Sat, 6 Apr 2013 10:24:10 +0100 > ---------------------------------------------------------- > > > I am not sure to understand against which type of error you want a > protection/detection of. > > *Are you talking about corruption of data during the the transport over > the wire?* > > Can you find the name of the feature implemented in iscsi you'd like to > check for in aoe? > > > regards, Alex > > > > > Hi, > > > Thank you for your feedback! I try to explain my question in a better way > :) > > So, your first question is exactly my question: *Are you talking about > corruption of data during the the transport over the wire? *Yes. > > On the client side, I want to write to the disk a data, for example a > binary data: 01111000001111100001111 I want to be sure, on the server > side it will be written to the disk the same data: 01111000001111100001111 > > At this moment, I do not know, if an electrical noise or whatever will > alter the data sent trough the wire, how will layer 2 and AoE handle that > situation? > > For example I wrote 'piano' on the client, but 'violin' will be written to > the disk on the server, due to the corruption during the transport over the > wire happens. Is that possible to get this kind of corruption, or > layer2+AoE can catch this situation and can fix this kind of issues? > > I use ZFS filesystem, it uses end-to-end data protection by using extra > checksum to every block written to the disk. This can provide very good > data protection, but would be nice to know, using AoE will not cause > corruption in a SAN environment. Others, who use iscsi, always tell me, > iscsi has checksum to detect transfer issues, while AoE has no this kind of > protection, so, to get data corruption can happen and will be not detected. > In fact, I never used iscsi, due to that comparing to AoE it is much more > complicated and I saw several people had issues to get it running on their > environment. AoE just works. > > Thank you! > > István > > > > > > > For example, > |
From: PongráczI <pon...@po...> - 2013-04-06 10:21:32
|
In this case I cannot see differences between local or aoe disks. ZFS has the advantage (if we use it directly as filesystem, not ext3 on top of zvol) it can detect exaclty which file changed due to any kind of corruption (on disk surface, hdd bios bug, chipset bug, whatever bug). If you have redundancy (raid1, raid5, raid6 equivalent), it can fix this kind of issue and make the data redundant again. (There are a lot of benefits of ZFS, worth to ckeck.) Thank you very much again! Have a nice weekend! István ----------------eredeti üzenet----------------- Feladó: "Alexandre" al...@gm... Címzett: "PongráczI" CC: "aoetools-discuss lists.sourceforge.net" aoe...@li... Dátum: Sat, 6 Apr 2013 11:08:08 +0100 ---------------------------------------------------------- > > > > >As stated in my previous answer data corruption on the wire is ensured by the ethernet >CRC32 checksum >[http://en.wikipedia.org/wiki/Error_detection_and_correction#Internet -> >http://en.wikipedia.org/wiki/Error_detection_and_correction#Internet] > >If the corruption of data happens before the ethernet stack is reached on the wire nor >iSCSI neither aoe can do something about that. Although I don't know ZFS, I guess in this case >the data protection you're talking about should apply. > > >But don't take my word for it... keep on digging. > > > >Regards, Alex. > > > > |