aoetools-discuss Mailing List for ATA over Ethernet Tools (Page 4)
Brought to you by:
ecashin,
elcapitansam
You can subscribe to this list here.
2005 |
Jan
|
Feb
(3) |
Mar
(14) |
Apr
(24) |
May
|
Jun
(8) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
(6) |
Dec
(16) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2006 |
Jan
(21) |
Feb
(7) |
Mar
(6) |
Apr
(10) |
May
(16) |
Jun
(17) |
Jul
(28) |
Aug
(50) |
Sep
(72) |
Oct
(44) |
Nov
(41) |
Dec
(33) |
2007 |
Jan
(10) |
Feb
(35) |
Mar
(29) |
Apr
(17) |
May
(5) |
Jun
(9) |
Jul
(12) |
Aug
(23) |
Sep
(27) |
Oct
(7) |
Nov
(15) |
Dec
(2) |
2008 |
Jan
(12) |
Feb
(25) |
Mar
(31) |
Apr
(30) |
May
(21) |
Jun
(16) |
Jul
(24) |
Aug
(2) |
Sep
(26) |
Oct
(15) |
Nov
(4) |
Dec
(12) |
2009 |
Jan
(7) |
Feb
(34) |
Mar
(38) |
Apr
(41) |
May
(37) |
Jun
|
Jul
(16) |
Aug
(7) |
Sep
(10) |
Oct
(15) |
Nov
(22) |
Dec
(7) |
2010 |
Jan
(9) |
Feb
(1) |
Mar
(3) |
Apr
(15) |
May
(23) |
Jun
(9) |
Jul
(1) |
Aug
(4) |
Sep
(2) |
Oct
(14) |
Nov
(9) |
Dec
(3) |
2011 |
Jan
|
Feb
(8) |
Mar
(6) |
Apr
(5) |
May
(10) |
Jun
(6) |
Jul
(48) |
Aug
(8) |
Sep
(15) |
Oct
|
Nov
(6) |
Dec
(6) |
2012 |
Jan
(1) |
Feb
(2) |
Mar
(12) |
Apr
(27) |
May
(15) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(10) |
Nov
(6) |
Dec
|
2013 |
Jan
(2) |
Feb
(1) |
Mar
(11) |
Apr
(12) |
May
(8) |
Jun
(1) |
Jul
(3) |
Aug
|
Sep
(10) |
Oct
(17) |
Nov
|
Dec
|
2014 |
Jan
(6) |
Feb
|
Mar
(6) |
Apr
(1) |
May
(39) |
Jun
(20) |
Jul
(10) |
Aug
(5) |
Sep
|
Oct
|
Nov
|
Dec
(3) |
2015 |
Jan
|
Feb
(8) |
Mar
|
Apr
(1) |
May
(3) |
Jun
(16) |
Jul
(1) |
Aug
|
Sep
|
Oct
(2) |
Nov
|
Dec
(3) |
2016 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
(3) |
Sep
(1) |
Oct
|
Nov
|
Dec
(9) |
2017 |
Jan
(4) |
Feb
|
Mar
(3) |
Apr
(2) |
May
|
Jun
|
Jul
|
Aug
(4) |
Sep
|
Oct
|
Nov
(9) |
Dec
|
2018 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
(1) |
Jul
|
Aug
(1) |
Sep
(5) |
Oct
(5) |
Nov
|
Dec
|
2019 |
Jan
|
Feb
|
Mar
(1) |
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2020 |
Jan
|
Feb
|
Mar
|
Apr
(1) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: Catalin S. <cs...@us...> - 2015-02-19 13:10:30
|
I tend to be of the same opinion, with the added note that I knew about the alignment differences prior to debugging this and we still fell for it. I doubt that a warning in a man page would come to mind (or really match what one would be looking for) when a user notices corruption issues in running systems. A log warning also has the added benefit that it can provide a straight-forward value to truncate against. @Ed: I'm not rigid regarding this and it's your call; however, I lack the time to provide pull requests for any of these at the moment, and the code I provided wasn't tested (we currently apply that correction externally, but it's essentially the same and it's not hard to follow) On 19/02/2015 6:39 AM, Joshua J. Kugler wrote: > You might argue that people are more likely to read the logs than the > docs...but then, a lot of people read neither until something goes wrong. But > maybe finding that message in the logs is more likely to happen when something > goes wrong, rather than "Hmm, something is wrong, I think I'll go look for > warnings in the docs." > > But maybe that's just me. :) > > j > > On Wednesday, February 18, 2015 22:51:47 Ed Cashin wrote: >> Would you consider a pull request that includes an addition to the > documentation? That seems like a more appropriate place for a warning. On Feb > 18, 2015 10:01 PM, Catalin Salgau <cs...@us...> wrote: >>> Hi. >>> >>> While I haven't gotten around to testing any of the "recent" changes, a >>> colleague finally tracked down one of our long-standing corruption >>> issues some time ago and I think I should suggest a change that might >>> help others. >>> WinAoE has some code in the GettingsSize state that truncates a disk to >>> CHS geometry. Prior to Vista, Windows enforced CHS alignment for >>> partition boundaries, so this was not a problem. >>> However, if you installed a newer OS (one using 1MB boundaries) then >>> moved it to AoE storage, truncating at a partition boundary could cause >>> sectors to be missing under WinAoE, corrupting your data. Windows >>> probably never actually relied on this behaviour, since it was enforcing >>> alignment itself. >>> >>> I would like to request a warning along the lines of (while the 512 byte >>> sector size is superfluous, I include it for clarity) >>> #define CHSALIGN 255*63*512 >>> if ((size*512) % CHSALIGN) { >>> vlong recsz = (size*512) + CHSALIGN - (size*512)%CHSALIGN; >>> printf("Exported size (%llu) is not aligned to usual CHS >>> geometry.\n", size*512) >>> printf("Consider truncating to %llu bytes to prevent issues.\n", >>> recsz); } >>> Please excuse the lack of a pull request. >>> I'll try getting back to the other changes I was proposing at a later >>> time. >>> Thanks! >>> >>> -------------------------------------------------------------------------- >>> ---- Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server >>> from Actuate! Instantly Supercharge Your Business Reports and Dashboards >>> with Interactivity, Sharing, Native Excel Exports, App Integration & more >>> Get technology previously reserved for billion-dollar corporations, FREE >>> http://pubads.g.doubleclick.net/gampad/clk?id=190641631&iu=/4140/ostg.clkt >>> rk _______________________________________________ >>> Aoetools-discuss mailing list >>> Aoe...@li... >>> https://lists.sourceforge.net/lists/listinfo/aoetools-discuss >> >> ---------------------------------------------------------------------------- >> -- Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server >> from Actuate! Instantly Supercharge Your Business Reports and Dashboards >> with Interactivity, Sharing, Native Excel Exports, App Integration & more >> Get technology previously reserved for billion-dollar corporations, FREE >> http://pubads.g.doubleclick.net/gampad/clk?id=190641631&iu=/4140/ostg.clktrk >> _______________________________________________ >> Aoetools-discuss mailing list >> Aoe...@li... >> https://lists.sourceforge.net/lists/listinfo/aoetools-discuss > |
From: Joshua J. K. <jo...@az...> - 2015-02-19 04:39:24
|
You might argue that people are more likely to read the logs than the docs...but then, a lot of people read neither until something goes wrong. But maybe finding that message in the logs is more likely to happen when something goes wrong, rather than "Hmm, something is wrong, I think I'll go look for warnings in the docs." But maybe that's just me. :) j On Wednesday, February 18, 2015 22:51:47 Ed Cashin wrote: > Would you consider a pull request that includes an addition to the documentation? That seems like a more appropriate place for a warning. On Feb 18, 2015 10:01 PM, Catalin Salgau <cs...@us...> wrote: > > Hi. > > > > While I haven't gotten around to testing any of the "recent" changes, a > > colleague finally tracked down one of our long-standing corruption > > issues some time ago and I think I should suggest a change that might > > help others. > > WinAoE has some code in the GettingsSize state that truncates a disk to > > CHS geometry. Prior to Vista, Windows enforced CHS alignment for > > partition boundaries, so this was not a problem. > > However, if you installed a newer OS (one using 1MB boundaries) then > > moved it to AoE storage, truncating at a partition boundary could cause > > sectors to be missing under WinAoE, corrupting your data. Windows > > probably never actually relied on this behaviour, since it was enforcing > > alignment itself. > > > > I would like to request a warning along the lines of (while the 512 byte > > sector size is superfluous, I include it for clarity) > > #define CHSALIGN 255*63*512 > > if ((size*512) % CHSALIGN) { > > vlong recsz = (size*512) + CHSALIGN - (size*512)%CHSALIGN; > > printf("Exported size (%llu) is not aligned to usual CHS > > geometry.\n", size*512) > > printf("Consider truncating to %llu bytes to prevent issues.\n", > > recsz); } > > Please excuse the lack of a pull request. > > I'll try getting back to the other changes I was proposing at a later > > time. > > Thanks! > > > > -------------------------------------------------------------------------- > > ---- Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server > > from Actuate! Instantly Supercharge Your Business Reports and Dashboards > > with Interactivity, Sharing, Native Excel Exports, App Integration & more > > Get technology previously reserved for billion-dollar corporations, FREE > > http://pubads.g.doubleclick.net/gampad/clk?id=190641631&iu=/4140/ostg.clkt > > rk _______________________________________________ > > Aoetools-discuss mailing list > > Aoe...@li... > > https://lists.sourceforge.net/lists/listinfo/aoetools-discuss > > ---------------------------------------------------------------------------- > -- Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server > from Actuate! Instantly Supercharge Your Business Reports and Dashboards > with Interactivity, Sharing, Native Excel Exports, App Integration & more > Get technology previously reserved for billion-dollar corporations, FREE > http://pubads.g.doubleclick.net/gampad/clk?id=190641631&iu=/4140/ostg.clktrk > _______________________________________________ > Aoetools-discuss mailing list > Aoe...@li... > https://lists.sourceforge.net/lists/listinfo/aoetools-discuss -- Joshua J. Kugler - Fairbanks, Alaska Azariah Enterprises - Programming and Website Design jo...@az... - Jabber: ped...@gm... PGP Key: http://pgp.mit.edu/ ID 0x73B13B6A |
From: Ed C. <ed....@ac...> - 2015-02-19 04:18:43
|
Would you consider a pull request that includes an addition to the documentation? That seems like a more appropriate place for a warning. On Feb 18, 2015 10:01 PM, Catalin Salgau <cs...@us...> wrote: > > Hi. > > While I haven't gotten around to testing any of the "recent" changes, a > colleague finally tracked down one of our long-standing corruption > issues some time ago and I think I should suggest a change that might > help others. > WinAoE has some code in the GettingsSize state that truncates a disk to > CHS geometry. Prior to Vista, Windows enforced CHS alignment for > partition boundaries, so this was not a problem. > However, if you installed a newer OS (one using 1MB boundaries) then > moved it to AoE storage, truncating at a partition boundary could cause > sectors to be missing under WinAoE, corrupting your data. Windows > probably never actually relied on this behaviour, since it was enforcing > alignment itself. > > I would like to request a warning along the lines of (while the 512 byte > sector size is superfluous, I include it for clarity) > #define CHSALIGN 255*63*512 > if ((size*512) % CHSALIGN) { > vlong recsz = (size*512) + CHSALIGN - (size*512)%CHSALIGN; > printf("Exported size (%llu) is not aligned to usual CHS > geometry.\n", size*512) > printf("Consider truncating to %llu bytes to prevent issues.\n", recsz); > } > Please excuse the lack of a pull request. > I'll try getting back to the other changes I was proposing at a later time. > Thanks! > > ------------------------------------------------------------------------------ > Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server > from Actuate! Instantly Supercharge Your Business Reports and Dashboards > with Interactivity, Sharing, Native Excel Exports, App Integration & more > Get technology previously reserved for billion-dollar corporations, FREE > http://pubads.g.doubleclick.net/gampad/clk?id=190641631&iu=/4140/ostg.clktrk > _______________________________________________ > Aoetools-discuss mailing list > Aoe...@li... > https://lists.sourceforge.net/lists/listinfo/aoetools-discuss |
From: Catalin S. <cs...@us...> - 2015-02-19 03:28:17
|
Hi. While I haven't gotten around to testing any of the "recent" changes, a colleague finally tracked down one of our long-standing corruption issues some time ago and I think I should suggest a change that might help others. WinAoE has some code in the GettingsSize state that truncates a disk to CHS geometry. Prior to Vista, Windows enforced CHS alignment for partition boundaries, so this was not a problem. However, if you installed a newer OS (one using 1MB boundaries) then moved it to AoE storage, truncating at a partition boundary could cause sectors to be missing under WinAoE, corrupting your data. Windows probably never actually relied on this behaviour, since it was enforcing alignment itself. I would like to request a warning along the lines of (while the 512 byte sector size is superfluous, I include it for clarity) #define CHSALIGN 255*63*512 if ((size*512) % CHSALIGN) { vlong recsz = (size*512) + CHSALIGN - (size*512)%CHSALIGN; printf("Exported size (%llu) is not aligned to usual CHS geometry.\n", size*512) printf("Consider truncating to %llu bytes to prevent issues.\n", recsz); } Please excuse the lack of a pull request. I'll try getting back to the other changes I was proposing at a later time. Thanks! |
From: Ed C. <ed....@ac...> - 2014-12-05 01:59:50
|
On 12/04/2014 11:28 AM, Adi Kriegisch wrote: > Hey! > >> Linux 2.6.32-71.el6.x86_64 #1 SMP Fri May 20 03:51:51 BST 2011 x86_64 >> x86_64 x86_64 GNU/Linux > (...) >> # aoe-version >> aoetools: 35 >> installed aoe driver: 47 >> running aoe driver: 47 > I experienced similar behaviour (although not even close to a load of 200 > ;-) with the in-kernel version of aoe in kernel 3.2. I'd do an upgrade to > the most recent version of the aoe driver and test again... Yes, and just in case you don't know, Absolutely Free, the version of the aoe driver that is distributed at coraid.com is backward compatible with an old kernel like yours. It gets patched when you build it with make, as the README says. http://support.coraid.com/support/linux/ -- Ed |
From: Adi K. <ad...@cg...> - 2014-12-04 16:44:38
|
Hey! > Linux 2.6.32-71.el6.x86_64 #1 SMP Fri May 20 03:51:51 BST 2011 x86_64 > x86_64 x86_64 GNU/Linux (...) > # aoe-version > aoetools: 35 > installed aoe driver: 47 > running aoe driver: 47 I experienced similar behaviour (although not even close to a load of 200 ;-) with the in-kernel version of aoe in kernel 3.2. I'd do an upgrade to the most recent version of the aoe driver and test again... -- Adi |
From: <abs...@li...> - 2014-12-04 16:17:13
|
Hi, I am using Coraid (Model: ST3750330NS) OS is CentOS: Distributor ID: CentOS Description: CentOS release 6.6 (Final) Release: 6.6 Codename: Final kernel: Linux 2.6.32-71.el6.x86_64 #1 SMP Fri May 20 03:51:51 BST 2011 x86_64 x86_64 x86_64 GNU/Linux 8 GB RAM, 2 x Intel(R) Pentium(R) D CPU 3.00GHz I use this server (directory connected to shelf through crossover cable) os mail server. Mail's spool in on etherd device. I am having big load problems (> 200!) I would like to investigate if this can be related to file system issue or network issue I set mtu 9000 to dedicated ethernet: 0a:00.0 Ethernet controller: Intel Corporation 82572EI Gigabit Ethernet Controller (Copper) (rev 06) eth2 Link encap:Ethernet HWaddr 00:15:17:CB:57:AB inet6 addr: fe80::215:17ff:fecb:57ab/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:9000 Metric:1 In /var/log/messages I see this: Call Trace: [<ffffffff8120051f>] ? security_inode_permission+0x1f/0x30 [<ffffffff8117ab4d>] ? __link_path_walk+0xfd/0x1040 [<ffffffff814c97ae>] __mutex_lock_slowpath+0x13e/0x180 [<ffffffff811794b1>] ? path_put+0x31/0x40 [<ffffffff814c964b>] mutex_lock+0x2b/0x50 [<ffffffff81178b7f>] lock_rename+0x3f/0xe0 [<ffffffff8117c133>] sys_renameat+0x113/0x260 [<ffffffff81135837>] ? handle_pte_fault+0xf7/0xad0 [<ffffffff81171b14>] ? cp_new_stat+0xe4/0x100 [<ffffffff81180210>] ? filldir+0x0/0xe0 [<ffffffff810d40a2>] ? audit_syscall_entry+0x272/0x2a0 [<ffffffff8117c29b>] sys_rename+0x1b/0x20 [<ffffffff81013172>] system_call_fastpath+0x16/0x1b and this is the output of "top" command: top - 17:15:32 up 1:29, 1 user, load average: 160.25, 164.19, 242.60 Tasks: 597 total, 2 running, 595 sleeping, 0 stopped, 0 zombie Cpu(s): 10.2%us, 5.2%sy, 0.0%ni, 0.0%id, 84.1%wa, 0.0%hi, 0.5%si, 0.0%st Mem: 8059612k total, 5515976k used, 2543636k free, 1737132k buffers Swap: 10289144k total, 0k used, 10289144k free, 2254068k cached (notice high I/O wait) # aoe-version aoetools: 35 installed aoe driver: 47 running aoe driver: 47 Do you have any suggestion? Thank you very much! |
From: Keri A. <k.a...@sy...> - 2014-08-28 20:57:58
|
Unless I'm mistaken, it seems that some kernel updates are also helping. On 08/28/2014 05:28 AM, Keri Alleyne wrote: > It would appear that a new udev package has appeared within the last 24 > hours or so. While it's a little preliminary...I think that it's > helping...something was corrected. I'll confirm some of this within the > next few tests. > > So the basic steps: > > sudo apt-get update && sudo apt-get upgrade > > make && sudo make install > > > On 08/27/2014 07:57 PM, Ed Cashin wrote: >> Hi! >> >> I will respond between selected quotes below. >> >> On 08/27/2014 08:24 PM, Keri Alleyne wrote: >> ... >>> [ 109.114933] aoe: module verification failed: signature and/or >>> required key missing - tainting kernel >>> [ 109.132450] aoe: AoE v85 initialised. >>> [ 109.134865] aoe: e0.0: setting 1024 byte data frames >>> [ 109.135635] aoe: 00e04cd770c5 e0.0 v4014 has 20480000 sectors >>> [ 109.138386] etherd/e0.0: unknown partition table >> >> So far so good. You have AoE going on. You might consider using jumbo >> frames when you get the device nodes issue sorted out. >> >> That's what's going on with the aoetools commands: >> >> ... >>> sudo aoe-discover >>> aoe-discover: /dev/etherd/discover does not exist or is not writeable. >>> >>> sudo aoe-flush >>> aoe-flush: /dev/etherd/flush does not exist or is not writeable. >> >> That probably means that udev didn't create the device files for the aoe >> driver. Ubuntu 14.04 might have a udev that's new enough to have >> stopped supporting the udev rules that the aoe makefile installs. You >> probably just need to adjust the rules based on udev's complaints in the >> system logs in light of the udev man pages. >> >> You're not alone, though, so I'll probably take a stab at updating the >> aoe makefile to support the Ubuntu 14.04 udev, unless you beat me to >> it. (Or anyone else here does.) >> |
From: Robin P. B. <ro...@co...> - 2014-08-28 13:11:41
|
this diff seems to move things along... # diff -u aoe6-85/linux/Documentation/aoe/udev.txt /etc/udev/rules.d/60-aoe.rules --- aoe6-85/linux/Documentation/aoe/udev.txt 2013-11-14 22:09:20.000000000 -0500 +++ /etc/udev/rules.d/60-aoe.rules 2014-08-28 08:44:24.020125669 -0400 @@ -16,11 +16,11 @@ # # aoe char devices -SUBSYSTEM=="aoe", KERNEL=="discover", NAME="etherd/%k", GROUP="disk", MODE="0220" -SUBSYSTEM=="aoe", KERNEL=="err", NAME="etherd/%k", GROUP="disk", MODE="0440" -SUBSYSTEM=="aoe", KERNEL=="interfaces", NAME="etherd/%k", GROUP="disk", MODE="0220" -SUBSYSTEM=="aoe", KERNEL=="revalidate", NAME="etherd/%k", GROUP="disk", MODE="0220" -SUBSYSTEM=="aoe", KERNEL=="flush", NAME="etherd/%k", GROUP="disk", MODE="0220" +SUBSYSTEM=="aoe", KERNEL=="discover", SYMLINK="etherd/%k", GROUP="disk", MODE="0220" +SUBSYSTEM=="aoe", KERNEL=="err", SYMLINK="etherd/%k", GROUP="disk", MODE="0440" +SUBSYSTEM=="aoe", KERNEL=="interfaces", SYMLINK="etherd/%k", GROUP="disk", MODE="0220" +SUBSYSTEM=="aoe", KERNEL=="revalidate", SYMLINK="etherd/%k", GROUP="disk", MODE="0220" +SUBSYSTEM=="aoe", KERNEL=="flush", SYMLINK="etherd/%k", GROUP="disk", MODE="0220" # aoe block devices KERNEL=="etherd*", GROUP="disk" # ls -l /dev/etherd/* lrwxrwxrwx 1 root root 11 Aug 28 08:44 /dev/etherd/discover -> ../discover lrwxrwxrwx 1 root root 6 Aug 28 08:44 /dev/etherd/err -> ../err lrwxrwxrwx 1 root root 8 Aug 28 08:44 /dev/etherd/flush -> ../flush lrwxrwxrwx 1 root root 13 Aug 28 08:44 /dev/etherd/interfaces -> ../interfaces lrwxrwxrwx 1 root root 13 Aug 28 08:44 /dev/etherd/revalidate -> ../revalidate On Aug 28, 2014, at 7:28, Keri Alleyne <k.a...@sy...> wrote: > It would appear that a new udev package has appeared within the last 24 > hours or so. While it's a little preliminary...I think that it's > helping...something was corrected. I'll confirm some of this within the > next few tests. > > So the basic steps: > > sudo apt-get update && sudo apt-get upgrade > > make && sudo make install > > > On 08/27/2014 07:57 PM, Ed Cashin wrote: >> Hi! >> >> I will respond between selected quotes below. >> >> On 08/27/2014 08:24 PM, Keri Alleyne wrote: >> ... >>> [ 109.114933] aoe: module verification failed: signature and/or >>> required key missing - tainting kernel >>> [ 109.132450] aoe: AoE v85 initialised. >>> [ 109.134865] aoe: e0.0: setting 1024 byte data frames >>> [ 109.135635] aoe: 00e04cd770c5 e0.0 v4014 has 20480000 sectors >>> [ 109.138386] etherd/e0.0: unknown partition table >> >> So far so good. You have AoE going on. You might consider using jumbo >> frames when you get the device nodes issue sorted out. >> >> That's what's going on with the aoetools commands: >> >> ... >>> sudo aoe-discover >>> aoe-discover: /dev/etherd/discover does not exist or is not writeable. >>> >>> sudo aoe-flush >>> aoe-flush: /dev/etherd/flush does not exist or is not writeable. >> >> That probably means that udev didn't create the device files for the aoe >> driver. Ubuntu 14.04 might have a udev that's new enough to have >> stopped supporting the udev rules that the aoe makefile installs. You >> probably just need to adjust the rules based on udev's complaints in the >> system logs in light of the udev man pages. >> >> You're not alone, though, so I'll probably take a stab at updating the >> aoe makefile to support the Ubuntu 14.04 udev, unless you beat me to >> it. (Or anyone else here does.) >> > > ------------------------------------------------------------------------------ > Slashdot TV. > Video for Nerds. Stuff that matters. > http://tv.slashdot.org/ > _______________________________________________ > Aoetools-discuss mailing list > Aoe...@li... > https://lists.sourceforge.net/lists/listinfo/aoetools-discuss -- Robin P. Blanchard Technical Solutions Engineer Global Field Services and Support Coraid: storage that fits your business www.coraid.com |
From: Keri A. <k.a...@sy...> - 2014-08-28 10:28:34
|
It would appear that a new udev package has appeared within the last 24 hours or so. While it's a little preliminary...I think that it's helping...something was corrected. I'll confirm some of this within the next few tests. So the basic steps: sudo apt-get update && sudo apt-get upgrade make && sudo make install On 08/27/2014 07:57 PM, Ed Cashin wrote: > Hi! > > I will respond between selected quotes below. > > On 08/27/2014 08:24 PM, Keri Alleyne wrote: > ... >> [ 109.114933] aoe: module verification failed: signature and/or >> required key missing - tainting kernel >> [ 109.132450] aoe: AoE v85 initialised. >> [ 109.134865] aoe: e0.0: setting 1024 byte data frames >> [ 109.135635] aoe: 00e04cd770c5 e0.0 v4014 has 20480000 sectors >> [ 109.138386] etherd/e0.0: unknown partition table > > So far so good. You have AoE going on. You might consider using jumbo > frames when you get the device nodes issue sorted out. > > That's what's going on with the aoetools commands: > > ... >> sudo aoe-discover >> aoe-discover: /dev/etherd/discover does not exist or is not writeable. >> >> sudo aoe-flush >> aoe-flush: /dev/etherd/flush does not exist or is not writeable. > > That probably means that udev didn't create the device files for the aoe > driver. Ubuntu 14.04 might have a udev that's new enough to have > stopped supporting the udev rules that the aoe makefile installs. You > probably just need to adjust the rules based on udev's complaints in the > system logs in light of the udev man pages. > > You're not alone, though, so I'll probably take a stab at updating the > aoe makefile to support the Ubuntu 14.04 udev, unless you beat me to > it. (Or anyone else here does.) > |
From: Ed C. <ed....@ac...> - 2014-08-28 01:23:59
|
Hi! I will respond between selected quotes below. On 08/27/2014 08:24 PM, Keri Alleyne wrote: ... > [ 109.114933] aoe: module verification failed: signature and/or > required key missing - tainting kernel > [ 109.132450] aoe: AoE v85 initialised. > [ 109.134865] aoe: e0.0: setting 1024 byte data frames > [ 109.135635] aoe: 00e04cd770c5 e0.0 v4014 has 20480000 sectors > [ 109.138386] etherd/e0.0: unknown partition table So far so good. You have AoE going on. You might consider using jumbo frames when you get the device nodes issue sorted out. That's what's going on with the aoetools commands: ... > sudo aoe-discover > aoe-discover: /dev/etherd/discover does not exist or is not writeable. > > sudo aoe-flush > aoe-flush: /dev/etherd/flush does not exist or is not writeable. That probably means that udev didn't create the device files for the aoe driver. Ubuntu 14.04 might have a udev that's new enough to have stopped supporting the udev rules that the aoe makefile installs. You probably just need to adjust the rules based on udev's complaints in the system logs in light of the udev man pages. You're not alone, though, so I'll probably take a stab at updating the aoe makefile to support the Ubuntu 14.04 udev, unless you beat me to it. (Or anyone else here does.) -- Ed |
From: Keri A. <k.a...@sy...> - 2014-08-28 00:41:44
|
Good day. I would like to report that we are running into some difficulty executing commands from AoE Tools following aoe6-85 driver compilation on Ubuntu 14.04.1 LTS Server (i386). Applied Ubuntu patches. sudo apt-get install build-essential linux-headers-`uname -r` Extracted aoe6-85.tar.gz make sudo make install sudo modprobe aoe dmesg shows: [ 109.114933] aoe: module verification failed: signature and/or required key missing - tainting kernel [ 109.132450] aoe: AoE v85 initialised. [ 109.134865] aoe: e0.0: setting 1024 byte data frames [ 109.135635] aoe: 00e04cd770c5 e0.0 v4014 has 20480000 sectors [ 109.138386] etherd/e0.0: unknown partition table sudo aoe-stat e0.0 10.485GB eth0 1024 up sudo aoe-discover aoe-discover: /dev/etherd/discover does not exist or is not writeable. sudo aoe-flush aoe-flush: /dev/etherd/flush does not exist or is not writeable. ----------------------------------------------------- As you see, there are some problems running aoe-discover and aoe-flush. In addition, the tainting of the kernel is a little strange. Anyway, when this process is repeated on a Ubuntu 12.04 LTS system, the AoE Tools work just fine. So there is something different in how AoE compiles on 14.04.1. Any suggestions? Thanks. |
From: Ed C. <ed....@ac...> - 2014-07-16 01:28:36
|
On 07/15/2014 01:29 AM, David Leach wrote: > Ed, > > I'm less concerned about the initiator side as we don't really have > direct control over what it requests. What I suggest is that these > requests from the host on the initiator will likely be aligned > requests due to how their file system works to try to keep things > efficient. If we then cause the resulting AoE requests to the server > be unaligned accesses then that will likely cause additional IO > transactions to the file system which would then likely cause latency > delays on the responses to the these requests. As long as you don't specify the sync or direct options, though, the vblade will write to a buffered backing store. Then the ultimate backing store (e.g., disk drive), the ultimate driver (e.g., SCSI layer), the block layer, the middle layer (e.g., dm and md), the VM subsystem and (if it's a file) the filesystem will get a chance to merge and align I/O. -- Ed |
From: Ed C. <ed....@ac...> - 2014-07-16 01:12:35
|
For measuring latency and responsiveness, fio is a great tool. It's by the maintainer of the Linux kernel's block layer. You can even export its data easily to data analysis software like GNU R or a python script that uses pandas. There's a feature of the Linux kernel that I've never tried, but you might be interested. The block layer has a tracing feature in recent kernels, and there's a blktrace tool that works with it. On 07/14/2014 10:00 PM, David & Linda Leach wrote: > Killer, > > That confirms some of my suspicion. In my testing I can see requests > for 1024 sectors (512K) of data from the hard drive which the AoE > client would have to carve up into individual read/write requests that > can fit into an AoE packet. At the server, each of these requests > would appear as individual read/write requests of the disk so if you > followed the optimal packet usage for Ethernet for AoE you would end > up with a jumbo frame request for 17 sectors. The initial 1024 sector > request at the client would start on an aligned boundary for the first > two 4k "sectors" but then have a trailing 512 byte sector request > which will cause the next 7 requests to start off unaligned and end > aligned... so a 1024 sector request from the host OS will result in > only 1 out of 8 requests starting on an aligned boundary. > > Since the AoE client driver is handling disk requests from the host > OS, the host OS is going to assume certain things about the disk and > try to ensure proper alignment requests. I even think I've seen that > if you have an application that is going to write to sector 1 that the > host will read (or page) in the 4k chunk starting at sector 0 and then > write out the 4k chunk at sector 0 with the modification of sector 1. > > It seems like if we wanted to ensure alignment and support this > configurable "max sector count" request size that the size we would > want would be 16 to keep these large requests aligned and to ensure > maximum efficiency for disk usage at the server. But this goes back to > some of my original questions: > > 1) What is the test setup to determine the results of changing the max > request size? > 2) How does one measure latency and "responsiveness"? > > David > > > ------------------------------------------------------------------------------ > Want fast and easy access to all the code in your enterprise? Index and > search up to 200,000 lines of code with a free copy of Black Duck > Code Sight - the same software that powers the world's largest code > search on Ohloh, the Black Duck Open Hub! Try it now. > http://p.sf.net/sfu/bds > > > _______________________________________________ > Aoetools-discuss mailing list > Aoe...@li... > https://lists.sourceforge.net/lists/listinfo/aoetools-discuss |
From: David L. <ta...@gm...> - 2014-07-15 05:29:20
|
Ed, I'm less concerned about the initiator side as we don't really have direct control over what it requests. What I suggest is that these requests from the host on the initiator will likely be aligned requests due to how their file system works to try to keep things efficient. If we then cause the resulting AoE requests to the server be unaligned accesses then that will likely cause additional IO transactions to the file system which would then likely cause latency delays on the responses to the these requests. David |
From: David & L. L. <dl...@po...> - 2014-07-15 02:16:22
|
Killer, That confirms some of my suspicion. In my testing I can see requests for 1024 sectors (512K) of data from the hard drive which the AoE client would have to carve up into individual read/write requests that can fit into an AoE packet. At the server, each of these requests would appear as individual read/write requests of the disk so if you followed the optimal packet usage for Ethernet for AoE you would end up with a jumbo frame request for 17 sectors. The initial 1024 sector request at the client would start on an aligned boundary for the first two 4k "sectors" but then have a trailing 512 byte sector request which will cause the next 7 requests to start off unaligned and end aligned... so a 1024 sector request from the host OS will result in only 1 out of 8 requests starting on an aligned boundary. Since the AoE client driver is handling disk requests from the host OS, the host OS is going to assume certain things about the disk and try to ensure proper alignment requests. I even think I've seen that if you have an application that is going to write to sector 1 that the host will read (or page) in the 4k chunk starting at sector 0 and then write out the 4k chunk at sector 0 with the modification of sector 1. It seems like if we wanted to ensure alignment and support this configurable "max sector count" request size that the size we would want would be 16 to keep these large requests aligned and to ensure maximum efficiency for disk usage at the server. But this goes back to some of my original questions: 1) What is the test setup to determine the results of changing the max request size? 2) How does one measure latency and "responsiveness"? David |
From: Ed C. <ed....@ac...> - 2014-07-15 01:47:10
|
On 07/13/2014 06:23 PM, David Leach wrote: > So I do find it interesting to have a configuration to limit the size > of the read/write request but it seems like it would be useful to > understand the side affects on why someone would want to do this. > Catalin suggested that reducing the size of the jumbo frames decreases > latency and improves boot-times and said that the system "feels more > response". This is were I have a problem though because something > "feeling" more responsive is not very satisfying. It would be better > to have some hard numbers behind what this change does. Yes, I agree. If Catalin posts the patch here, then perhaps any interested parties would be able to gather some data. [Leach correctly notes that some jumbos carry ...] > 17 sectors of data per request. There is often a lot going on there. For example, if the initiator host is using a filesystem, then writes will dirty pages of memory that are buffering the data from the AoE device. The virtual memory subsystem will flush that data when it gets around to it, using whatever chunks it likes, then the block layer will probably consolidate or split the I/O as it likes inside the I/O scheduler, and only then will the aoe initiator get the data. But the aoe driver will set up network buffers (sk_buff structures) that point right into the memory associated with the I/O. The network card itself often does the transfer from RAM into the card and vice versa. I'm not sure there's a significant penalty paid for telling the NIC to DMA seventeen sectors. It would be a good test to do in the aoe driver with a few different representative NICs. Further, on the target side, there's no guarantee that the target will do the I/O in exactly the same chunks that appear in the AoE packets. Even disk drives have elevator algorithms scheduling I/O from write buffers. I agree that test results here would be interesting, but a big "Your Mileage May Vary" should accompany the results. -- Ed |
From: Killer{R} <su...@ki...> - 2014-07-14 19:29:59
|
Hello David, Monday, July 14, 2014, 1:23:56 AM, you wrote: IMHO problem caused mostly not only by page size, but also by HDD's sector size. Nowadays HDDs has 4K physical sector size. However they support accessing by 512 bytes, but this is ineffective, cause every unaligned read access that doesnt fit into 4K sector will resulted into 4K read, and every unaligned write - will cause disk to read sector's data, modify it internally in buffers and the write it back. Sure, firmware tries to do this in fastest way, but my tests shows that there'is about 20..30% sequential write speed degradation (with O_DIRECT) on writing 4K blocks if begining of each block is not aligned to 4K too. So simple using jumbo frames is not enough to make hardware work as fast as it can. AoE protocol doesn't support 4K sectors directly, cause it should support 'normal' MTU, but not only jumbo frames. However its theoretically possible to make initiator report OS that its '4K sector drive' and proper ('4K sector aware' :) ) OS will then access it by 4K-aligned portions, that together with some buffering at target's side should make it all work faster :). But its all looks like a tricky workaround. DL> So I do find it interesting to have a configuration to limit the size of DL> the read/write request but it seems like it would be useful to understand DL> the side affects on why someone would want to do this. Catalin suggested DL> that reducing the size of the jumbo frames decreases latency and improves DL> boot-times and said that the system "feels more response". This is were I DL> have a problem though because something "feeling" more responsive is not DL> very satisfying. It would be better to have some hard numbers behind what DL> this change does. DL> AoE using normal Ethernet frames end up having a protocol efficiency of DL> only 89.82% which on a 1Gb Ethernet would give you a theoretical maximum DL> throughput of ~112 MB/s. Going up to a 9000 byte frame bumps the efficiency DL> to 98.68% and a theoretical max throughput of ~123 MB/s. Something DL> interesting about jumbo frames though is that it ends up being able to DL> request 17 sectors of data per request. DL> Why is this interesting? Because on some Linux systems, a page size is 4096 DL> or 8 sectors so the 17 sectors works out to 2 full pages plus touching into DL> another page. If you are not using direct IO but instead letting Linux DL> manage the underlying file system then it would seem like you will end up DL> making unaligned IO requests of the system causing additional I/Os to be DL> issued. This might be the reason for the latency affects and it would be DL> interesting to get the numbers that Catalin may have in his tests... I DL> wouldn't mind seeing results for 17, 16, 8 sector count requests. DL> But what I don't understand is that if the throughput is 80 MB/s and drops DL> to 60 MB/s as Catalin suggests then I don't get how a 20 MB/s drop in DL> throughput would make the system be more responsive ... I also don't DL> understand what the test setup would be to even measure the affects of DL> latency, throughput and having it correlate to responsiveness? DL> David -- Best regards, Killer{R} mailto:su...@ki... |
From: David L. <ta...@gm...> - 2014-07-13 22:24:03
|
So I do find it interesting to have a configuration to limit the size of the read/write request but it seems like it would be useful to understand the side affects on why someone would want to do this. Catalin suggested that reducing the size of the jumbo frames decreases latency and improves boot-times and said that the system "feels more response". This is were I have a problem though because something "feeling" more responsive is not very satisfying. It would be better to have some hard numbers behind what this change does. AoE using normal Ethernet frames end up having a protocol efficiency of only 89.82% which on a 1Gb Ethernet would give you a theoretical maximum throughput of ~112 MB/s. Going up to a 9000 byte frame bumps the efficiency to 98.68% and a theoretical max throughput of ~123 MB/s. Something interesting about jumbo frames though is that it ends up being able to request 17 sectors of data per request. Why is this interesting? Because on some Linux systems, a page size is 4096 or 8 sectors so the 17 sectors works out to 2 full pages plus touching into another page. If you are not using direct IO but instead letting Linux manage the underlying file system then it would seem like you will end up making unaligned IO requests of the system causing additional I/Os to be issued. This might be the reason for the latency affects and it would be interesting to get the numbers that Catalin may have in his tests... I wouldn't mind seeing results for 17, 16, 8 sector count requests. But what I don't understand is that if the throughput is 80 MB/s and drops to 60 MB/s as Catalin suggests then I don't get how a 20 MB/s drop in throughput would make the system be more responsive ... I also don't understand what the test setup would be to even measure the affects of latency, throughput and having it correlate to responsiveness? David |
From: Ed C. <ed....@ac...> - 2014-07-12 02:23:15
|
Catalin, Salgau, hi. Did you send a patch for this packet-size-tuning feature? It seems like it would be a nice patch for contrib/ if you could put a nice description at the top, including its motivation and your personal experiences. During testing you might even get a chance to jot down some details about specific performance differences, to motivate potential users to try the patch. (I'm composing this in Thunderbird, and I hope I'm not going to send HTML mail!) On 06/14/2014 11:59 PM, Catalin Salgau wrote: > On 15/06/2014 4:06 AM, Ed Cashin wrote: >> Hi, Catalin Salgau. >> >> I have questions below between selected quotes. >> >> On 2014-06-10 07:59, Catalin Salgau wrote: >>> I would like to request two changes before release. >>> - An option to restrict the size of packets over automatic detection of >>> MTU. >> You mean like if the MTU is 9000, you want the ability to tell the >> vblade to act like it's smaller, right? > Yes. That's the gist of it. > I believe there is some value in the ability to manually tweak the > maximum packet size used by vlade. > At the very least it would help with determining optimal parameters for > a deployment/use case. >> If you have some numbers to share (MTUs and packet sizes as well as >> throughput rates and latencies), that would fill out your interesting >> story with important details. > I sadly made no effort to document it, and, in retrospect, it might have > made for an interesting study.. > As 'methodology', the target was configured to support 9014 byte frames > and the initiator was switched between the two tested packet sizes on > Windows XP x86 and Windows 7 amd64. > Due to some problems with making these changes to multiple test images, > I haven't replicated the results over a larger set of machines. > I'm hoping to get back to this in a week or two. >> ... >>> - change to vblade(8) manual Synopsis section to include current syntax >> That change might be simple enough for a patch to be easier for me to >> understand than a description, so please send the patch if you don't >> mind. > This was probably poorly worded. > What I was trying to say was that the Synopsis section in the manual > page supplied with vblade has not been kept in sync with options added > to vblade. > Since changing this as suggested below would yield a line longer than 80 > chars, I'm not providing a real patch; I don't know how one should > format this. Use it as reference, maybe? > > --- a/vblade.8 > +++ b/vblade.8 > @@ -6,1 +6,1 @@ > -.B vblade [ -m mac[,mac...] ] shelf slot netif filename > +.B vblade [-b bufcnt] [-o offset] [-l length] [-dsr] [ -m mac[,mac...] > ] shelf slot netif filename > > ------------------------------------------------------------------------------ > HPCC Systems Open Source Big Data Platform from LexisNexis Risk Solutions > Find What Matters Most in Your Big Data with HPCC Systems > Open Source. Fast. Scalable. Simple. Ideal for Dirty Data. > Leverages Graph Analysis for Fast Processing & Easy Data Exploration > http://p.sf.net/sfu/hpccsystems > _______________________________________________ > Aoetools-discuss mailing list > Aoe...@li... > https://lists.sourceforge.net/lists/listinfo/aoetools-discuss -- Ed |
From: Ed C. <ed....@ac...> - 2014-07-10 02:06:01
|
On 2014-07-07 20:18, Ed Cashin wrote: > On 2014-06-15 07:08, Catalin Salgau wrote: > ... >> Legacy note: >> Negotiation is, I believe, a reminiscence from before the Query Config >> Information 'Sector Count' field was added, in AoEr9. >> But I would argue that this was invalid behaviour in AoEr8 (and maybe >> previously. I was unable to find previous revisions. @Ed some help >> here?) > > I'm working on this. I think that 8 is the first publicly released revision. -- Ed Cashin <ed....@ac...> |
From: Ed C. <ed....@ac...> - 2014-07-08 02:45:19
|
On 2014-06-15 07:08, Catalin Salgau wrote: ... > Legacy note: > Negotiation is, I believe, a reminiscence from before the Query Config > Information 'Sector Count' field was added, in AoEr9. > But I would argue that this was invalid behaviour in AoEr8 (and maybe > previously. I was unable to find previous revisions. @Ed some help > here?) I'm working on this. -- Ed Cashin <ed....@ac...> |
From: Catalin S. <cs...@us...> - 2014-06-15 16:03:03
|
On 15/06/2014 6:40 PM, Killer{R} wrote: > Hello Catalin, > > Sunday, June 15, 2014, 6:30:09 PM, you wrote: > > CS> Hi again! > CS> I like my long emails, don't I? > Yep :) > > About drop rate - its not my pure-theoretical assumption that drop > rate must be minimized. I played with WinAOE variable named as > OutstandingThreshold and found that IO performance best when it > nearby buffers count specified in vblade's command line (or in case of > FreeBSD - that value that actually means buffered packets count). > Also there is other yet theoretical for me aspect - if there're lot of > AoE targets/initiators sharing same wire it definately better to has > lower possible resend rate. That feature was not present in the old build, regrettably. I'll post back on the drop-rate when I have a chance to test this on Monday maybe. I was actually looking at adding more links or switching to 10G to get around this. > CS> On 15/06/2014 4:22 PM, Killer{R} wrote: >>> Hello Catalin, >>> >>> Sunday, June 15, 2014, 4:08:15 PM, you wrote: >>> >>>>>>>> I would like to request two changes before release. >>>>>>>> - An option to restrict the size of packets over automatic detection of >>>>>>>> MTU. >>>>>>> You mean like if the MTU is 9000, you want the ability to tell the >>>>>>> vblade to act like it's smaller, right? >>>>> CS> Yes. That's the gist of it. >>>>> CS> I believe there is some value in the ability to manually tweak the >>>>> CS> maximum packet size used by vlade. >>>>> But its all to initiator side. Actually for example WinAoE (and its >>>>> forks ;) ) does MTU 'autodetection' instead of using Conf::scnt. >>>>> >>> CS> That's not entirely correct. >>> CS> WinAoE indeed does a form of negotiation there - it will start at >>> CS> (MTU/sector size) and will do reads of decreasing size, until it >>> CS> receives a valid packet. >>> CS> However! If you would kindly check ata.c:157 (on v22-rc1) any ATA >>> CS> request for more than the supported packet size will be refused. >>> >>> That's also not entirely correct :) It increases sectors count from >>> 1 to ether MTU limit, either any kind of error from target, including >>> timeout. > CS> You're probably right there. I haven't looked at it recently. In any > CS> event, the observation stands. > CS> Changing the supported MTU in vblade will limit packets to that size (I > CS> wouldn't have bothered with the FreeBSD MTU detection code if that > CS> wasn't the case) >>> However in my investigation I found that its usefull for initiator to >>> know also value called in vblade as 'buffers count' .. I mean such >>> a count of packets initiator can send to target knowing that it will >>> likely process them all. Because sending more request than this value >>> as 'outstanding' sharply increases drops (and resends) rate. >>> I implemented also kind of negotiation to detect this by sending >>> 'congestion' extension command that does usleep(500000) and the >>> responds for all commands received in buffer. Such approach by >>> comparing with directly asking target for buffers count will >>> detect also any implicit buffering between initiator and target >>> > CS> As per the AoE spec, messages in excess of Buffer Count are dropped. > CS> Since vblade processes these synchronously, this happens at the network > CS> buffer level. If using async I/O, you're responsible for that, in theory. > CS> As far as I remember, WinAoE not only doesn't care about that, but > CS> doesn't even request this information from the target. > CS> Should WinAoE limit the number of floating packets, as the target says > CS> it should, we wouldn't actually be talking about that, but that would > CS> probably cause more latency, since the initiator would have to wait for > CS> confirmation for at least one packet before sending another one in > CS> excess of bufcnt (and as I remember, WinAoE does not apply limits to > CS> sending packets) > CS> This would probably reduce throughput and increase average response time > CS> under even moderate load, but decrease drop-rate. > CS> I'm not actually sure that the drop/resend rate is something to aim for. > CS> It's clearly desirable to minimise these, but not for the sake of the > CS> number. > > CS> Regarding your proposed extension, I could see something like this being > CS> valuable in the event that the target can detect increased drop-rate and > CS> inform the initiator to ease-off or resend packets faster than the > CS> default timeout, but I since the target is not allowed to send > CS> unsolicited packets to the initiator, a specific request would be > CS> needed(say when a large number of packets are outstanding), but this > CS> would raise the question - if those packets are being dropped, what is > CS> there to stop the target's network stack from dropping our congestion > CS> detection packet? > CS> On that note, vblade could be thought to broadcast load status > CS> periodically or on high-drop rate, and in initiators would notice that > CS> and adapt, but I believe that this raises some security concerns and > CS> also would slightly slow the target since it would need to yield to the > CS> kernel for the drop-rate information every few requests. > > CS> @Ed > CS> Now, thanks to Killer's reference to the Buffer Count, I remember that > CS> the freebsd code does not actually use it to allocate the network buffers. > CS> Under Linux, following setsockopt with the default bufcnt, the receive > CS> buffer would end up 24000 bytes long for an MTU of 1500 bytes, and > CS> 144000 for a 9K MTU. > CS> Under FreeBSD the code defaults to a 64K buffer. That makes bufcnt 43 on > CS> an 1500 byte MTU, but 7 on a 9K MTU. > CS> This could cause an increase in dropped packets and explain the decrease > CS> in throughput I mentioned in a previous mail. I did not check for this > CS> when testing. > CS> I was not concerned because multiple instances of vblade on the same > CS> interface would saturate the channel anyway, but now I'm starting to worry:) > > > CS> ------------------------------------------------------------------------------ > CS> HPCC Systems Open Source Big Data Platform from LexisNexis Risk Solutions > CS> Find What Matters Most in Your Big Data with HPCC Systems > CS> Open Source. Fast. Scalable. Simple. Ideal for Dirty Data. > CS> Leverages Graph Analysis for Fast Processing & Easy Data Exploration > CS> http://p.sf.net/sfu/hpccsystems > CS> _______________________________________________ > CS> Aoetools-discuss mailing list > CS> Aoe...@li... > CS> https://lists.sourceforge.net/lists/listinfo/aoetools-discuss > > > |
From: Killer{R} <su...@ki...> - 2014-06-15 15:40:46
|
Hello Catalin, Sunday, June 15, 2014, 6:30:09 PM, you wrote: CS> Hi again! CS> I like my long emails, don't I? Yep :) About drop rate - its not my pure-theoretical assumption that drop rate must be minimized. I played with WinAOE variable named as OutstandingThreshold and found that IO performance best when it nearby buffers count specified in vblade's command line (or in case of FreeBSD - that value that actually means buffered packets count). Also there is other yet theoretical for me aspect - if there're lot of AoE targets/initiators sharing same wire it definately better to has lower possible resend rate. CS> On 15/06/2014 4:22 PM, Killer{R} wrote: >> Hello Catalin, >> >> Sunday, June 15, 2014, 4:08:15 PM, you wrote: >> >>>>>>> I would like to request two changes before release. >>>>>>> - An option to restrict the size of packets over automatic detection of >>>>>>> MTU. >>>>>> You mean like if the MTU is 9000, you want the ability to tell the >>>>>> vblade to act like it's smaller, right? >>>> CS> Yes. That's the gist of it. >>>> CS> I believe there is some value in the ability to manually tweak the >>>> CS> maximum packet size used by vlade. >>>> But its all to initiator side. Actually for example WinAoE (and its >>>> forks ;) ) does MTU 'autodetection' instead of using Conf::scnt. >>>> >> CS> That's not entirely correct. >> CS> WinAoE indeed does a form of negotiation there - it will start at >> CS> (MTU/sector size) and will do reads of decreasing size, until it >> CS> receives a valid packet. >> CS> However! If you would kindly check ata.c:157 (on v22-rc1) any ATA >> CS> request for more than the supported packet size will be refused. >> >> That's also not entirely correct :) It increases sectors count from >> 1 to ether MTU limit, either any kind of error from target, including >> timeout. CS> You're probably right there. I haven't looked at it recently. In any CS> event, the observation stands. CS> Changing the supported MTU in vblade will limit packets to that size (I CS> wouldn't have bothered with the FreeBSD MTU detection code if that CS> wasn't the case) >> However in my investigation I found that its usefull for initiator to >> know also value called in vblade as 'buffers count' .. I mean such >> a count of packets initiator can send to target knowing that it will >> likely process them all. Because sending more request than this value >> as 'outstanding' sharply increases drops (and resends) rate. >> I implemented also kind of negotiation to detect this by sending >> 'congestion' extension command that does usleep(500000) and the >> responds for all commands received in buffer. Such approach by >> comparing with directly asking target for buffers count will >> detect also any implicit buffering between initiator and target >> CS> As per the AoE spec, messages in excess of Buffer Count are dropped. CS> Since vblade processes these synchronously, this happens at the network CS> buffer level. If using async I/O, you're responsible for that, in theory. CS> As far as I remember, WinAoE not only doesn't care about that, but CS> doesn't even request this information from the target. CS> Should WinAoE limit the number of floating packets, as the target says CS> it should, we wouldn't actually be talking about that, but that would CS> probably cause more latency, since the initiator would have to wait for CS> confirmation for at least one packet before sending another one in CS> excess of bufcnt (and as I remember, WinAoE does not apply limits to CS> sending packets) CS> This would probably reduce throughput and increase average response time CS> under even moderate load, but decrease drop-rate. CS> I'm not actually sure that the drop/resend rate is something to aim for. CS> It's clearly desirable to minimise these, but not for the sake of the CS> number. CS> Regarding your proposed extension, I could see something like this being CS> valuable in the event that the target can detect increased drop-rate and CS> inform the initiator to ease-off or resend packets faster than the CS> default timeout, but I since the target is not allowed to send CS> unsolicited packets to the initiator, a specific request would be CS> needed(say when a large number of packets are outstanding), but this CS> would raise the question - if those packets are being dropped, what is CS> there to stop the target's network stack from dropping our congestion CS> detection packet? CS> On that note, vblade could be thought to broadcast load status CS> periodically or on high-drop rate, and in initiators would notice that CS> and adapt, but I believe that this raises some security concerns and CS> also would slightly slow the target since it would need to yield to the CS> kernel for the drop-rate information every few requests. CS> @Ed CS> Now, thanks to Killer's reference to the Buffer Count, I remember that CS> the freebsd code does not actually use it to allocate the network buffers. CS> Under Linux, following setsockopt with the default bufcnt, the receive CS> buffer would end up 24000 bytes long for an MTU of 1500 bytes, and CS> 144000 for a 9K MTU. CS> Under FreeBSD the code defaults to a 64K buffer. That makes bufcnt 43 on CS> an 1500 byte MTU, but 7 on a 9K MTU. CS> This could cause an increase in dropped packets and explain the decrease CS> in throughput I mentioned in a previous mail. I did not check for this CS> when testing. CS> I was not concerned because multiple instances of vblade on the same CS> interface would saturate the channel anyway, but now I'm starting to worry:) CS> ------------------------------------------------------------------------------ CS> HPCC Systems Open Source Big Data Platform from LexisNexis Risk Solutions CS> Find What Matters Most in Your Big Data with HPCC Systems CS> Open Source. Fast. Scalable. Simple. Ideal for Dirty Data. CS> Leverages Graph Analysis for Fast Processing & Easy Data Exploration CS> http://p.sf.net/sfu/hpccsystems CS> _______________________________________________ CS> Aoetools-discuss mailing list CS> Aoe...@li... CS> https://lists.sourceforge.net/lists/listinfo/aoetools-discuss -- Best regards, Killer{R} mailto:su...@ki... |
From: Catalin S. <cs...@us...> - 2014-06-15 15:30:28
|
Hi again! I like my long emails, don't I? On 15/06/2014 4:22 PM, Killer{R} wrote: > Hello Catalin, > > Sunday, June 15, 2014, 4:08:15 PM, you wrote: > >>>>>> I would like to request two changes before release. >>>>>> - An option to restrict the size of packets over automatic detection of >>>>>> MTU. >>>>> You mean like if the MTU is 9000, you want the ability to tell the >>>>> vblade to act like it's smaller, right? >>> CS> Yes. That's the gist of it. >>> CS> I believe there is some value in the ability to manually tweak the >>> CS> maximum packet size used by vlade. >>> But its all to initiator side. Actually for example WinAoE (and its >>> forks ;) ) does MTU 'autodetection' instead of using Conf::scnt. >>> > CS> That's not entirely correct. > CS> WinAoE indeed does a form of negotiation there - it will start at > CS> (MTU/sector size) and will do reads of decreasing size, until it > CS> receives a valid packet. > CS> However! If you would kindly check ata.c:157 (on v22-rc1) any ATA > CS> request for more than the supported packet size will be refused. > > That's also not entirely correct :) It increases sectors count from > 1 to ether MTU limit, either any kind of error from target, including > timeout. You're probably right there. I haven't looked at it recently. In any event, the observation stands. Changing the supported MTU in vblade will limit packets to that size (I wouldn't have bothered with the FreeBSD MTU detection code if that wasn't the case) > However in my investigation I found that its usefull for initiator to > know also value called in vblade as 'buffers count' .. I mean such > a count of packets initiator can send to target knowing that it will > likely process them all. Because sending more request than this value > as 'outstanding' sharply increases drops (and resends) rate. > I implemented also kind of negotiation to detect this by sending > 'congestion' extension command that does usleep(500000) and the > responds for all commands received in buffer. Such approach by > comparing with directly asking target for buffers count will > detect also any implicit buffering between initiator and target > As per the AoE spec, messages in excess of Buffer Count are dropped. Since vblade processes these synchronously, this happens at the network buffer level. If using async I/O, you're responsible for that, in theory. As far as I remember, WinAoE not only doesn't care about that, but doesn't even request this information from the target. Should WinAoE limit the number of floating packets, as the target says it should, we wouldn't actually be talking about that, but that would probably cause more latency, since the initiator would have to wait for confirmation for at least one packet before sending another one in excess of bufcnt (and as I remember, WinAoE does not apply limits to sending packets) This would probably reduce throughput and increase average response time under even moderate load, but decrease drop-rate. I'm not actually sure that the drop/resend rate is something to aim for. It's clearly desirable to minimise these, but not for the sake of the number. Regarding your proposed extension, I could see something like this being valuable in the event that the target can detect increased drop-rate and inform the initiator to ease-off or resend packets faster than the default timeout, but I since the target is not allowed to send unsolicited packets to the initiator, a specific request would be needed(say when a large number of packets are outstanding), but this would raise the question - if those packets are being dropped, what is there to stop the target's network stack from dropping our congestion detection packet? On that note, vblade could be thought to broadcast load status periodically or on high-drop rate, and in initiators would notice that and adapt, but I believe that this raises some security concerns and also would slightly slow the target since it would need to yield to the kernel for the drop-rate information every few requests. @Ed Now, thanks to Killer's reference to the Buffer Count, I remember that the freebsd code does not actually use it to allocate the network buffers. Under Linux, following setsockopt with the default bufcnt, the receive buffer would end up 24000 bytes long for an MTU of 1500 bytes, and 144000 for a 9K MTU. Under FreeBSD the code defaults to a 64K buffer. That makes bufcnt 43 on an 1500 byte MTU, but 7 on a 9K MTU. This could cause an increase in dropped packets and explain the decrease in throughput I mentioned in a previous mail. I did not check for this when testing. I was not concerned because multiple instances of vblade on the same interface would saturate the channel anyway, but now I'm starting to worry:) |