You can subscribe to this list here.
2009 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(4) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2010 |
Jan
(20) |
Feb
(11) |
Mar
(11) |
Apr
(9) |
May
(22) |
Jun
(85) |
Jul
(94) |
Aug
(80) |
Sep
(72) |
Oct
(64) |
Nov
(69) |
Dec
(89) |
2011 |
Jan
(72) |
Feb
(109) |
Mar
(116) |
Apr
(117) |
May
(117) |
Jun
(102) |
Jul
(91) |
Aug
(72) |
Sep
(51) |
Oct
(41) |
Nov
(55) |
Dec
(74) |
2012 |
Jan
(45) |
Feb
(77) |
Mar
(99) |
Apr
(113) |
May
(132) |
Jun
(75) |
Jul
(70) |
Aug
(58) |
Sep
(58) |
Oct
(37) |
Nov
(51) |
Dec
(15) |
2013 |
Jan
(28) |
Feb
(16) |
Mar
(25) |
Apr
(38) |
May
(23) |
Jun
(39) |
Jul
(42) |
Aug
(19) |
Sep
(41) |
Oct
(31) |
Nov
(18) |
Dec
(18) |
2014 |
Jan
(17) |
Feb
(19) |
Mar
(39) |
Apr
(16) |
May
(10) |
Jun
(13) |
Jul
(17) |
Aug
(13) |
Sep
(8) |
Oct
(53) |
Nov
(23) |
Dec
(7) |
2015 |
Jan
(35) |
Feb
(13) |
Mar
(14) |
Apr
(56) |
May
(8) |
Jun
(18) |
Jul
(26) |
Aug
(33) |
Sep
(40) |
Oct
(37) |
Nov
(24) |
Dec
(20) |
2016 |
Jan
(38) |
Feb
(20) |
Mar
(25) |
Apr
(14) |
May
(6) |
Jun
(36) |
Jul
(27) |
Aug
(19) |
Sep
(36) |
Oct
(24) |
Nov
(15) |
Dec
(16) |
2017 |
Jan
(8) |
Feb
(13) |
Mar
(17) |
Apr
(20) |
May
(28) |
Jun
(10) |
Jul
(20) |
Aug
(3) |
Sep
(18) |
Oct
(8) |
Nov
|
Dec
(5) |
2018 |
Jan
(15) |
Feb
(9) |
Mar
(12) |
Apr
(7) |
May
(123) |
Jun
(41) |
Jul
|
Aug
(14) |
Sep
|
Oct
(15) |
Nov
|
Dec
(7) |
2019 |
Jan
(2) |
Feb
(9) |
Mar
(2) |
Apr
(9) |
May
|
Jun
|
Jul
(2) |
Aug
|
Sep
(6) |
Oct
(1) |
Nov
(12) |
Dec
(2) |
2020 |
Jan
(2) |
Feb
|
Mar
|
Apr
(3) |
May
|
Jun
(4) |
Jul
(4) |
Aug
(1) |
Sep
(18) |
Oct
(2) |
Nov
|
Dec
|
2021 |
Jan
|
Feb
(3) |
Mar
|
Apr
|
May
|
Jun
|
Jul
(6) |
Aug
|
Sep
(5) |
Oct
(5) |
Nov
(3) |
Dec
|
2022 |
Jan
|
Feb
|
Mar
(3) |
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: Davies L. <dav...@gm...> - 2017-10-24 05:17:42
|
Hey MooseFS fans in China, We have created a WeChat group to chat about anything around MooseFS, for the lovers in China, please find the barcode to join attached. -- - Davies |
From: web u. <web...@gm...> - 2017-10-09 19:38:24
|
Hi, I have the following moosefs version: MFS version 3.0.77-1 FUSE library version: 2.9.2 fusermount version: 2.9.2 I see in the document below: https://moosefs.com/manpages/mfsmount.html that there is a password option for mfsmount. Where do I setup the password on the mfsmaster server? Regards, WU |
From: Tru H. <tr...@pa...> - 2017-10-09 08:48:51
|
On Mon, Oct 09, 2017 at 10:15:41AM +0200, Aleksander Wieliczko wrote: > Hi, > > On 09.10.2017 08:46, Tru Huynh wrote: > > On Sun, Oct 08, 2017 at 06:59:07PM -0700, Matt Welland wrote: > > > > In the same trend, if I know in advance that I need to take into maintenance node A, > > I would like to preemptively copy the chunks of node A onto B and C > > rather than > > - put A in maintenance mode and loose the redondancy > > - turn off A, and force the replication at "full" speed. > > > > Does that make sense? > > > I would like to suggest to use "Mark for removal" option in mfshdd.cfg > file rather than turning off chunkserver. > Just put '*' at the begging of patch like in the example below: > > */mnt/hdd1 > */mnt/hdd2 > */mnt/hdd3 > > '*' means that this hard drive is 'marked for removal' and all data will > be replicated to other hard drives (usually on other chunkservers) > In such a case you will not loose redundancy, and after migration > process end, you will be able to disable chunkserver safely. :D I knew I already did that once, thanks! Best regards Tru -- Dr Tru Huynh | mailto:tr...@pa... | tel/fax +33 1 45 68 87 37/19 https://research.pasteur.fr/en/team/structural-bioinformatics/ Institut Pasteur, 25-28 rue du Docteur Roux, 75724 Paris CEDEX 15 France |
From: Aleksander W. <ale...@mo...> - 2017-10-09 08:15:56
|
Hi, On 09.10.2017 08:46, Tru Huynh wrote: > On Sun, Oct 08, 2017 at 06:59:07PM -0700, Matt Welland wrote: > > In the same trend, if I know in advance that I need to take into maintenance node A, > I would like to preemptively copy the chunks of node A onto B and C > rather than > - put A in maintenance mode and loose the redondancy > - turn off A, and force the replication at "full" speed. > > Does that make sense? > I would like to suggest to use "Mark for removal" option in mfshdd.cfg file rather than turning off chunkserver. Just put '*' at the begging of patch like in the example below: */mnt/hdd1 */mnt/hdd2 */mnt/hdd3 '*' means that this hard drive is 'marked for removal' and all data will be replicated to other hard drives (usually on other chunkservers) In such a case you will not loose redundancy, and after migration process end, you will be able to disable chunkserver safely. Best regards Aleksander Wieliczko Technical Support Engineer MooseFS.com |
From: Tru H. <tr...@pa...> - 2017-10-09 08:15:32
|
Hi Zlatko, On Mon, Oct 09, 2017 at 09:36:10AM +0200, Zlatko Čalušić wrote: > On 09.10.2017 08:46, Tru Huynh wrote: > >In the same trend, if I know in advance that I need to take into maintenance node A, > >I would like to preemptively copy the chunks of node A onto B and C > >rather than > >- put A in maintenance mode and loose the redondancy > >- turn off A, and force the replication at "full" speed. > > > >Does that make sense? > > > > > > Hello Tru, > > How about: > > * raise the goal by 1 That would raise the goal globally per folders, not only on node A. So a lot much more chunk copy, unless I can find the files hosted on A and only replicate them (+ house keeping the list to decrease the goal after). Maybe there is a way to to it that I am unaware of :P Real usage here slightly edited. mfscli -S -SIG -H xxxxxx -f0 +--------------------------------------------------+ | Master Info | +-----------------------+--------------------------+ | master version | 2.0.81 | | RAM used | 14 GiB | | CPU used | 4.31% | | CPU used (system) | 0.16% | | CPU used (user) | 4.15% | | total space | 79 TiB | | avail space | 8.3 TiB | | trash space | 1.3 MiB | | trash files | 11 | | sustained space | 0 B | | sustained files | 0 | | all fs objects | 50892617 | | directories | 2870869 | | files | 46549380 | | chunks | 46178544 | | all chunk copies | 112360106 | | regular chunk copies | 112360106 | | last successful store | Mon Oct 9 09:00:17 2017 | | last save duration | ~17.7s | | last save status | Saved in background | +-----------------------+--------------------------+ mfscli -S -SIC -H xxxxxx -f0 +--------------------------------------------------------------------------------------------------------------------------------------------------------+ | All chunks state matrix | +--------+-----------------------------------------------------------------------------------------------------------------------------------------------+ | | valid copies | | goal +-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+ | | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10+ | all | +--------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+ | 0 | - | - | - | - | - | - | - | - | - | - | - | 0 | | 1 | - | - | - | - | - | - | - | - | - | - | - | 0 | | 2 | - | - | 26175580 | 38 | - | - | - | - | - | - | - | 26175618 | | 3 | - | - | - | 20002872 | 54 | - | - | - | - | - | - | 20002926 | | 4 | - | - | - | - | - | - | - | - | - | - | - | 0 | | 5 | - | - | - | - | - | - | - | - | - | - | - | 0 | | 6 | - | - | - | - | - | - | - | - | - | - | - | 0 | | 7 | - | - | - | - | - | - | - | - | - | - | - | 0 | | 8 | - | - | - | - | - | - | - | - | - | - | - | 0 | | 9 | - | - | - | - | - | - | - | - | - | - | - | 0 | | 10+ | - | - | - | - | - | - | - | - | - | - | - | 0 | | all 1+ | 0 | 0 | 26175580 | 20002910 | 54 | 0 | 0 | 0 | 0 | 0 | 0 | 46178544 | +--------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+ | missing: 0 / endangered: 0 / undergoal: 0 / stable: 46178452 / overgoal: 92 / pending deletion: 0 / to be removed: 0 | +--------------------------------------------------------------------------------------------------------------------------------------------------------+ mfscli -S -SCS -H xxxxxx -f0 +----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Chunk Servers | +--------------------------+------+------+-------------+--------+---------------+-------------------------------------------------+------------------------------------------------+ | | | | | | | 'regular' hdd space | 'marked for removal' hdd space | | ip/host | port | id | version | load | maintenance +------------+-----------+-----------+------------+------------+----------+-----------+------------+ | | | | | | | chunks | used | total | % used | chunks | used | total | % used | +--------------------------+------+------+-------------+--------+---------------+------------+-----------+-----------+------------+------------+----------+-----------+------------+ | abcde.fghijkl.pasteur.fr | 9422 | 2 | 2.0.81 | 1 | off | 8942176 | 5.6 TiB | 6.3 TiB | 89.51% | 0 | 0 B | 0 B | - | | abcde.fghijkl.pasteur.fr | 9422 | 1 | 2.0.81 | 1 | off | 7582069 | 4.8 TiB | 5.4 TiB | 89.51% | 0 | 0 B | 0 B | - | | abcde.fghijkl.pasteur.fr | 9422 | 3 | 2.0.81 | 1 | off | 7960066 | 4.8 TiB | 5.4 TiB | 89.51% | 0 | 0 B | 0 B | - | | abcde.fghijkl.pasteur.fr | 9422 | 12 | 2.0.81 | 1 | off | 7588504 | 4.8 TiB | 5.4 TiB | 89.52% | 0 | 0 B | 0 B | - | | abcde.fghijkl.pasteur.fr | 9422 | 9 | 2.0.81 | 1 | off | 7788434 | 4.8 TiB | 5.4 TiB | 89.51% | 0 | 0 B | 0 B | - | | abcde.fghijkl.pasteur.fr | 9422 | 15 | 2.0.81 | 1 | off | 8915044 | 5.6 TiB | 6.3 TiB | 89.51% | 0 | 0 B | 0 B | - | | abcde.fghijkl.pasteur.fr | 9422 | 10 | 2.0.81 | 1 | off | 8818411 | 5.6 TiB | 6.3 TiB | 89.51% | 0 | 0 B | 0 B | - | | abcde.fghijkl.pasteur.fr | 9422 | 14 | 2.0.81 | 1 | off | 7627085 | 4.8 TiB | 5.4 TiB | 89.52% | 0 | 0 B | 0 B | - | | abcde.fghijkl.pasteur.fr | 9422 | 6 | 2.0.81 | 1 | off | 7699094 | 4.8 TiB | 5.4 TiB | 89.51% | 0 | 0 B | 0 B | - | | abcde.fghijkl.pasteur.fr | 9422 | 13 | 2.0.81 | 1 | off | 7595978 | 4.8 TiB | 5.4 TiB | 89.51% | 0 | 0 B | 0 B | - | | abcde.fghijkl.pasteur.fr | 9422 | 11 | 2.0.81 | 1 | off | 6311446 | 4.0 TiB | 4.5 TiB | 89.65% | 0 | 0 B | 0 B | - | | abcde.fghijkl.pasteur.fr | 9422 | 7 | 2.0.81 | 1 | off | 7752132 | 4.8 TiB | 5.4 TiB | 89.51% | 0 | 0 B | 0 B | - | | abcde.fghijkl.pasteur.fr | 9422 | 5 | 2.0.81 | 1 | off | 10162022 | 6.4 TiB | 7.2 TiB | 89.43% | 0 | 0 B | 0 B | - | | abcde.fghijkl.pasteur.fr | 9422 | 8 | 2.0.81 | 1 | off | 7617645 | 4.8 TiB | 5.4 TiB | 89.52% | 0 | 0 B | 0 B | - | +--------------------------+------+------+-------------+--------+---------------+------------+-----------+-----------+------------+------------+----------+-----------+------------+ So 79 TB, some with goals=2 or goals=3, chunks servers with 4.0-6.4 TB used. In terms of chunks, they range from 6311446 to 10162022 compared to 26175580 (goal=2) and 20002910(goal=3). It's ok to temporary add 6-10M chunks but 20-26M is an additionnal overhead :D Thanks Tru -- Dr Tru Huynh | mailto:tr...@pa... | tel/fax +33 1 45 68 87 37/19 https://research.pasteur.fr/en/team/structural-bioinformatics/ Institut Pasteur, 25-28 rue du Docteur Roux, 75724 Paris CEDEX 15 France |
From: Zlatko Č. <zca...@bi...> - 2017-10-09 07:51:23
|
On 09.10.2017 08:46, Tru Huynh wrote: > In the same trend, if I know in advance that I need to take into maintenance node A, > I would like to preemptively copy the chunks of node A onto B and C > rather than > - put A in maintenance mode and loose the redondancy > - turn off A, and force the replication at "full" speed. > > Does that make sense? > > Hello Tru, How about: * raise the goal by 1 * let MooseFS do the extra replication * put A in maintenance mode * do whatever you need to do with it * bring it back * lower the goal by 1 * let MooseFS delete extra chunks I believe you could do what you want with that procedure, do the maintenance and have extra safety during the window? Best regards, -- Zlatko |
From: Tru H. <tr...@pa...> - 2017-10-09 07:22:11
|
On Sun, Oct 08, 2017 at 06:59:07PM -0700, Matt Welland wrote: > If the feature I describe below is already available (I did not see > anything relevant in the config files) then consider this message a "howto" ... > Over time very long term data will converge on having chunks with goal of 3 > and new data will start out with chunks at 2. Overall churn due to > chunkservers being taken on/off line will be reduced even more than what > can be achieved with the grace period setting (*). > > (*) Well, I think so but I actually don't know as I haven't figured out how > to safely set the grace period settings. I think I can set > QUOTA_DEFAULT_GRACE_PERIOD to say, 3 years (in seconds), and that would > give similar behaviour to what I'm asking. > > ==end of idea== In the same trend, if I know in advance that I need to take into maintenance node A, I would like to preemptively copy the chunks of node A onto B and C rather than - put A in maintenance mode and loose the redondancy - turn off A, and force the replication at "full" speed. Does that make sense? > An aside: I tried to migrate my network storage to using btrfs with raid > and served via NFS but I am going back to MooseFS for exactly the reasons I > outlined in a previous message. MooseFS is so much easier to for me to > maintain, grow and shrink. Thank you MooseFS team! +1 for the thanks. Cheers Tru -- Dr Tru Huynh | mailto:tr...@pa... | tel/fax +33 1 45 68 87 37/19 https://research.pasteur.fr/en/team/structural-bioinformatics/ Institut Pasteur, 25-28 rue du Docteur Roux, 75724 Paris CEDEX 15 France |
From: Matt W. <mat...@gm...> - 2017-10-09 01:59:16
|
If the feature I describe below is already available (I did not see anything relevant in the config files) then consider this message a "howto" request. Some time ago I did raise the idea of a related mechanism of lazy chunk removal which I think actually can be achieved by using the chunkserver grace period controls. It might be that what I'm asking for here can also be implemented with the grace period but I'm not quite sure about that. MooseFS is very aggressive about meeting and not exceeding the goal for chunks. There are methods to control how fast MooseFS proceeds to rebalance but always the number of chunks is brought to the goal. My thought is that there would be a use case where MooseFS treated goals as a range. This could be captured as a setting in a config that determined "overchunks" or something similar. If overchunks was set to 1 then a chunk with goal of 2 would be allowed to go to 3 and no action would ever be taken to remove the third chunk. The goal of the idea is to reduce or eliminate chunk churn when a chunkserver is taken offline. Imagine this scenario: My files have goal set to 2 and I have three chunk servers; A, B and C. Chunks are currently balanced and goals met over the three chunk servers. Chunk server A goes down due to a hardware issue. Chunks that were on B or C and replicated on A now must be replicated to B or C to meet the goal of 2. After some time (and some performance impact) all the chunks are back to having two copies. When chunkserver A is brought back into service many of the files with goal of 2 will now have three chunks. But with overgoal set to 1 no action will be taken and MooseFS will NOT start removing chunks. If again a chunk server goes down the amount of chunk balance churn will be much reduced due to some chunks already being replicated to goal 3. Over time very long term data will converge on having chunks with goal of 3 and new data will start out with chunks at 2. Overall churn due to chunkservers being taken on/off line will be reduced even more than what can be achieved with the grace period setting (*). (*) Well, I think so but I actually don't know as I haven't figured out how to safely set the grace period settings. I think I can set QUOTA_DEFAULT_GRACE_PERIOD to say, 3 years (in seconds), and that would give similar behaviour to what I'm asking. ==end of idea== An aside: I tried to migrate my network storage to using btrfs with raid and served via NFS but I am going back to MooseFS for exactly the reasons I outlined in a previous message. MooseFS is so much easier to for me to maintain, grow and shrink. Thank you MooseFS team! Aside #2: I did see bup (the backup tool) fail to handle files on MooseFS but succeed on NFS. This was where the bup backup itself was being put on the network file system. It should be easy to replicate by doing a bup backup to MooseFS. I am using 3.0.81. I'm guessing this is a very unlikely use case but thought I'd mention my experience anyway. Thanks, Matt -=- |
From: Devin A. <lin...@gm...> - 2017-09-18 15:11:36
|
I am currently hosting lots of different sites using Wordpress / Apache / MySQL / PHP, and I am getting to the point where I need to scale my VM instances so that each site has multiple VM's using a load balancer to balance the traffic. I was looking to have a special mount point for each site that is unique to the user, then have MooseFS mounted into each VM so that any changes to any of the VM's replicates to the other to keep the FS in sync. Just curious to know what are people's thoughts about using MooseFS for this situation? Also looking to maybe use it to to write backups to it and have those Backup's synced to another data center for DR. Any inputs is appreciated. Anything I should consider or keep in mind? |
From: Wilson, S. M <st...@pu...> - 2017-09-13 18:43:01
|
________________________________________ From: Davies Liu <dav...@gm...> Sent: Wednesday, September 13, 2017 12:22 PM To: Wilson, Steven M Cc: Matt Welland; moo...@li... Subject: Re: [MooseFS-Users] Dealing with millions of files On Tue, Sep 12, 2017 at 12:42 PM, Wilson, Steven M <st...@pu...> wrote: > The chunk servers have at least 64GB of memory in each one. There is some > swapping taking place but I set the LOCK_MEMORY option in mfschunkserver.cfg > to avoid swapping out the chunk server process. A quick check shows that > the most memory being used by any mfschunkserver is about 55% on one of the > 64GB chunk servers. > > > The ls and cat (both to /dev/null) seem fairly responsive: > > ls takes 0.169s on a directory of 57K files This one sounds reasonable, the result of listing is about 2.5MB (about 50B per file), mfsmaster should not be the bottleneck. > cat takes 0.363s on a 308KB file This one is slow, so you could only read 3 files per second. The internal rebalance may be the reason for this slowness, which took most of your IOPS in disks. There some configs (number of read/write per cycle) in master to tune the speed of rebalancing , that will help the health of cluster. ============================================================== Thanks for that suggestion. I'll try setting HDD_REBALANCE_UTILIZATION to see if that helps. Steve |
From: Wilson, S. M <st...@pu...> - 2017-09-13 18:25:28
|
Yes, most POSIX-based distributed filesystems like MooseFS will have trouble dealing with millions of small files. I like MooseFS enough that I'm willing to put up with the performance issue. I just thought that perhaps there were some things I could do to boost our performance a little. Steve ________________________________________ From: Davies Liu <dav...@gm...> Sent: Wednesday, September 13, 2017 2:11 PM To: Wilson, Steven M Cc: Aleksander Wieliczko; moo...@li... Subject: Re: [MooseFS-Users] Dealing with millions of files In general, MooseFS is not good for many small files (both for scalability and performance), other KV store (which pack small files into bigger chunks) will works better than MFS. On Wed, Sep 13, 2017 at 10:43 AM, Wilson, Steven M <st...@pu...> wrote: > Hi! > > > I used the timing of a tar/zip job just as an example of the poor > performance I'm seeing with small files. I agree that there are faster ways > to do that! > > > The ping times between a typical client and the MooseFS chunkservers and > master average aound 0.2 ms. > > > I ran the "big file" / "small files" copy test against on the MooseFS > filesystem with these results: > > Big file copy: 67.65 MB/s > > Small files copy (cp -r): 1.57 MB/s > > > Just for comparison purposes, I did the same test on another MooseFS > filesystem with the following results: > > Big file copy: 62.05 MB/s > > Small files copy (cp -r): 4.18 MB/s > > This filesystem is much less busy, only hosts ~12 million files, and neither > of its two chunkservers is undergoing an internal rebalance. But the timing > for the small files copy still seems quite slow. > > > I'm curious if someone could run a similar test on their own MooseFS > filesystem to see what kind of timings they get. > > > A few weeks ago, I tried setting the CPU governor on the master server to > "performance" instead of "powersave" but that showed no measurable > improvement. Just for good measure, I again set it to "performance" and > re-ran the copy tests and the result for the small files copy was 1.44 MB/s > (i.e., no significant difference). > > > Thanks for your help! > > > Regards, > > > Steve > > > > ________________________________ > From: Aleksander Wieliczko <ale...@mo...> > Sent: Wednesday, September 13, 2017 3:34 AM > To: Wilson, Steven M; Matt Welland; moo...@li... > Subject: Re: [MooseFS-Users] Dealing with millions of files > > Hi. > I would like to suggest to do what Davies Liu said. I mean use some kind of > parallel tool. > > What is the ping time between MooseFS client and other MooseFS components? > For small file operations most crucial is latency between all MooseFS > components - this is how TCP/IP protocol works. > > Also please check MooseFS master CPU power governor. > Maybe your CPU is not working with full speed. Command like lscpu will show > you current CPU MHz value. > > By the way, very simple TCP/IP exercise for example with NFS share: > Please try to copy one big 1GiB file and 10486 small 100KiB files. What > speed you will get? > > Best regards > Alex. > > On 12.09.2017 21:42, Wilson, Steven M wrote: > > The chunk servers have at least 64GB of memory in each one. There is some > swapping taking place but I set the LOCK_MEMORY option in mfschunkserver.cfg > to avoid swapping out the chunk server process. A quick check shows that > the most memory being used by any mfschunkserver is about 55% on one of the > 64GB chunk servers. > > > The ls and cat (both to /dev/null) seem fairly responsive: > > ls takes 0.169s on a directory of 57K files > > cat takes 0.363s on a 308KB file > > > It looks like the write speed for small files may be what's killing me. I > tried copying 900 small files (308KB) to another directory on the same > MooseFS filesystem and it took 273 seconds or about 1.1MB/s. Ouch! > > > Three of the four chunk servers are doing an internal rebalance of chunks. > Perhaps that's having a larger impact on my overall I/O performance than I > expected. > > > Steve > > > > ------------------------------------------------------------------------------ > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://secure-web.cisco.com/1S06T52hVf88uyiHikVZYaqgLwEIyjT7upWrklGLvgrXVzQDd9YE-ovcZ3xAtqenzyBB8drDPIB7S57ZRpx8MaCyNcJviGwg7ArinC3P3DGPCEac3BNDmW06oZWlPrkvB28exIjIiP62Ic_RRfONbFtHqZASTLXxHh2Onjcp_Gb5b7xtuONu9Dc_7Ni9u9cn-arnAEpYpXPrVa68YDJs4Lv2BcLj72E6-r6xYV2s0ouuFqa1vI9Yohr3goFe0YnCRvPwc4UXs1jiRD3--dvm48Dly7nPhHQDBCqjutvRQgFrOGWd7t4T5a_7aKxY4Hi8eR-vOsNZXfhCuLnEupWpX7GivwJajIDQJwaUKO3HRx34/http%3A%2F%2Fsdm.link%2Fslashdot > _________________________________________ > moosefs-users mailing list > moo...@li... > https://secure-web.cisco.com/1kPxxUtLgVMLEHHWbUDrirwc6pLK7eaLx8SfzJR_vROzMQE-NZuQlYqpFOejRpbpC9UME7ZuGy4t8Wf7OTFzIHcO48vpX5gpcpqLmc0H5BAiCEedwVrnaDmzVS5vdUAHE4CMLCh6NVcdipFHlCu2Wmu-3OFiQ-dBb6fClyp-7Kowq6by-Ue-Nz_AlRwEXcAyTVpG4I8cCERDjSdiAG3UZvdo-nLJXqfkNDZUg7mNaWcQkj6QzFCHOmpTC811DJu1fg7iQgHFVI7PdSIe7rW6jt67bD5r96TdponsOLnfhx75k1ZjtO0PdlxnCJpYE0Rp64qGYNjz8-T0ALhnSntAgPC4UQTQofVD5t7TXyG3zgTs/https%3A%2F%2Flists.sourceforge.net%2Flists%2Flistinfo%2Fmoosefs-users > -- - Davies |
From: Wilson, S. M <st...@pu...> - 2017-09-13 18:13:46
|
I looked some more at the swapping on our chunk servers. I had assumed that because I was seeing kswapd0 showing up in top using 10-30% of a CPU that we were doing some swapping to/from disk. But when monitoring with vmstat, I see that the swap in and swap out columns are usually zero. Good point about sizing memory. I should consider beefing up my chunk server memory during my next hardware upgrade. Thanks! Steve ________________________________ From: Casper Langemeijer <cas...@pr...> Sent: Wednesday, September 13, 2017 5:07 AM To: Wilson, Steven M; moosefs-users Subject: Re: [MooseFS-Users] Dealing with millions of files "There is some swapping taking place" What does that mean? A few MB's of swapped out, or is your chunkserver actively swapping? I've configured my chunk servers to a memory size that allows about 85% of unused memory for processes. My mfschunkserver processes use about 10% of available memory. Actual memory as there is almost no swap being used <0.1%. This 80% gets used by the kernels disk cache. Increasing memory available for disk caches could speed up your cluster even more if the size of your 'hot' zone of popular files is bigger and your read/write ratio is higher (more reads). Any less than 80% for disk cache will probably have a significant impact on disks, especially on the old rotating ones. -- Casper 2017-09-12 21:42 GMT+02:00 Wilson, Steven M <st...@pu...<mailto:st...@pu...>>: The chunk servers have at least 64GB of memory in each one. There is some swapping taking place but I set the LOCK_MEMORY option in mfschunkserver.cfg to avoid swapping out the chunk server process. A quick check shows that the most memory being used by any mfschunkserver is about 55% on one of the 64GB chunk servers. The ls and cat (both to /dev/null) seem fairly responsive: ls takes 0.169s on a directory of 57K files cat takes 0.363s on a 308KB file It looks like the write speed for small files may be what's killing me. I tried copying 900 small files (308KB) to another directory on the same MooseFS filesystem and it took 273 seconds or about 1.1MB/s. Ouch! Three of the four chunk servers are doing an internal rebalance of chunks. Perhaps that's having a larger impact on my overall I/O performance than I expected. Steve ________________________________ From: est...@gm...<mailto:est...@gm...> <est...@gm...<mailto:est...@gm...>> on behalf of Matt Welland <mat...@gm...<mailto:mat...@gm...>> Sent: Tuesday, September 12, 2017 2:06 PM To: Wilson, Steven M Cc: moo...@li...<mailto:moo...@li...> Subject: Re: [MooseFS-Users] Dealing with millions of files To ask the obvious ... you don't explicitly state that the chunk servers have plenty of memory and are not hitting swap, are they fine? Can you tell from the logs or cgi web view if the slowness comes from metadata queries or from retrieving the chunks? Are other operations equally slow? E.g. if you change to one of the directories that you are tarring up, do an ls and then cat a file to /dev/null are both the ls and cat slow or just one or the other? On Tue, Sep 12, 2017 at 9:48 AM, Wilson, Steven M <st...@pu...<mailto:st...@pu...>> wrote: Hi, We have eight different MooseFS installations at our site and they are all doing terrific (thanks, MooseFS developers!) except for the one that hosts around 180 million files. The performance on this installation is quite dismal. I expect to see some decrease in performance due to the sheer number of files but not to the degree that I'm experiencing. For example, I am in the process of tarring/gzipping 176 directories that together contain 15M files and occupy 5.5TB of space. I am about 78% done after 25 days. Are there any users on this list who have MooseFS installations with large numbers of files and have some hints they can share about how to improve performance? Here is some basic information about this installation: * MooseFS master is running on a Xeon E5-1630v3 (3.7GHz) and 132GB of memory. Its system disk (where /var/lib/mfs is located) is a 256GB Samsung 850 Pro SSD. * Four chunk servers, two with 16 SATA disks ranging from 3TB to 10TB each. And two chunk servers with two 10TB SATA disks each. All disks are JBOD and most are formatted using XFS (some are ext4). * Disk schedulers on chunk servers are set to use deadline. Tried different settings for read_ahead_kb and nr_requests without much performance gain. * Network is gigabit Ethernet * Master and chunk servers are running a mix of MooseFS versions 3.0.90 and 3.0.92. Thanks in advance for any suggestions! Steve ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot<http://secure-web.cisco.com/1HJ9vXYy_0YCboBJl7cH9g4OKnzybhwhU5E-KN6mKbzvJTv_0EFZoTbu57nNPgOyGXdiBI7Ebib354Zs5lp-DeQjKWiaM1BVgwizyNKZ4igyNhSqyPHbGJM_6dOqp4CGSi-r8RR9rF-Mujd7bs-SrArr-JrGjSfuC8h5MZYDExbEorjdK_0MRbDP-2ELiGVwCtRtA_FTYr9VwrcoBJ_t5RBF-ic3-HaKwB5KgVcZUbp5iaLIRu87Ql4ezlf7SQCgc5PH1v5sbazeuVyNpybOA3Pn6UAac0KvRIq9cmTGCClj-Pz1ph5SLaxVMqbj1MZHbY64h-Cwp9EAMw_0qk3pvPD8S6LE4uWguvaszNlInxoQ/http%3A%2F%2Fsdm.link%2Fslashdot> _________________________________________ moosefs-users mailing list moo...@li...<mailto:moo...@li...> https://lists.sourceforge.net/lists/listinfo/moosefs-users<https://secure-web.cisco.com/1nhR7evg0-W5aGax886YeNrvlUfVHI5tJY16u--TYA7btVPw6zu7QElZxalaz7TojfSM7i7jPnAZks1Lq3MfUXyMP71hFwoB9HGHfRBHj30KJUpaMptBTruXnIswidqiqxPomN_7qPIUUBNhXNHIh0NaCmf3SkwoVfru5MU8yJ1RJoG-5QL3q7LabhYLOqKsuhXsXW_RMFcRpfGVLsIOC-xS0-ahznXLJ8_jBLuJc33osnPzwSJgPejlaMgkfuDLtIUsJti9fzlDpBapBw4RcixpGbsncoBZm94djVyikbrp5C7fMgQh9ZeIGZ5pxxr_YgnVt180JS1yRdxZiGn8IZVgI1eLr2NsIZyWsUQDxGWU/https%3A%2F%2Flists.sourceforge.net%2Flists%2Flistinfo%2Fmoosefs-users> ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot<http://secure-web.cisco.com/1zIJZIesjpBEksbZWqldU1vUnWojxPr_w_hbPBRuP1rud2Fen4aAhyWjFNU1Zp9ZyEjLVe4yvipryglQpF-E7c-CzQpGJQw0PGNg1sgxcmauSNIZHkPDYE0F9dLJFIzw8N0AORoza6SlzVExGBUi9rVoxuCDQ9esYID6VPK4hQSYisU7fpIxtrMSvgkjYb7cCSNTkJ-aGL_SMzHaRN3PCOYhO7ZkPg1f9zxZl3LuZMLZy5OLxvXytikY-Rwjtbp7OODswSsJ5G0x1gssmUkR3do65udFQhXizv0JOdSvQHe_Lb-Xf24Ydd4khtbO9z94zqPXv2DRvXF3Ix8H7eP4YwDIei9KMnlWOSWdgXdg-_y0/http%3A%2F%2Fsdm.link%2Fslashdot> _________________________________________ moosefs-users mailing list moo...@li...<mailto:moo...@li...> https://lists.sourceforge.net/lists/listinfo/moosefs-users<https://secure-web.cisco.com/1nwLhSBausBELs0giha6-PF4GNDKHtXOeQiPwnz7nDANsoLHNQrQc8UCLgoVFXJlWufO67urcsC8UrZiXpNLZUkUMFkFelqpzZaBKil4zuZ55bNwMJLmZ0SCGSl-F6D8vdL9s5W6ufFqrueao1EyKQz7KVkDyK52qZ76TntY2Jeb-ZRrnBAaD3JfS0zXLGuCoUBrYKXkhVEUl5KSyvLlHEnJrluSLhJAAZ2V3aa3q-sM_H3Trbt01XHoYcsrbuwMI7Zq9-MPqJuR4gJ5qe9fDuOiJKb5E8Q6Hk-hHbqfei6YjNUYTURXKrJWpPUkb4PE9TFnTGPFfipJPoUUn5weN0b56kKgGqHOHu2guoBjIYZM/https%3A%2F%2Flists.sourceforge.net%2Flists%2Flistinfo%2Fmoosefs-users> |
From: Davies L. <dav...@gm...> - 2017-09-13 18:11:49
|
In general, MooseFS is not good for many small files (both for scalability and performance), other KV store (which pack small files into bigger chunks) will works better than MFS. On Wed, Sep 13, 2017 at 10:43 AM, Wilson, Steven M <st...@pu...> wrote: > Hi! > > > I used the timing of a tar/zip job just as an example of the poor > performance I'm seeing with small files. I agree that there are faster ways > to do that! > > > The ping times between a typical client and the MooseFS chunkservers and > master average aound 0.2 ms. > > > I ran the "big file" / "small files" copy test against on the MooseFS > filesystem with these results: > > Big file copy: 67.65 MB/s > > Small files copy (cp -r): 1.57 MB/s > > > Just for comparison purposes, I did the same test on another MooseFS > filesystem with the following results: > > Big file copy: 62.05 MB/s > > Small files copy (cp -r): 4.18 MB/s > > This filesystem is much less busy, only hosts ~12 million files, and neither > of its two chunkservers is undergoing an internal rebalance. But the timing > for the small files copy still seems quite slow. > > > I'm curious if someone could run a similar test on their own MooseFS > filesystem to see what kind of timings they get. > > > A few weeks ago, I tried setting the CPU governor on the master server to > "performance" instead of "powersave" but that showed no measurable > improvement. Just for good measure, I again set it to "performance" and > re-ran the copy tests and the result for the small files copy was 1.44 MB/s > (i.e., no significant difference). > > > Thanks for your help! > > > Regards, > > > Steve > > > > ________________________________ > From: Aleksander Wieliczko <ale...@mo...> > Sent: Wednesday, September 13, 2017 3:34 AM > To: Wilson, Steven M; Matt Welland; moo...@li... > Subject: Re: [MooseFS-Users] Dealing with millions of files > > Hi. > I would like to suggest to do what Davies Liu said. I mean use some kind of > parallel tool. > > What is the ping time between MooseFS client and other MooseFS components? > For small file operations most crucial is latency between all MooseFS > components - this is how TCP/IP protocol works. > > Also please check MooseFS master CPU power governor. > Maybe your CPU is not working with full speed. Command like lscpu will show > you current CPU MHz value. > > By the way, very simple TCP/IP exercise for example with NFS share: > Please try to copy one big 1GiB file and 10486 small 100KiB files. What > speed you will get? > > Best regards > Alex. > > On 12.09.2017 21:42, Wilson, Steven M wrote: > > The chunk servers have at least 64GB of memory in each one. There is some > swapping taking place but I set the LOCK_MEMORY option in mfschunkserver.cfg > to avoid swapping out the chunk server process. A quick check shows that > the most memory being used by any mfschunkserver is about 55% on one of the > 64GB chunk servers. > > > The ls and cat (both to /dev/null) seem fairly responsive: > > ls takes 0.169s on a directory of 57K files > > cat takes 0.363s on a 308KB file > > > It looks like the write speed for small files may be what's killing me. I > tried copying 900 small files (308KB) to another directory on the same > MooseFS filesystem and it took 273 seconds or about 1.1MB/s. Ouch! > > > Three of the four chunk servers are doing an internal rebalance of chunks. > Perhaps that's having a larger impact on my overall I/O performance than I > expected. > > > Steve > > > > ------------------------------------------------------------------------------ > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > _________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users > -- - Davies |
From: Wilson, S. M <st...@pu...> - 2017-09-13 17:43:19
|
?Hi! I used the timing of a tar/zip job just as an example of the poor performance I'm seeing with small files. I agree that there are faster ways to do that! The ping times between a typical client and the MooseFS chunkservers and master average aound 0.2 ms. I ran the "big file" / "small files" copy test against on the MooseFS filesystem with these results: Big file copy: 67.65 MB/s Small files copy (cp -r): 1.57 MB/s Just for comparison purposes, I did the same test on another MooseFS filesystem with the following results: Big file copy: 62.05 MB/s Small files copy (cp -r): 4.18 MB/s This filesystem is much less busy, only hosts ~12 million files, and neither of its two chunkservers is undergoing an internal rebalance. But the timing for the small files copy still seems quite slow. I'm curious if someone could run a similar test on their own MooseFS filesystem to see what kind of timings they get. A few weeks ago, I tried setting the CPU governor on the master server to "performance" instead of "powersave" but that showed no measurable improvement. Just for good measure, I again set it to "performance" and re-ran the copy tests and the result for the small files copy was 1.44 MB/s (i.e., no significant difference). Thanks for your help! Regards, Steve ________________________________ From: Aleksander Wieliczko <ale...@mo...> Sent: Wednesday, September 13, 2017 3:34 AM To: Wilson, Steven M; Matt Welland; moo...@li... Subject: Re: [MooseFS-Users] Dealing with millions of files Hi. I would like to suggest to do what Davies Liu said. I mean use some kind of parallel tool. What is the ping time between MooseFS client and other MooseFS components? For small file operations most crucial is latency between all MooseFS components - this is how TCP/IP protocol works. Also please check MooseFS master CPU power governor. Maybe your CPU is not working with full speed. Command like lscpu will show you current CPU MHz value. By the way, very simple TCP/IP exercise for example with NFS share: Please try to copy one big 1GiB file and 10486 small 100KiB files. What speed you will get? Best regards Alex. On 12.09.2017 21:42, Wilson, Steven M wrote: The chunk servers have at least 64GB of memory in each one. There is some swapping taking place but I set the LOCK_MEMORY option in mfschunkserver.cfg to avoid swapping out the chunk server process. A quick check shows that the most memory being used by any mfschunkserver is about 55% on one of the 64GB chunk servers. The ls and cat (both to /dev/null) seem fairly responsive: ls takes 0.169s on a directory of 57K files cat takes 0.363s on a 308KB file It looks like the write speed for small files may be what's killing me. I tried copying 900 small files (308KB) to another directory on the same MooseFS filesystem and it took 273 seconds or about 1.1MB/s. Ouch! Three of the four chunk servers are doing an internal rebalance of chunks. Perhaps that's having a larger impact on my overall I/O performance than I expected. Steve |
From: Davies L. <dav...@gm...> - 2017-09-13 16:23:14
|
On Tue, Sep 12, 2017 at 12:42 PM, Wilson, Steven M <st...@pu...> wrote: > The chunk servers have at least 64GB of memory in each one. There is some > swapping taking place but I set the LOCK_MEMORY option in mfschunkserver.cfg > to avoid swapping out the chunk server process. A quick check shows that > the most memory being used by any mfschunkserver is about 55% on one of the > 64GB chunk servers. > > > The ls and cat (both to /dev/null) seem fairly responsive: > > ls takes 0.169s on a directory of 57K files This one sounds reasonable, the result of listing is about 2.5MB (about 50B per file), mfsmaster should not be the bottleneck. > cat takes 0.363s on a 308KB file This one is slow, so you could only read 3 files per second. The internal rebalance may be the reason for this slowness, which took most of your IOPS in disks. There some configs (number of read/write per cycle) in master to tune the speed of rebalancing , that will help the health of cluster. > > It looks like the write speed for small files may be what's killing me. I > tried copying 900 small files (308KB) to another directory on the same > MooseFS filesystem and it took 273 seconds or about 1.1MB/s. Ouch! > > > Three of the four chunk servers are doing an internal rebalance of chunks. > Perhaps that's having a larger impact on my overall I/O performance than I > expected. > > > Steve > > > ________________________________ > From: est...@gm... <est...@gm...> on behalf of Matt Welland > <mat...@gm...> > Sent: Tuesday, September 12, 2017 2:06 PM > To: Wilson, Steven M > Cc: moo...@li... > Subject: Re: [MooseFS-Users] Dealing with millions of files > > To ask the obvious ... you don't explicitly state that the chunk servers > have plenty of memory and are not hitting swap, are they fine? > > Can you tell from the logs or cgi web view if the slowness comes from > metadata queries or from retrieving the chunks? Are other operations equally > slow? E.g. if you change to one of the directories that you are tarring up, > do an ls and then cat a file to /dev/null are both the ls and cat slow or > just one or the other? > > On Tue, Sep 12, 2017 at 9:48 AM, Wilson, Steven M <st...@pu...> wrote: >> >> Hi, >> >> >> We have eight different MooseFS installations at our site and they are all >> doing terrific (thanks, MooseFS developers!) except for the one that hosts >> around 180 million files. The performance on this installation is quite >> dismal. I expect to see some decrease in performance due to the sheer >> number of files but not to the degree that I'm experiencing. For example, I >> am in the process of tarring/gzipping 176 directories that together contain >> 15M files and occupy 5.5TB of space. I am about 78% done after 25 days. >> >> >> Are there any users on this list who have MooseFS installations with large >> numbers of files and have some hints they can share about how to improve >> performance? >> >> >> Here is some basic information about this installation: >> >> * MooseFS master is running on a Xeon E5-1630v3 (3.7GHz) and 132GB of >> memory. Its system disk (where /var/lib/mfs is located) is a 256GB Samsung >> 850 Pro SSD. >> >> * Four chunk servers, two with 16 SATA disks ranging from 3TB to 10TB >> each. And two chunk servers with two 10TB SATA disks each. All disks are >> JBOD and most are formatted using XFS (some are ext4). >> >> * Disk schedulers on chunk servers are set to use deadline. Tried >> different settings for read_ahead_kb and nr_requests without much >> performance gain. >> >> * Network is gigabit Ethernet >> >> * Master and chunk servers are running a mix of MooseFS versions >> 3.0.90 and 3.0.92. >> >> >> Thanks in advance for any suggestions! >> >> >> Steve >> >> >> >> ------------------------------------------------------------------------------ >> Check out the vibrant tech community on one of the world's most >> engaging tech sites, Slashdot.org! http://sdm.link/slashdot >> _________________________________________ >> moosefs-users mailing list >> moo...@li... >> https://lists.sourceforge.net/lists/listinfo/moosefs-users >> > > > ------------------------------------------------------------------------------ > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > _________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users > -- - Davies |
From: Casper L. <cas...@pr...> - 2017-09-13 09:37:37
|
"There is some swapping taking place" What does that mean? A few MB's of swapped out, or is your chunkserver actively swapping? I've configured my chunk servers to a memory size that allows about 85% of unused memory for processes. My mfschunkserver processes use about 10% of available memory. Actual memory as there is almost no swap being used <0.1%. This 80% gets used by the kernels disk cache. Increasing memory available for disk caches could speed up your cluster even more if the size of your 'hot' zone of popular files is bigger and your read/write ratio is higher (more reads). Any less than 80% for disk cache will probably have a significant impact on disks, especially on the old rotating ones. -- Casper 2017-09-12 21:42 GMT+02:00 Wilson, Steven M <st...@pu...>: > The chunk servers have at least 64GB of memory in each one. There is some > swapping taking place but I set the LOCK_MEMORY option in > mfschunkserver.cfg to avoid swapping out the chunk server process. A quick > check shows that the most memory being used by any mfschunkserver is about > 55% on one of the 64GB chunk servers. > > > The ls and cat (both to /dev/null) seem fairly responsive: > > ls takes 0.169s on a directory of 57K files > > cat takes 0.363s on a 308KB file > > > It looks like the write speed for small files may be what's killing me. I > tried copying 900 small files (308KB) to another directory on the same > MooseFS filesystem and it took 273 seconds or about 1.1MB/s. Ouch! > > > Three of the four chunk servers are doing an internal rebalance of > chunks. Perhaps that's having a larger impact on my overall > I/O performance than I expected. > > > Steve > > > ------------------------------ > *From:* est...@gm... <est...@gm...> on behalf of Matt > Welland <mat...@gm...> > *Sent:* Tuesday, September 12, 2017 2:06 PM > *To:* Wilson, Steven M > *Cc:* moo...@li... > *Subject:* Re: [MooseFS-Users] Dealing with millions of files > > To ask the obvious ... you don't explicitly state that the chunk servers > have plenty of memory and are not hitting swap, are they fine? > > Can you tell from the logs or cgi web view if the slowness comes from > metadata queries or from retrieving the chunks? Are other operations > equally slow? E.g. if you change to one of the directories that you are > tarring up, do an ls and then cat a file to /dev/null are both the ls and > cat slow or just one or the other? > > On Tue, Sep 12, 2017 at 9:48 AM, Wilson, Steven M <st...@pu...> > wrote: > >> Hi, >> >> >> We have eight different MooseFS installations at our site and they are >> all doing terrific (thanks, MooseFS developers!) except for the one that >> hosts around 180 million files. The performance on this installation is >> quite dismal. I expect to see some decrease in performance due to the >> sheer number of files but not to the degree that I'm experiencing. For >> example, I am in the process of tarring/gzipping 176 directories that >> together contain 15M files and occupy 5.5TB of space. I am about 78% done >> after 25 days. >> >> >> Are there any users on this list who have MooseFS installations with >> large numbers of files and have some hints they can share about how to >> improve performance? >> >> >> Here is some basic information about this installation: >> >> * MooseFS master is running on a Xeon E5-1630v3 (3.7GHz) and 132GB of >> memory. Its system disk (where /var/lib/mfs is located) is a 256GB Samsung >> 850 Pro SSD. >> >> * Four chunk servers, two with 16 SATA disks ranging from 3TB to 10TB >> each. And two chunk servers with two 10TB SATA disks each. All disks are >> JBOD and most are formatted using XFS (some are ext4). >> >> * Disk schedulers on chunk servers are set to use deadline. Tried >> different settings for read_ahead_kb and nr_requests without much >> performance gain. >> >> * Network is gigabit Ethernet >> >> * Master and chunk servers are running a mix of MooseFS >> versions 3.0.90 and 3.0.92. >> >> >> Thanks in advance for any suggestions! >> >> >> Steve >> >> ------------------------------------------------------------ >> ------------------ >> Check out the vibrant tech community on one of the world's most >> engaging tech sites, Slashdot.org! http://sdm.link/slashdot >> <http://secure-web.cisco.com/1HJ9vXYy_0YCboBJl7cH9g4OKnzybhwhU5E-KN6mKbzvJTv_0EFZoTbu57nNPgOyGXdiBI7Ebib354Zs5lp-DeQjKWiaM1BVgwizyNKZ4igyNhSqyPHbGJM_6dOqp4CGSi-r8RR9rF-Mujd7bs-SrArr-JrGjSfuC8h5MZYDExbEorjdK_0MRbDP-2ELiGVwCtRtA_FTYr9VwrcoBJ_t5RBF-ic3-HaKwB5KgVcZUbp5iaLIRu87Ql4ezlf7SQCgc5PH1v5sbazeuVyNpybOA3Pn6UAac0KvRIq9cmTGCClj-Pz1ph5SLaxVMqbj1MZHbY64h-Cwp9EAMw_0qk3pvPD8S6LE4uWguvaszNlInxoQ/http%3A%2F%2Fsdm.link%2Fslashdot> >> _________________________________________ >> moosefs-users mailing list >> moo...@li... >> https://lists.sourceforge.net/lists/listinfo/moosefs-users >> <https://secure-web.cisco.com/1nhR7evg0-W5aGax886YeNrvlUfVHI5tJY16u--TYA7btVPw6zu7QElZxalaz7TojfSM7i7jPnAZks1Lq3MfUXyMP71hFwoB9HGHfRBHj30KJUpaMptBTruXnIswidqiqxPomN_7qPIUUBNhXNHIh0NaCmf3SkwoVfru5MU8yJ1RJoG-5QL3q7LabhYLOqKsuhXsXW_RMFcRpfGVLsIOC-xS0-ahznXLJ8_jBLuJc33osnPzwSJgPejlaMgkfuDLtIUsJti9fzlDpBapBw4RcixpGbsncoBZm94djVyikbrp5C7fMgQh9ZeIGZ5pxxr_YgnVt180JS1yRdxZiGn8IZVgI1eLr2NsIZyWsUQDxGWU/https%3A%2F%2Flists.sourceforge.net%2Flists%2Flistinfo%2Fmoosefs-users> >> >> > > ------------------------------------------------------------ > ------------------ > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > _________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users > > |
From: Aleksander W. <ale...@mo...> - 2017-09-13 07:35:02
|
Hi. I would like to suggest to do what Davies Liu said. I mean use some kind of parallel tool. What is the ping time between MooseFS client and other MooseFS components? For small file operations most crucial is latency between all MooseFS components - this is how TCP/IP protocol works. Also please check MooseFS master CPU power governor. Maybe your CPU is not working with full speed. Command like lscpu will show you current CPU MHz value. By the way, very simple TCP/IP exercise for example with NFS share: Please try to copy one big 1GiB file and 10486 small 100KiB files. What speed you will get? Best regards Alex. On 12.09.2017 21:42, Wilson, Steven M wrote: > > The chunk servers have at least 64GB of memory in each one. There is > some swapping taking place but I set the LOCK_MEMORY option in > mfschunkserver.cfg to avoid swapping out the chunk server process. A > quick check shows that the most memory being used by > any mfschunkserver is about 55% on one of the 64GB chunk servers. > > > The ls and cat (both to /dev/null) seem fairly responsive: > > ls takes 0.169s on a directory of 57K files > > cat takes 0.363s on a 308KB file > > > It looks like the write speed for small files may be what's killing > me. I tried copying 900 small files (308KB) to another directory on > the same MooseFS filesystem and it took 273 seconds or about 1.1MB/s. > Ouch! > > > Three of the four chunk servers are doing an internal rebalance of > chunks. Perhaps that's having a larger impact on my overall > I/O performance than I expected. > > > Steve > > |
From: Wilson, S. M <st...@pu...> - 2017-09-12 19:42:41
|
The chunk servers have at least 64GB of memory in each one. There is some swapping taking place but I set the LOCK_MEMORY option in mfschunkserver.cfg to avoid swapping out the chunk server process. A quick check shows that the most memory being used by any mfschunkserver is about 55% on one of the 64GB chunk servers. The ls and cat (both to /dev/null) seem fairly responsive: ls takes 0.169s on a directory of 57K files cat takes 0.363s on a 308KB file It looks like the write speed for small files may be what's killing me. I tried copying 900 small files (308KB) to another directory on the same MooseFS filesystem and it took 273 seconds or about 1.1MB/s. Ouch! Three of the four chunk servers are doing an internal rebalance of chunks. Perhaps that's having a larger impact on my overall I/O performance than I expected. Steve ________________________________ From: est...@gm... <est...@gm...> on behalf of Matt Welland <mat...@gm...> Sent: Tuesday, September 12, 2017 2:06 PM To: Wilson, Steven M Cc: moo...@li... Subject: Re: [MooseFS-Users] Dealing with millions of files To ask the obvious ... you don't explicitly state that the chunk servers have plenty of memory and are not hitting swap, are they fine? Can you tell from the logs or cgi web view if the slowness comes from metadata queries or from retrieving the chunks? Are other operations equally slow? E.g. if you change to one of the directories that you are tarring up, do an ls and then cat a file to /dev/null are both the ls and cat slow or just one or the other? On Tue, Sep 12, 2017 at 9:48 AM, Wilson, Steven M <st...@pu...<mailto:st...@pu...>> wrote: Hi, We have eight different MooseFS installations at our site and they are all doing terrific (thanks, MooseFS developers!) except for the one that hosts around 180 million files. The performance on this installation is quite dismal. I expect to see some decrease in performance due to the sheer number of files but not to the degree that I'm experiencing. For example, I am in the process of tarring/gzipping 176 directories that together contain 15M files and occupy 5.5TB of space. I am about 78% done after 25 days. Are there any users on this list who have MooseFS installations with large numbers of files and have some hints they can share about how to improve performance? Here is some basic information about this installation: * MooseFS master is running on a Xeon E5-1630v3 (3.7GHz) and 132GB of memory. Its system disk (where /var/lib/mfs is located) is a 256GB Samsung 850 Pro SSD. * Four chunk servers, two with 16 SATA disks ranging from 3TB to 10TB each. And two chunk servers with two 10TB SATA disks each. All disks are JBOD and most are formatted using XFS (some are ext4). * Disk schedulers on chunk servers are set to use deadline. Tried different settings for read_ahead_kb and nr_requests without much performance gain. * Network is gigabit Ethernet * Master and chunk servers are running a mix of MooseFS versions 3.0.90 and 3.0.92. Thanks in advance for any suggestions! Steve ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot<http://secure-web.cisco.com/1HJ9vXYy_0YCboBJl7cH9g4OKnzybhwhU5E-KN6mKbzvJTv_0EFZoTbu57nNPgOyGXdiBI7Ebib354Zs5lp-DeQjKWiaM1BVgwizyNKZ4igyNhSqyPHbGJM_6dOqp4CGSi-r8RR9rF-Mujd7bs-SrArr-JrGjSfuC8h5MZYDExbEorjdK_0MRbDP-2ELiGVwCtRtA_FTYr9VwrcoBJ_t5RBF-ic3-HaKwB5KgVcZUbp5iaLIRu87Ql4ezlf7SQCgc5PH1v5sbazeuVyNpybOA3Pn6UAac0KvRIq9cmTGCClj-Pz1ph5SLaxVMqbj1MZHbY64h-Cwp9EAMw_0qk3pvPD8S6LE4uWguvaszNlInxoQ/http%3A%2F%2Fsdm.link%2Fslashdot> _________________________________________ moosefs-users mailing list moo...@li...<mailto:moo...@li...> https://lists.sourceforge.net/lists/listinfo/moosefs-users<https://secure-web.cisco.com/1nhR7evg0-W5aGax886YeNrvlUfVHI5tJY16u--TYA7btVPw6zu7QElZxalaz7TojfSM7i7jPnAZks1Lq3MfUXyMP71hFwoB9HGHfRBHj30KJUpaMptBTruXnIswidqiqxPomN_7qPIUUBNhXNHIh0NaCmf3SkwoVfru5MU8yJ1RJoG-5QL3q7LabhYLOqKsuhXsXW_RMFcRpfGVLsIOC-xS0-ahznXLJ8_jBLuJc33osnPzwSJgPejlaMgkfuDLtIUsJti9fzlDpBapBw4RcixpGbsncoBZm94djVyikbrp5C7fMgQh9ZeIGZ5pxxr_YgnVt180JS1yRdxZiGn8IZVgI1eLr2NsIZyWsUQDxGWU/https%3A%2F%2Flists.sourceforge.net%2Flists%2Flistinfo%2Fmoosefs-users> |
From: Matt W. <mat...@gm...> - 2017-09-12 18:06:59
|
To ask the obvious ... you don't explicitly state that the chunk servers have plenty of memory and are not hitting swap, are they fine? Can you tell from the logs or cgi web view if the slowness comes from metadata queries or from retrieving the chunks? Are other operations equally slow? E.g. if you change to one of the directories that you are tarring up, do an ls and then cat a file to /dev/null are both the ls and cat slow or just one or the other? On Tue, Sep 12, 2017 at 9:48 AM, Wilson, Steven M <st...@pu...> wrote: > Hi, > > > We have eight different MooseFS installations at our site and they are all > doing terrific (thanks, MooseFS developers!) except for the one that > hosts around 180 million files. The performance on this installation is > quite dismal. I expect to see some decrease in performance due to the > sheer number of files but not to the degree that I'm experiencing. For > example, I am in the process of tarring/gzipping 176 directories that > together contain 15M files and occupy 5.5TB of space. I am about 78% done > after 25 days. > > > Are there any users on this list who have MooseFS installations with large > numbers of files and have some hints they can share about how to improve > performance? > > > Here is some basic information about this installation: > > * MooseFS master is running on a Xeon E5-1630v3 (3.7GHz) and 132GB of > memory. Its system disk (where /var/lib/mfs is located) is a 256GB Samsung > 850 Pro SSD. > > * Four chunk servers, two with 16 SATA disks ranging from 3TB to 10TB > each. And two chunk servers with two 10TB SATA disks each. All disks are > JBOD and most are formatted using XFS (some are ext4). > > * Disk schedulers on chunk servers are set to use deadline. Tried > different settings for read_ahead_kb and nr_requests without much > performance gain. > > * Network is gigabit Ethernet > > * Master and chunk servers are running a mix of MooseFS > versions 3.0.90 and 3.0.92. > > > Thanks in advance for any suggestions! > > > Steve > > ------------------------------------------------------------ > ------------------ > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > _________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users > > |
From: Wilson, S. M <st...@pu...> - 2017-09-12 17:28:01
|
Hi Dave, I'll have to check that out or some other caching method that's more generic (not just HTTP requests). Thanks! Steve ________________________________ From: David Myer <dav...@pr...> Sent: Tuesday, September 12, 2017 1:06 PM To: Wilson, Steven M; MooseFS-Users Subject: Re: [MooseFS-Users] Dealing with millions of files Hi Steve, We have about 6.3 million files in our MFS cluster and use Varnish to cache MFS file retrievals on each node. Having Varnish in front of mfsclient ensures that client requests for popular files are not occupying moosefs and potentially repeatedly going between MFS nodes to retrieve file chunks. This is for a HTTP download system. Cheers, Dave Sent with ProtonMail<https://secure-web.cisco.com/1jRwETGt_sAsFwUTQrRGO6E9t73UXgFhFa9ItKMYWOoUHrvH95de-GVK0fA4DxpRG6FCCS9tcB1vJx0Qszlv86lvhNWxG8obuUTGK_OT9liEzqfWrW_ce15wBBD7P_EQMpcZt9YdUijpJxXGrfGyg9QX2AvB4FqQrafHSDg8jG2pdWPzI50Q0NW3lgX9cYQ8DxDJ-b28BTqvok4mSZ-P8NFyX-1nk-0SgG364qtWBklHq0jE_iJn0JLL_ETG90N1FGSq7UL9cWmBawB5U5U7GfttSG4cer2LCvPOwSCs3h5xnmYs29YzvMZQP_mJDc2JTCY_1KrdWbFf8lN-B-wzZjuLUPj_voNZ1lJos3htTUao/https%3A%2F%2Fprotonmail.com> Secure Email. ---------------------------- Hi, We have eight different MooseFS installations at our site and they are all doing terrific (thanks, MooseFS developers!) except for the one that hosts around 180 million files. The performance on this installation is quite dismal. I expect to see some decrease in performance due to the sheer number of files but not to the degree that I'm experiencing. For example, I am in the process of tarring/gzipping 176 directories that together contain 15M files and occupy 5.5TB of space. I am about 78% done after 25 days. Are there any users on this list who have MooseFS installations with large numbers of files and have some hints they can share about how to improve performance? Here is some basic information about this installation: * MooseFS master is running on a Xeon E5-1630v3 (3.7GHz) and 132GB of memory. Its system disk (where /var/lib/mfs is located) is a 256GB Samsung 850 Pro SSD. * Four chunk servers, two with 16 SATA disks ranging from 3TB to 10TB each. And two chunk servers with two 10TB SATA disks each. All disks are JBOD and most are formatted using XFS (some are ext4). * Disk schedulers on chunk servers are set to use deadline. Tried different settings for read_ahead_kb and nr_requests without much performance gain. * Network is gigabit Ethernet * Master and chunk servers are running a mix of MooseFS versions 3.0.90 and 3.0.92. Thanks in advance for any suggestions! Steve |
From: Wilson, S. M <st...@pu...> - 2017-09-12 17:25:02
|
________________________________________ From: Davies Liu <dav...@gm...> Sent: Tuesday, September 12, 2017 1:13 PM To: Wilson, Steven M Cc: moo...@li... Subject: Re: [MooseFS-Users] Dealing with millions of files On Tue, Sep 12, 2017 at 9:48 AM, Wilson, Steven M <st...@pu...> wrote: > Hi, > > > We have eight different MooseFS installations at our site and they are all > doing terrific (thanks, MooseFS developers!) except for the one that hosts > around 180 million files. The performance on this installation is quite > dismal. I expect to see some decrease in performance due to the sheer > number of files but not to the degree that I'm experiencing. For example, I > am in the process of tarring/gzipping 176 directories that together contain > 15M files and occupy 5.5TB of space. I am about 78% done after 25 days. It seems it's slower than expected (5 files per second, 2MB/s), besides investigating the slowness, you could also you parallel to do that in parallel (with 1 thread per directories, you should be able to finish that in a day), rather than waiting a month. -- - Davies That's an excellent point but in this case, though, I am trying to package all these directories with minimal impact to the users of the filesystem. Hence I actually prefer a single job rather than multiple jobs in parallel. Like you said, it is slower, much slower, than expected. And this is just one example that represents the performance problem experienced by our users. Thanks! Steve |
From: Davies L. <dav...@gm...> - 2017-09-12 17:14:04
|
On Tue, Sep 12, 2017 at 9:48 AM, Wilson, Steven M <st...@pu...> wrote: > Hi, > > > We have eight different MooseFS installations at our site and they are all > doing terrific (thanks, MooseFS developers!) except for the one that hosts > around 180 million files. The performance on this installation is quite > dismal. I expect to see some decrease in performance due to the sheer > number of files but not to the degree that I'm experiencing. For example, I > am in the process of tarring/gzipping 176 directories that together contain > 15M files and occupy 5.5TB of space. I am about 78% done after 25 days. It seems it's slower than expected (5 files per second, 2MB/s), besides investigating the slowness, you could also you parallel to do that in parallel (with 1 thread per directories, you should be able to finish that in a day), rather than waiting a month. > > Are there any users on this list who have MooseFS installations with large > numbers of files and have some hints they can share about how to improve > performance? > > > Here is some basic information about this installation: > > * MooseFS master is running on a Xeon E5-1630v3 (3.7GHz) and 132GB of > memory. Its system disk (where /var/lib/mfs is located) is a 256GB Samsung > 850 Pro SSD. > > * Four chunk servers, two with 16 SATA disks ranging from 3TB to 10TB > each. And two chunk servers with two 10TB SATA disks each. All disks are > JBOD and most are formatted using XFS (some are ext4). > > * Disk schedulers on chunk servers are set to use deadline. Tried > different settings for read_ahead_kb and nr_requests without much > performance gain. > > * Network is gigabit Ethernet > > * Master and chunk servers are running a mix of MooseFS versions 3.0.90 > and 3.0.92. > > > Thanks in advance for any suggestions! > > > Steve > > > ------------------------------------------------------------------------------ > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > _________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users > -- - Davies |
From: David M. <dav...@pr...> - 2017-09-12 17:06:49
|
Hi Steve, We have about 6.3 million files in our MFS cluster and use Varnish to cache MFS file retrievals on each node. Having Varnish in front of mfsclient ensures that client requests for popular files are not occupying moosefs and potentially repeatedly going between MFS nodes to retrieve file chunks. This is for a HTTP download system. Cheers, Dave Sent with [ProtonMail](https://protonmail.com) Secure Email. ---------------------------- Hi, We have eight different MooseFS installations at our site and they are all doing terrific (thanks, MooseFS developers!) except for the one that hosts around 180 million files. The performance on this installation is quite dismal. I expect to see some decrease in performance due to the sheer number of files but not to the degree that I'm experiencing. For example, I am in the process of tarring/gzipping 176 directories that together contain 15M files and occupy 5.5TB of space. I am about 78% done after 25 days. Are there any users on this list who have MooseFS installations with large numbers of files and have some hints they can share about how to improve performance? Here is some basic information about this installation: * MooseFS master is running on a Xeon E5-1630v3 (3.7GHz) and 132GB of memory. Its system disk (where /var/lib/mfs is located) is a 256GB Samsung 850 Pro SSD. * Four chunk servers, two with 16 SATA disks ranging from 3TB to 10TB each. And two chunk servers with two 10TB SATA disks each. All disks are JBOD and most are formatted using XFS (some are ext4). * Disk schedulers on chunk servers are set to use deadline. Tried different settings for read_ahead_kb and nr_requests without much performance gain. * Network is gigabit Ethernet * Master and chunk servers are running a mix of MooseFS versions 3.0.90 and 3.0.92. Thanks in advance for any suggestions! Steve |
From: Wilson, S. M <st...@pu...> - 2017-09-12 16:48:43
|
Hi, We have eight different MooseFS installations at our site and they are all doing terrific (thanks, MooseFS developers!) except for the one that hosts around 180 million files. The performance on this installation is quite dismal. I expect to see some decrease in performance due to the sheer number of files but not to the degree that I'm experiencing. For example, I am in the process of tarring/gzipping 176 directories that together contain 15M files and occupy 5.5TB of space. I am about 78% done after 25 days. Are there any users on this list who have MooseFS installations with large numbers of files and have some hints they can share about how to improve performance? Here is some basic information about this installation: * MooseFS master is running on a Xeon E5-1630v3 (3.7GHz) and 132GB of memory. Its system disk (where /var/lib/mfs is located) is a 256GB Samsung 850 Pro SSD. * Four chunk servers, two with 16 SATA disks ranging from 3TB to 10TB each. And two chunk servers with two 10TB SATA disks each. All disks are JBOD and most are formatted using XFS (some are ext4). * Disk schedulers on chunk servers are set to use deadline. Tried different settings for read_ahead_kb and nr_requests without much performance gain. * Network is gigabit Ethernet * Master and chunk servers are running a mix of MooseFS versions 3.0.90 and 3.0.92. Thanks in advance for any suggestions! Steve |
From: Aleksander W. <ale...@mo...> - 2017-09-07 07:35:22
|
Hi. Would you be so kind and tell us something more about your hardware configuration - mainly we need to know if you are using RAID or JBOD as MooseFS chunkserver disks. Good idea is to look into system logs. By the way which MooseFS version are you using? Best regards Aleksander Wieliczko Technical Support Engineer MooseFS.com <moosefs.com> On 07.09.2017 09:20, Michael Tinsay wrote: > > Hello! > > > Last weekend, due to a building electrical issue, we had to power down > all our servers. When we started the process to power down our > chunkserver, "service moosefs-chunkerserver stop" took more than 30 > minutes to complete. What could have caused this behavior? We'll > need to power down our servers again this weekend for similar reason. > What do I look for should this happen again? Is it safe to send > SIGKILL signal if this happen again? > > > Warm Regards, > > > > --- mike t. > |