From: Wilson, S. M <st...@pu...> - 2018-10-19 17:19:43
|
Marco, Ah, yes... I could also run multiple instances of MooseFS to help divide the load. I was hoping to avoid doing something like that, though. I will keep it mind as a last resort. For the moment, I'm seeing about a 50% improvement in the benchmarks with adding the mfsmount option "mfsfsyncmintime=5" so I'll see how that plays out using real workloads. Thanks for your willingness to try this out in your own environment. It is very kind of you but please do not make too much work for yourself! Regards, Steve ________________________________________ From: Marco Milano <mar...@gm...> Sent: Friday, October 19, 2018 12:36 PM To: moo...@li... Subject: Re: [MooseFS-Users] Performance suggestions for millions of small files Steve, My wild guess is that somehow the master server is not very efficient to handle that many files. (Obviously if this is the case, it is very bad.) I will do some tests with the 4.x series and 0.5 billion files and let you know.(it will take me several weeks to create that test environment.) In the meantime, you can split the namespace into two which may help on the same hardware. (Basically you can run as many namespaces as you want on the same hardware setup, however this will require a lot of work to setup and migrate, in this case, there will be two master server processes running on your master server hardware just at different ports) Or, just hope that the performance of the master server is better with version 4.x -- Marco On 10/19/18 11:10 AM, Wilson, Steven M wrote: > Hi Diego, > > > I appreciate you taking the time to run these tests on your own setup! > My parameters to the smallfile benchmark were a little different (I took > them from some GlusterFS benchmarking documentation): > > smallfile_cli.py --top smallfile-tests --threads 4 --file-size 4 > --files 10000 --response-times Y > > > > Steve > > > ------------------------------------------------------------------------ > *From:* Remolina, Diego J <dij...@ae...> > *Sent:* Friday, October 19, 2018 7:19 AM > *To:* MooseFS-Users > *Subject:* Re: [MooseFS-Users] Performance suggestions for millions of > small files > > Hi Steve, > > > I have by no means a similar amount of files and space, as I am just > testing, but this is what I see with MooseFS 4.6.0 and goal=3 on a > pretty new (in testing phase, no load) 3-way server setup: > > > time tar -xf linux-4.9-rc3.tar > > real 4m0.332s > user 0m1.668s > sys 0m9.517s > > python /tmp/smallfile/smallfile_cli.py --operation create --threads 8 > --file-size 1 --files 2048 --top /nethome/dijuremo/test > > total threads = 8 > total files = 15948 > total IOPS = 15948 > total data = 0.015 GiB > 97.34% of requested files processed, minimum is 90.00 > elapsed time = 11.608 > files/sec = 1373.870032 > IOPS = 1373.870032 > MiB/sec = 1.341670 > > > python /tmp/smallfile/smallfile_cli.py --operation read --threads 8 > --file-size 1 --files 2048 --top /nethome/dijuremo/test > > total threads = 8 > total files = 16384 > total IOPS = 16384 > total data = 0.016 GiB > 100.00% of requested files processed, minimum is 90.00 > elapsed time = 2.553 > files/sec = 6416.909838 > IOPS = 6416.909838 > MiB/sec = 6.266514 > > python /tmp/smallfile/smallfile_cli.py --operation append --threads 8 > --file-size 1 --files 2048 --top /nethome/dijuremo/test > > > total threads = 8 > total files = 15348 > total IOPS = 15348 > total data = 0.015 GiB > 93.68% of requested files processed, minimum is 90.00 > elapsed time = 8.018 > files/sec = 1914.272783 > IOPS = 1914.272783 > MiB/sec = 1.869407 > > > > > I will be happy to adjust the smallfile test settings if any of my tests > are useful to you and re-run them for comparison. > > > Diego > > ------------------------------------------------------------------------ > *From:* Wilson, Steven M <st...@pu...> > *Sent:* Thursday, October 18, 2018 4:47:14 PM > *To:* MooseFS-Users > *Subject:* [MooseFS-Users] Performance suggestions for millions of small > files > > Hi, > > > We have ten different MooseFS installations in our research group and > one, in particular, is struggling with poor I/O performance. This > installation currently has 315 million files occupying 170TB of disk > space (goal = 2). If anyone else has a similar installation, I would > like to hear what you have done to maintain performance at a reasonable > level. > > > Here are some metrics to give a basic idea of the performance > characteristics. I'll include in parentheses the range of measurements > from other MFS installations with far fewer files for comparison. > > * tar xf linux-4.9-rc3.tar: 1185 secs (220 - 296 secs) > > * smallfile test, create MB/s: 0.8 (2.3 - 4.8) <== Ouch! > > * smallfile test, read MB/s: 10.7 (12.8 - 15.4) > > * smallfile test, append MB/s: 6.1 (3.0 - 7.7) > > > It looks file creation is where I'm losing most of my performance > compared to the other installations. My master server has a Xeon > E5-1630v3 3.7GHz CPU with 256GB of DDR4 2133MHz memory. > > > I tried several mfsmount options but the only one that showed any > significant improvement was the mfsfsyncmintime option > ("mfsfsyncmintime=5"). As to be expected, the improvement gained was > during the write/append operation. Here are the results using the same > tests as above: > > * tar xf linux-4.9-rc3.tar: 683 secs > * smallfile test, create MB/s: 1.2 > * smallfile test, read MB/s: 11.7 > * smallfile test, append MB/s: 11.4 <== Dramatic improvement over > 6.1 MB/s > > > The smallfile benchmark test I used is from > https://github.com/distributed-system-analysis/smallfile. > > > Thanks for any suggestions you might have! > > > Regards, > > Steve > > > > _________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users > _________________________________________ moosefs-users mailing list moo...@li... https://lists.sourceforge.net/lists/listinfo/moosefs-users |