From: Wilson, S. M <st...@pu...> - 2018-10-19 14:44:40
|
Thanks! Yes, it is very good to be thorough and this is a good reminder to double-check the things that you mentioned. I am also planning to get all the systems updated to 3.0.101 as soon as possible. Steve ________________________________________ From: Zlatko Čalušić <zca...@bi...> Sent: Friday, October 19, 2018 9:28 AM To: Wilson, Steven M; Marco Milano; moo...@li... Subject: Re: [MooseFS-Users] Performance suggestions for millions of small files Hello Steve, It's of course hard to debug problems without knowing all the details, so I can only suggest what I would look for, if I were in your shoes. If you suspect that system is slower than it should be, try to find what's it's weakest link, i.e. if there's any resource in the cluster that is exhausted. Since you provided info that actually file creation is the weakest link, definitely start from the master server: - what is CPU usage %? - what is /var/lib/mfs disk busy %? - any other issue, memory contention, swapping? If you can eliminate master server as the culprit, then proceed to chunkservers, where you also want to know: - disk busy % per each spindle you have in the pool? - CPU usage % per each chunk server? - swapping, etc...? It would be best to collect all those metrics while you're running creation test (say, in a loop). Then you look, if any particular resource is exhausted. Finally, check the network, any packet loss in any segment which would provoke TCP retransmissions would slow down whole cluster a lot. Yeah, a lot of stuff to check, but then again you do have a hefty 170 TB cluster, with lots of moving parts in it, aren't you. :) I'd also suggest upgrading the whole cluster to the newest 3.0.101 version, which has some caching improvements, i.e. might utilize memory on chunk servers slightly better than any previous version. Hope it helps! On 19. 10. 2018. 15:02, Wilson, Steven M wrote: > Marco, > > I re-ran the benchmark test asking for four threads to create four 10GB files and the problematic cluster shows 45.8 MB/s while one of the other clusters shows 77.3 MB/s. Certainly much better than many small files being created! > > And in answer to your second question, the versions of MooseFS on the slow cluster are mixed (two chunkservers and many clients running 3.0.97, master server and two other chunkservers running 3.0.101). > > Steve > > > ________________________________________ > From: Marco Milano <mar...@gm...> > Sent: Friday, October 19, 2018 6:26 AM > To: moo...@li... > Subject: Re: [MooseFS-Users] Performance suggestions for millions of small files > > Steve, > > I don't have a solution for this problem. > Just out of curiosity: > > -- what is your large file create speed on this cluster > compared to your other clusters? > (i.e how long does it take to create a single 10GB file > on this one compared to others ?) > > -- You said you have a mix of 3.0.97 and 3.0.101, > are the versions of MooseFS uniform on this "slow cluster" ? > > -- Marco > > On 10/18/18 8:35 PM, Wilson, Steven M wrote: >> We have a mix of 3.0.97 and 3.0.101. >> >> Steve >> >> ________________________________________ >> From: Marco Milano <mar...@gm...> >> Sent: Thursday, October 18, 2018 5:43 PM >> To: moo...@li... >> Subject: Re: [MooseFS-Users] Performance suggestions for millions of small files >> >> Steve, >> >> What is the version of the MooseFS ? >> >> -- Marco >> >> On 10/18/18 4:47 PM, Wilson, Steven M wrote: >>> Hi, >>> >>> >>> We have ten different MooseFS installations in our research group and >>> one, in particular, is struggling with poor I/O performance. This >>> installation currently has 315 million files occupying 170TB of disk >>> space (goal = 2). If anyone else has a similar installation, I would >>> like to hear what you have done to maintain performance at a reasonable >>> level. >>> >>> >>> Here are some metrics to give a basic idea of the performance >>> characteristics. I'll include in parentheses the range of measurements >>> from other MFS installations with far fewer files for comparison. >>> >>> * tar xf linux-4.9-rc3.tar: 1185 secs (220 - 296 secs) >>> >>> * smallfile test, create MB/s: 0.8 (2.3 - 4.8) <== Ouch! >>> >>> * smallfile test, read MB/s: 10.7 (12.8 - 15.4) >>> >>> * smallfile test, append MB/s: 6.1 (3.0 - 7.7) >>> >>> >>> It looks file creation is where I'm losing most of my performance >>> compared to the other installations. My master server has a Xeon >>> E5-1630v3 3.7GHz CPU with 256GB of DDR4 2133MHz memory. >>> >>> >>> I tried several mfsmount options but the only one that showed any >>> significant improvement was the mfsfsyncmintime option >>> ("mfsfsyncmintime=5"). As to be expected, the improvement gained was >>> during the write/append operation. Here are the results using the same >>> tests as above: >>> >>> * tar xf linux-4.9-rc3.tar: 683 secs >>> * smallfile test, create MB/s: 1.2 >>> * smallfile test, read MB/s: 11.7 >>> * smallfile test, append MB/s: 11.4 <== Dramatic improvement over >>> 6.1 MB/s >>> >>> >>> The smallfile benchmark test I used is from >>> https://github.com/distributed-system-analysis/smallfile. >>> >>> >>> Thanks for any suggestions you might have! >>> >>> >>> Regards, >>> >>> Steve >>> >>> >>> >>> _________________________________________ >>> moosefs-users mailing list >>> moo...@li... >>> https://lists.sourceforge.net/lists/listinfo/moosefs-users >>> >> >> _________________________________________ >> moosefs-users mailing list >> moo...@li... >> https://lists.sourceforge.net/lists/listinfo/moosefs-users >> > > _________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users > > _________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users -- Zlatko |