From: Wilson, S. M <st...@pu...> - 2018-10-18 21:22:20
|
Hi, We have ten different MooseFS installations in our research group and one, in particular, is struggling with poor I/O performance. This installation currently has 315 million files occupying 170TB of disk space (goal = 2). If anyone else has a similar installation, I would like to hear what you have done to maintain performance at a reasonable level. Here are some metrics to give a basic idea of the performance characteristics. I'll include in parentheses the range of measurements from other MFS installations with far fewer files for comparison. * tar xf linux-4.9-rc3.tar: 1185 secs (220 - 296 secs) * smallfile test, create MB/s: 0.8 (2.3 - 4.8) <== Ouch! * smallfile test, read MB/s: 10.7 (12.8 - 15.4) * smallfile test, append MB/s: 6.1 (3.0 - 7.7) It looks file creation is where I'm losing most of my performance compared to the other installations. My master server has a Xeon E5-1630v3 3.7GHz CPU with 256GB of DDR4 2133MHz memory. I tried several mfsmount options but the only one that showed any significant improvement was the mfsfsyncmintime option ("mfsfsyncmintime=5"). As to be expected, the improvement gained was during the write/append operation. Here are the results using the same tests as above: * tar xf linux-4.9-rc3.tar: 683 secs * smallfile test, create MB/s: 1.2 * smallfile test, read MB/s: 11.7 * smallfile test, append MB/s: 11.4 <== Dramatic improvement over 6.1 MB/s The smallfile benchmark test I used is ?from https://github.com/distributed-system-analysis/smallfile. Thanks for any suggestions you might have! Regards, Steve |
From: Marco M. <mar...@gm...> - 2018-10-18 21:43:36
|
Steve, What is the version of the MooseFS ? -- Marco On 10/18/18 4:47 PM, Wilson, Steven M wrote: > Hi, > > > We have ten different MooseFS installations in our research group and > one, in particular, is struggling with poor I/O performance. This > installation currently has 315 million files occupying 170TB of disk > space (goal = 2). If anyone else has a similar installation, I would > like to hear what you have done to maintain performance at a reasonable > level. > > > Here are some metrics to give a basic idea of the performance > characteristics. I'll include in parentheses the range of measurements > from other MFS installations with far fewer files for comparison. > > * tar xf linux-4.9-rc3.tar: 1185 secs (220 - 296 secs) > > * smallfile test, create MB/s: 0.8 (2.3 - 4.8) <== Ouch! > > * smallfile test, read MB/s: 10.7 (12.8 - 15.4) > > * smallfile test, append MB/s: 6.1 (3.0 - 7.7) > > > It looks file creation is where I'm losing most of my performance > compared to the other installations. My master server has a Xeon > E5-1630v3 3.7GHz CPU with 256GB of DDR4 2133MHz memory. > > > I tried several mfsmount options but the only one that showed any > significant improvement was the mfsfsyncmintime option > ("mfsfsyncmintime=5"). As to be expected, the improvement gained was > during the write/append operation. Here are the results using the same > tests as above: > > * tar xf linux-4.9-rc3.tar: 683 secs > * smallfile test, create MB/s: 1.2 > * smallfile test, read MB/s: 11.7 > * smallfile test, append MB/s: 11.4 <== Dramatic improvement over > 6.1 MB/s > > > The smallfile benchmark test I used is from > https://github.com/distributed-system-analysis/smallfile. > > > Thanks for any suggestions you might have! > > > Regards, > > Steve > > > > _________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users > |
From: Wilson, S. M <st...@pu...> - 2018-10-19 00:36:12
|
We have a mix of 3.0.97 and 3.0.101. Steve ________________________________________ From: Marco Milano <mar...@gm...> Sent: Thursday, October 18, 2018 5:43 PM To: moo...@li... Subject: Re: [MooseFS-Users] Performance suggestions for millions of small files Steve, What is the version of the MooseFS ? -- Marco On 10/18/18 4:47 PM, Wilson, Steven M wrote: > Hi, > > > We have ten different MooseFS installations in our research group and > one, in particular, is struggling with poor I/O performance. This > installation currently has 315 million files occupying 170TB of disk > space (goal = 2). If anyone else has a similar installation, I would > like to hear what you have done to maintain performance at a reasonable > level. > > > Here are some metrics to give a basic idea of the performance > characteristics. I'll include in parentheses the range of measurements > from other MFS installations with far fewer files for comparison. > > * tar xf linux-4.9-rc3.tar: 1185 secs (220 - 296 secs) > > * smallfile test, create MB/s: 0.8 (2.3 - 4.8) <== Ouch! > > * smallfile test, read MB/s: 10.7 (12.8 - 15.4) > > * smallfile test, append MB/s: 6.1 (3.0 - 7.7) > > > It looks file creation is where I'm losing most of my performance > compared to the other installations. My master server has a Xeon > E5-1630v3 3.7GHz CPU with 256GB of DDR4 2133MHz memory. > > > I tried several mfsmount options but the only one that showed any > significant improvement was the mfsfsyncmintime option > ("mfsfsyncmintime=5"). As to be expected, the improvement gained was > during the write/append operation. Here are the results using the same > tests as above: > > * tar xf linux-4.9-rc3.tar: 683 secs > * smallfile test, create MB/s: 1.2 > * smallfile test, read MB/s: 11.7 > * smallfile test, append MB/s: 11.4 <== Dramatic improvement over > 6.1 MB/s > > > The smallfile benchmark test I used is from > https://github.com/distributed-system-analysis/smallfile. > > > Thanks for any suggestions you might have! > > > Regards, > > Steve > > > > _________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users > _________________________________________ moosefs-users mailing list moo...@li... https://lists.sourceforge.net/lists/listinfo/moosefs-users |
From: Marco M. <mar...@gm...> - 2018-10-19 10:27:06
|
Steve, I don't have a solution for this problem. Just out of curiosity: -- what is your large file create speed on this cluster compared to your other clusters? (i.e how long does it take to create a single 10GB file on this one compared to others ?) -- You said you have a mix of 3.0.97 and 3.0.101, are the versions of MooseFS uniform on this "slow cluster" ? -- Marco On 10/18/18 8:35 PM, Wilson, Steven M wrote: > We have a mix of 3.0.97 and 3.0.101. > > Steve > > ________________________________________ > From: Marco Milano <mar...@gm...> > Sent: Thursday, October 18, 2018 5:43 PM > To: moo...@li... > Subject: Re: [MooseFS-Users] Performance suggestions for millions of small files > > Steve, > > What is the version of the MooseFS ? > > -- Marco > > On 10/18/18 4:47 PM, Wilson, Steven M wrote: >> Hi, >> >> >> We have ten different MooseFS installations in our research group and >> one, in particular, is struggling with poor I/O performance. This >> installation currently has 315 million files occupying 170TB of disk >> space (goal = 2). If anyone else has a similar installation, I would >> like to hear what you have done to maintain performance at a reasonable >> level. >> >> >> Here are some metrics to give a basic idea of the performance >> characteristics. I'll include in parentheses the range of measurements >> from other MFS installations with far fewer files for comparison. >> >> * tar xf linux-4.9-rc3.tar: 1185 secs (220 - 296 secs) >> >> * smallfile test, create MB/s: 0.8 (2.3 - 4.8) <== Ouch! >> >> * smallfile test, read MB/s: 10.7 (12.8 - 15.4) >> >> * smallfile test, append MB/s: 6.1 (3.0 - 7.7) >> >> >> It looks file creation is where I'm losing most of my performance >> compared to the other installations. My master server has a Xeon >> E5-1630v3 3.7GHz CPU with 256GB of DDR4 2133MHz memory. >> >> >> I tried several mfsmount options but the only one that showed any >> significant improvement was the mfsfsyncmintime option >> ("mfsfsyncmintime=5"). As to be expected, the improvement gained was >> during the write/append operation. Here are the results using the same >> tests as above: >> >> * tar xf linux-4.9-rc3.tar: 683 secs >> * smallfile test, create MB/s: 1.2 >> * smallfile test, read MB/s: 11.7 >> * smallfile test, append MB/s: 11.4 <== Dramatic improvement over >> 6.1 MB/s >> >> >> The smallfile benchmark test I used is from >> https://github.com/distributed-system-analysis/smallfile. >> >> >> Thanks for any suggestions you might have! >> >> >> Regards, >> >> Steve >> >> >> >> _________________________________________ >> moosefs-users mailing list >> moo...@li... >> https://lists.sourceforge.net/lists/listinfo/moosefs-users >> > > > _________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users > |
From: Alexander A. <ale...@op...> - 2018-10-19 06:11:26
|
Hi Steve! You know… I’v made a conclution not store millions of small files on MooseFS ;--) In such case I use a big file stored on MooseFS as container, format it as EXT4 or XFS and then mount it on client side. Yes. I know that in such case small files are not clustered anymore … it's a pity but here you are. Wbr Alexander (Anri) Akhobadze, ale...@op... System administrator, DATAVISION NN From: Wilson, Steven M [mailto:st...@pu...] Sent: Thursday, October 18, 2018 11:47 PM To: MooseFS-Users Subject: [MooseFS-Users] Performance suggestions for millions of small files Hi, We have ten different MooseFS installations in our research group and one, in particular, is struggling with poor I/O performance. This installation currently has 315 million files occupying 170TB of disk space (goal = 2). If anyone else has a similar installation, I would like to hear what you have done to maintain performance at a reasonable level. Here are some metrics to give a basic idea of the performance characteristics. I'll include in parentheses the range of measurements from other MFS installations with far fewer files for comparison. * tar xf linux-4.9-rc3.tar: 1185 secs (220 - 296 secs) * smallfile test, create MB/s: 0.8 (2.3 - 4.8) <== Ouch! * smallfile test, read MB/s: 10.7 (12.8 - 15.4) * smallfile test, append MB/s: 6.1 (3.0 - 7.7) It looks file creation is where I'm losing most of my performance compared to the other installations. My master server has a Xeon E5-1630v3 3.7GHz CPU with 256GB of DDR4 2133MHz memory. I tried several mfsmount options but the only one that showed any significant improvement was the mfsfsyncmintime option ("mfsfsyncmintime=5"). As to be expected, the improvement gained was during the write/append operation. Here are the results using the same tests as above: * tar xf linux-4.9-rc3.tar: 683 secs * smallfile test, create MB/s: 1.2 * smallfile test, read MB/s: 11.7 * smallfile test, append MB/s: 11.4 <== Dramatic improvement over 6.1 MB/s The smallfile benchmark test I used is from https://github.com/distributed-system-analysis/smallfile. Thanks for any suggestions you might have! Regards, Steve |
From: Wilson, S. M <st...@pu...> - 2018-10-19 12:47:38
|
?Thanks for the suggestion! I had thought about that also but a lot of our files need to be accessed simultaneously from different clients and, as you mentioned, this approach doesn't support that. Steve ________________________________ From: Alexander AKHOBADZE <ale...@op...> Sent: Friday, October 19, 2018 1:55 AM To: Wilson, Steven M; MooseFS-Users Subject: RE: Performance suggestions for millions of small files Hi Steve! You know... I'v made a conclution not store millions of small files on MooseFS ;--) In such case I use a big file stored on MooseFS as container, format it as EXT4 or XFS and then mount it on client side. Yes. I know that in such case small files are not clustered anymore ... it's a pity but here you are. Wbr Alexander (Anri) Akhobadze, ale...@op... System administrator, DATAVISION NN From: Wilson, Steven M [mailto:st...@pu...] Sent: Thursday, October 18, 2018 11:47 PM To: MooseFS-Users Subject: [MooseFS-Users] Performance suggestions for millions of small files Hi, We have ten different MooseFS installations in our research group and one, in particular, is struggling with poor I/O performance. This installation currently has 315 million files occupying 170TB of disk space (goal = 2). If anyone else has a similar installation, I would like to hear what you have done to maintain performance at a reasonable level. Here are some metrics to give a basic idea of the performance characteristics. I'll include in parentheses the range of measurements from other MFS installations with far fewer files for comparison. * tar xf linux-4.9-rc3.tar: 1185 secs (220 - 296 secs) * smallfile test, create MB/s: 0.8 (2.3 - 4.8) <== Ouch! * smallfile test, read MB/s: 10.7 (12.8 - 15.4) * smallfile test, append MB/s: 6.1 (3.0 - 7.7) It looks file creation is where I'm losing most of my performance compared to the other installations. My master server has a Xeon E5-1630v3 3.7GHz CPU with 256GB of DDR4 2133MHz memory. I tried several mfsmount options but the only one that showed any significant improvement was the mfsfsyncmintime option ("mfsfsyncmintime=5"). As to be expected, the improvement gained was during the write/append operation. Here are the results using the same tests as above: * tar xf linux-4.9-rc3.tar: 683 secs * smallfile test, create MB/s: 1.2 * smallfile test, read MB/s: 11.7 * smallfile test, append MB/s: 11.4 <== Dramatic improvement over 6.1 MB/s The smallfile benchmark test I used is ?from https://github.com/distributed-system-analysis/smallfile. Thanks for any suggestions you might have! Regards, Steve |
From: Wilson, S. M <st...@pu...> - 2018-10-19 13:02:26
|
Marco, I re-ran the benchmark test asking for four threads to create four 10GB files and the problematic cluster shows 45.8 MB/s while one of the other clusters shows 77.3 MB/s. Certainly much better than many small files being created! And in answer to your second question, the versions of MooseFS on the slow cluster are mixed (two chunkservers and many clients running 3.0.97, master server and two other chunkservers running 3.0.101). Steve ________________________________________ From: Marco Milano <mar...@gm...> Sent: Friday, October 19, 2018 6:26 AM To: moo...@li... Subject: Re: [MooseFS-Users] Performance suggestions for millions of small files Steve, I don't have a solution for this problem. Just out of curiosity: -- what is your large file create speed on this cluster compared to your other clusters? (i.e how long does it take to create a single 10GB file on this one compared to others ?) -- You said you have a mix of 3.0.97 and 3.0.101, are the versions of MooseFS uniform on this "slow cluster" ? -- Marco On 10/18/18 8:35 PM, Wilson, Steven M wrote: > We have a mix of 3.0.97 and 3.0.101. > > Steve > > ________________________________________ > From: Marco Milano <mar...@gm...> > Sent: Thursday, October 18, 2018 5:43 PM > To: moo...@li... > Subject: Re: [MooseFS-Users] Performance suggestions for millions of small files > > Steve, > > What is the version of the MooseFS ? > > -- Marco > > On 10/18/18 4:47 PM, Wilson, Steven M wrote: >> Hi, >> >> >> We have ten different MooseFS installations in our research group and >> one, in particular, is struggling with poor I/O performance. This >> installation currently has 315 million files occupying 170TB of disk >> space (goal = 2). If anyone else has a similar installation, I would >> like to hear what you have done to maintain performance at a reasonable >> level. >> >> >> Here are some metrics to give a basic idea of the performance >> characteristics. I'll include in parentheses the range of measurements >> from other MFS installations with far fewer files for comparison. >> >> * tar xf linux-4.9-rc3.tar: 1185 secs (220 - 296 secs) >> >> * smallfile test, create MB/s: 0.8 (2.3 - 4.8) <== Ouch! >> >> * smallfile test, read MB/s: 10.7 (12.8 - 15.4) >> >> * smallfile test, append MB/s: 6.1 (3.0 - 7.7) >> >> >> It looks file creation is where I'm losing most of my performance >> compared to the other installations. My master server has a Xeon >> E5-1630v3 3.7GHz CPU with 256GB of DDR4 2133MHz memory. >> >> >> I tried several mfsmount options but the only one that showed any >> significant improvement was the mfsfsyncmintime option >> ("mfsfsyncmintime=5"). As to be expected, the improvement gained was >> during the write/append operation. Here are the results using the same >> tests as above: >> >> * tar xf linux-4.9-rc3.tar: 683 secs >> * smallfile test, create MB/s: 1.2 >> * smallfile test, read MB/s: 11.7 >> * smallfile test, append MB/s: 11.4 <== Dramatic improvement over >> 6.1 MB/s >> >> >> The smallfile benchmark test I used is from >> https://github.com/distributed-system-analysis/smallfile. >> >> >> Thanks for any suggestions you might have! >> >> >> Regards, >> >> Steve >> >> >> >> _________________________________________ >> moosefs-users mailing list >> moo...@li... >> https://lists.sourceforge.net/lists/listinfo/moosefs-users >> > > > _________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users > _________________________________________ moosefs-users mailing list moo...@li... https://lists.sourceforge.net/lists/listinfo/moosefs-users |
From: Zlatko Č. <zca...@bi...> - 2018-10-19 13:46:21
|
Hello Steve, It's of course hard to debug problems without knowing all the details, so I can only suggest what I would look for, if I were in your shoes. If you suspect that system is slower than it should be, try to find what's it's weakest link, i.e. if there's any resource in the cluster that is exhausted. Since you provided info that actually file creation is the weakest link, definitely start from the master server: - what is CPU usage %? - what is /var/lib/mfs disk busy %? - any other issue, memory contention, swapping? If you can eliminate master server as the culprit, then proceed to chunkservers, where you also want to know: - disk busy % per each spindle you have in the pool? - CPU usage % per each chunk server? - swapping, etc...? It would be best to collect all those metrics while you're running creation test (say, in a loop). Then you look, if any particular resource is exhausted. Finally, check the network, any packet loss in any segment which would provoke TCP retransmissions would slow down whole cluster a lot. Yeah, a lot of stuff to check, but then again you do have a hefty 170 TB cluster, with lots of moving parts in it, aren't you. :) I'd also suggest upgrading the whole cluster to the newest 3.0.101 version, which has some caching improvements, i.e. might utilize memory on chunk servers slightly better than any previous version. Hope it helps! On 19. 10. 2018. 15:02, Wilson, Steven M wrote: > Marco, > > I re-ran the benchmark test asking for four threads to create four 10GB files and the problematic cluster shows 45.8 MB/s while one of the other clusters shows 77.3 MB/s. Certainly much better than many small files being created! > > And in answer to your second question, the versions of MooseFS on the slow cluster are mixed (two chunkservers and many clients running 3.0.97, master server and two other chunkservers running 3.0.101). > > Steve > > > ________________________________________ > From: Marco Milano <mar...@gm...> > Sent: Friday, October 19, 2018 6:26 AM > To: moo...@li... > Subject: Re: [MooseFS-Users] Performance suggestions for millions of small files > > Steve, > > I don't have a solution for this problem. > Just out of curiosity: > > -- what is your large file create speed on this cluster > compared to your other clusters? > (i.e how long does it take to create a single 10GB file > on this one compared to others ?) > > -- You said you have a mix of 3.0.97 and 3.0.101, > are the versions of MooseFS uniform on this "slow cluster" ? > > -- Marco > > On 10/18/18 8:35 PM, Wilson, Steven M wrote: >> We have a mix of 3.0.97 and 3.0.101. >> >> Steve >> >> ________________________________________ >> From: Marco Milano <mar...@gm...> >> Sent: Thursday, October 18, 2018 5:43 PM >> To: moo...@li... >> Subject: Re: [MooseFS-Users] Performance suggestions for millions of small files >> >> Steve, >> >> What is the version of the MooseFS ? >> >> -- Marco >> >> On 10/18/18 4:47 PM, Wilson, Steven M wrote: >>> Hi, >>> >>> >>> We have ten different MooseFS installations in our research group and >>> one, in particular, is struggling with poor I/O performance. This >>> installation currently has 315 million files occupying 170TB of disk >>> space (goal = 2). If anyone else has a similar installation, I would >>> like to hear what you have done to maintain performance at a reasonable >>> level. >>> >>> >>> Here are some metrics to give a basic idea of the performance >>> characteristics. I'll include in parentheses the range of measurements >>> from other MFS installations with far fewer files for comparison. >>> >>> * tar xf linux-4.9-rc3.tar: 1185 secs (220 - 296 secs) >>> >>> * smallfile test, create MB/s: 0.8 (2.3 - 4.8) <== Ouch! >>> >>> * smallfile test, read MB/s: 10.7 (12.8 - 15.4) >>> >>> * smallfile test, append MB/s: 6.1 (3.0 - 7.7) >>> >>> >>> It looks file creation is where I'm losing most of my performance >>> compared to the other installations. My master server has a Xeon >>> E5-1630v3 3.7GHz CPU with 256GB of DDR4 2133MHz memory. >>> >>> >>> I tried several mfsmount options but the only one that showed any >>> significant improvement was the mfsfsyncmintime option >>> ("mfsfsyncmintime=5"). As to be expected, the improvement gained was >>> during the write/append operation. Here are the results using the same >>> tests as above: >>> >>> * tar xf linux-4.9-rc3.tar: 683 secs >>> * smallfile test, create MB/s: 1.2 >>> * smallfile test, read MB/s: 11.7 >>> * smallfile test, append MB/s: 11.4 <== Dramatic improvement over >>> 6.1 MB/s >>> >>> >>> The smallfile benchmark test I used is from >>> https://github.com/distributed-system-analysis/smallfile. >>> >>> >>> Thanks for any suggestions you might have! >>> >>> >>> Regards, >>> >>> Steve >>> >>> >>> >>> _________________________________________ >>> moosefs-users mailing list >>> moo...@li... >>> https://lists.sourceforge.net/lists/listinfo/moosefs-users >>> >> >> _________________________________________ >> moosefs-users mailing list >> moo...@li... >> https://lists.sourceforge.net/lists/listinfo/moosefs-users >> > > _________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users > > _________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users -- Zlatko |
From: Wilson, S. M <st...@pu...> - 2018-10-19 14:44:40
|
Thanks! Yes, it is very good to be thorough and this is a good reminder to double-check the things that you mentioned. I am also planning to get all the systems updated to 3.0.101 as soon as possible. Steve ________________________________________ From: Zlatko Čalušić <zca...@bi...> Sent: Friday, October 19, 2018 9:28 AM To: Wilson, Steven M; Marco Milano; moo...@li... Subject: Re: [MooseFS-Users] Performance suggestions for millions of small files Hello Steve, It's of course hard to debug problems without knowing all the details, so I can only suggest what I would look for, if I were in your shoes. If you suspect that system is slower than it should be, try to find what's it's weakest link, i.e. if there's any resource in the cluster that is exhausted. Since you provided info that actually file creation is the weakest link, definitely start from the master server: - what is CPU usage %? - what is /var/lib/mfs disk busy %? - any other issue, memory contention, swapping? If you can eliminate master server as the culprit, then proceed to chunkservers, where you also want to know: - disk busy % per each spindle you have in the pool? - CPU usage % per each chunk server? - swapping, etc...? It would be best to collect all those metrics while you're running creation test (say, in a loop). Then you look, if any particular resource is exhausted. Finally, check the network, any packet loss in any segment which would provoke TCP retransmissions would slow down whole cluster a lot. Yeah, a lot of stuff to check, but then again you do have a hefty 170 TB cluster, with lots of moving parts in it, aren't you. :) I'd also suggest upgrading the whole cluster to the newest 3.0.101 version, which has some caching improvements, i.e. might utilize memory on chunk servers slightly better than any previous version. Hope it helps! On 19. 10. 2018. 15:02, Wilson, Steven M wrote: > Marco, > > I re-ran the benchmark test asking for four threads to create four 10GB files and the problematic cluster shows 45.8 MB/s while one of the other clusters shows 77.3 MB/s. Certainly much better than many small files being created! > > And in answer to your second question, the versions of MooseFS on the slow cluster are mixed (two chunkservers and many clients running 3.0.97, master server and two other chunkservers running 3.0.101). > > Steve > > > ________________________________________ > From: Marco Milano <mar...@gm...> > Sent: Friday, October 19, 2018 6:26 AM > To: moo...@li... > Subject: Re: [MooseFS-Users] Performance suggestions for millions of small files > > Steve, > > I don't have a solution for this problem. > Just out of curiosity: > > -- what is your large file create speed on this cluster > compared to your other clusters? > (i.e how long does it take to create a single 10GB file > on this one compared to others ?) > > -- You said you have a mix of 3.0.97 and 3.0.101, > are the versions of MooseFS uniform on this "slow cluster" ? > > -- Marco > > On 10/18/18 8:35 PM, Wilson, Steven M wrote: >> We have a mix of 3.0.97 and 3.0.101. >> >> Steve >> >> ________________________________________ >> From: Marco Milano <mar...@gm...> >> Sent: Thursday, October 18, 2018 5:43 PM >> To: moo...@li... >> Subject: Re: [MooseFS-Users] Performance suggestions for millions of small files >> >> Steve, >> >> What is the version of the MooseFS ? >> >> -- Marco >> >> On 10/18/18 4:47 PM, Wilson, Steven M wrote: >>> Hi, >>> >>> >>> We have ten different MooseFS installations in our research group and >>> one, in particular, is struggling with poor I/O performance. This >>> installation currently has 315 million files occupying 170TB of disk >>> space (goal = 2). If anyone else has a similar installation, I would >>> like to hear what you have done to maintain performance at a reasonable >>> level. >>> >>> >>> Here are some metrics to give a basic idea of the performance >>> characteristics. I'll include in parentheses the range of measurements >>> from other MFS installations with far fewer files for comparison. >>> >>> * tar xf linux-4.9-rc3.tar: 1185 secs (220 - 296 secs) >>> >>> * smallfile test, create MB/s: 0.8 (2.3 - 4.8) <== Ouch! >>> >>> * smallfile test, read MB/s: 10.7 (12.8 - 15.4) >>> >>> * smallfile test, append MB/s: 6.1 (3.0 - 7.7) >>> >>> >>> It looks file creation is where I'm losing most of my performance >>> compared to the other installations. My master server has a Xeon >>> E5-1630v3 3.7GHz CPU with 256GB of DDR4 2133MHz memory. >>> >>> >>> I tried several mfsmount options but the only one that showed any >>> significant improvement was the mfsfsyncmintime option >>> ("mfsfsyncmintime=5"). As to be expected, the improvement gained was >>> during the write/append operation. Here are the results using the same >>> tests as above: >>> >>> * tar xf linux-4.9-rc3.tar: 683 secs >>> * smallfile test, create MB/s: 1.2 >>> * smallfile test, read MB/s: 11.7 >>> * smallfile test, append MB/s: 11.4 <== Dramatic improvement over >>> 6.1 MB/s >>> >>> >>> The smallfile benchmark test I used is from >>> https://github.com/distributed-system-analysis/smallfile. >>> >>> >>> Thanks for any suggestions you might have! >>> >>> >>> Regards, >>> >>> Steve >>> >>> >>> >>> _________________________________________ >>> moosefs-users mailing list >>> moo...@li... >>> https://lists.sourceforge.net/lists/listinfo/moosefs-users >>> >> >> _________________________________________ >> moosefs-users mailing list >> moo...@li... >> https://lists.sourceforge.net/lists/listinfo/moosefs-users >> > > _________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users > > _________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users -- Zlatko |
From: Remolina, D. J <dij...@ae...> - 2018-10-19 14:52:33
Attachments:
pastedImage.png
|
Hi Steve, I have by no means a similar amount of files and space, as I am just testing, but this is what I see with MooseFS 4.6.0 and goal=3 on a pretty new (in testing phase, no load) 3-way server setup: time tar -xf linux-4.9-rc3.tar real 4m0.332s user 0m1.668s sys 0m9.517s python /tmp/smallfile/smallfile_cli.py --operation create --threads 8 --file-size 1 --files 2048 --top /nethome/dijuremo/test total threads = 8 total files = 15948 total IOPS = 15948 total data = 0.015 GiB 97.34% of requested files processed, minimum is 90.00 elapsed time = 11.608 files/sec = 1373.870032 IOPS = 1373.870032 MiB/sec = 1.341670 python /tmp/smallfile/smallfile_cli.py --operation read --threads 8 --file-size 1 --files 2048 --top /nethome/dijuremo/test total threads = 8 total files = 16384 total IOPS = 16384 total data = 0.016 GiB 100.00% of requested files processed, minimum is 90.00 elapsed time = 2.553 files/sec = 6416.909838 IOPS = 6416.909838 MiB/sec = 6.266514 python /tmp/smallfile/smallfile_cli.py --operation append --threads 8 --file-size 1 --files 2048 --top /nethome/dijuremo/test total threads = 8 total files = 15348 total IOPS = 15348 total data = 0.015 GiB 93.68% of requested files processed, minimum is 90.00 elapsed time = 8.018 files/sec = 1914.272783 IOPS = 1914.272783 MiB/sec = 1.869407 [cid:d2ba5e5d-face-451f-84da-62b3e8282ebc] I will be happy to adjust the smallfile test settings if any of my tests are useful to you and re-run them for comparison. Diego ________________________________ From: Wilson, Steven M <st...@pu...> Sent: Thursday, October 18, 2018 4:47:14 PM To: MooseFS-Users Subject: [MooseFS-Users] Performance suggestions for millions of small files Hi, We have ten different MooseFS installations in our research group and one, in particular, is struggling with poor I/O performance. This installation currently has 315 million files occupying 170TB of disk space (goal = 2). If anyone else has a similar installation, I would like to hear what you have done to maintain performance at a reasonable level. Here are some metrics to give a basic idea of the performance characteristics. I'll include in parentheses the range of measurements from other MFS installations with far fewer files for comparison. * tar xf linux-4.9-rc3.tar: 1185 secs (220 - 296 secs) * smallfile test, create MB/s: 0.8 (2.3 - 4.8) <== Ouch! * smallfile test, read MB/s: 10.7 (12.8 - 15.4) * smallfile test, append MB/s: 6.1 (3.0 - 7.7) It looks file creation is where I'm losing most of my performance compared to the other installations. My master server has a Xeon E5-1630v3 3.7GHz CPU with 256GB of DDR4 2133MHz memory. I tried several mfsmount options but the only one that showed any significant improvement was the mfsfsyncmintime option ("mfsfsyncmintime=5"). As to be expected, the improvement gained was during the write/append operation. Here are the results using the same tests as above: * tar xf linux-4.9-rc3.tar: 683 secs * smallfile test, create MB/s: 1.2 * smallfile test, read MB/s: 11.7 * smallfile test, append MB/s: 11.4 <== Dramatic improvement over 6.1 MB/s The smallfile benchmark test I used is from https://github.com/distributed-system-analysis/smallfile. Thanks for any suggestions you might have! Regards, Steve |
From: Wilson, S. M <st...@pu...> - 2018-10-19 15:10:12
Attachments:
pastedImage.png
|
Hi Diego, I appreciate you taking the time to run these tests on your own setup! My parameters to the smallfile benchmark were a little different (I took them from some GlusterFS benchmarking documentation): smallfile_cli.py --top smallfile-tests --threads 4 --file-size 4 --files 10000 --response-times Y ? ?Steve ________________________________ From: Remolina, Diego J <dij...@ae...> Sent: Friday, October 19, 2018 7:19 AM To: MooseFS-Users Subject: Re: [MooseFS-Users] Performance suggestions for millions of small files Hi Steve, I have by no means a similar amount of files and space, as I am just testing, but this is what I see with MooseFS 4.6.0 and goal=3 on a pretty new (in testing phase, no load) 3-way server setup: time tar -xf linux-4.9-rc3.tar real 4m0.332s user 0m1.668s sys 0m9.517s python /tmp/smallfile/smallfile_cli.py --operation create --threads 8 --file-size 1 --files 2048 --top /nethome/dijuremo/test total threads = 8 total files = 15948 total IOPS = 15948 total data = 0.015 GiB 97.34% of requested files processed, minimum is 90.00 elapsed time = 11.608 files/sec = 1373.870032 IOPS = 1373.870032 MiB/sec = 1.341670 python /tmp/smallfile/smallfile_cli.py --operation read --threads 8 --file-size 1 --files 2048 --top /nethome/dijuremo/test total threads = 8 total files = 16384 total IOPS = 16384 total data = 0.016 GiB 100.00% of requested files processed, minimum is 90.00 elapsed time = 2.553 files/sec = 6416.909838 IOPS = 6416.909838 MiB/sec = 6.266514 python /tmp/smallfile/smallfile_cli.py --operation append --threads 8 --file-size 1 --files 2048 --top /nethome/dijuremo/test total threads = 8 total files = 15348 total IOPS = 15348 total data = 0.015 GiB 93.68% of requested files processed, minimum is 90.00 elapsed time = 8.018 files/sec = 1914.272783 IOPS = 1914.272783 MiB/sec = 1.869407 [cid:d2ba5e5d-face-451f-84da-62b3e8282ebc] I will be happy to adjust the smallfile test settings if any of my tests are useful to you and re-run them for comparison. Diego ________________________________ From: Wilson, Steven M <st...@pu...> Sent: Thursday, October 18, 2018 4:47:14 PM To: MooseFS-Users Subject: [MooseFS-Users] Performance suggestions for millions of small files Hi, We have ten different MooseFS installations in our research group and one, in particular, is struggling with poor I/O performance. This installation currently has 315 million files occupying 170TB of disk space (goal = 2). If anyone else has a similar installation, I would like to hear what you have done to maintain performance at a reasonable level. Here are some metrics to give a basic idea of the performance characteristics. I'll include in parentheses the range of measurements from other MFS installations with far fewer files for comparison. * tar xf linux-4.9-rc3.tar: 1185 secs (220 - 296 secs) * smallfile test, create MB/s: 0.8 (2.3 - 4.8) <== Ouch! * smallfile test, read MB/s: 10.7 (12.8 - 15.4) * smallfile test, append MB/s: 6.1 (3.0 - 7.7) It looks file creation is where I'm losing most of my performance compared to the other installations. My master server has a Xeon E5-1630v3 3.7GHz CPU with 256GB of DDR4 2133MHz memory. I tried several mfsmount options but the only one that showed any significant improvement was the mfsfsyncmintime option ("mfsfsyncmintime=5"). As to be expected, the improvement gained was during the write/append operation. Here are the results using the same tests as above: * tar xf linux-4.9-rc3.tar: 683 secs * smallfile test, create MB/s: 1.2 * smallfile test, read MB/s: 11.7 * smallfile test, append MB/s: 11.4 <== Dramatic improvement over 6.1 MB/s The smallfile benchmark test I used is ?from https://github.com/distributed-system-analysis/smallfile. Thanks for any suggestions you might have! Regards, Steve |
From: Marco M. <mar...@gm...> - 2018-10-19 16:36:36
|
Steve, My wild guess is that somehow the master server is not very efficient to handle that many files. (Obviously if this is the case, it is very bad.) I will do some tests with the 4.x series and 0.5 billion files and let you know.(it will take me several weeks to create that test environment.) In the meantime, you can split the namespace into two which may help on the same hardware. (Basically you can run as many namespaces as you want on the same hardware setup, however this will require a lot of work to setup and migrate, in this case, there will be two master server processes running on your master server hardware just at different ports) Or, just hope that the performance of the master server is better with version 4.x -- Marco On 10/19/18 11:10 AM, Wilson, Steven M wrote: > Hi Diego, > > > I appreciate you taking the time to run these tests on your own setup! > My parameters to the smallfile benchmark were a little different (I took > them from some GlusterFS benchmarking documentation): > > smallfile_cli.py --top smallfile-tests --threads 4 --file-size 4 > --files 10000 --response-times Y > > > > Steve > > > ------------------------------------------------------------------------ > *From:* Remolina, Diego J <dij...@ae...> > *Sent:* Friday, October 19, 2018 7:19 AM > *To:* MooseFS-Users > *Subject:* Re: [MooseFS-Users] Performance suggestions for millions of > small files > > Hi Steve, > > > I have by no means a similar amount of files and space, as I am just > testing, but this is what I see with MooseFS 4.6.0 and goal=3 on a > pretty new (in testing phase, no load) 3-way server setup: > > > time tar -xf linux-4.9-rc3.tar > > real 4m0.332s > user 0m1.668s > sys 0m9.517s > > python /tmp/smallfile/smallfile_cli.py --operation create --threads 8 > --file-size 1 --files 2048 --top /nethome/dijuremo/test > > total threads = 8 > total files = 15948 > total IOPS = 15948 > total data = 0.015 GiB > 97.34% of requested files processed, minimum is 90.00 > elapsed time = 11.608 > files/sec = 1373.870032 > IOPS = 1373.870032 > MiB/sec = 1.341670 > > > python /tmp/smallfile/smallfile_cli.py --operation read --threads 8 > --file-size 1 --files 2048 --top /nethome/dijuremo/test > > total threads = 8 > total files = 16384 > total IOPS = 16384 > total data = 0.016 GiB > 100.00% of requested files processed, minimum is 90.00 > elapsed time = 2.553 > files/sec = 6416.909838 > IOPS = 6416.909838 > MiB/sec = 6.266514 > > python /tmp/smallfile/smallfile_cli.py --operation append --threads 8 > --file-size 1 --files 2048 --top /nethome/dijuremo/test > > > total threads = 8 > total files = 15348 > total IOPS = 15348 > total data = 0.015 GiB > 93.68% of requested files processed, minimum is 90.00 > elapsed time = 8.018 > files/sec = 1914.272783 > IOPS = 1914.272783 > MiB/sec = 1.869407 > > > > > I will be happy to adjust the smallfile test settings if any of my tests > are useful to you and re-run them for comparison. > > > Diego > > ------------------------------------------------------------------------ > *From:* Wilson, Steven M <st...@pu...> > *Sent:* Thursday, October 18, 2018 4:47:14 PM > *To:* MooseFS-Users > *Subject:* [MooseFS-Users] Performance suggestions for millions of small > files > > Hi, > > > We have ten different MooseFS installations in our research group and > one, in particular, is struggling with poor I/O performance. This > installation currently has 315 million files occupying 170TB of disk > space (goal = 2). If anyone else has a similar installation, I would > like to hear what you have done to maintain performance at a reasonable > level. > > > Here are some metrics to give a basic idea of the performance > characteristics. I'll include in parentheses the range of measurements > from other MFS installations with far fewer files for comparison. > > * tar xf linux-4.9-rc3.tar: 1185 secs (220 - 296 secs) > > * smallfile test, create MB/s: 0.8 (2.3 - 4.8) <== Ouch! > > * smallfile test, read MB/s: 10.7 (12.8 - 15.4) > > * smallfile test, append MB/s: 6.1 (3.0 - 7.7) > > > It looks file creation is where I'm losing most of my performance > compared to the other installations. My master server has a Xeon > E5-1630v3 3.7GHz CPU with 256GB of DDR4 2133MHz memory. > > > I tried several mfsmount options but the only one that showed any > significant improvement was the mfsfsyncmintime option > ("mfsfsyncmintime=5"). As to be expected, the improvement gained was > during the write/append operation. Here are the results using the same > tests as above: > > * tar xf linux-4.9-rc3.tar: 683 secs > * smallfile test, create MB/s: 1.2 > * smallfile test, read MB/s: 11.7 > * smallfile test, append MB/s: 11.4 <== Dramatic improvement over > 6.1 MB/s > > > The smallfile benchmark test I used is from > https://github.com/distributed-system-analysis/smallfile. > > > Thanks for any suggestions you might have! > > > Regards, > > Steve > > > > _________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users > |
From: Wilson, S. M <st...@pu...> - 2018-10-19 17:19:43
|
Marco, Ah, yes... I could also run multiple instances of MooseFS to help divide the load. I was hoping to avoid doing something like that, though. I will keep it mind as a last resort. For the moment, I'm seeing about a 50% improvement in the benchmarks with adding the mfsmount option "mfsfsyncmintime=5" so I'll see how that plays out using real workloads. Thanks for your willingness to try this out in your own environment. It is very kind of you but please do not make too much work for yourself! Regards, Steve ________________________________________ From: Marco Milano <mar...@gm...> Sent: Friday, October 19, 2018 12:36 PM To: moo...@li... Subject: Re: [MooseFS-Users] Performance suggestions for millions of small files Steve, My wild guess is that somehow the master server is not very efficient to handle that many files. (Obviously if this is the case, it is very bad.) I will do some tests with the 4.x series and 0.5 billion files and let you know.(it will take me several weeks to create that test environment.) In the meantime, you can split the namespace into two which may help on the same hardware. (Basically you can run as many namespaces as you want on the same hardware setup, however this will require a lot of work to setup and migrate, in this case, there will be two master server processes running on your master server hardware just at different ports) Or, just hope that the performance of the master server is better with version 4.x -- Marco On 10/19/18 11:10 AM, Wilson, Steven M wrote: > Hi Diego, > > > I appreciate you taking the time to run these tests on your own setup! > My parameters to the smallfile benchmark were a little different (I took > them from some GlusterFS benchmarking documentation): > > smallfile_cli.py --top smallfile-tests --threads 4 --file-size 4 > --files 10000 --response-times Y > > > > Steve > > > ------------------------------------------------------------------------ > *From:* Remolina, Diego J <dij...@ae...> > *Sent:* Friday, October 19, 2018 7:19 AM > *To:* MooseFS-Users > *Subject:* Re: [MooseFS-Users] Performance suggestions for millions of > small files > > Hi Steve, > > > I have by no means a similar amount of files and space, as I am just > testing, but this is what I see with MooseFS 4.6.0 and goal=3 on a > pretty new (in testing phase, no load) 3-way server setup: > > > time tar -xf linux-4.9-rc3.tar > > real 4m0.332s > user 0m1.668s > sys 0m9.517s > > python /tmp/smallfile/smallfile_cli.py --operation create --threads 8 > --file-size 1 --files 2048 --top /nethome/dijuremo/test > > total threads = 8 > total files = 15948 > total IOPS = 15948 > total data = 0.015 GiB > 97.34% of requested files processed, minimum is 90.00 > elapsed time = 11.608 > files/sec = 1373.870032 > IOPS = 1373.870032 > MiB/sec = 1.341670 > > > python /tmp/smallfile/smallfile_cli.py --operation read --threads 8 > --file-size 1 --files 2048 --top /nethome/dijuremo/test > > total threads = 8 > total files = 16384 > total IOPS = 16384 > total data = 0.016 GiB > 100.00% of requested files processed, minimum is 90.00 > elapsed time = 2.553 > files/sec = 6416.909838 > IOPS = 6416.909838 > MiB/sec = 6.266514 > > python /tmp/smallfile/smallfile_cli.py --operation append --threads 8 > --file-size 1 --files 2048 --top /nethome/dijuremo/test > > > total threads = 8 > total files = 15348 > total IOPS = 15348 > total data = 0.015 GiB > 93.68% of requested files processed, minimum is 90.00 > elapsed time = 8.018 > files/sec = 1914.272783 > IOPS = 1914.272783 > MiB/sec = 1.869407 > > > > > I will be happy to adjust the smallfile test settings if any of my tests > are useful to you and re-run them for comparison. > > > Diego > > ------------------------------------------------------------------------ > *From:* Wilson, Steven M <st...@pu...> > *Sent:* Thursday, October 18, 2018 4:47:14 PM > *To:* MooseFS-Users > *Subject:* [MooseFS-Users] Performance suggestions for millions of small > files > > Hi, > > > We have ten different MooseFS installations in our research group and > one, in particular, is struggling with poor I/O performance. This > installation currently has 315 million files occupying 170TB of disk > space (goal = 2). If anyone else has a similar installation, I would > like to hear what you have done to maintain performance at a reasonable > level. > > > Here are some metrics to give a basic idea of the performance > characteristics. I'll include in parentheses the range of measurements > from other MFS installations with far fewer files for comparison. > > * tar xf linux-4.9-rc3.tar: 1185 secs (220 - 296 secs) > > * smallfile test, create MB/s: 0.8 (2.3 - 4.8) <== Ouch! > > * smallfile test, read MB/s: 10.7 (12.8 - 15.4) > > * smallfile test, append MB/s: 6.1 (3.0 - 7.7) > > > It looks file creation is where I'm losing most of my performance > compared to the other installations. My master server has a Xeon > E5-1630v3 3.7GHz CPU with 256GB of DDR4 2133MHz memory. > > > I tried several mfsmount options but the only one that showed any > significant improvement was the mfsfsyncmintime option > ("mfsfsyncmintime=5"). As to be expected, the improvement gained was > during the write/append operation. Here are the results using the same > tests as above: > > * tar xf linux-4.9-rc3.tar: 683 secs > * smallfile test, create MB/s: 1.2 > * smallfile test, read MB/s: 11.7 > * smallfile test, append MB/s: 11.4 <== Dramatic improvement over > 6.1 MB/s > > > The smallfile benchmark test I used is from > https://github.com/distributed-system-analysis/smallfile. > > > Thanks for any suggestions you might have! > > > Regards, > > Steve > > > > _________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users > _________________________________________ moosefs-users mailing list moo...@li... https://lists.sourceforge.net/lists/listinfo/moosefs-users |
From: Piotr R. K. <pio...@ge...> - 2018-10-19 17:14:33
|
Hi Steve, what is the latency: Client <----> Master Server Client <----> Chunkservers? When we consider small files operations, latency becomes the most crucial parameter. Could you please provide us with some ping test results? Also, what are results (of single tests and summed up) if you run e.g. two, three, more such tests at the same time? Can you plz paste also some Master charts from the time of tests are being ran? Thank you, Best regards, Peter Piotr Robert Konopelko | m: +48 601 476 440 | e: pio...@mo... <mailto:pio...@mo...> Business & Technical Support Manager MooseFS Client Support Team WWW <http://moosefs.com/> | GitHub <https://github.com/moosefs/moosefs> | Twitter <https://twitter.com/moosefs> | Facebook <https://www.facebook.com/moosefs> | LinkedIn <https://www.linkedin.com/company/moosefs> > On 19 Oct 2018, at 6:36 PM, Marco Milano <mar...@gm...> wrote: > > Steve, > > My wild guess is that somehow the master server is not very efficient > to handle that many files. > (Obviously if this is the case, it is very bad.) > > I will do some tests with the 4.x series and 0.5 billion files > and let you know.(it will take me several weeks to create that test > environment.) > > In the meantime, you can split the namespace into two which may help > on the same hardware. (Basically you can run as many namespaces > as you want on the same hardware setup, however this will require > a lot of work to setup and migrate, in this case, there will be > two master server processes running on your master server hardware > just at different ports) > > Or, just hope that the performance of the master server is better > with version 4.x > > -- Marco > > On 10/19/18 11:10 AM, Wilson, Steven M wrote: >> Hi Diego, >> I appreciate you taking the time to run these tests on your own setup! My parameters to the smallfile benchmark were a little different (I took them from some GlusterFS benchmarking documentation): >> smallfile_cli.py --top smallfile-tests --threads 4 --file-size 4 >> --files 10000 --response-times Y >> >> Steve >> ------------------------------------------------------------------------ >> *From:* Remolina, Diego J <dij...@ae...> >> *Sent:* Friday, October 19, 2018 7:19 AM >> *To:* MooseFS-Users >> *Subject:* Re: [MooseFS-Users] Performance suggestions for millions of small files >> Hi Steve, >> I have by no means a similar amount of files and space, as I am just testing, but this is what I see with MooseFS 4.6.0 and goal=3 on a pretty new (in testing phase, no load) 3-way server setup: >> time tar -xf linux-4.9-rc3.tar >> real 4m0.332s >> user 0m1.668s >> sys 0m9.517s >> python /tmp/smallfile/smallfile_cli.py --operation create --threads 8 --file-size 1 --files 2048 --top /nethome/dijuremo/test >> total threads = 8 >> total files = 15948 >> total IOPS = 15948 >> total data = 0.015 GiB >> 97.34% of requested files processed, minimum is 90.00 >> elapsed time = 11.608 >> files/sec = 1373.870032 >> IOPS = 1373.870032 >> MiB/sec = 1.341670 >> python /tmp/smallfile/smallfile_cli.py --operation read --threads 8 --file-size 1 --files 2048 --top /nethome/dijuremo/test >> total threads = 8 >> total files = 16384 >> total IOPS = 16384 >> total data = 0.016 GiB >> 100.00% of requested files processed, minimum is 90.00 >> elapsed time = 2.553 >> files/sec = 6416.909838 >> IOPS = 6416.909838 >> MiB/sec = 6.266514 >> python /tmp/smallfile/smallfile_cli.py --operation append --threads 8 --file-size 1 --files 2048 --top /nethome/dijuremo/test >> total threads = 8 >> total files = 15348 >> total IOPS = 15348 >> total data = 0.015 GiB >> 93.68% of requested files processed, minimum is 90.00 >> elapsed time = 8.018 >> files/sec = 1914.272783 >> IOPS = 1914.272783 >> MiB/sec = 1.869407 >> I will be happy to adjust the smallfile test settings if any of my tests are useful to you and re-run them for comparison. >> Diego >> ------------------------------------------------------------------------ >> *From:* Wilson, Steven M <st...@pu...> >> *Sent:* Thursday, October 18, 2018 4:47:14 PM >> *To:* MooseFS-Users >> *Subject:* [MooseFS-Users] Performance suggestions for millions of small files >> Hi, >> We have ten different MooseFS installations in our research group and one, in particular, is struggling with poor I/O performance. This installation currently has 315 million files occupying 170TB of disk space (goal = 2). If anyone else has a similar installation, I would like to hear what you have done to maintain performance at a reasonable level. >> Here are some metrics to give a basic idea of the performance characteristics. I'll include in parentheses the range of measurements from other MFS installations with far fewer files for comparison. >> * tar xf linux-4.9-rc3.tar: 1185 secs (220 - 296 secs) >> * smallfile test, create MB/s: 0.8 (2.3 - 4.8) <== Ouch! >> * smallfile test, read MB/s: 10.7 (12.8 - 15.4) >> * smallfile test, append MB/s: 6.1 (3.0 - 7.7) >> It looks file creation is where I'm losing most of my performance compared to the other installations. My master server has a Xeon E5-1630v3 3.7GHz CPU with 256GB of DDR4 2133MHz memory. >> I tried several mfsmount options but the only one that showed any significant improvement was the mfsfsyncmintime option ("mfsfsyncmintime=5"). As to be expected, the improvement gained was during the write/append operation. Here are the results using the same tests as above: >> * tar xf linux-4.9-rc3.tar: 683 secs >> * smallfile test, create MB/s: 1.2 >> * smallfile test, read MB/s: 11.7 >> * smallfile test, append MB/s: 11.4 <== Dramatic improvement over 6.1 MB/s >> The smallfile benchmark test I used is from https://github.com/distributed-system-analysis/smallfile. >> Thanks for any suggestions you might have! >> Regards, >> Steve >> _________________________________________ >> moosefs-users mailing list >> moo...@li... >> https://lists.sourceforge.net/lists/listinfo/moosefs-users > > > _________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users |
From: Wilson, S. M <st...@pu...> - 2018-10-19 19:26:36
|
Hi Peter, Let me get some of this information to you. Ping results: Client to Master Server (which is also Chunkserver #1) 25 packets transmitted, 25 received, 0% packet loss, time 24581ms rtt min/avg/max/mdev = 0.082/0.108/0.147/0.023 ms Client to Chunkserver #2 25 packets transmitted, 25 received, 0% packet loss, time 24567ms rtt min/avg/max/mdev = 0.102/0.118/0.135/0.016 ms Client to Chunkserver #3 25 packets transmitted, 25 received, 0% packet loss, time 24556ms rtt min/avg/max/mdev = 0.104/0.126/0.153/0.014 ms Client to Chunkserver #4 25 packets transmitted, 25 received, 0% packet loss, time 24557ms rtt min/avg/max/mdev = 0.067/0.073/0.081/0.012 ms Three tests run at the same time from different clients (using the results from client #1): Create IOPS: 174 Create MB/s: 0.68 Read IOPS: 3166 Read MB/s: 12.37 Append IOPS: 1221 Append MB/s: 4.77 Rename files/s: 1158 tar -xf linux-4.9-rc3.tar: 824 secs Single test run on client #1: Create IOPS: 281 Create MB/s: 1.1 Read IOPS: 3351 Read MB/s: 13.09 Append IOPS: 1678 Append MB/s: 6.55 Rename files/s: 1319 tar -xf linux-4.9-rc3.tar: 937 secs (not sure why this took longer... perhaps due to an increase in other activity) These tests took place between 14:00 and 15:15 so you can see the related activity on the attached Master Charts images. Thanks! Steve ________________________________ From: Piotr Robert Konopelko <pio...@ge...> Sent: Friday, October 19, 2018 12:58 PM To: Wilson, Steven M Cc: moo...@li... Subject: Re: [MooseFS-Users] Performance suggestions for millions of small files Hi Steve, what is the latency: Client <----> Master Server Client <----> Chunkservers? When we consider small files operations, latency becomes the most crucial parameter. Could you please provide us with some ping test results? Also, what are results (of single tests and summed up) if you run e.g. two, three, more such tests at the same time? Can you plz paste also some Master charts from the time of tests are being ran? Thank you, Best regards, Peter Piotr Robert Konopelko | m: +48 601 476 440 | e: pio...@mo...<mailto:pio...@mo...> Business & Technical Support Manager MooseFS Client Support Team WWW<http://moosefs.com/> | GitHub<https://github.com/moosefs/moosefs> | Twitter<https://twitter.com/moosefs> | Facebook<https://www.facebook.com/moosefs> | LinkedIn<https://www.linkedin.com/company/moosefs> On 19 Oct 2018, at 6:36 PM, Marco Milano <mar...@gm...<mailto:mar...@gm...>> wrote: Steve, My wild guess is that somehow the master server is not very efficient to handle that many files. (Obviously if this is the case, it is very bad.) I will do some tests with the 4.x series and 0.5 billion files and let you know.(it will take me several weeks to create that test environment.) In the meantime, you can split the namespace into two which may help on the same hardware. (Basically you can run as many namespaces as you want on the same hardware setup, however this will require a lot of work to setup and migrate, in this case, there will be two master server processes running on your master server hardware just at different ports) Or, just hope that the performance of the master server is better with version 4.x -- Marco On 10/19/18 11:10 AM, Wilson, Steven M wrote: Hi Diego, I appreciate you taking the time to run these tests on your own setup! My parameters to the smallfile benchmark were a little different (I took them from some GlusterFS benchmarking documentation): smallfile_cli.py --top smallfile-tests --threads 4 --file-size 4 --files 10000 --response-times Y Steve ------------------------------------------------------------------------ *From:* Remolina, Diego J <dij...@ae...<mailto:dij...@ae...>> *Sent:* Friday, October 19, 2018 7:19 AM *To:* MooseFS-Users *Subject:* Re: [MooseFS-Users] Performance suggestions for millions of small files Hi Steve, I have by no means a similar amount of files and space, as I am just testing, but this is what I see with MooseFS 4.6.0 and goal=3 on a pretty new (in testing phase, no load) 3-way server setup: time tar -xf linux-4.9-rc3.tar real 4m0.332s user 0m1.668s sys 0m9.517s python /tmp/smallfile/smallfile_cli.py --operation create --threads 8 --file-size 1 --files 2048 --top /nethome/dijuremo/test total threads = 8 total files = 15948 total IOPS = 15948 total data = 0.015 GiB 97.34% of requested files processed, minimum is 90.00 elapsed time = 11.608 files/sec = 1373.870032 IOPS = 1373.870032 MiB/sec = 1.341670 python /tmp/smallfile/smallfile_cli.py --operation read --threads 8 --file-size 1 --files 2048 --top /nethome/dijuremo/test total threads = 8 total files = 16384 total IOPS = 16384 total data = 0.016 GiB 100.00% of requested files processed, minimum is 90.00 elapsed time = 2.553 files/sec = 6416.909838 IOPS = 6416.909838 MiB/sec = 6.266514 python /tmp/smallfile/smallfile_cli.py --operation append --threads 8 --file-size 1 --files 2048 --top /nethome/dijuremo/test total threads = 8 total files = 15348 total IOPS = 15348 total data = 0.015 GiB 93.68% of requested files processed, minimum is 90.00 elapsed time = 8.018 files/sec = 1914.272783 IOPS = 1914.272783 MiB/sec = 1.869407 I will be happy to adjust the smallfile test settings if any of my tests are useful to you and re-run them for comparison. Diego ------------------------------------------------------------------------ *From:* Wilson, Steven M <st...@pu...<mailto:st...@pu...>> *Sent:* Thursday, October 18, 2018 4:47:14 PM *To:* MooseFS-Users *Subject:* [MooseFS-Users] Performance suggestions for millions of small files Hi, We have ten different MooseFS installations in our research group and one, in particular, is struggling with poor I/O performance. This installation currently has 315 million files occupying 170TB of disk space (goal = 2). If anyone else has a similar installation, I would like to hear what you have done to maintain performance at a reasonable level. Here are some metrics to give a basic idea of the performance characteristics. I'll include in parentheses the range of measurements from other MFS installations with far fewer files for comparison. * tar xf linux-4.9-rc3.tar: 1185 secs (220 - 296 secs) * smallfile test, create MB/s: 0.8 (2.3 - 4.8) <== Ouch! * smallfile test, read MB/s: 10.7 (12.8 - 15.4) * smallfile test, append MB/s: 6.1 (3.0 - 7.7) It looks file creation is where I'm losing most of my performance compared to the other installations. My master server has a Xeon E5-1630v3 3.7GHz CPU with 256GB of DDR4 2133MHz memory. I tried several mfsmount options but the only one that showed any significant improvement was the mfsfsyncmintime option ("mfsfsyncmintime=5"). As to be expected, the improvement gained was during the write/append operation. Here are the results using the same tests as above: * tar xf linux-4.9-rc3.tar: 683 secs * smallfile test, create MB/s: 1.2 * smallfile test, read MB/s: 11.7 * smallfile test, append MB/s: 11.4 <== Dramatic improvement over 6.1 MB/s The smallfile benchmark test I used is from https://github.com/distributed-system-analysis/smallfile. Thanks for any suggestions you might have! Regards, Steve _________________________________________ moosefs-users mailing list moo...@li...<mailto:moo...@li...> https://lists.sourceforge.net/lists/listinfo/moosefs-users _________________________________________ moosefs-users mailing list moo...@li...<mailto:moo...@li...> https://lists.sourceforge.net/lists/listinfo/moosefs-users |