You can subscribe to this list here.
2009 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(4) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2010 |
Jan
(20) |
Feb
(11) |
Mar
(11) |
Apr
(9) |
May
(22) |
Jun
(85) |
Jul
(94) |
Aug
(80) |
Sep
(72) |
Oct
(64) |
Nov
(69) |
Dec
(89) |
2011 |
Jan
(72) |
Feb
(109) |
Mar
(116) |
Apr
(117) |
May
(117) |
Jun
(102) |
Jul
(91) |
Aug
(72) |
Sep
(51) |
Oct
(41) |
Nov
(55) |
Dec
(74) |
2012 |
Jan
(45) |
Feb
(77) |
Mar
(99) |
Apr
(113) |
May
(132) |
Jun
(75) |
Jul
(70) |
Aug
(58) |
Sep
(58) |
Oct
(37) |
Nov
(51) |
Dec
(15) |
2013 |
Jan
(28) |
Feb
(16) |
Mar
(25) |
Apr
(38) |
May
(23) |
Jun
(39) |
Jul
(42) |
Aug
(19) |
Sep
(41) |
Oct
(31) |
Nov
(18) |
Dec
(18) |
2014 |
Jan
(17) |
Feb
(19) |
Mar
(39) |
Apr
(16) |
May
(10) |
Jun
(13) |
Jul
(17) |
Aug
(13) |
Sep
(8) |
Oct
(53) |
Nov
(23) |
Dec
(7) |
2015 |
Jan
(35) |
Feb
(13) |
Mar
(14) |
Apr
(56) |
May
(8) |
Jun
(18) |
Jul
(26) |
Aug
(33) |
Sep
(40) |
Oct
(37) |
Nov
(24) |
Dec
(20) |
2016 |
Jan
(38) |
Feb
(20) |
Mar
(25) |
Apr
(14) |
May
(6) |
Jun
(36) |
Jul
(27) |
Aug
(19) |
Sep
(36) |
Oct
(24) |
Nov
(15) |
Dec
(16) |
2017 |
Jan
(8) |
Feb
(13) |
Mar
(17) |
Apr
(20) |
May
(28) |
Jun
(10) |
Jul
(20) |
Aug
(3) |
Sep
(18) |
Oct
(8) |
Nov
|
Dec
(5) |
2018 |
Jan
(15) |
Feb
(9) |
Mar
(12) |
Apr
(7) |
May
(123) |
Jun
(41) |
Jul
|
Aug
(14) |
Sep
|
Oct
(15) |
Nov
|
Dec
(7) |
2019 |
Jan
(2) |
Feb
(9) |
Mar
(2) |
Apr
(9) |
May
|
Jun
|
Jul
(2) |
Aug
|
Sep
(6) |
Oct
(1) |
Nov
(12) |
Dec
(2) |
2020 |
Jan
(2) |
Feb
|
Mar
|
Apr
(3) |
May
|
Jun
(4) |
Jul
(4) |
Aug
(1) |
Sep
(18) |
Oct
(2) |
Nov
|
Dec
|
2021 |
Jan
|
Feb
(3) |
Mar
|
Apr
|
May
|
Jun
|
Jul
(6) |
Aug
|
Sep
(5) |
Oct
(5) |
Nov
(3) |
Dec
|
2022 |
Jan
|
Feb
|
Mar
(3) |
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: Michał B. <mic...@ge...> - 2011-09-24 13:05:20
|
I can't see which operating system do you use? Because the solution with /etc/fstab works only on Linux platforms (tested on Debian). On other platfroms you need to prepare a script in /usr/local/etc/rc.d which will run mfsmount with needed options. Kind regards Michał Borychowski MooseFS Support Manager _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ Gemius S.A. ul. Wołoska 7, 02-672 Warszawa Budynek MARS, klatka D Tel.: +4822 874-41-00 Fax : +4822 874-41-01 -----Original Message----- From: Tom De Vylder [mailto:to...@pe...] Sent: Monday, August 29, 2011 11:39 PM To: moo...@li... Subject: [Moosefs-users] fstab entry not working Hi all, We're having some troubles getting the fstab entry to work: ~# mount /mnt/moose mount: wrong fs type, bad option, bad superblock on mfsmount, missing codepage or helper program, or other error (for several filesystems (e.g. nfs, cifs) you might need a /sbin/mount.<type> helper program) In some cases useful info is found in syslog - try dmesg | tail or so ~# tail -1 /etc/fstab mfsmount /mnt/moose fuse mfsmaster=10.0.0.1,mfsport=9421,_netdev 0 0 ~# It's the same format as listed on http://www.moosefs.org/reference-guide.html but it seems "mfsmount" isn't accepted. # mfsmount --version MFS version 1.6.20 FUSE library version: 2.8.4 # cat /etc/debian_version 6.0.1 If anyone needs more information I'd be glad to provide it. Kind regards, Tom De Vylder ---------------------------------------------------------------------------- -- Special Offer -- Download ArcSight Logger for FREE! Finally, a world-class log management solution at an even better price-free! And you'll get a free "Love Thy Logs" t-shirt when you download Logger. Secure your free ArcSight Logger TODAY! http://p.sf.net/sfu/arcsisghtdev2dev _______________________________________________ moosefs-users mailing list moo...@li... https://lists.sourceforge.net/lists/listinfo/moosefs-users |
From: Kristofer P. <kri...@cy...> - 2011-09-23 19:01:38
|
It looks like it is trying to increase it if the things to be deleted is growing too fast. Perhaps there should be a hard stop maximum (like chunks_del_hard_limit), where TmpMaxDel will never exceed that. In your case, it could be set to the same as the current chunks_del_limit, so that it has the best of both worlds. ----- Original Message ----- From: "WK" <wk...@bn...> To: moo...@li... Sent: Friday, September 23, 2011 1:53:04 PM Subject: Re: [Moosefs-users] Why is it increasing my DEL_LIMIT when I don't want it to! On 9/22/2011 2:42 AM, Ólafur Ósvaldsson wrote: > We have the exact same problem, chunk deletions have caused problems in the past and we have DEL_LIMIT set at 5, but mfsmaster increases it to 40-50 right away and sometimes goes to 70/s So I checked the code and here is the offending section in chunks.c of the mfsmaster code if (delnotdone > deldone && delnotdone > prevdelnotdone) { TmpMaxDelFrac *= 1.3; TmpMaxDel = TmpMaxDelFrac; syslog(LOG_NOTICE,"DEL_LIMIT temporary increased to: %u/s",TmpMaxDel); } So CHUNKS_DEL_LIMIT will automatically increase by 30% every cycle until the deletion queue gets caught up. Of course it just keeps rising if you are deleting hundreds of thousands of files irregardless of the performance hit which can be severe (at least thats what we see) So that whole section could be commented out and mfsmaster recompiled so that overwhelming deletion run doesn't happen and CHUNKS_DEL_LIMIT now really means LIMIT, even if the deletion queue stacks up. Instead we decided to drop our DEL_LIMIT to 12 which has no impact on our system for normal deletions and we are going to let the rate increase by 10% per cycle to DEL_LIMIT*2 (i.e. 24 in our case) which is still comfortable and then give us a warning if we are still not keeping up. // Local version 09-23-2011 if (delnotdone > deldone && delnotdone > prevdelnotdone) { if (TmpMaxDelFrac < (MaxDel*2)) { TmpMaxDelFrac *= 1.1; TmpMaxDel = TmpMaxDelFrac; syslog(LOG_NOTICE,"DEL_LIMIT temporary increased to: %u/s",TmpMaxDel); } else { syslog(LOG_NOTICE,"DEL_LIMIT at MAXIMUM of: %u/s",TmpMaxDel); } } We are testing now on our test cluster. This was a quickie, so someone let me know if I'm missing something important occuring elsewhere. -bill ------------------------------------------------------------------------------ All of the data generated in your IT infrastructure is seriously valuable. Why? It contains a definitive record of application performance, security threats, fraudulent activity, and more. Splunk takes this data and makes sense of it. IT sense. And common sense. http://p.sf.net/sfu/splunk-d2dcopy2 _______________________________________________ moosefs-users mailing list moo...@li... https://lists.sourceforge.net/lists/listinfo/moosefs-users |
From: WK <wk...@bn...> - 2011-09-23 18:53:33
|
On 9/22/2011 2:42 AM, Ólafur Ósvaldsson wrote: > We have the exact same problem, chunk deletions have caused problems in the past and we have DEL_LIMIT set at 5, but mfsmaster increases it to 40-50 right away and sometimes goes to 70/s So I checked the code and here is the offending section in chunks.c of the mfsmaster code if (delnotdone > deldone && delnotdone > prevdelnotdone) { TmpMaxDelFrac *= 1.3; TmpMaxDel = TmpMaxDelFrac; syslog(LOG_NOTICE,"DEL_LIMIT temporary increased to: %u/s",TmpMaxDel); } So CHUNKS_DEL_LIMIT will automatically increase by 30% every cycle until the deletion queue gets caught up. Of course it just keeps rising if you are deleting hundreds of thousands of files irregardless of the performance hit which can be severe (at least thats what we see) So that whole section could be commented out and mfsmaster recompiled so that overwhelming deletion run doesn't happen and CHUNKS_DEL_LIMIT now really means LIMIT, even if the deletion queue stacks up. Instead we decided to drop our DEL_LIMIT to 12 which has no impact on our system for normal deletions and we are going to let the rate increase by 10% per cycle to DEL_LIMIT*2 (i.e. 24 in our case) which is still comfortable and then give us a warning if we are still not keeping up. // Local version 09-23-2011 if (delnotdone > deldone && delnotdone > prevdelnotdone) { if (TmpMaxDelFrac < (MaxDel*2)) { TmpMaxDelFrac *= 1.1; TmpMaxDel = TmpMaxDelFrac; syslog(LOG_NOTICE,"DEL_LIMIT temporary increased to: %u/s",TmpMaxDel); } else { syslog(LOG_NOTICE,"DEL_LIMIT at MAXIMUM of: %u/s",TmpMaxDel); } } We are testing now on our test cluster. This was a quickie, so someone let me know if I'm missing something important occuring elsewhere. -bill |
From: Brian P. <axe...@ya...> - 2011-09-23 18:53:09
|
Hi, I've looked around some and I see moosefs has a max filesize of about 2TB. But, I am unable to find what the max file system size is. Can someone please tell me? thanks, Brian |
From: Kristofer P. <kri...@cy...> - 2011-09-23 18:39:46
|
I know that you guys can't given an exact date, but when is the next release expected? Q4-2011? It's been a long time, starting to wonder if MFS is still in development :) |
From: Thomas S. <tho...@gm...> - 2011-09-23 16:30:47
|
Hi, i only see this at startup of the mfsmaster deamon: Sep 23 16:52:39 brick01 mfsmaster[1981]: set gid to 1003 Sep 23 16:52:39 brick01 mfsmaster[1981]: set uid to 1003 Sep 23 16:52:39 brick01 mfsmaster[1981]: sessions have been loaded Sep 23 16:52:39 brick01 mfsmaster[1981]: exports file has been loaded Sep 23 16:52:39 brick01 mfsmaster[1981]: stats file has been loaded Sep 23 16:52:39 brick01 mfsmaster[1981]: master <-> metaloggers module: listen on *:9419 Sep 23 16:52:39 brick01 mfsmaster[1981]: master <-> chunkservers module: listen on *:9420 Sep 23 16:52:39 brick01 mfsmaster[1981]: main master server module: listen on *:9421 Sep 23 16:52:39 brick01 mfsmaster[1981]: open files limit: 5000 Sep 23 16:54:00 brick01 mfsmaster[1981]: chunkservers status: Sep 23 16:54:00 brick01 mfsmaster[1981]: total: usedspace: 0 (0.00 GiB), totalspace: 0 (0.00 GiB), usage: 0.00% Sep 23 16:54:00 brick01 mfsmaster[1981]: no meta loggers connected !!! nothing about the mfsmaster.cfg Regards Thomas 2011/9/23 Davies Liu <dav...@gm...> > You can check this in syslog if the mfsmaster.cfg is used or not. > > Davies > > > On Fri, Sep 23, 2011 at 3:07 PM, Thomas Schend <tho...@gm...>wrote: > >> Hello everyone, >> >> i have a small test setup with 4 debian 6.0 vms and moosefs 1.6.20. >> >> I have one master and 4 chunkservers running. Works very good so far. >> >> The only problem i see is that moosefs starts replication immideltiy after >> one chunk server is down or when i change the goal. >> I alos tried setting REPLICATIONS_DELAY_INIT is set to 300 and >> REPLICATIONS_DELAY_DISCONNECT to 3600 >> >> I even tried passing the config file with -c to the mfsmaster but no >> change. >> >> What can i do to solve this or am i doing something wrong? >> >> Regards >> Thomas >> >> |
From: Davies L. <dav...@gm...> - 2011-09-23 14:01:40
|
You can check this in syslog if the mfsmaster.cfg is used or not. Davies On Fri, Sep 23, 2011 at 3:07 PM, Thomas Schend <tho...@gm...>wrote: > Hello everyone, > > i have a small test setup with 4 debian 6.0 vms and moosefs 1.6.20. > > I have one master and 4 chunkservers running. Works very good so far. > > The only problem i see is that moosefs starts replication immideltiy after > one chunk server is down or when i change the goal. > I alos tried setting REPLICATIONS_DELAY_INIT is set to 300 and > REPLICATIONS_DELAY_DISCONNECT to 3600 > > I even tried passing the config file with -c to the mfsmaster but no > change. > > What can i do to solve this or am i doing something wrong? > > Regards > Thomas > > |
From: Steve W. <st...@pu...> - 2011-09-23 13:11:48
|
Hi Michal, Thanks for your response. It looks like all the invalid chunks (originally 29 of them) contained reserved files (files in the Trash?). Yesterday MooseFS reported only 5 invalid chunks so it looks like these chunks are being cleaned up automatically as the files they contain are expired and deleted from the Trash. So, I think I'll just wait this weekend and see if they all disappear. Steve On 09/23/2011 04:25 AM, Michał Borychowski wrote: > Hi Steve! > > The "realpath" function resolves the real path to the file - we are not sure > why it may return an error - probably some path fragment is a symbolic link > to a non-existing object. > > I send mfstools.c which will output more debug information in case of this > function. Please put the file in the "mfsmount" folder and issue "make ; > make install" commands and later run "mfsfilerepair" again. And give us some > feedback what you get. > > > Kind regards > Michał Borychowski > MooseFS Support Manager > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > Gemius S.A. > ul. Wołoska 7, 02-672 Warszawa > Budynek MARS, klatka D > Tel.: +4822 874-41-00 > Fax : +4822 874-41-01 > > > -----Original Message----- > From: Steve Wilson [mailto:st...@pu...] > Sent: Tuesday, September 20, 2011 2:52 PM > To: Michał Borychowski > Cc: moo...@li... > Subject: Re: [Moosefs-users] Invalid copies of chunks > > Hi Michal, > > I've already run mfsfilerepair on the files in these chunks. Most of > the corrupted files are temporary files that must have been open at the > time of the changeover. I wouldn't mind just deleting the files, if I > could. Some of the files, however, give a "realpath error" when I > attempt to run mfsfilerepair: > > mfsfilerepair /net/jiang/fguo/.cache/indicator-applet-session.log > /net/jiang/fguo/.cache/indicator-applet-session.log: realpath error > > Is there any way to clean up these chunks if mfsfilerepair fails? > Again, I wouldn't mind losing files like the indicator-applet-session.log... > > Thanks, > > Steve > > On 09/20/2011 04:18 AM, Michał Borychowski wrote: >> Hi! >> >> You need to run mfsfilerepair on the files belonging to these chunks. >> >> >> >> Kind regards >> Michał Borychowski >> MooseFS Support Manager >> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ >> Gemius S.A. >> ul. Wołoska 7, 02-672 Warszawa >> Budynek MARS, klatka D >> Tel.: +4822 874-41-00 >> Fax : +4822 874-41-01 >> >> >> >> >> -----Original Message----- >> From: Steve Wilson [mailto:st...@pu...] >> Sent: Monday, September 19, 2011 4:11 PM >> To: moo...@li... >> Subject: [Moosefs-users] Invalid copies of chunks >> >> Hi, >> >> Last week I swapped two MooseFS servers so that server A previously >> running mfs-master started running mfs-metalogger and server B >> previously running mfs-metalogger started running mfs-master. Somehow >> in the process I ended up with metadata_ml.mfs.back and >> metadata.mfs.back files that had problems preventing "mfsmetarestore -a" >> from working correctly. So I went to a previous metadata file and >> applied changelogs manually. >> >> Now I have 29 chunks that have no valid copies and messages like the >> following appear in my logs: >> >> Sep 18 06:15:04 noro mfsmaster[9008]: chunk 000000000568DC8C has only >> invalid copies (2) - please repair it manually >> Sep 18 06:15:04 noro mfsmaster[9008]: chunk 000000000568DC8C_00000001 - >> invalid copy on (128.210.48.90 - ver:00000002) >> Sep 18 06:15:04 noro mfsmaster[9008]: chunk 000000000568DC8C_00000001 - >> invalid copy on (128.210.48.88 - ver:00000002) >> >> So my question is: how should I proceed to repair these chunks manually >> as suggested by the log message? >> >> Thanks! >> >> Steve >> -- Steven M. Wilson, Systems and Network Manager Markey Center for Structural Biology Purdue University (765) 496-1946 |
From: Michał B. <mic...@ge...> - 2011-09-23 08:26:30
|
Hi Steve! The "realpath" function resolves the real path to the file - we are not sure why it may return an error - probably some path fragment is a symbolic link to a non-existing object. I send mfstools.c which will output more debug information in case of this function. Please put the file in the "mfsmount" folder and issue "make ; make install" commands and later run "mfsfilerepair" again. And give us some feedback what you get. Kind regards Michał Borychowski MooseFS Support Manager _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ Gemius S.A. ul. Wołoska 7, 02-672 Warszawa Budynek MARS, klatka D Tel.: +4822 874-41-00 Fax : +4822 874-41-01 -----Original Message----- From: Steve Wilson [mailto:st...@pu...] Sent: Tuesday, September 20, 2011 2:52 PM To: Michał Borychowski Cc: moo...@li... Subject: Re: [Moosefs-users] Invalid copies of chunks Hi Michal, I've already run mfsfilerepair on the files in these chunks. Most of the corrupted files are temporary files that must have been open at the time of the changeover. I wouldn't mind just deleting the files, if I could. Some of the files, however, give a "realpath error" when I attempt to run mfsfilerepair: mfsfilerepair /net/jiang/fguo/.cache/indicator-applet-session.log /net/jiang/fguo/.cache/indicator-applet-session.log: realpath error Is there any way to clean up these chunks if mfsfilerepair fails? Again, I wouldn't mind losing files like the indicator-applet-session.log... Thanks, Steve On 09/20/2011 04:18 AM, Michał Borychowski wrote: > Hi! > > You need to run mfsfilerepair on the files belonging to these chunks. > > > > Kind regards > Michał Borychowski > MooseFS Support Manager > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > Gemius S.A. > ul. Wołoska 7, 02-672 Warszawa > Budynek MARS, klatka D > Tel.: +4822 874-41-00 > Fax : +4822 874-41-01 > > > > > -----Original Message----- > From: Steve Wilson [mailto:st...@pu...] > Sent: Monday, September 19, 2011 4:11 PM > To: moo...@li... > Subject: [Moosefs-users] Invalid copies of chunks > > Hi, > > Last week I swapped two MooseFS servers so that server A previously > running mfs-master started running mfs-metalogger and server B > previously running mfs-metalogger started running mfs-master. Somehow > in the process I ended up with metadata_ml.mfs.back and > metadata.mfs.back files that had problems preventing "mfsmetarestore -a" > from working correctly. So I went to a previous metadata file and > applied changelogs manually. > > Now I have 29 chunks that have no valid copies and messages like the > following appear in my logs: > > Sep 18 06:15:04 noro mfsmaster[9008]: chunk 000000000568DC8C has only > invalid copies (2) - please repair it manually > Sep 18 06:15:04 noro mfsmaster[9008]: chunk 000000000568DC8C_00000001 - > invalid copy on (128.210.48.90 - ver:00000002) > Sep 18 06:15:04 noro mfsmaster[9008]: chunk 000000000568DC8C_00000001 - > invalid copy on (128.210.48.88 - ver:00000002) > > So my question is: how should I proceed to repair these chunks manually > as suggested by the log message? > > Thanks! > > Steve > -- Steven M. Wilson, Systems and Network Manager Markey Center for Structural Biology Purdue University (765) 496-1946 ---------------------------------------------------------------------------- -- All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2dcopy1 _______________________________________________ moosefs-users mailing list moo...@li... https://lists.sourceforge.net/lists/listinfo/moosefs-users |
From: Thomas S. <tho...@gm...> - 2011-09-23 07:08:22
|
Hello everyone, i have a small test setup with 4 debian 6.0 vms and moosefs 1.6.20. I have one master and 4 chunkservers running. Works very good so far. The only problem i see is that moosefs starts replication immideltiy after one chunk server is down or when i change the goal. I alos tried setting REPLICATIONS_DELAY_INIT is set to 300 and REPLICATIONS_DELAY_DISCONNECT to 3600 I even tried passing the config file with -c to the mfsmaster but no change. What can i do to solve this or am i doing something wrong? Regards Thomas |
From: WK <wk...@bn...> - 2011-09-23 01:07:10
|
Well I suppose it would be easy enough to grep the source code for 'DEL_LIMIT temporary increase' and start commenting some things out for a quick fix. However, I'd prefer if the maintainers addressed the issue with something more comprehensive and/or a flag for strict enforcement of the DEL_LIMIT setting or the current setup which is obviously some sort of 'oh no, we have a LOT of files we need to work through and need more resources' logic. There may also be some sort of reason they do this, such as some resource issue. We will see what they have to say. Maybe its already addressed in the next version. -bill On 9/22/2011 2:42 AM, Ólafur Ósvaldsson wrote: > We have the exact same problem, chunk deletions have caused problems in the past and we have DEL_LIMIT set at 5, but mfsmaster increases it to 40-50 right away and sometimes goes to 70/s > > This is also the case for chunk replications, it does not seem to honor the mfsmaster.cfg settings, although that does not get logged. > > /Oli > > On 22.9.2011, at 05:03, wk...@bn... wrote: > >> Ok, we deleted a couple hundred thousand files from a large Maildir >> folder set. >> >> We have had problems with deletions overwhelming the cluster in the >> past, so we have a DEL_LIMIT set to 20 (which we will probably lower) >> >> But when the expiretime hit, the server became lethargic. in checking >> the logs I see this >> >> Sep 21 21:10:31 mfs1master mfsmaster[2373]: DEL_LIMIT temporary >> increased to: 26/s >> Sep 21 21:15:30 mfs1master mfsmaster[2373]: DEL_LIMIT temporary >> increased to: 33/s >> Sep 21 21:55:24 mfs1master mfsmaster[2373]: DEL_LIMIT decreased back to: >> 26/s >> >> OK, WHY IS DOING THIS! I told it no more than 20 >> >> I do NOT want this, it kills my server and we've learned the hard way it >> can really screw up VM images (they go read-only) if the deletions >> overwhelm the cluster. >> >> >> -bill >> >> >> >> ------------------------------------------------------------------------------ >> All the data continuously generated in your IT infrastructure contains a >> definitive record of customers, application performance, security >> threats, fraudulent activity and more. Splunk takes this data and makes >> sense of it. Business sense. IT sense. Common sense. >> http://p.sf.net/sfu/splunk-d2dcopy1 >> _______________________________________________ >> moosefs-users mailing list >> moo...@li... >> https://lists.sourceforge.net/lists/listinfo/moosefs-users > -- > Ólafur Osvaldsson > System Administrator > Nethonnun ehf. > e-mail: osv...@ne... > phone: +354 517 3400 > > > ------------------------------------------------------------------------------ > All the data continuously generated in your IT infrastructure contains a > definitive record of customers, application performance, security > threats, fraudulent activity and more. Splunk takes this data and makes > sense of it. Business sense. IT sense. Common sense. > http://p.sf.net/sfu/splunk-d2dcopy1 > _______________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users |
From: Ólafur Ó. <osv...@ne...> - 2011-09-22 10:00:20
|
We have the exact same problem, chunk deletions have caused problems in the past and we have DEL_LIMIT set at 5, but mfsmaster increases it to 40-50 right away and sometimes goes to 70/s This is also the case for chunk replications, it does not seem to honor the mfsmaster.cfg settings, although that does not get logged. /Oli On 22.9.2011, at 05:03, wk...@bn... wrote: > Ok, we deleted a couple hundred thousand files from a large Maildir > folder set. > > We have had problems with deletions overwhelming the cluster in the > past, so we have a DEL_LIMIT set to 20 (which we will probably lower) > > But when the expiretime hit, the server became lethargic. in checking > the logs I see this > > Sep 21 21:10:31 mfs1master mfsmaster[2373]: DEL_LIMIT temporary > increased to: 26/s > Sep 21 21:15:30 mfs1master mfsmaster[2373]: DEL_LIMIT temporary > increased to: 33/s > Sep 21 21:55:24 mfs1master mfsmaster[2373]: DEL_LIMIT decreased back to: > 26/s > > OK, WHY IS DOING THIS! I told it no more than 20 > > I do NOT want this, it kills my server and we've learned the hard way it > can really screw up VM images (they go read-only) if the deletions > overwhelm the cluster. > > > -bill > > > > ------------------------------------------------------------------------------ > All the data continuously generated in your IT infrastructure contains a > definitive record of customers, application performance, security > threats, fraudulent activity and more. Splunk takes this data and makes > sense of it. Business sense. IT sense. Common sense. > http://p.sf.net/sfu/splunk-d2dcopy1 > _______________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users -- Ólafur Osvaldsson System Administrator Nethonnun ehf. e-mail: osv...@ne... phone: +354 517 3400 |
From: <wk...@bn...> - 2011-09-22 05:31:07
|
Ok, we deleted a couple hundred thousand files from a large Maildir folder set. We have had problems with deletions overwhelming the cluster in the past, so we have a DEL_LIMIT set to 20 (which we will probably lower) But when the expiretime hit, the server became lethargic. in checking the logs I see this Sep 21 21:10:31 mfs1master mfsmaster[2373]: DEL_LIMIT temporary increased to: 26/s Sep 21 21:15:30 mfs1master mfsmaster[2373]: DEL_LIMIT temporary increased to: 33/s Sep 21 21:55:24 mfs1master mfsmaster[2373]: DEL_LIMIT decreased back to: 26/s OK, WHY IS DOING THIS! I told it no more than 20 I do NOT want this, it kills my server and we've learned the hard way it can really screw up VM images (they go read-only) if the deletions overwhelm the cluster. -bill |
From: Steve W. <st...@pu...> - 2011-09-20 12:53:10
|
Hi Michal, I've already run mfsfilerepair on the files in these chunks. Most of the corrupted files are temporary files that must have been open at the time of the changeover. I wouldn't mind just deleting the files, if I could. Some of the files, however, give a "realpath error" when I attempt to run mfsfilerepair: mfsfilerepair /net/jiang/fguo/.cache/indicator-applet-session.log /net/jiang/fguo/.cache/indicator-applet-session.log: realpath error Is there any way to clean up these chunks if mfsfilerepair fails? Again, I wouldn't mind losing files like the indicator-applet-session.log... Thanks, Steve On 09/20/2011 04:18 AM, Michał Borychowski wrote: > Hi! > > You need to run mfsfilerepair on the files belonging to these chunks. > > > > Kind regards > Michał Borychowski > MooseFS Support Manager > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > Gemius S.A. > ul. Wołoska 7, 02-672 Warszawa > Budynek MARS, klatka D > Tel.: +4822 874-41-00 > Fax : +4822 874-41-01 > > > > > -----Original Message----- > From: Steve Wilson [mailto:st...@pu...] > Sent: Monday, September 19, 2011 4:11 PM > To: moo...@li... > Subject: [Moosefs-users] Invalid copies of chunks > > Hi, > > Last week I swapped two MooseFS servers so that server A previously > running mfs-master started running mfs-metalogger and server B > previously running mfs-metalogger started running mfs-master. Somehow > in the process I ended up with metadata_ml.mfs.back and > metadata.mfs.back files that had problems preventing "mfsmetarestore -a" > from working correctly. So I went to a previous metadata file and > applied changelogs manually. > > Now I have 29 chunks that have no valid copies and messages like the > following appear in my logs: > > Sep 18 06:15:04 noro mfsmaster[9008]: chunk 000000000568DC8C has only > invalid copies (2) - please repair it manually > Sep 18 06:15:04 noro mfsmaster[9008]: chunk 000000000568DC8C_00000001 - > invalid copy on (128.210.48.90 - ver:00000002) > Sep 18 06:15:04 noro mfsmaster[9008]: chunk 000000000568DC8C_00000001 - > invalid copy on (128.210.48.88 - ver:00000002) > > So my question is: how should I proceed to repair these chunks manually > as suggested by the log message? > > Thanks! > > Steve > -- Steven M. Wilson, Systems and Network Manager Markey Center for Structural Biology Purdue University (765) 496-1946 |
From: Ken <ken...@gm...> - 2011-09-20 09:37:07
|
This situation *maybe* cause by mfsmetalogger still open changelog file(s). On Sun, Sep 18, 2011 at 12:29 AM, Allen Landsidel <lan...@gm... > wrote: > After a hardware failure test, mfsmetarestore is returning me an error > that I can't quite get my head around or figure out how to recover from. > > There is/was no metalogger yet, as this is an evaluation install on a > single machine. > > The output of mfsmetarestore is: > > # mfsmetarestore -a > loading objects (files,directories,etc.) ... ok > loading names ... ok > loading deletion timestamps ... ok > checking filesystem consistency ... ok > loading chunks data ... ok > connecting files and chunks ... ok > 2642207: ',' expected > 2642207: error: 1 (Operation not permitted) > > I'm unsure how to proceed. > > The metadata.mfs.back file exists. There are also changelog files > (changelog.N.mfs) where N is (0-6, 14-16). 7-13 are not there / missing. > > Any advice on how to get the system back online (without just purging > all the data) would be appreciated! > > > ------------------------------------------------------------------------------ > All the data continuously generated in your IT infrastructure contains a > definitive record of customers, application performance, security > threats, fraudulent activity and more. Splunk takes this data and makes > sense of it. Business sense. IT sense. Common sense. > http://p.sf.net/sfu/splunk-d2dcopy1 > _______________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users > |
From: Ken <ken...@gm...> - 2011-09-20 09:33:33
|
Maybe I can TRY to answer sth. On Tue, Sep 13, 2011 at 10:16 PM, Zachary Wagner <zw...@ha...>wrote: > Hello, > I am looking into MooseFS for one of my courses and I was hoping people > could answer a few questions for me. > > Here are a few. Some have sub-questions: > > 1. Does MooseFS save "deltas"? > Does it save a whole copy of a changed document or just the changes? Does > it have effects on the amount of data stored. > No, moose do not save changes set. > > 2. Does MooseFS synchronize all writes before returning? > > Yes. take a look at Write Proces in moosefs.org home page. > 3. Does Master Server round robin the requests to different chunk servers? > In deleting does Master or metalogger re-balance the distribution of data > across chunkservers? > What is the way the Master server assigns writes. > Does Master server move data to alleviate congestion? Can system auto move > chunks to spread reads out? > How does master direct requests? > What is the process of syncronizing data across chunks? > >> round robin more detail: for Write, the choosing is base on free space. for Read, sort by ip address of all chunk server's and client's. >> In deleting does Master or metalogger re-balance the distribution of data across chunkservers? I am not sure what this mean. Master decide chunk put in witch chunk server. Metalogger never *change* meta or filesystem. >> What is the way the Master server assigns writes. http://www.moosefs.org/system/html/write862-008171c7.png a picture is worth than thound words. >> Does Master server move data to alleviate congestion? No >> Can system auto move chunks to spread reads out? i do not think so >> What is the process of syncronizing data across chunks? master send commands to chunk server. It's complex. . > > 4. System Specifications - what is most important? > Processor - high GHz or high Cache memory? > Fast Ram? > Fast hard drive or big hard drive? > CPU and RAM is important to master. Hard drive is important to chunk server. Network transmission is important to the system. > > 5. Error Correction - Who is responsible for proper checksum? > OS level or Checksum in meta logger? > There is a complex checksum task in chunk Server. Chunk Server report the chunks which checksum failed > > Thank you for any help you can give me. > > Zach > > > ------------------------------------------------------------------------------ > BlackBerry® DevCon Americas, Oct. 18-20, San Francisco, CA > Learn about the latest advances in developing for the > BlackBerry® mobile platform with sessions, labs & more. > See new tools and technologies. Register for BlackBerry® DevCon today! > http://p.sf.net/sfu/rim-devcon-copy1 > _______________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users > > ---------------------------------------------------- 邵军辉 ext: 1365 msn: boo...@gm... 北京 朝阳区 静安中心2612 |
From: Davies L. <dav...@gm...> - 2011-09-20 09:11:52
|
It's normal, we have mfschunkserver consuming 55M of RAM with 270k chunks, about 25 byte per chunk. On Thu, Aug 25, 2011 at 9:58 PM, Jesus <jes...@gm...> wrote: > Hi, > > I have a chunkserver with 4 million chunks. The mfschunkserver process is > consuming almost 1 GB of RAM, is it normal? > > I am using MFS 1.6.20-2 in Debian squeeze servers (64 bit 2.6.32-5 kernel) > > I asumed the RAM in the chunkserver was not very important (i.e, it was not > dependent on the number of chunks). As opossed to the MFS master, which > caches all the metadata in RAM. > > Thanks and kind regards, > > > ------------------------------------------------------------------------------ > All the data continuously generated in your IT infrastructure contains a > definitive record of customers, application performance, security > threats, fraudulent activity and more. Splunk takes this data and makes > sense of it. Business sense. IT sense. Common sense. > http://p.sf.net/sfu/splunk-d2dcopy1 > _______________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users > > -- - Davies |
From: Michał B. <mic...@ge...> - 2011-09-20 08:46:02
|
Hi! You need to run mfsfilerepair on the files belonging to these chunks. Kind regards Michał Borychowski MooseFS Support Manager _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ Gemius S.A. ul. Wołoska 7, 02-672 Warszawa Budynek MARS, klatka D Tel.: +4822 874-41-00 Fax : +4822 874-41-01 -----Original Message----- From: Steve Wilson [mailto:st...@pu...] Sent: Monday, September 19, 2011 4:11 PM To: moo...@li... Subject: [Moosefs-users] Invalid copies of chunks Hi, Last week I swapped two MooseFS servers so that server A previously running mfs-master started running mfs-metalogger and server B previously running mfs-metalogger started running mfs-master. Somehow in the process I ended up with metadata_ml.mfs.back and metadata.mfs.back files that had problems preventing "mfsmetarestore -a" from working correctly. So I went to a previous metadata file and applied changelogs manually. Now I have 29 chunks that have no valid copies and messages like the following appear in my logs: Sep 18 06:15:04 noro mfsmaster[9008]: chunk 000000000568DC8C has only invalid copies (2) - please repair it manually Sep 18 06:15:04 noro mfsmaster[9008]: chunk 000000000568DC8C_00000001 - invalid copy on (128.210.48.90 - ver:00000002) Sep 18 06:15:04 noro mfsmaster[9008]: chunk 000000000568DC8C_00000001 - invalid copy on (128.210.48.88 - ver:00000002) So my question is: how should I proceed to repair these chunks manually as suggested by the log message? Thanks! Steve -- Steven M. Wilson, Systems and Network Manager Markey Center for Structural Biology Purdue University (765) 496-1946 ---------------------------------------------------------------------------- -- BlackBerry® DevCon Americas, Oct. 18-20, San Francisco, CA Learn about the latest advances in developing for the BlackBerry® mobile platform with sessions, labs & more. See new tools and technologies. Register for BlackBerry® DevCon today! http://p.sf.net/sfu/rim-devcon-copy1 _______________________________________________ moosefs-users mailing list moo...@li... https://lists.sourceforge.net/lists/listinfo/moosefs-users |
From: Steve W. <st...@pu...> - 2011-09-19 14:11:31
|
Hi, Last week I swapped two MooseFS servers so that server A previously running mfs-master started running mfs-metalogger and server B previously running mfs-metalogger started running mfs-master. Somehow in the process I ended up with metadata_ml.mfs.back and metadata.mfs.back files that had problems preventing "mfsmetarestore -a" from working correctly. So I went to a previous metadata file and applied changelogs manually. Now I have 29 chunks that have no valid copies and messages like the following appear in my logs: Sep 18 06:15:04 noro mfsmaster[9008]: chunk 000000000568DC8C has only invalid copies (2) - please repair it manually Sep 18 06:15:04 noro mfsmaster[9008]: chunk 000000000568DC8C_00000001 - invalid copy on (128.210.48.90 - ver:00000002) Sep 18 06:15:04 noro mfsmaster[9008]: chunk 000000000568DC8C_00000001 - invalid copy on (128.210.48.88 - ver:00000002) So my question is: how should I proceed to repair these chunks manually as suggested by the log message? Thanks! Steve -- Steven M. Wilson, Systems and Network Manager Markey Center for Structural Biology Purdue University (765) 496-1946 |
From: Allen L. <lan...@gm...> - 2011-09-17 16:29:45
|
After a hardware failure test, mfsmetarestore is returning me an error that I can't quite get my head around or figure out how to recover from. There is/was no metalogger yet, as this is an evaluation install on a single machine. The output of mfsmetarestore is: # mfsmetarestore -a loading objects (files,directories,etc.) ... ok loading names ... ok loading deletion timestamps ... ok checking filesystem consistency ... ok loading chunks data ... ok connecting files and chunks ... ok 2642207: ',' expected 2642207: error: 1 (Operation not permitted) I'm unsure how to proceed. The metadata.mfs.back file exists. There are also changelog files (changelog.N.mfs) where N is (0-6, 14-16). 7-13 are not there / missing. Any advice on how to get the system back online (without just purging all the data) would be appreciated! |
From: Stéphane B. <ste...@ga...> - 2011-09-14 18:06:08
|
Hi, We have a cluster of 4 apache server with moosefs mount points used to share content to the web servers. I know moosefs is not best for this but we never had problem on others clusters with more servers. The thing is on that cluster we are having a lot of gettatr and lookups 240650 gettatr 2076747 lookups 247438 opens 9242 reads 0 writes the system is working fine most of the time. But lately we see the mfsmount process using 100%. We tried to send less queries to the server having the problem and even with very small amount of queries on the server.. the mfsmount process is still using near 100% of one cpu core. It seems to that we have slow down on that machine. All the remaining clients machines on the cluster (about 20) are fine. It looks like a leak or some problem with mfsclient. If I remount the share it starts working correctly again for few days/weeks. I wonder what I can check to troubleshoot this issue which is happening only on that cluster with 4 machines. By the way the load on master and chunk server is very low... Thanks Stephane Boisvert |
From: Zachary W. <zw...@ha...> - 2011-09-13 15:13:00
|
Hello, I am looking into MooseFS for one of my courses and I was hoping people could answer a few questions for me. Here are a few. Some have sub-questions: 1. Does MooseFS save "deltas"? Does it save a whole copy of a changed document or just the changes? Does it have effects on the amount of data stored. 2. Does MooseFS synchronize all writes before returning? 3. Does Master Server round robin the requests to different chunk servers? In deleting does Master or metalogger re-balance the distribution of data across chunkservers? What is the way the Master server assigns writes. Does Master server move data to alleviate congestion? Can system auto move chunks to spread reads out? How does master direct requests? What is the process of syncronizing data across chunks? . 4. System Specifications - what is most important? Processor - high GHz or high Cache memory? Fast Ram? Fast hard drive or big hard drive? 5. Error Correction - Who is responsible for proper checksum? OS level or Checksum in meta logger? Thank you for any help you can give me. Zach |
From: Fredrik R. <fr...@se...> - 2011-09-12 00:24:48
|
Hi, I'm quite new to moose and decided to test it with a media archive. My configuration is 1 x master server running intel Atom D525 with 4 GB of memory. 2 x chunk servers also running Atom D525 with 4 GB of memory. Each chunk server has a 6 x 2 TB drives mounted as /moose-$serialnumberofdrive What I've found after some testing is that for the most part performance is acceptable however I find that there are periods where its very very slow. When this happens I also see reads of 2 - 300 megabytes/second for each drive. Looking at when its slow it seems that this happens when it does testing of chunks. After checking the configuration I see the following parameter: HDD_TEST_FREQ = 10 What would be the recommended value of this parameter? I've been trying to search for documentation or an explanation of pro/con's of lowering or increasing this value. I would like to see that the checks are much more rare and that when they do happen they are very very low priority. Thank you for any feedback on this. Regards, Fredrik Rovik |
From: Davies L. <dav...@gm...> - 2011-09-07 09:22:26
|
Hi, Robert: I have written a mfs client in Go in last one more week, at https://github.com/davies/go-mfsclient It use connection pool to deal with master and chunk server, cache inode to reduce master lookup. You could do some benchmark with you deploy, any feedback is welcomed. Davies On Fri, Sep 2, 2011 at 8:57 AM, Robert Sandilands <rsa...@ne...>wrote: > I looked at the mfsmount code. It would be a significant effort to provide > a usable library/API from it that is as fully functional as mfsmount. > > I found a work around for the open()/close() limitation: > > I modified my web-server to be able to serve files from multiple mfs > mounts. I changed each of the 5 web servers to mount the file system on 8 > different folders having 8 instances of mfsmount running. This is a total of > 40 mounts. The individual web servers would then load balance between the > different mounts. > > It seems that if you have more than about 10 simultaneous accesses per > mfsmount then you run into a significant slowdown with open() and close(). > Here are the averages for a slightly shorter time period after I made this > change: > > File Open average 13.73 ms > File Read average 118.29 ms > File Close average 0.44 ms > File Size average 0.02 ms > Net Read average 2.7 ms > Net Write average 2.36 ms > Log Access average 0.37 ms > Log Error average 0.04 ms > > Average time to process a file 137.96 ms > Total files processed 1,391,217 > > This is a significant improvement and proves, for me at least, that the > handling of open() in mfsmount that is serialized with a single TCP socket > is a cause for scaling issues even at low numbers of clients per mount. > > Another thing I noticed in the source code is in mfschunkserver. It seems > like it creates 24 threads. 4 Helper threads and 2 groups of 10 worker > threads. The one group handles requests from mfsmaster and is used for > replication etc. The other group handles requests from mfsmount. This > basically implies that you can have at most 20 simultaneous accesses to the > disks controlled by a single chunk server at any specific time. Is there a > reason it is that low and what would be needed to make that tunable or > increase the number? > > Modern disk controllers work well with multiple pending requests and can > re-order it to get the most performance out of your disks. SAS and SATA > controllers can do this, but SAS can do it a bit better. It generally seems > to get the most out of your disk subsystem if you always have a few more > pending requests than spindles. > > Robert > > > On 8/31/11 2:19 AM, Davies Liu wrote: > > Not yet, but we can export parts of mfsmount, then create Python or Go > binding of it. > > Davies > > On Wed, Aug 31, 2011 at 11:18 AM, Robert Sandilands < > rsa...@ne...> wrote: > >> There is a native API? Where can I find information about it? Or do you >> have to reverse it from the code? >> >> > Robert >> >> >> On 8/30/11 10:42 PM, Davies Liu wrote: >> >> The bottle neck is FUSE and mfsmount, you should try to use native API ( >> borrowed from mfsmount) >> of MFS to re-implement a HTTP server, one socket per thread , or sockets >> pool. >> >> I just want do it in Go, may by python is easier. >> >> Davies >> >> On Wed, Aug 31, 2011 at 8:54 AM, Robert Sandilands <rsa...@ne... >> > wrote: >> >>> Further on this subject. >>> >>> I wrote a dedicated http server to serve the files instead of using >>> Apache. It allowed me to gain a few extra percent of performance and >>> decreased the memory usage of the web servers. The web server also gave me >>> some interesting timings: >>> >>> File open average 405.3732 ms >>> File read average 238.7784 ms >>> File close average 286.8376 ms >>> File size average 0.0026 ms >>> Net read average 2.536 ms >>> Net write average 2.2148 ms >>> Log to access log average 0.2526 ms >>> Log to error log average 0.2234 ms >>> >>> Average time to process a file 936.2186 ms >>> Total files processed 1,503,610 >>> >>> What I really find scary is that to open a file takes nearly half a >>> second. To close a file a quarter of a second. The time to open() and >>> close() is nearly 3 times more than the time to read the data. The server >>> always reads in multiples of 64 kB except if there are less data available. >>> Although it uses posix_fadvise() to try and do some read-ahead. This is the >>> average over 5 machines running mfsmount and my custom web server running >>> for about 18 hours. >>> >>> On a machine that only serves a low number of clients the times for open >>> and close are negligible. open() and close() seems to scale very badly with >>> an increase in clients using mfsmount. >>> >>> From looking at the code for mfsmount it seems like all communication to >>> the master happens over a single TCP socket with a global handle and mutex >>> to protect it. This may be the bottle neck? If there are multiple open()'s >>> at the same time they may end up waiting for the mutex to get an opportunity >>> to communicate with the master? The same handle and mutex is also used to >>> read replies and this may also not help the situation? >>> >>> What prevents multiple sockets to the master? >>> >>> It also seems to indicate that the only way to get the open() average >>> down is to introduce more web servers and that a single web server can only >>> serve a very low number of clients. Is that a correct assumption? >>> >>> >>> Robert >>> >>> On 8/26/11 3:25 AM, Davies Liu wrote: >>> >>> Hi Robert, >>> >>> Another hint to make mfsmaster more responsive is to locate the >>> metadata.mfs >>> on a separated disk with change logs, such as SAS array, then you should >>> modify >>> the source code of mfsmaster to do this. >>> >>> PS: what is the average size of you files? MooseFS (like GFS) is >>> designed for >>> large file (100M+), it can not serve well for amount of small files. >>> Haystack from >>> Facebook may be the better choice. We (douban.com) use MooseFS to serve >>> 200+T(1M files) offline data and beansdb [1] to serve 500 million online >>> small >>> files, it performs very well. >>> >>> [1]: http://code.google.com/p/*beansdb*/ >>> >>> Davies >>> >>> On Fri, Aug 26, 2011 at 9:08 AM, Robert Sandilands < >>> rsa...@ne...> wrote: >>> >>>> Hi Elliot, >>>> >>>> There is nothing in the code to change the priority. >>>> >>>> Taking virtually all other load from the chunk and master servers seems >>>> to have improved this significantly. I still see timeouts from mfsmount, >>>> but not enough to be problematic. >>>> >>>> To try and optimize the performance I am experimenting with accessing >>>> the data using different APIs and block sizes but this has been >>>> inconclusive. I have tried the effect of posix_fadvise(), sendfile() and >>>> different sized buffers for read(). I still want to try mmap(). >>>> Sendfile() did seem to be slightly slower than read(). >>>> >>>> Robert >>>> >>>> On 8/24/11 11:05 AM, Elliot Finley wrote: >>>> > On Tue, Aug 9, 2011 at 6:46 PM, Robert Sandilands< >>>> rsa...@ne...> wrote: >>>> >> Increasing the swap space fixed the fork() issue. It seems that you >>>> have to >>>> >> ensure that memory available is always double the memory needed by >>>> >> mfsmaster. None of the swap space was used over the last 24 hours. >>>> >> >>>> >> This did solve the extreme comb-like behavior of mfsmaster. It still >>>> does >>>> >> not resolve its sensitivity to load on the server. I am still seeing >>>> >> timeouts on the chunkservers and mounts on the hour due to the high >>>> CPU and >>>> >> I/O load when the meta data is dumped to disk. It did however >>>> decrease >>>> >> significantly. >>>> > Here is another thought on this... >>>> > >>>> > The process is niced to -19 (very high priority) so that it has good >>>> > performance. It forks once per hour to write out the metadata. I >>>> > haven't checked the code for this, but is the forked process lowering >>>> > it's priority so it doesn't compete with the original process? >>>> > >>>> > If it's not, it should be an easy code change to lower the priority in >>>> > the child process (metadata writer) so that it doesn't compete with >>>> > the original process at the same priority. >>>> > >>>> > If you check into this, I'm sure the list would appreciate an update. >>>> :) >>>> > >>>> > Elliot >>>> >>>> >>>> >>>> ------------------------------------------------------------------------------ >>>> EMC VNX: the world's simplest storage, starting under $10K >>>> The only unified storage solution that offers unified management >>>> Up to 160% more powerful than alternatives and 25% more efficient. >>>> Guaranteed. http://p.sf.net/sfu/emc-vnx-dev2dev >>>> _______________________________________________ >>>> moosefs-users mailing list >>>> moo...@li... >>>> https://lists.sourceforge.net/lists/listinfo/moosefs-users >>>> >>> >>> >>> >>> -- >>> - Davies >>> >>> >>> >>> >>> ------------------------------------------------------------------------------ >>> Special Offer -- Download ArcSight Logger for FREE! >>> Finally, a world-class log management solution at an even better >>> price-free! And you'll get a free "Love Thy Logs" t-shirt when you >>> download Logger. Secure your free ArcSight Logger TODAY! >>> http://p.sf.net/sfu/arcsisghtdev2dev >>> _______________________________________________ >>> moosefs-users mailing list >>> moo...@li... >>> https://lists.sourceforge.net/lists/listinfo/moosefs-users >>> >>> >> >> >> -- >> - Davies >> >> >> > > > -- > - Davies > > > -- - Davies |
From: Robert S. <rsa...@ne...> - 2011-09-02 00:57:32
|
I looked at the mfsmount code. It would be a significant effort to provide a usable library/API from it that is as fully functional as mfsmount. I found a work around for the open()/close() limitation: I modified my web-server to be able to serve files from multiple mfs mounts. I changed each of the 5 web servers to mount the file system on 8 different folders having 8 instances of mfsmount running. This is a total of 40 mounts. The individual web servers would then load balance between the different mounts. It seems that if you have more than about 10 simultaneous accesses per mfsmount then you run into a significant slowdown with open() and close(). Here are the averages for a slightly shorter time period after I made this change: File Open average 13.73 ms File Read average 118.29 ms File Close average 0.44 ms File Size average 0.02 ms Net Read average 2.7 ms Net Write average 2.36 ms Log Access average 0.37 ms Log Error average 0.04 ms Average time to process a file 137.96 ms Total files processed 1,391,217 This is a significant improvement and proves, for me at least, that the handling of open() in mfsmount that is serialized with a single TCP socket is a cause for scaling issues even at low numbers of clients per mount. Another thing I noticed in the source code is in mfschunkserver. It seems like it creates 24 threads. 4 Helper threads and 2 groups of 10 worker threads. The one group handles requests from mfsmaster and is used for replication etc. The other group handles requests from mfsmount. This basically implies that you can have at most 20 simultaneous accesses to the disks controlled by a single chunk server at any specific time. Is there a reason it is that low and what would be needed to make that tunable or increase the number? Modern disk controllers work well with multiple pending requests and can re-order it to get the most performance out of your disks. SAS and SATA controllers can do this, but SAS can do it a bit better. It generally seems to get the most out of your disk subsystem if you always have a few more pending requests than spindles. Robert On 8/31/11 2:19 AM, Davies Liu wrote: > Not yet, but we can export parts of mfsmount, then create Python or Go > binding of it. > > Davies > > On Wed, Aug 31, 2011 at 11:18 AM, Robert Sandilands > <rsa...@ne... <mailto:rsa...@ne...>> wrote: > > There is a native API? Where can I find information about it? Or > do you have to reverse it from the code? > > Robert > > > On 8/30/11 10:42 PM, Davies Liu wrote: >> The bottle neck is FUSE and mfsmount, you should try to use >> native API ( borrowed from mfsmount) >> of MFS to re-implement a HTTP server, one socket per thread , or >> sockets pool. >> >> I just want do it in Go, may by python is easier. >> >> Davies >> >> On Wed, Aug 31, 2011 at 8:54 AM, Robert Sandilands >> <rsa...@ne... <mailto:rsa...@ne...>> wrote: >> >> Further on this subject. >> >> I wrote a dedicated http server to serve the files instead of >> using Apache. It allowed me to gain a few extra percent of >> performance and decreased the memory usage of the web >> servers. The web server also gave me some interesting timings: >> >> File open average 405.3732 ms >> File read average 238.7784 ms >> File close average 286.8376 ms >> File size average 0.0026 ms >> Net read average 2.536 ms >> Net write average 2.2148 ms >> Log to access log average 0.2526 ms >> Log to error log average 0.2234 ms >> >> Average time to process a file 936.2186 ms >> Total files processed 1,503,610 >> >> What I really find scary is that to open a file takes nearly >> half a second. To close a file a quarter of a second. The >> time to open() and close() is nearly 3 times more than the >> time to read the data. The server always reads in multiples >> of 64 kB except if there are less data available. Although it >> uses posix_fadvise() to try and do some read-ahead. This is >> the average over 5 machines running mfsmount and my custom >> web server running for about 18 hours. >> >> On a machine that only serves a low number of clients the >> times for open and close are negligible. open() and close() >> seems to scale very badly with an increase in clients using >> mfsmount. >> >> From looking at the code for mfsmount it seems like all >> communication to the master happens over a single TCP socket >> with a global handle and mutex to protect it. This may be the >> bottle neck? If there are multiple open()'s at the same time >> they may end up waiting for the mutex to get an opportunity >> to communicate with the master? The same handle and mutex is >> also used to read replies and this may also not help the >> situation? >> >> What prevents multiple sockets to the master? >> >> It also seems to indicate that the only way to get the open() >> average down is to introduce more web servers and that a >> single web server can only serve a very low number of >> clients. Is that a correct assumption? >> >> >> Robert >> >> On 8/26/11 3:25 AM, Davies Liu wrote: >>> Hi Robert, >>> >>> Another hint to make mfsmaster more responsive is to locate >>> the metadata.mfs >>> on a separated disk with change logs, such as SAS array, >>> then you should modify >>> the source code of mfsmaster to do this. >>> >>> PS: what is the average size of you files? MooseFS (like >>> GFS) is designed for >>> large file (100M+), it can not serve well for amount of >>> small files. Haystack from >>> Facebook may be the better choice. We (douban.com >>> <http://douban.com>) use MooseFS to serve >>> 200+T(1M files) offline data and beansdb [1] to serve 500 >>> million online small >>> files, it performs very well. >>> >>> [1]: http://code.google.com/p/ >>> <http://code.google.com/p/>*beansdb*/ >>> >>> Davies >>> >>> On Fri, Aug 26, 2011 at 9:08 AM, Robert Sandilands >>> <rsa...@ne... <mailto:rsa...@ne...>> wrote: >>> >>> Hi Elliot, >>> >>> There is nothing in the code to change the priority. >>> >>> Taking virtually all other load from the chunk and >>> master servers seems >>> to have improved this significantly. I still see >>> timeouts from mfsmount, >>> but not enough to be problematic. >>> >>> To try and optimize the performance I am experimenting >>> with accessing >>> the data using different APIs and block sizes but this >>> has been >>> inconclusive. I have tried the effect of >>> posix_fadvise(), sendfile() and >>> different sized buffers for read(). I still want to try >>> mmap(). >>> Sendfile() did seem to be slightly slower than read(). >>> >>> Robert >>> >>> On 8/24/11 11:05 AM, Elliot Finley wrote: >>> > On Tue, Aug 9, 2011 at 6:46 PM, Robert >>> Sandilands<rsa...@ne... >>> <mailto:rsa...@ne...>> wrote: >>> >> Increasing the swap space fixed the fork() issue. It >>> seems that you have to >>> >> ensure that memory available is always double the >>> memory needed by >>> >> mfsmaster. None of the swap space was used over the >>> last 24 hours. >>> >> >>> >> This did solve the extreme comb-like behavior of >>> mfsmaster. It still does >>> >> not resolve its sensitivity to load on the server. I >>> am still seeing >>> >> timeouts on the chunkservers and mounts on the hour >>> due to the high CPU and >>> >> I/O load when the meta data is dumped to disk. It did >>> however decrease >>> >> significantly. >>> > Here is another thought on this... >>> > >>> > The process is niced to -19 (very high priority) so >>> that it has good >>> > performance. It forks once per hour to write out the >>> metadata. I >>> > haven't checked the code for this, but is the forked >>> process lowering >>> > it's priority so it doesn't compete with the original >>> process? >>> > >>> > If it's not, it should be an easy code change to lower >>> the priority in >>> > the child process (metadata writer) so that it doesn't >>> compete with >>> > the original process at the same priority. >>> > >>> > If you check into this, I'm sure the list would >>> appreciate an update. :) >>> > >>> > Elliot >>> >>> >>> ------------------------------------------------------------------------------ >>> EMC VNX: the world's simplest storage, starting under $10K >>> The only unified storage solution that offers unified >>> management >>> Up to 160% more powerful than alternatives and 25% more >>> efficient. >>> Guaranteed. http://p.sf.net/sfu/emc-vnx-dev2dev >>> _______________________________________________ >>> moosefs-users mailing list >>> moo...@li... >>> <mailto:moo...@li...> >>> https://lists.sourceforge.net/lists/listinfo/moosefs-users >>> >>> >>> >>> >>> -- >>> - Davies >> >> >> ------------------------------------------------------------------------------ >> Special Offer -- Download ArcSight Logger for FREE! >> Finally, a world-class log management solution at an even better >> price-free! And you'll get a free "Love Thy Logs" t-shirt >> when you >> download Logger. Secure your free ArcSight Logger TODAY! >> http://p.sf.net/sfu/arcsisghtdev2dev >> _______________________________________________ >> moosefs-users mailing list >> moo...@li... >> <mailto:moo...@li...> >> https://lists.sourceforge.net/lists/listinfo/moosefs-users >> >> >> >> >> -- >> - Davies > > > > > -- > - Davies |