You can subscribe to this list here.
2009 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(4) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2010 |
Jan
(20) |
Feb
(11) |
Mar
(11) |
Apr
(9) |
May
(22) |
Jun
(85) |
Jul
(94) |
Aug
(80) |
Sep
(72) |
Oct
(64) |
Nov
(69) |
Dec
(89) |
2011 |
Jan
(72) |
Feb
(109) |
Mar
(116) |
Apr
(117) |
May
(117) |
Jun
(102) |
Jul
(91) |
Aug
(72) |
Sep
(51) |
Oct
(41) |
Nov
(55) |
Dec
(74) |
2012 |
Jan
(45) |
Feb
(77) |
Mar
(99) |
Apr
(113) |
May
(132) |
Jun
(75) |
Jul
(70) |
Aug
(58) |
Sep
(58) |
Oct
(37) |
Nov
(51) |
Dec
(15) |
2013 |
Jan
(28) |
Feb
(16) |
Mar
(25) |
Apr
(38) |
May
(23) |
Jun
(39) |
Jul
(42) |
Aug
(19) |
Sep
(41) |
Oct
(31) |
Nov
(18) |
Dec
(18) |
2014 |
Jan
(17) |
Feb
(19) |
Mar
(39) |
Apr
(16) |
May
(10) |
Jun
(13) |
Jul
(17) |
Aug
(13) |
Sep
(8) |
Oct
(53) |
Nov
(23) |
Dec
(7) |
2015 |
Jan
(35) |
Feb
(13) |
Mar
(14) |
Apr
(56) |
May
(8) |
Jun
(18) |
Jul
(26) |
Aug
(33) |
Sep
(40) |
Oct
(37) |
Nov
(24) |
Dec
(20) |
2016 |
Jan
(38) |
Feb
(20) |
Mar
(25) |
Apr
(14) |
May
(6) |
Jun
(36) |
Jul
(27) |
Aug
(19) |
Sep
(36) |
Oct
(24) |
Nov
(15) |
Dec
(16) |
2017 |
Jan
(8) |
Feb
(13) |
Mar
(17) |
Apr
(20) |
May
(28) |
Jun
(10) |
Jul
(20) |
Aug
(3) |
Sep
(18) |
Oct
(8) |
Nov
|
Dec
(5) |
2018 |
Jan
(15) |
Feb
(9) |
Mar
(12) |
Apr
(7) |
May
(123) |
Jun
(41) |
Jul
|
Aug
(14) |
Sep
|
Oct
(15) |
Nov
|
Dec
(7) |
2019 |
Jan
(2) |
Feb
(9) |
Mar
(2) |
Apr
(9) |
May
|
Jun
|
Jul
(2) |
Aug
|
Sep
(6) |
Oct
(1) |
Nov
(12) |
Dec
(2) |
2020 |
Jan
(2) |
Feb
|
Mar
|
Apr
(3) |
May
|
Jun
(4) |
Jul
(4) |
Aug
(1) |
Sep
(18) |
Oct
(2) |
Nov
|
Dec
|
2021 |
Jan
|
Feb
(3) |
Mar
|
Apr
|
May
|
Jun
|
Jul
(6) |
Aug
|
Sep
(5) |
Oct
(5) |
Nov
(3) |
Dec
|
2022 |
Jan
|
Feb
|
Mar
(3) |
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: Thomas S H. <tha...@gm...> - 2011-07-10 19:56:41
|
I just released Salt 0.8.9 and thought I would mention that it includes a module for moosefs support. The moosefs support is still very small, but it is growing. If anyone is interested in Salt check it out: Github site: https://github.com/thatch45/salt Homepage: http://saltstack.org/ Release announcement: http://saltstack.org/topics/releases/0.8.9.html - Thomas S Hatch |
From: Michal B. <mic...@ge...> - 2011-07-08 10:24:39
|
Hi! This is normal, specially while your machine is swapping. Master forks in order to save metadata to disk (in your case 7072). When the data is saved, the process will quit. When machine needs to use swap, saving of metada takes longer and amount of commonly used memory by two processes reduces and finally total amount of memory used increases which causes further swapping... You should normally stop the master by: /usr/sbin/mfsmaster stop and wait for the writing process to finish saving metadata (you'll see metadata.mfs.back.tmp growing). Add RAM and restart :) Kind regards Michał Borychowski MooseFS Support Manager _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ Gemius S.A. ul. Wołoska 7, 02-672 Warszawa Budynek MARS, klatka D Tel.: +4822 874-41-00 Fax : +4822 874-41-01 -----Original Message----- From: Krzysztof Janiszewski - ecenter sp. z o.o. [mailto:k.j...@ec...] Sent: Thursday, July 07, 2011 9:53 PM To: moo...@li... Subject: [Moosefs-users] mfsmaster Hello! Is it normal that I have two mfsmaster processes running at the same time on master machine? root@master:~# 180s ps -ax | grep mfs 3659 ? D< 41:29 /usr/sbin/mfsmaster start 3667 ? S 0:22 python /usr/sbin/mfscgiserv 7072 ? D< 0:06 /usr/sbin/mfsmaster start Can I kill -9 mfsmaster when all chunkservers and metalogger are stopped? I ran out of memory and master is using 1GB of swap now and mfsmaster don't want to stop in "normal" way. Best regards Krzysztof Janiszewski ecenter sp. z o.o. -------------------------------------- Domeny, hosting, poczta wideo :: http://www.ecenter.pl :: Niniejsza wiadomość przekazana została Państwu przez ecenter sp z o.o. 87-100 Toruń, Ul. Goździkowa 2 Zarejestrowana w Sądzie Rejonowym w Toruniu VII Wydział Gospodarczy Krajowego Rejestru Sądowego pod numerem 0000251110 Z kapitałem zakładowym w wysokości 142500zł NIP 956-216-66-73 ---------------------------------------------------------------------------- -- All of the data generated in your IT infrastructure is seriously valuable. Why? It contains a definitive record of application performance, security threats, fraudulent activity, and more. Splunk takes this data and makes sense of it. IT sense. And common sense. http://p.sf.net/sfu/splunk-d2d-c2 _______________________________________________ moosefs-users mailing list moo...@li... https://lists.sourceforge.net/lists/listinfo/moosefs-users |
From: Krzysztof J. - e. s. z o.o. <k.j...@ec...> - 2011-07-07 20:09:29
|
Hello! Is it normal that I have two mfsmaster processes running at the same time on master machine? root@master:~# 180s ps -ax | grep mfs 3659 ? D< 41:29 /usr/sbin/mfsmaster start 3667 ? S 0:22 python /usr/sbin/mfscgiserv 7072 ? D< 0:06 /usr/sbin/mfsmaster start Can I kill -9 mfsmaster when all chunkservers and metalogger are stopped? I ran out of memory and master is using 1GB of swap now and mfsmaster don't want to stop in "normal" way. Best regards Krzysztof Janiszewski ecenter sp. z o.o. -------------------------------------- Domeny, hosting, poczta wideo :: http://www.ecenter.pl :: Niniejsza wiadomość przekazana została Państwu przez ecenter sp z o.o. 87-100 Toruń, Ul. Goździkowa 2 Zarejestrowana w Sądzie Rejonowym w Toruniu VII Wydział Gospodarczy Krajowego Rejestru Sądowego pod numerem 0000251110 Z kapitałem zakładowym w wysokości 142500zł NIP 956-216-66-73 |
From: Tuukka L. <tlu...@gm...> - 2011-07-07 16:33:18
|
Hey Michal, Thanks for the info. Actually you got me thinking about this again, and maybe a simpler/scalable solution to the problem I was experiencing would be to run a sanity check on the meta file that gets dumped and if it doesn't pass it some kind of error is presented in the admin console, logs and/or emailed/text messaged etc. Also in that case the system should keep the bad one as well as the last good one at the minimum. Thanks, Tuukka 2011/7/7 Michal Borychowski <mic...@ge...>: > Hi Tuukka! > > In reply to your old post :) Files below 64MB would occupy single chunks so > they would be in one part, not divided. > > Chunks have 5kB which needs to be removed and later it is necessary to check > where the file ends. Chunks length is rounded up to (5+64*n)kB, where n can > be 0 up to 1024. > > > Kind regards > Michał Borychowski > MooseFS Support Manager > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > Gemius S.A. > ul. Wołoska 7, 02-672 Warszawa > Budynek MARS, klatka D > Tel.: +4822 874-41-00 > Fax : +4822 874-41-01 > > > > -----Original Message----- > From: Tuukka Luolamo [mailto:tlu...@gm...] > Sent: Monday, June 06, 2011 8:17 AM > To: Michal Borychowski > Cc: moo...@li... > Subject: Re: [Moosefs-users] Problems after power failure > > So the chunk servers in effect do not know what files they have. Only > the master is aware? I looked at the chunks themselves and they don't > seem to be particularly special, I was able to identify some of the > files inside the chunks simply by using head/tail/cat commands on > them, so it would seem like it would not be hard for each chunkserver > to be at least minimally aware of what is in themselves, to remove > some dependency on the master. Not sure how other people feel, but if > I knew that each chunkserver can tell the master what it has, it would > make me feel better that there is a way to recover the contents of the > file system in case of a extreme failure, specially in small > implementations. I realize in larger implementations it may be > impractical to ever recover the meta data from the chunkservers, > simply for the time it might take. So having very reliable masters and > backup masters would be a key. > > Thanks. > > 2011/6/5 Michal Borychowski <mic...@ge...>: >> If you really don't have your metadata, your files are dead... You need >> metadata to know about them. Of course you can try to recover by reading > the >> surface of hard drives on chunkservers but this would be very tedious... >> >> >> Regards >> Michał >> >> -----Original Message----- >> From: Tuukka Luolamo [mailto:tlu...@gm...] >> Sent: Monday, June 06, 2011 7:56 AM >> To: Michal Borychowski >> Cc: moo...@li... >> Subject: Re: [Moosefs-users] Problems after power failure >> >> Well I guess the question is in the impossible situation that the >> metadata were to be completely corrupt or lost because there is no >> backup and master goes dead, what recourse is there if any? >> >> Thanks >> >> 2011/6/5 Michal Borychowski <mic...@ge...>: >>> Hi! >>> >>> That's interesting... So this bit could be changed by your CPU or >>> motherboard or it is an error in the software but it would be very very >>> difficult to find it as the error probably cannot be easily repeated. >>> >>> Regarding your previous questions - it's almost impossible that your >>> metadata is "completely" corrupt. You really can recover most of your >> files >>> at most times. This situation was really weird as there was a single >> change >>> in an information bit. Normally you would run mfsmetarestore with a flag >> -i >>> (ignore) and it would just ignore this one file. Unfortunately you would >> not >>> be able to repair this single bit as this was quite a complicated > process. >>> >>> >>> Kind regards >>> Michał >>> >>> >>> -----Original Message----- >>> From: Tuukka Luolamo [mailto:tlu...@gm...] >>> Sent: Monday, June 06, 2011 5:01 AM >>> To: Michal Borychowski; moo...@li... >>> Subject: Re: [Moosefs-users] Problems after power failure >>> >>> OK I ran the memtest on the master server, It got through without >>> finding any errors. >>> >>> 2011/6/2 Michal Borychowski <mic...@ge...>: >>>> Hi! >>>> >>>> In the meantime please run the memtest - we are curious if it really was >>>> hardware problem or maybe it could be a software problem >>>> >>>> >>>> Regards >>>> Michal >>>> >>>> -----Original Message----- >>>> From: Tuukka Luolamo [mailto:tlu...@gm...] >>>> Sent: Thursday, June 02, 2011 9:55 AM >>>> To: WK >>>> Cc: moo...@li... >>>> Subject: Re: [Moosefs-users] Problems after power failure >>>> >>>> OK I put in place the file the dev sent me and can not see any data >>> loss... >>>> >>>> I found the file in question the one I got in the error and it seems >> fine. >>>> >>>> The whole system is up and functioning. >>>> >>>> I run the system on a old desktop computer and a another PC I bought >>>> for $25 so the dev recommends making sure you have good memory, but I >>>> guess I am using whatever I got =) Aside from this error everything >>>> has been fine. Didn't run the memtest they recommended, but I would >>>> not count out memory errors. >>>> >>>> However I would like to understand the situation better mainly for >>>> what are my recourses. As WK articulated already had my metadata been >>>> completely corrupt would I have lost all my data? Would I lose just >>>> one file the one with the error? And can I fix this error myself? >>>> >>>> Thanks, >>>> >>>> Tuukka >>>> >>>> On Wed, Jun 1, 2011 at 3:39 PM, WK <wk...@bn...> wrote: >>>>> On 6/1/2011 2:30 AM, Michal Borychowski wrote: >>>>>> >>>>>> We think that this problem could be caused by your RAM in the master. >> We >>>>>> recommend using RAM with parity control. You can also run a test from >>>>>> http://www.memtest.org/ on your server and check your existing RAM. Of >>>>>> course, the bit could have been changed also on the motherboard level >> or >>>> CPU >>>>>> - which is much less probable. >>>>>> >>>>>> >>>>>> Also you can see in the log that file 7538 is located between 7553 and >>>> 7555: >>>>> >>>>> >>>>> So in a situation like this where the metadata is now corrupt. >>>>> >>>>> Is the problem fixable with only the loss of the one file? (and how > does >>>>> one fix it). >>>>> >>>>> or is his entire MFS setup completely corrupt and he would need to have >>>>> had a backup? >>>>> >>>>> Can I assume that older archived versions of the metadata.mfs could be >>>>> used to recover most of the files. >>>>> >>>>> -bill >>>>> >>>>> >>>> >>> >> > ---------------------------------------------------------------------------- >>>> -- >>>>> Simplify data backup and recovery for your virtual environment with >>>> vRanger. >>>>> Installation's a snap, and flexible recovery options mean your data is >>>> safe, >>>>> secure and there when you need it. Data protection magic? >>>>> Nope - It's vRanger. Get your free trial download today. >>>>> http://p.sf.net/sfu/quest-sfdev2dev >>>>> _______________________________________________ >>>>> moosefs-users mailing list >>>>> moo...@li... >>>>> https://lists.sourceforge.net/lists/listinfo/moosefs-users >>>>> >>>> >>>> >>> >> > ---------------------------------------------------------------------------- >>>> -- >>>> Simplify data backup and recovery for your virtual environment with >>> vRanger. >>>> >>>> Installation's a snap, and flexible recovery options mean your data is >>> safe, >>>> secure and there when you need it. Data protection magic? >>>> Nope - It's vRanger. Get your free trial download today. >>>> http://p.sf.net/sfu/quest-sfdev2dev >>>> _______________________________________________ >>>> moosefs-users mailing list >>>> moo...@li... >>>> https://lists.sourceforge.net/lists/listinfo/moosefs-users >>>> >>>> >>> >>> >> > ---------------------------------------------------------------------------- >>> -- >>> Simplify data backup and recovery for your virtual environment with >> vRanger. >>> Installation's a snap, and flexible recovery options mean your data is >> safe, >>> secure and there when you need it. Discover what all the cheering's > about. >>> Get your free trial download today. >>> http://p.sf.net/sfu/quest-dev2dev2 >>> _______________________________________________ >>> moosefs-users mailing list >>> moo...@li... >>> https://lists.sourceforge.net/lists/listinfo/moosefs-users >>> >>> >> >> > > ---------------------------------------------------------------------------- > -- > Simplify data backup and recovery for your virtual environment with vRanger. > Installation's a snap, and flexible recovery options mean your data is safe, > secure and there when you need it. Discover what all the cheering's about. > Get your free trial download today. > http://p.sf.net/sfu/quest-dev2dev2 > _______________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users > > |
From: i- <it...@it...> - 2011-07-07 15:08:15
|
Big up for this feature ! This REALLY is interresting. Le 07/07/2011 15:36, Roger Skjetlein a écrit : > How will this rack maps implemention be? Is is rack de-centric or > co-centric? > > RS > > On 6/30/11 10:24 AM, Michal Borychowski wrote: >> Hi! >> >> This could be quite difficult to keep "distances" in IP addresses. We >> introduced rack maps which should be easier in the mainenance. It would be >> soon available in the public version. >> >> >> -----Original Message----- >> From: Mike [mailto:isp...@gm...] >> Sent: Tuesday, June 21, 2011 5:02 AM >> To: moo...@li... >> Subject: [Moosefs-users] geographical fun with MooseFS >> >> >> What if someone wrote a patch for MooseFS that looked at the IP of the >> client, and the IP of the chunk servers that had the chunk the client >> wanted, and tried to pick the closest one? >> >> something like >> >> client = 10.1.1.1 >> chunkserver with copy#1 = 10.1.1.2 >> chunkserver with copy#2 = 10.1.1.20 >> chunkserver with copy#3 = 10.1.2.2 >> >> Those 3 chunk servers would be used in that order, since their IPs are >> closer to the client's IP (using a formula like a*256^3+b*256^2+c*256+d >> to calculate an integer? long int? based on an IP address). >> >> This way you could have "close" chunk servers respond to most of the >> requests from a client, but if no "close" server had the chunk you >> wanted, you could go to a "distant" one. >> >> Drop two IPs on the client's interface, and do some careful numbering, >> and you can even set preference on a machine on the same LAN. >> >> This might make for a really simple way to do "distant" archives that >> don't get used for reads unless they are the only source that's >> available, and other similar problems. >> >> writes would be a different problem. >> >> Thoughts? comments? |
From: Roger S. <rog...@un...> - 2011-07-07 13:36:36
|
How will this rack maps implemention be? Is is rack de-centric or co-centric? RS On 6/30/11 10:24 AM, Michal Borychowski wrote: > Hi! > > This could be quite difficult to keep "distances" in IP addresses. We > introduced rack maps which should be easier in the mainenance. It would be > soon available in the public version. > > > Regards > Kind regards > Michał Borychowski > MooseFS Support Manager > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > Gemius S.A. > ul. Wołoska 7, 02-672 Warszawa > Budynek MARS, klatka D > Tel.: +4822 874-41-00 > Fax : +4822 874-41-01 > > > -----Original Message----- > From: Mike [mailto:isp...@gm...] > Sent: Tuesday, June 21, 2011 5:02 AM > To: moo...@li... > Subject: [Moosefs-users] geographical fun with MooseFS > > > What if someone wrote a patch for MooseFS that looked at the IP of the > client, and the IP of the chunk servers that had the chunk the client > wanted, and tried to pick the closest one? > > something like > > client = 10.1.1.1 > chunkserver with copy#1 = 10.1.1.2 > chunkserver with copy#2 = 10.1.1.20 > chunkserver with copy#3 = 10.1.2.2 > > Those 3 chunk servers would be used in that order, since their IPs are > closer to the client's IP (using a formula like a*256^3+b*256^2+c*256+d > to calculate an integer? long int? based on an IP address). > > This way you could have "close" chunk servers respond to most of the > requests from a client, but if no "close" server had the chunk you > wanted, you could go to a "distant" one. > > Drop two IPs on the client's interface, and do some careful numbering, > and you can even set preference on a machine on the same LAN. > > This might make for a really simple way to do "distant" archives that > don't get used for reads unless they are the only source that's > available, and other similar problems. > > writes would be a different problem. > > Thoughts? comments? > > ---------------------------------------------------------------------------- > -- > EditLive Enterprise is the world's most technically advanced content > authoring tool. Experience the power of Track Changes, Inline Image > Editing and ensure content is compliant with Accessibility Checking. > http://p.sf.net/sfu/ephox-dev2dev > _______________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users > > > ------------------------------------------------------------------------------ > All of the data generated in your IT infrastructure is seriously valuable. > Why? It contains a definitive record of application performance, security > threats, fraudulent activity, and more. Splunk takes this data and makes > sense of it. IT sense. And common sense. > http://p.sf.net/sfu/splunk-d2d-c2 > _______________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users |
From: jyc <mai...@gm...> - 2011-07-07 09:08:24
|
Hi, thanks for your answer how can i help you with this problem ? i can tell that i still have these 5 chunks with one valid copie, and a goal of zero. i just don't know if it's possible, but it would be cool to make request to the mfsmaster process, like sql request : select chunks where file_goal=0 and valid_copies=1; :-) for exemple. there must be a way to find it using bash and mfsfileinfo ,but i think maybe i would be a killer app for debugging thanks again for you great job Michal Borychowski wrote: > > Hi! > > We’ll have to look closer at this situation. We know about this > behaviour and it needs further investigation. But fortunately this is > nothing serious. > > Kind regards > > -Michał > > *From:* jyc [mailto:mai...@gm...] > *Sent:* Monday, July 04, 2011 2:35 PM > *To:* moo...@li... > *Subject:* [Moosefs-users] chunks valid copies problem > > Hi everyone, > (excuse me, html mail...) > > i'm testing moosefs 1.6.20-2 up to 30 TB. > everything is running fine, except that i still have 5 chunks with 1 > valid copy for each one, and a goal of zero. > but they never go to 0 valid copy, and a goal of zero... > even when the all cluster is "stable", ie without any rebalance ? > > is there any way to find why is these chunks, and where are they ? > > *Filesystem check info* > > *check loop start time* > > > > *check loop end time* > > > > *files* > > > > *under-goal files* > > > > *missing files* > > > > *chunks* > > > > *under-goal chunks* > > > > *missing chunks* > > Mon Jul 4 07:33:03 2011 > > > > Mon Jul 4 11:33:17 2011 > > > > 388938 > > > > 0 > > > > 0 > > > > 606531 > > > > 0 > > > > 0 > > > > *Regular chunks state matrix (counts only 'regular' hdd space : switch > to 'all' > <http://172.16.33.248:9425/mfs.cgi?HDperiod=0§ions=HD%7CIN&INmatrix=0>)* > > *goal* > > > > *valid copies* > > *0* > > > > *1* > > > > *2* > > > > *3* > > > > *4* > > > > *5* > > > > *6* > > > > *7* > > > > *8* > > > > *9* > > > > *10+* > > > > *all* > > 0 > > > > - > > > > 5 > > > > - > > > > - > > > > - > > > > - > > > > - > > > > - > > > > - > > > > - > > > > - > > > > 5 > > > jyc > |
From: Florent B. <fl...@co...> - 2011-07-07 08:13:36
|
Of course I have it ! Fully mounted using mfsmount and used by client (many KVM images are in this share, are running). Just this file seems missing... Le 07/07/2011 10:08, Michal Borychowski a écrit : > > So probalby you don’t even have the folder /mfs/Ahng2u/ itself? > > Regards > > Michał > > *From:*Florent Bautista [mailto:fl...@co...] > *Sent:* Thursday, July 07, 2011 9:38 AM > *To:* moo...@li... > *Subject:* Re: [Moosefs-users] register to master: Permission denied > > Hi Michal, > > I still have the problem since I haven't rebooted master or client (I > will migrate master soon, so ...). > > The result of that command is just... > > -bash: /mnt/Ahng2u/.masterinfo: No such file or directory > > What is this file and why is it missing ? > > Thank you > > Le 07/07/2011 08:39, Michal Borychowski a écrit : > > Hi Florent! > > Do you still have this problem? > > Can you run this command and send us the results? > > "hexdump -C /mnt/Ahng2u/.masterinfo" lub "xxd -g1 < > /mnt/Ahng2u/.masterinfo" > > Regards > > -Michał > > *From:*Florent Bautista [mailto:fl...@co...] > *Sent:* Thursday, June 23, 2011 9:46 AM > *To:* moo...@li... > <mailto:moo...@li...> > *Subject:* [Moosefs-users] register to master: Permission denied > > Hi all, > > I have a problem with an installation of MooseFS. > > MFS is successfully mounted by a client, all files are readable and > writtable, but every command mfs* is not working and returns : > > register to master: Permission denied > > For example : > > test1:~# mfsdirinfo /mfs/Ahng2u/ > register to master: Permission denied > /mfs/Ahng2u/: can't register to master (.masterinfo) > > But I confirm that this client is using the files without problem > (some KVM machines are stored on it and running !). > > What can be the problem ? This is the first time I'm having this error. > > I do not see anything in syslog, but maybe I'm not looking in the > right place! > > -- > > > Florent Bautista > > ------------------------------------------------------------------------ > > Ce message et ses éventuelles pièces jointes sont personnels, > confidentiels et à l'usage exclusif de leur destinataire. > Si vous n'êtes pas la personne à laquelle ce message est destiné, > veuillez noter que vous avez reçu ce courriel par erreur et qu'il vous > est strictement interdit d'utiliser, de diffuser, de transférer, > d'imprimer ou de copier ce message. > > This e-mail and any attachments hereto are strictly personal, > confidential and intended solely for the addressee. > If you are not the intended recipient, be advised that you have > received this email in error and that any use, dissemination, > forwarding, printing, or copying of this message is strictly prohibited. > > ------------------------------------------------------------------------ > > -- > > > Florent Bautista > > ------------------------------------------------------------------------ > > Ce message et ses éventuelles pièces jointes sont personnels, > confidentiels et à l'usage exclusif de leur destinataire. > Si vous n'êtes pas la personne à laquelle ce message est destiné, > veuillez noter que vous avez reçu ce courriel par erreur et qu'il vous > est strictement interdit d'utiliser, de diffuser, de transférer, > d'imprimer ou de copier ce message. > > This e-mail and any attachments hereto are strictly personal, > confidential and intended solely for the addressee. > If you are not the intended recipient, be advised that you have > received this email in error and that any use, dissemination, > forwarding, printing, or copying of this message is strictly prohibited. > > ------------------------------------------------------------------------ -- Florent Bautista ------------------------------------------------------------------------ Ce message et ses éventuelles pièces jointes sont personnels, confidentiels et à l'usage exclusif de leur destinataire. Si vous n'êtes pas la personne à laquelle ce message est destiné, veuillez noter que vous avez reçu ce courriel par erreur et qu'il vous est strictement interdit d'utiliser, de diffuser, de transférer, d'imprimer ou de copier ce message. This e-mail and any attachments hereto are strictly personal, confidential and intended solely for the addressee. If you are not the intended recipient, be advised that you have received this email in error and that any use, dissemination, forwarding, printing, or copying of this message is strictly prohibited. ------------------------------------------------------------------------ |
From: Michal B. <mic...@ge...> - 2011-07-07 07:56:40
|
Hi Tuukka! In reply to your old post :) Files below 64MB would occupy single chunks so they would be in one part, not divided. Chunks have 5kB which needs to be removed and later it is necessary to check where the file ends. Chunks length is rounded up to (5+64*n)kB, where n can be 0 up to 1024. Kind regards Michał Borychowski MooseFS Support Manager _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ Gemius S.A. ul. Wołoska 7, 02-672 Warszawa Budynek MARS, klatka D Tel.: +4822 874-41-00 Fax : +4822 874-41-01 -----Original Message----- From: Tuukka Luolamo [mailto:tlu...@gm...] Sent: Monday, June 06, 2011 8:17 AM To: Michal Borychowski Cc: moo...@li... Subject: Re: [Moosefs-users] Problems after power failure So the chunk servers in effect do not know what files they have. Only the master is aware? I looked at the chunks themselves and they don't seem to be particularly special, I was able to identify some of the files inside the chunks simply by using head/tail/cat commands on them, so it would seem like it would not be hard for each chunkserver to be at least minimally aware of what is in themselves, to remove some dependency on the master. Not sure how other people feel, but if I knew that each chunkserver can tell the master what it has, it would make me feel better that there is a way to recover the contents of the file system in case of a extreme failure, specially in small implementations. I realize in larger implementations it may be impractical to ever recover the meta data from the chunkservers, simply for the time it might take. So having very reliable masters and backup masters would be a key. Thanks. 2011/6/5 Michal Borychowski <mic...@ge...>: > If you really don't have your metadata, your files are dead... You need > metadata to know about them. Of course you can try to recover by reading the > surface of hard drives on chunkservers but this would be very tedious... > > > Regards > Michał > > -----Original Message----- > From: Tuukka Luolamo [mailto:tlu...@gm...] > Sent: Monday, June 06, 2011 7:56 AM > To: Michal Borychowski > Cc: moo...@li... > Subject: Re: [Moosefs-users] Problems after power failure > > Well I guess the question is in the impossible situation that the > metadata were to be completely corrupt or lost because there is no > backup and master goes dead, what recourse is there if any? > > Thanks > > 2011/6/5 Michal Borychowski <mic...@ge...>: >> Hi! >> >> That's interesting... So this bit could be changed by your CPU or >> motherboard or it is an error in the software but it would be very very >> difficult to find it as the error probably cannot be easily repeated. >> >> Regarding your previous questions - it's almost impossible that your >> metadata is "completely" corrupt. You really can recover most of your > files >> at most times. This situation was really weird as there was a single > change >> in an information bit. Normally you would run mfsmetarestore with a flag > -i >> (ignore) and it would just ignore this one file. Unfortunately you would > not >> be able to repair this single bit as this was quite a complicated process. >> >> >> Kind regards >> Michał >> >> >> -----Original Message----- >> From: Tuukka Luolamo [mailto:tlu...@gm...] >> Sent: Monday, June 06, 2011 5:01 AM >> To: Michal Borychowski; moo...@li... >> Subject: Re: [Moosefs-users] Problems after power failure >> >> OK I ran the memtest on the master server, It got through without >> finding any errors. >> >> 2011/6/2 Michal Borychowski <mic...@ge...>: >>> Hi! >>> >>> In the meantime please run the memtest - we are curious if it really was >>> hardware problem or maybe it could be a software problem >>> >>> >>> Regards >>> Michal >>> >>> -----Original Message----- >>> From: Tuukka Luolamo [mailto:tlu...@gm...] >>> Sent: Thursday, June 02, 2011 9:55 AM >>> To: WK >>> Cc: moo...@li... >>> Subject: Re: [Moosefs-users] Problems after power failure >>> >>> OK I put in place the file the dev sent me and can not see any data >> loss... >>> >>> I found the file in question the one I got in the error and it seems > fine. >>> >>> The whole system is up and functioning. >>> >>> I run the system on a old desktop computer and a another PC I bought >>> for $25 so the dev recommends making sure you have good memory, but I >>> guess I am using whatever I got =) Aside from this error everything >>> has been fine. Didn't run the memtest they recommended, but I would >>> not count out memory errors. >>> >>> However I would like to understand the situation better mainly for >>> what are my recourses. As WK articulated already had my metadata been >>> completely corrupt would I have lost all my data? Would I lose just >>> one file the one with the error? And can I fix this error myself? >>> >>> Thanks, >>> >>> Tuukka >>> >>> On Wed, Jun 1, 2011 at 3:39 PM, WK <wk...@bn...> wrote: >>>> On 6/1/2011 2:30 AM, Michal Borychowski wrote: >>>>> >>>>> We think that this problem could be caused by your RAM in the master. > We >>>>> recommend using RAM with parity control. You can also run a test from >>>>> http://www.memtest.org/ on your server and check your existing RAM. Of >>>>> course, the bit could have been changed also on the motherboard level > or >>> CPU >>>>> - which is much less probable. >>>>> >>>>> >>>>> Also you can see in the log that file 7538 is located between 7553 and >>> 7555: >>>> >>>> >>>> So in a situation like this where the metadata is now corrupt. >>>> >>>> Is the problem fixable with only the loss of the one file? (and how does >>>> one fix it). >>>> >>>> or is his entire MFS setup completely corrupt and he would need to have >>>> had a backup? >>>> >>>> Can I assume that older archived versions of the metadata.mfs could be >>>> used to recover most of the files. >>>> >>>> -bill >>>> >>>> >>> >> > ---------------------------------------------------------------------------- >>> -- >>>> Simplify data backup and recovery for your virtual environment with >>> vRanger. >>>> Installation's a snap, and flexible recovery options mean your data is >>> safe, >>>> secure and there when you need it. Data protection magic? >>>> Nope - It's vRanger. Get your free trial download today. >>>> http://p.sf.net/sfu/quest-sfdev2dev >>>> _______________________________________________ >>>> moosefs-users mailing list >>>> moo...@li... >>>> https://lists.sourceforge.net/lists/listinfo/moosefs-users >>>> >>> >>> >> > ---------------------------------------------------------------------------- >>> -- >>> Simplify data backup and recovery for your virtual environment with >> vRanger. >>> >>> Installation's a snap, and flexible recovery options mean your data is >> safe, >>> secure and there when you need it. Data protection magic? >>> Nope - It's vRanger. Get your free trial download today. >>> http://p.sf.net/sfu/quest-sfdev2dev >>> _______________________________________________ >>> moosefs-users mailing list >>> moo...@li... >>> https://lists.sourceforge.net/lists/listinfo/moosefs-users >>> >>> >> >> > ---------------------------------------------------------------------------- >> -- >> Simplify data backup and recovery for your virtual environment with > vRanger. >> Installation's a snap, and flexible recovery options mean your data is > safe, >> secure and there when you need it. Discover what all the cheering's about. >> Get your free trial download today. >> http://p.sf.net/sfu/quest-dev2dev2 >> _______________________________________________ >> moosefs-users mailing list >> moo...@li... >> https://lists.sourceforge.net/lists/listinfo/moosefs-users >> >> > > ---------------------------------------------------------------------------- -- Simplify data backup and recovery for your virtual environment with vRanger. Installation's a snap, and flexible recovery options mean your data is safe, secure and there when you need it. Discover what all the cheering's about. Get your free trial download today. http://p.sf.net/sfu/quest-dev2dev2 _______________________________________________ moosefs-users mailing list moo...@li... https://lists.sourceforge.net/lists/listinfo/moosefs-users |
From: Florent B. <fl...@co...> - 2011-07-07 07:54:50
|
Hi Michal, I still have the problem since I haven't rebooted master or client (I will migrate master soon, so ...). The result of that command is just... -bash: /mnt/Ahng2u/.masterinfo: No such file or directory What is this file and why is it missing ? Thank you Le 07/07/2011 08:39, Michal Borychowski a écrit : > > Hi Florent! > > Do you still have this problem? > > Can you run this command and send us the results? > > "hexdump -C /mnt/Ahng2u/.masterinfo" lub "xxd -g1 < > /mnt/Ahng2u/.masterinfo" > > Regards > > -Michał > > *From:*Florent Bautista [mailto:fl...@co...] > *Sent:* Thursday, June 23, 2011 9:46 AM > *To:* moo...@li... > *Subject:* [Moosefs-users] register to master: Permission denied > > Hi all, > > I have a problem with an installation of MooseFS. > > MFS is successfully mounted by a client, all files are readable and > writtable, but every command mfs* is not working and returns : > > register to master: Permission denied > > For example : > > test1:~# mfsdirinfo /mfs/Ahng2u/ > register to master: Permission denied > /mfs/Ahng2u/: can't register to master (.masterinfo) > > But I confirm that this client is using the files without problem > (some KVM machines are stored on it and running !). > > What can be the problem ? This is the first time I'm having this error. > > I do not see anything in syslog, but maybe I'm not looking in the > right place! > > -- > > > Florent Bautista > > ------------------------------------------------------------------------ > > Ce message et ses éventuelles pièces jointes sont personnels, > confidentiels et à l'usage exclusif de leur destinataire. > Si vous n'êtes pas la personne à laquelle ce message est destiné, > veuillez noter que vous avez reçu ce courriel par erreur et qu'il vous > est strictement interdit d'utiliser, de diffuser, de transférer, > d'imprimer ou de copier ce message. > > This e-mail and any attachments hereto are strictly personal, > confidential and intended solely for the addressee. > If you are not the intended recipient, be advised that you have > received this email in error and that any use, dissemination, > forwarding, printing, or copying of this message is strictly prohibited. > > ------------------------------------------------------------------------ > > 30440 Saint Laurent le Minier > France > > *Compagnie pour des Prestations Internet* > > Téléphone : +33 (0)467 73 89 48 > Télécopie : + 33 (0)9 59 48 06 27 > > Courriel : Fl...@Co... <mailto:fl...@co...> > > ------------------------------------------------------------------------ -- Florent Bautista ------------------------------------------------------------------------ Ce message et ses éventuelles pièces jointes sont personnels, confidentiels et à l'usage exclusif de leur destinataire. Si vous n'êtes pas la personne à laquelle ce message est destiné, veuillez noter que vous avez reçu ce courriel par erreur et qu'il vous est strictement interdit d'utiliser, de diffuser, de transférer, d'imprimer ou de copier ce message. This e-mail and any attachments hereto are strictly personal, confidential and intended solely for the addressee. If you are not the intended recipient, be advised that you have received this email in error and that any use, dissemination, forwarding, printing, or copying of this message is strictly prohibited. ------------------------------------------------------------------------ |
From: Michal B. <mic...@ge...> - 2011-07-07 07:48:52
|
Hi! We'll have to look closer at this situation. We know about this behaviour and it needs further investigation. But fortunately this is nothing serious. Kind regards -Michał From: jyc [mailto:mai...@gm...] Sent: Monday, July 04, 2011 2:35 PM To: moo...@li... Subject: [Moosefs-users] chunks valid copies problem Hi everyone, (excuse me, html mail...) i'm testing moosefs 1.6.20-2 up to 30 TB. everything is running fine, except that i still have 5 chunks with 1 valid copy for each one, and a goal of zero. but they never go to 0 valid copy, and a goal of zero... even when the all cluster is "stable", ie without any rebalance ? is there any way to find why is these chunks, and where are they ? Filesystem check info check loop start time check loop end time files under-goal files missing files chunks under-goal chunks missing chunks Mon Jul 4 07:33:03 2011 Mon Jul 4 11:33:17 2011 388938 0 0 606531 0 0 Regular chunks state matrix (counts only 'regular' hdd space : switch to 'all' <http://172.16.33.248:9425/mfs.cgi?HDperiod=0§ions=HD%7CIN&INmatrix=0> ) goal valid copies 0 1 2 3 4 5 6 7 8 9 10+ all 0 - 5 - - - - - - - - - 5 jyc |
From: Michal B. <mic...@ge...> - 2011-07-07 06:42:48
|
Hi! Not much. You should check if BDB (or some earlier versions) could be recompiled so that it doesn't use mmap. Kind regards -Michal -----Original Message----- From: youngcow [mailto:you...@gm...] Sent: Monday, July 04, 2011 2:32 PM To: Michal Borychowski Cc: 'moosefs-users' Subject: Re: [Moosefs-users] mmap file on moosefs error Hi Thanks. rpm Use BDB Library and error happened on bdb library. Do you have any workaround for it? > Hi > > We made some tests with mmap some time ago and for the moment FUSE on Linux > doesn't support flag MAP_SHARED - it returns ENODEV error then. Only private > mappings are supported (MAP_PRIVATE). > > Probably rpm or BerkeleyDB (library responsible for rpm database) tries to > run mmap with this flag. > > > Kind regards > Michał Borychowski > MooseFS Support Manager > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > Gemius S.A. > ul. Wołoska 7, 02-672 Warszawa > Budynek MARS, klatka D > Tel.: +4822 874-41-00 > Fax : +4822 874-41-01 > > > -----Original Message----- > From: youngcow [mailto:you...@gm...] > Sent: Saturday, July 02, 2011 11:40 AM > To: moosefs-users > Subject: [Moosefs-users] mmap file on moosefs error > > Hi, > I found a error when I used moosefs. I used openvz(container-based > virtualization for Linux) on RHEL 5.6 x86_64 > and put all container file into moosefs and I notice an error when I was > running "rpm -qa " in vm: > > rpmdb: mmap: No such device > error: db4 error(19) from dbenv->open: No such device > error: cannot open Packages index using db3 - No such device (19) > error: cannot open Packages database in /var/lib/rpm > > So I tested it in physical machine: > I put the directory "/var/lib/rpm" into moosefs, link the dir to > /var/lib and run "rpm -qa",the error message also: > rpmdb: mmap: No such device > error: db4 error(19) from dbenv->open: No such device > error: cannot open Packages index using db3 - No such device > error: cannot open Packages database in /var/lib/rpm > > Anyone knows the reason. Is it a bug of moosefs? > > Thanks. > > ---------------------------------------------------------------------------- > -- > All of the data generated in your IT infrastructure is seriously valuable. > Why? It contains a definitive record of application performance, security > threats, fraudulent activity, and more. Splunk takes this data and makes > sense of it. IT sense. And common sense. > http://p.sf.net/sfu/splunk-d2d-c2 > _______________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users > |
From: Michal B. <mic...@ge...> - 2011-07-07 06:38:01
|
Hi! You can just try to limit write buffer, eg. pass "-o mfswritecachesize=16" Generally speaking mfsmount should not exceed 1GB RAM (unless FUSE invokes too many threads but probably we cannot do much about it). You can try to run FUSE with the '-s' option but we afraid it would cause dramatic decrease in the performance. Kind regards Michał Borychowski MooseFS Support Manager _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ Gemius S.A. ul. Wołoska 7, 02-672 Warszawa Budynek MARS, klatka D Tel.: +4822 874-41-00 Fax : +4822 874-41-01 From: WK [mailto:wk...@bn...] Sent: Sunday, June 26, 2011 11:28 PM To: moo...@li... Subject: [Moosefs-users] khugepaged/fuse issue on RH6/SL6 One one of our MFS clusters has four clients mounted. Three of them running RHEL5/Cent5 never have issues. The fourth locks up at least once a week, with the below /var/log/messages (note the mount errors just go on forever until we reboot). It starts with khugepaged. Googling the issue indicates that many people are seeing this with other fuse projects all with recent kernels. In particular the ZFS project has a number of threads. Here is just one thread: http://zfs-fuse.net/issues/123 In that thread, aside from downgrading the distro there is a recommenation of limiting the memory used to 1GB or less using "zfs-fuse --stack-size=1024 -m 1024 --no-kstat-mount --disable-block-cache --disable-page-cache -v 1 --zfs-prefetch-disable". Is there a MFSmount equivalent for limiting memory or any suggestions/feedback regarding this issue. Sincerely, WK LOG FILE snippet Jun 26 13:41:25 ariel kernel: INFO: task khugepaged:52 blocked for more than 120 seconds. Jun 26 13:41:25 ariel kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Jun 26 13:41:25 ariel kernel: khugepaged D ffff88012fc23080 0 52 2 0x00000000 Jun 26 13:41:25 ariel kernel: ffff88012af9f900 0000000000000046 0000000000000000 ffffffff8104b9c8 Jun 26 13:41:25 ariel kernel: 0000000002dae000 ffffea000027a050 000000000000000e 0000000113d439da Jun 26 13:41:25 ariel kernel: ffff88012afa3ad8 ffff88012af9ffd8 0000000000010518 ffff88012afa3ad8 Jun 26 13:41:25 ariel kernel: Call Trace: Jun 26 13:41:25 ariel kernel: [<ffffffff8104b9c8>] ? flush_tlb_others_ipi+0x128/0x130 Jun 26 13:41:25 ariel kernel: [<ffffffff8110c330>] ? sync_page+0x0/0x50 Jun 26 13:41:25 ariel kernel: [<ffffffff814c9a53>] io_schedule+0x73/0xc0 Jun 26 13:41:25 ariel kernel: [<ffffffff8110c36d>] sync_page+0x3d/0x50 Jun 26 13:41:25 ariel kernel: [<ffffffff814ca17a>] __wait_on_bit_lock+0x5a/0xc0 Jun 26 13:41:25 ariel kernel: [<ffffffff8110c307>] __lock_page+0x67/0x70 Jun 26 13:41:25 ariel kernel: [<ffffffff81091ee0>] ? wake_bit_function+0x0/0x50 Jun 26 13:41:25 ariel kernel: [<ffffffff81122781>] ? lru_cache_add_lru+0x21/0x40 Jun 26 13:41:25 ariel kernel: [<ffffffff8115bf10>] lock_page+0x30/0x40 Jun 26 13:41:25 ariel kernel: [<ffffffff8115c58d>] migrate_pages+0x59d/0x5d0 Jun 26 13:41:25 ariel kernel: [<ffffffff81152b20>] ? compaction_alloc+0x0/0x370 Jun 26 13:41:25 ariel kernel: [<ffffffff811525cc>] compact_zone+0x4cc/0x600 Jun 26 13:41:25 ariel kernel: [<ffffffff8111cffc>] ? get_page_from_freelist+0x15c/0x820 Jun 26 13:41:25 ariel kernel: [<ffffffff8115297e>] compact_zone_order+0x7e/0xb0 Jun 26 13:41:25 ariel kernel: [<ffffffff81152ab9>] try_to_compact_pages+0x109/0x170 Jun 26 13:41:25 ariel kernel: [<ffffffff8111e99d>] __alloc_pages_nodemask+0x5ed/0x850 Jun 26 13:41:25 ariel kernel: [<ffffffff81150db3>] alloc_pages_vma+0x93/0x150 Jun 26 13:41:25 ariel kernel: [<ffffffff81165c4b>] khugepaged+0xa9b/0x1210 Jun 26 13:41:25 ariel kernel: [<ffffffff81091ea0>] ? autoremove_wake_function+0x0/0x40 Jun 26 13:41:25 ariel kernel: [<ffffffff811651b0>] ? khugepaged+0x0/0x1210 Jun 26 13:41:25 ariel kernel: [<ffffffff81091b36>] kthread+0x96/0xa0 Jun 26 13:41:25 ariel kernel: [<ffffffff810141ca>] child_rip+0xa/0x20 Jun 26 13:41:25 ariel kernel: [<ffffffff81091aa0>] ? kthread+0x0/0xa0 Jun 26 13:41:25 ariel kernel: [<ffffffff810141c0>] ? child_rip+0x0/0x20 Jun 26 13:41:25 ariel kernel: INFO: task mfsmount:7808 blocked for more than 120 seconds. Jun 26 13:41:25 ariel kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Jun 26 13:41:25 ariel kernel: mfsmount D ffff88012fc23280 0 7808 1 0x00000000 Jun 26 13:41:25 ariel kernel: ffff8800730e1b70 0000000000000086 0000000000000000 0000000000000000 Jun 26 13:41:25 ariel kernel: ffff8800282912c0 0000000000000400 0000000000001000 0000000113d44058 Jun 26 13:41:25 ariel kernel: ffff8801121f45f8 ffff8800730e1fd8 0000000000010518 ffff8801121f45f8 Jun 26 13:41:25 ariel kernel: Call Trace: Jun 26 13:41:25 ariel kernel: [<ffffffff814cb6e5>] rwsem_down_failed_common+0x95/0x1d0 Jun 26 13:41:25 ariel kernel: [<ffffffff81059e02>] ? finish_task_switch+0x42/0xd0 Jun 26 13:41:25 ariel kernel: [<ffffffff814cb876>] rwsem_down_read_failed+0x26/0x30 Jun 26 13:41:25 ariel kernel: [<ffffffff81264db4>] call_rwsem_down_read_failed+0x14/0x30 Jun 26 13:41:25 ariel kernel: [<ffffffff814cad74>] ? down_read+0x24/0x30 Jun 26 13:41:25 ariel kernel: [<ffffffffa0370419>] fuse_copy_fill+0x99/0x1f0 [fuse] Jun 26 13:41:25 ariel kernel: [<ffffffffa03705b1>] fuse_copy_one+0x41/0x70 [fuse] Jun 26 13:41:25 ariel kernel: [<ffffffffa03714c4>] fuse_dev_read+0x224/0x310 [fuse] Jun 26 13:41:25 ariel kernel: [<ffffffff81091ea0>] ? autoremove_wake_function+0x0/0x40 Jun 26 13:41:25 ariel kernel: [<ffffffff8116d19a>] do_sync_read+0xfa/0x140 Jun 26 13:41:25 ariel kernel: [<ffffffff81091ea0>] ? autoremove_wake_function+0x0/0x40 Jun 26 13:41:25 ariel kernel: [<ffffffff81401e77>] ? release_sock+0xb7/0xd0 Jun 26 13:41:25 ariel kernel: [<ffffffff811fff16>] ? security_file_permission+0x16/0x20 Jun 26 13:41:25 ariel kernel: [<ffffffff8116dbc5>] vfs_read+0xb5/0x1a0 Jun 26 13:41:25 ariel kernel: [<ffffffff8116dd01>] sys_read+0x51/0x90 Jun 26 13:41:25 ariel kernel: [<ffffffff81013172>] system_call_fastpath+0x16/0x1b Jun 26 13:41:25 ariel kernel: INFO: task mfsmount:3885 blocked for more than 120 seconds. Jun 26 13:41:25 ariel kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Jun 26 13:41:25 ariel kernel: mfsmount D ffff88012fc23280 0 3885 1 0x00000000 Jun 26 13:41:25 ariel kernel: ffff880063e1bb70 0000000000000086 0000000000000000 ffff880063e1baf8 Jun 26 13:41:25 ariel kernel: ffff880028316980 ffff880063e1bb18 ffffffff8105c846 0000000113d44249 Jun 26 13:41:25 ariel kernel: ffff880044ea6678 ffff880063e1bfd8 0000000000010518 ffff880044ea6678 Jun 26 13:41:25 ariel kernel: Call Trace: Jun 26 13:41:25 ariel kernel: [<ffffffff8105c846>] ? update_curr+0xe6/0x1e0 Jun 26 13:41:25 ariel kernel: [<ffffffff81061c61>] ? dequeue_entity+0x1a1/0x1e0 Jun 26 13:41:25 ariel kernel: [<ffffffff814cb6e5>] rwsem_down_failed_common+0x95/0x1d0 Jun 26 13:41:25 ariel kernel: [<ffffffff81059e02>] ? finish_task_switch+0x42/0xd0 Jun 26 13:41:25 ariel kernel: [<ffffffff814cb876>] rwsem_down_read_failed+0x26/0x30 Jun 26 13:41:25 ariel kernel: [<ffffffff81264db4>] call_rwsem_down_read_failed+0x14/0x30 Jun 26 13:41:25 ariel kernel: [<ffffffff814cad74>] ? down_read+0x24/0x30 Jun 26 13:41:25 ariel kernel: [<ffffffffa0370419>] fuse_copy_fill+0x99/0x1f0 [fuse] Jun 26 13:41:25 ariel kernel: [<ffffffffa03705b1>] fuse_copy_one+0x41/0x70 [fuse] Jun 26 13:41:25 ariel kernel: [<ffffffffa03714c4>] fuse_dev_read+0x224/0x310 [fuse] Jun 26 13:41:25 ariel kernel: [<ffffffff81091ea0>] ? autoremove_wake_function+0x0/0x40 Jun 26 13:41:25 ariel kernel: [<ffffffff8116d19a>] do_sync_read+0xfa/0x140 Jun 26 13:41:25 ariel kernel: [<ffffffff81091ea0>] ? autoremove_wake_function+0x0/0x40 Jun 26 13:41:25 ariel kernel: [<ffffffff81401e77>] ? release_sock+0xb7/0xd0 Jun 26 13:41:25 ariel kernel: [<ffffffff811fff16>] ? security_file_permission+0x16/0x20 Jun 26 13:41:25 ariel kernel: [<ffffffff8116dbc5>] vfs_read+0xb5/0x1a0 Jun 26 13:41:25 ariel kernel: [<ffffffff8116dd01>] sys_read+0x51/0x90 Jun 26 13:41:25 ariel kernel: [<ffffffff81013172>] system_call_fastpath+0x16/0x1b Jun 26 13:41:25 ariel kernel: INFO: task mfsmount:21898 blocked for more than 120 seconds. Jun 26 13:41:25 ariel kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Jun 26 13:41:25 ariel kernel: mfsmount D ffff88012fc23280 0 21898 1 0x00000000 Jun 26 13:41:25 ariel kernel: ffff8800cdfe7b70 0000000000000086 0000000000000000 0000000000000000 Jun 26 13:41:25 ariel kernel: ffff8800282912c0 0000000000000400 0000000000001000 0000000113d439a9 Jun 26 13:41:25 ariel kernel: ffff8801298d45f8 ffff8800cdfe7fd8 0000000000010518 ffff8801298d45f8 Jun 26 13:41:25 ariel kernel: Call Trace: Jun 26 13:41:25 ariel kernel: [<ffffffff814cb6e5>] rwsem_down_failed_common+0x95/0x1d0 Jun 26 13:41:25 ariel kernel: [<ffffffff81059e02>] ? finish_task_switch+0x42/0xd0 Jun 26 13:41:25 ariel kernel: [<ffffffff814cb876>] rwsem_down_read_failed+0x26/0x30 Jun 26 13:41:25 ariel kernel: [<ffffffff81264db4>] call_rwsem_down_read_failed+0x14/0x30 Jun 26 13:41:25 ariel kernel: [<ffffffff814cad74>] ? down_read+0x24/0x30 Jun 26 13:41:25 ariel kernel: [<ffffffffa0370419>] fuse_copy_fill+0x99/0x1f0 [fuse] Jun 26 13:41:25 ariel kernel: [<ffffffffa03705b1>] fuse_copy_one+0x41/0x70 [fuse] Jun 26 13:41:25 ariel kernel: [<ffffffffa03714c4>] fuse_dev_read+0x224/0x310 [fuse] Jun 26 13:41:25 ariel kernel: [<ffffffff81091ea0>] ? autoremove_wake_function+0x0/0x40 Jun 26 13:41:25 ariel kernel: [<ffffffff8116d19a>] do_sync_read+0xfa/0x140 Jun 26 13:41:25 ariel kernel: [<ffffffff81091ea0>] ? autoremove_wake_function+0x0/0x40 Jun 26 13:41:25 ariel kernel: [<ffffffff8118bf70>] ? mntput_no_expire+0x30/0x110 Jun 26 13:41:25 ariel kernel: [<ffffffff811fff16>] ? security_file_permission+0x16/0x20 Jun 26 13:41:25 ariel kernel: [<ffffffff8116dbc5>] vfs_read+0xb5/0x1a0 Jun 26 13:41:25 ariel kernel: [<ffffffff8116dd01>] sys_read+0x51/0x90 Jun 26 13:41:25 ariel kernel: [<ffffffff81013172>] system_call_fastpath+0x16/0x1b and so on. |
From: rxknhe <rx...@gm...> - 2011-07-06 13:59:18
|
Hi All, Can some one help me with the following situation. CGI interface is showing 1 under goal chunk (red), and gui is reporting this file under 'reserved' metadata area. This is showing for more than 2 weeks now. currently unavailable chunk 00000000003791F5 (inode: 298219 ; index: 2) + currently unavailable reserved file 298219: [File_Path_snipped...]/mytestfile unavailable chunks: 1 unavailable reserved files: 1 I have trash quarantine setup for 2 days. Per documentation, files in 'reserved' metadata area are deleted files, but still open by some process and MFS will remove these once the file is closed. I have restarted master, all chunk servers and rebooted client from where this file was actually written in the first place. I mounted (META) data area and can see corresponding file under 'reserved' directory, but can not remove from there. Any other way to clean this up? rxknhe |
From: Rajeev K M. <raj...@gm...> - 2011-07-06 13:53:21
|
Hi All, Can some one help me with the following situation. CGI interface is showing 1 under goal chunk (red), and gui is reporting this file under 'reserved' metadata area. This is showing for more than 2 weeks now. currently unavailable chunk 00000000003791F5 (inode: 298219 ; index: 2) + currently unavailable reserved file 298219: [File_Path_snipped...]/mytestfile unavailable chunks: 1 unavailable reserved files: 1 I have trash quarantine setup for 2 days. Per documentation, files in 'reserved' metadata area are deleted files, but still open by some process and MFS will remove these once the file is closed. I have restarted master, all chunk servers and rebooted client from where this file was actually written in the first place. I mounted (META) data area and can see corresponding file under 'reserved' directory, but can not remove from there. Any other way to clean this up? rxknhe |
From: i- <it...@it...> - 2011-07-06 12:30:44
|
Hi all, I'm running a very simple cluster with 4 machines (1 master+meta+chunk, 2 chunk-only, 1 client, all on the same gigabit network, all servers using SSDs) and I have 2 questions : 1/ in the server charts, my chunkservers all show a lot of bytes read (7M / s) though they are idle, no client is doing anything, how can this be possible ? I also noticed there are 2 colors in the charts : light green and dark green. The chart shows both dark and light green when I use the cluster and only light green when I don't. What are the colors for ? 2/ I'm using bonnie++ to do some basic performance testing and it shows the following performance : - Write : ~40MB/s - Rewrite : ~4MB/s - Read : ~50MB/s I guess the network latency is the bottleneck here because "iostats -m -x" shows small cpu load and small ssd usage on all machines. How can I verify that ? Thank you very much! |
From: Samuel H. O. N. <sam...@ol...> - 2011-07-06 08:16:57
|
Hi all, I know there is a roadmap on the MooseFS website, but do you have a planning for the next releases? We are experiencing (http://on-master.olympe-network.com:9425/mfs.cgi) problems of performances during multiple small files accesses (such as Joomla or Wordpress hosting), we try to mount the filesystem with these options: mfsmount -H on-master -o mfsattrcacheto=30,mfsentrycacheto=30,mfsdirentrycacheto=30 Can you advice us more tuning options in order to improve the performances on our filesystem? Do you think the next releases will help us more? Thanks for your answers. Best regards. Sam |
From: Robert S. <rsa...@ne...> - 2011-07-05 12:47:37
|
Hi Michal, I need this to see if there is a way I can optimize the system to open more files per minute. At this stage our systems can open a few hundred files in parallel. I am not yet at the point where I can do thousands. What I think I am seeing is that the writes are starved because there are too many pending opens and most of them are for reading files. There seems to be a limit of around 2,400 opens per minute on the hardware I have and I am looking at what needs to be done to improve that. Based on your answer it sounds like the network traffic from the machine running mfsmount() to the master may be the biggest delay? Short of converting to 10 GB/s or trying to get all the servers on the same switch I don't know if there is much to be done about it? Robert On 7/5/11 3:15 AM, Michal Borychowski wrote: > Hi Robert! > > Ad. 1. There is no limit in mfsmount itself, but there are some limits in the operating system. Generally speaking it is wise not to open more than several thousands files in parallel. > > Ad. 2. Fopen invokes open, and open invokes (through kernel and FUSE) functions mfs_lookup and mfs_open. Mfs_lookup function changes consequtive path elements into i-node number. While mfs_open makes the target file opening. It sends a packet to the master in order to receive information about possibility to keep the file in the cache. It also marks the file in the master as opened - in cases it is deleted, it is sustained to the moment of closing. > > BTW. Why do you need this? > > > Kind regards > Michał Borychowski > MooseFS Support Manager > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > Gemius S.A. > ul. Wołoska 7, 02-672 Warszawa > Budynek MARS, klatka D > Tel.: +4822 874-41-00 > Fax : +4822 874-41-01 > > > > -----Original Message----- > From: Robert Sandilands [mailto:rsa...@ne...] > Sent: Saturday, July 02, 2011 2:54 AM > To: moo...@li... > Subject: Re: [Moosefs-users] Write starvation > > Based on some tests I think the limit in this case is the number of > opens per minute. I think I need to understand what happens with an open > before I can make guesses on what can be done to get the number higher. > > But then it still does not quite explain the write starvation except if > the number of pending reads are just so much higher than the number of > pending writes that it seems to starve the writes. Maybe this will > resolve itself as I add more chunk servers. > > Some questions: > > 1. Is there a limit to the number of handles that client applications > can open per mount, per chunk server, per disk? > 2. What happens when an application does fopen() on a mount? Can > somebody give a quick overview or do I have to read some code? > > Robert > > On 6/30/11 11:32 AM, Ricardo J. Barberis wrote: >> El Miércoles 29 Junio 2011, Robert escribió: >>> Yes, we use Centos, but installing and using the ktune package generally >>> resolves most of the performance issues and differences I have seen with >>> Ubuntu/Debian. >> Nice to know about ktune and thank you for bringing it up, I'll take a look a >> it. >> >>> I don't understand the comment on hitting metadata a lot? What is a lot? >> A lot = reading / (re)writing / ls -l'ing / stat'ing too often. >> >> If the client can't cache the metadata but uses it often, that means it has to >> query the master every time. >> >> Network latencies might also play a role in the performance degradation. >> >>> Why would it make a difference? All the metadata is in RAM anyway? The >>> biggest limit to speed seems to be the number of IOPS that you can get out >>> of your disks you have available to you. Looking up the metadata from RAM >>> should be several orders of magnitude faster than that. >> Yep, and you have plenty of RAM, so that shouldn't be an issue in your case. >> >>> The activity reported through the CGI interface on the master is around >>> 2,400 opens per minute average. Reads and writes are also around 2400 per >>> minute alternating with each other. mknod has some peaks around 2,800 per >>> minute but is generally much lower. Lookup's are around 8,000 per minute >>> and getattr is around 700 per minute. Chunk replication and deletion is >>> around 50 per minute. The other numbers are generally very low. >> Mmm, maybe 2 chunkservers are just too litle to handle that activity but I >> would also check the network latencies. >> >> I'm also not really confident about having master and cunkserver on the same >> server but I don't have any hard evidence to support my feelings ;) >> >>> Is there a guide/hints specific to MooseFS on what IO/Net/Process >>> parameters would be good to investigate for mfsmaster? >> I'd like to know that too! >> >> Cheers, > > ------------------------------------------------------------------------------ > All of the data generated in your IT infrastructure is seriously valuable. > Why? It contains a definitive record of application performance, security > threats, fraudulent activity, and more. Splunk takes this data and makes > sense of it. IT sense. And common sense. > http://p.sf.net/sfu/splunk-d2d-c2 > _______________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users > |
From: Robert S. <rsa...@ne...> - 2011-07-05 12:47:21
|
Hi Ricardo, My guess is that the number of chunks that the chunkserver knows about has an influence on the memory usage. In my case mfschunkserver uses a bit more memory. As reported by top: USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND daemon 0 -19 3622m 3.4g 756 S 16.9 5.5 188:10.92 mfschunkserver daemon 0 -19 8763m 8.5g 676 D 22.6 13.5 7766:43 mfschunkserver daemon 0 -19 24.3g 24g 768 S 10.0 39.2 248:44.71 mfsmaster I have seen mfsmount use around 1 GB of RAM too. This is the current usage on two servers where only read traffic happens. root 4 -19 745m 51m 496 S 3.3 0.1 453:04.87 mfsmount root 3 -19 684m 52m 748 S 2.7 0.1 55:44.67 mfsmount This is mfsmount on a server where only write traffic happens: root 0 -19 298m 39m 528 S 4.0 2.0 2017:43 mfsmount On another machine with a mix of read and write: root 6 -19 364m 12m 492 S 0.0 0.2 57:15.53 mfsmount But, yes. We will have to see how we can juggle hardware to get the best performance within the other constraints we have. Robert On 7/4/11 5:58 PM, Ricardo J. Barberis wrote: > El Lunes 04 Julio 2011, Robert Sandilands escribió: >> We plan on adding more chunkservers as we move content from traditional >> file systems to MFS. A dedicated master may be some time away. Getting a >> machine with 64+ GB of RAM is not always easy to get past the budget >> monsters. > If the problem is RAM, you can take some from your chunkservers, as they don't > use it much (only for disk cache I would guess). > > Or, once you add another chunkserver you might free the master disks and keep > it as master only. > > I don't have so many files (600.000) but my chunkservers have 2 or 4 GB and > are only using ~200 MB and my master is using 580 MB from a total of 4 GB. > > The rest of the memory is all used for disk cache and buffers, at least > acording to "free -m". > > Regards, |
From: Vfkq <rau...@ya...> - 2011-07-05 12:23:26
|
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"> <HTML><HEAD> <META http-equiv=Content-Type content="text/html; charset=unicode"> <META content="MSHTML 6.00.2900.3268" name=GENERATOR></HEAD> <BODY> <DIV><FONT color=#ffffff>Xm0R</FONT>我司<FONT color=#ffffff>Vk0</FONT>可优<FONT color=#ffffff>0RhVl</FONT>惠代<FONT color=#ffffff>gWk0Qg</FONT>开各类[<FONT color=#ff0000>发</FONT><FONT color=#ffffff>m0</FONT><FONT color=#ff0000>剽</FONT>], <FONT color=#ffffff>O1</FONT></DIV> <DIV><FONT color=#ffffff></FONT> </DIV> <DIV><FONT color=#ffffff>Wl0P</FONT>具体有:<FONT color=#ffffff>9n1S</FONT>商品销<FONT color=#ffffff>Qg</FONT>售、运<FONT color=#ffffff>j9n1Ti</FONT>输、建<FONT color=#ffffff>hWl0Qf</FONT>筑安<FONT color=#ffffff>l0</FONT>装;</DIV> <DIV> </DIV> <DIV><FONT color=#ffffff>Uk9</FONT> 广<FONT color=#ffffff>9n1S</FONT>告、设<FONT color=#ffffff>gVl9P</FONT>计、咨<FONT color=#ffffff>Uk9</FONT>询、住<FONT color=#ffffff>RgW</FONT>宿费等。 </DIV> <DIV> </DIV> <DIV><FONT color=#ffffff>0P1S</FONT>验<FONT color=#ffffff>0RgVk</FONT>后付<FONT color=#ffffff>O1</FONT>款。欢迎咨<FONT color=#ffffff>Vl0</FONT>询! </DIV> <DIV> </DIV> <DIV> </DIV> <DIV><FONT color=#ffffff>n1</FONT>电<FONT color=#ffffff>1Uj9Q</FONT>话:①③⑤<FONT color=#ffffff> ZEJOUZ</FONT> ⑨0①⑨ <FONT color=#ffffff> IN</FONT> ⑤①⑤③ </DIV> <DIV> </DIV> <DIV><FONT color=#ffffff> NSX</FONT> 联<FONT color=#ffffff>Q2</FONT>系人:陈先<FONT color=#ffffff>Wl0P</FONT>生 </DIV> <DIV> </DIV> <DIV><FONT color=#ffffff>l0Q1Vk</FONT>Q号:⒈⒎⒊<FONT color=#ffffff> DINRXC</FONT> Ο⒊⒊<FONT color=#ffffff> QVAF</FONT> ⒍⒍⒈⒐ <FONT color=#ffffff>gVk9Q2</FONT></DIV></BODY></HTML> |
From: Michal B. <mic...@ge...> - 2011-07-05 07:27:15
|
Hi! Have you finally managed to run the MooseFS system? If not, we'd need more information about your problem. Kind regards Michał Borychowski MooseFS Support Manager _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ Gemius S.A. ul. Wołoska 7, 02-672 Warszawa Budynek MARS, klatka D Tel.: +4822 874-41-00 Fax : +4822 874-41-01 -----Original Message----- From: Zachary A Wagner [mailto:zw...@ii...] Sent: Wednesday, June 29, 2011 9:12 PM To: moo...@li... Subject: [Moosefs-users] Client cannot mount to mfsmaster So after successfully building a test network last week for my professor (consisting of 1 master, 1 meta, 2 chunkservers, and 1 client), I arrived this week with the machines turned off. After restarting, I started the machines in order of master, meta, chunk1, chunk2, client. However, now the client will not mount to the master no matter what I do. Has anybody ever had this problem before? Is it something I need to change on my MooseFS machines or do you think it may have to do with my network? FYI I am very new to Linux and extremely new to MooseFS. Thank you, Zach ---------------------------------------------------------------------------- -- All of the data generated in your IT infrastructure is seriously valuable. Why? It contains a definitive record of application performance, security threats, fraudulent activity, and more. Splunk takes this data and makes sense of it. IT sense. And common sense. http://p.sf.net/sfu/splunk-d2d-c2 _______________________________________________ moosefs-users mailing list moo...@li... https://lists.sourceforge.net/lists/listinfo/moosefs-users |
From: Michal B. <mic...@ge...> - 2011-07-05 07:15:35
|
Hi Robert! Ad. 1. There is no limit in mfsmount itself, but there are some limits in the operating system. Generally speaking it is wise not to open more than several thousands files in parallel. Ad. 2. Fopen invokes open, and open invokes (through kernel and FUSE) functions mfs_lookup and mfs_open. Mfs_lookup function changes consequtive path elements into i-node number. While mfs_open makes the target file opening. It sends a packet to the master in order to receive information about possibility to keep the file in the cache. It also marks the file in the master as opened - in cases it is deleted, it is sustained to the moment of closing. BTW. Why do you need this? Kind regards Michał Borychowski MooseFS Support Manager _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ Gemius S.A. ul. Wołoska 7, 02-672 Warszawa Budynek MARS, klatka D Tel.: +4822 874-41-00 Fax : +4822 874-41-01 -----Original Message----- From: Robert Sandilands [mailto:rsa...@ne...] Sent: Saturday, July 02, 2011 2:54 AM To: moo...@li... Subject: Re: [Moosefs-users] Write starvation Based on some tests I think the limit in this case is the number of opens per minute. I think I need to understand what happens with an open before I can make guesses on what can be done to get the number higher. But then it still does not quite explain the write starvation except if the number of pending reads are just so much higher than the number of pending writes that it seems to starve the writes. Maybe this will resolve itself as I add more chunk servers. Some questions: 1. Is there a limit to the number of handles that client applications can open per mount, per chunk server, per disk? 2. What happens when an application does fopen() on a mount? Can somebody give a quick overview or do I have to read some code? Robert On 6/30/11 11:32 AM, Ricardo J. Barberis wrote: > El Miércoles 29 Junio 2011, Robert escribió: >> Yes, we use Centos, but installing and using the ktune package generally >> resolves most of the performance issues and differences I have seen with >> Ubuntu/Debian. > Nice to know about ktune and thank you for bringing it up, I'll take a look a > it. > >> I don't understand the comment on hitting metadata a lot? What is a lot? > A lot = reading / (re)writing / ls -l'ing / stat'ing too often. > > If the client can't cache the metadata but uses it often, that means it has to > query the master every time. > > Network latencies might also play a role in the performance degradation. > >> Why would it make a difference? All the metadata is in RAM anyway? The >> biggest limit to speed seems to be the number of IOPS that you can get out >> of your disks you have available to you. Looking up the metadata from RAM >> should be several orders of magnitude faster than that. > Yep, and you have plenty of RAM, so that shouldn't be an issue in your case. > >> The activity reported through the CGI interface on the master is around >> 2,400 opens per minute average. Reads and writes are also around 2400 per >> minute alternating with each other. mknod has some peaks around 2,800 per >> minute but is generally much lower. Lookup's are around 8,000 per minute >> and getattr is around 700 per minute. Chunk replication and deletion is >> around 50 per minute. The other numbers are generally very low. > Mmm, maybe 2 chunkservers are just too litle to handle that activity but I > would also check the network latencies. > > I'm also not really confident about having master and cunkserver on the same > server but I don't have any hard evidence to support my feelings ;) > >> Is there a guide/hints specific to MooseFS on what IO/Net/Process >> parameters would be good to investigate for mfsmaster? > I'd like to know that too! > > Cheers, ------------------------------------------------------------------------------ All of the data generated in your IT infrastructure is seriously valuable. Why? It contains a definitive record of application performance, security threats, fraudulent activity, and more. Splunk takes this data and makes sense of it. IT sense. And common sense. http://p.sf.net/sfu/splunk-d2d-c2 _______________________________________________ moosefs-users mailing list moo...@li... https://lists.sourceforge.net/lists/listinfo/moosefs-users |
From: Ricardo J. B. <ric...@da...> - 2011-07-04 21:59:05
|
El Lunes 04 Julio 2011, Robert Sandilands escribió: > We plan on adding more chunkservers as we move content from traditional > file systems to MFS. A dedicated master may be some time away. Getting a > machine with 64+ GB of RAM is not always easy to get past the budget > monsters. If the problem is RAM, you can take some from your chunkservers, as they don't use it much (only for disk cache I would guess). Or, once you add another chunkserver you might free the master disks and keep it as master only. I don't have so many files (600.000) but my chunkservers have 2 or 4 GB and are only using ~200 MB and my master is using 580 MB from a total of 4 GB. The rest of the memory is all used for disk cache and buffers, at least acording to "free -m". Regards, -- Ricardo J. Barberis Senior SysAdmin / ITI Dattatec.com :: Soluciones de Web Hosting Tu Hosting hecho Simple! |
From: Robert S. <rsa...@ne...> - 2011-07-04 17:13:21
|
Hi Michal, In our case we never modify the files once we have received them. It is also very rare for multiple processes to read the same file at the same time. What is not rare is hundreds of simultaneous reads on hundreds of unique files randomly distributed through the file system. We plan on adding more chunkservers as we move content from traditional file systems to MFS. A dedicated master may be some time away. Getting a machine with 64+ GB of RAM is not always easy to get past the budget monsters. Robert On 7/4/11 3:30 AM, Michal Borychowski wrote: > Hi Robert! > > Parallel reading and writing to the same file indeed causes some write > delays. If it is not your case, adding 1-2 chunkservers and having a > dedicated machine for the master should help. > > > Kind regards > Michał Borychowski > MooseFS Support Manager > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > Gemius S.A. > ul. Wołoska 7, 02-672 Warszawa > Budynek MARS, klatka D > Tel.: +4822 874-41-00 > Fax : +4822 874-41-01 > > > > -----Original Message----- > From: Robert Sandilands [mailto:rsa...@ne...] > Sent: Wednesday, June 29, 2011 4:30 AM > To: moo...@li... > Subject: [Moosefs-users] Write starvation > > I have been moving data from existing non-distributed file systems onto > a MooseFS file system. > > I am using 1.6.20 on Centos 5.6. > > While moving the data I have also transfered some of the normal > read-only traffic load to use the data already moved onto the MFS volume. > > What I can see is when there is any significant read traffic then write > traffic slows down to a crawl. > > When I look at the server charts for any of the chunk servers generated > by mfscgiserv then it seems like read and write traffic seems to alternate. > > Write traffic does not stop completely, but seems to slow down to< 10 > kB per second under high read traffic conditions. When the read traffic > decreases the write traffic will increase to normal levels. > > Is this a known problem? Is there something I can do to ensure that > write traffic is not starved by read traffic? > > Robert > > > > > > ---------------------------------------------------------------------------- > -- > All of the data generated in your IT infrastructure is seriously valuable. > Why? It contains a definitive record of application performance, security > threats, fraudulent activity, and more. Splunk takes this data and makes > sense of it. IT sense. And common sense. > http://p.sf.net/sfu/splunk-d2d-c2 > _______________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users > |
From: jyc <mai...@gm...> - 2011-07-04 12:34:50
|
Hi everyone, (excuse me, html mail...) i'm testing moosefs 1.6.20-2 up to 30 TB. everything is running fine, except that i still have 5 chunks with 1 valid copy for each one, and a goal of zero. but they never go to 0 valid copy, and a goal of zero... even when the all cluster is "stable", ie without any rebalance ? is there any way to find why is these chunks, and where are they ? Filesystem check info check loop start time check loop end time files under-goal files missing files chunks under-goal chunks missing chunks Mon Jul 4 07:33:03 2011 Mon Jul 4 11:33:17 2011 388938 0 0 606531 0 0 Regular chunks state matrix (counts only 'regular' hdd space : switch to 'all' <http://172.16.33.248:9425/mfs.cgi?HDperiod=0§ions=HD%7CIN&INmatrix=0>) goal valid copies 0 1 2 3 4 5 6 7 8 9 10+ all 0 - 5 - - - - - - - - - 5 jyc |