From: Tuukka L. <tlu...@gm...> - 2011-07-07 16:33:18
|
Hey Michal, Thanks for the info. Actually you got me thinking about this again, and maybe a simpler/scalable solution to the problem I was experiencing would be to run a sanity check on the meta file that gets dumped and if it doesn't pass it some kind of error is presented in the admin console, logs and/or emailed/text messaged etc. Also in that case the system should keep the bad one as well as the last good one at the minimum. Thanks, Tuukka 2011/7/7 Michal Borychowski <mic...@ge...>: > Hi Tuukka! > > In reply to your old post :) Files below 64MB would occupy single chunks so > they would be in one part, not divided. > > Chunks have 5kB which needs to be removed and later it is necessary to check > where the file ends. Chunks length is rounded up to (5+64*n)kB, where n can > be 0 up to 1024. > > > Kind regards > Michał Borychowski > MooseFS Support Manager > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > Gemius S.A. > ul. Wołoska 7, 02-672 Warszawa > Budynek MARS, klatka D > Tel.: +4822 874-41-00 > Fax : +4822 874-41-01 > > > > -----Original Message----- > From: Tuukka Luolamo [mailto:tlu...@gm...] > Sent: Monday, June 06, 2011 8:17 AM > To: Michal Borychowski > Cc: moo...@li... > Subject: Re: [Moosefs-users] Problems after power failure > > So the chunk servers in effect do not know what files they have. Only > the master is aware? I looked at the chunks themselves and they don't > seem to be particularly special, I was able to identify some of the > files inside the chunks simply by using head/tail/cat commands on > them, so it would seem like it would not be hard for each chunkserver > to be at least minimally aware of what is in themselves, to remove > some dependency on the master. Not sure how other people feel, but if > I knew that each chunkserver can tell the master what it has, it would > make me feel better that there is a way to recover the contents of the > file system in case of a extreme failure, specially in small > implementations. I realize in larger implementations it may be > impractical to ever recover the meta data from the chunkservers, > simply for the time it might take. So having very reliable masters and > backup masters would be a key. > > Thanks. > > 2011/6/5 Michal Borychowski <mic...@ge...>: >> If you really don't have your metadata, your files are dead... You need >> metadata to know about them. Of course you can try to recover by reading > the >> surface of hard drives on chunkservers but this would be very tedious... >> >> >> Regards >> Michał >> >> -----Original Message----- >> From: Tuukka Luolamo [mailto:tlu...@gm...] >> Sent: Monday, June 06, 2011 7:56 AM >> To: Michal Borychowski >> Cc: moo...@li... >> Subject: Re: [Moosefs-users] Problems after power failure >> >> Well I guess the question is in the impossible situation that the >> metadata were to be completely corrupt or lost because there is no >> backup and master goes dead, what recourse is there if any? >> >> Thanks >> >> 2011/6/5 Michal Borychowski <mic...@ge...>: >>> Hi! >>> >>> That's interesting... So this bit could be changed by your CPU or >>> motherboard or it is an error in the software but it would be very very >>> difficult to find it as the error probably cannot be easily repeated. >>> >>> Regarding your previous questions - it's almost impossible that your >>> metadata is "completely" corrupt. You really can recover most of your >> files >>> at most times. This situation was really weird as there was a single >> change >>> in an information bit. Normally you would run mfsmetarestore with a flag >> -i >>> (ignore) and it would just ignore this one file. Unfortunately you would >> not >>> be able to repair this single bit as this was quite a complicated > process. >>> >>> >>> Kind regards >>> Michał >>> >>> >>> -----Original Message----- >>> From: Tuukka Luolamo [mailto:tlu...@gm...] >>> Sent: Monday, June 06, 2011 5:01 AM >>> To: Michal Borychowski; moo...@li... >>> Subject: Re: [Moosefs-users] Problems after power failure >>> >>> OK I ran the memtest on the master server, It got through without >>> finding any errors. >>> >>> 2011/6/2 Michal Borychowski <mic...@ge...>: >>>> Hi! >>>> >>>> In the meantime please run the memtest - we are curious if it really was >>>> hardware problem or maybe it could be a software problem >>>> >>>> >>>> Regards >>>> Michal >>>> >>>> -----Original Message----- >>>> From: Tuukka Luolamo [mailto:tlu...@gm...] >>>> Sent: Thursday, June 02, 2011 9:55 AM >>>> To: WK >>>> Cc: moo...@li... >>>> Subject: Re: [Moosefs-users] Problems after power failure >>>> >>>> OK I put in place the file the dev sent me and can not see any data >>> loss... >>>> >>>> I found the file in question the one I got in the error and it seems >> fine. >>>> >>>> The whole system is up and functioning. >>>> >>>> I run the system on a old desktop computer and a another PC I bought >>>> for $25 so the dev recommends making sure you have good memory, but I >>>> guess I am using whatever I got =) Aside from this error everything >>>> has been fine. Didn't run the memtest they recommended, but I would >>>> not count out memory errors. >>>> >>>> However I would like to understand the situation better mainly for >>>> what are my recourses. As WK articulated already had my metadata been >>>> completely corrupt would I have lost all my data? Would I lose just >>>> one file the one with the error? And can I fix this error myself? >>>> >>>> Thanks, >>>> >>>> Tuukka >>>> >>>> On Wed, Jun 1, 2011 at 3:39 PM, WK <wk...@bn...> wrote: >>>>> On 6/1/2011 2:30 AM, Michal Borychowski wrote: >>>>>> >>>>>> We think that this problem could be caused by your RAM in the master. >> We >>>>>> recommend using RAM with parity control. You can also run a test from >>>>>> http://www.memtest.org/ on your server and check your existing RAM. Of >>>>>> course, the bit could have been changed also on the motherboard level >> or >>>> CPU >>>>>> - which is much less probable. >>>>>> >>>>>> >>>>>> Also you can see in the log that file 7538 is located between 7553 and >>>> 7555: >>>>> >>>>> >>>>> So in a situation like this where the metadata is now corrupt. >>>>> >>>>> Is the problem fixable with only the loss of the one file? (and how > does >>>>> one fix it). >>>>> >>>>> or is his entire MFS setup completely corrupt and he would need to have >>>>> had a backup? >>>>> >>>>> Can I assume that older archived versions of the metadata.mfs could be >>>>> used to recover most of the files. >>>>> >>>>> -bill >>>>> >>>>> >>>> >>> >> > ---------------------------------------------------------------------------- >>>> -- >>>>> Simplify data backup and recovery for your virtual environment with >>>> vRanger. >>>>> Installation's a snap, and flexible recovery options mean your data is >>>> safe, >>>>> secure and there when you need it. Data protection magic? >>>>> Nope - It's vRanger. Get your free trial download today. >>>>> http://p.sf.net/sfu/quest-sfdev2dev >>>>> _______________________________________________ >>>>> moosefs-users mailing list >>>>> moo...@li... >>>>> https://lists.sourceforge.net/lists/listinfo/moosefs-users >>>>> >>>> >>>> >>> >> > ---------------------------------------------------------------------------- >>>> -- >>>> Simplify data backup and recovery for your virtual environment with >>> vRanger. >>>> >>>> Installation's a snap, and flexible recovery options mean your data is >>> safe, >>>> secure and there when you need it. Data protection magic? >>>> Nope - It's vRanger. Get your free trial download today. >>>> http://p.sf.net/sfu/quest-sfdev2dev >>>> _______________________________________________ >>>> moosefs-users mailing list >>>> moo...@li... >>>> https://lists.sourceforge.net/lists/listinfo/moosefs-users >>>> >>>> >>> >>> >> > ---------------------------------------------------------------------------- >>> -- >>> Simplify data backup and recovery for your virtual environment with >> vRanger. >>> Installation's a snap, and flexible recovery options mean your data is >> safe, >>> secure and there when you need it. Discover what all the cheering's > about. >>> Get your free trial download today. >>> http://p.sf.net/sfu/quest-dev2dev2 >>> _______________________________________________ >>> moosefs-users mailing list >>> moo...@li... >>> https://lists.sourceforge.net/lists/listinfo/moosefs-users >>> >>> >> >> > > ---------------------------------------------------------------------------- > -- > Simplify data backup and recovery for your virtual environment with vRanger. > Installation's a snap, and flexible recovery options mean your data is safe, > secure and there when you need it. Discover what all the cheering's about. > Get your free trial download today. > http://p.sf.net/sfu/quest-dev2dev2 > _______________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users > > |