From: Ken <ken...@gm...> - 2012-03-13 03:57:37
|
>> The upcoming version has this feature, .... Glad to hear this again, I notice this was January. I am very appreciate working of moosefs team, we are very expected to the version. Best Regards -Ken On Tue, Mar 13, 2012 at 11:37 AM, Davies Liu <dav...@gm...> wrote: > > > On Tue, Mar 13, 2012 at 10:21 AM, Ken <ken...@gm...> wrote: > >> Hi, Davies and Moosefs >> >> I agree Davies' opinion in previous mail, undelete metadata.mfs.bak maybe >> is the best way. >> >> Recovery is very simply. >> >> After downloaded the damaged metadata from ChenGang, I began digging the >> code, comment a few lines(filesystem.c) of "return -1" in fs_loadnodes, >> fs_loadedges, chunk_load(procedures in restore). >> And print some count, like fsnodes count, edge count.., now we know inode >> information is complete, lost a half of edge, chunks totally lost. >> >> The most important information is inode ==> chunk id(s). It's complete, >> data will be remained. >> > > Great work I just did not notice that the relations between inode and > chunk id is in nodes, > I thought they are in chunk block. > > I'm very sorry for the wrong conclusion for Chen Gang, he maybe will lost > data without you help :-( > And thank you for the wonderful hack. > > Then collect all the chunks id, version info from disk of chunkserver, and >> write(use a python script) it to a single file(chunks.bin) which format >> same as metadata.mfs. Here, I made a mistake because of duplicate chunks >> via goal 2. >> >> Make some dirty change of chunk_load(in chunks.c), load chunks.bin >> instead of "metadata.mfs" >> At last execute "mfsmetarestore -o metadata.mfs -m metadata.mfs.part". We >> got the metadata.mfs. >> >> This took me almost 4 hours. >> > > Great efficience. He should buy you a drink :) > > >> >> In the accident, I notice something: >> a. The reason is disk full of master server, Why damaged occured in >> metalogger? Take a close look in fs_storeall(filesystem.c). Why unlink >> ("metadata.mfs") after write metadata failed? >> > > The upcoming version maybe already fixed this issue, need more check. > > >> b. Keep more copies metadata.mfs in metalogger and mfsmaster maybe good. >> > > The upcoming version has this feature, and I have patched my cluster also. > > >> c. Split the huge metadata.mfs to 3 files: inode, edge, chunk maybe be >> benefet in diagnosing, performance optimization. I guess. >> >> Any suggestion? >> >> Best Regards. >> -Ken >> >> >> >> >> On Tue, Mar 13, 2012 at 9:53 AM, Davies Liu <dav...@gm...> wrote: >> >>> Hi, >>> >>> congratulation ! >>> >>> Can you show some details about how to recover the data back ? >>> >>> Davies >>> >>> >>> On Tue, Mar 13, 2012 at 4:24 AM, 陈钢 <yik...@gm...> wrote: >>> >>>> Hi all. I got some message to report. >>>> >>>> The situation I faced is that I got my metadata.mfs broken when >>>> "mfsmaster restart" include the metadata.mfs on mfsmetalogger. >>>> >>>> In fact, the metadata on mfsmetalogger is "more broken" than the >>>> metadata on mfsmaster.That is wired. >>>> >>>> So, I got all my files back but half of them lost their filename. >>>> >>>> Lucky,ken...@gm... helped me,and I have a SQLite file which >>>> stored filesize and filename in it.I have already restored my important >>>> data 90 percent now. >>>> >>>> And , I wrote some script helps me cp the metadata.mfs then rsync it to >>>> another server every hour. just like then suggestion in >>>> http://www.moosefs.org/moosefs-faq.html#metadata-backup. >>>> >>>> Metadata is really important , it worse a incremental backup. >>>> >>>> >>>> 2012/3/9 陈钢 <yik...@gm...> >>>> >>>>> Maybe, 80% data can be found. i still trying. >>>>> i tried restore every metadata.mfs i have, just not work. >>>>> >>>>> >>>>> 2012/3/8 Olivier Thibault <Oli...@lm...> >>>>> >>>>>> Hi, >>>>>> >>>>>> Did you solve your problem ? >>>>>> I had few days ago a mfsmaster crash which went out of memory. >>>>>> When I tried to restart, it crashed saying that there was no >>>>>> metadata.mfs file. >>>>>> I tried "mfsmetarestore -a". It didn't work, saying that >>>>>> metadata.mfs.back was corrupted. >>>>>> There was a metadata.mfs.back file and a metadata.mfs.back.tmp file. >>>>>> metadata.mfs.back was half the size it should be. >>>>>> I restored from a daily backup the metadata.mfs file, then ran again >>>>>> 'mfsmetarestore -a', and this time, it worked. I could then start mfsmaster >>>>>> successfully. >>>>>> Did you tried that ? I mean, just restore the latest working >>>>>> metadata.mfs file ? >>>>>> >>>>>> HTH. >>>>>> >>>>>> Best regards, >>>>>> >>>>>> Olivier >>>>>> >>>>>> >>>>>> >>>>>> Le 07/03/12 04:20, 陈钢 a écrit : >>>>>> >>>>>>> In master`s metadata.mfs,i saw nodes part is complete, part of >>>>>>> names, no free, >>>>>>> no chunks.. >>>>>>> >>>>>>> 2012/3/7 Davies Liu <dav...@gm... <mailto: >>>>>>> dav...@gm...>> >>>>>>> >>>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> I had try to recover it from metadata.mfs and >>>>>>> metadata_ml.mfs.back, but failed. >>>>>>> >>>>>>> Because disk is full, mfsmaster had not dump all the metadata >>>>>>> into disks, >>>>>>> had part of nodes in metadata.mfs, no names, no edges, no chunks. >>>>>>> No hope to recover from the broken metadata.mfs, it's too short. >>>>>>> >>>>>>> The changelogs are also helpless, only in two days. >>>>>>> >>>>>>> The only hope is try to undelete the previous >>>>>>> metadata_ml.mfs.back from >>>>>>> metalogger machine, Chenggang had also failed, with extundelete >>>>>>> and etx3grep, >>>>>>> maybe some experts can archive this. >>>>>>> >>>>>>> The final options is to GUESS the relation between files and >>>>>>> chunks by >>>>>>> chunk id and size fo files, if data lost can not been afforded. >>>>>>> Each chunk is >>>>>>> combined with crc checksum and real data, if we know the relation >>>>>>> between >>>>>>> files and chunksever, then we can get data back. Or we can >>>>>>> contruct the >>>>>>> metadata according to the GUESS, the using mfsmaster to recover >>>>>>> them. >>>>>>> >>>>>>> Davies >>>>>>> >>>>>>> On Wed, Mar 7, 2012 at 10:56 AM, Ken <ken...@gm... >>>>>>> <mailto:ken...@gm...>> wrote: >>>>>>> >>>>>>> Hi, chengang >>>>>>> >>>>>>> I think you should try more, and post detail here. Someone >>>>>>> must resolve it. >>>>>>> Maybe you will lost some data in last few minutes, but 250T >>>>>>> should be saved. >>>>>>> >>>>>>> At first, BACKUP all files: >>>>>>> /var/lib/mfs/* on master >>>>>>> /var/lib/mfs/* on mfsmetalogger >>>>>>> >>>>>>> about restore error: >>>>>>> >>>>>>> file 'metadata.mfs.back' not found - will try >>>>>>> 'metadata_ml.mfs.back' >>>>>>> instead >>>>>>> loading objects (files,directories,etc.) ... loading >>>>>>> node: read >>>>>>> error: ENOENT (No such file or directory) >>>>>>> error >>>>>>> can't read metadata from file: .//metadata_ml.mfs.back >>>>>>> >>>>>>> How did you run mfsmetarestore? add -d options? >>>>>>> If stat(datapath + metadata_ml.mfs.back) fail, these error >>>>>>> will occur. >>>>>>> Maybe use strace will show why stat fail exactly. >>>>>>> >>>>>>> ps: I am in Beijing now and I can provide more help. >>>>>>> >>>>>>> HTH >>>>>>> >>>>>>> -Ken >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Wed, Mar 7, 2012 at 10:21 AM, 陈钢 <yik...@gm... >>>>>>> <mailto:yik...@gm...**>> wrote: >>>>>>> >>>>>>> can not start mfsmaster with the file "78962688 Mar 6 >>>>>>> 17:18 >>>>>>> metadata.mfs ".. >>>>>>> i tried that. :( >>>>>>> >>>>>>> 2012/3/7 Ricardo J. Barberis < >>>>>>> ric...@da... >>>>>>> <mailto:ricardo.barberis@**dattatec.com<ric...@da...> >>>>>>> >> >>>>>>> >>>>>>> >>>>>>> El Martes 06/03/2012, 陈钢 escribió: >>>>>>> > on master >>>>>>> [ ... ] >>>>>>> > -rw-r----- 1 mfs mfs 78962688 Mar 6 17:18 >>>>>>> metadata.mfs >>>>>>> > -rw-r--r-- 1 root root 8 Jul 4 2011 >>>>>>> metadata.mfs.empty >>>>>>> > -rw-r----- 1 mfs mfs 5984 Mar 6 12:00 >>>>>>> sessions.mfs >>>>>>> > -rw-r----- 1 mfs mfs 0 Mar 6 16:46 >>>>>>> sessions.mfs.tmp >>>>>>> > -rw-r----- 1 mfs mfs 131072 Mar 6 17:18 >>>>>>> stats.mfs >>>>>>> >>>>>>> You have /var/lib/mfs/metadata.mfs on the master, it >>>>>>> might not >>>>>>> be corrupt >>>>>>> after all? >>>>>>> >>>>>>> I'd suggest: >>>>>>> >>>>>>> - backup /var/lib/mfs to another disk/server (for >>>>>>> later recovery >>>>>>> if needed) >>>>>>> - make sure you have free space in your main disk >>>>>>> - then simply try to start mfsmaster >>>>>>> - check mfs.cgi (web interface) if it looks OK >>>>>>> >>>>>>> >>>>>>> BUT: if you can, wait for confirmation from Michał >>>>>>> Borychowski >>>>>>> first, in case >>>>>>> what I'm telling you is not safe. >>>>>>> >>>>>>> >>>>>>> (BTW: You have Reply-To set to che...@cp... >>>>>>> <mailto:che...@cp...>, I don't know if that's >>>>>>> >>>>>>> intentional on your part) >>>>>>> >>>>>>> Hope it helps, >>>>>>> -- >>>>>>> Ricardo J. Barberis >>>>>>> Senior SysAdmin / ITI >>>>>>> Dattatec.com :: Soluciones de Web Hosting >>>>>>> Tu Hosting hecho Simple! >>>>>>> >>>>>>> ------------------------------**------------ >>>>>>> >>>>>>> >>>>>>> >>>>>>> ------------------------------** >>>>>>> ------------------------------**------------------ >>>>>>> Virtualization & Cloud Management Using Capacity Planning >>>>>>> Cloud computing makes use of virtualization - but cloud >>>>>>> computing >>>>>>> also focuses on allowing computing to be delivered as a >>>>>>> service. >>>>>>> http://www.accelacomm.com/jaw/**sfnl/114/51521223/<http://www.accelacomm.com/jaw/sfnl/114/51521223/> >>>>>>> >>>>>>> ______________________________**_________________ >>>>>>> moosefs-users mailing list >>>>>>> moosefs-users@lists.**sourceforge.net<moo...@li...> >>>>>>> <mailto:moosefs-users@lists.**sourceforge.net<moo...@li...> >>>>>>> > >>>>>>> >>>>>>> https://lists.sourceforge.net/** >>>>>>> lists/listinfo/moosefs-users<https://lists.sourceforge.net/lists/listinfo/moosefs-users> >>>>>>> >>>>>>> >>>>>>> >>>>>>> ------------------------------** >>>>>>> ------------------------------**------------------ >>>>>>> Virtualization & Cloud Management Using Capacity Planning >>>>>>> Cloud computing makes use of virtualization - but cloud >>>>>>> computing >>>>>>> also focuses on allowing computing to be delivered as a >>>>>>> service. >>>>>>> http://www.accelacomm.com/jaw/**sfnl/114/51521223/<http://www.accelacomm.com/jaw/sfnl/114/51521223/> >>>>>>> ______________________________**_________________ >>>>>>> moosefs-users mailing list >>>>>>> moosefs-users@lists.**sourceforge.net<moo...@li...> >>>>>>> <mailto:moosefs-users@lists.**sourceforge.net<moo...@li...> >>>>>>> > >>>>>>> >>>>>>> https://lists.sourceforge.net/**lists/listinfo/moosefs-users<https://lists.sourceforge.net/lists/listinfo/moosefs-users> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> - Davies >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> ------------------------------**------------------------------** >>>>>>> ------------------ >>>>>>> Virtualization& Cloud Management Using Capacity Planning >>>>>>> >>>>>>> Cloud computing makes use of virtualization - but cloud computing >>>>>>> also focuses on allowing computing to be delivered as a service. >>>>>>> http://www.accelacomm.com/jaw/**sfnl/114/51521223/<http://www.accelacomm.com/jaw/sfnl/114/51521223/> >>>>>>> >>>>>>> >>>>>>> >>>>>>> ______________________________**_________________ >>>>>>> moosefs-users mailing list >>>>>>> moosefs-users@lists.**sourceforge.net<moo...@li...> >>>>>>> https://lists.sourceforge.net/**lists/listinfo/moosefs-users<https://lists.sourceforge.net/lists/listinfo/moosefs-users> >>>>>>> >>>>>> >>>>>> >>>>>> >>>>> >>>> >>> >>> >>> -- >>> - Davies >>> >> >> > > > -- > - Davies > |