From: Ken <ken...@gm...> - 2012-05-02 02:32:52
|
hi, I am quite sure about mfschunkserver crash because of: http://sourceforge.net/mailarchive/message.php?msg_id=29149150 >> How do I figure out which file(s) are in that chunk for example? maybe find mount-point -exec mfsfileinfo {} \; help >> I assume repairing it would be done with mfsfilerepair? about the log: 0000000000000AA5_00000005 - invalid copy on (10.0.0.1 - ver:00000003) I think mfsfilerepair maybe work, maybe early version file helped. I am not official, if there is any mistake please correct me. -Ken On Sun, Apr 29, 2012 at 4:26 PM, Jens Kristian Søgaard <je...@me...> wrote: > Hi again, > >> I received this error message: >> backups: master query: receive error > > A bit more happened after I wrote this message. First I noticed that all > writes to files on the mfs mount hanged. I had to force unmount the file > system, but even after remounting mfs writes would still not come through. > > In the end I had to force unmount the filesystem everywhere, and then > close down the chunkservers, the metalogger and the master. When I > closed down the chunkservers with "mfschunkserver stop", I got the > following log message: > > Apr 29 00:23:16 localhost kernel: [284815.544120] mfschunkserver[2229]: > segfault at 0 ip 0000000000411997 sp 00007fff1429b598 error 4 in > mfschunkserver[400000+2a000] > > I got the exact same error message on all chunkservers with only the > stack pointer different in each one. > > After starting the master, metalogger and chunkservers again I was > greeted with some bad news. The log shows messages like this: > > Apr 29 05:47:45 localhost mfsmaster[5036]: chunk 0000000000000AA5 has > only invalid copies (2) - please repair it manually > Apr 29 05:47:45 localhost mfsmaster[5036]: chunk > 0000000000000AA5_00000005 - invalid copy on (10.0.0.1 - ver:00000003) > Apr 29 05:47:45 localhost mfsmaster[5036]: chunk > 0000000000000AA5_00000005 - invalid copy on (10.0.0.2 - ver:00000004) > > How do I figure out which file(s) are in that chunk for example? > > I assume repairing it would be done with mfsfilerepair? > > > On the web interface is now displayed under "Important messages": > > unavailable chunks: 4543067 > unavailable files: 4749936 > > How can I find out which files are affected? > > Weirdly in the "Regular chunks state matrix" in the web interface only 6 > chunks show as having 0 valid copies. > > I know that nothing happened to the disks backing the chunk servers at > any point (they're RAID-5 arrays). So it sounds weird to me that 6 > blocks went missing? > > The web interface lists 14 file names under "Import messages" as > "currently unavailable file". But if I check them with mfsfileinfo and > mfscheckfile - they look fine? > > I hope someone could help me a bit in the right direction! > I'm new to moosefs and currently testing it. > > Thanks in advance, > -- > Jens Kristian Søgaard, Mermaid Consulting ApS, > je...@me..., > http://www.mermaidconsulting.com/ > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users |