From: Allen L. <lan...@gm...> - 2011-11-04 15:37:55
|
My chunk servers are on different machines, I actually have a setup that's supposed to be pretty resilient. 4 Chunkservers, 1 master, 1 metalogger, and 1 app server running mfsmount -- all separate machines (VMs). I too lost all the data. The chunks survived, but since the metadata was corrupt, all the files were lost and the chunks marked for recycling (goal 0). I should have paid more attention to the default drive the master was using for metadata, but a full-on crash just due to disk space exhaustion on the master seems excessive. On 11/4/2011 10:52 AM, Steve wrote: > This happened to me too, on a mfsmaster running on a small silicon drive. > Lost the lot. A powercut causing a mfschunkserver failure creating extra > logging. > > Looks like a repeatable problem that needs addressing. > > > > > > > > > > > > > > -------Original Message------- > > > > From: Allen Landsidel > > Date: 04/11/2011 14:37:09 > > To: moo...@li... > > Subject: [Moosefs-users] Out of disk space on master / recovery failed > > > > So I didn't plan ahead well and ended up with /var filling up on my > > master over night, causing the master to crash. > > > > mfsmetarestore refused to recover the system, I think because it didn't > > get a chance to write out the metadata file. It seems there's something > > wrong with the way it's doing writes. After the crash both the > > metadata.mfs and metadata.mfs.back were 0 bytes, and mfsmetarestore > > (obviously) refused to read from them. > > > > Some but not all of the changelog files were 0 bytes as well. Same > > story on the backup (metalogger) server. > > > > Just a heads up, I think a little more checking would be in order here > > to make sure there is space available for the metadata, and at least to > > prevent the master from crashing when/if it can't write the metadata. > > If it had stayed up with all the metadata in memory I could've seen the > > disk issue and brought up another metalogger with more disk space to > > catch up and take over. > > > > > > > > ----------------------------------------------------------------------------- > > > RSA(R) Conference 2012 > > Save $700 by Nov 18 > > Register now > > http://p.sf.net/sfu/rsa-sfdev2dev1 > > _______________________________________________ > > moosefs-users mailing list > > moo...@li... > > https://lists.sourceforge.net/lists/listinfo/moosefs-users > > |