Re: [Moosefs-users] Question about MooseFS metadata distribution/recovery

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

On 12/12/2011 3:36 PM, René Pfeiffer wrote:
> Yes, it did, but I get the "D" state for processes accessing the mount,
> too. The logs show messages of the type "chunk xyz has only invalid copies
> (1) - please repair it manually", so I guess the metadata is still not
> correct (IP addresses and names of the chunk servers haven't changed).
>
> The biggest problem is that we cannot figure out what the RAID controller
> exactly did to the file system of the master server, and we haven't found
> any traces of a more recent metadata file. The metalogger system had no
> problem, but can it be that the metalogger was/is out of sync due to the
> silent file system corruption on the master system?

That is a question for the devs, but early in our MFS testing with 
essentially throwaway kit, we had a master fail with a broken raid. In 
that case the underlying disk system had been essentially readonly for a 
few days and no recent data was in /usr/local/var/mfs.

However, the metalogger DID have accurate information and we simply 
recovered using that data using the restore process and then copying 
over metadata file to the now fixed master. Except for the 'on the fly 
files' lost when the damm thing crashed, no other data was lost, 
including files that had been received and written to chunkserver during 
the time the disk subsystem was out of order.

So my guess is that the metaloggers get their info from the masters 
memory, not from a file on the master.

But that is something that should be confirmed by the devs.

Re: [Moosefs-users] Question about MooseFS metadata distribution/recovery

Fault tolerant, POSIX-compliant, Net Distributed Storage / File System

Re: [Moosefs-users] Question about MooseFS metadata distribution/recovery