Thread: [Moosefs-users] Question about MooseFS metadata distribution/recovery

Fault tolerant, POSIX-compliant, Net Distributed Storage / File System

Brought to you by: jakub_kruszona, moosefs, oxide94

moosefs-users

[Moosefs-users] Question about MooseFS metadata distribution/recovery

From: René P. <ly...@lu...> - 2011-12-11 18:13:41

Hello!

We have the following scenario with a MooseFS deployment.

- 3 servers
  - server #3 : 1 master
  - server #2 : chunk server
  - server #1 : chunk server, one metalogger process running on this node

Server #3 suffered from a hardware RAID failure including a trashed JFS
file system where the master logs were on. We tried

- recovering the log/state data from server #1 and restarting the master
  process, and
- using the data from the metalogger process and starting a new master on
  server #1 (including changing the configs to use the new master as
  described in the documentation).

The MooseFS mount is accessible, but some file operations involving data
end in processes hanging in the "D" state (uninterruptible sleep). Logs on
the chunk servers show that the MooseFS is missing data although the state
of the chunk servers have not changed (no reboot, no file system damage,
only reconfigured master).

Is there a way to extract the data from the chunk directories?
Is the a "fsck.mfs" or a similar tool? mfsfilerepair is not useful, because
it only zeroed a file which was affected by the "D" state problem.

We're currently trying to recover the original master's directory
containing the meta data. Is there any documentation regarding the meta
data or chunk data besides the source code? Does someone experienced a
similar situation?

Best regards,
René Pfeiffer.

-- 
  )\._.,--....,'``.  fL  Let GNU/Linux work for you while you take a nap.
 /,   _.. \   _\  (`._ ,. R. Pfeiffer <lynx at luchs.at> + http://web.luchs.at/
`._.-(,_..'--(,_..'`-.;.'  - System administration + Consulting + Teaching -
Got mail delivery problems?  http://web.luchs.at/information/blockedmail.php

Re: [Moosefs-users] Question about MooseFS metadata distribution/recovery

From: René P. <ly...@lu...> - 2011-12-12 19:45:25

On Dec 11, 2011 at 1913 +0100, René Pfeiffer appeared and said:
> Hello!
> 
> We have the following scenario with a MooseFS deployment.
> 
> - 3 servers
>   - server #3 : 1 master
>   - server #2 : chunk server
>   - server #1 : chunk server, one metalogger process running on this node

I forgot to include:

- All storage nodes run Debin 6.0 (x86_64).
- All storage nodes run MooseFS on /var which is JFS.
- MooseFS 1.6.20 is used (compiled from source).

> …
> We're currently trying to recover the original master's directory
> containing the meta data.

Recovery only yielded a metadata.mfs.back file which dates back to 29 June
2011. 

Best regards,
René Pfeiffer.

-- 
  )\._.,--....,'``.  fL  Let GNU/Linux work for you while you take a nap.
 /,   _.. \   _\  (`._ ,. R. Pfeiffer <lynx at luchs.at> + http://web.luchs.at/
`._.-(,_..'--(,_..'`-.;.'  - System administration + Consulting + Teaching -
Got mail delivery problems?  http://web.luchs.at/information/blockedmail.php

Re: [Moosefs-users] Question about MooseFS metadata distribution/recovery

From: wkmail <wk...@bn...> - 2011-12-12 23:21:48

On 12/12/2011 11:45 AM, René Pfeiffer wrote:
> Recovery only yielded a metadata.mfs.back file which dates back to 29 June
> 2011.
>

What about your metalogger?  Didn't that have a more up to date metadata 
file?

-bill

Re: [Moosefs-users] Question about MooseFS metadata distribution/recovery

From: René P. <ly...@lu...> - 2011-12-12 23:37:13

On Dec 12, 2011 at 1520 -0800, wkmail appeared and said:
> On 12/12/2011 11:45 AM, René Pfeiffer wrote:
> > Recovery only yielded a metadata.mfs.back file which dates back to 29 June
> > 2011.
> >
> 
> What about your metalogger?  Didn't that have a more up to date metadata 
> file?

Yes, it did, but I get the "D" state for processes accessing the mount,
too. The logs show messages of the type "chunk xyz has only invalid copies
(1) - please repair it manually", so I guess the metadata is still not
correct (IP addresses and names of the chunk servers haven't changed).

The biggest problem is that we cannot figure out what the RAID controller
exactly did to the file system of the master server, and we haven't found
any traces of a more recent metadata file. The metalogger system had no
problem, but can it be that the metalogger was/is out of sync due to the
silent file system corruption on the master system?

Best,
René.

-- 
  )\._.,--....,'``.  fL  Let GNU/Linux work for you while you take a nap.
 /,   _.. \   _\  (`._ ,. R. Pfeiffer <lynx at luchs.at> + http://web.luchs.at/
`._.-(,_..'--(,_..'`-.;.'  - System administration + Consulting + Teaching -
Got mail delivery problems?  http://web.luchs.at/information/blockedmail.php

Re: [Moosefs-users] Question about MooseFS metadata distribution/recovery

From: wkmail <wk...@bn...> - 2011-12-12 23:55:57

On 12/12/2011 3:36 PM, René Pfeiffer wrote:
> Yes, it did, but I get the "D" state for processes accessing the mount,
> too. The logs show messages of the type "chunk xyz has only invalid copies
> (1) - please repair it manually", so I guess the metadata is still not
> correct (IP addresses and names of the chunk servers haven't changed).
>
> The biggest problem is that we cannot figure out what the RAID controller
> exactly did to the file system of the master server, and we haven't found
> any traces of a more recent metadata file. The metalogger system had no
> problem, but can it be that the metalogger was/is out of sync due to the
> silent file system corruption on the master system?

That is a question for the devs, but early in our MFS testing with 
essentially throwaway kit, we had a master fail with a broken raid. In 
that case the underlying disk system had been essentially readonly for a 
few days and no recent data was in /usr/local/var/mfs.

However, the metalogger DID have accurate information and we simply 
recovered using that data using the restore process and then copying 
over metadata file to the now fixed master. Except for the 'on the fly 
files' lost when the damm thing crashed, no other data was lost, 
including files that had been received and written to chunkserver during 
the time the disk subsystem was out of order.

So my guess is that the metaloggers get their info from the masters 
memory, not from a file on the master.

But that is something that should be confirmed by the devs.

Re: [Moosefs-users] Question about MooseFS metadata distribution/recovery

From: René P. <ly...@lu...> - 2011-12-13 13:17:39

On Dec 12, 2011 at 1554 -0800, wkmail appeared and said:
> On 12/12/2011 3:36 PM, René Pfeiffer wrote:
> > …
> > The biggest problem is that we cannot figure out what the RAID controller
> > exactly did to the file system of the master server, and we haven't found
> > any traces of a more recent metadata file. The metalogger system had no
> > problem, but can it be that the metalogger was/is out of sync due to the
> > silent file system corruption on the master system?
> 
> That is a question for the devs, but early in our MFS testing with 
> essentially throwaway kit, we had a master fail with a broken raid. In 
> that case the underlying disk system had been essentially readonly for a 
> few days and no recent data was in /usr/local/var/mfs.
> 
> However, the metalogger DID have accurate information and we simply 
> recovered using that data using the restore process and then copying 
> over metadata file to the now fixed master. Except for the 'on the fly 
> files' lost when the damm thing crashed, no other data was lost, 
> including files that had been received and written to chunkserver during 
> the time the disk subsystem was out of order.
> 
> So my guess is that the metaloggers get their info from the masters 
> memory, not from a file on the master.

Ok, this might be the reason then, because the master went down hard two
times (first time 21 October, second time 9 December) because the RAID
controller totally locked the system. I assume this could
explain some missing metadata.

> But that is something that should be confirmed by the devs.

Thanks,
René.

-- 
  )\._.,--....,'``.  fL  Let GNU/Linux work for you while you take a nap.
 /,   _.. \   _\  (`._ ,. R. Pfeiffer <lynx at luchs.at> + http://web.luchs.at/
`._.-(,_..'--(,_..'`-.;.'  - System administration + Consulting + Teaching -
Got mail delivery problems?  http://web.luchs.at/information/blockedmail.php

Re: [Moosefs-users] Question about MooseFS metadata distribution/recovery

From: Laurent W. <lw...@hy...> - 2011-12-13 14:13:16

Hi,

I'm no MFS dev, but AFAIK each metadata change on master is sent
immediatly to metalogger then get written to disk on the master.

HTH,
-- 
Laurent Wandrebeck
HYGEOS, Earth Observation Department / Observation de la Terre
Euratechnologies
165 Avenue de Bretagne
59000 Lille, France
tel: +33 3 20 08 24 98
http://www.hygeos.com
GPG fingerprint/Empreinte GPG: F5CA 37A4 6D03 A90C 7A1D  2A62 54E6 EF2C
D17C F64C

Re: [Moosefs-users] Question about MooseFS metadatadistribution/recovery

From: Steve <st...@bo...> - 2011-12-14 13:08:36

Hi, 

 

Does anyone have a backup methology and script to take the meatlogger data
off the mfs system perhaps with several generations 

 

 

 

 

 

 

-------Original Message------- 

 

From: Michał Borychowski 

Date: 14/12/2011 08:39:11 

To: 'René Pfeiffer' 

Cc: moo...@li... 

Subject: Re: [Moosefs-users] Question about MooseFS
metadatadistribution/recovery 

 

Hi! 

 

This is still too little information about how we could help you... What
about metaloggers? Don't you have metadata_ml.mfs.back and changelog_ml.*
mfs files? You could put them on FTP as tar.gz and give us a link so that we
try to recover them. 

 

In "emergency" situations MooseFS tries to write metadata file on the master
machine in these locations: 

/metadata.mfs.emergency 

/tmp/metadata.mfs.emergency 

/var/metadata.mfs.emergency 

/usr/metadata.mfs.emergency 

/usr/share/metadata.mfs.emergency 

/usr/local/metadata.mfs.emergency 

/usr/local/var/metadata.mfs.emergency 

/usr/local/share/metadata.mfs.emergency 

 

You can try to find them there, but as RAID went broken it is possible you
won't see them. 

 

 

And it is impossible to recover metadata from the chunks themselves. But
file data are in the chunks so if you need to find something specific it
would be possible. Each chunk has 5kb header (plain text) so simple 'grep'
would be enough to find what you are looking at. 

 

BTW. How much data was kept on this MooseFS installation? 

 

 

Kind regards 

Michał Borychowski 

MooseFS Support Manager 

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

Gemius S.A. 

ul. Wołoska 7, 02-672 Warszawa 

Budynek MARS, klatka D 

Tel.: +4822 874-41-00 

Fax : +4822 874-41-01 

 

 

-----Original Message----- 

From: René Pfeiffer [mailto:ly...@lu...] 

Sent: Tuesday, December 13, 2011 2:17 PM 

To: moo...@li... 

Subject: Re: [Moosefs-users] Question about MooseFS metadata
distribution/recovery 

 

On Dec 12, 2011 at 1554 -0800, wkmail appeared and said: 

> On 12/12/2011 3:36 PM, René Pfeiffer wrote: 

> > … 

> > The biggest problem is that we cannot figure out what the RAID 

> > controller exactly did to the file system of the master server, and 

> > we haven't found any traces of a more recent metadata file. The 

> > metalogger system had no problem, but can it be that the metalogger 

> > was/is out of sync due to the silent file system corruption on the
master system? 

> 

> That is a question for the devs, but early in our MFS testing with 

> essentially throwaway kit, we had a master fail with a broken raid. In 

> that case the underlying disk system had been essentially readonly for 

> a few days and no recent data was in /usr/local/var/mfs. 

> 

> However, the metalogger DID have accurate information and we simply 

> recovered using that data using the restore process and then copying 

> over metadata file to the now fixed master. Except for the 'on the fly 

> files' lost when the damm thing crashed, no other data was lost, 

> including files that had been received and written to chunkserver 

> during the time the disk subsystem was out of order. 

> 

> So my guess is that the metaloggers get their info from the masters 

> memory, not from a file on the master. 

 

Ok, this might be the reason then, because the master went down hard two
times (first time 21 October, second time 9 December) because the RAID
controller totally locked the system. I assume this could explain some
missing metadata. 

 

> But that is something that should be confirmed by the devs. 

 

Thanks, 

René. 

 

-- 

)\._.,--....,'``. FL Let GNU/Linux work for you while you take a nap. 

/, _.. \ _\ (`._ ,. R. Pfeiffer <lynx at luchs.at> + http://web.luchs.at/ 

`._.-(,_..'--(,_..'`-.;.' - System administration + Consulting + Teaching -
Got mail delivery problems? http://web.luchs.at/information/blockedmail.php 

 

 

-----------------------------------------------------------------------------
 

Cloud Computing - Latest Buzzword or a Glimpse of the Future? 

This paper surveys cloud computing today: What are the benefits? 

Why are businesses embracing it? What are its payoffs and pitfalls? 

http://www.accelacomm.com/jaw/sdnl/114/51425149/ 

_______________________________________________ 

moosefs-users mailing list 

moo...@li... 

https://lists.sourceforge.net/lists/listinfo/moosefs-users

Re: [Moosefs-users] Question about MooseFS metadata distribution/recovery

From: Michał B. <mic...@ge...> - 2011-12-14 13:00:59

No, but you can try with sth like:

"find . -type f | xargs mfsfilerepair"


Kind regards
Michał Borychowski 
MooseFS Support Manager
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
Gemius S.A.
ul. Wołoska 7, 02-672 Warszawa
Budynek MARS, klatka D
Tel.: +4822 874-41-00
Fax : +4822 874-41-01


-----Original Message-----
From: René Pfeiffer [mailto:ly...@lu...] 
Sent: Wednesday, December 14, 2011 1:22 PM
To: Michał Borychowski
Cc: moo...@li...
Subject: Re: [Moosefs-users] Question about MooseFS metadata distribution/recovery

On Dec 14, 2011 at 1159 +0100, Michał Borychowski appeared and said:
> If you have "invalid copies" you should use mfsfilerepair which will 
> put in metadata proper file version. It's important that you have all 
> chunkservers running while doing mfsfilerepair.

Is there a way to run mfsfilerepair recursively? The man page did not show such an option.

Best regards,
René.

--
  )\._.,--....,'``.  fL  Let GNU/Linux work for you while you take a nap.
 /,   _.. \   _\  (`._ ,. R. Pfeiffer <lynx at luchs.at> + http://web.luchs.at/
`._.-(,_..'--(,_..'`-.;.'  - System administration + Consulting + Teaching - Got mail delivery problems?  http://web.luchs.at/information/blockedmail.php

Re: [Moosefs-users] Question about MooseFS metadata distribution/recovery

From: René P. <ly...@lu...> - 2011-12-14 13:24:52

On Dec 14, 2011 at 1400 +0100, Michał Borychowski appeared and said:
> No, but you can try with sth like:
> 
> "find . -type f | xargs mfsfilerepair"

I did that, and we're currently assessing the file system.
I asked for the recursive option, because it is easier for operators to
deal with these situations. Also a "--dry-run" option would be nice (just
as rsync has), so mfsfilerepair can generate a preview before an admin
decides to run the actual repair task.

Do the *.emergency files have a signature that can be used to recognise
them? Since the JFS on the RAID1 volumes has massive metadata damages, we
have to scan the content of every file to look for the *.emergency files.

Best regards,
René.

-- 
  )\._.,--....,'``.  fL  Let GNU/Linux work for you while you take a nap.
 /,   _.. \   _\  (`._ ,. R. Pfeiffer <lynx at luchs.at> + http://web.luchs.at/
`._.-(,_..'--(,_..'`-.;.'  - System administration + Consulting + Teaching -
Got mail delivery problems?  http://web.luchs.at/information/blockedmail.php

Re: [Moosefs-users] Question about MooseFS metadata distribution/recovery

From: Michał B. <mic...@ge...> - 2011-12-14 14:07:20

At the beginning of metadata file there is: "MFSM 1.5"


Kind regards
Michał Borychowski 
MooseFS Support Manager
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
Gemius S.A.
ul. Wołoska 7, 02-672 Warszawa
Budynek MARS, klatka D
Tel.: +4822 874-41-00
Fax : +4822 874-41-01


-----Original Message-----
From: René Pfeiffer [mailto:ly...@lu...] 
Sent: Wednesday, December 14, 2011 2:25 PM
To: Michał Borychowski
Cc: moo...@li...
Subject: Re: [Moosefs-users] Question about MooseFS metadata distribution/recovery

On Dec 14, 2011 at 1400 +0100, Michał Borychowski appeared and said:
> No, but you can try with sth like:
> 
> "find . -type f | xargs mfsfilerepair"

I did that, and we're currently assessing the file system.
I asked for the recursive option, because it is easier for operators to deal with these situations. Also a "--dry-run" option would be nice (just as rsync has), so mfsfilerepair can generate a preview before an admin decides to run the actual repair task.

Do the *.emergency files have a signature that can be used to recognise them? Since the JFS on the RAID1 volumes has massive metadata damages, we have to scan the content of every file to look for the *.emergency files.

Best regards,
René.

--
  )\._.,--....,'``.  fL  Let GNU/Linux work for you while you take a nap.
 /,   _.. \   _\  (`._ ,. R. Pfeiffer <lynx at luchs.at> + http://web.luchs.at/
`._.-(,_..'--(,_..'`-.;.'  - System administration + Consulting + Teaching - Got mail delivery problems?  http://web.luchs.at/information/blockedmail.php

Re: [Moosefs-users] Question about MooseFS metadata distribution/recovery

From: Ricardo J. B. <ric...@da...> - 2011-12-14 21:35:19

El Miércoles 14 Diciembre 2011, René Pfeiffer escribió:
> On Dec 14, 2011 at 1400 +0100, Michał Borychowski appeared and said:
> > No, but you can try with sth like:
> >
> > "find . -type f | xargs mfsfilerepair"
>
> I did that, and we're currently assessing the file system.
> I asked for the recursive option, because it is easier for operators to
> deal with these situations. Also a "--dry-run" option would be nice (just
> as rsync has), so mfsfilerepair can generate a preview before an admin
> decides to run the actual repair task.

Exactly my concern.

Whenever I get unavailable chunks (not many times fortunately) but I don't 
know which files they belong to, I do this:

find /path/to/mfs/mount -type f | xargs mfsfileinfo > /tmp/mfsfileinfo.log
grep -B2 "no valid copies" /tmp/mfsfileinfo.log

And then I mfsfilrepair only those files that have chunks with invalid copies.


But, I don't know before hand if the missing chunk will be zeroed or restored 
from another (maybe older) version, so a --dry-run would be more than 
welcome :)


Regards,
-- 
Ricardo J. Barberis
Senior SysAdmin / ITI
Dattatec.com :: Soluciones de Web Hosting
Tu Hosting hecho Simple!

------------------------------------------

Re: [Moosefs-users] Question about MooseFS metadata distribution/recovery

From: Michał B. <mic...@ge...> - 2011-12-14 08:38:16

Hi!

This is still too little information about how we could help you... What about metaloggers? Don't you have metadata_ml.mfs.back and changelog_ml.*.mfs files? You could put them on ftp as tar.gz and give us a link so that we try to recover them.

In "emergency" situations MooseFS tries to write metadata file on the master machine in these locations:
/metadata.mfs.emergency
/tmp/metadata.mfs.emergency
/var/metadata.mfs.emergency
/usr/metadata.mfs.emergency
/usr/share/metadata.mfs.emergency
/usr/local/metadata.mfs.emergency
/usr/local/var/metadata.mfs.emergency
/usr/local/share/metadata.mfs.emergency

You can try to find them there, but as RAID went broken it is possible you won't see them.


And it is impossible to recover metadata from the chunks themselves. But file data are in the chunks so if you need to find something specific it would be possible. Each chunk has 5kb header (plain text) so simple 'grep' would be enough to find what you are looking at. 

BTW. How much data was kept on this MooseFS installation?


Kind regards
Michał Borychowski 
MooseFS Support Manager
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
Gemius S.A.
ul. Wołoska 7, 02-672 Warszawa
Budynek MARS, klatka D
Tel.: +4822 874-41-00
Fax : +4822 874-41-01


-----Original Message-----
From: René Pfeiffer [mailto:ly...@lu...] 
Sent: Tuesday, December 13, 2011 2:17 PM
To: moo...@li...
Subject: Re: [Moosefs-users] Question about MooseFS metadata distribution/recovery

On Dec 12, 2011 at 1554 -0800, wkmail appeared and said:
> On 12/12/2011 3:36 PM, René Pfeiffer wrote:
> > …
> > The biggest problem is that we cannot figure out what the RAID 
> > controller exactly did to the file system of the master server, and 
> > we haven't found any traces of a more recent metadata file. The 
> > metalogger system had no problem, but can it be that the metalogger 
> > was/is out of sync due to the silent file system corruption on the master system?
> 
> That is a question for the devs, but early in our MFS testing with 
> essentially throwaway kit, we had a master fail with a broken raid. In 
> that case the underlying disk system had been essentially readonly for 
> a few days and no recent data was in /usr/local/var/mfs.
> 
> However, the metalogger DID have accurate information and we simply 
> recovered using that data using the restore process and then copying 
> over metadata file to the now fixed master. Except for the 'on the fly 
> files' lost when the damm thing crashed, no other data was lost, 
> including files that had been received and written to chunkserver 
> during the time the disk subsystem was out of order.
> 
> So my guess is that the metaloggers get their info from the masters 
> memory, not from a file on the master.

Ok, this might be the reason then, because the master went down hard two times (first time 21 October, second time 9 December) because the RAID controller totally locked the system. I assume this could explain some missing metadata.

> But that is something that should be confirmed by the devs.

Thanks,
René.

--
  )\._.,--....,'``.  fL  Let GNU/Linux work for you while you take a nap.
 /,   _.. \   _\  (`._ ,. R. Pfeiffer <lynx at luchs.at> + http://web.luchs.at/
`._.-(,_..'--(,_..'`-.;.'  - System administration + Consulting + Teaching - Got mail delivery problems?  http://web.luchs.at/information/blockedmail.php

Re: [Moosefs-users] Question about MooseFS metadata distribution/recovery

From: René P. <ly...@lu...> - 2011-12-16 17:26:26

Hello, Michał!

I have an update on the metadata case.

On Dec 14, 2011 at 0937 +0100, Michał Borychowski appeared and said:
> …
> This is still too little information about how we could help you... What
> about metaloggers? Don't you have metadata_ml.mfs.back and
> changelog_ml.*.mfs files? You could put them on ftp as tar.gz and give us
> a link so that we try to recover them.

I will take some time to compile this information, because the servers in
questions are managed by different teams (long story) and the coordination
did not run smoothly (i.e. the metalogger's logs might have been
overwritten, because the MooseFS was continued to be used after the master
server crash).

I searched through 129 GB of salvaged data from the master and found not a
single file with a metadata signature. 

> In "emergency" situations MooseFS tries to write metadata file on the
> master machine in these locations: … You can try to find them there, but
> as RAID went broken it is possible you won't see them.

Correct, we've not found any *emergency* files on the dump either.

Since the state of the MooseFS moved back in time to 28 June 2011 we
suspect that this is the time when the data corruption started, and since
no warnings or errors were logged by the servers or the hardware
controllers no one noticed.

Best,
René.

-- 
  )\._.,--....,'``.  fL  Let GNU/Linux work for you while you take a nap.
 /,   _.. \   _\  (`._ ,. R. Pfeiffer <lynx at luchs.at> + http://web.luchs.at/
`._.-(,_..'--(,_..'`-.;.'  - System administration + Consulting + Teaching -
Got mail delivery problems?  http://web.luchs.at/information/blockedmail.php

Re: [Moosefs-users] Question about MooseFS metadata distribution/recovery

From: René P. <ly...@lu...> - 2011-12-14 10:00:56

On Dec 14, 2011 at 0937 +0100, Michał Borychowski appeared and said:
> Hi!
> 

> This is still too little information about how we could help you...

Sorry, I know, it's been a bit stressful since the server failure.

> What about metaloggers? Don't you have metadata_ml.mfs.back and
> changelog_ml.*.mfs files?

Yes, I have, and we used the metalogger's data in order to set up a new
master, but the access errors (invalid chunks and such) stayed.

> You could put them on ftp as tar.gz and give us a link so that we try to
> recover them.

I will prepare an archive and make it available.

> In "emergency" situations MooseFS tries to write metadata file on the master machine in these locations:
> /metadata.mfs.emergency
> /tmp/metadata.mfs.emergency
> /var/metadata.mfs.emergency
> /usr/metadata.mfs.emergency
> /usr/share/metadata.mfs.emergency
> /usr/local/metadata.mfs.emergency
> /usr/local/var/metadata.mfs.emergency
> /usr/local/share/metadata.mfs.emergency
> 
> You can try to find them there, but as RAID went broken it is possible you won't see them.

The data rescue company gave us the salvaged data and we found not a single
*.emergency file on the dump. The file might have been renamed though. I
suspect that the RAID controller suffered from a data corruption condition
for weeks prior to completely failing. Not even the Linux kernel got
sensible warnings from the block device.

Since the RAID controller totally locked up the system I'm not sure if the
master process was able to write the data to disk (especially because all
storage containers were managed by the faulty RAID controller on this
server).

> And it is impossible to recover metadata from the chunks themselves.

Yes, I know, but I just asked, because I thought that there's a way to scan
the data on the chunkservers themselves.

> But file data are in the chunks so if you need to find something specific
> it would be possible. Each chunk has 5kb header (plain text) so simple
> 'grep' would be enough to find what you are looking at. 

We mainly deal with image files (JPG and PNG). I guess we'll walk through
the chunks this way.

> BTW. How much data was kept on this MooseFS installation?

About 7.7 GB, mostly image files and a few videos. The images are the most
important data, but the application dealing with these files unfortunately
used hashes and a drectory structure similar to the Squid proxy (multiple
directories, path based on the hashes) to store the data. I'm not sure if
the developers can reconstruct this information without the metadata.

Best regards,
René.

-- 
  )\._.,--....,'``.  fL  Let GNU/Linux work for you while you take a nap.
 /,   _.. \   _\  (`._ ,. R. Pfeiffer <lynx at luchs.at> + http://web.luchs.at/
`._.-(,_..'--(,_..'`-.;.'  - System administration + Consulting + Teaching -
Got mail delivery problems?  http://web.luchs.at/information/blockedmail.php

Re: [Moosefs-users] Question about MooseFS metadata distribution/recovery

From: Michał B. <mic...@ge...> - 2011-12-14 10:59:43

If you have "invalid copies" you should use mfsfilerepair which will put in metadata proper file version. It's important that you have all chunkservers running while doing mfsfilerepair.


Kind regards
Michał Borychowski 
MooseFS Support Manager
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
Gemius S.A.
ul. Wołoska 7, 02-672 Warszawa
Budynek MARS, klatka D
Tel.: +4822 874-41-00
Fax : +4822 874-41-01


-----Original Message-----
From: René Pfeiffer [mailto:ly...@lu...] 
Sent: Wednesday, December 14, 2011 11:01 AM
To: Michał Borychowski
Cc: moo...@li...
Subject: Re: [Moosefs-users] Question about MooseFS metadata distribution/recovery

On Dec 14, 2011 at 0937 +0100, Michał Borychowski appeared and said:
> Hi!
> 

> This is still too little information about how we could help you...

Sorry, I know, it's been a bit stressful since the server failure.

> What about metaloggers? Don't you have metadata_ml.mfs.back and 
> changelog_ml.*.mfs files?

Yes, I have, and we used the metalogger's data in order to set up a new master, but the access errors (invalid chunks and such) stayed.

> You could put them on ftp as tar.gz and give us a link so that we try 
> to recover them.

I will prepare an archive and make it available.

> In "emergency" situations MooseFS tries to write metadata file on the master machine in these locations:
> /metadata.mfs.emergency
> /tmp/metadata.mfs.emergency
> /var/metadata.mfs.emergency
> /usr/metadata.mfs.emergency
> /usr/share/metadata.mfs.emergency
> /usr/local/metadata.mfs.emergency
> /usr/local/var/metadata.mfs.emergency
> /usr/local/share/metadata.mfs.emergency
> 
> You can try to find them there, but as RAID went broken it is possible you won't see them.

The data rescue company gave us the salvaged data and we found not a single *.emergency file on the dump. The file might have been renamed though. I suspect that the RAID controller suffered from a data corruption condition for weeks prior to completely failing. Not even the Linux kernel got sensible warnings from the block device.

Since the RAID controller totally locked up the system I'm not sure if the master process was able to write the data to disk (especially because all storage containers were managed by the faulty RAID controller on this server).

> And it is impossible to recover metadata from the chunks themselves.

Yes, I know, but I just asked, because I thought that there's a way to scan the data on the chunkservers themselves.

> But file data are in the chunks so if you need to find something 
> specific it would be possible. Each chunk has 5kb header (plain text) 
> so simple 'grep' would be enough to find what you are looking at.

We mainly deal with image files (JPG and PNG). I guess we'll walk through the chunks this way.

> BTW. How much data was kept on this MooseFS installation?

About 7.7 GB, mostly image files and a few videos. The images are the most important data, but the application dealing with these files unfortunately used hashes and a drectory structure similar to the Squid proxy (multiple directories, path based on the hashes) to store the data. I'm not sure if the developers can reconstruct this information without the metadata.

Best regards,
René.

--
  )\._.,--....,'``.  fL  Let GNU/Linux work for you while you take a nap.
 /,   _.. \   _\  (`._ ,. R. Pfeiffer <lynx at luchs.at> + http://web.luchs.at/
`._.-(,_..'--(,_..'`-.;.'  - System administration + Consulting + Teaching - Got mail delivery problems?  http://web.luchs.at/information/blockedmail.php

Re: [Moosefs-users] Question about MooseFS metadata distribution/recovery

From: René P. <ly...@lu...> - 2011-12-14 12:22:25

On Dec 14, 2011 at 1159 +0100, Michał Borychowski appeared and said:
> If you have "invalid copies" you should use mfsfilerepair which will put
> in metadata proper file version. It's important that you have all
> chunkservers running while doing mfsfilerepair.

Is there a way to run mfsfilerepair recursively? The man page did not show
such an option.

Best regards,
René.

-- 
  )\._.,--....,'``.  fL  Let GNU/Linux work for you while you take a nap.
 /,   _.. \   _\  (`._ ,. R. Pfeiffer <lynx at luchs.at> + http://web.luchs.at/
`._.-(,_..'--(,_..'`-.;.'  - System administration + Consulting + Teaching -
Got mail delivery problems?  http://web.luchs.at/information/blockedmail.php

Re: [Moosefs-users] Question about MooseFS metadata distribution/recovery

From: René P. <ly...@lu...> - 2012-02-28 11:57:06

Hello!

Here's a follow-up and a warning to everyone deploying servers. It's not
meant to be yet another war story, it should illustrate what to look for
when deploying MooseFS master servers.

On Dec 11, 2011 at 1913 +0100, René Pfeiffer appeared and said:
> …
> We have the following scenario with a MooseFS deployment.
> 
> - 3 servers
>   - server #3 : 1 master
>   - server #2 : chunk server
>   - server #1 : chunk server, one metalogger process running on this node
> 
> Server #3 suffered from a hardware RAID failure including a trashed JFS
> file system where the master logs were on. …

We have spent a couple of days diagnosing the failover with the server
vendor and the ISP that did the provisioning. Apparently the RAID meltdown
was due to firmware bugs since the servers were deployed without any
upgrades (classic case of communication failure). After recovery all
firmwares on the servers were upgrades. After that the RAID failed again.
This time the controller removed 2 out of 4 disks, because they weren't
responding. Since th 2 removed disks were a complete RAID1 container the
server went out of service again (this time without being master but only
metalogger).
Analysis yielded that storage server #3 is the only one with 2 TB disks
which were _not_ approved by the hardware vendor (ISP used third-party
disks to boost storage capacity). Storage servers #1 and #2 run 1 TB disks
approved by the hardware vendor. Apparently the firmwares of the RAID
controller do not like the 2 TB disks and their firmware leading to
timeouts and communication errors on the data bus.

Server types and disk models are available (please ask me off-list) if
anyone is interested.

We haven't figured out why the metalogger data was not useful after the
first failure, but we suspect that due to the massive data corruption on
storage server #3 the data sent to the metalogger was corrupt as well. I
don't know if the master sends data from disk or from memory to the
metalogger(s). If it reads data from disks and sends it, then our RAID
controller might have eaten the data already.

Best regards,
René Pfeiffer.

-- 
  )\._.,--....,'``.  fL  Let GNU/Linux work for you while you take a nap.
 /,   _.. \   _\  (`._ ,. R. Pfeiffer <lynx at luchs.at> + http://web.luchs.at/
`._.-(,_..'--(,_..'`-.;.'  - System administration + Consulting + Teaching -
Got mail delivery problems?  http://web.luchs.at/information/blockedmail.php