From: René P. <ly...@lu...> - 2011-12-11 18:13:41
|
Hello! We have the following scenario with a MooseFS deployment. - 3 servers - server #3 : 1 master - server #2 : chunk server - server #1 : chunk server, one metalogger process running on this node Server #3 suffered from a hardware RAID failure including a trashed JFS file system where the master logs were on. We tried - recovering the log/state data from server #1 and restarting the master process, and - using the data from the metalogger process and starting a new master on server #1 (including changing the configs to use the new master as described in the documentation). The MooseFS mount is accessible, but some file operations involving data end in processes hanging in the "D" state (uninterruptible sleep). Logs on the chunk servers show that the MooseFS is missing data although the state of the chunk servers have not changed (no reboot, no file system damage, only reconfigured master). Is there a way to extract the data from the chunk directories? Is the a "fsck.mfs" or a similar tool? mfsfilerepair is not useful, because it only zeroed a file which was affected by the "D" state problem. We're currently trying to recover the original master's directory containing the meta data. Is there any documentation regarding the meta data or chunk data besides the source code? Does someone experienced a similar situation? Best regards, René Pfeiffer. -- )\._.,--....,'``. fL Let GNU/Linux work for you while you take a nap. /, _.. \ _\ (`._ ,. R. Pfeiffer <lynx at luchs.at> + http://web.luchs.at/ `._.-(,_..'--(,_..'`-.;.' - System administration + Consulting + Teaching - Got mail delivery problems? http://web.luchs.at/information/blockedmail.php |
From: René P. <ly...@lu...> - 2011-12-12 19:45:25
|
On Dec 11, 2011 at 1913 +0100, René Pfeiffer appeared and said: > Hello! > > We have the following scenario with a MooseFS deployment. > > - 3 servers > - server #3 : 1 master > - server #2 : chunk server > - server #1 : chunk server, one metalogger process running on this node I forgot to include: - All storage nodes run Debin 6.0 (x86_64). - All storage nodes run MooseFS on /var which is JFS. - MooseFS 1.6.20 is used (compiled from source). > … > We're currently trying to recover the original master's directory > containing the meta data. Recovery only yielded a metadata.mfs.back file which dates back to 29 June 2011. Best regards, René Pfeiffer. -- )\._.,--....,'``. fL Let GNU/Linux work for you while you take a nap. /, _.. \ _\ (`._ ,. R. Pfeiffer <lynx at luchs.at> + http://web.luchs.at/ `._.-(,_..'--(,_..'`-.;.' - System administration + Consulting + Teaching - Got mail delivery problems? http://web.luchs.at/information/blockedmail.php |
From: wkmail <wk...@bn...> - 2011-12-12 23:21:48
|
On 12/12/2011 11:45 AM, René Pfeiffer wrote: > Recovery only yielded a metadata.mfs.back file which dates back to 29 June > 2011. > What about your metalogger? Didn't that have a more up to date metadata file? -bill |
From: René P. <ly...@lu...> - 2011-12-12 23:37:13
|
On Dec 12, 2011 at 1520 -0800, wkmail appeared and said: > On 12/12/2011 11:45 AM, René Pfeiffer wrote: > > Recovery only yielded a metadata.mfs.back file which dates back to 29 June > > 2011. > > > > What about your metalogger? Didn't that have a more up to date metadata > file? Yes, it did, but I get the "D" state for processes accessing the mount, too. The logs show messages of the type "chunk xyz has only invalid copies (1) - please repair it manually", so I guess the metadata is still not correct (IP addresses and names of the chunk servers haven't changed). The biggest problem is that we cannot figure out what the RAID controller exactly did to the file system of the master server, and we haven't found any traces of a more recent metadata file. The metalogger system had no problem, but can it be that the metalogger was/is out of sync due to the silent file system corruption on the master system? Best, René. -- )\._.,--....,'``. fL Let GNU/Linux work for you while you take a nap. /, _.. \ _\ (`._ ,. R. Pfeiffer <lynx at luchs.at> + http://web.luchs.at/ `._.-(,_..'--(,_..'`-.;.' - System administration + Consulting + Teaching - Got mail delivery problems? http://web.luchs.at/information/blockedmail.php |
From: wkmail <wk...@bn...> - 2011-12-12 23:55:57
|
On 12/12/2011 3:36 PM, René Pfeiffer wrote: > Yes, it did, but I get the "D" state for processes accessing the mount, > too. The logs show messages of the type "chunk xyz has only invalid copies > (1) - please repair it manually", so I guess the metadata is still not > correct (IP addresses and names of the chunk servers haven't changed). > > The biggest problem is that we cannot figure out what the RAID controller > exactly did to the file system of the master server, and we haven't found > any traces of a more recent metadata file. The metalogger system had no > problem, but can it be that the metalogger was/is out of sync due to the > silent file system corruption on the master system? That is a question for the devs, but early in our MFS testing with essentially throwaway kit, we had a master fail with a broken raid. In that case the underlying disk system had been essentially readonly for a few days and no recent data was in /usr/local/var/mfs. However, the metalogger DID have accurate information and we simply recovered using that data using the restore process and then copying over metadata file to the now fixed master. Except for the 'on the fly files' lost when the damm thing crashed, no other data was lost, including files that had been received and written to chunkserver during the time the disk subsystem was out of order. So my guess is that the metaloggers get their info from the masters memory, not from a file on the master. But that is something that should be confirmed by the devs. |
From: René P. <ly...@lu...> - 2011-12-13 13:17:39
|
On Dec 12, 2011 at 1554 -0800, wkmail appeared and said: > On 12/12/2011 3:36 PM, René Pfeiffer wrote: > > … > > The biggest problem is that we cannot figure out what the RAID controller > > exactly did to the file system of the master server, and we haven't found > > any traces of a more recent metadata file. The metalogger system had no > > problem, but can it be that the metalogger was/is out of sync due to the > > silent file system corruption on the master system? > > That is a question for the devs, but early in our MFS testing with > essentially throwaway kit, we had a master fail with a broken raid. In > that case the underlying disk system had been essentially readonly for a > few days and no recent data was in /usr/local/var/mfs. > > However, the metalogger DID have accurate information and we simply > recovered using that data using the restore process and then copying > over metadata file to the now fixed master. Except for the 'on the fly > files' lost when the damm thing crashed, no other data was lost, > including files that had been received and written to chunkserver during > the time the disk subsystem was out of order. > > So my guess is that the metaloggers get their info from the masters > memory, not from a file on the master. Ok, this might be the reason then, because the master went down hard two times (first time 21 October, second time 9 December) because the RAID controller totally locked the system. I assume this could explain some missing metadata. > But that is something that should be confirmed by the devs. Thanks, René. -- )\._.,--....,'``. fL Let GNU/Linux work for you while you take a nap. /, _.. \ _\ (`._ ,. R. Pfeiffer <lynx at luchs.at> + http://web.luchs.at/ `._.-(,_..'--(,_..'`-.;.' - System administration + Consulting + Teaching - Got mail delivery problems? http://web.luchs.at/information/blockedmail.php |
From: Laurent W. <lw...@hy...> - 2011-12-13 14:13:16
|
Hi, I'm no MFS dev, but AFAIK each metadata change on master is sent immediatly to metalogger then get written to disk on the master. HTH, -- Laurent Wandrebeck HYGEOS, Earth Observation Department / Observation de la Terre Euratechnologies 165 Avenue de Bretagne 59000 Lille, France tel: +33 3 20 08 24 98 http://www.hygeos.com GPG fingerprint/Empreinte GPG: F5CA 37A4 6D03 A90C 7A1D 2A62 54E6 EF2C D17C F64C |
From: Steve <st...@bo...> - 2011-12-14 13:08:36
|
Hi, Does anyone have a backup methology and script to take the meatlogger data off the mfs system perhaps with several generations -------Original Message------- From: Michał Borychowski Date: 14/12/2011 08:39:11 To: 'René Pfeiffer' Cc: moo...@li... Subject: Re: [Moosefs-users] Question about MooseFS metadatadistribution/recovery Hi! This is still too little information about how we could help you... What about metaloggers? Don't you have metadata_ml.mfs.back and changelog_ml.* mfs files? You could put them on FTP as tar.gz and give us a link so that we try to recover them. In "emergency" situations MooseFS tries to write metadata file on the master machine in these locations: /metadata.mfs.emergency /tmp/metadata.mfs.emergency /var/metadata.mfs.emergency /usr/metadata.mfs.emergency /usr/share/metadata.mfs.emergency /usr/local/metadata.mfs.emergency /usr/local/var/metadata.mfs.emergency /usr/local/share/metadata.mfs.emergency You can try to find them there, but as RAID went broken it is possible you won't see them. And it is impossible to recover metadata from the chunks themselves. But file data are in the chunks so if you need to find something specific it would be possible. Each chunk has 5kb header (plain text) so simple 'grep' would be enough to find what you are looking at. BTW. How much data was kept on this MooseFS installation? Kind regards Michał Borychowski MooseFS Support Manager _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ Gemius S.A. ul. Wołoska 7, 02-672 Warszawa Budynek MARS, klatka D Tel.: +4822 874-41-00 Fax : +4822 874-41-01 -----Original Message----- From: René Pfeiffer [mailto:ly...@lu...] Sent: Tuesday, December 13, 2011 2:17 PM To: moo...@li... Subject: Re: [Moosefs-users] Question about MooseFS metadata distribution/recovery On Dec 12, 2011 at 1554 -0800, wkmail appeared and said: > On 12/12/2011 3:36 PM, René Pfeiffer wrote: > > … > > The biggest problem is that we cannot figure out what the RAID > > controller exactly did to the file system of the master server, and > > we haven't found any traces of a more recent metadata file. The > > metalogger system had no problem, but can it be that the metalogger > > was/is out of sync due to the silent file system corruption on the master system? > > That is a question for the devs, but early in our MFS testing with > essentially throwaway kit, we had a master fail with a broken raid. In > that case the underlying disk system had been essentially readonly for > a few days and no recent data was in /usr/local/var/mfs. > > However, the metalogger DID have accurate information and we simply > recovered using that data using the restore process and then copying > over metadata file to the now fixed master. Except for the 'on the fly > files' lost when the damm thing crashed, no other data was lost, > including files that had been received and written to chunkserver > during the time the disk subsystem was out of order. > > So my guess is that the metaloggers get their info from the masters > memory, not from a file on the master. Ok, this might be the reason then, because the master went down hard two times (first time 21 October, second time 9 December) because the RAID controller totally locked the system. I assume this could explain some missing metadata. > But that is something that should be confirmed by the devs. Thanks, René. -- )\._.,--....,'``. FL Let GNU/Linux work for you while you take a nap. /, _.. \ _\ (`._ ,. R. Pfeiffer <lynx at luchs.at> + http://web.luchs.at/ `._.-(,_..'--(,_..'`-.;.' - System administration + Consulting + Teaching - Got mail delivery problems? http://web.luchs.at/information/blockedmail.php ----------------------------------------------------------------------------- Cloud Computing - Latest Buzzword or a Glimpse of the Future? This paper surveys cloud computing today: What are the benefits? Why are businesses embracing it? What are its payoffs and pitfalls? http://www.accelacomm.com/jaw/sdnl/114/51425149/ _______________________________________________ moosefs-users mailing list moo...@li... https://lists.sourceforge.net/lists/listinfo/moosefs-users |
From: Michał B. <mic...@ge...> - 2011-12-14 13:00:59
|
No, but you can try with sth like: "find . -type f | xargs mfsfilerepair" Kind regards Michał Borychowski MooseFS Support Manager _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ Gemius S.A. ul. Wołoska 7, 02-672 Warszawa Budynek MARS, klatka D Tel.: +4822 874-41-00 Fax : +4822 874-41-01 -----Original Message----- From: René Pfeiffer [mailto:ly...@lu...] Sent: Wednesday, December 14, 2011 1:22 PM To: Michał Borychowski Cc: moo...@li... Subject: Re: [Moosefs-users] Question about MooseFS metadata distribution/recovery On Dec 14, 2011 at 1159 +0100, Michał Borychowski appeared and said: > If you have "invalid copies" you should use mfsfilerepair which will > put in metadata proper file version. It's important that you have all > chunkservers running while doing mfsfilerepair. Is there a way to run mfsfilerepair recursively? The man page did not show such an option. Best regards, René. -- )\._.,--....,'``. fL Let GNU/Linux work for you while you take a nap. /, _.. \ _\ (`._ ,. R. Pfeiffer <lynx at luchs.at> + http://web.luchs.at/ `._.-(,_..'--(,_..'`-.;.' - System administration + Consulting + Teaching - Got mail delivery problems? http://web.luchs.at/information/blockedmail.php |
From: René P. <ly...@lu...> - 2011-12-14 13:24:52
|
On Dec 14, 2011 at 1400 +0100, Michał Borychowski appeared and said: > No, but you can try with sth like: > > "find . -type f | xargs mfsfilerepair" I did that, and we're currently assessing the file system. I asked for the recursive option, because it is easier for operators to deal with these situations. Also a "--dry-run" option would be nice (just as rsync has), so mfsfilerepair can generate a preview before an admin decides to run the actual repair task. Do the *.emergency files have a signature that can be used to recognise them? Since the JFS on the RAID1 volumes has massive metadata damages, we have to scan the content of every file to look for the *.emergency files. Best regards, René. -- )\._.,--....,'``. fL Let GNU/Linux work for you while you take a nap. /, _.. \ _\ (`._ ,. R. Pfeiffer <lynx at luchs.at> + http://web.luchs.at/ `._.-(,_..'--(,_..'`-.;.' - System administration + Consulting + Teaching - Got mail delivery problems? http://web.luchs.at/information/blockedmail.php |
From: Michał B. <mic...@ge...> - 2011-12-14 14:07:20
|
At the beginning of metadata file there is: "MFSM 1.5" Kind regards Michał Borychowski MooseFS Support Manager _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ Gemius S.A. ul. Wołoska 7, 02-672 Warszawa Budynek MARS, klatka D Tel.: +4822 874-41-00 Fax : +4822 874-41-01 -----Original Message----- From: René Pfeiffer [mailto:ly...@lu...] Sent: Wednesday, December 14, 2011 2:25 PM To: Michał Borychowski Cc: moo...@li... Subject: Re: [Moosefs-users] Question about MooseFS metadata distribution/recovery On Dec 14, 2011 at 1400 +0100, Michał Borychowski appeared and said: > No, but you can try with sth like: > > "find . -type f | xargs mfsfilerepair" I did that, and we're currently assessing the file system. I asked for the recursive option, because it is easier for operators to deal with these situations. Also a "--dry-run" option would be nice (just as rsync has), so mfsfilerepair can generate a preview before an admin decides to run the actual repair task. Do the *.emergency files have a signature that can be used to recognise them? Since the JFS on the RAID1 volumes has massive metadata damages, we have to scan the content of every file to look for the *.emergency files. Best regards, René. -- )\._.,--....,'``. fL Let GNU/Linux work for you while you take a nap. /, _.. \ _\ (`._ ,. R. Pfeiffer <lynx at luchs.at> + http://web.luchs.at/ `._.-(,_..'--(,_..'`-.;.' - System administration + Consulting + Teaching - Got mail delivery problems? http://web.luchs.at/information/blockedmail.php |
From: Ricardo J. B. <ric...@da...> - 2011-12-14 21:35:19
|
El Miércoles 14 Diciembre 2011, René Pfeiffer escribió: > On Dec 14, 2011 at 1400 +0100, Michał Borychowski appeared and said: > > No, but you can try with sth like: > > > > "find . -type f | xargs mfsfilerepair" > > I did that, and we're currently assessing the file system. > I asked for the recursive option, because it is easier for operators to > deal with these situations. Also a "--dry-run" option would be nice (just > as rsync has), so mfsfilerepair can generate a preview before an admin > decides to run the actual repair task. Exactly my concern. Whenever I get unavailable chunks (not many times fortunately) but I don't know which files they belong to, I do this: find /path/to/mfs/mount -type f | xargs mfsfileinfo > /tmp/mfsfileinfo.log grep -B2 "no valid copies" /tmp/mfsfileinfo.log And then I mfsfilrepair only those files that have chunks with invalid copies. But, I don't know before hand if the missing chunk will be zeroed or restored from another (maybe older) version, so a --dry-run would be more than welcome :) Regards, -- Ricardo J. Barberis Senior SysAdmin / ITI Dattatec.com :: Soluciones de Web Hosting Tu Hosting hecho Simple! ------------------------------------------ |
From: Michał B. <mic...@ge...> - 2011-12-14 08:38:16
|
Hi! This is still too little information about how we could help you... What about metaloggers? Don't you have metadata_ml.mfs.back and changelog_ml.*.mfs files? You could put them on ftp as tar.gz and give us a link so that we try to recover them. In "emergency" situations MooseFS tries to write metadata file on the master machine in these locations: /metadata.mfs.emergency /tmp/metadata.mfs.emergency /var/metadata.mfs.emergency /usr/metadata.mfs.emergency /usr/share/metadata.mfs.emergency /usr/local/metadata.mfs.emergency /usr/local/var/metadata.mfs.emergency /usr/local/share/metadata.mfs.emergency You can try to find them there, but as RAID went broken it is possible you won't see them. And it is impossible to recover metadata from the chunks themselves. But file data are in the chunks so if you need to find something specific it would be possible. Each chunk has 5kb header (plain text) so simple 'grep' would be enough to find what you are looking at. BTW. How much data was kept on this MooseFS installation? Kind regards Michał Borychowski MooseFS Support Manager _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ Gemius S.A. ul. Wołoska 7, 02-672 Warszawa Budynek MARS, klatka D Tel.: +4822 874-41-00 Fax : +4822 874-41-01 -----Original Message----- From: René Pfeiffer [mailto:ly...@lu...] Sent: Tuesday, December 13, 2011 2:17 PM To: moo...@li... Subject: Re: [Moosefs-users] Question about MooseFS metadata distribution/recovery On Dec 12, 2011 at 1554 -0800, wkmail appeared and said: > On 12/12/2011 3:36 PM, René Pfeiffer wrote: > > … > > The biggest problem is that we cannot figure out what the RAID > > controller exactly did to the file system of the master server, and > > we haven't found any traces of a more recent metadata file. The > > metalogger system had no problem, but can it be that the metalogger > > was/is out of sync due to the silent file system corruption on the master system? > > That is a question for the devs, but early in our MFS testing with > essentially throwaway kit, we had a master fail with a broken raid. In > that case the underlying disk system had been essentially readonly for > a few days and no recent data was in /usr/local/var/mfs. > > However, the metalogger DID have accurate information and we simply > recovered using that data using the restore process and then copying > over metadata file to the now fixed master. Except for the 'on the fly > files' lost when the damm thing crashed, no other data was lost, > including files that had been received and written to chunkserver > during the time the disk subsystem was out of order. > > So my guess is that the metaloggers get their info from the masters > memory, not from a file on the master. Ok, this might be the reason then, because the master went down hard two times (first time 21 October, second time 9 December) because the RAID controller totally locked the system. I assume this could explain some missing metadata. > But that is something that should be confirmed by the devs. Thanks, René. -- )\._.,--....,'``. fL Let GNU/Linux work for you while you take a nap. /, _.. \ _\ (`._ ,. R. Pfeiffer <lynx at luchs.at> + http://web.luchs.at/ `._.-(,_..'--(,_..'`-.;.' - System administration + Consulting + Teaching - Got mail delivery problems? http://web.luchs.at/information/blockedmail.php |
From: René P. <ly...@lu...> - 2011-12-16 17:26:26
|
Hello, Michał! I have an update on the metadata case. On Dec 14, 2011 at 0937 +0100, Michał Borychowski appeared and said: > … > This is still too little information about how we could help you... What > about metaloggers? Don't you have metadata_ml.mfs.back and > changelog_ml.*.mfs files? You could put them on ftp as tar.gz and give us > a link so that we try to recover them. I will take some time to compile this information, because the servers in questions are managed by different teams (long story) and the coordination did not run smoothly (i.e. the metalogger's logs might have been overwritten, because the MooseFS was continued to be used after the master server crash). I searched through 129 GB of salvaged data from the master and found not a single file with a metadata signature. > In "emergency" situations MooseFS tries to write metadata file on the > master machine in these locations: … You can try to find them there, but > as RAID went broken it is possible you won't see them. Correct, we've not found any *emergency* files on the dump either. Since the state of the MooseFS moved back in time to 28 June 2011 we suspect that this is the time when the data corruption started, and since no warnings or errors were logged by the servers or the hardware controllers no one noticed. Best, René. -- )\._.,--....,'``. fL Let GNU/Linux work for you while you take a nap. /, _.. \ _\ (`._ ,. R. Pfeiffer <lynx at luchs.at> + http://web.luchs.at/ `._.-(,_..'--(,_..'`-.;.' - System administration + Consulting + Teaching - Got mail delivery problems? http://web.luchs.at/information/blockedmail.php |
From: René P. <ly...@lu...> - 2011-12-14 10:00:56
|
On Dec 14, 2011 at 0937 +0100, Michał Borychowski appeared and said: > Hi! > > This is still too little information about how we could help you... Sorry, I know, it's been a bit stressful since the server failure. > What about metaloggers? Don't you have metadata_ml.mfs.back and > changelog_ml.*.mfs files? Yes, I have, and we used the metalogger's data in order to set up a new master, but the access errors (invalid chunks and such) stayed. > You could put them on ftp as tar.gz and give us a link so that we try to > recover them. I will prepare an archive and make it available. > In "emergency" situations MooseFS tries to write metadata file on the master machine in these locations: > /metadata.mfs.emergency > /tmp/metadata.mfs.emergency > /var/metadata.mfs.emergency > /usr/metadata.mfs.emergency > /usr/share/metadata.mfs.emergency > /usr/local/metadata.mfs.emergency > /usr/local/var/metadata.mfs.emergency > /usr/local/share/metadata.mfs.emergency > > You can try to find them there, but as RAID went broken it is possible you won't see them. The data rescue company gave us the salvaged data and we found not a single *.emergency file on the dump. The file might have been renamed though. I suspect that the RAID controller suffered from a data corruption condition for weeks prior to completely failing. Not even the Linux kernel got sensible warnings from the block device. Since the RAID controller totally locked up the system I'm not sure if the master process was able to write the data to disk (especially because all storage containers were managed by the faulty RAID controller on this server). > And it is impossible to recover metadata from the chunks themselves. Yes, I know, but I just asked, because I thought that there's a way to scan the data on the chunkservers themselves. > But file data are in the chunks so if you need to find something specific > it would be possible. Each chunk has 5kb header (plain text) so simple > 'grep' would be enough to find what you are looking at. We mainly deal with image files (JPG and PNG). I guess we'll walk through the chunks this way. > BTW. How much data was kept on this MooseFS installation? About 7.7 GB, mostly image files and a few videos. The images are the most important data, but the application dealing with these files unfortunately used hashes and a drectory structure similar to the Squid proxy (multiple directories, path based on the hashes) to store the data. I'm not sure if the developers can reconstruct this information without the metadata. Best regards, René. -- )\._.,--....,'``. fL Let GNU/Linux work for you while you take a nap. /, _.. \ _\ (`._ ,. R. Pfeiffer <lynx at luchs.at> + http://web.luchs.at/ `._.-(,_..'--(,_..'`-.;.' - System administration + Consulting + Teaching - Got mail delivery problems? http://web.luchs.at/information/blockedmail.php |
From: Michał B. <mic...@ge...> - 2011-12-14 10:59:43
|
If you have "invalid copies" you should use mfsfilerepair which will put in metadata proper file version. It's important that you have all chunkservers running while doing mfsfilerepair. Kind regards Michał Borychowski MooseFS Support Manager _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ Gemius S.A. ul. Wołoska 7, 02-672 Warszawa Budynek MARS, klatka D Tel.: +4822 874-41-00 Fax : +4822 874-41-01 -----Original Message----- From: René Pfeiffer [mailto:ly...@lu...] Sent: Wednesday, December 14, 2011 11:01 AM To: Michał Borychowski Cc: moo...@li... Subject: Re: [Moosefs-users] Question about MooseFS metadata distribution/recovery On Dec 14, 2011 at 0937 +0100, Michał Borychowski appeared and said: > Hi! > > This is still too little information about how we could help you... Sorry, I know, it's been a bit stressful since the server failure. > What about metaloggers? Don't you have metadata_ml.mfs.back and > changelog_ml.*.mfs files? Yes, I have, and we used the metalogger's data in order to set up a new master, but the access errors (invalid chunks and such) stayed. > You could put them on ftp as tar.gz and give us a link so that we try > to recover them. I will prepare an archive and make it available. > In "emergency" situations MooseFS tries to write metadata file on the master machine in these locations: > /metadata.mfs.emergency > /tmp/metadata.mfs.emergency > /var/metadata.mfs.emergency > /usr/metadata.mfs.emergency > /usr/share/metadata.mfs.emergency > /usr/local/metadata.mfs.emergency > /usr/local/var/metadata.mfs.emergency > /usr/local/share/metadata.mfs.emergency > > You can try to find them there, but as RAID went broken it is possible you won't see them. The data rescue company gave us the salvaged data and we found not a single *.emergency file on the dump. The file might have been renamed though. I suspect that the RAID controller suffered from a data corruption condition for weeks prior to completely failing. Not even the Linux kernel got sensible warnings from the block device. Since the RAID controller totally locked up the system I'm not sure if the master process was able to write the data to disk (especially because all storage containers were managed by the faulty RAID controller on this server). > And it is impossible to recover metadata from the chunks themselves. Yes, I know, but I just asked, because I thought that there's a way to scan the data on the chunkservers themselves. > But file data are in the chunks so if you need to find something > specific it would be possible. Each chunk has 5kb header (plain text) > so simple 'grep' would be enough to find what you are looking at. We mainly deal with image files (JPG and PNG). I guess we'll walk through the chunks this way. > BTW. How much data was kept on this MooseFS installation? About 7.7 GB, mostly image files and a few videos. The images are the most important data, but the application dealing with these files unfortunately used hashes and a drectory structure similar to the Squid proxy (multiple directories, path based on the hashes) to store the data. I'm not sure if the developers can reconstruct this information without the metadata. Best regards, René. -- )\._.,--....,'``. fL Let GNU/Linux work for you while you take a nap. /, _.. \ _\ (`._ ,. R. Pfeiffer <lynx at luchs.at> + http://web.luchs.at/ `._.-(,_..'--(,_..'`-.;.' - System administration + Consulting + Teaching - Got mail delivery problems? http://web.luchs.at/information/blockedmail.php |
From: René P. <ly...@lu...> - 2011-12-14 12:22:25
|
On Dec 14, 2011 at 1159 +0100, Michał Borychowski appeared and said: > If you have "invalid copies" you should use mfsfilerepair which will put > in metadata proper file version. It's important that you have all > chunkservers running while doing mfsfilerepair. Is there a way to run mfsfilerepair recursively? The man page did not show such an option. Best regards, René. -- )\._.,--....,'``. fL Let GNU/Linux work for you while you take a nap. /, _.. \ _\ (`._ ,. R. Pfeiffer <lynx at luchs.at> + http://web.luchs.at/ `._.-(,_..'--(,_..'`-.;.' - System administration + Consulting + Teaching - Got mail delivery problems? http://web.luchs.at/information/blockedmail.php |
From: René P. <ly...@lu...> - 2012-02-28 11:57:06
|
Hello! Here's a follow-up and a warning to everyone deploying servers. It's not meant to be yet another war story, it should illustrate what to look for when deploying MooseFS master servers. On Dec 11, 2011 at 1913 +0100, René Pfeiffer appeared and said: > … > We have the following scenario with a MooseFS deployment. > > - 3 servers > - server #3 : 1 master > - server #2 : chunk server > - server #1 : chunk server, one metalogger process running on this node > > Server #3 suffered from a hardware RAID failure including a trashed JFS > file system where the master logs were on. … We have spent a couple of days diagnosing the failover with the server vendor and the ISP that did the provisioning. Apparently the RAID meltdown was due to firmware bugs since the servers were deployed without any upgrades (classic case of communication failure). After recovery all firmwares on the servers were upgrades. After that the RAID failed again. This time the controller removed 2 out of 4 disks, because they weren't responding. Since th 2 removed disks were a complete RAID1 container the server went out of service again (this time without being master but only metalogger). Analysis yielded that storage server #3 is the only one with 2 TB disks which were _not_ approved by the hardware vendor (ISP used third-party disks to boost storage capacity). Storage servers #1 and #2 run 1 TB disks approved by the hardware vendor. Apparently the firmwares of the RAID controller do not like the 2 TB disks and their firmware leading to timeouts and communication errors on the data bus. Server types and disk models are available (please ask me off-list) if anyone is interested. We haven't figured out why the metalogger data was not useful after the first failure, but we suspect that due to the massive data corruption on storage server #3 the data sent to the metalogger was corrupt as well. I don't know if the master sends data from disk or from memory to the metalogger(s). If it reads data from disks and sends it, then our RAID controller might have eaten the data already. Best regards, René Pfeiffer. -- )\._.,--....,'``. fL Let GNU/Linux work for you while you take a nap. /, _.. \ _\ (`._ ,. R. Pfeiffer <lynx at luchs.at> + http://web.luchs.at/ `._.-(,_..'--(,_..'`-.;.' - System administration + Consulting + Teaching - Got mail delivery problems? http://web.luchs.at/information/blockedmail.php |