From: Tuukka L. <tlu...@gm...> - 2011-05-29 01:35:43
|
I had a power failure and both my master and meta logger went down simultaneously. When I turned them back on the master process failed to start, so I ran metarestore -a but got the following error: loading objects (files,directories,etc.) ... ok loading names ... loading edge: 7527,DSC01862.JPG->7554 error: child not found error can't read metadata from file: metadata.mfs.back So I went to the metalogger and got the same error. Now I am not sure what to try next. Any help would be appreciated. Tuukka |
From: Steve <st...@bo...> - 2011-05-29 08:18:53
|
I wonder why chunkservers arent by default capable of logging the data too as a built in function. If its mission critical I guess you should have had at least one of these on UPS. I cant preach as mine are not!! What filesystems are your running on ? -------Original Message------- From: Tuukka Luolamo Date: 29/05/2011 02:36:23 To: moo...@li... Subject: [Moosefs-users] Problems after power failure I had a power failure and both my master and meta logger went down simultaneously. When I turned them back on the master process failed to start, so I ran metarestore -a but got the following error: loading objects (files,directories,etc.) ... ok loading names ... loading edge: 7527,DSC01862.JPG->7554 error: child not found error can't read metadata from file: metadata.mfs.back So I went to the metalogger and got the same error. Now I am not sure what to try next. Any help would be appreciated. Tuukka ----------------------------------------------------------------------------- vRanger cuts backup time in half-while increasing security. With the market-leading solution for virtual backup and recovery, you get blazing-fast, flexible, and affordable data protection. Download your free trial now. http://p.sf.net/sfu/quest-d2dcopy1 _______________________________________________ moosefs-users mailing list moo...@li... https://lists.sourceforge.net/lists/listinfo/moosefs-users |
From: Tuukka L. <tlu...@gm...> - 2011-05-29 08:55:16
|
Sorry forgot to send my response tot he group too. Tuukka On Sun, May 29, 2011 at 1:41 AM, Tuukka Luolamo <tlu...@gm...> wrote: > This is just a home setup so no UPS though I thought maybe I should > have one. However I assumed the meta logger mechanism should be enough > of a backup to recover in case of a failure, if this disk goes down > and takes me an hours to get back up it is not the end of the world =) > (Right now it has been down for a week) > > They run on ext4. > > Thanks, > > Tuukka > > > On Sun, May 29, 2011 at 1:18 AM, Steve <st...@bo...> wrote: >> >> I wonder why chunkservers arent by default capable of logging the data too >> as a built in function. >> >> >> >> If its mission critical I guess you should have had at least one of these on >> UPS. I cant preach as mine are not!! >> >> What filesystems are your running on ? >> >> >> >> >> >> >> >> >> >> -------Original Message------- >> >> >> >> From: Tuukka Luolamo >> >> Date: 29/05/2011 02:36:23 >> >> To: moo...@li... >> >> Subject: [Moosefs-users] Problems after power failure >> >> >> >> I had a power failure and both my master and meta logger went down >> >> simultaneously. >> >> >> >> When I turned them back on the master process failed to start, so I >> >> ran metarestore -a but got the following error: >> >> >> >> loading objects (files,directories,etc.) ... ok >> >> loading names ... loading edge: 7527,DSC01862.JPG->7554 error: child not >> found >> >> error >> >> can't read metadata from file: metadata.mfs.back >> >> >> >> So I went to the metalogger and got the same error. >> >> >> >> Now I am not sure what to try next. >> >> >> >> Any help would be appreciated. >> >> >> >> >> >> Tuukka >> >> >> >> ----------------------------------------------------------------------------- >> >> >> vRanger cuts backup time in half-while increasing security. >> >> With the market-leading solution for virtual backup and recovery, >> >> you get blazing-fast, flexible, and affordable data protection. >> >> Download your free trial now. >> >> http://p.sf.net/sfu/quest-d2dcopy1 >> >> _______________________________________________ >> >> moosefs-users mailing list >> >> moo...@li... >> >> https://lists.sourceforge.net/lists/listinfo/moosefs-users >> >> >> > |
From: Steve <st...@bo...> - 2011-05-29 09:34:42
|
Yes it does get confusing with the 'reply to all' not being the default. Its a simple change at least to the listserv's iv'e used in the past. Same as me, then a home system. We don't get power cuts often here, last two caused by me! but ive survived a few and have no ups. Your ext4 should give some resilience over older system and is same as I use. So I guess we can assume your underlying filesystem is ok and you've just been unlucky with the timing of what moose was doing at the time of power loss ? You probably need to get your metadata to the devs on Monday they may help with a fix and may want to see if they can handle recovery better if its at all possible. Dual load sharing master servers maybe desirable to help reduce risks of data loss too. Otherwise ive had no problems with moose running it since before it became well know except that by my own cause - a brief try of early btrfs! -------Original Message------- From: Tuukka Luolamo Date: 29/05/2011 09:55:14 To: Steve; moo...@li... Subject: Re: [Moosefs-users] Problems after power failure Sorry forgot to send my response tot he group too. Tuukka On Sun, May 29, 2011 at 1:41 AM, Tuukka Luolamo <tlu...@gm...> wrote: > This is just a home setup so no UPS though I thought maybe I should > have one. However I assumed the meta logger mechanism should be enough > of a backup to recover in case of a failure, if this disk goes down > and takes me an hours to get back up it is not the end of the world =) > (Right now it has been down for a week) > > They run on ext4. > > Thanks, > > Tuukka > > > On Sun, May 29, 2011 at 1:18 AM, Steve <st...@bo...> wrote: >> >> I wonder why chunkservers arent by default capable of logging the data too >> as a built in function. >> >> >> >> If its mission critical I guess you should have had at least one of these on >> UPS. I cant preach as mine are not!! >> >> What filesystems are your running on ? >> >> >> >> >> >> >> >> >> >> -------Original Message------- >> >> >> >> From: Tuukka Luolamo >> >> Date: 29/05/2011 02:36:23 >> >> To: moo...@li... >> >> Subject: [Moosefs-users] Problems after power failure >> >> >> >> I had a power failure and both my master and meta logger went down >> >> simultaneously. >> >> >> >> When I turned them back on the master process failed to start, so I >> >> ran metarestore -a but got the following error: >> >> >> >> loading objects (files,directories,etc.) ... ok >> >> loading names ... loading edge: 7527,DSC01862.JPG->7554 error: child not >> found >> >> error >> >> can't read metadata from file: metadata.mfs.back >> >> >> >> So I went to the metalogger and got the same error. >> >> >> >> Now I am not sure what to try next. >> >> >> >> Any help would be appreciated. >> >> >> >> >> >> Tuukka >> >> >> >> ----------------------------------------------------------------------------- >> >> >> vRanger cuts backup time in half-while increasing security. >> >> With the market-leading solution for virtual backup and recovery, >> >> you get blazing-fast, flexible, and affordable data protection. >> >> Download your free trial now. >> >> http://p.sf.net/sfu/quest-d2dcopy1 >> >> _______________________________________________ >> >> moosefs-users mailing list >> >> moo...@li... >> >> https://lists.sourceforge.net/lists/listinfo/moosefs-users >> >> >> > |
From: Michal B. <mic...@ge...> - 2011-06-01 09:30:31
|
Hi! We run several instances of MooseFS over 5 years already and have never seen an error like yours. There was a situation that one file was lacking and the other existing but without relation to anything. We added -i (ignore) flag to the mfsmetarestore and got this result: loading objects (files,directories,etc.) ... ok loading names ... loading edge: 7527,DSC01862.JPG->7554 error: child not found ok loading deletion timestamps ... ok checking filesystem consistency ... fschk: found lost inode: 7538 ok loading chunks data ... ok connecting files and chunks ... ok store metadata into file: ../../../Downloads/mfs/metadata.mfs Numbers of files differ exactly by one bit: >>> "%02X" % 7554 '1D82' >>> "%02X" % 7538 '1D72' We think that this problem could be caused by your RAM in the master. We recommend using RAM with parity control. You can also run a test from http://www.memtest.org/ on your server and check your existing RAM. Of course, the bit could have been changed also on the motherboard level or CPU - which is much less probable. Also you can see in the log that file 7538 is located between 7553 and 7555: -|i: 7549|#:2|e:0|m:0777|u: 65534|g: 65534|a:1302156861,m:1088897360,c:1302340549|t: 86400|l: 978749|c:(0000000000001B1B)|r:() -|i: 7550|#:2|e:0|m:0777|u: 65534|g: 65534|a:1302156866,m:1088897400,c:1302340549|t: 86400|l: 804362|c:(0000000000001B1C)|r:() -|i: 7551|#:2|e:0|m:0777|u: 65534|g: 65534|a:1302156869,m:1088897438,c:1302340549|t: 86400|l: 850289|c:(0000000000001B1D)|r:() -|i: 7552|#:2|e:0|m:0777|u: 65534|g: 65534|a:1302156873,m:1088897474,c:1302340549|t: 86400|l: 710445|c:(0000000000001B1E)|r:() -|i: 7553|#:2|e:0|m:0777|u: 65534|g: 65534|a:1302156876,m:1098246428,c:1302340549|t: 86400|l: 456633|c:(0000000000001B1F)|r:() -|i: 7538|#:2|e:0|m:0777|u: 65534|g: 65534|a:1302154827,m:1088893918,c:1302340549|t: 86400|l: 848797|c:(0000000000001B10)|r:() -|i: 7555|#:2|e:0|m:0777|u: 65534|g: 65534|a:1302156878,m:1088897534,c:1302340549|t: 86400|l: 137858|c:(0000000000001B21)|r:() -|i: 7556|#:2|e:0|m:0777|u: 65534|g: 65534|a:1302156878,m:1088898128,c:1302340549|t: 86400|l: 805701|c:(0000000000001B22)|r:() -|i: 7557|#:2|e:0|m:0777|u: 65534|g: 65534|a:1302156880,m:1088898148,c:1302340549|t: 86400|l: 817717|c:(0000000000001B23)|r:() -|i: 7558|#:2|e:0|m:0777|u: 65534|g: 65534|a:1157861440,m:1088898162,c:1302340549|t: 86400|l: 852298|c:(0000000000001B24)|r:() -|i: 7559|#:2|e:0|m:0777|u: 65534|g: 65534|a:1157861440,m:1088898186,c:1302340549|t: 86400|l: 797550|c:(0000000000001B25)|r:() -|i: 7560|#:2|e:0|m:0777|u: 65534|g: 65534|a:1157861440,m:1088898530,c:1302340549|t: 86400|l: 764878|c:(0000000000001B26)|r:() Kind regards -Michal -----Original Message----- From: Tuukka Luolamo [mailto:tlu...@gm...] Sent: Monday, May 30, 2011 7:40 PM To: Michal Borychowski Subject: Re: [Moosefs-users] Problems after power failure Hello Michael, Attached are the files you requested. Let me know if you need anything else. Now getting the meta files fixed would be great, but also if there is a way to rebuild them from the chunk servers contents that would be a viable option for this system as I only have two servers in the cluster, one acting as the master and a chunkserver and the other acting as the metalogger and a second chunk server. I have the replication set to 2 so both have all the contents of the file system. Also when it went down I am pretty sure there was nothing being written to the servers. This is my home / test system so getting the data back is important, but the time it takes to recover it is not. Thanks, Tuukka 2011/5/30 Michal Borychowski <mic...@ge...>: > Hi! > > If you could send us your "metadata.mfs*" and "changelog*" files > (tar.gzipped) - we'll see what can be done about it. > > > Kind regards > Michał Borychowski > MooseFS Support Manager > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ Gemius S.A. > ul. Wołoska 7, 02-672 Warszawa > Budynek MARS, klatka D > Tel.: +4822 874-41-00 > Fax : +4822 874-41-01 > > > > -----Original Message----- > From: Tuukka Luolamo [mailto:tlu...@gm...] > Sent: Sunday, May 29, 2011 3:36 AM > To: moo...@li... > Subject: [Moosefs-users] Problems after power failure > > I had a power failure and both my master and meta logger went down > simultaneously. > > When I turned them back on the master process failed to start, so I > ran metarestore -a but got the following error: > > loading objects (files,directories,etc.) ... ok > loading names ... loading edge: 7527,DSC01862.JPG->7554 error: child not > found > error > can't read metadata from file: metadata.mfs.back > > So I went to the metalogger and got the same error. > > Now I am not sure what to try next. > > Any help would be appreciated. > > > Tuukka > > ---------------------------------------------------------------------------- > -- > vRanger cuts backup time in half-while increasing security. > With the market-leading solution for virtual backup and recovery, > you get blazing-fast, flexible, and affordable data protection. > Download your free trial now. > http://p.sf.net/sfu/quest-d2dcopy1 > _______________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users > > |
From: WK <wk...@bn...> - 2011-06-01 22:39:46
|
On 6/1/2011 2:30 AM, Michal Borychowski wrote: > > We think that this problem could be caused by your RAM in the master. We > recommend using RAM with parity control. You can also run a test from > http://www.memtest.org/ on your server and check your existing RAM. Of > course, the bit could have been changed also on the motherboard level or CPU > - which is much less probable. > > > Also you can see in the log that file 7538 is located between 7553 and 7555: So in a situation like this where the metadata is now corrupt. Is the problem fixable with only the loss of the one file? (and how does one fix it). or is his entire MFS setup completely corrupt and he would need to have had a backup? Can I assume that older archived versions of the metadata.mfs could be used to recover most of the files. -bill |
From: Tuukka L. <tlu...@gm...> - 2011-06-02 07:54:43
|
OK I put in place the file the dev sent me and can not see any data loss... I found the file in question the one I got in the error and it seems fine. The whole system is up and functioning. I run the system on a old desktop computer and a another PC I bought for $25 so the dev recommends making sure you have good memory, but I guess I am using whatever I got =) Aside from this error everything has been fine. Didn't run the memtest they recommended, but I would not count out memory errors. However I would like to understand the situation better mainly for what are my recourses. As WK articulated already had my metadata been completely corrupt would I have lost all my data? Would I lose just one file the one with the error? And can I fix this error myself? Thanks, Tuukka On Wed, Jun 1, 2011 at 3:39 PM, WK <wk...@bn...> wrote: > On 6/1/2011 2:30 AM, Michal Borychowski wrote: >> >> We think that this problem could be caused by your RAM in the master. We >> recommend using RAM with parity control. You can also run a test from >> http://www.memtest.org/ on your server and check your existing RAM. Of >> course, the bit could have been changed also on the motherboard level or CPU >> - which is much less probable. >> >> >> Also you can see in the log that file 7538 is located between 7553 and 7555: > > > So in a situation like this where the metadata is now corrupt. > > Is the problem fixable with only the loss of the one file? (and how does > one fix it). > > or is his entire MFS setup completely corrupt and he would need to have > had a backup? > > Can I assume that older archived versions of the metadata.mfs could be > used to recover most of the files. > > -bill > > ------------------------------------------------------------------------------ > Simplify data backup and recovery for your virtual environment with vRanger. > Installation's a snap, and flexible recovery options mean your data is safe, > secure and there when you need it. Data protection magic? > Nope - It's vRanger. Get your free trial download today. > http://p.sf.net/sfu/quest-sfdev2dev > _______________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users > |
From: Tuukka L. <tlu...@gm...> - 2011-06-06 03:01:23
|
OK I ran the memtest on the master server, It got through without finding any errors. 2011/6/2 Michal Borychowski <mic...@ge...>: > Hi! > > In the meantime please run the memtest - we are curious if it really was > hardware problem or maybe it could be a software problem > > > Regards > Michal > > -----Original Message----- > From: Tuukka Luolamo [mailto:tlu...@gm...] > Sent: Thursday, June 02, 2011 9:55 AM > To: WK > Cc: moo...@li... > Subject: Re: [Moosefs-users] Problems after power failure > > OK I put in place the file the dev sent me and can not see any data loss... > > I found the file in question the one I got in the error and it seems fine. > > The whole system is up and functioning. > > I run the system on a old desktop computer and a another PC I bought > for $25 so the dev recommends making sure you have good memory, but I > guess I am using whatever I got =) Aside from this error everything > has been fine. Didn't run the memtest they recommended, but I would > not count out memory errors. > > However I would like to understand the situation better mainly for > what are my recourses. As WK articulated already had my metadata been > completely corrupt would I have lost all my data? Would I lose just > one file the one with the error? And can I fix this error myself? > > Thanks, > > Tuukka > > On Wed, Jun 1, 2011 at 3:39 PM, WK <wk...@bn...> wrote: >> On 6/1/2011 2:30 AM, Michal Borychowski wrote: >>> >>> We think that this problem could be caused by your RAM in the master. We >>> recommend using RAM with parity control. You can also run a test from >>> http://www.memtest.org/ on your server and check your existing RAM. Of >>> course, the bit could have been changed also on the motherboard level or > CPU >>> - which is much less probable. >>> >>> >>> Also you can see in the log that file 7538 is located between 7553 and > 7555: >> >> >> So in a situation like this where the metadata is now corrupt. >> >> Is the problem fixable with only the loss of the one file? (and how does >> one fix it). >> >> or is his entire MFS setup completely corrupt and he would need to have >> had a backup? >> >> Can I assume that older archived versions of the metadata.mfs could be >> used to recover most of the files. >> >> -bill >> >> > ---------------------------------------------------------------------------- > -- >> Simplify data backup and recovery for your virtual environment with > vRanger. >> Installation's a snap, and flexible recovery options mean your data is > safe, >> secure and there when you need it. Data protection magic? >> Nope - It's vRanger. Get your free trial download today. >> http://p.sf.net/sfu/quest-sfdev2dev >> _______________________________________________ >> moosefs-users mailing list >> moo...@li... >> https://lists.sourceforge.net/lists/listinfo/moosefs-users >> > > ---------------------------------------------------------------------------- > -- > Simplify data backup and recovery for your virtual environment with vRanger. > > Installation's a snap, and flexible recovery options mean your data is safe, > secure and there when you need it. Data protection magic? > Nope - It's vRanger. Get your free trial download today. > http://p.sf.net/sfu/quest-sfdev2dev > _______________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users > > |
From: Michal B. <mic...@ge...> - 2011-06-06 05:47:12
|
Hi! That's interesting... So this bit could be changed by your CPU or motherboard or it is an error in the software but it would be very very difficult to find it as the error probably cannot be easily repeated. Regarding your previous questions - it's almost impossible that your metadata is "completely" corrupt. You really can recover most of your files at most times. This situation was really weird as there was a single change in an information bit. Normally you would run mfsmetarestore with a flag -i (ignore) and it would just ignore this one file. Unfortunately you would not be able to repair this single bit as this was quite a complicated process. Kind regards Michał -----Original Message----- From: Tuukka Luolamo [mailto:tlu...@gm...] Sent: Monday, June 06, 2011 5:01 AM To: Michal Borychowski; moo...@li... Subject: Re: [Moosefs-users] Problems after power failure OK I ran the memtest on the master server, It got through without finding any errors. 2011/6/2 Michal Borychowski <mic...@ge...>: > Hi! > > In the meantime please run the memtest - we are curious if it really was > hardware problem or maybe it could be a software problem > > > Regards > Michal > > -----Original Message----- > From: Tuukka Luolamo [mailto:tlu...@gm...] > Sent: Thursday, June 02, 2011 9:55 AM > To: WK > Cc: moo...@li... > Subject: Re: [Moosefs-users] Problems after power failure > > OK I put in place the file the dev sent me and can not see any data loss... > > I found the file in question the one I got in the error and it seems fine. > > The whole system is up and functioning. > > I run the system on a old desktop computer and a another PC I bought > for $25 so the dev recommends making sure you have good memory, but I > guess I am using whatever I got =) Aside from this error everything > has been fine. Didn't run the memtest they recommended, but I would > not count out memory errors. > > However I would like to understand the situation better mainly for > what are my recourses. As WK articulated already had my metadata been > completely corrupt would I have lost all my data? Would I lose just > one file the one with the error? And can I fix this error myself? > > Thanks, > > Tuukka > > On Wed, Jun 1, 2011 at 3:39 PM, WK <wk...@bn...> wrote: >> On 6/1/2011 2:30 AM, Michal Borychowski wrote: >>> >>> We think that this problem could be caused by your RAM in the master. We >>> recommend using RAM with parity control. You can also run a test from >>> http://www.memtest.org/ on your server and check your existing RAM. Of >>> course, the bit could have been changed also on the motherboard level or > CPU >>> - which is much less probable. >>> >>> >>> Also you can see in the log that file 7538 is located between 7553 and > 7555: >> >> >> So in a situation like this where the metadata is now corrupt. >> >> Is the problem fixable with only the loss of the one file? (and how does >> one fix it). >> >> or is his entire MFS setup completely corrupt and he would need to have >> had a backup? >> >> Can I assume that older archived versions of the metadata.mfs could be >> used to recover most of the files. >> >> -bill >> >> > ---------------------------------------------------------------------------- > -- >> Simplify data backup and recovery for your virtual environment with > vRanger. >> Installation's a snap, and flexible recovery options mean your data is > safe, >> secure and there when you need it. Data protection magic? >> Nope - It's vRanger. Get your free trial download today. >> http://p.sf.net/sfu/quest-sfdev2dev >> _______________________________________________ >> moosefs-users mailing list >> moo...@li... >> https://lists.sourceforge.net/lists/listinfo/moosefs-users >> > > ---------------------------------------------------------------------------- > -- > Simplify data backup and recovery for your virtual environment with vRanger. > > Installation's a snap, and flexible recovery options mean your data is safe, > secure and there when you need it. Data protection magic? > Nope - It's vRanger. Get your free trial download today. > http://p.sf.net/sfu/quest-sfdev2dev > _______________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users > > ---------------------------------------------------------------------------- -- Simplify data backup and recovery for your virtual environment with vRanger. Installation's a snap, and flexible recovery options mean your data is safe, secure and there when you need it. Discover what all the cheering's about. Get your free trial download today. http://p.sf.net/sfu/quest-dev2dev2 _______________________________________________ moosefs-users mailing list moo...@li... https://lists.sourceforge.net/lists/listinfo/moosefs-users |
From: Tuukka L. <tlu...@gm...> - 2011-06-06 05:56:19
|
Well I guess the question is in the impossible situation that the metadata were to be completely corrupt or lost because there is no backup and master goes dead, what recourse is there if any? Thanks 2011/6/5 Michal Borychowski <mic...@ge...>: > Hi! > > That's interesting... So this bit could be changed by your CPU or > motherboard or it is an error in the software but it would be very very > difficult to find it as the error probably cannot be easily repeated. > > Regarding your previous questions - it's almost impossible that your > metadata is "completely" corrupt. You really can recover most of your files > at most times. This situation was really weird as there was a single change > in an information bit. Normally you would run mfsmetarestore with a flag -i > (ignore) and it would just ignore this one file. Unfortunately you would not > be able to repair this single bit as this was quite a complicated process. > > > Kind regards > Michał > > > -----Original Message----- > From: Tuukka Luolamo [mailto:tlu...@gm...] > Sent: Monday, June 06, 2011 5:01 AM > To: Michal Borychowski; moo...@li... > Subject: Re: [Moosefs-users] Problems after power failure > > OK I ran the memtest on the master server, It got through without > finding any errors. > > 2011/6/2 Michal Borychowski <mic...@ge...>: >> Hi! >> >> In the meantime please run the memtest - we are curious if it really was >> hardware problem or maybe it could be a software problem >> >> >> Regards >> Michal >> >> -----Original Message----- >> From: Tuukka Luolamo [mailto:tlu...@gm...] >> Sent: Thursday, June 02, 2011 9:55 AM >> To: WK >> Cc: moo...@li... >> Subject: Re: [Moosefs-users] Problems after power failure >> >> OK I put in place the file the dev sent me and can not see any data > loss... >> >> I found the file in question the one I got in the error and it seems fine. >> >> The whole system is up and functioning. >> >> I run the system on a old desktop computer and a another PC I bought >> for $25 so the dev recommends making sure you have good memory, but I >> guess I am using whatever I got =) Aside from this error everything >> has been fine. Didn't run the memtest they recommended, but I would >> not count out memory errors. >> >> However I would like to understand the situation better mainly for >> what are my recourses. As WK articulated already had my metadata been >> completely corrupt would I have lost all my data? Would I lose just >> one file the one with the error? And can I fix this error myself? >> >> Thanks, >> >> Tuukka >> >> On Wed, Jun 1, 2011 at 3:39 PM, WK <wk...@bn...> wrote: >>> On 6/1/2011 2:30 AM, Michal Borychowski wrote: >>>> >>>> We think that this problem could be caused by your RAM in the master. We >>>> recommend using RAM with parity control. You can also run a test from >>>> http://www.memtest.org/ on your server and check your existing RAM. Of >>>> course, the bit could have been changed also on the motherboard level or >> CPU >>>> - which is much less probable. >>>> >>>> >>>> Also you can see in the log that file 7538 is located between 7553 and >> 7555: >>> >>> >>> So in a situation like this where the metadata is now corrupt. >>> >>> Is the problem fixable with only the loss of the one file? (and how does >>> one fix it). >>> >>> or is his entire MFS setup completely corrupt and he would need to have >>> had a backup? >>> >>> Can I assume that older archived versions of the metadata.mfs could be >>> used to recover most of the files. >>> >>> -bill >>> >>> >> > ---------------------------------------------------------------------------- >> -- >>> Simplify data backup and recovery for your virtual environment with >> vRanger. >>> Installation's a snap, and flexible recovery options mean your data is >> safe, >>> secure and there when you need it. Data protection magic? >>> Nope - It's vRanger. Get your free trial download today. >>> http://p.sf.net/sfu/quest-sfdev2dev >>> _______________________________________________ >>> moosefs-users mailing list >>> moo...@li... >>> https://lists.sourceforge.net/lists/listinfo/moosefs-users >>> >> >> > ---------------------------------------------------------------------------- >> -- >> Simplify data backup and recovery for your virtual environment with > vRanger. >> >> Installation's a snap, and flexible recovery options mean your data is > safe, >> secure and there when you need it. Data protection magic? >> Nope - It's vRanger. Get your free trial download today. >> http://p.sf.net/sfu/quest-sfdev2dev >> _______________________________________________ >> moosefs-users mailing list >> moo...@li... >> https://lists.sourceforge.net/lists/listinfo/moosefs-users >> >> > > ---------------------------------------------------------------------------- > -- > Simplify data backup and recovery for your virtual environment with vRanger. > Installation's a snap, and flexible recovery options mean your data is safe, > secure and there when you need it. Discover what all the cheering's about. > Get your free trial download today. > http://p.sf.net/sfu/quest-dev2dev2 > _______________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users > > |
From: Michal B. <mic...@ge...> - 2011-06-06 06:01:41
|
If you really don't have your metadata, your files are dead... You need metadata to know about them. Of course you can try to recover by reading the surface of hard drives on chunkservers but this would be very tedious... Regards Michał -----Original Message----- From: Tuukka Luolamo [mailto:tlu...@gm...] Sent: Monday, June 06, 2011 7:56 AM To: Michal Borychowski Cc: moo...@li... Subject: Re: [Moosefs-users] Problems after power failure Well I guess the question is in the impossible situation that the metadata were to be completely corrupt or lost because there is no backup and master goes dead, what recourse is there if any? Thanks 2011/6/5 Michal Borychowski <mic...@ge...>: > Hi! > > That's interesting... So this bit could be changed by your CPU or > motherboard or it is an error in the software but it would be very very > difficult to find it as the error probably cannot be easily repeated. > > Regarding your previous questions - it's almost impossible that your > metadata is "completely" corrupt. You really can recover most of your files > at most times. This situation was really weird as there was a single change > in an information bit. Normally you would run mfsmetarestore with a flag -i > (ignore) and it would just ignore this one file. Unfortunately you would not > be able to repair this single bit as this was quite a complicated process. > > > Kind regards > Michał > > > -----Original Message----- > From: Tuukka Luolamo [mailto:tlu...@gm...] > Sent: Monday, June 06, 2011 5:01 AM > To: Michal Borychowski; moo...@li... > Subject: Re: [Moosefs-users] Problems after power failure > > OK I ran the memtest on the master server, It got through without > finding any errors. > > 2011/6/2 Michal Borychowski <mic...@ge...>: >> Hi! >> >> In the meantime please run the memtest - we are curious if it really was >> hardware problem or maybe it could be a software problem >> >> >> Regards >> Michal >> >> -----Original Message----- >> From: Tuukka Luolamo [mailto:tlu...@gm...] >> Sent: Thursday, June 02, 2011 9:55 AM >> To: WK >> Cc: moo...@li... >> Subject: Re: [Moosefs-users] Problems after power failure >> >> OK I put in place the file the dev sent me and can not see any data > loss... >> >> I found the file in question the one I got in the error and it seems fine. >> >> The whole system is up and functioning. >> >> I run the system on a old desktop computer and a another PC I bought >> for $25 so the dev recommends making sure you have good memory, but I >> guess I am using whatever I got =) Aside from this error everything >> has been fine. Didn't run the memtest they recommended, but I would >> not count out memory errors. >> >> However I would like to understand the situation better mainly for >> what are my recourses. As WK articulated already had my metadata been >> completely corrupt would I have lost all my data? Would I lose just >> one file the one with the error? And can I fix this error myself? >> >> Thanks, >> >> Tuukka >> >> On Wed, Jun 1, 2011 at 3:39 PM, WK <wk...@bn...> wrote: >>> On 6/1/2011 2:30 AM, Michal Borychowski wrote: >>>> >>>> We think that this problem could be caused by your RAM in the master. We >>>> recommend using RAM with parity control. You can also run a test from >>>> http://www.memtest.org/ on your server and check your existing RAM. Of >>>> course, the bit could have been changed also on the motherboard level or >> CPU >>>> - which is much less probable. >>>> >>>> >>>> Also you can see in the log that file 7538 is located between 7553 and >> 7555: >>> >>> >>> So in a situation like this where the metadata is now corrupt. >>> >>> Is the problem fixable with only the loss of the one file? (and how does >>> one fix it). >>> >>> or is his entire MFS setup completely corrupt and he would need to have >>> had a backup? >>> >>> Can I assume that older archived versions of the metadata.mfs could be >>> used to recover most of the files. >>> >>> -bill >>> >>> >> > ---------------------------------------------------------------------------- >> -- >>> Simplify data backup and recovery for your virtual environment with >> vRanger. >>> Installation's a snap, and flexible recovery options mean your data is >> safe, >>> secure and there when you need it. Data protection magic? >>> Nope - It's vRanger. Get your free trial download today. >>> http://p.sf.net/sfu/quest-sfdev2dev >>> _______________________________________________ >>> moosefs-users mailing list >>> moo...@li... >>> https://lists.sourceforge.net/lists/listinfo/moosefs-users >>> >> >> > ---------------------------------------------------------------------------- >> -- >> Simplify data backup and recovery for your virtual environment with > vRanger. >> >> Installation's a snap, and flexible recovery options mean your data is > safe, >> secure and there when you need it. Data protection magic? >> Nope - It's vRanger. Get your free trial download today. >> http://p.sf.net/sfu/quest-sfdev2dev >> _______________________________________________ >> moosefs-users mailing list >> moo...@li... >> https://lists.sourceforge.net/lists/listinfo/moosefs-users >> >> > > ---------------------------------------------------------------------------- > -- > Simplify data backup and recovery for your virtual environment with vRanger. > Installation's a snap, and flexible recovery options mean your data is safe, > secure and there when you need it. Discover what all the cheering's about. > Get your free trial download today. > http://p.sf.net/sfu/quest-dev2dev2 > _______________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users > > |
From: Tuukka L. <tlu...@gm...> - 2011-06-06 06:17:17
|
So the chunk servers in effect do not know what files they have. Only the master is aware? I looked at the chunks themselves and they don't seem to be particularly special, I was able to identify some of the files inside the chunks simply by using head/tail/cat commands on them, so it would seem like it would not be hard for each chunkserver to be at least minimally aware of what is in themselves, to remove some dependency on the master. Not sure how other people feel, but if I knew that each chunkserver can tell the master what it has, it would make me feel better that there is a way to recover the contents of the file system in case of a extreme failure, specially in small implementations. I realize in larger implementations it may be impractical to ever recover the meta data from the chunkservers, simply for the time it might take. So having very reliable masters and backup masters would be a key. Thanks. 2011/6/5 Michal Borychowski <mic...@ge...>: > If you really don't have your metadata, your files are dead... You need > metadata to know about them. Of course you can try to recover by reading the > surface of hard drives on chunkservers but this would be very tedious... > > > Regards > Michał > > -----Original Message----- > From: Tuukka Luolamo [mailto:tlu...@gm...] > Sent: Monday, June 06, 2011 7:56 AM > To: Michal Borychowski > Cc: moo...@li... > Subject: Re: [Moosefs-users] Problems after power failure > > Well I guess the question is in the impossible situation that the > metadata were to be completely corrupt or lost because there is no > backup and master goes dead, what recourse is there if any? > > Thanks > > 2011/6/5 Michal Borychowski <mic...@ge...>: >> Hi! >> >> That's interesting... So this bit could be changed by your CPU or >> motherboard or it is an error in the software but it would be very very >> difficult to find it as the error probably cannot be easily repeated. >> >> Regarding your previous questions - it's almost impossible that your >> metadata is "completely" corrupt. You really can recover most of your > files >> at most times. This situation was really weird as there was a single > change >> in an information bit. Normally you would run mfsmetarestore with a flag > -i >> (ignore) and it would just ignore this one file. Unfortunately you would > not >> be able to repair this single bit as this was quite a complicated process. >> >> >> Kind regards >> Michał >> >> >> -----Original Message----- >> From: Tuukka Luolamo [mailto:tlu...@gm...] >> Sent: Monday, June 06, 2011 5:01 AM >> To: Michal Borychowski; moo...@li... >> Subject: Re: [Moosefs-users] Problems after power failure >> >> OK I ran the memtest on the master server, It got through without >> finding any errors. >> >> 2011/6/2 Michal Borychowski <mic...@ge...>: >>> Hi! >>> >>> In the meantime please run the memtest - we are curious if it really was >>> hardware problem or maybe it could be a software problem >>> >>> >>> Regards >>> Michal >>> >>> -----Original Message----- >>> From: Tuukka Luolamo [mailto:tlu...@gm...] >>> Sent: Thursday, June 02, 2011 9:55 AM >>> To: WK >>> Cc: moo...@li... >>> Subject: Re: [Moosefs-users] Problems after power failure >>> >>> OK I put in place the file the dev sent me and can not see any data >> loss... >>> >>> I found the file in question the one I got in the error and it seems > fine. >>> >>> The whole system is up and functioning. >>> >>> I run the system on a old desktop computer and a another PC I bought >>> for $25 so the dev recommends making sure you have good memory, but I >>> guess I am using whatever I got =) Aside from this error everything >>> has been fine. Didn't run the memtest they recommended, but I would >>> not count out memory errors. >>> >>> However I would like to understand the situation better mainly for >>> what are my recourses. As WK articulated already had my metadata been >>> completely corrupt would I have lost all my data? Would I lose just >>> one file the one with the error? And can I fix this error myself? >>> >>> Thanks, >>> >>> Tuukka >>> >>> On Wed, Jun 1, 2011 at 3:39 PM, WK <wk...@bn...> wrote: >>>> On 6/1/2011 2:30 AM, Michal Borychowski wrote: >>>>> >>>>> We think that this problem could be caused by your RAM in the master. > We >>>>> recommend using RAM with parity control. You can also run a test from >>>>> http://www.memtest.org/ on your server and check your existing RAM. Of >>>>> course, the bit could have been changed also on the motherboard level > or >>> CPU >>>>> - which is much less probable. >>>>> >>>>> >>>>> Also you can see in the log that file 7538 is located between 7553 and >>> 7555: >>>> >>>> >>>> So in a situation like this where the metadata is now corrupt. >>>> >>>> Is the problem fixable with only the loss of the one file? (and how does >>>> one fix it). >>>> >>>> or is his entire MFS setup completely corrupt and he would need to have >>>> had a backup? >>>> >>>> Can I assume that older archived versions of the metadata.mfs could be >>>> used to recover most of the files. >>>> >>>> -bill >>>> >>>> >>> >> > ---------------------------------------------------------------------------- >>> -- >>>> Simplify data backup and recovery for your virtual environment with >>> vRanger. >>>> Installation's a snap, and flexible recovery options mean your data is >>> safe, >>>> secure and there when you need it. Data protection magic? >>>> Nope - It's vRanger. Get your free trial download today. >>>> http://p.sf.net/sfu/quest-sfdev2dev >>>> _______________________________________________ >>>> moosefs-users mailing list >>>> moo...@li... >>>> https://lists.sourceforge.net/lists/listinfo/moosefs-users >>>> >>> >>> >> > ---------------------------------------------------------------------------- >>> -- >>> Simplify data backup and recovery for your virtual environment with >> vRanger. >>> >>> Installation's a snap, and flexible recovery options mean your data is >> safe, >>> secure and there when you need it. Data protection magic? >>> Nope - It's vRanger. Get your free trial download today. >>> http://p.sf.net/sfu/quest-sfdev2dev >>> _______________________________________________ >>> moosefs-users mailing list >>> moo...@li... >>> https://lists.sourceforge.net/lists/listinfo/moosefs-users >>> >>> >> >> > ---------------------------------------------------------------------------- >> -- >> Simplify data backup and recovery for your virtual environment with > vRanger. >> Installation's a snap, and flexible recovery options mean your data is > safe, >> secure and there when you need it. Discover what all the cheering's about. >> Get your free trial download today. >> http://p.sf.net/sfu/quest-dev2dev2 >> _______________________________________________ >> moosefs-users mailing list >> moo...@li... >> https://lists.sourceforge.net/lists/listinfo/moosefs-users >> >> > > |
From: Michal B. <mic...@ge...> - 2011-07-07 07:56:40
|
Hi Tuukka! In reply to your old post :) Files below 64MB would occupy single chunks so they would be in one part, not divided. Chunks have 5kB which needs to be removed and later it is necessary to check where the file ends. Chunks length is rounded up to (5+64*n)kB, where n can be 0 up to 1024. Kind regards Michał Borychowski MooseFS Support Manager _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ Gemius S.A. ul. Wołoska 7, 02-672 Warszawa Budynek MARS, klatka D Tel.: +4822 874-41-00 Fax : +4822 874-41-01 -----Original Message----- From: Tuukka Luolamo [mailto:tlu...@gm...] Sent: Monday, June 06, 2011 8:17 AM To: Michal Borychowski Cc: moo...@li... Subject: Re: [Moosefs-users] Problems after power failure So the chunk servers in effect do not know what files they have. Only the master is aware? I looked at the chunks themselves and they don't seem to be particularly special, I was able to identify some of the files inside the chunks simply by using head/tail/cat commands on them, so it would seem like it would not be hard for each chunkserver to be at least minimally aware of what is in themselves, to remove some dependency on the master. Not sure how other people feel, but if I knew that each chunkserver can tell the master what it has, it would make me feel better that there is a way to recover the contents of the file system in case of a extreme failure, specially in small implementations. I realize in larger implementations it may be impractical to ever recover the meta data from the chunkservers, simply for the time it might take. So having very reliable masters and backup masters would be a key. Thanks. 2011/6/5 Michal Borychowski <mic...@ge...>: > If you really don't have your metadata, your files are dead... You need > metadata to know about them. Of course you can try to recover by reading the > surface of hard drives on chunkservers but this would be very tedious... > > > Regards > Michał > > -----Original Message----- > From: Tuukka Luolamo [mailto:tlu...@gm...] > Sent: Monday, June 06, 2011 7:56 AM > To: Michal Borychowski > Cc: moo...@li... > Subject: Re: [Moosefs-users] Problems after power failure > > Well I guess the question is in the impossible situation that the > metadata were to be completely corrupt or lost because there is no > backup and master goes dead, what recourse is there if any? > > Thanks > > 2011/6/5 Michal Borychowski <mic...@ge...>: >> Hi! >> >> That's interesting... So this bit could be changed by your CPU or >> motherboard or it is an error in the software but it would be very very >> difficult to find it as the error probably cannot be easily repeated. >> >> Regarding your previous questions - it's almost impossible that your >> metadata is "completely" corrupt. You really can recover most of your > files >> at most times. This situation was really weird as there was a single > change >> in an information bit. Normally you would run mfsmetarestore with a flag > -i >> (ignore) and it would just ignore this one file. Unfortunately you would > not >> be able to repair this single bit as this was quite a complicated process. >> >> >> Kind regards >> Michał >> >> >> -----Original Message----- >> From: Tuukka Luolamo [mailto:tlu...@gm...] >> Sent: Monday, June 06, 2011 5:01 AM >> To: Michal Borychowski; moo...@li... >> Subject: Re: [Moosefs-users] Problems after power failure >> >> OK I ran the memtest on the master server, It got through without >> finding any errors. >> >> 2011/6/2 Michal Borychowski <mic...@ge...>: >>> Hi! >>> >>> In the meantime please run the memtest - we are curious if it really was >>> hardware problem or maybe it could be a software problem >>> >>> >>> Regards >>> Michal >>> >>> -----Original Message----- >>> From: Tuukka Luolamo [mailto:tlu...@gm...] >>> Sent: Thursday, June 02, 2011 9:55 AM >>> To: WK >>> Cc: moo...@li... >>> Subject: Re: [Moosefs-users] Problems after power failure >>> >>> OK I put in place the file the dev sent me and can not see any data >> loss... >>> >>> I found the file in question the one I got in the error and it seems > fine. >>> >>> The whole system is up and functioning. >>> >>> I run the system on a old desktop computer and a another PC I bought >>> for $25 so the dev recommends making sure you have good memory, but I >>> guess I am using whatever I got =) Aside from this error everything >>> has been fine. Didn't run the memtest they recommended, but I would >>> not count out memory errors. >>> >>> However I would like to understand the situation better mainly for >>> what are my recourses. As WK articulated already had my metadata been >>> completely corrupt would I have lost all my data? Would I lose just >>> one file the one with the error? And can I fix this error myself? >>> >>> Thanks, >>> >>> Tuukka >>> >>> On Wed, Jun 1, 2011 at 3:39 PM, WK <wk...@bn...> wrote: >>>> On 6/1/2011 2:30 AM, Michal Borychowski wrote: >>>>> >>>>> We think that this problem could be caused by your RAM in the master. > We >>>>> recommend using RAM with parity control. You can also run a test from >>>>> http://www.memtest.org/ on your server and check your existing RAM. Of >>>>> course, the bit could have been changed also on the motherboard level > or >>> CPU >>>>> - which is much less probable. >>>>> >>>>> >>>>> Also you can see in the log that file 7538 is located between 7553 and >>> 7555: >>>> >>>> >>>> So in a situation like this where the metadata is now corrupt. >>>> >>>> Is the problem fixable with only the loss of the one file? (and how does >>>> one fix it). >>>> >>>> or is his entire MFS setup completely corrupt and he would need to have >>>> had a backup? >>>> >>>> Can I assume that older archived versions of the metadata.mfs could be >>>> used to recover most of the files. >>>> >>>> -bill >>>> >>>> >>> >> > ---------------------------------------------------------------------------- >>> -- >>>> Simplify data backup and recovery for your virtual environment with >>> vRanger. >>>> Installation's a snap, and flexible recovery options mean your data is >>> safe, >>>> secure and there when you need it. Data protection magic? >>>> Nope - It's vRanger. Get your free trial download today. >>>> http://p.sf.net/sfu/quest-sfdev2dev >>>> _______________________________________________ >>>> moosefs-users mailing list >>>> moo...@li... >>>> https://lists.sourceforge.net/lists/listinfo/moosefs-users >>>> >>> >>> >> > ---------------------------------------------------------------------------- >>> -- >>> Simplify data backup and recovery for your virtual environment with >> vRanger. >>> >>> Installation's a snap, and flexible recovery options mean your data is >> safe, >>> secure and there when you need it. Data protection magic? >>> Nope - It's vRanger. Get your free trial download today. >>> http://p.sf.net/sfu/quest-sfdev2dev >>> _______________________________________________ >>> moosefs-users mailing list >>> moo...@li... >>> https://lists.sourceforge.net/lists/listinfo/moosefs-users >>> >>> >> >> > ---------------------------------------------------------------------------- >> -- >> Simplify data backup and recovery for your virtual environment with > vRanger. >> Installation's a snap, and flexible recovery options mean your data is > safe, >> secure and there when you need it. Discover what all the cheering's about. >> Get your free trial download today. >> http://p.sf.net/sfu/quest-dev2dev2 >> _______________________________________________ >> moosefs-users mailing list >> moo...@li... >> https://lists.sourceforge.net/lists/listinfo/moosefs-users >> >> > > ---------------------------------------------------------------------------- -- Simplify data backup and recovery for your virtual environment with vRanger. Installation's a snap, and flexible recovery options mean your data is safe, secure and there when you need it. Discover what all the cheering's about. Get your free trial download today. http://p.sf.net/sfu/quest-dev2dev2 _______________________________________________ moosefs-users mailing list moo...@li... https://lists.sourceforge.net/lists/listinfo/moosefs-users |
From: Tuukka L. <tlu...@gm...> - 2011-07-07 16:33:18
|
Hey Michal, Thanks for the info. Actually you got me thinking about this again, and maybe a simpler/scalable solution to the problem I was experiencing would be to run a sanity check on the meta file that gets dumped and if it doesn't pass it some kind of error is presented in the admin console, logs and/or emailed/text messaged etc. Also in that case the system should keep the bad one as well as the last good one at the minimum. Thanks, Tuukka 2011/7/7 Michal Borychowski <mic...@ge...>: > Hi Tuukka! > > In reply to your old post :) Files below 64MB would occupy single chunks so > they would be in one part, not divided. > > Chunks have 5kB which needs to be removed and later it is necessary to check > where the file ends. Chunks length is rounded up to (5+64*n)kB, where n can > be 0 up to 1024. > > > Kind regards > Michał Borychowski > MooseFS Support Manager > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > Gemius S.A. > ul. Wołoska 7, 02-672 Warszawa > Budynek MARS, klatka D > Tel.: +4822 874-41-00 > Fax : +4822 874-41-01 > > > > -----Original Message----- > From: Tuukka Luolamo [mailto:tlu...@gm...] > Sent: Monday, June 06, 2011 8:17 AM > To: Michal Borychowski > Cc: moo...@li... > Subject: Re: [Moosefs-users] Problems after power failure > > So the chunk servers in effect do not know what files they have. Only > the master is aware? I looked at the chunks themselves and they don't > seem to be particularly special, I was able to identify some of the > files inside the chunks simply by using head/tail/cat commands on > them, so it would seem like it would not be hard for each chunkserver > to be at least minimally aware of what is in themselves, to remove > some dependency on the master. Not sure how other people feel, but if > I knew that each chunkserver can tell the master what it has, it would > make me feel better that there is a way to recover the contents of the > file system in case of a extreme failure, specially in small > implementations. I realize in larger implementations it may be > impractical to ever recover the meta data from the chunkservers, > simply for the time it might take. So having very reliable masters and > backup masters would be a key. > > Thanks. > > 2011/6/5 Michal Borychowski <mic...@ge...>: >> If you really don't have your metadata, your files are dead... You need >> metadata to know about them. Of course you can try to recover by reading > the >> surface of hard drives on chunkservers but this would be very tedious... >> >> >> Regards >> Michał >> >> -----Original Message----- >> From: Tuukka Luolamo [mailto:tlu...@gm...] >> Sent: Monday, June 06, 2011 7:56 AM >> To: Michal Borychowski >> Cc: moo...@li... >> Subject: Re: [Moosefs-users] Problems after power failure >> >> Well I guess the question is in the impossible situation that the >> metadata were to be completely corrupt or lost because there is no >> backup and master goes dead, what recourse is there if any? >> >> Thanks >> >> 2011/6/5 Michal Borychowski <mic...@ge...>: >>> Hi! >>> >>> That's interesting... So this bit could be changed by your CPU or >>> motherboard or it is an error in the software but it would be very very >>> difficult to find it as the error probably cannot be easily repeated. >>> >>> Regarding your previous questions - it's almost impossible that your >>> metadata is "completely" corrupt. You really can recover most of your >> files >>> at most times. This situation was really weird as there was a single >> change >>> in an information bit. Normally you would run mfsmetarestore with a flag >> -i >>> (ignore) and it would just ignore this one file. Unfortunately you would >> not >>> be able to repair this single bit as this was quite a complicated > process. >>> >>> >>> Kind regards >>> Michał >>> >>> >>> -----Original Message----- >>> From: Tuukka Luolamo [mailto:tlu...@gm...] >>> Sent: Monday, June 06, 2011 5:01 AM >>> To: Michal Borychowski; moo...@li... >>> Subject: Re: [Moosefs-users] Problems after power failure >>> >>> OK I ran the memtest on the master server, It got through without >>> finding any errors. >>> >>> 2011/6/2 Michal Borychowski <mic...@ge...>: >>>> Hi! >>>> >>>> In the meantime please run the memtest - we are curious if it really was >>>> hardware problem or maybe it could be a software problem >>>> >>>> >>>> Regards >>>> Michal >>>> >>>> -----Original Message----- >>>> From: Tuukka Luolamo [mailto:tlu...@gm...] >>>> Sent: Thursday, June 02, 2011 9:55 AM >>>> To: WK >>>> Cc: moo...@li... >>>> Subject: Re: [Moosefs-users] Problems after power failure >>>> >>>> OK I put in place the file the dev sent me and can not see any data >>> loss... >>>> >>>> I found the file in question the one I got in the error and it seems >> fine. >>>> >>>> The whole system is up and functioning. >>>> >>>> I run the system on a old desktop computer and a another PC I bought >>>> for $25 so the dev recommends making sure you have good memory, but I >>>> guess I am using whatever I got =) Aside from this error everything >>>> has been fine. Didn't run the memtest they recommended, but I would >>>> not count out memory errors. >>>> >>>> However I would like to understand the situation better mainly for >>>> what are my recourses. As WK articulated already had my metadata been >>>> completely corrupt would I have lost all my data? Would I lose just >>>> one file the one with the error? And can I fix this error myself? >>>> >>>> Thanks, >>>> >>>> Tuukka >>>> >>>> On Wed, Jun 1, 2011 at 3:39 PM, WK <wk...@bn...> wrote: >>>>> On 6/1/2011 2:30 AM, Michal Borychowski wrote: >>>>>> >>>>>> We think that this problem could be caused by your RAM in the master. >> We >>>>>> recommend using RAM with parity control. You can also run a test from >>>>>> http://www.memtest.org/ on your server and check your existing RAM. Of >>>>>> course, the bit could have been changed also on the motherboard level >> or >>>> CPU >>>>>> - which is much less probable. >>>>>> >>>>>> >>>>>> Also you can see in the log that file 7538 is located between 7553 and >>>> 7555: >>>>> >>>>> >>>>> So in a situation like this where the metadata is now corrupt. >>>>> >>>>> Is the problem fixable with only the loss of the one file? (and how > does >>>>> one fix it). >>>>> >>>>> or is his entire MFS setup completely corrupt and he would need to have >>>>> had a backup? >>>>> >>>>> Can I assume that older archived versions of the metadata.mfs could be >>>>> used to recover most of the files. >>>>> >>>>> -bill >>>>> >>>>> >>>> >>> >> > ---------------------------------------------------------------------------- >>>> -- >>>>> Simplify data backup and recovery for your virtual environment with >>>> vRanger. >>>>> Installation's a snap, and flexible recovery options mean your data is >>>> safe, >>>>> secure and there when you need it. Data protection magic? >>>>> Nope - It's vRanger. Get your free trial download today. >>>>> http://p.sf.net/sfu/quest-sfdev2dev >>>>> _______________________________________________ >>>>> moosefs-users mailing list >>>>> moo...@li... >>>>> https://lists.sourceforge.net/lists/listinfo/moosefs-users >>>>> >>>> >>>> >>> >> > ---------------------------------------------------------------------------- >>>> -- >>>> Simplify data backup and recovery for your virtual environment with >>> vRanger. >>>> >>>> Installation's a snap, and flexible recovery options mean your data is >>> safe, >>>> secure and there when you need it. Data protection magic? >>>> Nope - It's vRanger. Get your free trial download today. >>>> http://p.sf.net/sfu/quest-sfdev2dev >>>> _______________________________________________ >>>> moosefs-users mailing list >>>> moo...@li... >>>> https://lists.sourceforge.net/lists/listinfo/moosefs-users >>>> >>>> >>> >>> >> > ---------------------------------------------------------------------------- >>> -- >>> Simplify data backup and recovery for your virtual environment with >> vRanger. >>> Installation's a snap, and flexible recovery options mean your data is >> safe, >>> secure and there when you need it. Discover what all the cheering's > about. >>> Get your free trial download today. >>> http://p.sf.net/sfu/quest-dev2dev2 >>> _______________________________________________ >>> moosefs-users mailing list >>> moo...@li... >>> https://lists.sourceforge.net/lists/listinfo/moosefs-users >>> >>> >> >> > > ---------------------------------------------------------------------------- > -- > Simplify data backup and recovery for your virtual environment with vRanger. > Installation's a snap, and flexible recovery options mean your data is safe, > secure and there when you need it. Discover what all the cheering's about. > Get your free trial download today. > http://p.sf.net/sfu/quest-dev2dev2 > _______________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users > > |