From: Thomas S H. <tha...@gm...> - 2010-12-07 15:35:36
|
Thanks Michal, will do! 2010/12/7 Michał Borychowski <mic...@ge...> > Hi Thomas! > > > > These errors were caused by a "disconnected" hdd. If you looked in the cgi > monitor you would see a disk with "damaged" status. > > > > The strange thing is that this bad chunk was retested after 10 seconds. It > should have been removed after the first test. And unfortunately in this > case these errors caused that MooseFS marked the hdd as damaged. But this > was a "logical" error not a physical one. Probably you should run "fsck" on > this hard drive. On the other hand we will make a patch so that system > doesn't test the same chunk in a loop. > > > > > > Kind regards > > Michal > > > > *From:* Thomas S Hatch [mailto:tha...@gm...] > *Sent:* Friday, December 03, 2010 6:05 PM > *To:* moosefs-users > *Subject:* [Moosefs-users] Errors and then "crash" > > > > This is the second time a chunkserver has issued this type of failure in > out environment, after giving this log message the chunkserver does not > crash, but all files on the chunk become unavailable and it shows %0 usage > on the mfsmaster > > > > Dec 3 14:58:08 localhost mfschunkserver[6969]: testing chunk: > /mnt/moose1/6C/chunk_000000000008F96C_00000001.mfs > > Dec 3 14:58:18 localhost mfschunkserver[6969]: testing chunk: > /mnt/moose1/0D/chunk_000000000001BB0D_00000001.mfs > > Dec 3 14:58:18 localhost mfschunkserver[6969]: chunk_readcrc: > file:/mnt/moose1/0D/chunk_000000000001BB0D_00000001.mfs - wrong id/version > in header (000000000001BB0D_00000000) > > Dec 3 14:58:18 localhost mfschunkserver[6969]: hdd_io_begin: > file:/mnt/moose1/0D/chunk_000000000001BB0D_00000001.mfs - read error: > Unknown error > > Dec 3 14:58:28 localhost mfschunkserver[6969]: testing chunk: > /mnt/moose1/0D/chunk_000000000001BB0D_00000001.mfs > > Dec 3 14:58:28 localhost mfschunkserver[6969]: chunk_readcrc: > file:/mnt/moose1/0D/chunk_000000000001BB0D_00000001.mfs - wrong id/version > in header (000000000001BB0D_00000000) > > Dec 3 14:58:28 localhost mfschunkserver[6969]: hdd_io_begin: > file:/mnt/moose1/0D/chunk_000000000001BB0D_00000001.mfs - read error: > Unknown error > > Dec 3 14:58:38 localhost mfschunkserver[6969]: testing chunk: > /mnt/moose1/0D/chunk_000000000001BB0D_00000001.mfs > > Dec 3 14:58:38 localhost mfschunkserver[6969]: chunk_readcrc: > file:/mnt/moose1/0D/chunk_000000000001BB0D_00000001.mfs - wrong id/version > in header (000000000001BB0D_00000000) > > Dec 3 14:58:38 localhost mfschunkserver[6969]: hdd_io_begin: > file:/mnt/moose1/0D/chunk_000000000001BB0D_00000001.mfs - read error: > Unknown error > > Dec 3 14:58:38 localhost mfschunkserver[6969]: 3 errors occurred in 60 > seconds on folder: /mnt/moose1/ > > Dec 3 14:58:39 localhost mfschunkserver[6969]: replicator: hdd_create > status: 21 > > > > > > I am running the prerelease of 1.6.18 on Ubuntu 10.04. > > > > After restarting the chunkserver everything comes back online without > problems. > > > > Any ideas as to what could be causing this? > > > > -Tom Hatch > |