From: Michał B. <mic...@ge...> - 2010-06-08 09:50:07
|
Hi! The system says that chunk numbered "D710" is not available (none copy of the 3 set in goal). If all chunkservers and all the disks are connected it means that this chunk simply does not exist. If reboot took place while the file had been written it can happen that such a chunk will be lost. The important question is - was it the reboot of the master server, chunkservers or the whole system? An abrupt reboot of the whole system (eg. lack of electricity) could cause something like this. Fsck on chunkserver could have unfortunately deleted this chunk. It may be worthy to look into "lost+found" on disks connected on mfschunkservers. You can also issue "mfsfilerepair", but this will help only by creating zeros in the "damaged" place of the file. The system would not try to read it (to be exact system does not hang up, it makes lots of retries to read it - waits for the file to show up and after several minutes it gives up). If you need any further assistance please let us know. Kind regards Michał Borychowski MooseFS Support Manager _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ Gemius S.A. ul. Wołoska 7, 02-672 Warszawa Budynek MARS, klatka D Tel.: +4822 874-41-00 Fax : +4822 874-41-01 From: kuer ku [mailto:ku...@gm...] Sent: Saturday, June 05, 2010 12:05 PM To: moo...@li... Subject: [Moosefs-users] how to fix unavailabe chunk ?? Hi, all, I setup a moosefs storage with 1 metaserver + 4 chunkserver. today I found some error messages on http interface : there are some files lost. currently unavailable chunk 000000000000D710 (inode: 331 ; index: 0) * currently unavailable file 331: sink/fifodata/00126/20100604/00126_20100604164805 On box where mfsmount, when executing 'ls' command, it shows : -rw-rw-rw- 1 sea sea 2778996 6 4 17:32 00126_20100604164805 There is a system reboot occurs on 06/04 17:32; it is the last time when file was written. Now, at present, I can list it, but I cannot cat content of the files. Moreover, when you cat this file, the command would hang. I can find some error message in /var/log/messages : Jun 5 17:51:26 nbase07 mfsmount[6625]: file: 331, index: 0, chunk: 55056, version: 2 - there are no valid copies Jun 5 17:51:26 nbase07 mfsmount[6625]: file: 331, index: 0 - can't connect to proper chunkserver (try counter: 15) Jun 5 17:52:26 nbase07 mfsmount[6625]: file: 331, index: 0, chunk: 55056, version: 2 - there are no valid copies Jun 5 17:52:26 nbase07 mfsmount[6625]: file: 331, index: 0 - can't connect to proper chunkserver (try counter: 22) Jun 5 17:53:26 nbase07 mfsmount[6625]: file: 331, index: 0, chunk: 55056, version: 2 - there are no valid copies Jun 5 17:53:26 nbase07 mfsmount[6625]: file: 331, index: 0 - can't connect to proper chunkserver (try counter: 29) and, the goal of the file should be 3, because I set goal of its parent-directory is 3. What is the problem ? how to fix it ?? My environment : metaserver : moosefs 1.6.13 build on CentOS 5.3 x86_64 chunkserver : moosefs 1.6.13 build on CentOS 5.3 x86_64 mfsmount : MFS version 1.6.15 (FUSE library version: 2.7.4) on FreeBSD 6.2 thanks, - kuer |