From: Wenhua Z. <shi...@gm...> - 2012-03-31 02:28:42
|
Hi all, When we did write test to MFS, I stopped all the chunkservers. After restarted the chunkservers, I found "damaged" error in the CGI-page and many "invalid copies" error form the master's log. We have 4 chunkservers, and other 5 servers. These 9 servers have mfsmount running and write data to mounted folder(I have just changed the goal form 1 to 3 about 2 hours ago). Before I stop the chunkservers, the inflow-rate of the mfs is about 40M Bytes/sec. 1. Errors in logs "chunk invalid copies" errors such as below from the mfsmaster log: Mar 29 17:07:52 XXX-22 mfsmaster[7192]: chunk 00000000000EC3A8 has only invalid copies (2) - please repair it manually Mar 29 17:07:52 XXX-22 mfsmaster[7192]: chunk 00000000000EC3A8_00000002 - invalid copy on (10.7.17.54 - ver:00000001) Mar 29 17:07:52 XXX-22 mfsmaster[7192]: chunk 00000000000EC3A8_00000002 - invalid copy on (10.7.17.86 - ver:00000000) ...... Mar 29 17:07:54 XXX-22 mfsmaster[7192]: chunk 00000000000EC3BF has only invalid copies (1) - please repair it manually Mar 29 17:07:54 XXX-22 mfsmaster[7192]: chunk 00000000000EC3BF_00000003 - invalid copy on (10.7.17.54 - ver:00000001) ...... Besides that, we also found some other errors form the chunkserver's log: Mar 29 17:07:21 XXX-54 mfsmount[1883]: file: 170882, index: 31 -fs_writechunk returns status 8 ... Mar 29 17:07:43 XXX-85 mfschunkserver[26178]: write_block_to_chunk: file:/data2/mfsdata/84/chunk_00000000000EC384_00000002.mfs - crc error ... Mar 29 17:07:43 XXX-85 mfsmount[6604]: writeworker: write error: 29 ...... Mar 29 17:07:44 XXX-85 mfsmount[6604]: writeworker: write error: 13 ...... Mar 29 17:07:44 XXX-85 mfsmount[6604]: writeworker: write error: 28 The error number in the codes: "#define ERROR_CHUNKLOST 8 // Chunk lost" "#define ERROR_NOCHUNK 13 // No such chunk" "#define ERROR_DISCONNECTED 28 // Disconnected" "#define ERROR_CRC 29 // CRC error" We got some more informations about the chunk form the chunkserver and found that the error chunks have more than one copies but their versions were not same. eg: chunk-00000000000EC3A8: The mfsmaster log: Mar 29 22:42:01 XXX-22 mfsmaster[7192]: chunk 00000000000EC3A8 has only invalid copies (2) - please repair it manually Mar 29 22:42:01 XXX-22 mfsmaster[7192]: chunk 00000000000EC3A8_00000003 - invalid copy on (10.7.17.86 - ver:00000002) Mar 29 22:42:01 XXX-22 mfsmaster[7192]: chunk 00000000000EC3A8_00000003 - invalid copy on (10.7.17.54 - ver:00000001) The chunkserver' log: Mar 29 17:07:43 XXX-85 mfschunkserver[26178]: write_block_to_chunk: file:/data8/mfsdata/A8/chunk_00000000000EC3A8_00000002.mfs - crc error Mar 29 17:31:24 XXX-85 mfschunkserver[15680]: write_block_to_chunk: file:/data8/mfsdata/A8/chunk_00000000000EC3A8_00000003.mfs - crc error Mar 29 17:07:43 XXX-86 mfschunkserver[8547]: write_block_to_chunk: file:/data9/mfsdata/A8/chunk_00000000000EC3A8_00000002.mfs - crc error The file in the chunkserver (54, 85 and 86): 54: 41096192 Mar 29 17:07 chunk_00000000000EC3A8_00000001.mfs 85: 41096192 Mar 29 17:31 chunk_00000000000EC3A8_00000003.mfs 86: 41096192 Mar 29 17:07 chunk_00000000000EC3A8_00000002.mfs md5 value of the files: 7bd65382eb63db86d5b68395ae546f40 /data3/mfsdata/A8/chunk_00000000000EC3A8_00000001.mfs aa8f3bab55dfbf3f7a2dbd42993e4e51 /data8/mfsdata/A8/chunk_00000000000EC3A8_00000003.mfs 9101e3feb0ecaea386afe0500df56941 /data9/mfsdata/A8/chunk_00000000000EC3A8_00000002.mfs In fact, this chunk is part of the file "/mnt/mfs/test/p/20120329/0000000c/0000027e" : /mnt/mfs/test/p/20120329/0000000c/0000027e: chunk 0: 00000000000EC181_00000001 / (id:967041 ver:1) copy 1: 10.7.17.54:9422 copy 2: 10.7.17.85:9422 copy 3: 10.7.17.86:9422 chunk 1: 00000000000EC1F3_00000001 / (id:967155 ver:1) copy 1: 10.7.17.54:9422 copy 2: 10.7.17.55:9422 copy 3: 10.7.17.86:9422 ...... chunk 6: 00000000000EC3A8_00000003 / (id:967592 ver:3) no valid copies !!! When we use mfsfileinfo command , mfsmount will send a message "MATOCU_FUSE_READ_CHUNK" to the master. If the chunk of the file is not correct, the response from master will not contain the information we suppose to get, and "no valid copies !!!" will be printed(such as chunk 6: 00000000000EC3A8_00000003). 2. Question 1). Till now, I think the main cause of the "invalid copy" error is the chunk-version conflict, am I right? But my doubt is that when will the chunk-version make changes. Thanks. Form the logs, we find many files which chunk's version is not 1, but 2, 3 or even 7. eg: chunk 0: 00000000000D5394_00000003 / (id:873364 ver:3) copy 1: 10.7.17.54:9422 copy 2: 10.7.17.85:9422 copy 3: 10.7.17.86:9422 chunk 1: 00000000000D5505_00000003 / (id:873733 ver:3) copy 1: 10.7.17.55:9422 copy 2: 10.7.17.85:9422 copy 3: 10.7.17.86:9422 chunk 2: 00000000000D55F3_00000007 / (id:873971 ver:7) copy 1: 10.7.17.54:9422 copy 2: 10.7.17.55:9422 copy 3: 10.7.17.86:9422 ...... 2). What will happen to the files awaiting to be saved when the chunkserver goes down while mfsmount is already running? And when restart the chunkserver, is there any influence on the saved files? (eg, when all the chunkservers power off, maybe including the master) According to "http://www.moosefs.org/moosefs-faq.html#master-online", when the master server goes down while mfsmount is already running, mfsmount doesn't disconnect the mounted resource and files awaiting to be saved would stay quite long in the queue while trying to reconnect to the master server. 3). As we know, if we want to stop one chunkserver or remove one HD of the chunkserver, we have to do as " http://www.moosefs.org/moosefs-faq.html#add_remove". It will be a long time and many steps before we can remove the chunkserver or its disks, is there any other better method? We think we can set a "access-level" value to chunkserver, only when the chunkserver's access-level is set to be "WRITE", we can write data to it, otherwise the chunkserver is READ-ONLY. So after this has been implemented, we could set the access-level of the chunkserver to be "READ-ONLY" when we want to stop the chunkserver. But till now, we are not sure if this method will work well, and we need do some more tests. Do you have any ideas about this or you have some better solutions? Thanks. 4). When one disk of the chunkserver is marked "damaged" in the CGI monitor, does it means that this disk is read-only? And what causes the chunkserver to be marked "damaged"? 5). In fact, we can find the error file according to the logs, such as "/mnt/mfs/test/p/20120329/0000000c/0000027e" above. After I try to repair this file with "mfsfilerepair", the version of the chunk "00000000000EC3A8" changed to be 2, not 3. What the difference between 00000000000EC3A8_00000002 and 00000000000EC3A8_00000003? According to the MD5 value of these two files, their content are not same, so is there any data lost after this mfsfilerepair operation? #mfsfileinfo /mnt/mfs/test/p/20120329/0000000c/0000027e /mnt/mfs/test/p/20120329/0000000c/0000027e: chunk 0: 00000000000EC181_00000001 / (id:967041 ver:1) copy 1: 10.7.17.54:9422 copy 2: 10.7.17.85:9422 copy 3: 10.7.17.86:9422 ...... chunk 6: 00000000000EC3A8_00000002 / (id:967592 ver:2) copy 1: 10.7.17.86:9422 Thanks, Best Wishes, Wenhua |