From: Guowen S. <guo...@gm...> - 2010-06-01 13:55:03
|
Hi, I was testing mfs with 16G RAM for master. After 16,000,000 files were written into, I “mfschunkserver stop” a chunkserver for some time. Troubles happened. The log of master server filled with plenty of messages as below: --------------------------------------------------------------------------------------------------------------------- * currently unavailable file 158921: photos/47/74/82/r_163042.jpg currently unavailable chunk 00000000004465CA (inode: 4353226 ; index: 0) --------------------------------------------------------------------------------------------------------------------- It stays in the loop of “fs_test_files()”, refusing to go out to continue serving for clients. The goal is 1, of cause. I attempted to reconnect the chunk server to master, but it failed. The master cannot provide the normal services for clients, and cannot accept the packets from chunk server, even it is impossible to get information from mfscgiserv. The mfs crashed. From the code, it decreases the value of “allvalidcopies”, then print log messages as above in loop after entering into “fs_test_files()”. Because the amount of files are so large, it lasts for a very long time (maybe the most time is spent on “printf”). I think it is quite common that the chunk server disconnected temporary, and connected again. so maybe the fs_test_file() could be work in a separate thread. How do you concern about this problem? -- Yours sincerely, Guowen Shen <guo...@gm...> |