From: TianYuchuan(田玉川) <ti...@fo...> - 2011-03-25 06:10:01
|
Hi! Thanks! The master server inserted SAS 15K speed disk! Now the problem had solved! I had update the moosefs version,now the version is mfs-1.6.20-2. Updated ,Cpu was 16%。 Anthor question! Moosefs upgrade! I installed the moosefs mfs-1.6.20-2 version of a new server,and started the master、chunkserver、client。 The old masterserver was not stoped。The old masterserver was not connected chunkserver、not connected client,but the master process occupied 80% CPU,then I restarted the master service,reduced to 5% CPU utilization。 The master cannot release the CPU? -----邮件原件----- 发件人: Michal Borychowski [mailto:mic...@ge...] 发送时间: 2011年3月24日 16:34 收件人: 'TianYuchuan(田玉川)'; 'Shen Guowen' 抄送: moo...@li... 主题: RE: [Moosefs-users] To access data was very slowly,nearly 2 minute。oh my god! Hi! You have almost all RAM consumed. As you have 100 million files in the system we suggest putting some extra RAM to the master server. Also it would be advisable to insert SSD disk into the master server so that the hourly metadata dump takes less time. Kind regards Michał Borychowski MooseFS Support Manager _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ Gemius S.A. ul. Wołoska 7, 02-672 Warszawa Budynek MARS, klatka D Tel.: +4822 874-41-00 Fax : +4822 874-41-01 -----Original Message----- From: TianYuchuan(田玉川) [mailto:ti...@fo...] Sent: Thursday, March 17, 2011 10:03 AM To: Shen Guowen Cc: moo...@li... Subject: [Moosefs-users] To access data was very slowly,nearly 2 minute。oh my god! Hello My moosefs system was accessd very slowly,I nave no idea,please help me!Thanks!!! files number 104964618 ,chunks number 104963962。 master load is not high,but When the hour every to data cannot accessed,continued for several minutes。General,visit concurrent small, to access data delay was needed a few seconds。 My moosefs system have nine chunks, The chunk station 1 localhost 192.168.0.118 9422 1.6.19 23387618 3.6 TiB 4.5 TiB 79.72 0 0 B 0 B - 2 localhost 192.168.0.119 9422 1.6.19 23246974 3.6 TiB 4.5 TiB 79.72 0 0 B 0 B - 3 localhost 192.168.0.120 9422 1.6.19 23360333 3.6 TiB 4.5 TiB 79.72 0 0 B 0 B - 4 localhost 192.168.0.121 9422 1.6.19 23192013 3.6 TiB 4.5 TiB 79.69 0 0 B 0 B - 5 localhost 192.168.0.122 9422 1.6.19 23483418 3.6 TiB 4.5 TiB 79.70 0 0 B 0 B - 6 localhost 192.168.0.123 9422 1.6.19 23308366 3.6 TiB 4.5 TiB 79.70 0 0 B 0 B - 7 localhost 192.168.0.124 9422 1.6.19 23361992 3.6 TiB 4.5 TiB 79.69 0 0 B 0 B - 8 localhost 192.168.0.125 9422 1.6.19 23300478 3.6 TiB 4.5 TiB 79.70 0 0 B 0 B - 9 localhost 192.168.0.127 9422 1.6.19 23284897 3.5 TiB 4.5 TiB 78.72 0 0 B 0 B - -------------------------------------------------------------------------------------------------------------------------------------------------- [root@localhost mfs]# free -m total used free shared buffers cached Mem: 48295 46127 2168 0 38 8204 -/+ buffers/cache: 37884 10411 Swap: 0 0 0 The CPU using 95%,the highest was by 150%。 -----邮件原件----- 发件人: Shen Guowen [mailto:sh...@ui...] 发送时间: 2010年8月9日 10:42 收件人: TianYuchuan(田玉川) 抄送: moo...@li... 主题: Re: [Moosefs-users] mfs-master[10546]: CS(192.168.0.125) packet too long (115289537/50000000) Don't worry! This is because some of your chunk servers are currently unreachable, and the master server notices it, then modifies the meta data of files in those chunk servers to set the "allvalidcopies" to 0 in "struct chunk". When the master is rescanning the files (fs_test_files() in filesystem.c), it finds out the valid copy is 0, then print information into syslog file, just as listed below. However, printing process is quite time-consuming, especially the mount of files is large. During this period, the master ignores the chunk server's connection (because it is in a big loop of test files, and it is a single thread to do this, maybe this is a pitfall). So although you make sure the chunk server working correctly, it is useless (you can notice the reconnecting information in chunk server's syslog file). You could let the master finish printing, then it will reconnect with chunk servers, and will notice the files is there, then set the "allvalidcopies" to a correct value. Then works normally. Or you can re-compile the program with commenting the line 5512 and line 5482 in filesystem.c(mfs-1.6.15). It will ignore the print messages and of cause, reduce the fs test time. Below is from Michal: ----------------------------------------------------------------------- We give you here some quick patches you can implement to the master server to improve its performance for that amount of files: In matocsserv.c in mfsmaster you need to change this line: #define MaxPacketSize 50000000 into this: #define MaxPacketSize 500000000 Also we suggest a change in filesystem.c in mfsmaster in "fs_test_files" function. Change this line: if ((uint32_t)(main_time())<=starttime+150) { into: if ((uint32_t)(main_time())<=starttime+900) { And also changing this line: for (k=0 ; k<(NODEHASHSIZE/3600) && i<NODEHASHSIZE ; k++,i++) { into this: for (k=0 ; k<(NODEHASHSIZE/14400) && i<NODEHASHSIZE ; k++,i++) { You need to recompile the master server and start it again. The above changes should make the master server work more stable with large amount of files. Another suggestion would be to create two MooseFS instances (eg. 2 x 200 million files). One master server could also be metalogger for the another system and vice versa. Kind regards Michał ----------------------------------------------------------------------------- -- Guowen Shen On Sun, 2010-08-08 at 22:51 +0800, TianYuchuan(田玉川) wrote: > > > hello,everyone! > I have a big quertion,please help me,thank you very much. > We intend to use moosefs at our product environment as the storage of > our online photo service. > We'll store for about 200 million photo files. > I've built one master server(48G mem), one metalogger server, eight > chunk servers(8*1T SATA). When I copy photo files to the moosefs > system. At start everything is good. But I had copyed files 57 > million ,the master machines'CPU were used 100% > I sthoped the master when used “/user/local/mfs/sbin/mfsmasterserver > -s”,that I started the master。but there was a big problem ,the > master had not read my files。 These documents are important to me,I > am very anxious,please help me recover these files,tihanks。 > > I got many error syslog from master server: > > Aug 6 00:57:01 localhost mfsmaster[10546]: * currently unavailable > file 41991323: 2668/2526212449954462668/176s.jpg > Aug 6 00:57:01 localhost mfsmaster[10546]: currently unavailable > chunk 00000000043CD358 (inode: 50379931 ; index: 0) > Aug 6 00:57:01 localhost mfsmaster[10546]: * currently unavailable > file 50379931: 2926/4294909215566102926/163b.jpg > Aug 6 00:57:01 localhost mfsmaster[10546]: currently unavailable > chunk 00000000002966C3 (inode: 48284 ; index: 0) > Aug 6 00:57:01 localhost mfsmaster[10546]: * currently unavailable > file 48284: bookdata/178/8533354296639220178/180b.jpg > Aug 6 00:57:01 localhost mfsmaster[10546]: currently unavailable > chunk 0000000000594726 (inode: 4242588 ; index: 0) > Aug 6 00:57:01 localhost mfsmaster[10546]: * currently unavailable > file 4242588: bookdata/6631/4300989258725036631/85s.jpg > Aug 6 00:57:01 localhost mfsmaster[10546]: currently unavailable > chunk 0000000000993541 (inode: 8436892 ; index: 0) > Aug 6 00:57:01 localhost mfsmaster[10546]: * currently unavailable > file 8436892: bookdata/7534/3147352338521267534/122b.jpg > Aug 6 00:57:01 localhost mfsmaster[10546]: currently unavailable > chunk 0000000000D906E6 (inode: 12631196 ; index: 0) > Aug 6 00:57:01 localhost mfsmaster[10546]: * currently unavailable > file 12631196: bookdata/8691/11879047433161548691/164s.jpg > Aug 6 00:57:01 localhost mfsmaster[10546]: currently unavailable > chunk 000000000118DC1E (inode: 16825500 ; index: 0) > Aug 6 00:57:01 localhost mfsmaster[10546]: * currently unavailable > file 16825500: bookdata/1232/17850056326363351232/166b.jpg > Aug 6 00:57:01 localhost mfsmaster[10546]: currently unavailable > chunk 0000000001681BC7 (inode: 21019804 ; index: 0) > Aug 6 00:57:01 localhost mfsmaster[10546]: * currently unavailable > file 21019804: bookdata/26/12779298489336140026/246s.jpg > Aug 6 00:57:01 localhost mfsmaster[10546]: currently unavailable > chunk 0000000001A804E1 (inode: 25214108 ; index: 0) > Aug 6 00:57:01 localhost mfsmaster[10546]: * currently unavailable > file 25214108: bookdata/3886/8729781571075193886/30s.jpg > Aug 6 00:57:01 localhost mfsmaster[10546]: currently unavailable > chunk 0000000001E7E826 (inode: 29408412 ; index: 0) > Aug 6 00:57:01 localhost mfsmaster[10546]: * currently unavailable > file 29408412: bookdata/4757/142868991575144757/316b.jpg > > > Aug 7 23:56:36 localhost mfsmaster[10546]: CS(192.168.0.124) packet > too long (115289537/50000000) > Aug 7 23:56:36 localhost mfsmaster[10546]: chunkserver disconnected - > ip: 192.168.0.124, port: 0, usedspace: 0 (0.00 GiB), totalspace: 0 > (0.00 GiB) > Aug 8 00:08:14 localhost mfsmaster[10546]: CS(192.168.0.127) packet > too long (104113889/50000000) > Aug 8 00:08:14 localhost mfsmaster[10546]: chunkserver disconnected - > ip: 192.168.0.127, port: 0, usedspace: 0 (0.00 GiB), totalspace: 0 > (0.00 GiB) > Aug 8 00:21:03 localhost mfsmaster[10546]: CS(192.168.0.120) packet > too long (117046565/50000000) > Aug 8 00:21:03 localhost mfsmaster[10546]: chunkserver disconnected - > ip: 192.168.0.120, port: 0, usedspace: 0 (0.00 GiB), totalspace: 0 > (0.00 GiB) > > when I visited the mfscgi,the error was“Can't connect to MFS master > (IP:127.0.0.1 ; PORT:9421)” > 。 > > Thanks all! > ------------------------------------------------------------------------------ > This SF.net email is sponsored by > > Make an app they can't live without > Enter the BlackBerry Developer Challenge > http://p.sf.net/sfu/RIM-dev2dev > _______________________________________________ moosefs-users mailing list moo...@li... https://lists.sourceforge.net/lists/listinfo/moosefs-users ------------------------------------------------------------------------------ Colocation vs. Managed Hosting A question and answer guide to determining the best fit for your organization - today and in the future. http://p.sf.net/sfu/internap-sfd2d _______________________________________________ moosefs-users mailing list moo...@li... https://lists.sourceforge.net/lists/listinfo/moosefs-users |