From: Fabien G. <fab...@gm...> - 2010-06-21 14:41:19
|
Hello, We had exactly the same issue as Marco this morning (while copying lots of files, it suddenly stopped working with the same error messages). The three modifications in the source code provided by Michal + recompilation of mfsmaster binary solved the problem, it's backup to life :-) Notice that we "only" have 11'480'000 chunks (whereas Gemius seems to run a 26'000'000 chunks MFS cluster). Do you have any clue why it can happen, whereas our current cluster is quite slam ? Our configuration : one master server (8 GB of RAM), one master backup server, 5 chunk servers (1 BG of RAM, 2 x 4 TB HDD on each chunkserver, and about 2'200'000 chunks of each HDD, which means about 4'500'000 chunks stored on each chunk server). Regards, Fabien 2010/6/21 Michał Borychowski <mic...@ge...> > We give you here some quick patches you can implement to the master > server to improve its performance for that amount of files: > > > > In matocsserv.c in mfsmaster you need to change this line: > > #define MaxPacketSize 50000000 > > > > into this: > > #define MaxPacketSize 500000000 > > > > > > > > Also we suggest a change in filesystem.c in mfsmaster in "fs_test_files" > function. Change this line: > > if ((uint32_t)(main_time())<=starttime+150) { > > > > into: > > if ((uint32_t)(main_time())<=starttime+900) { > > > > > > And also changing this line: > > for (k=0 ; k<(NODEHASHSIZE/3600) && i<NODEHASHSIZE ; k++,i++) { > > > > into this: > > for (k=0 ; k<(NODEHASHSIZE/14400) && i<NODEHASHSIZE ; k++,i++) { > > > > > > > > You need to recompile the master server and start it again. The above > changes should make the master server work more stable with large amount of > files. > > > > > > Another suggestion would be to create two MooseFS instances (eg. 2 x 200 > million files). One master server could also be metalogger for the another > system and vice versa. > > > > > > Kind regards > > Michał > > > > *From:* marco lu [mailto:mar...@gm...] > *Sent:* Monday, June 21, 2010 6:04 AM > *To:* moo...@li... > *Subject:* [Moosefs-users] mfs-master[4166]: CS(10.10.10.10) packet too > long (226064141/50000000) > > > > hi, everyone > > We intend to use moosefs at our product environment as the storage of our online photo service. > > > We'll store for about 400 million photo files. So the master server's mem is a big problem. > > > > I've built one master server(64G mem), one metalogger server, three chunk servers(10*1T SATA). When I copy photo files to the moosefs system. At start everything is good. But when the master server's exhaust the memories. I got many error syslog from master server: > > > > > Jun 21 11:48:58 mfs-master[4166]: currently unavailable chunk 00000000018140FF (inode: 26710547 ; index: 0) > > Jun 21 11:48:58 mfs-master[4166]: * currently unavailable file 26710547: img.xxx.com/003/810/560/b.jpg > > Jun 21 11:48:58 mfs-master[4166]: currently unavailable chunk 000000000144B907 (inode: 22516243 ; index: 0) > > Jun 21 11:48:58 mfs-master[4166]: * currently unavailable file 22516243: img.xxx.com/051/383/419/a.jpg > > > > and some error message like this: > > > Jun 21 11:49:31 mfs-master[4166]: chunkserver disconnected - ip: 10.10.10.11, port: 0, usedspace: 0 (0.00 GiB), totalspace: 0 (0.00 GiB) > > Jun 21 11:50:03 mfs-master[4166]: CS(10.25.40.111) packet too long (226064141/50000000) > > Jun 21 11:50:03 mfs-master[4166]: chunkserver disconnected - ip: 10.10.10.12, port: 0, usedspace: 0 (0.00 GiB), totalspace: 0 (0.00 GiB) > > Jun 21 11:50:34 mfs-master[4166]: CS(10.25.40.113) packet too long (217185941/50000000) > > Jun 21 11:50:34 mfs-master[4166]: chunkserver disconnected - ip: 10.10.10.13, port: 0, usedspace: 0 (0.00 GiB), totalspace: 0 (0.00 GiB) > > > It's a memory problem or a kernel tuning problem? Anyone can give me some information? > > > > Thans all. > > > > Mumonitor > > > > ------------------------------------------------------------------------------ > ThinkGeek and WIRED's GeekDad team up for the Ultimate > GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the > lucky parental unit. See the prize list and enter to win: > http://p.sf.net/sfu/thinkgeek-promo > _______________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users > > |