From: marco lu <mar...@gm...> - 2010-06-21 04:03:55
|
hi, everyone We intend to use moosefs at our product environment as the storage of our online photo service. We'll store for about 400 million photo files. So the master server's mem is a big problem. I've built one master server(64G mem), one metalogger server, three chunk servers(10*1T SATA). When I copy photo files to the moosefs system. At start everything is good. But when the master server's exhaust the memories. I got many error syslog from master server: Jun 21 11:48:58 mfs-master[4166]: currently unavailable chunk 00000000018140FF (inode: 26710547 ; index: 0) Jun 21 11:48:58 mfs-master[4166]: * currently unavailable file 26710547: img.xxx.com/003/810/560/b.jpg Jun 21 11:48:58 mfs-master[4166]: currently unavailable chunk 000000000144B907 (inode: 22516243 ; index: 0) Jun 21 11:48:58 mfs-master[4166]: * currently unavailable file 22516243: img.xxx.com/051/383/419/a.jpg and some error message like this: Jun 21 11:49:31 mfs-master[4166]: chunkserver disconnected - ip: 10.10.10.11, port: 0, usedspace: 0 (0.00 GiB), totalspace: 0 (0.00 GiB) Jun 21 11:50:03 mfs-master[4166]: CS(10.25.40.111) packet too long (226064141/50000000) Jun 21 11:50:03 mfs-master[4166]: chunkserver disconnected - ip: 10.10.10.12, port: 0, usedspace: 0 (0.00 GiB), totalspace: 0 (0.00 GiB) Jun 21 11:50:34 mfs-master[4166]: CS(10.25.40.113) packet too long (217185941/50000000) Jun 21 11:50:34 mfs-master[4166]: chunkserver disconnected - ip: 10.10.10.13, port: 0, usedspace: 0 (0.00 GiB), totalspace: 0 (0.00 GiB) It's a memory problem or a kernel tuning problem? Anyone can give me some information? Thans all. Mumonitor |
From: Roast <zha...@gm...> - 2010-06-21 10:58:03
|
master server support cluster or metadata can be store at disk will be a great feature for us. On Mon, Jun 21, 2010 at 12:03 PM, marco lu <mar...@gm...> wrote: > hi, everyone > > We intend to use moosefs at our product environment as the storage of our online photo service. > > We'll store for about 400 million photo files. So the master server's mem is a big problem. > > I've built one master server(64G mem), one metalogger server, three chunk servers(10*1T SATA). When I copy photo files to the moosefs system. At start everything is good. But when the master server's exhaust the memories. I got many error syslog from master server: > > > Jun 21 11:48:58 mfs-master[4166]: currently unavailable chunk 00000000018140FF (inode: 26710547 ; index: 0) > Jun 21 11:48:58 mfs-master[4166]: * currently unavailable file 26710547: img.xxx.com/003/810/560/b.jpg > > > Jun 21 11:48:58 mfs-master[4166]: currently unavailable chunk 000000000144B907 (inode: 22516243 ; index: 0) > Jun 21 11:48:58 mfs-master[4166]: * currently unavailable file 22516243: img.xxx.com/051/383/419/a.jpg > > > and some error message like this: > > Jun 21 11:49:31 mfs-master[4166]: chunkserver disconnected - ip: 10.10.10.11, port: 0, usedspace: 0 (0.00 GiB), totalspace: 0 (0.00 GiB) > Jun 21 11:50:03 mfs-master[4166]: CS(10.25.40.111) packet too long (226064141/50000000) > > > Jun 21 11:50:03 mfs-master[4166]: chunkserver disconnected - ip: 10.10.10.12, port: 0, usedspace: 0 (0.00 GiB), totalspace: 0 (0.00 GiB) > Jun 21 11:50:34 mfs-master[4166]: CS(10.25.40.113) packet too long (217185941/50000000) > > > Jun 21 11:50:34 mfs-master[4166]: chunkserver disconnected - ip: 10.10.10.13, port: 0, usedspace: 0 (0.00 GiB), totalspace: 0 (0.00 GiB) > > It's a memory problem or a kernel tuning problem? Anyone can give me some information? > > > Thans all. > > > Mumonitor > > > > ------------------------------------------------------------------------------ > ThinkGeek and WIRED's GeekDad team up for the Ultimate > GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the > lucky parental unit. See the prize list and enter to win: > http://p.sf.net/sfu/thinkgeek-promo > _______________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users > > -- The time you enjoy wasting is not wasted time! |
From: Michał B. <mic...@ge...> - 2010-06-21 11:51:08
|
We give you here some quick patches you can implement to the master server to improve its performance for that amount of files: In matocsserv.c in mfsmaster you need to change this line: #define MaxPacketSize 50000000 into this: #define MaxPacketSize 500000000 Also we suggest a change in filesystem.c in mfsmaster in "fs_test_files" function. Change this line: if ((uint32_t)(main_time())<=starttime+150) { into: if ((uint32_t)(main_time())<=starttime+900) { And also changing this line: for (k=0 ; k<(NODEHASHSIZE/3600) && i<NODEHASHSIZE ; k++,i++) { into this: for (k=0 ; k<(NODEHASHSIZE/14400) && i<NODEHASHSIZE ; k++,i++) { You need to recompile the master server and start it again. The above changes should make the master server work more stable with large amount of files. Another suggestion would be to create two MooseFS instances (eg. 2 x 200 million files). One master server could also be metalogger for the another system and vice versa. Kind regards Michał From: marco lu [mailto:mar...@gm...] Sent: Monday, June 21, 2010 6:04 AM To: moo...@li... Subject: [Moosefs-users] mfs-master[4166]: CS(10.10.10.10) packet too long (226064141/50000000) hi, everyone We intend to use moosefs at our product environment as the storage of our online photo service. We'll store for about 400 million photo files. So the master server's mem is a big problem. I've built one master server(64G mem), one metalogger server, three chunk servers(10*1T SATA). When I copy photo files to the moosefs system. At start everything is good. But when the master server's exhaust the memories. I got many error syslog from master server: Jun 21 11:48:58 mfs-master[4166]: currently unavailable chunk 00000000018140FF (inode: 26710547 ; index: 0) Jun 21 11:48:58 mfs-master[4166]: * currently unavailable file 26710547: img.xxx.com/003/810/560/b.jpg Jun 21 11:48:58 mfs-master[4166]: currently unavailable chunk 000000000144B907 (inode: 22516243 ; index: 0) Jun 21 11:48:58 mfs-master[4166]: * currently unavailable file 22516243: img.xxx.com/051/383/419/a.jpg and some error message like this: Jun 21 11:49:31 mfs-master[4166]: chunkserver disconnected - ip: 10.10.10.11, port: 0, usedspace: 0 (0.00 GiB), totalspace: 0 (0.00 GiB) Jun 21 11:50:03 mfs-master[4166]: CS(10.25.40.111) packet too long (226064141/50000000) Jun 21 11:50:03 mfs-master[4166]: chunkserver disconnected - ip: 10.10.10.12, port: 0, usedspace: 0 (0.00 GiB), totalspace: 0 (0.00 GiB) Jun 21 11:50:34 mfs-master[4166]: CS(10.25.40.113) packet too long (217185941/50000000) Jun 21 11:50:34 mfs-master[4166]: chunkserver disconnected - ip: 10.10.10.13, port: 0, usedspace: 0 (0.00 GiB), totalspace: 0 (0.00 GiB) It's a memory problem or a kernel tuning problem? Anyone can give me some information? Thans all. Mumonitor |
From: Fabien G. <fab...@gm...> - 2010-06-21 14:41:19
|
Hello, We had exactly the same issue as Marco this morning (while copying lots of files, it suddenly stopped working with the same error messages). The three modifications in the source code provided by Michal + recompilation of mfsmaster binary solved the problem, it's backup to life :-) Notice that we "only" have 11'480'000 chunks (whereas Gemius seems to run a 26'000'000 chunks MFS cluster). Do you have any clue why it can happen, whereas our current cluster is quite slam ? Our configuration : one master server (8 GB of RAM), one master backup server, 5 chunk servers (1 BG of RAM, 2 x 4 TB HDD on each chunkserver, and about 2'200'000 chunks of each HDD, which means about 4'500'000 chunks stored on each chunk server). Regards, Fabien 2010/6/21 Michał Borychowski <mic...@ge...> > We give you here some quick patches you can implement to the master > server to improve its performance for that amount of files: > > > > In matocsserv.c in mfsmaster you need to change this line: > > #define MaxPacketSize 50000000 > > > > into this: > > #define MaxPacketSize 500000000 > > > > > > > > Also we suggest a change in filesystem.c in mfsmaster in "fs_test_files" > function. Change this line: > > if ((uint32_t)(main_time())<=starttime+150) { > > > > into: > > if ((uint32_t)(main_time())<=starttime+900) { > > > > > > And also changing this line: > > for (k=0 ; k<(NODEHASHSIZE/3600) && i<NODEHASHSIZE ; k++,i++) { > > > > into this: > > for (k=0 ; k<(NODEHASHSIZE/14400) && i<NODEHASHSIZE ; k++,i++) { > > > > > > > > You need to recompile the master server and start it again. The above > changes should make the master server work more stable with large amount of > files. > > > > > > Another suggestion would be to create two MooseFS instances (eg. 2 x 200 > million files). One master server could also be metalogger for the another > system and vice versa. > > > > > > Kind regards > > Michał > > > > *From:* marco lu [mailto:mar...@gm...] > *Sent:* Monday, June 21, 2010 6:04 AM > *To:* moo...@li... > *Subject:* [Moosefs-users] mfs-master[4166]: CS(10.10.10.10) packet too > long (226064141/50000000) > > > > hi, everyone > > We intend to use moosefs at our product environment as the storage of our online photo service. > > > We'll store for about 400 million photo files. So the master server's mem is a big problem. > > > > I've built one master server(64G mem), one metalogger server, three chunk servers(10*1T SATA). When I copy photo files to the moosefs system. At start everything is good. But when the master server's exhaust the memories. I got many error syslog from master server: > > > > > Jun 21 11:48:58 mfs-master[4166]: currently unavailable chunk 00000000018140FF (inode: 26710547 ; index: 0) > > Jun 21 11:48:58 mfs-master[4166]: * currently unavailable file 26710547: img.xxx.com/003/810/560/b.jpg > > Jun 21 11:48:58 mfs-master[4166]: currently unavailable chunk 000000000144B907 (inode: 22516243 ; index: 0) > > Jun 21 11:48:58 mfs-master[4166]: * currently unavailable file 22516243: img.xxx.com/051/383/419/a.jpg > > > > and some error message like this: > > > Jun 21 11:49:31 mfs-master[4166]: chunkserver disconnected - ip: 10.10.10.11, port: 0, usedspace: 0 (0.00 GiB), totalspace: 0 (0.00 GiB) > > Jun 21 11:50:03 mfs-master[4166]: CS(10.25.40.111) packet too long (226064141/50000000) > > Jun 21 11:50:03 mfs-master[4166]: chunkserver disconnected - ip: 10.10.10.12, port: 0, usedspace: 0 (0.00 GiB), totalspace: 0 (0.00 GiB) > > Jun 21 11:50:34 mfs-master[4166]: CS(10.25.40.113) packet too long (217185941/50000000) > > Jun 21 11:50:34 mfs-master[4166]: chunkserver disconnected - ip: 10.10.10.13, port: 0, usedspace: 0 (0.00 GiB), totalspace: 0 (0.00 GiB) > > > It's a memory problem or a kernel tuning problem? Anyone can give me some information? > > > > Thans all. > > > > Mumonitor > > > > ------------------------------------------------------------------------------ > ThinkGeek and WIRED's GeekDad team up for the Ultimate > GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the > lucky parental unit. See the prize list and enter to win: > http://p.sf.net/sfu/thinkgeek-promo > _______________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users > > |
From: Michał B. <mic...@ge...> - 2010-06-23 09:09:34
|
Hi Fabien! Probably important is the difference in the amount of chunks per one chunkserver. We have about 800,000 chunks per chunkserver (60 million chunks on 75 machines). How many files do you have? What is the average size of a file? What goal do you have set? Regards Michał From: Fabien Germain [mailto:fab...@gm...] Sent: Monday, June 21, 2010 4:41 PM To: moo...@li... Subject: Re: [Moosefs-users] mfs-master[4166]: CS(10.10.10.10) packet too long (226064141/50000000) Hello, We had exactly the same issue as Marco this morning (while copying lots of files, it suddenly stopped working with the same error messages). The three modifications in the source code provided by Michal + recompilation of mfsmaster binary solved the problem, it's backup to life :-) Notice that we "only" have 11'480'000 chunks (whereas Gemius seems to run a 26'000'000 chunks MFS cluster). Do you have any clue why it can happen, whereas our current cluster is quite slam ? Our configuration : one master server (8 GB of RAM), one master backup server, 5 chunk servers (1 BG of RAM, 2 x 4 TB HDD on each chunkserver, and about 2'200'000 chunks of each HDD, which means about 4'500'000 chunks stored on each chunk server). Regards, Fabien 2010/6/21 Michał Borychowski <mic...@ge...> We give you here some quick patches you can implement to the master server to improve its performance for that amount of files: In matocsserv.c in mfsmaster you need to change this line: #define MaxPacketSize 50000000 into this: #define MaxPacketSize 500000000 Also we suggest a change in filesystem.c in mfsmaster in "fs_test_files" function. Change this line: if ((uint32_t)(main_time())<=starttime+150) { into: if ((uint32_t)(main_time())<=starttime+900) { And also changing this line: for (k=0 ; k<(NODEHASHSIZE/3600) && i<NODEHASHSIZE ; k++,i++) { into this: for (k=0 ; k<(NODEHASHSIZE/14400) && i<NODEHASHSIZE ; k++,i++) { You need to recompile the master server and start it again. The above changes should make the master server work more stable with large amount of files. Another suggestion would be to create two MooseFS instances (eg. 2 x 200 million files). One master server could also be metalogger for the another system and vice versa. Kind regards Michał From: marco lu [mailto:mar...@gm...] Sent: Monday, June 21, 2010 6:04 AM To: moo...@li... Subject: [Moosefs-users] mfs-master[4166]: CS(10.10.10.10) packet too long (226064141/50000000) hi, everyone We intend to use moosefs at our product environment as the storage of our online photo service. We'll store for about 400 million photo files. So the master server's mem is a big problem. I've built one master server(64G mem), one metalogger server, three chunk servers(10*1T SATA). When I copy photo files to the moosefs system. At start everything is good. But when the master server's exhaust the memories. I got many error syslog from master server: Jun 21 11:48:58 mfs-master[4166]: currently unavailable chunk 00000000018140FF (inode: 26710547 ; index: 0) Jun 21 11:48:58 mfs-master[4166]: * currently unavailable file 26710547: img.xxx.com/003/810/560/b.jpg Jun 21 11:48:58 mfs-master[4166]: currently unavailable chunk 000000000144B907 (inode: 22516243 ; index: 0) Jun 21 11:48:58 mfs-master[4166]: * currently unavailable file 22516243: img.xxx.com/051/383/419/a.jpg and some error message like this: Jun 21 11:49:31 mfs-master[4166]: chunkserver disconnected - ip: 10.10.10.11, port: 0, usedspace: 0 (0.00 GiB), totalspace: 0 (0.00 GiB) Jun 21 11:50:03 mfs-master[4166]: CS(10.25.40.111) packet too long (226064141/50000000) Jun 21 11:50:03 mfs-master[4166]: chunkserver disconnected - ip: 10.10.10.12, port: 0, usedspace: 0 (0.00 GiB), totalspace: 0 (0.00 GiB) Jun 21 11:50:34 mfs-master[4166]: CS(10.25.40.113) packet too long (217185941/50000000) Jun 21 11:50:34 mfs-master[4166]: chunkserver disconnected - ip: 10.10.10.13, port: 0, usedspace: 0 (0.00 GiB), totalspace: 0 (0.00 GiB) It's a memory problem or a kernel tuning problem? Anyone can give me some information? Thans all. Mumonitor ------------------------------------------------------------------------------ ThinkGeek and WIRED's GeekDad team up for the Ultimate GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the lucky parental unit. See the prize list and enter to win: http://p.sf.net/sfu/thinkgeek-promo _______________________________________________ moosefs-users mailing list moo...@li... https://lists.sourceforge.net/lists/listinfo/moosefs-users |
From: marco lu <mar...@gm...> - 2010-06-22 08:17:18
|
Thank Michał Borychowski ! This problem is resolved. The mfs system is restored too. Another question is : when i recompile mfsmaster as you said, mfscgiserv process cannot work normally. this process disappeared when i visit this url. Without any message (syslog or dmesg) to debug this problem . Thanks again. Mumonitor 2010/6/21 Michał Borychowski <mic...@ge...> > We give you here some quick patches you can implement to the master > server to improve its performance for that amount of files: > > > > In matocsserv.c in mfsmaster you need to change this line: > > #define MaxPacketSize 50000000 > > > > into this: > > #define MaxPacketSize 500000000 > > > > > > > > Also we suggest a change in filesystem.c in mfsmaster in "fs_test_files" > function. Change this line: > > if ((uint32_t)(main_time())<=starttime+150) { > > > > into: > > if ((uint32_t)(main_time())<=starttime+900) { > > > > > > And also changing this line: > > for (k=0 ; k<(NODEHASHSIZE/3600) && i<NODEHASHSIZE ; k++,i++) { > > > > into this: > > for (k=0 ; k<(NODEHASHSIZE/14400) && i<NODEHASHSIZE ; k++,i++) { > > > > > > > > You need to recompile the master server and start it again. The above > changes should make the master server work more stable with large amount of > files. > > > > > > Another suggestion would be to create two MooseFS instances (eg. 2 x 200 > million files). One master server could also be metalogger for the another > system and vice versa. > > > > > > Kind regards > > Michał > > > > *From:* marco lu [mailto:mar...@gm...] > *Sent:* Monday, June 21, 2010 6:04 AM > *To:* moo...@li... > *Subject:* [Moosefs-users] mfs-master[4166]: CS(10.10.10.10) packet too > long (226064141/50000000) > > > > hi, everyone > > We intend to use moosefs at our product environment as the storage of our online photo service. > > > We'll store for about 400 million photo files. So the master server's mem is a big problem. > > > > I've built one master server(64G mem), one metalogger server, three chunk servers(10*1T SATA). When I copy photo files to the moosefs system. At start everything is good. But when the master server's exhaust the memories. I got many error syslog from master server: > > > > Jun 21 11:48:58 mfs-master[4166]: currently unavailable chunk 00000000018140FF (inode: 26710547 ; index: 0) > > Jun 21 11:48:58 mfs-master[4166]: * currently unavailable file 26710547: img.xxx.com/003/810/560/b.jpg > > Jun 21 11:48:58 mfs-master[4166]: currently unavailable chunk 000000000144B907 (inode: 22516243 ; index: 0) > > Jun 21 11:48:58 mfs-master[4166]: * currently unavailable file 22516243: img.xxx.com/051/383/419/a.jpg > > > > and some error message like this: > > > Jun 21 11:49:31 mfs-master[4166]: chunkserver disconnected - ip: 10.10.10.11, port: 0, usedspace: 0 (0.00 GiB), totalspace: 0 (0.00 GiB) > > Jun 21 11:50:03 mfs-master[4166]: CS(10.25.40.111) packet too long (226064141/50000000) > > Jun 21 11:50:03 mfs-master[4166]: chunkserver disconnected - ip: 10.10.10.12, port: 0, usedspace: 0 (0.00 GiB), totalspace: 0 (0.00 GiB) > > Jun 21 11:50:34 mfs-master[4166]: CS(10.25.40.113) packet too long (217185941/50000000) > > Jun 21 11:50:34 mfs-master[4166]: chunkserver disconnected - ip: 10.10.10.13, port: 0, usedspace: 0 (0.00 GiB), totalspace: 0 (0.00 GiB) > > > It's a memory problem or a kernel tuning problem? Anyone can give me some information? > > > > Thans all. > > > > Mumonitor > > |
From: Michał B. <mic...@ge...> - 2010-06-22 10:47:35
|
Mfscgiserv was not touched by the patches, we had made tests with the exact patches and it worked properly. You can also try to run mfscgiserv with options -f and -v: /usr/local/sbin/mfscgiserv -f -v This way mfscgiserv would work in foreground and would write supported requests like: # /usr/local/sbin/mfscgiserv -f -v starting simple cgi server (host: any , port: 9425 , rootpath: /usr/local/share/mfscgi) Asynchronous HTTP server running on port 9425 localhost - - [22/Jun/2010 11:14:11] "GET / HTTP/1.1" 301 localhost - - [22/Jun/2010 11:14:11] "GET /index.html HTTP/1.1" 200 localhost - - [22/Jun/2010 11:14:12] "GET /mfs.cgi HTTP/1.1" 200 localhost - - [22/Jun/2010 11:14:12] "GET /mfs.css HTTP/1.1" 200 localhost - - [22/Jun/2010 11:14:12] "GET /logomini.png HTTP/1.1" 200 This should give us more interesting information. We were also wondering if you could test your environment of 400 million files putting master swap file on an SSD hard drive? Regards Michał From: marco lu [mailto:mar...@gm...] Sent: Tuesday, June 22, 2010 10:17 AM To: Michał Borychowski Cc: moo...@li... Subject: Re: [Moosefs-users] mfs-master[4166]: CS(10.10.10.10) packet too long (226064141/50000000) Thank Michał Borychowski ! This problem is resolved. The mfs system is restored too. Another question is : when i recompile mfsmaster as you said, mfscgiserv process cannot work normally. this process disappeared when i visit this url. Without any message (syslog or dmesg) to debug this problem . Thanks again. Mumonitor 2010/6/21 Michał Borychowski <mic...@ge...> We give you here some quick patches you can implement to the master server to improve its performance for that amount of files: In matocsserv.c in mfsmaster you need to change this line: #define MaxPacketSize 50000000 into this: #define MaxPacketSize 500000000 Also we suggest a change in filesystem.c in mfsmaster in "fs_test_files" function. Change this line: if ((uint32_t)(main_time())<=starttime+150) { into: if ((uint32_t)(main_time())<=starttime+900) { And also changing this line: for (k=0 ; k<(NODEHASHSIZE/3600) && i<NODEHASHSIZE ; k++,i++) { into this: for (k=0 ; k<(NODEHASHSIZE/14400) && i<NODEHASHSIZE ; k++,i++) { You need to recompile the master server and start it again. The above changes should make the master server work more stable with large amount of files. Another suggestion would be to create two MooseFS instances (eg. 2 x 200 million files). One master server could also be metalogger for the another system and vice versa. Kind regards Michał From: marco lu [mailto:mar...@gm...] Sent: Monday, June 21, 2010 6:04 AM To: moo...@li... Subject: [Moosefs-users] mfs-master[4166]: CS(10.10.10.10) packet too long (226064141/50000000) hi, everyone We intend to use moosefs at our product environment as the storage of our online photo service. We'll store for about 400 million photo files. So the master server's mem is a big problem. I've built one master server(64G mem), one metalogger server, three chunk servers(10*1T SATA). When I copy photo files to the moosefs system. At start everything is good. But when the master server's exhaust the memories. I got many error syslog from master server: Jun 21 11:48:58 mfs-master[4166]: currently unavailable chunk 00000000018140FF (inode: 26710547 ; index: 0) Jun 21 11:48:58 mfs-master[4166]: * currently unavailable file 26710547: img.xxx.com/003/810/560/b.jpg Jun 21 11:48:58 mfs-master[4166]: currently unavailable chunk 000000000144B907 (inode: 22516243 ; index: 0) Jun 21 11:48:58 mfs-master[4166]: * currently unavailable file 22516243: img.xxx.com/051/383/419/a.jpg and some error message like this: Jun 21 11:49:31 mfs-master[4166]: chunkserver disconnected - ip: 10.10.10.11, port: 0, usedspace: 0 (0.00 GiB), totalspace: 0 (0.00 GiB) Jun 21 11:50:03 mfs-master[4166]: CS(10.25.40.111) packet too long (226064141/50000000) Jun 21 11:50:03 mfs-master[4166]: chunkserver disconnected - ip: 10.10.10.12, port: 0, usedspace: 0 (0.00 GiB), totalspace: 0 (0.00 GiB) Jun 21 11:50:34 mfs-master[4166]: CS(10.25.40.113) packet too long (217185941/50000000) Jun 21 11:50:34 mfs-master[4166]: chunkserver disconnected - ip: 10.10.10.13, port: 0, usedspace: 0 (0.00 GiB), totalspace: 0 (0.00 GiB) It's a memory problem or a kernel tuning problem? Anyone can give me some information? Thans all. Mumonitor |
From: Fabien G. <fab...@gm...> - 2010-06-23 15:10:48
|
Hi Michal and moosefs-users@, 2010/6/23 Michał Borychowski <mic...@ge...> > Probably important is the difference in the amount of chunks per one > chunkserver. We have about 800,000 chunks per chunkserver (60 million chunks > on 75 machines). > Thanks for you quick answer. Yes you're right, that's what I thought too : 4.7M chunks per chunkserver is far bigger than 800K Just a question : How much RAM do you use on the master, and on slaves ? In our case (11'800'000 chunks on 5 chunkservers) : * 'mfsmaster' process on master : 4.9 GB (64 bits recompilation required : the 32 bits version of mfsmaster crashed without a message when it came to 4 GB) * 'mfschunkserver' process on chunkservers : 580 MB How many files do you have? What is the average size of a file? What goal > do you have set? > Our current test cluster is used for backups storage. We have lots of files from all sizes (rsync of /etc, /home/, ... files on several servers), and also a lot of big archive files (several GB each). We currently have 14.7 millions inodes used. Maybe later, we'd like to use it for webhosting. But since LOCK is not supported, it's not yet possible. Fabien |
From: Michał B. <mic...@ge...> - 2010-06-29 07:23:59
|
Hi! Just a question : How much RAM do you use on the master, and on slaves ? In our case (11'800'000 chunks on 5 chunkservers) : * 'mfsmaster' process on master : 4.9 GB (64 bits recompilation required : the 32 bits version of mfsmaster crashed without a message when it came to 4 GB) * 'mfschunkserver' process on chunkservers : 580 MB [MB] 32bit machines are not capable of addressing more than 4GB, so that was quite a normal behaviour. Regarding memory please have a look at this FAQ entry: http://www.moosefs.org/moosefs-faq.html#cpu – there is information about the cpu loads and ram usage. Keep in mind that RAM depends on the total number of files and folders (not on their size) and CPU load in mfsmaster depends on amount of operations which take place in the filesystem. [...] Maybe later, we'd like to use it for webhosting. But since LOCK is not supported, it's not yet possible. [MB] What for do you need LOCK for webhosting? [MB] Kind regards Michal |
From: Fabien G. <fab...@gm...> - 2010-07-08 23:41:09
|
Hi all, 2010/6/29 Michał Borychowski <mic...@ge...> > * 'mfsmaster' process on master : 4.9 GB (64 bits recompilation required > : the 32 bits version of mfsmaster crashed without a message when it came to > 4 GB) > * 'mfschunkserver' process on chunkservers : 580 MB > > [MB] 32bit machines are not capable of addressing more than 4GB, so that > was quite a normal behaviour. > > Regarding memory please have a look at this FAQ entry: > > http://www.moosefs.org/moosefs-faq.html#cpu – there is information about > the cpu loads and ram usage. Keep in mind that RAM depends on the total > number of files and folders (not on their size) and CPU load in mfsmaster > depends on amount of operations which take place in the filesystem. > Yes I read that page (actually I read every pages of moosefs.org to really understand how it works !). It's just a mistake I made to compile it on a 32 bits platform. But maybe you could tell the dev team that in case of memory allocation failure, mfsmaster crashes without a message... well, if we consider that "segmentation fault" is not a real error message :-) > Maybe later, we'd like to use it for webhosting. But since LOCK is not > supported, it's not yet possible. > > [MB] What for do you need LOCK for webhosting? > Dynamic websites, writing information to files. Several web servers using the same MooseFS could try to write to the same file in the same time. Fabien |
From: Michał B. <mic...@ge...> - 2010-07-09 08:26:02
|
Yes I read that page (actually I read every pages of <http://moosefs.org> moosefs.org to really understand how it works !). [MB] Perfect :) It's just a mistake I made to compile it on a 32 bits platform. But maybe you could tell the dev team that in case of memory allocation failure, mfsmaster crashes without a message... well, if we consider that "segmentation fault" is not a real error message :-) [MB] It’s on our todo list (but to be honest, with low priority – one cannot expect a 32bit machine to work with more than 4GB RAM :)) Maybe later, we'd like to use it for webhosting. But since LOCK is not supported, it's not yet possible. [MB] What for do you need LOCK for webhosting? Dynamic websites, writing information to files. Several web servers using the same MooseFS could try to write to the same file in the same time. [MB] It should not be a problem for you. There is a mechanism of chunk locking for write, but the writing process would be slow. There is no mechanism of informing the client waiting to write that the lock had been released (probably we’ll implement it one time). So now client which couldn’t start writing process will try again every second. This solution can in theory lead to starvation. But practically it shouldn’t. So this is a safe operation but still is not recommended. It is better when different process on different machines write to different files and later some other system combine this data from many files into one target file (something like in “map-reduce” processing). The only problem would be with simultaneous appending (writing at the end) to the same file by two clients. Please also read a thread “Append and seek while writing functionality” on the group archive: http://sourceforge.net/mailarchive/forum.php?forum_name=moosefs-users <http://sourceforge.net/mailarchive/forum.php?forum_name=moosefs-users&max_rows=25&style=ultimate&viewmonth=201006> &max_rows=25&style=ultimate&viewmonth=201006 Regards Michał |
From: Fabien G. <fab...@gm...> - 2010-07-09 08:48:01
|
Hi, 2010/7/9 Michał Borychowski <mic...@ge...> > It's just a mistake I made to compile it on a 32 bits platform. > > But maybe you could tell the dev team that in case of memory allocation > failure, mfsmaster crashes without a message... well, if we consider that > "segmentation fault" is not a real error message :-) > > *[MB] It’s on our todo list (but to be honest, with low priority – one > cannot expect a 32bit machine to work with more than 4GB RAM :))* > Sure ! But the problem is more general than 32bit machines : just catching the "no more memory available" error would be great, since it can happen on both 32bit and 64bit machines. For example, on our 64 bit machine with a 64 bit compiled mfsmaster binary, metadata has became such big that mfsmaster crashed, and we can't even restore it since it takes too much memory, and ends with a segmentation fault : [root@mfsmaster ~]# mfsmetarestore -a -d /data/MFS/ loading objects (files,directories,etc.) ... ok loading names ... ok loading deletion timestamps ... ok checking filesystem consistency ... ok loading chunks data ... Segmentation fault [root@mfsmaster ~]# [root@mfsmaster ~]# strace mfsmetarestore -a -d /data/MFS/ [...] read(3, "\0\0\0\0\36\347\314\0\0\0\1\0\0\0\0\0\0\0\0\0\35\347\314\0\0\0\1\0\0\0\0\0"..., 4096) = 4096 mmap2(NULL, 561152, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = -1 ENOMEM (Cannot allocate memory) brk(0xb08e8000) = 0xffffffffb0844000 mmap2(NULL, 1048576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = -1 ENOMEM (Cannot allocate memory) mmap2(NULL, 2097152, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0) = -1 ENOMEM (Cannot allocate memory) mmap2(NULL, 1048576, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0) = -1 ENOMEM (Cannot allocate memory) mmap2(NULL, 2097152, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0) = -1 ENOMEM (Cannot allocate memory) mmap2(NULL, 1048576, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0) = -1 ENOMEM (Cannot allocate memory) --- SIGSEGV (Segmentation fault) @ 0 (0) --- +++ killed by SIGSEGV +++ [root@mfsmaster ~]# [MB] What for do you need LOCK for webhosting? > > > Dynamic websites, writing information to files. Several web servers using > the same MooseFS could try to write to the same file in the same time. > > *[MB] It should not be a problem for you. > * > > *There is a mechanism of chunk locking for write, but the writing process > would be slow. There is no mechanism of informing the client waiting to > write that the lock had been released (probably we’ll implement it one > time). So now client which couldn’t start writing process will try again > every second. This solution can in theory lead to starvation. But > practically it shouldn’t.* > Oh, ok ! I missed that part of the documentation, shame on me. Thank you for the information. > * So this is a safe operation but still is not recommended. It is better > when different process on different machines write to different files and > later some other system combine this data from many files into one target > file (something like in “map-reduce” processing).* > I totally agree with you Michal, but we make webhosting for thousands of customers and most of them don't even know what a cluster is ;-) Fabien |
From: Michał B. <mic...@ge...> - 2010-07-09 08:59:43
|
From: Fabien Germain [mailto:fab...@gm...] Sent: Friday, July 09, 2010 10:48 AM To: Michał Borychowski Cc: moo...@li... Subject: Re: [Moosefs-users] mfs-master[4166]: CS(10.10.10.10) packet too long (226064141/50000000) Hi, 2010/7/9 Micha Borychowski <mic...@ge...> It's just a mistake I made to compile it on a 32 bits platform. But maybe you could tell the dev team that in case of memory allocation failure, mfsmaster crashes without a message... well, if we consider that "segmentation fault" is not a real error message :-) [MB] It’s on our todo list (but to be honest, with low priority – one cannot expect a 32bit machine to work with more than 4GB RAM :)) Sure ! But the problem is more general than 32bit machines : just catching the "no more memory available" error would be great, since it can happen on both 32bit and 64bit machines. For example, on our 64 bit machine with a 64 bit compiled mfsmaster binary, metadata has became such big that mfsmaster crashed, and we can't even restore it since it takes too much memory, and ends with a segmentation fault : [root@mfsmaster ~]# mfsmetarestore -a -d /data/MFS/ loading objects (files,directories,etc.) ... ok loading names ... ok loading deletion timestamps ... ok checking filesystem consistency ... ok loading chunks data ... Segmentation fault [root@mfsmaster ~]# [root@mfsmaster ~]# strace mfsmetarestore -a -d /data/MFS/ [...] read(3, "\0\0\0\0\36\347\314\0\0\0\1\0\0\0\0\0\0\0\0\0\35\347\314\0\0\0\1\0\0\0\0\0"..., 4096) = 4096 mmap2(NULL, 561152, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = -1 ENOMEM (Cannot allocate memory) brk(0xb08e8000) = 0xffffffffb0844000 mmap2(NULL, 1048576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = -1 ENOMEM (Cannot allocate memory) mmap2(NULL, 2097152, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0) = -1 ENOMEM (Cannot allocate memory) mmap2(NULL, 1048576, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0) = -1 ENOMEM (Cannot allocate memory) mmap2(NULL, 2097152, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0) = -1 ENOMEM (Cannot allocate memory) mmap2(NULL, 1048576, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0) = -1 ENOMEM (Cannot allocate memory) --- SIGSEGV (Segmentation fault) @ 0 (0) --- +++ killed by SIGSEGV +++ [root@mfsmaster ~]# [MB] We’ll look into it [...] So this is a safe operation but still is not recommended. It is better when different process on different machines write to different files and later some other system combine this data from many files into one target file (something like in “map-reduce” processing). I totally agree with you Michal, but we make webhosting for thousands of customers and most of them don't even know what a cluster is ;-) [MB] So what kind of simultaneous writing happens there mainly? Could you give us some examples? Michal |
From: Stas O. <sta...@gm...> - 2010-07-09 12:43:02
|
Hi. Sure ! But the problem is more general than 32bit machines : just catching > the "no more memory available" error would be great, since it can happen on > both 32bit and 64bit machines. For example, on our 64 bit machine with a 64 > bit compiled mfsmaster binary, metadata has became such big that mfsmaster > crashed, and we can't even restore it since it takes too much memory, and > ends with a segmentation fault : > Can you tell the total amount of files stored, and the total space stored you have, that you hitting this issue? Also, how much the metadata takes, and how much memory you have? Regards. |
From: Stas O. <sta...@gm...> - 2010-07-09 12:47:42
|
Hi. 2010/6/21 Michał Borychowski <mic...@ge...> > We give you here some quick patches you can implement to the master > server to improve its performance for that amount of files: > > > > In matocsserv.c in mfsmaster you need to change this line: > > #define MaxPacketSize 50000000 > > > > into this: > > #define MaxPacketSize 500000000 > > > > > > > > Also we suggest a change in filesystem.c in mfsmaster in "fs_test_files" > function. Change this line: > > if ((uint32_t)(main_time())<=starttime+150) { > > > > into: > > if ((uint32_t)(main_time())<=starttime+900) { > > > > > > And also changing this line: > > for (k=0 ; k<(NODEHASHSIZE/3600) && i<NODEHASHSIZE ; k++,i++) { > > > > into this: > > for (k=0 ; k<(NODEHASHSIZE/14400) && i<NODEHASHSIZE ; k++,i++) { > > > > > > > > You need to recompile the master server and start it again. The above > changes should make the master server work more stable with large amount of > files. > > > Can these changes be added to next MFS release? Or they impact the performance in any way for smaller amounts? Regards. |
From: Michał B. <mic...@ge...> - 2010-07-12 07:33:19
|
Yes, probably these patches would be applied to the new version or we would implement a still better solution for registering large amounts of files. Regards Michal From: Stas Oskin [mailto:sta...@gm...] Sent: Friday, July 09, 2010 2:47 PM To: Michał Borychowski Cc: moo...@li...; marco lu Subject: Re: [Moosefs-users] mfs-master[4166]: CS(10.10.10.10) packet too long (226064141/50000000) Hi. 2010/6/21 Micha Borychowski <mic...@ge...> We give you here some quick patches you can implement to the master server to improve its performance for that amount of files: In matocsserv.c in mfsmaster you need to change this line: #define MaxPacketSize 50000000 into this: #define MaxPacketSize 500000000 Also we suggest a change in filesystem.c in mfsmaster in "fs_test_files" function. Change this line: if ((uint32_t)(main_time())<=starttime+150) { into: if ((uint32_t)(main_time())<=starttime+900) { And also changing this line: for (k=0 ; k<(NODEHASHSIZE/3600) && i<NODEHASHSIZE ; k++,i++) { into this: for (k=0 ; k<(NODEHASHSIZE/14400) && i<NODEHASHSIZE ; k++,i++) { You need to recompile the master server and start it again. The above changes should make the master server work more stable with large amount of files. Can these changes be added to next MFS release? Or they impact the performance in any way for smaller amounts? Regards. |
From: Stas O. <sta...@gm...> - 2010-07-09 12:57:10
|
> > Sure ! But the problem is more general than 32bit machines : just catching > the "no more memory available" error would be great, since it can happen on > both 32bit and 64bit machines. For example, on our 64 bit machine with a 64 > bit compiled mfsmaster binary, metadata has became such big that mfsmaster > crashed, and we can't even restore it since it takes too much memory, and > ends with a segmentation fault : > > [root@mfsmaster ~]# mfsmetarestore -a -d /data/MFS/ > loading objects (files,directories,etc.) ... ok > loading names ... ok > loading deletion timestamps ... ok > checking filesystem consistency ... ok > loading chunks data ... Segmentation fault > [root@mfsmaster ~]# > > [root@mfsmaster ~]# strace mfsmetarestore -a -d /data/MFS/ > [...] > read(3, > "\0\0\0\0\36\347\314\0\0\0\1\0\0\0\0\0\0\0\0\0\35\347\314\0\0\0\1\0\0\0\0\0"..., > 4096) = 4096 > mmap2(NULL, 561152, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) > = -1 ENOMEM (Cannot allocate memory) > brk(0xb08e8000) = 0xffffffffb0844000 > mmap2(NULL, 1048576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, > 0) = -1 ENOMEM (Cannot allocate memory) > mmap2(NULL, 2097152, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE, > -1, 0) = -1 ENOMEM (Cannot allocate memory) > mmap2(NULL, 1048576, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE, > -1, 0) = -1 ENOMEM (Cannot allocate memory) > mmap2(NULL, 2097152, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE, > -1, 0) = -1 ENOMEM (Cannot allocate memory) > mmap2(NULL, 1048576, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE, > -1, 0) = -1 ENOMEM (Cannot allocate memory) > --- SIGSEGV (Segmentation fault) @ 0 (0) --- > +++ killed by SIGSEGV +++ > [root@mfsmaster ~]# > > *[MB] We’ll look into it* > > > Another suggestion: Perhaps it's possible to measure the total available memory to MFS master / logger, and show via chart how much is left? Similar to how disk space is measured today per chunk servers. That would allow to plan the memory expansion in advance, and not to be pressed to locate and add more memory modules when the MFSg master / logger has crashed (or even normally stopped once this added) due to insufficient memory. Regards. |
From: Michał B. <mic...@ge...> - 2010-07-15 08:48:25
|
From: Stas Oskin [mailto:sta...@gm...] Sent: Friday, July 09, 2010 2:57 PM To: Michał Borychowski Cc: moo...@li... Subject: Re: [Moosefs-users] mfs-master[4166]: CS(10.10.10.10) packet too long (226064141/50000000) Another suggestion: Perhaps it's possible to measure the total available memory to MFS master / logger, and show via chart how much is left? Similar to how disk space is measured today per chunk servers. That would allow to plan the memory expansion in advance, and not to be pressed to locate and add more memory modules when the MFSg master / logger has crashed (or even normally stopped once this added) due to insufficient memory. [MB] Probably we could quite easily check how much memory a given process occupies. We have to see how all the supported operating system return this value. But it would be much more difficult to check how much memory or swap is still left. So yes, we can add to the CGI Monitor "RAM usage" information for the master server, but still admin would have to tell if it is much or not. Regards Michał |
From: TianYuchuan(田玉川) <ti...@fo...> - 2010-08-07 18:23:40
|
hello,everyone! I have a big quertion,please help me,thank you very much. We intend to use moosefs at our product environment as the storage of our online photo service. We'll store for about 200 million photo files. I've built one master server(48G mem), one metalogger server, eight chunk servers(8*1T SATA). When I copy photo files to the moosefs system. At start everything is good. But I had copyed files 57 million ,the master machines'CPU were used 100% . I sthoped the master when used “/user/local/mfs/sbin/mfsmasterserver -s”,that I started the master。but there was a big problem ,the master had not read my files。 These documents are important to me,I am very anxious,please help me recover these files,tihanks。 I got many error syslog from master server: Aug 6 00:57:01 localhost mfsmaster[10546]: * currently unavailable file 41991323: 2668/2526212449954462668/176s.jpg Aug 6 00:57:01 localhost mfsmaster[10546]: currently unavailable chunk 00000000043CD358 (inode: 50379931 ; index: 0) Aug 6 00:57:01 localhost mfsmaster[10546]: * currently unavailable file 50379931: 2926/4294909215566102926/163b.jpg Aug 6 00:57:01 localhost mfsmaster[10546]: currently unavailable chunk 00000000002966C3 (inode: 48284 ; index: 0) Aug 6 00:57:01 localhost mfsmaster[10546]: * currently unavailable file 48284: bookdata/178/8533354296639220178/180b.jpg Aug 6 00:57:01 localhost mfsmaster[10546]: currently unavailable chunk 0000000000594726 (inode: 4242588 ; index: 0) Aug 6 00:57:01 localhost mfsmaster[10546]: * currently unavailable file 4242588: bookdata/6631/4300989258725036631/85s.jpg Aug 6 00:57:01 localhost mfsmaster[10546]: currently unavailable chunk 0000000000993541 (inode: 8436892 ; index: 0) Aug 6 00:57:01 localhost mfsmaster[10546]: * currently unavailable file 8436892: bookdata/7534/3147352338521267534/122b.jpg Aug 6 00:57:01 localhost mfsmaster[10546]: currently unavailable chunk 0000000000D906E6 (inode: 12631196 ; index: 0) Aug 6 00:57:01 localhost mfsmaster[10546]: * currently unavailable file 12631196: bookdata/8691/11879047433161548691/164s.jpg Aug 6 00:57:01 localhost mfsmaster[10546]: currently unavailable chunk 000000000118DC1E (inode: 16825500 ; index: 0) Aug 6 00:57:01 localhost mfsmaster[10546]: * currently unavailable file 16825500: bookdata/1232/17850056326363351232/166b.jpg Aug 6 00:57:01 localhost mfsmaster[10546]: currently unavailable chunk 0000000001681BC7 (inode: 21019804 ; index: 0) Aug 6 00:57:01 localhost mfsmaster[10546]: * currently unavailable file 21019804: bookdata/26/12779298489336140026/246s.jpg Aug 6 00:57:01 localhost mfsmaster[10546]: currently unavailable chunk 0000000001A804E1 (inode: 25214108 ; index: 0) Aug 6 00:57:01 localhost mfsmaster[10546]: * currently unavailable file 25214108: bookdata/3886/8729781571075193886/30s.jpg Aug 6 00:57:01 localhost mfsmaster[10546]: currently unavailable chunk 0000000001E7E826 (inode: 29408412 ; index: 0) Aug 6 00:57:01 localhost mfsmaster[10546]: * currently unavailable file 29408412: bookdata/4757/142868991575144757/316b.jpg Aug 7 23:56:36 localhost mfsmaster[10546]: CS(192.168.0.124) packet too long (115289537/50000000) Aug 7 23:56:36 localhost mfsmaster[10546]: chunkserver disconnected - ip: 192.168.0.124, port: 0, usedspace: 0 (0.00 GiB), totalspace: 0 (0.00 GiB) Aug 8 00:08:14 localhost mfsmaster[10546]: CS(192.168.0.127) packet too long (104113889/50000000) Aug 8 00:08:14 localhost mfsmaster[10546]: chunkserver disconnected - ip: 192.168.0.127, port: 0, usedspace: 0 (0.00 GiB), totalspace: 0 (0.00 GiB) Aug 8 00:21:03 localhost mfsmaster[10546]: CS(192.168.0.120) packet too long (117046565/50000000) Aug 8 00:21:03 localhost mfsmaster[10546]: chunkserver disconnected - ip: 192.168.0.120, port: 0, usedspace: 0 (0.00 GiB), totalspace: 0 (0.00 GiB) when I visited the mfscgi,the error was“Can't connect to MFS master (IP:127.0.0.1 ; PORT:9421)” 。 Thanks all! |
From: TianYuchuan(田玉川) <ti...@fo...> - 2010-08-08 14:52:21
|
hello,everyone! I have a big quertion,please help me,thank you very much. We intend to use moosefs at our product environment as the storage of our online photo service. We'll store for about 200 million photo files. I've built one master server(48G mem), one metalogger server, eight chunk servers(8*1T SATA). When I copy photo files to the moosefs system. At start everything is good. But I had copyed files 57 million ,the master machines'CPU were used 100% . I sthoped the master when used “/user/local/mfs/sbin/mfsmasterserver -s”,that I started the master。but there was a big problem ,the master had not read my files。 These documents are important to me,I am very anxious,please help me recover these files,tihanks。 I got many error syslog from master server: Aug 6 00:57:01 localhost mfsmaster[10546]: * currently unavailable file 41991323: 2668/2526212449954462668/176s.jpg Aug 6 00:57:01 localhost mfsmaster[10546]: currently unavailable chunk 00000000043CD358 (inode: 50379931 ; index: 0) Aug 6 00:57:01 localhost mfsmaster[10546]: * currently unavailable file 50379931: 2926/4294909215566102926/163b.jpg Aug 6 00:57:01 localhost mfsmaster[10546]: currently unavailable chunk 00000000002966C3 (inode: 48284 ; index: 0) Aug 6 00:57:01 localhost mfsmaster[10546]: * currently unavailable file 48284: bookdata/178/8533354296639220178/180b.jpg Aug 6 00:57:01 localhost mfsmaster[10546]: currently unavailable chunk 0000000000594726 (inode: 4242588 ; index: 0) Aug 6 00:57:01 localhost mfsmaster[10546]: * currently unavailable file 4242588: bookdata/6631/4300989258725036631/85s.jpg Aug 6 00:57:01 localhost mfsmaster[10546]: currently unavailable chunk 0000000000993541 (inode: 8436892 ; index: 0) Aug 6 00:57:01 localhost mfsmaster[10546]: * currently unavailable file 8436892: bookdata/7534/3147352338521267534/122b.jpg Aug 6 00:57:01 localhost mfsmaster[10546]: currently unavailable chunk 0000000000D906E6 (inode: 12631196 ; index: 0) Aug 6 00:57:01 localhost mfsmaster[10546]: * currently unavailable file 12631196: bookdata/8691/11879047433161548691/164s.jpg Aug 6 00:57:01 localhost mfsmaster[10546]: currently unavailable chunk 000000000118DC1E (inode: 16825500 ; index: 0) Aug 6 00:57:01 localhost mfsmaster[10546]: * currently unavailable file 16825500: bookdata/1232/17850056326363351232/166b.jpg Aug 6 00:57:01 localhost mfsmaster[10546]: currently unavailable chunk 0000000001681BC7 (inode: 21019804 ; index: 0) Aug 6 00:57:01 localhost mfsmaster[10546]: * currently unavailable file 21019804: bookdata/26/12779298489336140026/246s.jpg Aug 6 00:57:01 localhost mfsmaster[10546]: currently unavailable chunk 0000000001A804E1 (inode: 25214108 ; index: 0) Aug 6 00:57:01 localhost mfsmaster[10546]: * currently unavailable file 25214108: bookdata/3886/8729781571075193886/30s.jpg Aug 6 00:57:01 localhost mfsmaster[10546]: currently unavailable chunk 0000000001E7E826 (inode: 29408412 ; index: 0) Aug 6 00:57:01 localhost mfsmaster[10546]: * currently unavailable file 29408412: bookdata/4757/142868991575144757/316b.jpg Aug 7 23:56:36 localhost mfsmaster[10546]: CS(192.168.0.124) packet too long (115289537/50000000) Aug 7 23:56:36 localhost mfsmaster[10546]: chunkserver disconnected - ip: 192.168.0.124, port: 0, usedspace: 0 (0.00 GiB), totalspace: 0 (0.00 GiB) Aug 8 00:08:14 localhost mfsmaster[10546]: CS(192.168.0.127) packet too long (104113889/50000000) Aug 8 00:08:14 localhost mfsmaster[10546]: chunkserver disconnected - ip: 192.168.0.127, port: 0, usedspace: 0 (0.00 GiB), totalspace: 0 (0.00 GiB) Aug 8 00:21:03 localhost mfsmaster[10546]: CS(192.168.0.120) packet too long (117046565/50000000) Aug 8 00:21:03 localhost mfsmaster[10546]: chunkserver disconnected - ip: 192.168.0.120, port: 0, usedspace: 0 (0.00 GiB), totalspace: 0 (0.00 GiB) when I visited the mfscgi,the error was“Can't connect to MFS master (IP:127.0.0.1 ; PORT:9421)” 。 Thanks all! |
From: Shen G. <sh...@ui...> - 2010-08-09 02:57:23
|
Don't worry! This is because some of your chunk servers are currently unreachable, and the master server notices it, then modifies the meta data of files in those chunk servers to set the "allvalidcopies" to 0 in "struct chunk". When the master is rescanning the files (fs_test_files() in filesystem.c), it finds out the valid copy is 0, then print information into syslog file, just as listed below. However, printing process is quite time-consuming, especially the mount of files is large. During this period, the master ignores the chunk server's connection (because it is in a big loop of test files, and it is a single thread to do this, maybe this is a pitfall). So although you make sure the chunk server working correctly, it is useless (you can notice the reconnecting information in chunk server's syslog file). You could let the master finish printing, then it will reconnect with chunk servers, and will notice the files is there, then set the "allvalidcopies" to a correct value. Then works normally. Or you can re-compile the program with commenting the line 5512 and line 5482 in filesystem.c(mfs-1.6.15). It will ignore the print messages and of cause, reduce the fs test time. Below is from Michal: ----------------------------------------------------------------------- We give you here some quick patches you can implement to the master server to improve its performance for that amount of files: In matocsserv.c in mfsmaster you need to change this line: #define MaxPacketSize 50000000 into this: #define MaxPacketSize 500000000 Also we suggest a change in filesystem.c in mfsmaster in "fs_test_files" function. Change this line: if ((uint32_t)(main_time())<=starttime+150) { into: if ((uint32_t)(main_time())<=starttime+900) { And also changing this line: for (k=0 ; k<(NODEHASHSIZE/3600) && i<NODEHASHSIZE ; k++,i++) { into this: for (k=0 ; k<(NODEHASHSIZE/14400) && i<NODEHASHSIZE ; k++,i++) { You need to recompile the master server and start it again. The above changes should make the master server work more stable with large amount of files. Another suggestion would be to create two MooseFS instances (eg. 2 x 200 million files). One master server could also be metalogger for the another system and vice versa. Kind regards Michał ----------------------------------------------------------------------------- -- Guowen Shen On Sun, 2010-08-08 at 22:51 +0800, TianYuchuan(田玉川) wrote: > > > hello,everyone! > I have a big quertion,please help me,thank you very much. > We intend to use moosefs at our product environment as the storage of > our online photo service. > We'll store for about 200 million photo files. > I've built one master server(48G mem), one metalogger server, eight > chunk servers(8*1T SATA). When I copy photo files to the moosefs > system. At start everything is good. But I had copyed files 57 > million ,the master machines'CPU were used 100% > I sthoped the master when used “/user/local/mfs/sbin/mfsmasterserver > -s”,that I started the master。but there was a big problem ,the > master had not read my files。 These documents are important to me,I > am very anxious,please help me recover these files,tihanks。 > > I got many error syslog from master server: > > Aug 6 00:57:01 localhost mfsmaster[10546]: * currently unavailable > file 41991323: 2668/2526212449954462668/176s.jpg > Aug 6 00:57:01 localhost mfsmaster[10546]: currently unavailable > chunk 00000000043CD358 (inode: 50379931 ; index: 0) > Aug 6 00:57:01 localhost mfsmaster[10546]: * currently unavailable > file 50379931: 2926/4294909215566102926/163b.jpg > Aug 6 00:57:01 localhost mfsmaster[10546]: currently unavailable > chunk 00000000002966C3 (inode: 48284 ; index: 0) > Aug 6 00:57:01 localhost mfsmaster[10546]: * currently unavailable > file 48284: bookdata/178/8533354296639220178/180b.jpg > Aug 6 00:57:01 localhost mfsmaster[10546]: currently unavailable > chunk 0000000000594726 (inode: 4242588 ; index: 0) > Aug 6 00:57:01 localhost mfsmaster[10546]: * currently unavailable > file 4242588: bookdata/6631/4300989258725036631/85s.jpg > Aug 6 00:57:01 localhost mfsmaster[10546]: currently unavailable > chunk 0000000000993541 (inode: 8436892 ; index: 0) > Aug 6 00:57:01 localhost mfsmaster[10546]: * currently unavailable > file 8436892: bookdata/7534/3147352338521267534/122b.jpg > Aug 6 00:57:01 localhost mfsmaster[10546]: currently unavailable > chunk 0000000000D906E6 (inode: 12631196 ; index: 0) > Aug 6 00:57:01 localhost mfsmaster[10546]: * currently unavailable > file 12631196: bookdata/8691/11879047433161548691/164s.jpg > Aug 6 00:57:01 localhost mfsmaster[10546]: currently unavailable > chunk 000000000118DC1E (inode: 16825500 ; index: 0) > Aug 6 00:57:01 localhost mfsmaster[10546]: * currently unavailable > file 16825500: bookdata/1232/17850056326363351232/166b.jpg > Aug 6 00:57:01 localhost mfsmaster[10546]: currently unavailable > chunk 0000000001681BC7 (inode: 21019804 ; index: 0) > Aug 6 00:57:01 localhost mfsmaster[10546]: * currently unavailable > file 21019804: bookdata/26/12779298489336140026/246s.jpg > Aug 6 00:57:01 localhost mfsmaster[10546]: currently unavailable > chunk 0000000001A804E1 (inode: 25214108 ; index: 0) > Aug 6 00:57:01 localhost mfsmaster[10546]: * currently unavailable > file 25214108: bookdata/3886/8729781571075193886/30s.jpg > Aug 6 00:57:01 localhost mfsmaster[10546]: currently unavailable > chunk 0000000001E7E826 (inode: 29408412 ; index: 0) > Aug 6 00:57:01 localhost mfsmaster[10546]: * currently unavailable > file 29408412: bookdata/4757/142868991575144757/316b.jpg > > > Aug 7 23:56:36 localhost mfsmaster[10546]: CS(192.168.0.124) packet > too long (115289537/50000000) > Aug 7 23:56:36 localhost mfsmaster[10546]: chunkserver disconnected - > ip: 192.168.0.124, port: 0, usedspace: 0 (0.00 GiB), totalspace: 0 > (0.00 GiB) > Aug 8 00:08:14 localhost mfsmaster[10546]: CS(192.168.0.127) packet > too long (104113889/50000000) > Aug 8 00:08:14 localhost mfsmaster[10546]: chunkserver disconnected - > ip: 192.168.0.127, port: 0, usedspace: 0 (0.00 GiB), totalspace: 0 > (0.00 GiB) > Aug 8 00:21:03 localhost mfsmaster[10546]: CS(192.168.0.120) packet > too long (117046565/50000000) > Aug 8 00:21:03 localhost mfsmaster[10546]: chunkserver disconnected - > ip: 192.168.0.120, port: 0, usedspace: 0 (0.00 GiB), totalspace: 0 > (0.00 GiB) > > when I visited the mfscgi,the error was“Can't connect to MFS master > (IP:127.0.0.1 ; PORT:9421)” > 。 > > Thanks all! > ------------------------------------------------------------------------------ > This SF.net email is sponsored by > > Make an app they can't live without > Enter the BlackBerry Developer Challenge > http://p.sf.net/sfu/RIM-dev2dev > _______________________________________________ moosefs-users mailing list moo...@li... https://lists.sourceforge.net/lists/listinfo/moosefs-users |
From: Michał B. <mic...@ge...> - 2010-08-09 13:15:50
|
Shen, thanks for the reply :) Tian, these limits have been changed in 1.6.16 and now the latest stable is 1.6.17 so we would recommend you just update the master server to 1.6.17. If you need any further assistance please let us know. Kind regards Michał Borychowski MooseFS Support Manager _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ Gemius S.A. ul. Wołoska 7, 02-672 Warszawa Budynek MARS, klatka D Tel.: +4822 874-41-00 Fax : +4822 874-41-01 -----Original Message----- From: Shen Guowen [mailto:sh...@ui...] Sent: Monday, August 09, 2010 4:42 AM To: TianYuchuan(田玉川) Cc: moo...@li... Subject: Re: [Moosefs-users] mfs-master[10546]: CS(192.168.0.125) packet too long (115289537/50000000) Don't worry! This is because some of your chunk servers are currently unreachable, and the master server notices it, then modifies the meta data of files in those chunk servers to set the "allvalidcopies" to 0 in "struct chunk". When the master is rescanning the files (fs_test_files() in filesystem.c), it finds out the valid copy is 0, then print information into syslog file, just as listed below. However, printing process is quite time-consuming, especially the mount of files is large. During this period, the master ignores the chunk server's connection (because it is in a big loop of test files, and it is a single thread to do this, maybe this is a pitfall). So although you make sure the chunk server working correctly, it is useless (you can notice the reconnecting information in chunk server's syslog file). You could let the master finish printing, then it will reconnect with chunk servers, and will notice the files is there, then set the "allvalidcopies" to a correct value. Then works normally. Or you can re-compile the program with commenting the line 5512 and line 5482 in filesystem.c(mfs-1.6.15). It will ignore the print messages and of cause, reduce the fs test time. Below is from Michal: ----------------------------------------------------------------------- We give you here some quick patches you can implement to the master server to improve its performance for that amount of files: In matocsserv.c in mfsmaster you need to change this line: #define MaxPacketSize 50000000 into this: #define MaxPacketSize 500000000 Also we suggest a change in filesystem.c in mfsmaster in "fs_test_files" function. Change this line: if ((uint32_t)(main_time())<=starttime+150) { into: if ((uint32_t)(main_time())<=starttime+900) { And also changing this line: for (k=0 ; k<(NODEHASHSIZE/3600) && i<NODEHASHSIZE ; k++,i++) { into this: for (k=0 ; k<(NODEHASHSIZE/14400) && i<NODEHASHSIZE ; k++,i++) { You need to recompile the master server and start it again. The above changes should make the master server work more stable with large amount of files. Another suggestion would be to create two MooseFS instances (eg. 2 x 200 million files). One master server could also be metalogger for the another system and vice versa. Kind regards Michał ----------------------------------------------------------------------------- -- Guowen Shen On Sun, 2010-08-08 at 22:51 +0800, TianYuchuan(田玉川) wrote: > > > hello,everyone! > I have a big quertion,please help me,thank you very much. > We intend to use moosefs at our product environment as the storage of > our online photo service. > We'll store for about 200 million photo files. > I've built one master server(48G mem), one metalogger server, eight > chunk servers(8*1T SATA). When I copy photo files to the moosefs > system. At start everything is good. But I had copyed files 57 > million ,the master machines'CPU were used 100% > I sthoped the master when used “/user/local/mfs/sbin/mfsmasterserver > -s”,that I started the master。but there was a big problem ,the > master had not read my files。 These documents are important to me,I > am very anxious,please help me recover these files,tihanks。 > > I got many error syslog from master server: > > Aug 6 00:57:01 localhost mfsmaster[10546]: * currently unavailable > file 41991323: 2668/2526212449954462668/176s.jpg > Aug 6 00:57:01 localhost mfsmaster[10546]: currently unavailable > chunk 00000000043CD358 (inode: 50379931 ; index: 0) > Aug 6 00:57:01 localhost mfsmaster[10546]: * currently unavailable > file 50379931: 2926/4294909215566102926/163b.jpg > Aug 6 00:57:01 localhost mfsmaster[10546]: currently unavailable > chunk 00000000002966C3 (inode: 48284 ; index: 0) > Aug 6 00:57:01 localhost mfsmaster[10546]: * currently unavailable > file 48284: bookdata/178/8533354296639220178/180b.jpg > Aug 6 00:57:01 localhost mfsmaster[10546]: currently unavailable > chunk 0000000000594726 (inode: 4242588 ; index: 0) > Aug 6 00:57:01 localhost mfsmaster[10546]: * currently unavailable > file 4242588: bookdata/6631/4300989258725036631/85s.jpg > Aug 6 00:57:01 localhost mfsmaster[10546]: currently unavailable > chunk 0000000000993541 (inode: 8436892 ; index: 0) > Aug 6 00:57:01 localhost mfsmaster[10546]: * currently unavailable > file 8436892: bookdata/7534/3147352338521267534/122b.jpg > Aug 6 00:57:01 localhost mfsmaster[10546]: currently unavailable > chunk 0000000000D906E6 (inode: 12631196 ; index: 0) > Aug 6 00:57:01 localhost mfsmaster[10546]: * currently unavailable > file 12631196: bookdata/8691/11879047433161548691/164s.jpg > Aug 6 00:57:01 localhost mfsmaster[10546]: currently unavailable > chunk 000000000118DC1E (inode: 16825500 ; index: 0) > Aug 6 00:57:01 localhost mfsmaster[10546]: * currently unavailable > file 16825500: bookdata/1232/17850056326363351232/166b.jpg > Aug 6 00:57:01 localhost mfsmaster[10546]: currently unavailable > chunk 0000000001681BC7 (inode: 21019804 ; index: 0) > Aug 6 00:57:01 localhost mfsmaster[10546]: * currently unavailable > file 21019804: bookdata/26/12779298489336140026/246s.jpg > Aug 6 00:57:01 localhost mfsmaster[10546]: currently unavailable > chunk 0000000001A804E1 (inode: 25214108 ; index: 0) > Aug 6 00:57:01 localhost mfsmaster[10546]: * currently unavailable > file 25214108: bookdata/3886/8729781571075193886/30s.jpg > Aug 6 00:57:01 localhost mfsmaster[10546]: currently unavailable > chunk 0000000001E7E826 (inode: 29408412 ; index: 0) > Aug 6 00:57:01 localhost mfsmaster[10546]: * currently unavailable > file 29408412: bookdata/4757/142868991575144757/316b.jpg > > > Aug 7 23:56:36 localhost mfsmaster[10546]: CS(192.168.0.124) packet > too long (115289537/50000000) > Aug 7 23:56:36 localhost mfsmaster[10546]: chunkserver disconnected - > ip: 192.168.0.124, port: 0, usedspace: 0 (0.00 GiB), totalspace: 0 > (0.00 GiB) > Aug 8 00:08:14 localhost mfsmaster[10546]: CS(192.168.0.127) packet > too long (104113889/50000000) > Aug 8 00:08:14 localhost mfsmaster[10546]: chunkserver disconnected - > ip: 192.168.0.127, port: 0, usedspace: 0 (0.00 GiB), totalspace: 0 > (0.00 GiB) > Aug 8 00:21:03 localhost mfsmaster[10546]: CS(192.168.0.120) packet > too long (117046565/50000000) > Aug 8 00:21:03 localhost mfsmaster[10546]: chunkserver disconnected - > ip: 192.168.0.120, port: 0, usedspace: 0 (0.00 GiB), totalspace: 0 > (0.00 GiB) > > when I visited the mfscgi,the error was“Can't connect to MFS master > (IP:127.0.0.1 ; PORT:9421)” > 。 > > Thanks all! > ------------------------------------------------------------------------------ > This SF.net email is sponsored by > > Make an app they can't live without > Enter the BlackBerry Developer Challenge > http://p.sf.net/sfu/RIM-dev2dev > _______________________________________________ moosefs-users mailing list moo...@li... https://lists.sourceforge.net/lists/listinfo/moosefs-users ------------------------------------------------------------------------------ This SF.net email is sponsored by Make an app they can't live without Enter the BlackBerry Developer Challenge http://p.sf.net/sfu/RIM-dev2dev _______________________________________________ moosefs-users mailing list moo...@li... https://lists.sourceforge.net/lists/listinfo/moosefs-users |
From: TianYuchuan(田玉川) <ti...@fo...> - 2011-03-17 09:55:51
|
Hello My moosefs system was accessd very slowly,I nave no idea,please help me!Thanks!!! files number 104964618 ,chunks number 104963962。 master load is not high,but When the hour every to data cannot accessed,continued for several minutes。General,visit concurrent small, to access data delay was needed a few seconds。 My moosefs system have nine chunks, The chunk station 1 localhost 192.168.0.118 9422 1.6.19 23387618 3.6 TiB 4.5 TiB 79.72 0 0 B 0 B - 2 localhost 192.168.0.119 9422 1.6.19 23246974 3.6 TiB 4.5 TiB 79.72 0 0 B 0 B - 3 localhost 192.168.0.120 9422 1.6.19 23360333 3.6 TiB 4.5 TiB 79.72 0 0 B 0 B - 4 localhost 192.168.0.121 9422 1.6.19 23192013 3.6 TiB 4.5 TiB 79.69 0 0 B 0 B - 5 localhost 192.168.0.122 9422 1.6.19 23483418 3.6 TiB 4.5 TiB 79.70 0 0 B 0 B - 6 localhost 192.168.0.123 9422 1.6.19 23308366 3.6 TiB 4.5 TiB 79.70 0 0 B 0 B - 7 localhost 192.168.0.124 9422 1.6.19 23361992 3.6 TiB 4.5 TiB 79.69 0 0 B 0 B - 8 localhost 192.168.0.125 9422 1.6.19 23300478 3.6 TiB 4.5 TiB 79.70 0 0 B 0 B - 9 localhost 192.168.0.127 9422 1.6.19 23284897 3.5 TiB 4.5 TiB 78.72 0 0 B 0 B - -------------------------------------------------------------------------------------------------------------------------------------------------- [root@localhost mfs]# free -m total used free shared buffers cached Mem: 48295 46127 2168 0 38 8204 -/+ buffers/cache: 37884 10411 Swap: 0 0 0 The CPU using 95%,the highest was by 150%。 -----邮件原件----- 发件人: Shen Guowen [mailto:sh...@ui...] 发送时间: 2010年8月9日 10:42 收件人: TianYuchuan(田玉川) 抄送: moo...@li... 主题: Re: [Moosefs-users] mfs-master[10546]: CS(192.168.0.125) packet too long (115289537/50000000) Don't worry! This is because some of your chunk servers are currently unreachable, and the master server notices it, then modifies the meta data of files in those chunk servers to set the "allvalidcopies" to 0 in "struct chunk". When the master is rescanning the files (fs_test_files() in filesystem.c), it finds out the valid copy is 0, then print information into syslog file, just as listed below. However, printing process is quite time-consuming, especially the mount of files is large. During this period, the master ignores the chunk server's connection (because it is in a big loop of test files, and it is a single thread to do this, maybe this is a pitfall). So although you make sure the chunk server working correctly, it is useless (you can notice the reconnecting information in chunk server's syslog file). You could let the master finish printing, then it will reconnect with chunk servers, and will notice the files is there, then set the "allvalidcopies" to a correct value. Then works normally. Or you can re-compile the program with commenting the line 5512 and line 5482 in filesystem.c(mfs-1.6.15). It will ignore the print messages and of cause, reduce the fs test time. Below is from Michal: ----------------------------------------------------------------------- We give you here some quick patches you can implement to the master server to improve its performance for that amount of files: In matocsserv.c in mfsmaster you need to change this line: #define MaxPacketSize 50000000 into this: #define MaxPacketSize 500000000 Also we suggest a change in filesystem.c in mfsmaster in "fs_test_files" function. Change this line: if ((uint32_t)(main_time())<=starttime+150) { into: if ((uint32_t)(main_time())<=starttime+900) { And also changing this line: for (k=0 ; k<(NODEHASHSIZE/3600) && i<NODEHASHSIZE ; k++,i++) { into this: for (k=0 ; k<(NODEHASHSIZE/14400) && i<NODEHASHSIZE ; k++,i++) { You need to recompile the master server and start it again. The above changes should make the master server work more stable with large amount of files. Another suggestion would be to create two MooseFS instances (eg. 2 x 200 million files). One master server could also be metalogger for the another system and vice versa. Kind regards Michał ----------------------------------------------------------------------------- -- Guowen Shen On Sun, 2010-08-08 at 22:51 +0800, TianYuchuan(田玉川) wrote: > > > hello,everyone! > I have a big quertion,please help me,thank you very much. > We intend to use moosefs at our product environment as the storage of > our online photo service. > We'll store for about 200 million photo files. > I've built one master server(48G mem), one metalogger server, eight > chunk servers(8*1T SATA). When I copy photo files to the moosefs > system. At start everything is good. But I had copyed files 57 > million ,the master machines'CPU were used 100% > I sthoped the master when used “/user/local/mfs/sbin/mfsmasterserver > -s”,that I started the master。but there was a big problem ,the > master had not read my files。 These documents are important to me,I > am very anxious,please help me recover these files,tihanks。 > > I got many error syslog from master server: > > Aug 6 00:57:01 localhost mfsmaster[10546]: * currently unavailable > file 41991323: 2668/2526212449954462668/176s.jpg > Aug 6 00:57:01 localhost mfsmaster[10546]: currently unavailable > chunk 00000000043CD358 (inode: 50379931 ; index: 0) > Aug 6 00:57:01 localhost mfsmaster[10546]: * currently unavailable > file 50379931: 2926/4294909215566102926/163b.jpg > Aug 6 00:57:01 localhost mfsmaster[10546]: currently unavailable > chunk 00000000002966C3 (inode: 48284 ; index: 0) > Aug 6 00:57:01 localhost mfsmaster[10546]: * currently unavailable > file 48284: bookdata/178/8533354296639220178/180b.jpg > Aug 6 00:57:01 localhost mfsmaster[10546]: currently unavailable > chunk 0000000000594726 (inode: 4242588 ; index: 0) > Aug 6 00:57:01 localhost mfsmaster[10546]: * currently unavailable > file 4242588: bookdata/6631/4300989258725036631/85s.jpg > Aug 6 00:57:01 localhost mfsmaster[10546]: currently unavailable > chunk 0000000000993541 (inode: 8436892 ; index: 0) > Aug 6 00:57:01 localhost mfsmaster[10546]: * currently unavailable > file 8436892: bookdata/7534/3147352338521267534/122b.jpg > Aug 6 00:57:01 localhost mfsmaster[10546]: currently unavailable > chunk 0000000000D906E6 (inode: 12631196 ; index: 0) > Aug 6 00:57:01 localhost mfsmaster[10546]: * currently unavailable > file 12631196: bookdata/8691/11879047433161548691/164s.jpg > Aug 6 00:57:01 localhost mfsmaster[10546]: currently unavailable > chunk 000000000118DC1E (inode: 16825500 ; index: 0) > Aug 6 00:57:01 localhost mfsmaster[10546]: * currently unavailable > file 16825500: bookdata/1232/17850056326363351232/166b.jpg > Aug 6 00:57:01 localhost mfsmaster[10546]: currently unavailable > chunk 0000000001681BC7 (inode: 21019804 ; index: 0) > Aug 6 00:57:01 localhost mfsmaster[10546]: * currently unavailable > file 21019804: bookdata/26/12779298489336140026/246s.jpg > Aug 6 00:57:01 localhost mfsmaster[10546]: currently unavailable > chunk 0000000001A804E1 (inode: 25214108 ; index: 0) > Aug 6 00:57:01 localhost mfsmaster[10546]: * currently unavailable > file 25214108: bookdata/3886/8729781571075193886/30s.jpg > Aug 6 00:57:01 localhost mfsmaster[10546]: currently unavailable > chunk 0000000001E7E826 (inode: 29408412 ; index: 0) > Aug 6 00:57:01 localhost mfsmaster[10546]: * currently unavailable > file 29408412: bookdata/4757/142868991575144757/316b.jpg > > > Aug 7 23:56:36 localhost mfsmaster[10546]: CS(192.168.0.124) packet > too long (115289537/50000000) > Aug 7 23:56:36 localhost mfsmaster[10546]: chunkserver disconnected - > ip: 192.168.0.124, port: 0, usedspace: 0 (0.00 GiB), totalspace: 0 > (0.00 GiB) > Aug 8 00:08:14 localhost mfsmaster[10546]: CS(192.168.0.127) packet > too long (104113889/50000000) > Aug 8 00:08:14 localhost mfsmaster[10546]: chunkserver disconnected - > ip: 192.168.0.127, port: 0, usedspace: 0 (0.00 GiB), totalspace: 0 > (0.00 GiB) > Aug 8 00:21:03 localhost mfsmaster[10546]: CS(192.168.0.120) packet > too long (117046565/50000000) > Aug 8 00:21:03 localhost mfsmaster[10546]: chunkserver disconnected - > ip: 192.168.0.120, port: 0, usedspace: 0 (0.00 GiB), totalspace: 0 > (0.00 GiB) > > when I visited the mfscgi,the error was“Can't connect to MFS master > (IP:127.0.0.1 ; PORT:9421)” > 。 > > Thanks all! > ------------------------------------------------------------------------------ > This SF.net email is sponsored by > > Make an app they can't live without > Enter the BlackBerry Developer Challenge > http://p.sf.net/sfu/RIM-dev2dev > _______________________________________________ moosefs-users mailing list moo...@li... https://lists.sourceforge.net/lists/listinfo/moosefs-users |
From: Michal B. <mic...@ge...> - 2011-03-24 08:34:34
|
Hi! You have almost all RAM consumed. As you have 100 million files in the system we suggest putting some extra RAM to the master server. Also it would be advisable to insert SSD disk into the master server so that the hourly metadata dump takes less time. Kind regards Michał Borychowski MooseFS Support Manager _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ Gemius S.A. ul. Wołoska 7, 02-672 Warszawa Budynek MARS, klatka D Tel.: +4822 874-41-00 Fax : +4822 874-41-01 -----Original Message----- From: TianYuchuan(田玉川) [mailto:ti...@fo...] Sent: Thursday, March 17, 2011 10:03 AM To: Shen Guowen Cc: moo...@li... Subject: [Moosefs-users] To access data was very slowly,nearly 2 minute。oh my god! Hello My moosefs system was accessd very slowly,I nave no idea,please help me!Thanks!!! files number 104964618 ,chunks number 104963962。 master load is not high,but When the hour every to data cannot accessed,continued for several minutes。General,visit concurrent small, to access data delay was needed a few seconds。 My moosefs system have nine chunks, The chunk station 1 localhost 192.168.0.118 9422 1.6.19 23387618 3.6 TiB 4.5 TiB 79.72 0 0 B 0 B - 2 localhost 192.168.0.119 9422 1.6.19 23246974 3.6 TiB 4.5 TiB 79.72 0 0 B 0 B - 3 localhost 192.168.0.120 9422 1.6.19 23360333 3.6 TiB 4.5 TiB 79.72 0 0 B 0 B - 4 localhost 192.168.0.121 9422 1.6.19 23192013 3.6 TiB 4.5 TiB 79.69 0 0 B 0 B - 5 localhost 192.168.0.122 9422 1.6.19 23483418 3.6 TiB 4.5 TiB 79.70 0 0 B 0 B - 6 localhost 192.168.0.123 9422 1.6.19 23308366 3.6 TiB 4.5 TiB 79.70 0 0 B 0 B - 7 localhost 192.168.0.124 9422 1.6.19 23361992 3.6 TiB 4.5 TiB 79.69 0 0 B 0 B - 8 localhost 192.168.0.125 9422 1.6.19 23300478 3.6 TiB 4.5 TiB 79.70 0 0 B 0 B - 9 localhost 192.168.0.127 9422 1.6.19 23284897 3.5 TiB 4.5 TiB 78.72 0 0 B 0 B - -------------------------------------------------------------------------------------------------------------------------------------------------- [root@localhost mfs]# free -m total used free shared buffers cached Mem: 48295 46127 2168 0 38 8204 -/+ buffers/cache: 37884 10411 Swap: 0 0 0 The CPU using 95%,the highest was by 150%。 -----邮件原件----- 发件人: Shen Guowen [mailto:sh...@ui...] 发送时间: 2010年8月9日 10:42 收件人: TianYuchuan(田玉川) 抄送: moo...@li... 主题: Re: [Moosefs-users] mfs-master[10546]: CS(192.168.0.125) packet too long (115289537/50000000) Don't worry! This is because some of your chunk servers are currently unreachable, and the master server notices it, then modifies the meta data of files in those chunk servers to set the "allvalidcopies" to 0 in "struct chunk". When the master is rescanning the files (fs_test_files() in filesystem.c), it finds out the valid copy is 0, then print information into syslog file, just as listed below. However, printing process is quite time-consuming, especially the mount of files is large. During this period, the master ignores the chunk server's connection (because it is in a big loop of test files, and it is a single thread to do this, maybe this is a pitfall). So although you make sure the chunk server working correctly, it is useless (you can notice the reconnecting information in chunk server's syslog file). You could let the master finish printing, then it will reconnect with chunk servers, and will notice the files is there, then set the "allvalidcopies" to a correct value. Then works normally. Or you can re-compile the program with commenting the line 5512 and line 5482 in filesystem.c(mfs-1.6.15). It will ignore the print messages and of cause, reduce the fs test time. Below is from Michal: ----------------------------------------------------------------------- We give you here some quick patches you can implement to the master server to improve its performance for that amount of files: In matocsserv.c in mfsmaster you need to change this line: #define MaxPacketSize 50000000 into this: #define MaxPacketSize 500000000 Also we suggest a change in filesystem.c in mfsmaster in "fs_test_files" function. Change this line: if ((uint32_t)(main_time())<=starttime+150) { into: if ((uint32_t)(main_time())<=starttime+900) { And also changing this line: for (k=0 ; k<(NODEHASHSIZE/3600) && i<NODEHASHSIZE ; k++,i++) { into this: for (k=0 ; k<(NODEHASHSIZE/14400) && i<NODEHASHSIZE ; k++,i++) { You need to recompile the master server and start it again. The above changes should make the master server work more stable with large amount of files. Another suggestion would be to create two MooseFS instances (eg. 2 x 200 million files). One master server could also be metalogger for the another system and vice versa. Kind regards Michał ----------------------------------------------------------------------------- -- Guowen Shen On Sun, 2010-08-08 at 22:51 +0800, TianYuchuan(田玉川) wrote: > > > hello,everyone! > I have a big quertion,please help me,thank you very much. > We intend to use moosefs at our product environment as the storage of > our online photo service. > We'll store for about 200 million photo files. > I've built one master server(48G mem), one metalogger server, eight > chunk servers(8*1T SATA). When I copy photo files to the moosefs > system. At start everything is good. But I had copyed files 57 > million ,the master machines'CPU were used 100% > I sthoped the master when used “/user/local/mfs/sbin/mfsmasterserver > -s”,that I started the master。but there was a big problem ,the > master had not read my files。 These documents are important to me,I > am very anxious,please help me recover these files,tihanks。 > > I got many error syslog from master server: > > Aug 6 00:57:01 localhost mfsmaster[10546]: * currently unavailable > file 41991323: 2668/2526212449954462668/176s.jpg > Aug 6 00:57:01 localhost mfsmaster[10546]: currently unavailable > chunk 00000000043CD358 (inode: 50379931 ; index: 0) > Aug 6 00:57:01 localhost mfsmaster[10546]: * currently unavailable > file 50379931: 2926/4294909215566102926/163b.jpg > Aug 6 00:57:01 localhost mfsmaster[10546]: currently unavailable > chunk 00000000002966C3 (inode: 48284 ; index: 0) > Aug 6 00:57:01 localhost mfsmaster[10546]: * currently unavailable > file 48284: bookdata/178/8533354296639220178/180b.jpg > Aug 6 00:57:01 localhost mfsmaster[10546]: currently unavailable > chunk 0000000000594726 (inode: 4242588 ; index: 0) > Aug 6 00:57:01 localhost mfsmaster[10546]: * currently unavailable > file 4242588: bookdata/6631/4300989258725036631/85s.jpg > Aug 6 00:57:01 localhost mfsmaster[10546]: currently unavailable > chunk 0000000000993541 (inode: 8436892 ; index: 0) > Aug 6 00:57:01 localhost mfsmaster[10546]: * currently unavailable > file 8436892: bookdata/7534/3147352338521267534/122b.jpg > Aug 6 00:57:01 localhost mfsmaster[10546]: currently unavailable > chunk 0000000000D906E6 (inode: 12631196 ; index: 0) > Aug 6 00:57:01 localhost mfsmaster[10546]: * currently unavailable > file 12631196: bookdata/8691/11879047433161548691/164s.jpg > Aug 6 00:57:01 localhost mfsmaster[10546]: currently unavailable > chunk 000000000118DC1E (inode: 16825500 ; index: 0) > Aug 6 00:57:01 localhost mfsmaster[10546]: * currently unavailable > file 16825500: bookdata/1232/17850056326363351232/166b.jpg > Aug 6 00:57:01 localhost mfsmaster[10546]: currently unavailable > chunk 0000000001681BC7 (inode: 21019804 ; index: 0) > Aug 6 00:57:01 localhost mfsmaster[10546]: * currently unavailable > file 21019804: bookdata/26/12779298489336140026/246s.jpg > Aug 6 00:57:01 localhost mfsmaster[10546]: currently unavailable > chunk 0000000001A804E1 (inode: 25214108 ; index: 0) > Aug 6 00:57:01 localhost mfsmaster[10546]: * currently unavailable > file 25214108: bookdata/3886/8729781571075193886/30s.jpg > Aug 6 00:57:01 localhost mfsmaster[10546]: currently unavailable > chunk 0000000001E7E826 (inode: 29408412 ; index: 0) > Aug 6 00:57:01 localhost mfsmaster[10546]: * currently unavailable > file 29408412: bookdata/4757/142868991575144757/316b.jpg > > > Aug 7 23:56:36 localhost mfsmaster[10546]: CS(192.168.0.124) packet > too long (115289537/50000000) > Aug 7 23:56:36 localhost mfsmaster[10546]: chunkserver disconnected - > ip: 192.168.0.124, port: 0, usedspace: 0 (0.00 GiB), totalspace: 0 > (0.00 GiB) > Aug 8 00:08:14 localhost mfsmaster[10546]: CS(192.168.0.127) packet > too long (104113889/50000000) > Aug 8 00:08:14 localhost mfsmaster[10546]: chunkserver disconnected - > ip: 192.168.0.127, port: 0, usedspace: 0 (0.00 GiB), totalspace: 0 > (0.00 GiB) > Aug 8 00:21:03 localhost mfsmaster[10546]: CS(192.168.0.120) packet > too long (117046565/50000000) > Aug 8 00:21:03 localhost mfsmaster[10546]: chunkserver disconnected - > ip: 192.168.0.120, port: 0, usedspace: 0 (0.00 GiB), totalspace: 0 > (0.00 GiB) > > when I visited the mfscgi,the error was“Can't connect to MFS master > (IP:127.0.0.1 ; PORT:9421)” > 。 > > Thanks all! > ------------------------------------------------------------------------------ > This SF.net email is sponsored by > > Make an app they can't live without > Enter the BlackBerry Developer Challenge > http://p.sf.net/sfu/RIM-dev2dev > _______________________________________________ moosefs-users mailing list moo...@li... https://lists.sourceforge.net/lists/listinfo/moosefs-users ------------------------------------------------------------------------------ Colocation vs. Managed Hosting A question and answer guide to determining the best fit for your organization - today and in the future. http://p.sf.net/sfu/internap-sfd2d _______________________________________________ moosefs-users mailing list moo...@li... https://lists.sourceforge.net/lists/listinfo/moosefs-users |
From: TianYuchuan(田玉川) <ti...@fo...> - 2011-03-25 06:10:01
|
Hi! Thanks! The master server inserted SAS 15K speed disk! Now the problem had solved! I had update the moosefs version,now the version is mfs-1.6.20-2. Updated ,Cpu was 16%。 Anthor question! Moosefs upgrade! I installed the moosefs mfs-1.6.20-2 version of a new server,and started the master、chunkserver、client。 The old masterserver was not stoped。The old masterserver was not connected chunkserver、not connected client,but the master process occupied 80% CPU,then I restarted the master service,reduced to 5% CPU utilization。 The master cannot release the CPU? -----邮件原件----- 发件人: Michal Borychowski [mailto:mic...@ge...] 发送时间: 2011年3月24日 16:34 收件人: 'TianYuchuan(田玉川)'; 'Shen Guowen' 抄送: moo...@li... 主题: RE: [Moosefs-users] To access data was very slowly,nearly 2 minute。oh my god! Hi! You have almost all RAM consumed. As you have 100 million files in the system we suggest putting some extra RAM to the master server. Also it would be advisable to insert SSD disk into the master server so that the hourly metadata dump takes less time. Kind regards Michał Borychowski MooseFS Support Manager _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ Gemius S.A. ul. Wołoska 7, 02-672 Warszawa Budynek MARS, klatka D Tel.: +4822 874-41-00 Fax : +4822 874-41-01 -----Original Message----- From: TianYuchuan(田玉川) [mailto:ti...@fo...] Sent: Thursday, March 17, 2011 10:03 AM To: Shen Guowen Cc: moo...@li... Subject: [Moosefs-users] To access data was very slowly,nearly 2 minute。oh my god! Hello My moosefs system was accessd very slowly,I nave no idea,please help me!Thanks!!! files number 104964618 ,chunks number 104963962。 master load is not high,but When the hour every to data cannot accessed,continued for several minutes。General,visit concurrent small, to access data delay was needed a few seconds。 My moosefs system have nine chunks, The chunk station 1 localhost 192.168.0.118 9422 1.6.19 23387618 3.6 TiB 4.5 TiB 79.72 0 0 B 0 B - 2 localhost 192.168.0.119 9422 1.6.19 23246974 3.6 TiB 4.5 TiB 79.72 0 0 B 0 B - 3 localhost 192.168.0.120 9422 1.6.19 23360333 3.6 TiB 4.5 TiB 79.72 0 0 B 0 B - 4 localhost 192.168.0.121 9422 1.6.19 23192013 3.6 TiB 4.5 TiB 79.69 0 0 B 0 B - 5 localhost 192.168.0.122 9422 1.6.19 23483418 3.6 TiB 4.5 TiB 79.70 0 0 B 0 B - 6 localhost 192.168.0.123 9422 1.6.19 23308366 3.6 TiB 4.5 TiB 79.70 0 0 B 0 B - 7 localhost 192.168.0.124 9422 1.6.19 23361992 3.6 TiB 4.5 TiB 79.69 0 0 B 0 B - 8 localhost 192.168.0.125 9422 1.6.19 23300478 3.6 TiB 4.5 TiB 79.70 0 0 B 0 B - 9 localhost 192.168.0.127 9422 1.6.19 23284897 3.5 TiB 4.5 TiB 78.72 0 0 B 0 B - -------------------------------------------------------------------------------------------------------------------------------------------------- [root@localhost mfs]# free -m total used free shared buffers cached Mem: 48295 46127 2168 0 38 8204 -/+ buffers/cache: 37884 10411 Swap: 0 0 0 The CPU using 95%,the highest was by 150%。 -----邮件原件----- 发件人: Shen Guowen [mailto:sh...@ui...] 发送时间: 2010年8月9日 10:42 收件人: TianYuchuan(田玉川) 抄送: moo...@li... 主题: Re: [Moosefs-users] mfs-master[10546]: CS(192.168.0.125) packet too long (115289537/50000000) Don't worry! This is because some of your chunk servers are currently unreachable, and the master server notices it, then modifies the meta data of files in those chunk servers to set the "allvalidcopies" to 0 in "struct chunk". When the master is rescanning the files (fs_test_files() in filesystem.c), it finds out the valid copy is 0, then print information into syslog file, just as listed below. However, printing process is quite time-consuming, especially the mount of files is large. During this period, the master ignores the chunk server's connection (because it is in a big loop of test files, and it is a single thread to do this, maybe this is a pitfall). So although you make sure the chunk server working correctly, it is useless (you can notice the reconnecting information in chunk server's syslog file). You could let the master finish printing, then it will reconnect with chunk servers, and will notice the files is there, then set the "allvalidcopies" to a correct value. Then works normally. Or you can re-compile the program with commenting the line 5512 and line 5482 in filesystem.c(mfs-1.6.15). It will ignore the print messages and of cause, reduce the fs test time. Below is from Michal: ----------------------------------------------------------------------- We give you here some quick patches you can implement to the master server to improve its performance for that amount of files: In matocsserv.c in mfsmaster you need to change this line: #define MaxPacketSize 50000000 into this: #define MaxPacketSize 500000000 Also we suggest a change in filesystem.c in mfsmaster in "fs_test_files" function. Change this line: if ((uint32_t)(main_time())<=starttime+150) { into: if ((uint32_t)(main_time())<=starttime+900) { And also changing this line: for (k=0 ; k<(NODEHASHSIZE/3600) && i<NODEHASHSIZE ; k++,i++) { into this: for (k=0 ; k<(NODEHASHSIZE/14400) && i<NODEHASHSIZE ; k++,i++) { You need to recompile the master server and start it again. The above changes should make the master server work more stable with large amount of files. Another suggestion would be to create two MooseFS instances (eg. 2 x 200 million files). One master server could also be metalogger for the another system and vice versa. Kind regards Michał ----------------------------------------------------------------------------- -- Guowen Shen On Sun, 2010-08-08 at 22:51 +0800, TianYuchuan(田玉川) wrote: > > > hello,everyone! > I have a big quertion,please help me,thank you very much. > We intend to use moosefs at our product environment as the storage of > our online photo service. > We'll store for about 200 million photo files. > I've built one master server(48G mem), one metalogger server, eight > chunk servers(8*1T SATA). When I copy photo files to the moosefs > system. At start everything is good. But I had copyed files 57 > million ,the master machines'CPU were used 100% > I sthoped the master when used “/user/local/mfs/sbin/mfsmasterserver > -s”,that I started the master。but there was a big problem ,the > master had not read my files。 These documents are important to me,I > am very anxious,please help me recover these files,tihanks。 > > I got many error syslog from master server: > > Aug 6 00:57:01 localhost mfsmaster[10546]: * currently unavailable > file 41991323: 2668/2526212449954462668/176s.jpg > Aug 6 00:57:01 localhost mfsmaster[10546]: currently unavailable > chunk 00000000043CD358 (inode: 50379931 ; index: 0) > Aug 6 00:57:01 localhost mfsmaster[10546]: * currently unavailable > file 50379931: 2926/4294909215566102926/163b.jpg > Aug 6 00:57:01 localhost mfsmaster[10546]: currently unavailable > chunk 00000000002966C3 (inode: 48284 ; index: 0) > Aug 6 00:57:01 localhost mfsmaster[10546]: * currently unavailable > file 48284: bookdata/178/8533354296639220178/180b.jpg > Aug 6 00:57:01 localhost mfsmaster[10546]: currently unavailable > chunk 0000000000594726 (inode: 4242588 ; index: 0) > Aug 6 00:57:01 localhost mfsmaster[10546]: * currently unavailable > file 4242588: bookdata/6631/4300989258725036631/85s.jpg > Aug 6 00:57:01 localhost mfsmaster[10546]: currently unavailable > chunk 0000000000993541 (inode: 8436892 ; index: 0) > Aug 6 00:57:01 localhost mfsmaster[10546]: * currently unavailable > file 8436892: bookdata/7534/3147352338521267534/122b.jpg > Aug 6 00:57:01 localhost mfsmaster[10546]: currently unavailable > chunk 0000000000D906E6 (inode: 12631196 ; index: 0) > Aug 6 00:57:01 localhost mfsmaster[10546]: * currently unavailable > file 12631196: bookdata/8691/11879047433161548691/164s.jpg > Aug 6 00:57:01 localhost mfsmaster[10546]: currently unavailable > chunk 000000000118DC1E (inode: 16825500 ; index: 0) > Aug 6 00:57:01 localhost mfsmaster[10546]: * currently unavailable > file 16825500: bookdata/1232/17850056326363351232/166b.jpg > Aug 6 00:57:01 localhost mfsmaster[10546]: currently unavailable > chunk 0000000001681BC7 (inode: 21019804 ; index: 0) > Aug 6 00:57:01 localhost mfsmaster[10546]: * currently unavailable > file 21019804: bookdata/26/12779298489336140026/246s.jpg > Aug 6 00:57:01 localhost mfsmaster[10546]: currently unavailable > chunk 0000000001A804E1 (inode: 25214108 ; index: 0) > Aug 6 00:57:01 localhost mfsmaster[10546]: * currently unavailable > file 25214108: bookdata/3886/8729781571075193886/30s.jpg > Aug 6 00:57:01 localhost mfsmaster[10546]: currently unavailable > chunk 0000000001E7E826 (inode: 29408412 ; index: 0) > Aug 6 00:57:01 localhost mfsmaster[10546]: * currently unavailable > file 29408412: bookdata/4757/142868991575144757/316b.jpg > > > Aug 7 23:56:36 localhost mfsmaster[10546]: CS(192.168.0.124) packet > too long (115289537/50000000) > Aug 7 23:56:36 localhost mfsmaster[10546]: chunkserver disconnected - > ip: 192.168.0.124, port: 0, usedspace: 0 (0.00 GiB), totalspace: 0 > (0.00 GiB) > Aug 8 00:08:14 localhost mfsmaster[10546]: CS(192.168.0.127) packet > too long (104113889/50000000) > Aug 8 00:08:14 localhost mfsmaster[10546]: chunkserver disconnected - > ip: 192.168.0.127, port: 0, usedspace: 0 (0.00 GiB), totalspace: 0 > (0.00 GiB) > Aug 8 00:21:03 localhost mfsmaster[10546]: CS(192.168.0.120) packet > too long (117046565/50000000) > Aug 8 00:21:03 localhost mfsmaster[10546]: chunkserver disconnected - > ip: 192.168.0.120, port: 0, usedspace: 0 (0.00 GiB), totalspace: 0 > (0.00 GiB) > > when I visited the mfscgi,the error was“Can't connect to MFS master > (IP:127.0.0.1 ; PORT:9421)” > 。 > > Thanks all! > ------------------------------------------------------------------------------ > This SF.net email is sponsored by > > Make an app they can't live without > Enter the BlackBerry Developer Challenge > http://p.sf.net/sfu/RIM-dev2dev > _______________________________________________ moosefs-users mailing list moo...@li... https://lists.sourceforge.net/lists/listinfo/moosefs-users ------------------------------------------------------------------------------ Colocation vs. Managed Hosting A question and answer guide to determining the best fit for your organization - today and in the future. http://p.sf.net/sfu/internap-sfd2d _______________________________________________ moosefs-users mailing list moo...@li... https://lists.sourceforge.net/lists/listinfo/moosefs-users |