From: Michał B. <mic...@ge...> - 2010-07-09 08:59:43
|
From: Fabien Germain [mailto:fab...@gm...] Sent: Friday, July 09, 2010 10:48 AM To: Michał Borychowski Cc: moo...@li... Subject: Re: [Moosefs-users] mfs-master[4166]: CS(10.10.10.10) packet too long (226064141/50000000) Hi, 2010/7/9 Micha Borychowski <mic...@ge...> It's just a mistake I made to compile it on a 32 bits platform. But maybe you could tell the dev team that in case of memory allocation failure, mfsmaster crashes without a message... well, if we consider that "segmentation fault" is not a real error message :-) [MB] It’s on our todo list (but to be honest, with low priority – one cannot expect a 32bit machine to work with more than 4GB RAM :)) Sure ! But the problem is more general than 32bit machines : just catching the "no more memory available" error would be great, since it can happen on both 32bit and 64bit machines. For example, on our 64 bit machine with a 64 bit compiled mfsmaster binary, metadata has became such big that mfsmaster crashed, and we can't even restore it since it takes too much memory, and ends with a segmentation fault : [root@mfsmaster ~]# mfsmetarestore -a -d /data/MFS/ loading objects (files,directories,etc.) ... ok loading names ... ok loading deletion timestamps ... ok checking filesystem consistency ... ok loading chunks data ... Segmentation fault [root@mfsmaster ~]# [root@mfsmaster ~]# strace mfsmetarestore -a -d /data/MFS/ [...] read(3, "\0\0\0\0\36\347\314\0\0\0\1\0\0\0\0\0\0\0\0\0\35\347\314\0\0\0\1\0\0\0\0\0"..., 4096) = 4096 mmap2(NULL, 561152, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = -1 ENOMEM (Cannot allocate memory) brk(0xb08e8000) = 0xffffffffb0844000 mmap2(NULL, 1048576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = -1 ENOMEM (Cannot allocate memory) mmap2(NULL, 2097152, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0) = -1 ENOMEM (Cannot allocate memory) mmap2(NULL, 1048576, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0) = -1 ENOMEM (Cannot allocate memory) mmap2(NULL, 2097152, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0) = -1 ENOMEM (Cannot allocate memory) mmap2(NULL, 1048576, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0) = -1 ENOMEM (Cannot allocate memory) --- SIGSEGV (Segmentation fault) @ 0 (0) --- +++ killed by SIGSEGV +++ [root@mfsmaster ~]# [MB] We’ll look into it [...] So this is a safe operation but still is not recommended. It is better when different process on different machines write to different files and later some other system combine this data from many files into one target file (something like in “map-reduce” processing). I totally agree with you Michal, but we make webhosting for thousands of customers and most of them don't even know what a cluster is ;-) [MB] So what kind of simultaneous writing happens there mainly? Could you give us some examples? Michal |