From: 田忠博(Zhongbo T. <win...@gm...> - 2017-03-30 03:02:20
|
Hi Aleksander, Thanks for the quick reply. 3.0.90 DO fix the issue! Thank you. And on our production cluster, we found a lot of unconsumed messages in client's TCP receive queue. This led to periodically high load. After some investigation, we guess the client's `conncache` is too slow to digest KEEPALIVE messages. So we modified the source code to decrease the sleep time, and it seemed working for us. Here is our patch: """ diff --git a/mfscommon/conncache.c b/mfscommon/conncache.c index 4d33c19..b7a99bf 100644 --- a/mfscommon/conncache.c +++ b/mfscommon/conncache.c @@ -161,7 +161,7 @@ void* conncache_keepalive_thread(void* arg) { } ka = keep_alive; zassert(pthread_mutex_unlock(&glock)); - portable_usleep(10000); + portable_usleep(5000); } return arg; } """ Finally, I am curious on the progress of MooseFS 4.0. we are looking forward for the erase-coding implementation for a quite long time. And we also want to know how the MooseFS guys's option on Container Storage Interface (CSI), here you can find more details on it: https://github.com/docker/docker/issues/31923 And at the end, thank you for this excellent project. On Wed, Mar 29, 2017 at 6:04 PM Aleksander Wieliczko < ale...@mo...> wrote: > Hi. > Did you tried the last stable MooseFS version 3.0.90? > > MooseFS 3.0.86 client has a few bugs, but they were fixed. > > Best regards > Aleksander Wieliczko > Technical Support Engineer > MooseFS.com <http://moosefs.com> > On 29.03.2017 11:46, 田忠博(Zhongbo Tian) wrote: > > Hi all, > > We had encountered a weird issue after upgrading to moosefs 3.0.86. > When we try to run ' TMPDIR=/some/moosefs/path python -c "import ctypes" ', > we end up with a SIGBUS. > After some investigations, we found it seems related with mmap, and we > can reproduce this bug using following C code: > """ > > #include <stdio.h> > #include <fcntl.h> > #include <unistd.h> > #include <sys/mman.h> > > int main(int argc, char** argv) { > int fd; > char* filename; > char *c2; > if (argc != 2) { > fprintf(stderr, "usage: %s <file>\n", argv[0]); > return 1; > } > filename = argv[1]; > unlink(filename); > fd = open(filename, O_RDWR|O_CREAT, 0600); > ftruncate(fd, 4096); > c2 = mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0); > *c2 = '\0'; // SIGBUS > return 0; > } > > """ > Here is the strace for when we run this on a moosefs path: > > """ > > $ strace ./test /mfs/user/tianzhongbo/temp/test > execve("./test", ["./test", "/mfs/user/tianzhongbo/temp/test"], [/* 52 > vars */]) = 0 > brk(0) = 0x949000 > mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = > 0x7f825d85a000 > access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or > directory) > open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3 > fstat(3, {st_mode=S_IFREG|0644, st_size=114873, ...}) = 0 > mmap(NULL, 114873, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f825d83d000 > close(3) = 0 > open("/lib64/libc.so.6", O_RDONLY|O_CLOEXEC) = 3 > read(3, > "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0p\t\2\0\0\0\0\0"..., 832) = > 832 > fstat(3, {st_mode=S_IFREG|0755, st_size=1697568, ...}) = 0 > mmap(NULL, 3804928, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) > = 0x7f825d299000 > mprotect(0x7f825d430000, 2097152, PROT_NONE) = 0 > mmap(0x7f825d630000, 24576, PROT_READ|PROT_WRITE, > MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x197000) = 0x7f825d630000 > mmap(0x7f825d636000, 16128, PROT_READ|PROT_WRITE, > MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f825d636000 > close(3) = 0 > mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = > 0x7f825d83c000 > mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = > 0x7f825d83b000 > mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = > 0x7f825d83a000 > arch_prctl(ARCH_SET_FS, 0x7f825d83b700) = 0 > mprotect(0x7f825d630000, 16384, PROT_READ) = 0 > mprotect(0x600000, 4096, PROT_READ) = 0 > mprotect(0x7f825d85b000, 4096, PROT_READ) = 0 > munmap(0x7f825d83d000, 114873) = 0 > unlink("/mfs/user/tianzhongbo/temp/test") = 0 > open("/mfs/user/tianzhongbo/temp/test", O_RDWR|O_CREAT, 0600) = 3 > ftruncate(3, 4096) = 0 > mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_SHARED, 3, 0) = 0x7f825d859000 > --- SIGBUS {si_signo=SIGBUS, si_code=BUS_ADRERR, si_addr=0x7f825d859000} > --- > +++ killed by SIGBUS +++ > Bus error > > """ > > Can anyone help to resolve this? > > > ------------------------------------------------------------------------------ > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > > > > _________________________________________ > moosefs-users mailing lis...@li...https://lists.sourceforge.net/lists/listinfo/moosefs-users > > > |