From: Flow J. <fl...@gm...> - 2011-02-26 08:44:57
|
Hi, After merging moosefs into our production environment for about 2 weeks, I now found there are a lot of core files dumped from mfsmount and remaining in "/" directory. All the back traces look simulator, which dies at free(freecblockshead) call in write_data_term (), when mainloop() ends. Fedora 12 x64 (mfsmount is compiled from source): Core was generated by `mfsmount /home/fwjiang -o rw,mfssubfolder=UserHome/fwjiang'. Program terminated with signal 6, Aborted. #0 0x00000039ab4327f5 in raise () from /lib64/libc.so.6 Missing separate debuginfos, use: debuginfo-install fuse-libs-2.8.5-2.fc12.x86_64 glibc-2.11.2-3.x86_64 libgcc-4.4.4-10.fc12.x86_64 (gdb) bt #0 0x00000039ab4327f5 in raise () from /lib64/libc.so.6 #1 0x00000039ab433fd5 in abort () from /lib64/libc.so.6 #2 0x00000039ab46fa1b in __libc_message () from /lib64/libc.so.6 #3 0x00000039ab475336 in malloc_printerr () from /lib64/libc.so.6 #4 0x000000000040eb12 in write_data_term () at writedata.c:906 #5 0x000000000041282d in mainloop (args=0x7fff49f484d0, mp=0x1bb72e0 "/tmp/autotAfzu1", mt=1, fg=0) at main.c:599 #6 0x0000000000412c48 in main (argc=<value optimized out>, argv=0x7fff49f485f8) at main.c:819 Centos 5.5 x86 (mfsmount is from DAG repository): Core was generated by `mfsmount /project/ui -o rw,mfssubfolder=ProjectData/project/ui'. Program terminated with signal 6, Aborted. #0 0x00417410 in __kernel_vsyscall () (gdb) bt #0 0x00417410 in __kernel_vsyscall () #1 0x00a8ddf0 in raise () from /lib/libc.so.6 #2 0x00a8f701 in abort () from /lib/libc.so.6 #3 0x00ac628b in __libc_message () from /lib/libc.so.6 #4 0x00ace5a5 in _int_free () from /lib/libc.so.6 #5 0x00ace9e9 in free () from /lib/libc.so.6 #6 0x08056cc3 in write_data_term () #7 0x0805a768 in mainloop () #8 0x0805ab37 in main () The auto.home file to auto mount user home on Fedore 12 boxes look like: * -fstype=fuse,mfssubfolder=UserHome/& :mfsmount All the server / clients run mfs 1.6.19. And all core files are dumped from those mounts with Read/Write access. By reading time log of the core dump listed above, I found it's dumped at when autofs timeouts (the default timeout is 300s on CentOS 5.5). So I tried manually copy a file (about 80MB) to a user folder which haven't been auto mounted, then wait 300s until the folder is auto unmounted, the core was dumped as expected. Does anyone has the same issue? Am I doing the right thing to auto mount with Moosefs? Thanks Flow |