|
From: Michał B. <mic...@ge...> - 2010-10-19 07:37:42
|
Hi! We wonder if you still have the problem with dumping the metadata after deleting the /var/lib/mfs directory? We recreated the problem and found the solution. 1. We delete the directory with metadata and create a new one: root@ubuntu10:~/mfs-1.6.18# rm -rf /usr/local/var/mfs/ root@ubuntu10:~# mkdir /usr/local/var/mfs root@ubuntu10:~# chown nobody:nogroup /usr/local/var/mfs root@ubuntu10:~# chmod 755 /usr/local/var/mfs 2. We check master’s CWD by lsof command: root@ubuntu10:~/mfs-1.6.18# lsof | grep ^mfsmaster mfsmaster 22351 nobody cwd DIR 251,0 0 2624067 /usr/local/var/mfs (deleted) (....) So now we have the same situation as you. And now we start recovering the master server: root@ubuntu10:~/mfs-1.6.18# gdb GNU gdb (GDB) 7.2-ubuntu Copyright (C) 2010 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-linux-gnu". For bug reporting instructions, please see: <http://www.gnu.org/software/gdb/bugs/>. (gdb) attach 22351 Attaching to process 22351 Reading symbols from /usr/local/sbin/mfsmaster...done. Reading symbols from /lib/libz.so.1...(no debugging symbols found)...done. Loaded symbols for /lib/libz.so.1 Reading symbols from /lib/libc.so.6...Reading symbols from /usr/lib/debug/lib/libc-2.12.1.so...done. done. Loaded symbols for /lib/libc.so.6 Reading symbols from /lib64/ld-linux-x86-64.so.2...Reading symbols from /usr/lib/debug/lib/ld-2.12.1.so...done. done. Loaded symbols for /lib64/ld-linux-x86-64.so.2 Reading symbols from /lib/libnss_compat.so.2...Reading symbols from /usr/lib/debug/lib/libnss_compat-2.12.1.so...done. done. Loaded symbols for /lib/libnss_compat.so.2 Reading symbols from /lib/libnsl.so.1...Reading symbols from /usr/lib/debug/lib/libnsl-2.12.1.so...done. done. Loaded symbols for /lib/libnsl.so.1 Reading symbols from /lib/libnss_nis.so.2...Reading symbols from /usr/lib/debug/lib/libnss_nis-2.12.1.so...done. done. Loaded symbols for /lib/libnss_nis.so.2 Reading symbols from /lib/libnss_files.so.2...Reading symbols from /usr/lib/debug/lib/libnss_files-2.12.1.so...done. done. Loaded symbols for /lib/libnss_files.so.2 0x00007f9dc92451a8 in __poll (fds=0x7fff2812d3b0, nfds=6, timeout=<value optimized out>) at ../sysdeps/unix/sysv/linux/poll.c:83 83 ../sysdeps/unix/sysv/linux/poll.c: No such file or directory. in ../sysdeps/unix/sysv/linux/poll.c (gdb) call chdir("/usr/local/var/mfs") $1 = 0 (gdb) detach Detaching from program: /usr/local/sbin/mfsmaster, process 22351 (gdb) quit So that’s all: gdb, attach PID, call chdir(PATH), detach and quit root@ubuntu10:~/mfs-1.6.18# lsof | grep ^mfsmaster mfsmaster 22351 nobody cwd DIR 251,0 4096 2624071 /usr/local/var/mfs mfsmaster 22351 nobody rtd DIR 251,0 4096 2 / mfsmaster 22351 nobody txt REG 251,0 937984 2503085 /usr/local/sbin/mfsmaster mfsmaster 22351 nobody mem REG 251,0 51712 524325 /lib/libnss_files-2.12.1.so mfsmaster 22351 nobody mem REG 251,0 43552 524320 /lib/libnss_nis-2.12.1.so mfsmaster 22351 nobody mem REG 251,0 97256 524324 /lib/libnsl-2.12.1.so mfsmaster 22351 nobody mem REG 251,0 35712 524322 /lib/libnss_compat-2.12.1.so mfsmaster 22351 nobody mem REG 251,0 1572232 524311 /lib/libc-2.12.1.so mfsmaster 22351 nobody mem REG 251,0 96816 524531 /lib/libz.so.1.2.3.4 mfsmaster 22351 nobody mem REG 251,0 141072 524319 /lib/ld-2.12.1.so mfsmaster 22351 nobody 0u CHR 136,2 0t0 5 /dev/pts/2 mfsmaster 22351 nobody 1u CHR 136,2 0t0 5 /dev/pts/2 mfsmaster 22351 nobody 2u CHR 136,2 0t0 5 /dev/pts/2 mfsmaster 22351 nobody 3u unix 0xffff880030080000 0t0 44276 socket mfsmaster 22351 nobody 4wW REG 251,0 0 2624064 /usr/local/var/mfs/.mfsmaster.lock (deleted) mfsmaster 22351 nobody 5u IPv4 44281 0t0 TCP *:9419 (LISTEN) mfsmaster 22351 nobody 6u IPv4 44283 0t0 TCP *:9420 (LISTEN) mfsmaster 22351 nobody 7u IPv4 44285 0t0 TCP *:9421 (LISTEN) mfsmaster 22351 nobody 9u IPv4 44583 0t0 TCP 10.37.129.202:9419->10.37.129.202:59003 (ESTABLISHED) mfsmaster 22351 nobody 10u IPv4 44582 0t0 TCP 10.37.129.202:9420->10.37.129.202:36911 (ESTABLISHED) mfsmaster 22351 nobody 11w REG 251,0 341 2624065 /usr/local/var/mfs/changelog.0.mfs (deleted) mfsmaster 22351 nobody 12u IPv4 45330 0t0 TCP localhost:9421->localhost:50494 (ESTABLISHED) And now it is advisable to mount any client – after mounting the session file will write to the new directory. So when there is “sessions.mfs” file in the new directory it is safe to restart the master server. Or you can wait till the hour on the hour (:00 minutes) and the metadata.mfs.back will be dumped. Regarding the restart – there is still a “lockfile” in the old, deleted folder so the standard “/usr/local/sbin/mfsmaster stop” won’t work, you’ll need use “kill”: root@ubuntu10:~/mfs-1.6.18# kill 22351 You have to now wait till the metadata file gets dumped to the new directory and later you start the master in a regular way: root@ubuntu10:~/mfs-1.6.18# /usr/local/sbin/mfsmaster start Of course your PID and your directory would defer from ours. We hope this helps you recover the metadata file. Kind regards Michał From: leon hong [mailto:cod...@gm...] Sent: Friday, October 15, 2010 6:05 PM To: Michał Borychowski Cc: moo...@li... Subject: Re: [Moosefs-users] help!!!!! I met a very serious problem I know " metadata gets dumped from memory each hour",But to deleted is the directory "/var/mfs/lib", after mkdir "/var/lib/mfs" directory, the process will not recognize "/mfs/var", mad brush log.How can I do? Oct 15 22:42:20 master1 mfsmaster[3024]: lost MFS change 322533404: 1287153740|UNLOCK(33003916) Oct 15 22:42:20 master1 mfsmaster[3024]: lost MFS change 322533405: 1287153740|LENGTH(36510654,1024) Oct 15 22:42:20 master1 mfsmaster[3024]: lost MFS change 322533406: 1287153740|UNLOCK(33003913) ................................................................. ................................................................ 2010/10/15 Michał Borychowski <mic...@ge...> Exactly as I said – metadata gets dumped from memory each hour. Just make sure /var/lib/mfs is writeable for the master process. Regards Michał From: leon hong [mailto:cod...@gm...] Sent: Friday, October 15, 2010 5:35 PM To: Michał Borychowski Cc: moo...@li... Subject: Re: [Moosefs-users] help!!!!! I met a very serious problem but I deleted "/var/lib/mfs/" and I create "/var/lib/mfs", mfsmaster can not find this files(changelog.0.mfs metadata.mfs.back stats.mfs) in "/var/lib/mfs". How can I do? ? This is the online service, how do I get the "metadata.mfs.back" out from memory? thanks again!!!!!thanks again!!!!! 2010/10/15 Michał Borychowski <mic...@ge...> The "metadata.mfs.back" gets saved each hour so it should be back there on the hour with 00 minutes. You should also have a new "changelog.0.mfs” file Regards Michał From: leon hong [mailto:cod...@gm...] Sent: Friday, October 15, 2010 4:46 PM To: moo...@li... Subject: Re: [Moosefs-users] help!!!!! I met a very serious problem As I removed the "/var/lib/mfs" directory, and later create a "/var/lib/mfs",help!!!!!!!!~ the "/var/log/message": ----------------------------------------------------------------------------------------------------------------------- Oct 15 22:42:20 master1 mfsmaster[3024]: lost MFS change 322533404: 1287153740|UNLOCK(33003916) Oct 15 22:42:20 master1 mfsmaster[3024]: lost MFS change 322533405: 1287153740|LENGTH(36510654,1024) Oct 15 22:42:20 master1 mfsmaster[3024]: lost MFS change 322533406: 1287153740|UNLOCK(33003913) ............. thanks again! 2010/10/15 leon hong <cod...@gm...> hi~ I accidentally deleted the directory "/var/lib/mfs" How do I put the data in the lead out of memory?? help!!!!!!! thanks!!! |