Re: [Moosefs-users] I have problem ,Need help,thanks

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

El Lunes 10 Mayo 2010, lwxian_aha escribió:

>  158922: photos/47/62/63/m_853671.jpg May 10 20:34:58 localhost
>  mfsmaster[20168]: currently unavailable chunk 00000000004465CB (inode:
>  4353227 ; index: 0) May 10 20:34:58 localhost mfsmaster[20168]: *
>  currently unavailable file 4353227: photos/46/45/10/s_848116.jpg May 10
>  20:34:58 localhost mfsmaster[20168]: currently unavailable chunk
>  0000000000026B8E (inode: 158923 ; index: 0) May 10 20:34:58 localhost
>  mfsmaster[20168]: * currently unavailable file 158923:
>  photos/47/62/63/739789.jpg May 10 20:34:58 localhost mfsmaster[20168]:
>  currently unavailable chunk 00000000004465CC (inode: 4353228 ; index: 0)
>  May 10 20:34:58 localhost mfsmaster[20168]: * currently unavailable file
>  4353228: photos/46/91/73/936449.jpg May 10 20:34:58 localhost
>  mfsmaster[20168]: currently unavailable chunk 00000000000CBC48 (inode:
>  158924 ; index: 0) May 10 20:34:58 localhost mfsmaster[20168]: * currently
>  unavailable file 158924: photos/47/74/82/s_171883.jpg May 10 20:34:58
>  localhost mfsmaster[20168]: currently unavailable chunk 00000000004465CD
>  (inode: 4353229 ; index: 0) May 10 20:34:58 localhost mfsmaster[20168]: *
>  currently unavailable file 4353229: photos/46/45/10/l_848116.jpg May 10
>  20:34:58 localhost mfsmaster[20168]: currently unavailable chunk
>  00000000000CBC49 (inode: 158925 ; index: 0) May 10 20:34:58 localhost
>  mfsmaster[20168]: * currently unavailable file 158925:
>  photos/47/74/82/m_156158.jpg May 10 20:34:58 localhost mfsmaster[20168]:
>  currently unavailable chunk 00000000004465CE (inode: 4353230 ; index: 0)
>  May 10 20:34:58 localhost mfsmaster[20168]: * currently unavailable file
>  4353230: photos/46/91/73/m_928481.jpg May 10 20:34:58 localhost
>  mfsmaster[20168]: currently unavailable chunk 0000000000026B91 (inode:
>  158926 ; index: 0) May 10 20:34:58 localhost mfsmaster[20168]: * current
> 

* seems that only one copy of the files affected and have lost a disk fail.

* just find a solution to this situation, run mfsfilerepair on each of the 
affected files.

* in my case were tens of thousands, so delete the directories where files had 
been affected because he was back in the main cluster.

* procedure follows, as the rsyslog an mfscgi and even block the machine by 
the number of messages.

1.- /etc/rsyslog.d/mfs
----------- inicio ------------
# => all messages mfs on file:
if     ($programname == 'mfsmaster' or $syslogtag == '[mfsmaster]:') then \
       -/var/log/mfs/mfs;RSYSLOG_TraditionalFileFormat
if     ($programname == 'mfsmaster' or $syslogtag == '[mfsmaster]:') then \
        ~
--------- fin ----------------------

2.- mkdir -p /var/log/mfs/mfs-old

3.- /etc/init.d/syslog restart

4.- rotate logs every five minutes ?

vi /root/scripts-admin/LogrotateMfs

-----------inicio -----------
/var/log/mfs/mfs {
    dateext
    ifempty
    copytruncate
    create 640 root root
    olddir /var/log/mfs/mfs-old
    sharedscripts
    postrotate
        /etc/init.d/syslog reload
    endscript

lastaction
DATE=`date +%Y%m%d_%H%M` ; \
mv /var/log/mfs/mfs-old/mfs* /var/log/mfs/mfs-old/LOG-MASTER-$DATE ; \
find /var/log/mfs/mfs-old -type f -mtime +1 -exec rm -f {} \;
endscript
}

------------ fin --------------------

5.- crontab -e

*/5 * * * * /usr/sbin/logrotate /root/scripts-admin/LogrotateMfs -f >/dev/null 
2>&1

6.- start cluster
7.- mount cluster
7.- start mfscgiserv 

list the affected directories and files, take the appropriate decision 
according to the quantity and importance, if any backup, etc. ...........

8.- stop mfscgiserv
9.- aply mfssettrashtime 0 to directory and remove.

example.

mfsmount /media/mfs -H 172.26.0.10

cd /media/mfs/

mfssettrashtime -r 0 photos

or mfssettrashtime -r 0 photos/46/45/10

rm -Rf photos or rm -Rf photos/46/45/10

* continue ...........

* users obviously need tools to get listings that apply massively mfscommands 
and warning system, disk failures over mail, sms, etc .......

* smartmontools does not work in this case.

* sorry google translator.

Re: [Moosefs-users] I have problem ,Need help,thanks

Fault tolerant, POSIX-compliant, Net Distributed Storage / File System

Re: [Moosefs-users] I have problem ,Need help,thanks