From: jose m. <aso...@zo...> - 2010-05-12 17:22:34
|
El Lunes 10 Mayo 2010, lwxian_aha escribió: > 158922: photos/47/62/63/m_853671.jpg May 10 20:34:58 localhost > mfsmaster[20168]: currently unavailable chunk 00000000004465CB (inode: > 4353227 ; index: 0) May 10 20:34:58 localhost mfsmaster[20168]: * > currently unavailable file 4353227: photos/46/45/10/s_848116.jpg May 10 > 20:34:58 localhost mfsmaster[20168]: currently unavailable chunk > 0000000000026B8E (inode: 158923 ; index: 0) May 10 20:34:58 localhost > mfsmaster[20168]: * currently unavailable file 158923: > photos/47/62/63/739789.jpg May 10 20:34:58 localhost mfsmaster[20168]: > currently unavailable chunk 00000000004465CC (inode: 4353228 ; index: 0) > May 10 20:34:58 localhost mfsmaster[20168]: * currently unavailable file > 4353228: photos/46/91/73/936449.jpg May 10 20:34:58 localhost > mfsmaster[20168]: currently unavailable chunk 00000000000CBC48 (inode: > 158924 ; index: 0) May 10 20:34:58 localhost mfsmaster[20168]: * currently > unavailable file 158924: photos/47/74/82/s_171883.jpg May 10 20:34:58 > localhost mfsmaster[20168]: currently unavailable chunk 00000000004465CD > (inode: 4353229 ; index: 0) May 10 20:34:58 localhost mfsmaster[20168]: * > currently unavailable file 4353229: photos/46/45/10/l_848116.jpg May 10 > 20:34:58 localhost mfsmaster[20168]: currently unavailable chunk > 00000000000CBC49 (inode: 158925 ; index: 0) May 10 20:34:58 localhost > mfsmaster[20168]: * currently unavailable file 158925: > photos/47/74/82/m_156158.jpg May 10 20:34:58 localhost mfsmaster[20168]: > currently unavailable chunk 00000000004465CE (inode: 4353230 ; index: 0) > May 10 20:34:58 localhost mfsmaster[20168]: * currently unavailable file > 4353230: photos/46/91/73/m_928481.jpg May 10 20:34:58 localhost > mfsmaster[20168]: currently unavailable chunk 0000000000026B91 (inode: > 158926 ; index: 0) May 10 20:34:58 localhost mfsmaster[20168]: * current > * seems that only one copy of the files affected and have lost a disk fail. * just find a solution to this situation, run mfsfilerepair on each of the affected files. * in my case were tens of thousands, so delete the directories where files had been affected because he was back in the main cluster. * procedure follows, as the rsyslog an mfscgi and even block the machine by the number of messages. 1.- /etc/rsyslog.d/mfs ----------- inicio ------------ # => all messages mfs on file: if ($programname == 'mfsmaster' or $syslogtag == '[mfsmaster]:') then \ -/var/log/mfs/mfs;RSYSLOG_TraditionalFileFormat if ($programname == 'mfsmaster' or $syslogtag == '[mfsmaster]:') then \ ~ --------- fin ---------------------- 2.- mkdir -p /var/log/mfs/mfs-old 3.- /etc/init.d/syslog restart 4.- rotate logs every five minutes ? vi /root/scripts-admin/LogrotateMfs -----------inicio ----------- /var/log/mfs/mfs { dateext ifempty copytruncate create 640 root root olddir /var/log/mfs/mfs-old sharedscripts postrotate /etc/init.d/syslog reload endscript lastaction DATE=`date +%Y%m%d_%H%M` ; \ mv /var/log/mfs/mfs-old/mfs* /var/log/mfs/mfs-old/LOG-MASTER-$DATE ; \ find /var/log/mfs/mfs-old -type f -mtime +1 -exec rm -f {} \; endscript } ------------ fin -------------------- 5.- crontab -e */5 * * * * /usr/sbin/logrotate /root/scripts-admin/LogrotateMfs -f >/dev/null 2>&1 6.- start cluster 7.- mount cluster 7.- start mfscgiserv list the affected directories and files, take the appropriate decision according to the quantity and importance, if any backup, etc. ........... 8.- stop mfscgiserv 9.- aply mfssettrashtime 0 to directory and remove. example. mfsmount /media/mfs -H 172.26.0.10 cd /media/mfs/ mfssettrashtime -r 0 photos or mfssettrashtime -r 0 photos/46/45/10 rm -Rf photos or rm -Rf photos/46/45/10 * continue ........... * users obviously need tools to get listings that apply massively mfscommands and warning system, disk failures over mail, sms, etc ....... * smartmontools does not work in this case. * sorry google translator. |