|
From: jose m. <aso...@zo...> - 2010-05-12 17:22:34
|
El Lunes 10 Mayo 2010, lwxian_aha escribió:
> 158922: photos/47/62/63/m_853671.jpg May 10 20:34:58 localhost
> mfsmaster[20168]: currently unavailable chunk 00000000004465CB (inode:
> 4353227 ; index: 0) May 10 20:34:58 localhost mfsmaster[20168]: *
> currently unavailable file 4353227: photos/46/45/10/s_848116.jpg May 10
> 20:34:58 localhost mfsmaster[20168]: currently unavailable chunk
> 0000000000026B8E (inode: 158923 ; index: 0) May 10 20:34:58 localhost
> mfsmaster[20168]: * currently unavailable file 158923:
> photos/47/62/63/739789.jpg May 10 20:34:58 localhost mfsmaster[20168]:
> currently unavailable chunk 00000000004465CC (inode: 4353228 ; index: 0)
> May 10 20:34:58 localhost mfsmaster[20168]: * currently unavailable file
> 4353228: photos/46/91/73/936449.jpg May 10 20:34:58 localhost
> mfsmaster[20168]: currently unavailable chunk 00000000000CBC48 (inode:
> 158924 ; index: 0) May 10 20:34:58 localhost mfsmaster[20168]: * currently
> unavailable file 158924: photos/47/74/82/s_171883.jpg May 10 20:34:58
> localhost mfsmaster[20168]: currently unavailable chunk 00000000004465CD
> (inode: 4353229 ; index: 0) May 10 20:34:58 localhost mfsmaster[20168]: *
> currently unavailable file 4353229: photos/46/45/10/l_848116.jpg May 10
> 20:34:58 localhost mfsmaster[20168]: currently unavailable chunk
> 00000000000CBC49 (inode: 158925 ; index: 0) May 10 20:34:58 localhost
> mfsmaster[20168]: * currently unavailable file 158925:
> photos/47/74/82/m_156158.jpg May 10 20:34:58 localhost mfsmaster[20168]:
> currently unavailable chunk 00000000004465CE (inode: 4353230 ; index: 0)
> May 10 20:34:58 localhost mfsmaster[20168]: * currently unavailable file
> 4353230: photos/46/91/73/m_928481.jpg May 10 20:34:58 localhost
> mfsmaster[20168]: currently unavailable chunk 0000000000026B91 (inode:
> 158926 ; index: 0) May 10 20:34:58 localhost mfsmaster[20168]: * current
>
* seems that only one copy of the files affected and have lost a disk fail.
* just find a solution to this situation, run mfsfilerepair on each of the
affected files.
* in my case were tens of thousands, so delete the directories where files had
been affected because he was back in the main cluster.
* procedure follows, as the rsyslog an mfscgi and even block the machine by
the number of messages.
1.- /etc/rsyslog.d/mfs
----------- inicio ------------
# => all messages mfs on file:
if ($programname == 'mfsmaster' or $syslogtag == '[mfsmaster]:') then \
-/var/log/mfs/mfs;RSYSLOG_TraditionalFileFormat
if ($programname == 'mfsmaster' or $syslogtag == '[mfsmaster]:') then \
~
--------- fin ----------------------
2.- mkdir -p /var/log/mfs/mfs-old
3.- /etc/init.d/syslog restart
4.- rotate logs every five minutes ?
vi /root/scripts-admin/LogrotateMfs
-----------inicio -----------
/var/log/mfs/mfs {
dateext
ifempty
copytruncate
create 640 root root
olddir /var/log/mfs/mfs-old
sharedscripts
postrotate
/etc/init.d/syslog reload
endscript
lastaction
DATE=`date +%Y%m%d_%H%M` ; \
mv /var/log/mfs/mfs-old/mfs* /var/log/mfs/mfs-old/LOG-MASTER-$DATE ; \
find /var/log/mfs/mfs-old -type f -mtime +1 -exec rm -f {} \;
endscript
}
------------ fin --------------------
5.- crontab -e
*/5 * * * * /usr/sbin/logrotate /root/scripts-admin/LogrotateMfs -f >/dev/null
2>&1
6.- start cluster
7.- mount cluster
7.- start mfscgiserv
list the affected directories and files, take the appropriate decision
according to the quantity and importance, if any backup, etc. ...........
8.- stop mfscgiserv
9.- aply mfssettrashtime 0 to directory and remove.
example.
mfsmount /media/mfs -H 172.26.0.10
cd /media/mfs/
mfssettrashtime -r 0 photos
or mfssettrashtime -r 0 photos/46/45/10
rm -Rf photos or rm -Rf photos/46/45/10
* continue ...........
* users obviously need tools to get listings that apply massively mfscommands
and warning system, disk failures over mail, sms, etc .......
* smartmontools does not work in this case.
* sorry google translator.
|