From: Bruce A. <ba...@gr...> - 2008-03-23 21:20:26
|
Dear smartmontools users and developers, My laptop disk decided to start failing yesterday. In the course of rescuing my files from it, I figured out a simple trick for identifying files with uncorrectable sectors, and "repairing" the disk drive. This was on a MacBook Pro laptop, but should work on other Unix/Linux systems also. THE PROBLEM ----------- I knew that the drive had developed uncorrectable sectors (apparent with smartctl -A) and so some files were unreadabe. I wanted to IDENTIFY the unreadable files (so I knew what data was lost or corrupted) and then force the drive to REALLOCATE the sectors. Any files that were important to me or to the operating system would then have to be replaced. CAVEAT ------ The method advocated below can DESTROY YOUR DATA. Use this AT YOUR OWN RISK. Do NOT do this unless you are an experienced sysadmin and understand the different steps and their implications. This method is guaranteed to destroy some data. If not carried out properly it will destroy ALL data and make you computer unusable. METHOD ------ To identify all unreadable files, try reading all data from all files. As root, do: su find / -type f -exec md5sum {} \; >/tmp/filelist.out 2>/tmp/filelist.err & This creates a list of all files with uncorrectable sectors in /tmp/filelist.err. All readable files are listed in /tmp/filelist.out. "REPAIR OF THE DISK" -------------------- I looked at the files one by one. All files were things I didn't care about (GarageBand sample Audio files, for example). So I forced sector overwriting using this script: #!/bin/sh FILE="$1" if [ ! -f "$FILE" ] ; then echo regular file "$FILE" does not exist exit 0 fi SIZE=`ls -l "$FILE" | awk -- '{print \$5;}'` dd if=/dev/urandom of="$FILE" bs=${SIZE} count=1 conv=notrunc echo done with "$FILE" sync This opens the file in 'no truncation' mode and overwrites it with random numbers. WARNING: THIS DESTROYS ALL DATA IN THE FILE. It would be possible to do this in a much more subtle way that only overwrites the data that is already lost in an UNC sector. This method just overwrites the entire file under the assumption that the entire file is not necessary. (Note that by doing WRITES to the uncorrectable sectors this forces the disk to reallocate bad sectors.) EFFECT ------ In my case, there were too many failed sectors, and after doing this process for about 700 unreadable files, my laptop disk ran out of spare sectors. Here's what it looked like afterwards: http://smartmontools.sourceforge.net/examples/ST910021AS.txt For a disk with just a few uncorrectable sectors, this method should identify the files that are unreadable, and if those files can be replaced or destroyed, then permits the administrator to force sector reallocation for thos bad sectors. Cheers, Bruce |
From: Douglas G. <do...@to...> - 2008-03-24 03:56:16
|
Bruce Allen wrote: > Dear smartmontools users and developers, > > My laptop disk decided to start failing yesterday. In the course of > rescuing my files from it, I figured out a simple trick for identifying > files with uncorrectable sectors, and "repairing" the disk drive. This > was on a MacBook Pro laptop, but should work on other Unix/Linux systems > also. > > THE PROBLEM > ----------- > I knew that the drive had developed uncorrectable sectors (apparent with > smartctl -A) and so some files were unreadabe. I wanted to IDENTIFY the > unreadable files (so I knew what data was lost or corrupted) and then > force the drive to REALLOCATE the sectors. Any files that were important > to me or to the operating system would then have to be replaced. > > CAVEAT > ------ > The method advocated below can DESTROY YOUR DATA. Use this AT YOUR OWN > RISK. Do NOT do this unless you are an experienced sysadmin and > understand the different steps and their implications. This method is > guaranteed to destroy some data. If not carried out properly it will > destroy ALL data and make you computer unusable. > > METHOD > ------ > To identify all unreadable files, try reading all data from all files. As > root, do: > > su > find / -type f -exec md5sum {} \; >/tmp/filelist.out 2>/tmp/filelist.err & > > This creates a list of all files with uncorrectable sectors in > /tmp/filelist.err. All readable files are listed in /tmp/filelist.out. > > > "REPAIR OF THE DISK" > -------------------- > I looked at the files one by one. All files were things I didn't care > about (GarageBand sample Audio files, for example). So I forced sector > overwriting using this script: > > #!/bin/sh > FILE="$1" > if [ ! -f "$FILE" ] ; then > echo regular file "$FILE" does not exist > exit 0 > fi > SIZE=`ls -l "$FILE" | awk -- '{print \$5;}'` > dd if=/dev/urandom of="$FILE" bs=${SIZE} count=1 conv=notrunc > echo done with "$FILE" > sync > > This opens the file in 'no truncation' mode and overwrites it with random > numbers. WARNING: THIS DESTROYS ALL DATA IN THE FILE. It would be > possible to do this in a much more subtle way that only overwrites the > data that is already lost in an UNC sector. This method just overwrites > the entire file under the assumption that the entire file is not > necessary. > > (Note that by doing WRITES to the uncorrectable sectors this forces the > disk to reallocate bad sectors.) > > EFFECT ------ In my case, there were too many failed sectors, and after > doing this process for about 700 unreadable files, my laptop disk ran out > of spare sectors. Here's what it looked like afterwards: > http://smartmontools.sourceforge.net/examples/ST910021AS.txt > > For a disk with just a few uncorrectable sectors, this method should > identify the files that are unreadable, and if those files can be replaced > or destroyed, then permits the administrator to force sector reallocation > for thos bad sectors. Bruce, Ouch. Thinking about the above, I assume you want to get as much data as possible of that drive, then use it for smartmontools testing :-) So I'm not sure forcing (or encouraging) bad blocks to be reallocated on the media is required. At the file system level you just sacrifice the tainted files (tainted directories would be a more difficult problem) and then want the file system to add those just freed blocks (inodes, whatever) to its badblock list so no other file system operation tries to grab them (and compound the problem). Looking at ext2/ext3 in Linux, the 'e2fsck -c' command seems to do this. After that a backup program (e.g. tar) should run through clean, allowing recoverable files to be moved to different media. I'm not sure if you can run something like 'e2fsck -c' on a live root file system. Using a recent "live CD" may help in this situation. Another possibility is to remove the failing disk from the laptop and connect it as a non-boot, non-root-file-system disk on another computer that has spare SATA ports. Doug Gilbert |
From: Bruno W. I. <br...@wo...> - 2008-03-24 20:41:50
|
On Sun, Mar 23, 2008 at 23:55:58 -0400, Douglas Gilbert <do...@to...> wrote: > > I'm not sure if you can run something like 'e2fsck -c' > on a live root file system. Using a recent "live CD" > may help in this situation. Another possibility is to > remove the failing disk from the laptop and connect it > as a non-boot, non-root-file-system disk on another > computer that has spare SATA ports. "Rescue" CDs can also do this. They tend to have signicantly less data on them than "Live" CDs and you work in text mode instead of using X. They aren't as general purpose and you probably want to use one made for your particular distro. |
From: Stanislav B. <sb...@su...> - 2008-03-27 16:45:19
|
Bruce Allen wrote: > > For a disk with just a few uncorrectable sectors, this method should > identify the files that are unreadable, and if those files can be replaced > or destroyed, then permits the administrator to force sector reallocation > for thos bad sectors. You forgot to fill the rest of the disc by zeroes. It will force to relocate bad blocks in free area. But even then, it didn't help for my disc and I had to zero complete partition to force relocation. I guess that my bad blocks were in some type of metadata. (And the disc continues to work for several years without further problems.) -- Best Regards / S pozdravem, Stanislav Brabec software developer --------------------------------------------------------------------- SUSE LINUX, s. r. o. e-mail: sb...@su... Lihovarská 1060/12 tel: +420 284 028 966, +49 911 740538747 190 00 Praha 9 fax: +420 284 028 951 Czech Republic http://www.suse.cz/ |
From: Bruce A. <ba...@gr...> - 2008-03-28 02:29:29
|
On Thu, 27 Mar 2008, Stanislav Brabec wrote: > Bruce Allen wrote: > >> >> For a disk with just a few uncorrectable sectors, this method should >> identify the files that are unreadable, and if those files can be replaced >> or destroyed, then permits the administrator to force sector reallocation >> for thos bad sectors. > > You forgot to fill the rest of the disc by zeroes. It will force to > relocate bad blocks in free area. Agreed. In fact a user can use any method to create a large file that fills all remaining free space on the disk. > But even then, it didn't help for my disc and I had to zero complete > partition to force relocation. I guess that my bad blocks were in some > type of metadata. (And the disc continues to work for several years > without further problems.) Yes, the method I suggest will overwrite data but not metadata. If a directory or superblock is not readable then more serious intervention is needed. Cheers, Bruce |