Readme for ddrescue features patch 1.2 20140624
This patch currently works on ddrescue-1.18.1. It will not work on older versions, and may likely not work on newer versions.
This patch is now obsolete with the release of ddrescue 1.19. It is recommended that you update to ddrescue 1.19, as it has a better implementation of all the options in this patch.
WARNING! There is absolutely no warranty with this patch. Use at your own risk!
This patch is meant for advanced users. There were features that I wanted for personal use in ddrescue, so that is how this patch came into existance. I have just decided to share it.
This patch is not fully tested, and it is always possible that this patch could break something in ddrescue, as this is kind of a hack, and is NOT supported by the author of ddrescue in any way at this time.
To use this patch, extract and put the whole patch folder in the ddrescue source folder. Then navigate to the patch folder ("cd ddrescue_features_patch-x.x") and execute "./_patch.sh". Navigate up one directory ("cd ..") back to the ddrescue source, and then you can then run "make clean", "make" and "make install" as normal to install ddrescue (assuming you already ran "./configure"). Notice the addition of "make clean", as this may be needed for the changes to take effect.
If the patch fails, it will restore the original files from backup. If it is ran a second time on an already patched ddrescue, it will fail and therefore revert back to stock ddrescue. So you can switch back and forth with this patch. The backups are made the first time the patch is ran and then never overwritten again. So if you use this along with another one of my patches and it fails, it will revert the changed files back to stock and remove ALL patches. To use more than one patch, you must start with stock ddrescue and run each patch once, and possibly in a certain order.
This patch currently adds the following options:
--no-reverse-pass do not switch direction for each pass
--skip-on-first-err start skipping on first error
--trim-sequentially don't trim small blocks first
--split-sequentially don't split large blocks first
--no-reverse: This makes the second pass also go in the same direction as the first. This is for those who may ask for the option. But in my benchmark testing I can say there is no real benefit to turning off reverse.
--skip-on-first-err: By default, ddrescue doesn't start skipping until 2 errors are encountered in a row. Sometimes the errors are spread out so that skipping does not happen very often if at all. This option will make ddrescue skip on the first error on the first pass forwards, and also on the second pass in reverse. If used with --no-reverse, the second forward pass skips on the second error like normal. Note that if used with the --reverse option then ddrescue will behave as normal and this option will not do anything. This option does best when setting a higher skip size, as when used with the default skip size it does not have a positive effect.
--trim-sequentially: Normally ddrescue trims the smallest block first, which can cause unwanted head movement. This option makes it trim in order in one pass in the direction specified. My tests did not show any speed difference, but the small size of the test also did not have excessive head movement to begin with.
--split-sequentially: Normally ddrescue splits the largest blocks first (which can cause a lot of unwanted head movement), and then when there are only small blocks of less than 7 sectors in size it will split sequentially. This option makes it split in order in one pass in the direction specified. In my benchmarking tests this helped slightly with overall recovery time, which is likely a result of drive read-ahead. This was even with a small test size, so it is possible that there could be more to gain on a full size recovery. Note that this speed increase would not normally be noticed due to the amount of time errors take to process, and is a very small increase overall. The biggest benefit is the head movement.
************************************************************
Now let's look at some practical uses of the options of ddrescue, including these new ones in this patch. Some of this information is based on my benchmarking tests. But first we need to understand how the data is stored on the platters.
A typical disk can have between 1 and 4 platters, and 2 to 8 heads. The data is actually stored in small groups that could be 100MB or less up to 1GB or more, depending on the drive. So for example if the group size was exactly 100MB, then on a 2 platter 4 head drive the first 0-100MB would be read from head 1, 100-200MB from head 2, 200-300MB from head 3, 300-400MB from head 4. Then the next 400-500MB would go back to head 1, and so on. So as you see, the data is not all in strait line order.
There are normally two basic hard drive errors (ones that can be worked with using ddrescue). The first is a damaged area on one of the platters. The size of this error can vary, and the error can span multiple groups on the head. A damaged platter can also cause head damage (or further head damage) when the head passes over it. The less time spent in this area the better.
The second common error is a weak or damaged head. This will affect reads across the entire disk. I have seen more than one logfile that shows this. There are usually many small errors spaced a bit apart, and usually there is also somewhat of a pattern (that can only be seen by examining the logfile). You can use ddrescueview to see a visual reference of the errors caused by the bad head, and you can also use it to get an idea of the group size of the head.
So how do we best handle these errors with ddrescue?
Fully sequential read: For those who want absolute control, use the following normal options to get a totally sequential read in one pass, '--skip-size=0 --cluster-size=1'. This turns off skipping, and also makes ddrescue read one sector at a time. The disk is read in order from start to finish with no trimming or splitting, and no areas are tried twice. This can actually speed up the overall time of a recovery over using the default options if there are a lot of small errors. The downside to this is that reads of the good areas are slow due to the small read size. And you don't get the most good data first, so if the drive gets worse as you progress you risk loosing more data. As stated, this is for those who want the absolute control, and is not normally recommended.
Skip out fast: This method involves using the --skip-size option to set both the skip size and the max skip size. By default the skip size is 64KiB and the max is either 1GiB or 1% of the drive size, whichever is smaller. So for example if we use ddrescueview (or examine the logfile) for the error pattern early on in the rescue to get an estimate that the data group size is about 100MB, then we might want to go with something like a 5Mi skip size with a 10Mi max ("--skip-size=5Mi,10Mi"). We want to keep skipping out of the bad head as fast as possible on the first pass, but don't want to skip way too far out if we can help it. The untried area that is skipped out away from the bad head will get processed by the reverse pass (a good benefit of the reverse pass). This means that we can skip out big and fast if wanted, but understand that reverse reads are usually slower than forward reads. And you also don't want to allow skipping more than half way to the next bad read, or good data could be missed on the reverse pass and would have to wait for the third no-skip pass. The skip out fast method will also work for a damaged area on the platter, although you will likely not know in advance the group size. The big benefit of this method is getting the most good data as fast as possible before working on the problem areas.
If you notice that you are getting read errors but it is not skipping ahead like it should, then you could also use my patch option of --skip-on-first-err. This option can help get more good data faster, or help skip away from a bad spot on the disk as fast as possible.
No matter what, when you get to the trimming and splitting phases, it is going to go slow. But my new options may help a little bit, although it may not be a noticable difference.
Antonio is working on these improvements in 1.19 (and 1.19-pre1 is definitely a big improvement), but after seeing the head movement plots from trimming and splitting (splitting bounced the head all over like a basketball), I myself am sold on the --trim-sequentially and --split-sequentially options, at least for version 1.18.1. These options will most likely be obsolete for 1.19.