I believe I've had great success using your ddrutility scripts, and it's awesome to be able to have ddrescue only attempt recovery of blocks used by files rather than an entire raw disk.
I also ran the ddru_ntfsfindbad on a particular image/logfile and there are a lot of entries with an errorsize of zero. For example:
It would seem to indicate that there is actually no error at all in the file, and that is corroborated by using my previous method of finding errors, which is:
Mount rescued image file and create a list of hashes for every file using hashdeep
Use ddrescue in fill-mode to fill all non-rescued areas of the image with a character string
Create a new list of hashes and then diff the two files containing the before and after hashes.
So, that begs the question of why the entry is in the ntfsfindbad.log file at all. Can you shed any light on what I might be missing?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Because I do not have access to the recovery in question, I cannot be 100% certain of this. But the fact that you verified this result (AWESOME!) helps me make this conclusion. First, let’s look at the documentation for ddru_ntfsfindbad:
“Errorsize= the total size of all the combined errors in bytes. Note that while this number would normally be a multiple of the sector size, if an error was in the last sector and the file did not fill the whole sector, only the used portion would be included in the errorsize”
I believe this would explain what you are seeing, if I would have explained it much better (now on the to-do list). NTFS uses clusters, usually 4096 bytes (8x512 byte sectors). So a 1 byte file would still take 4096 bytes of disk space. Ddru_ntfsfindbad may still consider any error out of that 8 sectors as linked to that file. But it is designed to only report the error size of the data actually lost. So if the file was only big enough to take up a few (or only one) of the sectors, then an error in the unused sectors would still report as linked to that file, but because the error occurred outside the used portion the error size reports 0 bytes.
I don’t have the time right now to verify if that is how it is, but the other possibility would be if the file was actually 0 bytes in size, and the first sector is always considered. I know I wrote the code, but I can’t remember exactly how it is, and don’t have the time to look right now. But one way or another should explain it. Either the file is 0 bytes in size, or the bad sector was in an unused part of the cluster.
Hope this explains it.
Scott
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
After I posted, I realized something. The chances of this happening due to a 0 byte file are almost impossible. If the file was that small it would be held in the MFT record itself. So I will go with my first reason of the error being in an unused sector in the cluster.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I am ashamed to report that I was wrong either way. I took what little time I had today to fire up my test disk and look for this issue, and have found that it is a slight bug in ddru_ntfsbitmap. It happens when the beginning of the file starts right after the end of the last error sector. It is getting reported in the list, but does not actually have an error in it. The good news is that at least it reports an accurate error size of 0 bytes. Finding and fixing this bug is going on the to-do list at the top, and should be fixed in the next release.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
No problem. I'm glad you found the cause of the issue, and I'm also glad to have someone like you writing utilities to enhance the usefulness of ddrescue. As hard drives get bigger and bigger it's great to have a good chance at finding only the part of the drive that actually contains data. Thanks for the detailed explanation.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Just released ddrutility 2.4 which should have the fix for this. If you felt the need to verify this, I would not object as I did not do much testing on it :)
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I believe I've had great success using your ddrutility scripts, and it's awesome to be able to have ddrescue only attempt recovery of blocks used by files rather than an entire raw disk.
I also ran the ddru_ntfsfindbad on a particular image/logfile and there are a lot of entries with an errorsize of zero. For example:
It would seem to indicate that there is actually no error at all in the file, and that is corroborated by using my previous method of finding errors, which is:
So, that begs the question of why the entry is in the ntfsfindbad.log file at all. Can you shed any light on what I might be missing?
Because I do not have access to the recovery in question, I cannot be 100% certain of this. But the fact that you verified this result (AWESOME!) helps me make this conclusion. First, let’s look at the documentation for ddru_ntfsfindbad:
“Errorsize= the total size of all the combined errors in bytes. Note that while this number would normally be a multiple of the sector size, if an error was in the last sector and the file did not fill the whole sector, only the used portion would be included in the errorsize”
I believe this would explain what you are seeing, if I would have explained it much better (now on the to-do list). NTFS uses clusters, usually 4096 bytes (8x512 byte sectors). So a 1 byte file would still take 4096 bytes of disk space. Ddru_ntfsfindbad may still consider any error out of that 8 sectors as linked to that file. But it is designed to only report the error size of the data actually lost. So if the file was only big enough to take up a few (or only one) of the sectors, then an error in the unused sectors would still report as linked to that file, but because the error occurred outside the used portion the error size reports 0 bytes.
I don’t have the time right now to verify if that is how it is, but the other possibility would be if the file was actually 0 bytes in size, and the first sector is always considered. I know I wrote the code, but I can’t remember exactly how it is, and don’t have the time to look right now. But one way or another should explain it. Either the file is 0 bytes in size, or the bad sector was in an unused part of the cluster.
Hope this explains it.
Scott
After I posted, I realized something. The chances of this happening due to a 0 byte file are almost impossible. If the file was that small it would be held in the MFT record itself. So I will go with my first reason of the error being in an unused sector in the cluster.
I am ashamed to report that I was wrong either way. I took what little time I had today to fire up my test disk and look for this issue, and have found that it is a slight bug in ddru_ntfsbitmap. It happens when the beginning of the file starts right after the end of the last error sector. It is getting reported in the list, but does not actually have an error in it. The good news is that at least it reports an accurate error size of 0 bytes. Finding and fixing this bug is going on the to-do list at the top, and should be fixed in the next release.
No problem. I'm glad you found the cause of the issue, and I'm also glad to have someone like you writing utilities to enhance the usefulness of ddrescue. As hard drives get bigger and bigger it's great to have a good chance at finding only the part of the drive that actually contains data. Thanks for the detailed explanation.
Just released ddrutility 2.4 which should have the fix for this. If you felt the need to verify this, I would not object as I did not do much testing on it :)