Re: [sleuthkit-users] hashing a file system
Brought to you by:
carrier
From: RB <ao...@gm...> - 2014-09-05 01:11:34
|
On Thu, Sep 4, 2014 at 7:01 PM, Simson Garfinkel <si...@ac...> wrote: > This doesn't work unless you are prepared to buffer the later fragments of a file when they appear on disk before earlier fragments. So in the worst case, you need to hold the entire disk in RAM. Perhaps I'm being dense, but "dd if=file | md5sum - " in no way holds the entire file in RAM, and the process can be slept/interrupted/etc; all this means that md5 can be calculated over a stream. Looking at the API for Perl & Python MD5 libraries (expected to be the simplest), they have standard functionality for adding data to a hash object, and I don't expect it holds that in memory either. This would mean you should be able to make a linear scan through the disk and, as you read blocks associated with a file, append them to the md5 object for that file, and move on. You'd have a lot of md5 objects in-memory, but it shouldn't be of a size equivalent to the entire [used] disk. |