Re: [sleuthkit-users] hashing a file system
Brought to you by:
carrier
From: Luís F. N. <lfc...@gm...> - 2014-09-05 23:02:09
|
Hi Simson, I have had thoughts about implementing this "sort by sector number of first run" approach in a forensic tool based on TskJavaBindings, but I did not see how to get the file first sector number through the API. Do you know if it is possible with tsk java bindings? Regards, Luis Nassif 2014-09-04 20:13 GMT-03:00 Simson Garfinkel <si...@ac...>: > Hi Stuart. > > You are correct — I put this in numerous presentations but never published > it. > > The MD5 algorithm won't let you combine a partial hash from the middle of > the file with one from the beginning. You need to start at the beginning > and hash through the end. (That's one of the many problems with MD5 for > forensics, BTW.) So I believe that the only approach is sorting the files > by the sector number of the first run, and just leaving it at that. > > I saw speedup with both HDs and SSDs, strangely enough, but not as much > with SSDs. There may be a prefetch thing going on here. > > I think that the Autopsy framework should hash this way, but currently it > doesn't. On the other hand, it may be more useful to hash based on the > "importance" of the files. > > Simson > > > > > On Sep 4, 2014, at 7:04 PM, Stuart Maclean <st...@ap...> > wrote: > > > I am tracking recent efforts in STIX and Cybox and all things Mitre. > > One indicator of compromise is an md5 hash of some file. Presumably you > > compare the hash with all files on some file system to see if there is a > > match. Obviously this requires a walk of the host fs, using e.g. fls or > > fiwalk or the tsk library in general. > > > > Is this a common activity, the hashing of a complete filesystem that > > is? If yes, some experiments I have done with minimising total disk > > seek time by ordering Runs, reading content from the ordered Runs and > > piecing each file's hash back together would show that this is indeed a > > worthy optimization since it can decrease the time spent deriving the > > full hash table considerably. > > > > I did see a slide deck by Simson G where he alluded to a similar win > > situation when disk reads are ordered so as to minimise seek time, but > > wonder if much has been published on the topic, specifically relating to > > the digital forensics arena, i.e. when an entire file system contents is > > to be read in a single pass, for the purposes of producing an 'md5 -> > > file path' map. > > > > Opinions and comments welcomed. > > > > Stuart > > > > > > > ------------------------------------------------------------------------------ > > Slashdot TV. > > Video for Nerds. Stuff that matters. > > http://tv.slashdot.org/ > > _______________________________________________ > > sleuthkit-users mailing list > > https://lists.sourceforge.net/lists/listinfo/sleuthkit-users > > http://www.sleuthkit.org > > > > ------------------------------------------------------------------------------ > Slashdot TV. > Video for Nerds. Stuff that matters. > http://tv.slashdot.org/ > _______________________________________________ > sleuthkit-users mailing list > https://lists.sourceforge.net/lists/listinfo/sleuthkit-users > http://www.sleuthkit.org > |