Hi,
Has anyone tried running tsk_loaddb on an image of a Mac backup (aka
Time Machine) drive? Time Machine creates a lot of hard links, and I
don't think tsk_loaddb recognises the hard links, because the result
is that processing either takes more memory than I have, or takes very
long.
Here's a new issue report I just posted on the bugtracker:
https://github.com/sleuthkit/sleuthkit/issues/397
For example, I suspect that tsk_loaddb -h (to calculate checksums)
actually checksums files not just once, but once for each hard link. I
haven't verified this in the code, but the processing time increases
by so much when I add the -h flag that I suspect this must be the
case.
As noted on the issue tracker, the tsk_files table in the database
contains full path and metadata (like size, timestamps, checksum,
etc), and the tsk_file_layout stores This replicates all the metadata
for each inode. That probably isn't a problem in the ordinary case,
but it means that a lot of data is processed and stored many times for
a case like mine. Perhaps this is something to consider for the
database schema. Is v3 of the schema implemented yet?
http://wiki.sleuthkit.org/index.php?title=SQLite_Database_v3_Schema
It'd also be interesting to hear if anyone else has seen other real
life cases where there's a lot of hard links.
-Ketil
|