Re: [sleuthkit-developers] First Draft - Layout Hash Database
Brought to you by:
carrier
From: Matthias H. <mat...@mh...> - 2004-01-30 14:25:55
|
Brian Carrier said: [...] > So, after thinking about this thread some more, there are two problems > that are being addressed at the same time and I think they can be more > independent and I think the merging has caused some confusion. > > 1. A small set of application categories for any hash database. > > 2. An implementation of a database that can import hashes from > multiple sources. > > As I mentioned before, the categories are a problem with all databases > and I think it would be useful if we could publish a list with > requirements for each category. From Doug's email, it sounds like NIST > would be interested in such categories (assuming that they are > comprehensive and make sense). Ok, then let's treat the list of applications separately. We can later decide if/how we want to implement this in our database. I'll compile a list with examples out of our recent discussion and post it this weekend for further discussion. > For the implementation, it seems that we need to have a clear goal for > the DB. Is it for a comprehensive DB or is it just for quick good vs > bad lookups. Both are needed, but can we satisfy both goals with one > DB? Or, could that be an option at install time. They can chose the > quick / dirty / less data version or the full version. I'm not a DB > guy, so I have no clue what the answers for this are. After thinking about the recent discussion and your comments, I would prefer not to separate the database but instead the interface: - we use a comprehensive database with a large set of information for eac= h hash set - upon importing, everybody can decide for himself how much data to include into the database - we provide a mapping table in order to map the very detailed categories to a small set of super-categories - we provide 2 interfaces: "quick&dirty" (->super-categories) and "long&detailed" The biggest part of the database are the hashsets themself. The organization of comprehensive add-on information doesn't use much ressources, it requires only a good data model. So we gain not much by using two different database models. > It has occurred to me that there should be a 'source' column in the > database, so that the entry can be attributed to the NSRL, hashkeeper, > custom etc. A version may also be useful. This is also useful so that > you can remove the hashes from the DB at a later point. Good idea, I do use this already (without a version) in my forensic hash database. Regards, Matthias |