RE: [sleuthkit-developers] Search
Brought to you by:
carrier
From: Paul B. <ba...@fo...> - 2004-05-06 06:57:14
|
Michael, > While we are on the topic of searching, I was thinking=20 > about comments made=20 > previously on this list with regards to indexing=20 > (specifically the thread=20 > "[sleuthkit-developers] blindly indexing garbage..."). Rather=20 > than indexing=20 > every possible string in the image (which would take huge=20 > amounts of space=20 > for the index file - almost the same size of the image itself=20 > in some cases).=20 > Would it make more sense to have a dictionary of words to=20 > search for and then=20 > only index the offsets of these words in the image? You would=20 > get stuck if=20 > you wanted to search for a word not in the dictionary, but it=20 > might be useful=20 > for a standard initial search. For example, the english=20 > language dictionary=20 > on my linux box is about 85k words big. While I'm not convinced that this is wanted behaviour, I will think about adding this feature to the Searchtools.... One of the problems that you might encounter is that you cannot use a very large dictionary file... As one wants to load the entire dictionary file in memory in some tree like format... This takes precious memory.... Thus limiting the size of your dictionary.... But as said... I will look into this and probably integrate this functionality in either release 3 of release 4 of Searchtools.. Paul Bakker |