[sleuthkit-users] Indexed searching in Autopsy and Sleuthkit (Second release/version)
Brought to you by:
carrier
From: Paul B. <ba...@fo...> - 2003-08-08 08:05:36
|
=20 -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hello, I work at a company doing Forensic IT investigations in the Netherlands = called Fox-IT (http://www.fox-it.com). We are working on an all-Linux = environment for Forensic research. As the main Forensic tool we would like to use Autopsy/Sleuthkit. As it = is missing some features in comparison to (commercial) Windows products, = we've decided to contribute and add some new features to Autopsy and = Sleuthkit. We're doing this in cooperation with Brian Carrier. =20 One of the major missing features is indexed searching. Indexed = searching greatly speeds up searches for words during investigations. In May 2003 we released a first implementation for indexed searching in = Autopsy and Sleuthkit. This has resulted in a lot of feedback and = feature requests. This e-mail announces the release of the second version of indexed = searching in Autopsy and Sleuthkit. The patch can be downloaded from=20 http://www.fox-it.com/files/autopsy-indexing-2.patch.tar.gz (MD5 http://www.fox-it.com/files/autopsy-indexing-2.patch.tar.gz.md5) (MD5: 9889 52cf dcb3 a318 f3c8 9920 43b8 d6fb) This second version uses a different and better technique for indexing = image files that has support for more advanced future options. The new version has the following improvements and features: * Tools for Indexed searching in sleuthkit. * Creation of necessary files integrated into Autopsy interface. * Indexed Search field (At the bottom of the "Keyword search" page). * Case insensitive searching. * Possibility to search for whole words only or parts of words. * No strings file necessary. Only the Image file is needed for = indexing. The size for a normal combined index is about the same as a = strings file for the same image. (This depends on the settings used for = indexing). * Can be used to index image files of any size. (Indexing results in = multiple small indexes). * Includes a tool to combine multiple index files of the same image. * The Autopsy interface is currently only useable for "small" images, = because it will combine index files into a single index files thus = taking a long time for very large images (> 20 Gb) Future version will = add more flexibility here. * Support for different default index-character sets. This release lets = you index using: - Alphabet [a-z,A-Z] - Alphanumeric [a-z,A-Z,0-9] - EMail and Alphanumeric [a-z,A-Z,0-9,.,_,-,@] The smaller the set, the smaller the index file. * Lots of flexibility for the index proces. (Specify the maximum memory = usage, the minimum and maximum indexword length and more) The next version will include: * Folding (Mapping diacritic characters to their normal equivalent, = allowing for more powerful searches.) * Default support for folding of the default ISO-8859-1 character set = and perhaps for others too. * Better flexibility in the Autopsy interface. * Allows the use of index specification files. These files describe = exactly what characters should indexed and how they should be folded. = Thus allowing full control over the indexing process. * More documentation on the format used in the index file and the = process involved. It has been tested on a Debian Linux system and on a number of forensic = images. The following statistics have been gathered: * Index time. The index time is dependent on the index character set = used, the minimum and maximum indexword size and the maximum memory that = is available. Indexing a 5 Gb image with only 200 Mb of memory to use, = using the Alphanumeric character set requires 74 minutes and results in = 39 index files with a total size of 3.8 Gb. * Combine time. Multiple index files can be combined into a single = index file. This decreases the size of the index file and increases the = search speed. Combining requires about 33 minutes to combine 3.8 Gb of = index files into a single 2.4 Gb index file (The strings file for the = same image is 2.0 Gb). * Search time. The search time is dependent on the number of results = that are returned. The more results, the longer the search as it has to = access the original image file for every hit. The speedup for searching = is very great. Searches on a 5 Gb image file for a single word: - in less than 1 second (Resulting in 4935 hits), compared to 111 = seconds using the regular grepping on the strings file. - in 66 seconds (Resulting in 366587 hits), compared to 111 seconds = using the regular grepping on the strings file. The available patches are for Autopsy 1.72 and Sleuthkit 1.64. They add = the second beta version of indexed searching to Autopsy. =20 It is still in beta and therefore I would greatly appreciate it if = people would test the indexed searching on other machines and images and = send their problems, feedback and feature requests to me. All feedback is appreciated! My goal is to add useful features (like = indexed searching) to Autopsy and Sleuthkit. This requires feedback! ;-) - -- Paul Bakker Fox-IT Experts in IT Security! Haagweg 137=20 2281 AG RIJSWIJK=20 T 070 336 9999=20 F 070 336 9990=20 I www.fox-it.com=20 E ba...@fo... 57A6 C5EA 55E4 CC1C A967 B13C F8C0 C0FB 8135 E225 Disclaimer: This email may contain confidential information. If this = message is not addressed to you, you may not retain or use the = information in it for any purpose. If you have received it in error, = please notify the sender and delete this message. We try to screen out = viruses but take no responsibility if this email contains a virus.=20 -----BEGIN PGP SIGNATURE----- Version: PGP 8.0 iQA/AwUBPzNZpfjAwPuBNeIlEQJQwQCePQG2bhGRBG6qtz67obh9DfxllnUAoLsY Is+Scu1ZsBYrlMyjVbReB/t9 =3Dvjyl -----END PGP SIGNATURE----- |