Re: [sleuthkit-developers] Re: IO Subsystem patch for fstools

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

On Mon, 23 Feb 2004 10:12 pm, Paul Bakker wrote:
> Hi Michael,
Hi Paul,

> I currently support two modes (As will the new release)...
>
> The first is raw mode, meaning that all data on the disk is indexed
> as it is!.. That means that the whole disk is walked sequentially in
> (currently) 64k blocks... This can be enlarged if that would increase
> performance of the underlying subsystem..
It is possible for you then to just malloc a buffer (say 1mb) and fill it=20
sequentially, and then just index the buffer? Or will that complicate=20
matters?

> The second is raw_fragment mode, meaning that all fragmented pieces
> of files are indexed in a similar manner as icat runs through them...
> I use both inode_walk and file_walk.. Thus this consists of more small
> reads.
Have a look at the dbtools patch, because David collett has done something=
=20
similar for flag. In his file walk he is simply building a linked list of=20
blocks in the file (without actually reading these blocks), and then in his=
=20
print_blocks he is saving entire runs of blocks as unique entries.=20

So for example suppose you have a file thats 30 blocks big (~120kb). While =
the=20
file_walk might call the callback 30 times, (once for each block), In the e=
nd=20
David's print_blocks function will print a single entry for 30 consecutive=
=20
blocks.

So the idea is that you might use this information to preallocate a buffer =
30=20
blocks big and read a large chunk into it- then index that buffer. The resu=
lt=20
is that you need to do 30 times less reading on the actual file (in this=20
example).=20

=46rom experinece, most files are not really fragmented and at most I have =
seen=20
large files fragmented into 2-3 parts. Thats an average of 2-3 reads for=20
large files, and a single read for small files - not too expensive at all.=
=20
(contrast this with reading the block on each callback you will need to rea=
d=20
every 4kb in every file, or upto several hundred times for each large file).

I just ran icat under gdb again to confirm what im saying here. This is the=
=20
result:
(gdb) b icat_action
(gdb) r -f linux-ext2 /var/tmp/honeypot.hda1.dd 16 > /tmp/vmlinuz

Breakpoint 1, icat_action (fs=3D0x8064e18, addr=3D2080,
    buf=3D0x8065c48 ... , size=3D1024, flags=3D1028, ptr=3D0x805d934 "")

The size is the size that icat writes in every call to icat_action, and it=
=20
seems to be alway 1024 here (block size). So the icat_action callback is=20
called for every single block.

The other problem I can think about (again, I havent seen your code yet so =
im=20
sorry if im talking crap here), is that if you only do the indexing on each=
=20
buffer returned by the file_walk callback, then wouldnt you be missing word=
s=20
that happen to be cut by the block boundary? i.e. half a word will be on on=
e=20
block and the other half on the next block? This problem will be alleviated=
=20
to some extent by indexing larger buffers than blocksizes.

Another suggestion... The way we currently have string indexing done in fla=
g=20
(its mostly in cvs and not quite finished, but we should have it finished=20
soon :-) is by using David's dbtool to extract _all_ the meta information=20
about the image into the database - this includes all the inodes, files, an=
d=20
the blocks these files occupy (in other words file_walk and inode_walk) - W=
e=20
do not read the blocks themselves, we just note where they are. We then ind=
ex=20
the entire image file storing all the offsets for the strings in the=20
database. Then its quite easy to tell if a string is in an allocated file,=
=20
and exactly which file its in (and inode). We can extract entire files by=20
simply reading the complete block run (as i described above). The result is=
=20
that we dont really seek very much, we seek a bit in the dbtool to pull out=
=20
all the meta data, but then we just read sequentially for indexing - and ve=
ry=20
large reads at that (I think about 1mb buffers).

That said indexing is a tough job and it does take a long time... its=20
inescapable. Im interested in your indexing implementation, because the=20
database implementation requires a copy of all the strings to live in the d=
b,=20
which basically blows the db up to about 1/3-1/2 the size of the=20
(uncompressed) image. This is not really practical.

> As fragmented parts usually comprise only a very small amount of the
> disk, this should not be used as an indication of access.. Especially
> the first mode (raw) is a real time/disk access/processor roughy... In
> it's current form it does not use any seeks, as this greatly increases
> speed (Almost double if otherwise)...
Caching can certainly help here. Although if you can restructure your code =
so=20
that you do few big reads rather than lots of small reads it would aleviate=
=20
the need for caching.=20

This is especially important when you think about the possibility of your c=
ode=20
directly operating on encase evidence files or compressed volumes, where in=
=20
that case the major cost is the decompression overheads. Remember that with=
=20
encase the minimum buffer is 32kb, so even if you wanted to read 1 byte, it=
=20
will still need to decompress the whole 32kb chunk to give you that byte -=
=20
very expensive. In that case the cost of seeking is negligent relative to t=
he=20
cost of decompression.

Michael.