RE: [sleuthkit-developers] Re: IO Subsystem patch for fstools

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Hi Michael,

> The solution to this problem, i think, is to implement some=20
> kind of caching in=20
> memory. A cache system can solve all those problems very=20
> efficiently, =20
> particularly for the case where you make lots of small reads,=20
> very close=20
> together (i.e. no seeks). A simple cache (with a simple=20
> policy) can be=20
> implemented quite easily i think, and will be effective for=20
> the scenario you=20
> are describing.
>=20
> What kind of IO do you do for indexing? Is it very localised?=20
> If you were to=20
> cache a block into memory, what would be the optimal size of=20
> the block? (say=20
> 1 mb or more like 32kb?) If you were to cache 1 mb in memory,=20
> how many reads=20
> would you get out of it on average?

I currently support two modes (As will the new release)...

The first is raw mode, meaning that all data on the disk is indexed
as it is!.. That means that the whole disk is walked sequentially in
(currently) 64k blocks... This can be enlarged if that would increase
performance of the underlying subsystem..

The second is raw_fragment mode, meaning that all fragmented pieces
of files are indexed in a similar manner as icat runs through them...
I use both inode_walk and file_walk.. Thus this consists of more small
reads.

As fragmented parts usually comprise only a very small amount of the
disk, this should not be used as an indication of access.. Especially
the first mode (raw) is a real time/disk access/processor roughy... In
it's current form it does not use any seeks, as this greatly increases
speed (Almost double if otherwise)...

Paul Bakker