[sleuthkit-developers] Re: IO Subsystem patch for fstools

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

On Feb 9, 2004, at 3:44 AM, Michael Cohen wrote:

>> Yea, but autopsy or flag need to store all of those options as well in
>> their own configuration file so that they can pass them to the 
>> fstools.
>>   It would seem more flexible if the sleuth kit had a configuration 
>> file
>> for the image, which could then be used by any GUI (including autopsy,
>> flag, rex etc.)
> The problem with this approach is that the fstools are then too 
> integrated
> with the GUI, especially if the configuration file becomes so complex 
> that
> you really need a GUI to make one. In that case you cant just use them 
> by
> themselve. Is that something we are prepared to live with? Or is a 
> design
> goal to make small self contained tools that may be used from the 
> command
> line?

I guess that depends on how complex the config file is.  RAID systems 
are obviously the most complex  Is there a way that we can make the 
config file fairly simple and have the maps built into the sleuth kit 
instead of the GUI?  That will be faster to load as well if it just has 
to read a few parameters from the config file instead of a full 
mapping.

>
>> Sure.  For this to work with the Sleuth Kit though, there must be the
>> ability to create the configurations in the sleuth kit.   If the only
>> way to create the map is in flag, then that doesn't do autopsy any 
>> good
>> or future interfaces and it doesn't make sense to replicate the stuff
>> in each gui.
>
> Thats true. However, a raid map must be generated internally anyway in 
> order
> to reassemble the individual raid implementations (e.g. lvm, linux 
> raid,
> etc).  Perhaps we can have different IO subsystems which all they do is
> generate a generic map and then call the generic raid implementation? 
> So for
> example say we have a generic raid io subsystem as described above 
> that takes
> on a raid map as input, then we have another subsystem called lvm for 
> example
> which accepts a bunch of lvm specific parameters and then generates a 
> raid
> map and calls the generic raid io subsystem. This way autopsy doesnt 
> need to
> be able to build a generic map in the gui, but one will be built
> automatically as required. If the user works out a way to build a raid 
> map by
> some other means (i.e. some other GUI, by hand, or whatever), they can 
> still
> use the generic raid implementation.

I would rather let each of the RAID types have their own data 
structures and read functions.  That seems much more simple, scalable 
and efficient to run.   It seems that making a map for every type of 
RAID system is like trying to make a mapping for all file systems so 
that we can use generic code for processing.

Maybe we can have a type for generic RAID, but for Windows LDM or the 
Linux RAID configurations where the data structures and layout are 
known I would rather have standard code.  That is much easier to audit 
as well.  If someone wants to review what is going on, having to audit 
a raid map would be a major pain.

>
>>> Maybe it would make more sense to populate the IO_INFO structure
>>> inside the
>>> FS_INFO structure?
>>
>> I would rather not.  I would prefer to keep the file system code
>> separate from the image format code.  In fact, I would even consider
>> making all of this image stuff its own library,
>> imgtools maybe.  It seems much more logical to call the file system
>> processing code with the filled in IO_INFO structure and let it read
>> from it.  The file system code would never touch any of the file
>> descriptors, it would just call the read functions.  This also allows
>> the 'mm...' tools to use the image formats and any other future tools,
>> such as memory images that are split or saved in another tool's
>> proprietary format.
>
> Just to clarify what you are saying... Are you proposing to make the
> io_subsystem and file system code into seperate libraries, and then the
> individual tools (e.g. fls) would open the subsystem, and initialise 
> it, and
> then call the file system code giving it a filled in IO_INFO 
> structure? If I
> understood your comment right it sounds great.
>
> So the IO_INFO structure will contain function pointers to the 
> read_random and
> read_block which will be initialised by the constructor, and the fs 
> code
> would just call those methods? Sounds great:

Yea.  That seems to scale the best.

> FS_INFO *
> ext2fs_open(const char *name, unsigned char ftype)
>
> Changes to:
> FS_INFO *
> ext2fs_open(IO_INFO *io)
>
> (BTW do you think that ftype is a little redundant here?

Yes and no.  For most cases it is, but for FAT where the user can force 
it to be FAT12, FAT16 or FAT32, then it is needed.  If we do auto 
detection though, then we can probably scrap it.

>  and a little off
> topic, it would be nicer if the *fs_open routines returned NULL if they
> couldnt find the filesystem rather than error out, cause then you 
> could cycle
> over all filesystem decoders until one worked rather than demanding 
> the user
> specify the -f parameter all the time. You could use -f to override the
> automatic detection)

Yea, that is a good idea.  These types of changes are the ones that I 
want to examine when The Sleuth Kit gets a make over in the next few 
months.  Auto detection of file systems would be very nice.

>
>> I would lean towards the way that FS_INFO is structured.  There would
>> be a few basic items in IO_INFO, such as the function pointers and
>> maybe the maximum size of the image.  Then there are image specific
>> structures that have their needed values.  For example, the structure
>> for split images may have an array of file descriptors and  a 
>> structure
>> with the sizes of each split image. The normal image structure may 
>> just
>> have one file descriptor.  Actually, maybe this whole thing is better
>> called IMG_INFO instead of IO_INFO.
>
> That sounds great we could cast a void* to achieve this, and then each 
> io
> subsystem makes it own pointer and casts to void*:
> strcut IMG_INFO {
>      common fields
>      ....
>      common fields
>      function pointers....
>      void *data;

> }

To keep the fstools and imgtools code consistent we should either 
change the fstools structures to use the void *, or we can just use the 
method that they use.  The file system structures, NTFS_INFO for 
example, are defined like:

struct NTFS_INFO {
	FS_INFO fs_info;
	int blah;
	int boo;
}

The code casts the pointers as both FS_INFO and NTFS_INFO.  So, the 
argument to an API function takes the FS_INFO structure, but the ntfs 
code just casts it to an NTFS_INFO. Both do the job, but I would like 
to be consistent.

> and maybe the multipart reassmebly has:
> struct part {
> 	char *filename;
> 	struct part* next;
> }

Yea, and we can include a file descriptor that is only opened when that 
file is needed and isn't closed until the end.  We'll also need a size 
field in there somehow as well, but we can figure out the details 
later.

thanks,
brian