[sleuthkit-developers] Re: IO Subsystem patch for fstools
Brought to you by:
carrier
From: Michael C. <mic...@ne...> - 2004-02-09 08:44:47
|
> Yea, but autopsy or flag need to store all of those options as well in > their own configuration file so that they can pass them to the fstools. > It would seem more flexible if the sleuth kit had a configuration file > for the image, which could then be used by any GUI (including autopsy, > flag, rex etc.) The problem with this approach is that the fstools are then too integrated with the GUI, especially if the configuration file becomes so complex that you really need a GUI to make one. In that case you cant just use them by themselve. Is that something we are prepared to live with? Or is a design goal to make small self contained tools that may be used from the command line? I agree that fstools by themselves are probably not all that useful without having some sort of GUI. So perhaps we just live with an increased level of complexity for the fstools, in favour of better integration into larger GUIs? The other problem that may arise from trying to make fstool integrate with the GUI's configuration files is that different GUIs store configuration in different ways, for example flag stores everything in the database (not even in a file), so having a single configuration file format is a little clunky. Its not so bad currently for flag, since we are currently using the database patch that was posted on the list a little while ago to dump out all the data from the image and we never really use the individual tools like ils,fls, icat etc. So it wont be too hard to simply write out a conf file for each image. However, im just thinking of the old version of flag where we did shell out to these tools basically for each file in the filesystem, the cost of parsing a huge config file for each invokation of icat would be tremendous i would imagine. > If we are going to start discussing configuration files for the fstools > (which we both agree are required for at least RAID), then I would > rather make then general enough so that they can be used for other > formats besides RAID. I would even like to have these include > the file system type, mounting point, and hashes of each partition. > Basically the stuff that the other tools include in some proprietary > format with the image, we would put in a separate text file. Thats a great idea (accepting the level of complexity from the fs tool is increased). I would vote for using an xml config file format, since its standard and easy to deal with and we dont have to write a parser. The downside is that we increase the program dependency by requiring libxml2 to be present. Alternatively we could write some yacc/lex parser but we than need to discuss a good format which will be sufficient for autopsy and allow future growth. > Sure. For this to work with the Sleuth Kit though, there must be the > ability to create the configurations in the sleuth kit. If the only > way to create the map is in flag, then that doesn't do autopsy any good > or future interfaces and it doesn't make sense to replicate the stuff > in each gui. Thats true. However, a raid map must be generated internally anyway in order to reassemble the individual raid implementations (e.g. lvm, linux raid, etc). Perhaps we can have different IO subsystems which all they do is generate a generic map and then call the generic raid implementation? So for example say we have a generic raid io subsystem as described above that takes on a raid map as input, then we have another subsystem called lvm for example which accepts a bunch of lvm specific parameters and then generates a raid map and calls the generic raid io subsystem. This way autopsy doesnt need to be able to build a generic map in the gui, but one will be built automatically as required. If the user works out a way to build a raid map by some other means (i.e. some other GUI, by hand, or whatever), they can still use the generic raid implementation. > I'm still not convinced that we need so many options on the command > line. The only case that I can see where all of the command line > options are beneficial is for a live analysis where you don't want to > write to the disk. But, in that case I don't see why you would need to > use any of these complex image formats because you will have access to > the raw device corresponding to the partition. Thats true, and if you have access to the raw device you would not need extra options or more complex io subsystems. > Is there a specific reason with flag that command line options are > easier? No reason currently, because we have our own program (dbtool) written using the fstools library (as is seen in the patch dave submitted). Im just thinking about the way it used to work by shelling out. Maybe a better way is to simply document the fstools library and define a clear interface (with a proper shared library), and then people would be expected to use the library rather than shell out to the tools all the time. > > Maybe it would make more sense to populate the IO_INFO structure > > inside the > > FS_INFO structure? > > I would rather not. I would prefer to keep the file system code > separate from the image format code. In fact, I would even consider > making all of this image stuff its own library, > imgtools maybe. It seems much more logical to call the file system > processing code with the filled in IO_INFO structure and let it read > from it. The file system code would never touch any of the file > descriptors, it would just call the read functions. This also allows > the 'mm...' tools to use the image formats and any other future tools, > such as memory images that are split or saved in another tool's > proprietary format. Just to clarify what you are saying... Are you proposing to make the io_subsystem and file system code into seperate libraries, and then the individual tools (e.g. fls) would open the subsystem, and initialise it, and then call the file system code giving it a filled in IO_INFO structure? If I understood your comment right it sounds great. So the IO_INFO structure will contain function pointers to the read_random and read_block which will be initialised by the constructor, and the fs code would just call those methods? Sounds great: FS_INFO * ext2fs_open(const char *name, unsigned char ftype) Changes to: FS_INFO * ext2fs_open(IO_INFO *io) (BTW do you think that ftype is a little redundant here? and a little off topic, it would be nicer if the *fs_open routines returned NULL if they couldnt find the filesystem rather than error out, cause then you could cycle over all filesystem decoders until one worked rather than demanding the user specify the -f parameter all the time. You could use -f to override the automatic detection) > I would lean towards the way that FS_INFO is structured. There would > be a few basic items in IO_INFO, such as the function pointers and > maybe the maximum size of the image. Then there are image specific > structures that have their needed values. For example, the structure > for split images may have an array of file descriptors and a structure > with the sizes of each split image. The normal image structure may just > have one file descriptor. Actually, maybe this whole thing is better > called IMG_INFO instead of IO_INFO. That sounds great we could cast a void* to achieve this, and then each io subsystem makes it own pointer and casts to void*: strcut IMG_INFO { common fields .... common fields function pointers.... void *data; } and maybe the multipart reassmebly has: struct part { char *filename; struct part* next; } So we initialise as: IMG_INFO *img; img->data=(void *)part_list While the raid is totaly different: struct raid { whatever, ... more stuff, } The advatage of this option is that the IMG_INFO struct doesnt need to know about each subsystem. > In the imgtools collection, we could actually have a tool that > converts the proprietary image formats to a raw image. That could be a new stand alone tool which chooses the right io-subsystem and dumps a dd image out. It would be useful in the case of raid. > Very cool. I had never seen sgzip before. I guess it isn't as much of > a pain as I thought :) It only took a day or so to write sgzip for this purpose, and I thought it would be useful in general for any application needing quick seeking in a compressed file. The library is now available in general on sf: http://sourceforge.net/project/showfiles.php?group_id=100803 > > Do you have any idea how you would read in encase files? I didnt get > > the > > chance to ever use it so i dont know how complex the file format is > > but there > > is nothing i can find on the net re the format. > > Check out asrdata.com. Somewhere on there is a link to the expert > witness format. Thanks for that, the format looks remarkably similar to sgzip except with some extra meta data stuck in there. Should be easy to write a library to access this. I just need to get a small example encase image to play with. > I apologize if I am being a pain with some of these details, but after > having to redesign autopsy because of a bad initial design, I want to > make sure we add this new functionality the right way. I think the discussion is very constructive so far. I was initially expecting a small change, but it looks like there is a need now to do a larger reorganization of code. Its going to pay off in the long run I expect. cheers Michael. |