[htdig] file type statistics and ht://Dig

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

Hello,

I work for the ITS department at my university, and we are trying to find a way to gather information regarding the number of files of various types that are found after a full crawl of our domain name.  htdig -t followed by a simple script to gather info from the document database seems like it should do the trick; however, I have two questions just to make sure:

1) does the title field (or any other field stored by the db, for that matter) include the file type, either on its own or at the end of the full URL or file directory path?

and

2) if so, is it possible to gather this information without unnecessary overhead due to further reading, parsing, etc. of the file?

Thank you,
Ryan Bach
ITS Web Services
University of Rochester