From: Ryan B. <sol...@ma...> - 2006-03-28 20:20:56
|
Hello, I work for the ITS department at my university, and we are trying to find a way to gather information regarding the number of files of various types that are found after a full crawl of our domain name. htdig -t followed by a simple script to gather info from the document database seems like it should do the trick; however, I have two questions just to make sure: 1) does the title field (or any other field stored by the db, for that matter) include the file type, either on its own or at the end of the full URL or file directory path? and 2) if so, is it possible to gather this information without unnecessary overhead due to further reading, parsing, etc. of the file? Thank you, Ryan Bach ITS Web Services University of Rochester |