From: Douglas K. <kl...@he...> - 2005-04-27 18:11:34
|
> If you want all files indexed within that tree, then you could use some > sort of script to dump out a recursive directory listing to a file, then > use that file as the source for your Start_URL. > > If you only want a subset then that technique might not be practical. > > Mike > > > -----Original Message----- > > From: htd...@li... > > [mailto:htd...@li...] On Behalf > > Hi, > > > > is there a simple incantation of htdig that would allow me to index a > > file tree (not via the web server)? > > I have a hodgepodge of files, all text, some with and some without > > extensions that I would like to have full text search capabilities on. > > But I can not figure out how to get them all indexed. Some do, some > > don't. (I am using a file:// URL as the starting point). > > I am using 3.2.0b6. > > Looks like htdig is geared mainly towards web site indexing and I am > > trying to bend it too much... If this is under Unix, you could use the "find" command to write out all the files in a tree. If you want to select some of them, you could use options to the "find" command which are more plentiful with the Gnu version or you could pipe the output to a command like a grep or sed to select some files. The argument to ht-Dig should be a URL or list of URL's. So somewhere a URL has to be used to address the files. Once you have a URL you can use, the various files can be addressed by pathnames starting from the URL. You could convert the list of files into a list of URL's with substitutions which could be executed in the same pipe from the find command. If you're not running Unix, there might be some parallel operations you could use. Douglas ======== Douglas Kline kl...@he... |