From: Gilles D. <gr...@sc...> - 2002-06-11 18:26:53
|
According to Hinrich Specht: > I=B4m looking for a search engine to be used in an intranet solution. T= here > are pdf-files placed in a directory that is not part of the webroot (an= d for > some security reasons it must not be part of it). The pdf-files in this > directory should be indexed by the search engine... Well, if security is a concern, wouldn't it defeat that security to put a searchable index of these documents up on your web site? (See http://www.htdig.org/FAQ.html#q4.20) Or are you looking for a non-web based search interface? The htsearch program can be configured to run from a shell script and give plain text results. > now my question: Is htdig able to > search in the local filesystem (outside webroot) - or, if not: Can anyo= ne > suggest a way to solve the problem? One idea might be to have a dedicat= ed > webserver installed and to make the pdf-directories the webroot of this= webserver, > then only let the searchengine have access to it ?!?!? Well, a dedicated, secure and password-contolled web server might be a solution to the problem, but there are other ways. With the local_urls attribute, you can convince or trick htdig to look elsewhere for files (even outside of the DocumentRoot), by mapping http:// URLs to specific local directories. Also, with the recent 3.2.0b4 snapshots, you can get htdig to index file:/ URLs too. --=20 Gilles R. Detillieux E-mail: <gr...@sc...> Spinal Cord Research Centre WWW: http://www.scrc.umanitoba.ca/ Dept. Physiology, U. of Manitoba Winnipeg, MB R3E 3J7 (Canada) |