|
From: Gilles D. <gr...@sc...> - 2002-04-12 19:53:49
|
According to Willy Calderon:
> I've got a few output lines from doing a rundig in which I'm being asked
> what to index.
>
> =========================
> host# rundig -vvv
> 1:1:
> New server: , 0
OK, the two lines above tell me htdig isn't even seeing a valid URL to
begin with. That points the finger to a faulty start_url definition...
> Unknown host: 0/robots.txt
> pushed
> pick: , # servers = 1
> htmerge: Unable to open word list file '/opt/www/htdig/db/db.allwords.text'.
> Did you index anything?
> Check your config file and try running htdig again.
> ==========================
>
> At the moment my htdig.conf file looks something like this
> ==========================
> database_dir: /opt/www/htdig/db
> database_base: ${database_dir}/db
> word_db: ${database_base}.allwords.db
> word_list: ${database_base}.allwords.text
> config_dir: /opt/www/htdig/conf
> common_url: /var/www/htdocs/www/
> start_url: ${common_dir}/index.html
Bingo! The 3.1.x series of htdig only handles http:// URLs. You can't
have just a bare UNIX directory pathname for a URL. The ${common_dir}
attribute expands to a UNIX directory path. (Even with the 3.2 betas,
which allow other protocols than HTTP, you still need to explicitly give
the protocol in the URL, even for file:/ URLs.)
--
Gilles R. Detillieux E-mail: <gr...@sc...>
Spinal Cord Research Centre WWW: http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba Phone: (204)789-3766
Winnipeg, MB R3E 3J7 (Canada) Fax: (204)789-3930
|