From: Rupert J. <ru...@sa...> - 2004-06-11 08:15:17
|
Hi Graeme, Htdig works through HTTP, therefore indexes from a webserver. If you don't have a webserver running on your local machine then it won't be able to perform HTTP requests and therefore won't 'dig' anything (or so I believe). It should work indexing the live website, as this will allow HTTP requests. The htdig files will be stored on your local machine. The error messages: htfuzzy: Unable to open word database /var/lib/htdig/db.words.db htfuzzy: Unable to open word database /var/lib/htdig/db.words.db Are probably generated because there has been nothing indexed and therefore the database files don't exist. It might also be down to file permissions though. Does htdig have write access to its db directory? You might want to consider downloading Apache and installing it if you just want to index websites on your local machine. You should get a decent install just by doing ./configure --prefix=/path/to/install/location (typical is /var/www or /usr/local/apache). If you can, you might prefer to do an RPM install. HTH Rupert. -----Original Message----- From: htd...@li... [mailto:htd...@li...] On Behalf Of Graeme Nichols Sent: 11 June 2004 06:34 To: Jim Cc: htd...@li... Subject: Re: [htdig] Having trouble getting rundig to run, need some helpfor a newbie On Thu, 2004-06-10 at 16:30, Jim wrote: > On Wed, 10 Jun 2004, Graeme Nichols wrote: > > > [graeme@localhost graeme]$ sudo rundig -vvv > > ht://dig Start Time: Thu Jun 10 13:13:17 2004 > > 1:1:http://localhost/home/graeme/gramps/web/ > > New server: localhost, 80 > > - Persistent connections: enabled > > - HEAD before GET: disabled > > - Timeout: 30 > > - Connection space: 0 > > - Max Documents: -1 > > - TCP retries: 1 > > - TCP wait time: 5 > > - Accept-Language: > > Trying to retrieve robots.txt file > > Making HTTP request on http://localhost/robots.txt > > Unable to establish the connection with host: localhost (port 80) > > Do you have a web server running on port 80? If you open up a browser and > enter http://localhost/robots.txt in the address box, do you get a > response, or does your browser just timeout? If you do have a web server > setup to run on port 80, are you certain it was running when you tried to > index the site? No, I don't have a web server running on port 80 and when I enter http://localhost/robots.txt into my browser (Galeon) it goes to Google. > Are you sure that http://localhost/home/graeme/gramps/web/ is a valid > start_url? Such a URL would seem to imply that your web browser is > configured to use / as the document root, which would generally be > considered a bad thing and is certainly not typical. If you type this URL > into your browser, what happens? Same as above. > If you are just trying to index things locally, you might want to take a > look at the local_urls and local_urls_only attributes. > > http://www.htdig.org/attrs.html#local_urls > http://www.htdig.org/attrs.html#local_urls_only > > > Jim Thanks Jim, will go and check out the above and try again. What I'm trying to do is index a web site which I have a copy of on my local disk in the following directory:- /home/graeme/gramps/web/. I will also try indexing the url to the live web site and see what happens. Thanks again for your help. I'll get back if I still cannot work it out. -- Kind regards, Graeme Nichols Public Key available from http://keyserver.kjsl.com:11371/#extract ---------------------------------------------------------------------- Oh, by the way, which one's Pink? -- Pink Floyd ---------------------------------------------------------------------- - IMPORTANT. - - The contents of this email and any attachments, which may be con- - - fidential, are sent for the personal attention of the addressee/s - - only. If you receive this email and are not the intended addressee - - please inform the sender and delete this email immediately. Use, - - copying, disclosure or forwarding of the contents of this email - - and/or any attachment/s is not authourised. - ---------------------------------------------------------------------- -- Incoming mail is certified Virus Free. Checked by AVG Anti-Virus (http://www.grisoft.com). Version: 7.0.250 / Virus Database: 263.1.2 - Release Date: 07/06/2004 -- Outgoing mail is certified Virus Free. Checked by AVG Anti-Virus (http://www.grisoft.com). Version: 7.0.251 / Virus Database: 263.2.0 - Release Date: 10/06/2004 |