|
From: Lachlan A. <lh...@us...> - 2003-02-26 22:21:19
|
On Thursday 27 February 2003 04:36, Neal Richter wrote: > On Wed, 26 Feb 2003, Lachlan Andrew wrote: > > 1) The -i option doesn't remove the _weakcmpr file. > > 2) I've just run htdig on an existing database *without* -i=20 > > and it also complained about weakcmpr problems. > > (I've forgotten whether I ran htpurge after the first run, > > so I'm running it again without it.) > > #1 is easy to fix. Yes. While we're at it, we should remove db.log (=3D"url_log"). I was=20 just thinking it might give you/us some insight into the cause of the=20 problem. For #2, I have run htdig again without -i and without having purged=20 the database, but after 'touch'ing each html file. It complains: WordDB: CDB___memp_cmpr_read: unable to uncompress page at pgno =3D=20 40435 WordDB: PANIC: Input/output error Whenever this appears, it appears twice. > #3 > What is htpurge being run for???? Isn't its used to remove > entries from the index? I know that htpurge is called immediately > after htdig in rundig... my question is WHY???!!! Entries are created for all of the pages referred to during the dig,=20 even if they don't exist. Purging gets rid of these useless entries. > How are you guys using it? =2E./bin/htpurge -v -c <file>.conf > An interesting test would be to establish two test datasets that > are exact duplicates of each other at different URLs on your > server. > > %htdig -i URL1 > %htdig -i URL2 > > This would access, expand and rewrite nearly every page in the > WordDB. If there are problems rewriting/expanding pages, they may > show up. If -i works, the database should be erased before being accessed in=20 the second dig, shouldn't it? Regards, Lachlan |