|
From: Christopher M. <chr...@mc...> - 2003-11-15 01:40:17
|
On Fri, 2003-11-14 at 20:12, Neal Richter wrote: > Ack! This would imply that the 'purged document' is still returned in > the search results AFTER you run htpurge!! True???? > > I am assuming that you did something like this: > > 1) index pages > 2) htdump -w > 3) mv db.docs db.docs1 > 4) htpurge > 5) htdump -w > 6) mv db.docs db.docs2 > 7) diff db.docs1 db.docs2 Sorry, my bad. I had to do a fresh index first (I had already purged the same one earlier today). After the fresh index, I did a dump, purged a record and diffed the second dump. Here's what I got: 824a825 > 818 u:http://newfind.mcgill.ca/indexes/ads/?AdsID=1025825 t:*** BASSIST WANTED *** \ a:0 m:1068859617 s:336 H: anyone out there play bass? we're a groove/funk\ /jazz/rock improv band with influences from medeski martin wood and bela fleck to phish, \ pink floyd and hendrix... anything and everything in between... improv skills would help...\ email fa...@ho... for details... h: l:1068859617 L:0 b:2 c:1 g:0\ e: n: S: d:1025825 A: 1357a1359 > 2 u:http://newfind.mcgill.ca/indexes/ads/ t: a:2 m:1068859603 s:112334 \ H: h: l:1068859604 L:1403 b:1 c:0 g:0e: n: S: d: \ A: After the purge, it doesn't show up any more. Then after that, I tried to re-index it by doing this: [root@lovelace bin]# echo 'http://newfind.mcgill.ca/indexes/ads/?AdsID=1025825' | ./htdig - -s -v \ -m -c /www/htdig/install/conf/ads.conf ht://dig Start Time: Fri Nov 14 20:36:08 2003 New server: newfind.mcgill.ca, 80 0:11476:0:http://newfind.mcgill.ca/indexes/ads/?AdsID=1025825: (changed) size = 336 htdig: Run complete htdig: 1 server seen: htdig: newfind.mcgill.ca:80 1 document HTTP statistics =============== Persistent connections : Yes HEAD call before GET : Yes Connections opened : 2 Connections closed : 1 Changes of server : 0 HTTP Requests : 3 HTTP KBytes requested : 0.442383 HTTP Average request time : 0 secs HTTP Average speed : inf KBytes/secs ht://dig End Time: Fri Nov 14 20:36:08 2003 but it still doesn't show up in the search results (even after I changed my start_url to be 'http://newfind.mcgill.ca/indexes/ads/?AdsID=1025825'). Cheers, Chris -- Christopher Murtagh Enterprise Systems Administrator ISR / Web Communications Group McGill University Montreal, Quebec Canada Tel.: (514) 398-3122 Fax: (514) 398-2017 |