From: Olivier V. <ova...@ch...> - 2005-02-25 09:41:40
|
Hello, The indexing of my site seems to work fine but I can't search anything into my database. I have the message "No matches were found" for every words in the search page. What I tried already to troubleshoot this problem: 1- When I do a "htdig -h 1 -v", everything works fine. It means that I can search words correctly in my search engine. Thus, it means that something is wrong when I try to dig into more files. 2- htdig -vvvv > /var/log/rundig.log 2>&1 & then htmerge -vvvv I have no error message, everything seems to work fine. but the search engine give me no result! 3- I tried htsearch from the command line with the same result 4- I tried to delete the database and dig again without result Here is my database files: su-2.05b# ls -al total 130270 drwxrwxrwx 2 root wheel 512 Feb 25 10:48 . drwxr-xr-x 4 root wheel 512 Feb 17 16:31 .. -rwxrwxrwx 1 root wheel 1630208 Feb 24 19:02 db.docdb -rwxrwxrwx 1 root wheel 942080 Feb 24 19:02 db.docs.index -rwxrwxrwx 1 root wheel 8634368 Feb 24 19:02 db.excerpts -rwxrwxrwx 1 root wheel 123279360 Feb 24 19:10 db.words.db -rwxrwxrwx 1 root wheel 16384 Feb 24 03:00 db.words.db_weakcmpr -rwxrwxrwx 1 root wheel 9807 Feb 23 15:45 htdex.Bgh1uG Everything seems to look fine. It seems that my database is full of words. I don't know now how to troubleshoot further... Any help or ideas are welcome! Thanks, Olivier. |
From: Jim <li...@yg...> - 2005-02-26 10:19:22
|
On Fri, 25 Feb 2005, Olivier Vautrin wrote: > The indexing of my site seems to work fine but I can't search anything into > my database. I have the message "No matches were found" for every words in > the search page. ... > Everything seems to look fine. It seems that my database is full of words. > > I don't know now how to troubleshoot further... Any help or ideas are > welcome! The first thing I would try is rerunning htdig/htmerge using the -c option to explicitly specify the configuration file you want to use. Then run htsearch from the command line also using the -c option to specify the configuration file. If the search works in this case, it is likely that htdig/htmerge and htsearch do not agree on the default location of the databases (i.e. they are using different configuration settings). Jim |
From: Olivier V. <ova...@ch...> - 2005-03-17 11:41:01
|
Jim wrote: > On Fri, 25 Feb 2005, Olivier Vautrin wrote: > >> The indexing of my site seems to work fine but I can't search >> anything into my database. I have the message "No matches were found" >> for every words in the search page. > > ... > >> Everything seems to look fine. It seems that my database is full of >> words. >> >> I don't know now how to troubleshoot further... Any help or ideas are >> welcome! > > > The first thing I would try is rerunning htdig/htmerge using the -c > option to explicitly specify the configuration file you want to use. > Then run htsearch from the command line also using the -c option to > specify the configuration file. If the search works in this case, it > is likely that htdig/htmerge and htsearch do not agree on the default > location of the databases (i.e. they are using different configuration > settings). > > Jim It seems that it is not a problem for the configuration files. I tried your suggestion without result. The behaviour is quite simple to reproduce: When I use the command: ."/htdig -i -h 3 -v ". the search engine is working well and htsearch also. When I use the command: "./htdig -i -v". htsearch give me no result at all ("No matches were found"). The database is not empty: su-2.05b# cd /usr/local/share/htdig/database/ su-2.05b# ls -al total 148798 drwxrwxrwx 2 root wheel 512 Mar 17 10:32 . drwxr-xr-x 4 root wheel 512 Feb 17 16:31 .. -rw-r--r-- 1 root wheel 679936 Mar 17 12:28 db.docdb -rwxrwxrwx 1 root wheel 1376256 Mar 15 18:52 db.docdb.work -rw-r--r-- 1 root wheel 180224 Mar 17 12:28 db.docs.index -rwxrwxrwx 1 root wheel 729088 Mar 15 18:52 db.docs.index.work -rw-r--r-- 1 root wheel 5251072 Mar 17 12:28 db.excerpts -rwxrwxrwx 1 root wheel 7249920 Mar 15 18:52 db.excerpts.work -rw-r--r-- 1 root wheel 47339520 Mar 17 12:28 db.words.db -rwxrwxrwx 1 root wheel 90491904 Mar 15 18:52 db.words.db.work -rwxrwxrwx 1 root wheel 16384 Mar 15 18:52 db.words.db.work_weakcmpr -rw-r--r-- 1 root wheel 16384 Mar 17 12:28 db.words.db_weakcmpr -rwxrwxrwx 1 root wheel 9807 Feb 23 15:45 htdex.Bgh1uG and an htdig with "-vvv" give me no interesting results in the logs... Is there a maximum size for the database? Thanks, Olivier. |
From: Jim <li...@yg...> - 2005-03-19 09:20:56
|
On Thu, 17 Mar 2005, Olivier Vautrin wrote: > It seems that it is not a problem for the configuration files. I tried your > suggestion without result. > > The behaviour is quite simple to reproduce: > > When I use the command: ."/htdig -i -h 3 -v ". the search engine is working > well and htsearch also. > When I use the command: "./htdig -i -v". htsearch give me no result at all > ("No matches were found"). Strange. The only difference here would be in the number of files indexed. According to the database sizes provided below, the dig is not generating database files large enough to hit any limits. > The database is not empty: > su-2.05b# cd /usr/local/share/htdig/database/ > su-2.05b# ls -al > total 148798 > drwxrwxrwx 2 root wheel 512 Mar 17 10:32 . > drwxr-xr-x 4 root wheel 512 Feb 17 16:31 .. > -rw-r--r-- 1 root wheel 679936 Mar 17 12:28 db.docdb > -rwxrwxrwx 1 root wheel 1376256 Mar 15 18:52 db.docdb.work > -rw-r--r-- 1 root wheel 180224 Mar 17 12:28 db.docs.index > -rwxrwxrwx 1 root wheel 729088 Mar 15 18:52 db.docs.index.work > -rw-r--r-- 1 root wheel 5251072 Mar 17 12:28 db.excerpts > -rwxrwxrwx 1 root wheel 7249920 Mar 15 18:52 db.excerpts.work > -rw-r--r-- 1 root wheel 47339520 Mar 17 12:28 db.words.db > -rwxrwxrwx 1 root wheel 90491904 Mar 15 18:52 db.words.db.work > -rwxrwxrwx 1 root wheel 16384 Mar 15 18:52 db.words.db.work_weakcmpr > -rw-r--r-- 1 root wheel 16384 Mar 17 12:28 db.words.db_weakcmpr > -rwxrwxrwx 1 root wheel 9807 Feb 23 15:45 htdex.Bgh1uG > > and an htdig with "-vvv" give me no interesting results in the logs... Have you double and triple checked this? Most of the issues that could result in corrupt databases (e.g. running out of disk space, running out of tmp space, network problems, etc.) should show up in the verbose output. Have you tried running the htdump program on the database to see if the databases are even readable by the ht://Dig tools? If that works, you might want to check the resulting files to ensure that the terms and documents you are expecting to find are actually present. > Is there a maximum size for the database? Yes, but it depends on the version of ht://Dig, the way it is compiled, and the hardware it is running on. However the first limit you would be likely to run into is at about 2 GB and you are clearly nowhere near that based on the numbers above. Is htsearch running on a machine over which you have full control? If you are running on a shared server maintained by someone else, it might be that resource limits have been put in place that limit the amount of memory/CPU time that CGI's are allowed to consume. I would expect a server error in this case rather than a "No matches" response, but I am not sure what else to suggest. Jim |