I'm a novice in htdig and have tried some problems to configure htdig for my webserver (apache on suse10).

 

The server manage 4 different virtual host configured with 2 IP_virtual_address_host and 2_name_base_virtual_host as below described

 

==============================================================================

            HOST 1 ip ADDRESS

            ====================

            <VirtualHost 192.168.aaa.aaa>

             DocumentRoot /srv/www/htdocs/extranet2

             ServerName extranet2.unitą.azienda.it

             ServerAdmin webmaster@azienda.it

             ErrorLog /var/log/apache2/extranet2.azienda.it-error_log

             Alias /extranet2 "/srv/www/htdocs/extranet2"

             <Directory "/srv/www/htdocs/extranet2">

              Options Indexes

              AllowOverride None

              Order allow,deny

              Allow from all

             </Directory>

            </VirtualHost>

 

            HOST 2 ip ADDRESS

            ====================

            <VirtualHost 192.168.aaa.bbb>

             DocumentRoot /srv/www/htdocs/intranet2

             ServerName intranet2.unitą.azienda.it

             ServerAdmin webmaster@azienda.it

             ErrorLog /var/log/apache2/intranet2.azienda.it-error_log

             Alias /extranet2 "/srv/www/htdocs/intranet2"

             <Directory "/srv/www/htdocs/intranet2">

              Options Indexes

              AllowOverride None

              Order allow,deny

              Allow from all

             </Directory>

            </VirtualHost>

 

            HOST BASED NAME (2 HOST)

            ====================

            <VirtualHost 192.168.aaa.ccc>

             ServerName test1.unitą.azienda.it

             DocumentRoot /srv/www/htdocs/test1

             ServerAdmin webmaster@azienda.it

             ErrorLog /var/log/apache2/test1.azienda.it-error_log

             Alias /test1 "/srv/www/htdocs/test1"

             <Directory "/srv/www/htdocs/test1">

              Options None

              AllowOverride None

              Order allow,deny

              Allow from all

             </Directory>

            </VirtualHost>

 

            <VirtualHost 192.168.aaa.ccc>

             ServerName test2.unitą.azienda.it

             DocumentRoot /srv/www/htdocs/test2

             ServerAdmin webmaster@azienda.it

             ErrorLog /var/log/apache2/test2.azienda.it-error_log

             Alias /test2 "/srv/www/htdocs/test2"

             <Directory "/srv/www/htdocs/test2">

              Options None

              AllowOverride None

              Order allow,deny

              Allow from all

             </Directory>

            </VirtualHost>

            ===========================================================================

 

The web server well works and the DNS is fine (all name and reverse search are OK).

I can call each site as xx.unitą.azienda.it or by ip.name; and from each site I can

call htdig  http://namesite/htdig and the program start.

 

This is the config file used for htdig (The converter are present and I think well

configured using XPDF and Catdoc)

            /////////////////////////////////////////////////////////////////////////

            database_dir:                /var/lib/htdig/db

            start_url:                       http://extranet2.unitą.azienda.it             

            http://intranet2.unitą.azienda.it http://test2.unitą.azienda.it

            http://test1.unitą.azienda.it

            external_parser:            application/pdf->text/html /usr/local/bin/conv_doc.pl

            external_parser:            application/msword->text/html /usr/local/bin/conv_doc.pl

            limit_urls_to:                 ${start_url}

            common_url_parts:        ${limit_urls_to} .html .htm .shtml .php .asp

            exclude_urls:                /cgi-bin/ .cgi    

            bad_extensions:                       .wav .gz .z .sit .au .zip .tar .hqx .exe .com .gif \

                                               .jpg .jpeg .aiff .class .map .ram .tgz .bin .rpm .mpg .mov .avi .css

            maintainer:                   webmaster@azienda.it

            max_head_length:         10000

            max_doc_size:             200000

            no_excerpt_show_top:   true

            search_algorithm:          exact:1 synonyms:0.5 endings:0.1

            next_page_text:                       

              ........images instructin omitted but well work........

            allow_virtual_host:         true     

            case_sensitive:             false    

            allow_numbers:                         true     

            logging:                        true     

            max_stars:                   6         

            /////////////////////////////////////////////////////////////////////////

 

The problem encountered are different:

1) when I run the first time rundigg -vvv the system retried a lot of page from alias (manual of apache) and in main web (generic under htdocs) but not the virtual site (directory below); after I've changed the parameter in htdig.conf the process is very quickly and just a page for each site (index.html) is returned (showing just the site name without ref:) and if I click on it the referred page is open; I cannot retrive others pages under each site (.html and .php), documents (word format e pdf format) inserted for testing the installation.

 

2) how I can creare different search engine for each site? This is for me important to avoid more argument will be mixed inside the company. I've found some papers around the web (and in your site) where is specified to create more .conf file and different database but not un instruction is given (also in official manual).

There is a paper to explain how to proceed?

How I can with a single htDig installation address htsearch on different database?

 

Thanks a lot for each suggestion you can give me.

 

Bye Bye

 

Roberto Franchi