[htdig] pdf and doc hits sorted first in htsearch results?

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

We are running ht://Dig 3.2.0b4-011302 on a Red Hat 7.3 system, installed from 
the standard Red Hat RPMs.  We have been using doc2html to parse PDFs and DOCs, 
with the following lines at the end of /etc/htdig.conf:

external_parsers: application/msword->text/html /usr/local/bin/doc2html.pl \
                   application/postscript->text/html /usr/local/bin/doc2html.pl \
                   application/pdf->text/html /usr/local/bin/doc2html.pl

The mystery is:  How can we get htsearch to stop bunching all the .pdf and .doc 
files at the top of the results?  For reasons unclear to me, all matching .pdf 
files are listed, then all the .docs files, and then all the .html files.

Our search algorithm and weighting factors are like this:

search_algorithm:       exact:1 synonyms:0.2 endings:0.1

#backlink_factor: 1000.0
#date_factor: 0.00
#description_factor:  150
#heading_factor: 5.0
keywords_factor: 500
meta_description_factor: 100
#text_factor: 1
#title_factor: 100
heading_factor_1: 10
heading_factor_2: 5
heading_factor_3: 4
#heading_factor_4: 1
#heading_factor_5: 1
#heading_factor_6: 0

Any suggestions?  (We're just about ready to give up indexing .pdf and .doc 
files altogether.)