|
From: Gilles D. <gr...@sc...> - 2003-10-08 21:45:04
|
According to Martin Joisten: > I have to admit not having followed this problem so far, but when > Natalya writes "I don't get error message, but I have never .pdf-Files > in my search-List!!!", I wonder if a simple misunderstanding is the > cause for the trouble... > > For my understanding htdig doesn't index all the files in a subdirectory > but only follows URLs which it finds on "webpages". So if no URL points > to a PDF-File, no PDF will be indexed and therefore no PDF will show up > in the search list. > > I wanted to index PDFs once and specially created a single PHP File that > would browse through the subdirectories recursively and simple create a > page with links to all the PDF Files found. > > I pointed htdig to this particular file and "voila" - all of the PDF > Files were indexed. So maybe this is the problem - no links to the PDF > Files. > > If this point had already been cleared in previous mails concerning this > issue, I apologize for not having read these. I raised the issue very briefly in my reply to Natalya yesterday. I.e.: > >>>Also, make sure you have links in your HTML files to all PDF files you > >>>want to index. (See http://www.htdig.org/FAQ.html#q5.25) However, this is just one possibility among many possible trouble spots, and the test results from attempting to index a single PDF, using the URL of the PDF as start_url, suggest there's another problem at play here. I think it's important to get htdig working with a single PDF before tackling the bigger issue of whether it can find multiple ones. -- Gilles R. Detillieux E-mail: <gr...@sc...> Spinal Cord Research Centre WWW: http://www.scrc.umanitoba.ca/ Dept. Physiology, U. of Manitoba Winnipeg, MB R3E 3J7 (Canada) |