From: Gilles D. <gr...@sc...> - 2002-02-08 16:19:04
|
According to Keith Elliott: > I am using ver. 3.2.0-0.b3.4 on Linux I assume this is the Red Hat Linux RPM package. You should upgrade to at least the 3.2.0-1.b4.1 RPM update package that Red Hat issued in October, which contains a number of bug fixes including a security related fix. > The problem seems to show up on many > pages, not > just the one with which I'm experimenting. I have not messed with > noindex_start, > but I will check it out. ... > The page is dynamically generated (the URL is > animal.nurelm.com/site/faqs.jsp). It's full of Flash and Javascript, but I > would hope that did not matter. In fact, I removed all of the Flash and > JavaScript and it did not matter. I have messed with most of the > configuration parameters and get seem to make any progress. That faqs.jsp page has a series of 3 ASCII NUL characters in the middle of an HTML tag about 10 lines below the second section of JavaScript code: <td><img na^@^@^@me="tag" src="images/tag.gif" width="182" height="16" border="0"></td> The htdig text parsers have always choked on NULs because that's the normal C string terminator character. The noindex_start attribute won't help, because it also will choke on NULs. So, the obvious fix is to find and get rid of all NULs in your files. However, you say this seems to show up on many pages, so editing them by hand may not be the easiest fix. The question is, why are those NULs in the file in the first place? Are they the result of a buggy HTML code generator? Can you fix the problem by fixing the code generator? You could try the following patch as well. It's for 3.1.6, but it would likely apply fine to the 3.2 code. Of course, you'd need to build the code by hand, or at least unpack a src.rpm file and alter it to use the patch. ftp://ftp.ccsf.org/htdig-patches/3.1.6/NUL.0 -- Gilles R. Detillieux E-mail: <gr...@sc...> Spinal Cord Research Centre WWW: http://www.scrc.umanitoba.ca/~grdetil Dept. Physiology, U. of Manitoba Phone: (204)789-3766 Winnipeg, MB R3E 3J7 (Canada) Fax: (204)789-3930 |