[Htmlparser-user] scanning / parsing bug?

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

For this url, 
http://www.washingtonpost.com/wp-dyn/content/article/2007/12/10/AR2007121001600.html 
(and maybe other washington post urls), I wonder if HTML Parser is 
running into a bug.

The HTML source for this page has the following block of HTML in the 
middle ..

    <!---------------- End New Comments Box ------------------>
    <div class="sidebarhack"><b></b></div>
    ....
    ....
    </div>
    <!-- sphereit end -->
    <br clear="all">

The parser is ignoring all content from the start of the line 'End New 
Comments Box' till 'sphereit end' ... I wonder if this is because of the 
lack of a space before the '-->' closing comment string in the first 
line ... I tested the code by adding a space manually at that point, and 
sure enough, the block of HTML in the middle is correctly recognized.

Is there a workaround for this?  I am also willing to download the 
source code and incorporate a fix, if necessary.

Thanks,
Subbu.