From: Brad C. <yo...@br...> - 2004-10-27 22:56:43
|
Sounds like it might be a problem with xerces or nekohtml. What versions of these are you running? Could you make a simple test case for this? (I can get to things much quicker when I don't have to write a test) Brad C --- Niklas Vargensten <nik...@ja...> wrote: > OK, so maybe the hyphen is not the issue. The problem is that the documents > don't have a complete ending tag. In my example, > i) <! DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" > > > is skipped altogether. I suppose the reason is that the space at the end of > the tag (before the >) confuses the parser, making it expect > a url such as "http://www.w3.org/TR/html4/loose.dtd" at the end. > > But - simply removing the last space, which gives us: > ii) <! DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> > > works just fine with htmlunit. This is very strange behaviour, wouldn't you > say? If it is ok to exclude the ending url, then it should of course be ok to > have spaces at the end of the tag (before the >). In particular since a HTML > parser is not supposed to care too much about spaces. It just seems a bit too > picky :) > > / Niklas |