From: Niklas V. <nik...@ja...> - 2004-10-22 14:29:10
|
I am trying to parse various downloaded html documents with htmlunit, using a mock web-connection. But - all pages starting with the following line (before <HTML>): <! DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> are completely ignored by the parser. However, if I change the line to <! DOCTYPE HTML PUBLIC - "//W3C//DTD HTML 4.01 Transitional//EN"> (the only difference being the hyphen placed outside the double quotes) then it works just fine. How is this possible? / Niklas |