From: Brad C. <yo...@br...> - 2004-10-22 14:54:24
|
I'd guess that putting the hyphen outside the quotes makes in invaild so nekohtml just throws it out. Our doctype looks like this and works just fine: <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> Could you provide a small test case for your error? Brad C --- Niklas Vargensten <nik...@ja...> wrote: > I am trying to parse various downloaded html documents with htmlunit, using a > mock web-connection. But - all pages starting with the following line (before > <HTML>): > > <! DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> > > are completely ignored by the parser. However, if I change the line to > > <! DOCTYPE HTML PUBLIC - "//W3C//DTD HTML 4.01 Transitional//EN"> > > (the only difference being the hyphen placed outside the double quotes) then > it works just fine. How is this possible? > > / Niklas > > > ------------------------------------------------------- > This SF.net email is sponsored by: IT Product Guide on ITManagersJournal > Use IT products in your business? Tell us what you think of them. Give us > Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out more > http://productguide.itmanagersjournal.com/guidepromo.tmpl > _______________________________________________ > Htmlunit-user mailing list > Htm...@li... > https://lists.sourceforge.net/lists/listinfo/htmlunit-user > |