[Htmlparser-user] Parsing malformed HTML whilst still leaving it intact
Brought to you by:
derrickoswald
From: Marc C. <mc...@ja...> - 2006-01-21 22:48:05
|
Hi, I'm parsing snippets of HTML pages at a time, making some changes and then outputting back to HTML. The problem with HTML snippets is that they will be malformed since some closing tags, for example, will be missing. The Parser seems to automatically correct the malformed HTML by adding closing tags. Is it possible to prevent it from doing so? Or at least it can notify me when it does so, so that before reconstructing the modified HTML output I can simply delete them. An alternative would be to use the Lexer but then I loose all the hierarchical features of the Parser, which not an option. This is similar to the general problem brought up in <http://sourceforge.net/mailarchive/message.php?msg_id=12635550> http://sourceforge.net/mailarchive/message.php?msg_id=12635550 . Kind Regards Mark |