Html text in Script tag causes parser error
Brought to you by:
derrickoswald
I'm parsing out a poorly implemented HTML site (not
well formed) and have come across an interesting
situation.
Inside of a link tag there's a script tag which defines
a variable string containing html. When the string
contains a </span> end tag, the parser decides that
this is an end tag for the script and spits out a
</script> (during .toHtml() )
The rest of the tags are therefore out of synch.
Example HTML Code is attached.
Logged In: YES
user_id=1631925
Never mind - I discovered this in the forums.
org.htmlparser.scanners.ScriptScanner.STRICT = false
Feel free to delete this bug as a "known" issue with "wild"
html :)
Logged In: YES
user_id=605407
Originator: NO
See org.htmlparser.scanners.ScriptScanner.STRICT = false solution.