I am writing to check if what I am observing is a parsing bug, and if
so, if there are any known workarounds.
When javascript is being parsed, at the start, my NodeVisitor's visitTag
method gets called. As expected, all starting html tags within the
javascript itself are being ignored since they are are part of the
javascript and not the HTML. However, at the first closing tag that is
encounted within the javascript code (even within strings), the parser
triggers a close script tag and calls my visitor's visitEndTag method
gets called.
For example with "</textarea>" string within the javascript, I get 2
successive calls to visitEndTag, first with the SCRIPT tag and next with
the TEXTAREA tag. Or, with a "</he" + "ad>" string within javascript, I
get 2 successive calls to visitEndTag, first with the SCRIPT tag and
next with the TEXTAREA tag.
As an example, test the 2 urls:
1. http://youtube.com/?v=qf4tdOKKWic
2. http://www.deccanherald.com/Content/Oct202007/city2007102031594.asp
Any leads to fix this are much appreciated.
Thanks,
Subbu.
|