Thread: [nekohtml-user] noframes parsing oddity

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

I was serializing the DOM of a page containing  <noframes> and noticed a 
problem. For background, <noframes>  was a child of <html> and the page also 
contained a  <frameset>, but <noframes> was outside <frameset>.   The result 
of parsing was that the contents of noframes was stored in  the DOM as one 
big, unparsed, TEXT node.

As a result of being stored as an unparsed TEXT node, upon serialization  of 
the DOM, all the contents of <noframes> were escaped.  That  is, this markup 
parsed by NekoHTML...

...became this after serialization...

Looking at the changes history, it seems like this is the result of a  "fix" 
from release 1.9.14.  Specifically, #2854697 which maps to  NekoHTML bug 
#87...

http://sourceforge.net/p/nekohtml/bugs/87/

Can someone explain how this "fixes" anything?  I guess there was, 
 apparently, a StackOverflowError that no longer occurs, but it leaves  the 
DOM in a state where the serializer needs to take special care to  output the 
contents of <noframes> in the way they were originally  meant to be.  How does 
that help anything?  Was this really the  intention of the "fix" or is this a 
bug/regression?

Note that <noscript> appears to be parsed like normal.  Why the difference 
with <noframes>?

Jake