Menu

#49 Invalid attribute bug in DocumentFragment parsing

closed
nobody
scanner (58)
5
2009-08-27
2009-01-23
arshan
No

Marc and I had a conversation last year about this bug (http://sourceforge.net/tracker/index.php?func=detail&aid=1995218&group_id=195122&atid=952178). Basically, if you try to parse the following text in a DocumentFragment parser, you'll get an error:

<a - href="/">link</a>
<a . href="/">link</a>

The error is as follows:

> Caused by: org.w3c.dom.DOMException: INVALID_CHARACTER_ERR: An invalid or illegal XML character is specified.

This is because the attributes the XML specification is very strict about special characters in attribute names. Running this through the regular DOMParser doesn't because the parsing logic is different.

Anyway, I have attached a patch that creates a new feature, "http://cyberneko.org/html/features/enforce-strict-attribute-names", that does not process XML-illegal attribute names for DocumentFragments. It is off by default at Marc's request. I hope that everyone benefits from this patch.

I have also attached a non-JUnit test case. You will have to make sure the path of the file name in the DOMParser test points to the HTML file that I also attached to make it work in your environment. That particular test is not very important because all it does is show that the DOMParser does not have the same flaw.

Basically, this is the same bug as reported before, but I've introduced a test case and a patch.

Discussion

  • arshan

    arshan - 2009-01-23

    Patch of the necessary files to prevent the bug from manifesting

     
  • Marc Guillemot

    Marc Guillemot - 2009-08-12
    • status: open --> pending
     
  • Marc Guillemot

    Marc Guillemot - 2009-08-12

    Sorry for the delay.

    To fix bug 2828534, attributes with invalid name are simply ignored in DOMFragmentParser. Isn't it enough?

     
  • SourceForge Robot

    This Tracker item was closed automatically by the system. It was
    previously set to a Pending status, and the original submitter
    did not respond within 14 days (the time period specified by
    the administrator of this Tracker).

     
  • SourceForge Robot

    • status: pending --> closed
     

Log in to post a comment.