The attached code tries to use expat to parse the (wide) string
L"\ufeff<?xml version=\"1.0\" encoding=\"UTF-16\"?><root><child\u2070></root>".
When it's run, it produces the output:
Expat returned error 4: not well-formed (invalid token)
According to the XML specification for version 1.0 (5th edition), the character \u2070 (which is the superscript zero) is legal in tag names, so I think expat should accept it.
I think expat currently complies with the fourth version of the standard, or an even older version.
I see that conformance to XML 1.0 is one of your goals, which I think also includes the fifth version of the standard.
Log in to post a comment.