I have a document for which the current xmlwf.exe tells me
"2:3: not well-formed (invalid token)" when called with
-e UTF-8
The concerned character is: ᓴ (see also the attachement)
in
<?xml version="1.0" encoding="utf-8"?> <o.ᓴ android:gravity="center_vertical" android:layout_width="wrap_content" android:layout_height="wrap_content" android:divider="?actionBarDivider" android:dividerPadding="12.0dip" xmlns:android="http://schemas.android.com/apk/res/android" xmlns:app="http://schemas.android.com/apk/res-auto" />
According to some online resources such as http://www.opentag.com/xfaq_charrep.htm#char_nonasciitag
Unicode characters in tag names are valid.
Has expat a problem or is this not a valid XML file?
However, if it is not real valid XML, I'd like to patch expat so that it supports these kind of names nevertheless.
I guess I'd have to change something in xmltok_impl.c
Could you give a small hint on what method to change in order to do that?
Thank you very much in advance
Marc
Patched CHECK_NAME_CASE, works for me now.
Expat supports Unicode, just not the latest version.
It appears that your character is part of the "Unified Canadian Aboriginal Syllabics Extended " Unicode block which has only been inluded in Unicode 8.0.
Still valid for Expat 2.2.4.
I notice the bug is marked as "private". Marc, any concerns about making it public now?
https://sourceforge.net/p/expat/bugs/525/ is related.
Feel free to make it public.
Thanks, Marc!
Closing as a duplicate of https://sourceforge.net/p/expat/bugs/292/ .