Menu

#531 XMLWF rejects XML file

Test Required
closed-duplicate
nobody
None
5
2022-01-10
2015-12-09
Marc
No

I have a document for which the current xmlwf.exe tells me
"2:3: not well-formed (invalid token)" when called with
-e UTF-8

The concerned character is: ᓴ (see also the attachement)
in

<?xml version="1.0" encoding="utf-8"?>
<o.ᓴ android:gravity="center_vertical" android:layout_width="wrap_content" android:layout_height="wrap_content" android:divider="?actionBarDivider" android:dividerPadding="12.0dip"
  xmlns:android="http://schemas.android.com/apk/res/android" xmlns:app="http://schemas.android.com/apk/res-auto" />

According to some online resources such as http://www.opentag.com/xfaq_charrep.htm#char_nonasciitag
Unicode characters in tag names are valid.

Has expat a problem or is this not a valid XML file?
However, if it is not real valid XML, I'd like to patch expat so that it supports these kind of names nevertheless.
I guess I'd have to change something in xmltok_impl.c
Could you give a small hint on what method to change in order to do that?

Thank you very much in advance
Marc

1 Attachments

Discussion

  • Marc

    Marc - 2015-12-09

    Patched CHECK_NAME_CASE, works for me now.

     
  • Karl Waclawek

    Karl Waclawek - 2015-12-10

    Expat supports Unicode, just not the latest version.
    It appears that your character is part of the "Unified Canadian Aboriginal Syllabics Extended " Unicode block which has only been inluded in Unicode 8.0.

     
  • Sebastian Pipping

    Still valid for Expat 2.2.4.
    I notice the bug is marked as "private". Marc, any concerns about making it public now?

     
  • Marc

    Marc - 2017-09-03

    Feel free to make it public.

     
  • Sebastian Pipping

    • private: Yes --> No
     
  • Sebastian Pipping

    Thanks, Marc!

     
  • Sebastian Pipping

    • status: open --> closed-duplicate
     
  • Sebastian Pipping

    Closing as a duplicate of https://sourceforge.net/p/expat/bugs/292/ .

     

Log in to post a comment.