Hi,
I am using expat-1.95.5 OR expat-2.1.0 on Windows XP SP3.
I have a "XML_ERROR_UNCLOSED_TOKEN" error when I parse an XML file that contains particular Chinese characters :
级 (0xe7 0xB4 0x9A)
通 (0xe9 0x80 0x9A)
Without these characters, all others Chinese characters are well parsed !
I used encoding "UTF-16"
Do you have any clue ?
Thx
Another infos :
These Chinese characters in UTF-16 are encode like this :
级 (0x1A 0x7D)
通 (0x1A 0x90)
The problem seems to be the "0x1A" character which is the EOF character and is an illegal character according to the XML specification
I have copied this bug report to GitHub now: https://github.com/libexpat/libexpat/issues/143
I had a closer look now and consider the cause outside of Expat, see https://github.com/libexpat/libexpat/issues/143#issuecomment-328603563 . Please re-open the ticket on GitHub with further details if my analysis seemed anything. Thanks!