From: Mark S. <ma...@Sc...> - 2006-10-02 20:59:02
|
Jimmy Zhang wrote: > Is ^L a valid XML character? What is its value in the UCS? > Does Xerces have problem with this char?? Yeah, I can see I wasn't clear; I did look up ^L and didn't find anything so was hoping you would just know :-) Ok, I used hexdump and found the value of the offending character: 0x0C Form Feed. This is not a valid xml character. Valid characters are 0x0a, 0x0d, 0x09 below 0x20, and 0x20 and up: http://www.w3.org/TR/REC-xml/#charsets However, I now have to create a method called removeAsciiControl() that removes every byte < 0x20 except for 0x0d, 0x0a, 0x09. Only then can I pass this cleaned up data to vtd. I'd like to avoid this overhead, and it would be ideal if vtd just ignored non-valid xml characters. This saves me from creating a buffer and cleaning the data manually. Thank you. -- http://www.ScheduleWorld.com/tg/ Free Google Calendar synchronization with Outlook, Evolution, cell phones, BlackBerry, PalmOS, Exchange, Mozilla, Thunderbird, Pocket PC/Windows Mobile. Also sync tasks, notes and contacts! WebDAV, vfreebusy, RSS, LDAP, iCalendar, iTIP, iMIP support. |