From: Tatu S. <cow...@ya...> - 2006-08-22 01:05:30
|
--- Jimmy Zhang <cra...@co...> wrote: > How many different kind of encoding does woodstox > currently support?? Natively just couple (UTF-8, ISO-8859-1, UTF-32), first 2 for performance reasons, third because JDK doesn't support it. Others are handled by using JDK constructed Reader. EBCDIC doesn't yet work; I'm not sure what'd be the best way (JDK may have some decoders, but I don't know exactly which ones to use, nor have many sample docs). It's nice to be able to use JDK decoders as a fallback. With NIO, perhaps VTD-XML could use them too? I have been thinking of writing my own UTF-16 decoder/encoder, for performance reasons and to be able to do better character validation, but haven't had time. > I am thinking about adding more encoding support > right now VTD-XML supports > support UTF8 ascii iso-8859 UTF-16BE and UTF-16LE That's a reasonable starting point I think. UTF-32 is quite easy to support, but I don't know if many use it (there were same test docs though, in XMLTest test suite). Other iso-8859-x encodings beyond -1 might be easy too: you just have to map byte values 128 - 255 to other parts of Unicode tables (I think). -+ Tatu +- __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com |