Anonymous - 2013-08-10

Originally posted by: tavultes...@gmail.com

Per [rFC4627], this could be solved by detecting the encoding of the JSON string before decoding, and converting it to UnicodeString (or WideString if you are stuck in the bad old days), and then the issues around encoding go away.  But this means you'd need considerable modifications to the parser though as it currently assumes ANSIString for memory management.

The alternative is to convert to UTF-8, but as this is not a native Delphi string format, you take on the messy task of validating the UTF-8 as you go.

As it stands, both the parser and the generator are non-compliant.

The relevant encoding section of [rFC4627]:

3.  Encoding

   JSON text SHALL be encoded in Unicode.  The default encoding is
   UTF-8.

   Since the first two characters of a JSON text will always be ASCII
   characters [[rFC0020]], it is possible to determine whether an octet
   stream is UTF-8, UTF-16 (BE or LE), or UTF-32 (BE or LE) by looking
   at the pattern of nulls in the first four octets.

           00 00 00 xx  UTF-32BE
           00 xx 00 xx  UTF-16BE
           xx 00 00 00  UTF-32LE
           xx 00 xx 00  UTF-16LE
           xx xx xx xx  UTF-8