Re: encoding of json

On 3 June 2015 at 17:58, Ivan Herman <ivan@w3.org> wrote:
> I have just heard the remark from Addison that JSON is defined in terms of JavaScript, meaning that the encoding is utf-16 and not utf-8! This seems to be in line with
>
> http://www.ecma-international.org/publications/files/ECMA-ST/ECMA-404.pdf

I believe it is inspired by Javascript not defined in terms of.

"3. Encoding

   JSON text SHALL be encoded in Unicode.  The default encoding is
   UTF-8.

   Since the first two characters of a JSON text will always be ASCII
   characters [RFC0020], it is possible to determine whether an octet
   stream is UTF-8, UTF-16 (BE or LE), or UTF-32 (BE or LE) by looking
   at the pattern of nulls in the first four octets.

           00 00 00 xx  UTF-32BE
           00 xx 00 xx  UTF-16BE
           xx 00 00 00  UTF-32LE
           xx 00 xx 00  UTF-16LE
           xx xx xx xx  UTF-8"

"6. IANA Considerations

   The MIME media type for JSON text is application/json.

   Type name: application

   Subtype name: json

   Required parameters: n/a

   Optional parameters: n/a

   Encoding considerations: 8bit if UTF-8; binary if UTF-16 or UTF-32

      JSON may be represented using UTF-8, UTF-16, or UTF-32.  When JSON
      is written in UTF-8, JSON is 8bit compatible.  When JSON is
      written in UTF-16 or UTF-32, the binary content-transfer-encoding
      must be used."


http://www.ietf.org/rfc/rfc4627.txt

Couldn't find much in the JSON-LD spec on these issues.

Dan

Received on Wednesday, 3 June 2015 16:10:56 UTC