W3C home > Mailing lists > Public > public-csv-wg@w3.org > June 2015

Re: encoding of json

From: Dan Brickley <danbri@google.com>
Date: Wed, 3 Jun 2015 18:10:27 +0200
Message-ID: <CAK-qy=5djuAVmvJiJ=Q-pCZeCkgO-h-bXWMza7GxMV22xS7Gbg@mail.gmail.com>
To: Ivan Herman <ivan@w3.org>
Cc: Jeremy Tandy <jeremy.tandy@gmail.com>, W3C CSV on the Web Working Group <public-csv-wg@w3.org>
On 3 June 2015 at 17:58, Ivan Herman <ivan@w3.org> wrote:
> I have just heard the remark from Addison that JSON is defined in terms of JavaScript, meaning that the encoding is utf-16 and not utf-8! This seems to be in line with
>
> http://www.ecma-international.org/publications/files/ECMA-ST/ECMA-404.pdf

I believe it is inspired by Javascript not defined in terms of.

"3. Encoding

   JSON text SHALL be encoded in Unicode.  The default encoding is
   UTF-8.

   Since the first two characters of a JSON text will always be ASCII
   characters [RFC0020], it is possible to determine whether an octet
   stream is UTF-8, UTF-16 (BE or LE), or UTF-32 (BE or LE) by looking
   at the pattern of nulls in the first four octets.

           00 00 00 xx  UTF-32BE
           00 xx 00 xx  UTF-16BE
           xx 00 00 00  UTF-32LE
           xx 00 xx 00  UTF-16LE
           xx xx xx xx  UTF-8"

"6. IANA Considerations

   The MIME media type for JSON text is application/json.

   Type name: application

   Subtype name: json

   Required parameters: n/a

   Optional parameters: n/a

   Encoding considerations: 8bit if UTF-8; binary if UTF-16 or UTF-32

      JSON may be represented using UTF-8, UTF-16, or UTF-32.  When JSON
      is written in UTF-8, JSON is 8bit compatible.  When JSON is
      written in UTF-16 or UTF-32, the binary content-transfer-encoding
      must be used."


http://www.ietf.org/rfc/rfc4627.txt

Couldn't find much in the JSON-LD spec on these issues.

Dan
Received on Wednesday, 3 June 2015 16:10:56 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 3 June 2015 16:10:56 UTC