- From: Dan Brickley <danbri@google.com>
- Date: Wed, 3 Jun 2015 18:10:27 +0200
- To: Ivan Herman <ivan@w3.org>
- Cc: Jeremy Tandy <jeremy.tandy@gmail.com>, W3C CSV on the Web Working Group <public-csv-wg@w3.org>
On 3 June 2015 at 17:58, Ivan Herman <ivan@w3.org> wrote:
> I have just heard the remark from Addison that JSON is defined in terms of JavaScript, meaning that the encoding is utf-16 and not utf-8! This seems to be in line with
>
> http://www.ecma-international.org/publications/files/ECMA-ST/ECMA-404.pdf
I believe it is inspired by Javascript not defined in terms of.
"3. Encoding
JSON text SHALL be encoded in Unicode. The default encoding is
UTF-8.
Since the first two characters of a JSON text will always be ASCII
characters [RFC0020], it is possible to determine whether an octet
stream is UTF-8, UTF-16 (BE or LE), or UTF-32 (BE or LE) by looking
at the pattern of nulls in the first four octets.
00 00 00 xx UTF-32BE
00 xx 00 xx UTF-16BE
xx 00 00 00 UTF-32LE
xx 00 xx 00 UTF-16LE
xx xx xx xx UTF-8"
"6. IANA Considerations
The MIME media type for JSON text is application/json.
Type name: application
Subtype name: json
Required parameters: n/a
Optional parameters: n/a
Encoding considerations: 8bit if UTF-8; binary if UTF-16 or UTF-32
JSON may be represented using UTF-8, UTF-16, or UTF-32. When JSON
is written in UTF-8, JSON is 8bit compatible. When JSON is
written in UTF-16 or UTF-32, the binary content-transfer-encoding
must be used."
http://www.ietf.org/rfc/rfc4627.txt
Couldn't find much in the JSON-LD spec on these issues.
Dan
Received on Wednesday, 3 June 2015 16:10:56 UTC