- From: Pete Cordell <petejson@codalogic.com>
- Date: Fri, 22 Nov 2013 19:28:14 -0000
- To: "Matt Miller \(mamille2\)" <mamille2@cisco.com>, "JSON WG" <json@ietf.org>
- Cc: <www-tag@w3.org>, "es-discuss" <es-discuss@mozilla.org>
----- Original Message From: "Matt Miller (mamille2)"
> There does seem to be rough consensus that using an encoding
> other than UTF-8 can have interoperability issues. The also
> seems to be rough consensus that the current text and table
> in section 8.1 for detecting the encoding will be inaccurate
> (and potentially harmful).
>
> That appears to mean the approach with the most consensus is
> to remove the encoding detection entirely, leaving only:
>
> """"
> JSON text SHALL be encoded in Unicode. The default encoding is
> UTF-8.
> """"
I think we can be a little more helpful here. For example, something along
the lines of:
JSON text is a sequence of Unicode codepoints. The transfer encoding
used to
represent those characters on-the-wire is beyond the scope of this
document. It is therefore up to the specifications that reference this
document to
specify whether JSON messages will be transferred using UTF-8
(recommended),
UTF-16 and/or UTF-32 (discouraged), and whether preceding BOMs must be
present, must not be present or are optional.
If multiple encodings are permitted, implementers may choose to
auto-detect a
message's encoding by exploiting the fact that the first character of a
JSON text
must be in the ASCII character range and use the following table to
deduce the
active encoding:
00 00 -- -- UTF-32BE
00 xx -- -- UTF-16BE
xx 00 00 00 UTF-32LE
xx 00 00 xx UTF-16LE
xx 00 xx -- UTF-16LE
xx xx -- -- UTF-8
Pete Cordell
Codalogic Ltd
C++ tools for C++ programmers, http://codalogic.com
Read & write XML in C++, http://www.xml2cpp.com
Received on Friday, 22 November 2013 19:27:30 UTC