- From: Nico Williams <nico@cryptonector.com>
- Date: Tue, 26 Nov 2013 18:11:31 -0600
- To: Carsten Bormann <cabo@tzi.org>
- Cc: JSON WG <json@ietf.org>, Bjoern Hoehrmann <derhoermi@gmx.net>, www-tag <www-tag@w3.org>, es-discuss <es-discuss@mozilla.org>
On Wed, Nov 27, 2013 at 12:20:25AM +0100, Carsten Bormann wrote: > On 27 Nov 2013, at 00:07, Nico Williams <nico@cryptonector.com> wrote: > > Do you want to say anything about other encodings? What would that be? > > JSON is encoded in UTF-8. > > There is no need to discuss JSON in other encodings, because it > wouldn’t be JSON. Thanks. My opinion as to MIME contexts: I'm not opposed to saying that the application/json media type requires UTF-8. Others have objected, and I believe the WG consensus to be that the application/json media type allows all of UTF-8/16/32. I believe we should settle for an interop note noting that UTF-8 has the best interoperability, and a recommendation that UTF-8 be used. My opinion as to non-MIME contexts: I'm not opposed to recommending that JSON texts for interchange in non-MIME contexts be encoded in UTF-8, and I'm not opposed to requiring that use of any other encoding be expressed as metadata. I do object to requiring that under all circumstances -even in non-MIME contexts- UTF-8 must be used. > (And no, I see no need to handle UTF-16LE, UTF-16BE, UTF-32LE or > UTF-32BE in any special way, even if RFC 4627 was written at a time > when it still seemed useful to pay them lip service. But I recognize > that there appears to be WG consensus to keep these corpses on life > support, maybe because UTF-16 is the internal encoding of the > programming language that gave JSON its name.) Right, that appears to be the consensus, and more than that, it seems extremely unlikely to change. Assuming *that*, what are you willing to settle for? Nico PS: Back to my hypo... If my hypothetical JSON-using shell were to escape all non-ASCII characters in JSON string values, then encode the JSON text in UTF-8, then convert the result to the current locale's codeset (doing the reverse to parse), and the resulting texts either never leak to other locales, why should anyone care? Most (but not all) non-Unicode locales use ASCII-compatible codesets, thus the result would be "proper" JSON texts in most cases anyways... As to why one might want to do that: because JSON texts are... *text*, i.e., editable in your favorite $EDITOR, readable with your favorite $PAGER, and so on. It might be a problem if such texts leaked outside that locale, but we already have that problem in spades, and no JSON parser would be called upon to try to auto-detect any encodings other than UTF-8/16/32.
Received on Wednesday, 27 November 2013 00:11:55 UTC