- From: John Cowan <cowan@ccil.org>
- Date: Mon, 2 Nov 2009 22:54:35 -0500
- To: "Phillips, Addison" <addison@amazon.com>
- Cc: John Cowan <cowan@ccil.org>, "public-i18n-core@w3.org" <public-i18n-core@w3.org>
Phillips, Addison scripsit: > Our W3C WG is discussing this in our TPAC meeting today and have a > couple of questions about your note. Could you help us understand the > following comments: > > > Worse yet, the JSON RFC is self-contradictory, with the result that > > it's not even clear that CESU-8-encoded JSON is illegal. > > Can you point out where you think the JSON spec is broken? I was too hasty. It's better to say that JSON is not internally self-consistent. JSON documents MUST be Unicode, and section 3 suggests (without saying) that the valid encodings are UTF-8 and UTF-{16,32}{LE,BE}, apparently forbidding any BOM. In the underlying JSON data model, however, strings need not be sequences of Unicode characters, since unpaired surrogates are permitted in them. Worse, the ES5 built-in encoder is not required to escape unpaired surrogates on output, so it's possible for ES5-conforming implementations to produce output that is not valid in any of the five encodings. (Fortunately, it's permitted to encode any character on output.) -- John Cowan cowan@ccil.org http://ccil.org/~cowan Rather than making ill-conceived suggestions for improvement based on uninformed guesses about established conventions in a field of study with which familiarity is limited, it is sometimes better to stick to merely observing the usage and listening to the explanations offered, inserting only questions as needed to fill in gaps in understanding. --Peter Constable
Received on Tuesday, 3 November 2009 03:55:08 UTC