W3C home > Mailing lists > Public > public-i18n-core@w3.org > October to December 2009

Re: Changes to DOM3 Events Key Identifiers

From: John Cowan <cowan@ccil.org>
Date: Mon, 2 Nov 2009 22:54:35 -0500
To: "Phillips, Addison" <addison@amazon.com>
Cc: John Cowan <cowan@ccil.org>, "public-i18n-core@w3.org" <public-i18n-core@w3.org>
Message-ID: <20091103035435.GZ7704@mercury.ccil.org>
Phillips, Addison scripsit:

> Our W3C WG is discussing this in our TPAC meeting today and have a
> couple of questions about your note. Could you help us understand the
> following comments:
> 
> > Worse yet, the JSON RFC is self-contradictory, with the result that
> > it's not even clear that CESU-8-encoded JSON is illegal.
>
> Can you point out where you think the JSON spec is broken?

I was too hasty.  It's better to say that JSON is not internally
self-consistent.  JSON documents MUST be Unicode, and section 3
suggests (without saying) that the valid encodings are UTF-8 and
UTF-{16,32}{LE,BE}, apparently forbidding any BOM.  In the underlying
JSON data model, however, strings need not be sequences of Unicode
characters, since unpaired surrogates are permitted in them.

Worse, the ES5 built-in encoder is not required to escape unpaired
surrogates on output, so it's possible for ES5-conforming implementations
to produce output that is not valid in any of the five encodings.
(Fortunately, it's permitted to encode any character on output.)

-- 
John Cowan    cowan@ccil.org    http://ccil.org/~cowan
Rather than making ill-conceived suggestions for improvement based on
uninformed guesses about established conventions in a field of study with
which familiarity is limited, it is sometimes better to stick to merely
observing the usage and listening to the explanations offered, inserting
only questions as needed to fill in gaps in understanding. --Peter Constable
Received on Tuesday, 3 November 2009 03:55:08 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 3 November 2009 03:55:11 GMT