Re: [XHR] responseType "json" from Glenn Maynard on 2012-01-06 (public-webapps@w3.org from January to March 2012)

From: Glenn Maynard <glenn@zewt.org>
Date: Fri, 6 Jan 2012 11:20:04 -0500
To: Jarred Nicholls <jarred@webkit.org>
Cc: "Web Applications Working Group WG (public-webapps@w3.org)" <public-webapps@w3.org>, Anne van Kesteren <annevk@opera.com>
Message-ID: <CABirCh_mNnCbJOhX_csh7q5ziYBwXPpMjT1K9jQ-i48N+eCZ8w@mail.gmail.com>

Please be careful with quote markers; you quoted text written by me as
written by Glenn Adams.

On Fri, Jan 6, 2012 at 10:00 AM, Jarred Nicholls <jarred@webkit.org> wrote:
> I'm getting responseType "json" landed in WebKit, and going to do so without
> the restriction of the JSON source being UTF-8.  We default our decoding to
> UTF-8 if none is dictated by the server or overrideMIMEType(), but we also
> do BOM detection and will gracefully switch to UTF-16(BE/LE) or
> UTF-32(BE/LE) if the context is encoded as such, and accept the source
> as-is.
>
> It's a matter of having that perfect recipe of "easiest implementation +
> most interoperability".  It actually adds complication to our decoder if we

Accepting content that other browsers don't will result in pages being
created that work only in WebKit.  That gives the least
interoperability, not the most.

If this behavior gets propagated into other browsers, that's even
worse.  Gecko doesn't support UTF-32, and adding it would be a huge
step backwards.

> do something special just for (perfectly legit) JSON payloads.  I think
> keeping that UTF-8 bit in the spec is fine, but I don't think WebKit will be
> reducing our interoperability and complicating our code base.  If we don't
> want JSON to be UTF-16 or UTF-32, let's change the JSON spec and the JSON
> grammar and JSON.parse will do the leg work.

Big -1 to perpetuating UTF-16 and UTF-32 due to braindamage in an IETF spec.

Also, I'm a bit confused.  You talk about the rudimentary encoding
detection in the JSON spec (rfc4627 sec3), but you also mention HTTP
mechanisms (HTTP headers and overrideMimeType).  These are separate
and unrelated.  If you're using HTTP mechanisms, then the JSON spec
doesn't enter into it.  If you're using both HTTP headers (HTTP) and
UTF-32 BOM detection (rfc4627), then you're using a strange mix of the
two.  I can't tell what mechanism you're actually using.

> As someone else stated, this is a good fight but probably not the right battlefield.

Strongly disagree.  Preventing legacy messes from being perpetuated
into new APIs is one of the *only* battlefields available, where we
can get people to stop using legacy encodings without breaking
existing content.

Anne: There's one related change I'd suggest.  Currently, if a JSON
response says "Content-Encoding: application/json; charset=Shift_JIS",
the explicit charset will be silently ignored and UTF-8 will be used.
I think this should be explicitly rejected, returning null as the JSON
response entity body.  Don't decode as UTF-8 despite an explicitly
conflicting header, or people will start sending bogus charset values
without realizing it.

-- 
Glenn Maynard

Received on Friday, 6 January 2012 16:20:32 UTC