Re: [Json] Encoding detection (Was: Re: JSON: remove gap between Ecma-404 and IETF draft) from Nico Williams on 2013-11-26 (www-tag@w3.org from November 2013)

From: Nico Williams <nico@cryptonector.com>
Date: Tue, 26 Nov 2013 09:27:05 -0600
To: Henri Sivonen <hsivonen@hsivonen.fi>
Cc: Pete Cordell <petejson@codalogic.com>, John Cowan <cowan@mercury.ccil.org>, Paul Hoffman <paul.hoffman@vpnc.org>, JSON WG <json@ietf.org>, "Joe Hildebrand (jhildebr)" <jhildebr@cisco.com>, www-tag <www-tag@w3.org>, es-discuss <es-discuss@mozilla.org>
Message-ID: <20131126152700.GN3655@localhost>

On Tue, Nov 26, 2013 at 04:10:14PM +0200, Henri Sivonen wrote:
> On Fri, Nov 22, 2013 at 1:33 PM, Pete Cordell <petejson@codalogic.com> wrote:
> > I'm hoping that for the IETFs purposes we'll be looking at
> > JSONs wider utility into broader areas, which may even include logging to
> > files and interprocess communication where there might be sensible reasons
> > to choose something other than UTF-8.
> 
> What sensible reasons could there possibly be?

I can't think of any either.  UTF-32 is superficially appealing (O(1)
indexing!) but it's only O(1) indexing by codepoint counts, not
character counts so it's still lame and you pay for longer strings.
It's possible that on some architectures / for some use cases it
performs fabulously better than the alternatives, and though I doubt it,
that would be a reason not to *forbid* the use of UTF-32.  What we
clearly don't have consensus for is requiring support of UTF-32.

> There exists no situation where using UTF-32 for interchange makes
> sense. I think proponents of craziness of the level of using UTF-32
> for interchange should show evidence of existing crazy deployments
> instead of asking future implementers to support UTF-32 just because
> it wasn't possible to prove non-existence.

No, I think this is too much.  If someone wants to use UTF-32 because
they have numbers showing that for IPC and local processing it's faster,
that might be compelling; let them.

Anyways, I think we're focusing too hard on details that aren't terribly
important.  The "non-BOM-based sniffing rules" work and can be derived
by any capable implementor whether stated or not by the RFC.

> I continue to strongly disapprove of non-BOM-based sniffing rules
> unless there's compelling evidence that such rules are needed in order
> to interoperate with bogus existing serializers.

I think it's fair to object to requiring sniffing, and I support not
requiring it.  I don't see anything wrong with leaving those in for
those who want to include support for it.

Nico
--

Received on Tuesday, 26 November 2013 15:27:32 UTC