Re: [Json] BOMs from Joe Hildebrand (jhildebr) on 2013-11-19 (www-tag@w3.org from November 2013)

From: Joe Hildebrand (jhildebr) <jhildebr@cisco.com>
Date: Tue, 19 Nov 2013 20:24:36 +0000
To: Allen Wirfs-Brock <allen@wirfs-brock.com>, Martin J. Dürst <duerst@it.aoyama.ac.jp>
CC: John Cowan <cowan@mercury.ccil.org>, IETF Discussion <ietf@ietf.org>, "Pete Cordell" <petejson@codalogic.com>, JSON WG <json@ietf.org>, "www-tag@w3.org" <www-tag@w3.org>, Anne van Kesteren <annevk@annevk.nl>, "t.p." <daedulus@btconnect.com>, es-discuss <es-discuss@mozilla.org>
Message-ID: <CEB1806C.2D9E5%jhildebr@cisco.com>

I've been back and forth on this several times, however I think I've come
to a conclusion for myself unless new arguments or data comes in.

I don't see any value in a BOM ever being included intentionally.  RFC4627
does not specify anything about BOMs.  RFC4627 explicitly has a mechanism
for determining encoding that does not use BOMs.  Although ECMA-262
supports more whitespace than RFC4627, <BOM> is not one of the supported
whitespace characters in my reading.  Same with ECMA-404.  Many existing
implementations do not parse BOM-prepended text in practice (thanks
Allen). 

I understand the use case of manually creating a document in notepad and
getting a BOM you didn't expect, but I don't think that's a compelling
enough reason to break backward-compatibility by adding effectively-new
blessings for BOMs.

Section 9 provides complete air-cover for any parser writer that wants to
support BOMs anyway:

"A JSON parser MAY accept non-JSON forms or extensions."

What they do with them, particularly when they don't match the encoding
detection of section 8.1 would of course be unspecified, but those
documents would have been rejected by a strictly-conforming parser anyway.

As such, I think our best course of action is to change nothing with
respect to BOMs, leaving them as non-interoperable syntax extensions.

On 11/19/13 5:59 PM, "Allen Wirfs-Brock" <allen@wirfs-brock.com> wrote:

>There can be no doubt that the most widely deployed JSON parsers are
>those that are built intp the browser javascript implementations.  The
>ECMAScript 5 specification for JSON.parse that they implement says BOM is
>an illegal character.  But what do the
> browser actually implement?  This:
>
>
>//FireFox 25 scratchpad execution:
>JSON.parse('\ufeff {"abc": 0} ')
>/*
>Exception: JSON.parse: unexpected character
>@Scratchpad/1:1
>*/
>
>JSON.parse('\ufeff {"abc": 0} ')
>/*
>Exception: JSON.parse: unexpected character
>@Scratchpad/1:1
>*/
>
>JSON.parse('\ufeff {"abc": 0} ')
>/*
>Exception: JSON.parse: unexpected character
>@Scratchpad/1:1
>*/JSON.parse('\ufeff {"abc": 0} ')
>/*
>Exception: JSON.parse: unexpected character
>@Scratchpad/1:1
>*/JSON.parse('\ufeff {"abc": 0} ')
>/*
>Exception: JSON.parse: unexpected character
>@Scratchpad/1:1
>*/
>
>
>
>
>
>
>
>//Safari 5.1.9 JS console
>JSON.parse('\ufeff {"abc": 0} ')
>
>1. message: "JSON
> Parse error: Unrecognized token '?'"
>
>
>
>//Chrome 31 JS console
>JSON.parse('\ufeff
> {"abc": 0} ')
>1. 
>
>
>SyntaxError: Unexpected token
>
>
>1. message: "Unexpected
> token "
>
>
>
>Unfortunately,
> I don't have access to IE right now,  but the trend is clear
>
>
>Allen
>
>
>
>
>
>
>
>

-- 
Joe Hildebrand

Received on Tuesday, 19 November 2013 20:25:05 UTC