- From: Henry S. Thompson <ht@inf.ed.ac.uk>
- Date: Thu, 14 Nov 2013 09:44:00 +0000
- To: John Cowan <cowan@mercury.ccil.org>
- Cc: "Joe Hildebrand \(jhildebr\)" <jhildebr@cisco.com>, Paul Hoffman <paul.hoffman@vpnc.org>, Anne van Kesteren <annevk@annevk.nl>, es-discuss <es-discuss@mozilla.org>, IETF Discussion <ietf@ietf.org>, "www-tag\@w3.org" <www-tag@w3.org>, JSON WG <json@ietf.org>
John Cowan writes:
> Joe Hildebrand (jhildebr) scripsit:
>
>> If 404 doesn't allow [a BOM], I don't see a strong need to add it.
>> Parsers can always be more forgiving of what they will parse than what
>> the spec says, particularly since section 9 says "A JSON parser MAY
>> accept non-JSON forms or extensions".
>
> It's not clear that 404 disallows it, since 404 is defined in terms of
> characters, and a BOM is not a character but an out-of-band signal.
I think this is a crucial observation. I note that XML approaches
this problem in what might be a useful way. The XML ABNF makes no
mention of BOM, it's not part of any XML document as such. But it
_is_ allowed. The relevant wording [1] is:
Entities ... may begin with the Byte Order Mark described by Annex H
of [ISO/IEC 10646:2000], section 16.8 of [Unicode] (the ZERO WIDTH
NO-BREAK SPACE character, #xFEFF). _This is an encoding signature,_
_not part of either the markup or the character data of the XML_
_document._ XML processors must be able to use this character to
differentiate between UTF-8 and UTF-16 encoded documents. [emphasis
added]
ht
[1] http://www.w3.org/TR/REC-xml/#charencoding
--
Henry S. Thompson, School of Informatics, University of Edinburgh
10 Crichton Street, Edinburgh EH8 9AB, SCOTLAND -- (44) 131 650-4440
Fax: (44) 131 650-4587, e-mail: ht@inf.ed.ac.uk
URL: http://www.ltg.ed.ac.uk/~ht/
[mail from me _always_ has a .sig like this -- mail without it is forged spam]
Received on Thursday, 14 November 2013 09:45:23 UTC