- From: Henry S. Thompson <ht@inf.ed.ac.uk>
- Date: Thu, 14 Nov 2013 09:44:00 +0000
- To: John Cowan <cowan@mercury.ccil.org>
- Cc: "Joe Hildebrand \(jhildebr\)" <jhildebr@cisco.com>, Paul Hoffman <paul.hoffman@vpnc.org>, Anne van Kesteren <annevk@annevk.nl>, es-discuss <es-discuss@mozilla.org>, IETF Discussion <ietf@ietf.org>, "www-tag\@w3.org" <www-tag@w3.org>, JSON WG <json@ietf.org>
John Cowan writes: > Joe Hildebrand (jhildebr) scripsit: > >> If 404 doesn't allow [a BOM], I don't see a strong need to add it. >> Parsers can always be more forgiving of what they will parse than what >> the spec says, particularly since section 9 says "A JSON parser MAY >> accept non-JSON forms or extensions". > > It's not clear that 404 disallows it, since 404 is defined in terms of > characters, and a BOM is not a character but an out-of-band signal. I think this is a crucial observation. I note that XML approaches this problem in what might be a useful way. The XML ABNF makes no mention of BOM, it's not part of any XML document as such. But it _is_ allowed. The relevant wording [1] is: Entities ... may begin with the Byte Order Mark described by Annex H of [ISO/IEC 10646:2000], section 16.8 of [Unicode] (the ZERO WIDTH NO-BREAK SPACE character, #xFEFF). _This is an encoding signature,_ _not part of either the markup or the character data of the XML_ _document._ XML processors must be able to use this character to differentiate between UTF-8 and UTF-16 encoded documents. [emphasis added] ht [1] http://www.w3.org/TR/REC-xml/#charencoding -- Henry S. Thompson, School of Informatics, University of Edinburgh 10 Crichton Street, Edinburgh EH8 9AB, SCOTLAND -- (44) 131 650-4440 Fax: (44) 131 650-4587, e-mail: ht@inf.ed.ac.uk URL: http://www.ltg.ed.ac.uk/~ht/ [mail from me _always_ has a .sig like this -- mail without it is forged spam]
Received on Thursday, 14 November 2013 09:45:23 UTC