- From: Henry S. Thompson <ht@inf.ed.ac.uk>
- Date: Mon, 18 Nov 2013 12:59:18 +0000
- To: Bjoern Hoehrmann <derhoermi@gmx.net>
- Cc: Martin J. Dürst <duerst@it.aoyama.ac.jp>, IETF Discussion <ietf@ietf.org>, JSON WG <json@ietf.org>, Anne van Kesteren <annevk@annevk.nl>, www-tag@w3.org, es-discuss <es-discuss@mozilla.org>
Bjoern Hoehrmann writes: > Perl's JSON module gives me > > malformed JSON string, neither array, object, number, string > or atom, at character offset 0 (before "\x{ef}\x{bb}\x{bf}[]") > > Python's json module gives me > > ValueError: No JSON object could be decoded > > Go's "encoding/json" module gives me > > invalid character 'ï' looking for beginning of value I'm curious to know what level you're invoking the parser at. As implied by my previous post about the Python 'requests' package, it handles application/json resources by stripping any initial BOM it finds -- you can try this with >>> import requests >>> r=requests.get("http://www.ltg.ed.ac.uk/ov-test/b16le.json") >>> r.json() Signatures are not part of the text of a document, as the UNICODE spec makes clear, so asking what happens when you pass a string beginning with a BOM to a parser is not really the right question in this context, is it? As I tried to say in an earlier post, there's a distinction which needs to be carefully insisted on between, on the one hand, languages and their parsers, where I agree signatures/BOMs have no place, and, on the other hand, (media-typed) resources/entities/payloads and _their_ processing, where a discussion of BOMs/signatures _is_ appropriate and, often, necessary. BTW I agree that the status of the UTF-8 BOM as signature is slightly hazy, but again the UNICODE spec itself [1] says "this sequence can serve as signature for UTF-8 encoded text where the character set is unmarked" ht [1] http://www.unicode.org/versions/Unicode6.2.0/ch16.pdf -- Henry S. Thompson, School of Informatics, University of Edinburgh 10 Crichton Street, Edinburgh EH8 9AB, SCOTLAND -- (44) 131 650-4440 Fax: (44) 131 650-4587, e-mail: ht@inf.ed.ac.uk URL: http://www.ltg.ed.ac.uk/~ht/ [mail from me _always_ has a .sig like this -- mail without it is forged spam]
Received on Monday, 18 November 2013 13:00:19 UTC