- From: Bjoern Hoehrmann <derhoermi@gmx.net>
- Date: Thu, 21 Nov 2013 20:37:49 +0100
- To: John Cowan <cowan@mercury.ccil.org>
- Cc: es-discuss <es-discuss@mozilla.org>, IETF Discussion <ietf@ietf.org>, www-tag <www-tag@w3.org>, JSON WG <json@ietf.org>
* John Cowan wrote: >Bjoern Hoehrmann scripsit: > >> Is there any chance, by the way, to change `JSON.stringify` so it does >> not output strings that cannot be encoded using UTF-8? Specifically, >> >> JSON.stringify(JSON.parse("\"\uD800\"")) >> >> would need to escape the surrogate instead of emitting it literally. > >No, there isn't. We've been down this road repeatedly. People can and >do use JSON strings to encode arbitrary sequences of unsigned 16-bit integers. The output of JSON.stringify("\uD800") contains no backslash character, if you call `utf8_encode(JSON.stringify("\uD800"))` you get an exception because UTF-8 cannot encode the lone surrogate and `utf8_encode` does not know it could encode it as `\uD800` without loss of information. If `JSON.stringify` produced an escape sequence instead, there would be no problem passing the output to `utf8_encode`. -- Björn Höhrmann · mailto:bjoern@hoehrmann.de · http://bjoern.hoehrmann.de Am Badedeich 7 · Telefon: +49(0)160/4415681 · http://www.bjoernsworld.de 25899 Dagebüll · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/
Received on Thursday, 21 November 2013 19:38:27 UTC