- From: John Cowan <cowan@ccil.org>
- Date: Fri, 30 Oct 2009 20:47:38 -0400
- To: "Phillips, Addison" <addison@amazon.com>
- Cc: John Cowan <cowan@ccil.org>, Doug Schepers <schepers@w3.org>, Mark Davis �?? <mark@macchiato.com>, "www-dom@w3.org" <www-dom@w3.org>, "www-international@w3.org" <www-international@w3.org>
Phillips, Addison scripsit: > ECMAScript's "firm commitment" to a 16-bit character model (i.e. UTF-16) If only. JavaScript and JSON strings aren't sequences of characters, they are sequences of 16-bit unsigned integers. If you happen to want to interpret them as UTF-16, you are free to do so, but there is not and never will be any guarantee that all strings are well-formed UTF-16. What's more, the built-in JSON serializer provided by ECMAScript 5th edition does not generate escape sequences for isolated surrogate codepoints, so that some strings will be written out in CESU-8 rather than UTF-8. Worse yet, the JSON RFC is self-contradictory, with the result that it's not even clear that CESU-8-encoded JSON is illegal. -- Let's face it: software is crap. Feature-laden and bloated, written under tremendous time-pressure, often by incapable coders, using dangerous languages and inadequate tools, trying to connect to heaps of broken or obsolete protocols, implemented equally insufficiently, running on unpredictable hardware -- we are all more than used to brokenness. --Felix Winkelmann
Received on Saturday, 31 October 2009 00:48:12 UTC