- From: <bugzilla@jessica.w3.org>
- Date: Tue, 26 Nov 2013 18:06:11 +0000
- To: www-international@w3.org
https://www.w3.org/Bugs/Public/show_bug.cgi?id=23927 --- Comment #6 from Addison Phillips <addison@lab126.com> --- You're probably right about not being able to get to the UTF-16 encoder directly. I'm trying to think of cases and the only one that occurs to me out of hand would be reading data into a JS string? Or maybe writing an XML document (**NOT** XHTML, please note). A UTF-16 encoder should deal with non-Unicode-scalar-value input: that is one of its edge conditions. Bad data exists everywhere and the failure conditions should be well-described. It's easy enough to chop a UTF-16 buffer between two surrogate code points (if your code is surrogate stupid). Similarly someone might use it as a form of attack ("?" has a meaning in syntaxes such as URL but U+D800 might look like a tofu box and not arouse suspicion). In any case, don't you agree that the "error" instructions are for ASCII-compatible encodings and, as written, aren't quite right for a UTF-16 encoder? If you changed the word "byte" to "code unit", that might fix it (at the cost of confusion for all other encodings). -- You are receiving this mail because: You are on the CC list for the bug.
Received on Tuesday, 26 November 2013 18:06:13 UTC