[Bug 23927] ASCII-incompatible encoder error handling

https://www.w3.org/Bugs/Public/show_bug.cgi?id=23927

--- Comment #6 from Addison Phillips <addison@lab126.com> ---
You're probably right about not being able to get to the UTF-16 encoder
directly. I'm trying to think of cases and the only one that occurs to me out
of hand would be reading data into a JS string? Or maybe writing an XML
document (**NOT** XHTML, please note).

A UTF-16 encoder should deal with non-Unicode-scalar-value input: that is one
of its edge conditions. Bad data exists everywhere and the failure conditions
should be well-described. It's easy enough to chop a UTF-16 buffer between two
surrogate code points (if your code is surrogate stupid). Similarly someone
might use it as a form of attack ("?" has a meaning in syntaxes such as URL but
U+D800 might look like a tofu box and not arouse suspicion).

In any case, don't you agree that the "error" instructions are for
ASCII-compatible encodings and, as written, aren't quite right for a UTF-16
encoder? If you changed the word "byte" to "code unit", that might fix it (at
the cost of confusion for all other encodings).

-- 
You are receiving this mail because:
You are on the CC list for the bug.

Received on Tuesday, 26 November 2013 18:06:13 UTC