- From: <bugzilla@jessica.w3.org>
- Date: Sun, 27 Oct 2013 14:15:54 +0000
- To: www-international@w3.org
https://www.w3.org/Bugs/Public/show_bug.cgi?id=23646 John C Klensin <john+w3cbugs@jck.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |john+w3cbugs@jck.com --- Comment #3 from John C Klensin <john+w3cbugs@jck.com> --- Addison, I certainly agree that generating additional replacement characters is a bad idea. But the argument that ASCII (and ISO 8859-1) are reasonable aliases for "windows-1252" because the latter is a proper superset could also be used to argue that ASCII is a reasonable alias for UTF-8 because UTF-8 is also a proper superset. If one then assumes transitivity of aliases, your suggestion that, if one thinks something is ASCII and some octet is out of range then breaks down because that out of range octet could be either Windows-1252 (or ISO 8859-1 if it is in the 0xA1 - 0xFF range) or part of a UTF-8 sequence. The document itself is probably ok because of the way it separates single-byte and multi-byte operations but, if an implementation gets even slightly sloppy about terminology or labeling, it seems to me that those aliases help us get onto rather thin ice. For better or worse, such sloppy behavior appears frequently on the Internet, probably most commonly induced by copying strings from one document and pasting them into another. -- You are receiving this mail because: You are on the CC list for the bug.
Received on Sunday, 27 October 2013 14:15:56 UTC