- From: Internationalization Working Group Issue Tracker <sysbot+tracker@w3.org>
- Date: Tue, 19 Aug 2014 21:38:19 +0000
- To: public-i18n-core@w3.org
I18N-ISSUE-385: Per Unicode: remove "violation" notes [encoding] http://www.w3.org/International/track/issues/385 Raised by: Addison Phillips On product: encoding Ken Whistler advised us: == Re the note in Section 4.2, I do not understand at all why you word this as “In violation of section 1.4 of UTS #22…” How is this a “violation” of anything? The wording in UTS #22 is: “… For best results, names should be compared after applying the following transformations: …” That is simply a recommendation for how to minimize non-recognition of variations in spelling of charset names in labels. It doesn’t really have anything to do with the actual conformance clause of UTS #22. So I don’t see how anybody could actually be in “violation” of it. The W3C “Encodings” document just makes a much more detailed and prescriptive mapping of charset labels to the specified encodings it enumerates. Why don’t you just say *that*, instead: ============================================================= Note: This specification provides a more detailed and prescriptive mapping of charset labels to encodings than the loose matching for charset aliases recommended by UTS #22 … etc., etc. ============================================================= See? No violation anywhere. I have a similar reaction to your notes in 14.2 and 14.4. I also do not see those a “violations” of the Unicode Standard (which, by the way, I would spell with a capitalized “Standard”). Start with 14.4 utf-16le. The Unicode Standard does not specify “labels” for charsets, so I don’t see how you’d be in violation of the standard by defining how you interpret charset labels. Essentially, you are saying: ========================================================= Note: For [insert reason here] the label “utf-16” is treated as synonymous with the label “utf-16le”, and also identifies the utf-16le encoding. =========================================================== And for your note in 14.2, I think the statement is just wrong. This is not a violation of the Unicode Standard. It is very much in the spirit of the definition of the UTF-16 encoding scheme to treat the BOM as signature and use it to identity the actual byte order of a stream. And if that is used to override an explicit (but erroneous) charset labeling, so be it. See Asmus’ comment, which just crossed mine. In any case, I would advise rewording all three of these notes in your document. Rather than having a rhetorical stance that says, “We violate the Unicode Standard, but that’s o.k., because this item is uncontroversial, and …”, why would you need to state any violations here at all? Just put in clarifying notes to forestall people from *claiming* that these practices violate the Unicode Standard (or UTS #22).
Received on Tuesday, 19 August 2014 21:38:20 UTC