- From: Leif Halvard Silli <xn--mlform-iua@xn--mlform-iua.no>
- Date: Tue, 10 Aug 2010 20:34:16 +0200
- To: "Tab Atkins Jr." <jackalmage@gmail.com>
- Cc: whatwg@whatwg.org, HTMLwg <public-html@w3.org>, Ian Hickson <ian@hixie.ch>, commit-watchers@whatwg.org
Tab Atkins Jr., Tue, 10 Aug 2010 10:11:25 -0700: > On Tue, Aug 10, 2010 at 4:53 AM, Leif Halvard Silli > <xn--mlform-iua@xn--mlform-iua.no> wrote: >> whatwg@whatwg.org, Mon, 9 Aug 2010 18:16:12 -0700 (PDT): >>> Author: ianh >>> Date: 2010-08-09 18:16:10 -0700 (Mon, 09 Aug 2010) >>> New Revision: 5258 >> >>> <p>Authors are encouraged to use UTF-8. Conformance checkers may >>> - advise authors against using legacy encodings.</p> >>> + advise authors against using legacy encodings. <a >>> href=#refsRFC3629>[RFC3629]</a></p> >> >> Could we replace 'legacy encodings' with a clearer wording - or >> eventually define what 'legacy encodings' mean? The current wording >> could give the impression that any encoding other than UTF-8 is a >> legacy encoding. But it is unclear whether that is actually what is >> meant. >> >> Specifically, it is not clear from the above whether conformance >> checkers may advice authors against using UTF-16, since UTF-16 >> generally isn't associated with 'legacy encoding'. > > That's precisely what's meant. UTF-8 is the encoding of the web. Any > and all other encodings are legacy encodings. But that is not clear from this paragraph in the specification. Because, "legacy" is not synonymous with "deprecated", "unwanted" or "not optimal". Rather, it is synonymous with "old" and "outdated" (and therefore, subsequently, deprecated/unwanted/unoptimal). Not everyone (probably quite few) reading the spec will think of UTF-16 as "old" and "out of date". And unlike what I feel you are suggesting, "legacy" used about encodings, usually mean the same thing, irrespective of the context - be it "the Web" or anything else. It is generally accepted that non-UNICODE encodings are legacy encodings, irrespective of the context. But it is not generally accepted that UTF-16, being a UNICODE encoding in "good standing", is a legacy encoding. Thus, if 'legacy encoding' is meant to cover 'UTF-16' as well, then it is a quite unclear wording open to more than one interpretation. I suggest using a wording that is more certain to get the message across: EITHER define what 'legacy encoding' refers to [*]. OR avoid the term entirely [#]. It is not important to me which strategy is used - it only matters that the wording becomes clear. [*] For example: "Encodings other that UTF-8 are considered legacy encodings by this specification and conformance checkers may advise against their use." -- leif halvard silli
Received on Tuesday, 10 August 2010 18:34:52 UTC