W3C home > Mailing lists > Public > www-international@w3.org > January to March 2016

Re: Proposal to deprecate 'Character encodings' article

From: <ishida@w3.org>
Date: Mon, 25 Jan 2016 16:56:24 +0000
To: Martin J. Dürst <duerst@it.aoyama.ac.jp>, John C Klensin <john+w3c@jck.com>, www International <www-international@w3.org>
Message-ID: <56A653B8.50805@w3.org>
On 23/01/2016 10:03, Martin J. Dürst wrote:
> On 2016/01/23 03:59, John C Klensin wrote:
>> I think explicitly deprecating that document as outdated is a
>> fine idea (and, in retrospect, probably overdue).
>> However (and while it is probably part of a separate
>> discussion), I'm still anxious about having two separate
>> registries -- at IANA and in the Encoding spec.  We went to
>> great lengths to make what are now called media types the same
>> for the web, email, and everything else.  Separate lists for
>> character encoding identifiers (seen from the IETF and email
>> perspective as part of the media type picture) really benefits
>> no one.   Perhaps the solution is to point out the confusion we
>> have gotten into and see it as another reason for moving to
>> Unicode encoded in UTF-8, but I'm not sure that is a good reason
>> for encouraging worse confusion in the interim.
> I agree with John.
> I see the Encoding spec primarily as a document that gives more specific
> definitions for edge cases and some limitations on the legacy encodings
> expected to be supported, in the context of the "Web Platform" (i.e.
> mostly Web Browser implementations), motivated by security and
> uniformity concerns that apply very much in that context but not
> necessarily in other contexts where encodings are used.
> It would be very good if the Encoding spec said so (although hopefully
> in more, shorter sentences :-().


i think there are two questions here.

1) where W3C i18n articles should point when advising content authors 
about which encodings to use (if they don't do the sensible thing and 
use UTF-8)

2) the relationship between the Encoding spec and the IANA charset registry.

I think that only the first question is relevant to this thread.  Wrt 
the second, we did discuss the scope of the Encoding spec a while ago 
and Anne made some changes to reflect that. If you want to discuss that 
topic further, please do so by raising a github issue on the Encoding 
spec, not in this thread, thanks.

It may be useful to note, wrt the first, that we advise HTML content 
authors to check the list in the Encoding spec because it "provides a 
list that has been tested against actual browser implementations". For 
Web platform development, this is therefore the most useful list to 
choose from, since it take into account interoperability in browsers. We 
do, however, also mention the IANA registry. (See 

Received on Monday, 25 January 2016 16:56:36 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 22:41:09 UTC