- From: Anne van Kesteren <annevk@annevk.nl>
- Date: Sun, 31 Aug 2014 20:04:19 +0200
- To: John C Klensin <john+w3c@jck.com>
- Cc: Larry Masinter <masinter@adobe.com>, Richard Ishida <ishida@w3.org>, "Phillips, Addison" <addison@lab126.com>, "www-international@w3.org" <www-international@w3.org>
On Thu, Aug 28, 2014 at 8:08 PM, John C Klensin <john+w3c@jck.com> wrote: > Where we seem to be today is that there are a lot of charset > labels in the IANA Charset Registry. Some of them are > irrelevant to web browsers (and, depending on how one defines > it, to the web generally). Others are used in web browsers but > with exactly the same definitions as appear in the IANA > Registry. And a few are used --widely so-- in web browsers but > with different definitions. At the same time, there are other > applications (and probably some legacy web ones) that use the > labels in the last category but strictly follow the IANA > Registry definitions. Actually, quite a lot have different definitions when you get down to the details, because the specifications for the encodings IANA points to are often not implemented in the prescribed manner (or lack essential details, such as handling of errors). > The one solace here and the one I hope all involved can agree on > (or have already) is that, with the exception of writing systems > whose scripts have not yet been encoded in Unicode, everyone > ought to be moving away from historical encodings and toward > UTF-8 as soon as possible. That is the real solution to the > problem of different definitions and the issues they can cause: > just move forward to Standard UTF-8 to get away from them and > consider the present mess as added incentive. Writing systems that cannot be done in Unicode cannot be done on the web. There's no infrastructure in place for such systems. (Apart from PUA font hacks.) -- http://annevankesteren.nl/
Received on Sunday, 31 August 2014 18:04:49 UTC