W3C home > Mailing lists > Public > public-qa-dev@w3.org > May 2007

Re: [wmvs] do we still need charset.cfg to list the "acceptable" character encodings?

From: Terje Bless <link@pobox.com>
Date: Thu, 24 May 2007 09:40:05 +0200
cc: QA-dev Dev <public-qa-dev@w3.org>, Martin Duerst <duerst@it.aoyama.ac.jp>, Bjoern Hoehrmann <derhoermi@gmx.net>
Message-ID: <r02020000-215-1049-ppc-6D889AE439E84C308C4D61F4229D1447@pounder.neutri.no>

link@pobox.com (Terje Bless) wrote:

>ot@w3.org (olivier Thereaux) wrote:
>>Sounds reasonable, but what's the policy? And where does it come from?
>The policy is that nothing that's not registered with IANA will be
>accepted, and it comes from me. :-)

To elaborate somewhat[0];

charset.cfg is an implementation artifact and reflects limited tools.

The planned “ideal” way for this to work was that 
charset.cfg be replaced with the actual IANA registry[1] such 
that what we whitelist is not what we happen to have had time to 
find and stuff in a config file, but what's actually registered.

The IANA registry contains information on preferred MIME name 
etc. based on which we could emit warnings for non-preferred names.

Whether an unregistered encoding is a fatal error or a warning 
is debateable.

A “charset.cfg” may still be needed, but then only for 
“exception” purposes such as bitching about vendor-specific 
charsets or usage boo boos (the -I variants and some Thai 
encodings, IIRC).

[0] — See <http://swhack.com/logs/2007-05-24#T07-12-02>.

[1] — Literally by parsing
       instead of “charset.cfg”.

I have lobbied for the update and improvement of SGML. I've done 
it for years.
I consider it the jewel for which XML is a setting.  It does 
deserve a bit of
polishing now and then.                                        
-- Len Bullard
Received on Thursday, 24 May 2007 07:40:21 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 20:36:27 UTC