- From: Terje Bless <link@pobox.com>
- Date: Thu, 24 May 2007 09:40:05 +0200
- cc: QA-dev Dev <public-qa-dev@w3.org>, Martin Duerst <duerst@it.aoyama.ac.jp>, Bjoern Hoehrmann <derhoermi@gmx.net>
link@pobox.com (Terje Bless) wrote:
>ot@w3.org (olivier Thereaux) wrote:
>
>>Sounds reasonable, but what's the policy? And where does it come from?
>
>The policy is that nothing that's not registered with IANA will be
>accepted, and it comes from me. :-)
To elaborate somewhat[0];
charset.cfg is an implementation artifact and reflects limited tools.
The planned “ideal” way for this to work was that
charset.cfg be replaced with the actual IANA registry[1] such
that what we whitelist is not what we happen to have had time to
find and stuff in a config file, but what's actually registered.
The IANA registry contains information on preferred MIME name
etc. based on which we could emit warnings for non-preferred names.
Whether an unregistered encoding is a fatal error or a warning
is debateable.
A “charset.cfg” may still be needed, but then only for
“exception” purposes such as bitching about vendor-specific
charsets or usage boo boos (the -I variants and some Thai
encodings, IIRC).
[0] — See <http://swhack.com/logs/2007-05-24#T07-12-02>.
[1] — Literally by parsing
<http://www.iana.org/assignments/character-sets>
instead of “charset.cfg”.
--
I have lobbied for the update and improvement of SGML. I've done
it for years.
I consider it the jewel for which XML is a setting. It does
deserve a bit of
polishing now and then.
-- Len Bullard
Received on Thursday, 24 May 2007 07:40:21 UTC