W3C home > Mailing lists > Public > public-qa-dev@w3.org > May 2007

[wmvs] do we still need charset.cfg to list the "acceptable" character encodings?

From: olivier Thereaux <ot@w3.org>
Date: Thu, 24 May 2007 15:13:06 +0900
Message-Id: <F61F9805-F513-4638-A998-41DBF5346EE4@w3.org>
Cc: Martin Duerst <duerst@it.aoyama.ac.jp>, Bjoern Hoehrmann <derhoermi@gmx.net>
To: QA-dev Dev <public-qa-dev@w3.org>


If you don't mind going a few years back, I would like to get your  
recollections of (and opinion on) the character encoding list  
accepted by the markup validator.


Technically speaking, we do not really need this list any more. To  
know whether an encoding is technically supported, we have a small  
routine with Encode::decode() that does the job just fine. The Encode  
module seems to support a wide variety of encodings, too, much wider  
than the list we have.

e.g iso_8859-1 - http://qa-dev.w3.org/wmvs/HEAD/dev/tests/197- 

I haven't yet tested whether Encode supports all IANA listed  
characters, but if it does not, then we could always pass the  
character encoding declared through something like I18N::Alias, as  
suggested in

Therefore, there is no technical reason why we should enforce the use  
of a small list of accepted charsets.

However, the charset.cfg documents itself with (since revision 1.11  
committed by Bjoern):
The Validator will refuse to decode documents in an encoding
other than those listed here. The list is independent of what
is supported on a specific system but subject to the Validator
policy for acceptable encodings.
]] -- http://dev.w3.org/cvsweb/validator/htdocs/config/ 

Sounds reasonable, but what's the policy? And where does it come from?
All I can find so far in normative documents systematically points to  
the IANA registry.
And searching the lists archives does not give me a clear lead on  
whether there used to be a policy in the validator to favor such  
charset or other.

Anyone has any thought/recollection on this?

Received on Thursday, 24 May 2007 06:13:08 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 20:36:27 UTC