Re: charset name matching rules from Geoffrey Sneddon on 2009-08-17 (public-html-comments@w3.org from August 2009)

From: Geoffrey Sneddon <gsneddon@opera.com>
Date: Mon, 17 Aug 2009 11:31:41 +0200
To: Erik van der Poel <erikv@google.com>
CC: Ian Hickson <ian@hixie.ch>, public-html-comments@w3.org
Message-ID: <4A89237D.4000805@opera.com>

Erik van der Poel wrote:
> I had another look at section 2.7, and it does have a pointer to the 
> IANA charset registry, which also says "However, no distinction is 
> made between use of upper and lower case letters." This is the only 
> matching rule that we need. UTS22 is too lenient, and we all know
> what happens to the Web when browsers are too lenient.

Going by the case-insensitive matching rule is incompatible with web 
content, as there is plenty of content out there which expects some 
normalization to be done. I originally suggested using the UTS22 rules 
as it seemed better than the status quo of three normalization rules 
(the case-insensitive one; what browsers currently do, which HTML 5 
previously defined; and UTS22) by reducing this to only two 
normalization rules (purely case-insensitivity, as mentioned above, is 
incompatible with the web so that's not an option, and as it turns out 
UTS22 is incompatible as well). I guess we should go back to the 
normalization rules that HTML 5 previously defined.

-- 
Geoffrey Sneddon — Opera Software
<http://gsnedders.com/>
<http://www.opera.com/>

Received on Monday, 17 August 2009 09:32:34 UTC