W3C home > Mailing lists > Public > www-validator@w3.org > August 2007

Re: Validator case-sensitive bug for CHARSET?

From: olivier Thereaux <ot@w3.org>
Date: Tue, 7 Aug 2007 14:46:54 +0900
Message-Id: <C0CC49DD-2B51-44E9-BB30-BE4411240B04@w3.org>
Cc: www-validator Community <www-validator@w3.org>, www-international@w3.org
To: Ernest Unrau <ejunrau@mts.net>

Hello Ernest, all,

On Aug 5, 2007, at 04:05 , Ernest Unrau wrote:
> Specifically, the validator is unable to detect the character  
> encoding if
> "CHARSET" is uppercased in the CONTENT field (see below). It will  
> detect it
> automatically if this parameter is lowercased.

This is the first time I run into this issue. Looking at the HTTP  
specification (which HTML normatively refers to for the http-equiv  
meta information) I was unable to find precisely whether the  
"charset=" string was case-sensitive or not, but lacking any mention,  
I will assume that it is case sensitive, as is the rest of HTTP  

I have added an entry in bugzilla to track the issue:

> If indeed this parameter must be lowercased, I would suggest the  
> validator
> should return some help for this problem. I have seen some  
> correspondence
> on your site noting problems with the doctype, but did not find any  
> that
> specifically identified where the problem occurs.

I agree. The validator should probably be loose in its detection of  
the charset parameter in http-equiv, but should shoot a warning if  
the case is wrong. We are, however, lacking documentation on this.  
The otherwise excellent document:
talks about this usage of <meta> but does not mention case.

> Testing variations of the CONTENT field, these constructions work:
>   <META HTTP-EQUIV="Content-Type" CONTENT="text/html;  
> charset=ISO-8859-1">
>   <META HTTP-EQUIV="Content-Type" CONTENT="text/ 
> html;charset=ISO-8859-1">
>   <META HTTP-EQUIV="Content-Type" CONTENT="text/html  
> charset=ISO-8859-1">
> These constructions don't work:
>   <META HTTP-EQUIV="Content-Type" CONTENT="text/html  
> CHARSET=ISO-8859-1">
>   <META HTTP-EQUIV="Content-Type" CONTENT="text/html;  
> CHARSET=ISO-8859-1">
>   <META HTTP-EQUIV="Content-Type" CONTENT="text/html;  
> CHARSET=iso-8859-1">
>   <META HTTP-EQUIV="Content-Type" CONTENT="text/ 
> html;CHARSET=ISO-8859-1">
>   <META HTTP-EQUIV="Content-Type" CONTENT="text/ 
> html;CHARSET=iso-8859-1">

Could you make at least a few of these into test documents?
* very minimal HTML documents
* encoded as iso-8859-1
* using one of these constructs
* including some non-ascii characters (will be a good test of the  

Thank you
Received on Tuesday, 7 August 2007 05:46:15 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 1 March 2016 14:17:53 UTC