- From: olivier Thereaux <ot@w3.org>
- Date: Tue, 7 Aug 2007 14:46:54 +0900
- To: Ernest Unrau <ejunrau@mts.net>
- Cc: www-validator Community <www-validator@w3.org>, www-international@w3.org
Hello Ernest, all, On Aug 5, 2007, at 04:05 , Ernest Unrau wrote: > Specifically, the validator is unable to detect the character > encoding if > "CHARSET" is uppercased in the CONTENT field (see below). It will > detect it > automatically if this parameter is lowercased. This is the first time I run into this issue. Looking at the HTTP specification (which HTML normatively refers to for the http-equiv meta information) I was unable to find precisely whether the "charset=" string was case-sensitive or not, but lacking any mention, I will assume that it is case sensitive, as is the rest of HTTP constructs. I have added an entry in bugzilla to track the issue: http://www.w3.org/Bugs/Public/show_bug.cgi?id=4917 > If indeed this parameter must be lowercased, I would suggest the > validator > should return some help for this problem. I have seen some > correspondence > on your site noting problems with the doctype, but did not find any > that > specifically identified where the problem occurs. I agree. The validator should probably be loose in its detection of the charset parameter in http-equiv, but should shoot a warning if the case is wrong. We are, however, lacking documentation on this. The otherwise excellent document: http://www.w3.org/International/O-charset talks about this usage of <meta> but does not mention case. > Testing variations of the CONTENT field, these constructions work: > > <META HTTP-EQUIV="Content-Type" CONTENT="text/html; > charset=ISO-8859-1"> > <META HTTP-EQUIV="Content-Type" CONTENT="text/ > html;charset=ISO-8859-1"> > <META HTTP-EQUIV="Content-Type" CONTENT="text/html > charset=ISO-8859-1"> > > These constructions don't work: > > <META HTTP-EQUIV="Content-Type" CONTENT="text/html > CHARSET=ISO-8859-1"> > <META HTTP-EQUIV="Content-Type" CONTENT="text/html; > CHARSET=ISO-8859-1"> > <META HTTP-EQUIV="Content-Type" CONTENT="text/html; > CHARSET=iso-8859-1"> > <META HTTP-EQUIV="Content-Type" CONTENT="text/ > html;CHARSET=ISO-8859-1"> > <META HTTP-EQUIV="Content-Type" CONTENT="text/ > html;CHARSET=iso-8859-1"> Could you make at least a few of these into test documents? * very minimal HTML documents * encoded as iso-8859-1 * using one of these constructs * including some non-ascii characters (will be a good test of the detection) Thank you -- olivier
Received on Tuesday, 7 August 2007 05:46:15 UTC