- From: Martin Duerst <duerst@it.aoyama.ac.jp>
- Date: Wed, 08 Aug 2007 10:32:39 +0900
- To: "McDonald, Ira" <imcdonald@sharplabs.com>, "David Dorward" <david@dorward.me.uk>, "Ernest Unrau" <ejunrau@mts.net>
- Cc: "www-validator Community" <www-validator@w3.org>, <www-international@w3.org>
It's very clear that the charset tags themselves are case-insensitive,
i.e. US-ASCII is as good as us-ascii is as good as uS-aSCii or any
other variant. It's also clear that for HTML, element and attribute
names are case-insensitive.
The question is whether the charset parameter on the Content-Type
HTTP header is case-sensitive or case-insensitive. Olivier earlier
said that he wasn't able to find anything relevant in the HTTP spec,
but I found this at http://www.ietf.org/rfc/rfc2616.txt:
>>>>>>>>
3.7 Media Types
HTTP uses Internet Media Types [17] in the Content-Type (section
14.17) and Accept (section 14.1) header fields in order to provide
open and extensible data typing and type negotiation.
media-type = type "/" subtype *( ";" parameter )
type = token
subtype = token
Parameters MAY follow the type/subtype in the form of attribute/value
pairs (as defined in section 3.6).
The type, subtype, and parameter attribute names are case-
insensitive. Parameter values might or might not be case-sensitive,
depending on the semantics of the parameter name.
>>>>>>>>
"charset" is a parameter attribute name, and therefore case-insensitive.
Section 3.7 is clearly referenced from Section 14.7:
>>>>>>>>
14.17 Content-Type
The Content-Type entity-header field indicates the media type of the
entity-body sent to the recipient or, in the case of the HEAD method,
the media type that would have been sent had the request been a GET.
Content-Type = "Content-Type" ":" media-type
Media types are defined in section 3.7. An example of the field is
Content-Type: text/html; charset=ISO-8859-4
Further discussion of methods for identifying the media type of an
entity is provided in section 7.2.1.
>>>>>>>>
Olivier said that "the rest of HTTP constructs" are case-sensitive,
but this is not true. Methods such as GET and PUT are case-sensitive,
but most of the other stuff is not because it was taken over from
email, where it is also not case-sensitive.
Regards, Martin.s
At 04:33 07/08/08, McDonald, Ira wrote:
>
>Hi,
>
>Quoting HTTP/1.1 (RFC 2616), page 22:
>
>>> "HTTP character sets are identified by case-insensitive tokens. The
> complete set of tokens is defined by the IANA Character Set registry
> [19]."
>
>And the normative IANA Charset Registration Procedures (RFC 2978),
>page 4 says:
>
> "Finally, charsets being registered for use with the "text" media type
> MUST have a primary name that conforms to the more restrictive syntax
> of the charset field in MIME encoded-words [RFC-2047, RFC-2184] and
> MIME extended parameter values [RFC-2184]. A combined ABNF
> definition for such names is as follows:
>
> mime-charset = 1*mime-charset-chars
> mime-charset-chars = ALPHA / DIGIT /
> "!" / "#" / "$" / "%" / "&" /
> "'" / "+" / "-" / "^" / "_" /
> "`" / "{" / "}" / "~"
>>> ALPHA = "A".."Z" ; Case insensitive ASCII Letter
> DIGIT = "0".."9" ; Numeric digit"
>
>Any use of IANA charset tags in any standard that is case
>sensitive is broken.
>
>Cheers,
>- Ira - editor of IANA Charset MIB (RFC 3808)
>
>Ira McDonald (Musician / Software Architect)
>Chair - Linux Foundation Open Printing WG
>Blue Roof Music / High North Inc
>PO Box 221 Grand Marais, MI 49839
>phone: +1-906-494-2434
>email: imcdonald@sharplabs.com
>
>-----Original Message-----
>From: www-international-request@w3.org
>[mailto:www-international-request@w3.org]On Behalf Of David Dorward
>Sent: Tuesday, August 07, 2007 2:59 AM
>To: Ernest Unrau
>Cc: www-validator Community; www-international@w3.org
>Subject: Re: Validator case-sensitive bug for CHARSET?
>
>
>
>On 7 Aug 2007, at 08:11, Ernest Unrau wrote:
>> No HTML tags are case-sensitive, but it may indeed be that the CHARSET
>> parameter must be case sensitive since I'm told that the META tags are
>> mimicking HTML headers. Perhaps the servers that parse these
>> headers are
>> also case sensitive? But one would think that validation would fail on
>> other META tags also.
>
>There aren't any other meta tags that provide information needed in
>order to parse a document, so that isn't the case.
>
>--
>David Dorward
>http://dorward.me.uk/
>http://blog.dorward.me.uk/
>
>
>
>
>No virus found in this outgoing message.
>Checked by AVG Free Edition.
>Version: 7.5.476 / Virus Database: 269.11.8/940 - Release Date: 8/6/2007 4:53 PM
>
#-#-# Martin J. Du"rst, Assoc. Professor, Aoyama Gakuin University
#-#-# http://www.sw.it.aoyama.ac.jp mailto:duerst@it.aoyama.ac.jp
Received on Wednesday, 8 August 2007 01:34:25 UTC