Re: ignore dashes etc. (was Registration of new charset GB18030 (fwd))

                                                                                                               
                                                                                                               
                                                                                                               


You are quite right. I was inferring from Martin's message, and should have
looked at the source documents:

http://www.iana.org/assignments/character-sets says quite clearly:

The character set names may be up to 40 characters taken from the
printable characters of US-ASCII.  However, no distinction is made
between use of upper and lower case letters.

http://www.ietf.org/rfc/rfc2978.txt also mentions it briefly (and less than
clearly):

...A combined ABNF
   definition for such names is as follows:

     mime-charset = 1*mime-charset-chars
     mime-charset-chars = ALPHA / DIGIT /
                "!" / "#" / "$" / "%" / "&" /
                "'" / "+" / "-" / "^" / "_" /
                "`" / "{" / "}" / "~"
     ALPHA        = "A".."Z"    ; Case insensitive ASCII Letter
     DIGIT        = "0".."9"    ; Numeric digit

And case-insensitivity is a good thing; also good would be hyphen and
underscore insensitivity.

Mark
___
mark.davis@us.ibm.com
IBM, MS 50-2/B11, 5600 Cottle Rd, SJ CA 95193
(408) 256-3148
fax: (408) 256-0799



                                                                                                                                    
                      ned.                                                                                                          
                      freed@mrochek.com        To:       Martin Duerst <duerst@w3.org>                                              
                                               cc:       Mark Davis/Cupertino/IBM@IBMUS, charsets <ietf-charsets@iana.org>, Markus  
                      2002.07.20 22:47          Scherer <markus.scherer@jtcsv.com>                                                  
                                               Subject:  Re: ignore dashes etc. (was Registration of new charset  GB18030 (fwd))    
                                                                                                                                    
                                                                                                                                    
                                                                                                                                    



> At 20:41 02/07/18 -0700, Mark Davis wrote:

> >And what harm does it do, to make the name matching case-insensitive --
> >especially since a great many implementations do that anyway?

> Case-insensitive matching doesn't harm, as 'charset' matching was
> always case sensitive in the specs and in all implementations.

I don't know where you got this idea, but it simply isn't true. RFC 2046
section 4.1.2 is quite clear on the matter:

  Unlike some other parameter values, the values of the charset parameter
are NOT
  case sensitive.

I also can assure you that various cases of US-ASCII, Iso-8859-1, and
numerous other charsets are routinely used in practice.

Now, it is true that RFC 2278 doesn't come out and say that all charset
values
are case-insensitive. And this should probably be clarified. But it is a
heck
of a stretch to infer that they are case sensitive given that the subset
intended for use in MIME most definitely are not. (This last point is
actually
reiterated in the ABNF in RFC 2978 section 2.3.)

                                                 Ned

Received on Sunday, 21 July 2002 05:10:11 UTC