W3C home > Mailing lists > Public > ietf-http-wg@w3.org > April to June 2008

i67: quoting charsets

From: Mark Nottingham <mnot@mnot.net>
Date: Fri, 4 Apr 2008 16:55:23 +1100
Message-Id: <3EC52F82-8440-4444-BA99-CEC6E62999BC@mnot.net>
To: HTTP Working Group <ietf-http-wg@w3.org>


Earlier related thread at:

Roy wrote in the issue;
> There is some confusion here. First, HTTP allows both quoted and  
> unquoted forms in Content-Type, and that certainly isn't going to  
> change. However, HTTP only uses the charset ABNF production in  
> Accept-Charset, and thus is currently defined to only allow tokens  
> in Accept-Charset.
> Should Accept-Charset allow charset quoted strings? I don't think  
> so. Should the charset production be removed to reduce the  
> confusion? Perhaps. This is really a design issue.
> This would be a lot easier if IANA kept a decent registry for  
> charset that only included the "MIME preferred names". We may need  
> to request that in the IANA considerations

p3 3.1 already says:
> HTTP uses charset in two contexts: within an Accept-Charset request  
> header (in which the charset value is an unquoted token) and as the  
> value of a parameter in a Content-Type header (within a request or  
> response), in which case the parameter value of the charset  
> parameter may be quoted.
I can't find this text in 2616, so I'm guessing that the editors took  
a stab at resolving this before flipping it to a design issue?

At any rate, the interesting thing to me here is that the argument for  
allowing quoted charset content-type parameters seems to be that the  
BNF for params is
token | quoted-string
i.e., all parameters inherit the ability to be quoted from the generic  

However, as part of the discussion of our favourite issue, #74, we've  
come to the place where saying that field-content is *not* subject to  
RFC2047 encoding generically, even though its BNF refers to TEXT  
(albeit in comments).

I think we need to be more explicit about when a higher-level BNF  
rule's attributes (such as encoding and quoting) are inherited. This  
will help avoid a fair amount of reader confusion.

In this case, I'm fine with the added text above, but I think we also  
need to explicitly state that quoting in media-type parameters is  
syntactic, not semantic, and so both forms are equivalent (probably in  
p2 section 3.3) for any given parameter.

As far as accept-charset goes, I'm fine with leaving it just a token,  
and don't think we need any change there.

Mark Nottingham     http://www.mnot.net/
Received on Friday, 4 April 2008 05:55:59 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 1 March 2016 11:10:45 UTC