- From: Julian Reschke <julian.reschke@gmx.de>
- Date: Mon, 17 Mar 2008 13:48:18 +0100
- To: Mark Nottingham <mnot@mnot.net>
- CC: HTTP Working Group <ietf-http-wg@w3.org>
Mark Nottingham wrote:
> 
> I think we've actually made progress on this; AFAICT, we seem to be 
> moving towards removing the generic text WRT RFC2047 encoding and 
> replacing it* with something that says that individual headers need to 
> nominate an encoding mechanism directly, and giving guidance on when 
> they should do so (roughly, wherever something is a candidate for 
> display and/or user input).
Right. Let's clarify which of the HTTP/1.1 headers allow RFC2047-style 
encoding; and let's also document a sane solution for new headers.
> Is that where we're at?
> 
> If so, the next step would be to craft recommendations / requirements 
> about what that mechanism will be. Possibilities discussed;
> 
> a) RFC2047
I haven't seen any evidence this being implemented.
> b) UTF-8
Unfortunately, RFC2616, Section 4.2 currently states:
     message-header = field-name ":" [ field-value ]
     field-name     = token
     field-value    = *( field-content | LWS )
     field-content  = <the OCTETs making up the field-value
                      and consisting of either *TEXT or combinations
                      of token, separators, and quoted-string>
Thus, if we take that as final word, we can't use anything but Latin1, 
thus need to encode non-Latin-1 characters.
> c) Something from BCP137 section 5
...which would be \u'nnnnnn' or &#xnnnnnn;...
> d) IRI->URI
> Separately, we'd need to open new issues for specifying these encodings 
> for the field-values of:
>   - From
...this one is currently defined in terms of RFC2822, Section 3.4...
>   - Warning
Currently explicitly refers to RFC2047.
>   - Content-Location
 >   - Location
 >   - Referer
These are URI references. No non-ASCII characters anyway.
>   - Content-Dispostion (?)
Content-Disposition uses I18N *inside* the parameters, for which there 
already is RFC2231.
> Am I overlooking anything?
Reason-Phrase, for instance. In general, we need to answer whether 
RFC2047 applies to everything using "comment" or "quoted-string".
> * It isn't actually replacing it, it's moving it to something specific 
> to the field-value of headers. I don't hear anyone talking about 
> internationalising other protocol elements at this point...
BR, Julian
Received on Monday, 17 March 2008 12:49:13 UTC