- From: Julian Reschke <julian.reschke@gmx.de>
- Date: Mon, 17 Mar 2008 13:48:18 +0100
- To: Mark Nottingham <mnot@mnot.net>
- CC: HTTP Working Group <ietf-http-wg@w3.org>
Mark Nottingham wrote:
>
> I think we've actually made progress on this; AFAICT, we seem to be
> moving towards removing the generic text WRT RFC2047 encoding and
> replacing it* with something that says that individual headers need to
> nominate an encoding mechanism directly, and giving guidance on when
> they should do so (roughly, wherever something is a candidate for
> display and/or user input).
Right. Let's clarify which of the HTTP/1.1 headers allow RFC2047-style
encoding; and let's also document a sane solution for new headers.
> Is that where we're at?
>
> If so, the next step would be to craft recommendations / requirements
> about what that mechanism will be. Possibilities discussed;
>
> a) RFC2047
I haven't seen any evidence this being implemented.
> b) UTF-8
Unfortunately, RFC2616, Section 4.2 currently states:
message-header = field-name ":" [ field-value ]
field-name = token
field-value = *( field-content | LWS )
field-content = <the OCTETs making up the field-value
and consisting of either *TEXT or combinations
of token, separators, and quoted-string>
Thus, if we take that as final word, we can't use anything but Latin1,
thus need to encode non-Latin-1 characters.
> c) Something from BCP137 section 5
...which would be \u'nnnnnn' or &#xnnnnnn;...
> d) IRI->URI
> Separately, we'd need to open new issues for specifying these encodings
> for the field-values of:
> - From
...this one is currently defined in terms of RFC2822, Section 3.4...
> - Warning
Currently explicitly refers to RFC2047.
> - Content-Location
> - Location
> - Referer
These are URI references. No non-ASCII characters anyway.
> - Content-Dispostion (?)
Content-Disposition uses I18N *inside* the parameters, for which there
already is RFC2231.
> Am I overlooking anything?
Reason-Phrase, for instance. In general, we need to answer whether
RFC2047 applies to everything using "comment" or "quoted-string".
> * It isn't actually replacing it, it's moving it to something specific
> to the field-value of headers. I don't hear anyone talking about
> internationalising other protocol elements at this point...
BR, Julian
Received on Monday, 17 March 2008 12:49:13 UTC