W3C home > Mailing lists > Public > ietf-http-wg@w3.org > January to March 2008

UTF-8 (was: PROPOSAL: i74: Encoding for non-ASCII headers)

From: Frank Ellermann <nobody@xyzzy.claranet.de>
Date: Mon, 31 Mar 2008 20:12:07 +0200
To: ietf-http-wg@w3.org
Message-ID: <fsr9d8$pbm$1@ger.gmane.org>

Henrik Nordstrom wrote:

> My gut feeling is that the best long term move would be to move to
> UTF-8 and forget about 2047 and accept that some existing things
> MAY break, BUT as you say it can not be proved to be completely
> without problems for existing implementations. In fact it's very
> likely to cause problems in some areas:

> - Authentication (RFC2617, here ISO-8859-1 is actively supported
> today, but not really sufficient)

ACK, in theory that could be fixed by adopting RFC 2831 or 2831bis
magic.  I'm not exactly sure about Unicode 3.2 SASLPREP in 2831bis,
maybe RFC 5198 NFC minus anything declared to be bad in 3987bis is
good enough for a 2617bis.

> - Cookie, for applications using/setting cookies in both client
> and server contexts (not just echoing what you got)

I cannot judge cookies.  I'm happy when I find the way to disable
double-analytics-tracker cookies from 3rd parties, for undisclosed
reasons FF2 makes that more difficult than IE6, but it is possible.

> For now I think the only possible outcome is to keep what we have;
> ISO-8859-1 as default, but clarifying that intermediaries should
> handle them as 8-bit ASCII strings and consider the
C1
> set (0x80-0x9F) as just another set of octets of the string (not
> as control characters or invalid) and a note that future headers
> MAY be seen using UTF-8 encoding.

Point - I did not consider that when I proposed to exclude C1 from
Mark's 2B proposal.  Oddly we all agree that we want UTF-8 "later",
but have different ideas how to get there.

 Frank
Received on Monday, 31 March 2008 18:11:03 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Friday, 27 April 2012 06:50:37 GMT