- From: Henrik Nordstrom <henrik@henriknordstrom.net>
- Date: Mon, 31 Mar 2008 19:30:17 +0200
- To: Frank Ellermann <hmdmhdfmhdjmzdtjmzdtzktdkztdjz@gmail.com>
- Cc: ietf-http-wg@w3.org
fre 2008-03-28 klockan 16:33 +0100 skrev Frank Ellermann: > > If we'd work on HTTP/1.2 proposing to ignore RFC 5198 would > be madness, but we are supposed to improve the HTTP/1.1 spec. Actually there is no real difference here if we worked on HTTP/1.2. In HTTP headers is defined to have the same meaning for as long as the major version number is the same. If changing TEXT from ISO-8859-1 to UTF-8 is a problem for HTTP/1.1, it's likewise a problem for HTTP/1.2 as HTTP/1.2 still needs to deal with how the message is understood if downgraded to earlier protocol versions by an intermediary. My gut feeling is that the best long term move would be to move to UTF-8 and forget about 2047 and accept that some existing things MAY break, BUT as you say it can not be proved to be completely without problems for existing implementations. In fact it's very likely to cause problems in some areas: - Authentication (RFC2617, here ISO-8859-1 is actively supported today, but not really sufficient) - Cookie, for applications using/setting cookies in both client and server contexts (not just echoing what you got) The business of relying on RFC2047 encoding or similar "obfuscation" is quite likely to get more bad implementations than UTF-8, and the risk of security implications at the protocol level due to mismatches between ISO-8859-1 / UTF-8 expectations is pretty minimal. Related to Cookie it may be worth mentioning that RFC2965 (Cookie/Set-Cookie2) defines that the human visible attribute (Comment) must have it's value encoded in UTF-8, within HTTP... For now I think the only possible outcome is to keep what we have; ISO-8859-1 as default, but clarifying that intermediaries should handle them as 8-bit ASCII strings and consider the C0 set (0x80-0x9F) as just another set of octets of the string (not as control characters or invalid) and a note that future headers MAY be seen using UTF-8 encoding. Switching to UTF-8 in general is possible, but may require a new header declaring that this message is sent using UTF-8 and is outside the scope of RFC2616bis until there is a strong requirement to address I18N to advance as standard. But switching to UTF-8 is imho most likely the most sane way of addressing I18N in both 2616 and 2617, and least likely of causing long-term interop issues. Regards Henrik
Received on Monday, 31 March 2008 17:31:13 UTC