- From: Martin Duerst <duerst@w3.org>
- Date: Wed, 16 Feb 2005 14:46:00 +0900
- To: ietf-http-wg@w3.org
Dear HTTP experts, RFC 2616 currently says, in 3.4, Character Sets: HTTP character sets are identified by case-insensitive tokens. The complete set of tokens is defined by the IANA Character Set registry [19]. charset = token Although HTTP allows an arbitrary token to be used as a charset value, any token that has a predefined value within the IANA Character Set registry [19] MUST represent the character set defined by that registry. Applications SHOULD limit their use of character sets to those defined by the IANA registry. The references then give [19] Reynolds, J. and J. Postel, "Assigned Numbers", STD 2, RFC 1700, October 1994. This is a very old snapshot of the IANA charset registry, missing a few important entries (such as UTF-8). Based on this, we have seen claims saying that utf-8 cannot be used in HTTP. While I would personally consider such claims somewhere between 'bogus' and 'doubtful', it would be great if the HTTP spec were changed to directly point to the IANA registry if and when updated in the future. Regards, Martin. P.S.: As a separate, but related issue, it might also be a good idea to remove the never actually effective default of iso-8859-1.
Received on Wednesday, 16 February 2005 05:59:05 UTC