- From: Martin Duerst <duerst@w3.org>
- Date: Wed, 16 Feb 2005 14:46:00 +0900
- To: ietf-http-wg@w3.org
Dear HTTP experts,
RFC 2616 currently says, in 3.4, Character Sets:
HTTP character sets are identified by case-insensitive tokens. The
complete set of tokens is defined by the IANA Character Set registry
[19].
charset = token
Although HTTP allows an arbitrary token to be used as a charset
value, any token that has a predefined value within the IANA
Character Set registry [19] MUST represent the character set defined
by that registry. Applications SHOULD limit their use of character
sets to those defined by the IANA registry.
The references then give
[19] Reynolds, J. and J. Postel, "Assigned Numbers", STD 2, RFC 1700,
October 1994.
This is a very old snapshot of the IANA charset registry, missing
a few important entries (such as UTF-8).
Based on this, we have seen claims saying that utf-8 cannot be used
in HTTP. While I would personally consider such claims somewhere
between 'bogus' and 'doubtful', it would be great if the HTTP spec
were changed to directly point to the IANA registry if and when
updated in the future.
Regards, Martin.
P.S.: As a separate, but related issue, it might also be a good
idea to remove the never actually effective default of
iso-8859-1.
Received on Wednesday, 16 February 2005 05:59:05 UTC