Re: Accept-Charset support

# However, what I (and others) object against very strongly is
# the combination of RFC 1522 with the exception for ISO-8859-1.
# If everybody has to use RFC 1522, there must not be an
# exception for ISO-8859-1. ISO-8859-1 and Western Europe is
# not really anything special. If we choose RFC 1522, then
# everybody should use it, there should not be any exceptions.

The exception for ISO-8859-1 for warning messages in HTTP is based on
the fact that there is an exception for ISO-8859-1 for text documents,
and that it made no sense for the protocol to be inconsistent.

It is a historical fact that the web's origin at CERN in western
Europe gives it a western-European bias. This is perhaps unfortunate
(if you're not a western European), but your proposal that the default
be UTF-8 doesn't actually advantage much of the world that currently
has different encodings as their default.

You're proposing that recipients apply heuristics to decide if the
warning messages are in UTF-8 or ISO-8859-1. This seems like a bad
idea, to make something that's deterministic into something that's
heuristic. The 12-byte overhead for the "=?UTF-8?Q?" and "?=" suffix
in the warning message isn't so big, and isn't really "Clogging up the
8-bit channel".

Perhaps by the time Unicode is widespread -- in the next 3-5 years --
we'll have a new version of HTTP 2.x or HTTP-NG. I would certainly
propose that in the future, new versions of HTTP default to UTF-8.



Follow-Ups: References: