Re: Default charsets for text media types [i20]

On Wed, 2008-03-26 at 00:15 +1100, Mark Nottingham wrote:
> A few people have noted a security issue in a widely-used browser that  
> requires (b). However, I haven't seen a reference to a vulnerability  
> report, etc. yet; is anyone aware of one?

The security issue is only beause some UAs do sniff making them parse
the content in a different character set than others..

If we do say someting more about this we should very cleary state that
IF the server has indicated a charater set the client MUST use this
characterset and MUST NOT attempt to sniff to guess the characterset.

This is already said in "Missing Charset", kind of.. what is left open
there is how the UA should act if it doesn't support the indicated
characterset..


I prefer to leave it at status-quo on how user agents should behave if
no characterset parameter is specified. The specs is clear that this
means ISO-8859-1, but note that some do things different..

Having ISO-8859-1 as a default instead of US-ASCII is not really a such
big concern as US-ASCII is a pure subset of ISO-8859-1. Content which
was meant to be rendered using US-ASCII will render just fine using
ISO-8859-1.

This said I am in favor of softly depreating the ISO-8859-1,
recommending servers to not rely on it as some user agents is known to
guess the characterset, and therefore recommend to always specify the
characterset explicit when known. But keeping ISO-8859-1 as the official
default if charset is not specified for historic reasons. This means
moving some of the compatibility notes about HTTP/1.0 user agents not
understanding content-type specifications with parameters to historic
notes.

Regards
Henrik

Received on Tuesday, 25 March 2008 16:23:00 UTC