Re: Default Charsets

Mark Nottingham wrote:
> 
> 2616 Section 3.7.1 states;
> 
>> When no explicit charset parameter is provided by the sender, media  
>> subtypes of the "text" type are defined to have a default charset  
>> value of "ISO-8859-1" when received via HTTP.
> 
> However, many, if not all, of the text/* media types define their own  
> defaults; text/plain (RFC2046), for example, defaults to ASCII, as  does 
> text/xml (RFC3023).
> 
> How do these format-specific defaults interact with HTTP's default?  Is 
> HTTP really overriding them?

Recent releases of Apache http server -stopped- adding a default charset
directive to a stock user's config.  Of course the user is free to enable.

Additional garbage of the metadata tags for charset further complicate this
issue, it was defined at the protocol level (http headers), not at the
content level, and now you have clients legitimately confused over which card
to draw.

> I'm far from the first to be confused by this text, and I'm sure it's  
> been asked before, but I haven't been able to find a definitive  answer. 
> If errata are still being considered, perhaps removing/ modifying this 
> line would be a good start...

It would have been preferable to define the charset of the URI, the charset
of the headers in the http rfc, and the charset of the message body as a
content-type-specific implicit assumption/default.

Received on Monday, 1 May 2006 20:19:40 UTC