W3C home > Mailing lists > Public > www-validator@w3.org > June 2003

Re: default charset broken

From: Kjetil Torgrim Homme <kjetilho@ifi.uio.no>
Date: Sat, 07 Jun 2003 17:13:21 +0200
To: Bjoern Hoehrmann <derhoermi@gmx.net>
Cc: www-validator@w3.org
Message-ID: <1rfzmlly3y.fsf@vingodur.ifi.uio.no>

[Bjoern Hoehrmann]:
>
>   * Kjetil Torgrim Homme wrote:
>   > you have to keep the layers separate here.  the HTTP transport
>   > delivers a Content-Type header to HTML, this has charset
>   > "ISO-8859-1" set implicitly.  HTML can NOT distinguish between
>   > the charset being explicit in HTTP or not, that behaviour is
>   > specified in the HTTP standard.
>   
>   Most major user agents implement what HTML says rather than what
>   HTTP/1.1 says, why should the HTML Validator not do what the HTML
>   specification requires?

ok, let's look at it:

|  The HTTP protocol ([RFC2616], section 3.7.1) mentions ISO-8859-1 as
|  a default character encoding when the "charset" parameter is absent
|  from the "Content-Type" header field. In practice, this
|  recommendation has proved useless

it is not a RECOMMENDATION, it is a REQUIREMENT ("are defined to" --
no leeway there).

|  because some servers don't allow a "charset" parameter to be sent,
|  and others may not be configured to send the parameter. Therefore,
|  user agents must not assume any default value for the "charset"
|  parameter.

so therefore this reasoning doesn't apply.  you can't ignore a
requirement out of hand.

I'll reiterate: when it comes to specifying how HTTP works, the HTTP
RFC trumps the HTML spec.

-- 
Kjetil T.
Received on Saturday, 7 June 2003 11:13:30 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 25 April 2012 12:14:09 GMT