W3C home > Mailing lists > Public > www-validator@w3.org > July 2001

Re: charset parameter

From: Martin Duerst <duerst@w3.org>
Date: Wed, 25 Jul 2001 16:57:12 +0900
Message-Id: <>
To: Terje Bless <link@pobox.com>, Bjoern Hoehrmann <derhoermi@gmx.net>
Cc: www-validator@w3.org
At 03:53 01/07/25 +0200, Terje Bless wrote:
>On 25.07.01 at 03:05, Bjoern Hoehrmann <derhoermi@gmx.net> wrote:

> >For what or whom? HTML 4 explicitly says user agents must not assume a
> >default value for the charset parameter, as says RFC 3023 for
> >application/xml (and application/xhtml+xml refers to that), so this is
> >rather intentionally, isn't it? Sure, dump applications that don't know
> >nothing about HTML may assume some default encoding (but as for
> >application/xml they SHOULD NOT) but we don't have to deal with that.
>The issue is that the transport protocol sez that an absense of an explicit
>charset parameter on the Content-Type means "ISO-8859-1"; HTML or XML rules
>don't apply here. When it comes time to parse the markup, you already have
>a charset; the XML/HTML rules do not govern HTTP.

Sorry, but the HTML 4 spec explicitly says that the HTTP default
doesn't work.

>Now application/xml and application/xhtml+xml may well change this, but for
>text/html we're still stuck with it.
>That's the theory...
>In practice you have to decide between "Assume ISO-8859-1 as that's what
>/people/ tend to assume" or "Assume nothing as people will get it wrong
>some part of the time".

Well, in your part, that's what /people/ tend to assume, but in
this part of the world, assumptions are quite different.

>In any case, we'll fix this in our pages when an oportunity presents
>itself. No reason to set a bad example. :-)


Regards,   Martin.
Received on Wednesday, 25 July 2001 08:11:01 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 1 March 2016 14:17:30 UTC