Re: charset parameter

On 25.07.01 at 03:05, Bjoern Hoehrmann <derhoermi@gmx.net> wrote:

>* Terje Bless wrote:
>
>(ok, I won't cc: you for this list any longer :-)

For the record: CCs are unnecessary for w-v and other lists I follow
closely, but I certainly don't mind getting CCs and if you want to be sure
you get my attention please do send me a CC. My email client tels me I have
35335 messages in 199 mailboxes; something is bound to get lost in the
noise. :-)


>>Since HTTP/1.1 has a default, XHTML can wave it's hands all
>>it likes and it won't change a thing.
>
>For what or whom? HTML 4 explicitly says user agents must not assume a
>default value for the charset parameter, as says RFC 3023 for
>application/xml (and application/xhtml+xml refers to that), so this is
>rather intentionally, isn't it? Sure, dump applications that don't know
>nothing about HTML may assume some default encoding (but as for
>application/xml they SHOULD NOT) but we don't have to deal with that.

The issue is that the transport protocol sez that an absense of an explicit
charset parameter on the Content-Type means "ISO-8859-1"; HTML or XML rules
don't apply here. When it comes time to parse the markup, you already have
a charset; the XML/HTML rules do not govern HTTP.

Now application/xml and application/xhtml+xml may well change this, but for
text/html we're still stuck with it.

That's the theory...


In practice you have to decide between "Assume ISO-8859-1 as that's what
/people/ tend to assume" or "Assume nothing as people will get it wrong
some part of the time".


In any case, we'll fix this in our pages when an oportunity presents
itself. No reason to set a bad example. :-)

Received on Tuesday, 24 July 2001 21:54:10 UTC