Re: flakey charset detection

Terje Bless wrote:
> David Brownell <david-b@pacbell.net> wrote:
> 
>>... Note that HTTP
>>is the "higher level protocol" in question, and it has a default
>>encoding of iso-8859-1 for all "text/*" mime types.
> 
> 
> Unfortunately, the HTML WG in their infinite wisdom have made the
> contradictory claim (normatively) that no default encoding should be
> assumed in the absense of an explicit charcter encoding indication.

Hmm -- news to me.  Not that I'd have noticed.  I suspect that to the
extent implementors even noticed that contradiction, they've resolved
it in favor of consistency with all the other web specs.  Certainly
my copy of Mozilla doesn't work that way, as have most browsers I've
had occasion to notice such things with.


>     The HTTP protocol ([RFC2616], section 3.7.1) mentions ISO-8859-1
>     as a default character encoding when the "charset" parameter is

It does more than "mention" it!  This looks like one of those "from
false premises, you can deduce anything" results.  Perhaps somebody
was unwilling to fix IE to obey that web standard, and was wielding
the editorial pen at a key point in the HTML4 process?  :)


> ]]] -- http://www.w3.org/TR/html401/charset.html#h-5.2.2

Curious.  But still, this page _used to validate_ just fine using
the W3C validator.  Issue a warning if you must, but I can't see
a way an XHTML validator should count this as an error.  At the
very worst the 8859-1 default is common practice with strong support
in all standards except html4 (e.g. html3.2 over http) ... but it
seems to me most like that bit of html4 is a bug.

- Dave

Received on Wednesday, 4 December 2002 16:54:24 UTC