W3C home > Mailing lists > Public > www-validator@w3.org > April 2008

Re: Fallback to UTF-8

From: Jukka K. Korpela <jkorpela@cs.tut.fi>
Date: Fri, 25 Apr 2008 13:19:05 +0300
Message-ID: <024501c8a6c0$e111f040$0500000a@DOCENDO>
To: "W3C Validator Community" <www-validator@w3.org>

Henri Sivonen wrote:

> My point is that while HTML 4.01
> doesn't specify this properly, this is a solved problem (by HTML 5)

You're joking, right? "HTML 5" is a collection of incomplete sketches.

HTML 4.01 rather properly specifies how the encoding shall be specified. 
Data that does not do that is outside the scope of the specification.

If you wish to add _pragmatic_ notes to that, then you should say that 
in the absence of encoding information, browsers usually imply _some_ 
encoding.

Did you notice that the press news that tells that there are now more 
Internet users in China than in the US? Would it make sense for a 
browser used in China to assume windows-1252?

> How about "The character encoding of the document was not explicit
> (assumed windows-1252) but the document contains non-ASCII."

Everything from the "(" onwards is gibberish to most authors and also 
fairly misleading. There's no "ASCII" or "non-ASCII" when the encoding 
has not been specified.

Jukka K. Korpela ("Yucca")
http://www.cs.tut.fi/~jkorpela/ 
Received on Friday, 25 April 2008 10:42:01 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 25 April 2012 12:14:29 GMT