W3C home > Mailing lists > Public > www-validator@w3.org > August 2006

Re: utf-8 validation help

From: Frank Ellermann <nobody@xyzzy.claranet.de>
Date: Thu, 31 Aug 2006 17:37:35 +0200
To: www-validator@w3.org
Message-ID: <44F7023F.6AD8@xyzzy.claranet.de>

Pete Forman wrote:
 
> RFC 2616 is trumped by the HTML spec which states that an
> absent HTTP Content-Type header may not be construed as
> ISO-8859-1.

Yes, for HTML 4 an explicit declaration is required.

> I'd suggest that a validating UA might use US-ASCII as
> its default encoding and raise errors for out of range
> characters.

For the validator as is that would cause more error messages.

For other UAs it might a bad plan if authors read their pages
offline as file:///what/ever/funny.html - it would force them
to always use meta for their validating UA.

> Of course there should still be a warning if neither the web
> server nor document specify an encoding.

For intentionally simple HTML 2 404-documents it's AFAIK no
error.  For HTML 4 it's not only a "warning", it's invalid.
For HTML 3.2 I can't tell.  The validator always wants an
explicit declaration, good enough for all relevant purposes.

Frank
Received on Thursday, 31 August 2006 15:41:46 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 25 April 2012 12:14:22 GMT