W3C home > Mailing lists > Public > www-validator@w3.org > October 2004

Re: Fwd: Content-Type and encoding

From: Lachlan Hunt <lachlan.hunt@iinet.net.au>
Date: Tue, 26 Oct 2004 00:33:26 +1000
Message-ID: <417D0EB6.6030904@iinet.net.au>
To: olivier Thereaux <ot@w3.org>
CC: www-validator community <www-validator@w3.org>, Alexander Willner <mail@alexanderwillner.de>

olivier Thereaux wrote:
> I created a XHTML 1.1 website and based on the HTTP_ACCEPT and 
> HTTP_ACCEPT_CHARSET information I send "Content-Type: 
> application/xhtml+xml; charset=utf-8" or I fall back to "Content-Type: 
> text/html; charset=iso-8859-1" (or any combination).
`
If you're going to use XHTML 1.1, it should not be sent as text/html 
[1].  If you wish to negotiate the content type like that, you should 
use XHTML 1.0 Strict.  Alternatively, you can send XHTML 1.1 as 
application/xhtml+xml, and use some additional processing (eg. XSLT) to 
convert the file to HTML 4.01 on the fly, for sending as text/html.

> Your www-validator do not send any informations like HTTP_ACCEPT, 
> HTTP_ACCEPT_CHARSET, ...
> Because of this I get an error about the wrong encoding since my 
> website falls back to charset=iso-8859-1...

Why does it change the charset parameter?  If the file is encoded as 
UTF-8, and then it should be sent as UTF-8 no matter which MIME type is 
being used.  Without a URI, and being unable to see the exact error 
message, I can only take a guess at what is causing it.  It is likely 
that the file is still encoded as UTF-8, but the header is incorrectly 
claiming that it is ISO-8859-1, and it's possible that the file contains 
some octets in the range 127 to 159, which are control characters in 
ISO-8859-1.  To fix this, configure your server to always send the 
correct charset parameter to indicate the correct character encoding for 
your files.

[1] http://www.w3.org/TR/2002/NOTE-xhtml-media-types-20020801/#summary
-- 
Lachlan Hunt
http://lachy.id.au/
http://GetFirefox.com/    Rediscover the Web
http://SpreadFirefox.com/   Igniting the Web
Received on Monday, 25 October 2004 14:34:05 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 25 April 2012 12:14:15 GMT