W3C home > Mailing lists > Public > www-validator@w3.org > May 2004

Re: Character encoding trouble

From: Bjoern Hoehrmann <derhoermi@gmx.net>
Date: Mon, 10 May 2004 23:19:04 +0200
To: Håvard Skjæveland <pho@dataportalen.com>
Cc: <www-validator@w3.org>
Message-ID: <40a3f0c7.621673218@smtp.bjoern.hoehrmann.de>

* Håvard Skjæveland wrote:
>I am unable to validate my documents because there's no character
>encoding set found. This also worked before the make-over. I'm
>sending the character set with the AddCharset directive in .htaccess
>
>Strangely, it works when going ctrl + v in Opera, but not if
>validating manually or by using the link at the bottom of any of
>my pages: http://www.dataportalen.com/pho/

It actually complains

  Sorry, I am unable to validate this document because on lines 14, 32
  it contained one or more bytes that I cannot interpret as utf-8 (in
  other words, the bytes found are not valid values in the specified
  Character Encoding). Please check both the content of the file and the
  character encoding indication. 

Let's have a look:

  % netc www.dataportalen.com 80
  HEAD /pho/ HTTP/1.1
  Connection: close
  Host: www.dataportalen.com
  User-Agent: W3C_Validator/1.305.2.109 libwww-perl/5.79
  
  HTTP/1.1 200 OK
  Date: Mon, 10 May 2004 21:05:43 GMT
  Server: Apache/1.3.28 (Unix) ...
  Vary: User-Agent
  Connection: close
  Content-Type: application/xhtml+xml
  Content-Language: en-us

There is no charset=... parameter in the HTTP header. Depending on the
User-Agent header, you send different resources with different meta
data, some users would indeed get

  Content-Type: text/html; charset=iso-8859-1

But the Validator obviously not. The resource is ISO-8859-1 encoded and
lacks both a XML declaration and an encoding declaration in the charset
attribute in the HTTP header, hence it will be interpreted as UTF-8
which does not work as the resource is not UTF-8 encoded. It's a
misconfiguration on your side.
Received on Monday, 10 May 2004 17:19:20 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 25 April 2012 12:14:13 GMT