W3C home > Mailing lists > Public > www-validator@w3.org > February 2013

Re: How to correct tf8 "\x94"

From: Jukka K. Korpela <jkorpela@cs.tut.fi>
Date: Mon, 25 Feb 2013 09:45:12 +0200
Message-ID: <512B1688.9060807@cs.tut.fi>
To: Kolaylezzet yemek ve çay saatleri <kolaylezzet@gmail.com>
CC: "www-validator@w3.org" <www-validator@w3.org>
2013-02-23 20:06, Kolaylezzet yemek ve çay saatleri wrote:

> Following error message appears when I check my website
> http://www.kolaylezzet.com/
> The error was: utf8 "\x94" does not map to Unicode."

The error message is poorly worded, or even plain wrong. In 2008, there was a discussion on improving it, with no result:

Anyway, what the message tries to say is that when reading the document, with UTF-8 as the declared or implied character encoding, the validator encountered the byte 94 (hexadecimal) in a context where it must not appear in UTF-8 data. So this is a character-level data error (the bytes do not constitute characters), quite independent of any HTML markup issues.

When I tested the page, the error message reported line 677 as containing the error, and that line starts with

       <div class="nsb_container" align="center"><a id="l1" rel=�external� rel="nofollow"

where “�” indicate character data errors as described above. Apparently, they are supposed to be quotation marks, but probably “curly” quotation marks were used instead of "vertical" Ascii quotation marks. And presumably, the line was inserted so that the “curly” quotation marks were in windows-1252 encoding. Windows-1252 encoded “curly” quotation marks, when inserted as raw data into UTF-8 data, make the data invalid.

So you need to find out where the line comes from and to replace the “curly” quotation marks by the Ascii quotations marks (").

Received on Monday, 25 February 2013 07:45:42 GMT

This archive was generated by hypermail 2.3.1 : Monday, 18 March 2013 18:07:59 GMT