W3C home > Mailing lists > Public > www-validator@w3.org > February 2013

Re: How to correct tf8 "\x94"

From: Jukka K. Korpela <jkorpela@cs.tut.fi>
Date: Mon, 25 Feb 2013 09:45:12 +0200
Message-ID: <512B1688.9060807@cs.tut.fi>
To: Kolaylezzet yemek ve çay saatleri <kolaylezzet@gmail.com>
CC: "www-validator@w3.org" <www-validator@w3.org>
2013-02-23 20:06, Kolaylezzet yemek ve çay saatleri wrote:

> Following error message appears when I check my website
> http://www.kolaylezzet.com/
> The error was: utf8 "\x94" does not map to Unicode."

The error message is poorly worded, or even plain wrong. In 2008, there 
was a discussion on improving it, with no result:

Anyway, what the message tries to say is that when reading the document, 
with UTF-8 as the declared or implied character encoding, the validator 
encountered the byte 94 (hexadecimal) in a context where it must not 
appear in UTF-8 data. So this is a character-level data error (the bytes 
do not constitute characters), quite independent of any HTML markup issues.

When I tested the page, the error message reported line 677 as 
containing the error, and that line starts with

       <div class="nsb_container" align="center"><a id="l1" 
rel=�external� rel="nofollow"

where “�” indicate character data errors as described above. Apparently, 
they are supposed to be quotation marks, but probably “curly” quotation 
marks were used instead of "vertical" Ascii quotation marks. And 
presumably, the line was inserted so that the “curly” quotation marks 
were in windows-1252 encoding. Windows-1252 encoded “curly” quotation 
marks, when inserted as raw data into UTF-8 data, make the data invalid.

So you need to find out where the line comes from and to replace the 
“curly” quotation marks by the Ascii quotations marks (").

Received on Monday, 25 February 2013 07:45:42 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 1 March 2016 14:18:07 UTC