Re: Meta Character Encoding Not Detected from olivier Thereaux on 2007-11-21 (www-validator@w3.org from November 2007)

From: olivier Thereaux <ot@w3.org>
Date: Wed, 21 Nov 2007 10:56:20 +0900
To: Email Reply <email_reply0234@mercysoftware.com>
Cc: www-validator@w3.org
Message-Id: <E27E046E-B0BB-4E2D-BAF2-2734BD5BC1A9@w3.org>

Dear "Email Reply"

On 21 nov. 07, at 04:35, Email Reply wrote:

> If I validate by Direct Input and paste the following code into the  
> validator:
>
> If I set encoding to "detect automatically" then the validator  
> selects utf-8 as the encoding despite the fact that I have set a  
> meta tag declaring it as iso-8859-1.

This is specific to "direct input". When the validator fetches a  
document online or gets it sent by file upload, there is a question of  
what the file, or HTTP resource, is encoded in. In direct input  
however, what gets sent to the validator is not a file, but a string  
of characters encoded in the same encoding as the validator's  
interface, that is, utf-8.

Even if the document you will eventually publish is not utf-8, the act  
of copy-pasting it to the text area in the validator will make it utf-8.

As a result, the meta charset information in the markup is ignored.

> If I set the encoding to iso-8859-1, then the validator issues a  
> warning that I'm overriding the detected character encoding of utf-8.

I am puzzled by this. For the reasons explained above, the "direct  
input" interface does not have any character encoding override  
mechanism. Where did you see that? What validator are you using?

-- 
olivier

Received on Wednesday, 21 November 2007 01:56:30 UTC