W3C validator service bug. from Yaros, Al on 2012-04-29 (www-validator@w3.org from April 2012)

From: Yaros, Al <al.yaros@hp.com>
Date: Sun, 29 Apr 2012 10:49:21 +0000
To: "www-validator@w3.org" <www-validator@w3.org>
CC: "Malki, Elad" <elad.malki@hp.com>
Message-ID: <5C1642ACCAFC834485A0A6AEC3F826984BF98137@G4W3210.americas.hpqcorp.net>

Hi
I am using your service API (http://validator.w3.org/docs/api.html) With Parameters:

*         debug = 1

*         output = soap12

*         fragment= <my web page content>

Meaning that character encoding should be detected automatically.
But on any web page content I send to analysis, it always "parses\assumes UTF-8 Encoding" Regardless the page declared encoding.

Needless to say I have tested it on multiple web pages with different encodings - all resulted with UTF-8 Parsing.
Note that in case I'm "Copy\Paste" the content to the W3C validator Web Site (Direct Input) it works correctly and detects the correct encoding.

I saw that you already fixed this bug on the direct input mode - but probably not implemented this fix on the API flow - Bug: https://www.w3.org/Bugs/Public/show_bug.cgi?id=2690
Thanks,
Al.

Received on Sunday, 29 April 2012 10:51:14 UTC