W3C home > Mailing lists > Public > public-qa-dev@w3.org > May 2007

Re: Non SGML char errors with validator HEAD

From: olivier Thereaux <ot@w3.org>
Date: Fri, 18 May 2007 09:58:33 +0900
Message-Id: <DF72B21D-9FF8-4C0E-8545-500452F5DD7F@w3.org>
Cc: Shane McCarron <shane@aptest.com>, Yoshio Fukushige <fukushige.yoshio@jp.panasonic.com>
To: QA-dev Dev <public-qa-dev@w3.org>

Hi Ville,

On Apr 28, 2007, at 21:05 , Ville Skyttä wrote:
> I'm running the HEAD validator locally (Fedora Core 6 x86_64,  
> OpenSP 1.5.2
> from Fedora Core, SGML::Parser::OpenSP 0.99 locally built), and I'm  
> getting
> non SGML char errors where validator-test nor qa-dev's HEAD setup  
> shows them.
>
> Ideas where to look for the problem?  I saw
> http://www.w3.org/Bugs/Public/show_bug.cgi?id=3164 but I gather it's
> supposedly already fixed in SPO 0.99, and I'm not sure if it's even  
> the same
> issue.

I think I found the solution.

At first, we fixed the SGML declaration of XML, basically telling the  
validator "the characters 128-159 are OK in XML dialects". That was  
good, but actually, the issue remained for e.g HTML <= 4.01. And  
besides, the error was triggered for a number of unicode characters  
that have nothing to do with 129-159 control characters. Denmark,  
rotten, something.

Yoshio's message earlier today, forwarded by Shane, prompted me to  
give a serious new look at our transcoding.

I eventually found that when fixing
http://www.w3.org/Bugs/Public/show_bug.cgi?id=4474
I completely broke our transcoding system, but (and that's what  
confused me for a while) only for some versions of the Encode library.

The patch I applied a few minutes ago
http://lists.w3.org/Archives/Public/www-validator-cvs/2007May/0064.html
should hopefully clean up everything.

Thanks all for the reports/info, it helped.
Cheers,
-- 
olivier
Received on Friday, 18 May 2007 00:58:38 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Thursday, 19 August 2010 18:12:48 GMT