W3C home > Mailing lists > Public > www-validator@w3.org > April 2007

Re: Character data messed up in 0.8.0 beta 1

From: olivier Thereaux <ot@w3.org>
Date: Thu, 19 Apr 2007 19:22:43 +0900
Message-Id: <22D92357-3D46-4877-A334-E99986E4A664@w3.org>
Cc: "www-validator@w3.org Community" <www-validator@w3.org>
To: Jukka K.Korpela <jkorpela@cs.tut.fi>

Hello Jukka,

On Apr 19, 2007, at 19:01 , Jukka K. Korpela wrote:
> I haven't found many differences yet (it's faster, but probably  
> just because of smaller load)

Actually, validator-test is hosted on the machine that takes most of  
the load for validator.w3.org, so if it feels faster, it's because of  
the module Björn Höhrmann built. Kudos to him for the performance  
boost, and great code.

> , but this one is rather serious:
>
> When testing a page in ISO-8859-1 encoding, the echo of a source  
> line in an error message has the non-ASCII characters replaced by  
> malformed data, displayed by IE 7 as small rectangles, by Firefox 2  
> as U+FFFD (a white
> question mark in a black lozenge)
>
> Test page: http://www.cs.tut.fi/~jkorpela/test/val.html
>
> The reason is apparently that the beta version echoes the source  
> line "as is", even though the source is ISO-8859-1 encoded and the  
> validator's report page is UTF-8 encoded.
>
> This doesn't happen in the production version validator.w3.org,  
> which seems to convert the source to UTF-8 before echoing it.

This is indeed a bug where transcoding either is not working, or is  
not applied everywhere.
I'm adding a bug entry here:
http://www.w3.org/Bugs/Public/show_bug.cgi?id=4474
and will track it there. I'll try and report here on the list too,  
once the bug is fixed.

Thanks a lot!
-- 
olivier
Received on Thursday, 19 April 2007 10:22:53 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 25 April 2012 12:14:24 GMT