Character data messed up in 0.8.0 beta 1

On Thu, 19 Apr 2007, olivier Thereaux wrote:

> It is with great pleasure and excitement that we am starting a Beta Test 
> period for the W3C Markup Validator:

I haven't found many differences yet (it's faster, but probably just 
because of smaller load), but this one is rather serious:

When testing a page in ISO-8859-1 encoding, the echo of a source line in 
an error message has the non-ASCII characters replaced by malformed data, 
displayed by IE 7 as small rectangles, by Firefox 2 as U+FFFD (a white
question mark in a black lozenge)

Test page:

The reason is apparently that the beta version echoes the source line "as 
is", even though the source is ISO-8859-1 encoded and the validator's 
report page is UTF-8 encoded.

This doesn't happen in the production version, which 
seems to convert the source to UTF-8 before echoing it.

Jukka "Yucca" Korpela,

Received on Thursday, 19 April 2007 10:01:25 UTC