Re: Ill-formed Validator response

Hi Henri,

Thanks for reporting this.

On Jan 2, 2008, at 21:20 , Henri Sivonen wrote:
> http://validator.w3.org/check?uri=http%3A%2F%2Fphilip.html5.org%2Fmisc%2Fchars.html&charset=iso-8859-1&output=soap12

OK, that's an interesting case. If In understand correctly this is how  
it was constructed:

* take some claiming to be utf-8, but isn't. A bit more metadata on  
the test case would help.
* force the validator to interpret that as iso-8859-1
  (note that if left to its own device, the validator will refuse to  
validate the document as it can't decode as utf-8)
* the forced transcoding creates something ugly, which is then  
displayed in the error source
* That's bad, especially since it trips up a number of parsers which  
seem to think that the data stops there

Is that a proper assessment of what's happening? I am not an expert in  
unicode and your report is a bit terse. :)

Ideally, please report this to bugzilla, with more details and  
information, that would be very helpful.

Thanks.
-- 
olivier

Received on Wednesday, 2 January 2008 13:05:21 UTC