- From: olivier Thereaux <ot@w3.org>
- Date: Wed, 2 Jan 2008 22:05:17 +0900
- To: Henri Sivonen <hsivonen@iki.fi>
- Cc: W3C Validator Community <www-validator@w3.org>
Hi Henri, Thanks for reporting this. On Jan 2, 2008, at 21:20 , Henri Sivonen wrote: > http://validator.w3.org/check?uri=http%3A%2F%2Fphilip.html5.org%2Fmisc%2Fchars.html&charset=iso-8859-1&output=soap12 OK, that's an interesting case. If In understand correctly this is how it was constructed: * take some claiming to be utf-8, but isn't. A bit more metadata on the test case would help. * force the validator to interpret that as iso-8859-1 (note that if left to its own device, the validator will refuse to validate the document as it can't decode as utf-8) * the forced transcoding creates something ugly, which is then displayed in the error source * That's bad, especially since it trips up a number of parsers which seem to think that the data stops there Is that a proper assessment of what's happening? I am not an expert in unicode and your report is a bit terse. :) Ideally, please report this to bugzilla, with more details and information, that would be very helpful. Thanks. -- olivier
Received on Wednesday, 2 January 2008 13:05:21 UTC