Hi Henri, Thanks for reporting this. On Jan 2, 2008, at 21:20 , Henri Sivonen wrote: > http://validator.w3.org/check?uri=http%3A%2F%2Fphilip.html5.org%2Fmisc%2Fchars.html&charset=iso-8859-1&output=soap12 OK, that's an interesting case. If In understand correctly this is how it was constructed: * take some claiming to be utf-8, but isn't. A bit more metadata on the test case would help. * force the validator to interpret that as iso-8859-1 (note that if left to its own device, the validator will refuse to validate the document as it can't decode as utf-8) * the forced transcoding creates something ugly, which is then displayed in the error source * That's bad, especially since it trips up a number of parsers which seem to think that the data stops there Is that a proper assessment of what's happening? I am not an expert in unicode and your report is a bit terse. :) Ideally, please report this to bugzilla, with more details and information, that would be very helpful. Thanks. -- olivierReceived on Wednesday, 2 January 2008 13:05:21 UTC
This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 22:59:06 UTC