- From: Michael Adams <linux_mike@paradise.net.nz>
- Date: Sat, 15 Nov 2008 22:42:34 +1300
- To: www-validator@w3.org
On Fri, 14 Nov 2008 17:28:09 +0200 Came this utterance fomulated by Jukka K. Korpela to my mailbox: > > Michael Adams wrote: > > [ discussing the error message ...] > >>> The error was: utf8 "\x80" does not map to Unicode > > > \x80 is illegal as a first byte in unicode. > > First of all, this relates to UTF-8 encoding only. > > Second, you're right in the sense that byte 80 is not allowed as the > first byte of the encoding of character in UTF-8. I was confused when > I wrote that it must be _followed_ by a byte pattern of a specific > kind; instead, it must appear _within_ a byte combination of a certain > kind. > > Anyway, the error message is wrong. The byte 80 occurring in UTF-8 > data stream surely "maps to Unicode" as part of byte patterns. A > correct error message would be "The error was: Byte 80 (hexadecimal) > found in purported UTF-8 data in a context where it is not allowed." > This is fairly generic of course, but I suppose the error message > pattern is generic as well, so we cannot assume that it's about > occurrences as first bytes only. > I like it, other than the word 'Purported' which is not an easy word for those with little English. How about "The error is: Byte 80 (hexadecimal) is not allowed here in UTF-8 data." -- Michael All shall be well, and all shall be well, and all manner of things shall be well - Julian of Norwich 1342 - 1416
Received on Saturday, 15 November 2008 09:39:57 UTC