W3C home > Mailing lists > Public > www-validator@w3.org > May 2008

Re: Error message for invalid UTF-8 overlong forms should be improved

From: olivier Thereaux <ot@w3.org>
Date: Thu, 29 May 2008 12:57:13 -0400
To: Jukka K.Korpela <jkorpela@cs.tut.fi>
Message-Id: <6E2BDFB9-787B-4D9A-94B6-759F7B4AFCEC@w3.org>
Cc: Thomas Rutter <tom@thomasrutter.com>, Frank Ellermann <hmdmhdfmhdjmzdtjmzdtzktdkztdjz@gmail.com>, W3C Validator Community <www-validator@w3.org>


On 29-May-08, at 2:14 AM, Jukka K. Korpela wrote:
> That's inconsistent indeed, and the more I think of it, the more
> misleading this "utf8 "\x..." does not map to Unicode" thing looks  
> like.
> It is difficult to express concisely that data that has been  
> declared or
> assumed to be utf-8 encoded violates the rules of utf-8 and cannot  
> thus
> be interpreted as characters. But the current formulation is  
> misleading
> and even plain wrong, at least in the first case.

This error message (and the decoding of the bytes as utf-8) all are  
part of the Encode perl module.

http://search.cpan.org/dist/Encode/

The validator can of course work around issues, and rewrite messages  
from modules it uses, but if indeed there are issues/suggestions, it  
may be worth reporting them upstream.

http://rt.cpan.org/Public/Dist/Display.html?Name=Encode

Thanks,
olivier
-- 
olivier Thereaux - W3C - http://www.w3.org/People/olivier
W3C Open Source Software : http://www.w3.org/Status
Received on Thursday, 29 May 2008 16:57:48 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 25 April 2012 12:14:29 GMT