Re: Error message for invalid UTF-8 overlong forms should be improved

On 29-May-08, at 2:14 AM, Jukka K. Korpela wrote:
> That's inconsistent indeed, and the more I think of it, the more
> misleading this "utf8 "\x..." does not map to Unicode" thing looks  
> like.
> It is difficult to express concisely that data that has been  
> declared or
> assumed to be utf-8 encoded violates the rules of utf-8 and cannot  
> thus
> be interpreted as characters. But the current formulation is  
> misleading
> and even plain wrong, at least in the first case.

This error message (and the decoding of the bytes as utf-8) all are  
part of the Encode perl module.

http://search.cpan.org/dist/Encode/

The validator can of course work around issues, and rewrite messages  
from modules it uses, but if indeed there are issues/suggestions, it  
may be worth reporting them upstream.

http://rt.cpan.org/Public/Dist/Display.html?Name=Encode

Thanks,
olivier
-- 
olivier Thereaux - W3C - http://www.w3.org/People/olivier
W3C Open Source Software : http://www.w3.org/Status

Received on Thursday, 29 May 2008 16:57:48 UTC