Re: 12. Are C1 controls and Unicode non-characters disallowed?

On 9/8/2012 1:14 AM, James Clark wrote:
>
> I find the case for excluding non-characters pretty compelling. I 
> would state it like this:
Just for the sake of completeness, would you mind explaining what's 
compelling about it?  My initial reaction was: if we don't *need* to 
restrict the code-point set, why would we?  Is it a benefit in that tool 
chains will catch invalid characters further upstream than they might 
otherwise? I understand Unicode bans these code points, but if someone 
puts them in a file and then processes them as uXML, where's the harm?  
Is there some difficulty encoding these as UTF-8 or other Unicode encoding?

-Mike

Received on Saturday, 8 September 2012 14:56:19 UTC