Re: [CSS21][css3-namespace][css3-page][css3-selectors][css3-content] Unicode Normalization from Anne van Kesteren on 2009-02-02 (www-style@w3.org from February 2009)

From: Anne van Kesteren <annevk@opera.com>
Date: Mon, 02 Feb 2009 16:21:29 +0100
To: "Boris Zbarsky" <bzbarsky@mit.edu>
Cc: public-i18n-core@w3.org, www-style@w3.org
Message-ID: <op.uoqcx3wn64w2qv@annevk-t60.oslo.opera.com>

On Mon, 02 Feb 2009 15:44:31 +0100, Boris Zbarsky <bzbarsky@mit.edu> wrote:
> Anne van Kesteren wrote:
>> HTML & XML character references and CSS character escapes would also  
>> allow for the same character to be represented in different forms as  
>> far as I can tell. (Though not quite as extreme as in JS, still  
>> basically the same "issue".)
>
> Yeah, true. We have existing code in Gecko to detect situations where  
> such escapes create high or low UTF-16 surrogates and we disallow that.  
> Would it be reasonable to also disallow insertion of combining  
> characters via such escapes?  Or to just not worry about the problem?

My suggestion would be not to worry about it at that level.

>> (And then there's the bits Martin pointed out.)
>
> What bits are those?  I couldn't make sense of what he was saying, to be  
> honest.  I couldn't even understand whether the thought normalization  
> should or should not be done...

That rule C6 is slightly different than explained and that it's not clear  
to him either that this is an actual problem developers are facing. Also  
how this might affect font matching is an interesting interoperability  
question. Though I suppose this issue would affect fonts regardless.

I understood from his various e-mails that he's sceptical that it should  
be done. (Like me.)

-- 
Anne van Kesteren
<http://annevankesteren.nl/>
<http://www.opera.com/>

Received on Monday, 2 February 2009 15:22:13 UTC