Re: [CSS21][css3-namespace][css3-page][css3-selectors][css3-content] Unicode Normalization from Boris Zbarsky on 2009-02-02 (public-i18n-core@w3.org from January to March 2009)

From: Boris Zbarsky <bzbarsky@MIT.EDU>
Date: Mon, 02 Feb 2009 09:44:31 -0500
To: Anne van Kesteren <annevk@opera.com>
CC: public-i18n-core@w3.org, www-style@w3.org
Message-ID: <498706CF.9010805@mit.edu>

Anne van Kesteren wrote:
> HTML & XML character references and CSS character escapes would also 
> allow for the same character to be represented in different forms as far 
> as I can tell. (Though not quite as extreme as in JS, still basically 
> the same "issue".)

Yeah, true. We have existing code in Gecko to detect situations where 
such escapes create high or low UTF-16 surrogates and we disallow that. 
  Would it be reasonable to also disallow insertion of combining 
characters via such escapes?  Or to just not worry about the problem?

> (And then there's the bits Martin pointed out.)

What bits are those?  I couldn't make sense of what he was saying, to be 
honest.  I couldn't even understand whether the thought normalization 
should or should not be done...

-Boris

Received on Monday, 2 February 2009 14:45:37 UTC