RE: Re: [CSS21][css3-namespace][css3-page][css3-selectors][css3-content] Unicode Normalization from Phillips, Addison on 2009-02-02 (www-style@w3.org from February 2009)

From: Phillips, Addison <addison@amazon.com>
Date: Mon, 2 Feb 2009 11:00:37 -0800
To: "L. David Baron" <dbaron@dbaron.org>
CC: Boris Zbarsky <bzbarsky@MIT.EDU>, "public-i18n-core@w3.org" <public-i18n-core@w3.org>, "www-style@w3.org" <www-style@w3.org>
Message-ID: <4D25F22093241741BC1D0EEBC2DBB1DA017DA5F573@EX-SEA5-D.ant.amazon.com>

> > However, there are two problems with this observation.
> >
> > First, any two strings that are equal are, well, equal.
> > Normalizing them both won't change that. So an obvious
> performance
> > boost is to call strcmp() first.
> 
> Most string comparisons fail, so failing quickly is significantly
> more important than succeeding quickly.
> 

(laughs) I know.

If you make selectors require normalization form NFC, you can normalize the elements when atomizing them in the first place. For certain things, you might need to store both normalized and "original" representations.

If selectors require NFC internally, you can safely normalize during the initial processing. That's part of the point of recommending NFC in the first place. You might still store the original code point sequence for rendering purposes (although that should be for non-markup content).

Addison

Received on Monday, 2 February 2009 19:01:52 UTC