Re: [CSS21][css3-namespace][css3-page][css3-selectors][css3-content] Unicode Normalization from Henri Sivonen on 2009-02-03 (public-i18n-core@w3.org from January to March 2009)

From: Henri Sivonen <hsivonen@iki.fi>
Date: Tue, 3 Feb 2009 10:48:13 +0200
To: "Phillips, Addison" <addison@amazon.com>
Cc: "L. David Baron" <dbaron@dbaron.org>, Boris Zbarsky <bzbarsky@MIT.EDU>, "public-i18n-core@w3.org" <public-i18n-core@w3.org>, "www-style@w3.org" <www-style@w3.org>
Message-Id: <0A0FD4B4-626B-4BC8-9ED7-BFD84AA2C589@iki.fi>

On Feb 2, 2009, at 21:02, Phillips, Addison wrote:

> Because browsers are NOT the primary creator of the content. Early  
> uniform normalization refers to every process that creates an XML/ 
> HTML/CSS/etc etc. document. The browser reads those documents and  
> must still deal with normalization issues.

To me, it seems unreasonable to introduce serious performance- 
sensitive complexity into Web content consumers to address the case  
that a Web developer fails to supply HTML, CSS and JS in a  
*consistent* form in terms of combining characters. (I think even  
normalization in the HTML parser post-entity expansion would be  
undesirable.) How big a problem is it in practice that an author fails  
to be self-consistent when writing class names to .html and when  
writing them to .css or .js?

In my opinion the most reasonable way for browsers to deal with  
normalization of identifiers is not to normalize before performing a  
string equality comparison. And then it's up to authors to be  
normalization-wise *self*-consistent in their HTML, CSS and JS.

-- 
Henri Sivonen
hsivonen@iki.fi
http://hsivonen.iki.fi/

Received on Tuesday, 3 February 2009 08:48:55 UTC