Re: [CSS21][css3-namespace][css3-page][css3-selectors][css3-content] Unicode Normalization from Philip TAYLOR (Ret'd) on 2009-02-05 (www-style@w3.org from February 2009)

From: Philip TAYLOR (Ret'd) <P.Taylor@Rhul.Ac.Uk>
Date: Thu, 05 Feb 2009 15:31:48 +0000
To: Henri Sivonen <hsivonen@iki.fi>
CC: Jonathan Kew <jonathan@jfkew.plus.com>, Andrew Cunningham <andrewc@vicnet.net.au>, public-i18n-core@w3.org, W3C Style List <www-style@w3.org>
Message-ID: <498B0664.30105@Rhul.Ac.Uk>

Henri Sivonen wrote:

> My point is that it's generally not helpful to bring out the Western 
> bias[1] thing in discussions of using Unicode in computer languages. 
> Previously, too, performance has been preferred over full natural 
> language complexity for computer language identifier equality comparison 
> and in that instance clearly it could not have been an issue of Western 
> bias. The thing is that comparing computer language identifiers code 
> point for code point is the common-sense thing to do. 

With respect, it is the /simplest/ thing to do.  For those
who work in anything more complex than English, it is
probably anything /but/ "common sense".

> If you consider 
> the lack of case-insensitivity, some languages are not perfectly 
> convenienced. If you consider the lack normalization, another 
> (overlapping) set of languages is not perfectly convenienced. If you 
> consider the sensitivity to diacritics, yet another set of languages is 
> not perfectly convenienced. No language is prohibited by code point for 
> code point comparison, though.

Yet for many (perhaps most) of the world's languages,
comparison by code-point is noticeably sub-optimal.

Philip TAYLOR

Received on Thursday, 5 February 2009 15:32:26 UTC