W3C home > Mailing lists > Public > public-i18n-core@w3.org > January to March 2009

Re: [CSS21][css3-namespace][css3-page][css3-selectors][css3-content] Unicode Normalization

From: Anne van Kesteren <annevk@opera.com>
Date: Sun, 01 Feb 2009 17:02:53 +0100
To: "Andrew Cunningham" <andrewc@vicnet.net.au>
Cc: "Jonathan Kew" <jonathan@jfkew.plus.com>, "Richard Ishida" <ishida@w3.org>, "'L. David Baron'" <dbaron@dbaron.org>, public-i18n-core@w3.org, www-style@w3.org
Message-ID: <op.uooj6w0e64w2qv@annevk-t60.oslo.opera.com>

On Sun, 01 Feb 2009 03:17:04 +0100, Andrew Cunningham  
<andrewc@vicnet.net.au> wrote:
> but developers have to type code, sometimes more than one developer needs
> to work on the code. And if they are using different input tolls, and
> those tools are generating different codepoints, when  identical
> codepoints are required ... then there is a problem.

I can definitely see that problems might arise. And I can also see that  
putting complexity on the user agent side is better than putting it on the  
developer side. However, there are several things to take into  
consideration here.

1. How many developers are actually facing this problem? We know that  
theoretically there is an issue here, but I do not believe research has  
shown that this is a problem in practice. E.g. as I understand things this  
could occur with the character ë, but has it?

2. What is the performance impact on processing? That is, is the impact so  
neglicable that browser vendors can add it? (FWIW, we care about  
microseconds.)

3. How likely is that XML will change to require doing NFC normalization  
on input? Currently XML does reference Unicode Normalization normatively,  
but it does only do so from a non-normative section on guidelines for  
designing XML names. If XML does not change it does not make a whole lot  
of sense to change e.g. CSS selector matching because that would mean some  
XML element names that are not in NFC could no longer be selected.

The last one is quite important. If Unicode Normalization is so important  
it has to happen everywhere, otherwise the platform becomes inconsistent.  
This means XML will have to change, HTML will have to change, CSS will  
have change, DOM APIs will have to change, etc. That's a lot of tedious  
work with great potential for bugs and performance issues. Without very  
clear evidence that such a major overhaul is needed, I doubt you'll  
convince many vendors.

I can see how this task might be difficult given that most vendors (and  
all that matter for most of the Web) are Western, things seem to work fine  
today, and changing this has a high cost, but I like to think that  
research can convince us.


-- 
Anne van Kesteren
<http://annevankesteren.nl/>
<http://www.opera.com/>
Received on Sunday, 1 February 2009 16:03:46 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Sunday, 1 February 2009 16:03:47 GMT