RE: Normalization of CSS Selectors: a rough outline [I18N-ACTION-39] from Koji Ishii on 2011-05-12 (public-i18n-core@w3.org from April to June 2011)

From: Koji Ishii <kojiishi@gluesoft.co.jp>
Date: Thu, 12 May 2011 03:46:14 -0400
To: "Phillips, Addison" <addison@lab126.com>, fantasai <fantasai.lists@inkedblade.net>, "public-i18n-core@w3.org" <public-i18n-core@w3.org>
Message-ID: <A592E245B36A8949BDB0A302B375FB4E0AC287597A@MAILR001.mail.lan>

I agree with fantasai's mail[1], but if we really want to consider options on save time, there's a possibility to follow non-normative suggestions for XML Names[1], which says "3. Characters in names should be expressed using Normalization Form C". It then can avoid applying NFC to text contents.

It could be more complex to implement than other options though.

[1] http://lists.w3.org/Archives/Public/www-style/2011Apr/0196.html

[2] http://www.w3.org/TR/xml/#sec-suggested-names

Regards,
Koji

-----Original Message-----
From: public-i18n-core-request@w3.org [mailto:public-i18n-core-request@w3.org] On Behalf Of Phillips, Addison
Sent: Tuesday, May 10, 2011 5:52 AM
To: fantasai; public-i18n-core@w3.org
Subject: RE: Normalization of CSS Selectors: a rough outline [I18N-ACTION-39]

(personal response)

> On 05/09/2011 12:53 AM, Koji Ishii wrote:
> > I’ve got some more feedback. It looks like this issue is hard to get
> > into conclusion in a week or so for me.
> 
> I don't think that applying NFC or NFD normalization is appropriate for text
> content. I believe the last round of discussions around normalization seemed to
> make that very clear, to me at least.

I agree that automatically normalizing pages and stylesheets as part of processing them is not appropriate. Normalizing identifiers is potentially a different issue: they aren't visible to users and defining proper matching behavior that meets user expectations (i.e. that may be consistent with Unicode normalization) seems potentially useful.

Please note that my summary's "recommend that users save documents in NFC" is consistent with current recommendations of the I18N WG and others. For most languages, using NFC is actually the "default" behavior and in the main it is the right approach the normalization problem for non-normalized content. Any discussion of normalization, though, needs to take the various details into account and present them clearly for users.

> 
> I think the question before the i18nWG right now is the one I posted last month:
>    http://lists.w3.org/Archives/Public/www-style/2011Apr/0196.html

> 
I agree. However, I wanted to cover the other options listed in my summary. The problem is that the different approaches to late normalization all have potential to either confuse users or be complex to implement in a robust manner (or both). I'm not sure I agree with your analysis of NFC, but full-file normalization is a non-starter. We may still recommend parse or compare-time normalization (the former looks iffy, since it's pretty close to normalization of the file itself and its impact would be difficult to isolate).

For me, I guess the issue comes down to: if nothing else in the HTML/CSS/JS edifice does normalization, are we really hurting anyone by making CSS Selectors consistent with that? Doing nothing (with a health warning for users who use affected languages) has the least impact on developers, is apparently "robust enough", and is consistent. Putting normalization into, say, ID comparison probably isn't the performance problem some allege and probably isn't as complex as some have alleged. But it does require the all CSS and JS implementations to embed a normalizer and handle the issue. And it is at variance with current practice.

Addison

Received on Thursday, 12 May 2011 07:48:45 UTC