RE: I18N-ACTION-40: CSS Selectors and Normalization (new response) from Phillips, Addison on 2011-05-25 (public-i18n-core@w3.org from April to June 2011)

From: Phillips, Addison <addison@lab126.com>
Date: Wed, 25 May 2011 09:45:01 -0700
To: fantasai <fantasai.lists@inkedblade.net>
CC: "public-i18n-core@w3.org" <public-i18n-core@w3.org>
Message-ID: <131F80DEA635F044946897AFDA9AC3476A931F03B8@EX-SEA31-D.ant.amazon.com>

> >>>
> >>> That said, the Namespaces problem and the Selectors problem are
> >>> similar problems (from opposite sides). As a result, we can either:
> >>
> >> Actually, they have one major difference: Namespaces do not interact
> >> with anything outside CSS. Their matching is entirely internal to
> >> CSS, so problems involving normalization of the source markup or the DOM
> do not apply.
> >>
> >
> > The matching is entirely internal to CSS, so a stronger normalization
> > requirement could be applied to matching (if we were so inclined).
> > However, I don't believe that this really alters the recommendation as
> > given. Or are you suggesting that namespace prefixes ought to require
> > a specific normalization form (in order to be valid)?
> 
> It might make sense to require CSS identifiers to parsed into a particular
> normalization form, and exposed to the DOM in that form.
> 

A requirement that names which are canonically equivalent be treated as equivalent is a de facto requirement that they be parsed to a normalized form. Consider:

@namespace \c5;land http://example.com

@namespace A\30aland http://example.com/bad?


These two namespace prefixes should compare as equal under NFC, that is, they are in conflict. If we require namespace prefix matching to be normalizing (which suggests that all CSS matching should be normalizing, incidentally), then the identifiers have to be normalized when they are parsed. Otherwise you can have two tuples with the "same" identifier.

Since there is no requirement that a given CSS file use a particular normalization form when serialized (or that a normalizing transcoder be applied to non-Unicode encoded serializations), the application of normalization to namespaces introduces the need for parse time normalization of *some* identifiers. Or it introduces the possibility that two namespaces may be validly created, parsed, etc. which later compare as equal---either in the stylesheet or in the DOM itself.

The I18N WG is of the opinion that IDs that are canonically equivalent should compare identically. But this suggests that an otherwise-valid stylesheet can be created that have peculiar matching behavior. Maybe "early uniform normalization" (aka "compare character sequences for equivalence; be careful what sequence you use because it will be strictly observed") is a more likely solution.....

Thoughts?

Addison

Received on Wednesday, 25 May 2011 16:45:31 UTC