RE: [CSS21][css3-namespace][css3-page][css3-selectors][css3-content] Unicode Normalization from Phillips, Addison on 2009-02-02 (public-i18n-core@w3.org from January to March 2009)

From: Phillips, Addison <addison@amazon.com>
Date: Mon, 2 Feb 2009 13:04:47 -0800
To: Boris Zbarsky <bzbarsky@MIT.EDU>, Mark Davis <mark.davis@icu-project.org>
CC: "public-i18n-core@w3.org" <public-i18n-core@w3.org>, "www-style@w3.org" <www-style@w3.org>
Message-ID: <4D25F22093241741BC1D0EEBC2DBB1DA017DA5F7CD@EX-SEA5-D.ant.amazon.com>

The question here is one of interpretation. Anne points out that, at least theoretically, it is possible to create XML document schemas that define two semantically identical names that are encoded using different code point sequences. This, of course, is an Extremely Bad Idea, since, among other things, such a document might not live through a transcoding to another character encoding or other forms of processing. Although Anne pointed to XML 1.1, in fact, XML 1.0 5e also includes the same recommendations:

  http://www.w3.org/TR/xml/#sec-suggested-names

The real question is: what feature is more important to preserve? The non-normalizability of XML names (which is deprecated anyway)?

Addison

Addison Phillips
Globalization Architect -- Lab126

Internationalization is not a feature.
It is an architecture.

> -----Original Message-----
> From: Boris Zbarsky [mailto:bzbarsky@MIT.EDU]
> Sent: Monday, February 02, 2009 11:27 AM
> To: Mark Davis
> Cc: Phillips, Addison; public-i18n-core@w3.org; www-style@w3.org
> Subject: Re: [CSS21][css3-namespace][css3-page][css3-
> selectors][css3-content] Unicode Normalization
> 
> Mark Davis wrote:
> > When you are tokenizing, and then doing comparison, the simplest
> > approach is to normalize when creating the tokens.
> 
> At least one post to this thread has stated that doing that (parse-
> time
> normalization) is not acceptable.
> 
> Clearly either I'm missing something, or there is significant
> disagreement on what the acceptable behaviors are here...
> 
> -Boris

Received on Monday, 2 February 2009 21:05:26 UTC